From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E5B8EA04FF; Tue, 24 May 2022 05:46:34 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8B2D24067B; Tue, 24 May 2022 05:46:34 +0200 (CEST) Received: from NAM02-BN1-obe.outbound.protection.outlook.com (mail-bn1nam07on2074.outbound.protection.outlook.com [40.107.212.74]) by mails.dpdk.org (Postfix) with ESMTP id 73E1E4014F for ; Tue, 24 May 2022 05:46:33 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=P+uwV1ZtxNWuR48pHXdqnDu/m3KjxZVzF9cb7uuSAuXvNby13MhgoY0uN+M+DTgiKKtsz6yQ9TvsiFOrvvSJV7jz0/bLARUHSCULJGFPXlwCJdTN7rlYPk6wTB0cWCsGhEsI0fFISCNcBz6AETtkNUpFxsT5HdndljPx4u1YLdLyOR3/rVL7KIZu//6m1NHXlXoQ87JOd7HMHgrmAdrC6EMgLd6SvJuqe3oFTxVov/MUbUB0dBG07s2fkl7eGvvzfKqXi7zyjMnOJfP5ss/pL3tgE21q6e0uWxB7eNPZF/ok3CqPqR1+fdxhJNVJlRiO/wzQd05G5aQpCjwbkuWbcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=K7h8axz+7hJrX5ZQ3I31MIhxwFHCGe/RnxytyTnNH4A=; b=J2uCp5vuBL4RqC4qVIWLRCoFp1t4y/ChsYJG2pdSy32HRkz8jrAnNOxuLr+SOAHJBWVn2hf82qxeR8my13ecUG0YwARAiZwyga8BXyHbIJpLQykdKut8ULYO8uYR+8Z057DZLtF3CTGTWDD6sF+lfow2oViwODCGOuP9ZUVVQDL6kXXkeR2oKHsvBNJEtCLSe/dDbaBOk4DFakIi4D2sio+DBj+ssGtkbptJZYakn1JfRVEZ+fpyf2HPbF0UmZ1sVYA1kLUFOhhjWdBYzUIJZIrEA8mxeKM0IVxIrwJXcMcBRAOCm74N/i/KTRxRQ+qPMzjAgQ4dwy3EBfNxJSscGw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=K7h8axz+7hJrX5ZQ3I31MIhxwFHCGe/RnxytyTnNH4A=; b=Txhdk1hbeLXzI168k7ZcO2I7R7aDYr0f99Y1/MzympLlHAMRpmnSrehF/kzJGGklvKQzw6xghsYbJ9dekshIEX+zPkjtIwywSo3jw3Y4/XsJR6qRcfXZUb/YmHBIxJy6raBFyelMu3gms9hOqcpFc8tpUIU4oS+qvmyr+mrLoxqoQnDM9JpLaKKNstA73G/oh6kSja2d3gbIdDHAY8salu6qx9Pg8FxRYMSu06AO3XQOGb+mOYrwRwm55MnEpou884Vt8WLtF6fdSZHsE65XXfAVNvFPLxkwE3dDQDqEQVV+hIf2C5jgOtv382MWWR2StafKzso++AnArPqTjOjo0Q== Received: from MN2PR12MB3647.namprd12.prod.outlook.com (2603:10b6:208:c4::17) by MWHPR12MB1872.namprd12.prod.outlook.com (2603:10b6:300:10d::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5273.17; Tue, 24 May 2022 03:46:30 +0000 Received: from MN2PR12MB3647.namprd12.prod.outlook.com ([fe80::f831:cec1:9c49:5988]) by MN2PR12MB3647.namprd12.prod.outlook.com ([fe80::f831:cec1:9c49:5988%6]) with mapi id 15.20.5273.022; Tue, 24 May 2022 03:46:30 +0000 From: Spike Du To: Stephen Hemminger CC: Matan Azrad , Slava Ovsiienko , Ori Kam , "NBU-Contact-Thomas Monjalon (EXTERNAL)" , "dev@dpdk.org" , Raslan Darawsheh Subject: RE: [RFC v2 3/7] ethdev: introduce Rx queue based limit watermark Thread-Topic: [RFC v2 3/7] ethdev: introduce Rx queue based limit watermark Thread-Index: AQHYbe/iAz+l86Nfv02itzUhoBl7mq0rus/ggAFZeICAAEhSkA== Date: Tue, 24 May 2022 03:46:30 +0000 Message-ID: References: <20220506035645.4101714-1-spiked@nvidia.com> <20220522055900.417282-1-spiked@nvidia.com> <20220522055900.417282-4-spiked@nvidia.com> <20220522082321.3cdb7693@hermes.local> <20220523155437.764bea10@hermes.local> In-Reply-To: <20220523155437.764bea10@hermes.local> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 6c9ca92a-afea-47e0-93c9-08da3d37fcf9 x-ms-traffictypediagnostic: MWHPR12MB1872:EE_ x-ld-processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr x-microsoft-antispam-prvs: x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: gbUQv+aZ+V1WhkjG+McBm+vFYVjSagMwc6F5o/0XC6slCbVeudHhXGKCNb1ZzVFkDgGm2ElR6kYvP+XGiORIL8yk6ogY29Zq09MM1IX0iSJoFlU8czzdzGOTlcN0+/YHU0t7rKffyDOB2guH9//S7AFZ2jkU2klnFiNZXoxwwex8r7OryMyf33w9x+4RJAN4xpc72b4JB6uBeZtUVvcPxZuXdG9Vnwuk8dy582llNKY4hRcTjWPt05NBpq2dA+KwcxJ/1jNfFxp+jD0bKpW/Dlew9AW6ClJ7hInlU/dZ8Nzzn1Pt5o1NNkIg7N3z7XWkRkipivyM2Xqt6QUixi8j1bePZmk4RSiPXZt75bZ4UacSJc1uTtpAL7I1c18IYI7jkco0Mc4WTXqd5qnoiZwbl0ZgBhXoXKLnjywMBB3x45unDkpvhn1BN7mfDdMviqA2ZOBp6pjCGf9eEf1/ZkFifj3zums4mPgGUSHNfglanLZlfmovZOhjzOJeeGtptUgK9+ORACERVtkgjNqq/lLjYW/0cEgLdvMxoV4xujUFatzcLI6al/Jm6vCWq44pqdDmX/s0jtd/yKHW9mzIYrN/YAy0YiE3MdpZyT8DAN8A/mhKyVZYXuB8AbkUs3Ef3hj7GKuDv5SdYTHKSRbZ+xXFsCwD65siwltUW2IwAXJM5zrHll/EkhPUlfJ0bLbHmVJ8yaGTjFkayh6LI5HyjDxwMA== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MN2PR12MB3647.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(4636009)(366004)(186003)(66556008)(76116006)(26005)(71200400001)(83380400001)(64756008)(66476007)(66946007)(66446008)(508600001)(33656002)(8936002)(4326008)(107886003)(5660300002)(52536014)(9686003)(8676002)(38100700002)(53546011)(6506007)(6916009)(54906003)(7696005)(2906002)(55016003)(122000001)(86362001)(316002)(38070700005); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?IWn3dQk8cFnMSnW/6YjETD9KZ0H9YuqM2/xjKQlYnhx2oxDkUC8whaVj2i+I?= =?us-ascii?Q?1yw9DVcESg8/XajEq4yoZEl5S3oy1UgDfwuA/JpNx4AhEVcwSkjc2Ldrc6gP?= =?us-ascii?Q?IA89H1r+o8inS0iJ6YFpL8Iqd7tWNqovFJeYOU0SBez22/iX8TuHUsWvFYoq?= =?us-ascii?Q?VMO4m3RMUJCqPd9TGlSEXpSQk6i62iXU2CLt3sLwZ6acLDmIDu58//aXcy6h?= =?us-ascii?Q?QEE3dv/rIobQ8i3FX5qPcFxWTtCu+uI70jJmp0LwijUBda2wOHhUxqeg/j2I?= =?us-ascii?Q?qVgmn9O82s0YgNpO80nUjfnV5s93QdoWxhkHcMzCOHRFl/Kmb8LD3N0tEqn9?= =?us-ascii?Q?GNYFH/ZTRXH92T5G2gZYVLazZVA9R0VC+2z1I1TCRahlcR27zX4CZPKBQkiy?= =?us-ascii?Q?9IIf5k2SJuj16XOZL9+1Z2gY3HihQpwRyFgHoLmW6W5T8st7K/VAmCRZWdBF?= =?us-ascii?Q?eww5zd2hbYozTkb78k3shMQXqgIXO8YL7rna1S6r9Wly+yf52crX8EK0EziM?= =?us-ascii?Q?TF3CLkdwawxn+8Ias4AiT7iOomfKy9Vps1LX9RBwKLVZk7io7a+nBJdGWcL1?= =?us-ascii?Q?SbOA8wthRErH6f9jnIXDzxfWXgja1Wyq3Wj5gf84oDxxkGlVQWEfFcn63xZU?= =?us-ascii?Q?w2YsNzgweXCQrHq6z5/1H/ynu1f+h5oo+vpGnEofKDIQC3MgcOYD/OUKMbAr?= =?us-ascii?Q?DiTFxZkSGcbuG1RGDPHA1MDQsgwn3whxu2CBvrN99wrRBX4iHvP/jof5kP6/?= =?us-ascii?Q?USp/Tll2UZ/DNixpjuMlzYY2LSLYrM/7nVy5zR972Y+1BUIfYHmWzv28So1a?= =?us-ascii?Q?b2wYBzLRq/QYG3F1akCIMLs8X53CpC+z17lFO6VxpXpH2JmNU1fxkubHFJft?= =?us-ascii?Q?AwM804yDLCwIJRtY06jVnphzoccygOwfSfrH0zRe3I8s1uw1TgNCl4xW70H6?= =?us-ascii?Q?RpVmi2x/s/+revbI+Kc3DwVVbujNULummr3nMA6/Fjs7D4Vysck+MXeuEMj1?= =?us-ascii?Q?O9SQHX1hxKNmYC+rQwN8T4V+npT9gMD2t82qbasyIsfq5/qTAfa4tMU2Z2lH?= =?us-ascii?Q?r3AU6LQv4KeU48tkCpsfA5XpouvrTNXnFjIPPfxqNKBTl+BpfBmqbvjreXla?= =?us-ascii?Q?gZi8nXZYTjuD1nYJjOOTUcfPs1uZhG18779NNTCc7KUHvbEo0IAlZzmI6tbS?= =?us-ascii?Q?8EAIPS3T0N+DBAo4PcH6wMdWsdsqG4B8XDw9c+bgGOT5n50vCMstZPUt8XG1?= =?us-ascii?Q?1jnyCAcg8oJMyKk3CvAUA8g+cC4c3Y0j/ZLbzU0xmMRVqn3EaqhRoDZ9cYTX?= =?us-ascii?Q?G/Oh6u/SVzn7lOW7hPwzpNOx4r6lN3SzAWpecNr9pfkLvfKGOMGlacJUnO7Y?= =?us-ascii?Q?V1o4r440BSGL+9Qvx98m14AikvTdJCE3hkUr+yhqVts3o9dLpy1LgrqkhG7W?= =?us-ascii?Q?ey1QrBPM8kSVsKopqWP37Sg7UjfX0Oz6D/VVawEX4QwzzCGc3K9bsnSB5lyW?= =?us-ascii?Q?sJZEWnrMHpz91rrOSy53ynZ0USWF9XUOdFjbRELLg2IRNXvYTFiAix+wBuqJ?= =?us-ascii?Q?tPXZ8+3qXNGMcqV/JnMHlpeldZQtWO+WTcv24lXq0tshvASYAM6P9lzzTe0S?= =?us-ascii?Q?fDkZ8E3bAN3OktGrTocR7+eUxbAkKd8apSm6akiL2iKWtyYolMCjs1kIYwMF?= =?us-ascii?Q?UA85LhcF0yk+VSHWipJDbKd7H7gK64j42K3sXfyd/w6P1lw3?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: MN2PR12MB3647.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6c9ca92a-afea-47e0-93c9-08da3d37fcf9 X-MS-Exchange-CrossTenant-originalarrivaltime: 24 May 2022 03:46:30.2431 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: WHcmCDsDJB0VcBpWntLrG4GU0IAu0Db6PSZ5lSd0TaKrA37j+9cVBUac3lx2WZOzY7v/mFLf3plu2NbQHbOICg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR12MB1872 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > -----Original Message----- > From: Stephen Hemminger > Sent: Tuesday, May 24, 2022 6:55 AM > To: Spike Du > Cc: Matan Azrad ; Slava Ovsiienko > ; Ori Kam ; NBU-Contact- > Thomas Monjalon (EXTERNAL) ; dev@dpdk.org; > Raslan Darawsheh > Subject: Re: [RFC v2 3/7] ethdev: introduce Rx queue based limit watermar= k >=20 > External email: Use caution opening links or attachments >=20 >=20 > On Mon, 23 May 2022 03:01:20 +0000 > Spike Du wrote: >=20 > > Hi, pls see below. > > > > > -----Original Message----- > > > From: Stephen Hemminger > > > Sent: Sunday, May 22, 2022 11:23 PM > > > To: Spike Du > > > Cc: Matan Azrad ; Slava Ovsiienko > > > ; Ori Kam ; NBU-Contact- > > > Thomas Monjalon (EXTERNAL) ; dev@dpdk.org; > > > Raslan Darawsheh > > > Subject: Re: [RFC v2 3/7] ethdev: introduce Rx queue based limit > > > watermark > > > > > > External email: Use caution opening links or attachments > > > > > > > > > On Sun, 22 May 2022 08:58:56 +0300 > > > Spike Du wrote: > > > > > > > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h > > > > index > > > > 04cff8ee10..687ae5ff29 100644 > > > > --- a/lib/ethdev/rte_ethdev.h > > > > +++ b/lib/ethdev/rte_ethdev.h > > > > @@ -1249,7 +1249,16 @@ struct rte_eth_rxconf { > > > > */ > > > > union rte_eth_rxseg *rx_seg; > > > > > > > > - uint64_t reserved_64s[2]; /**< Reserved for future fields */ > > > > + /** > > > > + * Per-queue Rx limit watermark defined as percentage of Rx q= ueue > > > > + * size. If Rx queue receives traffic higher than this percen= tage, > > > > + * the event RTE_ETH_EVENT_RX_LWM is triggered. > > > > + */ > > > > + uint8_t lwm; > > > > + > > > > + uint8_t reserved_bits[3]; > > > > + uint32_t reserved_32s; > > > > + uint64_t reserved_64s; > > > > void *reserved_ptrs[2]; /**< Reserved for future fields */ > > > > }; > > > > > > > > > > Ok but, this is an ABI risk about this because reserved stuff was > > > never required before. > > > Whenever is a reserved field is introduced the code (in this case > > > rte_ethdev_configure). > > > > > > Best practice would have been to have the code require all reserved > > > fields be > > > 0 in earlier releases. In this case an application is like to define > > > a watermark of zero; how will your code handle it. > > Having watermark of 0 is desired, which is the default. LWM of 0 means > > the Rx Queue's watermark is not monitored, hence no LWM event is > generated. > > > > > > Also, using 8 bits as percentage is different than how other API's ha= ndle > this. > > > Since Rx queue size is in packets, why is this not in packets? > > The short answer is to simply the LWM configuration. > > Rx queue descriptor is complex nowadays. > > For normal queue, user may configure LWM according to queue descriptor > number easily. > > But for below queues, it's not easy: > > Take mprq as example, the testpmd cmd options can be " -a > > > 0000:03:00.0,rxqs_min_mprq=3D1,mprq_en=3D1,mprq_max_memcpy_len=3D465, > mprq_lo > > g_stride_size=3D8,mprq_log_stride_num=3D3 > > -- --mbcache=3D512 -i --nb-cores=3D7 --txd=3D1024 --rxd=3D1024 ", For= MLX5 > > implementation, the minimum "unit" in queue has 64 descriptors, the > > "unit" number is 16, if you configure according to descriptor number(1= 024) > Here, you may easily set LWM as something like 512, but HW doesn't allow = it, > because 512 > 16. If you want the watermark to be half, the correct value= is 8. > > The same issue happens to feature like "Rx queue buffer split" where a > packet can be split to multiple descriptors. > > Using percentage doesn't have such issues, PMD will cover all the detai= ls. > > > > > Also document what behavior of 0 is. > > Sure. The behavior is like the old days without this feature, pls see a= bove. > > > > > Why introduce new query/set operations? This should just be part of > > > the overall device configuration. > > Due to different implementation. LWM can be a dynamic configuration > which can help user design a flexible flow control. > > User may feel ok with LWM of 80% to get high throughput, or later on wi= th > 50% to throttle the traffic responsively by handling LWM event in order t= o > reduce drop. > > Some driver like mlx5 may implement LWM event as one-time shot. When > > you receive LWM event, you need to reconfigure LWM in order to receive > the event again, thus you will not likely to be overwhelmed by the events= . > > These all require set operation. > > > > For the query operation. The rte_event API > rte_eth_dev_callback_process() is per-port API, it doesn't carry much > information when an event happens. > > When a LWM event happens, we need to know in which Rx queue it > happens or optionally what's the current LWM percentage of this queue. > > The query operation serves this purpose. > > > > > > Regards, > > Spike. > > > > >=20 > The bigger question is why does this have to be just MLX5 and why can't i= t fit > into the existing DPDK RX interrupt framework? >=20 > Linux and BSD have had this for years in their packet coalescing logic. > Ethtool provides ability to set lot of irq coalescing parameters like: >=20 > ethtool -C|--coalesce devname [adaptive-rx on|off] [adaptive-tx on= |off] > [rx-usecs N] [rx-frames N] [rx-usecs-irq N] [rx-frames-irq = N] > [tx-usecs N] [tx-frames N] [tx-usecs-irq N] [tx-frames-irq = N] > [stats-block-usecs N] [pkt-rate-low N] [rx-usecs-low N] > [rx-frames-low N] [tx-usecs-low N] [tx-frames-low N] > [pkt-rate-high N] [rx-usecs-high N] [rx-frames-high N] > [tx-usecs-high N] [tx-frames-high N] [sample-interval N] > [cqe-mode-rx on|off] [cqe-mode-tx on|off] >=20 > It feels like this is just the DPDK version of a small subset of that. > Since many device already support IRQ coalescing, it would be best to bui= ld > one new API that has most of these. Rather than a MLX/Nvidia only API for= a > single parameter. I take MLX5 as example here because I only know how mlx5 works, I don't und= erstand How other NICs work. It doesn't mean I try to change common code only to s= atisfy=20 Mlx5 needs. I think interrupt coalesce is different from LWM: Interrupt coalesce is delay interrupt until a batch of packets(or an interv= al) is received.=20 LWM intends to notify when a Rx queue is out of buffer. Delaying interrupt = can't detect A specific fullness value of the Rx queue, but LWM can if driver supports i= t. Regards, Spike.