From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2E044A0C43; Thu, 26 Aug 2021 13:58:43 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 94F0040689; Thu, 26 Aug 2021 13:58:42 +0200 (CEST) Received: from mail-il1-f181.google.com (mail-il1-f181.google.com [209.85.166.181]) by mails.dpdk.org (Postfix) with ESMTP id E6A9E40140 for ; Thu, 26 Aug 2021 13:58:41 +0200 (CEST) Received: by mail-il1-f181.google.com with SMTP id l10so2968361ilh.8 for ; Thu, 26 Aug 2021 04:58:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=j4aHf6+CfyWyRfS3AQSLJlqfcBpQp5eoHL4GK9S3SeM=; b=qkheRSGbH0edDubEUsqJHPl2OmZbaVnC8Yxutk7rij+nrIEgFV16DxaBVtuko/bVjf h6P0BFKpfl8U9GBRidd5sfz4t0TzAl2wEug0jq/+h4maLJeYVEI7sJO6WSb9xwdLMKHy 4Yk61wg+CT7nUBws+BEbv/hVEvNNVdCQzuHtaTRizpI5uBxZ8uQBwIOUx4oqpIryBlva Kz1+9txDrrDkyKLNU8bqRkM9fSaKa1rUCRl1FSSO5+wMUlWzQbcRJzLlCo1tbKMkmzgu 0/GbtA6c/buXtR/4S0/SWmj36HT+NiLxyEjWoTctl7eQd9OmSxfkrGlm7vjfNQJ5E+eV 2Nlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=j4aHf6+CfyWyRfS3AQSLJlqfcBpQp5eoHL4GK9S3SeM=; b=aqpinyq/IjfUX0Ncmm0DPUYV7LbZWN9HOEkLA7pZcC9ocVQypxADd/XB6A46LIe4h2 9eBtB4jQ9v2zLqfP8HM5fC+cV0gGnc9eTAOlaATU4Fn1B8IPyikJUc2TFrZxchqh8lUO 1AFRyEDWfVeMqV8cPaSQaBamx19q6zLKTsrJhVO69gGgU7zkylrS1iHlMg/VnWJUfji5 fmG8sC1vzB6ejukSKavOZJrArSlAqSN59RTiUstGOKI5hojDj4va0dDyxEWx3EYQLXtn pBTCjBvl6SU3ibT5EphT50cSBU25CqzIK73pySmX37WuluU24JCKmcEirw1/5uk3sJJv fCwQ== X-Gm-Message-State: AOAM533n+2eDtCH+kEPmySsdkC991RCTncEY2Qdfj5UZHwtCSoXfl30v M7IG7OdL/0Wb2LTXM8xl4S7vM8KQmWrtOR0adXk= X-Google-Smtp-Source: ABdhPJwWVzUqFXW3w1Nd5/FoihHBLwTUJ2osBX10awfXUVudCQJWUHfcHL20hk7+Vvv+DDvX7R5FYzKfkTjOBshJ9tg= X-Received: by 2002:a05:6e02:160f:: with SMTP id t15mr2562258ilu.60.1629979121178; Thu, 26 Aug 2021 04:58:41 -0700 (PDT) MIME-Version: 1.0 References: <20210727034204.20649-1-xuemingl@nvidia.com> <20210811140418.393264-1-xuemingl@nvidia.com> In-Reply-To: From: Jerin Jacob Date: Thu, 26 Aug 2021 17:28:14 +0530 Message-ID: To: "Xueming(Steven) Li" Cc: dpdk-dev , Ferruh Yigit , NBU-Contact-Thomas Monjalon , Andrew Rybchenko Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [PATCH v2 01/15] ethdev: introduce shared Rx queue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Thu, Aug 19, 2021 at 5:39 PM Xueming(Steven) Li wr= ote: > > > > > -----Original Message----- > > From: Jerin Jacob > > Sent: Thursday, August 19, 2021 1:27 PM > > To: Xueming(Steven) Li > > Cc: dpdk-dev ; Ferruh Yigit ; NBU= -Contact-Thomas Monjalon ; > > Andrew Rybchenko > > Subject: Re: [PATCH v2 01/15] ethdev: introduce shared Rx queue > > > > On Wed, Aug 18, 2021 at 4:44 PM Xueming(Steven) Li wrote: > > > > > > > > > > > > > -----Original Message----- > > > > From: Jerin Jacob > > > > Sent: Tuesday, August 17, 2021 11:12 PM > > > > To: Xueming(Steven) Li > > > > Cc: dpdk-dev ; Ferruh Yigit ; > > > > NBU-Contact-Thomas Monjalon ; Andrew Rybchenko > > > > > > > > Subject: Re: [PATCH v2 01/15] ethdev: introduce shared Rx queue > > > > > > > > On Tue, Aug 17, 2021 at 5:01 PM Xueming(Steven) Li wrote: > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > From: Jerin Jacob > > > > > > Sent: Tuesday, August 17, 2021 5:33 PM > > > > > > To: Xueming(Steven) Li > > > > > > Cc: dpdk-dev ; Ferruh Yigit > > > > > > ; NBU-Contact-Thomas Monjalon > > > > > > ; Andrew Rybchenko > > > > > > > > > > > > Subject: Re: [PATCH v2 01/15] ethdev: introduce shared Rx queue > > > > > > > > > > > > On Wed, Aug 11, 2021 at 7:34 PM Xueming Li wrote: > > > > > > > > > > > > > > In current DPDK framework, each RX queue is pre-loaded with > > > > > > > mbufs for incoming packets. When number of representors scale > > > > > > > out in a switch domain, the memory consumption became > > > > > > > significant. Most important, polling all ports leads to high > > > > > > > cache miss, high latency and low throughput. > > > > > > > > > > > > > > This patch introduces shared RX queue. Ports with same > > > > > > > configuration in a switch domain could share RX queue set by = specifying sharing group. > > > > > > > Polling any queue using same shared RX queue receives packets > > > > > > > from all member ports. Source port is identified by mbuf->por= t. > > > > > > > > > > > > > > Port queue number in a shared group should be identical. Queu= e > > > > > > > index is > > > > > > > 1:1 mapped in shared group. > > > > > > > > > > > > > > Share RX queue must be polled on single thread or core. > > > > > > > > > > > > > > Multiple groups is supported by group ID. > > > > > > > > > > > > > > Signed-off-by: Xueming Li > > > > > > > Cc: Jerin Jacob > > > > > > > --- > > > > > > > Rx queue object could be used as shared Rx queue object, it's > > > > > > > important to clear all queue control callback api that using = queue object: > > > > > > > https://mails.dpdk.org/archives/dev/2021-July/215574.html > > > > > > > > > > > > > #undef RTE_RX_OFFLOAD_BIT2STR diff --git > > > > > > > a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index > > > > > > > d2b27c351f..a578c9db9d 100644 > > > > > > > --- a/lib/ethdev/rte_ethdev.h > > > > > > > +++ b/lib/ethdev/rte_ethdev.h > > > > > > > @@ -1047,6 +1047,7 @@ struct rte_eth_rxconf { > > > > > > > uint8_t rx_drop_en; /**< Drop packets if no descripto= rs are available. */ > > > > > > > uint8_t rx_deferred_start; /**< Do not start queue wi= th rte_eth_dev_start(). */ > > > > > > > uint16_t rx_nseg; /**< Number of descriptions in rx_s= eg array. > > > > > > > */ > > > > > > > + uint32_t shared_group; /**< Shared port group index i= n > > > > > > > + switch domain. */ > > > > > > > > > > > > Not to able to see anyone setting/creating this group ID test a= pplication. > > > > > > How this group is created? > > > > > > > > > > Nice catch, the initial testpmd version only support one default = group(0). > > > > > All ports that supports shared-rxq assigned in same group. > > > > > > > > > > We should be able to change "--rxq-shared" to "--rxq-shared-group= " > > > > > to support group other than default. > > > > > > > > > > To support more groups simultaneously, need to consider testpmd > > > > > forwarding stream core assignment, all streams in same group need= to stay on same core. > > > > > It's possible to specify how many ports to increase group number, > > > > > but user must schedule stream affinity carefully - error prone. > > > > > > > > > > On the other hand, one group should be sufficient for most > > > > > customer, the doubt is whether it valuable to support multiple gr= oups test. > > > > > > > > Ack. One group is enough in testpmd. > > > > > > > > My question was more about who and how this group is created, Shoul= d > > > > n't we need API to create shared_group? If we do the following, at = least, I can think, how it can be implemented in SW or other HW. > > > > > > > > - Create aggregation queue group > > > > - Attach multiple Rx queues to the aggregation queue group > > > > - Pull the packets from the queue group(which internally fetch from > > > > the Rx queues _attached_) > > > > > > > > Does the above kind of sequence, break your representor use case? > > > > > > Seems more like a set of EAL wrapper. Current API tries to minimize t= he application efforts to adapt shared-rxq. > > > - step 1, not sure how important it is to create group with API, in r= te_flow, group is created on demand. > > > > Which rte_flow pattern/action for this? > > No rte_flow for this, just recalled that the group in rte_flow is not cre= ated along with flow, not via api. > I don=E2=80=99t see anything else to create along with group, just double= whether it valuable to introduce a new api set to manage group. See below. > > > > > > - step 2, currently, the attaching is done in rte_eth_rx_queue_setup,= specify offload and group in rx_conf struct. > > > - step 3, define a dedicate api to receive packets from shared rxq? L= ooks clear to receive packets from shared rxq. > > > currently, rxq objects in share group is same - the shared rxq, so = the eth callback eth_rx_burst_t(rxq_obj, mbufs, n) could > > > be used to receive packets from any ports in group, normally the fi= rst port(PF) in group. > > > An alternative way is defining a vdev with same queue number and co= py rxq objects will make the vdev a proxy of > > > the shared rxq group - this could be an helper API. > > > > > > Anyway the wrapper doesn't break use case, step 3 api is more clear, = need to understand how to implement efficiently. > > > > Are you doing this feature based on any HW support or it just pure SW t= hing, If it is SW, It is better to have just new vdev for like > > drivers/net/bonding/. This we can help aggregate multiple Rxq across th= e multiple ports of same the driver. > > Based on HW support. In Marvel HW, we do some support, I will outline here and some queries on t= his. # We need to create some new HW structure for aggregation # Connect each Rxq to the new HW structure for aggregation # Use rx_burst from the new HW structure. Could you outline your HW support? Also, I am not able to understand how this will reduce the memory, atleast in our HW need creating more memory now to deal this as we need to deal new HW structure. How is in your HW it reduces the memory? Also, if memory is the constraint, why NOT reduce the number of queues. # Also, I was thinking, one way to avoid the fast path or ABI change would = like. # Driver Initializes one more eth_dev_ops in driver as aggregator ethdev # devargs of new ethdev or specific API like drivers/net/bonding/rte_eth_bond.h can take the argument (port, queue) tuples which needs to aggregate by new ethdev port # No change in fastpath or ABI is required in this model. > Most user might uses PF in group as the anchor port to rx burst, current = definition should be easy for them to migrate. > but some user might prefer grouping some hot plug/unpluggedrepresentors, = EAL could provide wrappers, users could do > that either due to the strategy not complex enough. Anyway, welcome any s= uggestion. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > /** > > > > > > > * Per-queue Rx offloads to be set using DEV_RX_OFFLO= AD_* flags. > > > > > > > * Only offloads set on rx_queue_offload_capa or > > > > > > > rx_offload_capa @@ -1373,6 +1374,12 @@ struct rte_eth_conf { > > > > > > > #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM 0x00040000 > > > > > > > #define DEV_RX_OFFLOAD_RSS_HASH 0x00080000 > > > > > > > #define RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT 0x00100000 > > > > > > > +/** > > > > > > > + * Rx queue is shared among ports in same switch domain to > > > > > > > +save memory, > > > > > > > + * avoid polling each port. Any port in group can be used to= receive packets. > > > > > > > + * Real source port number saved in mbuf->port field. > > > > > > > + */ > > > > > > > +#define RTE_ETH_RX_OFFLOAD_SHARED_RXQ 0x00200000 > > > > > > > > > > > > > > #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM |= \ > > > > > > > DEV_RX_OFFLOAD_UDP_CKSUM | \ > > > > > > > -- > > > > > > > 2.25.1 > > > > > > >