From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 7CD31A0C41;
	Wed, 15 Sep 2021 11:45:35 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 41F754003F;
	Wed, 15 Sep 2021 11:45:35 +0200 (CEST)
Received: from mail-io1-f53.google.com (mail-io1-f53.google.com
 [209.85.166.53])
 by mails.dpdk.org (Postfix) with ESMTP id 1E5724003C;
 Wed, 15 Sep 2021 11:45:33 +0200 (CEST)
Received: by mail-io1-f53.google.com with SMTP id g9so2556698ioq.11;
 Wed, 15 Sep 2021 02:45:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc:content-transfer-encoding;
 bh=xRGvBa1zuJaXux+kan2ckWLEu3/TnAcEdP3b9gpZORQ=;
 b=A2pzRvxHzZyZF6ntIetNFnJ3jW0EOgtb2sltNKtgk0nZOADrhKw63rZL18snHLT9cw
 b5vsoH054iU9hATr7tKCdQ7GygZ+cNl+9DWrQHzJek76vH6eZzzp3ChVXhmH/Zl9ursd
 f5YKbejPS7f88s00wVD24FTe31frVRsY4R2ThjMXPzy8TU4q6aMwtrjtoSk4aOWZcvKb
 RVxRugmDROj+2lPMiXNRFAvoT/RFow+c33V/iiwIKJcnn+vcqcffTvSyejUIo1yUOF/q
 rvQGQ319PUBpBt48MjHmp6HppZm76RWcGLhjpzHUroEGKNJ2H0imPvXSzQMVrymE786b
 WtLw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc:content-transfer-encoding;
 bh=xRGvBa1zuJaXux+kan2ckWLEu3/TnAcEdP3b9gpZORQ=;
 b=3uDje+a76nKc8r3ZXw7SYhFgkc/qCU6FnQhwmdeO1aiBrZKd0fN5/gvsuGIfdDf3N5
 e1daSpHNzYMclDDxD5WbBWAOg0uC3FWsh77Ftxmx2u7kMPbBGoV0RuF9548aEABZkcpU
 Hsu+WT29YEGr9uIGcHcAr5qV71sMQQXwn4G1pj2VlmqkMauF2PhrXk7KULziHKPJbzuW
 KSqD9M8klk1vk+WhS1eRK2Bkt9Fj327sCF8DSJe7Z2/faxjmV8MA1wZ95ETPcGLnuZMR
 VMHnuJOovZLzuE89VLEgniGGGVbnZvysSn+iAnpBItvwfPUg8Fk7BSOZJpokBvInD55D
 zEgw==
X-Gm-Message-State: AOAM533TUWFUD/yKhCz3ppqg9fgywOdmkziTL04TDYP+/cbrlTifCY9Q
 ZQK84qm8JbutdNg6FhXqhDjbxj86/psxaxtqr7I=
X-Google-Smtp-Source: ABdhPJwmsA+f2YjIXDywrYdRMsV3jif3ZjfzvjaVdhNv2cFAfm3qEFiJ3EzZtS/9j+FrUaYxxJBFxirhckN7yknnr98=
X-Received: by 2002:a05:6602:1543:: with SMTP id
 h3mr17177911iow.123.1631699132387; 
 Wed, 15 Sep 2021 02:45:32 -0700 (PDT)
MIME-Version: 1.0
References: <20210820162834.12544-1-konstantin.ananyev@intel.com>
 <CALBAE1P+gCEJgxBqCXW7fjFEcBcHVFWTMAmsG_mvozA78FJYTQ@mail.gmail.com>
 <DM6PR11MB4491DAA82C445BA0278C0CD09ADA9@DM6PR11MB4491.namprd11.prod.outlook.com>
In-Reply-To: <DM6PR11MB4491DAA82C445BA0278C0CD09ADA9@DM6PR11MB4491.namprd11.prod.outlook.com>
From: Jerin Jacob <jerinjacobk@gmail.com>
Date: Wed, 15 Sep 2021 15:15:05 +0530
Message-ID: <CALBAE1Nce=wAysKp-FMy8MvD7fcjHOCbSJ2Qxqr4VnmoY5abdA@mail.gmail.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: dpdk-dev <dev@dpdk.org>, Thomas Monjalon <thomas@monjalon.net>, 
 "Yigit, Ferruh" <ferruh.yigit@intel.com>,
 Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>, 
 "Yang, Qiming" <qiming.yang@intel.com>, "Zhang, Qi Z" <qi.z.zhang@intel.com>, 
 "Xing, Beilei" <beilei.xing@intel.com>,
 "techboard@dpdk.org" <techboard@dpdk.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [dpdk-dev] [RFC 0/7] hide eth dev related structures
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On Tue, Sep 14, 2021 at 7:03 PM Ananyev, Konstantin
<konstantin.ananyev@intel.com> wrote:
>
>
> Hi Jerin,
>
> > > NOTE: This is just an RFC to start further discussion and collect the=
 feedback.
> > > Due to significant amount of work, changes required are applied only =
to two
> > > PMDs so far: net/i40e and net/ice.
> > > So to build it you'll need to add:
> > > -Denable_drivers=3D'common/*,mempool/*,net/ice,net/i40e'
> > > to your config options.
> >
> > >
> > > That approach was selected to avoid(/minimize) possible performance l=
osses.
> > >
> > > So far I done only limited amount functional and performance testing.
> > > Didn't spot any functional problems, and performance numbers
> > > remains the same before and after the patch on my box (testpmd, macsw=
ap fwd).
> >
> >
> > Based on testing on octeonxt2. We see some regression in testpmd and
> > bit on l3fwd too.
> >
> > Without patch: 73.5mpps/core in testpmd iofwd
> > With out patch: 72 5mpps/core in testpmd iofwd
> >
> > Based on my understanding it is due to additional indirection.
>
> From your patch below, it looks like not actually additional indirection,
> but extra memory dereference - func and dev pointers are now stored
> at different places.

Yup. I meant the same. We are on the same page.

> Plus the fact that now we dereference rte_eth_devices[]
> data inside PMD function. Which probably prevents compiler and CPU to loa=
d
>  rte_eth_devices[port_id].data and rte_eth_devices[port_id]. pre_tx_burst=
_cbs[queue_id]
> in advance before calling actual RX/TX function.

Yes.

> About your approach: I don=E2=80=99t mind to add extra opaque 'void *data=
' pointer,
> but would prefer not to expose callback invocations code into inline func=
tion.
> Main reason for that - I think it still need to be reworked to allow addi=
ng/removing
> callbacks without stopping the device. Something similar to what was done=
 for cryptodev
> callbacks. To be able to do that in future without another ABI breakage c=
allbacks related part
> needs to be kept internal.
> Though what we probably can do: add two dynamic arrays of opaque pointers=
 to  rte_eth_burst_api.
> One for rx/tx queue data pointers, second for rx/tx callback pointers.
> To be more specific, something like:
>
> typedef uint16_t (*rte_eth_rx_burst_t)( void *rxq, struct rte_mbuf **rx_p=
kts, uint16_t nb_pkts, void *cbs);
> typedef uint16_t (*rte_eth_tx_burst_t)(void *txq, struct rte_mbuf **tx_pk=
ts, uint16_t nb_pkts, void *cbs);
> ....
>
> struct rte_eth_burst_api {
>         rte_eth_rx_burst_t rx_pkt_burst;
>         /**< PMD receive function. */
>         rte_eth_tx_burst_t tx_pkt_burst;
>         /**< PMD transmit function. */
>         rte_eth_tx_prep_t tx_pkt_prepare;
>         /**< PMD transmit prepare function. */
>         rte_eth_rx_queue_count_t rx_queue_count;
>         /**< Get the number of used RX descriptors. */
>         rte_eth_rx_descriptor_status_t rx_descriptor_status;
>         /**< Check the status of a Rx descriptor. */
>         rte_eth_tx_descriptor_status_t tx_descriptor_status;
>         /**< Check the status of a Tx descriptor. */
>         struct {
>                  void **queue_data;   /* point to rte_eth_devices[port_id=
].data-> rx_queues */
>                  void **cbs;                  /*  points to rte_eth_devic=
es[port_id].post_rx_burst_cbs */
>        } rx_data, tx_data;
> } __rte_cache_aligned;
>
> static inline uint16_t
> rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
>                  struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
> {
>        struct rte_eth_burst_api *p;
>
>         if (port_id >=3D RTE_MAX_ETHPORTS || queue_id >=3D RTE_MAX_QUEUES=
_PER_PORT)
>                 return 0;
>
>       p =3D  &rte_eth_burst_api[port_id];
>       return p->rx_pkt_burst(p->rx_data.queue_data[queue_id], rx_pkts, nb=
_pkts, p->rx_data.cbs[queue_id]);



That works.


> }
>
> Same for TX.
>
> If that looks ok to everyone, I'll try to prepare next version based on t=
hat.


Looks good to me.

> In theory that should avoid extra dereference problem and even reduce ind=
irection.
> As a drawback data->rxq/txq should always be allocated for RTE_MAX_QUEUES=
_PER_PORT entries,
> but I presume that=E2=80=99s not a big deal.
>
> As a side question - is there any reason why rte_ethdev_trace_rx_burst() =
is invoked at very last point,
> while rte_ethdev_trace_tx_burst()  after CBs but before actual tx_pkt_bur=
st()?
> It would make things simpler if tracng would always be done either on ent=
rance or exit of rx/tx_burst.

exit is fine.

>
> >
> > My suggestion to fix the problem by:
> > Removing the additional `data` redirection and pull callback function
> > pointers back
> > and keep rest as opaque as done in the existing patch like [1]
> >
> > I don't believe this has any real implication on future ABI stability
> > as we will not be adding
> > any new item in rte_eth_fp in any way as new features can be added in s=
lowpath
> > rte_eth_dev as mentioned in the patch.

Ack

I will happy to test again after the rework and report any performance
issues if any.

Thaks for the great work :-)


> >
> > [2] is the patch of doing the same as I don't see any performance
> > regression after [2].
> >
> >
> > [1]
> > - struct rte_eth_burst_api {
> > - struct rte_eth_fp {
> > + void *data;
> >   rte_eth_rx_burst_t rx_pkt_burst;
> >   /**< PMD receive function. */
> >   rte_eth_tx_burst_t tx_pkt_burst;
> > @@ -85,8 +100,19 @@ struct rte_eth_burst_api {
> >   /**< Check the status of a Rx descriptor. */
> >   rte_eth_tx_descriptor_status_t tx_descriptor_status;
> >   /**< Check the status of a Tx descriptor. */
> > + /**
> > + * User-supplied functions called from rx_burst to post-process
> > + * received packets before passing them to the user
> > + */
> > + struct rte_eth_rxtx_callback
> > + *post_rx_burst_cbs[RTE_MAX_QUEUES_PER_PORT];
> > + /**
> > + * User-supplied functions called from tx_burst to pre-process
> > + * received packets before passing them to the driver for transmission=
.
> > + */
> > + struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_POR=
T];
> >   uintptr_t reserved[2];
> > -} __rte_cache_min_aligned;
> > +} __rte_cache_aligned;
> >
> > [2]
> > https://pastebin.com/CuqkrCW4