From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: Olivier Matz <olivier.matz@6wind.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
"konstantin.ananyev@intel.com" <konstantin.ananyev@intel.com>,
"david.marchand@redhat.com" <david.marchand@redhat.com>,
nd <nd@arm.com>,
Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
nd <nd@arm.com>
Subject: Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs
Date: Thu, 8 Oct 2020 20:47:39 +0000 [thread overview]
Message-ID: <DBAPR08MB5814795ED609968E8A1F7A0C980B0@DBAPR08MB5814.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <DBAPR08MB5814C9B0AB1E569C3D51CC65980B0@DBAPR08MB5814.eurprd08.prod.outlook.com>
<snip>
>
> >
> > Hi Honnappa,
> >
> > From a quick walkthrough, I have some questions/comments, please see
> > below.
> Hi Olivier, appreciate your input.
>
> >
> > On Tue, Oct 06, 2020 at 08:29:05AM -0500, Honnappa Nagarahalli wrote:
> > > Add scatter gather APIs to avoid intermediate memcpy. Use cases that
> > > involve copying large amount of data to/from the ring can benefit
> > > from these APIs.
> > >
> > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > > ---
> > > lib/librte_ring/meson.build | 3 +-
> > > lib/librte_ring/rte_ring_elem.h | 1 +
> > > lib/librte_ring/rte_ring_peek_sg.h | 552
> > > +++++++++++++++++++++++++++++
> > > 3 files changed, 555 insertions(+), 1 deletion(-) create mode
> > > 100644 lib/librte_ring/rte_ring_peek_sg.h
> > >
> > > diff --git a/lib/librte_ring/meson.build
> > > b/lib/librte_ring/meson.build index 31c0b4649..377694713 100644
> > > --- a/lib/librte_ring/meson.build
> > > +++ b/lib/librte_ring/meson.build
> > > @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
> > > 'rte_ring_peek.h',
> > > 'rte_ring_peek_c11_mem.h',
> > > 'rte_ring_rts.h',
> > > - 'rte_ring_rts_c11_mem.h')
> > > + 'rte_ring_rts_c11_mem.h',
> > > + 'rte_ring_peek_sg.h')
> > > diff --git a/lib/librte_ring/rte_ring_elem.h
> > > b/lib/librte_ring/rte_ring_elem.h index 938b398fc..7d3933f15 100644
> > > --- a/lib/librte_ring/rte_ring_elem.h
> > > +++ b/lib/librte_ring/rte_ring_elem.h
> > > @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring
> > > *r, void *obj_table,
> > >
> > > #ifdef ALLOW_EXPERIMENTAL_API
> > > #include <rte_ring_peek.h>
> > > +#include <rte_ring_peek_sg.h>
> > > #endif
> > >
> > > #include <rte_ring.h>
> > > diff --git a/lib/librte_ring/rte_ring_peek_sg.h
> > > b/lib/librte_ring/rte_ring_peek_sg.h
> > > new file mode 100644
> > > index 000000000..97d5764a6
> > > --- /dev/null
> > > +++ b/lib/librte_ring/rte_ring_peek_sg.h
> > > @@ -0,0 +1,552 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + *
> > > + * Copyright (c) 2020 Arm
> > > + * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
> > > + * All rights reserved.
> > > + * Derived from FreeBSD's bufring.h
> > > + * Used as BSD-3 Licensed with permission from Kip Macy.
> > > + */
> > > +
> > > +#ifndef _RTE_RING_PEEK_SG_H_
> > > +#define _RTE_RING_PEEK_SG_H_
> > > +
> > > +/**
> > > + * @file
> > > + * @b EXPERIMENTAL: this API may change without prior notice
> > > + * It is not recommended to include this file directly.
> > > + * Please include <rte_ring_elem.h> instead.
> > > + *
> > > + * Ring Peek Scatter Gather APIs
> >
> > I am not fully convinced by the API name. To me, "scatter/gather" is
> > associated to iovecs, like for instance in [1]. The wikipedia
> > definition [2] may be closer though.
> >
> > [1]
> >
> https://www.gnu.org/software/libc/manual/html_node/Scatter_002dGathe
> > r.html
> > [2] https://en.wikipedia.org/wiki/Gather-scatter_(vector_addressing)
> The way I understand scatter-gather is, the data to be sent to something (like
> a device) is coming from multiple sources. It would require copying to put the
> data together to be contiguous. If the device supports scatter-gather, such
> copying is not required. The device can collect data from multiple locations
> and make it contiguous.
>
> In the case I was looking at, one part of the data was coming from the user of
> the API and another was generated by the API itself. If these two pieces of
> information have to be enqueued as a single object on the ring, I had to
> create an intermediate copy. But by exposing the ring memory to the user,
> the intermediate copy is avoided. Hence I called it scatter-gather.
>
> >
> > What about "zero-copy"?
> I think no-copy (nc for short) or user-copy (uc for short) would convey the
> meaning better. These would indicate that the rte_ring APIs are not copying
> the objects and it is left to the user to do the actual copy.
>
> >
> > Also, the "peek" term looks also a bit confusing to me, but it existed
> > before your patch. I understand it for dequeue, but not for enqueue.
> I kept 'peek' there because the API still offers the 'peek' API capabilities. I am
> also not sure on what 'peek' means for enqueue API. The enqueue 'peek'
> API was provided to be symmetric with dequeue peek API.
>
> >
> > Or, what about replacing the existing experimental peek API by this one?
> > They look quite similar to me.
> I agree, scatter gather APIs provide the peek capability and the no-copy
> benefits.
> Konstantin, any opinions here?
>
> >
> > > + * Introduction of rte_ring with scatter gather serialized
> > > + producer/consumer
> > > + * (HTS sync mode) makes it possible to split public
> > > + enqueue/dequeue API
> > > + * into 3 phases:
> > > + * - enqueue/dequeue start
> > > + * - copy data to/from the ring
> > > + * - enqueue/dequeue finish
> > > + * Along with the advantages of the peek APIs, these APIs provide
> > > + the ability
> > > + * to avoid copying of the data to temporary area.
> > > + *
> > > + * Note that right now this new API is available only for two sync modes:
> > > + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> > > + * 2) Serialized Producer/Serialized Consumer
> (RTE_RING_SYNC_MT_HTS).
> > > + * It is a user responsibility to create/init ring with appropriate
> > > + sync
> > > + * modes selected.
> > > + *
> > > + * Example usage:
> > > + * // read 1 elem from the ring:
> >
> > Comment should be "prepare enqueuing 32 objects"
> >
> > > + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> > > + * if (n != 0) {
> > > + * //Copy objects in the ring
> > > + * memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> > > + * if (n != sgd->n1)
> > > + * //Second memcpy because of wrapround
> > > + * n2 = n - sgd->n1;
> > > + * memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));
> >
> > Missing { }
> >
> > > + * rte_ring_dequeue_sg_finish(ring, n);
> >
> > Should be enqueue
> >
> Thanks, will correct them.
>
> > > + * }
> > > + *
> > > + * Note that between _start_ and _finish_ none other thread can
> > > + proceed
> > > + * with enqueue(/dequeue) operation till _finish_ completes.
> > > + */
> > > +
> > > +#ifdef __cplusplus
> > > +extern "C" {
> > > +#endif
> > > +
> > > +#include <rte_ring_peek_c11_mem.h>
> > > +
> > > +/* Rock that needs to be passed between reserve and commit APIs */
> > > +struct rte_ring_sg_data {
> > > + /* Pointer to the first space in the ring */
> > > + void **ptr1;
> > > + /* Pointer to the second space in the ring if there is wrap-around */
> > > + void **ptr2;
> > > + /* Number of elements in the first pointer. If this is equal to
> > > + * the number of elements requested, then ptr2 is NULL.
> > > + * Otherwise, subtracting n1 from number of elements requested
> > > + * will give the number of elements available at ptr2.
> > > + */
> > > + unsigned int n1;
> > > +};
> >
> > Would it be possible to simply return the offset instead of this structure?
> > The wrap could be managed by a __rte_ring_enqueue_elems() function.
> Trying to use __rte_ring_enqueue_elems() will force temporary copy. See
> below.
>
> >
> > I mean something like this:
> >
> > uint32_t start;
> > n = rte_ring_enqueue_sg_bulk_start(ring, 32, &start, NULL);
> > if (n != 0) {
> > /* Copy objects in the ring. */
> > __rte_ring_enqueue_elems(ring, start, obj, sizeof(uintptr_t),
> n);
> For ex: 'obj' here is temporary copy.
The example I provided in the comment at the top of the file is not good. I will replace the 'memcpy' with calling another API that copies the data to the ring directly. That should show the clear benefit.
>
> > rte_ring_enqueue_sg_finish(ring, n);
> > }
> >
> > It would require to slightly change __rte_ring_enqueue_elems() to
> > support to be called with prod_head >= size, and wrap in that case.
> >
> The alternate solution I can think of requires 3 things 1) the base address of
> the ring 2) Index to where to copy 3) the mask. With these 3 things one could
> write the code like below:
> for (i = 0; i < n; i++) {
> ring_addr[(index + i) & mask] = obj[i]; // ANDing with mask will take
> care of wrap-around.
> }
>
> However, I think this does not allow for passing the address in the ring to
> another function/API to copy the data (It is possible, but the user has to
> calculate the actual address, worry about the wrap-around, second pointer
> etc).
>
> The current approach hides some details and provides flexibility to the
> application to use the pointer the way it wants.
>
> >
> > Regards,
> > Olivier
next prev parent reply other threads:[~2020-10-08 20:47 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-24 20:39 [dpdk-dev] [RFC 0/1] lib/ring: add scatter gather and serial dequeue APIs Honnappa Nagarahalli
2020-02-24 20:39 ` [dpdk-dev] [RFC 1/1] " Honnappa Nagarahalli
2020-02-26 20:38 ` Ananyev, Konstantin
2020-02-26 23:21 ` Ananyev, Konstantin
2020-02-28 0:18 ` Honnappa Nagarahalli
2020-03-02 18:20 ` Ananyev, Konstantin
2020-03-04 23:21 ` Honnappa Nagarahalli
2020-03-05 18:26 ` Ananyev, Konstantin
2020-03-25 20:43 ` Honnappa Nagarahalli
2020-10-06 13:29 ` [dpdk-dev] [RFC v2 0/1] lib/ring: add scatter gather APIs Honnappa Nagarahalli
2020-10-06 13:29 ` [dpdk-dev] [RFC v2 1/1] " Honnappa Nagarahalli
2020-10-07 8:27 ` Olivier Matz
2020-10-08 20:44 ` Honnappa Nagarahalli
2020-10-08 20:47 ` Honnappa Nagarahalli [this message]
2020-10-09 7:33 ` Olivier Matz
2020-10-09 8:05 ` Ananyev, Konstantin
2020-10-09 22:54 ` Honnappa Nagarahalli
2020-10-12 17:06 ` Ananyev, Konstantin
2020-10-12 16:20 ` Ananyev, Konstantin
2020-10-12 22:31 ` Honnappa Nagarahalli
2020-10-13 11:38 ` Ananyev, Konstantin
2020-10-23 4:43 ` [dpdk-dev] [PATCH v3 0/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
2020-10-23 4:43 ` [dpdk-dev] [PATCH v3 1/5] test/ring: fix the memory dump size Honnappa Nagarahalli
2020-10-23 13:24 ` Ananyev, Konstantin
2020-10-23 4:43 ` [dpdk-dev] [PATCH v3 2/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
2020-10-23 13:59 ` Ananyev, Konstantin
2020-10-24 15:45 ` Honnappa Nagarahalli
2020-10-23 4:43 ` [dpdk-dev] [PATCH v3 3/5] test/ring: move common function to header file Honnappa Nagarahalli
2020-10-23 14:22 ` Ananyev, Konstantin
2020-10-23 23:54 ` Honnappa Nagarahalli
2020-10-24 0:29 ` Stephen Hemminger
2020-10-24 0:31 ` Honnappa Nagarahalli
2020-10-23 4:43 ` [dpdk-dev] [PATCH v3 4/5] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
2020-10-23 14:20 ` Ananyev, Konstantin
2020-10-23 22:47 ` Honnappa Nagarahalli
2020-10-23 4:43 ` [dpdk-dev] [PATCH v3 5/5] test/ring: add stress " Honnappa Nagarahalli
2020-10-23 14:11 ` Ananyev, Konstantin
2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 1/8] " Honnappa Nagarahalli
2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 2/8] test/ring: move common function to header file Honnappa Nagarahalli
2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 3/8] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 4/8] test/ring: add stress " Honnappa Nagarahalli
2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 5/8] doc/ring: add zero copy peek APIs Honnappa Nagarahalli
2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 6/8] test/ring: fix the memory dump size Honnappa Nagarahalli
2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 7/8] test/ring: remove unnecessary braces Honnappa Nagarahalli
2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 8/8] test/ring: user uintptr_t instead of unsigned long Honnappa Nagarahalli
2020-10-24 16:18 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add zero copy APIs Honnappa Nagarahalli
2020-10-25 7:16 ` David Marchand
2020-10-25 8:14 ` Thomas Monjalon
2020-10-25 5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
2020-10-25 5:45 ` [dpdk-dev] [PATCH v5 1/8] " Honnappa Nagarahalli
2020-10-27 14:11 ` Ananyev, Konstantin
2020-10-25 5:45 ` [dpdk-dev] [PATCH v5 2/8] test/ring: move common function to header file Honnappa Nagarahalli
2020-10-27 13:51 ` Ananyev, Konstantin
2020-10-25 5:45 ` [dpdk-dev] [PATCH v5 3/8] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
2020-10-25 5:45 ` [dpdk-dev] [PATCH v5 4/8] test/ring: add stress " Honnappa Nagarahalli
2020-10-25 5:45 ` [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs Honnappa Nagarahalli
2020-10-27 14:11 ` Ananyev, Konstantin
2020-10-29 10:52 ` David Marchand
2020-10-29 11:28 ` Ananyev, Konstantin
2020-10-29 12:35 ` David Marchand
2020-10-29 17:29 ` Honnappa Nagarahalli
2020-10-25 5:45 ` [dpdk-dev] [PATCH v5 6/8] test/ring: fix the memory dump size Honnappa Nagarahalli
2020-10-25 5:45 ` [dpdk-dev] [PATCH v5 7/8] test/ring: remove unnecessary braces Honnappa Nagarahalli
2020-10-27 14:13 ` Ananyev, Konstantin
2020-10-25 5:45 ` [dpdk-dev] [PATCH v5 8/8] test/ring: user uintptr_t instead of unsigned long Honnappa Nagarahalli
2020-10-27 14:14 ` Ananyev, Konstantin
2020-10-29 13:57 ` [dpdk-dev] [PATCH v5 0/8] lib/ring: add zero copy APIs David Marchand
2020-10-29 22:11 ` Honnappa Nagarahalli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DBAPR08MB5814795ED609968E8A1F7A0C980B0@DBAPR08MB5814.eurprd08.prod.outlook.com \
--to=honnappa.nagarahalli@arm.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=konstantin.ananyev@intel.com \
--cc=nd@arm.com \
--cc=olivier.matz@6wind.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).