DPDK patches and discussions
 help / color / mirror / Atom feed
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>
Cc: Gavin Hu <Gavin.Hu@arm.com>, "dev@dpdk.org" <dev@dpdk.org>,
	nd <nd@arm.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	nd <nd@arm.com>
Subject: Re: [dpdk-dev] [RFC 1/1] lib/ring: add scatter gather and serial dequeue APIs
Date: Wed, 25 Mar 2020 20:43:47 +0000	[thread overview]
Message-ID: <VE1PR08MB5149F1C5B39DE98D71B74B7D98CE0@VE1PR08MB5149.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <SN6PR11MB25581C8D63FC2268B814427A9AE20@SN6PR11MB2558.namprd11.prod.outlook.com>

<snip>

> 
> > >
> > > > > > +/**
> > > > > > + * @internal Reserve ring elements to enqueue several objects
> > > > > > +on the ring
> > > > > > + *
> > > > > > + * @param r
> > > > > > + *   A pointer to the ring structure.
> > > > > > + * @param esize
> > > > > > + *   The size of ring element, in bytes. It must be a multiple of 4.
> > > > > > + *   This must be the same value used while creating the ring.
> > > Otherwise
> > > > > > + *   the results are undefined.
> > > > > > + * @param n
> > > > > > + *   The number of elements to reserve in the ring.
> > > > > > + * @param behavior
> > > > > > + *   RTE_RING_QUEUE_FIXED:    Reserve a fixed number of
> elements
> > > from a
> > > > > ring
> > > > > > + *   RTE_RING_QUEUE_VARIABLE: Reserve as many elements as
> > > possible
> > > > > from ring
> > > > > > + * @param is_sp
> > > > > > + *   Indicates whether to use single producer or multi-producer
> reserve
> > > > > > + * @param old_head
> > > > > > + *   Producer's head index before reservation.
> > > > > > + * @param new_head
> > > > > > + *   Producer's head index after reservation.
> > > > > > + * @param free_space
> > > > > > + *   returns the amount of space after the reserve operation has
> > > finished.
> > > > > > + *   It is not updated if the number of reserved elements is zero.
> > > > > > + * @param dst1
> > > > > > + *   Pointer to location in the ring to copy the data.
> > > > > > + * @param n1
> > > > > > + *   Number of elements to copy at dst1
> > > > > > + * @param dst2
> > > > > > + *   In case of ring wrap around, this pointer provides the location
> to
> > > > > > + *   copy the remaining elements. The number of elements to copy
> at
> > > this
> > > > > > + *   location is equal to (number of elements reserved - n1)
> > > > > > + * @return
> > > > > > + *   Actual number of elements reserved.
> > > > > > + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> > > > > > + */
> > > > > > +static __rte_always_inline unsigned int
> > > > > > +__rte_ring_do_enqueue_elem_reserve(struct rte_ring *r,
> > > > > > +unsigned int esize,
> > > > >
> > > > >
> > > > > I do understand the purpose of reserve, then either commit/abort
> > > > > for serial sync mode, but what is the purpose of non-serial
> > > > > version of
> > > reserve/commit?
> > > > In RCU, I have the need for scatter-gather feature. i.e. the data
> > > > in the ring element is coming from multiple sources ('token' is
> > > > generated by the RCU library and the application provides
> > > > additional data). If I do not
> > > provide the reserve/commit, I need to introduce an intermediate
> > > memcpy to get these two data contiguously to copy to the ring
> > > element. The sequence is 'reserve(1), memcpy1, mempcy2, commit(1)'.
> > > > Hence, you do not see the abort API for the enqueue.
> > > >
> > > > > In serial  MP/MC case, after _reserve_(n) you always have to do
> > > > > _commit_(n) - you can't reduce number of elements, or do _abort_.
> > > > Agree, the intention here is to provide the scatter/gather feature.
> > > >
> > > > > Again you cannot avoid memcpy(n) here anyhow.
> > > > > So what is the point of these functions for non-serial case?
> > > > It avoids an intermediate memcpy when the data is coming from
> > > > multiple
> > > sources.
> > >
> > > Ok, I think I understand what was my confusion:
> > Yes, the following understanding is correct.
> >
> > > Your intention:
> > >  1) reserve/commit for both serial and non-serial mode -
> > >      to allow user get/set contents of the ring manually and avoid
> > >      intermediate load/stores.
> > > 2) abort only for serial mode.
> > >
> > > My intention:
> > > 1) commit/reserve/abort only for serial case
> > >     (as that's the only mode where we can commit less
> > >      then was reserved or do abort).
> > I do not know if there is a requirement on committing less than reserved.
> 
> From my perspective, that's a necessary part of peek functionality.
> revert/abort function you introduced below is just one special case of it.
> Having just abort is enough when you processing elements in the ring one by
> one, but not sufficient if someone would try to operate in bulks.
> Let say you read (reserved) N objects from the ring, inspected them and
> found that first M (<N) are ok to be removed from the ring, others should
> remain.
Agree, it makes sense from a dequeue perspective. Does it make sense from enqueue perspective?

> 
> > I think, if the size of commit is not known during reservation, may be
> > the reservation can be delayed till it is known.
> 
> In some cases, you do know how much you'd like to commit, but you can't
> guarantee that you can commit that much, till you inspect contents of
> reserved elems.
The above comment was from enqueue perspective.

> 
> > If there is no requirement to commit less than reserved, then I do not see a
> need for serial APIs for enqueue operation.
> 
> >
> > > 2) get/set of ring contents are done as part of either
> > >     reserve(for dequeue) or commit(for enqueue) API calls
> > >     (no scatter-gather ability).
> > >
> > > I still think that this new API you suggest creates too big exposure
> > > of ring internals, and makes it less 'safe-to-use':
> > > - it provides direct access to contents of the ring.
> > > - user has to specify head/tail values directly.
> > It is some what complex. But, with the support of user defined element
> > size, I think it becomes necessary to support scatter gather feature (since it
> is not a single pointer that will be stored).
> 
> I suppose to see the real benefit from scatter-gather, we need a scenario
> where there are relatively big elems in the ring (32B+ or so), plus
> enqueue/dequeue done in bulks.
> If you really  envision such use case - I am ok to consider scatter-gather API
> too, but I think it shouldn't be the only available API for serial mode.
> Might be we can have 'normal' enqueue/dequeue API for serial mode (actual
> copy done internally in ring functions, head/tail values are not exposed
> directly), plus SG API as addon for some special cases.
I will try to run some benchmarks and take a decision on if SG makes an impact on RCU defer APIs.

> 
> > >
> > > So in case of some programmatic error in related user code, there
> > > are less chances it could be catch-up by API, and we can easily
> > > end-up with silent memory corruption and other nasty things that
> > > would be hard to catch/reproduce.
> > >
> > > That makes me wonder how critical is this scatter-gather ability in
> > > terms of overall RCU performance?
> > > Is the gain provided really that significant, especially if you'll
> > > update the ring by one element at a time?
> > For RCU, it is 64b token and the size of the user data. Not sure how much
> difference it will make.
> > I can drop the scatter gather requirement for now.
> >
> > >
> > > >
> > > > >
> > > > > BTW, I think it would be good to have serial version of _enqueue_ too.
> > > > If there is a good use case, they should be provided. I did not
> > > > come across a
> > > good use case.
> > > >
> > > > >
> > > > > > +		unsigned int n, enum rte_ring_queue_behavior
> behavior,
> > > > > > +		unsigned int is_sp, unsigned int *old_head,
> > > > > > +		unsigned int *new_head, unsigned int *free_space,
> > > > > > +		void **dst1, unsigned int *n1, void **dst2)
> > > > >
> > > > > I do understand the intention to avoid memcpy(), but proposed
> > > > > API seems overcomplicated, error prone, and not very convenient for
> the user.
> > > > The issue is the need to handle the wrap around in ring storage array.
> > > > i.e. when the space is reserved for more than 1 ring element, the
> > > > wrap
> > > around might happen.
> > > >
> > > > > I don't think that avoiding memcpy() will save us that many
> > > > > cycles here, so
> > > > This depends on the amount of data being copied.
> > > >
> > > > > probably better to keep API model a bit more regular:
> > > > >
> > > > > n = rte_ring_mp_serial_enqueue_bulk_reserve(ring, num,
> &free_space); ...
> > > > > /* performs actual memcpy(), m<=n */
> > > > > rte_ring_mp_serial_enqueue_bulk_commit(ring, obj,  m);
> > > > These do not take care of the wrap-around case or I am not able to
> > > understand your comment.
> > >
> > > I meant that serial_enqueue_commit() will do both:
> > > actual copy of elements to the ring and tail update (no
> > > Scatter-Gather), see above.
> > RCU does not require the serial enqueue APIs, do you have any use case?
> 
> I agree that serial dequeue seems to have more usages then enqueue.
> Though I still can name at least two cases for enqueue, from top of my head:
> 1. serial mode (both enqueue/dequeue) helps to mitigate ring slowdown
> overcommitted scenarios, see RFC I submitted:
> http://patches.dpdk.org/cover/66001/
> 2. any intermediate node when you have pop/push from/to some external
> queue, and enqueue/dequeue to/from the ring, would like to avoid any
> elem drops in between, and by some reason don't want your own
> intermediate bufferization.
> Let say:
> dequeue_from_ring -> tx_burst/cryptodev_enqueue
> rx_burst/cryptodev_dequeue -> enqueue_to_ring
> 
> Plus as enqueue/dequeue are sort of mirror, I think it is good to have both
> identical.
Ok, agreed. I think we need to allow for combination of APIs to be used. i.e. MP enqueue and serialization on dequeue.

> 
> 


  reply	other threads:[~2020-03-25 20:44 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-24 20:39 [dpdk-dev] [RFC 0/1] " Honnappa Nagarahalli
2020-02-24 20:39 ` [dpdk-dev] [RFC 1/1] " Honnappa Nagarahalli
2020-02-26 20:38   ` Ananyev, Konstantin
2020-02-26 23:21     ` Ananyev, Konstantin
2020-02-28  0:18     ` Honnappa Nagarahalli
2020-03-02 18:20       ` Ananyev, Konstantin
2020-03-04 23:21         ` Honnappa Nagarahalli
2020-03-05 18:26           ` Ananyev, Konstantin
2020-03-25 20:43             ` Honnappa Nagarahalli [this message]
2020-10-06 13:29 ` [dpdk-dev] [RFC v2 0/1] lib/ring: add scatter gather APIs Honnappa Nagarahalli
2020-10-06 13:29   ` [dpdk-dev] [RFC v2 1/1] " Honnappa Nagarahalli
2020-10-07  8:27     ` Olivier Matz
2020-10-08 20:44       ` Honnappa Nagarahalli
2020-10-08 20:47         ` Honnappa Nagarahalli
2020-10-09  7:33         ` Olivier Matz
2020-10-09  8:05           ` Ananyev, Konstantin
2020-10-09 22:54             ` Honnappa Nagarahalli
2020-10-12 17:06               ` Ananyev, Konstantin
2020-10-12 16:20     ` Ananyev, Konstantin
2020-10-12 22:31       ` Honnappa Nagarahalli
2020-10-13 11:38         ` Ananyev, Konstantin
2020-10-23  4:43 ` [dpdk-dev] [PATCH v3 0/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 1/5] test/ring: fix the memory dump size Honnappa Nagarahalli
2020-10-23 13:24     ` Ananyev, Konstantin
2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 2/5] lib/ring: add zero copy APIs Honnappa Nagarahalli
2020-10-23 13:59     ` Ananyev, Konstantin
2020-10-24 15:45       ` Honnappa Nagarahalli
2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 3/5] test/ring: move common function to header file Honnappa Nagarahalli
2020-10-23 14:22     ` Ananyev, Konstantin
2020-10-23 23:54       ` Honnappa Nagarahalli
2020-10-24  0:29         ` Stephen Hemminger
2020-10-24  0:31           ` Honnappa Nagarahalli
2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 4/5] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
2020-10-23 14:20     ` Ananyev, Konstantin
2020-10-23 22:47       ` Honnappa Nagarahalli
2020-10-23  4:43   ` [dpdk-dev] [PATCH v3 5/5] test/ring: add stress " Honnappa Nagarahalli
2020-10-23 14:11     ` Ananyev, Konstantin
2020-10-24 16:11 ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add " Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 1/8] " Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 2/8] test/ring: move common function to header file Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 3/8] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 4/8] test/ring: add stress " Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 5/8] doc/ring: add zero copy peek APIs Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 6/8] test/ring: fix the memory dump size Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 7/8] test/ring: remove unnecessary braces Honnappa Nagarahalli
2020-10-24 16:11   ` [dpdk-dev] [PATCH v4 8/8] test/ring: user uintptr_t instead of unsigned long Honnappa Nagarahalli
2020-10-24 16:18   ` [dpdk-dev] [PATCH v4 0/8] lib/ring: add zero copy APIs Honnappa Nagarahalli
2020-10-25  7:16     ` David Marchand
2020-10-25  8:14       ` Thomas Monjalon
2020-10-25  5:45 ` [dpdk-dev] [PATCH v5 " Honnappa Nagarahalli
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 1/8] " Honnappa Nagarahalli
2020-10-27 14:11     ` Ananyev, Konstantin
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 2/8] test/ring: move common function to header file Honnappa Nagarahalli
2020-10-27 13:51     ` Ananyev, Konstantin
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 3/8] test/ring: add functional tests for zero copy APIs Honnappa Nagarahalli
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 4/8] test/ring: add stress " Honnappa Nagarahalli
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 5/8] doc/ring: add zero copy peek APIs Honnappa Nagarahalli
2020-10-27 14:11     ` Ananyev, Konstantin
2020-10-29 10:52     ` David Marchand
2020-10-29 11:28       ` Ananyev, Konstantin
2020-10-29 12:35         ` David Marchand
2020-10-29 17:29           ` Honnappa Nagarahalli
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 6/8] test/ring: fix the memory dump size Honnappa Nagarahalli
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 7/8] test/ring: remove unnecessary braces Honnappa Nagarahalli
2020-10-27 14:13     ` Ananyev, Konstantin
2020-10-25  5:45   ` [dpdk-dev] [PATCH v5 8/8] test/ring: user uintptr_t instead of unsigned long Honnappa Nagarahalli
2020-10-27 14:14     ` Ananyev, Konstantin
2020-10-29 13:57   ` [dpdk-dev] [PATCH v5 0/8] lib/ring: add zero copy APIs David Marchand
2020-10-29 22:11     ` Honnappa Nagarahalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VE1PR08MB5149F1C5B39DE98D71B74B7D98CE0@VE1PR08MB5149.eurprd08.prod.outlook.com \
    --to=honnappa.nagarahalli@arm.com \
    --cc=Gavin.Hu@arm.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=nd@arm.com \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).