[dpdk-users] Beginners question: rte_eth_tx_burst, rte

DPDK usage discussions
 help / color / mirror / Atom feed

* [dpdk-users] Beginners question: rte_eth_tx_burst, rte_mbuf access synchronization
@ 2016-11-11  9:49 Philipp Beyer
  2016-11-11 12:35 ` Anupam Kapoor
  2016-11-11 13:45 ` Matt Laswell
  0 siblings, 2 replies; 6+ messages in thread
From: Philipp Beyer @ 2016-11-11  9:49 UTC (permalink / raw)
  To: users

Hi!

I am just writing my first code using dpdk, a traffic generator, for 
which I started with the l2fwd example.

Basically, I need to send the same packet over a single interface, over 
an over again, with single bytes changed each time.
I use rte_eth_tx_burst to send 16 packets at once. As I want to re-use 
the same buffers in a very simple way, I just increment the refcnt
accordingly.

My current code prepares all 16 buffers, calls rte_eth_tx_burst until 
all 16 packets are stored in the transmit ring, and starts over again, 
adjusting the buffers to send the next 16 packets.

Currently I observe duplicate packets, although every packet should be 
individual due to single byte adjustments.

My current problem is, as I guess, that rte_eth_tx_burst does not 
synchnolously transmit the count of packets, which is returned to the 
caller, but just stores them in transmit queue. So, I am not allowed to 
instantly re-use these buffers again.

My question is: How do I know when to re-use buffers passed to 
rte_eth_tx_burst. Of course, I can check their refcnt member, and this 
would be perfectly fine. Apparently, I should have at least BURST_SIZE*2 
buffers, passing BURST_SIZE buffers at once, so I can manipulate one set 
of buffers while the other is transmitted. But I am missing the idea of 
the best synchronization scheme here: How should I wait on this refcnt 
to drop?

Some blind guessing:
If I take the documentation of rte_eth_tx_burst literally, I could get 
the idea that refcounts of buffers are only decreased (buffers are 
'freed'), while rte_eth_tx_burst is executed, but one function call 
might free buffers used by previous function calls. If this is correct, 
I still do not see a complete synchronization scheme. There is still a 
chance that I end up without any buffers left, which means I do not have 
a chance to call rte_eth_tx_burst again to free buffers.

Thanks for any help,
Philipp

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] Beginners question: rte_eth_tx_burst, rte_mbuf access synchronization
  2016-11-11  9:49 [dpdk-users] Beginners question: rte_eth_tx_burst, rte_mbuf access synchronization Philipp Beyer
@ 2016-11-11 12:35 ` Anupam Kapoor
  2016-11-11 13:09   ` Philipp Beyer
  2016-11-11 13:45 ` Matt Laswell
  1 sibling, 1 reply; 6+ messages in thread
From: Anupam Kapoor @ 2016-11-11 12:35 UTC (permalink / raw)
  To: Philipp Beyer; +Cc: users

On Fri, Nov 11, 2016 at 3:19 PM, Philipp Beyer <pbeyer@voipfuture.com>
wrote:

> Basically, I need to send the same packet over a single interface, over an
> over again, with single bytes changed each time.
> I use rte_eth_tx_burst to send 16 packets at once. As I want to re-use the
> same buffers in a very simple way, I just increment the refcnt
> accordingly.
>

just throwing it out there: have you considered a trivial scheme of
repeatedly invoking 'rte_eth_tx_burst(...)' till a value less than
'nb_pkts' is returned. once you reach that state, then the reuse can
happen...

--
kind regards
anupam

In the beginning was the lambda, and the lambda was with Emacs, and Emacs
was the lambda.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] Beginners question: rte_eth_tx_burst, rte_mbuf access synchronization
  2016-11-11 12:35 ` Anupam Kapoor
@ 2016-11-11 13:09   ` Philipp Beyer
  0 siblings, 0 replies; 6+ messages in thread
From: Philipp Beyer @ 2016-11-11 13:09 UTC (permalink / raw)
  Cc: users

Hi Anupam,

I'm afraid, I don't get your point.  rte_eth_tx_burst returning a 
reduced buffer count means that TX queue is filled up, doesn't it? I 
don't see why "buffer M does not fit into TX queue" means "buffers 1..N 
already transmitted".

Thanks,
Philipp

Am 11.11.2016 um 13:35 schrieb Anupam Kapoor:
>
> On Fri, Nov 11, 2016 at 3:19 PM, Philipp Beyer <pbeyer@voipfuture.com 
> <mailto:pbeyer@voipfuture.com>> wrote:
>
>     Basically, I need to send the same packet over a single interface,
>     over an over again, with single bytes changed each time.
>     I use rte_eth_tx_burst to send 16 packets at once. As I want to
>     re-use the same buffers in a very simple way, I just increment the
>     refcnt
>     accordingly.
>
>
> just throwing it out there: have you considered a trivial scheme of 
> repeatedly invoking 'rte_eth_tx_burst(...)' till a value less than 
> 'nb_pkts' is returned. once you reach that state, then the reuse can 
> happen...
>
> --
> kind regards
> anupam
> 
>
> In the beginning was the lambda, and the lambda was with Emacs, and 
> Emacs was the lambda.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] Beginners question: rte_eth_tx_burst, rte_mbuf access synchronization
  2016-11-11  9:49 [dpdk-users] Beginners question: rte_eth_tx_burst, rte_mbuf access synchronization Philipp Beyer
  2016-11-11 12:35 ` Anupam Kapoor
@ 2016-11-11 13:45 ` Matt Laswell
  2016-11-11 14:06   ` Philipp Beyer
  1 sibling, 1 reply; 6+ messages in thread
From: Matt Laswell @ 2016-11-11 13:45 UTC (permalink / raw)
  To: Philipp Beyer; +Cc: users

Hi Philipp,

I'm a little unclear what you mean with your comments about adjusting the
refcnt in your mbufs.  You are absolutely correct that rte_eth_tx_burst
doesn't synchronously transmit the packets.  Instead, it puts them in a
ring that is serviced by the poll mode driver.  Eventually, they are handed
off to the NIC, which copies them into its buffer and ultimately sends them
on the wire.

The architecture you've described won't work for the reasons you've
surmised - when you hand a pointer to the pack to the device driver, you
are giving it control of the memory pointed to.  If you continue to modify
its contents at that point, the results will be unpredictable.  Also, it
sounds as though you might really just have 16 pointers to a single packet,
with a reference count of 16.  Since you don't actually have 16 buffers, if
you modify the contents of any one packet, you're modifying them all.

Let me suggest that you might want to rethink your scheme.  Rather than
trying to reverse engineer a way to either make the PMD behave
synchronously or to give you a callback, I would consider prebuilding
packet contents at init time, then allocating mbufs and copying the
contents in.  I suspect you've avoided an approach like this because you'd
like to not copy mostly the same data over and over when you only want to
modify one byte.

An alternative approach would be to use indirect mbufs.  In essence, each
packet you want to send might be made up of three mbufs.  The first is an
indirect mbuf that points to one that contains the common data at the start
of your packets.  The second contains the one byte that you wish to
change.  The third is an indirect mbuf that points to the common data at
the end of your packets.  I haven't used this approach myself, but I
suspect it would let you avoid copying so much data.

- Matt

On Fri, Nov 11, 2016 at 3:49 AM, Philipp Beyer <pbeyer@voipfuture.com>
wrote:

> Hi!
>
> I am just writing my first code using dpdk, a traffic generator, for which
> I started with the l2fwd example.
>
> Basically, I need to send the same packet over a single interface, over an
> over again, with single bytes changed each time.
> I use rte_eth_tx_burst to send 16 packets at once. As I want to re-use the
> same buffers in a very simple way, I just increment the refcnt
> accordingly.
>
> My current code prepares all 16 buffers, calls rte_eth_tx_burst until all
> 16 packets are stored in the transmit ring, and starts over again,
> adjusting the buffers to send the next 16 packets.
>
> Currently I observe duplicate packets, although every packet should be
> individual due to single byte adjustments.
>
> My current problem is, as I guess, that rte_eth_tx_burst does not
> synchnolously transmit the count of packets, which is returned to the
> caller, but just stores them in transmit queue. So, I am not allowed to
> instantly re-use these buffers again.
>
> My question is: How do I know when to re-use buffers passed to
> rte_eth_tx_burst. Of course, I can check their refcnt member, and this
> would be perfectly fine. Apparently, I should have at least BURST_SIZE*2
> buffers, passing BURST_SIZE buffers at once, so I can manipulate one set of
> buffers while the other is transmitted. But I am missing the idea of the
> best synchronization scheme here: How should I wait on this refcnt to drop?
>
> Some blind guessing:
> If I take the documentation of rte_eth_tx_burst literally, I could get the
> idea that refcounts of buffers are only decreased (buffers are 'freed'),
> while rte_eth_tx_burst is executed, but one function call might free
> buffers used by previous function calls. If this is correct, I still do not
> see a complete synchronization scheme. There is still a chance that I end
> up without any buffers left, which means I do not have a chance to call
> rte_eth_tx_burst again to free buffers.
>
> Thanks for any help,
> Philipp
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] Beginners question: rte_eth_tx_burst, rte_mbuf access synchronization
  2016-11-11 13:45 ` Matt Laswell
@ 2016-11-11 14:06   ` Philipp Beyer
       [not found]     ` <9754A038-DB66-417F-8958-2DDDE317E7A2@net.in.tum.de>
  0 siblings, 1 reply; 6+ messages in thread
From: Philipp Beyer @ 2016-11-11 14:06 UTC (permalink / raw)
  To: Matt Laswell; +Cc: users

Hi Matt,

Thanks for your answers. This helps as I am still stabbing in the dark 
quite a lot.

I actually use 16, not one, distinct buffers to be sent in one burst. 
But still, your conclusion is correct: I mess with the refcount, adjust 
payload after calling rte_eth_tx_burst, and therefore get undefined 
behaviour.

Your answer pretty much sound like you understood my point, so it seems 
the solution I am looking for does not exist. Unfortunately, it is not 
really only one byte i am changing. This was just a simplification, its 
a few byte actually, but still a small portion of the payload. So your 
idea won't really work.

But I might have found another idea: What about preparing all buffers of 
a memory pool with the same payload? I should than get a pre-filled 
buffer from rte_pktmbuf_alloc, right? Let's say, I initialize a buffer 
for transmittion, the transmitting code free's this buffer, and I get 
the same buffer back from rte_pktmbuf_alloc. What do I have to 
re-initialize to have the same buffer again? Only the payload length? Is 
this approach feasible, based on documented/specified behaviour?


Philipp


Am 11.11.2016 um 14:45 schrieb Matt Laswell:
> Hi Philipp,
>
> I'm a little unclear what you mean with your comments about adjusting 
> the refcnt in your mbufs.  You are absolutely correct that 
> rte_eth_tx_burst doesn't synchronously transmit the packets.  Instead, 
> it puts them in a ring that is serviced by the poll mode driver.  
> Eventually, they are handed off to the NIC, which copies them into its 
> buffer and ultimately sends them on the wire.
>
> The architecture you've described won't work for the reasons you've 
> surmised - when you hand a pointer to the pack to the device driver, 
> you are giving it control of the memory pointed to.  If you continue 
> to modify its contents at that point, the results will be 
> unpredictable.  Also, it sounds as though you might really just have 
> 16 pointers to a single packet, with a reference count of 16.  Since 
> you don't actually have 16 buffers, if you modify the contents of any 
> one packet, you're modifying them all.
>
> Let me suggest that you might want to rethink your scheme. Rather than 
> trying to reverse engineer a way to either make the PMD behave 
> synchronously or to give you a callback, I would consider prebuilding 
> packet contents at init time, then allocating mbufs and copying the 
> contents in.  I suspect you've avoided an approach like this because 
> you'd like to not copy mostly the same data over and over when you 
> only want to modify one byte.
>
> An alternative approach would be to use indirect mbufs.  In essence, 
> each packet you want to send might be made up of three mbufs.  The 
> first is an indirect mbuf that points to one that contains the common 
> data at the start of your packets. The second contains the one byte 
> that you wish to change.  The third is an indirect mbuf that points to 
> the common data at the end of your packets.  I haven't used this 
> approach myself, but I suspect it would let you avoid copying so much 
> data.
>
> - Matt
>
> On Fri, Nov 11, 2016 at 3:49 AM, Philipp Beyer <pbeyer@voipfuture.com 
> <mailto:pbeyer@voipfuture.com>> wrote:
>
>     Hi!
>
>     I am just writing my first code using dpdk, a traffic generator,
>     for which I started with the l2fwd example.
>
>     Basically, I need to send the same packet over a single interface,
>     over an over again, with single bytes changed each time.
>     I use rte_eth_tx_burst to send 16 packets at once. As I want to
>     re-use the same buffers in a very simple way, I just increment the
>     refcnt
>     accordingly.
>
>     My current code prepares all 16 buffers, calls rte_eth_tx_burst
>     until all 16 packets are stored in the transmit ring, and starts
>     over again, adjusting the buffers to send the next 16 packets.
>
>     Currently I observe duplicate packets, although every packet
>     should be individual due to single byte adjustments.
>
>     My current problem is, as I guess, that rte_eth_tx_burst does not
>     synchnolously transmit the count of packets, which is returned to
>     the caller, but just stores them in transmit queue. So, I am not
>     allowed to instantly re-use these buffers again.
>
>     My question is: How do I know when to re-use buffers passed to
>     rte_eth_tx_burst. Of course, I can check their refcnt member, and
>     this would be perfectly fine. Apparently, I should have at least
>     BURST_SIZE*2 buffers, passing BURST_SIZE buffers at once, so I can
>     manipulate one set of buffers while the other is transmitted. But
>     I am missing the idea of the best synchronization scheme here: How
>     should I wait on this refcnt to drop?
>
>     Some blind guessing:
>     If I take the documentation of rte_eth_tx_burst literally, I could
>     get the idea that refcounts of buffers are only decreased (buffers
>     are 'freed'), while rte_eth_tx_burst is executed, but one function
>     call might free buffers used by previous function calls. If this
>     is correct, I still do not see a complete synchronization scheme.
>     There is still a chance that I end up without any buffers left,
>     which means I do not have a chance to call rte_eth_tx_burst again
>     to free buffers.
>
>     Thanks for any help,
>     Philipp
>
>

-- 

Philipp Beyer

Software Developer

**

** **

**

Voipfuture GmbH   Wendenstr. 4   20097 Hamburg   Germany

Phone +49 40 688 9001 69   Fax +49 40 688 9001 99 www.voipfuture.com 
<http://www.voipfuture.com/>

Managing Directors   Jan Bastian   Eyal Ullert

Commercial Court AG Hamburg   HRB 109896  VAT ID DE263738086

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] Beginners question: rte_eth_tx_burst, rte_mbuf access synchronization
       [not found]     ` <9754A038-DB66-417F-8958-2DDDE317E7A2@net.in.tum.de>
@ 2016-11-11 14:16       ` Paul Emmerich
  0 siblings, 0 replies; 6+ messages in thread
From: Paul Emmerich @ 2016-11-11 14:16 UTC (permalink / raw)
  To: users

Hi,


> Philipp Beyer <pbeyer@voipfuture.com>:
> But I might have found another idea: What about preparing all buffers of a memory pool with the same payload? I should than get a pre-filled buffer from rte_pktmbuf_alloc, right? Let's say, I initialize a buffer for transmittion, the transmitting code free's this buffer, and I get the same buffer back from rte_pktmbuf_alloc. What do I have to re-initialize to have the same buffer again? Only the payload length? Is this approach feasible, based on documented/specified behaviour?

Yes, that's the easiest way to do it with DPDK's mbuf model. Your first
approach would work well on frameworks that expose the ringbuffers
directly in their API (e.g., netmap).

I've implemented it like this in my packet generator MoonGen, you can
read Section 4.2. of our paper at
https://www.net.in.tum.de/fileadmin/bibtex/publications/papers/MoonGen_IMC2015.pdf
for further details.

 Paul

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-11-11 14:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-11  9:49 [dpdk-users] Beginners question: rte_eth_tx_burst, rte_mbuf access synchronization Philipp Beyer
2016-11-11 12:35 ` Anupam Kapoor
2016-11-11 13:09   ` Philipp Beyer
2016-11-11 13:45 ` Matt Laswell
2016-11-11 14:06   ` Philipp Beyer
     [not found]     ` <9754A038-DB66-417F-8958-2DDDE317E7A2@net.in.tum.de>
2016-11-11 14:16       ` Paul Emmerich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).