DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Pattan, Reshma" <reshma.pattan@intel.com>
To: Jay Rolette <rolette@infiniteio.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH RFC] librte_reorder: new reorder library
Date: Fri, 17 Oct 2014 09:44:49 +0000	[thread overview]
Message-ID: <3AEA2BF9852C6F48A459DA490692831FE240FC@IRSMSX109.ger.corp.intel.com> (raw)
In-Reply-To: <CADNuJVpwfiC2x3joDnkgUX=fpEMe2MAS46ozkvojsaKy0=UGZA@mail.gmail.com>

Hi Jay,

Please find comments inline.

Thanks,
Reshma

From: Jay Rolette [mailto:rolette@infiniteio.com]
Sent: Thursday, October 9, 2014 8:02 PM
To: Pattan, Reshma
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH RFC] librte_reorder: new reorder library

Hi Reshma,

A few comments and questions about your design...

1) How do you envision the reorder library to be used? Based on the description, it seems like the expectation is that packet order would be maintained at either the interface/port level or maybe at the RX queue level. Is that right or am I reading too much between the lines?

For my purposes (and for network security products I've developed in the past), I mostly don't care about box or port-level order. The most important thing is to maintain packet order within a flow. Relative order from packets in different flows doesn't matter. If there is a way I can process packets in parallel and transmit out-of-order transmission *within the flow*, that's very useful. Architecturally, it helps avoid hot-spotting in my packet processing pipeline and wasting cycles when load-balancing isn't perfect (and it never is).
[Reshma]: Generic parallel processing of packets is planned in phase2 version of distributor based on sequence numbers, but not flow based  parallel processing.

2) If the reorder library is "flow aware", then give me flexibility on deciding what a flow is. Let me define pseudo-flows even if the protocol itself isn't connection oriented (ie., frequently useful to treat UDP 5-tuples as a flow). I may want to include tunnels/VLANs/etc. as part of my "flow" definition. I may need to include the physical port as part of the flow definition.

Ideally, the library includes the common cases and gives me the option to register a callback function for doing whatever sort of "flows" I require for my app.
[Reshma]:It is not flow aware. But to reorder packets of particular flow, you can handover particular flow to the library and library will give you back the reordered data.

3) Is there a way to apply the reorder library to some packets and not others? I might want to use for TCP and UDP, but not care about order for other IP traffic (for example).
[Reshma]:No, reorder library will not have intelligence about  traffic type (i.e. flow or protocols based).
Applications can do  traffic  splitting into flows or  protocol based and  handover to library for reordering

4) How are you dealing with internal congestion? If I drop a packet somewhere in my processing pipeline, how does the TX side of the reorder queue/buffer deal with the missing sequence number? Is there some sort of timeout mechanism so that it will only wait for X microseconds for a missing sequence number?
[Reshma]: Library just takes care of packets what it has  got. No waiting mechanism is used for missing packets.
Reorder processing will skip the dropped packets(i.e. will create a gap in reorder buffer) and proceed with allocation of slot to the later packets which are available.

Need the ability to bound how long packets are held up in the reorder engine before they are released.
[Reshma]: This is dependent upon how frequently packets are enqueued and dequeued from it. Packets which are in order and without gaps are dequeued at the next call to the dequeue api. If there is a gap, the time taken to skip over the gap will depend on the size of the reorder ring.

Assuming you address this, the reorder engine will also need to deal with slow packets that show up after "later" packets were transmitted.
[Reshma]: As of now, plan is to check sequence number of current packet that library has got with the min sequence number maintained in the library.
The difference between them should not cross 2*reorder_buffer_size . If so we don’t handle such packet and drop it.
But, we are open to suggestions on how to handle late packets? Should we have config option to drop them or just deque them in next immediate dequeue operation.

Regards,
Jay


On Tue, Oct 7, 2014 at 4:33 AM, Pattan, Reshma <reshma.pattan@intel.com<mailto:reshma.pattan@intel.com>> wrote:
Hi All,

I am planning  to implement packet reorder library. Details are as below, please go through them and provide the comments.

Requirement:
               To reorder out of ordered packets that are received from different cores.

Usage:
To be used along with distributor library. Next version of distributor are planned to distribute incoming packets to all worker cores irrespective of the flow type.
In this case to ensure in order delivery of the packets at output side reorder library will used by the tx end.

Assumption:
All input packets will be marked with sequence number in seqn field of mbuf, this will be the reference for reordering at the tx end.
Sequence number will be of type uint32_t. New sequence number field seqn will be added to mbuf structure.

Design:
a)There will be reorder buffer(circular buffer) structure maintained in reorder library to store reordered packets and other details of buffer like head to drain the packet from, min sequence number and other details.
               b)Library will provide insert and drain functions to reorder and fetch out the reordered packets respectively.
c)Users of library should pass the packets to insert functions for reordering.

Insertion logic:
Sequence number of current packet will be used to calculate offset in reorder buffer and write packet to the location  in the reorder buffer corresponding to offset.
                             Offset is calculated as difference of current packet  sequence number and sequence number associated with  reorder buffer.

During sequence number wrapping or wrapping over of reorder buffer size, before inserting the new packet we should move offset number of packets to other buffer called overflow buffer and advance the head of reorder buffer by "offset-reorder buffer size" and insert the new packet.

Insert function:
int rte_reorder_insert(struct rte_reorder_buffer *buffer, struct rte_mbuf *mbuf);
Note: Other insert function is also under plan to insert burst of packets.

               Reorder buffer:
struct rte_reorder_buffer {
        unsigned int size;      /* The size (number of entries) of the buffer. */
        unsigned int mask;      /* Mask (size - 1) of the buffer */
        unsigned int head;      /* Current head of buffer */
        uint32_t min_seqn;      /* latest sequence number associated with buffer */
        struct rte_mbuf *entries[MAX_REORDER_BUFFER_SIZE]; /* buffer to hold reordered mbufs */
};

d)Users can fetch out the reordered packets by drain function provided by library. Users must pass the mbuf array , drain function should fill  passed mbuff array  with the reordered buffer packets.
During drain operation, overflow buffer  packets will be fetched out first and then reorder buffer.

Drain function:
               int rte_reorder_drain(struct rte_reorder_buffer *buffer, struct rte_mbuf **mbufs)

Thanks,
Reshma

--------------------------------------------------------------
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.


  reply	other threads:[~2014-10-17  9:36 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-07  9:33 Pattan, Reshma
2014-10-07 11:21 ` Neil Horman
2014-10-08 14:11   ` Pattan, Reshma
2014-10-08 19:15     ` Neil Horman
2014-10-09 10:27       ` Pattan, Reshma
2014-10-09 11:36         ` Neil Horman
2014-10-09 14:36           ` Pattan, Reshma
2014-10-09 16:09             ` Neil Horman
2014-10-09 17:21               ` Matthew Hall
2014-10-09 17:55                 ` Neil Horman
2014-10-08 22:41 ` Matthew Hall
2014-10-08 22:55   ` Neil Horman
2014-10-08 23:07     ` Matthew Hall
2014-10-09  9:14       ` Bruce Richardson
2014-10-09 17:11         ` Matthew Hall
2014-10-10 10:59           ` Bruce Richardson
2014-10-09 19:01 ` Jay Rolette
2014-10-17  9:44   ` Pattan, Reshma [this message]
2014-10-17 16:26     ` Jay Rolette
2014-10-18 17:26     ` Matthew Hall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3AEA2BF9852C6F48A459DA490692831FE240FC@IRSMSX109.ger.corp.intel.com \
    --to=reshma.pattan@intel.com \
    --cc=dev@dpdk.org \
    --cc=rolette@infiniteio.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).