From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yk0-f175.google.com (mail-yk0-f175.google.com [209.85.160.175]) by dpdk.org (Postfix) with ESMTP id 74B1A7E79 for ; Fri, 17 Oct 2014 18:18:41 +0200 (CEST) Received: by mail-yk0-f175.google.com with SMTP id 19so483540ykq.34 for ; Fri, 17 Oct 2014 09:26:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=YhfhYj0qpQOwwIEU5O9RSZxV2R9MsW1eGT3GQ6yaxxA=; b=SMH3xcMmahZZMI6K5qoZ2lmdTKCfLijF5Xz/Qcu+vKiNX4kpR9X9misuSrd2q3b1jG nUHd9ZnVJtMu5+bHoHH4Ep7ufMRcuVYHE+rDremwHTTeFuiAWcyqb1qryEUSmFjg/idI dMlBVX+tj9tDTyOP3mVE2mRUfveBfueFdxMEyiHy7kLdmI4pG3Fb2UtjI3ljCQrv7nUy e4X/oK9dHn+k1VuhWRMXQlhXcXY/7uhPGy19XTVIZrUQ0DVxudo1jmecRXdOtFjSUkoG EWQH91EuA9MZx9IPS3t50YdkQcyOxk+hC0DBfz4Du39+DLOs4FqbTcXVXZWbYpWMwAMN v7jw== X-Gm-Message-State: ALoCoQkYg+bCnJ5NCVL/r4wGlFwFUreAp5eqU0y0gV436LQH2lVlSoSJhHt8AXxFAYOAO1R61fsz MIME-Version: 1.0 X-Received: by 10.236.14.137 with SMTP id d9mr13538781yhd.16.1413563199360; Fri, 17 Oct 2014 09:26:39 -0700 (PDT) Received: by 10.170.84.10 with HTTP; Fri, 17 Oct 2014 09:26:39 -0700 (PDT) In-Reply-To: <3AEA2BF9852C6F48A459DA490692831FE240FC@IRSMSX109.ger.corp.intel.com> References: <3AEA2BF9852C6F48A459DA490692831FE21954@IRSMSX109.ger.corp.intel.com> <3AEA2BF9852C6F48A459DA490692831FE240FC@IRSMSX109.ger.corp.intel.com> Date: Fri, 17 Oct 2014 11:26:39 -0500 Message-ID: From: Jay Rolette To: "Pattan, Reshma" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH RFC] librte_reorder: new reorder library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Oct 2014 16:18:42 -0000 Thanks for the responses, Reshma. Can you provide a little more context about the use case that your reorder library is intended to help with? If I'm understanding your answers correctly, the functionality seems pretty limited and not something I would ever end up using, but that may be more about the types of products I build (deep packet inspection and working at L4-L7 generally, even though running at or near line-rate). Please take my comments in the spirit intended... If the design makes sense for different use cases and I'm not the target audience, that's perfectly ok and there are probably different trade-offs being made. But if it is intended to be useful for DPI applications, I'd hate to just be quiet and end up with something that doesn't get used as much as it might. I haven't looked at the distributor library, so entirely possible it makes more sense in that context. More detailed responses to your previous answers inline. Regards, Jay On Fri, Oct 17, 2014 at 4:44 AM, Pattan, Reshma wrote: > Hi Jay, > > > > Please find comments inline. > > > > Thanks, > > Reshma > > > > *From:* Jay Rolette [mailto:rolette@infiniteio.com] > *Sent:* Thursday, October 9, 2014 8:02 PM > *To:* Pattan, Reshma > *Cc:* dev@dpdk.org > *Subject:* Re: [dpdk-dev] [PATCH RFC] librte_reorder: new reorder library > > > > Hi Reshma, > > > > A few comments and questions about your design... > > > > 1) How do you envision the reorder library to be used? Based on the > description, it seems like the expectation is that packet order would be > maintained at either the interface/port level or maybe at the RX queue > level. Is that right or am I reading too much between the lines? > > > > For my purposes (and for network security products I've developed in the > past), I mostly don't care about box or port-level order. The most > important thing is to maintain packet order within a flow. Relative order > from packets in different flows doesn't matter. If there is a way I can > process packets in parallel and transmit out-of-order transmission *withi= n > the flow*, that's very useful. Architecturally, it helps avoid hot-spotti= ng > in my packet processing pipeline and wasting cycles when load-balancing > isn't perfect (and it never is). > > [Reshma]: Generic parallel processing of packets is planned in phase2 > version of distributor based on sequence numbers, but not flow based > parallel processing. > See question at the top of my email about the intended use-case. For DPI applications, global (box-wide or per port) reordering isn't normally required. Maintaining order within flows is the important part. Depending on your implementation and the guarantees you make, the impact it has on aggregate system throughput can be significant. > 2) If the reorder library is "flow aware", then give me flexibility on > deciding what a flow is. Let me define pseudo-flows even if the protocol > itself isn't connection oriented (ie., frequently useful to treat UDP > 5-tuples as a flow). I may want to include tunnels/VLANs/etc. as part of = my > "flow" definition. I may need to include the physical port as part of the > flow definition. > > > > Ideally, the library includes the common cases and gives me the option to > register a callback function for doing whatever sort of "flows" I require > for my app. > > [Reshma]:It is not flow aware. But to reorder packets of particular flow, > you can handover particular flow to the library and library will give you > back the reordered data. > I think given how a couple of other bits are described, this doesn't end up helping. More a bit further down. > 3) Is there a way to apply the reorder library to some packets and not > others? I might want to use for TCP and UDP, but not care about order for > other IP traffic (for example). > > [Reshma]:No, reorder library will not have intelligence about traffic > type (i.e. flow or protocols based). > > Applications can do traffic splitting into flows or protocol based and > handover to library for reordering > Ditto 4) How are you dealing with internal congestion? If I drop a packet > somewhere in my processing pipeline, how does the TX side of the reorder > queue/buffer deal with the missing sequence number? Is there some sort of > timeout mechanism so that it will only wait for X microseconds for a > missing sequence number? > > [Reshma]: Library just takes care of packets what it has got. No waiting > mechanism is used for missing packets. > > Reorder processing will skip the dropped packets(i.e. will create a gap i= n > reorder buffer) and proceed with allocation of slot to the later packets > which are available. > > > > Need the ability to bound how long packets are held up in the reorder > engine before they are released. > > [Reshma]: This is dependent upon how frequently packets are enqueued and > dequeued from it. Packets which are in order and without gaps are dequeue= d > at the next call to the dequeue api. If there is a gap, the time taken to > skip over the gap will depend on the size of the reorder ring. > So the window for correcting out-of-order is nothing more than whatever queueing delays happen to be on the TX queue? That seems... not very useful. Am I missing something about the design? For DPI applications, processing time is somewhat variable between different packets in a flow. I'm assuming L3 apps have similar issues with control plane traffic. In a low-latency architecture, very few packets should be sitting in any TX queues so you really need something with some time/cycle-count constraints to manage that window - ie., how long should other packets in a flow be held up in the TX queue waiting for "earlier" packets vs. transmitting them anyway? Assuming you address this, the reorder engine will also need to deal with > slow packets that show up after "later" packets were transmitted. > > [Reshma]: As of now, plan is to check sequence number of current packet > that library has got with the min sequence number maintained in the libra= ry. > > The difference between them should not cross 2*reorder_buffer_size . If s= o > we don=E2=80=99t handle such packet and drop it. > > But, we are open to suggestions on how to handle late packets? Should we > have config option to drop them or just deque them in next immediate > dequeue operation. > Config option is the most flexible, but I would expect the normal case to be to TX the packet "quickly". I'm hesitant to say deque it in the next immediate dequeue operation because there are potential DoS attack vectors on the system depending on implementation details. > On Tue, Oct 7, 2014 at 4:33 AM, Pattan, Reshma > wrote: > > Hi All, > > I am planning to implement packet reorder library. Details are as below, > please go through them and provide the comments. > > Requirement: > To reorder out of ordered packets that are received from > different cores. > > Usage: > To be used along with distributor library. Next version of distributor ar= e > planned to distribute incoming packets to all worker cores irrespective o= f > the flow type. > In this case to ensure in order delivery of the packets at output side > reorder library will used by the tx end. > > Assumption: > All input packets will be marked with sequence number in seqn field of > mbuf, this will be the reference for reordering at the tx end. > Sequence number will be of type uint32_t. New sequence number field seqn > will be added to mbuf structure. > > Design: > a)There will be reorder buffer(circular buffer) structure maintained in > reorder library to store reordered packets and other details of buffer li= ke > head to drain the packet from, min sequence number and other details. > b)Library will provide insert and drain functions to > reorder and fetch out the reordered packets respectively. > c)Users of library should pass the packets to insert functions for > reordering. > > Insertion logic: > Sequence number of current packet will be used to calculate offset in > reorder buffer and write packet to the location in the reorder buffer > corresponding to offset. > Offset is calculated as difference of curren= t > packet sequence number and sequence number associated with reorder buff= er. > > During sequence number wrapping or wrapping over of reorder buffer size, > before inserting the new packet we should move offset number of packets t= o > other buffer called overflow buffer and advance the head of reorder buffe= r > by "offset-reorder buffer size" and insert the new packet. > > Insert function: > int rte_reorder_insert(struct rte_reorder_buffer *buffer, struct rte_mbuf > *mbuf); > Note: Other insert function is also under plan to insert burst of packets= . > > Reorder buffer: > struct rte_reorder_buffer { > unsigned int size; /* The size (number of entries) of the > buffer. */ > unsigned int mask; /* Mask (size - 1) of the buffer */ > unsigned int head; /* Current head of buffer */ > uint32_t min_seqn; /* latest sequence number associated with > buffer */ > struct rte_mbuf *entries[MAX_REORDER_BUFFER_SIZE]; /* buffer to > hold reordered mbufs */ > }; > > d)Users can fetch out the reordered packets by drain function provided by > library. Users must pass the mbuf array , drain function should fill > passed mbuff array with the reordered buffer packets. > During drain operation, overflow buffer packets will be fetched out firs= t > and then reorder buffer. > > Drain function: > int rte_reorder_drain(struct rte_reorder_buffer *buffer, > struct rte_mbuf **mbufs) > > Thanks, > Reshma > > -------------------------------------------------------------- > Intel Shannon Limited > Registered in Ireland > Registered Office: Collinstown Industrial Park, Leixlip, County Kildare > Registered Number: 308263 > Business address: Dromore House, East Park, Shannon, Co. Clare > > This e-mail and any attachments may contain confidential material for the > sole use of the intended recipient(s). Any review or distribution by othe= rs > is strictly prohibited. If you are not the intended recipient, please > contact the sender and delete all copies. > > >