From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 79AC35F25 for ; Wed, 14 Nov 2018 16:04:48 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Nov 2018 07:04:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,232,1539673200"; d="scan'208";a="281032103" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by fmsmga006.fm.intel.com with ESMTP; 14 Nov 2018 07:04:47 -0800 Received: from fmsmsx116.amr.corp.intel.com (10.18.116.20) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.408.0; Wed, 14 Nov 2018 07:04:47 -0800 Received: from fmsmsx117.amr.corp.intel.com ([169.254.3.70]) by fmsmsx116.amr.corp.intel.com ([169.254.2.70]) with mapi id 14.03.0415.000; Wed, 14 Nov 2018 07:04:47 -0800 From: "Wiles, Keith" To: Harsh Patel CC: "users@dpdk.org" Thread-Topic: [dpdk-users] Query on handling packets Thread-Index: AQHUdzydRRkBFdv4fkKjO7j2RJcyb6VGGaEAgACGp4CAAAyagIABE0+AgAFRlQCAAnP3gIACAjmAgAC+mQCAAZQ9gIAAE5kA Date: Wed, 14 Nov 2018 15:04:46 +0000 Message-ID: <2455C953-0D3C-4B5E-BCDB-AEAE1700935B@intel.com> References: <71CBA720-633D-4CFE-805C-606DAAEDD356@intel.com> <3C60E59D-36AD-4382-8CC3-89D4EEB0140D@intel.com> <76959924-D9DB-4C58-BB05-E33107AD98AC@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.252.138.154] Content-Type: text/plain; charset="us-ascii" Content-ID: <39BB9AD294735A49AC5629AC8CDD7F8B@intel.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-users] Query on handling packets X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Nov 2018 15:04:49 -0000 Sorry, did not send plain text email again.=20 > On Nov 14, 2018, at 7:54 AM, Harsh Patel wrote= : >=20 > Hello, > This is a link to the complete source code of our project :- https://gith= ub.com/ns-3-dpdk-integration/ns-3-dpdk > For the description of the project, look through this :- https://ns-3-dpd= k-integration.github.io/ > Once you go through it, you will have a basic understanding of the projec= t. > Installation instructions link are provided in the github.io page. >=20 > In the code we mentioned above, the master branch contains the implementa= tion of the logic using rte_rings which we mentioned at the very beginning = of the discussion. There is a branch named "newrxtx" which contains the imp= lementation according to the logic you provided. >=20 > We would like you to take a look at the code in newrxtx branch. (https://= github.com/ns-3-dpdk-integration/ns-3-dpdk/tree/newrxtx) > In the code in this branch, go to ns-allinone-3.28.1/ns-3.28.1/src/fd-net= -device/model/ directory. Here we have implemented the DpdkNetDevice model.= This model contains the code which implements the whole model providing in= teraction between ns-3 and DPDK. We would like you take a look at our Read = function (https://github.com/ns-3-dpdk-integration/ns-3-dpdk/blob/newrxtx/n= s-allinone-3.28.1/ns-3.28.1/src/fd-net-device/model/dpdk-net-device.cc#L626= ) and Write function (https://github.com/ns-3-dpdk-integration/ns-3-dpdk/bl= ob/newrxtx/ns-allinone-3.28.1/ns-3.28.1/src/fd-net-device/model/dpdk-net-de= vice.cc#L576). These contains the logic you suggested. >=20 I looked at the read and write routines briefly. The one thing that jumped = out at me is you copy the packet from an internal data buffer to the mbuf o= r mbuf to data buffer. You should try your hardest to remove these memcpy c= alls in the data path as they will kill your performance. If you have to us= e memcpy I would look at the rte_memcpy() routine to use as they are highly= optimized for DPDK. Even with using DPKD rte_memcpy() you will still see a= big performance hit. I did not look at were the buffer came from, but maybe you could allocate a= pktmbuf pool (as you did) and when you main code asks for a buffer it grab= s a mbufs points to the start of the mbuf and returns that pointer instead.= Then when you get to the write or read routine you find the start of the m= buf header based on the buffer address or even some meta data attached to t= he buffer. Then you can call the rte_eth_tx_buffer() routine with that mbuf= pointer. For the TX side the mbuf is freed by the driver, but could be on = the TX done queue, just make sure you have enough buffers. On the read side you need to also find the place the buffer is allocated an= d allocate a mbuf then save the mbuf pointer in the meta data of the buffer= (if you have meta data per buffer) then you can at some point free the mbu= f after you have processed the data buffer. I hope that is clear, I meetings I must attend. > Can you go through this and suggest us some changes or find some mistake = in our code? If you need any help or have any doubt, ping us. >=20 > Thanks and Regards, > Harsh & Hrishikesh >=20 > On Tue, 13 Nov 2018 at 19:17, Wiles, Keith wrote: >=20 >=20 > > On Nov 12, 2018, at 8:25 PM, Harsh Patel wro= te: > >=20 > > Hello, > > It would be really helpful if you can provide us a link (for both Tx an= d Rx) to the project you mentioned earlier where you worked on a similar pr= oblem, if possible.=20 > >=20 >=20 > At this time I can not provide a link. I will try and see what I can do, = but do not hold your breath it could be awhile as we have to go thru a lot = of legal stuff. If you can try vtune tool from Intel for x86 systems if you= can get a copy for your platform as it can tell you a lot about the code a= nd where the performance issues are located. If you are not running Intel x= 86 then my code may not work for you, I do not remember if you told me whic= h platform. >=20 >=20 > > Thanks and Regards,=20 > > Harsh & Hrishikesh. > >=20 > > On Mon, 12 Nov 2018 at 01:15, Harsh Patel wr= ote: > > Thanks a lot for all the support. We are looking into our work as of no= w and will contact you once we are done checking it completely from our sid= e. Thanks for the help. > >=20 > > Regards, > > Harsh and Hrishikesh > >=20 > > On Sat, 10 Nov 2018 at 11:47, Wiles, Keith wrot= e: > > Please make sure to send your emails in plain text format. The Mac mail= program loves to use rich-text format is the original email use it and I h= ave told it not only send plain text :-( > >=20 > > > On Nov 9, 2018, at 4:09 AM, Harsh Patel wr= ote: > > >=20 > > > We have implemented the logic for Tx/Rx as you suggested. We compared= the obtained throughput with another version of same application that uses= Linux raw sockets.=20 > > > Unfortunately, the throughput we receive in our DPDK application is l= ess by a good margin. Is this any way we can optimize our implementation or= anything that we are missing? > > >=20 > >=20 > > The PoC code I was developing for DAPI I did not have any performance o= f issues it run just as fast with my limited testing. I converted the l3fwd= code and I saw 10G 64byte wire rate as I remember using pktgen to generate= the traffic. > >=20 > > Not sure why you would see a big performance drop, but I do not know yo= ur application or code. > >=20 > > > Thanks and regards > > > Harsh & Hrishikesh > > >=20 > > > On Thu, 8 Nov 2018 at 23:14, Wiles, Keith wro= te: > > >=20 > > >=20 > > >> On Nov 8, 2018, at 4:58 PM, Harsh Patel w= rote: > > >>=20 > > >> Thanks > > >> for your insight on the topic. Transmission is working with the fun= ctions you mentioned. We tried to search for some similar functions for han= dling incoming packets but could not find anything. Can you help us on that= as well? > > >>=20 > > >=20 > > > I do not know if a DPDK API set for RX side. But in the DAPI (DPDK AP= I) PoC I was working on and presented at the DPDK Summit last Sept. In the = PoC I did create a RX side version. The issues it has a bit of tangled up i= n the DAPI PoC. > > >=20 > > > The basic concept is a call to RX a single packet does a rx_burst of = N number of packets keeping then in a mbuf list. The code would spin waitin= g for mbufs to arrive or return quickly if a flag was set. When it did find= RX mbufs it would just return the single mbuf and keep the list of mbufs f= or later requests until the list is empty then do another rx_burst call. > > >=20 > > > Sorry this is a really quick note on how it works. If you need more d= etails we can talk more later. > > >>=20 > > >> Regards, > > >> Harsh > > >> and Hrishikesh. > > >>=20 > > >>=20 > > >> On Thu, 8 Nov 2018 at 14:26, Wiles, Keith wr= ote: > > >>=20 > > >>=20 > > >> > On Nov 8, 2018, at 8:24 AM, Harsh Patel = wrote: > > >> >=20 > > >> > Hi, > > >> > We are working on a project where we are trying to integrate DPDK = with > > >> > another software. We are able to obtain packets from the other env= ironment > > >> > to DPDK environment in one-by-one fashion. On the other hand DPDK = allows to > > >> > send/receive burst of data packets. We want to know if there is an= y > > >> > functionality in DPDK to achieve this conversion of single incomin= g packet > > >> > to a burst of packets sent on NIC and similarly, conversion of bur= st read > > >> > packets from NIC to send it to other environment sequentially? > > >>=20 > > >>=20 > > >> Search in the docs or lib/librte_ethdev directory on rte_eth_tx_buff= er_init, rte_eth_tx_buffer, ... > > >>=20 > > >>=20 > > >>=20 > > >> > Thanks and regards > > >> > Harsh Patel, Hrishikesh Hiraskar > > >> > NITK Surathkal > > >>=20 > > >> Regards, > > >> Keith > > >>=20 > > >=20 > > > Regards, > > > Keith > > >=20 > >=20 > > Regards, > > Keith > >=20 >=20 > Regards, > Keith >=20 Regards, Keith