From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from jaguar.aricent.com (jaguar.aricent.com [180.151.2.24]) by dpdk.org (Postfix) with ESMTP id 45C6C9DE for ; Thu, 5 Dec 2013 15:16:31 +0100 (CET) Received: from jaguar.aricent.com (localhost [127.0.0.1]) by postfix.imss71 (Postfix) with ESMTP id 636A036D49; Thu, 5 Dec 2013 19:47:11 +0530 (IST) Received: from GUREXHT02.ASIAN.AD.ARICENT.COM (gurexht02.asian.ad.aricent.com [10.203.171.138]) by jaguar.aricent.com (Postfix) with ESMTP id 4BF5636DE1; Thu, 5 Dec 2013 19:46:22 +0530 (IST) Received: from GUREXMB01.asian.ad.aricent.com ([10.203.171.132]) by GUREXHT02.ASIAN.AD.ARICENT.COM ([10.203.171.138]) with mapi; Thu, 5 Dec 2013 19:46:22 +0530 From: Prashant Upadhyaya To: "Benson, Bryan" , Stephen Hemminger Date: Thu, 5 Dec 2013 19:46:21 +0530 Thread-Topic: [dpdk-dev] generic load balancing Thread-Index: AQHO8RnFWyvPo7f6W02hdASxHnM2OppFDOSAgAB82wCAAAZaAIAACdCA//+fi/CAAG0RIA== Message-ID: References: <03f701cef134$7e564720$7b02d560$@com> , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US MIME-Version: 1.0 X-TM-AS-MML: No Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] generic load balancing X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Dec 2013 14:16:32 -0000 Hi Bryan, Regarding your 1st point, the single core becomes the rx bottleneck which i= s clearly not desirable. I am not sure regarding how to use the stuff you mentioned in 2nd point, is= there some DPDK api which lets me configure this, kindly let me know. Regards -Prashant From: Benson, Bryan [mailto:bmbenson@amazon.com] Sent: Thursday, December 05, 2013 1:14 PM To: Prashant Upadhyaya; Stephen Hemminger Cc: dev@dpdk.org Subject: RE: [dpdk-dev] generic load balancing Prashant, I assume your use case is not of one IP/UDP/TCP - or if it is, you are deal= ing with a single tuple that is not evenly distributed. You have a few options with the NIC that I can think of. 1) Use a single core to RX each port's frames and use your own software sol= ution to RR to worker rings. There is an example of this in the Load Balan= cer sample application. 2) If your packets/frames have an evenly distributed field in the first 64 = bytes of the frame, you can use the 2 byte match feature of flow director t= o send to different queues (with multiple match signatures). This will giv= e even distribution, but not round robin behavior. 3) Modify the RSS redirection table for the NIC in the order you desire. I= am unsure how often this can happen, or if there are performance issues wi= th reprogramming it. Definitely would need some experimentation. What is it you are trying to achieve with Round Robin? A distribution of p= ackets to multiple cores for processing, or something else? Without knowing the use case, my main suggestion is to use the LB sample ap= plication - that way you can distribute in any way you please. Thanks, Bryan Benson -------- Original message -------- From: Prashant Upadhyaya Date:12/04/2013 9:30 PM (GMT-08:00) To: Stephen Hemminger Cc: dev@dpdk.org Subject: Re: [dpdk-dev] generic load balancing Hi Stepher, The awfulness depends upon the 'usecase' I have eg. a usecase where I want this roundrobin behaviour. I just want the NIC to give me a facility to use this. Regards -Prashant -----Original Message----- From: Stephen Hemminger [mailto:stephen@networkplumber.org] Sent: Thursday, December 05, 2013 10:25 AM To: Prashant Upadhyaya Cc: Fran=E7ois-Fr=E9d=E9ric Ozog; Michael Quicquaro; dev@dpdk.org Subject: Re: [dpdk-dev] generic load balancing Round robin would actually be awful for any protocol because it would cause= out of order packets. That is why flow based algorithms like flow director and RSS work much bett= er. On Wed, Dec 4, 2013 at 8:31 PM, Prashant Upadhyaya > wrote: > Hi, > > It's a real pity that Intel 82599 NIC (and possibly others) don't have a = simple round robin scheduling of packets on the configured queues. > > I have requested Intel earlier, and using this forum requesting again -- = please please put this facility in the NIC that if I drop N queues there an= d configure the NIC for some round robin scheduling on queues, then NIC sh= ould simply put the received packets one by one on queue 1, then on queue2,= ....,then on queueN, and then back on queue 1. > The above is very useful in lot of load balancing cases. > > Regards > -Prashant > > > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Fran=E7ois-Fr=E9d=E9= ric > Ozog > Sent: Thursday, December 05, 2013 2:35 AM > To: 'Michael Quicquaro' > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] generic load balancing > > Hi, > > As far as I can tell, this is really hardware dependent. Some hash functi= ons allow uplink and downlink packets of the same "session" to go to the sa= me queue (I know Chelsio can do this). > > For the Intel card, you may find what you want in: > http://www.intel.com/content/www/us/en/ethernet-controllers/82599-10-g > be-con > troller-datasheet.html > > Other cards require NDA or other agreements to get details of RSS. > > If you have a performance problem, may I suggest you use kernel 3.10 then= monitor system activity with "perf" command. For instance you can start wi= th "perf top -a" this will give you nice information. Then your creativity = will do the rest ;-) You may be surprised what comes on the top hot points.= .. > (the most unexpected hot function I found here was Linux syscall > gettimeofday!!!) > > Fran=E7ois-Fr=E9d=E9ric > >> -----Message d'origine----- >> De : dev [mailto:dev-bounces@dpdk.org] De la part de Michael >> Quicquaro Envoy=E9 : mercredi 4 d=E9cembre 2013 18:53 =C0 : dev@dpdk.org= Objet : >> [dpdk-dev] generic load balancing >> >> Hi all, >> I am writing a dpdk application that will receive packets from one >> interface and process them. It does not forward packets in the > traditional >> sense. However, I do need to process them at full line rate and >> therefore need more than one core. The packets can be somewhat >> generic in nature > and >> can be nearly identical (especially at the beginning of the packet). >> I've used the rxonly function of testpmd as a model. >> >> I've run into problems in processing a full line rate of data since >> the nature of the data causes all the data to be presented to only one c= ore. > I >> get a large percentage of dropped packets (shows up as Rx-Errors in >> "port >> stats") because of this. I've tried modifying the data so that >> packets have different UDP ports and that seems to work when I use >> --rss-udp >> >> My questions are: >> 1) Is there a way to configure RSS so that it alternates packets to >> all configured cores regardless of the packet data? >> >> 2) Where is the best place to learn more about RSS and how to >> configure it? I have not found much in the DPDK documentation. >> >> Thanks for the help, >> - Mike > > > > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > =3D=3D=3D=3D=3D=3D=3D=3D=3D Please refer to > http://www.aricent.com/legal/email_disclaimer.html > for important disclosures regarding this electronic communication. > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > =3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D Please refer to http://www.aricent.com/legal/email_disclaimer.html for important disclosures regarding this electronic communication. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D