From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <prvs=0447c0877=bmbenson@amazon.com>
Received: from smtp-fw-9102.amazon.com (smtp-fw-9102.amazon.com
 [207.171.184.29]) by dpdk.org (Postfix) with ESMTP id 7C6A59DE
 for <dev@dpdk.org>; Thu,  5 Dec 2013 08:43:32 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209;
 t=1386229475; x=1417765475;
 h=from:to:cc:subject:date:message-id:references:
 in-reply-to:mime-version;
 bh=EEZI3flu1Ql4BXjCS8g8zEo9dq6jyjcsH8dJR0R/vgI=;
 b=IBI5vKLjI4kYk/xjw7htakWik0zQsbexCTBPVxuMlvmxPwAuVlaT6BkQ
 N2DcFpjLa8XhU1GxkLpkNtErWu6a1pmm/ET0hhLjnNXD7oCRLO1TzKr3h
 nuoKHCzwyiUME0K0DgR8HwOywEbVu/1XTxDAdEPuJL4uX7fT4RnmZk0O/ M=;
X-IronPort-AV: E=Sophos;i="4.93,831,1378857600"; d="scan'208,217";a="42244922"
Received: from smtp-in-31001.sea31.amazon.com ([10.184.168.27])
 by smtp-border-fw-out-9102.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA;
 05 Dec 2013 07:44:33 +0000
Received: from ex10-hub-31002.ant.amazon.com (ex10-hub-31002.sea31.amazon.com
 [10.185.169.193])
 by smtp-in-31001.sea31.amazon.com (8.14.7/8.14.7) with ESMTP id rB57iWKL005515
 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=OK);
 Thu, 5 Dec 2013 07:44:33 GMT
Received: from EX10-MBX-9002.ant.amazon.com ([fe80::f0a2:552e:430e:3cdb]) by
 ex10-hub-31002.ant.amazon.com ([::1]) with mapi id 14.02.0342.003; Wed, 4 Dec
 2013 23:44:30 -0800
From: "Benson, Bryan" <bmbenson@amazon.com>
To: Prashant Upadhyaya <prashant.upadhyaya@aricent.com>, Stephen Hemminger
 <stephen@networkplumber.org>
Thread-Topic: [dpdk-dev] generic load balancing
Thread-Index: AQHO8RnFWyvPo7f6W02hdASxHnM2OppFDOSAgAB82wCAAAZaAIAACdCA//+fi/A=
Date: Thu, 5 Dec 2013 07:44:29 +0000
Message-ID: <eexymb17q9xfuyslcxlpfvhi.1386229465261@email.android.com>
References: <CAAD-K94YUY6aUPzvJyqJ7w4W2_81d0Fq7EvwJ1xKOzzd3Ld4Lw@mail.gmail.com>
 <03f701cef134$7e564720$7b02d560$@com>
 <C7CE7EEF248E2B48BBA63D0ABEEE700C5353AEF76F@GUREXMB01.ASIAN.AD.ARICENT.COM>
 <CAOaVG16OSgtUOTiv6nqOfuz7MgDP+Hygqv7hEKUMxMaFctnCpg@mail.gmail.com>,
 <C7CE7EEF248E2B48BBA63D0ABEEE700C5353AEF790@GUREXMB01.ASIAN.AD.ARICENT.COM>
In-Reply-To: <C7CE7EEF248E2B48BBA63D0ABEEE700C5353AEF790@GUREXMB01.ASIAN.AD.ARICENT.COM>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
MIME-Version: 1.0
Precedence: Bulk
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.15
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] generic load balancing
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2013 07:43:33 -0000

Prashant,
I assume your use case is not of one IP/UDP/TCP - or if it is, you are deal=
ing with a single tuple that is not evenly distributed.

You have a few options with the NIC that I can think of.

1) Use a single core to RX each port's frames and use your own software sol=
ution to RR to worker rings.  There is an example of this in the Load Balan=
cer sample application.

2) If your packets/frames have an evenly distributed field in the first 64 =
bytes of the frame, you can use the 2 byte match feature of flow director t=
o send to different queues (with multiple match signatures).  This will giv=
e even distribution, but not round robin behavior.

3) Modify the RSS redirection table for the NIC in the order you desire.  I=
 am unsure how often this can happen, or if there are performance issues wi=
th reprogramming it.  Definitely would need some experimentation.

What is it you are trying to achieve with Round Robin?  A distribution of p=
ackets to multiple cores for processing, or something else?

Without knowing the use case, my main suggestion is to use the LB sample ap=
plication - that way you can distribute in any way you please.

Thanks,
Bryan Benson


-------- Original message --------
From: Prashant Upadhyaya
Date:12/04/2013 9:30 PM (GMT-08:00)
To: Stephen Hemminger
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] generic load balancing

Hi Stepher,

The awfulness depends upon the 'usecase'
I have eg. a usecase where I want this roundrobin behaviour.

I just want the NIC to give me a facility to use this.

Regards
-Prashant


-----Original Message-----
From: Stephen Hemminger [mailto:stephen@networkplumber.org]
Sent: Thursday, December 05, 2013 10:25 AM
To: Prashant Upadhyaya
Cc: Fran=E7ois-Fr=E9d=E9ric Ozog; Michael Quicquaro; dev@dpdk.org
Subject: Re: [dpdk-dev] generic load balancing

Round robin would actually be awful for any protocol because it would cause=
 out of order packets.
That is why flow based algorithms like flow director and RSS work much bett=
er.

On Wed, Dec 4, 2013 at 8:31 PM, Prashant Upadhyaya <prashant.upadhyaya@aric=
ent.com> wrote:
> Hi,
>
> It's a real pity that Intel 82599 NIC (and possibly others) don't have a =
simple round robin scheduling of packets on the configured queues.
>
> I have requested Intel earlier, and using this forum requesting again -- =
please please put this facility in the NIC that if I drop N queues there an=
d configure  the NIC for some round robin scheduling on queues, then NIC sh=
ould simply put the received packets one by one on queue 1, then on queue2,=
....,then on queueN, and then back on queue 1.
> The above is very useful in lot of load balancing cases.
>
> Regards
> -Prashant
>
>
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Fran=E7ois-Fr=E9d=E9=
ric
> Ozog
> Sent: Thursday, December 05, 2013 2:35 AM
> To: 'Michael Quicquaro'
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] generic load balancing
>
> Hi,
>
> As far as I can tell, this is really hardware dependent. Some hash functi=
ons allow uplink and downlink packets of the same "session" to go to the sa=
me queue (I know Chelsio can do this).
>
> For the Intel card, you may find what you want in:
> http://www.intel.com/content/www/us/en/ethernet-controllers/82599-10-g
> be-con
> troller-datasheet.html
>
> Other cards require NDA or other agreements to get details of RSS.
>
> If you have a performance problem, may I suggest you use kernel 3.10 then=
 monitor system activity with "perf" command. For instance you can start wi=
th "perf top -a" this will give you nice information. Then your creativity =
will do the rest ;-) You may be surprised what comes on the top hot points.=
..
> (the most unexpected hot function I found here was Linux syscall
> gettimeofday!!!)
>
> Fran=E7ois-Fr=E9d=E9ric
>
>> -----Message d'origine-----
>> De : dev [mailto:dev-bounces@dpdk.org] De la part de Michael
>> Quicquaro Envoy=E9 : mercredi 4 d=E9cembre 2013 18:53 =C0 : dev@dpdk.org=
 Objet :
>> [dpdk-dev] generic load balancing
>>
>> Hi all,
>> I am writing a dpdk application that will receive packets from one
>> interface and process them.  It does not forward packets in the
> traditional
>> sense.  However, I do need to process them at full line rate and
>> therefore need more than one core.  The packets can be somewhat
>> generic in nature
> and
>> can be nearly identical (especially at the beginning of the packet).
>> I've used the rxonly function of testpmd as a model.
>>
>> I've run into problems in processing a full line rate of data since
>> the nature of the data causes all the data to be presented to only one c=
ore.
> I
>> get a large percentage of dropped packets (shows up as Rx-Errors in
>> "port
>> stats") because of this.  I've tried modifying the data so that
>> packets have different UDP ports and that seems to work when I use
>> --rss-udp
>>
>> My questions are:
>> 1) Is there a way to configure RSS so that it alternates packets to
>> all configured cores regardless of the packet data?
>>
>> 2)  Where is the best place to learn more about RSS and how to
>> configure it? I have not found much in the DPDK documentation.
>>
>> Thanks for the help,
>> - Mike
>
>
>
>
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> =3D=3D=3D=3D=3D=3D=3D=3D=3D Please refer to
> http://www.aricent.com/legal/email_disclaimer.html
> for important disclosures regarding this electronic communication.
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> =3D=3D=3D=3D=3D=3D=3D=3D=3D




=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
Please refer to http://www.aricent.com/legal/email_disclaimer.html
for important disclosures regarding this electronic communication.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D