From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5F28E455C8 for ; Mon, 8 Jul 2024 22:20:03 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2B67F40ED3; Mon, 8 Jul 2024 22:20:03 +0200 (CEST) Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) by mails.dpdk.org (Postfix) with ESMTP id C90E040B99 for ; Mon, 8 Jul 2024 22:20:01 +0200 (CEST) Received: by mail-pj1-f50.google.com with SMTP id 98e67ed59e1d1-2c9a10728abso2330878a91.0 for ; Mon, 08 Jul 2024 13:20:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=domainhart-com.20230601.gappssmtp.com; s=20230601; t=1720470001; x=1721074801; darn=dpdk.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=QkE8u+7gVsBlq08vY0AVQrFSScaSwhnx/GLls3ohNfo=; b=dbYMBCN+R4fLKwCZM9E82sUSsdq19v6I6tUc8p38BQuaFv7PKJSaFP5LS8l6GlxmUM hzZtXpGHYN/zyBnIHpEh1vLu/foN4PePkTsQI9WRk9gd0wRXidpKUXxSjDxdIr8BdlJu o5K6TdvO2mLmS/pLs+mp1yv30E5FSIWv0FkuWTEB3JqlRbjPYJeNzXjzbqETymTIHv0W BR7NVfMdTXIri2vpCN3JUWv7pbdPwN1fxgl1SQcuHZ1hqnColiubWn9nstT8vkY/IC7r KQigVYcrAPcpx9OJO6xl7GFrfBCtme3p/evrOq8NBay5T1htatDd0Ji7B/qZeJsXIWau gYPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720470001; x=1721074801; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=QkE8u+7gVsBlq08vY0AVQrFSScaSwhnx/GLls3ohNfo=; b=emyNraaUnuQWfzRzOUWhx+7xzXga37eBQDYmFtTQk4Q9gHWj9/fFRvjaa+6TycCNEH xAPMRERuRDNGheEkUH7ZKHq4L17C2lMcTzHWQy+62xoAAH1xal7/Xn12ykfTUi6lN+bB 3weddBYMOxGeGSkMmYcivD4MzI/m3nvZa0Ta11z4KjRf28bYoi/KWyGrWcF4WTSJZtq4 Arv36nmn+EtAA6sbTDgKaZSC4vuEQf4dYRIHeHijNPRcxFUvvd5cF9vNH35SG4YCR/dO rz+Z1h3IiAAEJI0tImmFgxwc9i2JzPJIy4YPFYmW1Agt2Nhuch4dsnfF/nQ6PLKd7kPU QXVQ== X-Gm-Message-State: AOJu0YzYuHyYkccUQOryUsdsn7SJNFJGtBiQTMdOacMzbeCazzgmV8C0 lsG6gYha9tH1TQeKpvH8lp6xZzH8aGBYyPFKEMtaPsHRuAcvtJHRm0v6EHvxFQM3BV9GJZmNoUf ZLy9RVla5iIQx1+wudWpex1hCJeH8A8lgQ/lqjQ== X-Google-Smtp-Source: AGHT+IHU1wYJA0Odam7fhvLarC4tY1pfoGsTmjE/r0Zt0XXCeRzfP5TqOShb6ZrwVxP6kXCqx+Xw8XmxViuz/DA8U24= X-Received: by 2002:a17:90b:380e:b0:2c8:da73:af7d with SMTP id 98e67ed59e1d1-2ca35bdfcf5mr626406a91.3.1720470000744; Mon, 08 Jul 2024 13:20:00 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Tony Hart Date: Mon, 8 Jul 2024 16:19:48 -0400 Message-ID: Subject: Re: Performance of CX7 with 'eth' pattern versus 'eth/ipv4' in hairpin To: Bing Zhao Cc: "users@dpdk.org" Content-Type: multipart/alternative; boundary="0000000000003dd501061cc223e4" X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org --0000000000003dd501061cc223e4 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Bing, Thanks for your help on this. Let me check if I understand your analysis correctly. With just 'eth' as the pattern the CX has to do two (maybe more) lookups, the first just match ethernet packets once that's done a second match has to occur to match IPv4 (since rss function is L3) , only then can the RSS action be performed. When the pattern is eth/ipv4 only one lookup is required before the RSS action can occur? thanks, tony On Mon, Jul 8, 2024 at 12:56=E2=80=AFPM Bing Zhao wrote: > Hi, > > Apologize for the late response. PSB > > > -----Original Message----- > > From: Tony Hart > > Sent: Wednesday, June 26, 2024 9:25 PM > > To: Bing Zhao > > Cc: users@dpdk.org > > Subject: Re: Performance of CX7 with 'eth' pattern versus 'eth/ipv4' in > > hairpin > > > > External email: Use caution opening links or attachments > > > > > > Hi Bing, > > Thanks for the quick reply. The results are... > > > > With a single hairpin queue I get approx the same rate for both pattern= s, > > ~54Gbps. I assume this is less than the RSS rates due to fewer queues? > > flow create 0 ingress group 1 pattern eth / end actions count / queue > > index 6 / end flow create 0 ingress group 1 pattern eth / ipv4 / end > > actions count / queue index 6 / end > > The reason that I want to compare single queue is to confirm if the > difference is caused by the RSS action. > And the result is as expected. > > > > > With the split ipv6/ipv4 I'm getting ~124Gbps > > > > flow create 0 ingress group 1 priority 1 pattern eth / ipv6 / end actio= ns > > count / rss queues 6 7 8 9 end / end flow create 0 ingress group 1 > > priority 1 pattern eth / ipv4 / end actions count / rss queues 6 7 8 9 > end > > / end > > > > testpmd> flow list 0 > > ID Group Prio Attr Rule > > 0 0 0 i-- =3D> JUMP > > 1 1 1 i-- ETH IPV6 =3D> COUNT RSS > > 2 1 1 i-- ETH IPV4 =3D> COUNT RSS > > > > I tried to debug on my local setup, the reason is related to the RSS > expansion. > The mlx5 PMD doesn't support RSS on Ethernet header fields now. When only > ETH is in the pattern, but the RSS is the default (L3 IP), there will be > several rules to be inserted. > 1. Ethernet + IPv6 / RSS based on IPv6 header > 2. Ethernet + IPv4 / RSS based on IPv4 header > 3. Other Ethernet packets / single default queue > > This will have some more hops for a IPv4 packet. > So, it would be better to match IPv4 if you are using the default RSS > fields. > Note: If you are using RSS on the (IP +) TCP / UDP fields, the expansion > to the L4 headers may be involved. To get rid of this, the match of the > rule can be specified to the L4 as well. > > > On Wed, Jun 26, 2024 at 8:10=E2=80=AFAM Bing Zhao wr= ote: > > > > > > Hi Tony, > > > > > > Could you also try to test with: > > > 1. QUEUE action instead of RSS and check 1 queue performance. > > > 2. when trying to test IPv4 only case, try the following 3 commands > with > > this order - > > > flow create 0 ingress group 0 pattern end actions jump group = 1 > / > > end > > > flow create 0 ingress group 1 pattern priority 1 eth / ipv6 / > > end actions count / rss queues 6 7 8 9 end / end > > > flow create 0 ingress group 1 pattern priority 1 eth / ipv4 / > > > end actions count / rss queues 6 7 8 9 end / end > > > > > > BR. Bing > > > > > > > -----Original Message----- > > > > From: Tony Hart > > > > Sent: Wednesday, June 26, 2024 7:39 PM > > > > To: users@dpdk.org > > > > Subject: Performance of CX7 with 'eth' pattern versus 'eth/ipv4' in > > > > hairpin > > > > > > > > External email: Use caution opening links or attachments > > > > > > > > > > > > I'm using a CX7 and testing hairpin queues. The test traffic is > > > > entirely > > > > IPv4+UDP with distributed SIP,DIP pairs and received packets are > > > > IPv4+u-turned via > > > > hairpin in the CX7 (single 400G interface). > > > > > > > > I see different performance when I use a pattern of 'eth' versus > > > > 'eth/ipv4' in the hairpin flow entry. From testing it seems that > > > > specifying just 'eth' is sufficient to invoke RSS and 'eth/ipv4' > > > > should be equivalent since the traffic is all ipv4, but I'm getting > > > > ~104Gbps for the 'eth' pattern and ~124Gbps for 'eth/ipv4' pattern= . > > > > > > > > Any thoughts on why there is such a performance difference here? > > > > > > > > thanks > > > > tony > > > > > > > > This is the 'eth' pattern testpmd commands flow create 0 ingress > > > > group 0 pattern end actions jump group 1 / end flow create 0 ingres= s > > > > group 1 pattern eth / end actions count / rss queues 6 7 8 9 end / > > > > end > > > > > > > > The testpmd commands for 'eth/ipv4' > > > > flow create 0 ingress group 0 pattern end actions jump group 1 / en= d > > > > flow create 0 ingress group 1 pattern eth / ipv4 / end actions coun= t > > > > / rss queues 6 7 > > > > 8 9 end / end > > > > > > > > > > > > This is the testpmd command line... > > > > dpdk-testpmd -l8-14 -a81:00.0,dv_flow_en=3D1 -- -i --nb-cores 6 --r= xq > > > > 6 --txq 6 --port-topology loop --forward-mode=3Drxonly --hairpinq 4 > > > > --hairpin-mode > > > > 0x10 > > > > > > > > Versions > > > > mlnx-ofa_kernel-24.04-OFED.24.04.0.6.6.1.rhel9u4.x86_64 > > > > kmod-mlnx-ofa_kernel-24.04-OFED.24.04.0.6.6.1.rhel9u4.x86_64 > > > > mlnx-ofa_kernel-devel-24.04-OFED.24.04.0.6.6.1.rhel9u4.x86_64 > > > > ofed-scripts-24.04-OFED.24.04.0.6.6.x86_64 > > > > > > > > DPDK: v24.03 > > > > > > > > -- > > tony > --=20 tony --0000000000003dd501061cc223e4 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Bing,
Thanks for your help on this.

<= /div>
Let me check if I understand your analysis correctly.=C2=A0 With = just 'eth' as the pattern the CX has to do two (maybe more) lookups= , the first just match ethernet packets once that's done a second=C2=A0= match has to occur to match=C2=A0IPv4 (since rss function is L3) , only the= n can the RSS action be performed.=C2=A0 When the pattern is eth/ipv4 only = one lookup is required before the RSS action can occur?

thanks,
tony

On Mon, Jul 8, 2024 at 12:56=E2=80=AFPM B= ing Zhao <bingz@nvidia.com> w= rote:
Hi,

Apologize for the late response. PSB

> -----Original Message-----
> From: Tony Hart <tony.hart@domainhart.com>
> Sent: Wednesday, June 26, 2024 9:25 PM
> To: Bing Zhao <bingz@nvidia.com>
> Cc: users@dpdk.org=
> Subject: Re: Performance of CX7 with 'eth' pattern versus '= ;eth/ipv4' in
> hairpin
>
> External email: Use caution opening links or attachments
>
>
> Hi Bing,
> Thanks for the quick reply.=C2=A0 The results are...
>
> With a single hairpin queue I get approx the same rate for both patter= ns,
> ~54Gbps.=C2=A0 I assume this is less than the RSS rates due to fewer q= ueues?
> flow create 0 ingress group 1 pattern eth / end actions count / queue<= br> > index 6 / end flow create 0 ingress group 1 pattern eth / ipv4 / end > actions count / queue index 6 / end

The reason that I want to compare single queue is to confirm if the differe= nce is caused by the RSS action.
And the result is as expected.

>
> With the split ipv6/ipv4 I'm getting ~124Gbps
>
> flow create 0 ingress group 1 priority 1 pattern eth / ipv6 / end acti= ons
> count / rss queues 6 7 8 9 end / end flow create 0 ingress group 1
> priority 1 pattern eth / ipv4 / end actions count / rss queues 6 7 8 9= end
> / end
>
> testpmd> flow list 0
> ID Group Prio Attr Rule
> 0 0 0 i-- =3D> JUMP
> 1 1 1 i-- ETH IPV6 =3D> COUNT RSS
> 2 1 1 i-- ETH IPV4 =3D> COUNT RSS
>

I tried to debug on my local setup, the reason is related to the RSS expans= ion.
The mlx5 PMD doesn't support RSS on Ethernet header fields now. When on= ly ETH is in the pattern, but the RSS is the default (L3 IP), there will be= several rules to be inserted.
1. Ethernet + IPv6 / RSS based on IPv6 header
2. Ethernet + IPv4 / RSS based on IPv4 header
3. Other Ethernet packets / single default queue

This will have some more hops for a IPv4 packet.
So, it would be better to match IPv4 if you are using the default RSS field= s.
Note: If you are using RSS on the (IP +) TCP / UDP fields, the expansion to= the L4 headers may be involved. To get rid of this, the match of the rule = can be specified to the L4 as well.

> On Wed, Jun 26, 2024 at 8:10=E2=80=AFAM Bing Zhao <bingz@nvidia.com> wrote:
> >
> > Hi Tony,
> >
> > Could you also try to test with:
> > 1. QUEUE action instead of RSS and check 1 queue performance.
> > 2. when trying to test IPv4 only case, try the following 3 comman= ds with
> this order -
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0flow create 0 ingress group 0 pa= ttern end actions jump group 1 /
> end
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0flow create 0 ingress group 1 pa= ttern priority 1 eth / ipv6 /
> end actions count / rss queues 6 7 8 9 end / end
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0flow create 0 ingress group 1 pa= ttern priority 1 eth / ipv4 /
> > end actions count / rss queues 6 7 8 9 end / end
> >
> > BR. Bing
> >
> > > -----Original Message-----
> > > From: Tony Hart <tony.hart@domainhart.com>
> > > Sent: Wednesday, June 26, 2024 7:39 PM
> > > To: user= s@dpdk.org
> > > Subject: Performance of CX7 with 'eth' pattern versu= s 'eth/ipv4' in
> > > hairpin
> > >
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > I'm using a CX7 and testing hairpin queues.=C2=A0 The te= st traffic is
> > > entirely
> > > IPv4+UDP with distributed SIP,DIP pairs and received packets= are
> > > IPv4+u-turned via
> > > hairpin in the CX7 (single 400G interface).
> > >
> > > I see different performance when I use a pattern of 'eth= ' versus
> > > 'eth/ipv4' in the hairpin flow entry.=C2=A0 From tes= ting it seems that
> > > specifying just 'eth' is sufficient to invoke RSS an= d 'eth/ipv4'
> > > should be equivalent since the traffic is all ipv4, but I= 9;m getting
> > > ~104Gbps for the 'eth' pattern and=C2=A0 ~124Gbps fo= r 'eth/ipv4' pattern.
> > >
> > > Any thoughts on why there is such a performance difference h= ere?
> > >
> > > thanks
> > > tony
> > >
> > > This is the 'eth' pattern testpmd commands flow crea= te 0 ingress
> > > group 0 pattern end actions jump group 1 / end flow create 0= ingress
> > > group 1 pattern eth / end actions count / rss queues 6 7 8 9= end /
> > > end
> > >
> > > The testpmd commands for 'eth/ipv4'
> > > flow create 0 ingress group 0 pattern end actions jump group= 1 / end
> > > flow create 0 ingress group 1 pattern eth / ipv4 / end actio= ns count
> > > / rss queues 6 7
> > > 8 9 end / end
> > >
> > >
> > > This is the testpmd command line...
> > > dpdk-testpmd -l8-14 -a81:00.0,dv_flow_en=3D1 -- -i --nb-core= s 6 --rxq
> > > 6 --txq 6 --port-topology loop --forward-mode=3Drxonly --hai= rpinq 4
> > > --hairpin-mode
> > > 0x10
> > >
> > > Versions
> > > mlnx-ofa_kernel-24.04-OFED.24.04.0.6.6.1.rhel9u4.x86_64
> > > kmod-mlnx-ofa_kernel-24.04-OFED.24.04.0.6.6.1.rhel9u4.x86_64=
> > > mlnx-ofa_kernel-devel-24.04-OFED.24.04.0.6.6.1.rhel9u4.x86_6= 4
> > > ofed-scripts-24.04-OFED.24.04.0.6.6.x86_64
> > >
> > > DPDK: v24.03
>
>
>
> --
> tony


--
to= ny
--0000000000003dd501061cc223e4--