DPDK usage discussions
 help / color / mirror / Atom feed
* Flow rules performance with ConnectX-6 Dx
@ 2022-03-11 11:10 Дмитрий Степанов
  2022-03-11 11:37 ` Dmitry Kozlyuk
  0 siblings, 1 reply; 5+ messages in thread
From: Дмитрий Степанов @ 2022-03-11 11:10 UTC (permalink / raw)
  To: users

[-- Attachment #1: Type: text/plain, Size: 2374 bytes --]

Hi, folks!

I'm using Mellanox ConnectX-6 Dx EN adapter card (100GbE; Dual-port QSFP56;
PCIe 4.0/3.0 x16) with DPDK 21.11 on Ubuntu 20.04

I want to drop particular packets within HW NIC's using rte_flow.
The flow configuration is rather straightforward: RTE_FLOW_ACTION_TYPE_DROP
as a single action and RTE_FLOW_ITEM_TYPE_ETH/RTE_FLOW_ITEM_TYPE_IPV4 as
pattern items (I used flow_filtering DPDK example as a starting point).

I'm using following IPv4 pattern for rte_flow drop rule - 0.0.0.0/0 as src
IP, 10.0.0.2/32 as dest IP. So I want to drop all packets which are
addressed to 10.0.0.2 (source IP doesn't matter).
To test this I generate TCP packets with 2 different IP dest addresses -
10.0.0.1 and 10.0.0.2. Source IPs are generated randomly in the range of
10.0.0.0-10.255.255.255. Half of the traffic should be blocked and other
half passed to my application.

If I generate 20Mpps in sum - I see that 10 Mpps is dropped by rte_flow and
10 Mpps is passed to my application. So everything ok there.
But if I increase input traffic to 40/50/100/148Mpps I see that only max 15
Mpps is passed to my application (and it doesn't depend on input speed).
Other traffic is dropped. I checked that my generator properly produces
packets - IP dest addresses are  equally distributed among generated
traffic. If I generate packets which don't match rte_flow drop rule (e.g.
with IP dest 10.0.0.1 and 10.0.0.3) - I see that all traffic is passed to
my application without problems.

Another example. If I generate traffic with 3 different IP dest addresses -
10.0.0.1, 10.0.0.2, 10.0.0.3 at 60Mpps (20 Mpps for each IP dest where
10.0.0.2 matches rte_flow drop rule) - I get only 30Mpps in sum passed to
my application (15Mpps for each non-matched IP dest address instead of
20Mpps). If I replace 10.0.0.2 (which matches rte_flow drop rule) by
10.0.0.4 I see that all 60 Mpps passed to my application.

To summarize - if I generate traffic with IP dest address which matches
rte_flow drop rule other non-matched IP dest get only max 15Mpps for each.
But if traffic doesn't include IP dest which matches rte_flow drop rule
this 15Mpps limit is not in a play and the whole traffic is passed to my
application.

Is there any explanation for such behavior? Or i'm doing something wrong? I
haven't found any explanation in mlx5 PMD driver documentation.



Thanks, Dmitriy Stepanov

[-- Attachment #2: Type: text/html, Size: 2567 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Flow rules performance with ConnectX-6 Dx
  2022-03-11 11:10 Flow rules performance with ConnectX-6 Dx Дмитрий Степанов
@ 2022-03-11 11:37 ` Dmitry Kozlyuk
  2022-03-11 12:58   ` Дмитрий Степанов
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Kozlyuk @ 2022-03-11 11:37 UTC (permalink / raw)
  To: Дмитрий
	Степанов,
	users

Hi Dmitry,

Can it be that RSS, to which non-matching traffic gets by default,
is configured in a way that steers each destination IP to one queue?
And this 15 Mpps is in fact how much a core can read from a queue?
In general, it is always worth trying to reproduce the issue with testpmd
and to describe flow rules in full testpmd format ("flow create...").

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Flow rules performance with ConnectX-6 Dx
  2022-03-11 11:37 ` Dmitry Kozlyuk
@ 2022-03-11 12:58   ` Дмитрий Степанов
  2022-03-18 14:54     ` Dmitry Kozlyuk
  0 siblings, 1 reply; 5+ messages in thread
From: Дмитрий Степанов @ 2022-03-11 12:58 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: users

[-- Attachment #1: Type: text/plain, Size: 3041 bytes --]

Hey, Dmitry!
Thanks for reply!

I'm using global RSS configuration (configured using rte_eth_dev_configure)
which distributes incoming packets to different queues. And each queue is
handled by different lcore.
I've checked that incoming traffic is properly distributed among them. For
example, in case of 16 queues (lcores) I see about 900 Kpps per lcore which
in sum gives 15 Mpps.

I was able to reproduce the same using testpmd utility

My steps:

- Start generator at 50 Mpps with 2 IP dest addresses: 10.0.0.1 and 10.0.0.2

- Start testpmd in interactive mode with 16 queues/lcores:

numactl -N 1 -m 1 ./dpdk-testpmd -l 64-127 -a 0000:c1:00.0  --
--nb-cores=16 --rxq=16 --txq=16 -i

- Create flow rule:

testpmd> flow create 0 group 0 priority 0 ingress pattern eth / ipv4 dst is
10.0.0.2 / end actions drop / end

- Start forwarding:

testpmd> start

- Show stats (it shows the same 15Mpps instead of expected 25 Mpps)

testpmd> show port stats 0

  ######################## NIC statistics for port 0
 ########################
  RX-packets: 1127219612 RX-missed: 0          RX-bytes:  67633178722
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 1127219393 TX-errors: 0          TX-bytes:  67633171416

  Throughput (since last show)
  Rx-pps:     14759286          Rx-bps:   7084457512
  Tx-pps:     14758730          Tx-bps:   7084315448

############################################################################

- Ensure incoming traffic is properly distributed among queues (lcores):

testpmd> show port xstats 0

rx_q0_packets: 21841125
rx_q1_packets: 21847375
rx_q2_packets: 21833731
rx_q3_packets: 21837461
rx_q4_packets: 21842922
rx_q5_packets: 21843999
rx_q6_packets: 21838775
rx_q7_packets: 21833429
rx_q8_packets: 21838033
rx_q9_packets: 21835210
rx_q10_packets: 21833261
rx_q11_packets: 21833059
rx_q12_packets: 21849831
rx_q13_packets: 21843589
rx_q14_packets: 21842721
rx_q15_packets: 21834222

- If I use IP dest addresses which don't match drop rule (replace 10.0.0.2
by 10.0.0.3) - I get expected 50 Mpps

  ######################## NIC statistics for port 0
 ########################
  RX-packets: 1988576249 RX-missed: 0          RX-bytes:  119314577228
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 1988576248 TX-errors: 0          TX-bytes:  119314576882

  Throughput (since last show)
  Rx-pps:     49999534          Rx-bps:  23999776424
  Tx-pps:     49999580          Tx-bps:  23999776424

############################################################################

пт, 11 мар. 2022 г. в 14:37, Dmitry Kozlyuk <dkozlyuk@nvidia.com>:

> Hi Dmitry,
>
> Can it be that RSS, to which non-matching traffic gets by default,
> is configured in a way that steers each destination IP to one queue?
> And this 15 Mpps is in fact how much a core can read from a queue?
> In general, it is always worth trying to reproduce the issue with testpmd
> and to describe flow rules in full testpmd format ("flow create...").
>

[-- Attachment #2: Type: text/html, Size: 3604 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Flow rules performance with ConnectX-6 Dx
  2022-03-11 12:58   ` Дмитрий Степанов
@ 2022-03-18 14:54     ` Dmitry Kozlyuk
  2022-03-21  8:40       ` Дмитрий Степанов
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Kozlyuk @ 2022-03-18 14:54 UTC (permalink / raw)
  To: Дмитрий
	Степанов
  Cc: users

Hi Dmitry,

Can you check if the issue reproduces for you when using group 1?
Group 0 is special for mlx5, its behavior differs from other groups.
In testpmd terms:

flow create 0 ingress pattern end actions jump group 1 / end
flow create 0 ingress group 1 pattern eth / ipv4 dst is 10.0.0.2 / end actions drop / end

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Flow rules performance with ConnectX-6 Dx
  2022-03-18 14:54     ` Dmitry Kozlyuk
@ 2022-03-21  8:40       ` Дмитрий Степанов
  0 siblings, 0 replies; 5+ messages in thread
From: Дмитрий Степанов @ 2022-03-21  8:40 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: users

[-- Attachment #1: Type: text/plain, Size: 1161 bytes --]

For some reasons I'm not able to create group 1 / jump action. I get this
error:

testpmd> flow create 0 ingress pattern end actions jump group 1 / end
port_flow_complain(): Caught PMD error type 16 (specific action): cannot
create jump action.: Operation not supported
testpmd> flow create 0 ingress group 1 pattern eth / ipv4 dst is 10.0.0.2 /
end actions drop / end
port_flow_complain(): Caught PMD error type 1 (cause unspecified): cannot
get table: Cannot allocate memory

This looks similar to
https://www.mail-archive.com/dev@dpdk.org/msg152132.html
I've checked UCTX_EN bit - and it its already set within NIC:

mstconfig -d 0000:c1:00.0 q | grep UCTX_EN
         UCTX_EN                             True(1)



пт, 18 мар. 2022 г. в 17:54, Dmitry Kozlyuk <dkozlyuk@nvidia.com>:

> Hi Dmitry,
>
> Can you check if the issue reproduces for you when using group 1?
> Group 0 is special for mlx5, its behavior differs from other groups.
> In testpmd terms:
>
> flow create 0 ingress pattern end actions jump group 1 / end
> flow create 0 ingress group 1 pattern eth / ipv4 dst is 10.0.0.2 / end
> actions drop / end
>

[-- Attachment #2: Type: text/html, Size: 1666 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-03-21  8:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-11 11:10 Flow rules performance with ConnectX-6 Dx Дмитрий Степанов
2022-03-11 11:37 ` Dmitry Kozlyuk
2022-03-11 12:58   ` Дмитрий Степанов
2022-03-18 14:54     ` Dmitry Kozlyuk
2022-03-21  8:40       ` Дмитрий Степанов

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).