From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5F30543B13; Thu, 15 Feb 2024 14:12:58 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E114A40689; Thu, 15 Feb 2024 14:12:57 +0100 (CET) Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) by mails.dpdk.org (Postfix) with ESMTP id 83184402DF for ; Thu, 15 Feb 2024 14:12:56 +0100 (CET) Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-290da27f597so636529a91.2 for ; Thu, 15 Feb 2024 05:12:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind.com; s=google; t=1708002775; x=1708607575; darn=dpdk.org; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=se6Dtq9W6W0V5E7immfmtDb5L8OUWXQGyEgeAWOefH8=; b=ij/Ek8djPu4oVVCpKLAUHSEp7xAKlYhDl9HfixZBGNNThvpN4e5W6qLgoxEdntzGT4 +SImzgdXX/1Mk5lIk07qaZmiUf7WiqVog8kkyzkVHiM04WyAhLN5EslMGJUiIrW1g4YN P6DjEnJmK7cI4m/9ciBskjcQRH4JasCcwTBqez3EsjjsgrmDZBQZs0EcVaBDeRh4qosG O7Pd2Mug6xG9pIFSSBawx20gb0ueeFmpdjAi2xaUYLVjk/KZs0cbSYAlIqi6XdQluivF HZtY3aamY5EQJoSofhB1UC8oNe+m6EtV/jibgECy+kw9z5VZFPkNL8lNY1oQbGN0azRk PoWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708002775; x=1708607575; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=se6Dtq9W6W0V5E7immfmtDb5L8OUWXQGyEgeAWOefH8=; b=th0xim4gVtl7aLILcMswb5/h2eTzdTS4aAxz84G8DC+eiZoOBPArVUTsY3PCbSy44u XBab2cZr8U7n5zMnvrphFBcQP5iC91WZeN5iPu4Q8xaTpK5+gV9xifyfo6eXb3UxHm2J X6qKJD1bmbM5mWU4lanYCZaT0UIukjJ0zGT4J1bicHIYsP7tSWtolBZ5vf/JGxalZbV/ Plrtia+rY1PsjZObH5Fu74sOLgB6lgPeqqVOn77S0TrLEn7xeNu4VWk6FGsmqIjV71mg 8xX/BXQIJ5+9rCkK0O0/STsvcZcdNwXEdIGfjNFRgSCYHCtsVs9tfvw1QwxFJtj+AdO/ fqeQ== X-Gm-Message-State: AOJu0Yzo8MCcRRzdGNUoMtwGFAOMjNtT2741OgEqoh3gcgacHXwPjQnK 2JJPjBMza4xjPd/0s9D6yrY8HSLENXVhIz3t3nL2vQuLF9x06RtOLpDDvjBX11wrt1YeST+y/je HD4mQ8KrjkC1BBOYi0xK0CPE3Vq5kwuJnubm45yyXBsgU7ymer3U= X-Google-Smtp-Source: AGHT+IEnIGw2Z8W3Vx0zkE/eJtQaaNyJhm2PQjuRoM1cyiTbMEGKD3bdkGSMynoeJQ5MddxgCNOwTMeIavERxDMFwVg= X-Received: by 2002:a17:90a:b10b:b0:297:fd5:9e89 with SMTP id z11-20020a17090ab10b00b002970fd59e89mr1646820pjq.38.1708002775491; Thu, 15 Feb 2024 05:12:55 -0800 (PST) MIME-Version: 1.0 From: Edwin Brossette Date: Thu, 15 Feb 2024 14:12:44 +0100 Message-ID: Subject: [ixgbevf] Problem with RSS initial config after device init on X550 nic To: dev@dpdk.org Cc: Olivier Matz Content-Type: multipart/alternative; boundary="000000000000b5776006116b62e5" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org --000000000000b5776006116b62e5 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello, We recently ran into an issue with our product when working with an X550 nic with stable dpdk-23.11. We observed that all the incoming traffic was directed only into a single queue. The issue became evident after displaying the RSS reta which was fully zeroed after device init, thus directing all traffic to rxq0. Moreover, RSS hash key did not seem to be correctly initialized. Manually setting the reta afterwards was enough to balance the incoming traffic between our queues, which convinced me that the issue here was simply a matter of correctly initializing the device on port start. Looking into the pmd's code, I couldn't see any RSS configuration done vf side at device startup, at least not in ixgbevf_dev_rx_init(). I've seen ixgbe_dev_mq_rx_configure() was called during the pf's init and configured RSS to be handled by vf if sriov was on, but this isn't enough to fully configure RSS for the vf. (see code here: https://git.dpdk.org/dpdk/tree/drivers/net/ixgbe/ixgbe_rxtx.c#n4644 ) I have also observed that all different models of nics using ixgbe did not handle RSS in the same way. For example, for nics of the 82599 series, it is written in their datasheet that, for IOV mode: "=E2=80=94 Note that RSS = is not supported in IOV mode since there is only a single RSS hash function in the hardware." On the contrary, x550 nics have special registers to handle RSS in VF, like VFRSSRK and VFRETA, for example. I believe the RSS config not being initialized for X550 nics might come from a slight misunderstanding on this part. Therefore, I can suggest a patch to add a call to ixgbe_rss_configure() somewhere in ixgbevf_dev_rx_init() specifically for this model of nic. Despite this function being named ixgbe_xxx instead of ixgbevf_xxx, it will do correct initialization for RSS in vf mode because all functions to get RSS-related registers such as ixgbe_reta_reg_get() or ixgbe_reta_size_get() will check if the device is in vf or pf mode and fetch the appropriate registers. Here is a way to reproduce, on an X550 card: Here are the nics I am using: 0000:08:00.0 ntfp1 ac:1f:6b:57:57:74 ixgbe 1x2.5 GT/s PCIe 1x2.5 GT/s PCIe Intel Corporation Ethernet Connection X553 10 GbE SFP+ 0000:08:00.1 ntfp2 ac:1f:6b:57:57:75 ixgbe 1x2.5 GT/s PCIe 1x2.5 GT/s PCIe Intel Corporation Ethernet Connection X553 10 GbE SFP+ 0000:08:10.0 eth0 d2:c4:fc:c5:c3:05 ixgbevf 1x2.5 GT/s PCIe 0xUnknown Intel Corporation X553 Virtual Function 0000:08:10.2 eth1 e2:a8:68:09:20:29 ixgbevf 1x2.5 GT/s PCIe 0xUnknown Intel Corporation X553 Virtual Function 1) Starting up dpdk-testpmd: sudo dpdk-hugepages.py --setup 2G; dpdk-devbind --bind=3Dvfio-pci 0000:08:10.0 dpdk-devbind --bind=3Dvfio-pci 0000:08:10.2 dpdk-testpmd -a 0000:08:10.0 -a 0000:08:10.2 -- -i --rxq=3D2 --txq=3D2 --coremask=3D0xff0 --total-num-mbufs=3D250000 EAL: Detected CPU lcores: 12 EAL: Detected NUMA nodes: 1 EAL: Detected static linkage of DPDK EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Selected IOVA mode 'VA' EAL: VFIO support initialized EAL: Using IOMMU type 1 (Type 1) EAL: Probe PCI driver: net_ixgbe_vf (8086:15c5) device: 0000:08:10.0 (socket -1) EAL: Probe PCI driver: net_ixgbe_vf (8086:15c5) device: 0000:08:10.2 (socket -1) Interactive-mode selected previous number of forwarding cores 1 - changed to number of configured cores 8 Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa. testpmd: create a new mbuf pool : n=3D250000, size=3D2176, socke= t=3D0 testpmd: preferred mempool ops selected: ring_mp_mc Configuring Port 0 (socket 0) Port 0: 02:09:C0:9E:09:75 Configuring Port 1 (socket 0) Port 1: 02:09:C0:76:6D:4B Checking link statuses... Done testpmd> 2) Display port info: testpmd> show port info 0 ********************* Infos for port 0 ********************* MAC address: 02:09:C0:B8:68:2F Device name: 0000:08:10.0 Driver name: net_ixgbe_vf Firmware-version: not available Devargs: Connect to socket: 0 memory allocation on the socket: 0 Link status: down Link speed: None Link duplex: half-duplex Autoneg status: On MTU: 1500 Promiscuous mode: disabled Allmulticast mode: disabled Maximum number of MAC addresses: 128 Maximum number of MAC addresses of hash filtering: 4096 VLAN offload: strip off, filter off, extend off, qinq strip off Hash key size in bytes: 40 Redirection table size: 64 Supported RSS offload flow types: ipv4 ipv4-tcp ipv4-udp ipv6 ipv6-tcp ipv6-udp ipv6-ex ipv6-tcp-ex ipv6-udp-ex Minimum size of RX buffer: 1024 Maximum configurable length of RX packet: 9728 Maximum configurable size of LRO aggregated packet: 0 Maximum number of VMDq pools: 64 Current number of RX queues: 2 Max possible RX queues: 4 Max possible number of RXDs per queue: 4096 Min possible number of RXDs per queue: 32 RXDs number alignment: 8 Current number of TX queues: 2 Max possible TX queues: 4 Max possible number of TXDs per queue: 4096 Min possible number of TXDs per queue: 32 TXDs number alignment: 8 Max segment number per packet: 40 Max segment number per MTU/TSO: 40 Device capabilities: 0x0( ) Device error handling mode: passive Device private info: none 3) Display RSS conf: testpmd> show port 0 rss-hash ixgbe_dev_rss_hash_conf_get(): : rss is enabled RSS functions: ipv4 ipv4-tcp ipv6 ipv6-tcp testpmd> show port 0 rss-hash key ixgbe_dev_rss_hash_conf_get(): : rss is enabled RSS functions: ipv4 ipv4-tcp ipv6 ipv6-tcp RSS key: 88F1A05B9FFCD601333EB3FF4176AE8836B36D67D4013A4B75F25806D17078D08C1EF6A69FF= 29A78 testpmd> show port 0 rss reta [UINT16]: show port rss reta testpmd> show port 0 rss-hash algorithm ixgbe_dev_rss_hash_conf_get(): : rss is enabled RSS algorithm: default testpmd> show port 0 rss reta 64 (0xfff) RSS RETA configuration: hash index=3D0, queue=3D0 RSS RETA configuration: hash index=3D1, queue=3D0 RSS RETA configuration: hash index=3D2, queue=3D0 RSS RETA configuration: hash index=3D3, queue=3D0 RSS RETA configuration: hash index=3D4, queue=3D0 RSS RETA configuration: hash index=3D5, queue=3D0 RSS RETA configuration: hash index=3D6, queue=3D0 RSS RETA configuration: hash index=3D7, queue=3D0 RSS RETA configuration: hash index=3D8, queue=3D0 RSS RETA configuration: hash index=3D9, queue=3D0 RSS RETA configuration: hash index=3D10, queue=3D0 RSS RETA configuration: hash index=3D11, queue=3D0 Here you can see the reta is full of 0s, which is causing me the performance issues. The log appearing above "ixgbe_dev_rss_hash_conf_get(): : rss is enabled" is a custom log I added in ixgbe_dev_rss_hash_conf_get(), just after the check that ixgbe_rss_enabled() didn't return 0 (see https://git.dpdk.org/dpdk/tree/drivers/net/ixgbe/ixgbe_rxtx.c#n3670 ). I also had occurrences where the RSS key and algorithm weren't set. In these cases, test-pmd would tell me that RSS is disabled, even though it was enabled and simply not configured. After adding my suggested patch, the key set was the default key and the reta was correctly initialized (alternating 1s and 0s). --000000000000b5776006116b62e5 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,

We recently ran into an issue with our produ= ct when=20 working with an X550 nic with stable dpdk-23.11. We observed that all=20 the incoming traffic was directed only into a single queue.

The=20 issue became evident after displaying the RSS reta which was fully=20 zeroed after device init, thus directing all traffic to rxq0. Moreover,=20 RSS hash key did not seem to be correctly initialized. Manually setting=20 the reta afterwards was enough to balance the incoming traffic between=20 our queues, which convinced me that the issue here was simply a matter=20 of correctly initializing the device on port start.

Looking into the= pmd's code, I couldn't see any RSS configuration done vf side at d= evice startup, at least not in ixgbevf_dev_rx_init(). I've seen ixgbe_dev_mq_rx_configure() was called during the pf's i= nit=20 and configured RSS to be handled by vf if sriov was on, but this isn't= =20 enough to fully configure RSS for the vf. (see code here: https://git.dpdk.org/dpdk/tree/drivers/net/ixgbe/ixgbe_rxtx.c#n4644 )

I have also observed that all different models of nics using ixgbe did=20 not handle RSS in the same way. For example, for nics of the 82599=20 series, it is written in their datasheet that, for IOV mode: "=E2=80= =94 Note=20 that RSS is not supported in IOV mode since there is only a single RSS=20 hash function in the hardware." On the contrary, x550 nics have specia= l=20 registers to handle RSS in VF, like VFRSSRK and VFRETA, for example. I=20 believe the RSS config not being initialized for X550 nics might come=20 from a slight misunderstanding on this part.

Therefore, I can=20 suggest a patch to add a call to ixgbe_rss_configure() somewhere in=20 ixgbevf_dev_rx_init() specifically for this model of nic. Despite this=20 function being named ixgbe_xxx instead of ixgbevf_xxx, it will do=20 correct initialization for RSS in vf mode because all functions to get=20 RSS-related registers such as ixgbe_reta_reg_get() or=20 ixgbe_reta_size_get() will check if the device is in vf or pf mode and=20 fetch the appropriate registers.

Here is a way to reproduce, on an X= 550 card:

Here are the nics I am using:
0000:0= 8:00.0 =C2=A0ntfp1 =C2=A0 ac:1f:6b:57:57:74 =C2=A0ixgbe =C2=A0 =C2=A01x2.5 GT/s P= CIe =C2=A01x2.5 GT/s PCIe=20 =C2=A0Intel Corporation Ethernet Connection X553 10 GbE SFP+
0000:08:00.= 1=20 =C2=A0ntfp2 =C2=A0 ac:1f:6b:57:57:75 =C2=A0ixgbe =C2=A0 =C2=A01x2.5 GT/s PC= Ie =C2=A01x2.5 GT/s PCIe=20 =C2=A0Intel Corporation Ethernet Connection X553 10 GbE SFP+
0000:08:10.= 0 =C2=A0eth0 =C2=A0 =C2=A0d2:c4:fc:c5:c3:05 =C2=A0ixgbevf =C2=A01x2.5 GT/s = PCIe =C2=A00xUnknown =C2=A0 =C2=A0 =C2=A0 =C2=A0Intel Corporation X553 Virt= ual Function
0000:08:10.2 =C2=A0eth1 =C2=A0 =C2=A0e2:a8:68:09:20:29 =C2= =A0ixgbevf =C2=A01x2.5 GT/s PCIe =C2=A00xUnknown =C2=A0 =C2=A0 =C2=A0 =C2= =A0Intel Corporation X553 Virtual Function

=C2=A0 =C2=A0 1) Starting= up dpdk-testpmd:

sudo dpdk-hugepages.py --setup 2G;
dpdk-devbind= --bind=3Dvfio-pci 0000:08:10.0
dpdk-devbind --bind=3Dvfio-pci 0000:08:1= 0.2

dpdk-testpmd -a 0000:08:10.0 -a 0000:08:10.2 -- -i --rxq=3D2 --t= xq=3D2 --coremask=3D0xff0 --total-num-mbufs=3D250000
EAL: Detected CPU l= cores: 12
EAL: Detected NUMA nodes: 1
EAL: Detected static linkage of= DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Sele= cted IOVA mode 'VA'
EAL: VFIO support initialized
EAL: Using = IOMMU type 1 (Type 1)
EAL: Probe PCI driver: net_ixgbe_vf (8086:15c5) de= vice: 0000:08:10.0 (socket -1)
EAL: Probe PCI driver: net_ixgbe_vf (8086= :15c5) device: 0000:08:10.2 (socket -1)
Interactive-mode selected
pre= vious number of forwarding cores 1 - changed to number of configured cores = 8
Warning: NUMA should be configured manually by using --port-numa-confi= g and --ring-numa-config parameters along with --numa.
testpmd: create a= new mbuf pool <mb_pool_0>: n=3D250000, size=3D2176, socket=3D0
te= stpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (so= cket 0)
Port 0: 02:09:C0:9E:09:75
Configuring Port 1 (socket 0)
Po= rt 1: 02:09:C0:76:6D:4B
Checking link statuses...
Done
testpmd>=

=C2=A0 =C2=A0 2) Display port info:

testpmd> show port in= fo 0

********************* Infos for port 0 =C2=A0******************= ***
MAC address: 02:09:C0:B8:68:2F
Device name: 0000:08:10.0
Drive= r name: net_ixgbe_vf
Firmware-version: not available
Devargs:
Con= nect to socket: 0
memory allocation on the socket: 0
Link status: dow= n
Link speed: None
Link duplex: half-duplex
Autoneg status: On
= MTU: 1500
Promiscuous mode: disabled
Allmulticast mode: disabled
M= aximum number of MAC addresses: 128
Maximum number of MAC addresses of h= ash filtering: 4096
VLAN offload:
=C2=A0 strip off, filter off, exte= nd off, qinq strip off
Hash key size in bytes: 40
Redirection table s= ize: 64
Supported RSS offload flow types:
=C2=A0 ipv4 =C2=A0ipv4-tcp = =C2=A0ipv4-udp =C2=A0ipv6 =C2=A0ipv6-tcp =C2=A0ipv6-udp =C2=A0ipv6-ex
= =C2=A0 ipv6-tcp-ex =C2=A0ipv6-udp-ex
Minimum size of RX buffer: 1024
= Maximum configurable length of RX packet: 9728
Maximum configurable size= of LRO aggregated packet: 0
Maximum number of VMDq pools: 64
Current= number of RX queues: 2
Max possible RX queues: 4
Max possible number= of RXDs per queue: 4096
Min possible number of RXDs per queue: 32
RX= Ds number alignment: 8
Current number of TX queues: 2
Max possible TX= queues: 4
Max possible number of TXDs per queue: 4096
Min possible n= umber of TXDs per queue: 32
TXDs number alignment: 8
Max segment numb= er per packet: 40
Max segment number per MTU/TSO: 40
Device capabilit= ies: 0x0( )
Device error handling mode: passive
Device private info:<= br>=C2=A0 none

=C2=A0 =C2=A0 3) Display RSS conf:

testpmd>= show port 0 rss-hash
ixgbe_dev_rss_hash_conf_get(): <log>: rss is= enabled
RSS functions:
=C2=A0 ipv4 =C2=A0ipv4-tcp =C2=A0ipv6 =C2=A0i= pv6-tcp
testpmd> show port 0 rss-hash key
ixgbe_dev_rss_hash_conf_= get(): <log>: rss is enabled
RSS functions:
=C2=A0 ipv4 =C2=A0i= pv4-tcp =C2=A0ipv6 =C2=A0ipv6-tcp
RSS key:
88F1A05B9FFCD601333EB3FF41= 76AE8836B36D67D4013A4B75F25806D17078D08C1EF6A69FF29A78
testpmd> show = port 0 rss reta
=C2=A0[UINT16]: show port <port_id> rss reta <s= ize> <mask0[,mask1]*>
testpmd> show port 0 rss-hash algorith= m
ixgbe_dev_rss_hash_conf_get(): <log>: rss is enabled
RSS algo= rithm:
=C2=A0 default
testpmd> show port 0 rss reta 64 (0xfff)
= RSS RETA configuration: hash index=3D0, queue=3D0
RSS RETA configuration= : hash index=3D1, queue=3D0
RSS RETA configuration: hash index=3D2, queu= e=3D0
RSS RETA configuration: hash index=3D3, queue=3D0
RSS RETA conf= iguration: hash index=3D4, queue=3D0
RSS RETA configuration: hash index= =3D5, queue=3D0
RSS RETA configuration: hash index=3D6, queue=3D0
RSS= RETA configuration: hash index=3D7, queue=3D0
RSS RETA configuration: h= ash index=3D8, queue=3D0
RSS RETA configuration: hash index=3D9, queue= =3D0
RSS RETA configuration: hash index=3D10, queue=3D0
RSS RETA conf= iguration: hash index=3D11, queue=3D0

Here you can see the reta is f= ull of 0s, which is causing me the performance issues.
I also had occurrences where the RSS key and algorithm weren't set. In= =20 these cases, test-pmd would tell me that RSS is disabled, even though it was enabled and simply not configured. After adding my suggested patch, the key set was the default key and the reta was correctly initialized=20 (alternating 1s and 0s).
--000000000000b5776006116b62e5--