* [dpdk-dev] [PATCH] ethdev: make struct rte_eth_dev cache aligned @ 2016-05-02 8:07 Jerin Jacob 2016-05-03 9:40 ` Bruce Richardson 2016-05-03 12:42 ` [dpdk-dev] [PATCH v2] " Jerin Jacob 0 siblings, 2 replies; 10+ messages in thread From: Jerin Jacob @ 2016-05-02 8:07 UTC (permalink / raw) To: dev; +Cc: thomas.monjalon, Jerin Jacob Elements of struct rte_eth_dev used in the fast path. Make struct rte_eth_dev cache aligned to avoid the cases where rte_eth_dev elements share the same cache line with other structures. Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- lib/librte_ether/rte_ethdev.c | 2 +- lib/librte_ether/rte_ethdev.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index a31018e..04f492d 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -70,7 +70,7 @@ #include "rte_ethdev.h" static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data"; -struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS]; +struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS] __rte_cache_aligned; static struct rte_eth_dev_data *rte_eth_dev_data; static uint8_t nb_ports; diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 8519ff6..e359dda 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -1630,7 +1630,7 @@ struct rte_eth_dev { struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT]; uint8_t attached; /**< Flag indicating the port is attached */ enum rte_eth_dev_type dev_type; /**< Flag indicating the device type */ -}; +} __rte_cache_aligned; struct rte_eth_dev_sriov { uint8_t active; /**< SRIOV is active with 16, 32 or 64 pools */ -- 2.1.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH] ethdev: make struct rte_eth_dev cache aligned 2016-05-02 8:07 [dpdk-dev] [PATCH] ethdev: make struct rte_eth_dev cache aligned Jerin Jacob @ 2016-05-03 9:40 ` Bruce Richardson 2016-05-03 12:10 ` Jerin Jacob 2016-05-03 12:42 ` [dpdk-dev] [PATCH v2] " Jerin Jacob 1 sibling, 1 reply; 10+ messages in thread From: Bruce Richardson @ 2016-05-03 9:40 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev, thomas.monjalon On Mon, May 02, 2016 at 01:37:45PM +0530, Jerin Jacob wrote: > Elements of struct rte_eth_dev used in the fast path. > Make struct rte_eth_dev cache aligned to avoid the cases where > rte_eth_dev elements share the same cache line with other structures. > > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> > --- > lib/librte_ether/rte_ethdev.c | 2 +- > lib/librte_ether/rte_ethdev.h | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c > index a31018e..04f492d 100644 > --- a/lib/librte_ether/rte_ethdev.c > +++ b/lib/librte_ether/rte_ethdev.c > @@ -70,7 +70,7 @@ > #include "rte_ethdev.h" > > static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data"; > -struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS]; > +struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS] __rte_cache_aligned; > static struct rte_eth_dev_data *rte_eth_dev_data; > static uint8_t nb_ports; > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > index 8519ff6..e359dda 100644 > --- a/lib/librte_ether/rte_ethdev.h > +++ b/lib/librte_ether/rte_ethdev.h > @@ -1630,7 +1630,7 @@ struct rte_eth_dev { > struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT]; > uint8_t attached; /**< Flag indicating the port is attached */ > enum rte_eth_dev_type dev_type; /**< Flag indicating the device type */ > -}; > +} __rte_cache_aligned; > > struct rte_eth_dev_sriov { > uint8_t active; /**< SRIOV is active with 16, 32 or 64 pools */ > -- Adding cache aligned in two places is overkill, I think. If the structure is marked as being cache aligned, there is no need to have the rte_eth_devices marked that way too. I suggest therefore just keeping the structure alignment. Regards, /Bruce ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH] ethdev: make struct rte_eth_dev cache aligned 2016-05-03 9:40 ` Bruce Richardson @ 2016-05-03 12:10 ` Jerin Jacob 0 siblings, 0 replies; 10+ messages in thread From: Jerin Jacob @ 2016-05-03 12:10 UTC (permalink / raw) To: Bruce Richardson; +Cc: dev, thomas.monjalon On Tue, May 03, 2016 at 10:40:53AM +0100, Bruce Richardson wrote: > On Mon, May 02, 2016 at 01:37:45PM +0530, Jerin Jacob wrote: > > Elements of struct rte_eth_dev used in the fast path. > > Make struct rte_eth_dev cache aligned to avoid the cases where > > rte_eth_dev elements share the same cache line with other structures. > > > > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> > > --- > > lib/librte_ether/rte_ethdev.c | 2 +- > > lib/librte_ether/rte_ethdev.h | 2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c > > index a31018e..04f492d 100644 > > --- a/lib/librte_ether/rte_ethdev.c > > +++ b/lib/librte_ether/rte_ethdev.c > > @@ -70,7 +70,7 @@ > > #include "rte_ethdev.h" > > > > static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data"; > > -struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS]; > > +struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS] __rte_cache_aligned; > > static struct rte_eth_dev_data *rte_eth_dev_data; > > static uint8_t nb_ports; > > > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > > index 8519ff6..e359dda 100644 > > --- a/lib/librte_ether/rte_ethdev.h > > +++ b/lib/librte_ether/rte_ethdev.h > > @@ -1630,7 +1630,7 @@ struct rte_eth_dev { > > struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT]; > > uint8_t attached; /**< Flag indicating the port is attached */ > > enum rte_eth_dev_type dev_type; /**< Flag indicating the device type */ > > -}; > > +} __rte_cache_aligned; > > > > struct rte_eth_dev_sriov { > > uint8_t active; /**< SRIOV is active with 16, 32 or 64 pools */ > > -- > Adding cache aligned in two places is overkill, I think. If the structure is > marked as being cache aligned, there is no need to have the rte_eth_devices > marked that way too. I suggest therefore just keeping the structure alignment. OK. Will fix it in v2 > > Regards, > /Bruce ^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH v2] ethdev: make struct rte_eth_dev cache aligned 2016-05-02 8:07 [dpdk-dev] [PATCH] ethdev: make struct rte_eth_dev cache aligned Jerin Jacob 2016-05-03 9:40 ` Bruce Richardson @ 2016-05-03 12:42 ` Jerin Jacob 2016-05-04 11:09 ` Bruce Richardson 2016-06-22 21:20 ` Thomas Monjalon 1 sibling, 2 replies; 10+ messages in thread From: Jerin Jacob @ 2016-05-03 12:42 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, thomas.monjalon, Jerin Jacob Elements of struct rte_eth_dev used in the fast path. Make struct rte_eth_dev cache aligned to avoid the cases where rte_eth_dev elements share the same cache line with other structures. Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> --- v2: Remove __rte_cache_aligned from rte_eth_devices and keep it only at struct rte_eth_dev definition as suggested by Bruce http://dpdk.org/dev/patchwork/patch/12328/ --- lib/librte_ether/rte_ethdev.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 2757510..48f14d5 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -1615,7 +1615,7 @@ struct rte_eth_dev { struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT]; uint8_t attached; /**< Flag indicating the port is attached */ enum rte_eth_dev_type dev_type; /**< Flag indicating the device type */ -}; +} __rte_cache_aligned; struct rte_eth_dev_sriov { uint8_t active; /**< SRIOV is active with 16, 32 or 64 pools */ -- 2.1.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ethdev: make struct rte_eth_dev cache aligned 2016-05-03 12:42 ` [dpdk-dev] [PATCH v2] " Jerin Jacob @ 2016-05-04 11:09 ` Bruce Richardson 2016-05-04 13:42 ` Jerin Jacob 2016-06-22 21:20 ` Thomas Monjalon 1 sibling, 1 reply; 10+ messages in thread From: Bruce Richardson @ 2016-05-04 11:09 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev, thomas.monjalon On Tue, May 03, 2016 at 06:12:07PM +0530, Jerin Jacob wrote: > Elements of struct rte_eth_dev used in the fast path. > Make struct rte_eth_dev cache aligned to avoid the cases where > rte_eth_dev elements share the same cache line with other structures. > > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> > --- > v2: > Remove __rte_cache_aligned from rte_eth_devices and keep > it only at struct rte_eth_dev definition as suggested by Bruce > http://dpdk.org/dev/patchwork/patch/12328/ > --- > lib/librte_ether/rte_ethdev.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > index 2757510..48f14d5 100644 > --- a/lib/librte_ether/rte_ethdev.h > +++ b/lib/librte_ether/rte_ethdev.h > @@ -1615,7 +1615,7 @@ struct rte_eth_dev { > struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT]; > uint8_t attached; /**< Flag indicating the port is attached */ > enum rte_eth_dev_type dev_type; /**< Flag indicating the device type */ > -}; > +} __rte_cache_aligned; > > struct rte_eth_dev_sriov { > uint8_t active; /**< SRIOV is active with 16, 32 or 64 pools */ > -- Hi Jerin, have you seen a performance degradation due to ethdev elements sharing a cache line? I ask because, surprisingly for me, I actually see a performance regression when I apply the above patch. It's not a big change - perf reduction of <1% - but still noticable across multiple runs using testpmd. I'm using two 1x40G NICs using i40e driver, and I see ~100kpps less traffic per port after applying the patch. [CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz] Testpmd cmd and output shown below. Regards, /Bruce $ sudo ./x86_64-native-linuxapp-gcc/app/testpmd -c C0000 -n 4 -- --rxd=512 --txd=512 --numa EAL: Detected 36 lcore(s) EAL: Probing VFIO support... PMD: nfp_net_pmd_init(): librte_pmd_nfp_net version 0.1 EAL: PCI device 0000:01:00.0 on NUMA socket 0 EAL: probe driver: 8086:1521 rte_igb_pmd EAL: PCI device 0000:01:00.1 on NUMA socket 0 EAL: probe driver: 8086:1521 rte_igb_pmd EAL: PCI device 0000:05:00.0 on NUMA socket 0 EAL: probe driver: 8086:154a rte_ixgbe_pmd EAL: PCI device 0000:05:00.1 on NUMA socket 0 EAL: probe driver: 8086:154a rte_ixgbe_pmd EAL: PCI device 0000:08:00.0 on NUMA socket 0 EAL: probe driver: 8086:154a rte_ixgbe_pmd EAL: PCI device 0000:08:00.1 on NUMA socket 0 EAL: probe driver: 8086:154a rte_ixgbe_pmd EAL: PCI device 0000:81:00.0 on NUMA socket 1 EAL: probe driver: 8086:1584 rte_i40e_pmd PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 80002281 EAL: PCI device 0000:88:00.0 on NUMA socket 1 EAL: probe driver: 8086:1584 rte_i40e_pmd PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 80002281 Configuring Port 0 (socket 1) Port 0: 68:05:CA:27:D4:4E Configuring Port 1 (socket 1) Port 1: 68:05:CA:27:D2:0A Checking link statuses... Port 0 Link Up - speed 40000 Mbps - full-duplex Port 1 Link Up - speed 40000 Mbps - full-duplex Done No commandline core given, start packet forwarding io packet forwarding - CRC stripping disabled - packets/burst=32 nb forwarding cores=1 - nb forwarding ports=2 RX queues=1 - RX desc=512 - RX free threshold=32 RX threshold registers: pthresh=8 hthresh=8 wthresh=0 TX queues=1 - TX desc=512 - TX free threshold=32 TX threshold registers: pthresh=32 hthresh=0 wthresh=0 TX RS bit threshold=32 - TXQ flags=0xf01 Press enter to exit Telling cores to stop... Waiting for lcores to finish... ---------------------- Forward statistics for port 0 ---------------------- RX-packets: 1940564672 RX-dropped: 1456035742 RX-total: 3396600414 TX-packets: 1940564736 TX-dropped: 0 TX-total: 1940564736 ---------------------------------------------------------------------------- ---------------------- Forward statistics for port 1 ---------------------- RX-packets: 1940564671 RX-dropped: 1456036082 RX-total: 3396600753 TX-packets: 1940564736 TX-dropped: 0 TX-total: 1940564736 ---------------------------------------------------------------------------- +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ RX-packets: 3881129343 RX-dropped: 2912071824 RX-total: 6793201167 TX-packets: 3881129472 TX-dropped: 0 TX-total: 3881129472 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Done. Shutting down port 0... Stopping ports... Done Closing ports... Done Shutting down port 1... Stopping ports... Done Closing ports... Done Bye... ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ethdev: make struct rte_eth_dev cache aligned 2016-05-04 11:09 ` Bruce Richardson @ 2016-05-04 13:42 ` Jerin Jacob 2016-05-04 13:53 ` Richardson, Bruce 0 siblings, 1 reply; 10+ messages in thread From: Jerin Jacob @ 2016-05-04 13:42 UTC (permalink / raw) To: Bruce Richardson; +Cc: dev, thomas.monjalon On Wed, May 04, 2016 at 12:09:50PM +0100, Bruce Richardson wrote: > On Tue, May 03, 2016 at 06:12:07PM +0530, Jerin Jacob wrote: > > Elements of struct rte_eth_dev used in the fast path. > > Make struct rte_eth_dev cache aligned to avoid the cases where > > rte_eth_dev elements share the same cache line with other structures. > > > > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> > > --- > > v2: > > Remove __rte_cache_aligned from rte_eth_devices and keep > > it only at struct rte_eth_dev definition as suggested by Bruce > > http://dpdk.org/dev/patchwork/patch/12328/ > > --- > > lib/librte_ether/rte_ethdev.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > > index 2757510..48f14d5 100644 > > --- a/lib/librte_ether/rte_ethdev.h > > +++ b/lib/librte_ether/rte_ethdev.h > > @@ -1615,7 +1615,7 @@ struct rte_eth_dev { > > struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT]; > > uint8_t attached; /**< Flag indicating the port is attached */ > > enum rte_eth_dev_type dev_type; /**< Flag indicating the device type */ > > -}; > > +} __rte_cache_aligned; > > > > struct rte_eth_dev_sriov { > > uint8_t active; /**< SRIOV is active with 16, 32 or 64 pools */ > > -- > > Hi Jerin, Hi Bruce, > > have you seen a performance degradation due to ethdev elements sharing a cache No. Not because of sharing the cache line. > line? I ask because, surprisingly for me, I actually see a performance regression I see performance degradation in PMD in my setup where independent changes are causing the performance issue in PMD(~<100k). That's the reason I thought making aligned cache line stuff where ever it makes sense so that independent change shouldn't impact the PMD performance and this patch was an initiative for the same. > when I apply the above patch. It's not a big change - perf reduction of <1% - but > still noticable across multiple runs using testpmd. I'm using two 1x40G NICs > using i40e driver, and I see ~100kpps less traffic per port after applying the > patch. [CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz] This particular patch does not have any performance degradation in my setup. CPU: ThunderX > > Testpmd cmd and output shown below. > > Regards, > /Bruce > > $ sudo ./x86_64-native-linuxapp-gcc/app/testpmd -c C0000 -n 4 -- --rxd=512 --txd=512 --numa > EAL: Detected 36 lcore(s) > EAL: Probing VFIO support... > PMD: nfp_net_pmd_init(): librte_pmd_nfp_net version 0.1 > > EAL: PCI device 0000:01:00.0 on NUMA socket 0 > EAL: probe driver: 8086:1521 rte_igb_pmd > EAL: PCI device 0000:01:00.1 on NUMA socket 0 > EAL: probe driver: 8086:1521 rte_igb_pmd > EAL: PCI device 0000:05:00.0 on NUMA socket 0 > EAL: probe driver: 8086:154a rte_ixgbe_pmd > EAL: PCI device 0000:05:00.1 on NUMA socket 0 > EAL: probe driver: 8086:154a rte_ixgbe_pmd > EAL: PCI device 0000:08:00.0 on NUMA socket 0 > EAL: probe driver: 8086:154a rte_ixgbe_pmd > EAL: PCI device 0000:08:00.1 on NUMA socket 0 > EAL: probe driver: 8086:154a rte_ixgbe_pmd > EAL: PCI device 0000:81:00.0 on NUMA socket 1 > EAL: probe driver: 8086:1584 rte_i40e_pmd > PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 80002281 > EAL: PCI device 0000:88:00.0 on NUMA socket 1 > EAL: probe driver: 8086:1584 rte_i40e_pmd > PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 80002281 > Configuring Port 0 (socket 1) > Port 0: 68:05:CA:27:D4:4E > Configuring Port 1 (socket 1) > Port 1: 68:05:CA:27:D2:0A > Checking link statuses... > Port 0 Link Up - speed 40000 Mbps - full-duplex > Port 1 Link Up - speed 40000 Mbps - full-duplex > Done > No commandline core given, start packet forwarding > io packet forwarding - CRC stripping disabled - packets/burst=32 > nb forwarding cores=1 - nb forwarding ports=2 > RX queues=1 - RX desc=512 - RX free threshold=32 > RX threshold registers: pthresh=8 hthresh=8 wthresh=0 > TX queues=1 - TX desc=512 - TX free threshold=32 > TX threshold registers: pthresh=32 hthresh=0 wthresh=0 > TX RS bit threshold=32 - TXQ flags=0xf01 > Press enter to exit > > Telling cores to stop... > Waiting for lcores to finish... > > ---------------------- Forward statistics for port 0 ---------------------- > RX-packets: 1940564672 RX-dropped: 1456035742 RX-total: 3396600414 > TX-packets: 1940564736 TX-dropped: 0 TX-total: 1940564736 > ---------------------------------------------------------------------------- > > ---------------------- Forward statistics for port 1 ---------------------- > RX-packets: 1940564671 RX-dropped: 1456036082 RX-total: 3396600753 > TX-packets: 1940564736 TX-dropped: 0 TX-total: 1940564736 > ---------------------------------------------------------------------------- > > +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ > RX-packets: 3881129343 RX-dropped: 2912071824 RX-total: 6793201167 > TX-packets: 3881129472 TX-dropped: 0 TX-total: 3881129472 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > Done. > > Shutting down port 0... > Stopping ports... > Done > Closing ports... > Done > > Shutting down port 1... > Stopping ports... > Done > Closing ports... > Done > > Bye... > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ethdev: make struct rte_eth_dev cache aligned 2016-05-04 13:42 ` Jerin Jacob @ 2016-05-04 13:53 ` Richardson, Bruce 2016-05-04 15:19 ` Jerin Jacob 0 siblings, 1 reply; 10+ messages in thread From: Richardson, Bruce @ 2016-05-04 13:53 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev, thomas.monjalon > -----Original Message----- > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com] > Sent: Wednesday, May 4, 2016 2:43 PM > To: Richardson, Bruce <bruce.richardson@intel.com> > Cc: dev@dpdk.org; thomas.monjalon@6wind.com > Subject: Re: [dpdk-dev] [PATCH v2] ethdev: make struct rte_eth_dev cache > aligned > > On Wed, May 04, 2016 at 12:09:50PM +0100, Bruce Richardson wrote: > > On Tue, May 03, 2016 at 06:12:07PM +0530, Jerin Jacob wrote: > > > Elements of struct rte_eth_dev used in the fast path. > > > Make struct rte_eth_dev cache aligned to avoid the cases where > > > rte_eth_dev elements share the same cache line with other structures. > > > > > > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> > > > --- > > > v2: > > > Remove __rte_cache_aligned from rte_eth_devices and keep it only at > > > struct rte_eth_dev definition as suggested by Bruce > > > http://dpdk.org/dev/patchwork/patch/12328/ > > > --- > > > lib/librte_ether/rte_ethdev.h | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/lib/librte_ether/rte_ethdev.h > > > b/lib/librte_ether/rte_ethdev.h index 2757510..48f14d5 100644 > > > --- a/lib/librte_ether/rte_ethdev.h > > > +++ b/lib/librte_ether/rte_ethdev.h > > > @@ -1615,7 +1615,7 @@ struct rte_eth_dev { > > > struct rte_eth_rxtx_callback > *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT]; > > > uint8_t attached; /**< Flag indicating the port is attached */ > > > enum rte_eth_dev_type dev_type; /**< Flag indicating the device > > > type */ -}; > > > +} __rte_cache_aligned; > > > > > > struct rte_eth_dev_sriov { > > > uint8_t active; /**< SRIOV is active with 16, 32 or 64 > pools */ > > > -- > > > > Hi Jerin, > > Hi Bruce, > > > > > have you seen a performance degradation due to ethdev elements sharing > > a cache > > No. Not because of sharing the cache line. > > > line? I ask because, surprisingly for me, I actually see a performance > > regression > > I see performance degradation in PMD in my setup where independent changes > are causing the performance issue in PMD(~<100k). That's the reason I > thought making aligned cache line stuff where ever it makes sense so that > independent change shouldn't impact the PMD performance and this patch was > an initiative for the same. > > > when I apply the above patch. It's not a big change - perf reduction > > of <1% - but still noticable across multiple runs using testpmd. I'm > > using two 1x40G NICs using i40e driver, and I see ~100kpps less > > traffic per port after applying the patch. [CPU: Intel(R) Xeon(R) CPU > > E5-2699 v3 @ 2.30GHz] > > This particular patch does not have any performance degradation in my > setup. > CPU: ThunderX Ok, so I take it that this patch is performance neutral on your setup, then? If that's the case, can we hold off on merging it on the basis that it's not needed and does cause a slight regression. Thanks, /Bruce ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ethdev: make struct rte_eth_dev cache aligned 2016-05-04 13:53 ` Richardson, Bruce @ 2016-05-04 15:19 ` Jerin Jacob 2016-05-04 15:48 ` Bruce Richardson 0 siblings, 1 reply; 10+ messages in thread From: Jerin Jacob @ 2016-05-04 15:19 UTC (permalink / raw) To: Richardson, Bruce; +Cc: dev, thomas.monjalon On Wed, May 04, 2016 at 01:53:39PM +0000, Richardson, Bruce wrote: > > > > -----Original Message----- > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com] > > Sent: Wednesday, May 4, 2016 2:43 PM > > To: Richardson, Bruce <bruce.richardson@intel.com> > > Cc: dev@dpdk.org; thomas.monjalon@6wind.com > > Subject: Re: [dpdk-dev] [PATCH v2] ethdev: make struct rte_eth_dev cache > > aligned > > > > On Wed, May 04, 2016 at 12:09:50PM +0100, Bruce Richardson wrote: Snip > > > > > > Hi Jerin, > > > > Hi Bruce, > > > > > > > > have you seen a performance degradation due to ethdev elements sharing > > > a cache > > > > No. Not because of sharing the cache line. > > > > > line? I ask because, surprisingly for me, I actually see a performance > > > regression > > > > I see performance degradation in PMD in my setup where independent changes > > are causing the performance issue in PMD(~<100k). That's the reason I > > thought making aligned cache line stuff where ever it makes sense so that > > independent change shouldn't impact the PMD performance and this patch was > > an initiative for the same. > > > > > when I apply the above patch. It's not a big change - perf reduction > > > of <1% - but still noticable across multiple runs using testpmd. I'm > > > using two 1x40G NICs using i40e driver, and I see ~100kpps less > > > traffic per port after applying the patch. [CPU: Intel(R) Xeon(R) CPU > > > E5-2699 v3 @ 2.30GHz] > > > > This particular patch does not have any performance degradation in my > > setup. > > CPU: ThunderX > > Ok, so I take it that this patch is performance neutral on your setup, then? > If that's the case, can we hold off on merging it on the basis that it's not needed and does cause a slight regression. OK Can you share the base dpdk.org upstream change set where this patch creates the slight regression? I will test it the same on IA and ThunderX platform. In my testpmd build, rte_eth_devices(0x0000000000751ef8) share the cache line with inactive "notify_ops" that the reason for its not creating any benefit. I guess the case would have been different if its active write location. COMMON 0x0000000000751ef0 0x8 /home/jerin/dpdk-thunderx/build/lib/librte_vhost.a(virtio-net.o) 0x0000000000751ef0 notify_ops COMMON 0x0000000000751ef8 0x80900 /home/jerin/dpdk-thunderx/build/lib/libethdev.a(rte_ethdev.o) 0x0000000000751ef8 rte_eth_devices Thanks, Jerin > > Thanks, > /Bruce ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ethdev: make struct rte_eth_dev cache aligned 2016-05-04 15:19 ` Jerin Jacob @ 2016-05-04 15:48 ` Bruce Richardson 0 siblings, 0 replies; 10+ messages in thread From: Bruce Richardson @ 2016-05-04 15:48 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev, thomas.monjalon On Wed, May 04, 2016 at 08:49:38PM +0530, Jerin Jacob wrote: > On Wed, May 04, 2016 at 01:53:39PM +0000, Richardson, Bruce wrote: > > > > > > > -----Original Message----- > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com] > > > Sent: Wednesday, May 4, 2016 2:43 PM > > > To: Richardson, Bruce <bruce.richardson@intel.com> > > > Cc: dev@dpdk.org; thomas.monjalon@6wind.com > > > Subject: Re: [dpdk-dev] [PATCH v2] ethdev: make struct rte_eth_dev cache > > > aligned > > > > > > On Wed, May 04, 2016 at 12:09:50PM +0100, Bruce Richardson wrote: > > Snip > > > > > > > > > Hi Jerin, > > > > > > Hi Bruce, > > > > > > > > > > > have you seen a performance degradation due to ethdev elements sharing > > > > a cache > > > > > > No. Not because of sharing the cache line. > > > > > > > line? I ask because, surprisingly for me, I actually see a performance > > > > regression > > > > > > I see performance degradation in PMD in my setup where independent changes > > > are causing the performance issue in PMD(~<100k). That's the reason I > > > thought making aligned cache line stuff where ever it makes sense so that > > > independent change shouldn't impact the PMD performance and this patch was > > > an initiative for the same. > > > > > > > when I apply the above patch. It's not a big change - perf reduction > > > > of <1% - but still noticable across multiple runs using testpmd. I'm > > > > using two 1x40G NICs using i40e driver, and I see ~100kpps less > > > > traffic per port after applying the patch. [CPU: Intel(R) Xeon(R) CPU > > > > E5-2699 v3 @ 2.30GHz] > > > > > > This particular patch does not have any performance degradation in my > > > setup. > > > CPU: ThunderX > > > > Ok, so I take it that this patch is performance neutral on your setup, then? > > If that's the case, can we hold off on merging it on the basis that it's not needed and does cause a slight regression. > > OK > > Can you share the base dpdk.org upstream change set where this patch > creates the slight regression? I will test it the same on IA and > ThunderX platform. > > In my testpmd build, rte_eth_devices(0x0000000000751ef8) share the cache > line with inactive "notify_ops" that the reason for its not creating any benefit. > I guess the case would have been different if its active write location. > > COMMON 0x0000000000751ef0 0x8 > /home/jerin/dpdk-thunderx/build/lib/librte_vhost.a(virtio-net.o) > 0x0000000000751ef0 notify_ops > COMMON 0x0000000000751ef8 0x80900 > /home/jerin/dpdk-thunderx/build/lib/libethdev.a(rte_ethdev.o) > 0x0000000000751ef8 rte_eth_devices > > Thanks, > Jerin > > > Hi Jerin, I see no change with the latest code on dpdk.org mainline. I do see a change though, with the latest code on the dpdk-next-net/rel_16_07 branch, which has additional i40e driver improvements on it. However, on re-running the tests, I realise I may have got my measurement results backwards and that this patch may actually cause a slight increase instead. :-( Therefore, I withdraw my current objections to this patch. But I would like to see someone else verify what performance difference, if any, this causes on different platforms, so your planned tests on ThunderX and IA would still be great to see! :-) Thanks, /Bruce ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH v2] ethdev: make struct rte_eth_dev cache aligned 2016-05-03 12:42 ` [dpdk-dev] [PATCH v2] " Jerin Jacob 2016-05-04 11:09 ` Bruce Richardson @ 2016-06-22 21:20 ` Thomas Monjalon 1 sibling, 0 replies; 10+ messages in thread From: Thomas Monjalon @ 2016-06-22 21:20 UTC (permalink / raw) To: Jerin Jacob; +Cc: dev, bruce.richardson 2016-05-03 18:12, Jerin Jacob: > Elements of struct rte_eth_dev used in the fast path. > Make struct rte_eth_dev cache aligned to avoid the cases where > rte_eth_dev elements share the same cache line with other structures. > > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Let's try it in real tests. Applied, thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-06-22 21:20 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-05-02 8:07 [dpdk-dev] [PATCH] ethdev: make struct rte_eth_dev cache aligned Jerin Jacob 2016-05-03 9:40 ` Bruce Richardson 2016-05-03 12:10 ` Jerin Jacob 2016-05-03 12:42 ` [dpdk-dev] [PATCH v2] " Jerin Jacob 2016-05-04 11:09 ` Bruce Richardson 2016-05-04 13:42 ` Jerin Jacob 2016-05-04 13:53 ` Richardson, Bruce 2016-05-04 15:19 ` Jerin Jacob 2016-05-04 15:48 ` Bruce Richardson 2016-06-22 21:20 ` Thomas Monjalon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).