DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] netvsc: optimize stats counters performance
@ 2024-08-02 14:40 Morten Brørup
  2024-08-02 16:48 ` Long Li
  0 siblings, 1 reply; 7+ messages in thread
From: Morten Brørup @ 2024-08-02 14:40 UTC (permalink / raw)
  To: longli, weh; +Cc: stephen, dev, Morten Brørup

Optimized the performance of updating the statistics counters by reducing
the number of branches.

Ordered the packet size comparisons according to the probability with
typical internet traffic mix.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
---
 drivers/net/netvsc/hn_rxtx.c | 32 ++++++++++----------------------
 1 file changed, 10 insertions(+), 22 deletions(-)

diff --git a/drivers/net/netvsc/hn_rxtx.c b/drivers/net/netvsc/hn_rxtx.c
index 9bf1ec5509..b704b2c971 100644
--- a/drivers/net/netvsc/hn_rxtx.c
+++ b/drivers/net/netvsc/hn_rxtx.c
@@ -110,30 +110,18 @@ hn_update_packet_stats(struct hn_stats *stats, const struct rte_mbuf *m)
 	uint32_t s = m->pkt_len;
 	const struct rte_ether_addr *ea;
 
-	if (s == 64) {
-		stats->size_bins[1]++;
-	} else if (s > 64 && s < 1024) {
-		uint32_t bin;
-
-		/* count zeros, and offset into correct bin */
-		bin = (sizeof(s) * 8) - rte_clz32(s) - 5;
-		stats->size_bins[bin]++;
-	} else {
-		if (s < 64)
-			stats->size_bins[0]++;
-		else if (s < 1519)
-			stats->size_bins[6]++;
-		else
-			stats->size_bins[7]++;
-	}
+	if (s >= 1024)
+		stats->size_bins[6 + (s > 1518)]++;
+	else if (s <= 64)
+		stats->size_bins[s >> 6]++;
+	else
+		stats->size_bins[32UL - rte_clz32(s) - 5]++;
 
 	ea = rte_pktmbuf_mtod(m, const struct rte_ether_addr *);
-	if (rte_is_multicast_ether_addr(ea)) {
-		if (rte_is_broadcast_ether_addr(ea))
-			stats->broadcast++;
-		else
-			stats->multicast++;
-	}
+	RTE_BUILD_BUG_ON(offsetof(struct hn_stats, broadcast) !=
+			offsetof(struct hn_stats, multicast) + sizeof(uint64_t));
+	if (unlikely(rte_is_multicast_ether_addr(ea)))
+		(&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;
 }
 
 static inline unsigned int hn_rndis_pktlen(const struct rndis_packet_msg *pkt)
-- 
2.43.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] netvsc: optimize stats counters performance
  2024-08-02 14:40 [PATCH] netvsc: optimize stats counters performance Morten Brørup
@ 2024-08-02 16:48 ` Long Li
  2024-08-02 17:28   ` Morten Brørup
  0 siblings, 1 reply; 7+ messages in thread
From: Long Li @ 2024-08-02 16:48 UTC (permalink / raw)
  To: Morten Brørup, Wei Hu; +Cc: stephen, dev

> Subject: [PATCH] netvsc: optimize stats counters performance
> 
> Optimized the performance of updating the statistics counters by reducing the
> number of branches.
> 
> Ordered the packet size comparisons according to the probability with typical
> internet traffic mix.
> 
> Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> ---
>  drivers/net/netvsc/hn_rxtx.c | 32 ++++++++++----------------------
>  1 file changed, 10 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/net/netvsc/hn_rxtx.c b/drivers/net/netvsc/hn_rxtx.c index
> 9bf1ec5509..b704b2c971 100644
> --- a/drivers/net/netvsc/hn_rxtx.c
> +++ b/drivers/net/netvsc/hn_rxtx.c
> @@ -110,30 +110,18 @@ hn_update_packet_stats(struct hn_stats *stats,
> const struct rte_mbuf *m)
>  	uint32_t s = m->pkt_len;
>  	const struct rte_ether_addr *ea;
> 
> -	if (s == 64) {
> -		stats->size_bins[1]++;
> -	} else if (s > 64 && s < 1024) {
> -		uint32_t bin;
> -
> -		/* count zeros, and offset into correct bin */
> -		bin = (sizeof(s) * 8) - rte_clz32(s) - 5;
> -		stats->size_bins[bin]++;
> -	} else {
> -		if (s < 64)
> -			stats->size_bins[0]++;
> -		else if (s < 1519)
> -			stats->size_bins[6]++;
> -		else
> -			stats->size_bins[7]++;
> -	}
> +	if (s >= 1024)
> +		stats->size_bins[6 + (s > 1518)]++;
> +	else if (s <= 64)
> +		stats->size_bins[s >> 6]++;
> +	else
> +		stats->size_bins[32UL - rte_clz32(s) - 5]++;

This part looks good.

> 
>  	ea = rte_pktmbuf_mtod(m, const struct rte_ether_addr *);
> -	if (rte_is_multicast_ether_addr(ea)) {
> -		if (rte_is_broadcast_ether_addr(ea))
> -			stats->broadcast++;
> -		else
> -			stats->multicast++;
> -	}
> +	RTE_BUILD_BUG_ON(offsetof(struct hn_stats, broadcast) !=
> +			offsetof(struct hn_stats, multicast) + sizeof(uint64_t));
> +	if (unlikely(rte_is_multicast_ether_addr(ea)))
> +		(&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;
>  }

This makes the code a little harder to read. How about just add "unlikely" to rte_is_multicast_ether_addr(ea) and keep the rest unchanged?

Thank you,

Long


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] netvsc: optimize stats counters performance
  2024-08-02 16:48 ` Long Li
@ 2024-08-02 17:28   ` Morten Brørup
  2024-08-02 17:33     ` Stephen Hemminger
  0 siblings, 1 reply; 7+ messages in thread
From: Morten Brørup @ 2024-08-02 17:28 UTC (permalink / raw)
  To: Long Li, Wei Hu, maxime.coquelin, chenbox, stephen; +Cc: dev

> From: Long Li [mailto:longli@microsoft.com]
> Sent: Friday, 2 August 2024 18.49
> 
> > Subject: [PATCH] netvsc: optimize stats counters performance
> >
> > Optimized the performance of updating the statistics counters by
> reducing the
> > number of branches.
> >
> > Ordered the packet size comparisons according to the probability with
> typical
> > internet traffic mix.
> >
> > Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
> > ---
> >  drivers/net/netvsc/hn_rxtx.c | 32 ++++++++++----------------------
> >  1 file changed, 10 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/net/netvsc/hn_rxtx.c
> b/drivers/net/netvsc/hn_rxtx.c index
> > 9bf1ec5509..b704b2c971 100644
> > --- a/drivers/net/netvsc/hn_rxtx.c
> > +++ b/drivers/net/netvsc/hn_rxtx.c
> > @@ -110,30 +110,18 @@ hn_update_packet_stats(struct hn_stats *stats,
> > const struct rte_mbuf *m)
> >  	uint32_t s = m->pkt_len;
> >  	const struct rte_ether_addr *ea;
> >
> > -	if (s == 64) {
> > -		stats->size_bins[1]++;
> > -	} else if (s > 64 && s < 1024) {
> > -		uint32_t bin;
> > -
> > -		/* count zeros, and offset into correct bin */
> > -		bin = (sizeof(s) * 8) - rte_clz32(s) - 5;
> > -		stats->size_bins[bin]++;
> > -	} else {
> > -		if (s < 64)
> > -			stats->size_bins[0]++;
> > -		else if (s < 1519)
> > -			stats->size_bins[6]++;
> > -		else
> > -			stats->size_bins[7]++;
> > -	}
> > +	if (s >= 1024)
> > +		stats->size_bins[6 + (s > 1518)]++;
> > +	else if (s <= 64)
> > +		stats->size_bins[s >> 6]++;
> > +	else
> > +		stats->size_bins[32UL - rte_clz32(s) - 5]++;
> 
> This part looks good.
> 
> >
> >  	ea = rte_pktmbuf_mtod(m, const struct rte_ether_addr *);
> > -	if (rte_is_multicast_ether_addr(ea)) {
> > -		if (rte_is_broadcast_ether_addr(ea))
> > -			stats->broadcast++;
> > -		else
> > -			stats->multicast++;
> > -	}
> > +	RTE_BUILD_BUG_ON(offsetof(struct hn_stats, broadcast) !=
> > +			offsetof(struct hn_stats, multicast) +
> sizeof(uint64_t));
> > +	if (unlikely(rte_is_multicast_ether_addr(ea)))
> > +		(&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;
> >  }
> 
> This makes the code a little harder to read.

I agree it is somewhat convoluted.
It's a tradeoff... I preferred performance at the cost of making the code somewhat harder to read.
The RTE_BUILD_BUG_ON() also helps showing what is going on with the weird indexing.

> How about just add
> "unlikely" to rte_is_multicast_ether_addr(ea) and keep the rest
> unchanged?

Then I could also add "unlikely" to is_broadcast().
In theory, there should be minimal broadcast traffic in any normal network. (Except when experiencing broadcast storms due to network loops or other network problems.)
But in reality, I think most real life networks see less multicast (non-broadcast) than broadcast traffic.

I don' think the following alternative makes the code significantly more readable than the method in this patch, but I'll mention it for the sake of discussion:

I could modify the hn_stats type like this:

struct hn_stats {
other fields...
+	union {
+		struct {
			uint64_t	multicast;
			uint64_t	broadcast;
+		};
+		uint64_t	multicast_broadcast[2];
+	};
other fields...
};

So the code would become:

+	if (unlikely(rte_is_multicast_ether_addr(ea)))
+		 stats->multicast_broadcast[rte_is_broadcast_ether_addr(ea)]++;

Whatever we decide, we should use the same method in all three patches (netvsc, virtio, vhost-user).
The choices are:
1. Stick with the original code (with two branches), and add unlikely().
2. Use the method provided in this patch (with only one branch).
3. Introduce a multicast_broadcast[2] to the stats structure as described above (also with only one branch).

@Long, @Wei, @Maxime, @Chenbo, @Stephen and anyone else interested, please cast your votes.
Comments are also welcome! :-)

I'm in favor of #2.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] netvsc: optimize stats counters performance
  2024-08-02 17:28   ` Morten Brørup
@ 2024-08-02 17:33     ` Stephen Hemminger
  2024-09-19 13:51       ` Morten Brørup
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2024-08-02 17:33 UTC (permalink / raw)
  To: Morten Brørup; +Cc: Long Li, Wei Hu, maxime.coquelin, chenbox, dev

On Fri, 2 Aug 2024 19:28:26 +0200
Morten Brørup <mb@smartsharesystems.com> wrote:

> > >  	ea = rte_pktmbuf_mtod(m, const struct rte_ether_addr *);
> > > -	if (rte_is_multicast_ether_addr(ea)) {
> > > -		if (rte_is_broadcast_ether_addr(ea))
> > > -			stats->broadcast++;
> > > -		else
> > > -			stats->multicast++;
> > > -	}
> > > +	RTE_BUILD_BUG_ON(offsetof(struct hn_stats, broadcast) !=
> > > +			offsetof(struct hn_stats, multicast) +  
> > sizeof(uint64_t));  
> > > +	if (unlikely(rte_is_multicast_ether_addr(ea)))
> > > +		(&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;
> > >  }  
> > 
> > This makes the code a little harder to read.  
> 
> I agree it is somewhat convoluted.
> It's a tradeoff... I preferred performance at the cost of making the code somewhat harder to read.
> The RTE_BUILD_BUG_ON() also helps showing what is going on with the weird indexing.

Optimizing for multicast packets is not worth bothering.
Keep the original code it is simpler.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] netvsc: optimize stats counters performance
  2024-08-02 17:33     ` Stephen Hemminger
@ 2024-09-19 13:51       ` Morten Brørup
  2024-09-19 18:06         ` Long Li
  0 siblings, 1 reply; 7+ messages in thread
From: Morten Brørup @ 2024-09-19 13:51 UTC (permalink / raw)
  To: Long Li, Wei Hu; +Cc: maxime.coquelin, chenbox, dev, Stephen Hemminger

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Friday, 2 August 2024 19.33
> 
> On Fri, 2 Aug 2024 19:28:26 +0200
> Morten Brørup <mb@smartsharesystems.com> wrote:
> 
> > > >  	ea = rte_pktmbuf_mtod(m, const struct rte_ether_addr *);
> > > > -	if (rte_is_multicast_ether_addr(ea)) {
> > > > -		if (rte_is_broadcast_ether_addr(ea))
> > > > -			stats->broadcast++;
> > > > -		else
> > > > -			stats->multicast++;
> > > > -	}
> > > > +	RTE_BUILD_BUG_ON(offsetof(struct hn_stats, broadcast) !=
> > > > +			offsetof(struct hn_stats, multicast) +
> > > sizeof(uint64_t));
> > > > +	if (unlikely(rte_is_multicast_ether_addr(ea)))
> > > > +		(&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;
> > > >  }
> > >
> > > This makes the code a little harder to read.
> >
> > I agree it is somewhat convoluted.
> > It's a tradeoff... I preferred performance at the cost of making the code
> somewhat harder to read.
> > The RTE_BUILD_BUG_ON() also helps showing what is going on with the weird
> indexing.

Similar patches have been accepted by other drivers:
[virtio]: https://patchwork.dpdk.org/project/dpdk/patch/20240801160312.205281-1-mb@smartsharesystems.com/
[vhost-user]: https://patchwork.dpdk.org/project/dpdk/patch/20240802143259.269827-1-mb@smartsharesystems.com/

> 
> Optimizing for multicast packets is not worth bothering.

Optimizing for multicast/broadcast comes into play in multicast environments, and during network broadcast storms.
Although I don't know if any of those two scenarios are relevant for this specific driver.

> Keep the original code it is simpler.

Let's keep similar code similar across drivers.

@Long, @Wei, please Review/Ack, so the patch can be applied.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] netvsc: optimize stats counters performance
  2024-09-19 13:51       ` Morten Brørup
@ 2024-09-19 18:06         ` Long Li
  2024-09-22  2:36           ` Ferruh Yigit
  0 siblings, 1 reply; 7+ messages in thread
From: Long Li @ 2024-09-19 18:06 UTC (permalink / raw)
  To: Morten Brørup, Wei Hu
  Cc: maxime.coquelin, chenbox, dev, Stephen Hemminger

> Subject: RE: [PATCH] netvsc: optimize stats counters performance
> 
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Friday, 2 August 2024 19.33
> >
> > On Fri, 2 Aug 2024 19:28:26 +0200
> > Morten Brørup <mb@smartsharesystems.com> wrote:
> >
> > > > >  	ea = rte_pktmbuf_mtod(m, const struct rte_ether_addr *);
> > > > > -	if (rte_is_multicast_ether_addr(ea)) {
> > > > > -		if (rte_is_broadcast_ether_addr(ea))
> > > > > -			stats->broadcast++;
> > > > > -		else
> > > > > -			stats->multicast++;
> > > > > -	}
> > > > > +	RTE_BUILD_BUG_ON(offsetof(struct hn_stats, broadcast) !=
> > > > > +			offsetof(struct hn_stats, multicast) +
> > > > sizeof(uint64_t));
> > > > > +	if (unlikely(rte_is_multicast_ether_addr(ea)))
> > > > > +		(&stats-
> >multicast)[rte_is_broadcast_ether_addr(ea)]++;
> > > > >  }
> > > >
> > > > This makes the code a little harder to read.
> > >
> > > I agree it is somewhat convoluted.
> > > It's a tradeoff... I preferred performance at the cost of making the code
> > somewhat harder to read.
> > > The RTE_BUILD_BUG_ON() also helps showing what is going on with the
> weird
> > indexing.
> 
> Similar patches have been accepted by other drivers:
> [virtio]:
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwor
> k.dpdk.org%2Fproject%2Fdpdk%2Fpatch%2F20240801160312.205281-1-
> mb%40smartsharesystems.com%2F&data=05%7C02%7Clongli%40microsoft.com
> %7C014c845ad8ec4ea25fb508dcd8b22474%7C72f988bf86f141af91ab2d7cd011
> db47%7C1%7C0%7C638623506854135960%7CUnknown%7CTWFpbGZsb3d8eyJ
> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0
> %7C%7C%7C&sdata=DrVP6TuTruREAxAoSEBED11TOapSqDN5stsDJJTuux0%3D&r
> eserved=0
> [vhost-user]:
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwor
> k.dpdk.org%2Fproject%2Fdpdk%2Fpatch%2F20240802143259.269827-1-
> mb%40smartsharesystems.com%2F&data=05%7C02%7Clongli%40microsoft.com
> %7C014c845ad8ec4ea25fb508dcd8b22474%7C72f988bf86f141af91ab2d7cd011
> db47%7C1%7C0%7C638623506854150865%7CUnknown%7CTWFpbGZsb3d8eyJ
> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0
> %7C%7C%7C&sdata=85T9qnC4nbAwU53sxnfQ2lMayp5soX8BMEJsbB58ZBw%3D
> &reserved=0
> 
> >
> > Optimizing for multicast packets is not worth bothering.
> 
> Optimizing for multicast/broadcast comes into play in multicast environments,
> and during network broadcast storms.
> Although I don't know if any of those two scenarios are relevant for this specific
> driver.
> 
> > Keep the original code it is simpler.
> 
> Let's keep similar code similar across drivers.
> 
> @Long, @Wei, please Review/Ack, so the patch can be applied.

Reviewed-by: Long Li <longli@microsoft.com>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] netvsc: optimize stats counters performance
  2024-09-19 18:06         ` Long Li
@ 2024-09-22  2:36           ` Ferruh Yigit
  0 siblings, 0 replies; 7+ messages in thread
From: Ferruh Yigit @ 2024-09-22  2:36 UTC (permalink / raw)
  To: Long Li, Morten Brørup, Wei Hu
  Cc: maxime.coquelin, chenbox, dev, Stephen Hemminger

On 9/19/2024 7:06 PM, Long Li wrote:
>> Subject: RE: [PATCH] netvsc: optimize stats counters performance
>>
>>> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
>>> Sent: Friday, 2 August 2024 19.33
>>>
>>> On Fri, 2 Aug 2024 19:28:26 +0200
>>> Morten Brørup <mb@smartsharesystems.com> wrote:
> 
> Reviewed-by: Long Li <longli@microsoft.com>
> 

Applied to dpdk-next-net/main, thanks.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-09-22  2:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-02 14:40 [PATCH] netvsc: optimize stats counters performance Morten Brørup
2024-08-02 16:48 ` Long Li
2024-08-02 17:28   ` Morten Brørup
2024-08-02 17:33     ` Stephen Hemminger
2024-09-19 13:51       ` Morten Brørup
2024-09-19 18:06         ` Long Li
2024-09-22  2:36           ` Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).