DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] examples/ipsec-secgw: fix IPsec performance drop
@ 2024-02-06 12:38 Rahul Bhansali
  2024-02-06 18:25 ` Ferruh Yigit
  0 siblings, 1 reply; 8+ messages in thread
From: Rahul Bhansali @ 2024-02-06 12:38 UTC (permalink / raw)
  To: dev, Radu Nicolau, Akhil Goyal, Konstantin Ananyev, Anoob Joseph
  Cc: Rahul Bhansali

Single packet free using rte_pktmbuf_free_bulk() is dropping the
performance. On cn10k, maximum of ~4% drop observed for IPsec
event mode single SA outbound case.

To fix this issue, single packet free will use rte_pktmbuf_free
API.

Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free")

Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
---
 examples/ipsec-secgw/ipsec-secgw.h | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/examples/ipsec-secgw/ipsec-secgw.h b/examples/ipsec-secgw/ipsec-secgw.h
index 8baab44ee7..ec33a982df 100644
--- a/examples/ipsec-secgw/ipsec-secgw.h
+++ b/examples/ipsec-secgw/ipsec-secgw.h
@@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb)
 }
 
 /* helper routine to free bulk of packets */
-static inline void
-free_pkts(struct rte_mbuf *mb[], uint32_t n)
+static __rte_always_inline void
+free_pkts(struct rte_mbuf *mb[], const uint32_t n)
 {
-	rte_pktmbuf_free_bulk(mb, n);
-
+	n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n);
 	core_stats_update_drop(n);
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
  2024-02-06 12:38 [PATCH] examples/ipsec-secgw: fix IPsec performance drop Rahul Bhansali
@ 2024-02-06 18:25 ` Ferruh Yigit
  2024-02-07  6:46   ` [EXT] " Rahul Bhansali
  0 siblings, 1 reply; 8+ messages in thread
From: Ferruh Yigit @ 2024-02-06 18:25 UTC (permalink / raw)
  To: Rahul Bhansali, dev, Radu Nicolau, Akhil Goyal,
	Konstantin Ananyev, Anoob Joseph

On 2/6/2024 12:38 PM, Rahul Bhansali wrote:
> Single packet free using rte_pktmbuf_free_bulk() is dropping the
> performance. On cn10k, maximum of ~4% drop observed for IPsec
> event mode single SA outbound case.
> 
> To fix this issue, single packet free will use rte_pktmbuf_free
> API.
> 
> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free")
> 
> Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
> ---
>  examples/ipsec-secgw/ipsec-secgw.h | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/examples/ipsec-secgw/ipsec-secgw.h b/examples/ipsec-secgw/ipsec-secgw.h
> index 8baab44ee7..ec33a982df 100644
> --- a/examples/ipsec-secgw/ipsec-secgw.h
> +++ b/examples/ipsec-secgw/ipsec-secgw.h
> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb)
>  }
>  
>  /* helper routine to free bulk of packets */
> -static inline void
> -free_pkts(struct rte_mbuf *mb[], uint32_t n)
> +static __rte_always_inline void
> +free_pkts(struct rte_mbuf *mb[], const uint32_t n)
>  {
> -	rte_pktmbuf_free_bulk(mb, n);
> -
> +	n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n);
>  	core_stats_update_drop(n);
>  }
>  

Hi Rahul,

Do you think the 'rte_pktmbuf_free_bulk()' API performance can be
improved by similar change?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
  2024-02-06 18:25 ` Ferruh Yigit
@ 2024-02-07  6:46   ` Rahul Bhansali
  2024-02-07 10:35     ` Ferruh Yigit
  0 siblings, 1 reply; 8+ messages in thread
From: Rahul Bhansali @ 2024-02-07  6:46 UTC (permalink / raw)
  To: Ferruh Yigit, dev, Radu Nicolau, Akhil Goyal, Konstantin Ananyev,
	Anoob Joseph



> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Tuesday, February 6, 2024 11:55 PM
> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu Nicolau
> <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; Konstantin
> Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph
> <anoobj@marvell.com>
> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
> 
> External Email
> 
> ----------------------------------------------------------------------
> On 2/6/2024 12:38 PM, Rahul Bhansali wrote:
> > Single packet free using rte_pktmbuf_free_bulk() is dropping the
> > performance. On cn10k, maximum of ~4% drop observed for IPsec event
> > mode single SA outbound case.
> >
> > To fix this issue, single packet free will use rte_pktmbuf_free API.
> >
> > Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free")
> >
> > Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
> > ---
> >  examples/ipsec-secgw/ipsec-secgw.h | 7 +++----
> >  1 file changed, 3 insertions(+), 4 deletions(-)
> >
> > diff --git a/examples/ipsec-secgw/ipsec-secgw.h
> > b/examples/ipsec-secgw/ipsec-secgw.h
> > index 8baab44ee7..ec33a982df 100644
> > --- a/examples/ipsec-secgw/ipsec-secgw.h
> > +++ b/examples/ipsec-secgw/ipsec-secgw.h
> > @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb)  }
> >
> >  /* helper routine to free bulk of packets */ -static inline void
> > -free_pkts(struct rte_mbuf *mb[], uint32_t n)
> > +static __rte_always_inline void
> > +free_pkts(struct rte_mbuf *mb[], const uint32_t n)
> >  {
> > -	rte_pktmbuf_free_bulk(mb, n);
> > -
> > +	n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n);
> >  	core_stats_update_drop(n);
> >  }
> >
> 
> Hi Rahul,
> 
> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be improved by
> similar change?

Hi Ferruh,
Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that along with __rte_pktmbuf_free_seg_via_array()  both inline then performance can be improved similar.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
  2024-02-07  6:46   ` [EXT] " Rahul Bhansali
@ 2024-02-07 10:35     ` Ferruh Yigit
  2024-02-09 13:10       ` Rahul Bhansali
  0 siblings, 1 reply; 8+ messages in thread
From: Ferruh Yigit @ 2024-02-07 10:35 UTC (permalink / raw)
  To: Rahul Bhansali, dev, Radu Nicolau, Akhil Goyal,
	Konstantin Ananyev, Anoob Joseph

On 2/7/2024 6:46 AM, Rahul Bhansali wrote:
> 
> 
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Tuesday, February 6, 2024 11:55 PM
>> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu Nicolau
>> <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; Konstantin
>> Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph
>> <anoobj@marvell.com>
>> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
>>
>> External Email
>>
>> ----------------------------------------------------------------------
>> On 2/6/2024 12:38 PM, Rahul Bhansali wrote:
>>> Single packet free using rte_pktmbuf_free_bulk() is dropping the
>>> performance. On cn10k, maximum of ~4% drop observed for IPsec event
>>> mode single SA outbound case.
>>>
>>> To fix this issue, single packet free will use rte_pktmbuf_free API.
>>>
>>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free")
>>>
>>> Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
>>> ---
>>>  examples/ipsec-secgw/ipsec-secgw.h | 7 +++----
>>>  1 file changed, 3 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h
>>> b/examples/ipsec-secgw/ipsec-secgw.h
>>> index 8baab44ee7..ec33a982df 100644
>>> --- a/examples/ipsec-secgw/ipsec-secgw.h
>>> +++ b/examples/ipsec-secgw/ipsec-secgw.h
>>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb)  }
>>>
>>>  /* helper routine to free bulk of packets */ -static inline void
>>> -free_pkts(struct rte_mbuf *mb[], uint32_t n)
>>> +static __rte_always_inline void
>>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n)
>>>  {
>>> -	rte_pktmbuf_free_bulk(mb, n);
>>> -
>>> +	n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n);
>>>  	core_stats_update_drop(n);
>>>  }
>>>
>>
>> Hi Rahul,
>>
>> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be improved by
>> similar change?
> 
> Hi Ferruh,
> Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that along with __rte_pktmbuf_free_seg_via_array()  both inline then performance can be improved similar.
>

Ah, so performance improvement is coming from 'rte_pktmbuf_free()' being
inline, OK.

As you are doing performance testing in that area, can you please check
if '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is static
function I expect it to be inlined. If not, can you please test with
force inlining it (__rte_always_inline)?


And I wonder if bulk() API may get single mbuf is a common theme, does
it makes sense add a new inline wrapper to library to cover this case,
if it is bringing ~4% improvement, like:
```
static inline void
rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n)
{
	if (n == 1)
		return rte_pktmbuf_free(mb[0]);
	return rte_pktmbuf_free_bulk(mb, n);
}
```


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
  2024-02-07 10:35     ` Ferruh Yigit
@ 2024-02-09 13:10       ` Rahul Bhansali
  2024-02-09 13:51         ` Ferruh Yigit
  0 siblings, 1 reply; 8+ messages in thread
From: Rahul Bhansali @ 2024-02-09 13:10 UTC (permalink / raw)
  To: Ferruh Yigit, dev, Radu Nicolau, Akhil Goyal, Konstantin Ananyev,
	Anoob Joseph



> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Wednesday, February 7, 2024 4:06 PM
> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu Nicolau
> <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; Konstantin
> Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph
> <anoobj@marvell.com>
> Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
> 
> On 2/7/2024 6:46 AM, Rahul Bhansali wrote:
> >
> >
> >> -----Original Message-----
> >> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >> Sent: Tuesday, February 6, 2024 11:55 PM
> >> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu
> >> Nicolau <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>;
> >> Konstantin Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph
> >> <anoobj@marvell.com>
> >> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec
> >> performance drop
> >>
> >> External Email
> >>
> >> ---------------------------------------------------------------------
> >> - On 2/6/2024 12:38 PM, Rahul Bhansali wrote:
> >>> Single packet free using rte_pktmbuf_free_bulk() is dropping the
> >>> performance. On cn10k, maximum of ~4% drop observed for IPsec event
> >>> mode single SA outbound case.
> >>>
> >>> To fix this issue, single packet free will use rte_pktmbuf_free API.
> >>>
> >>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free")
> >>>
> >>> Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
> >>> ---
> >>>  examples/ipsec-secgw/ipsec-secgw.h | 7 +++----
> >>>  1 file changed, 3 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h
> >>> b/examples/ipsec-secgw/ipsec-secgw.h
> >>> index 8baab44ee7..ec33a982df 100644
> >>> --- a/examples/ipsec-secgw/ipsec-secgw.h
> >>> +++ b/examples/ipsec-secgw/ipsec-secgw.h
> >>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb)
> >>> }
> >>>
> >>>  /* helper routine to free bulk of packets */ -static inline void
> >>> -free_pkts(struct rte_mbuf *mb[], uint32_t n)
> >>> +static __rte_always_inline void
> >>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n)
> >>>  {
> >>> -	rte_pktmbuf_free_bulk(mb, n);
> >>> -
> >>> +	n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n);
> >>>  	core_stats_update_drop(n);
> >>>  }
> >>>
> >>
> >> Hi Rahul,
> >>
> >> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be
> >> improved by similar change?
> >
> > Hi Ferruh,
> > Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that along with
> __rte_pktmbuf_free_seg_via_array()  both inline then performance can be
> improved similar.
> >
> 
> Ah, so performance improvement is coming from 'rte_pktmbuf_free()' being
> inline, OK.
> 
> As you are doing performance testing in that area, can you please check if
> '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is static function I expect it
> to be inlined. If not, can you please test with force inlining it
> (__rte_always_inline)?
It was not inline, did check with force inline also and no impact with this, so I can make it force inline.
> 
> 
> And I wonder if bulk() API may get single mbuf is a common theme, does it makes
> sense add a new inline wrapper to library to cover this case, if it is bringing ~4%
> improvement, like:
> ```
> static inline void
> rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n) {
> 	if (n == 1)
> 		return rte_pktmbuf_free(mb[0]);
> 	return rte_pktmbuf_free_bulk(mb, n);
> }
Agree, can make this wrapper to cover a case where bulk free API is called but might have single mbuf to get better perf. It can be further optimize " if (n == 1)" with compile time constant check,
```
static inline void
rte_pktmbuf_free_bulk_or_one(struct rte_mbuf **mb, unsigned int n)
{
       if (__builtin_constant_p(n) && (n == 1))
               rte_pktmbuf_free(mb[0]);
       else
               rte_pktmbuf_free_bulk(mb, n);
}
```
Let me know if it is fine. I'll send v2. And, this will be " __rte_experimental" right ?
> ```


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
  2024-02-09 13:10       ` Rahul Bhansali
@ 2024-02-09 13:51         ` Ferruh Yigit
  2024-02-13 12:50           ` Rahul Bhansali
  0 siblings, 1 reply; 8+ messages in thread
From: Ferruh Yigit @ 2024-02-09 13:51 UTC (permalink / raw)
  To: Rahul Bhansali, dev, Radu Nicolau, Akhil Goyal,
	Konstantin Ananyev, Anoob Joseph

On 2/9/2024 1:10 PM, Rahul Bhansali wrote:
> 
> 
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Wednesday, February 7, 2024 4:06 PM
>> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu Nicolau
>> <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; Konstantin
>> Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph
>> <anoobj@marvell.com>
>> Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
>>
>> On 2/7/2024 6:46 AM, Rahul Bhansali wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>>>> Sent: Tuesday, February 6, 2024 11:55 PM
>>>> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu
>>>> Nicolau <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>;
>>>> Konstantin Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph
>>>> <anoobj@marvell.com>
>>>> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec
>>>> performance drop
>>>>
>>>> External Email
>>>>
>>>> ---------------------------------------------------------------------
>>>> - On 2/6/2024 12:38 PM, Rahul Bhansali wrote:
>>>>> Single packet free using rte_pktmbuf_free_bulk() is dropping the
>>>>> performance. On cn10k, maximum of ~4% drop observed for IPsec event
>>>>> mode single SA outbound case.
>>>>>
>>>>> To fix this issue, single packet free will use rte_pktmbuf_free API.
>>>>>
>>>>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free")
>>>>>
>>>>> Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
>>>>> ---
>>>>>  examples/ipsec-secgw/ipsec-secgw.h | 7 +++----
>>>>>  1 file changed, 3 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h
>>>>> b/examples/ipsec-secgw/ipsec-secgw.h
>>>>> index 8baab44ee7..ec33a982df 100644
>>>>> --- a/examples/ipsec-secgw/ipsec-secgw.h
>>>>> +++ b/examples/ipsec-secgw/ipsec-secgw.h
>>>>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb)
>>>>> }
>>>>>
>>>>>  /* helper routine to free bulk of packets */ -static inline void
>>>>> -free_pkts(struct rte_mbuf *mb[], uint32_t n)
>>>>> +static __rte_always_inline void
>>>>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n)
>>>>>  {
>>>>> -	rte_pktmbuf_free_bulk(mb, n);
>>>>> -
>>>>> +	n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n);
>>>>>  	core_stats_update_drop(n);
>>>>>  }
>>>>>
>>>>
>>>> Hi Rahul,
>>>>
>>>> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be
>>>> improved by similar change?
>>>
>>> Hi Ferruh,
>>> Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that along with
>> __rte_pktmbuf_free_seg_via_array()  both inline then performance can be
>> improved similar.
>>>
>>
>> Ah, so performance improvement is coming from 'rte_pktmbuf_free()' being
>> inline, OK.
>>
>> As you are doing performance testing in that area, can you please check if
>> '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is static function I expect it
>> to be inlined. If not, can you please test with force inlining it
>> (__rte_always_inline)?
> It was not inline, did check with force inline also and no impact with this, so I can make it force inline.
>

If there is no performance improvement, I think no need to force inline
'__rte_pktmbuf_free_seg_via_array()'.

>>
>>
>> And I wonder if bulk() API may get single mbuf is a common theme, does it makes
>> sense add a new inline wrapper to library to cover this case, if it is bringing ~4%
>> improvement, like:
>> ```
>> static inline void
>> rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n) {
>> 	if (n == 1)
>> 		return rte_pktmbuf_free(mb[0]);
>> 	return rte_pktmbuf_free_bulk(mb, n);
>> }
> Agree, can make this wrapper to cover a case where bulk free API is called but might have single mbuf to get better perf. It can be further optimize " if (n == 1)" with compile time constant check,
> ```
> static inline void
> rte_pktmbuf_free_bulk_or_one(struct rte_mbuf **mb, unsigned int n)
> {
>        if (__builtin_constant_p(n) && (n == 1))
>                rte_pktmbuf_free(mb[0]);
>        else
>                rte_pktmbuf_free_bulk(mb, n);
> }
> ```
> Let me know if it is fine. I'll send v2. And, this will be " __rte_experimental" right ?
>

Compile time constant check can prevent penalty from additional check,
which is good, and I can see this can work for the examples/ipsec-secgw
usecase above, which has some hardcoded single mbuf free calls.

But most of the other usecases I think 'n' won't be known in compile
time, so API will be effectively same as free_bulk().

If you have it with runtime check, do you still observe any performance
improvement? If not perhaps we can go only with example code update,
without new API.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
  2024-02-09 13:51         ` Ferruh Yigit
@ 2024-02-13 12:50           ` Rahul Bhansali
  2024-02-29 17:06             ` Akhil Goyal
  0 siblings, 1 reply; 8+ messages in thread
From: Rahul Bhansali @ 2024-02-13 12:50 UTC (permalink / raw)
  To: Ferruh Yigit, dev, Radu Nicolau, Akhil Goyal, Konstantin Ananyev,
	Anoob Joseph



> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Friday, February 9, 2024 7:21 PM
> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu Nicolau
> <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; Konstantin
> Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph
> <anoobj@marvell.com>
> Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
> 
> On 2/9/2024 1:10 PM, Rahul Bhansali wrote:
> >
> >
> >> -----Original Message-----
> >> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >> Sent: Wednesday, February 7, 2024 4:06 PM
> >> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu
> >> Nicolau <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>;
> >> Konstantin Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph
> >> <anoobj@marvell.com>
> >> Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec
> >> performance drop
> >>
> >> On 2/7/2024 6:46 AM, Rahul Bhansali wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >>>> Sent: Tuesday, February 6, 2024 11:55 PM
> >>>> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu
> >>>> Nicolau <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>;
> >>>> Konstantin Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph
> >>>> <anoobj@marvell.com>
> >>>> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec
> >>>> performance drop
> >>>>
> >>>> External Email
> >>>>
> >>>> -------------------------------------------------------------------
> >>>> --
> >>>> - On 2/6/2024 12:38 PM, Rahul Bhansali wrote:
> >>>>> Single packet free using rte_pktmbuf_free_bulk() is dropping the
> >>>>> performance. On cn10k, maximum of ~4% drop observed for IPsec
> >>>>> event mode single SA outbound case.
> >>>>>
> >>>>> To fix this issue, single packet free will use rte_pktmbuf_free API.
> >>>>>
> >>>>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free")
> >>>>>
> >>>>> Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
> >>>>> ---
> >>>>>  examples/ipsec-secgw/ipsec-secgw.h | 7 +++----
> >>>>>  1 file changed, 3 insertions(+), 4 deletions(-)
> >>>>>
> >>>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h
> >>>>> b/examples/ipsec-secgw/ipsec-secgw.h
> >>>>> index 8baab44ee7..ec33a982df 100644
> >>>>> --- a/examples/ipsec-secgw/ipsec-secgw.h
> >>>>> +++ b/examples/ipsec-secgw/ipsec-secgw.h
> >>>>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf
> >>>>> *mb) }
> >>>>>
> >>>>>  /* helper routine to free bulk of packets */ -static inline void
> >>>>> -free_pkts(struct rte_mbuf *mb[], uint32_t n)
> >>>>> +static __rte_always_inline void
> >>>>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n)
> >>>>>  {
> >>>>> -	rte_pktmbuf_free_bulk(mb, n);
> >>>>> -
> >>>>> +	n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n);
> >>>>>  	core_stats_update_drop(n);
> >>>>>  }
> >>>>>
> >>>>
> >>>> Hi Rahul,
> >>>>
> >>>> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be
> >>>> improved by similar change?
> >>>
> >>> Hi Ferruh,
> >>> Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that
> >>> along with
> >> __rte_pktmbuf_free_seg_via_array()  both inline then performance can
> >> be improved similar.
> >>>
> >>
> >> Ah, so performance improvement is coming from 'rte_pktmbuf_free()'
> >> being inline, OK.
> >>
> >> As you are doing performance testing in that area, can you please
> >> check if '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is
> >> static function I expect it to be inlined. If not, can you please
> >> test with force inlining it (__rte_always_inline)?
> > It was not inline, did check with force inline also and no impact with this, so I
> can make it force inline.
> >
> 
> If there is no performance improvement, I think no need to force inline
> '__rte_pktmbuf_free_seg_via_array()'.
> 
> >>
> >>
> >> And I wonder if bulk() API may get single mbuf is a common theme,
> >> does it makes sense add a new inline wrapper to library to cover this
> >> case, if it is bringing ~4% improvement, like:
> >> ```
> >> static inline void
> >> rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n) {
> >> 	if (n == 1)
> >> 		return rte_pktmbuf_free(mb[0]);
> >> 	return rte_pktmbuf_free_bulk(mb, n); }
> > Agree, can make this wrapper to cover a case where bulk free API is
> > called but might have single mbuf to get better perf. It can be
> > further optimize " if (n == 1)" with compile time constant check, ```
> > static inline void rte_pktmbuf_free_bulk_or_one(struct rte_mbuf **mb,
> > unsigned int n) {
> >        if (__builtin_constant_p(n) && (n == 1))
> >                rte_pktmbuf_free(mb[0]);
> >        else
> >                rte_pktmbuf_free_bulk(mb, n); } ``` Let me know if it
> > is fine. I'll send v2. And, this will be " __rte_experimental" right ?
> >
> 
> Compile time constant check can prevent penalty from additional check, which is
> good, and I can see this can work for the examples/ipsec-secgw usecase above,
> which has some hardcoded single mbuf free calls.
> 
> But most of the other usecases I think 'n' won't be known in compile time, so API
> will be effectively same as free_bulk().
Agree.
> 
> If you have it with runtime check, do you still observe any performance
> improvement? If not perhaps we can go only with example code update, without
> new API.
With runtime check, performance improvement is small only in compare to compile time check. So can continue without this new API.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
  2024-02-13 12:50           ` Rahul Bhansali
@ 2024-02-29 17:06             ` Akhil Goyal
  0 siblings, 0 replies; 8+ messages in thread
From: Akhil Goyal @ 2024-02-29 17:06 UTC (permalink / raw)
  To: Rahul Bhansali, Ferruh Yigit, dev, Radu Nicolau,
	Konstantin Ananyev, Anoob Joseph

> > Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance
> drop
> >
> > On 2/9/2024 1:10 PM, Rahul Bhansali wrote:
> > >
> > >
> > >> -----Original Message-----
> > >> From: Ferruh Yigit <ferruh.yigit@amd.com>
> > >> Sent: Wednesday, February 7, 2024 4:06 PM
> > >> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu
> > >> Nicolau <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>;
> > >> Konstantin Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph
> > >> <anoobj@marvell.com>
> > >> Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec
> > >> performance drop
> > >>
> > >> On 2/7/2024 6:46 AM, Rahul Bhansali wrote:
> > >>>
> > >>>
> > >>>> -----Original Message-----
> > >>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> > >>>> Sent: Tuesday, February 6, 2024 11:55 PM
> > >>>> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu
> > >>>> Nicolau <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>;
> > >>>> Konstantin Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph
> > >>>> <anoobj@marvell.com>
> > >>>> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec
> > >>>> performance drop
> > >>>>
> > >>>> External Email
> > >>>>
> > >>>> -------------------------------------------------------------------
> > >>>> --
> > >>>> - On 2/6/2024 12:38 PM, Rahul Bhansali wrote:
> > >>>>> Single packet free using rte_pktmbuf_free_bulk() is dropping the
> > >>>>> performance. On cn10k, maximum of ~4% drop observed for IPsec
> > >>>>> event mode single SA outbound case.
> > >>>>>
> > >>>>> To fix this issue, single packet free will use rte_pktmbuf_free API.
> > >>>>>
> > >>>>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free")
> > >>>>>
> > >>>>> Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
> > >>>>> ---
> > >>>>>  examples/ipsec-secgw/ipsec-secgw.h | 7 +++----
> > >>>>>  1 file changed, 3 insertions(+), 4 deletions(-)
> > >>>>>
> > >>>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h
> > >>>>> b/examples/ipsec-secgw/ipsec-secgw.h
> > >>>>> index 8baab44ee7..ec33a982df 100644
> > >>>>> --- a/examples/ipsec-secgw/ipsec-secgw.h
> > >>>>> +++ b/examples/ipsec-secgw/ipsec-secgw.h
> > >>>>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf
> > >>>>> *mb) }
> > >>>>>
> > >>>>>  /* helper routine to free bulk of packets */ -static inline void
> > >>>>> -free_pkts(struct rte_mbuf *mb[], uint32_t n)
> > >>>>> +static __rte_always_inline void
> > >>>>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n)
> > >>>>>  {
> > >>>>> -	rte_pktmbuf_free_bulk(mb, n);
> > >>>>> -
> > >>>>> +	n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb,
> n);
> > >>>>>  	core_stats_update_drop(n);
> > >>>>>  }
> > >>>>>
> > >>>>
> > >>>> Hi Rahul,
> > >>>>
> > >>>> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be
> > >>>> improved by similar change?
> > >>>
> > >>> Hi Ferruh,
> > >>> Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that
> > >>> along with
> > >> __rte_pktmbuf_free_seg_via_array()  both inline then performance can
> > >> be improved similar.
> > >>>
> > >>
> > >> Ah, so performance improvement is coming from 'rte_pktmbuf_free()'
> > >> being inline, OK.
> > >>
> > >> As you are doing performance testing in that area, can you please
> > >> check if '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is
> > >> static function I expect it to be inlined. If not, can you please
> > >> test with force inlining it (__rte_always_inline)?
> > > It was not inline, did check with force inline also and no impact with this, so I
> > can make it force inline.
> > >
> >
> > If there is no performance improvement, I think no need to force inline
> > '__rte_pktmbuf_free_seg_via_array()'.
> >
> > >>
> > >>
> > >> And I wonder if bulk() API may get single mbuf is a common theme,
> > >> does it makes sense add a new inline wrapper to library to cover this
> > >> case, if it is bringing ~4% improvement, like:
> > >> ```
> > >> static inline void
> > >> rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n) {
> > >> 	if (n == 1)
> > >> 		return rte_pktmbuf_free(mb[0]);
> > >> 	return rte_pktmbuf_free_bulk(mb, n); }
> > > Agree, can make this wrapper to cover a case where bulk free API is
> > > called but might have single mbuf to get better perf. It can be
> > > further optimize " if (n == 1)" with compile time constant check, ```
> > > static inline void rte_pktmbuf_free_bulk_or_one(struct rte_mbuf **mb,
> > > unsigned int n) {
> > >        if (__builtin_constant_p(n) && (n == 1))
> > >                rte_pktmbuf_free(mb[0]);
> > >        else
> > >                rte_pktmbuf_free_bulk(mb, n); } ``` Let me know if it
> > > is fine. I'll send v2. And, this will be " __rte_experimental" right ?
> > >
> >
> > Compile time constant check can prevent penalty from additional check, which
> is
> > good, and I can see this can work for the examples/ipsec-secgw usecase above,
> > which has some hardcoded single mbuf free calls.
> >
> > But most of the other usecases I think 'n' won't be known in compile time, so
> API
> > will be effectively same as free_bulk().
> Agree.
> >
> > If you have it with runtime check, do you still observe any performance
> > improvement? If not perhaps we can go only with example code update,
> without
> > new API.
> With runtime check, performance improvement is small only in compare to
> compile time check. So can continue without this new API.

Acked-by: Akhil Goyal <gakhil@marvell.com>
Applied to dpdk-next-crypto
Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-02-29 17:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-06 12:38 [PATCH] examples/ipsec-secgw: fix IPsec performance drop Rahul Bhansali
2024-02-06 18:25 ` Ferruh Yigit
2024-02-07  6:46   ` [EXT] " Rahul Bhansali
2024-02-07 10:35     ` Ferruh Yigit
2024-02-09 13:10       ` Rahul Bhansali
2024-02-09 13:51         ` Ferruh Yigit
2024-02-13 12:50           ` Rahul Bhansali
2024-02-29 17:06             ` Akhil Goyal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).