* [PATCH] examples/ipsec-secgw: fix IPsec performance drop @ 2024-02-06 12:38 Rahul Bhansali 2024-02-06 18:25 ` Ferruh Yigit 0 siblings, 1 reply; 8+ messages in thread From: Rahul Bhansali @ 2024-02-06 12:38 UTC (permalink / raw) To: dev, Radu Nicolau, Akhil Goyal, Konstantin Ananyev, Anoob Joseph Cc: Rahul Bhansali Single packet free using rte_pktmbuf_free_bulk() is dropping the performance. On cn10k, maximum of ~4% drop observed for IPsec event mode single SA outbound case. To fix this issue, single packet free will use rte_pktmbuf_free API. Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free") Signed-off-by: Rahul Bhansali <rbhansali@marvell.com> --- examples/ipsec-secgw/ipsec-secgw.h | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/examples/ipsec-secgw/ipsec-secgw.h b/examples/ipsec-secgw/ipsec-secgw.h index 8baab44ee7..ec33a982df 100644 --- a/examples/ipsec-secgw/ipsec-secgw.h +++ b/examples/ipsec-secgw/ipsec-secgw.h @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb) } /* helper routine to free bulk of packets */ -static inline void -free_pkts(struct rte_mbuf *mb[], uint32_t n) +static __rte_always_inline void +free_pkts(struct rte_mbuf *mb[], const uint32_t n) { - rte_pktmbuf_free_bulk(mb, n); - + n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n); core_stats_update_drop(n); } -- 2.25.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop 2024-02-06 12:38 [PATCH] examples/ipsec-secgw: fix IPsec performance drop Rahul Bhansali @ 2024-02-06 18:25 ` Ferruh Yigit 2024-02-07 6:46 ` [EXT] " Rahul Bhansali 0 siblings, 1 reply; 8+ messages in thread From: Ferruh Yigit @ 2024-02-06 18:25 UTC (permalink / raw) To: Rahul Bhansali, dev, Radu Nicolau, Akhil Goyal, Konstantin Ananyev, Anoob Joseph On 2/6/2024 12:38 PM, Rahul Bhansali wrote: > Single packet free using rte_pktmbuf_free_bulk() is dropping the > performance. On cn10k, maximum of ~4% drop observed for IPsec > event mode single SA outbound case. > > To fix this issue, single packet free will use rte_pktmbuf_free > API. > > Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free") > > Signed-off-by: Rahul Bhansali <rbhansali@marvell.com> > --- > examples/ipsec-secgw/ipsec-secgw.h | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/examples/ipsec-secgw/ipsec-secgw.h b/examples/ipsec-secgw/ipsec-secgw.h > index 8baab44ee7..ec33a982df 100644 > --- a/examples/ipsec-secgw/ipsec-secgw.h > +++ b/examples/ipsec-secgw/ipsec-secgw.h > @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb) > } > > /* helper routine to free bulk of packets */ > -static inline void > -free_pkts(struct rte_mbuf *mb[], uint32_t n) > +static __rte_always_inline void > +free_pkts(struct rte_mbuf *mb[], const uint32_t n) > { > - rte_pktmbuf_free_bulk(mb, n); > - > + n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n); > core_stats_update_drop(n); > } > Hi Rahul, Do you think the 'rte_pktmbuf_free_bulk()' API performance can be improved by similar change? ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop 2024-02-06 18:25 ` Ferruh Yigit @ 2024-02-07 6:46 ` Rahul Bhansali 2024-02-07 10:35 ` Ferruh Yigit 0 siblings, 1 reply; 8+ messages in thread From: Rahul Bhansali @ 2024-02-07 6:46 UTC (permalink / raw) To: Ferruh Yigit, dev, Radu Nicolau, Akhil Goyal, Konstantin Ananyev, Anoob Joseph > -----Original Message----- > From: Ferruh Yigit <ferruh.yigit@amd.com> > Sent: Tuesday, February 6, 2024 11:55 PM > To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu Nicolau > <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; Konstantin > Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph > <anoobj@marvell.com> > Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop > > External Email > > ---------------------------------------------------------------------- > On 2/6/2024 12:38 PM, Rahul Bhansali wrote: > > Single packet free using rte_pktmbuf_free_bulk() is dropping the > > performance. On cn10k, maximum of ~4% drop observed for IPsec event > > mode single SA outbound case. > > > > To fix this issue, single packet free will use rte_pktmbuf_free API. > > > > Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free") > > > > Signed-off-by: Rahul Bhansali <rbhansali@marvell.com> > > --- > > examples/ipsec-secgw/ipsec-secgw.h | 7 +++---- > > 1 file changed, 3 insertions(+), 4 deletions(-) > > > > diff --git a/examples/ipsec-secgw/ipsec-secgw.h > > b/examples/ipsec-secgw/ipsec-secgw.h > > index 8baab44ee7..ec33a982df 100644 > > --- a/examples/ipsec-secgw/ipsec-secgw.h > > +++ b/examples/ipsec-secgw/ipsec-secgw.h > > @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb) } > > > > /* helper routine to free bulk of packets */ -static inline void > > -free_pkts(struct rte_mbuf *mb[], uint32_t n) > > +static __rte_always_inline void > > +free_pkts(struct rte_mbuf *mb[], const uint32_t n) > > { > > - rte_pktmbuf_free_bulk(mb, n); > > - > > + n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n); > > core_stats_update_drop(n); > > } > > > > Hi Rahul, > > Do you think the 'rte_pktmbuf_free_bulk()' API performance can be improved by > similar change? Hi Ferruh, Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that along with __rte_pktmbuf_free_seg_via_array() both inline then performance can be improved similar. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop 2024-02-07 6:46 ` [EXT] " Rahul Bhansali @ 2024-02-07 10:35 ` Ferruh Yigit 2024-02-09 13:10 ` Rahul Bhansali 0 siblings, 1 reply; 8+ messages in thread From: Ferruh Yigit @ 2024-02-07 10:35 UTC (permalink / raw) To: Rahul Bhansali, dev, Radu Nicolau, Akhil Goyal, Konstantin Ananyev, Anoob Joseph On 2/7/2024 6:46 AM, Rahul Bhansali wrote: > > >> -----Original Message----- >> From: Ferruh Yigit <ferruh.yigit@amd.com> >> Sent: Tuesday, February 6, 2024 11:55 PM >> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu Nicolau >> <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; Konstantin >> Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph >> <anoobj@marvell.com> >> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop >> >> External Email >> >> ---------------------------------------------------------------------- >> On 2/6/2024 12:38 PM, Rahul Bhansali wrote: >>> Single packet free using rte_pktmbuf_free_bulk() is dropping the >>> performance. On cn10k, maximum of ~4% drop observed for IPsec event >>> mode single SA outbound case. >>> >>> To fix this issue, single packet free will use rte_pktmbuf_free API. >>> >>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free") >>> >>> Signed-off-by: Rahul Bhansali <rbhansali@marvell.com> >>> --- >>> examples/ipsec-secgw/ipsec-secgw.h | 7 +++---- >>> 1 file changed, 3 insertions(+), 4 deletions(-) >>> >>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h >>> b/examples/ipsec-secgw/ipsec-secgw.h >>> index 8baab44ee7..ec33a982df 100644 >>> --- a/examples/ipsec-secgw/ipsec-secgw.h >>> +++ b/examples/ipsec-secgw/ipsec-secgw.h >>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb) } >>> >>> /* helper routine to free bulk of packets */ -static inline void >>> -free_pkts(struct rte_mbuf *mb[], uint32_t n) >>> +static __rte_always_inline void >>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n) >>> { >>> - rte_pktmbuf_free_bulk(mb, n); >>> - >>> + n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n); >>> core_stats_update_drop(n); >>> } >>> >> >> Hi Rahul, >> >> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be improved by >> similar change? > > Hi Ferruh, > Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that along with __rte_pktmbuf_free_seg_via_array() both inline then performance can be improved similar. > Ah, so performance improvement is coming from 'rte_pktmbuf_free()' being inline, OK. As you are doing performance testing in that area, can you please check if '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is static function I expect it to be inlined. If not, can you please test with force inlining it (__rte_always_inline)? And I wonder if bulk() API may get single mbuf is a common theme, does it makes sense add a new inline wrapper to library to cover this case, if it is bringing ~4% improvement, like: ``` static inline void rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n) { if (n == 1) return rte_pktmbuf_free(mb[0]); return rte_pktmbuf_free_bulk(mb, n); } ``` ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop 2024-02-07 10:35 ` Ferruh Yigit @ 2024-02-09 13:10 ` Rahul Bhansali 2024-02-09 13:51 ` Ferruh Yigit 0 siblings, 1 reply; 8+ messages in thread From: Rahul Bhansali @ 2024-02-09 13:10 UTC (permalink / raw) To: Ferruh Yigit, dev, Radu Nicolau, Akhil Goyal, Konstantin Ananyev, Anoob Joseph > -----Original Message----- > From: Ferruh Yigit <ferruh.yigit@amd.com> > Sent: Wednesday, February 7, 2024 4:06 PM > To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu Nicolau > <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; Konstantin > Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph > <anoobj@marvell.com> > Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop > > On 2/7/2024 6:46 AM, Rahul Bhansali wrote: > > > > > >> -----Original Message----- > >> From: Ferruh Yigit <ferruh.yigit@amd.com> > >> Sent: Tuesday, February 6, 2024 11:55 PM > >> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu > >> Nicolau <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; > >> Konstantin Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph > >> <anoobj@marvell.com> > >> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec > >> performance drop > >> > >> External Email > >> > >> --------------------------------------------------------------------- > >> - On 2/6/2024 12:38 PM, Rahul Bhansali wrote: > >>> Single packet free using rte_pktmbuf_free_bulk() is dropping the > >>> performance. On cn10k, maximum of ~4% drop observed for IPsec event > >>> mode single SA outbound case. > >>> > >>> To fix this issue, single packet free will use rte_pktmbuf_free API. > >>> > >>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free") > >>> > >>> Signed-off-by: Rahul Bhansali <rbhansali@marvell.com> > >>> --- > >>> examples/ipsec-secgw/ipsec-secgw.h | 7 +++---- > >>> 1 file changed, 3 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h > >>> b/examples/ipsec-secgw/ipsec-secgw.h > >>> index 8baab44ee7..ec33a982df 100644 > >>> --- a/examples/ipsec-secgw/ipsec-secgw.h > >>> +++ b/examples/ipsec-secgw/ipsec-secgw.h > >>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb) > >>> } > >>> > >>> /* helper routine to free bulk of packets */ -static inline void > >>> -free_pkts(struct rte_mbuf *mb[], uint32_t n) > >>> +static __rte_always_inline void > >>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n) > >>> { > >>> - rte_pktmbuf_free_bulk(mb, n); > >>> - > >>> + n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n); > >>> core_stats_update_drop(n); > >>> } > >>> > >> > >> Hi Rahul, > >> > >> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be > >> improved by similar change? > > > > Hi Ferruh, > > Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that along with > __rte_pktmbuf_free_seg_via_array() both inline then performance can be > improved similar. > > > > Ah, so performance improvement is coming from 'rte_pktmbuf_free()' being > inline, OK. > > As you are doing performance testing in that area, can you please check if > '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is static function I expect it > to be inlined. If not, can you please test with force inlining it > (__rte_always_inline)? It was not inline, did check with force inline also and no impact with this, so I can make it force inline. > > > And I wonder if bulk() API may get single mbuf is a common theme, does it makes > sense add a new inline wrapper to library to cover this case, if it is bringing ~4% > improvement, like: > ``` > static inline void > rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n) { > if (n == 1) > return rte_pktmbuf_free(mb[0]); > return rte_pktmbuf_free_bulk(mb, n); > } Agree, can make this wrapper to cover a case where bulk free API is called but might have single mbuf to get better perf. It can be further optimize " if (n == 1)" with compile time constant check, ``` static inline void rte_pktmbuf_free_bulk_or_one(struct rte_mbuf **mb, unsigned int n) { if (__builtin_constant_p(n) && (n == 1)) rte_pktmbuf_free(mb[0]); else rte_pktmbuf_free_bulk(mb, n); } ``` Let me know if it is fine. I'll send v2. And, this will be " __rte_experimental" right ? > ``` ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop 2024-02-09 13:10 ` Rahul Bhansali @ 2024-02-09 13:51 ` Ferruh Yigit 2024-02-13 12:50 ` Rahul Bhansali 0 siblings, 1 reply; 8+ messages in thread From: Ferruh Yigit @ 2024-02-09 13:51 UTC (permalink / raw) To: Rahul Bhansali, dev, Radu Nicolau, Akhil Goyal, Konstantin Ananyev, Anoob Joseph On 2/9/2024 1:10 PM, Rahul Bhansali wrote: > > >> -----Original Message----- >> From: Ferruh Yigit <ferruh.yigit@amd.com> >> Sent: Wednesday, February 7, 2024 4:06 PM >> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu Nicolau >> <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; Konstantin >> Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph >> <anoobj@marvell.com> >> Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop >> >> On 2/7/2024 6:46 AM, Rahul Bhansali wrote: >>> >>> >>>> -----Original Message----- >>>> From: Ferruh Yigit <ferruh.yigit@amd.com> >>>> Sent: Tuesday, February 6, 2024 11:55 PM >>>> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu >>>> Nicolau <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; >>>> Konstantin Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph >>>> <anoobj@marvell.com> >>>> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec >>>> performance drop >>>> >>>> External Email >>>> >>>> --------------------------------------------------------------------- >>>> - On 2/6/2024 12:38 PM, Rahul Bhansali wrote: >>>>> Single packet free using rte_pktmbuf_free_bulk() is dropping the >>>>> performance. On cn10k, maximum of ~4% drop observed for IPsec event >>>>> mode single SA outbound case. >>>>> >>>>> To fix this issue, single packet free will use rte_pktmbuf_free API. >>>>> >>>>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free") >>>>> >>>>> Signed-off-by: Rahul Bhansali <rbhansali@marvell.com> >>>>> --- >>>>> examples/ipsec-secgw/ipsec-secgw.h | 7 +++---- >>>>> 1 file changed, 3 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h >>>>> b/examples/ipsec-secgw/ipsec-secgw.h >>>>> index 8baab44ee7..ec33a982df 100644 >>>>> --- a/examples/ipsec-secgw/ipsec-secgw.h >>>>> +++ b/examples/ipsec-secgw/ipsec-secgw.h >>>>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb) >>>>> } >>>>> >>>>> /* helper routine to free bulk of packets */ -static inline void >>>>> -free_pkts(struct rte_mbuf *mb[], uint32_t n) >>>>> +static __rte_always_inline void >>>>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n) >>>>> { >>>>> - rte_pktmbuf_free_bulk(mb, n); >>>>> - >>>>> + n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n); >>>>> core_stats_update_drop(n); >>>>> } >>>>> >>>> >>>> Hi Rahul, >>>> >>>> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be >>>> improved by similar change? >>> >>> Hi Ferruh, >>> Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that along with >> __rte_pktmbuf_free_seg_via_array() both inline then performance can be >> improved similar. >>> >> >> Ah, so performance improvement is coming from 'rte_pktmbuf_free()' being >> inline, OK. >> >> As you are doing performance testing in that area, can you please check if >> '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is static function I expect it >> to be inlined. If not, can you please test with force inlining it >> (__rte_always_inline)? > It was not inline, did check with force inline also and no impact with this, so I can make it force inline. > If there is no performance improvement, I think no need to force inline '__rte_pktmbuf_free_seg_via_array()'. >> >> >> And I wonder if bulk() API may get single mbuf is a common theme, does it makes >> sense add a new inline wrapper to library to cover this case, if it is bringing ~4% >> improvement, like: >> ``` >> static inline void >> rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n) { >> if (n == 1) >> return rte_pktmbuf_free(mb[0]); >> return rte_pktmbuf_free_bulk(mb, n); >> } > Agree, can make this wrapper to cover a case where bulk free API is called but might have single mbuf to get better perf. It can be further optimize " if (n == 1)" with compile time constant check, > ``` > static inline void > rte_pktmbuf_free_bulk_or_one(struct rte_mbuf **mb, unsigned int n) > { > if (__builtin_constant_p(n) && (n == 1)) > rte_pktmbuf_free(mb[0]); > else > rte_pktmbuf_free_bulk(mb, n); > } > ``` > Let me know if it is fine. I'll send v2. And, this will be " __rte_experimental" right ? > Compile time constant check can prevent penalty from additional check, which is good, and I can see this can work for the examples/ipsec-secgw usecase above, which has some hardcoded single mbuf free calls. But most of the other usecases I think 'n' won't be known in compile time, so API will be effectively same as free_bulk(). If you have it with runtime check, do you still observe any performance improvement? If not perhaps we can go only with example code update, without new API. ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop 2024-02-09 13:51 ` Ferruh Yigit @ 2024-02-13 12:50 ` Rahul Bhansali 2024-02-29 17:06 ` Akhil Goyal 0 siblings, 1 reply; 8+ messages in thread From: Rahul Bhansali @ 2024-02-13 12:50 UTC (permalink / raw) To: Ferruh Yigit, dev, Radu Nicolau, Akhil Goyal, Konstantin Ananyev, Anoob Joseph > -----Original Message----- > From: Ferruh Yigit <ferruh.yigit@amd.com> > Sent: Friday, February 9, 2024 7:21 PM > To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu Nicolau > <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; Konstantin > Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph > <anoobj@marvell.com> > Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop > > On 2/9/2024 1:10 PM, Rahul Bhansali wrote: > > > > > >> -----Original Message----- > >> From: Ferruh Yigit <ferruh.yigit@amd.com> > >> Sent: Wednesday, February 7, 2024 4:06 PM > >> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu > >> Nicolau <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; > >> Konstantin Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph > >> <anoobj@marvell.com> > >> Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec > >> performance drop > >> > >> On 2/7/2024 6:46 AM, Rahul Bhansali wrote: > >>> > >>> > >>>> -----Original Message----- > >>>> From: Ferruh Yigit <ferruh.yigit@amd.com> > >>>> Sent: Tuesday, February 6, 2024 11:55 PM > >>>> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu > >>>> Nicolau <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; > >>>> Konstantin Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph > >>>> <anoobj@marvell.com> > >>>> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec > >>>> performance drop > >>>> > >>>> External Email > >>>> > >>>> ------------------------------------------------------------------- > >>>> -- > >>>> - On 2/6/2024 12:38 PM, Rahul Bhansali wrote: > >>>>> Single packet free using rte_pktmbuf_free_bulk() is dropping the > >>>>> performance. On cn10k, maximum of ~4% drop observed for IPsec > >>>>> event mode single SA outbound case. > >>>>> > >>>>> To fix this issue, single packet free will use rte_pktmbuf_free API. > >>>>> > >>>>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free") > >>>>> > >>>>> Signed-off-by: Rahul Bhansali <rbhansali@marvell.com> > >>>>> --- > >>>>> examples/ipsec-secgw/ipsec-secgw.h | 7 +++---- > >>>>> 1 file changed, 3 insertions(+), 4 deletions(-) > >>>>> > >>>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h > >>>>> b/examples/ipsec-secgw/ipsec-secgw.h > >>>>> index 8baab44ee7..ec33a982df 100644 > >>>>> --- a/examples/ipsec-secgw/ipsec-secgw.h > >>>>> +++ b/examples/ipsec-secgw/ipsec-secgw.h > >>>>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf > >>>>> *mb) } > >>>>> > >>>>> /* helper routine to free bulk of packets */ -static inline void > >>>>> -free_pkts(struct rte_mbuf *mb[], uint32_t n) > >>>>> +static __rte_always_inline void > >>>>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n) > >>>>> { > >>>>> - rte_pktmbuf_free_bulk(mb, n); > >>>>> - > >>>>> + n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n); > >>>>> core_stats_update_drop(n); > >>>>> } > >>>>> > >>>> > >>>> Hi Rahul, > >>>> > >>>> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be > >>>> improved by similar change? > >>> > >>> Hi Ferruh, > >>> Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that > >>> along with > >> __rte_pktmbuf_free_seg_via_array() both inline then performance can > >> be improved similar. > >>> > >> > >> Ah, so performance improvement is coming from 'rte_pktmbuf_free()' > >> being inline, OK. > >> > >> As you are doing performance testing in that area, can you please > >> check if '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is > >> static function I expect it to be inlined. If not, can you please > >> test with force inlining it (__rte_always_inline)? > > It was not inline, did check with force inline also and no impact with this, so I > can make it force inline. > > > > If there is no performance improvement, I think no need to force inline > '__rte_pktmbuf_free_seg_via_array()'. > > >> > >> > >> And I wonder if bulk() API may get single mbuf is a common theme, > >> does it makes sense add a new inline wrapper to library to cover this > >> case, if it is bringing ~4% improvement, like: > >> ``` > >> static inline void > >> rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n) { > >> if (n == 1) > >> return rte_pktmbuf_free(mb[0]); > >> return rte_pktmbuf_free_bulk(mb, n); } > > Agree, can make this wrapper to cover a case where bulk free API is > > called but might have single mbuf to get better perf. It can be > > further optimize " if (n == 1)" with compile time constant check, ``` > > static inline void rte_pktmbuf_free_bulk_or_one(struct rte_mbuf **mb, > > unsigned int n) { > > if (__builtin_constant_p(n) && (n == 1)) > > rte_pktmbuf_free(mb[0]); > > else > > rte_pktmbuf_free_bulk(mb, n); } ``` Let me know if it > > is fine. I'll send v2. And, this will be " __rte_experimental" right ? > > > > Compile time constant check can prevent penalty from additional check, which is > good, and I can see this can work for the examples/ipsec-secgw usecase above, > which has some hardcoded single mbuf free calls. > > But most of the other usecases I think 'n' won't be known in compile time, so API > will be effectively same as free_bulk(). Agree. > > If you have it with runtime check, do you still observe any performance > improvement? If not perhaps we can go only with example code update, without > new API. With runtime check, performance improvement is small only in compare to compile time check. So can continue without this new API. ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop 2024-02-13 12:50 ` Rahul Bhansali @ 2024-02-29 17:06 ` Akhil Goyal 0 siblings, 0 replies; 8+ messages in thread From: Akhil Goyal @ 2024-02-29 17:06 UTC (permalink / raw) To: Rahul Bhansali, Ferruh Yigit, dev, Radu Nicolau, Konstantin Ananyev, Anoob Joseph > > Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance > drop > > > > On 2/9/2024 1:10 PM, Rahul Bhansali wrote: > > > > > > > > >> -----Original Message----- > > >> From: Ferruh Yigit <ferruh.yigit@amd.com> > > >> Sent: Wednesday, February 7, 2024 4:06 PM > > >> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu > > >> Nicolau <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; > > >> Konstantin Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph > > >> <anoobj@marvell.com> > > >> Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec > > >> performance drop > > >> > > >> On 2/7/2024 6:46 AM, Rahul Bhansali wrote: > > >>> > > >>> > > >>>> -----Original Message----- > > >>>> From: Ferruh Yigit <ferruh.yigit@amd.com> > > >>>> Sent: Tuesday, February 6, 2024 11:55 PM > > >>>> To: Rahul Bhansali <rbhansali@marvell.com>; dev@dpdk.org; Radu > > >>>> Nicolau <radu.nicolau@intel.com>; Akhil Goyal <gakhil@marvell.com>; > > >>>> Konstantin Ananyev <konstantin.ananyev@huawei.com>; Anoob Joseph > > >>>> <anoobj@marvell.com> > > >>>> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec > > >>>> performance drop > > >>>> > > >>>> External Email > > >>>> > > >>>> ------------------------------------------------------------------- > > >>>> -- > > >>>> - On 2/6/2024 12:38 PM, Rahul Bhansali wrote: > > >>>>> Single packet free using rte_pktmbuf_free_bulk() is dropping the > > >>>>> performance. On cn10k, maximum of ~4% drop observed for IPsec > > >>>>> event mode single SA outbound case. > > >>>>> > > >>>>> To fix this issue, single packet free will use rte_pktmbuf_free API. > > >>>>> > > >>>>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free") > > >>>>> > > >>>>> Signed-off-by: Rahul Bhansali <rbhansali@marvell.com> > > >>>>> --- > > >>>>> examples/ipsec-secgw/ipsec-secgw.h | 7 +++---- > > >>>>> 1 file changed, 3 insertions(+), 4 deletions(-) > > >>>>> > > >>>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h > > >>>>> b/examples/ipsec-secgw/ipsec-secgw.h > > >>>>> index 8baab44ee7..ec33a982df 100644 > > >>>>> --- a/examples/ipsec-secgw/ipsec-secgw.h > > >>>>> +++ b/examples/ipsec-secgw/ipsec-secgw.h > > >>>>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf > > >>>>> *mb) } > > >>>>> > > >>>>> /* helper routine to free bulk of packets */ -static inline void > > >>>>> -free_pkts(struct rte_mbuf *mb[], uint32_t n) > > >>>>> +static __rte_always_inline void > > >>>>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n) > > >>>>> { > > >>>>> - rte_pktmbuf_free_bulk(mb, n); > > >>>>> - > > >>>>> + n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, > n); > > >>>>> core_stats_update_drop(n); > > >>>>> } > > >>>>> > > >>>> > > >>>> Hi Rahul, > > >>>> > > >>>> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be > > >>>> improved by similar change? > > >>> > > >>> Hi Ferruh, > > >>> Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that > > >>> along with > > >> __rte_pktmbuf_free_seg_via_array() both inline then performance can > > >> be improved similar. > > >>> > > >> > > >> Ah, so performance improvement is coming from 'rte_pktmbuf_free()' > > >> being inline, OK. > > >> > > >> As you are doing performance testing in that area, can you please > > >> check if '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is > > >> static function I expect it to be inlined. If not, can you please > > >> test with force inlining it (__rte_always_inline)? > > > It was not inline, did check with force inline also and no impact with this, so I > > can make it force inline. > > > > > > > If there is no performance improvement, I think no need to force inline > > '__rte_pktmbuf_free_seg_via_array()'. > > > > >> > > >> > > >> And I wonder if bulk() API may get single mbuf is a common theme, > > >> does it makes sense add a new inline wrapper to library to cover this > > >> case, if it is bringing ~4% improvement, like: > > >> ``` > > >> static inline void > > >> rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n) { > > >> if (n == 1) > > >> return rte_pktmbuf_free(mb[0]); > > >> return rte_pktmbuf_free_bulk(mb, n); } > > > Agree, can make this wrapper to cover a case where bulk free API is > > > called but might have single mbuf to get better perf. It can be > > > further optimize " if (n == 1)" with compile time constant check, ``` > > > static inline void rte_pktmbuf_free_bulk_or_one(struct rte_mbuf **mb, > > > unsigned int n) { > > > if (__builtin_constant_p(n) && (n == 1)) > > > rte_pktmbuf_free(mb[0]); > > > else > > > rte_pktmbuf_free_bulk(mb, n); } ``` Let me know if it > > > is fine. I'll send v2. And, this will be " __rte_experimental" right ? > > > > > > > Compile time constant check can prevent penalty from additional check, which > is > > good, and I can see this can work for the examples/ipsec-secgw usecase above, > > which has some hardcoded single mbuf free calls. > > > > But most of the other usecases I think 'n' won't be known in compile time, so > API > > will be effectively same as free_bulk(). > Agree. > > > > If you have it with runtime check, do you still observe any performance > > improvement? If not perhaps we can go only with example code update, > without > > new API. > With runtime check, performance improvement is small only in compare to > compile time check. So can continue without this new API. Acked-by: Akhil Goyal <gakhil@marvell.com> Applied to dpdk-next-crypto Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-02-29 17:06 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-02-06 12:38 [PATCH] examples/ipsec-secgw: fix IPsec performance drop Rahul Bhansali 2024-02-06 18:25 ` Ferruh Yigit 2024-02-07 6:46 ` [EXT] " Rahul Bhansali 2024-02-07 10:35 ` Ferruh Yigit 2024-02-09 13:10 ` Rahul Bhansali 2024-02-09 13:51 ` Ferruh Yigit 2024-02-13 12:50 ` Rahul Bhansali 2024-02-29 17:06 ` Akhil Goyal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).