* [dpdk-stable] [PATCH v2 2/5] net/hns3: fix build with sve enabled [not found] ` <20210108082523.1062058-1-ruifeng.wang@arm.com> @ 2021-01-08 8:25 ` Ruifeng Wang 2021-01-09 0:06 ` Honnappa Nagarahalli 2021-01-09 2:15 ` oulijun 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 3/5] net/octeontx: " Ruifeng Wang 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 4/5] common/octeontx2: " Ruifeng Wang 2 siblings, 2 replies; 17+ messages in thread From: Ruifeng Wang @ 2021-01-08 8:25 UTC (permalink / raw) To: Wei Hu (Xavier), Min Hu (Connor), Yisen Zhuang, Lijun Ou, Huisong Li, Chengchang Tang, Chengwen Feng Cc: dev, vladimir.medvedkin, jerinj, hemant.agrawal, honnappa.nagarahalli, nd, Ruifeng Wang, stable Building with SVE extension enabled stopped with error: error: ACLE function ‘svwhilelt_b64_s32’ requires ISA extension ‘sve’ 18 | #define PG64_256BIT svwhilelt_b64(0, 4) This is caused by unintentional cflags reset. Fixed the issue by appending required flag to cflags instead of overriding it. Fixes: 952ebacce4f2 ("net/hns3: support SVE Rx") Cc: xavier.huwei@huawei.com Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> --- drivers/net/hns3/meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/hns3/meson.build b/drivers/net/hns3/meson.build index 45cee34d9..798086357 100644 --- a/drivers/net/hns3/meson.build +++ b/drivers/net/hns3/meson.build @@ -32,7 +32,7 @@ deps += ['hash'] if arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64') sources += files('hns3_rxtx_vec.c') if cc.get_define('__ARM_FEATURE_SVE', args: machine_args) != '' - cflags = ['-DCC_SVE_SUPPORT'] + cflags += ['-DCC_SVE_SUPPORT'] sources += files('hns3_rxtx_vec_sve.c') endif endif -- 2.25.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [dpdk-stable] [PATCH v2 2/5] net/hns3: fix build with sve enabled 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 2/5] net/hns3: fix build with sve enabled Ruifeng Wang @ 2021-01-09 0:06 ` Honnappa Nagarahalli 2021-01-09 2:11 ` oulijun 2021-01-09 2:15 ` oulijun 1 sibling, 1 reply; 17+ messages in thread From: Honnappa Nagarahalli @ 2021-01-09 0:06 UTC (permalink / raw) To: Ruifeng Wang, Wei Hu (Xavier), Min Hu (Connor), Yisen Zhuang, Lijun Ou, Huisong Li, Chengchang Tang, Chengwen Feng Cc: dev, vladimir.medvedkin, jerinj, hemant.agrawal, nd, Ruifeng Wang, stable, Honnappa Nagarahalli, nd <snip> > > Building with SVE extension enabled stopped with error: > > error: ACLE function ‘svwhilelt_b64_s32’ requires ISA extension ‘sve’ > 18 | #define PG64_256BIT svwhilelt_b64(0, 4) > > This is caused by unintentional cflags reset. > Fixed the issue by appending required flag to cflags instead of overriding it. > > Fixes: 952ebacce4f2 ("net/hns3: support SVE Rx") > Cc: xavier.huwei@huawei.com > Cc: stable@dpdk.org > > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> > --- > drivers/net/hns3/meson.build | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/hns3/meson.build b/drivers/net/hns3/meson.build > index 45cee34d9..798086357 100644 > --- a/drivers/net/hns3/meson.build > +++ b/drivers/net/hns3/meson.build > @@ -32,7 +32,7 @@ deps += ['hash'] > if arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64') > sources += files('hns3_rxtx_vec.c') > if cc.get_define('__ARM_FEATURE_SVE', args: machine_args) != '' > - cflags = ['-DCC_SVE_SUPPORT'] > + cflags += ['-DCC_SVE_SUPPORT'] This comment is unrelated to this patch. We need to be consistent with the macro definitions. Is '__ARM_FEATURE_SVE' not enough? If we need to define an additional flag, I would name it something like 'RTE_ARM_FEATURE_SVE'. > sources += files('hns3_rxtx_vec_sve.c') > endif > endif > -- > 2.25.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [dpdk-stable] [PATCH v2 2/5] net/hns3: fix build with sve enabled 2021-01-09 0:06 ` Honnappa Nagarahalli @ 2021-01-09 2:11 ` oulijun 2021-01-11 2:39 ` Ruifeng Wang 0 siblings, 1 reply; 17+ messages in thread From: oulijun @ 2021-01-09 2:11 UTC (permalink / raw) To: Honnappa Nagarahalli, Ruifeng Wang, Wei Hu (Xavier), Min Hu (Connor), Yisen Zhuang, Huisong Li, Chengchang Tang, Chengwen Feng Cc: dev, vladimir.medvedkin, jerinj, hemant.agrawal, nd, stable 在 2021/1/9 8:06, Honnappa Nagarahalli 写道: > <snip> > >> >> Building with SVE extension enabled stopped with error: >> >> error: ACLE function ‘svwhilelt_b64_s32’ requires ISA extension ‘sve’ >> 18 | #define PG64_256BIT svwhilelt_b64(0, 4) >> >> This is caused by unintentional cflags reset. >> Fixed the issue by appending required flag to cflags instead of overriding it. >> >> Fixes: 952ebacce4f2 ("net/hns3: support SVE Rx") >> Cc: xavier.huwei@huawei.com >> Cc: stable@dpdk.org >> >> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> >> --- >> drivers/net/hns3/meson.build | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/net/hns3/meson.build b/drivers/net/hns3/meson.build >> index 45cee34d9..798086357 100644 >> --- a/drivers/net/hns3/meson.build >> +++ b/drivers/net/hns3/meson.build >> @@ -32,7 +32,7 @@ deps += ['hash'] >> if arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64') >> sources += files('hns3_rxtx_vec.c') >> if cc.get_define('__ARM_FEATURE_SVE', args: machine_args) != '' >> - cflags = ['-DCC_SVE_SUPPORT'] >> + cflags += ['-DCC_SVE_SUPPORT'] > This comment is unrelated to this patch. We need to be consistent with the macro definitions. Is '__ARM_FEATURE_SVE' not enough? If we need to define an additional flag, I would name it something like 'RTE_ARM_FEATURE_SVE'. > I think the __ARM_FEATURE_SVE is ok. if use the gcc version included SVE flag, it will be identified as __ARM_FEATURE_SVE. it is defined in the ARM SVE document. >> sources += files('hns3_rxtx_vec_sve.c') >> endif >> endif >> -- >> 2.25.1 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [dpdk-stable] [PATCH v2 2/5] net/hns3: fix build with sve enabled 2021-01-09 2:11 ` oulijun @ 2021-01-11 2:39 ` Ruifeng Wang 2021-01-11 13:38 ` Honnappa Nagarahalli 0 siblings, 1 reply; 17+ messages in thread From: Ruifeng Wang @ 2021-01-11 2:39 UTC (permalink / raw) To: oulijun, Honnappa Nagarahalli, Min Hu (Connor), Yisen Zhuang, Huisong Li, Chengchang Tang, Chengwen Feng Cc: dev, vladimir.medvedkin, jerinj, hemant.agrawal, nd, stable, nd > -----Original Message----- > From: oulijun <oulijun@huawei.com> > Sent: Saturday, January 9, 2021 10:12 AM > To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang > <Ruifeng.Wang@arm.com>; Wei Hu (Xavier) <xavier.huwei@huawei.com>; > Min Hu (Connor) <humin29@huawei.com>; Yisen Zhuang > <yisen.zhuang@huawei.com>; Huisong Li <lihuisong@huawei.com>; > Chengchang Tang <tangchengchang@huawei.com>; Chengwen Feng > <fengchengwen@huawei.com> > Cc: dev@dpdk.org; vladimir.medvedkin@intel.com; jerinj@marvell.com; > hemant.agrawal@nxp.com; nd <nd@arm.com>; stable@dpdk.org > Subject: Re: [PATCH v2 2/5] net/hns3: fix build with sve enabled > > > 在 2021/1/9 8:06, Honnappa Nagarahalli 写道: > > <snip> > > > >> > >> Building with SVE extension enabled stopped with error: > >> > >> error: ACLE function ‘svwhilelt_b64_s32’ requires ISA extension ‘sve’ > >> 18 | #define PG64_256BIT svwhilelt_b64(0, 4) > >> > >> This is caused by unintentional cflags reset. > >> Fixed the issue by appending required flag to cflags instead of overriding it. > >> > >> Fixes: 952ebacce4f2 ("net/hns3: support SVE Rx") > >> Cc: xavier.huwei@huawei.com > >> Cc: stable@dpdk.org > >> > >> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> > >> --- > >> drivers/net/hns3/meson.build | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/drivers/net/hns3/meson.build > >> b/drivers/net/hns3/meson.build index 45cee34d9..798086357 100644 > >> --- a/drivers/net/hns3/meson.build > >> +++ b/drivers/net/hns3/meson.build > >> @@ -32,7 +32,7 @@ deps += ['hash'] > >> if arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64') > >> sources += files('hns3_rxtx_vec.c') > >> if cc.get_define('__ARM_FEATURE_SVE', args: machine_args) != '' > >> - cflags = ['-DCC_SVE_SUPPORT'] > >> + cflags += ['-DCC_SVE_SUPPORT'] > > This comment is unrelated to this patch. We need to be consistent with the > macro definitions. Is '__ARM_FEATURE_SVE' not enough? If we need to > define an additional flag, I would name it something like > 'RTE_ARM_FEATURE_SVE'. > > > I think the __ARM_FEATURE_SVE is ok. if use the gcc version included SVE > flag, it will be identified as __ARM_FEATURE_SVE. it is defined in the ARM > SVE document. Yes, we can rely on flags defined by compiler and no extra flag is needed. I can update in next version to remove this section from meson file and replace CC_SVE_SUPPORT in code. > >> sources += files('hns3_rxtx_vec_sve.c') > >> endif > >> endif > >> -- > >> 2.25.1 > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [dpdk-stable] [PATCH v2 2/5] net/hns3: fix build with sve enabled 2021-01-11 2:39 ` Ruifeng Wang @ 2021-01-11 13:38 ` Honnappa Nagarahalli 0 siblings, 0 replies; 17+ messages in thread From: Honnappa Nagarahalli @ 2021-01-11 13:38 UTC (permalink / raw) To: Ruifeng Wang, oulijun, Min Hu (Connor), Yisen Zhuang, Huisong Li, Chengchang Tang, Chengwen Feng Cc: dev, vladimir.medvedkin, jerinj, hemant.agrawal, nd, stable, Honnappa Nagarahalli, nd <snip> > > > > > >> > > >> Building with SVE extension enabled stopped with error: > > >> > > >> error: ACLE function ‘svwhilelt_b64_s32’ requires ISA extension ‘sve’ > > >> 18 | #define PG64_256BIT svwhilelt_b64(0, 4) > > >> > > >> This is caused by unintentional cflags reset. > > >> Fixed the issue by appending required flag to cflags instead of overriding it. > > >> > > >> Fixes: 952ebacce4f2 ("net/hns3: support SVE Rx") > > >> Cc: xavier.huwei@huawei.com > > >> Cc: stable@dpdk.org > > >> > > >> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> > > >> --- > > >> drivers/net/hns3/meson.build | 2 +- > > >> 1 file changed, 1 insertion(+), 1 deletion(-) > > >> > > >> diff --git a/drivers/net/hns3/meson.build > > >> b/drivers/net/hns3/meson.build index 45cee34d9..798086357 100644 > > >> --- a/drivers/net/hns3/meson.build > > >> +++ b/drivers/net/hns3/meson.build > > >> @@ -32,7 +32,7 @@ deps += ['hash'] > > >> if arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64') > > >> sources += files('hns3_rxtx_vec.c') > > >> if cc.get_define('__ARM_FEATURE_SVE', args: machine_args) != '' > > >> -cflags = ['-DCC_SVE_SUPPORT'] > > >> +cflags += ['-DCC_SVE_SUPPORT'] > > > This comment is unrelated to this patch. We need to be consistent > > > with the > > macro definitions. Is '__ARM_FEATURE_SVE' not enough? If we need to > > define an additional flag, I would name it something like > > 'RTE_ARM_FEATURE_SVE'. > > > > > I think the __ARM_FEATURE_SVE is ok. if use the gcc version included > > SVE flag, it will be identified as __ARM_FEATURE_SVE. it is defined in > > the ARM SVE document. > > Yes, we can rely on flags defined by compiler and no extra flag is needed. > I can update in next version to remove this section from meson file and replace > CC_SVE_SUPPORT in code. Sounds good to me. > > >> sources += files('hns3_rxtx_vec_sve.c') > > >> endif > > >> endif > > >> -- > > >> 2.25.1 > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [dpdk-stable] [PATCH v2 2/5] net/hns3: fix build with sve enabled 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 2/5] net/hns3: fix build with sve enabled Ruifeng Wang 2021-01-09 0:06 ` Honnappa Nagarahalli @ 2021-01-09 2:15 ` oulijun 2021-01-11 2:27 ` Ruifeng Wang 1 sibling, 1 reply; 17+ messages in thread From: oulijun @ 2021-01-09 2:15 UTC (permalink / raw) To: Ruifeng Wang, Wei Hu (Xavier), Min Hu (Connor), Yisen Zhuang, Huisong Li, Chengchang Tang, Chengwen Feng Cc: dev, vladimir.medvedkin, jerinj, hemant.agrawal, honnappa.nagarahalli, nd, stable 在 2021/1/8 16:25, Ruifeng Wang 写道: > Building with SVE extension enabled stopped with error: > > error: ACLE function ‘svwhilelt_b64_s32’ requires ISA extension ‘sve’ > 18 | #define PG64_256BIT svwhilelt_b64(0, 4) > > This is caused by unintentional cflags reset. > Fixed the issue by appending required flag to cflags instead of > overriding it. > > Fixes: 952ebacce4f2 ("net/hns3: support SVE Rx") > Cc: xavier.huwei@huawei.com > Cc: stable@dpdk.org > > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> > --- > drivers/net/hns3/meson.build | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/hns3/meson.build b/drivers/net/hns3/meson.build > index 45cee34d9..798086357 100644 > --- a/drivers/net/hns3/meson.build > +++ b/drivers/net/hns3/meson.build > @@ -32,7 +32,7 @@ deps += ['hash'] > if arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64') > sources += files('hns3_rxtx_vec.c') > if cc.get_define('__ARM_FEATURE_SVE', args: machine_args) != '' > - cflags = ['-DCC_SVE_SUPPORT'] > + cflags += ['-DCC_SVE_SUPPORT'] Hi I noticed this patch, but I checked that the hns3 driver did not use this function.How did you compile it? Thanks Lijun Ou > sources += files('hns3_rxtx_vec_sve.c') > endif > endif > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [dpdk-stable] [PATCH v2 2/5] net/hns3: fix build with sve enabled 2021-01-09 2:15 ` oulijun @ 2021-01-11 2:27 ` Ruifeng Wang 0 siblings, 0 replies; 17+ messages in thread From: Ruifeng Wang @ 2021-01-11 2:27 UTC (permalink / raw) To: oulijun, Wei Hu (Xavier), Min Hu (Connor), Yisen Zhuang, Huisong Li, Chengchang Tang, Chengwen Feng Cc: dev, vladimir.medvedkin, jerinj, hemant.agrawal, Honnappa Nagarahalli, nd, stable, nd > -----Original Message----- > From: oulijun <oulijun@huawei.com> > Sent: Saturday, January 9, 2021 10:16 AM > To: Ruifeng Wang <Ruifeng.Wang@arm.com>; Wei Hu (Xavier) > <xavier.huwei@huawei.com>; Min Hu (Connor) <humin29@huawei.com>; > Yisen Zhuang <yisen.zhuang@huawei.com>; Huisong Li > <lihuisong@huawei.com>; Chengchang Tang > <tangchengchang@huawei.com>; Chengwen Feng > <fengchengwen@huawei.com> > Cc: dev@dpdk.org; vladimir.medvedkin@intel.com; jerinj@marvell.com; > hemant.agrawal@nxp.com; Honnappa Nagarahalli > <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>; stable@dpdk.org > Subject: Re: [PATCH v2 2/5] net/hns3: fix build with sve enabled > > > > 在 2021/1/8 16:25, Ruifeng Wang 写道: > > Building with SVE extension enabled stopped with error: > > > > error: ACLE function ‘svwhilelt_b64_s32’ requires ISA extension ‘sve’ > > 18 | #define PG64_256BIT svwhilelt_b64(0, 4) > > > > This is caused by unintentional cflags reset. > > Fixed the issue by appending required flag to cflags instead of > > overriding it. > > > > Fixes: 952ebacce4f2 ("net/hns3: support SVE Rx") > > Cc: xavier.huwei@huawei.com > > Cc: stable@dpdk.org > > > > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> > > --- > > drivers/net/hns3/meson.build | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/net/hns3/meson.build > > b/drivers/net/hns3/meson.build index 45cee34d9..798086357 100644 > > --- a/drivers/net/hns3/meson.build > > +++ b/drivers/net/hns3/meson.build > > @@ -32,7 +32,7 @@ deps += ['hash'] > > if arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64') > > sources += files('hns3_rxtx_vec.c') > > if cc.get_define('__ARM_FEATURE_SVE', args: machine_args) != '' > > - cflags = ['-DCC_SVE_SUPPORT'] > > + cflags += ['-DCC_SVE_SUPPORT'] > Hi > I noticed this patch, but I checked that the hns3 driver did not use this > function.How did you compile it? Hi, The hns3 driver has sve rx/tx implementation in hns3_rxtx_vec_sve.c. This path will be enabled when compiling with sve feature enabled. I compiled it by using gcc-10.2 with flag '-march=armv8.3-a+sve'. You can try compile for n2 with the cross file added in this series (5/5). Thanks, Ruifeng > > Thanks > Lijun Ou > > sources += files('hns3_rxtx_vec_sve.c') > > endif > > endif > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* [dpdk-stable] [PATCH v2 3/5] net/octeontx: fix build with sve enabled [not found] ` <20210108082523.1062058-1-ruifeng.wang@arm.com> 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 2/5] net/hns3: fix build with sve enabled Ruifeng Wang @ 2021-01-08 8:25 ` Ruifeng Wang 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 4/5] common/octeontx2: " Ruifeng Wang 2 siblings, 0 replies; 17+ messages in thread From: Ruifeng Wang @ 2021-01-08 8:25 UTC (permalink / raw) To: Harman Kalra, Jerin Jacob, Santosh Shukla Cc: dev, vladimir.medvedkin, jerinj, hemant.agrawal, honnappa.nagarahalli, nd, Ruifeng Wang, stable Building with gcc 10.2 with SVE extension enabled got error: {standard input}: Assembler messages: {standard input}:91: Error: selected processor does not support `addvl x4,x8,#-1' {standard input}:95: Error: selected processor does not support `ptrue p1.d,all' {standard input}:135: Error: selected processor does not support `whilelo p2.d,xzr,x5' {standard input}:137: Error: selected processor does not support `decb x1' This is because inline assembly code explicitly resets cpu model to not have SVE support. Thus SVE instructions generated by compiler auto vectorization got rejected by assembler. Fixed the issue by replacing inline assembly with equivalent atomic built-ins. Compiler will generate LSE instructions for cpu that has the extension. Fixes: f0c7bb1bf778 ("net/octeontx/base: add octeontx IO operations") Cc: jerinj@marvell.com Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> --- drivers/net/octeontx/base/octeontx_io.h | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/drivers/net/octeontx/base/octeontx_io.h b/drivers/net/octeontx/base/octeontx_io.h index 04b9ce191..0bf9b100d 100644 --- a/drivers/net/octeontx/base/octeontx_io.h +++ b/drivers/net/octeontx/base/octeontx_io.h @@ -58,14 +58,8 @@ do { \ static inline uint64_t octeontx_reg_ldadd_u64(void *addr, int64_t off) { - uint64_t old_val; - - __asm__ volatile( - " .cpu generic+lse\n" - " ldadd %1, %0, [%2]\n" - : "=r" (old_val) : "r" (off), "r" (addr) : "memory"); - - return old_val; + return (uint64_t)__atomic_fetch_add((int64_t *)addr, off, + __ATOMIC_RELAXED); } /** @@ -97,10 +91,8 @@ octeontx_reg_lmtst(void *lmtline_va, void *ioreg_va, const uint64_t cmdbuf[], } /* LDEOR initiates atomic transfer to I/O device */ - __asm__ volatile( - " .cpu generic+lse\n" - " ldeor xzr, %0, [%1]\n" - : "=r" (result) : "r" (ioreg_va) : "memory"); + result = __atomic_fetch_xor((uint64_t *)ioreg_va, 0, + __ATOMIC_RELAXED); } while (!result); } -- 2.25.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [dpdk-stable] [PATCH v2 4/5] common/octeontx2: fix build with sve enabled [not found] ` <20210108082523.1062058-1-ruifeng.wang@arm.com> 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 2/5] net/hns3: fix build with sve enabled Ruifeng Wang 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 3/5] net/octeontx: " Ruifeng Wang @ 2021-01-08 8:25 ` Ruifeng Wang 2021-01-08 10:29 ` [dpdk-stable] [EXT] " Pavan Nikhilesh Bhagavatula 2 siblings, 1 reply; 17+ messages in thread From: Ruifeng Wang @ 2021-01-08 8:25 UTC (permalink / raw) To: Jerin Jacob, Nithin Dabilpuram, Pavan Nikhilesh Cc: dev, vladimir.medvedkin, hemant.agrawal, honnappa.nagarahalli, nd, Ruifeng Wang, stable Building with gcc 10.2 with SVE extension enabled got error: {standard input}: Assembler messages: {standard input}:4002: Error: selected processor does not support `mov z3.b,#0' {standard input}:4003: Error: selected processor does not support `whilelo p1.b,xzr,x7' {standard input}:4005: Error: selected processor does not support `ld1b z0.b,p1/z,[x8]' {standard input}:4006: Error: selected processor does not support `whilelo p4.s,wzr,w7' This is because inline assembly code explicitly resets cpu model to not have SVE support. Thus SVE instructions generated by compiler auto vectorization got rejected by assembler. Fixed the issue by replacing inline assembly with equivalent atomic built-ins. Compiler will generate LSE instructions for cpu that has the extension. Fixes: 8a4f835971f5 ("common/octeontx2: add IO handling APIs") Cc: jerinj@marvell.com Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> --- drivers/common/octeontx2/otx2_io_arm64.h | 37 +++--------------------- 1 file changed, 4 insertions(+), 33 deletions(-) diff --git a/drivers/common/octeontx2/otx2_io_arm64.h b/drivers/common/octeontx2/otx2_io_arm64.h index b5c85d9a6..8843a79b5 100644 --- a/drivers/common/octeontx2/otx2_io_arm64.h +++ b/drivers/common/octeontx2/otx2_io_arm64.h @@ -24,55 +24,26 @@ static __rte_always_inline uint64_t otx2_atomic64_add_nosync(int64_t incr, int64_t *ptr) { - uint64_t result; - /* Atomic add with no ordering */ - asm volatile ( - ".cpu generic+lse\n" - "ldadd %x[i], %x[r], [%[b]]" - : [r] "=r" (result), "+m" (*ptr) - : [i] "r" (incr), [b] "r" (ptr) - : "memory"); - return result; + return (uint64_t)__atomic_fetch_add(ptr, incr, __ATOMIC_RELAXED); } static __rte_always_inline uint64_t otx2_atomic64_add_sync(int64_t incr, int64_t *ptr) { - uint64_t result; - - /* Atomic add with ordering */ - asm volatile ( - ".cpu generic+lse\n" - "ldadda %x[i], %x[r], [%[b]]" - : [r] "=r" (result), "+m" (*ptr) - : [i] "r" (incr), [b] "r" (ptr) - : "memory"); - return result; + return (uint64_t)__atomic_fetch_add(ptr, incr, __ATOMIC_ACQUIRE); } static __rte_always_inline uint64_t otx2_lmt_submit(rte_iova_t io_address) { - uint64_t result; - - asm volatile ( - ".cpu generic+lse\n" - "ldeor xzr,%x[rf],[%[rs]]" : - [rf] "=r"(result): [rs] "r"(io_address)); - return result; + return __atomic_fetch_xor((uint64_t *)io_address, 0, __ATOMIC_RELAXED); } static __rte_always_inline uint64_t otx2_lmt_submit_release(rte_iova_t io_address) { - uint64_t result; - - asm volatile ( - ".cpu generic+lse\n" - "ldeorl xzr,%x[rf],[%[rs]]" : - [rf] "=r"(result) : [rs] "r"(io_address)); - return result; + return __atomic_fetch_xor((uint64_t *)io_address, 0, __ATOMIC_RELEASE); } static __rte_always_inline void -- 2.25.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [dpdk-stable] [EXT] [PATCH v2 4/5] common/octeontx2: fix build with sve enabled 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 4/5] common/octeontx2: " Ruifeng Wang @ 2021-01-08 10:29 ` Pavan Nikhilesh Bhagavatula 2021-01-11 9:51 ` Ruifeng Wang 0 siblings, 1 reply; 17+ messages in thread From: Pavan Nikhilesh Bhagavatula @ 2021-01-08 10:29 UTC (permalink / raw) To: Ruifeng Wang, Jerin Jacob Kollanukkaran, Nithin Kumar Dabilpuram Cc: dev, vladimir.medvedkin, hemant.agrawal, honnappa.nagarahalli, nd, stable Hi Ruifeng, >Building with gcc 10.2 with SVE extension enabled got error: > >{standard input}: Assembler messages: >{standard input}:4002: Error: selected processor does not support `mov >z3.b,#0' >{standard input}:4003: Error: selected processor does not support >`whilelo p1.b,xzr,x7' >{standard input}:4005: Error: selected processor does not support `ld1b >z0.b,p1/z,[x8]' >{standard input}:4006: Error: selected processor does not support >`whilelo p4.s,wzr,w7' > >This is because inline assembly code explicitly resets cpu model to >not have SVE support. Thus SVE instructions generated by compiler >auto vectorization got rejected by assembler. > >Fixed the issue by replacing inline assembly with equivalent atomic >built-ins. Compiler will generate LSE instructions for cpu that has >the extension. > >Fixes: 8a4f835971f5 ("common/octeontx2: add IO handling APIs") >Cc: jerinj@marvell.com >Cc: stable@dpdk.org > >Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> >--- > drivers/common/octeontx2/otx2_io_arm64.h | 37 +++-------------------- >- > 1 file changed, 4 insertions(+), 33 deletions(-) > >diff --git a/drivers/common/octeontx2/otx2_io_arm64.h >b/drivers/common/octeontx2/otx2_io_arm64.h >index b5c85d9a6..8843a79b5 100644 >--- a/drivers/common/octeontx2/otx2_io_arm64.h >+++ b/drivers/common/octeontx2/otx2_io_arm64.h >@@ -24,55 +24,26 @@ > static __rte_always_inline uint64_t > otx2_atomic64_add_nosync(int64_t incr, int64_t *ptr) > { >- uint64_t result; >- > /* Atomic add with no ordering */ >- asm volatile ( >- ".cpu generic+lse\n" >- "ldadd %x[i], %x[r], [%[b]]" >- : [r] "=r" (result), "+m" (*ptr) >- : [i] "r" (incr), [b] "r" (ptr) >- : "memory"); >- return result; >+ return (uint64_t)__atomic_fetch_add(ptr, incr, >__ATOMIC_RELAXED); > } > Here LDADD acts as a way to interface to co-processors i.e. LDADD instruction opcode + specific io address are recognized by HW interceptor and dispatched to the specific coprocessor. Leaving it to the compiler to use the correct instruction is a bad idea. This breaks the arm64_armv8_linux_gcc build as it doesn't have the +lse enabled. __atomic_fetch_add will generate a different instruction with SVE enabled. Instead can we add +sve to the first line to prevent outer loop from optimizing out the trap? I tested with 10.2 and n2 config below change works fine. -" .cpu generic+lse\n" +" .cpu generic+lse+sve\n" Regards, Pavan. > static __rte_always_inline uint64_t > otx2_atomic64_add_sync(int64_t incr, int64_t *ptr) > { >- uint64_t result; >- >- /* Atomic add with ordering */ >- asm volatile ( >- ".cpu generic+lse\n" >- "ldadda %x[i], %x[r], [%[b]]" >- : [r] "=r" (result), "+m" (*ptr) >- : [i] "r" (incr), [b] "r" (ptr) >- : "memory"); >- return result; >+ return (uint64_t)__atomic_fetch_add(ptr, incr, >__ATOMIC_ACQUIRE); > } > > static __rte_always_inline uint64_t > otx2_lmt_submit(rte_iova_t io_address) > { >- uint64_t result; >- >- asm volatile ( >- ".cpu generic+lse\n" >- "ldeor xzr,%x[rf],[%[rs]]" : >- [rf] "=r"(result): [rs] "r"(io_address)); >- return result; >+ return __atomic_fetch_xor((uint64_t *)io_address, 0, >__ATOMIC_RELAXED); > } > > static __rte_always_inline uint64_t > otx2_lmt_submit_release(rte_iova_t io_address) > { >- uint64_t result; >- >- asm volatile ( >- ".cpu generic+lse\n" >- "ldeorl xzr,%x[rf],[%[rs]]" : >- [rf] "=r"(result) : [rs] "r"(io_address)); >- return result; >+ return __atomic_fetch_xor((uint64_t *)io_address, 0, >__ATOMIC_RELEASE); > } > > static __rte_always_inline void >-- >2.25.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [dpdk-stable] [EXT] [PATCH v2 4/5] common/octeontx2: fix build with sve enabled 2021-01-08 10:29 ` [dpdk-stable] [EXT] " Pavan Nikhilesh Bhagavatula @ 2021-01-11 9:51 ` Ruifeng Wang 0 siblings, 0 replies; 17+ messages in thread From: Ruifeng Wang @ 2021-01-11 9:51 UTC (permalink / raw) To: Pavan Nikhilesh Bhagavatula, jerinj, Nithin Kumar Dabilpuram Cc: dev, vladimir.medvedkin, hemant.agrawal, Honnappa Nagarahalli, nd, stable, nd > -----Original Message----- > From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com> > Sent: Friday, January 8, 2021 6:29 PM > To: Ruifeng Wang <Ruifeng.Wang@arm.com>; jerinj@marvell.com; Nithin > Kumar Dabilpuram <ndabilpuram@marvell.com> > Cc: dev@dpdk.org; vladimir.medvedkin@intel.com; > hemant.agrawal@nxp.com; Honnappa Nagarahalli > <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>; stable@dpdk.org > Subject: RE: [EXT] [PATCH v2 4/5] common/octeontx2: fix build with sve > enabled > > Hi Ruifeng, > > >Building with gcc 10.2 with SVE extension enabled got error: > > > >{standard input}: Assembler messages: > >{standard input}:4002: Error: selected processor does not support `mov > >z3.b,#0' > >{standard input}:4003: Error: selected processor does not support > >`whilelo p1.b,xzr,x7' > >{standard input}:4005: Error: selected processor does not support `ld1b > >z0.b,p1/z,[x8]' > >{standard input}:4006: Error: selected processor does not support > >`whilelo p4.s,wzr,w7' > > > >This is because inline assembly code explicitly resets cpu model to not > >have SVE support. Thus SVE instructions generated by compiler auto > >vectorization got rejected by assembler. > > > >Fixed the issue by replacing inline assembly with equivalent atomic > >built-ins. Compiler will generate LSE instructions for cpu that has the > >extension. > > > >Fixes: 8a4f835971f5 ("common/octeontx2: add IO handling APIs") > >Cc: jerinj@marvell.com > >Cc: stable@dpdk.org > > > >Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> > >--- > > drivers/common/octeontx2/otx2_io_arm64.h | 37 +++-------------------- > >- > > 1 file changed, 4 insertions(+), 33 deletions(-) > > > >diff --git a/drivers/common/octeontx2/otx2_io_arm64.h > >b/drivers/common/octeontx2/otx2_io_arm64.h > >index b5c85d9a6..8843a79b5 100644 > >--- a/drivers/common/octeontx2/otx2_io_arm64.h > >+++ b/drivers/common/octeontx2/otx2_io_arm64.h > >@@ -24,55 +24,26 @@ > > static __rte_always_inline uint64_t > > otx2_atomic64_add_nosync(int64_t incr, int64_t *ptr) { > >- uint64_t result; > >- > > /* Atomic add with no ordering */ > >- asm volatile ( > >- ".cpu generic+lse\n" > >- "ldadd %x[i], %x[r], [%[b]]" > >- : [r] "=r" (result), "+m" (*ptr) > >- : [i] "r" (incr), [b] "r" (ptr) > >- : "memory"); > >- return result; > >+ return (uint64_t)__atomic_fetch_add(ptr, incr, > >__ATOMIC_RELAXED); > > } > > > > Here LDADD acts as a way to interface to co-processors i.e. > LDADD instruction opcode + specific io address are recognized by HW > interceptor and dispatched to the specific coprocessor. OK. Now I understand the background. > > Leaving it to the compiler to use the correct instruction is a bad idea. > This breaks the arm64_armv8_linux_gcc build as it doesn't have the > +lse enabled. > __atomic_fetch_add will generate a different instruction with SVE enabled. > > Instead can we add +sve to the first line to prevent outer loop from > optimizing out the trap? Since the inline assembly needs to be preserved, we have to tune the enabled extensions. I will change in next version. Thanks, Ruifeng > > I tested with 10.2 and n2 config below change works fine. > -" .cpu generic+lse\n" > +" .cpu generic+lse+sve\n" > > Regards, > Pavan. > > > static __rte_always_inline uint64_t > > otx2_atomic64_add_sync(int64_t incr, int64_t *ptr) { > >- uint64_t result; > >- > >- /* Atomic add with ordering */ > >- asm volatile ( > >- ".cpu generic+lse\n" > >- "ldadda %x[i], %x[r], [%[b]]" > >- : [r] "=r" (result), "+m" (*ptr) > >- : [i] "r" (incr), [b] "r" (ptr) > >- : "memory"); > >- return result; > >+ return (uint64_t)__atomic_fetch_add(ptr, incr, > >__ATOMIC_ACQUIRE); > > } > > > > static __rte_always_inline uint64_t > > otx2_lmt_submit(rte_iova_t io_address) { > >- uint64_t result; > >- > >- asm volatile ( > >- ".cpu generic+lse\n" > >- "ldeor xzr,%x[rf],[%[rs]]" : > >- [rf] "=r"(result): [rs] "r"(io_address)); > >- return result; > >+ return __atomic_fetch_xor((uint64_t *)io_address, 0, > >__ATOMIC_RELAXED); > > } > > > > static __rte_always_inline uint64_t > > otx2_lmt_submit_release(rte_iova_t io_address) { > >- uint64_t result; > >- > >- asm volatile ( > >- ".cpu generic+lse\n" > >- "ldeorl xzr,%x[rf],[%[rs]]" : > >- [rf] "=r"(result) : [rs] "r"(io_address)); > >- return result; > >+ return __atomic_fetch_xor((uint64_t *)io_address, 0, > >__ATOMIC_RELEASE); > > } > > > > static __rte_always_inline void > >-- > >2.25.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <20210112025709.1121523-1-ruifeng.wang@arm.com>]
* [dpdk-stable] [PATCH v3 2/5] net/hns3: fix build with sve enabled [not found] ` <20210112025709.1121523-1-ruifeng.wang@arm.com> @ 2021-01-12 2:57 ` Ruifeng Wang 2021-01-13 2:16 ` Honnappa Nagarahalli 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 3/5] net/octeontx: " Ruifeng Wang 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 4/5] common/octeontx2: " Ruifeng Wang 2 siblings, 1 reply; 17+ messages in thread From: Ruifeng Wang @ 2021-01-12 2:57 UTC (permalink / raw) To: Wei Hu (Xavier), Min Hu (Connor), Yisen Zhuang, Lijun Ou, Chengwen Feng, Chengchang Tang, Huisong Li Cc: dev, vladimir.medvedkin, pbhagavatula, jerinj, hemant.agrawal, honnappa.nagarahalli, nd, Ruifeng Wang, stable Building with SVE extension enabled stopped with error: error: ACLE function ‘svwhilelt_b64_s32’ requires ISA extension ‘sve’ 18 | #define PG64_256BIT svwhilelt_b64(0, 4) This is caused by unintentional cflags reset. Fixed the issue by not touching cflags, and using flags defined by compiler. Fixes: 952ebacce4f2 ("net/hns3: support SVE Rx") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> --- v3: Removed extra flag, use compiler flag instead. drivers/net/hns3/hns3_rxtx.c | 4 ++-- drivers/net/hns3/meson.build | 1 - 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c index 88d3baba4..5ac36b314 100644 --- a/drivers/net/hns3/hns3_rxtx.c +++ b/drivers/net/hns3/hns3_rxtx.c @@ -10,7 +10,7 @@ #include <rte_io.h> #include <rte_net.h> #include <rte_malloc.h> -#if defined(RTE_ARCH_ARM64) && defined(CC_SVE_SUPPORT) +#if defined(RTE_ARCH_ARM64) && defined(__ARM_FEATURE_SVE) #include <rte_cpuflags.h> #endif @@ -2467,7 +2467,7 @@ hns3_rx_burst_mode_get(struct rte_eth_dev *dev, __rte_unused uint16_t queue_id, static bool hns3_check_sve_support(void) { -#if defined(RTE_ARCH_ARM64) && defined(CC_SVE_SUPPORT) +#if defined(RTE_ARCH_ARM64) && defined(__ARM_FEATURE_SVE) if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SVE)) return true; #endif diff --git a/drivers/net/hns3/meson.build b/drivers/net/hns3/meson.build index 45cee34d9..5674d986b 100644 --- a/drivers/net/hns3/meson.build +++ b/drivers/net/hns3/meson.build @@ -32,7 +32,6 @@ deps += ['hash'] if arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64') sources += files('hns3_rxtx_vec.c') if cc.get_define('__ARM_FEATURE_SVE', args: machine_args) != '' - cflags = ['-DCC_SVE_SUPPORT'] sources += files('hns3_rxtx_vec_sve.c') endif endif -- 2.25.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [dpdk-stable] [PATCH v3 2/5] net/hns3: fix build with sve enabled 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 2/5] net/hns3: " Ruifeng Wang @ 2021-01-13 2:16 ` Honnappa Nagarahalli 0 siblings, 0 replies; 17+ messages in thread From: Honnappa Nagarahalli @ 2021-01-13 2:16 UTC (permalink / raw) To: Ruifeng Wang, Wei Hu (Xavier), Min Hu (Connor), Yisen Zhuang, Lijun Ou, Chengwen Feng, Chengchang Tang, Huisong Li Cc: dev, vladimir.medvedkin, pbhagavatula, jerinj, hemant.agrawal, nd, Ruifeng Wang, stable, Honnappa Nagarahalli, nd <snip> > > Building with SVE extension enabled stopped with error: > > error: ACLE function ‘svwhilelt_b64_s32’ requires ISA extension ‘sve’ > 18 | #define PG64_256BIT svwhilelt_b64(0, 4) > > This is caused by unintentional cflags reset. > Fixed the issue by not touching cflags, and using flags defined by compiler. > > Fixes: 952ebacce4f2 ("net/hns3: support SVE Rx") > Cc: stable@dpdk.org > > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> > --- > v3: > Removed extra flag, use compiler flag instead. > > drivers/net/hns3/hns3_rxtx.c | 4 ++-- > drivers/net/hns3/meson.build | 1 - > 2 files changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c > index 88d3baba4..5ac36b314 100644 > --- a/drivers/net/hns3/hns3_rxtx.c > +++ b/drivers/net/hns3/hns3_rxtx.c > @@ -10,7 +10,7 @@ > #include <rte_io.h> > #include <rte_net.h> > #include <rte_malloc.h> > -#if defined(RTE_ARCH_ARM64) && defined(CC_SVE_SUPPORT) > +#if defined(RTE_ARCH_ARM64) && defined(__ARM_FEATURE_SVE) > #include <rte_cpuflags.h> > #endif > > @@ -2467,7 +2467,7 @@ hns3_rx_burst_mode_get(struct rte_eth_dev > *dev, __rte_unused uint16_t queue_id, static bool > hns3_check_sve_support(void) > { > -#if defined(RTE_ARCH_ARM64) && defined(CC_SVE_SUPPORT) > +#if defined(RTE_ARCH_ARM64) && defined(__ARM_FEATURE_SVE) > if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SVE)) > return true; > #endif > diff --git a/drivers/net/hns3/meson.build b/drivers/net/hns3/meson.build > index 45cee34d9..5674d986b 100644 > --- a/drivers/net/hns3/meson.build > +++ b/drivers/net/hns3/meson.build > @@ -32,7 +32,6 @@ deps += ['hash'] > if arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64') > sources += files('hns3_rxtx_vec.c') > if cc.get_define('__ARM_FEATURE_SVE', args: machine_args) != '' > - cflags = ['-DCC_SVE_SUPPORT'] > sources += files('hns3_rxtx_vec_sve.c') > endif > endif > -- > 2.25.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [dpdk-stable] [PATCH v3 3/5] net/octeontx: fix build with sve enabled [not found] ` <20210112025709.1121523-1-ruifeng.wang@arm.com> 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 2/5] net/hns3: " Ruifeng Wang @ 2021-01-12 2:57 ` Ruifeng Wang 2021-01-12 4:39 ` [dpdk-stable] [dpdk-dev] " Jerin Jacob 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 4/5] common/octeontx2: " Ruifeng Wang 2 siblings, 1 reply; 17+ messages in thread From: Ruifeng Wang @ 2021-01-12 2:57 UTC (permalink / raw) To: Harman Kalra, Santosh Shukla, Jerin Jacob Cc: dev, vladimir.medvedkin, pbhagavatula, jerinj, hemant.agrawal, honnappa.nagarahalli, nd, Ruifeng Wang, stable Building with gcc 10.2 with SVE extension enabled got error: {standard input}: Assembler messages: {standard input}:91: Error: selected processor does not support `addvl x4,x8,#-1' {standard input}:95: Error: selected processor does not support `ptrue p1.d,all' {standard input}:135: Error: selected processor does not support `whilelo p2.d,xzr,x5' {standard input}:137: Error: selected processor does not support `decb x1' This is because inline assembly code explicitly resets cpu model to not have SVE support. Thus SVE instructions generated by compiler auto vectorization got rejected by assembler. Added SVE to the cpu model specified by inline assembly for SVE support. Not replacing the inline assembly with C atomics because the driver relies on specific LSE instruction to interface to co-processor [1]. Fixes: f0c7bb1bf778 ("net/octeontx/base: add octeontx IO operations") Cc: jerinj@marvell.com Cc: stable@dpdk.org [1] https://mails.dpdk.org/archives/dev/2021-January/196092.html Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> --- v3: Keep inline assembly and add sve extension to fix issue. (Pavan) drivers/net/octeontx/base/octeontx_io.h | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/net/octeontx/base/octeontx_io.h b/drivers/net/octeontx/base/octeontx_io.h index 04b9ce191..d0b9cfbc6 100644 --- a/drivers/net/octeontx/base/octeontx_io.h +++ b/drivers/net/octeontx/base/octeontx_io.h @@ -52,6 +52,11 @@ do { \ #endif #if defined(RTE_ARCH_ARM64) +#if defined(__ARM_FEATURE_SVE) +#define __LSE_PREAMBLE " .cpu generic+lse+sve\n" +#else +#define __LSE_PREAMBLE " .cpu generic+lse\n" +#endif /** * Perform an atomic fetch-and-add operation. */ @@ -61,7 +66,7 @@ octeontx_reg_ldadd_u64(void *addr, int64_t off) uint64_t old_val; __asm__ volatile( - " .cpu generic+lse\n" + __LSE_PREAMBLE " ldadd %1, %0, [%2]\n" : "=r" (old_val) : "r" (off), "r" (addr) : "memory"); @@ -98,12 +103,13 @@ octeontx_reg_lmtst(void *lmtline_va, void *ioreg_va, const uint64_t cmdbuf[], /* LDEOR initiates atomic transfer to I/O device */ __asm__ volatile( - " .cpu generic+lse\n" + __LSE_PREAMBLE " ldeor xzr, %0, [%1]\n" : "=r" (result) : "r" (ioreg_va) : "memory"); } while (!result); } +#undef __LSE_PREAMBLE #else static inline uint64_t -- 2.25.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH v3 3/5] net/octeontx: fix build with sve enabled 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 3/5] net/octeontx: " Ruifeng Wang @ 2021-01-12 4:39 ` Jerin Jacob 0 siblings, 0 replies; 17+ messages in thread From: Jerin Jacob @ 2021-01-12 4:39 UTC (permalink / raw) To: Ruifeng Wang Cc: Harman Kalra, Santosh Shukla, Jerin Jacob, dpdk-dev, Vladimir Medvedkin, Pavan Nikhilesh, Jerin Jacob, Hemant Agrawal, Honnappa Nagarahalli, nd, dpdk stable On Tue, Jan 12, 2021 at 8:28 AM Ruifeng Wang <ruifeng.wang@arm.com> wrote: > > Building with gcc 10.2 with SVE extension enabled got error: > > {standard input}: Assembler messages: > {standard input}:91: Error: selected processor does not support `addvl x4,x8,#-1' > {standard input}:95: Error: selected processor does not support `ptrue p1.d,all' > {standard input}:135: Error: selected processor does not support `whilelo p2.d,xzr,x5' > {standard input}:137: Error: selected processor does not support `decb x1' > > This is because inline assembly code explicitly resets cpu model to > not have SVE support. Thus SVE instructions generated by compiler > auto vectorization got rejected by assembler. > > Added SVE to the cpu model specified by inline assembly for SVE support. > Not replacing the inline assembly with C atomics because the driver relies > on specific LSE instruction to interface to co-processor [1]. > > Fixes: f0c7bb1bf778 ("net/octeontx/base: add octeontx IO operations") > Cc: jerinj@marvell.com > Cc: stable@dpdk.org > > [1] https://mails.dpdk.org/archives/dev/2021-January/196092.html > > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> > --- > v3: > Keep inline assembly and add sve extension to fix issue. (Pavan) > > drivers/net/octeontx/base/octeontx_io.h | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/octeontx/base/octeontx_io.h b/drivers/net/octeontx/base/octeontx_io.h > index 04b9ce191..d0b9cfbc6 100644 > --- a/drivers/net/octeontx/base/octeontx_io.h > +++ b/drivers/net/octeontx/base/octeontx_io.h > @@ -52,6 +52,11 @@ do { \ > #endif > > #if defined(RTE_ARCH_ARM64) > +#if defined(__ARM_FEATURE_SVE) > +#define __LSE_PREAMBLE " .cpu generic+lse+sve\n" > +#else > +#define __LSE_PREAMBLE " .cpu generic+lse\n" > +#endif > /** > * Perform an atomic fetch-and-add operation. > */ > @@ -61,7 +66,7 @@ octeontx_reg_ldadd_u64(void *addr, int64_t off) > uint64_t old_val; > > __asm__ volatile( > - " .cpu generic+lse\n" > + __LSE_PREAMBLE > " ldadd %1, %0, [%2]\n" > : "=r" (old_val) : "r" (off), "r" (addr) : "memory"); > > @@ -98,12 +103,13 @@ octeontx_reg_lmtst(void *lmtline_va, void *ioreg_va, const uint64_t cmdbuf[], > > /* LDEOR initiates atomic transfer to I/O device */ > __asm__ volatile( > - " .cpu generic+lse\n" > + __LSE_PREAMBLE > " ldeor xzr, %0, [%1]\n" > : "=r" (result) : "r" (ioreg_va) : "memory"); > } while (!result); > } > > +#undef __LSE_PREAMBLE > #else > > static inline uint64_t > -- > 2.25.1 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* [dpdk-stable] [PATCH v3 4/5] common/octeontx2: fix build with sve enabled [not found] ` <20210112025709.1121523-1-ruifeng.wang@arm.com> 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 2/5] net/hns3: " Ruifeng Wang 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 3/5] net/octeontx: " Ruifeng Wang @ 2021-01-12 2:57 ` Ruifeng Wang 2021-01-12 4:38 ` [dpdk-stable] [dpdk-dev] " Jerin Jacob 2 siblings, 1 reply; 17+ messages in thread From: Ruifeng Wang @ 2021-01-12 2:57 UTC (permalink / raw) To: Jerin Jacob, Nithin Dabilpuram, Pavan Nikhilesh Cc: dev, vladimir.medvedkin, hemant.agrawal, honnappa.nagarahalli, nd, Ruifeng Wang, stable Building with gcc 10.2 with SVE extension enabled got error: {standard input}: Assembler messages: {standard input}:4002: Error: selected processor does not support `mov z3.b,#0' {standard input}:4003: Error: selected processor does not support `whilelo p1.b,xzr,x7' {standard input}:4005: Error: selected processor does not support `ld1b z0.b,p1/z,[x8]' {standard input}:4006: Error: selected processor does not support `whilelo p4.s,wzr,w7' This is because inline assembly code explicitly resets cpu model to not have SVE support. Thus SVE instructions generated by compiler auto vectorization got rejected by assembler. Added SVE to the cpu model specified by inline assembly for SVE support. Not replacing the inline assembly with C atomics because the driver relies on specific LSE instruction to interface to co-processor [1]. Fixes: 8a4f835971f5 ("common/octeontx2: add IO handling APIs") Cc: jerinj@marvell.com Cc: stable@dpdk.org [1] https://mails.dpdk.org/archives/dev/2021-January/196092.html Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> --- v3: Keep inline assembly and add sve extension to fix issue. (Pavan) drivers/common/octeontx2/otx2_io_arm64.h | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/common/octeontx2/otx2_io_arm64.h b/drivers/common/octeontx2/otx2_io_arm64.h index b5c85d9a6..34268e3af 100644 --- a/drivers/common/octeontx2/otx2_io_arm64.h +++ b/drivers/common/octeontx2/otx2_io_arm64.h @@ -21,6 +21,12 @@ #define otx2_prefetch_store_keep(ptr) ({\ asm volatile("prfm pstl1keep, [%x0]\n" : : "r" (ptr)); }) +#if defined(__ARM_FEATURE_SVE) +#define __LSE_PREAMBLE " .cpu generic+lse+sve\n" +#else +#define __LSE_PREAMBLE " .cpu generic+lse\n" +#endif + static __rte_always_inline uint64_t otx2_atomic64_add_nosync(int64_t incr, int64_t *ptr) { @@ -28,7 +34,7 @@ otx2_atomic64_add_nosync(int64_t incr, int64_t *ptr) /* Atomic add with no ordering */ asm volatile ( - ".cpu generic+lse\n" + __LSE_PREAMBLE "ldadd %x[i], %x[r], [%[b]]" : [r] "=r" (result), "+m" (*ptr) : [i] "r" (incr), [b] "r" (ptr) @@ -43,7 +49,7 @@ otx2_atomic64_add_sync(int64_t incr, int64_t *ptr) /* Atomic add with ordering */ asm volatile ( - ".cpu generic+lse\n" + __LSE_PREAMBLE "ldadda %x[i], %x[r], [%[b]]" : [r] "=r" (result), "+m" (*ptr) : [i] "r" (incr), [b] "r" (ptr) @@ -57,7 +63,7 @@ otx2_lmt_submit(rte_iova_t io_address) uint64_t result; asm volatile ( - ".cpu generic+lse\n" + __LSE_PREAMBLE "ldeor xzr,%x[rf],[%[rs]]" : [rf] "=r"(result): [rs] "r"(io_address)); return result; @@ -69,7 +75,7 @@ otx2_lmt_submit_release(rte_iova_t io_address) uint64_t result; asm volatile ( - ".cpu generic+lse\n" + __LSE_PREAMBLE "ldeorl xzr,%x[rf],[%[rs]]" : [rf] "=r"(result) : [rs] "r"(io_address)); return result; @@ -104,4 +110,5 @@ otx2_lmt_mov_seg(void *out, const void *in, const uint16_t segdw) dst128[i] = src128[i]; } +#undef __LSE_PREAMBLE #endif /* _OTX2_IO_ARM64_H_ */ -- 2.25.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [dpdk-stable] [dpdk-dev] [PATCH v3 4/5] common/octeontx2: fix build with sve enabled 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 4/5] common/octeontx2: " Ruifeng Wang @ 2021-01-12 4:38 ` Jerin Jacob 0 siblings, 0 replies; 17+ messages in thread From: Jerin Jacob @ 2021-01-12 4:38 UTC (permalink / raw) To: Ruifeng Wang Cc: Jerin Jacob, Nithin Dabilpuram, Pavan Nikhilesh, dpdk-dev, Vladimir Medvedkin, Hemant Agrawal, Honnappa Nagarahalli, nd, dpdk stable On Tue, Jan 12, 2021 at 8:28 AM Ruifeng Wang <ruifeng.wang@arm.com> wrote: > > Building with gcc 10.2 with SVE extension enabled got error: > > {standard input}: Assembler messages: > {standard input}:4002: Error: selected processor does not support `mov z3.b,#0' > {standard input}:4003: Error: selected processor does not support `whilelo p1.b,xzr,x7' > {standard input}:4005: Error: selected processor does not support `ld1b z0.b,p1/z,[x8]' > {standard input}:4006: Error: selected processor does not support `whilelo p4.s,wzr,w7' > > This is because inline assembly code explicitly resets cpu model to > not have SVE support. Thus SVE instructions generated by compiler > auto vectorization got rejected by assembler. > > Added SVE to the cpu model specified by inline assembly for SVE support. > Not replacing the inline assembly with C atomics because the driver relies > on specific LSE instruction to interface to co-processor [1]. > > Fixes: 8a4f835971f5 ("common/octeontx2: add IO handling APIs") > Cc: jerinj@marvell.com > Cc: stable@dpdk.org Reviewed-by: Jerin Jacob <jerinj@marvell.com> > > [1] https://mails.dpdk.org/archives/dev/2021-January/196092.html > > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> > --- > v3: > Keep inline assembly and add sve extension to fix issue. (Pavan) > > drivers/common/octeontx2/otx2_io_arm64.h | 15 +++++++++++---- > 1 file changed, 11 insertions(+), 4 deletions(-) > > diff --git a/drivers/common/octeontx2/otx2_io_arm64.h b/drivers/common/octeontx2/otx2_io_arm64.h > index b5c85d9a6..34268e3af 100644 > --- a/drivers/common/octeontx2/otx2_io_arm64.h > +++ b/drivers/common/octeontx2/otx2_io_arm64.h > @@ -21,6 +21,12 @@ > #define otx2_prefetch_store_keep(ptr) ({\ > asm volatile("prfm pstl1keep, [%x0]\n" : : "r" (ptr)); }) > > +#if defined(__ARM_FEATURE_SVE) > +#define __LSE_PREAMBLE " .cpu generic+lse+sve\n" > +#else > +#define __LSE_PREAMBLE " .cpu generic+lse\n" > +#endif > + > static __rte_always_inline uint64_t > otx2_atomic64_add_nosync(int64_t incr, int64_t *ptr) > { > @@ -28,7 +34,7 @@ otx2_atomic64_add_nosync(int64_t incr, int64_t *ptr) > > /* Atomic add with no ordering */ > asm volatile ( > - ".cpu generic+lse\n" > + __LSE_PREAMBLE > "ldadd %x[i], %x[r], [%[b]]" > : [r] "=r" (result), "+m" (*ptr) > : [i] "r" (incr), [b] "r" (ptr) > @@ -43,7 +49,7 @@ otx2_atomic64_add_sync(int64_t incr, int64_t *ptr) > > /* Atomic add with ordering */ > asm volatile ( > - ".cpu generic+lse\n" > + __LSE_PREAMBLE > "ldadda %x[i], %x[r], [%[b]]" > : [r] "=r" (result), "+m" (*ptr) > : [i] "r" (incr), [b] "r" (ptr) > @@ -57,7 +63,7 @@ otx2_lmt_submit(rte_iova_t io_address) > uint64_t result; > > asm volatile ( > - ".cpu generic+lse\n" > + __LSE_PREAMBLE > "ldeor xzr,%x[rf],[%[rs]]" : > [rf] "=r"(result): [rs] "r"(io_address)); > return result; > @@ -69,7 +75,7 @@ otx2_lmt_submit_release(rte_iova_t io_address) > uint64_t result; > > asm volatile ( > - ".cpu generic+lse\n" > + __LSE_PREAMBLE > "ldeorl xzr,%x[rf],[%[rs]]" : > [rf] "=r"(result) : [rs] "r"(io_address)); > return result; > @@ -104,4 +110,5 @@ otx2_lmt_mov_seg(void *out, const void *in, const uint16_t segdw) > dst128[i] = src128[i]; > } > > +#undef __LSE_PREAMBLE > #endif /* _OTX2_IO_ARM64_H_ */ > -- > 2.25.1 > ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2021-01-13 2:17 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20201218101210.356836-1-ruifeng.wang@arm.com> [not found] ` <20210108082523.1062058-1-ruifeng.wang@arm.com> 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 2/5] net/hns3: fix build with sve enabled Ruifeng Wang 2021-01-09 0:06 ` Honnappa Nagarahalli 2021-01-09 2:11 ` oulijun 2021-01-11 2:39 ` Ruifeng Wang 2021-01-11 13:38 ` Honnappa Nagarahalli 2021-01-09 2:15 ` oulijun 2021-01-11 2:27 ` Ruifeng Wang 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 3/5] net/octeontx: " Ruifeng Wang 2021-01-08 8:25 ` [dpdk-stable] [PATCH v2 4/5] common/octeontx2: " Ruifeng Wang 2021-01-08 10:29 ` [dpdk-stable] [EXT] " Pavan Nikhilesh Bhagavatula 2021-01-11 9:51 ` Ruifeng Wang [not found] ` <20210112025709.1121523-1-ruifeng.wang@arm.com> 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 2/5] net/hns3: " Ruifeng Wang 2021-01-13 2:16 ` Honnappa Nagarahalli 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 3/5] net/octeontx: " Ruifeng Wang 2021-01-12 4:39 ` [dpdk-stable] [dpdk-dev] " Jerin Jacob 2021-01-12 2:57 ` [dpdk-stable] [PATCH v3 4/5] common/octeontx2: " Ruifeng Wang 2021-01-12 4:38 ` [dpdk-stable] [dpdk-dev] " Jerin Jacob
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).