* [dpdk-dev] [PATCH] eal: fix up bad asm in rte_cpu_get_features @ 2014-03-18 20:43 Neil Horman 2014-03-19 14:48 ` [dpdk-dev] [PATCH v2] " Neil Horman 2014-03-20 11:42 ` [dpdk-dev] [PATCH v3] eal: Fix up assembly for x86_64 " Neil Horman 0 siblings, 2 replies; 9+ messages in thread From: Neil Horman @ 2014-03-18 20:43 UTC (permalink / raw) To: dev The recent conversion to build dpdk as a DSO has an error in rte_cpu_get_features. When being build with -fpie, %ebx gets clobbered by the cpuid instruction which is also the pic register. Therefore the inline asm tries to save off %ebx, but does so incorrectly. It starts by loading params.ebx to "D" which is %edi, but then the first instruction moves %ebx to %edi, clobbering the input value. Then after the operation is complete, "D" (%edi) is stored to the local ebx variable, but only after the xchgl instruction has happened, which means ebx holds only the PIC pointer. This behavior was causing strange segfults for me when running the cpuid instruction. The fix is pretty easy, split the asm into two separate directives, the first saving ebx, and using it to grab the appropriate cpuid info (and correctly listing %edi as a clobbered register in the process, and then a subsequent asm directive preforming the reverse exchange (again, listing %edi as being clobbered). Signed-off-by: Neil Horman <nhorman@tuxdriver.com> --- lib/librte_eal/common/eal_common_cpuflags.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c index 1ebf78c..2072a0c 100644 --- a/lib/librte_eal/common/eal_common_cpuflags.c +++ b/lib/librte_eal/common/eal_common_cpuflags.c @@ -208,16 +208,19 @@ rte_cpu_get_features(struct cpuid_parameters_t params) asm volatile ( "mov %%ebx, %%edi\n" "cpuid\n" - "xchgl %%ebx, %%edi;\n" : "=a" (eax), - "=D" (ebx), + "=b" (ebx), "=c" (ecx), "=d" (edx) /* input */ : "a" (params.eax), - "D" (params.ebx), + "b" (params.ebx), "c" (params.ecx), - "d" (params.edx)); + "d" (params.edx) + : "%edi"); + + asm volatile ("xchgl %%ebx, %%edi;\n" + : :); #endif switch (params.return_register) { -- 1.8.3.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [dpdk-dev] [PATCH v2] eal: fix up bad asm in rte_cpu_get_features 2014-03-18 20:43 [dpdk-dev] [PATCH] eal: fix up bad asm in rte_cpu_get_features Neil Horman @ 2014-03-19 14:48 ` Neil Horman 2014-03-19 15:44 ` H. Peter Anvin 2014-03-20 11:42 ` [dpdk-dev] [PATCH v3] eal: Fix up assembly for x86_64 " Neil Horman 1 sibling, 1 reply; 9+ messages in thread From: Neil Horman @ 2014-03-19 14:48 UTC (permalink / raw) To: dev The recent conversion to build dpdk as a DSO has an error in rte_cpu_get_features. When being build with -fpie, %ebx gets clobbered by the cpuid instruction which is also the pic register. Therefore the inline asm tries to save off %ebx, but does so incorrectly. It starts by loading params.ebx to "D" which is %edi, but then the first instruction moves %ebx to %edi, clobbering the input value. Then after the operation is complete, "D" (%edi) is stored to the local ebx variable, but only after the xchgl instruction has happened, which means ebx holds only the PIC pointer. This behavior was causing strange segfults for me when running the cpuid instruction. The fix is pretty easy, split the asm into two separate directives, the first saving ebx, and using it to grab the appropriate cpuid info (and correctly listing %edi as a clobbered register in the process, and then a subsequent asm directive preforming the reverse exchange (again, listing %edi as being clobbered). Signed-off-by: Neil Horman <nhorman@tuxdriver.com> --- Change notes v2) Fix constraints to ensure that ebx isn't overwritten before asm starts --- lib/librte_eal/common/eal_common_cpuflags.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c index 1ebf78c..75b505f 100644 --- a/lib/librte_eal/common/eal_common_cpuflags.c +++ b/lib/librte_eal/common/eal_common_cpuflags.c @@ -190,7 +190,7 @@ static const struct feature_entry cpu_feature_table[] = { static inline int rte_cpu_get_features(struct cpuid_parameters_t params) { - int eax, ebx, ecx, edx; /* registers */ + int eax, ebx, ecx, edx, oldebx; /* registers */ #ifndef __PIC__ asm volatile ("cpuid" @@ -206,18 +206,21 @@ rte_cpu_get_features(struct cpuid_parameters_t params) "d" (params.edx)); #else asm volatile ( - "mov %%ebx, %%edi\n" + "xchgl %%ebx, %%edi\n" "cpuid\n" - "xchgl %%ebx, %%edi;\n" : "=a" (eax), - "=D" (ebx), + "=b" (ebx), "=c" (ecx), - "=d" (edx) + "=d" (edx), + "=D" (oldebx) /* input */ : "a" (params.eax), "D" (params.ebx), "c" (params.ecx), "d" (params.edx)); + + asm volatile ("xchgl %%ebx, %%edi;\n" + : : "D" (oldebx)); #endif switch (params.return_register) { -- 1.8.3.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [PATCH v2] eal: fix up bad asm in rte_cpu_get_features 2014-03-19 14:48 ` [dpdk-dev] [PATCH v2] " Neil Horman @ 2014-03-19 15:44 ` H. Peter Anvin 2014-03-20 0:40 ` Neil Horman 0 siblings, 1 reply; 9+ messages in thread From: H. Peter Anvin @ 2014-03-19 15:44 UTC (permalink / raw) To: Neil Horman, dev On 03/19/2014 07:48 AM, Neil Horman wrote: > The recent conversion to build dpdk as a DSO has an error in > rte_cpu_get_features. When being build with -fpie, %ebx gets clobbered by the > cpuid instruction which is also the pic register. Therefore the inline asm > tries to save off %ebx, but does so incorrectly. It starts by loading > params.ebx to "D" which is %edi, but then the first instruction moves %ebx to > %edi, clobbering the input value. Then after the operation is complete, "D" > (%edi) is stored to the local ebx variable, but only after the xchgl instruction > has happened, which means ebx holds only the PIC pointer. This behavior was > causing strange segfults for me when running the cpuid instruction. > > The fix is pretty easy, split the asm into two separate directives, the first > saving ebx, and using it to grab the appropriate cpuid info (and correctly > listing %edi as a clobbered register in the process, and then a subsequent asm > directive preforming the reverse exchange (again, listing %edi as being > clobbered). > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com> > Hi Neil :) If I'm reading this correctly, this is at the very best extremely dangerous (I'm confused why it would compile at all with PIC enabled), since it leaves the CPU state in an unexpected way between two asm statements, where the compiler is perfectly allowed to put code. Instead, I would do simple xchg/xchg, which is an idiom I have used for this particular purpose in a lot of code. The minimal patch is simply to change "mov" to "xchg" inside the asm statement. There is no fundamental reason to nail down the register to %edi, though; thus I would suggest instead: diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c index 1ebf78cc2a48..6b75992fec1a 100644 --- a/lib/librte_eal/common/eal_common_cpuflags.c +++ b/lib/librte_eal/common/eal_common_cpuflags.c @@ -206,16 +206,16 @@ rte_cpu_get_features(struct cpuid_parameters_t params) "d" (params.edx)); #else asm volatile ( - "mov %%ebx, %%edi\n" + "xchgl %%ebx, %1\n" "cpuid\n" - "xchgl %%ebx, %%edi;\n" + "xchgl %%ebx, %1;\n" : "=a" (eax), - "=D" (ebx), + "=r" (ebx), "=c" (ecx), "=d" (edx) /* input */ : "a" (params.eax), - "D" (params.ebx), + "1" (params.ebx), "c" (params.ecx), "d" (params.edx)); #endif > --- > Change notes > > v2) Fix constraints to ensure that ebx isn't overwritten before asm starts > --- > lib/librte_eal/common/eal_common_cpuflags.c | 13 ++++++++----- > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c > index 1ebf78c..75b505f 100644 > --- a/lib/librte_eal/common/eal_common_cpuflags.c > +++ b/lib/librte_eal/common/eal_common_cpuflags.c > @@ -190,7 +190,7 @@ static const struct feature_entry cpu_feature_table[] = { > static inline int > rte_cpu_get_features(struct cpuid_parameters_t params) > { > - int eax, ebx, ecx, edx; /* registers */ > + int eax, ebx, ecx, edx, oldebx; /* registers */ > > #ifndef __PIC__ > asm volatile ("cpuid" > @@ -206,18 +206,21 @@ rte_cpu_get_features(struct cpuid_parameters_t params) > "d" (params.edx)); > #else > asm volatile ( > - "mov %%ebx, %%edi\n" > + "xchgl %%ebx, %%edi\n" > "cpuid\n" > - "xchgl %%ebx, %%edi;\n" > : "=a" (eax), > - "=D" (ebx), > + "=b" (ebx), > "=c" (ecx), > - "=d" (edx) > + "=d" (edx), > + "=D" (oldebx) > /* input */ > : "a" (params.eax), > "D" (params.ebx), > "c" (params.ecx), > "d" (params.edx)); > + > + asm volatile ("xchgl %%ebx, %%edi;\n" > + : : "D" (oldebx)); > #endif > > switch (params.return_register) { > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [PATCH v2] eal: fix up bad asm in rte_cpu_get_features 2014-03-19 15:44 ` H. Peter Anvin @ 2014-03-20 0:40 ` Neil Horman 2014-03-20 4:22 ` H. Peter Anvin 0 siblings, 1 reply; 9+ messages in thread From: Neil Horman @ 2014-03-20 0:40 UTC (permalink / raw) To: H. Peter Anvin; +Cc: dev On Wed, Mar 19, 2014 at 08:44:46AM -0700, H. Peter Anvin wrote: > On 03/19/2014 07:48 AM, Neil Horman wrote: > > The recent conversion to build dpdk as a DSO has an error in > > rte_cpu_get_features. When being build with -fpie, %ebx gets clobbered by the > > cpuid instruction which is also the pic register. Therefore the inline asm > > tries to save off %ebx, but does so incorrectly. It starts by loading > > params.ebx to "D" which is %edi, but then the first instruction moves %ebx to > > %edi, clobbering the input value. Then after the operation is complete, "D" > > (%edi) is stored to the local ebx variable, but only after the xchgl instruction > > has happened, which means ebx holds only the PIC pointer. This behavior was > > causing strange segfults for me when running the cpuid instruction. > > > > The fix is pretty easy, split the asm into two separate directives, the first > > saving ebx, and using it to grab the appropriate cpuid info (and correctly > > listing %edi as a clobbered register in the process, and then a subsequent asm > > directive preforming the reverse exchange (again, listing %edi as being > > clobbered). > > > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com> > > > So after some discussion with hpa, I need to self NAK this again, apologies for the noise. Theres some clean up to be done in this area, and I'm still getting a segfault that is in some way related to this code, but I need to dig deeper to understand it. Neil > Hi Neil :) > > If I'm reading this correctly, this is at the very best extremely > dangerous (I'm confused why it would compile at all with PIC enabled), > since it leaves the CPU state in an unexpected way between two asm > statements, where the compiler is perfectly allowed to put code. > > Instead, I would do simple xchg/xchg, which is an idiom I have used for > this particular purpose in a lot of code. The minimal patch is simply > to change "mov" to "xchg" inside the asm statement. > > There is no fundamental reason to nail down the register to %edi, > though; thus I would suggest instead: > > diff --git a/lib/librte_eal/common/eal_common_cpuflags.c > b/lib/librte_eal/common/eal_common_cpuflags.c > index 1ebf78cc2a48..6b75992fec1a 100644 > --- a/lib/librte_eal/common/eal_common_cpuflags.c > +++ b/lib/librte_eal/common/eal_common_cpuflags.c > @@ -206,16 +206,16 @@ rte_cpu_get_features(struct cpuid_parameters_t params) > "d" (params.edx)); > #else > asm volatile ( > - "mov %%ebx, %%edi\n" > + "xchgl %%ebx, %1\n" > "cpuid\n" > - "xchgl %%ebx, %%edi;\n" > + "xchgl %%ebx, %1;\n" > : "=a" (eax), > - "=D" (ebx), > + "=r" (ebx), > "=c" (ecx), > "=d" (edx) > /* input */ > : "a" (params.eax), > - "D" (params.ebx), > + "1" (params.ebx), > "c" (params.ecx), > "d" (params.edx)); > #endif > > > > --- > > Change notes > > > > v2) Fix constraints to ensure that ebx isn't overwritten before asm starts > > --- > > lib/librte_eal/common/eal_common_cpuflags.c | 13 ++++++++----- > > 1 file changed, 8 insertions(+), 5 deletions(-) > > > > diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c > > index 1ebf78c..75b505f 100644 > > --- a/lib/librte_eal/common/eal_common_cpuflags.c > > +++ b/lib/librte_eal/common/eal_common_cpuflags.c > > @@ -190,7 +190,7 @@ static const struct feature_entry cpu_feature_table[] = { > > static inline int > > rte_cpu_get_features(struct cpuid_parameters_t params) > > { > > - int eax, ebx, ecx, edx; /* registers */ > > + int eax, ebx, ecx, edx, oldebx; /* registers */ > > > > #ifndef __PIC__ > > asm volatile ("cpuid" > > @@ -206,18 +206,21 @@ rte_cpu_get_features(struct cpuid_parameters_t params) > > "d" (params.edx)); > > #else > > asm volatile ( > > - "mov %%ebx, %%edi\n" > > + "xchgl %%ebx, %%edi\n" > > "cpuid\n" > > - "xchgl %%ebx, %%edi;\n" > > : "=a" (eax), > > - "=D" (ebx), > > + "=b" (ebx), > > "=c" (ecx), > > - "=d" (edx) > > + "=d" (edx), > > + "=D" (oldebx) > > /* input */ > > : "a" (params.eax), > > "D" (params.ebx), > > "c" (params.ecx), > > "d" (params.edx)); > > + > > + asm volatile ("xchgl %%ebx, %%edi;\n" > > + : : "D" (oldebx)); > > #endif > > > > switch (params.return_register) { > > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [PATCH v2] eal: fix up bad asm in rte_cpu_get_features 2014-03-20 0:40 ` Neil Horman @ 2014-03-20 4:22 ` H. Peter Anvin 2014-03-20 11:03 ` Neil Horman 0 siblings, 1 reply; 9+ messages in thread From: H. Peter Anvin @ 2014-03-20 4:22 UTC (permalink / raw) To: Neil Horman, dev On 03/19/2014 05:40 PM, Neil Horman wrote: > So after some discussion with hpa, I need to self NAK this again, apologies for > the noise. Theres some clean up to be done in this area, and I'm still getting > a segfault that is in some way related to this code, but I need to dig deeper to > understand it. > > Neil I still believe we should add the patch I posted in the previous email; I should clean it up and put a proper header on it. This is, if there is actually a need to feed %ebx and %edx into CPUID (the native instruction is sensitive to %eax and %ecx, but not %ebx or %edx.) For reference, this is a version of CPUID I personally often use: struct cpuid { unsigned int eax, ecx, edx, ebx; }; static inline void cpuid(unsigned int leaf, unsigned int subleaf, struct cpuid *out) { #if defined(__i386__) && defined(__PIC__) /* %ebx is a forbidden register */ asm volatile("movl %%ebx,%0 ; cpuid ; xchgl %%ebx,%0" : "=r" (out->ebx), "=a" (out->eax), "=c" (out->ecx), "=d" (out->edx) : "a" (leaf), "c" (subleaf)); #else asm volatile("cpuid" : "=b" (out->ebx), "=a" (out->eax), "=c" (out->ecx), "=d" (out->edx) : "a" (leaf), "c" (subleaf)); #endif } ... but that is a pretty significant API change. Making it an inline lets gcc elide the entire memory structure, so that is definitely useful. >> >> diff --git a/lib/librte_eal/common/eal_common_cpuflags.c >> b/lib/librte_eal/common/eal_common_cpuflags.c >> index 1ebf78cc2a48..6b75992fec1a 100644 >> --- a/lib/librte_eal/common/eal_common_cpuflags.c >> +++ b/lib/librte_eal/common/eal_common_cpuflags.c >> @@ -206,16 +206,16 @@ rte_cpu_get_features(struct cpuid_parameters_t params) >> "d" (params.edx)); >> #else >> asm volatile ( >> - "mov %%ebx, %%edi\n" >> + "xchgl %%ebx, %1\n" >> "cpuid\n" >> - "xchgl %%ebx, %%edi;\n" >> + "xchgl %%ebx, %1;\n" >> : "=a" (eax), >> - "=D" (ebx), >> + "=r" (ebx), >> "=c" (ecx), >> "=d" (edx) >> /* input */ >> : "a" (params.eax), >> - "D" (params.ebx), >> + "1" (params.ebx), >> "c" (params.ecx), >> "d" (params.edx)); >> #endif >> -hpa ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [PATCH v2] eal: fix up bad asm in rte_cpu_get_features 2014-03-20 4:22 ` H. Peter Anvin @ 2014-03-20 11:03 ` Neil Horman 2014-03-20 11:27 ` Neil Horman 0 siblings, 1 reply; 9+ messages in thread From: Neil Horman @ 2014-03-20 11:03 UTC (permalink / raw) To: H. Peter Anvin; +Cc: dev On Wed, Mar 19, 2014 at 09:22:03PM -0700, H. Peter Anvin wrote: > On 03/19/2014 05:40 PM, Neil Horman wrote: > > So after some discussion with hpa, I need to self NAK this again, apologies for > > the noise. Theres some clean up to be done in this area, and I'm still getting > > a segfault that is in some way related to this code, but I need to dig deeper to > > understand it. > > > > Neil > > I still believe we should add the patch I posted in the previous email; > I should clean it up and put a proper header on it. > I agree, but the fact of the matter is that I'm still getting a segfault very close to these instructions and I dont' understand why yet. I'd hate to just make the problem go away without understanding the reason why. The patch you propose doesn't fix (yet moving the xchgl to its own asm statement does). > This is, if there is actually a need to feed %ebx and %edx into CPUID > (the native instruction is sensitive to %eax and %ecx, but not %ebx or > %edx.) > > For reference, this is a version of CPUID I personally often use: > > struct cpuid { > unsigned int eax, ecx, edx, ebx; > }; > > static inline void cpuid(unsigned int leaf, unsigned int subleaf, > struct cpuid *out) > { > #if defined(__i386__) && defined(__PIC__) So, this is an additional difference and this in fact does make the problem clear up. By applying only this patch: @@ -192,7 +192,7 @@ rte_cpu_get_features(struct cpuid_parameters_t params) { int eax, ebx, ecx, edx; /* registers */ -#ifndef __PIC__ +#if !defined(__PIC__) || !defined(__i386__) asm volatile ("cpuid" /* output */ : "=a" (eax), my build compiles the cpuid instruction branch, not the mov;cpuid; xchgl branch (its an x86_64 build). Is there any reason that x86_64 doesn't need to save the ebx register when running cpuid while building PIE code? Neil ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [PATCH v2] eal: fix up bad asm in rte_cpu_get_features 2014-03-20 11:03 ` Neil Horman @ 2014-03-20 11:27 ` Neil Horman 2014-03-20 15:20 ` H. Peter Anvin 0 siblings, 1 reply; 9+ messages in thread From: Neil Horman @ 2014-03-20 11:27 UTC (permalink / raw) To: H. Peter Anvin; +Cc: dev On Thu, Mar 20, 2014 at 07:03:23AM -0400, Neil Horman wrote: > On Wed, Mar 19, 2014 at 09:22:03PM -0700, H. Peter Anvin wrote: > > On 03/19/2014 05:40 PM, Neil Horman wrote: > > > So after some discussion with hpa, I need to self NAK this again, apologies for > > > the noise. Theres some clean up to be done in this area, and I'm still getting > > > a segfault that is in some way related to this code, but I need to dig deeper to > > > understand it. > > > > > > Neil > > > > I still believe we should add the patch I posted in the previous email; > > I should clean it up and put a proper header on it. > > > I agree, but the fact of the matter is that I'm still getting a segfault very > close to these instructions and I dont' understand why yet. I'd hate to just > make the problem go away without understanding the reason why. The patch you > propose doesn't fix (yet moving the xchgl to its own asm statement does). > > > This is, if there is actually a need to feed %ebx and %edx into CPUID > > (the native instruction is sensitive to %eax and %ecx, but not %ebx or > > %edx.) > > > > For reference, this is a version of CPUID I personally often use: > > > > struct cpuid { > > unsigned int eax, ecx, edx, ebx; > > }; > > > > static inline void cpuid(unsigned int leaf, unsigned int subleaf, > > struct cpuid *out) > > { > > #if defined(__i386__) && defined(__PIC__) > So, this is an additional difference and this in fact does make the problem > clear up. By applying only this patch: > > @@ -192,7 +192,7 @@ rte_cpu_get_features(struct cpuid_parameters_t params) > { > int eax, ebx, ecx, edx; /* registers */ > > -#ifndef __PIC__ > +#if !defined(__PIC__) || !defined(__i386__) > asm volatile ("cpuid" > /* output */ > : "=a" (eax), > > my build compiles the cpuid instruction branch, not the mov;cpuid; xchgl branch > (its an x86_64 build). Is there any reason that x86_64 doesn't need to save the > ebx register when running cpuid while building PIE code? > > Neil > > So, I answered my own question, sort of. The __i386__ is clear: x86_64 uses RIP relative addressing, making the saving of ebx not needed - thats perfectly clear. Whats a bit less clear to me is why it matters. Ideally moving ebx and restoring it with an xchg should change the register state at all. It would clobber the lower part of rbx I think, but looking at the disassembly that shouldn't be used, so as long as the calling function saves its value of rbx, it should be ok. The odd part is, if I look at the disassembly of rte_cpu_get_flag_enabled compiled with and without the mov and xchgl operations, I see that without those additional instructions the compiler adds a push rbx and pop rbx instruction at the start and end of the assembly, but not when the mov ebx, %0 and xchgl %ebx, %0 instructions are added. I'm not sure what the compiler is sensitive to when adding those instructions, but it seems like it should be sensitive to the cpuid instruction, and should be adding it to both. I'd like your thought Peter on that, but either way it seems clear that the mov/xchgl aren't needed for x86_64 code, so I'll clean that up and post a new patch shortly. Neil ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [PATCH v2] eal: fix up bad asm in rte_cpu_get_features 2014-03-20 11:27 ` Neil Horman @ 2014-03-20 15:20 ` H. Peter Anvin 0 siblings, 0 replies; 9+ messages in thread From: H. Peter Anvin @ 2014-03-20 15:20 UTC (permalink / raw) To: Neil Horman; +Cc: dev On 03/20/2014 04:27 AM, Neil Horman wrote: >> > So, I answered my own question, sort of. The __i386__ is clear: x86_64 uses RIP > relative addressing, making the saving of ebx not needed - thats perfectly > clear. > > Whats a bit less clear to me is why it matters. Ideally moving ebx and > restoring it with an xchg should change the register state at all. It would > clobber the lower part of rbx I think, but looking at the disassembly that > shouldn't be used, so as long as the calling function saves its value of rbx, it > should be ok. I think you just hit on the real bug. If this code were compiled on 64 bits, it would clobber the *upper* half of %rbx, because a 32-bit operation on 64 bits clobber the upper half of the register. Since the compiler isn't being told that %rbx is being modified, it expects %rbx to be unmodified and disaster ensues. It just clicked on me, though, that this function is actually a static function in a .c file, meaning it is not an API at all. This code can be simplified dramatically as a result. Let me see if I can hack up something quickly. > The odd part is, if I look at the disassembly of > rte_cpu_get_flag_enabled compiled with and without the mov and xchgl operations, > I see that without those additional instructions the compiler adds a push rbx > and pop rbx instruction at the start and end of the assembly, but not when the > mov ebx, %0 and xchgl %ebx, %0 instructions are added. I'm not sure what the > compiler is sensitive to when adding those instructions, but it seems like it > should be sensitive to the cpuid instruction, and should be adding it to both. It's not the instruction, it is the fact that the constraints include a "=b". This explains why your little hack happens to work... I was wondering how it compiled at all. The answer, of course, is that it it on x86-64 where the hack is neither necessary nor correct. -hpa ^ permalink raw reply [flat|nested] 9+ messages in thread
* [dpdk-dev] [PATCH v3] eal: Fix up assembly for x86_64 in rte_cpu_get_features 2014-03-18 20:43 [dpdk-dev] [PATCH] eal: fix up bad asm in rte_cpu_get_features Neil Horman 2014-03-19 14:48 ` [dpdk-dev] [PATCH v2] " Neil Horman @ 2014-03-20 11:42 ` Neil Horman 1 sibling, 0 replies; 9+ messages in thread From: Neil Horman @ 2014-03-20 11:42 UTC (permalink / raw) To: dev; +Cc: H. Peter Anvin x86_64 doesn't need to save off and restore ebx when issuing cpuid, since x86_64 uses RIP relative addressing. Doing the save actually clobbers the lower half of rbx, which could be used and not saved off independently, leading to undefined behavior. Fix up the defines so that for x86_64 we just issue the cpuid instruction, which is safe. Also, while we're at it, lets clean up the input and output constraints on the inline asm, so that we don't load registers that the cpuid instruction isn't sensitive to. Note that this patch does alter the API, in that specifcations to ebx and edx are ignored. I chose to go ahead and do that because there is only a single caller of this function and neither register is ever written currently. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "H. Peter Anvin" <hpa@zytor.com> --- lib/librte_eal/common/eal_common_cpuflags.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c index 1ebf78c..0a18d53 100644 --- a/lib/librte_eal/common/eal_common_cpuflags.c +++ b/lib/librte_eal/common/eal_common_cpuflags.c @@ -192,7 +192,7 @@ rte_cpu_get_features(struct cpuid_parameters_t params) { int eax, ebx, ecx, edx; /* registers */ -#ifndef __PIC__ +#if !defined(__PIC__) || !defined(__i386__) asm volatile ("cpuid" /* output */ : "=a" (eax), @@ -201,23 +201,19 @@ rte_cpu_get_features(struct cpuid_parameters_t params) "=d" (edx) /* input */ : "a" (params.eax), - "b" (params.ebx), - "c" (params.ecx), - "d" (params.edx)); + "c" (params.ecx)); #else asm volatile ( - "mov %%ebx, %%edi\n" + "mov %%ebx, %0\n" "cpuid\n" - "xchgl %%ebx, %%edi;\n" - : "=a" (eax), - "=D" (ebx), + "xchgl %%ebx, %0\n" + : "=r" (ebx), + "=a" (eax), "=c" (ecx), "=d" (edx) /* input */ : "a" (params.eax), - "D" (params.ebx), - "c" (params.ecx), - "d" (params.edx)); + "c" (params.ecx)); #endif switch (params.return_register) { -- 1.8.3.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-03-20 15:19 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-03-18 20:43 [dpdk-dev] [PATCH] eal: fix up bad asm in rte_cpu_get_features Neil Horman 2014-03-19 14:48 ` [dpdk-dev] [PATCH v2] " Neil Horman 2014-03-19 15:44 ` H. Peter Anvin 2014-03-20 0:40 ` Neil Horman 2014-03-20 4:22 ` H. Peter Anvin 2014-03-20 11:03 ` Neil Horman 2014-03-20 11:27 ` Neil Horman 2014-03-20 15:20 ` H. Peter Anvin 2014-03-20 11:42 ` [dpdk-dev] [PATCH v3] eal: Fix up assembly for x86_64 " Neil Horman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).