From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.zytor.com (terminus.zytor.com [198.137.202.10]) by dpdk.org (Postfix) with ESMTP id C3A44530A for ; Thu, 20 Mar 2014 16:19:03 +0100 (CET) Received: from tazenda.hos.anvin.org ([IPv6:2601:9:7280:8f0:cc79:79ff:fead:f559]) (authenticated bits=0) by mail.zytor.com (8.14.7/8.14.5) with ESMTP id s2KFKVJ2012644 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Thu, 20 Mar 2014 08:20:32 -0700 Message-ID: <532B073A.5010709@zytor.com> Date: Thu, 20 Mar 2014 08:20:26 -0700 From: "H. Peter Anvin" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: Neil Horman References: <1395175414-25232-1-git-send-email-nhorman@tuxdriver.com> <1395240524-412-1-git-send-email-nhorman@tuxdriver.com> <5329BB6E.8080509@zytor.com> <20140320004010.GA20693@neilslaptop.think-freely.org> <532A6CEB.1070106@zytor.com> <20140320110323.GA7721@hmsreliant.think-freely.org> <20140320112734.GB7721@hmsreliant.think-freely.org> In-Reply-To: <20140320112734.GB7721@hmsreliant.think-freely.org> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH v2] eal: fix up bad asm in rte_cpu_get_features X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Mar 2014 15:19:04 -0000 On 03/20/2014 04:27 AM, Neil Horman wrote: >> > So, I answered my own question, sort of. The __i386__ is clear: x86_64 uses RIP > relative addressing, making the saving of ebx not needed - thats perfectly > clear. > > Whats a bit less clear to me is why it matters. Ideally moving ebx and > restoring it with an xchg should change the register state at all. It would > clobber the lower part of rbx I think, but looking at the disassembly that > shouldn't be used, so as long as the calling function saves its value of rbx, it > should be ok. I think you just hit on the real bug. If this code were compiled on 64 bits, it would clobber the *upper* half of %rbx, because a 32-bit operation on 64 bits clobber the upper half of the register. Since the compiler isn't being told that %rbx is being modified, it expects %rbx to be unmodified and disaster ensues. It just clicked on me, though, that this function is actually a static function in a .c file, meaning it is not an API at all. This code can be simplified dramatically as a result. Let me see if I can hack up something quickly. > The odd part is, if I look at the disassembly of > rte_cpu_get_flag_enabled compiled with and without the mov and xchgl operations, > I see that without those additional instructions the compiler adds a push rbx > and pop rbx instruction at the start and end of the assembly, but not when the > mov ebx, %0 and xchgl %ebx, %0 instructions are added. I'm not sure what the > compiler is sensitive to when adding those instructions, but it seems like it > should be sensitive to the cpuid instruction, and should be adding it to both. It's not the instruction, it is the fact that the constraints include a "=b". This explains why your little hack happens to work... I was wondering how it compiled at all. The answer, of course, is that it it on x86-64 where the hack is neither necessary nor correct. -hpa