From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by dpdk.org (Postfix) with ESMTP id C32DF58C4 for ; Mon, 12 May 2014 17:59:20 +0200 (CEST) Received: by mail-pa0-f54.google.com with SMTP id bj1so7693425pad.27 for ; Mon, 12 May 2014 08:59:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-type:content-transfer-encoding; bh=uiNo7LT6JiSYrcudNaTasyoge26d6sIH71OuH0EAau0=; b=a1LmEBsjSAKmGxiQJktq5PLucb5wX3HW6uvPxLDK+qAbxKaXZkot3JEez8P5k/fO9y X3SSbEvZHpR7+4vbocn6e2nm74kSlYnW2yBgjykB1LfJctBkPq/yHoqOY3s+OyZ/oknQ SmASCxKVnKg49wby6+76G+aV7WWxLlV9uJp6+jn2RG6uVo+YJpDk0X1RT8YMVFGEnSjp Kp6jGFCNzTvElvHjrXYd6BlEesddHX/LkovRmpq+tlvd4OdAXRkeMLAZcI6wFhmLTlIo m3WbBiwub0//bpSuaXG/eKJQW/uYRVvm/UrJGNRYmvIDcms+0QgwIi7Yq6Uq9v3EW1Hs LPzQ== X-Gm-Message-State: ALoCoQmR5bwCaVkH3Nw0USjOb5m1T9UXVs5z53Pu4TyQ/kD7vBHaJNj2LwboR/HDp1pVDwK4P0uX X-Received: by 10.66.102.4 with SMTP id fk4mr56577900pab.59.1399910367508; Mon, 12 May 2014 08:59:27 -0700 (PDT) Received: from nehalam.linuxnetplumber.net (static-50-53-83-51.bvtn.or.frontiernet.net. [50.53.83.51]) by mx.google.com with ESMTPSA id is5sm12393042pbb.8.2014.05.12.08.59.26 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Mon, 12 May 2014 08:59:27 -0700 (PDT) Date: Mon, 12 May 2014 08:59:24 -0700 From: Stephen Hemminger To: Olivier MATZ Message-ID: <20140512085924.20a29cad@nehalam.linuxnetplumber.net> In-Reply-To: <5370E397.7000706@6wind.com> References: <1399647038-15095-1-git-send-email-olivier.matz@6wind.com> <1399647038-15095-7-git-send-email-olivier.matz@6wind.com> <3144526.CGFdr4BbI8@xps13> <1FD9B82B8BF2CF418D9A1000154491D9740A92B8@ORSMSX102.amr.corp.intel.com> <20140512144108.GB21298@hmsreliant.think-freely.org> <5370E397.7000706@6wind.com> X-Mailer: Claws Mail 3.9.3 (GTK+ 2.24.23; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH RFC 06/11] mbuf: replace data pointer by an offset X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 May 2014 15:59:21 -0000 On Mon, 12 May 2014 17:07:03 +0200 Olivier MATZ wrote: > Hi Venky, > > On 05/12/2014 04:41 PM, Neil Horman wrote: > >> This is a hugely problematic change, and has a pretty large > >> performance impact (because the dependency to compute and access). We > >> debated this for a long time during the early days of DPDK and > >> decided against it. This is also a repeated sequence - the driver > >> will do it twice (Rx + Tx) and the next level stack will do it twice > >> (Rx + Tx) ... > >> > >> My vote is to reject this change particular change to the mbuf. > >> > >> Regards, > >> -Venky > >> > > Do you have perforamance numbers to compare throughput with and without this > > change? I always feel suspcious when I see the spectre of performane used to > > support or deny a change without supporting reasoning or metrics. > > I agree with Neil. My feeling is that it won't impact performance, and > it is correlated with the forwarding tests I've done with this patch. > > I don't really understand what would cost more by storing the offset > instead of the virtual address. I agree that each time the stack will > access to the begining of the mbuf, there will be an arithmetic > operation, but it is compensated by other operations that will be > accelerated: > > - When receiving a packet, the driver will do: > > m->data_off = RTE_PKTMBUF_HEADROOM; > > instead of: > > m->data = (char*) rxm->buf_addr + RTE_PKTMBUF_HEADROOM; > > - Each time the stack will prepend data, it has to check if the headroom > is large enough to do the operation. This will be faster as data_off > is the headroom. > > - When transmitting a packet, the driver will get the physical address: > > phys_addr = m->buf_physaddr + m->data_off > > instead of: > > phys_addr = (m->buf_physaddr + \ > ((char *)m->data - (char *)m->buf_addr))) > > Moreover, these operations look negligible to me (few cycles) compared > to the large amount of arithmetic operations and tests done in the > driver. > > Regards, > Olivier There is one case which this case might make problematic. Right now it is possible to clone an mbuf and in the cloned mbuf use the associated data buffer as private meta data store. This is convenient (like skb->cb in Linux) and avoids addtional allocation.