From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f178.google.com (mail-wi0-f178.google.com [209.85.212.178]) by dpdk.org (Postfix) with ESMTP id 5D3595960 for ; Tue, 8 Jul 2014 09:17:01 +0200 (CEST) Received: by mail-wi0-f178.google.com with SMTP id n15so404379wiw.11 for ; Tue, 08 Jul 2014 00:17:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type :content-transfer-encoding; bh=vpguA3E6Lzej70TrvkpbryS64kIaLaqqlzbdQr6EOV0=; b=gNO7XtT5TB4UOR2b6TAr8WTkqQHVCcVoo0b6tiMmcUCDm3IwcJC5dJ8CbtRZ68ZDMm Lr+WtJllmLg6udIZN2/RdjpPAhoIvYRFXP8NuMCmK1uKagMvzvHDxY6zuFvXaMq8gYKn tdYYWo9KTFFc/4G72XXG6RRvgqT4uKHvjQ3TvSCVHrzCkFDSulnbByqfqV+VjYg46J4r +dUwj8RPK7Xyh04cpKrSlb5hF+/0vMIivPLWzOHMzXhicK+q4K/ZksoiwvhPFlpvcVWt X1NZyKIYOrDyNauTgRKJNmbSIaSbBRc0LDbgvTEM/8r71059OoxhzHQ0FoeY2gR3rfud +x4g== X-Gm-Message-State: ALoCoQnbHqdOgMGF6Tu/2sO95khYtgLC5og9H8Hu05boV5kvBhyNh941WHQEApk+vfsw6udKHw3a X-Received: by 10.195.11.34 with SMTP id ef2mr1329538wjd.123.1404803843158; Tue, 08 Jul 2014 00:17:23 -0700 (PDT) Received: from [10.16.0.189] (6wind.net2.nerim.net. [213.41.180.237]) by mx.google.com with ESMTPSA id de5sm3656744wib.18.2014.07.08.00.17.18 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 08 Jul 2014 00:17:22 -0700 (PDT) Message-ID: <53BB9AE6.4030503@6wind.com> Date: Tue, 08 Jul 2014 09:16:54 +0200 From: Ivan Boule User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Icedove/24.6.0 MIME-Version: 1.0 To: "Zhang, Helin" , Olivier MATZ , "Richardson, Bruce" References: <59AF69C657FD0841A61C55336867B5B02CF13E32@IRSMSX103.ger.corp.intel.com> <53BA7422.9080706@6wind.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] Making space in mbuf data-structure X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Jul 2014 07:17:01 -0000 On 07/08/2014 09:04 AM, Zhang, Helin wrote: > > >> -----Original Message----- >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier MATZ >> Sent: Monday, July 7, 2014 6:19 PM >> To: Richardson, Bruce; dev@dpdk.org >> Subject: Re: [dpdk-dev] Making space in mbuf data-structure >> >> Hello Bruce, >> >> Thank you to revive this discussion now that the 1.7 is released. >> >> First, I would like to reference my previous patch series that first reworks the >> mbuf to gain 9 bytes [1]. The v1 of the patch was discussed at [2]. >> >> Now, let's list what I would find useful to have in this mbuf rework: >> >> - larger size for ol_flags: this is at least needed for TSO, but as it >> is completely full today, I expect to have this need for other >> features. >> - add other offload fields: l4_len and mss, required for TSO >> - remove ctrl_mbuf: they could be replaced by a packet mbuf. It will >> simplify the mbuf structure. Moreover, it would allow to save room >> in the mbuf. >> - a new vlan tag, I suppose this could be useful in some use-cases >> where vlans are stacked. >> - splitting out fields that are superimposed: if 2 features can be used >> at the same time >> >> On the other hand, I'm not convinced by this: >> >> - new filters in the i40e driver: I don't think the mbuf is the >> right place for driver-specific flags. If a feature is brought >> by a new driver requiring a flag in mbuf, we should take care that >> the flag is not bound to this particular driver and would match >> the same feature in another driver. >> - sequence number: I'm not sure I understand the use-case, maybe this >> could stay in a mbuf meta data in the reordering module. >> >>> Firstly, we believe that there is no possible way that we can ever fit >>> all the fields we need to fit into a 64-byte mbuf, and so we need to >>> start looking at a 128-byte mbuf instead. >> >> The TSO patches show that it is possible to keep a 64 bytes mbuf (of course, it >> depends on what we want to add in the mbuf). I'm not fundamentally against >> having 128 bytes mbuf. But: >> >> - it should not be a reason for just adding things and not reworking >> things that could be enhanced >> - it should not be a reason for not optimizing the current mbuf >> structure >> - if we can do the same with a 64 bytes mbuf, we need to carefuly >> compare the solutions as fetching a second cache line is not >> costless in all situations. The 64 bytes solution I'm proposing >> in [1] may cost a bit more in CPU cycles but avoids an additional >> cache prefetch (or miss). In some situations (I'm thinking about >> use-cases where we are memory-bound, e.g. an application processing >> a lot of data), it is better to loose a few CPU cycles. >> >>> First off the blocks is to look at moving the mempool pointer into the >>> second cache line [...] Beyond this change, I'm also investigating >>> potentially moving the "next" >>> pointer to the second cache line, but it's looking harder to move >>> without serious impact >> >> I think we can easily find DPDK applications that would use the "next" >> field of the mbuf on rx side, as it is the standard way of chaining packets. For >> instance: IP reassembly, TCP/UDP socket queues, or any other protocol that >> needs a reassembly queue. This is at least what we do in 6WINDGate fast path >> stack, and I suppose other network stack implementations would do something >> similar, so we should probably avoid moving this field to the 2nd cache line. >> >> One more issue I do foresee, with slower CPUs like Atom, having 2 cache lines >> will add more cost than on Xeon. I'm wondering if it make sense to have a >> compilation time option to select either limited features with one cache line or >> full features 2 line caches. I don't know if it's a good idea because it would make >> the code more complex, but we could consider it. I think we don't target binary >> compatibility today? >> >> From a functional point of view, we could check that my TSO patch can be >> adapted to your proposal so we can challenge and merge both approaches. >> >> As this change would impact the core of DPDK, I think it would be interesting to >> list some representative use-cases in order to evaluate the cost of each >> solution. This will also help for future modifications, and could be included in a >> sort of non-regression test? >> >> Regards, >> Olivier >> >> [1] http://dpdk.org/ml/archives/dev/2014-May/002537.html >> [2] http://dpdk.org/ml/archives/dev/2014-May/002322.html > > Hi Olivier > > I am trying to convince you on the new field of "filter status". > It is for matched Flow Director Filter ID, and might be reused for HASH signature if it matches hash filter, or others. > It is quite useful for Flow Director, and not a flag. I guess there should have the similar feature even in non-Intel NICs. > By construction, since a packet cannot match more than 1 filter with an associated identifier, this is typically the kind of field that should be put in an union with the standard 32-bit RSS id. Regards, Ivan > Regards, > Helin > -- Ivan Boule 6WIND Development Engineer