From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id AE8B3137D for ; Mon, 15 Sep 2014 21:06:39 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP; 15 Sep 2014 12:12:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,529,1406617200"; d="scan'208";a="573466947" Received: from fmsmsx107.amr.corp.intel.com ([10.18.124.205]) by orsmga001.jf.intel.com with ESMTP; 15 Sep 2014 12:11:40 -0700 Received: from fmsmsx119.amr.corp.intel.com (10.18.124.207) by fmsmsx107.amr.corp.intel.com (10.18.124.205) with Microsoft SMTP Server (TLS) id 14.3.195.1; Mon, 15 Sep 2014 12:11:39 -0700 Received: from shsmsx152.ccr.corp.intel.com (10.239.6.52) by FMSMSX119.amr.corp.intel.com (10.18.124.207) with Microsoft SMTP Server (TLS) id 14.3.195.1; Mon, 15 Sep 2014 12:11:39 -0700 Received: from shsmsx104.ccr.corp.intel.com ([169.254.5.230]) by SHSMSX152.ccr.corp.intel.com ([169.254.6.190]) with mapi id 14.03.0195.001; Tue, 16 Sep 2014 03:11:38 +0800 From: "Zhou, Danny" To: "John W. Linville" , Neil Horman Thread-Topic: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices Thread-Index: AQHPn5Gwvcd4A0wcTkeObxakesCY6Jv9ov6AgACKEgD//4OfgIAAmgdggAPeNgCAAIm9EP//iqYAgAAX7QCAAJbfEA== Date: Mon, 15 Sep 2014 19:11:37 +0000 Message-ID: References: <1405024369-30058-1-git-send-email-linville@tuxdriver.com> <1405362290-6753-1-git-send-email-linville@tuxdriver.com> <20140912180523.GB7145@tuxdriver.com> <20140912185423.GD7145@tuxdriver.com> <20140915150946.GA11690@hmsreliant.think-freely.org> <20140915162244.GB11690@hmsreliant.think-freely.org> <20140915174822.GG28459@tuxdriver.com> In-Reply-To: <20140915174822.GG28459@tuxdriver.com> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Sep 2014 19:06:40 -0000 > -----Original Message----- > From: John W. Linville [mailto:linville@tuxdriver.com] > Sent: Tuesday, September 16, 2014 1:48 AM > To: Neil Horman > Cc: Zhou, Danny; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACK= ET-based virtual devices >=20 > On Mon, Sep 15, 2014 at 12:22:44PM -0400, Neil Horman wrote: > > On Mon, Sep 15, 2014 at 03:43:07PM +0000, Zhou, Danny wrote: > > > > > > > -----Original Message----- > > > > From: Neil Horman [mailto:nhorman@tuxdriver.com] > > > > Sent: Monday, September 15, 2014 11:10 PM > > > > To: Zhou, Danny > > > > Cc: John W. Linville; dev@dpdk.org > > > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for A= F_PACKET-based virtual devices > > > > > > > > On Fri, Sep 12, 2014 at 08:35:47PM +0000, Zhou, Danny wrote: > > > > > > -----Original Message----- > > > > > > From: John W. Linville [mailto:linville@tuxdriver.com] > > > > > > Sent: Saturday, September 13, 2014 2:54 AM > > > > > > To: Zhou, Danny > > > > > > Cc: dev@dpdk.org > > > > > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD f= or AF_PACKET-based virtual devices > > > > > > > > > > > > On Fri, Sep 12, 2014 at 06:31:08PM +0000, Zhou, Danny wrote: > > > > > > > I am concerned about its performance caused by too many > > > > > > > memcpy(). Specifically, on Rx side, kernel NIC driver needs t= o copy > > > > > > > packets to skb, then af_packet copies packets to AF_PACKET bu= ffer > > > > > > > which are mapped to user space, and then those packets to be = copied > > > > > > > to DPDK mbuf. In addition, 3 copies needed on Tx side. So to = run a > > > > > > > simple DPDK L2/L3 forwarding benchmark, each packet needs 6 p= acket > > > > > > > copies which brings significant negative performance impact. = We > > > > > > > had a bifurcated driver prototype that can do zero-copy and a= chieve > > > > > > > native DPDK performance, but it depends on base driver and AF= _PACKET > > > > > > > code changes in kernel, John R will be presenting it in comin= g Linux > > > > > > > Plumbers Conference. Once kernel adopts it, the relevant PMD = will be > > > > > > > submitted to dpdk.org. > > > > > > > > > > > > Admittedly, this is not as good a performer as most of the exis= ting > > > > > > PMDs. It serves a different purpose, afterall. FWIW, you did > > > > > > previously indicate that it performed better than the pcap-base= d PMD. > > > > > > > > > > Yes, slightly higher but makes no big difference. > > > > > > > > > Do you have numbers for this? It seems to me faster is faster as l= ong as its > > > > statistically significant. Even if its not, johns AF_PACKET pmd ha= s the ability > > > > to scale to multple cpus more easily than the pcap pmd, as it can m= ake use of > > > > the AF_PACKET fanout feature. > > > > > > For 64B small packet, 1.35M pps with 1 queue. > > Why did you only test with a single queue? Multiqueue operation was on= e of the > > big advantages of the AF_PACKET based pmd. I would expect a single que= ue setup > > to perform in a very simmilar fashion to the pcap PMD > > > > As both pcap and AF_PACKET PMDs depend on interrupt > > > based NIC kernel drivers, all the DPDK performance optimization techn= iques are not utilized. Why should DPDK adopt > > > two similar and poor performant PMDs which cannot demonstrate DPDK' k= ey value "high performance"? > > Several reasons: > > * "High performance" isn't always the key need for end users. Consider > > pre-hardware availablity development phase. > > > > * Better hardware modeling (consider AF_PACKETS multiqueue abiltiy) > > > > * Better scaling (pcap doesn't make use of the fanout features that AF_= PACKET > > does) > > > > * Space savings, Building the AF_PACKET pmd doesn't require the additio= nal > > building/storage of the pcap driver. >=20 > This would include not requiring a dependency on libpcap, if nothing else= . librte_pmd_pcap and librte_pmd_packet are both DPDK wrapper libraries on to= p of libpcap library and AF_PACKET module respectively,=20 so they are not born for high performance, which is truly understandable. D= PDK is moving toward to open to a larger public of data center consumers who do not care about very high performance, so from that angle, = it makes sense to adopt librte_pmd_packet in my mind. >=20 > > > > > > > > > > > > > I look forward to seeing the changes you mention -- they sound = very > > > > > > exciting. But, they will still require both networking core an= d > > > > > > driver changes in the kernel. And as I understand things today= , > > > > > > the userland code will still need at least some knowledge of sp= ecific > > > > > > devices and how they layout their packet descriptors, etc. So = while > > > > > > those changes sound very promising, they will still have certai= n > > > > > > drawbacks in common with the current situation. > > > > > > > > > > Yes, we would like the DPDK performance optimization techniques s= uch as huge page, efficient rx/tx routines to manipulate > > > > device-specific > > > > > packet descriptors, polling-model can be still used. We have to t= radeoff between performance and commonality. But we believe > it will > > > > be much easier > > > > > to develop DPDK PMD for non-Intel NICs than porting entire kernel= drivers to DPDK. > > > > > > > > > > > > > Not sure how this relates, what you're describing is the feature in= tel has been > > > > working on to augment kernel drivers to provide better throughput v= ia direct > > > > hardware access to user space. Johns PMD provides ubiquitous funct= ion on all > > > > hardware. I'm not sure how the desire for one implies the other isn= 't valuable? > > > > > > > > > > Performance is the key value of DPDK, instead of commonality. But we = are trying to improve commonality of our solution to make > it easily > > > adopted by other NIC vendors. > > > > > Thats completely irrelevant to the question at hand. To go with your r= easoning, > > if performance is the key value of the DPDK, then you should remove all= driver > > support save for the most performant hardware you have. By that same t= oken, > > you should deprecate the pcap driver in favor of this AF_PACKET driver,= because > > it has shown performance improvement. > > > > I'm being facetious, of course, but the facts remain: Lack of superior > > performance from one PMD to the next does not immediately obviate the n= eed for > > one PMD over another, as they quite likely address differing needs. As= you note > > the DPDK seeks performance as a key goal, but its an open source projec= t, there > > are other needs from other users in play here. The AF_PACKET pmd provi= des > > superior performance on linux platforms when hardware independence is r= equired. > > It differs from the pcap PMD as it uses features that are only availabl= e on the > > Linux platform, so it stands to reason we should have both. >=20 > IMHO, the biggest deficiency in DPDK is the lack of apps. Let's face > it, no one really cares about running l2fwd except for testing the > drivers. What people want is applications. Providing a PMD to use > while developing an app without requiring specific hardware seems like > a win to me. The pcap PMD addresses some of that, but it is more of > a stop-gap or special purpose thing (like for playing back captures). >=20 It is not true for network middle boxes which resolve L2/L3 packet processi= ng problems(which is the main problem DPDK wants to resolve when it was bor= n),=20 but it might be truefor data center or endpoint applications that primarily= focus on addressing L4-L7 packet processing problems, which do not care about L2/L3 high throughput and packet latency very much, as sy= stem performance bottle-neck are in the L4-L7 routines. > > > > > > It seems like the changes you mention will still need some sort= of > > > > > > AF_PACKET-based PMD driver. Have you implemented that complete= ly > > > > > > separate from the code I already posted? Or did you add that w= ork > > > > > > on top of mine? > > > > > > > > > > > > > > > > For userland code, it certainly use some of your code related to = raw rocket, but highly modified. A layer will be added into > eth_dev > > > > library to do device > > > > > probe and support new socket options. > > > > > > > > > > > > > Ok, but again, PMD's are independent, and serve different needs. I= f they're use > > > > is at all overlapping from a functional standpoint, take this one n= ow, and > > > > deprecate it when a better one comes along. Though from your descr= iption it > > > > seems like both have a valid place in the ecosystem. > > > > > > > > > > I am ok with this approach, as long as this AF_PACKET PMD does not ad= d extra maintain efforts. Thomas might make the call. > > > > > What extra maintainer efforts do you think are required here, that woul= dn't be > > required for any PMD? To suggest that a given PMD shouldn't be include= d because > > it would require additional effort to maintain holds it to a higher sta= ndard > > than the PMD's already included. I don't recall anyone asking if the i= 40e or > > bonding pmds would require additional effort before being integrated. >=20 > Right -- how much maintainer effort is put into the pcap driver > these days? I do not know details, but I DO know validation guys need to put a lot effo= rts on measuring the performance for it on different platforms. Probably a automation function and performance testsuite can help a lot. >=20 > John > -- > John W. Linville Someday the world will need a hero, and you > linville@tuxdriver.com might be all we have. Be ready.