From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 915F0592E for ; Tue, 13 May 2014 15:54:02 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP; 13 May 2014 06:49:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.97,1044,1389772800"; d="scan'208";a="510904602" Received: from orsmsx108.amr.corp.intel.com ([10.22.240.6]) by orsmga001.jf.intel.com with ESMTP; 13 May 2014 06:54:08 -0700 Received: from orsmsx154.amr.corp.intel.com (10.22.226.12) by ORSMSX108.amr.corp.intel.com (10.22.240.6) with Microsoft SMTP Server (TLS) id 14.3.123.3; Tue, 13 May 2014 06:54:07 -0700 Received: from orsmsx102.amr.corp.intel.com ([169.254.1.253]) by ORSMSX154.amr.corp.intel.com ([169.254.11.72]) with mapi id 14.03.0123.003; Tue, 13 May 2014 06:54:08 -0700 From: "Venkatesan, Venky" To: Neil Horman Thread-Topic: [dpdk-dev] [PATCH RFC 06/11] mbuf: replace data pointer by an offset Thread-Index: AQHPbhGqXNfDC6Ohe0+krWy9pesK/5s+iQuA Date: Tue, 13 May 2014 13:54:07 +0000 Message-ID: <1FD9B82B8BF2CF418D9A1000154491D9740AA95E@ORSMSX102.amr.corp.intel.com> References: <1399647038-15095-1-git-send-email-olivier.matz@6wind.com> <1399647038-15095-7-git-send-email-olivier.matz@6wind.com> <3144526.CGFdr4BbI8@xps13> <1FD9B82B8BF2CF418D9A1000154491D9740A92B8@ORSMSX102.amr.corp.intel.com> <20140512144108.GB21298@hmsreliant.think-freely.org> <5370E397.7000706@6wind.com> <1FD9B82B8BF2CF418D9A1000154491D9740A9631@ORSMSX102.amr.corp.intel.com> <20140512183943.GC21298@hmsreliant.think-freely.org> In-Reply-To: <20140512183943.GC21298@hmsreliant.think-freely.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.22.254.138] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH RFC 06/11] mbuf: replace data pointer by an offset X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 May 2014 13:54:03 -0000 An alternative way to save 6 bytes (without the side effects this change ha= s) would be to change the mempool struct * to a uint16_t mempool_id. That l= imits the changes to a return function, and the performance impact of that = can be mitigated quite easily.=20 -Venky -----Original Message----- From: Neil Horman [mailto:nhorman@tuxdriver.com]=20 Sent: Monday, May 12, 2014 11:40 AM To: Venkatesan, Venky Cc: Olivier MATZ; Thomas Monjalon; dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH RFC 06/11] mbuf: replace data pointer by an = offset On Mon, May 12, 2014 at 04:06:23PM +0000, Venkatesan, Venky wrote: > Olivier, >=20 > The impact isn't going to be felt on the driver quite as much (and can=20 > be mitigated) - the driver runs a pretty low IPC (~1.7) compared to=20 > some of the more optimized code above it that actually accesses the=20 > data. The problem with the dependent compute is like this - in effect=20 > you are changing >=20 > struct eth_hdr * eth =3D (struct eth_hdr *) m->data; to struct eth_hdr *= =20 > eth =3D (struct eth_hdr *) ( (char *)m->buf _addr + m->data_offset); >=20 > We have some code that actually processes 4-8 packets in parallel (parse = + hash), with a pretty high IPC. What we've done here is essentially replac= ed is a simple load, with a load, load, add sequence in front of it. There= is no real way to do these computations in parallel for multiple packets -= it has to be done one or two at a time. What suffers is the IPC of the ove= rall function that does the parse/hash quite significantly. It's those func= tions that I worry about more than the driver. I haven't yet been able to = come up with a mitigation for this yet.=20 >=20 > Neil, >=20 > The last time we looked at this change - and it's been a while ago, the n= egative effect on the upper level functions built on this was on the order = of about 15-20%. It's probably will get worse once we tune the code even mo= re. Hope the above explanation gives you a flavour of the problem this wil= l introduce.=20 >=20 I'm sorry, it doesnt. I take you at your word that it was a problem, but I= don't think we can just categorically deny patches based on past testing o= f potentially simmilar code, especially given that this series attempts to = improve some traffic patten via the implementation TSO (meaning the net res= ult will be different based on the use case). =20 I understand what your saying above, that this code incurs a second load op= eration (though I would think they could be implemented in parallel, or at = the very least accelerated by clever placement of data_offset relative to b= uf_addr to ensure that the second load was cache hot). Regardless, my point is, just saying that this can't be done because you sa= w a performance hit with something simmilar in the past, isn't helpful. If= you think thats a problem, then we really need to get details of your test= case and measurements you took so that they can be reproduced, and confirm= ed or refuted. Regards Neil. > Regards, > -Venky >=20 >=20 >=20 >=20 > -----Original Message----- > From: Olivier MATZ [mailto:olivier.matz@6wind.com] > Sent: Monday, May 12, 2014 8:07 AM > To: Neil Horman; Venkatesan, Venky > Cc: Thomas Monjalon; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH RFC 06/11] mbuf: replace data pointer=20 > by an offset >=20 > Hi Venky, >=20 > On 05/12/2014 04:41 PM, Neil Horman wrote: > >> This is a hugely problematic change, and has a pretty large=20 > >> performance impact (because the dependency to compute and access).=20 > >> We debated this for a long time during the early days of DPDK and=20 > >> decided against it. This is also a repeated sequence - the driver=20 > >> will do it twice (Rx + Tx) and the next level stack will do it=20 > >> twice (Rx + Tx) ... > >> > >> My vote is to reject this change particular change to the mbuf. > >> > >> Regards, > >> -Venky > >> > > Do you have perforamance numbers to compare throughput with and=20 > > without this change? I always feel suspcious when I see the spectre=20 > > of performane used to support or deny a change without supporting reaso= ning or metrics. >=20 > I agree with Neil. My feeling is that it won't impact performance, and it= is correlated with the forwarding tests I've done with this patch. >=20 > I don't really understand what would cost more by storing the offset=20 > instead of the virtual address. I agree that each time the stack will=20 > access to the begining of the mbuf, there will be an arithmetic=20 > operation, but it is compensated by other operations that will be > accelerated: >=20 > - When receiving a packet, the driver will do: >=20 > m->data_off =3D RTE_PKTMBUF_HEADROOM; >=20 > instead of: >=20 > m->data =3D (char*) rxm->buf_addr + RTE_PKTMBUF_HEADROOM; >=20 > - Each time the stack will prepend data, it has to check if the headroom > is large enough to do the operation. This will be faster as data_off > is the headroom. >=20 > - When transmitting a packet, the driver will get the physical address: >=20 > phys_addr =3D m->buf_physaddr + m->data_off >=20 > instead of: >=20 > phys_addr =3D (m->buf_physaddr + \ > ((char *)m->data - (char *)m->buf_addr))) >=20 > Moreover, these operations look negligible to me (few cycles) compared to= the large amount of arithmetic operations and tests done in the driver. >=20 > Regards, > Olivier >=20