From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1on0096.outbound.protection.outlook.com [157.56.110.96]) by dpdk.org (Postfix) with ESMTP id B2CEA9AD6 for ; Thu, 19 May 2016 15:36:17 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:To:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=nl/DO7vNymAkipB9ZVObKljaOm2qTKcU4ZkVa8RTKek=; b=IBnZWYldPRO51qkDeOqfCFEOt/80BuzB6Oj3mpFcMjUwwD4nVMyA5lM9j2dOunGUDONjJkqCjV8vrakEYr0otuidD24oez5Gpl8vmkTWcCaD0yhq2HgDjCUlEkwMssR9d2CSMboDSW0U3Oy2/SkBDneKljdGCdW799l1QRsHlyM= Authentication-Results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=caviumnetworks.com; Received: from localhost.localdomain (122.171.43.177) by BLUPR0701MB1716.namprd07.prod.outlook.com (10.163.85.142) with Microsoft SMTP Server (TLS) id 15.1.497.12; Thu, 19 May 2016 13:36:13 +0000 Date: Thu, 19 May 2016 19:05:54 +0530 From: Jerin Jacob To: "Ananyev, Konstantin" CC: "Richardson, Bruce" , "dev@dpdk.org" , "thomas.monjalon@6wind.com" , "viktorin@rehivetech.com" , "jianbo.liu@linaro.org" Message-ID: <20160519133548.GA5308@localhost.localdomain> References: <1463579863-32053-1-git-send-email-jerin.jacob@caviumnetworks.com> <20160518164300.GA12324@bricha3-MOBL3> <20160518185011.GA4432@localhost.localdomain> <20160519085047.GA17500@bricha3-MOBL3> <2601191342CEEE43887BDE71AB97725836B5AB67@irsmsx105.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <2601191342CEEE43887BDE71AB97725836B5AB67@irsmsx105.ger.corp.intel.com> User-Agent: Mutt/1.6.1 (2016-04-27) X-Originating-IP: [122.171.43.177] X-ClientProxiedBy: BM1PR01CA0019.INDPRD01.PROD.OUTLOOK.COM (10.163.198.154) To BLUPR0701MB1716.namprd07.prod.outlook.com (10.163.85.142) X-MS-Office365-Filtering-Correlation-Id: 44df269a-00d4-48f9-cbce-08d37fea8d8b X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1716; 2:b8AXQ4Bage8Hzu++UArv9upHuNhmDPqdRPfShTDIQD8/58H97p8gkm52FMo0mGR5R2aEzu00zV6eIf47Q98qnaTOMRfhHPbHu/BRInq4+KvmHHuj5fyWm39qqOatqOSnUsVGCFtYahm05Gva0hkGYPCQrolHQL7jnsxonZe23vE70Jk6DlgJshq1aNsUvd3g; 3:lXeRJT6bhkEmI/r2uvR8rEoXYMNY5OyLzEgshu9DWZJDeTRUyKWp2zyS4VZKkdQiwtNnw1TK9/fYKQOLBWI28rzo497uq2zsr2HjKZgvJFXxePRljl5SJ5BG17PJ65pH; 25:J1ObTdnqvOT1dGb5wHiAaIMkDZQ2iRhIhp7/wOD2UDpowjzUfYem4ItvVwFRiHaytuRx24+Aqb/ctVhC6cU2Ipz0efDpkEukDAZoJyeymnyfkZJgGJHhOLCzKLPBUh9ANkb2MkZcBInF8ZeDOSx5m2WAfIaOv5M8LTmF9Pf+tIL1H2Wbh6fXRw6DLU4PL/B+tOdfHb2eUbs2W0G4uvM06cEY8MtCMS14LJUrk5OAiaeN18SwP2JYucGGQwi5TS4N9KsdqxvEG1Cko3Dca40fcOqmp9xYmnPrsGDYdJH3divuGGzXnxp+47ddMyDe1tUFVdndT3etQAOf9DyN3NWGsyAYRoqRVZ/yWKVuytNDAAzcSjrttu9wXmlSE86wYC7av36gOAKWNqsp0BA4ZU0JWnCr/+Hs7nGiUtUOfGrpkLE= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BLUPR0701MB1716; X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1716; 20:qBVORPXraqLqrMMwrI+2f0uOKJv4hggkpzkuNQ6vjYtbjskpZSFNx+pfRTRLk+yxmEUTqQbWJ5Gce08ju5xhPDKJt9nX0INJLraIncMqx/TVReZiEJsxQc0d7i7Caj7XaW5onuSfrCJPP/g5KnOnWaqwitUD1uY6/xg8WVu02NvPPdKT3b2AWio0YgNPyyP/5u0ygU5+76Qo0o2VQhCeqHgOql9l7hlX0ahA9OJPuvQtPGAAFodKAzsWSMW4zILYc5otuA/atyg6sqUo38SsfiRkZT2wwv+VuLhhTx1EaHxyhR6hXsBnWCbHAPZlPFDU72ro6P0Grp71S2+0zYgTme8sjHL4csTpK9TTaTXnWdpE1tbtrTiLLwxHjp5c3ipcHA9o8Wna/e7fd5O29XRsZwzV35D36cbV8CbTUXeWROuGjYDVpVH1d4iR3gXjBP4VD/wMczfSfxF3iVX6jFE6MAPLAhNxSSzoaSAWzD3LDGX5dDrVtuwYqfQFJZE2WVX7+fyvpVmrKfpnYA7VyNAvRrYXthaT7JB3KL/XNtj2WLhWnzaQLaVdXpT3lSmpTtQ7w2xWaHx8T2ILdSmJHkAAVwFoGzkRoICKBWSoQC8589A= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001); SRVR:BLUPR0701MB1716; BCL:0; PCL:0; RULEID:; SRVR:BLUPR0701MB1716; X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1716; 4:lBF5LqiNaoqLu0RQBtQuxi8yYPVPEFr7nTIaI9UXbOZR5p3+LEXNYoQ63p2Ol3QFWu592jMX6WtmGgU31PRbSU4iQ/cXEmi71aLkBLpp3Cmvl4s3cENIHwF4m+jUknQBmwTceEVKjVi7bND8cd+ClcWl9/1ImxkE/1OGf6j2z+c6HozovJuBqFLZ3Kx+idUMsVGsajTDHKACH+s7pzQgmq9NHoQ1AE8zgPTOp8RSGWlFX+sNIQ85ZPsHuUszFZycbfNIRX2USfY23MzZGGreR4JX/tNVvcmvvHCv2Azt0Vr9JLnTLAgxLZvbPcuf5gtpQ+4C0iuLSXolYWm/Fhjq7HjWVEFqP+TapITN/FNWkUDWCVHwwTYoP3p7XfcCumnn X-Forefront-PRVS: 094700CA91 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(4630300001)(6069001)(6009001)(53754006)(51444003)(24454002)(92566002)(33656002)(4326007)(76176999)(47776003)(8676002)(93886004)(81166006)(19580395003)(19580405001)(66066001)(83506001)(42186005)(2950100001)(6116002)(3846002)(4001350100001)(2906002)(9686002)(110136002)(77096005)(5004730100002)(189998001)(54356999)(50986999)(61506002)(15395725005)(23726003)(1076002)(586003)(97756001)(5008740100001)(50466002)(46406003); DIR:OUT; SFP:1101; SCL:1; SRVR:BLUPR0701MB1716; H:localhost.localdomain; FPR:; SPF:None; MLV:sfv; LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BLUPR0701MB1716; 23:4OA2ppPF8AcIZRj0NmivehoxfwreCB3ohVPB88T?= =?us-ascii?Q?MMqCPFPrB3fdnAD8UZNdRMKTI8Kjqv4ti6wdBva8RoHs8RtExyTMYlamH410?= =?us-ascii?Q?eP1WV0SmZ1jx7QKSOwNjquDNuw+9CrhJQ/JfGUmnnh3oUvOx0ceJHN+B/QbU?= =?us-ascii?Q?Mqp+dhMR+1xwtpeLHJ791Sn3ih+XkQxhVDVSNkPjtp+rQ85YjE1UYohpItbD?= =?us-ascii?Q?McaoNfthcaWrLBf7t7l9tZj0WPKiwpiI18XHzKznuKXOD3kWumzX6dALN7yk?= =?us-ascii?Q?5FxjvXIO1NLulZcTfPq0DsFiMEby2WoNM882AwUlYpcs8GdLKEk51y1ReEI8?= =?us-ascii?Q?31/0waWbt98oEyH8qAQ+F70Zpm87sHi/qsc2cC8nZuPmu2zJMi9670bZDi2L?= =?us-ascii?Q?vHR+jkGprXLd0WmlUm//ZTVB7JxE+wgvYhVZmNSs1UzP9cNua9+LmtAZcgAN?= =?us-ascii?Q?/7P3AdYMgCMBM7HL8hnVxaL7HzmGUhN7BvgrHp4M85oe/bUkfkr/EabZTfc0?= =?us-ascii?Q?4zX69fZGVBkTIQvgkT/sO+qM/QXvQnbqr553u5bOklRVXXTEYe2OUkjmEZBb?= =?us-ascii?Q?jObLD5+56g0SIQt+HrT+5XOmOP1oRyzn8K68yR4lEp6D1dEaNL5Z0syvs3C7?= =?us-ascii?Q?vQfv+4UGR040/okpU3qa3nUX8cDUQHia4/1YgXq7tv4ckvXNuQaCJPPr3kVL?= =?us-ascii?Q?8YEeDcOspK04Xj5bG8wtF7pXao8W3UCxhusZOFjViGHHNcopDfMkzJx0vnbl?= =?us-ascii?Q?FeoaIZy4O/GCx0RXeqAostyqUvaaPQKGcKutIc6DJqcvfw6uQ5zk2bzZpUCz?= =?us-ascii?Q?dKUZJt0MG6kI8USDqabZp2Hc8BaBxYdulnYoacgqQztv4f6Z/OWCGVV1828j?= =?us-ascii?Q?gWy6jR/5vXj9s8joExHJ1gfGBQ9XyZJWtXE1Ys0/8BvBBeQ26dYztp2NRJIm?= =?us-ascii?Q?KjHD94NiSDtn2pGJDScv3/NF5Cg5uhfpv9Jhgc/gqUWYBnC++0Fu3wmmWSo8?= =?us-ascii?Q?OQqJZKzrpzLPqa0XpB296KLXcboeeWDFYnvmBD3ISzD2UD9YS/ngVn+KVh8v?= =?us-ascii?Q?mPlEZ0UE=3D?= X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1716; 5:Dekg6RlAXlTLFAT1lry/hAg5V44VLCaBC5gXGd4u/vFjy9j+wqwm3MS3WzfAM1hDM3ZbiGomOiFUE4UAnuO9HHdgAF+vqergP8DCwVB7TYgmDlbEtYUZ2KLdsluQxlGCeH3zyXYzD0b1N8XrdJukWA==; 24:QEABvihnAcrYnTuOiIxLyzgAZfTTWH7piGm8IJikvK9/KQkhgTC2Lj+XnQMPSJH3xi+rActjVxtnGrrcin7xBezzOGOf6QqAAgZPIcLLLMc=; 7:p38Oo/F8Wy02vHVz62Js8R5Bx6enwMBneZsBrOZfoNoIhiaxh5rlrljUWEMS2drtAZ/aaAkWeHy3nuh7WtCwQ6qLemafQ/wyOtYE0/KaLpG2XaCk1PwR6nXGA2qJUANN/XAa4rPkoOtpPr3YqMuS47l7KbbYI7KpQwNIgfwBOd7uxHWmk3b/mDUaqDQMWyTH SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 May 2016 13:36:13.5069 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR0701MB1716 Subject: Re: [dpdk-dev] [PATCH] mbuf: make rearm_data address naturally aligned X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2016 13:36:18 -0000 On Thu, May 19, 2016 at 12:18:57PM +0000, Ananyev, Konstantin wrote: > > Hi everyone, > > > On Thu, May 19, 2016 at 12:20:16AM +0530, Jerin Jacob wrote: > > > On Wed, May 18, 2016 at 05:43:00PM +0100, Bruce Richardson wrote: > > > > On Wed, May 18, 2016 at 07:27:43PM +0530, Jerin Jacob wrote: > > > > > To avoid multiple stores on fast path, Ethernet drivers > > > > > aggregate the writes to data_off, refcnt, nb_segs and port > > > > > to an uint64_t data and write the data in one shot > > > > > with uint64_t* at &mbuf->rearm_data address. > > > > > > > > > > Some of the non-IA platforms have store operation overhead > > > > > if the store address is not naturally aligned.This patch > > > > > fixes the performance issue on those targets. > > > > > > > > > > Signed-off-by: Jerin Jacob > > > > > --- > > > > > > > > > > Tested this patch on IA and non-IA(ThunderX) platforms. > > > > > This patch shows 400Kpps/core improvement on ThunderX + ixgbe + vector environment. > > > > > and this patch does not have any overhead on IA platform. > > > > > > > > > > Have tried an another similar approach by replacing "buf_len" with "pad" > > > > > (in this patch context), > > > > > Since it has additional overhead on read and then mask to keep "buf_len" intact, > > > > > not much improvement is not shown. > > > > > ref: http://dpdk.org/ml/archives/dev/2016-May/038914.html > > > > > > > > > > --- > > > > While this will work and from your tests doesn't seem to have a performance > > > > impact, I'm not sure I particularly like it. It's extending out the end of > > > > cacheline0 of the mbuf by 16 bytes, though I suppose it's not technically using > > > > up any more space of it. > > > > > > Extending by 2 bytes. Right ?. Yes, I guess, Now we using only 56 out of 64 bytes > > > in the first 64-byte cache line. > > > > > > > > > > > What I'm wondering about though, is do we have any usecases where we need a > > > > variable buf_len for packets for RX. These mbufs come directly from a mempool, > > > > which is generally understood to be a set of fixed-sized buffers. I realise that > > > > this change was made in the past after some discussion, but one of the key points > > > > there [at least to my reading] was that - even though nobody actually made a > > > > concrete case where they had variable-sized buffers - having support for them > > > > made no performance difference. > > I was going to point to vhost zcp support, but as Thomas pointed out > that functionality was removed from dpdk.org recently. > So I am not aware does such case exist right now in the 'real world' or not. > Though I still think RX function should leave buf_len field intact. > > > > > > > > > The latter part of that has now changed, and supporting variable-sized mbufs > > > > from an mbuf pool has a perf impact. Do we definitely need that functionality, > > > > because the easiest fix here is just to move the rxrearm marker back above > > > > mbuf_len as it was originally in releases like 1.8? > > > > > > And initialize the buf_len with mp->elt_size - sizeof(struct rte_mbuf). > > > Right? > > > > > > I don't have a strong opinion on this, I can do this if there is no > > > objection on this. Let me know. > > > > > > However, I do see in future, "buf_len" may belong at the end of the first 64 byte > > > cache line as currently "port" is defined as uint8_t, IMO, that is less. > > > We may need to increase that uint16_t. The reason why I think that > > > because, Currently in ThunderX HW, we do have 128VFs per socket for > > > built-in NIC, So, the two node configuration and one external PCIe NW card > > > configuration can easily go beyond 256 ports. > > I wonder does anyone really use mbuf port field? > My though was - could we to drop it completely? > Actually, after discussing it with Bruce offline, an interesting idea came out: > if we'll drop port and make mbuf_prefree() to reset nb_segs=1, then > we can reduce RX rearm_data to 4B. So with that layout: > > struct rte_mbuf { > > MARKER cacheline0; > > void *buf_addr; > phys_addr_t buf_physaddr; > uint16_t buf_len; > uint8_t nb_segs; > uint8_t reserved_1byte; /* former port */ > > MARKER32 rearm_data; > uint16_t data_off; > uint16_t refcnt; > > uint64_t ol_flags; > ... > > We can keep buf_len at its place and avoid 2B gap, while making rearm_data > 4B long and 4B aligned. Couple of comments, - IMO, It is good if nb_segs can move under rearm_data, as some drivers(not in ixgbe may be) can write nb_segs in one shot also in segmented rx handler case - I think, it makes sense to keep port in mbuf so that application can make use of it(Not sure what real application developers think of this) - if Writing 4B and 8B consume same cycles(at least in arm64) then I think it makes sense to make it as 8B wide with maximum pre-built constants are possible. > > Another similar alternative, is to make mbuf_prefree() to set refcnt=1 > (as it update it anyway). Then we can remove refcnt from the RX rearm_data, > and again make rearm_data 4B long and 4B aligned: > > struct rte_mbuf { > > MARKER cacheline0; > > void *buf_addr; > phys_addr_t buf_physaddr; > uint16_t buf_len; > uint16_t refcnt; > > MARKER32 rearm_data; > uint16_t data_off; > uint8_t nb_segs; > uint8_t port; The only problem I think with this approach is that, port data type cannot be extended to uint16_t in future. > > uint64_t ol_flags; > .. > > As additional plus, __rte_mbuf_raw_alloc() wouldn't need to modify mbuf contents at all - > which probably is a good thing. > As a drawback - we'll have a free mbufs in pool with refcnt==1, which probably reduce > debug ability of the mbuf code. > > Konstantin > > > > > > Ok, good point. If you think it's needed, and if we are changing the mbuf > > structure, it might be a good time to extend that field while you are at it, save > > a second ABI break later on. > > > > /Bruce > > > > > > > > > > Regards, > > > > /Bruce > > > > > > > > Ref: http://dpdk.org/ml/archives/dev/2014-December/009432.html > > > >