From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1on0074.outbound.protection.outlook.com [157.56.110.74]) by dpdk.org (Postfix) with ESMTP id 2EB16568D for ; Thu, 19 May 2016 08:47:18 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:To:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=GvJI1x3XRqfaM5E15r11ZDJiMpVM1y3b9CvFRXmxwv4=; b=Bv4wnGmciKAhScg4XxBGWvqB0DVdo9w1ADs59xsKKDk9vb2ZUwdVZ6wlYPxpyMmakvkmbn8soKNvj+ILKQTxapNEcIvsaQp3kvTcMLD5cMc49T60rSb6Jk0JZWs7oZ56UOan1YUgTGMCLwPxr2FmLl3dZeNQyKyxhWKA5sNb5MI= Authentication-Results: 6wind.com; dkim=none (message not signed) header.d=none;6wind.com; dmarc=none action=none header.from=caviumnetworks.com; Received: from localhost.localdomain (122.171.43.177) by CY1PR0701MB1728.namprd07.prod.outlook.com (10.163.21.142) with Microsoft SMTP Server (TLS) id 15.1.497.12; Thu, 19 May 2016 06:47:14 +0000 Date: Thu, 19 May 2016 12:16:55 +0530 From: Jerin Jacob To: Olivier Matz CC: , , Message-ID: <20160519064653.GA4790@localhost.localdomain> References: <1462810707-7434-1-git-send-email-olivier.matz@6wind.com> <1463587328-13019-1-git-send-email-olivier.matz@6wind.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <1463587328-13019-1-git-send-email-olivier.matz@6wind.com> User-Agent: Mutt/1.6.1 (2016-04-27) X-Originating-IP: [122.171.43.177] X-ClientProxiedBy: MAXPR01CA0007.INDPRD01.PROD.OUTLOOK.COM (10.164.147.14) To CY1PR0701MB1728.namprd07.prod.outlook.com (10.163.21.142) X-MS-Office365-Filtering-Correlation-Id: 98beb626-4800-47b3-7b49-08d37fb16af1 X-Microsoft-Exchange-Diagnostics: 1; CY1PR0701MB1728; 2:PRqefIxPG5M85WcXSrVeNbrxCMTGcAznNFpAeF+YI7Cxst+TUGh55862YUcsUfTP0+nmvP1fmVzHGBJbuquRyGg6n41nOf343O+9tuuQOb+V79xKYNbVgnD3kwnrC52ImRaS9o6ULJTiz76/RxzqndDYTz/VjMbdm2VH+P8oFzriWMSIfWehPbaYs+IgBUmQ; 3:duUJAaYIh2XktgeYPenznHa0lYbZPMmJL5hqEleOwQnmbgINVtN+4hMra7fIPrFprV4zLv8TD7mUfB/JVGusrlFt28xa+xZ282lhoomdK9sX7DZSMNKXwbWhPeP4u2pY; 25:jPLeoFu9ljb5afYmKtd57b9e09Sy0wqFdb3qB5eHQ9Vc1LAWBjJxau9F4Sp1UsEVJnJLZqwSwQW52upUg8sFqN70uzLwH7V7NkdRbPpbLLaQGWE5jM0JFDPXcz5Xvq8ARPffJgfBdWA/vM5OYKm8YzsnnzguU29LZ23ftuOET69rThOEphReoc03skkT1dEwh/cS1ROCVWqzk0XmQLw2BY2JzjvpsJguiS7mUaNukW/5FoHUobJeh8R/K8aj4rfPsjMmKrsqBoJoAkipG9YlxVf0UyMYsZJLH7gyLvwsEke8xIewwbRgdFeoFeCSP2d6vqaXm0QTclL8eNWcW9jHqxsmZoZtcY+l1TuyVJnheJ/3KW9ohoNZpoR273Fi3AXfWl00kycHqAB5uN76a1Yp1BnRtp+DAe/WMk+1eJhQFSA= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CY1PR0701MB1728; X-Microsoft-Exchange-Diagnostics: 1; CY1PR0701MB1728; 20:Y++pyE60XWUeiM/+wg5selA1OcXLUn3lrK9nmitxF170mIQpH79h/vRQl3US7nruGUCEtVs/jurmZ5suYs95eubKH+hhLsuIRAiONGl4fIoXlpJCqUJlo4yQkb1QUeTBUxjhHzzezHw5LODf+SSPb8aT9HZeitq30+hoI3Wx65hBvxKaQspvm0Gb9h1vSTh/sqhF8lw/0wTI+kqYT6iCV7CgRzuO9MaMeNUOu+iKnhKai9L6oPEbKIlXCTEsrtur6panqiPmWlWzxUKh3mFaIsO7S5oyXmeGn6hRSKa/ZRE/5+61vvMDo0Um5H+5ztMUN4fEH1QvDvsfg7yFTymgB+dt1QmndZEB5X0lf5+tTrNinB2mSWSmPKyuvC/PJ4INUhJtSJEjvAB6nR6Y7jhSyZF9mKXR3G/M8Lp1KcbtoERP5+PIqI8CIk1IJhvNUi/tV/aK+5wAwdVhAgWvguSyQ3ChCTvCerxFycyLGI8PSwMVHd3Qr6dbIAViPlYXi9aFBTS9uejURPij3mfY1MrnDu2sZforfDxHkcDIVGJXV8/qe7YOYZf/pryXYn/KM5WqKXioYsEIq5Uf3KEFLzRUAGTwZG6QhMDQkKHlpfwUpnQ= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046); SRVR:CY1PR0701MB1728; BCL:0; PCL:0; RULEID:; SRVR:CY1PR0701MB1728; X-Microsoft-Exchange-Diagnostics: 1; CY1PR0701MB1728; 4:QC0RdL2OHjsVuYq/R+EH6U1Ba+e4oWaQppY/OKree9yUBgw5oFwfVadlmAE+VwIW8JwTOPQMms87ctYEH/8cpfIi+y2nccwMJ67SZZRM3TTPAeyobALlwajFJAFNMOCRQbHR9b7+M3oiB5PtcYz8rxK3jO0bl6AQYRrIj/xr3GPY+anC503eQF1SdpLMld9Y8tDeUHPLvG+LaprIAH8AgpMcCjuq+DY78Dz+SgJiAZ6i5YHjl9+XGxQxeialks+ZppKlltjOAn/9CU3bMfcMsSEhIy5VqMxaZLQHe/f0ZflKpGIrsFq056HhkLhGdCIIoLhbzjuNLUTBdGjyDXafK9eOeqr+UBOYqONtKSAPeUzIEGrdeIlsvWJv1LUgbTN+ X-Forefront-PRVS: 094700CA91 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(4630300001)(6069001)(6009001)(24454002)(66066001)(92566002)(54356999)(50466002)(76176999)(50986999)(5008740100001)(47776003)(2906002)(189998001)(81166006)(5004730100002)(97756001)(2950100001)(4001350100001)(77096005)(110136002)(8676002)(33656002)(61506002)(9686002)(23726003)(83506001)(46406003)(586003)(19580395003)(19580405001)(3846002)(42186005)(1076002)(4326007)(6116002); DIR:OUT; SFP:1101; SCL:1; SRVR:CY1PR0701MB1728; H:localhost.localdomain; FPR:; SPF:None; MLV:sfv; LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; CY1PR0701MB1728; 23:QKum8QmGQQjwWaBppRpSn/kYp1/0hz8R2toDKKX?= =?us-ascii?Q?tDB2ibXZA2pUHfiH+n1S2zVMmixi9E0/dFNVac1D2FnZ7YlySBGbL2XxZKyx?= =?us-ascii?Q?XqdW+9p2wJcw01JIC4c7IQLPakCHK73LEZdwfw9XdDebEj6gOwXFUt3RkYQA?= =?us-ascii?Q?o9O28ng0ZXibuOsKG4Bv6xBFhUUU+IUqh+C9vb09PpM3wEu1BqHRQAkfuTHC?= =?us-ascii?Q?n5fBQZRrCigW2hAvQuVGvdCYBkf956O4HrQu0oHQf2Swa8NUAE+tRws/eNV/?= =?us-ascii?Q?Gzn15RZgzt/nE6OUkw7w7XlsQs5+T6JCGEX8GJA0bYxoYjUYfN7AofJk/3sl?= =?us-ascii?Q?WB6U0dwdmgs+yDTV1AUiLUrV1ibXyS4kjOPCTkHCUuL70GIAli2i+2MInrvc?= =?us-ascii?Q?vTSBmjHi2QkXcAizWz0D9gJuUe2u3+u+nO4ejcZJ4HL2kH6Ibio/0kKP2Hae?= =?us-ascii?Q?AtkQO9jS1+dedmE97mLixbZ1OdHp+v40qIqxE0uzwOvB3lGFsZK4D1+AgptH?= =?us-ascii?Q?oyt6Pd9TMXvRszFoQ8sEcp+Box4t+lBUtcrgpdb2hWOSRvuG7KBSv4AkVZhO?= =?us-ascii?Q?dtCyvMRRV7M+Ad2bVY2oTYTs3194Az6USH+h7c01B0MkkcfNBrwi7g7VofH2?= =?us-ascii?Q?c5QOsTay2A3WdvceinrU7YJuWwRDS51H6xuUs6wteOI44mRPt7uo+hWn64PQ?= =?us-ascii?Q?ltkTIUaMREpULBX2LHFmZKmktvvIN/LNoaFGfH5e4iM6B+a8PrLlz7bqWLe9?= =?us-ascii?Q?jtcz1a8N1PLBiDyWKVo3xPsnBzBcKuxn4aKGd5m+oRdUzo+Djtw22U/tCYS1?= =?us-ascii?Q?Zaav//BzIoq5j2YGW0/qya7Qo4D3/Pjxz88ZyIy0I1SMYQAHGWb5hDJTSvKt?= =?us-ascii?Q?d0NSVbPobxJt66jUkIMtkOQXMWX1/M39IUScOsfuMXu07bBBfu3bmbQNgnw1?= =?us-ascii?Q?3WmkO9hlsKNvvPuIeGioj/cOL5lLoSP5zfp7De/WPTg=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1; CY1PR0701MB1728; 5:eo3Bedc6JyRBTzj2KIH2V7LJSjl+HRQVFAe3U3ZPV1RdyrE7AvpmFNXVxZYJgwwL/BD0DbMV79Uxi9uqVwlLH4nOrwL1LDTZCpTXsHhxRNVEslPui9mezOKPQyjvD6gaH3AhTLWm3qAhl+a/o5bi+w==; 24:+WNBgfQ3JepcSzzoO8fjiuUvXv7IQq+nSEpKYE+Q+B452fq5E49+0I+QmmNP7tYhTScozaOWNnWgPiQTUYuePongVUcdHaIXl3xjEWWJwNk=; 7:QuahlFwZTRX+9BYCP21g+G863SpNbRb7deL5y0sg1ktklAH8CeVg3il7Lag0SQoV8O4L9+5+y1nSim/dmfxhotHf7kGtQBLlbOUMF8PjcNH/7ew+9ZFw5wT5v+7nEbxxQm2rL3UgrhYsQF1IZOuVqdKzrpk28QsuMH1jeVj3eBAgQhNFDhkHyjAugDP5LOsK SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 May 2016 06:47:14.8581 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR0701MB1728 Subject: Re: [dpdk-dev] [PATCH v2] mbuf: add helpers to prefetch mbuf X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2016 06:47:18 -0000 On Wed, May 18, 2016 at 06:02:08PM +0200, Olivier Matz wrote: > Some architectures (ex: Power8) have a cache line size of 128 bytes, > so the drivers should not expect that prefetching the second part of > the mbuf with rte_prefetch0(&m->cacheline1) is valid. > > This commit add helpers that can be used by drivers to prefetch the > rx or tx part of the mbuf, whatever the cache line size. > > Signed-off-by: Olivier Matz Reviewed-by: Jerin Jacob > --- > > v1 -> v2: > - rename part0 as part1 and part1 as part2, as suggested by Thomas > > > drivers/net/fm10k/fm10k_rxtx_vec.c | 8 ++++---- > drivers/net/i40e/i40e_rxtx_vec.c | 8 ++++---- > drivers/net/ixgbe/ixgbe_rxtx_vec.c | 8 ++++---- > drivers/net/mlx4/mlx4.c | 4 ++-- > drivers/net/mlx5/mlx5_rxtx.c | 4 ++-- > examples/ipsec-secgw/ipsec-secgw.c | 2 +- > lib/librte_mbuf/rte_mbuf.h | 38 ++++++++++++++++++++++++++++++++++++++ > 7 files changed, 55 insertions(+), 17 deletions(-) > > diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c b/drivers/net/fm10k/fm10k_rxtx_vec.c > index 03e4a5c..ef256a5 100644 > --- a/drivers/net/fm10k/fm10k_rxtx_vec.c > +++ b/drivers/net/fm10k/fm10k_rxtx_vec.c > @@ -487,10 +487,10 @@ fm10k_recv_raw_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, > rte_compiler_barrier(); > > if (split_packet) { > - rte_prefetch0(&rx_pkts[pos]->cacheline1); > - rte_prefetch0(&rx_pkts[pos + 1]->cacheline1); > - rte_prefetch0(&rx_pkts[pos + 2]->cacheline1); > - rte_prefetch0(&rx_pkts[pos + 3]->cacheline1); > + rte_mbuf_prefetch_part2(rx_pkts[pos]); > + rte_mbuf_prefetch_part2(rx_pkts[pos + 1]); > + rte_mbuf_prefetch_part2(rx_pkts[pos + 2]); > + rte_mbuf_prefetch_part2(rx_pkts[pos + 3]); > } > > /* D.1 pkt 3,4 convert format from desc to pktmbuf */ > diff --git a/drivers/net/i40e/i40e_rxtx_vec.c b/drivers/net/i40e/i40e_rxtx_vec.c > index f7a62a8..eef80d9 100644 > --- a/drivers/net/i40e/i40e_rxtx_vec.c > +++ b/drivers/net/i40e/i40e_rxtx_vec.c > @@ -297,10 +297,10 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts, > _mm_storeu_si128((__m128i *)&rx_pkts[pos+2], mbp2); > > if (split_packet) { > - rte_prefetch0(&rx_pkts[pos]->cacheline1); > - rte_prefetch0(&rx_pkts[pos + 1]->cacheline1); > - rte_prefetch0(&rx_pkts[pos + 2]->cacheline1); > - rte_prefetch0(&rx_pkts[pos + 3]->cacheline1); > + rte_mbuf_prefetch_part2(rx_pkts[pos]); > + rte_mbuf_prefetch_part2(rx_pkts[pos + 1]); > + rte_mbuf_prefetch_part2(rx_pkts[pos + 2]); > + rte_mbuf_prefetch_part2(rx_pkts[pos + 3]); > } > > /* avoid compiler reorder optimization */ > diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c > index c4d709b..e97ea82 100644 > --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c > +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c > @@ -307,10 +307,10 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts, > _mm_storeu_si128((__m128i *)&rx_pkts[pos+2], mbp2); > > if (split_packet) { > - rte_prefetch0(&rx_pkts[pos]->cacheline1); > - rte_prefetch0(&rx_pkts[pos + 1]->cacheline1); > - rte_prefetch0(&rx_pkts[pos + 2]->cacheline1); > - rte_prefetch0(&rx_pkts[pos + 3]->cacheline1); > + rte_mbuf_prefetch_part2(rx_pkts[pos]); > + rte_mbuf_prefetch_part2(rx_pkts[pos + 1]); > + rte_mbuf_prefetch_part2(rx_pkts[pos + 2]); > + rte_mbuf_prefetch_part2(rx_pkts[pos + 3]); > } > > /* avoid compiler reorder optimization */ > diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c > index c5d8535..733d192 100644 > --- a/drivers/net/mlx4/mlx4.c > +++ b/drivers/net/mlx4/mlx4.c > @@ -3235,8 +3235,8 @@ mlx4_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) > * Fetch initial bytes of packet descriptor into a > * cacheline while allocating rep. > */ > - rte_prefetch0(seg); > - rte_prefetch0(&seg->cacheline1); > + rte_mbuf_prefetch_part1(seg); > + rte_mbuf_prefetch_part2(seg); > ret = rxq->if_cq->poll_length_flags(rxq->cq, NULL, NULL, > &flags); > if (unlikely(ret < 0)) { > diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c > index 1832a21..5be8c62 100644 > --- a/drivers/net/mlx5/mlx5_rxtx.c > +++ b/drivers/net/mlx5/mlx5_rxtx.c > @@ -1086,8 +1086,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n) > * Fetch initial bytes of packet descriptor into a > * cacheline while allocating rep. > */ > - rte_prefetch0(seg); > - rte_prefetch0(&seg->cacheline1); > + rte_mbuf_prefetch_part1(seg); > + rte_mbuf_prefetch_part2(seg); > ret = rxq->poll(rxq->cq, NULL, NULL, &flags, &vlan_tci); > if (unlikely(ret < 0)) { > struct ibv_wc wc; > diff --git a/examples/ipsec-secgw/ipsec-secgw.c b/examples/ipsec-secgw/ipsec-secgw.c > index 1dc505c..ebd7c23 100644 > --- a/examples/ipsec-secgw/ipsec-secgw.c > +++ b/examples/ipsec-secgw/ipsec-secgw.c > @@ -298,7 +298,7 @@ prepare_tx_burst(struct rte_mbuf *pkts[], uint16_t nb_pkts, uint8_t port) > const int32_t prefetch_offset = 2; > > for (i = 0; i < (nb_pkts - prefetch_offset); i++) { > - rte_prefetch0(pkts[i + prefetch_offset]->cacheline1); > + rte_mbuf_prefetch_part2(pkts[i + prefetch_offset]); > prepare_tx_pkt(pkts[i], port); > } > /* Process left packets */ > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > index 7b92b88..3ee8d66 100644 > --- a/lib/librte_mbuf/rte_mbuf.h > +++ b/lib/librte_mbuf/rte_mbuf.h > @@ -842,6 +842,44 @@ struct rte_mbuf { > uint16_t timesync; > } __rte_cache_aligned; > > +/** > + * Prefetch the first part of the mbuf > + * > + * The first 64 bytes of the mbuf corresponds to fields that are used early > + * in the receive path. If the cache line of the architecture is higher than > + * 64B, the second part will also be prefetched. > + * > + * @param m > + * The pointer to the mbuf. > + */ > +static inline void > +rte_mbuf_prefetch_part1(struct rte_mbuf *m) > +{ > + rte_prefetch0(&m->cacheline0); > +} > + > +/** > + * Prefetch the second part of the mbuf > + * > + * The next 64 bytes of the mbuf corresponds to fields that are used in the > + * transmit path. If the cache line of the architecture is higher than 64B, > + * this function does nothing as it is expected that the full mbuf is > + * already in cache. > + * > + * @param m > + * The pointer to the mbuf. > + */ > +static inline void > +rte_mbuf_prefetch_part2(struct rte_mbuf *m) > +{ > +#if RTE_CACHE_LINE_SIZE == 64 > + rte_prefetch0(&m->cacheline1); > +#else > + RTE_SET_USED(m); > +#endif > +} > + > + > static inline uint16_t rte_pktmbuf_priv_size(struct rte_mempool *mp); > > /** > -- > 2.8.0.rc3 >