From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from tyo202.gate.nec.co.jp (TYO202.gate.nec.co.jp [210.143.35.52]) by dpdk.org (Postfix) with ESMTP id BB37D7E1F for ; Tue, 30 Sep 2014 13:11:04 +0200 (CEST) Received: from mailgate3.nec.co.jp ([10.7.69.160]) by tyo202.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id s8UBHhiM025024 for ; Tue, 30 Sep 2014 20:17:43 +0900 (JST) Received: from mailsv4.nec.co.jp (imss62.nec.co.jp [10.7.69.157]) by mailgate3.nec.co.jp (8.11.7/3.7W-MAILGATE-NEC) with ESMTP id s8UBHh201356 for ; Tue, 30 Sep 2014 20:17:43 +0900 (JST) Received: from mail01b.kamome.nec.co.jp (mail01b.kamome.nec.co.jp [10.25.43.2]) by mailsv4.nec.co.jp (8.13.8/8.13.4) with ESMTP id s8UBHhVs029207 for ; Tue, 30 Sep 2014 20:17:43 +0900 (JST) Received: from bpxc99gp.gisp.nec.co.jp ([10.38.151.145] [10.38.151.145]) by mail03.kamome.nec.co.jp with ESMTP id BT-MMP-2247020; Tue, 30 Sep 2014 20:13:43 +0900 Received: from BPXM14GP.gisp.nec.co.jp ([169.254.1.136]) by BPXC17GP.gisp.nec.co.jp ([10.38.151.145]) with mapi id 14.03.0174.002; Tue, 30 Sep 2014 20:13:43 +0900 From: Hiroshi Shimamoto To: "dev@dpdk.org" Thread-Topic: [memnic PATCH v2 4/7] pmd: use compiler barrier Thread-Index: Ac/cn4+K+6Ilfe4mQweVFddJvrohdA== Date: Tue, 30 Sep 2014 11:13:43 +0000 Message-ID: <7F861DC0615E0C47A872E6F3C5FCDDBD02AE268C@BPXM14GP.gisp.nec.co.jp> Accept-Language: ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.205.5.123] Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: Hayato Momma Subject: [dpdk-dev] [memnic PATCH v2 4/7] pmd: use compiler barrier X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Sep 2014 11:11:05 -0000 From: Hiroshi Shimamoto x86 can keep store ordering with standard operations. Using memory barrier is much expensive in main packet processing loop. Removing this improves xmit/recv packet performance. We can see performance improvements with memnic-tester. Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU. size | before | after 64 | 4.18Mpps | 4.59Mpps 128 | 3.85Mpps | 4.87Mpps 256 | 4.01Mpps | 4.72Mpps 512 | 3.52Mpps | 4.41Mpps 1024 | 3.18Mpps | 3.64Mpps 1280 | 2.86Mpps | 3.15Mpps 1518 | 2.59Mpps | 2.87Mpps Note: we have to take care if we use non-temporal cache. Signed-off-by: Hiroshi Shimamoto Reviewed-by: Hayato Momma --- pmd/pmd_memnic.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c index 872f3c4..0783440 100644 --- a/pmd/pmd_memnic.c +++ b/pmd/pmd_memnic.c @@ -316,7 +316,7 @@ static uint16_t memnic_recv_pkts(void *rx_queue, bytes +=3D p->len; =20 drop: - rte_mb(); + rte_compiler_barrier(); p->status =3D MEMNIC_PKT_ST_FREE; =20 if (++idx >=3D MEMNIC_NR_PACKET) @@ -403,7 +403,7 @@ retry: pkts++; bytes +=3D pkt_len; =20 - rte_mb(); + rte_compiler_barrier(); p->status =3D MEMNIC_PKT_ST_FILLED; =20 rte_pktmbuf_free(tx_pkts[nr]); --=20 1.8.3.1