From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7A047A00BE; Tue, 28 Apr 2020 17:35:57 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 471EC1D609; Tue, 28 Apr 2020 17:35:56 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id B33E21D603 for ; Tue, 28 Apr 2020 17:35:54 +0200 (CEST) IronPort-SDR: VDGFcx4G/XSKcSK0afVTFv9EKlvW52kbeQrXemRM0/iEBqT/n2MRS2xlKwTPaeQSLIrlCSqQKa Te3ko93pXFPA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Apr 2020 08:35:53 -0700 IronPort-SDR: yRh50TramXj3mBCdF52OM/XqZ5IgXhCyqJ2wD723PFupVoSVuu4mR8GV1ZNde10PDeJCd/azV8 6vdAKQozIGng== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,328,1583222400"; d="scan'208";a="459269113" Received: from fmsmsx107.amr.corp.intel.com ([10.18.124.205]) by fmsmga006.fm.intel.com with ESMTP; 28 Apr 2020 08:35:53 -0700 Received: from fmsmsx158.amr.corp.intel.com (10.18.116.75) by fmsmsx107.amr.corp.intel.com (10.18.124.205) with Microsoft SMTP Server (TLS) id 14.3.439.0; Tue, 28 Apr 2020 08:35:53 -0700 Received: from shsmsx153.ccr.corp.intel.com (10.239.6.53) by fmsmsx158.amr.corp.intel.com (10.18.116.75) with Microsoft SMTP Server (TLS) id 14.3.439.0; Tue, 28 Apr 2020 08:35:52 -0700 Received: from shsmsx103.ccr.corp.intel.com ([169.254.4.146]) by SHSMSX153.ccr.corp.intel.com ([169.254.12.89]) with mapi id 14.03.0439.000; Tue, 28 Apr 2020 23:35:50 +0800 From: "Liu, Yong" To: Maxime Coquelin , "Ye, Xiaolong" , "Wang, Zhihong" CC: "dev@dpdk.org" , Honnappa Nagarahalli , "jerinj@marvell.com" Thread-Topic: [PATCH v10 6/9] net/virtio: add vectorized packed ring Rx path Thread-Index: AQHWG3Ez/QtVlUuuCEi04+ox0lqZDaiMTtSAgAFrqFD///rlgIAAx6WA//+MxACAAJKhsP//f0uAABEHg+A= Date: Tue, 28 Apr 2020 15:35:49 +0000 Message-ID: <86228AFD5BCD8E4EBFD2B90117B5E81E63547785@SHSMSX103.ccr.corp.intel.com> References: <20200313174230.74661-1-yong.liu@intel.com> <20200426021943.43158-1-yong.liu@intel.com> <20200426021943.43158-7-yong.liu@intel.com> <672a584a-46d1-c78b-7b21-9ed7bc060814@redhat.com> <86228AFD5BCD8E4EBFD2B90117B5E81E63546D0C@SHSMSX103.ccr.corp.intel.com> <86228AFD5BCD8E4EBFD2B90117B5E81E6354755C@SHSMSX103.ccr.corp.intel.com> <679f4dc1-a4ac-ab1b-a7b1-297d66158820@redhat.com> <86228AFD5BCD8E4EBFD2B90117B5E81E635476DC@SHSMSX103.ccr.corp.intel.com> <2c383833-c67f-c1fb-ea3f-c3802b24de79@redhat.com> In-Reply-To: <2c383833-c67f-c1fb-ea3f-c3802b24de79@redhat.com> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.2.0.6 dlp-reaction: no-action x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v10 6/9] net/virtio: add vectorized packed ring Rx path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Maxime Coquelin > Sent: Tuesday, April 28, 2020 10:50 PM > To: Liu, Yong ; Ye, Xiaolong ; > Wang, Zhihong > Cc: dev@dpdk.org; Honnappa Nagarahalli > ; jerinj@marvell.com > Subject: Re: [PATCH v10 6/9] net/virtio: add vectorized packed ring Rx pa= th >=20 >=20 >=20 > On 4/28/20 4:43 PM, Liu, Yong wrote: > > > > > >> -----Original Message----- > >> From: Maxime Coquelin > >> Sent: Tuesday, April 28, 2020 9:46 PM > >> To: Liu, Yong ; Ye, Xiaolong > ; > >> Wang, Zhihong > >> Cc: dev@dpdk.org; Honnappa Nagarahalli > >> ; jerinj@marvell.com > >> Subject: Re: [PATCH v10 6/9] net/virtio: add vectorized packed ring Rx > path > >> > >> > >> > >> On 4/28/20 3:01 PM, Liu, Yong wrote: > >>>>> Maxime, > >>>>> Thanks for point it out, it will add extra cache miss in datapath. > >>>>> And its impact on performance is around 1% in loopback case. > >>>> Ok, thanks for doing the test. I'll try to run some PVP benchmarks > >>>> on my side because when doing IO loopback, the cache pressure is > >>>> much less important. > >>>> > >>>>> While benefit of vectorized path will be more than that number. > >>>> Ok, but I disagree for two reasons: > >>>> 1. You have to keep in mind than non-vectorized is the default and > >>>> encouraged mode to use. Indeed, it takes a lot of shortcuts like not > >>>> checking header length (so no error stats), etc... > >>>> > >>> Ok, I will keep non-vectorized same as before. > >>> > >>>> 2. It's like saying it's OK it degrades by 5% on $CPU_VENDOR_A > because > >>>> the gain is 20% on $CPU_VENDOR_B. > >>>> > >>>> In the case we see more degradation in real-world scenario, you migh= t > >>>> want to consider using ifdefs to avoid adding padding in the non- > >>>> vectorized case, like you did to differentiate Virtio PMD to Virtio-= user > >>>> PMD in patch 7. > >>>> > >>> Maxime, > >>> The performance difference is so slight, so I ignored for it look lik= e a > >> sampling error. > >> > >> Agree for IO loopback, but it adds one more cache line access per burs= t, > >> which might be see in some real-life use cases. > >> > >>> It maybe not suitable to add new configuration for such setting which > >> only used inside driver. > >> > >> Wait, the Virtio-user #ifdef is based on the defconfig options? How ca= n > >> it work since both Virtio PMD and Virtio-user PMD can be selected at t= he > >> same time? > >> > >> I thought it was a define set before the headers inclusion and unset > >> afterwards, but I didn't checked carefully. > >> > > > > Maxime, > > The difference between virtio PMD and Virtio-user PMD addresses is > handled by vq->offset. > > > > When virtio PMD is running, offset will be set to buf_iova. > > vq->offset =3D offsetof(struct rte_mbuf, buf_iova); > > > > When virtio_user PMD is running, offset will be set to buf_addr. > > vq->offset =3D offsetof(struct rte_mbuf, buf_addr); >=20 > Ok, but below is a build time check: >=20 > +#ifdef RTE_VIRTIO_USER > + __m128i flag_offset =3D _mm_set_epi64x(flags_temp, (uint64_t)vq- > >offset); > +#else > + __m128i flag_offset =3D _mm_set_epi64x(flags_temp, 0); > +#endif >=20 > So how can it work for a single build for both Virtio and Virtio-user? >=20 Sorry, here is an implementation error. vq->offset should be used in descs_= base for getting the iova address.=20 It will work the same as VIRTIO_MBUF_ADDR macro. > >>> Virtio driver can check whether virtqueue is using vectorized path wh= en > >> initialization, will use padded structure if it is. > >>> I have added some tested code and now performance came back. Since > >> code has changed in initialization process, it need some time for > regression > >> check. > >> > >> Ok, works for me. > >> > >> I am investigating a linkage issue with your series, which does not > >> happen systematically (see below, it happens also with clang). David > >> pointed me to some Intel patches removing the usage if __rte_weak, > >> could it be related? > >> > > > > I checked David's patch, it only changed i40e driver. Meanwhile attribu= te > __rte_weak should still be in virtio_rxtx.c. > > I will follow David's patch, eliminate the usage of weak attribute. >=20 > Yeah, I meant below issue could be linked to __rte_weak, not that i40e > patch was the cause of this problem. >=20 Maxime, I haven't seen any build issue related to __rte_weak both with gcc and clan= g. =20 Thanks, Marvin