From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 1E9A456B7 for ; Tue, 29 Sep 2015 17:41:45 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP; 29 Sep 2015 08:41:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,608,1437462000"; d="scan'208";a="815541192" Received: from fmsmsx106.amr.corp.intel.com ([10.18.124.204]) by orsmga002.jf.intel.com with ESMTP; 29 Sep 2015 08:41:44 -0700 Received: from fmsmsx155.amr.corp.intel.com (10.18.116.71) by FMSMSX106.amr.corp.intel.com (10.18.124.204) with Microsoft SMTP Server (TLS) id 14.3.248.2; Tue, 29 Sep 2015 08:41:44 -0700 Received: from shsmsx103.ccr.corp.intel.com (10.239.110.14) by FMSMSX155.amr.corp.intel.com (10.18.116.71) with Microsoft SMTP Server (TLS) id 14.3.248.2; Tue, 29 Sep 2015 08:41:44 -0700 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.75]) by SHSMSX103.ccr.corp.intel.com ([169.254.4.246]) with mapi id 14.03.0248.002; Tue, 29 Sep 2015 23:41:42 +0800 From: "Xie, Huawei" To: "dev@dpdk.org" , Thomas Monjalon Thread-Topic: [dpdk-dev] [PATCH 0/8] virtio: virtio ring layout optimization and RX vector processing Thread-Index: AdD6zVY2zfIZsr+7TYu4v+CwDa08Pg== Date: Tue, 29 Sep 2015 15:41:41 +0000 Message-ID: References: <1443537953-23917-1-git-send-email-huawei.xie@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 0/8] virtio: virtio ring layout optimization and RX vector processing X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Sep 2015 15:41:46 -0000 Thomas:=0A= Let us review first, then discuss the macro after that.=0A= My preference is use the configure macro and then fix this before next=0A= release. We could give some development features or aggressive changes a=0A= time buffer.=0A= =0A= =0A= On 9/29/2015 10:46 PM, Huawei Xie wrote:=0A= > Copied some message from patch 4.=0A= > In DPDK based switching enviroment, mostly vhost runs on a dedicated core= =0A= > while virtio processing in guest VMs runs on different cores.=0A= > Take RX for example, with generic implementation, for each guest buffer,= =0A= > a) virtio driver allocates a descriptor from free descriptor list=0A= > b) modify the entry of avail ring to point to allocated descriptor=0A= > c) after packet is received, free the descriptor=0A= >=0A= > When vhost fetches the avail ring, it needs to fetch the modified L1 cach= e from=0A= > virtio core, which is a heavy cost in current CPU implementation.=0A= >=0A= > This idea of this optimization is:=0A= > allocate the fixed descriptor for each entry of avail ring.=0A= > and avail ring will always be the same during the run.=0A= > This removes L1 cache transfer from virtio core to vhost core for avail r= ing.=0A= > Besides, no descriptor free and allocation is needed.=0A= >=0A= > Most importantly, this makes vector procesing possible to further acceler= ate=0A= > the processing.=0A= >=0A= > This is the layout for the avail ring(take 256 ring entries for example),= with=0A= > each entry pointing to the descriptor with the same index.=0A= > avail=0A= > idx=0A= > +=0A= > |=0A= > +----+----+---+-------------+------+=0A= > | 0 | 1 | 2 | ... | 254 | 255 | avail ring=0A= > +-+--+-+--+-+-+---------+---+--+---+=0A= > | | | | | |=0A= > | | | | | |=0A= > v v v | v v=0A= > +-+--+-+--+-+-+---------+---+--+---+=0A= > | 0 | 1 | 2 | ... | 254 | 255 | desc ring=0A= > +----+----+---+-------------+------+=0A= > |=0A= > |=0A= > +----+----+---+-------------+------+=0A= > | 0 | 1 | 2 | | 254 | 255 | used ring=0A= > +----+----+---+-------------+------+=0A= > |=0A= > +=0A= >=0A= > This is the ring layout for TX.=0A= > As one virtio header is needed for each xmit packet, we have 128 slots av= ailable.=0A= >=0A= > ++=0A= > ||=0A= > ||=0A= > +-----+-----+-----+--------------+------+------+------+=0A= > | 0 | 1 | ... | 127 || 128 | 129 | ... | 255 | avail ring=0A= > +--+--+--+--+-----+---+------+---+--+---+------+--+---+=0A= > | | | || | | |=0A= > v v v || v v v=0A= > +--+--+--+--+-----+---+------+---+--+---+------+--+---+=0A= > | 127 | 128 | ... | 255 || 127 | 128 | ... | 255 | desc ring for v= irtio_net_hdr=0A= > +--+--+--+--+-----+---+------+---+--+---+------+--+---+=0A= > | | | || | | |=0A= > v v v || v v v=0A= > +--+--+--+--+-----+---+------+---+--+---+------+--+---+=0A= > | 0 | 1 | ... | 127 || 0 | 1 | ... | 127 | desc ring for t= x dat=0A= > +-----+-----+-----+--------------+------+------+------+=0A= > ||=0A= > ||=0A= > ++=0A= >=0A= > Performance boost could be observed if the virtio backend isn't the bottl= eneck or in VM2VM=0A= > case.=0A= > There are also several vhost optimization patches to be submitted later.= =0A= >=0A= > Huawei Xie (8):=0A= > virtio: add configure for simple virtio rx/tx=0A= > virtio: add virtio_rxtx.h header file=0A= > virtio: add software rx ring, fake_buf, simple_rxtx into virtqueue=0A= > virtio: rx/tx ring layout optimization=0A= > virtio: fill RX avail ring with blank mbufs=0A= > virtio: virtio vec rx=0A= > virtio: simple tx routine=0A= > virtio: rxtx_func_get=0A= >=0A= > config/common_linuxapp | 1 +=0A= > drivers/net/virtio/Makefile | 2 +-=0A= > drivers/net/virtio/virtio_ethdev.c | 29 ++-=0A= > drivers/net/virtio/virtio_ethdev.h | 5 +=0A= > drivers/net/virtio/virtio_rxtx.c | 70 +++++-=0A= > drivers/net/virtio/virtio_rxtx.h | 39 ++++=0A= > drivers/net/virtio/virtio_rxtx_simple.c | 403 ++++++++++++++++++++++++++= ++++++=0A= > drivers/net/virtio/virtqueue.h | 7 +=0A= > 8 files changed, 550 insertions(+), 6 deletions(-)=0A= > create mode 100644 drivers/net/virtio/virtio_rxtx.h=0A= > create mode 100644 drivers/net/virtio/virtio_rxtx_simple.c=0A= >=0A= =0A=