From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id A4321CF9 for ; Tue, 29 Sep 2015 16:45:59 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP; 29 Sep 2015 07:45:58 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,608,1437462000"; d="scan'208";a="570551486" Received: from shvmail01.sh.intel.com ([10.239.29.42]) by FMSMGA003.fm.intel.com with ESMTP; 29 Sep 2015 07:45:57 -0700 Received: from shecgisg003.sh.intel.com (shecgisg003.sh.intel.com [10.239.29.90]) by shvmail01.sh.intel.com with ESMTP id t8TEju2A024030 for ; Tue, 29 Sep 2015 22:45:56 +0800 Received: from shecgisg003.sh.intel.com (localhost [127.0.0.1]) by shecgisg003.sh.intel.com (8.13.6/8.13.6/SuSE Linux 0.8) with ESMTP id t8TEjsdb023952 for ; Tue, 29 Sep 2015 22:45:56 +0800 Received: (from hxie5@localhost) by shecgisg003.sh.intel.com (8.13.6/8.13.6/Submit) id t8TEjsX3023948 for dev@dpdk.org; Tue, 29 Sep 2015 22:45:54 +0800 From: Huawei Xie To: dev@dpdk.org Date: Tue, 29 Sep 2015 22:45:45 +0800 Message-Id: <1443537953-23917-1-git-send-email-huawei.xie@intel.com> X-Mailer: git-send-email 1.7.4.1 Subject: [dpdk-dev] [PATCH 0/8] virtio: virtio ring layout optimization and RX vector processing X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Sep 2015 14:46:00 -0000 Copied some message from patch 4. In DPDK based switching enviroment, mostly vhost runs on a dedicated core while virtio processing in guest VMs runs on different cores. Take RX for example, with generic implementation, for each guest buffer, a) virtio driver allocates a descriptor from free descriptor list b) modify the entry of avail ring to point to allocated descriptor c) after packet is received, free the descriptor When vhost fetches the avail ring, it needs to fetch the modified L1 cache from virtio core, which is a heavy cost in current CPU implementation. This idea of this optimization is: allocate the fixed descriptor for each entry of avail ring. and avail ring will always be the same during the run. This removes L1 cache transfer from virtio core to vhost core for avail ring. Besides, no descriptor free and allocation is needed. Most importantly, this makes vector procesing possible to further accelerate the processing. This is the layout for the avail ring(take 256 ring entries for example), with each entry pointing to the descriptor with the same index. avail idx + | +----+----+---+-------------+------+ | 0 | 1 | 2 | ... | 254 | 255 | avail ring +-+--+-+--+-+-+---------+---+--+---+ | | | | | | | | | | | | v v v | v v +-+--+-+--+-+-+---------+---+--+---+ | 0 | 1 | 2 | ... | 254 | 255 | desc ring +----+----+---+-------------+------+ | | +----+----+---+-------------+------+ | 0 | 1 | 2 | | 254 | 255 | used ring +----+----+---+-------------+------+ | + This is the ring layout for TX. As one virtio header is needed for each xmit packet, we have 128 slots available. ++ || || +-----+-----+-----+--------------+------+------+------+ | 0 | 1 | ... | 127 || 128 | 129 | ... | 255 | avail ring +--+--+--+--+-----+---+------+---+--+---+------+--+---+ | | | || | | | v v v || v v v +--+--+--+--+-----+---+------+---+--+---+------+--+---+ | 127 | 128 | ... | 255 || 127 | 128 | ... | 255 | desc ring for virtio_net_hdr +--+--+--+--+-----+---+------+---+--+---+------+--+---+ | | | || | | | v v v || v v v +--+--+--+--+-----+---+------+---+--+---+------+--+---+ | 0 | 1 | ... | 127 || 0 | 1 | ... | 127 | desc ring for tx dat +-----+-----+-----+--------------+------+------+------+ || || ++ Performance boost could be observed if the virtio backend isn't the bottleneck or in VM2VM case. There are also several vhost optimization patches to be submitted later. Huawei Xie (8): virtio: add configure for simple virtio rx/tx virtio: add virtio_rxtx.h header file virtio: add software rx ring, fake_buf, simple_rxtx into virtqueue virtio: rx/tx ring layout optimization virtio: fill RX avail ring with blank mbufs virtio: virtio vec rx virtio: simple tx routine virtio: rxtx_func_get config/common_linuxapp | 1 + drivers/net/virtio/Makefile | 2 +- drivers/net/virtio/virtio_ethdev.c | 29 ++- drivers/net/virtio/virtio_ethdev.h | 5 + drivers/net/virtio/virtio_rxtx.c | 70 +++++- drivers/net/virtio/virtio_rxtx.h | 39 ++++ drivers/net/virtio/virtio_rxtx_simple.c | 403 ++++++++++++++++++++++++++++++++ drivers/net/virtio/virtqueue.h | 7 + 8 files changed, 550 insertions(+), 6 deletions(-) create mode 100644 drivers/net/virtio/virtio_rxtx.h create mode 100644 drivers/net/virtio/virtio_rxtx_simple.c -- 1.8.1.4