From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f49.google.com (mail-pa0-f49.google.com [209.85.220.49]) by dpdk.org (Postfix) with ESMTP id 222488D95 for ; Tue, 8 Sep 2015 17:39:17 +0200 (CEST) Received: by padhy16 with SMTP id hy16so124904364pad.1 for ; Tue, 08 Sep 2015 08:39:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-type:content-transfer-encoding; bh=smp7wNl6fyP8g7BsWxHVJlL56ouVB2fwn3deAVPUhbk=; b=mAdMcqhrV1UO2k/2EAfM5BrC3iobnLSeGUX6KuunW7vnM3L25ZZZxekJA7Ca1amS5d LKyUBM96hUws0nx/B2QUMv5G6rEbf9AKDsg9Y9RNrFP69M/9EAhUVzPBbLLltGQW/5Nq k2aOmFUKX1JbqGGWGRA+WyLLSB4y4t1im13uc62Z+PVUmOEi/2IhK5wNWWVgQpZJaYaK rUG2FDGrqnt8z13aVU2JoI3FO9dk1F4A7poBbDgGgXrpX3mmCyRG6rEOV/Qk6/uAof0d STBaeh+fy+LSbw/U/cQ006YYFKW2Jn0PwiW1A7u1blqA0SiIyJlPtgsSQL2o85Wfh4LY N1Xg== X-Gm-Message-State: ALoCoQlcwcBYk2kCgBa1brwa+SWy3501B+z9k7/VaEsJB/ZZ3QVoYB/SWstwDLuMA5IE1/KACG9k X-Received: by 10.66.163.227 with SMTP id yl3mr50031694pab.10.1441726756241; Tue, 08 Sep 2015 08:39:16 -0700 (PDT) Received: from urahara (static-50-53-82-155.bvtn.or.frontiernet.net. [50.53.82.155]) by smtp.gmail.com with ESMTPSA id v8sm3858488pbs.16.2015.09.08.08.39.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 08 Sep 2015 08:39:16 -0700 (PDT) Date: Tue, 8 Sep 2015 08:39:26 -0700 From: Stephen Hemminger To: "Xie, Huawei" Message-ID: <20150908083926.3f2f409f@urahara> In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: "dev@dpdk.org" , "ms >> Michael S. Tsirkin" Subject: Re: [dpdk-dev] virtio optimization idea X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Sep 2015 15:39:17 -0000 On Fri, 4 Sep 2015 08:25:05 +0000 "Xie, Huawei" wrote: > Hi: > > Recently I have done one virtio optimization proof of concept. The > optimization includes two parts: > 1) avail ring set with fixed descriptors > 2) RX vectorization > With the optimizations, we could have several times of performance boost > for purely vhost-virtio throughput. > > Here i will only cover the first part, which is the prerequisite for the > second part. > Let us first take RX for example. Currently when we fill the avail ring > with guest mbuf, we need > a) allocate one descriptor(for non sg mbuf) from free descriptors > b) set the idx of the desc into the entry of avail ring > c) set the addr/len field of the descriptor to point to guest blank mbuf > data area > > Those operation takes time, and especially step b results in modifed (M) > state of the cache line for the avail ring in the virtio processing > core. When vhost processes the avail ring, the cache line transfer from > virtio processing core to vhost processing core takes pretty much CPU > cycles. > To solve this problem, this is the arrangement of RX ring for DPDK > pmd(for non-mergable case). > > avail > idx > + > | > +----+----+---+-------------+------+ > | 0 | 1 | 2 | ... | 254 | 255 | avail ring > +-+--+-+--+-+-+---------+---+--+---+ > | | | | | | > | | | | | | > v v v | v v > +-+--+-+--+-+-+---------+---+--+---+ > | 0 | 1 | 2 | ... | 254 | 255 | desc ring > +----+----+---+-------------+------+ > | > | > +----+----+---+-------------+------+ > | 0 | 1 | 2 | | 254 | 255 | used ring > +----+----+---+-------------+------+ > | > + > Avail ring is initialized with fixed descriptor and is never changed, > i.e, the index value of the nth avail ring entry is always n, which > means virtio PMD is actually refilling desc ring only, without having to > change avail ring. > When vhost fetches avail ring, if not evicted, it is always in its first > level cache. > > When RX receives packets from used ring, we use the used->idx as the > desc idx. This requires that vhost processes and returns descs from > avail ring to used ring in order, which is true for both current dpdk > vhost and kernel vhost implementation. In my understanding, there is no > necessity for vhost net to process descriptors OOO. One case could be > zero copy, for example, if one descriptor doesn't meet zero copy > requirment, we could directly return it to used ring, earlier than the > descriptors in front of it. > To enforce this, i want to use a reserved bit to indicate in order > processing of descriptors. > > For tx ring, the arrangement is like below. Each transmitted mbuf needs > a desc for virtio_net_hdr, so actually we have only 128 free slots. > > > > ++ > > || > > || > > +-----+-----+-----+--------------+------+------+------+ > > | 0 | 1 | ... | 127 || 128 | 129 | ... | 255 | avail ring > with fixed descriptor > > +--+--+--+--+-----+---+------+---+--+---+------+--+---+ > > | | | || | | > | > v v v || v v > v > > +--+--+--+--+-----+---+------+---+--+---+------+--+---+ > > | 127 | 128 | ... | 255 || 127 | 128 | ... | 255 | desc ring > for virtio_net_hdr > > +--+--+--+--+-----+---+------+---+--+---+------+--+---+ > > | | | || | | > | > v v v || v v > v > > +--+--+--+--+-----+---+------+---+--+---+------+--+---+ > > | 0 | 1 | ... | 127 || 0 | 1 | ... | 127 | desc ring > for tx dat > > +-----+-----+-----+--------------+------+------+------+ > Does this still work with Linux (or BSD) guest/host. If you are assuming both virtio/vhost are DPDK this is never going to be usable. On a related note, have you looked at getting virtio to support the new standard (not legacy) mode?