From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <maxime.coquelin@redhat.com>
Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28])
 by dpdk.org (Postfix) with ESMTP id 68FF32931
 for <dev@dpdk.org>; Tue, 23 Aug 2016 16:14:47 +0200 (CEST)
Received: from int-mx10.intmail.prod.int.phx2.redhat.com
 (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mx1.redhat.com (Postfix) with ESMTPS id D48194E4CD;
 Tue, 23 Aug 2016 14:14:46 +0000 (UTC)
Received: from [10.36.4.245] (vpn1-4-245.ams2.redhat.com [10.36.4.245])
 by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id
 u7NEEiJO004694
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Tue, 23 Aug 2016 10:14:46 -0400
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>, dev@dpdk.org
References: <1471939839-29778-1-git-send-email-yuanhan.liu@linux.intel.com>
 <1471939839-29778-7-git-send-email-yuanhan.liu@linux.intel.com>
From: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-ID: <373f1d24-f5a5-eed1-e342-87fc6c7cc3cb@redhat.com>
Date: Tue, 23 Aug 2016 16:14:44 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.2.0
MIME-Version: 1.0
In-Reply-To: <1471939839-29778-7-git-send-email-yuanhan.liu@linux.intel.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16
 (mx1.redhat.com [10.5.110.38]); Tue, 23 Aug 2016 14:14:46 +0000 (UTC)
Subject: Re: [dpdk-dev] [PATCH 6/6] examples/vhost: add an option to enable
	Tx zero copy
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Aug 2016 14:14:47 -0000



On 08/23/2016 10:10 AM, Yuanhan Liu wrote:
> Add an option, --tx-zero-copy, to enable Tx zero copy.
>
> One thing worth noting while using Tx zero copy is the nb_tx_desc has
> to be small enough so that the eth driver will hit the mbuf free
> threshold easily and thus free mbuf more frequently.
>
> The reason behind that is, when Tx zero copy is enabled, guest Tx used
> vring will be updated only when corresponding mbuf is freed. If mbuf is
> not freed frequently, the guest Tx vring could be starved.
>
> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
> ---
>  examples/vhost/main.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/examples/vhost/main.c b/examples/vhost/main.c
> index 9974f0b..e3437ad 100644
> --- a/examples/vhost/main.c
> +++ b/examples/vhost/main.c
> @@ -130,6 +130,7 @@ static uint32_t enable_tx_csum;
>  static uint32_t enable_tso;
>
>  static int client_mode;
> +static int tx_zero_copy;
>
>  /* Specify timeout (in useconds) between retries on RX. */
>  static uint32_t burst_rx_delay_time = BURST_RX_WAIT_US;
> @@ -297,6 +298,17 @@ port_init(uint8_t port)
>
>  	rx_ring_size = RTE_TEST_RX_DESC_DEFAULT;
>  	tx_ring_size = RTE_TEST_TX_DESC_DEFAULT;
> +
> +	/*
> +	 * When Tx zero copy is enabled, guest Tx used vring will be updated
> +	 * only when corresponding mbuf is freed. Thus, the nb_tx_desc
> +	 * (tx_ring_size here) must be small enough so that the driver will
> +	 * hit the free threshold easily and free mbufs timely. Otherwise,
> +	 * guest Tx vring would be starved.
> +	 */
> +	if (tx_zero_copy)
> +		tx_ring_size = 64;

I have a concern about more complex applications, where the mbufs might
not be consumed sequentially.
If one mbuf gets stuck for a while, whereas all others are consumed,
we would face starvation.
For example, the packet is to be routed to a VM, which is paused,
and the routing thread keeps retrying to enqueue the packet for a while.

Anyway, this feature is optional and off by default, so having the
feature applied is not a blocker.

Thanks!
Maxime