From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 8948EA0C50;
	Fri, 16 Jul 2021 09:46:02 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 1310140151;
	Fri, 16 Jul 2021 09:46:02 +0200 (CEST)
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [216.205.24.124])
 by mails.dpdk.org (Postfix) with ESMTP id E1CAE4014D
 for <dev@dpdk.org>; Fri, 16 Jul 2021 09:45:59 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1626421559;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references;
 bh=5KSUfnDhBg1qd9HRb3TfQfpGHxk88jlg4Zode/nVoKE=;
 b=eD6+AkQchC298PddhZCpbvZJzfOOsNAMZ6IQVwlyqD+mtxC6gZYpR98eTwAay6E6KSqCEI
 YtxPRc7uKUr8t3pr5R+wFJXIJzIsJ4nde0qReesf8EORAD8ua098XAnKuP+RubUtUktMZk
 kw/4SEWFgEnZwCdLXp2nLEmZsm5BEdU=
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-599-fQpQUz5EM0Stpnv5VMGkUg-1; Fri, 16 Jul 2021 03:45:55 -0400
X-MC-Unique: fQpQUz5EM0Stpnv5VMGkUg-1
Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com
 [10.5.11.14])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3CAE2362FC;
 Fri, 16 Jul 2021 07:45:54 +0000 (UTC)
Received: from [10.36.110.39] (unknown [10.36.110.39])
 by smtp.corp.redhat.com (Postfix) with ESMTPS id 8D8BB5D9C6;
 Fri, 16 Jul 2021 07:45:52 +0000 (UTC)
To: "Hu, Jiayu" <jiayu.hu@intel.com>, "Ma, WenwuX" <wenwux.ma@intel.com>,
 "dev@dpdk.org" <dev@dpdk.org>
Cc: "Xia, Chenbo" <chenbo.xia@intel.com>,
 "Jiang, Cheng1" <cheng1.jiang@intel.com>, "Wang, YuanX"
 <yuanx.wang@intel.com>
References: <20210602083110.5530-1-yuanx.wang@intel.com>
 <20210705181151.141752-1-wenwux.ma@intel.com>
 <20210705181151.141752-4-wenwux.ma@intel.com>
 <fc43aba3-997b-aeaf-35cc-df73805550e7@redhat.com>
 <ec43de6a87ce4119a182becbcd94c68a@intel.com>
 <74bd35ee-5548-f32d-638f-9ea1748e8e35@redhat.com>
 <d156587a88764124ab7ff6277689f350@intel.com>
From: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-ID: <ca3b9ad3-1eff-d4ef-855e-5e436dc715e3@redhat.com>
Date: Fri, 16 Jul 2021 09:45:50 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.11.0
MIME-Version: 1.0
In-Reply-To: <d156587a88764124ab7ff6277689f350@intel.com>
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14
Authentication-Results: relay.mimecast.com;
 auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=maxime.coquelin@redhat.com
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Subject: Re: [dpdk-dev] [PATCH v5 3/4] vhost: support async dequeue for
 split ring
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi,

On 7/16/21 3:10 AM, Hu, Jiayu wrote:
> Hi, Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Thursday, July 15, 2021 9:18 PM
>> To: Hu, Jiayu <jiayu.hu@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>;
>> dev@dpdk.org
>> Cc: Xia, Chenbo <chenbo.xia@intel.com>; Jiang, Cheng1
>> <cheng1.jiang@intel.com>; Wang, YuanX <yuanx.wang@intel.com>
>> Subject: Re: [PATCH v5 3/4] vhost: support async dequeue for split ring
>>
>>
>>
>> On 7/14/21 8:50 AM, Hu, Jiayu wrote:
>>> Hi Maxime,
>>>
>>> Thanks for your comments. Applies are inline.
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Sent: Tuesday, July 13, 2021 10:30 PM
>>>> To: Ma, WenwuX <wenwux.ma@intel.com>; dev@dpdk.org
>>>> Cc: Xia, Chenbo <chenbo.xia@intel.com>; Jiang, Cheng1
>>>> <cheng1.jiang@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>; Wang, YuanX
>>>> <yuanx.wang@intel.com>
>>>> Subject: Re: [PATCH v5 3/4] vhost: support async dequeue for split
>>>> ring
>>>>>  struct async_inflight_info {
>>>>>  	struct rte_mbuf *mbuf;
>>>>> -	uint16_t descs; /* num of descs inflight */
>>>>> +	union {
>>>>> +		uint16_t descs; /* num of descs in-flight */
>>>>> +		struct async_nethdr nethdr;
>>>>> +	};
>>>>>  	uint16_t nr_buffers; /* num of buffers inflight for packed ring */
>>>>> -};
>>>>> +} __rte_cache_aligned;
>>>>
>>>> Does it really need to be cache aligned?
>>>
>>> How about changing to 32-byte align? So a cacheline can hold 2 objects.
>>
>> Or not forcing any alignment at all? Would there really be a performance
>> regression?
>>
>>>>
>>>>>
>>>>>  /**
>>>>>   *  dma channel feature bit definition @@ -193,4 +201,34 @@
>>>>> __rte_experimental  uint16_t rte_vhost_poll_enqueue_completed(int
>>>>> vid, uint16_t queue_id,
>>>>>  		struct rte_mbuf **pkts, uint16_t count);
>>>>>
>>>>> +/**
>>>>> + * This function tries to receive packets from the guest with
>>>>> +offloading
>>>>> + * large copies to the DMA engine. Successfully dequeued packets
>>>>> +are
>>>>> + * transfer completed, either by the CPU or the DMA engine, and
>>>>> +they are
>>>>> + * returned in "pkts". There may be other packets that are sent
>>>>> +from
>>>>> + * the guest but being transferred by the DMA engine, called
>>>>> +in-flight
>>>>> + * packets. The amount of in-flight packets by now is returned in
>>>>> + * "nr_inflight". This function will return in-flight packets only
>>>>> +after
>>>>> + * the DMA engine finishes transferring.
>>>>
>>>> I am not sure to understand that comment. Is it still "in-flight" if
>>>> the DMA transfer is completed?
>>>
>>> "in-flight" means packet copies are submitted to the DMA, but the DMA
>>> hasn't completed copies.
>>>
>>>>
>>>> Are we ensuring packets are not reordered with this way of working?
>>>
>>> There is a threshold can be set by users. If set it to 0, which
>>> presents all packet copies assigned to the DMA, the packets sent from
>>> the guest will not be reordered.
>>
>> Reordering packets is bad in my opinion. We cannot expect the user to know
>> that he should set the threshold to zero to have packets ordered.
>>
>> Maybe we should consider not having threshold, and so have every
>> descriptors handled either by the CPU (sync datapath) or by the DMA (async
>> datapath). Doing so would simplify a lot the code, and would make
>> performance/latency more predictable.
>>
>> I understand that we might not get the best performance for every packet
>> size doing that, but that may be a tradeoff we would make to have the
>> feature maintainable and easily useable by the user.
> 
> I understand and agree in some way. But before changing the existed design
> in async enqueue and dequeue, we need more careful tests, as current design
> is well validated and performance looks good. So I suggest to do it in 21.11.

My understanding was that for enqueue path packets were not reordered,
thinking the used ring was written in order, but it seems I was wrong.

What kind of validation and performance testing has been done? I can
imagine reordering to have a bad impact on L4+ benchmarks.

Let's first fix this for enqueue path, then submit new revision for
dequeue path without packet reordering.

Regards,
Maxime

> Thanks,
> Jiayu
>