From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stable-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id EA9E846128
	for <public@inbox.dpdk.org>; Wed, 29 Jan 2025 18:58:17 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id CE7A0402E8;
	Wed, 29 Jan 2025 18:58:17 +0100 (CET)
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by mails.dpdk.org (Postfix) with ESMTP id 4DCCD4026B
 for <stable@dpdk.org>; Wed, 29 Jan 2025 18:58:16 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1738173495;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references;
 bh=eT2AELLTrsruL941L7omLFKiKtlY3S59wOg0N9CwCBM=;
 b=Kgjms3g01VB/z0rxKLu8JBKNR86mQKANnL7uWha/FHmxOm8oSrI2NCmcRAz3vds+MA0yfI
 3auS26ST7SI5dOarez9tX3YBBH0fVaWb/zKOdYLYPmMDZLiOYIATHCQgyApjXzKC0b+r4d
 tD9FCY3y/cW2OoQ1aw8qV4MoutUjBSg=
Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com
 [209.85.208.70]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-644-sy-ylfLgOmC9St9Vkroihg-1; Wed, 29 Jan 2025 12:58:13 -0500
X-MC-Unique: sy-ylfLgOmC9St9Vkroihg-1
X-Mimecast-MFC-AGG-ID: sy-ylfLgOmC9St9Vkroihg
Received: by mail-ed1-f70.google.com with SMTP id
 4fb4d7f45d1cf-5dbe8a4165eso1340692a12.0
 for <stable@dpdk.org>; Wed, 29 Jan 2025 09:58:13 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1738173492; x=1738778292;
 h=content-transfer-encoding:in-reply-to:from:content-language
 :references:cc:to:subject:user-agent:mime-version:date:message-id
 :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
 bh=eT2AELLTrsruL941L7omLFKiKtlY3S59wOg0N9CwCBM=;
 b=lAfUIM/dfbzyS/1CBoiNhEpljbC37ZPBa2Xpok8MWthsmQGGNXJ0/K0jd2HgmeNGqw
 pTaYXKZNcP4NI6wcAdEA4s8fasSrCs39HN6nFhFEra33/549ZmE52MN27ihE2mNi2aOJ
 yswPNjiaCYgyi7OG1+V7lqdLynYFzgd1THAcvbex3JfGtIv86CZiz4VnrcgdUC1M9U+x
 7my1XEqLSiWFCcpW07rZkRxqpzBssDg+tbxhICOeeQ+yfNSSyms/WxG1zfBo8FM086s5
 2QDjOSuA13IJKbJWB1R+WKoOi2GGWYwlaXqV2WaTYkQ3hIjgLcDm8sy/ekt4LqaNX64G
 RMJA==
X-Gm-Message-State: AOJu0YwhJ5lDoAtEmkLp/kKJikT1b5AjYRusBaC2/9qjySwPURC5vqZd
 5K8N75lhOZ3KA+5vKItNHLh5NbIRQii6F3RwnpLEueVp7VKAIKWUCXnU4BwOBly1fq8l/tK4G71
 PUPU92kJef6SLdg1LlnClUyU6UxMZt9ylgX/dky4yor8o
X-Gm-Gg: ASbGncu6ld1KBI3tKnxd+/jfCbO2JXSrBcIGqzlxfxwhh2PCbVYHAYDofjqHXf9z6Hs
 C8ncbC8tS9WSWgzM84PuWU1gj4erZvQDlUiIT4o5L5zKukgJ8nM4M6neUA4gshMJ2p/h+7vd5cl
 fZYUMlNHzgJ3aX8E6KCZboTHUwxdzHMlWj2TubVOCrbYJckU37QucR83gCSgmMaerz+lqDFkcGw
 ENYEkcg+bnLAo2EwliLpcfcbmBg8doCRSKTeiMCEAC/jLccl30y8ZYilo5fVSO7j1mxwK0vWKKU
 5SC0ILvTWB8QEh8y0kPWkcNM5RRNRRqPoOxrQbkzwJLtjrKJQiAGQN6XRRcNRTo8daxSrV6KDju
 b
X-Received: by 2002:a05:6402:1d4e:b0:5db:f5e9:6759 with SMTP id
 4fb4d7f45d1cf-5dc6f618cfamr226360a12.14.1738173492152; 
 Wed, 29 Jan 2025 09:58:12 -0800 (PST)
X-Google-Smtp-Source: AGHT+IEl0qv5shsUClEXKy91sYtHe4EcxBAFE7Gy+S7ce2Ak6vuXorEx9cgSGYoIc3I4uYFr1vLi1Q==
X-Received: by 2002:a05:6402:1d4e:b0:5db:f5e9:6759 with SMTP id
 4fb4d7f45d1cf-5dc6f618cfamr226344a12.14.1738173491668; 
 Wed, 29 Jan 2025 09:58:11 -0800 (PST)
Received: from ?IPV6:2a06:5900:c00c:9000:106a:b694:33e7:1734?
 ([2a06:5900:c00c:9000:106a:b694:33e7:1734])
 by smtp.gmail.com with ESMTPSA id
 4fb4d7f45d1cf-5dc1863b714sm9428660a12.45.2025.01.29.09.58.10
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Wed, 29 Jan 2025 09:58:11 -0800 (PST)
Message-ID: <f067b5ea-ade9-4b2d-8a66-9f8446138bb5@redhat.com>
Date: Wed, 29 Jan 2025 17:58:10 +0000
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: [PATCH v3 2/2] net/af_xdp: Refactor af_xdp_tx_zc()
To: Ariel Otilibili <ariel.otilibili@6wind.com>, dev@dpdk.org
Cc: stable@dpdk.org, Thomas Monjalon <thomas@monjalon.net>,
 David Marchand <david.marchand@redhat.com>,
 Stephen Hemminger <stephen@networkplumber.org>,
 Ciara Loftus <ciara.loftus@intel.com>
References: <20250116195640.68885-1-ariel.otilibili@6wind.com>
 <20250128231152.249497-1-ariel.otilibili@6wind.com>
 <20250128231152.249497-3-ariel.otilibili@6wind.com>
From: Maryam Tahhan <mtahhan@redhat.com>
In-Reply-To: <20250128231152.249497-3-ariel.otilibili@6wind.com>
X-Mimecast-Spam-Score: 0
X-Mimecast-MFC-PROC-ID: 3npaqe8fb2lALjMY0pnhWtTO-EFCRqwmW1xehwC_bXU_1738173492
X-Mimecast-Originator: redhat.com
Content-Language: en-US
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-BeenThere: stable@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: patches for DPDK stable branches <stable.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/stable>,
 <mailto:stable-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/stable/>
List-Post: <mailto:stable@dpdk.org>
List-Help: <mailto:stable-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/stable>,
 <mailto:stable-request@dpdk.org?subject=subscribe>
Errors-To: stable-bounces@dpdk.org


On 28/01/2025 23:11, Ariel Otilibili wrote:
> Both legs of the loop share the same logic: either to go out of the
> function, or to fall through to the next instructions.
>
> For that, maybe_cpy_pkt() is introduced.
>
> Bugzilla ID: 1440
> Signed-off-by: Ariel Otilibili <ariel.otilibili@6wind.com>
> ---
>   drivers/net/af_xdp/rte_eth_af_xdp.c | 131 ++++++++++++++++++----------
>   1 file changed, 84 insertions(+), 47 deletions(-)
>
> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
> index 092bcb73aa0a..efe23283fb52 100644
> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> @@ -536,13 +536,80 @@ kick_tx(struct pkt_tx_queue *txq, struct xsk_ring_cons *cq)
>   		}
>   }
>   
> +static inline uint64_t
> +update_addr(struct rte_mbuf *mbuf, struct xsk_umem_info *umem)
> +{
> +	return (uint64_t)mbuf - (uint64_t)umem->buffer
> +		- umem->mb_pool->header_size;
> +}
> +
> +static inline uint64_t
> +update_offset(struct rte_mbuf *mbuf, struct xsk_umem_info *umem)
> +{
> +	uint64_t offset;
> +
> +	offset = rte_pktmbuf_mtod(mbuf, uint64_t) - (uint64_t)mbuf
> +		+ umem->mb_pool->header_size;
> +	return offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT;
> +}
> +
> +static struct rte_mbuf *
> +maybe_kick_tx(struct pkt_tx_queue *txq, uint32_t *idx_tx, struct rte_mbuf *mbuf)
> +{
> +	struct rte_mbuf *ret = mbuf;
> +
> +	if (!xsk_ring_prod__reserve(&txq->tx, 1, idx_tx)) {
> +		kick_tx(txq, &txq->pair->cq);
> +		if (!xsk_ring_prod__reserve(&txq->tx, 1, idx_tx))
> +			ret = NULL;
> +	}
> +
> +	return ret;
> +}


[MT] I don't see why we are passing in mbuf here?


> +
> +static struct rte_mbuf *
> +maybe_create_mbuf(struct pkt_tx_queue *txq, uint32_t *idx_tx,
> +		  struct xsk_umem_info *umem) {
> +	struct rte_mbuf *local_mbuf = rte_pktmbuf_alloc(umem->mb_pool);
> +
> +	if (local_mbuf == NULL)
> +		goto out;
> +
> +	if (!xsk_ring_prod__reserve(&txq->tx, 1, idx_tx)) {
> +		rte_pktmbuf_free(local_mbuf);
> +		local_mbuf = NULL;
> +		goto out;
> +	}
> +
> +out:
> +	return local_mbuf;
> +}
> +
> +static void
> +maybe_cpy_pkt(bool is_mbuf_equal, struct xsk_umem_info *umem,
> +	      uint64_t addr_plus_offset, struct rte_mbuf *mbuf,
> +	      struct xdp_desc *desc)
> +{
> +	void *pkt;
> +
> +	if(is_mbuf_equal)
> +		goto out;
> +
> +	pkt = xsk_umem__get_data(umem->buffer, addr_plus_offset);
> +	rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), desc->len);
> +	rte_pktmbuf_free(mbuf);
> +
> +out:
> +	return;
> +}


[MT] does this really need to be an inline function? it wasn't common 
code between the blocks?


> +
>   #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
>   static uint16_t
>   af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
>   {
>   	struct pkt_tx_queue *txq = queue;
>   	struct xsk_umem_info *umem = txq->umem;
> -	struct rte_mbuf *mbuf;
> +	struct rte_mbuf *mbuf, *local_mbuf;
>   	unsigned long tx_bytes = 0;
>   	int i;
>   	uint32_t idx_tx;
> @@ -551,61 +618,31 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
>   	uint64_t addr, offset;
>   	struct xsk_ring_cons *cq = &txq->pair->cq;
>   	uint32_t free_thresh = cq->size >> 1;
> +	bool is_true;
>   
>   	if (xsk_cons_nb_avail(cq, free_thresh) >= free_thresh)
>   		pull_umem_cq(umem, XSK_RING_CONS__DEFAULT_NUM_DESCS, cq);
>   
>   	for (i = 0; i < nb_pkts; i++) {
>   		mbuf = bufs[i];
> +		is_true = mbuf->pool == umem->mb_pool;
>   
> -		if (mbuf->pool == umem->mb_pool) {
> -			if (!xsk_ring_prod__reserve(&txq->tx, 1, &idx_tx)) {
> -				kick_tx(txq, cq);
> -				if (!xsk_ring_prod__reserve(&txq->tx, 1,
> -							    &idx_tx))
> -					goto out;
> -			}
> -			desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx);
> -			desc->len = mbuf->pkt_len;
> -			addr = (uint64_t)mbuf - (uint64_t)umem->buffer -
> -					umem->mb_pool->header_size;
> -			offset = rte_pktmbuf_mtod(mbuf, uint64_t) -
> -					(uint64_t)mbuf +
> -					umem->mb_pool->header_size;
> -			offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT;
> -			desc->addr = addr | offset;
> -			tx_bytes += desc->len;
> -			count++;
> -		} else {
> -			struct rte_mbuf *local_mbuf =
> -					rte_pktmbuf_alloc(umem->mb_pool);
> -			void *pkt;
> -
> -			if (local_mbuf == NULL)
> -				goto out;
> +		if (is_true)
> +			local_mbuf = maybe_kick_tx(txq, &idx_tx, mbuf);
> +		else
> +			local_mbuf = maybe_create_mbuf(txq, &idx_tx, umem);
>   
> -			if (!xsk_ring_prod__reserve(&txq->tx, 1, &idx_tx)) {
> -				rte_pktmbuf_free(local_mbuf);
> -				goto out;
> -			}
> +		if (!local_mbuf)
> +		  goto out;
>   
> -			desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx);
> -			desc->len = mbuf->pkt_len;
> -
> -			addr = (uint64_t)local_mbuf - (uint64_t)umem->buffer -
> -					umem->mb_pool->header_size;
> -			offset = rte_pktmbuf_mtod(local_mbuf, uint64_t) -
> -					(uint64_t)local_mbuf +
> -					umem->mb_pool->header_size;
> -			pkt = xsk_umem__get_data(umem->buffer, addr + offset);
> -			offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT;
> -			desc->addr = addr | offset;
> -			rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *),
> -					desc->len);
> -			tx_bytes += desc->len;
> -			rte_pktmbuf_free(mbuf);
> -			count++;
> -		}
> +		desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx);
> +		desc->len = mbuf->pkt_len;
> +		addr = update_addr(local_mbuf, umem);
> +		offset = update_offset(local_mbuf, umem);
> +		desc->addr = addr | offset;
> +		maybe_cpy_pkt(is_true, umem, addr + offset, mbuf, desc);
> +		tx_bytes += desc->len;
> +		count++;
>   	}
>   
>   out:


Firstly thank you for your efforts to clean up the code. I believe a 
simpler cleanup approach would better balance readability + 
maintainability. This approach would focus on eliminating the repeated 
code in both branches of the conditional block. For me the main 
repetition between the branches is the code that reserves and fills the 
descriptor info, Please find an example below...

Note: The following is all untested code (don't even know if it 
compiles) it's just to give an idea around the cleanup I had in mind:

static inline int
reserve_and_fill_desc(struct pkt_tx_queue *txq, struct xdp_desc **desc,
                       struct rte_mbuf *mbuf, struct xsk_umem_info *umem, uint32_t *idx_tx)
{
     uint64_t addr, offset;

     if (!xsk_ring_prod__reserve(&txq->tx, 1, idx_tx))
         return -1;

     *desc = xsk_ring_prod__tx_desc(&txq->tx, *idx_tx);
     (*desc)->len = mbuf->pkt_len;

     addr = (uint64_t)mbuf - (uint64_t)umem->buffer - umem->mb_pool->header_size;
     offset = rte_pktmbuf_mtod(mbuf, uint64_t) - (uint64_t)mbuf +
              umem->mb_pool->header_size;
     offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT;
     (*desc)->addr = addr | offset;

     return 0;
}


Then the main function would look something like:


static uint16_t
af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
{
     struct pkt_tx_queue *txq = queue;
     struct xsk_umem_info *umem = txq->umem;
     struct rte_mbuf *mbuf;
     unsigned long tx_bytes = 0;
     int i;
     uint32_t idx_tx;
     uint16_t count = 0;
     struct xdp_desc *desc;
     struct xsk_ring_cons *cq = &txq->pair->cq;
     uint32_t free_thresh = cq->size >> 1;

     if (xsk_cons_nb_avail(cq, free_thresh) >= free_thresh)
         pull_umem_cq(umem, XSK_RING_CONS__DEFAULT_NUM_DESCS, cq);

     for (i = 0; i < nb_pkts; i++) {
         mbuf = bufs[i];

         if (mbuf->pool == umem->mb_pool) {
             if (reserve_and_fill_desc(txq, &desc, mbuf, umem, &idx_tx) < 0) {
                 kick_tx(txq, cq);
                 if (reserve_and_fill_desc(txq, &desc, mbuf, umem, &idx_tx) < 0)
                     goto out;
             }
             
             tx_bytes += desc->len;
             count++;
         } else {
             struct rte_mbuf *local_mbuf = rte_pktmbuf_alloc(umem->mb_pool);
             void *pkt;

             if (local_mbuf == NULL)
                 goto out;

             if (reserve_and_fill_desc(txq, &desc, local_mbuf, umem, &idx_tx) < 0) {
                 rte_pktmbuf_free(local_mbuf);
                 goto out;
             }

             pkt = xsk_umem__get_data(umem->buffer, (desc->addr & ~0xFFF) +
                                      (desc->addr & 0xFFF));
             rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *), desc->len);
             rte_pktmbuf_free(mbuf);
             tx_bytes += desc->len;
             count++;
         }


     }

out:
     xsk_ring_prod__submit(&txq->tx, count);
     kick_tx(txq, cq);

     txq->stats.tx_pkts += count;
     txq->stats.tx_bytes += tx_bytes;
     txq->stats.tx_dropped += nb_pkts - count;

     return count;
}


Please let me know your thoughts, and I’d be happy to discuss further