Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx

patches for DPDK stable branches
 help / color / mirror / Atom feed

From: Ferruh Yigit <ferruh.yigit@amd.com>
To: Mihai Brodschi <mihai.brodschi@broadcom.com>,
	Jakub Grajciar <jgrajcia@cisco.com>
Cc: dev@dpdk.org, stable@dpdk.org
Subject: Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
Date: Sun, 7 Jul 2024 15:05:35 +0100	[thread overview]
Message-ID: <ffa4241c-342e-499a-ba21-cde5111018b1@amd.com> (raw)
In-Reply-To: <8bf5e505-191b-46ea-8b90-0ed4fc15d306@broadcom.com>

On 7/7/2024 6:50 AM, Mihai Brodschi wrote:
> Hi Ferruh,
> 
> On 07/07/2024 05:12, Ferruh Yigit wrote:
>> On 6/28/2024 10:01 PM, Mihai Brodschi wrote:
>>> rte_pktmbuf_alloc_bulk is called by the zero-copy receiver to allocate
>>> new mbufs to be provided to the sender. The allocated mbuf pointers
>>> are stored in a ring, but the alloc function doesn't implement index
>>> wrap-around, so it writes past the end of the array. This results in
>>> memory corruption and duplicate mbufs being received.
>>>
>>
>> Hi Mihai,
>>
>> I am not sure writing past the ring actually occurs.
>>
>> As far as I can see is to keep the ring full as much as possible, when
>> initially 'head' and 'tail' are 0, it fills all ring.
>> Later tails moves and emptied space filled again. So head (in modulo) is
>> always just behind tail after refill. In next run, refill will only fill
>> the part tail moved, and this is calculated by 'n_slots'. As this is
>> only the size of the gap, starting from 'head' (with modulo) shouldn't
>> pass the ring length.
>>
>> Do you observe this issue practically? If so can you please provide your
>> backtrace and numbers that is showing how to reproduce the issue?
> 
> The alloc function writes starting from the ring's head, but the ring's
> head can be located at the end of the ring's memory buffer (ring_size - 1).
> The correct behavior would be to wrap around to the start of the buffer (0),
> but the alloc function has no awareness of the fact that it's writing to a
> ring, so it writes to ring_size, ring_size + 1, etc.
> 
> Let's look at the existing code:
> We assume the ring size is 256 and we just received 32 packets.
> The previous tail was at index 255, now it's at index 31.
> The head is initially at index 255.
> 
> head = __atomic_load_n(&ring->head, __ATOMIC_RELAXED);	// head = 255
> n_slots = ring_size - head + mq->last_tail;		// n_slots = 32
> 
> if (n_slots < 32)					// not taken
> 	goto no_free_mbufs;
> 
> ret = rte_pktmbuf_alloc_bulk(mq->mempool, &mq->buffers[head & mask], n_slots);
> // This will write 32 mbuf pointers starting at index (head & mask) = 255.
> // The ring size is 256, so apart from the first one all pointers will be
> // written out of bounds (index 256 .. 286, when it should be 0 .. 30).
> 

My expectation is numbers should be like following:

Initially:
 size = 256
 head = 0
 tail = 0

In first refill:
 n_slots = 256
 head = 256
 tail = 0

Subsequent run that 32 slots used:
 head = 256
 tail = 32
 n_slots = 32
 rte_pktmbuf_alloc_bulk(mq, buf[head & mask], n_slots);
  head & mask = 0
  // So it fills first 32 elements of buffer, which is inbound

This will continue as above, combination of only gap filled and head
masked with 'mask' provides the wrapping required.


> I can reproduce a crash 100% of the time with my application, but the output
> is not very helpful, since it crashes elsewhere because of mempool corruption.
> Applying this patch fixes the crashes completely.
> 

This causing always reproducible crash means existing memif zero copy Rx
is broken and nobody can use it, but I am suspicions that this is the
case, perhaps something special in your usecase triggering this issue.

@Jakup, can you please confirm that memif Rx zero copy is tested?

>>> Allocate 2x the space for the mbuf ring, so that the alloc function
>>> has a contiguous array to write to, then copy the excess entries
>>> to the start of the array.
>>>
>>
>> Even issue is valid, I am not sure about solution to double to buffer
>> memory, but lets confirm the issue first before discussing the solution.
> 
> Initially, I thought about splitting the call to rte_pktmbuf_alloc_bulk in two,
> but I thought that might be bad for performance if the mempool is being used
> concurrently from multiple threads.
> 
> If we want to use only one call to rte_pktmbuf_alloc_bulk, we need an array
> to store the allocated mbuf pointers. This array must be of length ring_size,
> since that's the maximum amount of mbufs which may be allocated in one go.
> We need to copy the pointers from this array to the ring.
> 
> If we instead allocate twice the space for the ring, we can skip copying
> the pointers which were written to the ring, and only copy those that were
> written outside of its bounds.
> 

First thing comes my mind was also using two 'rte_pktmbuf_alloc_bulk()'
calls.
I can see why you prefer doubling the buffer size, but it comes with
copying overhead.
So both options comes with some overhead, not sure which one is better,
although I am leaning to the first solution we should do some
measurements to decide.

BUT first lets agree on the problem first, before doing more work, I am
not still fully convinced that original code is wrong.

>>> Fixes: 43b815d88188 ("net/memif: support zero-copy slave")
>>> Cc: stable@dpdk.org
>>> Signed-off-by: Mihai Brodschi <mihai.brodschi@broadcom.com>
>>> ---
>>> v2:
>>>  - fix email formatting
>>>
>>> ---
>>>  drivers/net/memif/rte_eth_memif.c | 10 +++++++++-
>>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
>>> index 16da22b5c6..3491c53cf1 100644
>>> --- a/drivers/net/memif/rte_eth_memif.c
>>> +++ b/drivers/net/memif/rte_eth_memif.c
>>> @@ -600,6 +600,10 @@ eth_memif_rx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
>>>  	ret = rte_pktmbuf_alloc_bulk(mq->mempool, &mq->buffers[head & mask], n_slots);
>>>  	if (unlikely(ret < 0))
>>>  		goto no_free_mbufs;
>>> +	if (unlikely(n_slots > ring_size - (head & mask))) {
>>> +		rte_memcpy(mq->buffers, &mq->buffers[ring_size],
>>> +			(n_slots + (head & mask) - ring_size) * sizeof(struct rte_mbuf *));
>>> +	}
>>>  
>>>  	while (n_slots--) {
>>>  		s0 = head++ & mask;
>>> @@ -1245,8 +1249,12 @@ memif_init_queues(struct rte_eth_dev *dev)
>>>  		}
>>>  		mq->buffers = NULL;
>>>  		if (pmd->flags & ETH_MEMIF_FLAG_ZERO_COPY) {
>>> +			/*
>>> +			 * Allocate 2x ring_size to reserve a contiguous array for
>>> +			 * rte_pktmbuf_alloc_bulk (to store allocated mbufs).
>>> +			 */
>>>  			mq->buffers = rte_zmalloc("bufs", sizeof(struct rte_mbuf *) *
>>> -						  (1 << mq->log2_ring_size), 0);
>>> +						  (1 << (mq->log2_ring_size + 1)), 0);
>>>  			if (mq->buffers == NULL)
>>>  				return -ENOMEM;
>>>  		}
>>
> 
> Apologies for sending this multiple times, I'm not familiar with mailing lists.
> 
>

next prev parent reply	other threads:[~2024-07-07 14:05 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-28 21:01 Mihai Brodschi
2024-07-01  4:57 ` Patrick Robb
2024-07-07  2:12 ` Ferruh Yigit
2024-07-07  5:50   ` Mihai Brodschi
2024-07-07 14:05     ` Ferruh Yigit [this message]
2024-07-07 15:18       ` Mihai Brodschi
2024-07-07 18:46         ` Mihai Brodschi
2024-07-08  3:39           ` Mihai Brodschi
2024-07-08 11:45             ` Ferruh Yigit
2024-07-19  9:03               ` Ferruh Yigit
2024-08-31 13:38                 ` Mihai Brodschi
2024-10-10  2:00                   ` Ferruh Yigit
2024-10-10  2:33 ` Ferruh Yigit
2024-07-07  5:31 Mihai Brodschi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ffa4241c-342e-499a-ba21-cde5111018b1@amd.com \
    --to=ferruh.yigit@amd.com \
    --cc=dev@dpdk.org \
    --cc=jgrajcia@cisco.com \
    --cc=mihai.brodschi@broadcom.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).