DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
@ 2024-06-28 21:01 Mihai Brodschi
  2024-07-01  4:57 ` Patrick Robb
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Mihai Brodschi @ 2024-06-28 21:01 UTC (permalink / raw)
  To: Jakub Grajciar, Ferruh Yigit; +Cc: dev, Mihai Brodschi, stable

[-- Attachment #1: Type: text/plain, Size: 2701 bytes --]

rte_pktmbuf_alloc_bulk is called by the zero-copy receiver to allocate
new mbufs to be provided to the sender. The allocated mbuf pointers
are stored in a ring, but the alloc function doesn't implement index
wrap-around, so it writes past the end of the array. This results in
memory corruption and duplicate mbufs being received.

Allocate 2x the space for the mbuf ring, so that the alloc function
has a contiguous array to write to, then copy the excess entries
to the start of the array.

Fixes: 43b815d88188 ("net/memif: support zero-copy slave")
Cc: stable@dpdk.org
Signed-off-by: Mihai Brodschi <mihai.brodschi@broadcom.com>
---
v2:
 - fix email formatting

---
 drivers/net/memif/rte_eth_memif.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
index 16da22b5c6..3491c53cf1 100644
--- a/drivers/net/memif/rte_eth_memif.c
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -600,6 +600,10 @@ eth_memif_rx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 	ret = rte_pktmbuf_alloc_bulk(mq->mempool, &mq->buffers[head & mask], n_slots);
 	if (unlikely(ret < 0))
 		goto no_free_mbufs;
+	if (unlikely(n_slots > ring_size - (head & mask))) {
+		rte_memcpy(mq->buffers, &mq->buffers[ring_size],
+			(n_slots + (head & mask) - ring_size) * sizeof(struct rte_mbuf *));
+	}
 
 	while (n_slots--) {
 		s0 = head++ & mask;
@@ -1245,8 +1249,12 @@ memif_init_queues(struct rte_eth_dev *dev)
 		}
 		mq->buffers = NULL;
 		if (pmd->flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+			/*
+			 * Allocate 2x ring_size to reserve a contiguous array for
+			 * rte_pktmbuf_alloc_bulk (to store allocated mbufs).
+			 */
 			mq->buffers = rte_zmalloc("bufs", sizeof(struct rte_mbuf *) *
-						  (1 << mq->log2_ring_size), 0);
+						  (1 << (mq->log2_ring_size + 1)), 0);
 			if (mq->buffers == NULL)
 				return -ENOMEM;
 		}
-- 
2.43.0

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4215 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
  2024-06-28 21:01 [PATCH v2] net/memif: fix buffer overflow in zero copy Rx Mihai Brodschi
@ 2024-07-01  4:57 ` Patrick Robb
  2024-07-07  2:12 ` Ferruh Yigit
  2024-10-10  2:33 ` Ferruh Yigit
  2 siblings, 0 replies; 14+ messages in thread
From: Patrick Robb @ 2024-07-01  4:57 UTC (permalink / raw)
  To: Mihai Brodschi; +Cc: Jakub Grajciar, Ferruh Yigit, dev, stable

I see this patchseries had a CI testing fail, for coremask DTS test on
Marvel CN10k. I don't think it could relate to the contents of your
patch though.

It had a timeout:

TestCoremask: Test Case test_individual_coremask Result FAILED:
TIMEOUT on ./arm64-native-linuxapp-gcc/app/test/dpdk-test  -c 0x8000
-n 2 --log-level="lib.eal,8"

So, I'm issuing a retest.
Recheck-request: iol-marvell-Functional

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
  2024-06-28 21:01 [PATCH v2] net/memif: fix buffer overflow in zero copy Rx Mihai Brodschi
  2024-07-01  4:57 ` Patrick Robb
@ 2024-07-07  2:12 ` Ferruh Yigit
  2024-07-07  5:50   ` Mihai Brodschi
  2024-10-10  2:33 ` Ferruh Yigit
  2 siblings, 1 reply; 14+ messages in thread
From: Ferruh Yigit @ 2024-07-07  2:12 UTC (permalink / raw)
  To: Mihai Brodschi, Jakub Grajciar; +Cc: dev, stable

On 6/28/2024 10:01 PM, Mihai Brodschi wrote:
> rte_pktmbuf_alloc_bulk is called by the zero-copy receiver to allocate
> new mbufs to be provided to the sender. The allocated mbuf pointers
> are stored in a ring, but the alloc function doesn't implement index
> wrap-around, so it writes past the end of the array. This results in
> memory corruption and duplicate mbufs being received.
> 

Hi Mihai,

I am not sure writing past the ring actually occurs.

As far as I can see is to keep the ring full as much as possible, when
initially 'head' and 'tail' are 0, it fills all ring.
Later tails moves and emptied space filled again. So head (in modulo) is
always just behind tail after refill. In next run, refill will only fill
the part tail moved, and this is calculated by 'n_slots'. As this is
only the size of the gap, starting from 'head' (with modulo) shouldn't
pass the ring length.

Do you observe this issue practically? If so can you please provide your
backtrace and numbers that is showing how to reproduce the issue?


> Allocate 2x the space for the mbuf ring, so that the alloc function
> has a contiguous array to write to, then copy the excess entries
> to the start of the array.
> 

Even issue is valid, I am not sure about solution to double to buffer
memory, but lets confirm the issue first before discussing the solution.

> Fixes: 43b815d88188 ("net/memif: support zero-copy slave")
> Cc: stable@dpdk.org
> Signed-off-by: Mihai Brodschi <mihai.brodschi@broadcom.com>
> ---
> v2:
>  - fix email formatting
> 
> ---
>  drivers/net/memif/rte_eth_memif.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
> index 16da22b5c6..3491c53cf1 100644
> --- a/drivers/net/memif/rte_eth_memif.c
> +++ b/drivers/net/memif/rte_eth_memif.c
> @@ -600,6 +600,10 @@ eth_memif_rx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
>  	ret = rte_pktmbuf_alloc_bulk(mq->mempool, &mq->buffers[head & mask], n_slots);
>  	if (unlikely(ret < 0))
>  		goto no_free_mbufs;
> +	if (unlikely(n_slots > ring_size - (head & mask))) {
> +		rte_memcpy(mq->buffers, &mq->buffers[ring_size],
> +			(n_slots + (head & mask) - ring_size) * sizeof(struct rte_mbuf *));
> +	}
>  
>  	while (n_slots--) {
>  		s0 = head++ & mask;
> @@ -1245,8 +1249,12 @@ memif_init_queues(struct rte_eth_dev *dev)
>  		}
>  		mq->buffers = NULL;
>  		if (pmd->flags & ETH_MEMIF_FLAG_ZERO_COPY) {
> +			/*
> +			 * Allocate 2x ring_size to reserve a contiguous array for
> +			 * rte_pktmbuf_alloc_bulk (to store allocated mbufs).
> +			 */
>  			mq->buffers = rte_zmalloc("bufs", sizeof(struct rte_mbuf *) *
> -						  (1 << mq->log2_ring_size), 0);
> +						  (1 << (mq->log2_ring_size + 1)), 0);
>  			if (mq->buffers == NULL)
>  				return -ENOMEM;
>  		}


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
  2024-07-07  2:12 ` Ferruh Yigit
@ 2024-07-07  5:50   ` Mihai Brodschi
  2024-07-07 14:05     ` Ferruh Yigit
  0 siblings, 1 reply; 14+ messages in thread
From: Mihai Brodschi @ 2024-07-07  5:50 UTC (permalink / raw)
  To: Ferruh Yigit, Jakub Grajciar; +Cc: dev, stable, Mihai Brodschi

[-- Attachment #1: Type: text/plain, Size: 5740 bytes --]

Hi Ferruh,

On 07/07/2024 05:12, Ferruh Yigit wrote:
> On 6/28/2024 10:01 PM, Mihai Brodschi wrote:
>> rte_pktmbuf_alloc_bulk is called by the zero-copy receiver to allocate
>> new mbufs to be provided to the sender. The allocated mbuf pointers
>> are stored in a ring, but the alloc function doesn't implement index
>> wrap-around, so it writes past the end of the array. This results in
>> memory corruption and duplicate mbufs being received.
>>
>
> Hi Mihai,
>
> I am not sure writing past the ring actually occurs.
>
> As far as I can see is to keep the ring full as much as possible, when
> initially 'head' and 'tail' are 0, it fills all ring.
> Later tails moves and emptied space filled again. So head (in modulo) is
> always just behind tail after refill. In next run, refill will only fill
> the part tail moved, and this is calculated by 'n_slots'. As this is
> only the size of the gap, starting from 'head' (with modulo) shouldn't
> pass the ring length.
>
> Do you observe this issue practically? If so can you please provide your
> backtrace and numbers that is showing how to reproduce the issue?

The alloc function writes starting from the ring's head, but the ring's
head can be located at the end of the ring's memory buffer (ring_size - 1).
The correct behavior would be to wrap around to the start of the buffer (0),
but the alloc function has no awareness of the fact that it's writing to a
ring, so it writes to ring_size, ring_size + 1, etc.

Let's look at the existing code:
We assume the ring size is 256 and we just received 32 packets.
The previous tail was at index 255, now it's at index 31.
The head is initially at index 255.

head = __atomic_load_n(&ring->head, __ATOMIC_RELAXED);	// head = 255
n_slots = ring_size - head + mq->last_tail;		// n_slots = 32

if (n_slots < 32)					// not taken
	goto no_free_mbufs;

ret = rte_pktmbuf_alloc_bulk(mq->mempool, &mq->buffers[head & mask], n_slots);
// This will write 32 mbuf pointers starting at index (head & mask) = 255.
// The ring size is 256, so apart from the first one all pointers will be
// written out of bounds (index 256 .. 286, when it should be 0 .. 30).

I can reproduce a crash 100% of the time with my application, but the output
is not very helpful, since it crashes elsewhere because of mempool corruption.
Applying this patch fixes the crashes completely.

>> Allocate 2x the space for the mbuf ring, so that the alloc function
>> has a contiguous array to write to, then copy the excess entries
>> to the start of the array.
>>
>
> Even issue is valid, I am not sure about solution to double to buffer
> memory, but lets confirm the issue first before discussing the solution.

Initially, I thought about splitting the call to rte_pktmbuf_alloc_bulk in two,
but I thought that might be bad for performance if the mempool is being used
concurrently from multiple threads.

If we want to use only one call to rte_pktmbuf_alloc_bulk, we need an array
to store the allocated mbuf pointers. This array must be of length ring_size,
since that's the maximum amount of mbufs which may be allocated in one go.
We need to copy the pointers from this array to the ring.

If we instead allocate twice the space for the ring, we can skip copying
the pointers which were written to the ring, and only copy those that were
written outside of its bounds.

>> Fixes: 43b815d88188 ("net/memif: support zero-copy slave")
>> Cc: stable@dpdk.org
>> Signed-off-by: Mihai Brodschi <mihai.brodschi@broadcom.com>
>> ---
>> v2:
>>  - fix email formatting
>>
>> ---
>>  drivers/net/memif/rte_eth_memif.c | 10 +++++++++-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
>> index 16da22b5c6..3491c53cf1 100644
>> --- a/drivers/net/memif/rte_eth_memif.c
>> +++ b/drivers/net/memif/rte_eth_memif.c
>> @@ -600,6 +600,10 @@ eth_memif_rx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
>>  	ret = rte_pktmbuf_alloc_bulk(mq->mempool, &mq->buffers[head & mask], n_slots);
>>  	if (unlikely(ret < 0))
>>  		goto no_free_mbufs;
>> +	if (unlikely(n_slots > ring_size - (head & mask))) {
>> +		rte_memcpy(mq->buffers, &mq->buffers[ring_size],
>> +			(n_slots + (head & mask) - ring_size) * sizeof(struct rte_mbuf *));
>> +	}
>>  
>>  	while (n_slots--) {
>>  		s0 = head++ & mask;
>> @@ -1245,8 +1249,12 @@ memif_init_queues(struct rte_eth_dev *dev)
>>  		}
>>  		mq->buffers = NULL;
>>  		if (pmd->flags & ETH_MEMIF_FLAG_ZERO_COPY) {
>> +			/*
>> +			 * Allocate 2x ring_size to reserve a contiguous array for
>> +			 * rte_pktmbuf_alloc_bulk (to store allocated mbufs).
>> +			 */
>>  			mq->buffers = rte_zmalloc("bufs", sizeof(struct rte_mbuf *) *
>> -						  (1 << mq->log2_ring_size), 0);
>> +						  (1 << (mq->log2_ring_size + 1)), 0);
>>  			if (mq->buffers == NULL)
>>  				return -ENOMEM;
>>  		}
>

Apologies for sending this multiple times, I'm not familiar with mailing lists.


-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4215 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
  2024-07-07  5:50   ` Mihai Brodschi
@ 2024-07-07 14:05     ` Ferruh Yigit
  2024-07-07 15:18       ` Mihai Brodschi
  0 siblings, 1 reply; 14+ messages in thread
From: Ferruh Yigit @ 2024-07-07 14:05 UTC (permalink / raw)
  To: Mihai Brodschi, Jakub Grajciar; +Cc: dev, stable

On 7/7/2024 6:50 AM, Mihai Brodschi wrote:
> Hi Ferruh,
> 
> On 07/07/2024 05:12, Ferruh Yigit wrote:
>> On 6/28/2024 10:01 PM, Mihai Brodschi wrote:
>>> rte_pktmbuf_alloc_bulk is called by the zero-copy receiver to allocate
>>> new mbufs to be provided to the sender. The allocated mbuf pointers
>>> are stored in a ring, but the alloc function doesn't implement index
>>> wrap-around, so it writes past the end of the array. This results in
>>> memory corruption and duplicate mbufs being received.
>>>
>>
>> Hi Mihai,
>>
>> I am not sure writing past the ring actually occurs.
>>
>> As far as I can see is to keep the ring full as much as possible, when
>> initially 'head' and 'tail' are 0, it fills all ring.
>> Later tails moves and emptied space filled again. So head (in modulo) is
>> always just behind tail after refill. In next run, refill will only fill
>> the part tail moved, and this is calculated by 'n_slots'. As this is
>> only the size of the gap, starting from 'head' (with modulo) shouldn't
>> pass the ring length.
>>
>> Do you observe this issue practically? If so can you please provide your
>> backtrace and numbers that is showing how to reproduce the issue?
> 
> The alloc function writes starting from the ring's head, but the ring's
> head can be located at the end of the ring's memory buffer (ring_size - 1).
> The correct behavior would be to wrap around to the start of the buffer (0),
> but the alloc function has no awareness of the fact that it's writing to a
> ring, so it writes to ring_size, ring_size + 1, etc.
> 
> Let's look at the existing code:
> We assume the ring size is 256 and we just received 32 packets.
> The previous tail was at index 255, now it's at index 31.
> The head is initially at index 255.
> 
> head = __atomic_load_n(&ring->head, __ATOMIC_RELAXED);	// head = 255
> n_slots = ring_size - head + mq->last_tail;		// n_slots = 32
> 
> if (n_slots < 32)					// not taken
> 	goto no_free_mbufs;
> 
> ret = rte_pktmbuf_alloc_bulk(mq->mempool, &mq->buffers[head & mask], n_slots);
> // This will write 32 mbuf pointers starting at index (head & mask) = 255.
> // The ring size is 256, so apart from the first one all pointers will be
> // written out of bounds (index 256 .. 286, when it should be 0 .. 30).
> 

My expectation is numbers should be like following:

Initially:
 size = 256
 head = 0
 tail = 0

In first refill:
 n_slots = 256
 head = 256
 tail = 0

Subsequent run that 32 slots used:
 head = 256
 tail = 32
 n_slots = 32
 rte_pktmbuf_alloc_bulk(mq, buf[head & mask], n_slots);
  head & mask = 0
  // So it fills first 32 elements of buffer, which is inbound

This will continue as above, combination of only gap filled and head
masked with 'mask' provides the wrapping required.


> I can reproduce a crash 100% of the time with my application, but the output
> is not very helpful, since it crashes elsewhere because of mempool corruption.
> Applying this patch fixes the crashes completely.
> 

This causing always reproducible crash means existing memif zero copy Rx
is broken and nobody can use it, but I am suspicions that this is the
case, perhaps something special in your usecase triggering this issue.

@Jakup, can you please confirm that memif Rx zero copy is tested?

>>> Allocate 2x the space for the mbuf ring, so that the alloc function
>>> has a contiguous array to write to, then copy the excess entries
>>> to the start of the array.
>>>
>>
>> Even issue is valid, I am not sure about solution to double to buffer
>> memory, but lets confirm the issue first before discussing the solution.
> 
> Initially, I thought about splitting the call to rte_pktmbuf_alloc_bulk in two,
> but I thought that might be bad for performance if the mempool is being used
> concurrently from multiple threads.
> 
> If we want to use only one call to rte_pktmbuf_alloc_bulk, we need an array
> to store the allocated mbuf pointers. This array must be of length ring_size,
> since that's the maximum amount of mbufs which may be allocated in one go.
> We need to copy the pointers from this array to the ring.
> 
> If we instead allocate twice the space for the ring, we can skip copying
> the pointers which were written to the ring, and only copy those that were
> written outside of its bounds.
> 

First thing comes my mind was also using two 'rte_pktmbuf_alloc_bulk()'
calls.
I can see why you prefer doubling the buffer size, but it comes with
copying overhead.
So both options comes with some overhead, not sure which one is better,
although I am leaning to the first solution we should do some
measurements to decide.

BUT first lets agree on the problem first, before doing more work, I am
not still fully convinced that original code is wrong.

>>> Fixes: 43b815d88188 ("net/memif: support zero-copy slave")
>>> Cc: stable@dpdk.org
>>> Signed-off-by: Mihai Brodschi <mihai.brodschi@broadcom.com>
>>> ---
>>> v2:
>>>  - fix email formatting
>>>
>>> ---
>>>  drivers/net/memif/rte_eth_memif.c | 10 +++++++++-
>>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
>>> index 16da22b5c6..3491c53cf1 100644
>>> --- a/drivers/net/memif/rte_eth_memif.c
>>> +++ b/drivers/net/memif/rte_eth_memif.c
>>> @@ -600,6 +600,10 @@ eth_memif_rx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
>>>  	ret = rte_pktmbuf_alloc_bulk(mq->mempool, &mq->buffers[head & mask], n_slots);
>>>  	if (unlikely(ret < 0))
>>>  		goto no_free_mbufs;
>>> +	if (unlikely(n_slots > ring_size - (head & mask))) {
>>> +		rte_memcpy(mq->buffers, &mq->buffers[ring_size],
>>> +			(n_slots + (head & mask) - ring_size) * sizeof(struct rte_mbuf *));
>>> +	}
>>>  
>>>  	while (n_slots--) {
>>>  		s0 = head++ & mask;
>>> @@ -1245,8 +1249,12 @@ memif_init_queues(struct rte_eth_dev *dev)
>>>  		}
>>>  		mq->buffers = NULL;
>>>  		if (pmd->flags & ETH_MEMIF_FLAG_ZERO_COPY) {
>>> +			/*
>>> +			 * Allocate 2x ring_size to reserve a contiguous array for
>>> +			 * rte_pktmbuf_alloc_bulk (to store allocated mbufs).
>>> +			 */
>>>  			mq->buffers = rte_zmalloc("bufs", sizeof(struct rte_mbuf *) *
>>> -						  (1 << mq->log2_ring_size), 0);
>>> +						  (1 << (mq->log2_ring_size + 1)), 0);
>>>  			if (mq->buffers == NULL)
>>>  				return -ENOMEM;
>>>  		}
>>
> 
> Apologies for sending this multiple times, I'm not familiar with mailing lists.
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
  2024-07-07 14:05     ` Ferruh Yigit
@ 2024-07-07 15:18       ` Mihai Brodschi
  2024-07-07 18:46         ` Mihai Brodschi
  0 siblings, 1 reply; 14+ messages in thread
From: Mihai Brodschi @ 2024-07-07 15:18 UTC (permalink / raw)
  To: Ferruh Yigit, Jakub Grajciar; +Cc: dev, stable, Mihai Brodschi

[-- Attachment #1: Type: text/plain, Size: 7758 bytes --]



On 07/07/2024 17:05, Ferruh Yigit wrote:
> On 7/7/2024 6:50 AM, Mihai Brodschi wrote:
>> Hi Ferruh,
>>
>> On 07/07/2024 05:12, Ferruh Yigit wrote:
>>> On 6/28/2024 10:01 PM, Mihai Brodschi wrote:
>>>> rte_pktmbuf_alloc_bulk is called by the zero-copy receiver to allocate
>>>> new mbufs to be provided to the sender. The allocated mbuf pointers
>>>> are stored in a ring, but the alloc function doesn't implement index
>>>> wrap-around, so it writes past the end of the array. This results in
>>>> memory corruption and duplicate mbufs being received.
>>>>
>>>
>>> Hi Mihai,
>>>
>>> I am not sure writing past the ring actually occurs.
>>>
>>> As far as I can see is to keep the ring full as much as possible, when
>>> initially 'head' and 'tail' are 0, it fills all ring.
>>> Later tails moves and emptied space filled again. So head (in modulo) is
>>> always just behind tail after refill. In next run, refill will only fill
>>> the part tail moved, and this is calculated by 'n_slots'. As this is
>>> only the size of the gap, starting from 'head' (with modulo) shouldn't
>>> pass the ring length.
>>>
>>> Do you observe this issue practically? If so can you please provide your
>>> backtrace and numbers that is showing how to reproduce the issue?
>>
>> The alloc function writes starting from the ring's head, but the ring's
>> head can be located at the end of the ring's memory buffer (ring_size - 1).
>> The correct behavior would be to wrap around to the start of the buffer (0),
>> but the alloc function has no awareness of the fact that it's writing to a
>> ring, so it writes to ring_size, ring_size + 1, etc.
>>
>> Let's look at the existing code:
>> We assume the ring size is 256 and we just received 32 packets.
>> The previous tail was at index 255, now it's at index 31.
>> The head is initially at index 255.
>>
>> head = __atomic_load_n(&ring->head, __ATOMIC_RELAXED);	// head = 255
>> n_slots = ring_size - head + mq->last_tail;		// n_slots = 32
>>
>> if (n_slots < 32)					// not taken
>> 	goto no_free_mbufs;
>>
>> ret = rte_pktmbuf_alloc_bulk(mq->mempool, &mq->buffers[head & mask], n_slots);
>> // This will write 32 mbuf pointers starting at index (head & mask) = 255.
>> // The ring size is 256, so apart from the first one all pointers will be
>> // written out of bounds (index 256 .. 286, when it should be 0 .. 30).
>>
> 
> My expectation is numbers should be like following:
> 
> Initially:
>  size = 256
>  head = 0
>  tail = 0
> 
> In first refill:
>  n_slots = 256
>  head = 256
>  tail = 0
> 
> Subsequent run that 32 slots used:
>  head = 256
>  tail = 32
>  n_slots = 32
>  rte_pktmbuf_alloc_bulk(mq, buf[head & mask], n_slots);
>   head & mask = 0
>   // So it fills first 32 elements of buffer, which is inbound
> 
> This will continue as above, combination of only gap filled and head
> masked with 'mask' provides the wrapping required.

If I understand correctly, this works only if eth_memif_rx_zc always processes
a number of packets which is a power of 2, so that the ring's head always wraps
around at the end of a refill loop, never in the middle of it.
Is there any reason this should be the case?
Maybe the tests don't trigger the crash because this condition holds true for them?

>> I can reproduce a crash 100% of the time with my application, but the output
>> is not very helpful, since it crashes elsewhere because of mempool corruption.
>> Applying this patch fixes the crashes completely.
>>
> 
> This causing always reproducible crash means existing memif zero copy Rx
> is broken and nobody can use it, but I am suspicions that this is the
> case, perhaps something special in your usecase triggering this issue.
> 
> @Jakup, can you please confirm that memif Rx zero copy is tested?
> 
>>>> Allocate 2x the space for the mbuf ring, so that the alloc function
>>>> has a contiguous array to write to, then copy the excess entries
>>>> to the start of the array.
>>>>
>>>
>>> Even issue is valid, I am not sure about solution to double to buffer
>>> memory, but lets confirm the issue first before discussing the solution.
>>
>> Initially, I thought about splitting the call to rte_pktmbuf_alloc_bulk in two,
>> but I thought that might be bad for performance if the mempool is being used
>> concurrently from multiple threads.
>>
>> If we want to use only one call to rte_pktmbuf_alloc_bulk, we need an array
>> to store the allocated mbuf pointers. This array must be of length ring_size,
>> since that's the maximum amount of mbufs which may be allocated in one go.
>> We need to copy the pointers from this array to the ring.
>>
>> If we instead allocate twice the space for the ring, we can skip copying
>> the pointers which were written to the ring, and only copy those that were
>> written outside of its bounds.
>>
> 
> First thing comes my mind was also using two 'rte_pktmbuf_alloc_bulk()'
> calls.
> I can see why you prefer doubling the buffer size, but it comes with
> copying overhead.
> So both options comes with some overhead, not sure which one is better,
> although I am leaning to the first solution we should do some
> measurements to decide.
> 
> BUT first lets agree on the problem first, before doing more work, I am
> not still fully convinced that original code is wrong.
> 
>>>> Fixes: 43b815d88188 ("net/memif: support zero-copy slave")
>>>> Cc: stable@dpdk.org
>>>> Signed-off-by: Mihai Brodschi <mihai.brodschi@broadcom.com>
>>>> ---
>>>> v2:
>>>>  - fix email formatting
>>>>
>>>> ---
>>>>  drivers/net/memif/rte_eth_memif.c | 10 +++++++++-
>>>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
>>>> index 16da22b5c6..3491c53cf1 100644
>>>> --- a/drivers/net/memif/rte_eth_memif.c
>>>> +++ b/drivers/net/memif/rte_eth_memif.c
>>>> @@ -600,6 +600,10 @@ eth_memif_rx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
>>>>  	ret = rte_pktmbuf_alloc_bulk(mq->mempool, &mq->buffers[head & mask], n_slots);
>>>>  	if (unlikely(ret < 0))
>>>>  		goto no_free_mbufs;
>>>> +	if (unlikely(n_slots > ring_size - (head & mask))) {
>>>> +		rte_memcpy(mq->buffers, &mq->buffers[ring_size],
>>>> +			(n_slots + (head & mask) - ring_size) * sizeof(struct rte_mbuf *));
>>>> +	}
>>>>  
>>>>  	while (n_slots--) {
>>>>  		s0 = head++ & mask;
>>>> @@ -1245,8 +1249,12 @@ memif_init_queues(struct rte_eth_dev *dev)
>>>>  		}
>>>>  		mq->buffers = NULL;
>>>>  		if (pmd->flags & ETH_MEMIF_FLAG_ZERO_COPY) {
>>>> +			/*
>>>> +			 * Allocate 2x ring_size to reserve a contiguous array for
>>>> +			 * rte_pktmbuf_alloc_bulk (to store allocated mbufs).
>>>> +			 */
>>>>  			mq->buffers = rte_zmalloc("bufs", sizeof(struct rte_mbuf *) *
>>>> -						  (1 << mq->log2_ring_size), 0);
>>>> +						  (1 << (mq->log2_ring_size + 1)), 0);
>>>>  			if (mq->buffers == NULL)
>>>>  				return -ENOMEM;
>>>>  		}
>>>
>>
>> Apologies for sending this multiple times, I'm not familiar with mailing lists.
>>
>>
> 

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4215 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
  2024-07-07 15:18       ` Mihai Brodschi
@ 2024-07-07 18:46         ` Mihai Brodschi
  2024-07-08  3:39           ` Mihai Brodschi
  0 siblings, 1 reply; 14+ messages in thread
From: Mihai Brodschi @ 2024-07-07 18:46 UTC (permalink / raw)
  To: Ferruh Yigit, Jakub Grajciar; +Cc: dev, stable, Mihai Brodschi



On 07/07/2024 18:18, Mihai Brodschi wrote:
> 
> 
> On 07/07/2024 17:05, Ferruh Yigit wrote:
>>
>> My expectation is numbers should be like following:
>>
>> Initially:
>>  size = 256
>>  head = 0
>>  tail = 0
>>
>> In first refill:
>>  n_slots = 256
>>  head = 256
>>  tail = 0
>>
>> Subsequent run that 32 slots used:
>>  head = 256
>>  tail = 32
>>  n_slots = 32
>>  rte_pktmbuf_alloc_bulk(mq, buf[head & mask], n_slots);
>>   head & mask = 0
>>   // So it fills first 32 elements of buffer, which is inbound
>>
>> This will continue as above, combination of only gap filled and head
>> masked with 'mask' provides the wrapping required.
> 
> If I understand correctly, this works only if eth_memif_rx_zc always processes
> a number of packets which is a power of 2, so that the ring's head always wraps
> around at the end of a refill loop, never in the middle of it.
> Is there any reason this should be the case?
> Maybe the tests don't trigger the crash because this condition holds true for them?

Here's how to reproduce the crash on DPDK stable 23.11.1, using testpmd:

Server:
# ./dpdk-testpmd --vdev=net_memif0,id=1,role=server,bsize=1024,rsize=8 --single-file-segments -l2,3 --file-prefix test1 -- -i

Client:
# ./dpdk-testpmd --vdev=net_memif0,id=1,role=client,bsize=1024,rsize=8,zero-copy=yes --single-file-segments -l4,5 --file-prefix test2 -- -i
testpmd> start

Server:
testpmd> start tx_first
testpmt> set burst 15

At this point, the client crashes with a segmentation fault.
Before the burst is set to 15, its default value is 32.
If the receiver processes packets in bursts of size 2^N, the crash does not occur.
Setting the burst size to any power of 2 works, anything else crashes.
After applying this patch, the crashes are completely gone.

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
  2024-07-07 18:46         ` Mihai Brodschi
@ 2024-07-08  3:39           ` Mihai Brodschi
  2024-07-08 11:45             ` Ferruh Yigit
  0 siblings, 1 reply; 14+ messages in thread
From: Mihai Brodschi @ 2024-07-08  3:39 UTC (permalink / raw)
  To: Ferruh Yigit, Jakub Grajciar; +Cc: dev, stable, Mihai Brodschi

[-- Attachment #1: Type: text/plain, Size: 2969 bytes --]



On 07/07/2024 21:46, Mihai Brodschi wrote:
> 
> 
> On 07/07/2024 18:18, Mihai Brodschi wrote:
>>
>>
>> On 07/07/2024 17:05, Ferruh Yigit wrote:
>>>
>>> My expectation is numbers should be like following:
>>>
>>> Initially:
>>>  size = 256
>>>  head = 0
>>>  tail = 0
>>>
>>> In first refill:
>>>  n_slots = 256
>>>  head = 256
>>>  tail = 0
>>>
>>> Subsequent run that 32 slots used:
>>>  head = 256
>>>  tail = 32
>>>  n_slots = 32
>>>  rte_pktmbuf_alloc_bulk(mq, buf[head & mask], n_slots);
>>>   head & mask = 0
>>>   // So it fills first 32 elements of buffer, which is inbound
>>>
>>> This will continue as above, combination of only gap filled and head
>>> masked with 'mask' provides the wrapping required.
>>
>> If I understand correctly, this works only if eth_memif_rx_zc always processes
>> a number of packets which is a power of 2, so that the ring's head always wraps
>> around at the end of a refill loop, never in the middle of it.
>> Is there any reason this should be the case?
>> Maybe the tests don't trigger the crash because this condition holds true for them?
> 
> Here's how to reproduce the crash on DPDK stable 23.11.1, using testpmd:
> 
> Server:
> # ./dpdk-testpmd --vdev=net_memif0,id=1,role=server,bsize=1024,rsize=8 --single-file-segments -l2,3 --file-prefix test1 -- -i
> 
> Client:
> # ./dpdk-testpmd --vdev=net_memif0,id=1,role=client,bsize=1024,rsize=8,zero-copy=yes --single-file-segments -l4,5 --file-prefix test2 -- -i
> testpmd> start
> 
> Server:
> testpmd> start tx_first
> testpmt> set burst 15
> 
> At this point, the client crashes with a segmentation fault.
> Before the burst is set to 15, its default value is 32.
> If the receiver processes packets in bursts of size 2^N, the crash does not occur.
> Setting the burst size to any power of 2 works, anything else crashes.
> After applying this patch, the crashes are completely gone.

Sorry, this might not crash with a segmentation fault. To confirm the mempool is
corrupted, please compile DPDK with debug=true and the c_args -DRTE_LIBRTE_MEMPOOL_DEBUG.
You should see the client panic when changing the burst size to not be a power of 2.
This also works on the latest main branch.

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4215 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
  2024-07-08  3:39           ` Mihai Brodschi
@ 2024-07-08 11:45             ` Ferruh Yigit
  2024-07-19  9:03               ` Ferruh Yigit
  0 siblings, 1 reply; 14+ messages in thread
From: Ferruh Yigit @ 2024-07-08 11:45 UTC (permalink / raw)
  To: Mihai Brodschi, Jakub Grajciar; +Cc: dev, stable

On 7/8/2024 4:39 AM, Mihai Brodschi wrote:
> 
> 
> On 07/07/2024 21:46, Mihai Brodschi wrote:
>>
>>
>> On 07/07/2024 18:18, Mihai Brodschi wrote:
>>>
>>>
>>> On 07/07/2024 17:05, Ferruh Yigit wrote:
>>>>
>>>> My expectation is numbers should be like following:
>>>>
>>>> Initially:
>>>>  size = 256
>>>>  head = 0
>>>>  tail = 0
>>>>
>>>> In first refill:
>>>>  n_slots = 256
>>>>  head = 256
>>>>  tail = 0
>>>>
>>>> Subsequent run that 32 slots used:
>>>>  head = 256
>>>>  tail = 32
>>>>  n_slots = 32
>>>>  rte_pktmbuf_alloc_bulk(mq, buf[head & mask], n_slots);
>>>>   head & mask = 0
>>>>   // So it fills first 32 elements of buffer, which is inbound
>>>>
>>>> This will continue as above, combination of only gap filled and head
>>>> masked with 'mask' provides the wrapping required.
>>>
>>> If I understand correctly, this works only if eth_memif_rx_zc always processes
>>> a number of packets which is a power of 2, so that the ring's head always wraps
>>> around at the end of a refill loop, never in the middle of it.
>>> Is there any reason this should be the case?
>>> Maybe the tests don't trigger the crash because this condition holds true for them?
>>
>> Here's how to reproduce the crash on DPDK stable 23.11.1, using testpmd:
>>
>> Server:
>> # ./dpdk-testpmd --vdev=net_memif0,id=1,role=server,bsize=1024,rsize=8 --single-file-segments -l2,3 --file-prefix test1 -- -i
>>
>> Client:
>> # ./dpdk-testpmd --vdev=net_memif0,id=1,role=client,bsize=1024,rsize=8,zero-copy=yes --single-file-segments -l4,5 --file-prefix test2 -- -i
>> testpmd> start
>>
>> Server:
>> testpmd> start tx_first
>> testpmt> set burst 15
>>
>> At this point, the client crashes with a segmentation fault.
>> Before the burst is set to 15, its default value is 32.
>> If the receiver processes packets in bursts of size 2^N, the crash does not occur.
>> Setting the burst size to any power of 2 works, anything else crashes.
>> After applying this patch, the crashes are completely gone.
> 
> Sorry, this might not crash with a segmentation fault. To confirm the mempool is
> corrupted, please compile DPDK with debug=true and the c_args -DRTE_LIBRTE_MEMPOOL_DEBUG.
> You should see the client panic when changing the burst size to not be a power of 2.
> This also works on the latest main branch.
> 

Hi Mihai,

Right, if the buffer size is not multiple of burst size, issue is valid.
And as there is a requirement to have buffer size power of two, burst
should have the same.
I assume this issue is not caught before because default burst size is 32.

Can you please share some performance impact of the change, with two
possible solutions we discussed above?

Other option is to add this as a limitation to the memif zero copy, but
this won't be good for usability.

We can decide based on performance numbers.

Thanks,
ferruh


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
  2024-07-08 11:45             ` Ferruh Yigit
@ 2024-07-19  9:03               ` Ferruh Yigit
  2024-08-31 13:38                 ` Mihai Brodschi
  0 siblings, 1 reply; 14+ messages in thread
From: Ferruh Yigit @ 2024-07-19  9:03 UTC (permalink / raw)
  To: Mihai Brodschi, Jakub Grajciar; +Cc: dev, stable

On 7/8/2024 12:45 PM, Ferruh Yigit wrote:
> On 7/8/2024 4:39 AM, Mihai Brodschi wrote:
>>
>>
>> On 07/07/2024 21:46, Mihai Brodschi wrote:
>>>
>>>
>>> On 07/07/2024 18:18, Mihai Brodschi wrote:
>>>>
>>>>
>>>> On 07/07/2024 17:05, Ferruh Yigit wrote:
>>>>>
>>>>> My expectation is numbers should be like following:
>>>>>
>>>>> Initially:
>>>>>  size = 256
>>>>>  head = 0
>>>>>  tail = 0
>>>>>
>>>>> In first refill:
>>>>>  n_slots = 256
>>>>>  head = 256
>>>>>  tail = 0
>>>>>
>>>>> Subsequent run that 32 slots used:
>>>>>  head = 256
>>>>>  tail = 32
>>>>>  n_slots = 32
>>>>>  rte_pktmbuf_alloc_bulk(mq, buf[head & mask], n_slots);
>>>>>   head & mask = 0
>>>>>   // So it fills first 32 elements of buffer, which is inbound
>>>>>
>>>>> This will continue as above, combination of only gap filled and head
>>>>> masked with 'mask' provides the wrapping required.
>>>>
>>>> If I understand correctly, this works only if eth_memif_rx_zc always processes
>>>> a number of packets which is a power of 2, so that the ring's head always wraps
>>>> around at the end of a refill loop, never in the middle of it.
>>>> Is there any reason this should be the case?
>>>> Maybe the tests don't trigger the crash because this condition holds true for them?
>>>
>>> Here's how to reproduce the crash on DPDK stable 23.11.1, using testpmd:
>>>
>>> Server:
>>> # ./dpdk-testpmd --vdev=net_memif0,id=1,role=server,bsize=1024,rsize=8 --single-file-segments -l2,3 --file-prefix test1 -- -i
>>>
>>> Client:
>>> # ./dpdk-testpmd --vdev=net_memif0,id=1,role=client,bsize=1024,rsize=8,zero-copy=yes --single-file-segments -l4,5 --file-prefix test2 -- -i
>>> testpmd> start
>>>
>>> Server:
>>> testpmd> start tx_first
>>> testpmt> set burst 15
>>>
>>> At this point, the client crashes with a segmentation fault.
>>> Before the burst is set to 15, its default value is 32.
>>> If the receiver processes packets in bursts of size 2^N, the crash does not occur.
>>> Setting the burst size to any power of 2 works, anything else crashes.
>>> After applying this patch, the crashes are completely gone.
>>
>> Sorry, this might not crash with a segmentation fault. To confirm the mempool is
>> corrupted, please compile DPDK with debug=true and the c_args -DRTE_LIBRTE_MEMPOOL_DEBUG.
>> You should see the client panic when changing the burst size to not be a power of 2.
>> This also works on the latest main branch.
>>
> 
> Hi Mihai,
> 
> Right, if the buffer size is not multiple of burst size, issue is valid.
> And as there is a requirement to have buffer size power of two, burst
> should have the same.
> I assume this issue is not caught before because default burst size is 32.
> 
> Can you please share some performance impact of the change, with two
> possible solutions we discussed above?
> 
> Other option is to add this as a limitation to the memif zero copy, but
> this won't be good for usability.
> 
> We can decide based on performance numbers.
> 
> 

Hi Jakup,

Do you have any comment on this?

I think we should either document this as limitation of the driver, or
fix it, and if so need to decide the fix.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
  2024-07-19  9:03               ` Ferruh Yigit
@ 2024-08-31 13:38                 ` Mihai Brodschi
  2024-10-10  2:00                   ` Ferruh Yigit
  0 siblings, 1 reply; 14+ messages in thread
From: Mihai Brodschi @ 2024-08-31 13:38 UTC (permalink / raw)
  To: Ferruh Yigit, Jakub Grajciar; +Cc: dev, stable, Mihai Brodschi

Hi Ferruh,

Apologies for the late response.
I've run some performance tests for the two proposed solutions.
In the tables below, the rte_memcpy results correspond to this patch.
The 2xpktmbuf_alloc results correspond to the other proposed solution.

bash commands:
server# ./dpdk-testpmd --vdev=net_memif0,id=1,role=server,bsize=1024,rsize=8 --single -l<SERVER_CORES> --file=test1 -- --nb-cores <NB_CORES> --txq <NB_CORES> --rxq <NB_CORES> --burst <BURST> -i
client# ./dpdk-testpmd --vdev=net_memif0,id=1,role=client,bsize=1024,rsize=8,zero-copy=yes --single -l<CLIENT_CORES> --file=test2 -- --nb-cores <NB_CORES> --txq <NB_CORES> --rxq <NB_CORES> --burst <BURST> -i

testpmd commands:
client:
testpmd> start
server:
testpmd> start tx_first


CPU: AMD EPYC 7713P
RAM: DDR4-3200
OS: Debian 12
DPDK: 22.11.1
SERVER_CORES=72,8,9,10,11
CLIENT_CORES=76,12,13,14,15

Results:
==================================================================
|                          | 1 CORE     | 2 CORES    | 4 CORES   |
==================================================================
| unpatched burst=32       | 9.95 Gbps  | 19.24 Gbps | 36.4 Gbps |
------------------------------------------------------------------
| 2xpktmbuf_alloc burst=32 | 9.86 Gbps  | 18.88 Gbps | 36.6 Gbps |
------------------------------------------------------------------
| 2xpktmbuf_alloc burst=31 | 9.17 Gbps  | 18.69 Gbps | 35.1 Gbps |
------------------------------------------------------------------
| rte_memcpy burst=32      | 9.54 Gbps  | 19.10 Gbps | 36.6 Gbps |
------------------------------------------------------------------
| rte_memcpy burst=31      | 9.39 Gbps  | 18.53 Gbps | 35.5 Gbps |
==================================================================


CPU: Intel Core i7-14700HX
RAM: DDR5-5600
OS:  Ubuntu 24.04.1
DPDK: 23.11.1
SERVER_CORES=0,1,3,5,7
CLIENT_CORES=8,9,11,13,15

Results:
==================================================================
|                          | 1 CORE     | 2 CORES    | 4 CORES   |
==================================================================
| unpatched burst=32       | 15.52 Gbps | 27.35 Gbps | 46.8 Gbps |
------------------------------------------------------------------
| 2xpktmbuf_alloc burst=32 | 15.49 Gbps | 27.68 Gbps | 46.4 Gbps |
------------------------------------------------------------------
| 2xpktmbuf_alloc burst=31 | 14.98 Gbps | 26.75 Gbps | 45.2 Gbps |
------------------------------------------------------------------
| rte_memcpy burst=32      | 15.99 Gbps | 28.44 Gbps | 49.3 Gbps |
------------------------------------------------------------------
| rte_memcpy burst=31      | 14.85 Gbps | 27.32 Gbps | 46.3 Gbps |
==================================================================


On 19/07/2024 12:03, Ferruh Yigit wrote:
> On 7/8/2024 12:45 PM, Ferruh Yigit wrote:
>> On 7/8/2024 4:39 AM, Mihai Brodschi wrote:
>>>
>>>
>>> On 07/07/2024 21:46, Mihai Brodschi wrote:
>>>>
>>>>
>>>> On 07/07/2024 18:18, Mihai Brodschi wrote:
>>>>>
>>>>>
>>>>> On 07/07/2024 17:05, Ferruh Yigit wrote:
>>>>>>
>>>>>> My expectation is numbers should be like following:
>>>>>>
>>>>>> Initially:
>>>>>>  size = 256
>>>>>>  head = 0
>>>>>>  tail = 0
>>>>>>
>>>>>> In first refill:
>>>>>>  n_slots = 256
>>>>>>  head = 256
>>>>>>  tail = 0
>>>>>>
>>>>>> Subsequent run that 32 slots used:
>>>>>>  head = 256
>>>>>>  tail = 32
>>>>>>  n_slots = 32
>>>>>>  rte_pktmbuf_alloc_bulk(mq, buf[head & mask], n_slots);
>>>>>>   head & mask = 0
>>>>>>   // So it fills first 32 elements of buffer, which is inbound
>>>>>>
>>>>>> This will continue as above, combination of only gap filled and head
>>>>>> masked with 'mask' provides the wrapping required.
>>>>>
>>>>> If I understand correctly, this works only if eth_memif_rx_zc always processes
>>>>> a number of packets which is a power of 2, so that the ring's head always wraps
>>>>> around at the end of a refill loop, never in the middle of it.
>>>>> Is there any reason this should be the case?
>>>>> Maybe the tests don't trigger the crash because this condition holds true for them?
>>>>
>>>> Here's how to reproduce the crash on DPDK stable 23.11.1, using testpmd:
>>>>
>>>> Server:
>>>> # ./dpdk-testpmd --vdev=net_memif0,id=1,role=server,bsize=1024,rsize=8 --single-file-segments -l2,3 --file-prefix test1 -- -i
>>>>
>>>> Client:
>>>> # ./dpdk-testpmd --vdev=net_memif0,id=1,role=client,bsize=1024,rsize=8,zero-copy=yes --single-file-segments -l4,5 --file-prefix test2 -- -i
>>>> testpmd> start
>>>>
>>>> Server:
>>>> testpmd> start tx_first
>>>> testpmt> set burst 15
>>>>
>>>> At this point, the client crashes with a segmentation fault.
>>>> Before the burst is set to 15, its default value is 32.
>>>> If the receiver processes packets in bursts of size 2^N, the crash does not occur.
>>>> Setting the burst size to any power of 2 works, anything else crashes.
>>>> After applying this patch, the crashes are completely gone.
>>>
>>> Sorry, this might not crash with a segmentation fault. To confirm the mempool is
>>> corrupted, please compile DPDK with debug=true and the c_args -DRTE_LIBRTE_MEMPOOL_DEBUG.
>>> You should see the client panic when changing the burst size to not be a power of 2.
>>> This also works on the latest main branch.
>>>
>>
>> Hi Mihai,
>>
>> Right, if the buffer size is not multiple of burst size, issue is valid.
>> And as there is a requirement to have buffer size power of two, burst
>> should have the same.
>> I assume this issue is not caught before because default burst size is 32.
>>
>> Can you please share some performance impact of the change, with two
>> possible solutions we discussed above?
>>
>> Other option is to add this as a limitation to the memif zero copy, but
>> this won't be good for usability.
>>
>> We can decide based on performance numbers.
>>
>>
> 
> Hi Jakup,
> 
> Do you have any comment on this?
> 
> I think we should either document this as limitation of the driver, or
> fix it, and if so need to decide the fix.
> 


-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
  2024-08-31 13:38                 ` Mihai Brodschi
@ 2024-10-10  2:00                   ` Ferruh Yigit
  0 siblings, 0 replies; 14+ messages in thread
From: Ferruh Yigit @ 2024-10-10  2:00 UTC (permalink / raw)
  To: Mihai Brodschi, Jakub Grajciar; +Cc: dev, stable

On 8/31/2024 2:38 PM, Mihai Brodschi wrote:
> Hi Ferruh,
> 
> Apologies for the late response.
> I've run some performance tests for the two proposed solutions.
> In the tables below, the rte_memcpy results correspond to this patch.
> The 2xpktmbuf_alloc results correspond to the other proposed solution.
> 
> bash commands:
> server# ./dpdk-testpmd --vdev=net_memif0,id=1,role=server,bsize=1024,rsize=8 --single -l<SERVER_CORES> --file=test1 -- --nb-cores <NB_CORES> --txq <NB_CORES> --rxq <NB_CORES> --burst <BURST> -i
> client# ./dpdk-testpmd --vdev=net_memif0,id=1,role=client,bsize=1024,rsize=8,zero-copy=yes --single -l<CLIENT_CORES> --file=test2 -- --nb-cores <NB_CORES> --txq <NB_CORES> --rxq <NB_CORES> --burst <BURST> -i
> 
> testpmd commands:
> client:
> testpmd> start
> server:
> testpmd> start tx_first
> 
> 
> CPU: AMD EPYC 7713P
> RAM: DDR4-3200
> OS: Debian 12
> DPDK: 22.11.1
> SERVER_CORES=72,8,9,10,11
> CLIENT_CORES=76,12,13,14,15
> 
> Results:
> ==================================================================
> |                          | 1 CORE     | 2 CORES    | 4 CORES   |
> ==================================================================
> | unpatched burst=32       | 9.95 Gbps  | 19.24 Gbps | 36.4 Gbps |
> ------------------------------------------------------------------
> | 2xpktmbuf_alloc burst=32 | 9.86 Gbps  | 18.88 Gbps | 36.6 Gbps |
> ------------------------------------------------------------------
> | 2xpktmbuf_alloc burst=31 | 9.17 Gbps  | 18.69 Gbps | 35.1 Gbps |
> ------------------------------------------------------------------
> | rte_memcpy burst=32      | 9.54 Gbps  | 19.10 Gbps | 36.6 Gbps |
> ------------------------------------------------------------------
> | rte_memcpy burst=31      | 9.39 Gbps  | 18.53 Gbps | 35.5 Gbps |
> ==================================================================
> 
> 
> CPU: Intel Core i7-14700HX
> RAM: DDR5-5600
> OS:  Ubuntu 24.04.1
> DPDK: 23.11.1
> SERVER_CORES=0,1,3,5,7
> CLIENT_CORES=8,9,11,13,15
> 
> Results:
> ==================================================================
> |                          | 1 CORE     | 2 CORES    | 4 CORES   |
> ==================================================================
> | unpatched burst=32       | 15.52 Gbps | 27.35 Gbps | 46.8 Gbps |
> ------------------------------------------------------------------
> | 2xpktmbuf_alloc burst=32 | 15.49 Gbps | 27.68 Gbps | 46.4 Gbps |
> ------------------------------------------------------------------
> | 2xpktmbuf_alloc burst=31 | 14.98 Gbps | 26.75 Gbps | 45.2 Gbps |
> ------------------------------------------------------------------
> | rte_memcpy burst=32      | 15.99 Gbps | 28.44 Gbps | 49.3 Gbps |
> ------------------------------------------------------------------
> | rte_memcpy burst=31      | 14.85 Gbps | 27.32 Gbps | 46.3 Gbps |
> ==================================================================
> 

Hi Mihai,

Thank you for the extensive testing.

Problematic case is "burst=31", between '2xpktmbuf_alloc' & 'rte_memcpy'
method, there is small difference and not one of them consistently
better than other.

In this case I will proceed with current patch.


> 
> On 19/07/2024 12:03, Ferruh Yigit wrote:
>> On 7/8/2024 12:45 PM, Ferruh Yigit wrote:
>>> On 7/8/2024 4:39 AM, Mihai Brodschi wrote:
>>>>
>>>>
>>>> On 07/07/2024 21:46, Mihai Brodschi wrote:
>>>>>
>>>>>
>>>>> On 07/07/2024 18:18, Mihai Brodschi wrote:
>>>>>>
>>>>>>
>>>>>> On 07/07/2024 17:05, Ferruh Yigit wrote:
>>>>>>>
>>>>>>> My expectation is numbers should be like following:
>>>>>>>
>>>>>>> Initially:
>>>>>>>  size = 256
>>>>>>>  head = 0
>>>>>>>  tail = 0
>>>>>>>
>>>>>>> In first refill:
>>>>>>>  n_slots = 256
>>>>>>>  head = 256
>>>>>>>  tail = 0
>>>>>>>
>>>>>>> Subsequent run that 32 slots used:
>>>>>>>  head = 256
>>>>>>>  tail = 32
>>>>>>>  n_slots = 32
>>>>>>>  rte_pktmbuf_alloc_bulk(mq, buf[head & mask], n_slots);
>>>>>>>   head & mask = 0
>>>>>>>   // So it fills first 32 elements of buffer, which is inbound
>>>>>>>
>>>>>>> This will continue as above, combination of only gap filled and head
>>>>>>> masked with 'mask' provides the wrapping required.
>>>>>>
>>>>>> If I understand correctly, this works only if eth_memif_rx_zc always processes
>>>>>> a number of packets which is a power of 2, so that the ring's head always wraps
>>>>>> around at the end of a refill loop, never in the middle of it.
>>>>>> Is there any reason this should be the case?
>>>>>> Maybe the tests don't trigger the crash because this condition holds true for them?
>>>>>
>>>>> Here's how to reproduce the crash on DPDK stable 23.11.1, using testpmd:
>>>>>
>>>>> Server:
>>>>> # ./dpdk-testpmd --vdev=net_memif0,id=1,role=server,bsize=1024,rsize=8 --single-file-segments -l2,3 --file-prefix test1 -- -i
>>>>>
>>>>> Client:
>>>>> # ./dpdk-testpmd --vdev=net_memif0,id=1,role=client,bsize=1024,rsize=8,zero-copy=yes --single-file-segments -l4,5 --file-prefix test2 -- -i
>>>>> testpmd> start
>>>>>
>>>>> Server:
>>>>> testpmd> start tx_first
>>>>> testpmt> set burst 15
>>>>>
>>>>> At this point, the client crashes with a segmentation fault.
>>>>> Before the burst is set to 15, its default value is 32.
>>>>> If the receiver processes packets in bursts of size 2^N, the crash does not occur.
>>>>> Setting the burst size to any power of 2 works, anything else crashes.
>>>>> After applying this patch, the crashes are completely gone.
>>>>
>>>> Sorry, this might not crash with a segmentation fault. To confirm the mempool is
>>>> corrupted, please compile DPDK with debug=true and the c_args -DRTE_LIBRTE_MEMPOOL_DEBUG.
>>>> You should see the client panic when changing the burst size to not be a power of 2.
>>>> This also works on the latest main branch.
>>>>
>>>
>>> Hi Mihai,
>>>
>>> Right, if the buffer size is not multiple of burst size, issue is valid.
>>> And as there is a requirement to have buffer size power of two, burst
>>> should have the same.
>>> I assume this issue is not caught before because default burst size is 32.
>>>
>>> Can you please share some performance impact of the change, with two
>>> possible solutions we discussed above?
>>>
>>> Other option is to add this as a limitation to the memif zero copy, but
>>> this won't be good for usability.
>>>
>>> We can decide based on performance numbers.
>>>
>>>
>>
>> Hi Jakup,
>>
>> Do you have any comment on this?
>>
>> I think we should either document this as limitation of the driver, or
>> fix it, and if so need to decide the fix.
>>
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
  2024-06-28 21:01 [PATCH v2] net/memif: fix buffer overflow in zero copy Rx Mihai Brodschi
  2024-07-01  4:57 ` Patrick Robb
  2024-07-07  2:12 ` Ferruh Yigit
@ 2024-10-10  2:33 ` Ferruh Yigit
  2 siblings, 0 replies; 14+ messages in thread
From: Ferruh Yigit @ 2024-10-10  2:33 UTC (permalink / raw)
  To: Mihai Brodschi, Jakub Grajciar; +Cc: dev, stable

On 6/28/2024 10:01 PM, Mihai Brodschi wrote:
> rte_pktmbuf_alloc_bulk is called by the zero-copy receiver to allocate
> new mbufs to be provided to the sender. The allocated mbuf pointers
> are stored in a ring, but the alloc function doesn't implement index
> wrap-around, so it writes past the end of the array. This results in
> memory corruption and duplicate mbufs being received.
> 
> Allocate 2x the space for the mbuf ring, so that the alloc function
> has a contiguous array to write to, then copy the excess entries
> to the start of the array.
> 
> Fixes: 43b815d88188 ("net/memif: support zero-copy slave")
> Cc: stable@dpdk.org
>
> Signed-off-by: Mihai Brodschi <mihai.brodschi@broadcom.com>
>

Reviewed-by: Ferruh Yigit <ferruh.yigit@amd.com>

Applied to dpdk-next-net/main, thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx
@ 2024-07-07  5:31 Mihai Brodschi
  0 siblings, 0 replies; 14+ messages in thread
From: Mihai Brodschi @ 2024-07-07  5:31 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev, stable, Mihai Brodschi

[-- Attachment #1: Type: text/plain, Size: 5658 bytes --]

Hi Ferruh,

On 07/07/2024 05:12, Ferruh Yigit wrote:
> On 6/28/2024 10:01 PM, Mihai Brodschi wrote:
>> rte_pktmbuf_alloc_bulk is called by the zero-copy receiver to allocate
>> new mbufs to be provided to the sender. The allocated mbuf pointers
>> are stored in a ring, but the alloc function doesn't implement index
>> wrap-around, so it writes past the end of the array. This results in
>> memory corruption and duplicate mbufs being received.
>>
>
> Hi Mihai,
>
> I am not sure writing past the ring actually occurs.
>
> As far as I can see is to keep the ring full as much as possible, when
> initially 'head' and 'tail' are 0, it fills all ring.
> Later tails moves and emptied space filled again. So head (in modulo) is
> always just behind tail after refill. In next run, refill will only fill
> the part tail moved, and this is calculated by 'n_slots'. As this is
> only the size of the gap, starting from 'head' (with modulo) shouldn't
> pass the ring length.
>
> Do you observe this issue practically? If so can you please provide your
> backtrace and numbers that is showing how to reproduce the issue?

The alloc function writes starting from the ring's head, but the ring's
head can be located at the end of the ring's memory buffer (ring_size - 1).
The correct behavior would be to wrap around to the start of the buffer (0),
but the alloc function has no awareness of the fact that it's writing to a
ring, so it writes to ring_size, ring_size + 1, etc.

Let's look at the existing code:
We assume the ring size is 256 and we just received 32 packets.
The previous tail was at index 255, now it's at index 31.
The head is initially at index 255.

head = __atomic_load_n(&ring->head, __ATOMIC_RELAXED);	// head = 255
n_slots = ring_size - head + mq->last_tail;		// n_slots = 32

if (n_slots < 32)					// not taken
	goto no_free_mbufs;

ret = rte_pktmbuf_alloc_bulk(mq->mempool, &mq->buffers[head & mask], n_slots);
// This will write 32 mbuf pointers starting at index (head & mask) = 255.
// The ring size is 256, so apart from the first one all pointers will be
// written out of bounds (index 256 .. 286, when it should be 0 .. 30).

I can reproduce a crash 100% of the time with my application, but the output
is not very helpful, since it crashes elsewhere because of mempool corruption.
Applying this patch fixes the crashes completely.

>> Allocate 2x the space for the mbuf ring, so that the alloc function
>> has a contiguous array to write to, then copy the excess entries
>> to the start of the array.
>>
>
> Even issue is valid, I am not sure about solution to double to buffer
> memory, but lets confirm the issue first before discussing the solution.

Initially, I thought about splitting the call to rte_pktmbuf_alloc_bulk in two,
but I thought that might be bad for performance if the mempool is being used
concurrently from multiple threads.

If we want to use only one call to rte_pktmbuf_alloc_bulk, we need an array
to store the allocated mbuf pointers. This array must be of length ring_size,
since that's the maximum amount of mbufs which may be allocated in one go.
We need to copy the pointers from this array to the ring.

If we instead allocate twice the space for the ring, we can skip copying
the pointers which were written to the ring, and only copy those that were
written outside of its bounds.

>> Fixes: 43b815d88188 ("net/memif: support zero-copy slave")
>> Cc: stable@dpdk.org
>> Signed-off-by: Mihai Brodschi <mihai.brodschi@broadcom.com>
>> ---
>> v2:
>>  - fix email formatting
>>
>> ---
>>  drivers/net/memif/rte_eth_memif.c | 10 +++++++++-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
>> index 16da22b5c6..3491c53cf1 100644
>> --- a/drivers/net/memif/rte_eth_memif.c
>> +++ b/drivers/net/memif/rte_eth_memif.c
>> @@ -600,6 +600,10 @@ eth_memif_rx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
>>  	ret = rte_pktmbuf_alloc_bulk(mq->mempool, &mq->buffers[head & mask], n_slots);
>>  	if (unlikely(ret < 0))
>>  		goto no_free_mbufs;
>> +	if (unlikely(n_slots > ring_size - (head & mask))) {
>> +		rte_memcpy(mq->buffers, &mq->buffers[ring_size],
>> +			(n_slots + (head & mask) - ring_size) * sizeof(struct rte_mbuf *));
>> +	}
>>  
>>  	while (n_slots--) {
>>  		s0 = head++ & mask;
>> @@ -1245,8 +1249,12 @@ memif_init_queues(struct rte_eth_dev *dev)
>>  		}
>>  		mq->buffers = NULL;
>>  		if (pmd->flags & ETH_MEMIF_FLAG_ZERO_COPY) {
>> +			/*
>> +			 * Allocate 2x ring_size to reserve a contiguous array for
>> +			 * rte_pktmbuf_alloc_bulk (to store allocated mbufs).
>> +			 */
>>  			mq->buffers = rte_zmalloc("bufs", sizeof(struct rte_mbuf *) *
>> -						  (1 << mq->log2_ring_size), 0);
>> +						  (1 << (mq->log2_ring_size + 1)), 0);
>>  			if (mq->buffers == NULL)
>>  				return -ENOMEM;
>>  		}
>

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4215 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-10-10  2:33 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-28 21:01 [PATCH v2] net/memif: fix buffer overflow in zero copy Rx Mihai Brodschi
2024-07-01  4:57 ` Patrick Robb
2024-07-07  2:12 ` Ferruh Yigit
2024-07-07  5:50   ` Mihai Brodschi
2024-07-07 14:05     ` Ferruh Yigit
2024-07-07 15:18       ` Mihai Brodschi
2024-07-07 18:46         ` Mihai Brodschi
2024-07-08  3:39           ` Mihai Brodschi
2024-07-08 11:45             ` Ferruh Yigit
2024-07-19  9:03               ` Ferruh Yigit
2024-08-31 13:38                 ` Mihai Brodschi
2024-10-10  2:00                   ` Ferruh Yigit
2024-10-10  2:33 ` Ferruh Yigit
2024-07-07  5:31 Mihai Brodschi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).