DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] rte_memcpy: fix off by one for size 16 and 32
@ 2024-03-02 20:49 Stephen Hemminger
  2024-03-02 20:56 ` Stephen Hemminger
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Stephen Hemminger @ 2024-03-02 20:49 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, Morten Brørup, Bruce Richardson,
	Konstantin Ananyev, Zhihong Wang, Yuanhan Liu, Xiaoyun Li

The rte_memcpy code would do extra instructions for size 16
and 32 which potentially could reference past end of data.

For size of 16, only single mov16 is needed.
same for size of 32, only single mov32.

Fixes: f5472703c0bd ("eal: optimize aligned memcpy on x86")
Fixes: d35cc1fe6a7a ("eal/x86: revert select optimized memcpy at run-time")

Suggested-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/eal/x86/include/rte_memcpy.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/eal/x86/include/rte_memcpy.h b/lib/eal/x86/include/rte_memcpy.h
index 72a92290e05d..00479a4bfbe6 100644
--- a/lib/eal/x86/include/rte_memcpy.h
+++ b/lib/eal/x86/include/rte_memcpy.h
@@ -233,13 +233,15 @@ rte_memcpy_generic(void *dst, const void *src, size_t n)
 	 */
 	if (n <= 32) {
 		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov16((uint8_t *)dst - 16 + n,
+		if (n > 16)
+			rte_mov16((uint8_t *)dst - 16 + n,
 				  (const uint8_t *)src - 16 + n);
 		return ret;
 	}
 	if (n <= 64) {
 		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
-		rte_mov32((uint8_t *)dst - 32 + n,
+		if (n > 32)
+			rte_mov32((uint8_t *)dst - 32 + n,
 				  (const uint8_t *)src - 32 + n);
 		return ret;
 	}
-- 
2.43.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] rte_memcpy: fix off by one for size 16 and 32
  2024-03-02 20:49 [PATCH] rte_memcpy: fix off by one for size 16 and 32 Stephen Hemminger
@ 2024-03-02 20:56 ` Stephen Hemminger
  2024-03-02 22:10   ` Morten Brørup
  2024-03-03  6:46   ` Mattias Rönnblom
  2024-03-02 23:57 ` Morten Brørup
  2024-03-03  6:47 ` Mattias Rönnblom
  2 siblings, 2 replies; 6+ messages in thread
From: Stephen Hemminger @ 2024-03-02 20:56 UTC (permalink / raw)
  To: dev
  Cc: Morten Brørup, Bruce Richardson, Konstantin Ananyev,
	Zhihong Wang, Yuanhan Liu, Xiaoyun Li

On Sat,  2 Mar 2024 12:49:23 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:

> The rte_memcpy code would do extra instructions for size 16
> and 32 which potentially could reference past end of data.
> 
> For size of 16, only single mov16 is needed.
> same for size of 32, only single mov32.
> 
> Fixes: f5472703c0bd ("eal: optimize aligned memcpy on x86")
> Fixes: d35cc1fe6a7a ("eal/x86: revert select optimized memcpy at run-time")
> 
> Suggested-by: Morten Brørup <mb@smartsharesystems.com>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Self-NAK, more is needed here.

The code has lots of pre-existing bugs where it will reference past the end
of the data in some cases.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] rte_memcpy: fix off by one for size 16 and 32
  2024-03-02 20:56 ` Stephen Hemminger
@ 2024-03-02 22:10   ` Morten Brørup
  2024-03-03  6:46   ` Mattias Rönnblom
  1 sibling, 0 replies; 6+ messages in thread
From: Morten Brørup @ 2024-03-02 22:10 UTC (permalink / raw)
  To: Stephen Hemminger, dev
  Cc: Bruce Richardson, Konstantin Ananyev, Zhihong Wang, Yuanhan Liu,
	Xiaoyun Li

I'm also working on a fix.

Med venlig hilsen / Kind regards,
-Morten Brørup

> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Saturday, 2 March 2024 21.57
> To: dev@dpdk.org
> Cc: Morten Brørup; Bruce Richardson; Konstantin Ananyev; Zhihong Wang;
> Yuanhan Liu; Xiaoyun Li
> Subject: Re: [PATCH] rte_memcpy: fix off by one for size 16 and 32
> 
> On Sat,  2 Mar 2024 12:49:23 -0800
> Stephen Hemminger <stephen@networkplumber.org> wrote:
> 
> > The rte_memcpy code would do extra instructions for size 16
> > and 32 which potentially could reference past end of data.
> >
> > For size of 16, only single mov16 is needed.
> > same for size of 32, only single mov32.
> >
> > Fixes: f5472703c0bd ("eal: optimize aligned memcpy on x86")
> > Fixes: d35cc1fe6a7a ("eal/x86: revert select optimized memcpy at run-
> time")
> >
> > Suggested-by: Morten Brørup <mb@smartsharesystems.com>
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> 
> Self-NAK, more is needed here.
> 
> The code has lots of pre-existing bugs where it will reference past the
> end
> of the data in some cases.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] rte_memcpy: fix off by one for size 16 and 32
  2024-03-02 20:49 [PATCH] rte_memcpy: fix off by one for size 16 and 32 Stephen Hemminger
  2024-03-02 20:56 ` Stephen Hemminger
@ 2024-03-02 23:57 ` Morten Brørup
  2024-03-03  6:47 ` Mattias Rönnblom
  2 siblings, 0 replies; 6+ messages in thread
From: Morten Brørup @ 2024-03-02 23:57 UTC (permalink / raw)
  To: Stephen Hemminger, dev
  Cc: Bruce Richardson, Konstantin Ananyev, Zhihong Wang, Yuanhan Liu,
	Xiaoyun Li

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Saturday, 2 March 2024 21.49
> 
> The rte_memcpy code would do extra instructions for size 16
> and 32 which potentially could reference past end of data.

It's a somewhat weird concept, but they don't reference past end of data.
They reference data in chunks of 16.
E.g. when copying 17 bytes: first copy [0..15] and then copy [1..16] (as "-16 + n" in the code).

By referencing an address "-16" in a block of 16 bytes, they are not referencing past end of data.

> 
> For size of 16, only single mov16 is needed.
> same for size of 32, only single mov32.

I fixed the duplicate copies with my patch [1]. Please review.

[1]: https://inbox.dpdk.org/dev/20240302234812.9137-1-mb@smartsharesystems.com/T/#u


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] rte_memcpy: fix off by one for size 16 and 32
  2024-03-02 20:56 ` Stephen Hemminger
  2024-03-02 22:10   ` Morten Brørup
@ 2024-03-03  6:46   ` Mattias Rönnblom
  1 sibling, 0 replies; 6+ messages in thread
From: Mattias Rönnblom @ 2024-03-03  6:46 UTC (permalink / raw)
  To: Stephen Hemminger, dev
  Cc: Morten Brørup, Bruce Richardson, Konstantin Ananyev,
	Zhihong Wang, Yuanhan Liu, Xiaoyun Li

On 2024-03-02 21:56, Stephen Hemminger wrote:
> On Sat,  2 Mar 2024 12:49:23 -0800
> Stephen Hemminger <stephen@networkplumber.org> wrote:
> 
>> The rte_memcpy code would do extra instructions for size 16
>> and 32 which potentially could reference past end of data.
>>
>> For size of 16, only single mov16 is needed.
>> same for size of 32, only single mov32.
>>
>> Fixes: f5472703c0bd ("eal: optimize aligned memcpy on x86")
>> Fixes: d35cc1fe6a7a ("eal/x86: revert select optimized memcpy at run-time")
>>
>> Suggested-by: Morten Brørup <mb@smartsharesystems.com>
>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> 
> Self-NAK, more is needed here.
> 
> The code has lots of pre-existing bugs where it will reference past the end
> of the data in some cases.

Memory beyond the buffer is not accessed in this case. The rte_mov16() 
copies just overlap.

A colleague pointed out the same "bug" to me a couple of years ago. We 
didn't realize what code would be generated in the n == 16 case though. 
That seems very much worth fixing.

Maybe it's worth adding a comment regarding the overlap.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] rte_memcpy: fix off by one for size 16 and 32
  2024-03-02 20:49 [PATCH] rte_memcpy: fix off by one for size 16 and 32 Stephen Hemminger
  2024-03-02 20:56 ` Stephen Hemminger
  2024-03-02 23:57 ` Morten Brørup
@ 2024-03-03  6:47 ` Mattias Rönnblom
  2 siblings, 0 replies; 6+ messages in thread
From: Mattias Rönnblom @ 2024-03-03  6:47 UTC (permalink / raw)
  To: Stephen Hemminger, dev
  Cc: Morten Brørup, Bruce Richardson, Konstantin Ananyev,
	Zhihong Wang, Yuanhan Liu, Xiaoyun Li

On 2024-03-02 21:49, Stephen Hemminger wrote:
> The rte_memcpy code would do extra instructions for size 16
> and 32 which potentially could reference past end of data.
> 
> For size of 16, only single mov16 is needed.
> same for size of 32, only single mov32.
> 
> Fixes: f5472703c0bd ("eal: optimize aligned memcpy on x86")
> Fixes: d35cc1fe6a7a ("eal/x86: revert select optimized memcpy at run-time")
> 
> Suggested-by: Morten Brørup <mb@smartsharesystems.com>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>   lib/eal/x86/include/rte_memcpy.h | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/eal/x86/include/rte_memcpy.h b/lib/eal/x86/include/rte_memcpy.h
> index 72a92290e05d..00479a4bfbe6 100644
> --- a/lib/eal/x86/include/rte_memcpy.h
> +++ b/lib/eal/x86/include/rte_memcpy.h
> @@ -233,13 +233,15 @@ rte_memcpy_generic(void *dst, const void *src, size_t n)
>   	 */
>   	if (n <= 32) {
>   		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
> -		rte_mov16((uint8_t *)dst - 16 + n,
> +		if (n > 16)
> +			rte_mov16((uint8_t *)dst - 16 + n,
>   				  (const uint8_t *)src - 16 + n);
>   		return ret;
>   	}
>   	if (n <= 64) {
>   		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
> -		rte_mov32((uint8_t *)dst - 32 + n,
> +		if (n > 32)

n is always > 32 here.

> +			rte_mov32((uint8_t *)dst - 32 + n,
>   				  (const uint8_t *)src - 32 + n);
>   		return ret;
>   	}

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-03-03  6:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-02 20:49 [PATCH] rte_memcpy: fix off by one for size 16 and 32 Stephen Hemminger
2024-03-02 20:56 ` Stephen Hemminger
2024-03-02 22:10   ` Morten Brørup
2024-03-03  6:46   ` Mattias Rönnblom
2024-03-02 23:57 ` Morten Brørup
2024-03-03  6:47 ` Mattias Rönnblom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).