* [PATCH] eal/x86: remove redundant round to improve performance
@ 2023-03-29 9:16 Leyi Rong
2023-03-29 9:30 ` Morten Brørup
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Leyi Rong @ 2023-03-29 9:16 UTC (permalink / raw)
To: mb, bruce.richardson; +Cc: dev, Leyi Rong
In rte_memcpy_aligned(), one redundant round is taken in the 64 bytes
block copy loops if the size is a multiple of 64. So, let the catch-up
copy the last 64 bytes in this case.
Suggested-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
---
lib/eal/x86/include/rte_memcpy.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/eal/x86/include/rte_memcpy.h b/lib/eal/x86/include/rte_memcpy.h
index d4d7a5cfc8..fd151be708 100644
--- a/lib/eal/x86/include/rte_memcpy.h
+++ b/lib/eal/x86/include/rte_memcpy.h
@@ -846,7 +846,7 @@ rte_memcpy_aligned(void *dst, const void *src, size_t n)
}
/* Copy 64 bytes blocks */
- for (; n >= 64; n -= 64) {
+ for (; n > 64; n -= 64) {
rte_mov64((uint8_t *)dst, (const uint8_t *)src);
dst = (uint8_t *)dst + 64;
src = (const uint8_t *)src + 64;
--
2.34.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] eal/x86: remove redundant round to improve performance
2023-03-29 9:16 [PATCH] eal/x86: remove redundant round to improve performance Leyi Rong
@ 2023-03-29 9:30 ` Morten Brørup
2023-03-29 10:20 ` Bruce Richardson
2023-04-04 13:15 ` David Marchand
2 siblings, 0 replies; 5+ messages in thread
From: Morten Brørup @ 2023-03-29 9:30 UTC (permalink / raw)
To: Leyi Rong, bruce.richardson; +Cc: dev
> From: Leyi Rong [mailto:leyi.rong@intel.com]
> Sent: Wednesday, 29 March 2023 11.17
>
> In rte_memcpy_aligned(), one redundant round is taken in the 64 bytes
> block copy loops if the size is a multiple of 64. So, let the catch-up
> copy the last 64 bytes in this case.
>
> Suggested-by: Morten Brørup <mb@smartsharesystems.com>
> Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> ---
> lib/eal/x86/include/rte_memcpy.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/eal/x86/include/rte_memcpy.h
> b/lib/eal/x86/include/rte_memcpy.h
> index d4d7a5cfc8..fd151be708 100644
> --- a/lib/eal/x86/include/rte_memcpy.h
> +++ b/lib/eal/x86/include/rte_memcpy.h
> @@ -846,7 +846,7 @@ rte_memcpy_aligned(void *dst, const void *src, size_t n)
> }
>
> /* Copy 64 bytes blocks */
> - for (; n >= 64; n -= 64) {
> + for (; n > 64; n -= 64) {
> rte_mov64((uint8_t *)dst, (const uint8_t *)src);
> dst = (uint8_t *)dst + 64;
> src = (const uint8_t *)src + 64;
> --
> 2.34.1
>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] eal/x86: remove redundant round to improve performance
2023-03-29 9:16 [PATCH] eal/x86: remove redundant round to improve performance Leyi Rong
2023-03-29 9:30 ` Morten Brørup
@ 2023-03-29 10:20 ` Bruce Richardson
2023-04-04 13:15 ` David Marchand
2 siblings, 0 replies; 5+ messages in thread
From: Bruce Richardson @ 2023-03-29 10:20 UTC (permalink / raw)
To: Leyi Rong; +Cc: mb, dev
On Wed, Mar 29, 2023 at 05:16:58PM +0800, Leyi Rong wrote:
> In rte_memcpy_aligned(), one redundant round is taken in the 64 bytes
> block copy loops if the size is a multiple of 64. So, let the catch-up
> copy the last 64 bytes in this case.
>
> Suggested-by: Morten Brørup <mb@smartsharesystems.com>
> Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> ---
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Thanks for doing the fix for this.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] eal/x86: remove redundant round to improve performance
2023-03-29 9:16 [PATCH] eal/x86: remove redundant round to improve performance Leyi Rong
2023-03-29 9:30 ` Morten Brørup
2023-03-29 10:20 ` Bruce Richardson
@ 2023-04-04 13:15 ` David Marchand
2023-06-07 16:44 ` David Marchand
2 siblings, 1 reply; 5+ messages in thread
From: David Marchand @ 2023-04-04 13:15 UTC (permalink / raw)
To: Leyi Rong; +Cc: mb, bruce.richardson, dev
On Wed, Mar 29, 2023 at 11:17 AM Leyi Rong <leyi.rong@intel.com> wrote:
>
> In rte_memcpy_aligned(), one redundant round is taken in the 64 bytes
> block copy loops if the size is a multiple of 64. So, let the catch-up
> copy the last 64 bytes in this case.
Fixes: f5472703c0bd ("eal: optimize aligned memcpy on x86")
>
> Suggested-by: Morten Brørup <mb@smartsharesystems.com>
> Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
--
David Marchand
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] eal/x86: remove redundant round to improve performance
2023-04-04 13:15 ` David Marchand
@ 2023-06-07 16:44 ` David Marchand
0 siblings, 0 replies; 5+ messages in thread
From: David Marchand @ 2023-06-07 16:44 UTC (permalink / raw)
To: Leyi Rong; +Cc: mb, bruce.richardson, dev
On Tue, Apr 4, 2023 at 3:15 PM David Marchand <david.marchand@redhat.com> wrote:
> On Wed, Mar 29, 2023 at 11:17 AM Leyi Rong <leyi.rong@intel.com> wrote:
> >
> > In rte_memcpy_aligned(), one redundant round is taken in the 64 bytes
> > block copy loops if the size is a multiple of 64. So, let the catch-up
> > copy the last 64 bytes in this case.
>
> Fixes: f5472703c0bd ("eal: optimize aligned memcpy on x86")
> >
> > Suggested-by: Morten Brørup <mb@smartsharesystems.com>
> > Signed-off-by: Leyi Rong <leyi.rong@intel.com>
> Reviewed-by: David Marchand <david.marchand@redhat.com>
Applied, thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-06-07 16:44 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-29 9:16 [PATCH] eal/x86: remove redundant round to improve performance Leyi Rong
2023-03-29 9:30 ` Morten Brørup
2023-03-29 10:20 ` Bruce Richardson
2023-04-04 13:15 ` David Marchand
2023-06-07 16:44 ` David Marchand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).