From: Jerin Jacob <jerinjacobk@gmail.com>
To: Timothy McDaniel <timothy.mcdaniel@intel.com>,
"Richardson, Bruce" <bruce.richardson@intel.com>,
konstantin.v.ananyev@yandex.ru
Cc: Jerin Jacob <jerinj@marvell.com>, dpdk-dev <dev@dpdk.org>
Subject: Re: [PATCH] event/dlb2: add support for single 512B write of 4 QEs
Date: Sat, 14 May 2022 17:37:39 +0530 [thread overview]
Message-ID: <CALBAE1Ns=prBKxsvjbOwN93HeC5zekF0jrdWM-bMwFLtiZji4w@mail.gmail.com> (raw)
In-Reply-To: <20220409151849.1007602-1-timothy.mcdaniel@intel.com>
On Sat, Apr 9, 2022 at 8:48 PM Timothy McDaniel
<timothy.mcdaniel@intel.com> wrote:
>
> On Xeon, as 512b accesses are available, movdir64 instruction is able to
> perform 512b read and write to DLB producer port. In order for movdir64
> to be able to pull its data from store buffers (store-buffer-forwarding)
> (before actual write), data should be in single 512b write format.
> This commit add change when code is built for Xeon with 512b AVX support
> to make single 512b write of all 4 QEs instead of 4x64b writes.
>
> Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
> ---
> drivers/event/dlb2/dlb2.c | 86 ++++++++++++++++++++++++++++++---------
> 1 file changed, 67 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
> index 36f07d0061..e2a5303310 100644
> --- a/drivers/event/dlb2/dlb2.c
> +++ b/drivers/event/dlb2/dlb2.c
> @@ -2776,25 +2776,73 @@ dlb2_event_build_hcws(struct dlb2_port *qm_port,
> ev[3].event_type,
> DLB2_QE_EV_TYPE_WORD + 4);
>
> - /* Store the metadata to memory (use the double-precision
> - * _mm_storeh_pd because there is no integer function for
> - * storing the upper 64b):
> - * qe[0] metadata = sse_qe[0][63:0]
> - * qe[1] metadata = sse_qe[0][127:64]
> - * qe[2] metadata = sse_qe[1][63:0]
> - * qe[3] metadata = sse_qe[1][127:64]
> - */
> - _mm_storel_epi64((__m128i *)&qe[0].u.opaque_data, sse_qe[0]);
> - _mm_storeh_pd((double *)&qe[1].u.opaque_data,
> - (__m128d)sse_qe[0]);
> - _mm_storel_epi64((__m128i *)&qe[2].u.opaque_data, sse_qe[1]);
> - _mm_storeh_pd((double *)&qe[3].u.opaque_data,
> - (__m128d)sse_qe[1]);
> -
> - qe[0].data = ev[0].u64;
> - qe[1].data = ev[1].u64;
> - qe[2].data = ev[2].u64;
> - qe[3].data = ev[3].u64;
> + #ifdef __AVX512VL__
+ x86 maintainers
We need a runtime check based on CPU flags. Right? As the build and
run machine can be different?
> +
> + /*
> + * 1) Build avx512 QE store and build each
> + * QE individually as XMM register
> + * 2) Merge the 4 XMM registers/QEs into single AVX512
> + * register
> + * 3) Store single avx512 register to &qe[0] (4x QEs
> + * stored in 1x store)
> + */
> +
> + __m128i v_qe0 = _mm_setzero_si128();
> + uint64_t meta = _mm_extract_epi64(sse_qe[0], 0);
> + v_qe0 = _mm_insert_epi64(v_qe0, ev[0].u64, 0);
> + v_qe0 = _mm_insert_epi64(v_qe0, meta, 1);
> +
> + __m128i v_qe1 = _mm_setzero_si128();
> + meta = _mm_extract_epi64(sse_qe[0], 1);
> + v_qe1 = _mm_insert_epi64(v_qe1, ev[1].u64, 0);
> + v_qe1 = _mm_insert_epi64(v_qe1, meta, 1);
> +
> + __m128i v_qe2 = _mm_setzero_si128();
> + meta = _mm_extract_epi64(sse_qe[1], 0);
> + v_qe2 = _mm_insert_epi64(v_qe2, ev[2].u64, 0);
> + v_qe2 = _mm_insert_epi64(v_qe2, meta, 1);
> +
> + __m128i v_qe3 = _mm_setzero_si128();
> + meta = _mm_extract_epi64(sse_qe[1], 1);
> + v_qe3 = _mm_insert_epi64(v_qe3, ev[3].u64, 0);
> + v_qe3 = _mm_insert_epi64(v_qe3, meta, 1);
> +
> + /* we have 4x XMM registers, one per QE. */
> + __m512i v_all_qes = _mm512_setzero_si512();
> + v_all_qes = _mm512_inserti32x4(v_all_qes, v_qe0, 0);
> + v_all_qes = _mm512_inserti32x4(v_all_qes, v_qe1, 1);
> + v_all_qes = _mm512_inserti32x4(v_all_qes, v_qe2, 2);
> + v_all_qes = _mm512_inserti32x4(v_all_qes, v_qe3, 3);
> +
> + /*
> + * store the 4x QEs in a single register to the scratch
> + * space of the PMD
> + */
> + _mm512_store_si512(&qe[0], v_all_qes);
> +#else
> + /*
> + * Store the metadata to memory (use the double-precision
> + * _mm_storeh_pd because there is no integer function for
> + * storing the upper 64b):
> + * qe[0] metadata = sse_qe[0][63:0]
> + * qe[1] metadata = sse_qe[0][127:64]
> + * qe[2] metadata = sse_qe[1][63:0]
> + * qe[3] metadata = sse_qe[1][127:64]
> + */
> + _mm_storel_epi64((__m128i *)&qe[0].u.opaque_data,
> + sse_qe[0]);
> + _mm_storeh_pd((double *)&qe[1].u.opaque_data,
> + (__m128d)sse_qe[0]);
> + _mm_storel_epi64((__m128i *)&qe[2].u.opaque_data,
> + sse_qe[1]);
> + _mm_storeh_pd((double *)&qe[3].u.opaque_data,
> + (__m128d)sse_qe[1]);
> +
> + qe[0].data = ev[0].u64;
> + qe[1].data = ev[1].u64;
> + qe[2].data = ev[2].u64;
> + qe[3].data = ev[3].u64;
> +#endif
>
> break;
> case 3:
> --
> 2.25.1
>
next prev parent reply other threads:[~2022-05-14 12:08 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-09 15:18 Timothy McDaniel
2022-05-14 12:07 ` Jerin Jacob [this message]
2022-05-16 8:42 ` Bruce Richardson
2022-05-16 17:00 ` McDaniel, Timothy
2022-05-19 20:24 ` [PATCH v3] " Timothy McDaniel
2022-05-23 16:09 ` [PATCH v4] " Timothy McDaniel
2022-05-23 16:34 ` Bruce Richardson
2022-05-23 16:52 ` McDaniel, Timothy
2022-05-23 16:55 ` Bruce Richardson
2022-06-09 17:40 ` Jerin Jacob
2022-06-09 18:02 ` McDaniel, Timothy
2022-05-23 16:37 ` Bruce Richardson
2022-05-23 16:45 ` McDaniel, Timothy
2022-06-10 12:43 ` [PATCH v6] " Timothy McDaniel
2022-06-10 15:41 ` [PATCH v7] " Timothy McDaniel
2022-06-10 16:15 ` Bruce Richardson
2022-06-10 16:27 ` [PATCH v8] " Timothy McDaniel
2022-06-13 6:30 ` Jerin Jacob
2022-06-13 20:39 ` [PATCH v9] " Timothy McDaniel
2022-06-14 10:40 ` Jerin Jacob
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALBAE1Ns=prBKxsvjbOwN93HeC5zekF0jrdWM-bMwFLtiZji4w@mail.gmail.com' \
--to=jerinjacobk@gmail.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=jerinj@marvell.com \
--cc=konstantin.v.ananyev@yandex.ru \
--cc=timothy.mcdaniel@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).