DPDK CI discussions
 help / color / mirror / Atom feed
* Intel QAT 8970 accel card on ARM Ampere Server
@ 2023-07-31 17:13 Patrick Robb
  2023-08-04  9:48 ` Ruifeng Wang
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick Robb @ 2023-07-31 17:13 UTC (permalink / raw)
  To: Ruifeng Wang, Honnappa Nagarahalli, Juraj Linkeš
  Cc: dharmikjayesh.thakkar, ci

[-- Attachment #1: Type: text/plain, Size: 3323 bytes --]

Hi Ruifeng, Honnappa, Juraj,

The Intel QAT 8970 accelerator card has arrived to the Community Lab, and
we've installed it on the Ampere server. Presumably, we should test both
crypto and compress operations (and their respective performance metrics).
To that end, there are also DTS testsuites for testing QAT crypto/compress
functions. These testsuites make use of the crypto perf dpdk app and the
compress perf dpdk app. If you want, you can setup the DTS stuff yourself,
both on the system side, and the Jenkins side (you are allowed to submit
PRs on our gitlab now), but we can also do that on the lab side as we
probably have more experience. I do, however, have a question about the QAT
kernel driver and corresponding PMDs.

compress suite:
https://git.dpdk.org/tools/dts/tree/test_plans/compressdev_qat_pmd_test_plan.rst
crypto suite:
https://git.dpdk.org/tools/dts/tree/test_plans/crypto_perf_cryptodev_perf_test_plan.rst

For reference, the DPDK docs page explaining QAT driver capabilities and
building the QAT PMDs (crypto sym, crypto asym, and compress) is here:
https://doc.dpdk.org/guides/cryptodevs/qat.html#building-qat

Some notes before I get to my main question:
-The 8970 is a C62x device
-OpenSSL (arm requires it for QAT) is installed
-3 PFs are visible from lspci (expected)
-SRIOV is enabled

However, although the system is on a valid kernel version for the QAT
driver, the kernel module for QAT is not loaded, so in trying to set up
testing, I am unable to create the 16 VFs for the 3 PFs respectively, like
the example below:

echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs

There is also an option to download the firmware from the kernel firmware
repo and copy the qat binaries to /lib/firmware and start the qat modules
from there. I wasn't able to resolve the situation with this method, but it
also could have been user error on my part.

There is an option to install using the IDZ QAT Driver
<https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html>,
but it should not be required given the kernel version the Ampere server is
on, and I don't want to go down the road of relying on this "fall back"
method without consulting you first. Is it possible that there is anything
specific to running a QAT device on ARM specifically which I am missing
here? The DTS testsuite testplans actually seem to recommend going down
this road in general, but the DPDK docs say to use the kernel driver, so I
don't know.

In any case, one of you should be able to login to the Ampere server in
situations like this, or just in general. Ruifeng/Juraj I see you both have
accounts on our IdM system, so you should have access. Please let me know
if you need renewed vpn cert configs and I will send you one. If you do
login, know this system could be running CI testing at any time. I can
always schedule time for it to be offline and available for maintenance if
you want to do anything which could be disruptive to testing.

I also CC'd Dharmik on this as I see he sent an email regarding QAT support
on aarch64 in June.

Let me know if you have any thoughts on the QAT kernel driver part.

Thanks,
Patrick

[-- Attachment #2: Type: text/html, Size: 4040 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: Intel QAT 8970 accel card on ARM Ampere Server
  2023-07-31 17:13 Intel QAT 8970 accel card on ARM Ampere Server Patrick Robb
@ 2023-08-04  9:48 ` Ruifeng Wang
  2023-08-08  7:07   ` Juraj Linkeš
  0 siblings, 1 reply; 31+ messages in thread
From: Ruifeng Wang @ 2023-08-04  9:48 UTC (permalink / raw)
  To: Patrick Robb, Honnappa Nagarahalli, Juraj Linkeš
  Cc: Dharmik Jayesh Thakkar, ci, nd

[-- Attachment #1: Type: text/plain, Size: 3990 bytes --]

Hi Patrick,

Thanks for reaching out and my apologies for delayed response.

We noticed that some information is missing regarding using QAT with DPDK on Arm.
The DPDK document will be updated to include the missing part.
Will get back on this later.

Best regards,
Ruifeng

From: Patrick Robb <probb@iol.unh.edu>
Sent: Tuesday, August 1, 2023 1:14 AM
To: Ruifeng Wang <Ruifeng.Wang@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Juraj Linkeš <juraj.linkes@pantheon.tech>
Cc: Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org
Subject: Intel QAT 8970 accel card on ARM Ampere Server

Hi Ruifeng, Honnappa, Juraj,

The Intel QAT 8970 accelerator card has arrived to the Community Lab, and we've installed it on the Ampere server. Presumably, we should test both crypto and compress operations (and their respective performance metrics). To that end, there are also DTS testsuites for testing QAT crypto/compress functions. These testsuites make use of the crypto perf dpdk app and the compress perf dpdk app. If you want, you can setup the DTS stuff yourself, both on the system side, and the Jenkins side (you are allowed to submit PRs on our gitlab now), but we can also do that on the lab side as we probably have more experience. I do, however, have a question about the QAT kernel driver and corresponding PMDs.

compress suite: https://git.dpdk.org/tools/dts/tree/test_plans/compressdev_qat_pmd_test_plan.rst
crypto suite: https://git.dpdk.org/tools/dts/tree/test_plans/crypto_perf_cryptodev_perf_test_plan.rst

For reference, the DPDK docs page explaining QAT driver capabilities and building the QAT PMDs (crypto sym, crypto asym, and compress) is here: https://doc.dpdk.org/guides/cryptodevs/qat.html#building-qat

Some notes before I get to my main question:
-The 8970 is a C62x device
-OpenSSL (arm requires it for QAT) is installed
-3 PFs are visible from lspci (expected)
-SRIOV is enabled

However, although the system is on a valid kernel version for the QAT driver, the kernel module for QAT is not loaded, so in trying to set up testing, I am unable to create the 16 VFs for the 3 PFs respectively, like the example below:

echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs

There is also an option to download the firmware from the kernel firmware repo and copy the qat binaries to /lib/firmware and start the qat modules from there. I wasn't able to resolve the situation with this method, but it also could have been user error on my part.

There is an option to install using the IDZ QAT Driver<https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html>, but it should not be required given the kernel version the Ampere server is on, and I don't want to go down the road of relying on this "fall back" method without consulting you first. Is it possible that there is anything specific to running a QAT device on ARM specifically which I am missing here? The DTS testsuite testplans actually seem to recommend going down this road in general, but the DPDK docs say to use the kernel driver, so I don't know.

In any case, one of you should be able to login to the Ampere server in situations like this, or just in general. Ruifeng/Juraj I see you both have accounts on our IdM system, so you should have access. Please let me know if you need renewed vpn cert configs and I will send you one. If you do login, know this system could be running CI testing at any time. I can always schedule time for it to be offline and available for maintenance if you want to do anything which could be disruptive to testing.

I also CC'd Dharmik on this as I see he sent an email regarding QAT support on aarch64 in June.

Let me know if you have any thoughts on the QAT kernel driver part.

Thanks,
Patrick

[-- Attachment #2: Type: text/html, Size: 8403 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-08-04  9:48 ` Ruifeng Wang
@ 2023-08-08  7:07   ` Juraj Linkeš
  2023-08-08  7:11     ` Ruifeng Wang
  0 siblings, 1 reply; 31+ messages in thread
From: Juraj Linkeš @ 2023-08-08  7:07 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Patrick Robb, Honnappa Nagarahalli, Dharmik Jayesh Thakkar, ci, nd

[-- Attachment #1: Type: text/plain, Size: 5098 bytes --]

We've talked about this some more and the best way to move forward is to
rebuild the ubuntu kernel. It should be fairly straightforward
according to their
wiki page <https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel>. The page
mentions a fairly old release (19.04), but was updated a year ago so the
instructions are likely still valid.

However, I don't have the link to the kernel patch that Honnappa
mentioned. @Honnappa
Nagarahalli <Honnappa.Nagarahalli@arm.com> @Ruifeng Wang
<Ruifeng.Wang@arm.com>, can you please provide a reference for the patch?

Since the patch is small, there shouldn't be problems with applying it. Let
us know whether this is doable.

Regards,
Juraj

On Fri, Aug 4, 2023 at 11:48 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:

> Hi Patrick,
>
>
>
> Thanks for reaching out and my apologies for delayed response.
>
>
>
> We noticed that some information is missing regarding using QAT with DPDK
> on Arm.
>
> The DPDK document will be updated to include the missing part.
>
> Will get back on this later.
>
>
>
> Best regards,
>
> Ruifeng
>
>
>
> *From:* Patrick Robb <probb@iol.unh.edu>
> *Sent:* Tuesday, August 1, 2023 1:14 AM
> *To:* Ruifeng Wang <Ruifeng.Wang@arm.com>; Honnappa Nagarahalli <
> Honnappa.Nagarahalli@arm.com>; Juraj Linkeš <juraj.linkes@pantheon.tech>
> *Cc:* Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org
> *Subject:* Intel QAT 8970 accel card on ARM Ampere Server
>
>
>
> Hi Ruifeng, Honnappa, Juraj,
>
>
>
> The Intel QAT 8970 accelerator card has arrived to the Community Lab, and
> we've installed it on the Ampere server. Presumably, we should test both
> crypto and compress operations (and their respective performance metrics).
> To that end, there are also DTS testsuites for testing QAT crypto/compress
> functions. These testsuites make use of the crypto perf dpdk app and the
> compress perf dpdk app. If you want, you can setup the DTS stuff yourself,
> both on the system side, and the Jenkins side (you are allowed to submit
> PRs on our gitlab now), but we can also do that on the lab side as we
> probably have more experience. I do, however, have a question about the QAT
> kernel driver and corresponding PMDs.
>
>
>
> compress suite:
> https://git.dpdk.org/tools/dts/tree/test_plans/compressdev_qat_pmd_test_plan.rst
>
> crypto suite:
> https://git.dpdk.org/tools/dts/tree/test_plans/crypto_perf_cryptodev_perf_test_plan.rst
>
>
>
> For reference, the DPDK docs page explaining QAT driver capabilities and
> building the QAT PMDs (crypto sym, crypto asym, and compress) is here:
> https://doc.dpdk.org/guides/cryptodevs/qat.html#building-qat
>
> Some notes before I get to my main question:
>
> -The 8970 is a C62x device
>
> -OpenSSL (arm requires it for QAT) is installed
>
> -3 PFs are visible from lspci (expected)
>
> -SRIOV is enabled
>
>
>
> However, although the system is on a valid kernel version for the QAT
> driver, the kernel module for QAT is not loaded, so in trying to set up
> testing, I am unable to create the 16 VFs for the 3 PFs respectively, like
> the example below:
>
>
>
> echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
> echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
> echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
>
>
>
> There is also an option to download the firmware from the kernel firmware
> repo and copy the qat binaries to /lib/firmware and start the qat modules
> from there. I wasn't able to resolve the situation with this method, but it
> also could have been user error on my part.
>
>
>
> There is an option to install using the IDZ QAT Driver
> <https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html>,
> but it should not be required given the kernel version the Ampere server is
> on, and I don't want to go down the road of relying on this "fall back"
> method without consulting you first. Is it possible that there is anything
> specific to running a QAT device on ARM specifically which I am missing
> here? The DTS testsuite testplans actually seem to recommend going down
> this road in general, but the DPDK docs say to use the kernel driver, so I
> don't know.
>
>
>
> In any case, one of you should be able to login to the Ampere server in
> situations like this, or just in general. Ruifeng/Juraj I see you both have
> accounts on our IdM system, so you should have access. Please let me know
> if you need renewed vpn cert configs and I will send you one. If you do
> login, know this system could be running CI testing at any time. I can
> always schedule time for it to be offline and available for maintenance if
> you want to do anything which could be disruptive to testing.
>
>
>
> I also CC'd Dharmik on this as I see he sent an email regarding QAT
> support on aarch64 in June.
>
>
>
> Let me know if you have any thoughts on the QAT kernel driver part.
>
>
>
> Thanks,
>
> Patrick
>

[-- Attachment #2: Type: text/html, Size: 8689 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: Intel QAT 8970 accel card on ARM Ampere Server
  2023-08-08  7:07   ` Juraj Linkeš
@ 2023-08-08  7:11     ` Ruifeng Wang
  2023-08-11 21:18       ` Patrick Robb
  0 siblings, 1 reply; 31+ messages in thread
From: Ruifeng Wang @ 2023-08-08  7:11 UTC (permalink / raw)
  To: Juraj Linkeš
  Cc: Patrick Robb, Honnappa Nagarahalli, Dharmik Jayesh Thakkar, ci, nd, nd

[-- Attachment #1: Type: text/plain, Size: 5430 bytes --]

From: Juraj Linkeš <juraj.linkes@pantheon.tech>
Sent: Tuesday, August 8, 2023 3:07 PM
To: Ruifeng Wang <Ruifeng.Wang@arm.com>
Cc: Patrick Robb <probb@iol.unh.edu>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org; nd <nd@arm.com>
Subject: Re: Intel QAT 8970 accel card on ARM Ampere Server

We've talked about this some more and the best way to move forward is to rebuild the ubuntu kernel. It should be fairly straightforward according to their wiki page<https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel>. The page mentions a fairly old release (19.04), but was updated a year ago so the instructions are likely still valid.

However, I don't have the link to the kernel patch that Honnappa mentioned. @Honnappa Nagarahalli<mailto:Honnappa.Nagarahalli@arm.com> @Ruifeng Wang<mailto:Ruifeng.Wang@arm.com>, can you please provide a reference for the patch?
[Ruifeng] Here is the kernel patch set: https://lkml.org/lkml/2022/6/17/328

Since the patch is small, there shouldn't be problems with applying it. Let us know whether this is doable.

Regards,
Juraj

On Fri, Aug 4, 2023 at 11:48 AM Ruifeng Wang <Ruifeng.Wang@arm.com<mailto:Ruifeng.Wang@arm.com>> wrote:
Hi Patrick,

Thanks for reaching out and my apologies for delayed response.

We noticed that some information is missing regarding using QAT with DPDK on Arm.
The DPDK document will be updated to include the missing part.
Will get back on this later.

Best regards,
Ruifeng

From: Patrick Robb <probb@iol.unh.edu<mailto:probb@iol.unh.edu>>
Sent: Tuesday, August 1, 2023 1:14 AM
To: Ruifeng Wang <Ruifeng.Wang@arm.com<mailto:Ruifeng.Wang@arm.com>>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com<mailto:Honnappa.Nagarahalli@arm.com>>; Juraj Linkeš <juraj.linkes@pantheon.tech<mailto:juraj.linkes@pantheon.tech>>
Cc: Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com<mailto:DharmikJayesh.Thakkar@arm.com>>; ci@dpdk.org<mailto:ci@dpdk.org>
Subject: Intel QAT 8970 accel card on ARM Ampere Server

Hi Ruifeng, Honnappa, Juraj,

The Intel QAT 8970 accelerator card has arrived to the Community Lab, and we've installed it on the Ampere server. Presumably, we should test both crypto and compress operations (and their respective performance metrics). To that end, there are also DTS testsuites for testing QAT crypto/compress functions. These testsuites make use of the crypto perf dpdk app and the compress perf dpdk app. If you want, you can setup the DTS stuff yourself, both on the system side, and the Jenkins side (you are allowed to submit PRs on our gitlab now), but we can also do that on the lab side as we probably have more experience. I do, however, have a question about the QAT kernel driver and corresponding PMDs.

compress suite: https://git.dpdk.org/tools/dts/tree/test_plans/compressdev_qat_pmd_test_plan.rst
crypto suite: https://git.dpdk.org/tools/dts/tree/test_plans/crypto_perf_cryptodev_perf_test_plan.rst

For reference, the DPDK docs page explaining QAT driver capabilities and building the QAT PMDs (crypto sym, crypto asym, and compress) is here: https://doc.dpdk.org/guides/cryptodevs/qat.html#building-qat

Some notes before I get to my main question:
-The 8970 is a C62x device
-OpenSSL (arm requires it for QAT) is installed
-3 PFs are visible from lspci (expected)
-SRIOV is enabled

However, although the system is on a valid kernel version for the QAT driver, the kernel module for QAT is not loaded, so in trying to set up testing, I am unable to create the 16 VFs for the 3 PFs respectively, like the example below:

echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs

There is also an option to download the firmware from the kernel firmware repo and copy the qat binaries to /lib/firmware and start the qat modules from there. I wasn't able to resolve the situation with this method, but it also could have been user error on my part.

There is an option to install using the IDZ QAT Driver<https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html>, but it should not be required given the kernel version the Ampere server is on, and I don't want to go down the road of relying on this "fall back" method without consulting you first. Is it possible that there is anything specific to running a QAT device on ARM specifically which I am missing here? The DTS testsuite testplans actually seem to recommend going down this road in general, but the DPDK docs say to use the kernel driver, so I don't know.

In any case, one of you should be able to login to the Ampere server in situations like this, or just in general. Ruifeng/Juraj I see you both have accounts on our IdM system, so you should have access. Please let me know if you need renewed vpn cert configs and I will send you one. If you do login, know this system could be running CI testing at any time. I can always schedule time for it to be offline and available for maintenance if you want to do anything which could be disruptive to testing.

I also CC'd Dharmik on this as I see he sent an email regarding QAT support on aarch64 in June.

Let me know if you have any thoughts on the QAT kernel driver part.

Thanks,
Patrick

[-- Attachment #2: Type: text/html, Size: 13819 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-08-08  7:11     ` Ruifeng Wang
@ 2023-08-11 21:18       ` Patrick Robb
  2023-08-21  8:45         ` Juraj Linkeš
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick Robb @ 2023-08-11 21:18 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Juraj Linkeš, Honnappa Nagarahalli, Dharmik Jayesh Thakkar, ci, nd

[-- Attachment #1: Type: text/plain, Size: 6539 bytes --]

Sorry about the wait on my reply guys.

Thanks for the information. So I download the 2 diffs from that thread,
make a patch with them. Then where and how do I apply it?

Then I install the packages needed per the ubuntu page, and then I can skip
down to the "Building The Kernel" section? And then we're all set I think,
and we just have to setup DTS and associated Jenkins pipelines.

Do you want me to back anything up in advance of this? I don't know if that
is needed or not, but Ampere is currently live doing testing for CI, so I
want to act in a safe manner. I will try to address this first thing on
Monday and get back to you.



On Tue, Aug 8, 2023 at 3:11 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:

> *From:* Juraj Linkeš <juraj.linkes@pantheon.tech>
> *Sent:* Tuesday, August 8, 2023 3:07 PM
> *To:* Ruifeng Wang <Ruifeng.Wang@arm.com>
> *Cc:* Patrick Robb <probb@iol.unh.edu>; Honnappa Nagarahalli <
> Honnappa.Nagarahalli@arm.com>; Dharmik Jayesh Thakkar <
> DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org; nd <nd@arm.com>
> *Subject:* Re: Intel QAT 8970 accel card on ARM Ampere Server
>
>
>
> We've talked about this some more and the best way to move forward is to
> rebuild the ubuntu kernel. It should be fairly straightforward according to their
> wiki page <https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel>. The page
> mentions a fairly old release (19.04), but was updated a year ago so the
> instructions are likely still valid.
>
>
>
> However, I don't have the link to the kernel patch that Honnappa
> mentioned. @Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> @Ruifeng
> Wang <Ruifeng.Wang@arm.com>, can you please provide a reference for the
> patch?
>
> *[Ruifeng]* Here is the kernel patch set:
> https://lkml.org/lkml/2022/6/17/328
>
>
>
> Since the patch is small, there shouldn't be problems with applying it.
> Let us know whether this is doable.
>
>
>
> Regards,
>
> Juraj
>
>
>
> On Fri, Aug 4, 2023 at 11:48 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:
>
> Hi Patrick,
>
>
>
> Thanks for reaching out and my apologies for delayed response.
>
>
>
> We noticed that some information is missing regarding using QAT with DPDK
> on Arm.
>
> The DPDK document will be updated to include the missing part.
>
> Will get back on this later.
>
>
>
> Best regards,
>
> Ruifeng
>
>
>
> *From:* Patrick Robb <probb@iol.unh.edu>
> *Sent:* Tuesday, August 1, 2023 1:14 AM
> *To:* Ruifeng Wang <Ruifeng.Wang@arm.com>; Honnappa Nagarahalli <
> Honnappa.Nagarahalli@arm.com>; Juraj Linkeš <juraj.linkes@pantheon.tech>
> *Cc:* Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org
> *Subject:* Intel QAT 8970 accel card on ARM Ampere Server
>
>
>
> Hi Ruifeng, Honnappa, Juraj,
>
>
>
> The Intel QAT 8970 accelerator card has arrived to the Community Lab, and
> we've installed it on the Ampere server. Presumably, we should test both
> crypto and compress operations (and their respective performance metrics).
> To that end, there are also DTS testsuites for testing QAT crypto/compress
> functions. These testsuites make use of the crypto perf dpdk app and the
> compress perf dpdk app. If you want, you can setup the DTS stuff yourself,
> both on the system side, and the Jenkins side (you are allowed to submit
> PRs on our gitlab now), but we can also do that on the lab side as we
> probably have more experience. I do, however, have a question about the QAT
> kernel driver and corresponding PMDs.
>
>
>
> compress suite:
> https://git.dpdk.org/tools/dts/tree/test_plans/compressdev_qat_pmd_test_plan.rst
>
> crypto suite:
> https://git.dpdk.org/tools/dts/tree/test_plans/crypto_perf_cryptodev_perf_test_plan.rst
>
>
>
> For reference, the DPDK docs page explaining QAT driver capabilities and
> building the QAT PMDs (crypto sym, crypto asym, and compress) is here:
> https://doc.dpdk.org/guides/cryptodevs/qat.html#building-qat
>
> Some notes before I get to my main question:
>
> -The 8970 is a C62x device
>
> -OpenSSL (arm requires it for QAT) is installed
>
> -3 PFs are visible from lspci (expected)
>
> -SRIOV is enabled
>
>
>
> However, although the system is on a valid kernel version for the QAT
> driver, the kernel module for QAT is not loaded, so in trying to set up
> testing, I am unable to create the 16 VFs for the 3 PFs respectively, like
> the example below:
>
>
>
> echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
> echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
> echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
>
>
>
> There is also an option to download the firmware from the kernel firmware
> repo and copy the qat binaries to /lib/firmware and start the qat modules
> from there. I wasn't able to resolve the situation with this method, but it
> also could have been user error on my part.
>
>
>
> There is an option to install using the IDZ QAT Driver
> <https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html>,
> but it should not be required given the kernel version the Ampere server is
> on, and I don't want to go down the road of relying on this "fall back"
> method without consulting you first. Is it possible that there is anything
> specific to running a QAT device on ARM specifically which I am missing
> here? The DTS testsuite testplans actually seem to recommend going down
> this road in general, but the DPDK docs say to use the kernel driver, so I
> don't know.
>
>
>
> In any case, one of you should be able to login to the Ampere server in
> situations like this, or just in general. Ruifeng/Juraj I see you both have
> accounts on our IdM system, so you should have access. Please let me know
> if you need renewed vpn cert configs and I will send you one. If you do
> login, know this system could be running CI testing at any time. I can
> always schedule time for it to be offline and available for maintenance if
> you want to do anything which could be disruptive to testing.
>
>
>
> I also CC'd Dharmik on this as I see he sent an email regarding QAT
> support on aarch64 in June.
>
>
>
> Let me know if you have any thoughts on the QAT kernel driver part.
>
>
>
> Thanks,
>
> Patrick
>
>

-- 

Patrick Robb

Technical Service Manager

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

www.iol.unh.edu

[-- Attachment #2: Type: text/html, Size: 13719 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-08-11 21:18       ` Patrick Robb
@ 2023-08-21  8:45         ` Juraj Linkeš
  2023-08-30  0:05           ` Patrick Robb
  2023-09-01 21:30           ` Patrick Robb
  0 siblings, 2 replies; 31+ messages in thread
From: Juraj Linkeš @ 2023-08-21  8:45 UTC (permalink / raw)
  To: Patrick Robb
  Cc: Ruifeng Wang, Honnappa Nagarahalli, Dharmik Jayesh Thakkar, ci, nd

[-- Attachment #1: Type: text/plain, Size: 7856 bytes --]

Hi Patrick,

On Fri, Aug 11, 2023 at 11:18 PM Patrick Robb <probb@iol.unh.edu> wrote:

> Sorry about the wait on my reply guys.
>
> Thanks for the information. So I download the 2 diffs from that thread,
> make a patch with them. Then where and how do I apply it?
>
>
First get the ubuntu repo:

   -

   git clone git://kernel.ubuntu.com/ubuntu/ubuntu-<release codename>.git


22.04 is jammy, but looking at https://kernel.ubuntu.com/git/, it's not
under ubuntu/ubuntu-jammy.git, but
rather ubuntu-stable/ubuntu-stable-jammy.git. It also seems the repo's been
redirected:

git clone git://kernel.ubuntu.com/ubuntu-stable/ubuntu-stable-jammy.git
Cloning into 'ubuntu-stable-jammy'...
fatal: remote error: **REPOSITORY RELOCATED**  Updated URL:
https://git.launchpad.net/~ubuntu-kernel-stable/+git/jammy Local path:
/ubuntu-stable/ubuntu-stable-jammy.git


Cloning the new URL worked for me. Then we need to checkout the tag that
corresponds to the running kernel (uname -r), apply the patch and build the
kernel with the running config (in /boot/config-$(uname -r)), possibly
enabling the QAT driver if needed.


> Then I install the packages needed per the ubuntu page, and then I can
> skip down to the "Building The Kernel" section? And then we're all set I
> think, and we just have to setup DTS and associated Jenkins pipelines.
>
> Do you want me to back anything up in advance of this? I don't know if
> that is needed or not, but Ampere is currently live doing testing for CI,
> so I want to act in a safe manner. I will try to address this first thing
> on Monday and get back to you.
>
>
The backup should not be needed, at least in principle, as we can always
reinstall the original kernel packages.


>
>
> On Tue, Aug 8, 2023 at 3:11 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:
>
>> *From:* Juraj Linkeš <juraj.linkes@pantheon.tech>
>> *Sent:* Tuesday, August 8, 2023 3:07 PM
>> *To:* Ruifeng Wang <Ruifeng.Wang@arm.com>
>> *Cc:* Patrick Robb <probb@iol.unh.edu>; Honnappa Nagarahalli <
>> Honnappa.Nagarahalli@arm.com>; Dharmik Jayesh Thakkar <
>> DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org; nd <nd@arm.com>
>> *Subject:* Re: Intel QAT 8970 accel card on ARM Ampere Server
>>
>>
>>
>> We've talked about this some more and the best way to move forward is to
>> rebuild the ubuntu kernel. It should be fairly straightforward according to their
>> wiki page <https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel>. The page
>> mentions a fairly old release (19.04), but was updated a year ago so the
>> instructions are likely still valid.
>>
>>
>>
>> However, I don't have the link to the kernel patch that Honnappa
>> mentioned. @Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> @Ruifeng
>> Wang <Ruifeng.Wang@arm.com>, can you please provide a reference for the
>> patch?
>>
>> *[Ruifeng]* Here is the kernel patch set:
>> https://lkml.org/lkml/2022/6/17/328
>>
>>
>>
>> Since the patch is small, there shouldn't be problems with applying it.
>> Let us know whether this is doable.
>>
>>
>>
>> Regards,
>>
>> Juraj
>>
>>
>>
>> On Fri, Aug 4, 2023 at 11:48 AM Ruifeng Wang <Ruifeng.Wang@arm.com>
>> wrote:
>>
>> Hi Patrick,
>>
>>
>>
>> Thanks for reaching out and my apologies for delayed response.
>>
>>
>>
>> We noticed that some information is missing regarding using QAT with DPDK
>> on Arm.
>>
>> The DPDK document will be updated to include the missing part.
>>
>> Will get back on this later.
>>
>>
>>
>> Best regards,
>>
>> Ruifeng
>>
>>
>>
>> *From:* Patrick Robb <probb@iol.unh.edu>
>> *Sent:* Tuesday, August 1, 2023 1:14 AM
>> *To:* Ruifeng Wang <Ruifeng.Wang@arm.com>; Honnappa Nagarahalli <
>> Honnappa.Nagarahalli@arm.com>; Juraj Linkeš <juraj.linkes@pantheon.tech>
>> *Cc:* Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org
>> *Subject:* Intel QAT 8970 accel card on ARM Ampere Server
>>
>>
>>
>> Hi Ruifeng, Honnappa, Juraj,
>>
>>
>>
>> The Intel QAT 8970 accelerator card has arrived to the Community Lab, and
>> we've installed it on the Ampere server. Presumably, we should test both
>> crypto and compress operations (and their respective performance metrics).
>> To that end, there are also DTS testsuites for testing QAT crypto/compress
>> functions. These testsuites make use of the crypto perf dpdk app and the
>> compress perf dpdk app. If you want, you can setup the DTS stuff yourself,
>> both on the system side, and the Jenkins side (you are allowed to submit
>> PRs on our gitlab now), but we can also do that on the lab side as we
>> probably have more experience. I do, however, have a question about the QAT
>> kernel driver and corresponding PMDs.
>>
>>
>>
>> compress suite:
>> https://git.dpdk.org/tools/dts/tree/test_plans/compressdev_qat_pmd_test_plan.rst
>>
>> crypto suite:
>> https://git.dpdk.org/tools/dts/tree/test_plans/crypto_perf_cryptodev_perf_test_plan.rst
>>
>>
>>
>> For reference, the DPDK docs page explaining QAT driver capabilities and
>> building the QAT PMDs (crypto sym, crypto asym, and compress) is here:
>> https://doc.dpdk.org/guides/cryptodevs/qat.html#building-qat
>>
>> Some notes before I get to my main question:
>>
>> -The 8970 is a C62x device
>>
>> -OpenSSL (arm requires it for QAT) is installed
>>
>> -3 PFs are visible from lspci (expected)
>>
>> -SRIOV is enabled
>>
>>
>>
>> However, although the system is on a valid kernel version for the QAT
>> driver, the kernel module for QAT is not loaded, so in trying to set up
>> testing, I am unable to create the 16 VFs for the 3 PFs respectively, like
>> the example below:
>>
>>
>>
>> echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
>> echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
>> echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs
>>
>>
>>
>> There is also an option to download the firmware from the kernel firmware
>> repo and copy the qat binaries to /lib/firmware and start the qat modules
>> from there. I wasn't able to resolve the situation with this method, but it
>> also could have been user error on my part.
>>
>>
>>
>> There is an option to install using the IDZ QAT Driver
>> <https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html>,
>> but it should not be required given the kernel version the Ampere server is
>> on, and I don't want to go down the road of relying on this "fall back"
>> method without consulting you first. Is it possible that there is anything
>> specific to running a QAT device on ARM specifically which I am missing
>> here? The DTS testsuite testplans actually seem to recommend going down
>> this road in general, but the DPDK docs say to use the kernel driver, so I
>> don't know.
>>
>>
>>
>> In any case, one of you should be able to login to the Ampere server in
>> situations like this, or just in general. Ruifeng/Juraj I see you both have
>> accounts on our IdM system, so you should have access. Please let me know
>> if you need renewed vpn cert configs and I will send you one. If you do
>> login, know this system could be running CI testing at any time. I can
>> always schedule time for it to be offline and available for maintenance if
>> you want to do anything which could be disruptive to testing.
>>
>>
>>
>> I also CC'd Dharmik on this as I see he sent an email regarding QAT
>> support on aarch64 in June.
>>
>>
>>
>> Let me know if you have any thoughts on the QAT kernel driver part.
>>
>>
>>
>> Thanks,
>>
>> Patrick
>>
>>
>
> --
>
> Patrick Robb
>
> Technical Service Manager
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> www.iol.unh.edu
>
>
>

[-- Attachment #2: Type: text/html, Size: 16344 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-08-21  8:45         ` Juraj Linkeš
@ 2023-08-30  0:05           ` Patrick Robb
  2023-09-01 21:30           ` Patrick Robb
  1 sibling, 0 replies; 31+ messages in thread
From: Patrick Robb @ 2023-08-30  0:05 UTC (permalink / raw)
  To: Juraj Linkeš
  Cc: Ruifeng Wang, Honnappa Nagarahalli, Dharmik Jayesh Thakkar, ci, nd

[-- Attachment #1: Type: text/plain, Size: 1137 bytes --]

Hi Juraj,

Thanks for the guidance. The kernel version in use on the Ampere server
currently is 5.4.0-155-generic. The tags for the 22.04 repo you suggested
only include 5.15 kernel versions, so I can't checkout to the
currently running kernel per your recommendation. I figure the idea behind
checking out to the kernel currently running is to maintain the current
state as much as possible, and ensure the currently running kernel config
could be re-used. If I instead clone the 20.04/focal repo I can checkout to
5.4.0-155, but the diffs you shared do not cleanly apply.

On the other hand, from looking directly at the files on Jammy/5.15[1], it
looks like the qat diffs (https://lkml.org/lkml/2022/6/17/328) have already
reached that kernel version. If indeed applying these diffs is not needed
in this case, is there any reason why I shouldn't just re-build the kernel
from here? I don't want to do this (and necessarily advance the kernel
version from 5.4 to 5.15) without asking you since I don't know what the
negative implications of this action may be, if any.

[1] https://git.launchpad.net/~ubuntu-kernel-stable/+git/jammy/

[-- Attachment #2: Type: text/html, Size: 1424 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-08-21  8:45         ` Juraj Linkeš
  2023-08-30  0:05           ` Patrick Robb
@ 2023-09-01 21:30           ` Patrick Robb
  2023-09-11  8:13             ` Juraj Linkeš
  1 sibling, 1 reply; 31+ messages in thread
From: Patrick Robb @ 2023-09-01 21:30 UTC (permalink / raw)
  To: Juraj Linkeš; +Cc: Ruifeng Wang, Honnappa Nagarahalli, ci, nd

[-- Attachment #1: Type: text/plain, Size: 747 bytes --]

Thanks Juraj,

I did bring the system to 22.04 based on our conversation from yesterday.
From there and from checking out to the new current kernel
(5.15.0-82-generic) yes the diffs cleanly apply, removing the x86
dependency on the QAT kernel drivers, and then you can make the kernel,
enabling the QAT driver. It looks like that all worked fine.

I didn't actually install and reboot with the custom kernel today because I
don't want to do that with a production server right before the weekend,
particularly with USA having a holiday on Monday. I will reboot with the
custom kernel on Tuesday morning though, and then hopefully the
compress/crypto testsuites on QAT will be unblocked. Thanks, the guidance
is greatly appreciated.

Best,
Patrick

[-- Attachment #2: Type: text/html, Size: 890 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-09-01 21:30           ` Patrick Robb
@ 2023-09-11  8:13             ` Juraj Linkeš
  2023-09-20 18:28               ` Patrick Robb
  0 siblings, 1 reply; 31+ messages in thread
From: Juraj Linkeš @ 2023-09-11  8:13 UTC (permalink / raw)
  To: Patrick Robb; +Cc: Ruifeng Wang, Honnappa Nagarahalli, ci, nd

[-- Attachment #1: Type: text/plain, Size: 968 bytes --]

Hi Patrick,

This is good news. How does the server fare after the restart?

Juraj

On Fri, Sep 1, 2023 at 11:30 PM Patrick Robb <probb@iol.unh.edu> wrote:

> Thanks Juraj,
>
> I did bring the system to 22.04 based on our conversation from yesterday.
> From there and from checking out to the new current kernel
> (5.15.0-82-generic) yes the diffs cleanly apply, removing the x86
> dependency on the QAT kernel drivers, and then you can make the kernel,
> enabling the QAT driver. It looks like that all worked fine.
>
> I didn't actually install and reboot with the custom kernel today because
> I don't want to do that with a production server right before the weekend,
> particularly with USA having a holiday on Monday. I will reboot with the
> custom kernel on Tuesday morning though, and then hopefully the
> compress/crypto testsuites on QAT will be unblocked. Thanks, the guidance
> is greatly appreciated.
>
> Best,
> Patrick
>
>

[-- Attachment #2: Type: text/html, Size: 1374 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-09-11  8:13             ` Juraj Linkeš
@ 2023-09-20 18:28               ` Patrick Robb
  2023-09-25 15:19                 ` Ruifeng Wang
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick Robb @ 2023-09-20 18:28 UTC (permalink / raw)
  To: Juraj Linkeš; +Cc: Ruifeng Wang, Honnappa Nagarahalli, ci, nd

[-- Attachment #1: Type: text/plain, Size: 2824 bytes --]

Hi Juraj,

Sorry for the late reply. So, yes I applied those diffs and set the QAT
modules to =y in the .config file when building the custom kernel. It
appears to have worked correctly. The qat module is now built into the new
kernel running on the ampere server (called 5.15.82+). You can see it
listed on modules.builtin and from modinfo.

probb@arm-ampere-dut:~$ modinfo qat_c62x
name:           qat_c62x
filename:       (builtin)
version:        0.6.0
description:    Intel(R) QuickAssist Technology
firmware:       qat_c62x_mmp.bin
firmware:       qat_c62x.bin
author:         Intel
license:        Dual BSD/GPL
file:           drivers/crypto/qat/qat_c62x/qat_c62x

And there is a /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs path
now so that's good. However, when trying to create the VFs for the 3 PFs on
the card, a segmentation fault was returned the first time, and on
subsequent tries it hangs now. So like:

root@arm-ampere-dut:~# lspci -d:37c8
0000:03:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist
Technology (rev 04)
0000:04:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist
Technology (rev 04)
0000:05:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist
Technology (rev 04)
root@arm-ampere-dut:~# echo 16 >
/sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs

The sriov_numvfs file should be writable from root so I'm a bit perplexed.
I am wondering whether it is relevant to statically build in the qat_c62x
module with the kernel, vs having it be a loadable driver? What do you do?

On Mon, Sep 11, 2023 at 4:13 AM Juraj Linkeš <juraj.linkes@pantheon.tech>
wrote:

> Hi Patrick,
>
> This is good news. How does the server fare after the restart?
>
> Juraj
>
> On Fri, Sep 1, 2023 at 11:30 PM Patrick Robb <probb@iol.unh.edu> wrote:
>
>> Thanks Juraj,
>>
>> I did bring the system to 22.04 based on our conversation from yesterday.
>> From there and from checking out to the new current kernel
>> (5.15.0-82-generic) yes the diffs cleanly apply, removing the x86
>> dependency on the QAT kernel drivers, and then you can make the kernel,
>> enabling the QAT driver. It looks like that all worked fine.
>>
>> I didn't actually install and reboot with the custom kernel today because
>> I don't want to do that with a production server right before the weekend,
>> particularly with USA having a holiday on Monday. I will reboot with the
>> custom kernel on Tuesday morning though, and then hopefully the
>> compress/crypto testsuites on QAT will be unblocked. Thanks, the guidance
>> is greatly appreciated.
>>
>> Best,
>> Patrick
>>
>>

-- 

Patrick Robb

Technical Service Manager

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

www.iol.unh.edu

[-- Attachment #2: Type: text/html, Size: 5530 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: Intel QAT 8970 accel card on ARM Ampere Server
  2023-09-20 18:28               ` Patrick Robb
@ 2023-09-25 15:19                 ` Ruifeng Wang
  2023-10-09 16:34                   ` Patrick Robb
  0 siblings, 1 reply; 31+ messages in thread
From: Ruifeng Wang @ 2023-09-25 15:19 UTC (permalink / raw)
  To: Patrick Robb, Juraj Linkeš, Dharmik Jayesh Thakkar
  Cc: Honnappa Nagarahalli, ci, nd, nd

[-- Attachment #1: Type: text/plain, Size: 3413 bytes --]

+Dharmik

Hi Dharmik,

Do you see a similar problem on your machine?

Thanks,
Ruifeng

From: Patrick Robb <probb@iol.unh.edu>
Sent: Thursday, September 21, 2023 2:28 AM
To: Juraj Linkeš <juraj.linkes@pantheon.tech>
Cc: Ruifeng Wang <Ruifeng.Wang@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; ci@dpdk.org; nd <nd@arm.com>
Subject: Re: Intel QAT 8970 accel card on ARM Ampere Server

Hi Juraj,

Sorry for the late reply. So, yes I applied those diffs and set the QAT modules to =y in the .config file when building the custom kernel. It appears to have worked correctly. The qat module is now built into the new kernel running on the ampere server (called 5.15.82+). You can see it listed on modules.builtin and from modinfo.

probb@arm-ampere-dut:~$ modinfo qat_c62x
name:           qat_c62x
filename:       (builtin)
version:        0.6.0
description:    Intel(R) QuickAssist Technology
firmware:       qat_c62x_mmp.bin
firmware:       qat_c62x.bin
author:         Intel
license:        Dual BSD/GPL
file:           drivers/crypto/qat/qat_c62x/qat_c62x

And there is a /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs path now so that's good. However, when trying to create the VFs for the 3 PFs on the card, a segmentation fault was returned the first time, and on subsequent tries it hangs now. So like:

root@arm-ampere-dut:~# lspci -d:37c8
0000:03:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist Technology (rev 04)
0000:04:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist Technology (rev 04)
0000:05:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist Technology (rev 04)
root@arm-ampere-dut:~# echo 16 > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs

The sriov_numvfs file should be writable from root so I'm a bit perplexed. I am wondering whether it is relevant to statically build in the qat_c62x module with the kernel, vs having it be a loadable driver? What do you do?

On Mon, Sep 11, 2023 at 4:13 AM Juraj Linkeš <juraj.linkes@pantheon.tech<mailto:juraj.linkes@pantheon.tech>> wrote:
Hi Patrick,

This is good news. How does the server fare after the restart?

Juraj

On Fri, Sep 1, 2023 at 11:30 PM Patrick Robb <probb@iol.unh.edu<mailto:probb@iol.unh.edu>> wrote:
Thanks Juraj,

I did bring the system to 22.04 based on our conversation from yesterday. From there and from checking out to the new current kernel (5.15.0-82-generic) yes the diffs cleanly apply, removing the x86 dependency on the QAT kernel drivers, and then you can make the kernel, enabling the QAT driver. It looks like that all worked fine.

I didn't actually install and reboot with the custom kernel today because I don't want to do that with a production server right before the weekend, particularly with USA having a holiday on Monday. I will reboot with the custom kernel on Tuesday morning though, and then hopefully the compress/crypto testsuites on QAT will be unblocked. Thanks, the guidance is greatly appreciated.

Best,
Patrick



--

Patrick Robb

Technical Service Manager

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

www.iol.unh.edu<http://www.iol.unh.edu/>



[https://lh4.googleusercontent.com/7sTY8VswXadak_YT0J13osh5ockNVRX2BuYaRsKoTTpkpilBokA0WlocYHLB4q7XUgXNHka6-ns47S8R_am0sOt7MYQQ1ILQ3S-P5aezsrjp3-IsJMmMrErHWmTARNgZhpAx06n2]

[-- Attachment #2: Type: text/html, Size: 9634 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-09-25 15:19                 ` Ruifeng Wang
@ 2023-10-09 16:34                   ` Patrick Robb
  2023-10-10  2:28                     ` Patrick Robb
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick Robb @ 2023-10-09 16:34 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Juraj Linkeš, Dharmik Jayesh Thakkar, Honnappa Nagarahalli, ci, nd

[-- Attachment #1: Type: text/plain, Size: 327 bytes --]

Hi Ruifeng, Dharmik,

I'm just bumping this so we can come up with a plan to go forward.

And again, I am wondering did you all build your custom kernel with the
qat_c62x driver statically built in (like I did), or added as a loadable
driver? I think that's one of the few ways our test beds could be
different.

Best,
Patrick

[-- Attachment #2: Type: text/html, Size: 462 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-10-09 16:34                   ` Patrick Robb
@ 2023-10-10  2:28                     ` Patrick Robb
  2023-10-10  3:55                       ` Dharmik Jayesh Thakkar
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick Robb @ 2023-10-10  2:28 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Juraj Linkeš, Dharmik Jayesh Thakkar, Honnappa Nagarahalli, ci, nd

[-- Attachment #1: Type: text/plain, Size: 465 bytes --]

Also I am just now thinking I probably should have provided dpdk-devbind.py
output:

probb@arm-ampere-dut:/tmp/dpdk/usertools$ dpdk-devbind.py --status
Crypto devices using kernel driver
==================================
0000:03:00.0 'C62x Chipset QuickAssist Technology 37c8' drv=c6xx
unused=vfio-pci
0000:04:00.0 'C62x Chipset QuickAssist Technology 37c8' drv=c6xx
unused=vfio-pci
0000:05:00.0 'C62x Chipset QuickAssist Technology 37c8' drv=c6xx
unused=vfio-pci

[-- Attachment #2: Type: text/html, Size: 551 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: Intel QAT 8970 accel card on ARM Ampere Server
  2023-10-10  2:28                     ` Patrick Robb
@ 2023-10-10  3:55                       ` Dharmik Jayesh Thakkar
  2023-10-10  7:25                         ` David Marchand
  2023-10-10 15:59                         ` Patrick Robb
  0 siblings, 2 replies; 31+ messages in thread
From: Dharmik Jayesh Thakkar @ 2023-10-10  3:55 UTC (permalink / raw)
  To: Patrick Robb, Ruifeng Wang
  Cc: Juraj Linkeš, Honnappa Nagarahalli, ci, nd

[-- Attachment #1: Type: text/plain, Size: 2963 bytes --]

Hi Patrick,

Can you provide the grub settings? Is iommu.passthrough=1 included?

Also, is qat_c62xvf loaded as well?

Finally, a few guidelines on the vfio driver:
At times, we need to configure the vfio driver.
On kernel vers. 5.9+ we need to load the vfio-pci driver with the additional parameter disable_denylist=1
Unload the vfio-pci driver if it is already loaded so that we can reload it with the correct parameters :
sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio
If you can't unload the vfio driver because it's been built into the kernel, you'll have to find another way to change VFIO parameters, or to rebuild your kernel with VFIO_PCI set as a module. Failing to do that, you might encounter issues later on when you try to bind the VFs to VFIO.

Load the vfio-pci driver and bind it to QAT VFs device ids:
sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9

Enable no-iommu-mode:
echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
 /sys/module/vfio/parameter is missing ?
If /sys/module/vfio/parameters does not exist, you might be missing the kernel module VFIO_NOIOMMU

Automatically set VFIO params on boot
It's possible to set these parameters automatically on boot by creating a /etc/modprobe.d/vfio-pci.conf file with the parameters :
cat /etc/modprobe.d/vfio-pci.conf
options vfio enable_unsafe_noiommu_mode=1
options vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9

We haven’t encountered this issue in the past, so just making sure the configuration is correct. I don’t think having the driver static/loadable should make a difference, I will try with building statically on my setup.

Thank you!


From: Patrick Robb <probb@iol.unh.edu>
Sent: Monday, October 9, 2023 9:29 PM
To: Ruifeng Wang <Ruifeng.Wang@arm.com>
Cc: Juraj Linkeš <juraj.linkes@pantheon.tech>; Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; ci@dpdk.org; nd <nd@arm.com>
Subject: Re: Intel QAT 8970 accel card on ARM Ampere Server

Also I am just now thinking I probably should have provided dpdk-devbind.py output:

probb@arm-ampere-dut:/tmp/dpdk/usertools$ dpdk-devbind.py --status
Crypto devices using kernel driver
==================================
0000:03:00.0 'C62x Chipset QuickAssist Technology 37c8' drv=c6xx unused=vfio-pci
0000:04:00.0 'C62x Chipset QuickAssist Technology 37c8' drv=c6xx unused=vfio-pci
0000:05:00.0 'C62x Chipset QuickAssist Technology 37c8' drv=c6xx unused=vfio-pci
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

[-- Attachment #2: Type: text/html, Size: 5861 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-10-10  3:55                       ` Dharmik Jayesh Thakkar
@ 2023-10-10  7:25                         ` David Marchand
  2023-10-10 15:03                           ` Dharmik Jayesh Thakkar
  2023-10-10 15:59                         ` Patrick Robb
  1 sibling, 1 reply; 31+ messages in thread
From: David Marchand @ 2023-10-10  7:25 UTC (permalink / raw)
  To: Dharmik Jayesh Thakkar
  Cc: Patrick Robb, Ruifeng Wang, Juraj Linkeš,
	Honnappa Nagarahalli, ci, nd, Thomas Monjalon, Maxime Coquelin

Hello,

On Tue, Oct 10, 2023 at 5:56 AM Dharmik Jayesh Thakkar
<DharmikJayesh.Thakkar@arm.com> wrote:
>
> Hi Patrick,
>
> Can you provide the grub settings? Is iommu.passthrough=1 included?
>
>
>
> Also, is qat_c62xvf loaded as well?
>
>
>
> Finally, a few guidelines on the vfio driver:
>
> At times, we need to configure the vfio driver.
>
> On kernel vers. 5.9+ we need to load the vfio-pci driver with the additional parameter disable_denylist=1

o_O
I did not know this option, but it scares me a bit, reading its description.
Could you please elaborate why this is needed?


>
> Unload the vfio-pci driver if it is already loaded so that we can reload it with the correct parameters :
> sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio
>
> If you can't unload the vfio driver because it's been built into the kernel, you'll have to find another way to change VFIO parameters, or to rebuild your kernel with VFIO_PCI set as a module. Failing to do that, you might encounter issues later on when you try to bind the VFs to VFIO.
>
> Load the vfio-pci driver and bind it to QAT VFs device ids:
> sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9
>
> Enable no-iommu-mode:
> echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
>
>  /sys/module/vfio/parameter is missing ?
>
> If /sys/module/vfio/parameters does not exist, you might be missing the kernel module VFIO_NOIOMMU
>
>
>
> Automatically set VFIO params on boot
>
> It's possible to set these parameters automatically on boot by creating a /etc/modprobe.d/vfio-pci.conf file with the parameters :
> cat /etc/modprobe.d/vfio-pci.conf
> options vfio enable_unsafe_noiommu_mode=1
> options vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9
>
>
>
> We haven’t encountered this issue in the past, so just making sure the configuration is correct. I don’t think having the driver static/loadable should make a difference, I will try with building statically on my setup.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: Intel QAT 8970 accel card on ARM Ampere Server
  2023-10-10  7:25                         ` David Marchand
@ 2023-10-10 15:03                           ` Dharmik Jayesh Thakkar
  2023-10-10 15:12                             ` David Marchand
  0 siblings, 1 reply; 31+ messages in thread
From: Dharmik Jayesh Thakkar @ 2023-10-10 15:03 UTC (permalink / raw)
  To: David Marchand
  Cc: Patrick Robb, Ruifeng Wang, Juraj Linkeš,
	Honnappa Nagarahalli, ci, nd, thomas, Maxime Coquelin



> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, October 10, 2023 2:26 AM
> To: Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>
> Cc: Patrick Robb <probb@iol.unh.edu>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; Juraj Linkeš <juraj.linkes@pantheon.tech>;
> Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; ci@dpdk.org; nd
> <nd@arm.com>; thomas@monjalon.net; Maxime Coquelin
> <maxime.coquelin@redhat.com>
> Subject: Re: Intel QAT 8970 accel card on ARM Ampere Server
>
> Hello,
>
> On Tue, Oct 10, 2023 at 5:56 AM Dharmik Jayesh Thakkar
> <DharmikJayesh.Thakkar@arm.com> wrote:
> >
> > Hi Patrick,
> >
> > Can you provide the grub settings? Is iommu.passthrough=1 included?
> >
> >
> >
> > Also, is qat_c62xvf loaded as well?
> >
> >
> >
> > Finally, a few guidelines on the vfio driver:
> >
> > At times, we need to configure the vfio driver.
> >
> > On kernel vers. 5.9+ we need to load the vfio-pci driver with the
> > additional parameter disable_denylist=1
>
> o_O
> I did not know this option, but it scares me a bit, reading its description.
> Could you please elaborate why this is needed?
>
>

Details for adding QAT to denylist provided in the below commit:
https://github.com/torvalds/linux/commit/50173329c8cc0c892eaa7a9d0f0692ac39cd7b04

> >
> > Unload the vfio-pci driver if it is already loaded so that we can reload it with
> the correct parameters :
> > sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo
> > modprobe -r vfio_virqfd; sudo modprobe -r vfio
> >
> > If you can't unload the vfio driver because it's been built into the kernel,
> you'll have to find another way to change VFIO parameters, or to rebuild your
> kernel with VFIO_PCI set as a module. Failing to do that, you might encounter
> issues later on when you try to bind the VFs to VFIO.
> >
> > Load the vfio-pci driver and bind it to QAT VFs device ids:
> > sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1
> > vfio-pci.ids=8086:37c9
> >
> > Enable no-iommu-mode:
> > echo "1" | sudo tee
> > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
> >
> >  /sys/module/vfio/parameter is missing ?
> >
> > If /sys/module/vfio/parameters does not exist, you might be missing
> > the kernel module VFIO_NOIOMMU
> >
> >
> >
> > Automatically set VFIO params on boot
> >
> > It's possible to set these parameters automatically on boot by creating a
> /etc/modprobe.d/vfio-pci.conf file with the parameters :
> > cat /etc/modprobe.d/vfio-pci.conf
> > options vfio enable_unsafe_noiommu_mode=1 options vfio-pci
> > disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9
> >
> >
> >
> > We haven’t encountered this issue in the past, so just making sure the
> configuration is correct. I don’t think having the driver static/loadable should
> make a difference, I will try with building statically on my setup.
>
>
> --
> David Marchand

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-10-10 15:03                           ` Dharmik Jayesh Thakkar
@ 2023-10-10 15:12                             ` David Marchand
  0 siblings, 0 replies; 31+ messages in thread
From: David Marchand @ 2023-10-10 15:12 UTC (permalink / raw)
  To: Dharmik Jayesh Thakkar, Patrick Robb
  Cc: Ruifeng Wang, Juraj Linkeš,
	Honnappa Nagarahalli, ci, nd, thomas, Maxime Coquelin

On Tue, Oct 10, 2023 at 5:03 PM Dharmik Jayesh Thakkar
<DharmikJayesh.Thakkar@arm.com> wrote:
> > > On kernel vers. 5.9+ we need to load the vfio-pci driver with the
> > > additional parameter disable_denylist=1
> >
> > o_O
> > I did not know this option, but it scares me a bit, reading its description.
> > Could you please elaborate why this is needed?
> >
> >
>
> Details for adding QAT to denylist provided in the below commit:
> https://github.com/torvalds/linux/commit/50173329c8cc0c892eaa7a9d0f0692ac39cd7b04

Dharmik,
Ok, thanks.
That matches what I found in the qat documentation:
http://doc.dpdk.org/guides/cryptodevs/qat.html#binding-the-available-vfs-to-the-vfio-pci-driver


Patrick,
Sorry for jumping in this thread, but to be clear, this
disable_denylist option is really specific to this model of
quickassist crypto card.
It must not be enabled in other setups using vfio.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-10-10  3:55                       ` Dharmik Jayesh Thakkar
  2023-10-10  7:25                         ` David Marchand
@ 2023-10-10 15:59                         ` Patrick Robb
  2023-10-10 21:50                           ` Dharmik Jayesh Thakkar
                                             ` (2 more replies)
  1 sibling, 3 replies; 31+ messages in thread
From: Patrick Robb @ 2023-10-10 15:59 UTC (permalink / raw)
  To: Dharmik Jayesh Thakkar, David Marchand
  Cc: Ruifeng Wang, Juraj Linkeš, Honnappa Nagarahalli, ci, nd

[-- Attachment #1: Type: text/plain, Size: 3710 bytes --]

On Mon, Oct 9, 2023 at 11:56 PM Dharmik Jayesh Thakkar <
DharmikJayesh.Thakkar@arm.com> wrote:

> Hi Patrick,
>
>
>
> Can you provide the grub settings? Is iommu.passthrough=1 included?
>

Sure. I'm not sure if you just wanted the kernel cmdline options or the
whole grub config, but I assume you just meant kernel cmdline. Let me know
if you meant more.

GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G
hugepages=32 iommu=pt intel_iommu=on isolcpus=39-79 nohz_full=39-79
rcu_nocbs=39-79 processor.max_cstate=1 intel_pstate=disable
console=ttyS0,115200 console=tty0"

But, iommu.passthrough=1 is not included, so I can add that if we need to.
Do you know that this won't have any bad implications for the (intel,
nvidia, broadcom) NICs which we test on this server?


>
>
> Also, is qat_c62xvf loaded as well?
>
qat_c62xvf is built in to the kernel also.


>
>

>
> Finally, a few guidelines on the vfio driver:
>
> At times, we need to configure the vfio driver.
>
> On kernel vers. 5.9+ we need to load the vfio-pci driver with the
> additional parameter *disable_denylist=1*
>
> Unload the vfio-pci driver if it is already loaded so that we can reload
> it with the correct parameters :
> *sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo
> modprobe -r vfio_virqfd; sudo modprobe -r vfio*
>
> If you can't unload the vfio driver because it's been built into the
> kernel, you'll have to find another way to change VFIO parameters, or to
> rebuild your kernel with VFIO_PCI set as a module. Failing to do that, you
> might encounter issues later on when you try to bind the VFs to VFIO.
>
> Load the vfio-pci driver and bind it to QAT VFs device ids:
> *sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1
> vfio-pci.ids=8086:37c9*
>
> Enable no-iommu-mode:
> *echo "1" | sudo tee
> /sys/module/vfio/parameters/enable_unsafe_noiommu_mode*
>
>  /sys/module/vfio/parameter is missing ?
>
> If /sys/module/vfio/parameters does not exist, you might be missing the
> kernel module VFIO_NOIOMMU
>
>
>
> *Automatically set VFIO params on boot*
>
> It's possible to set these parameters automatically on boot by creating a
> */etc/modprobe.d/vfio-pci.conf *file with the parameters :
> *cat /etc/modprobe.d/vfio-pci.conf*
> *options vfio enable_unsafe_noiommu_mode=1*
> *options vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9*
>
>
>
> We haven’t encountered this issue in the past, so just making sure the
> configuration is correct. I don’t think having the driver static/loadable
> should make a difference, I will try with building statically on my setup.
>
>
>
> Thank you!
>
>
> Okay, this should be fine. Like I said, we are also running tests on NICs
on this server. So, in our Jenkinsfiles scripts for running the testing, I
will add a preliminary step only for QAT tests which runs:
*sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo
modprobe -r vfio_virqfd; sudo modprobe -r vfio*
*sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1
vfio-pci.ids=8086:37c9*
*echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode*
(then run QAT tests)

And if running on NICs, have a preliminary step which runs
*sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo
modprobe -r vfio_virqfd; sudo modprobe -r vfio*
*sudo modprobe vfio*

David does this also sound reasonable to you, per your comment about
isolating this setting to QAT card testing?

Dharmik if this all sounds okay and you can confirm the iommu.passthrough
change is fine, I will proceed. Thank you for providing the assistance.

[-- Attachment #2: Type: text/html, Size: 6210 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: Intel QAT 8970 accel card on ARM Ampere Server
  2023-10-10 15:59                         ` Patrick Robb
@ 2023-10-10 21:50                           ` Dharmik Jayesh Thakkar
  2023-10-11  8:14                           ` Juraj Linkeš
  2023-10-11 11:51                           ` David Marchand
  2 siblings, 0 replies; 31+ messages in thread
From: Dharmik Jayesh Thakkar @ 2023-10-10 21:50 UTC (permalink / raw)
  To: Patrick Robb, David Marchand
  Cc: Ruifeng Wang, Juraj Linkeš, Honnappa Nagarahalli, ci, nd

[-- Attachment #1: Type: text/plain, Size: 4469 bytes --]

Thank you for the details, Patrick!
Yeah can you please update the grub and vfio settings and see if it works. I don’t think it should have any implications on other NICs.

From: Patrick Robb <probb@iol.unh.edu>
Sent: Tuesday, October 10, 2023 11:00 AM
To: Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; David Marchand <david.marchand@redhat.com>
Cc: Ruifeng Wang <Ruifeng.Wang@arm.com>; Juraj Linkeš <juraj.linkes@pantheon.tech>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; ci@dpdk.org; nd <nd@arm.com>
Subject: Re: Intel QAT 8970 accel card on ARM Ampere Server



On Mon, Oct 9, 2023 at 11:56 PM Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com<mailto:DharmikJayesh.Thakkar@arm.com>> wrote:
Hi Patrick,

Can you provide the grub settings? Is iommu.passthrough=1 included?

Sure. I'm not sure if you just wanted the kernel cmdline options or the whole grub config, but I assume you just meant kernel cmdline. Let me know if you meant more.

GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=39-79 nohz_full=39-79 rcu_nocbs=39-79 processor.max_cstate=1 intel_pstate=disable console=ttyS0,115200 console=tty0"

But, iommu.passthrough=1 is not included, so I can add that if we need to. Do you know that this won't have any bad implications for the (intel, nvidia, broadcom) NICs which we test on this server?


Also, is qat_c62xvf loaded as well?
qat_c62xvf is built in to the kernel also.



Finally, a few guidelines on the vfio driver:
At times, we need to configure the vfio driver.
On kernel vers. 5.9+ we need to load the vfio-pci driver with the additional parameter disable_denylist=1
Unload the vfio-pci driver if it is already loaded so that we can reload it with the correct parameters :
sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio
If you can't unload the vfio driver because it's been built into the kernel, you'll have to find another way to change VFIO parameters, or to rebuild your kernel with VFIO_PCI set as a module. Failing to do that, you might encounter issues later on when you try to bind the VFs to VFIO.

Load the vfio-pci driver and bind it to QAT VFs device ids:
sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9

Enable no-iommu-mode:
echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
 /sys/module/vfio/parameter is missing ?
If /sys/module/vfio/parameters does not exist, you might be missing the kernel module VFIO_NOIOMMU

Automatically set VFIO params on boot
It's possible to set these parameters automatically on boot by creating a /etc/modprobe.d/vfio-pci.conf file with the parameters :
cat /etc/modprobe.d/vfio-pci.conf
options vfio enable_unsafe_noiommu_mode=1
options vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9

We haven’t encountered this issue in the past, so just making sure the configuration is correct. I don’t think having the driver static/loadable should make a difference, I will try with building statically on my setup.

Thank you!

Okay, this should be fine. Like I said, we are also running tests on NICs on this server. So, in our Jenkinsfiles scripts for running the testing, I will add a preliminary step only for QAT tests which runs:
sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio
sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9
echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
(then run QAT tests)

And if running on NICs, have a preliminary step which runs
sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio
sudo modprobe vfio

David does this also sound reasonable to you, per your comment about isolating this setting to QAT card testing?

Dharmik if this all sounds okay and you can confirm the iommu.passthrough change is fine, I will proceed. Thank you for providing the assistance.

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

[-- Attachment #2: Type: text/html, Size: 10876 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-10-10 15:59                         ` Patrick Robb
  2023-10-10 21:50                           ` Dharmik Jayesh Thakkar
@ 2023-10-11  8:14                           ` Juraj Linkeš
  2023-10-11 20:13                             ` Patrick Robb
  2023-10-11 11:51                           ` David Marchand
  2 siblings, 1 reply; 31+ messages in thread
From: Juraj Linkeš @ 2023-10-11  8:14 UTC (permalink / raw)
  To: Patrick Robb
  Cc: Dharmik Jayesh Thakkar, David Marchand, Ruifeng Wang,
	Honnappa Nagarahalli, ci, nd

[-- Attachment #1: Type: text/plain, Size: 4518 bytes --]

On Tue, Oct 10, 2023 at 5:59 PM Patrick Robb <probb@iol.unh.edu> wrote:

>
>
> On Mon, Oct 9, 2023 at 11:56 PM Dharmik Jayesh Thakkar <
> DharmikJayesh.Thakkar@arm.com> wrote:
>
>> Hi Patrick,
>>
>>
>>
>> Can you provide the grub settings? Is iommu.passthrough=1 included?
>>
>
> Sure. I'm not sure if you just wanted the kernel cmdline options or the
> whole grub config, but I assume you just meant kernel cmdline. Let me know
> if you meant more.
>
> GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G
> hugepages=32 iommu=pt intel_iommu=on isolcpus=39-79 nohz_full=39-79
> rcu_nocbs=39-79 processor.max_cstate=1 intel_pstate=disable
> console=ttyS0,115200 console=tty0"
>
> But, iommu.passthrough=1 is not included, so I can add that if we need to.
> Do you know that this won't have any bad implications for the (intel,
> nvidia, broadcom) NICs which we test on this server?
>
>

Just a note here, Patrick. The iommu kernel and intel_pstate parameters
aren't supported on arm, so you can remove those. And when
iommu.passthrouh=1, IOMMU is bypassed and intel_iommu doesn't do anything
(and maybe isn't supported on arm, but that's not clear from the docs
<https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt>),
so that can be removed as well.

From what I can tell, using iommu.passthrough=1 is the standard, so if
there are any negative implications, we should investigate them, but there
shouldn't be anything major.


>
>>
>> Also, is qat_c62xvf loaded as well?
>>
> qat_c62xvf is built in to the kernel also.
>
>
>>
>>
>
>>
>> Finally, a few guidelines on the vfio driver:
>>
>> At times, we need to configure the vfio driver.
>>
>> On kernel vers. 5.9+ we need to load the vfio-pci driver with the
>> additional parameter *disable_denylist=1*
>>
>> Unload the vfio-pci driver if it is already loaded so that we can reload
>> it with the correct parameters :
>> *sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo
>> modprobe -r vfio_virqfd; sudo modprobe -r vfio*
>>
>> If you can't unload the vfio driver because it's been built into the
>> kernel, you'll have to find another way to change VFIO parameters, or to
>> rebuild your kernel with VFIO_PCI set as a module. Failing to do that, you
>> might encounter issues later on when you try to bind the VFs to VFIO.
>>
>> Load the vfio-pci driver and bind it to QAT VFs device ids:
>> *sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1
>> vfio-pci.ids=8086:37c9*
>>
>> Enable no-iommu-mode:
>> *echo "1" | sudo tee
>> /sys/module/vfio/parameters/enable_unsafe_noiommu_mode*
>>
>>  /sys/module/vfio/parameter is missing ?
>>
>> If /sys/module/vfio/parameters does not exist, you might be missing the
>> kernel module VFIO_NOIOMMU
>>
>>
>>
>> *Automatically set VFIO params on boot*
>>
>> It's possible to set these parameters automatically on boot by creating a
>> */etc/modprobe.d/vfio-pci.conf *file with the parameters :
>> *cat /etc/modprobe.d/vfio-pci.conf*
>> *options vfio enable_unsafe_noiommu_mode=1*
>> *options vfio-pci disable_denylist=1 enable_sriov=1
>> vfio-pci.ids=8086:37c9*
>>
>>
>>
>> We haven’t encountered this issue in the past, so just making sure the
>> configuration is correct. I don’t think having the driver static/loadable
>> should make a difference, I will try with building statically on my setup.
>>
>>
>>
>> Thank you!
>>
>>
>> Okay, this should be fine. Like I said, we are also running tests on NICs
> on this server. So, in our Jenkinsfiles scripts for running the testing, I
> will add a preliminary step only for QAT tests which runs:
> *sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo
> modprobe -r vfio_virqfd; sudo modprobe -r vfio*
> *sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1
> vfio-pci.ids=8086:37c9*
> *echo "1" | sudo tee
> /sys/module/vfio/parameters/enable_unsafe_noiommu_mode*
> (then run QAT tests)
>
> And if running on NICs, have a preliminary step which runs
> *sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo
> modprobe -r vfio_virqfd; sudo modprobe -r vfio*
> *sudo modprobe vfio*
>
> David does this also sound reasonable to you, per your comment about
> isolating this setting to QAT card testing?
>
> Dharmik if this all sounds okay and you can confirm the iommu.passthrough
> change is fine, I will proceed. Thank you for providing the assistance.
>
>

[-- Attachment #2: Type: text/html, Size: 7019 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-10-10 15:59                         ` Patrick Robb
  2023-10-10 21:50                           ` Dharmik Jayesh Thakkar
  2023-10-11  8:14                           ` Juraj Linkeš
@ 2023-10-11 11:51                           ` David Marchand
  2 siblings, 0 replies; 31+ messages in thread
From: David Marchand @ 2023-10-11 11:51 UTC (permalink / raw)
  To: Patrick Robb
  Cc: Dharmik Jayesh Thakkar, Ruifeng Wang, Juraj Linkeš,
	Honnappa Nagarahalli, ci, nd

On Tue, Oct 10, 2023 at 6:00 PM Patrick Robb <probb@iol.unh.edu> wrote:
> On Mon, Oct 9, 2023 at 11:56 PM Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com> wrote:
>>
>> Hi Patrick,
>>
>>
>>
>> Can you provide the grub settings? Is iommu.passthrough=1 included?
>
>
> Sure. I'm not sure if you just wanted the kernel cmdline options or the whole grub config, but I assume you just meant kernel cmdline. Let me know if you meant more.
>
> GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=39-79 nohz_full=39-79 rcu_nocbs=39-79 processor.max_cstate=1 intel_pstate=disable console=ttyS0,115200 console=tty0"
>
> But, iommu.passthrough=1 is not included, so I can add that if we need to. Do you know that this won't have any bad implications for the (intel, nvidia, broadcom) NICs which we test on this server?
>
>>
>>
>>
>> Also, is qat_c62xvf loaded as well?
>
> qat_c62xvf is built in to the kernel also.
>
>>
>>
>>
>>
>>
>> Finally, a few guidelines on the vfio driver:
>>
>> At times, we need to configure the vfio driver.
>>
>> On kernel vers. 5.9+ we need to load the vfio-pci driver with the additional parameter disable_denylist=1
>>
>> Unload the vfio-pci driver if it is already loaded so that we can reload it with the correct parameters :
>> sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio
>>
>> If you can't unload the vfio driver because it's been built into the kernel, you'll have to find another way to change VFIO parameters, or to rebuild your kernel with VFIO_PCI set as a module. Failing to do that, you might encounter issues later on when you try to bind the VFs to VFIO.
>>
>> Load the vfio-pci driver and bind it to QAT VFs device ids:
>> sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9
>>
>> Enable no-iommu-mode:
>> echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
>>
>>  /sys/module/vfio/parameter is missing ?
>>
>> If /sys/module/vfio/parameters does not exist, you might be missing the kernel module VFIO_NOIOMMU
>>
>>
>>
>> Automatically set VFIO params on boot
>>
>> It's possible to set these parameters automatically on boot by creating a /etc/modprobe.d/vfio-pci.conf file with the parameters :
>> cat /etc/modprobe.d/vfio-pci.conf
>> options vfio enable_unsafe_noiommu_mode=1
>> options vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9
>>
>>
>>
>> We haven’t encountered this issue in the past, so just making sure the configuration is correct. I don’t think having the driver static/loadable should make a difference, I will try with building statically on my setup.
>>
>>
>>
>> Thank you!
>>
>>
> Okay, this should be fine. Like I said, we are also running tests on NICs on this server. So, in our Jenkinsfiles scripts for running the testing, I will add a preliminary step only for QAT tests which runs:
> sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio

- I thought vfio_iommu_type1 was a x86 thing. So it would work for x86
(Intel/AMD) systems, but fail on other arches.. ?
If you tested this on ARM, it is probably ok as is.


> sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9

- Speaking to myself, too bad the disable_denylist param value is only
read once, when loading the vfio-pci kernel module...
So ok, I get why you need to reload the whole chain of kmods.

However, I don't think the vfio-pci.ids syntax works for passing parameters.
And in any case, why do you need to set this initial list?
Binding devices (using either driverctl or dpdk-devbind.py) to
vfio-pci should be done the "usual" way, or is there some special case
again for QAT?


- Besides, from what I understood so far, there are two parts specific
to this QAT test:
* enabling SRIOV so that creating VF is possible with a PF bound to
vfio-pci (option enable_sriov=1),
* for a list of PCI QAT cards, forcing the disable_denylist is needed
(option disable_denylist=1),

For the latter point, at this step of the test setup, do you know
which QAT devices will be used?
If so, the commandline params could be constructed to enable
disable_denylist only for known-broken QAT devices (the list is
available in the kernel commit Dharmik provided earlier).


> echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
> (then run QAT tests)
>
> And if running on NICs, have a preliminary step which runs
> sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio
> sudo modprobe vfio

Given that vfio_iommu_type1 is ok on other arch, this lgtm.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-10-11  8:14                           ` Juraj Linkeš
@ 2023-10-11 20:13                             ` Patrick Robb
  2023-11-02 22:00                               ` Patrick Robb
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick Robb @ 2023-10-11 20:13 UTC (permalink / raw)
  To: Juraj Linkeš
  Cc: Dharmik Jayesh Thakkar, David Marchand, Ruifeng Wang,
	Honnappa Nagarahalli, ci, nd

[-- Attachment #1: Type: text/plain, Size: 2312 bytes --]

On Wed, Oct 11, 2023 at 4:14 AM Juraj Linkeš <juraj.linkes@pantheon.tech>
wrote:

>
>
> On Tue, Oct 10, 2023 at 5:59 PM Patrick Robb <probb@iol.unh.edu> wrote:
>
>>
>>
>> On Mon, Oct 9, 2023 at 11:56 PM Dharmik Jayesh Thakkar <
>> DharmikJayesh.Thakkar@arm.com> wrote:
>>
>>> Hi Patrick,
>>>
>>>
>>>
>>> Can you provide the grub settings? Is iommu.passthrough=1 included?
>>>
>>
>> Sure. I'm not sure if you just wanted the kernel cmdline options or the
>> whole grub config, but I assume you just meant kernel cmdline. Let me know
>> if you meant more.
>>
>> GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G
>> hugepages=32 iommu=pt intel_iommu=on isolcpus=39-79 nohz_full=39-79
>> rcu_nocbs=39-79 processor.max_cstate=1 intel_pstate=disable
>> console=ttyS0,115200 console=tty0"
>>
>> But, iommu.passthrough=1 is not included, so I can add that if we need
>> to. Do you know that this won't have any bad implications for the (intel,
>> nvidia, broadcom) NICs which we test on this server?
>>
>>
>
> Just a note here, Patrick. The iommu kernel and intel_pstate parameters
> aren't supported on arm, so you can remove those. And when
> iommu.passthrouh=1, IOMMU is bypassed and intel_iommu doesn't do anything
> (and maybe isn't supported on arm, but that's not clear from the docs
> <https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt>),
> so that can be removed as well.
>

Thanks Dharmik and Juraj. Updated kernel cmdline args:
BOOT_IMAGE=/vmlinuz-5.15.82+ root=/dev/mapper/ubuntu--vg--1-ubuntu--lv ro
default_hugepagesz=1G hugepagesz=1G hugepages=32 isolcpus=39-79
nohz_full=39-79 rcu_nocbs=39-79 processor.max_cstate=1 iommu.passthrough=1
console=ttyS0,115200 console=tty0

I added the iommu.passthrough option and tried again, to no avail. FYI I am
still using the guidance here:
https://doc.dpdk.org/guides/cryptodevs/qat.html along with your added steps.

root@arm-ampere-dut:~# echo 16 >
/sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs
Segmentation fault (core dumped)

As you know the above setting of the 48 VFs is a prerequisite to binding
the VFs to vfio-pci. But, I did run through loading the custom vfio and
there were no issues, so once we clear this initial hurdle we should be
fine.

[-- Attachment #2: Type: text/html, Size: 3569 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-10-11 20:13                             ` Patrick Robb
@ 2023-11-02 22:00                               ` Patrick Robb
  2023-11-14  7:34                                 ` Ruifeng Wang
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick Robb @ 2023-11-02 22:00 UTC (permalink / raw)
  To: Juraj Linkeš
  Cc: Dharmik Jayesh Thakkar, David Marchand, Ruifeng Wang,
	Honnappa Nagarahalli, ci, nd

[-- Attachment #1: Type: text/plain, Size: 700 bytes --]

On Wed, Oct 11, 2023 at 4:13 PM Patrick Robb <probb@iol.unh.edu> wrote:

>
> root@arm-ampere-dut:~# echo 16 >
> /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs
> Segmentation fault (core dumped)
>
> Hi Aaron,

Thanks for offering to take a look. I'm not sure if you've seen the rest of
this conversation already from it being on the ci mailing list or not, but
modinfo looks good for qat_c62x andqat_c62xvf after the custom kernel was
built. From there, it should be possible to bind some VFs for each PF on
the QAT card, per documentation here
https://doc.dpdk.org/guides/cryptodevs/qat.html but it results in a seg
fault like you see above. Let me know if you have any ideas.

[-- Attachment #2: Type: text/html, Size: 1160 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-11-02 22:00                               ` Patrick Robb
@ 2023-11-14  7:34                                 ` Ruifeng Wang
  2023-11-14 14:36                                   ` Patrick Robb
  2024-02-27  6:58                                   ` Patrick Robb
  0 siblings, 2 replies; 31+ messages in thread
From: Ruifeng Wang @ 2023-11-14  7:34 UTC (permalink / raw)
  To: Patrick Robb, Juraj Linkeš
  Cc: Dharmik Jayesh Thakkar, David Marchand, Honnappa Nagarahalli, ci, nd

[-- Attachment #1: Type: text/plain, Size: 1488 bytes --]

Hi Patrick,

It seems kernel v5.15 has a defect on this. A similar issue was fixed by commit:
40da865381ad ("crypto: qat - remove unneeded packed attribute")

Could you patch the kernel and try again?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40da865381ad061ab75a7a9da469ed4e623bdfeb

Thanks,
Ruifeng

From: Patrick Robb <probb@iol.unh.edu>
Date: Friday, November 3, 2023 at 6:01 AM
To: Juraj Linkeš <juraj.linkes@pantheon.tech>
Cc: Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>, David Marchand <david.marchand@redhat.com>, Ruifeng Wang <Ruifeng.Wang@arm.com>, Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>, ci@dpdk.org <ci@dpdk.org>, nd <nd@arm.com>
Subject: Re: Intel QAT 8970 accel card on ARM Ampere Server
On Wed, Oct 11, 2023 at 4:13 PM Patrick Robb <probb@iol.unh.edu<mailto:probb@iol.unh.edu>> wrote:

root@arm-ampere-dut:~# echo 16 > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs
Segmentation fault (core dumped)

Hi Aaron,

Thanks for offering to take a look. I'm not sure if you've seen the rest of this conversation already from it being on the ci mailing list or not, but modinfo looks good for qat_c62x andqat_c62xvf after the custom kernel was built. From there, it should be possible to bind some VFs for each PF on the QAT card, per documentation here https://doc.dpdk.org/guides/cryptodevs/qat.html but it results in a seg fault like you see above. Let me know if you have any ideas.

[-- Attachment #2: Type: text/html, Size: 5386 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-11-14  7:34                                 ` Ruifeng Wang
@ 2023-11-14 14:36                                   ` Patrick Robb
  2024-02-27  6:58                                   ` Patrick Robb
  1 sibling, 0 replies; 31+ messages in thread
From: Patrick Robb @ 2023-11-14 14:36 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Juraj Linkeš,
	Dharmik Jayesh Thakkar, David Marchand, Honnappa Nagarahalli, ci,
	nd, Aaron Conole

[-- Attachment #1: Type: text/plain, Size: 2086 bytes --]

Hi Ruifeng,

Okay, thanks for the update. I'll build a new kernel just like before, but
with this patch added too. And, I know it shouldn't matter, but I'll avoid
statically building in the qat modules this go around.

Thanks,
Patrick

On Tue, Nov 14, 2023 at 2:35 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:

> Hi Patrick,
>
>
>
> It seems kernel v5.15 has a defect on this. A similar issue was fixed by
> commit:
>
> 40da865381ad ("crypto: qat - remove unneeded packed attribute")
>
>
>
> Could you patch the kernel and try again?
>
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40da865381ad061ab75a7a9da469ed4e623bdfeb
>
>
>
> Thanks,
>
> Ruifeng
>
>
>
> *From: *Patrick Robb <probb@iol.unh.edu>
> *Date: *Friday, November 3, 2023 at 6:01 AM
> *To: *Juraj Linkeš <juraj.linkes@pantheon.tech>
> *Cc: *Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>, David
> Marchand <david.marchand@redhat.com>, Ruifeng Wang <Ruifeng.Wang@arm.com>,
> Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>, ci@dpdk.org <
> ci@dpdk.org>, nd <nd@arm.com>
> *Subject: *Re: Intel QAT 8970 accel card on ARM Ampere Server
>
> On Wed, Oct 11, 2023 at 4:13 PM Patrick Robb <probb@iol.unh.edu> wrote:
>
>
>
> root@arm-ampere-dut:~# echo 16 >
> /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs
> Segmentation fault (core dumped)
>
>
>
> Hi Aaron,
>
>
>
> Thanks for offering to take a look. I'm not sure if you've seen the rest
> of this conversation already from it being on the ci mailing list or not,
> but modinfo looks good for qat_c62x andqat_c62xvf after the custom kernel
> was built. From there, it should be possible to bind some VFs for each PF
> on the QAT card, per documentation here
> https://doc.dpdk.org/guides/cryptodevs/qat.html but it results in a seg
> fault like you see above. Let me know if you have any ideas.
>


-- 

Patrick Robb

Technical Service Manager

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

www.iol.unh.edu

[-- Attachment #2: Type: text/html, Size: 7488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2023-11-14  7:34                                 ` Ruifeng Wang
  2023-11-14 14:36                                   ` Patrick Robb
@ 2024-02-27  6:58                                   ` Patrick Robb
  2024-02-27 13:50                                     ` Honnappa Nagarahalli
  1 sibling, 1 reply; 31+ messages in thread
From: Patrick Robb @ 2024-02-27  6:58 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: Juraj Linkeš,
	Dharmik Jayesh Thakkar, David Marchand, Honnappa Nagarahalli, ci,
	nd

[-- Attachment #1: Type: text/plain, Size: 2169 bytes --]

On Tue, Nov 14, 2023 at 2:35 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:

> Hi Patrick,
>
>
>
> It seems kernel v5.15 has a defect on this. A similar issue was fixed by
> commit:
>
> 40da865381ad ("crypto: qat - remove unneeded packed attribute")
>
>
>
> Could you patch the kernel and try again?
>
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40da865381ad061ab75a7a9da469ed4e623bdfeb
>
>
>
> Thanks,
>
> Ruifeng
>
>
>
Hi Ruifeng,

Sorry for the delay on this - there has been a work item backlog at the
Community Lab we've been working through.

I did rebuild the patch today with these changes from the commit (or
similar, as the commit above was for the qat_common file in a different
state, but I tried to remain as true to the commit as possible).

And that does seem to have resolved the seg fault problem! Thank you so
much for picking this commit out of obscurity and sending it our way!

root@arm-ampere-dut:~# echo 16 >
/sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs
root@arm-ampere-dut:~# cat
/sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs
16

Wunderbar!

The only other thing I changed (just because I was floating the idea with
Dharmik before) was in the kernel .config I changed the qat_c62x and
qat_c62xvf modules from statically built in (=y) to loadable (=m). Of
course, this should not matter, and I presume the change in behavior
relates to those brought in from the commit above. I just want to present
fully all changes made so there is a complete picture.

I will continue on this tomorrow according to where this conversation left
off, and try to move this quickly. If indeed there are no more blockers I
think we are very close. As a reminder, when standing up a new testing
plan, we want to make sure at least 1 rep from each vendor has SSH access
and can remotely login to help with system tuning, troubleshooting, etc.
for the testbed and test plan. Who would be the best person from ARM for
this at this time, given the context on QAT testing? Ruifeng? Dharmik?
Someone else?

Thanks, I'll keep yall apprised of the situation.

[-- Attachment #2: Type: text/html, Size: 4066 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2024-02-27  6:58                                   ` Patrick Robb
@ 2024-02-27 13:50                                     ` Honnappa Nagarahalli
  2024-02-28 20:00                                       ` Patrick Robb
  0 siblings, 1 reply; 31+ messages in thread
From: Honnappa Nagarahalli @ 2024-02-27 13:50 UTC (permalink / raw)
  To: Patrick Robb
  Cc: Ruifeng Wang, Juraj Linkeš,
	Dharmik Jayesh Thakkar, David Marchand, ci, nd,
	Wathsala Wathawana Vithanage, Paul Szczepanek

+ Paul, Wathsala

> On Feb 27, 2024, at 12:58 AM, Patrick Robb <probb@iol.unh.edu> wrote:
> 
> 
> 
> On Tue, Nov 14, 2023 at 2:35 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:
> Hi Patrick,
>  It seems kernel v5.15 has a defect on this. A similar issue was fixed by commit:
> 40da865381ad ("crypto: qat - remove unneeded packed attribute")
>  Could you patch the kernel and try again?
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40da865381ad061ab75a7a9da469ed4e623bdfeb
>  Thanks,
> Ruifeng
>  
> Hi Ruifeng,
> 
> Sorry for the delay on this - there has been a work item backlog at the Community Lab we've been working through. 
> 
> I did rebuild the patch today with these changes from the commit (or similar, as the commit above was for the qat_common file in a different state, but I tried to remain as true to the commit as possible). 
> 
> And that does seem to have resolved the seg fault problem! Thank you so much for picking this commit out of obscurity and sending it our way! 
> 
> root@arm-ampere-dut:~# echo 16 > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs
> root@arm-ampere-dut:~# cat /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs
> 16
> 
> Wunderbar!
> 
> The only other thing I changed (just because I was floating the idea with Dharmik before) was in the kernel .config I changed the qat_c62x and qat_c62xvf modules from statically built in (=y) to loadable (=m). Of course, this should not matter, and I presume the change in behavior relates to those brought in from the commit above. I just want to present fully all changes made so there is a complete picture. 
> 
> I will continue on this tomorrow according to where this conversation left off, and try to move this quickly. If indeed there are no more blockers I think we are very close. As a reminder, when standing up a new testing plan, we want to make sure at least 1 rep from each vendor has SSH access and can remotely login to help with system tuning, troubleshooting, etc. for the testbed and test plan. Who would be the best person from ARM for this at this time, given the context on QAT testing? Ruifeng? Dharmik? Someone else? 
> 
> Thanks, I'll keep yall apprised of the situation.
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2024-02-27 13:50                                     ` Honnappa Nagarahalli
@ 2024-02-28 20:00                                       ` Patrick Robb
  2024-02-28 20:40                                         ` Honnappa Nagarahalli
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick Robb @ 2024-02-28 20:00 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: Ruifeng Wang, Juraj Linkeš,
	Dharmik Jayesh Thakkar, David Marchand, ci, nd,
	Wathsala Wathawana Vithanage, Paul Szczepanek

[-- Attachment #1: Type: text/plain, Size: 2997 bytes --]

quick update:

I could bind the QAT VFs to vfio-pci after using the module loading options
Dharmik mentioned.

First I tested SYM QAT pmd from dpdk test on the VF and got:

 + Tests Total :       751
 + Tests Skipped :     257
 + Tests Executed :    659
 + Tests Unsupported:   0
 + Tests Passed :      494
 + Tests Failed :       0
 + ------------------------------------------------------- +
Test OK

I can try the crypto performance DTS testsuite next. Let me know if you
have any thoughts.



On Tue, Feb 27, 2024 at 8:51 AM Honnappa Nagarahalli <
Honnappa.Nagarahalli@arm.com> wrote:

> + Paul, Wathsala
>
> > On Feb 27, 2024, at 12:58 AM, Patrick Robb <probb@iol.unh.edu> wrote:
> >
> >
> >
> > On Tue, Nov 14, 2023 at 2:35 AM Ruifeng Wang <Ruifeng.Wang@arm.com>
> wrote:
> > Hi Patrick,
> >  It seems kernel v5.15 has a defect on this. A similar issue was fixed
> by commit:
> > 40da865381ad ("crypto: qat - remove unneeded packed attribute")
> >  Could you patch the kernel and try again?
> >
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40da865381ad061ab75a7a9da469ed4e623bdfeb
> >  Thanks,
> > Ruifeng
> >
> > Hi Ruifeng,
> >
> > Sorry for the delay on this - there has been a work item backlog at the
> Community Lab we've been working through.
> >
> > I did rebuild the patch today with these changes from the commit (or
> similar, as the commit above was for the qat_common file in a different
> state, but I tried to remain as true to the commit as possible).
> >
> > And that does seem to have resolved the seg fault problem! Thank you so
> much for picking this commit out of obscurity and sending it our way!
> >
> > root@arm-ampere-dut:~# echo 16 >
> /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs
> > root@arm-ampere-dut:~# cat
> /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs
> > 16
> >
> > Wunderbar!
> >
> > The only other thing I changed (just because I was floating the idea
> with Dharmik before) was in the kernel .config I changed the qat_c62x and
> qat_c62xvf modules from statically built in (=y) to loadable (=m). Of
> course, this should not matter, and I presume the change in behavior
> relates to those brought in from the commit above. I just want to present
> fully all changes made so there is a complete picture.
> >
> > I will continue on this tomorrow according to where this conversation
> left off, and try to move this quickly. If indeed there are no more
> blockers I think we are very close. As a reminder, when standing up a new
> testing plan, we want to make sure at least 1 rep from each vendor has SSH
> access and can remotely login to help with system tuning, troubleshooting,
> etc. for the testbed and test plan. Who would be the best person from ARM
> for this at this time, given the context on QAT testing? Ruifeng? Dharmik?
> Someone else?
> >
> > Thanks, I'll keep yall apprised of the situation.
> >
>
>

[-- Attachment #2: Type: text/html, Size: 3917 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2024-02-28 20:00                                       ` Patrick Robb
@ 2024-02-28 20:40                                         ` Honnappa Nagarahalli
  2024-03-07  5:27                                           ` Patrick Robb
  0 siblings, 1 reply; 31+ messages in thread
From: Honnappa Nagarahalli @ 2024-02-28 20:40 UTC (permalink / raw)
  To: Patrick Robb
  Cc: Ruifeng Wang, Juraj Linkeš,
	Dharmik Jayesh Thakkar, David Marchand, ci, nd,
	Wathsala Wathawana Vithanage, Paul Szczepanek, Dhruv Tripathi



> On Feb 28, 2024, at 2:00 PM, Patrick Robb <probb@iol.unh.edu> wrote:
> 
> quick update:
> 
> I could bind the QAT VFs to vfio-pci after using the module loading options Dharmik mentioned. 
> 
> First I tested SYM QAT pmd from dpdk test on the VF and got:
> 
>  + Tests Total :       751
>  + Tests Skipped :     257
>  + Tests Executed :    659
>  + Tests Unsupported:   0
>  + Tests Passed :      494
>  + Tests Failed :       0
>  + ------------------------------------------------------- +
> Test OK
> 
> I can try the crypto performance DTS testsuite next. Let me know if you have any thoughts.
Please go ahead and try. We have not worked on the performance, but it is fine to try.

> 
> 
> 
> On Tue, Feb 27, 2024 at 8:51 AM Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> wrote:
> + Paul, Wathsala
> 
> > On Feb 27, 2024, at 12:58 AM, Patrick Robb <probb@iol.unh.edu> wrote:
> > 
> > 
> > 
> > On Tue, Nov 14, 2023 at 2:35 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote:
> > Hi Patrick,
> >  It seems kernel v5.15 has a defect on this. A similar issue was fixed by commit:
> > 40da865381ad ("crypto: qat - remove unneeded packed attribute")
> >  Could you patch the kernel and try again?
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40da865381ad061ab75a7a9da469ed4e623bdfeb
> >  Thanks,
> > Ruifeng
> >  
> > Hi Ruifeng,
> > 
> > Sorry for the delay on this - there has been a work item backlog at the Community Lab we've been working through. 
> > 
> > I did rebuild the patch today with these changes from the commit (or similar, as the commit above was for the qat_common file in a different state, but I tried to remain as true to the commit as possible). 
> > 
> > And that does seem to have resolved the seg fault problem! Thank you so much for picking this commit out of obscurity and sending it our way! 
> > 
> > root@arm-ampere-dut:~# echo 16 > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs
> > root@arm-ampere-dut:~# cat /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs
> > 16
> > 
> > Wunderbar!
> > 
> > The only other thing I changed (just because I was floating the idea with Dharmik before) was in the kernel .config I changed the qat_c62x and qat_c62xvf modules from statically built in (=y) to loadable (=m). Of course, this should not matter, and I presume the change in behavior relates to those brought in from the commit above. I just want to present fully all changes made so there is a complete picture. 
> > 
> > I will continue on this tomorrow according to where this conversation left off, and try to move this quickly. If indeed there are no more blockers I think we are very close. As a reminder, when standing up a new testing plan, we want to make sure at least 1 rep from each vendor has SSH access and can remotely login to help with system tuning, troubleshooting, etc. for the testbed and test plan. Who would be the best person from ARM for this at this time, given the context on QAT testing? Ruifeng? Dharmik? Someone else? 
> > 
> > Thanks, I'll keep yall apprised of the situation.
> > 
> 
> 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2024-02-28 20:40                                         ` Honnappa Nagarahalli
@ 2024-03-07  5:27                                           ` Patrick Robb
  2024-03-07  7:56                                             ` David Marchand
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick Robb @ 2024-03-07  5:27 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: Ruifeng Wang, Juraj Linkeš,
	Dharmik Jayesh Thakkar, David Marchand, ci, nd,
	Wathsala Wathawana Vithanage, Paul Szczepanek, Dhruv Tripathi

Hi all,

I have run the crypto_perf_cryptodev_perf DTS testsuite for the QAT
card on the Ampere server, and have some updates below:

On Wed, Feb 28, 2024 at 3:40 PM Honnappa Nagarahalli
<Honnappa.Nagarahalli@arm.com> wrote:
>
>
>
> > On Feb 28, 2024, at 2:00 PM, Patrick Robb <probb@iol.unh.edu> wrote:
> >
> > quick update:
> >
> > I could bind the QAT VFs to vfio-pci after using the module loading options Dharmik mentioned.
> >
> > First I tested SYM QAT pmd from dpdk test on the VF and got:
> >
> >  + Tests Total :       751
> >  + Tests Skipped :     257
> >  + Tests Executed :    659
> >  + Tests Unsupported:   0
> >  + Tests Passed :      494
> >  + Tests Failed :       0
> >  + ------------------------------------------------------- +
> > Test OK
> >
> > I can try the crypto performance DTS testsuite next. Let me know if you have any thoughts.
> Please go ahead and try. We have not worked on the performance, but it is fine to try.

First, two tiny change are needed in DTS to make it work:

1. As Dharmik and David discussed, there are some QAT devices that
need VFIO denylist=1. To account for this, in cryptodev_common.py
(which the crypto perf testsuite imports), we need to add:

given the c62x device id is 37c8

if dev_id in ["37c8", "435", "19e2"]:
    test_case.dut.send_expect('modprobe -r vfio_iommu_type1; modprobe
-r vfio_pci; modprobe -r vfio_virqfd; modprobe -r vfio', '# ', 5)
    test_case.dut.send_expect('modprobe vfio-pci disable_denylist=1
enable_sriov=1 vfio-pci.ids=8086:37c9', '# ', 5)
    test_case.dut.send_expect('echo "1" | tee
/sys/module/vfio/parameters/enable_unsafe_noiommu_mode', '# ', 5)

In order to maintain the custom vfio loading Dharmik recommended. The
latter two dev ids in that list are for DH895XCC and C3XXX, since they
are also included in
https://github.com/torvalds/linux/commit/50173329c8cc0c892eaa7a9d0f0692ac39cd7b04

David and Dharmik, I think this is correct, but please chime in if it isn't.

2. For this testsuite we need to add some whitespace stripping on the
lscpu output for ARM systems. For some reason on some systems there is
no leading whitespace before "Core(s) per socket" in lscpu, but in
others (the arm servers we have at the lab) there is.

So, as long as this is all fine, I can submit a patch to DTS for these items.

And from there we can run the testsuite and all QAT testcases are
passing. It will give some results like:

            PerfTestsCryptodev: Test Case test_qat_zuc Begin
dut.arm-ampere-dut.dpdklab.iol.unh.edu: lscpu
dut.arm-ampere-dut.dpdklab.iol.unh.edu:
x86_64-native-linux-gcc/app/dpdk-test-crypto-perf  -l 9,10 -a
0000:03:01.0 --socket-mem 2048,0 -n 6  -- --ptest throughput --silent
--total-
CRYPTODEV: Initialisation parameters - name:
0000:03:01.0_qat_sym,socket id: 0, max queue pairs: 0
Allocated pool "sess_mp_0" on socket 0
    lcore id    Buf Size  Burst Size    Enqueued    Dequeued  Failed
Enq  Failed Deq        MOps        Gbps  Cycles/Buf

          10          64          32    30000000    30000000
39393954    33424660      5.5361      2.8345        4.52
          10         128          32    30000000    30000000
40170307    34256181      5.4867      5.6184        4.56
          10         256          32    30000000    30000000
42119414    36231215      5.3883     11.0352        4.64
          10         512          32    30000000    30000000
44557481    38555569      5.2235     21.3955        4.79
          10        1024          32    30000000    30000000
55097817    48193496      4.6161     37.8149        5.42
          10        2048          32    30000000    30000000
126698128   118908347      3.0483     49.9439        8.20

I will let you folks who are working on this to assess the performance
metrics. I assume this is useful, and if/when we bring this to CI, all
these results will be stored as artifacts and viewable for any new
series which come in.

Happy to discuss further tomorrow at the CI meeting. If there are no
issues here, I think we can write up the jenkins scripts pretty
quickly and get this online tomorrow or early next week.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Intel QAT 8970 accel card on ARM Ampere Server
  2024-03-07  5:27                                           ` Patrick Robb
@ 2024-03-07  7:56                                             ` David Marchand
  0 siblings, 0 replies; 31+ messages in thread
From: David Marchand @ 2024-03-07  7:56 UTC (permalink / raw)
  To: Patrick Robb
  Cc: Honnappa Nagarahalli, Ruifeng Wang, Juraj Linkeš,
	Dharmik Jayesh Thakkar, ci, nd, Wathsala Wathawana Vithanage,
	Paul Szczepanek, Dhruv Tripathi

Hello Patrick,

On Thu, Mar 7, 2024 at 6:27 AM Patrick Robb <probb@iol.unh.edu> wrote:
> 1. As Dharmik and David discussed, there are some QAT devices that
> need VFIO denylist=1. To account for this, in cryptodev_common.py
> (which the crypto perf testsuite imports), we need to add:
>
> given the c62x device id is 37c8
>
> if dev_id in ["37c8", "435", "19e2"]:
>     test_case.dut.send_expect('modprobe -r vfio_iommu_type1; modprobe
> -r vfio_pci; modprobe -r vfio_virqfd; modprobe -r vfio', '# ', 5)
>     test_case.dut.send_expect('modprobe vfio-pci disable_denylist=1
> enable_sriov=1 vfio-pci.ids=8086:37c9', '# ', 5)
>     test_case.dut.send_expect('echo "1" | tee
> /sys/module/vfio/parameters/enable_unsafe_noiommu_mode', '# ', 5)
>
> In order to maintain the custom vfio loading Dharmik recommended. The
> latter two dev ids in that list are for DH895XCC and C3XXX, since they
> are also included in
> https://github.com/torvalds/linux/commit/50173329c8cc0c892eaa7a9d0f0692ac39cd7b04
>
> David and Dharmik, I think this is correct, but please chime in if it isn't.

You probably missed one question I had, mixed with my grmbl about
disable_denylist.
"""
However, I don't think the vfio-pci.ids syntax works for passing parameters.
And in any case, why do you need to set this initial list?
Binding devices (using either driverctl or dpdk-devbind.py) to
vfio-pci should be done the "usual" way, or is there some special case
again for QAT?
"""

Re-reading vfio-pci kernel parsing code, the syntax for vfio-pci.ids
seems ok, my bad.

But I am still not clear if there is a need for a special case here.
bind_qat_device() calls test_case.dut.bind_eventdev_port which itself
calls dpdk-devbind to bind the VF to vfio-pci.

So here, on the topic of loading vfio-pci wrt the QAT quirk, you only need:
# modprobe vfio-pci disable_denylist=1 enable_sriov=1


-- 
David Marchand


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2024-03-07  7:57 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-31 17:13 Intel QAT 8970 accel card on ARM Ampere Server Patrick Robb
2023-08-04  9:48 ` Ruifeng Wang
2023-08-08  7:07   ` Juraj Linkeš
2023-08-08  7:11     ` Ruifeng Wang
2023-08-11 21:18       ` Patrick Robb
2023-08-21  8:45         ` Juraj Linkeš
2023-08-30  0:05           ` Patrick Robb
2023-09-01 21:30           ` Patrick Robb
2023-09-11  8:13             ` Juraj Linkeš
2023-09-20 18:28               ` Patrick Robb
2023-09-25 15:19                 ` Ruifeng Wang
2023-10-09 16:34                   ` Patrick Robb
2023-10-10  2:28                     ` Patrick Robb
2023-10-10  3:55                       ` Dharmik Jayesh Thakkar
2023-10-10  7:25                         ` David Marchand
2023-10-10 15:03                           ` Dharmik Jayesh Thakkar
2023-10-10 15:12                             ` David Marchand
2023-10-10 15:59                         ` Patrick Robb
2023-10-10 21:50                           ` Dharmik Jayesh Thakkar
2023-10-11  8:14                           ` Juraj Linkeš
2023-10-11 20:13                             ` Patrick Robb
2023-11-02 22:00                               ` Patrick Robb
2023-11-14  7:34                                 ` Ruifeng Wang
2023-11-14 14:36                                   ` Patrick Robb
2024-02-27  6:58                                   ` Patrick Robb
2024-02-27 13:50                                     ` Honnappa Nagarahalli
2024-02-28 20:00                                       ` Patrick Robb
2024-02-28 20:40                                         ` Honnappa Nagarahalli
2024-03-07  5:27                                           ` Patrick Robb
2024-03-07  7:56                                             ` David Marchand
2023-10-11 11:51                           ` David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).