* Intel QAT 8970 accel card on ARM Ampere Server @ 2023-07-31 17:13 Patrick Robb 2023-08-04 9:48 ` Ruifeng Wang 0 siblings, 1 reply; 31+ messages in thread From: Patrick Robb @ 2023-07-31 17:13 UTC (permalink / raw) To: Ruifeng Wang, Honnappa Nagarahalli, Juraj Linkeš Cc: dharmikjayesh.thakkar, ci [-- Attachment #1: Type: text/plain, Size: 3323 bytes --] Hi Ruifeng, Honnappa, Juraj, The Intel QAT 8970 accelerator card has arrived to the Community Lab, and we've installed it on the Ampere server. Presumably, we should test both crypto and compress operations (and their respective performance metrics). To that end, there are also DTS testsuites for testing QAT crypto/compress functions. These testsuites make use of the crypto perf dpdk app and the compress perf dpdk app. If you want, you can setup the DTS stuff yourself, both on the system side, and the Jenkins side (you are allowed to submit PRs on our gitlab now), but we can also do that on the lab side as we probably have more experience. I do, however, have a question about the QAT kernel driver and corresponding PMDs. compress suite: https://git.dpdk.org/tools/dts/tree/test_plans/compressdev_qat_pmd_test_plan.rst crypto suite: https://git.dpdk.org/tools/dts/tree/test_plans/crypto_perf_cryptodev_perf_test_plan.rst For reference, the DPDK docs page explaining QAT driver capabilities and building the QAT PMDs (crypto sym, crypto asym, and compress) is here: https://doc.dpdk.org/guides/cryptodevs/qat.html#building-qat Some notes before I get to my main question: -The 8970 is a C62x device -OpenSSL (arm requires it for QAT) is installed -3 PFs are visible from lspci (expected) -SRIOV is enabled However, although the system is on a valid kernel version for the QAT driver, the kernel module for QAT is not loaded, so in trying to set up testing, I am unable to create the 16 VFs for the 3 PFs respectively, like the example below: echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs There is also an option to download the firmware from the kernel firmware repo and copy the qat binaries to /lib/firmware and start the qat modules from there. I wasn't able to resolve the situation with this method, but it also could have been user error on my part. There is an option to install using the IDZ QAT Driver <https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html>, but it should not be required given the kernel version the Ampere server is on, and I don't want to go down the road of relying on this "fall back" method without consulting you first. Is it possible that there is anything specific to running a QAT device on ARM specifically which I am missing here? The DTS testsuite testplans actually seem to recommend going down this road in general, but the DPDK docs say to use the kernel driver, so I don't know. In any case, one of you should be able to login to the Ampere server in situations like this, or just in general. Ruifeng/Juraj I see you both have accounts on our IdM system, so you should have access. Please let me know if you need renewed vpn cert configs and I will send you one. If you do login, know this system could be running CI testing at any time. I can always schedule time for it to be offline and available for maintenance if you want to do anything which could be disruptive to testing. I also CC'd Dharmik on this as I see he sent an email regarding QAT support on aarch64 in June. Let me know if you have any thoughts on the QAT kernel driver part. Thanks, Patrick [-- Attachment #2: Type: text/html, Size: 4040 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* RE: Intel QAT 8970 accel card on ARM Ampere Server 2023-07-31 17:13 Intel QAT 8970 accel card on ARM Ampere Server Patrick Robb @ 2023-08-04 9:48 ` Ruifeng Wang 2023-08-08 7:07 ` Juraj Linkeš 0 siblings, 1 reply; 31+ messages in thread From: Ruifeng Wang @ 2023-08-04 9:48 UTC (permalink / raw) To: Patrick Robb, Honnappa Nagarahalli, Juraj Linkeš Cc: Dharmik Jayesh Thakkar, ci, nd [-- Attachment #1: Type: text/plain, Size: 3990 bytes --] Hi Patrick, Thanks for reaching out and my apologies for delayed response. We noticed that some information is missing regarding using QAT with DPDK on Arm. The DPDK document will be updated to include the missing part. Will get back on this later. Best regards, Ruifeng From: Patrick Robb <probb@iol.unh.edu> Sent: Tuesday, August 1, 2023 1:14 AM To: Ruifeng Wang <Ruifeng.Wang@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Juraj Linkeš <juraj.linkes@pantheon.tech> Cc: Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org Subject: Intel QAT 8970 accel card on ARM Ampere Server Hi Ruifeng, Honnappa, Juraj, The Intel QAT 8970 accelerator card has arrived to the Community Lab, and we've installed it on the Ampere server. Presumably, we should test both crypto and compress operations (and their respective performance metrics). To that end, there are also DTS testsuites for testing QAT crypto/compress functions. These testsuites make use of the crypto perf dpdk app and the compress perf dpdk app. If you want, you can setup the DTS stuff yourself, both on the system side, and the Jenkins side (you are allowed to submit PRs on our gitlab now), but we can also do that on the lab side as we probably have more experience. I do, however, have a question about the QAT kernel driver and corresponding PMDs. compress suite: https://git.dpdk.org/tools/dts/tree/test_plans/compressdev_qat_pmd_test_plan.rst crypto suite: https://git.dpdk.org/tools/dts/tree/test_plans/crypto_perf_cryptodev_perf_test_plan.rst For reference, the DPDK docs page explaining QAT driver capabilities and building the QAT PMDs (crypto sym, crypto asym, and compress) is here: https://doc.dpdk.org/guides/cryptodevs/qat.html#building-qat Some notes before I get to my main question: -The 8970 is a C62x device -OpenSSL (arm requires it for QAT) is installed -3 PFs are visible from lspci (expected) -SRIOV is enabled However, although the system is on a valid kernel version for the QAT driver, the kernel module for QAT is not loaded, so in trying to set up testing, I am unable to create the 16 VFs for the 3 PFs respectively, like the example below: echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs There is also an option to download the firmware from the kernel firmware repo and copy the qat binaries to /lib/firmware and start the qat modules from there. I wasn't able to resolve the situation with this method, but it also could have been user error on my part. There is an option to install using the IDZ QAT Driver<https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html>, but it should not be required given the kernel version the Ampere server is on, and I don't want to go down the road of relying on this "fall back" method without consulting you first. Is it possible that there is anything specific to running a QAT device on ARM specifically which I am missing here? The DTS testsuite testplans actually seem to recommend going down this road in general, but the DPDK docs say to use the kernel driver, so I don't know. In any case, one of you should be able to login to the Ampere server in situations like this, or just in general. Ruifeng/Juraj I see you both have accounts on our IdM system, so you should have access. Please let me know if you need renewed vpn cert configs and I will send you one. If you do login, know this system could be running CI testing at any time. I can always schedule time for it to be offline and available for maintenance if you want to do anything which could be disruptive to testing. I also CC'd Dharmik on this as I see he sent an email regarding QAT support on aarch64 in June. Let me know if you have any thoughts on the QAT kernel driver part. Thanks, Patrick [-- Attachment #2: Type: text/html, Size: 8403 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-08-04 9:48 ` Ruifeng Wang @ 2023-08-08 7:07 ` Juraj Linkeš 2023-08-08 7:11 ` Ruifeng Wang 0 siblings, 1 reply; 31+ messages in thread From: Juraj Linkeš @ 2023-08-08 7:07 UTC (permalink / raw) To: Ruifeng Wang Cc: Patrick Robb, Honnappa Nagarahalli, Dharmik Jayesh Thakkar, ci, nd [-- Attachment #1: Type: text/plain, Size: 5098 bytes --] We've talked about this some more and the best way to move forward is to rebuild the ubuntu kernel. It should be fairly straightforward according to their wiki page <https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel>. The page mentions a fairly old release (19.04), but was updated a year ago so the instructions are likely still valid. However, I don't have the link to the kernel patch that Honnappa mentioned. @Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> @Ruifeng Wang <Ruifeng.Wang@arm.com>, can you please provide a reference for the patch? Since the patch is small, there shouldn't be problems with applying it. Let us know whether this is doable. Regards, Juraj On Fri, Aug 4, 2023 at 11:48 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote: > Hi Patrick, > > > > Thanks for reaching out and my apologies for delayed response. > > > > We noticed that some information is missing regarding using QAT with DPDK > on Arm. > > The DPDK document will be updated to include the missing part. > > Will get back on this later. > > > > Best regards, > > Ruifeng > > > > *From:* Patrick Robb <probb@iol.unh.edu> > *Sent:* Tuesday, August 1, 2023 1:14 AM > *To:* Ruifeng Wang <Ruifeng.Wang@arm.com>; Honnappa Nagarahalli < > Honnappa.Nagarahalli@arm.com>; Juraj Linkeš <juraj.linkes@pantheon.tech> > *Cc:* Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org > *Subject:* Intel QAT 8970 accel card on ARM Ampere Server > > > > Hi Ruifeng, Honnappa, Juraj, > > > > The Intel QAT 8970 accelerator card has arrived to the Community Lab, and > we've installed it on the Ampere server. Presumably, we should test both > crypto and compress operations (and their respective performance metrics). > To that end, there are also DTS testsuites for testing QAT crypto/compress > functions. These testsuites make use of the crypto perf dpdk app and the > compress perf dpdk app. If you want, you can setup the DTS stuff yourself, > both on the system side, and the Jenkins side (you are allowed to submit > PRs on our gitlab now), but we can also do that on the lab side as we > probably have more experience. I do, however, have a question about the QAT > kernel driver and corresponding PMDs. > > > > compress suite: > https://git.dpdk.org/tools/dts/tree/test_plans/compressdev_qat_pmd_test_plan.rst > > crypto suite: > https://git.dpdk.org/tools/dts/tree/test_plans/crypto_perf_cryptodev_perf_test_plan.rst > > > > For reference, the DPDK docs page explaining QAT driver capabilities and > building the QAT PMDs (crypto sym, crypto asym, and compress) is here: > https://doc.dpdk.org/guides/cryptodevs/qat.html#building-qat > > Some notes before I get to my main question: > > -The 8970 is a C62x device > > -OpenSSL (arm requires it for QAT) is installed > > -3 PFs are visible from lspci (expected) > > -SRIOV is enabled > > > > However, although the system is on a valid kernel version for the QAT > driver, the kernel module for QAT is not loaded, so in trying to set up > testing, I am unable to create the 16 VFs for the 3 PFs respectively, like > the example below: > > > > echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs > echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs > echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs > > > > There is also an option to download the firmware from the kernel firmware > repo and copy the qat binaries to /lib/firmware and start the qat modules > from there. I wasn't able to resolve the situation with this method, but it > also could have been user error on my part. > > > > There is an option to install using the IDZ QAT Driver > <https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html>, > but it should not be required given the kernel version the Ampere server is > on, and I don't want to go down the road of relying on this "fall back" > method without consulting you first. Is it possible that there is anything > specific to running a QAT device on ARM specifically which I am missing > here? The DTS testsuite testplans actually seem to recommend going down > this road in general, but the DPDK docs say to use the kernel driver, so I > don't know. > > > > In any case, one of you should be able to login to the Ampere server in > situations like this, or just in general. Ruifeng/Juraj I see you both have > accounts on our IdM system, so you should have access. Please let me know > if you need renewed vpn cert configs and I will send you one. If you do > login, know this system could be running CI testing at any time. I can > always schedule time for it to be offline and available for maintenance if > you want to do anything which could be disruptive to testing. > > > > I also CC'd Dharmik on this as I see he sent an email regarding QAT > support on aarch64 in June. > > > > Let me know if you have any thoughts on the QAT kernel driver part. > > > > Thanks, > > Patrick > [-- Attachment #2: Type: text/html, Size: 8689 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* RE: Intel QAT 8970 accel card on ARM Ampere Server 2023-08-08 7:07 ` Juraj Linkeš @ 2023-08-08 7:11 ` Ruifeng Wang 2023-08-11 21:18 ` Patrick Robb 0 siblings, 1 reply; 31+ messages in thread From: Ruifeng Wang @ 2023-08-08 7:11 UTC (permalink / raw) To: Juraj Linkeš Cc: Patrick Robb, Honnappa Nagarahalli, Dharmik Jayesh Thakkar, ci, nd, nd [-- Attachment #1: Type: text/plain, Size: 5430 bytes --] From: Juraj Linkeš <juraj.linkes@pantheon.tech> Sent: Tuesday, August 8, 2023 3:07 PM To: Ruifeng Wang <Ruifeng.Wang@arm.com> Cc: Patrick Robb <probb@iol.unh.edu>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org; nd <nd@arm.com> Subject: Re: Intel QAT 8970 accel card on ARM Ampere Server We've talked about this some more and the best way to move forward is to rebuild the ubuntu kernel. It should be fairly straightforward according to their wiki page<https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel>. The page mentions a fairly old release (19.04), but was updated a year ago so the instructions are likely still valid. However, I don't have the link to the kernel patch that Honnappa mentioned. @Honnappa Nagarahalli<mailto:Honnappa.Nagarahalli@arm.com> @Ruifeng Wang<mailto:Ruifeng.Wang@arm.com>, can you please provide a reference for the patch? [Ruifeng] Here is the kernel patch set: https://lkml.org/lkml/2022/6/17/328 Since the patch is small, there shouldn't be problems with applying it. Let us know whether this is doable. Regards, Juraj On Fri, Aug 4, 2023 at 11:48 AM Ruifeng Wang <Ruifeng.Wang@arm.com<mailto:Ruifeng.Wang@arm.com>> wrote: Hi Patrick, Thanks for reaching out and my apologies for delayed response. We noticed that some information is missing regarding using QAT with DPDK on Arm. The DPDK document will be updated to include the missing part. Will get back on this later. Best regards, Ruifeng From: Patrick Robb <probb@iol.unh.edu<mailto:probb@iol.unh.edu>> Sent: Tuesday, August 1, 2023 1:14 AM To: Ruifeng Wang <Ruifeng.Wang@arm.com<mailto:Ruifeng.Wang@arm.com>>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com<mailto:Honnappa.Nagarahalli@arm.com>>; Juraj Linkeš <juraj.linkes@pantheon.tech<mailto:juraj.linkes@pantheon.tech>> Cc: Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com<mailto:DharmikJayesh.Thakkar@arm.com>>; ci@dpdk.org<mailto:ci@dpdk.org> Subject: Intel QAT 8970 accel card on ARM Ampere Server Hi Ruifeng, Honnappa, Juraj, The Intel QAT 8970 accelerator card has arrived to the Community Lab, and we've installed it on the Ampere server. Presumably, we should test both crypto and compress operations (and their respective performance metrics). To that end, there are also DTS testsuites for testing QAT crypto/compress functions. These testsuites make use of the crypto perf dpdk app and the compress perf dpdk app. If you want, you can setup the DTS stuff yourself, both on the system side, and the Jenkins side (you are allowed to submit PRs on our gitlab now), but we can also do that on the lab side as we probably have more experience. I do, however, have a question about the QAT kernel driver and corresponding PMDs. compress suite: https://git.dpdk.org/tools/dts/tree/test_plans/compressdev_qat_pmd_test_plan.rst crypto suite: https://git.dpdk.org/tools/dts/tree/test_plans/crypto_perf_cryptodev_perf_test_plan.rst For reference, the DPDK docs page explaining QAT driver capabilities and building the QAT PMDs (crypto sym, crypto asym, and compress) is here: https://doc.dpdk.org/guides/cryptodevs/qat.html#building-qat Some notes before I get to my main question: -The 8970 is a C62x device -OpenSSL (arm requires it for QAT) is installed -3 PFs are visible from lspci (expected) -SRIOV is enabled However, although the system is on a valid kernel version for the QAT driver, the kernel module for QAT is not loaded, so in trying to set up testing, I am unable to create the 16 VFs for the 3 PFs respectively, like the example below: echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs There is also an option to download the firmware from the kernel firmware repo and copy the qat binaries to /lib/firmware and start the qat modules from there. I wasn't able to resolve the situation with this method, but it also could have been user error on my part. There is an option to install using the IDZ QAT Driver<https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html>, but it should not be required given the kernel version the Ampere server is on, and I don't want to go down the road of relying on this "fall back" method without consulting you first. Is it possible that there is anything specific to running a QAT device on ARM specifically which I am missing here? The DTS testsuite testplans actually seem to recommend going down this road in general, but the DPDK docs say to use the kernel driver, so I don't know. In any case, one of you should be able to login to the Ampere server in situations like this, or just in general. Ruifeng/Juraj I see you both have accounts on our IdM system, so you should have access. Please let me know if you need renewed vpn cert configs and I will send you one. If you do login, know this system could be running CI testing at any time. I can always schedule time for it to be offline and available for maintenance if you want to do anything which could be disruptive to testing. I also CC'd Dharmik on this as I see he sent an email regarding QAT support on aarch64 in June. Let me know if you have any thoughts on the QAT kernel driver part. Thanks, Patrick [-- Attachment #2: Type: text/html, Size: 13819 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-08-08 7:11 ` Ruifeng Wang @ 2023-08-11 21:18 ` Patrick Robb 2023-08-21 8:45 ` Juraj Linkeš 0 siblings, 1 reply; 31+ messages in thread From: Patrick Robb @ 2023-08-11 21:18 UTC (permalink / raw) To: Ruifeng Wang Cc: Juraj Linkeš, Honnappa Nagarahalli, Dharmik Jayesh Thakkar, ci, nd [-- Attachment #1: Type: text/plain, Size: 6539 bytes --] Sorry about the wait on my reply guys. Thanks for the information. So I download the 2 diffs from that thread, make a patch with them. Then where and how do I apply it? Then I install the packages needed per the ubuntu page, and then I can skip down to the "Building The Kernel" section? And then we're all set I think, and we just have to setup DTS and associated Jenkins pipelines. Do you want me to back anything up in advance of this? I don't know if that is needed or not, but Ampere is currently live doing testing for CI, so I want to act in a safe manner. I will try to address this first thing on Monday and get back to you. On Tue, Aug 8, 2023 at 3:11 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote: > *From:* Juraj Linkeš <juraj.linkes@pantheon.tech> > *Sent:* Tuesday, August 8, 2023 3:07 PM > *To:* Ruifeng Wang <Ruifeng.Wang@arm.com> > *Cc:* Patrick Robb <probb@iol.unh.edu>; Honnappa Nagarahalli < > Honnappa.Nagarahalli@arm.com>; Dharmik Jayesh Thakkar < > DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org; nd <nd@arm.com> > *Subject:* Re: Intel QAT 8970 accel card on ARM Ampere Server > > > > We've talked about this some more and the best way to move forward is to > rebuild the ubuntu kernel. It should be fairly straightforward according to their > wiki page <https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel>. The page > mentions a fairly old release (19.04), but was updated a year ago so the > instructions are likely still valid. > > > > However, I don't have the link to the kernel patch that Honnappa > mentioned. @Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> @Ruifeng > Wang <Ruifeng.Wang@arm.com>, can you please provide a reference for the > patch? > > *[Ruifeng]* Here is the kernel patch set: > https://lkml.org/lkml/2022/6/17/328 > > > > Since the patch is small, there shouldn't be problems with applying it. > Let us know whether this is doable. > > > > Regards, > > Juraj > > > > On Fri, Aug 4, 2023 at 11:48 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote: > > Hi Patrick, > > > > Thanks for reaching out and my apologies for delayed response. > > > > We noticed that some information is missing regarding using QAT with DPDK > on Arm. > > The DPDK document will be updated to include the missing part. > > Will get back on this later. > > > > Best regards, > > Ruifeng > > > > *From:* Patrick Robb <probb@iol.unh.edu> > *Sent:* Tuesday, August 1, 2023 1:14 AM > *To:* Ruifeng Wang <Ruifeng.Wang@arm.com>; Honnappa Nagarahalli < > Honnappa.Nagarahalli@arm.com>; Juraj Linkeš <juraj.linkes@pantheon.tech> > *Cc:* Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org > *Subject:* Intel QAT 8970 accel card on ARM Ampere Server > > > > Hi Ruifeng, Honnappa, Juraj, > > > > The Intel QAT 8970 accelerator card has arrived to the Community Lab, and > we've installed it on the Ampere server. Presumably, we should test both > crypto and compress operations (and their respective performance metrics). > To that end, there are also DTS testsuites for testing QAT crypto/compress > functions. These testsuites make use of the crypto perf dpdk app and the > compress perf dpdk app. If you want, you can setup the DTS stuff yourself, > both on the system side, and the Jenkins side (you are allowed to submit > PRs on our gitlab now), but we can also do that on the lab side as we > probably have more experience. I do, however, have a question about the QAT > kernel driver and corresponding PMDs. > > > > compress suite: > https://git.dpdk.org/tools/dts/tree/test_plans/compressdev_qat_pmd_test_plan.rst > > crypto suite: > https://git.dpdk.org/tools/dts/tree/test_plans/crypto_perf_cryptodev_perf_test_plan.rst > > > > For reference, the DPDK docs page explaining QAT driver capabilities and > building the QAT PMDs (crypto sym, crypto asym, and compress) is here: > https://doc.dpdk.org/guides/cryptodevs/qat.html#building-qat > > Some notes before I get to my main question: > > -The 8970 is a C62x device > > -OpenSSL (arm requires it for QAT) is installed > > -3 PFs are visible from lspci (expected) > > -SRIOV is enabled > > > > However, although the system is on a valid kernel version for the QAT > driver, the kernel module for QAT is not loaded, so in trying to set up > testing, I am unable to create the 16 VFs for the 3 PFs respectively, like > the example below: > > > > echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs > echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs > echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs > > > > There is also an option to download the firmware from the kernel firmware > repo and copy the qat binaries to /lib/firmware and start the qat modules > from there. I wasn't able to resolve the situation with this method, but it > also could have been user error on my part. > > > > There is an option to install using the IDZ QAT Driver > <https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html>, > but it should not be required given the kernel version the Ampere server is > on, and I don't want to go down the road of relying on this "fall back" > method without consulting you first. Is it possible that there is anything > specific to running a QAT device on ARM specifically which I am missing > here? The DTS testsuite testplans actually seem to recommend going down > this road in general, but the DPDK docs say to use the kernel driver, so I > don't know. > > > > In any case, one of you should be able to login to the Ampere server in > situations like this, or just in general. Ruifeng/Juraj I see you both have > accounts on our IdM system, so you should have access. Please let me know > if you need renewed vpn cert configs and I will send you one. If you do > login, know this system could be running CI testing at any time. I can > always schedule time for it to be offline and available for maintenance if > you want to do anything which could be disruptive to testing. > > > > I also CC'd Dharmik on this as I see he sent an email regarding QAT > support on aarch64 in June. > > > > Let me know if you have any thoughts on the QAT kernel driver part. > > > > Thanks, > > Patrick > > -- Patrick Robb Technical Service Manager UNH InterOperability Laboratory 21 Madbury Rd, Suite 100, Durham, NH 03824 www.iol.unh.edu [-- Attachment #2: Type: text/html, Size: 13719 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-08-11 21:18 ` Patrick Robb @ 2023-08-21 8:45 ` Juraj Linkeš 2023-08-30 0:05 ` Patrick Robb 2023-09-01 21:30 ` Patrick Robb 0 siblings, 2 replies; 31+ messages in thread From: Juraj Linkeš @ 2023-08-21 8:45 UTC (permalink / raw) To: Patrick Robb Cc: Ruifeng Wang, Honnappa Nagarahalli, Dharmik Jayesh Thakkar, ci, nd [-- Attachment #1: Type: text/plain, Size: 7856 bytes --] Hi Patrick, On Fri, Aug 11, 2023 at 11:18 PM Patrick Robb <probb@iol.unh.edu> wrote: > Sorry about the wait on my reply guys. > > Thanks for the information. So I download the 2 diffs from that thread, > make a patch with them. Then where and how do I apply it? > > First get the ubuntu repo: - git clone git://kernel.ubuntu.com/ubuntu/ubuntu-<release codename>.git 22.04 is jammy, but looking at https://kernel.ubuntu.com/git/, it's not under ubuntu/ubuntu-jammy.git, but rather ubuntu-stable/ubuntu-stable-jammy.git. It also seems the repo's been redirected: git clone git://kernel.ubuntu.com/ubuntu-stable/ubuntu-stable-jammy.git Cloning into 'ubuntu-stable-jammy'... fatal: remote error: **REPOSITORY RELOCATED** Updated URL: https://git.launchpad.net/~ubuntu-kernel-stable/+git/jammy Local path: /ubuntu-stable/ubuntu-stable-jammy.git Cloning the new URL worked for me. Then we need to checkout the tag that corresponds to the running kernel (uname -r), apply the patch and build the kernel with the running config (in /boot/config-$(uname -r)), possibly enabling the QAT driver if needed. > Then I install the packages needed per the ubuntu page, and then I can > skip down to the "Building The Kernel" section? And then we're all set I > think, and we just have to setup DTS and associated Jenkins pipelines. > > Do you want me to back anything up in advance of this? I don't know if > that is needed or not, but Ampere is currently live doing testing for CI, > so I want to act in a safe manner. I will try to address this first thing > on Monday and get back to you. > > The backup should not be needed, at least in principle, as we can always reinstall the original kernel packages. > > > On Tue, Aug 8, 2023 at 3:11 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote: > >> *From:* Juraj Linkeš <juraj.linkes@pantheon.tech> >> *Sent:* Tuesday, August 8, 2023 3:07 PM >> *To:* Ruifeng Wang <Ruifeng.Wang@arm.com> >> *Cc:* Patrick Robb <probb@iol.unh.edu>; Honnappa Nagarahalli < >> Honnappa.Nagarahalli@arm.com>; Dharmik Jayesh Thakkar < >> DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org; nd <nd@arm.com> >> *Subject:* Re: Intel QAT 8970 accel card on ARM Ampere Server >> >> >> >> We've talked about this some more and the best way to move forward is to >> rebuild the ubuntu kernel. It should be fairly straightforward according to their >> wiki page <https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel>. The page >> mentions a fairly old release (19.04), but was updated a year ago so the >> instructions are likely still valid. >> >> >> >> However, I don't have the link to the kernel patch that Honnappa >> mentioned. @Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> @Ruifeng >> Wang <Ruifeng.Wang@arm.com>, can you please provide a reference for the >> patch? >> >> *[Ruifeng]* Here is the kernel patch set: >> https://lkml.org/lkml/2022/6/17/328 >> >> >> >> Since the patch is small, there shouldn't be problems with applying it. >> Let us know whether this is doable. >> >> >> >> Regards, >> >> Juraj >> >> >> >> On Fri, Aug 4, 2023 at 11:48 AM Ruifeng Wang <Ruifeng.Wang@arm.com> >> wrote: >> >> Hi Patrick, >> >> >> >> Thanks for reaching out and my apologies for delayed response. >> >> >> >> We noticed that some information is missing regarding using QAT with DPDK >> on Arm. >> >> The DPDK document will be updated to include the missing part. >> >> Will get back on this later. >> >> >> >> Best regards, >> >> Ruifeng >> >> >> >> *From:* Patrick Robb <probb@iol.unh.edu> >> *Sent:* Tuesday, August 1, 2023 1:14 AM >> *To:* Ruifeng Wang <Ruifeng.Wang@arm.com>; Honnappa Nagarahalli < >> Honnappa.Nagarahalli@arm.com>; Juraj Linkeš <juraj.linkes@pantheon.tech> >> *Cc:* Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; ci@dpdk.org >> *Subject:* Intel QAT 8970 accel card on ARM Ampere Server >> >> >> >> Hi Ruifeng, Honnappa, Juraj, >> >> >> >> The Intel QAT 8970 accelerator card has arrived to the Community Lab, and >> we've installed it on the Ampere server. Presumably, we should test both >> crypto and compress operations (and their respective performance metrics). >> To that end, there are also DTS testsuites for testing QAT crypto/compress >> functions. These testsuites make use of the crypto perf dpdk app and the >> compress perf dpdk app. If you want, you can setup the DTS stuff yourself, >> both on the system side, and the Jenkins side (you are allowed to submit >> PRs on our gitlab now), but we can also do that on the lab side as we >> probably have more experience. I do, however, have a question about the QAT >> kernel driver and corresponding PMDs. >> >> >> >> compress suite: >> https://git.dpdk.org/tools/dts/tree/test_plans/compressdev_qat_pmd_test_plan.rst >> >> crypto suite: >> https://git.dpdk.org/tools/dts/tree/test_plans/crypto_perf_cryptodev_perf_test_plan.rst >> >> >> >> For reference, the DPDK docs page explaining QAT driver capabilities and >> building the QAT PMDs (crypto sym, crypto asym, and compress) is here: >> https://doc.dpdk.org/guides/cryptodevs/qat.html#building-qat >> >> Some notes before I get to my main question: >> >> -The 8970 is a C62x device >> >> -OpenSSL (arm requires it for QAT) is installed >> >> -3 PFs are visible from lspci (expected) >> >> -SRIOV is enabled >> >> >> >> However, although the system is on a valid kernel version for the QAT >> driver, the kernel module for QAT is not loaded, so in trying to set up >> testing, I am unable to create the 16 VFs for the 3 PFs respectively, like >> the example below: >> >> >> >> echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs >> echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs >> echo 16 > /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs >> >> >> >> There is also an option to download the firmware from the kernel firmware >> repo and copy the qat binaries to /lib/firmware and start the qat modules >> from there. I wasn't able to resolve the situation with this method, but it >> also could have been user error on my part. >> >> >> >> There is an option to install using the IDZ QAT Driver >> <https://www.intel.com/content/www/us/en/developer/topic-technology/open/quick-assist-technology/overview.html>, >> but it should not be required given the kernel version the Ampere server is >> on, and I don't want to go down the road of relying on this "fall back" >> method without consulting you first. Is it possible that there is anything >> specific to running a QAT device on ARM specifically which I am missing >> here? The DTS testsuite testplans actually seem to recommend going down >> this road in general, but the DPDK docs say to use the kernel driver, so I >> don't know. >> >> >> >> In any case, one of you should be able to login to the Ampere server in >> situations like this, or just in general. Ruifeng/Juraj I see you both have >> accounts on our IdM system, so you should have access. Please let me know >> if you need renewed vpn cert configs and I will send you one. If you do >> login, know this system could be running CI testing at any time. I can >> always schedule time for it to be offline and available for maintenance if >> you want to do anything which could be disruptive to testing. >> >> >> >> I also CC'd Dharmik on this as I see he sent an email regarding QAT >> support on aarch64 in June. >> >> >> >> Let me know if you have any thoughts on the QAT kernel driver part. >> >> >> >> Thanks, >> >> Patrick >> >> > > -- > > Patrick Robb > > Technical Service Manager > > UNH InterOperability Laboratory > > 21 Madbury Rd, Suite 100, Durham, NH 03824 > > www.iol.unh.edu > > > [-- Attachment #2: Type: text/html, Size: 16344 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-08-21 8:45 ` Juraj Linkeš @ 2023-08-30 0:05 ` Patrick Robb 2023-09-01 21:30 ` Patrick Robb 1 sibling, 0 replies; 31+ messages in thread From: Patrick Robb @ 2023-08-30 0:05 UTC (permalink / raw) To: Juraj Linkeš Cc: Ruifeng Wang, Honnappa Nagarahalli, Dharmik Jayesh Thakkar, ci, nd [-- Attachment #1: Type: text/plain, Size: 1137 bytes --] Hi Juraj, Thanks for the guidance. The kernel version in use on the Ampere server currently is 5.4.0-155-generic. The tags for the 22.04 repo you suggested only include 5.15 kernel versions, so I can't checkout to the currently running kernel per your recommendation. I figure the idea behind checking out to the kernel currently running is to maintain the current state as much as possible, and ensure the currently running kernel config could be re-used. If I instead clone the 20.04/focal repo I can checkout to 5.4.0-155, but the diffs you shared do not cleanly apply. On the other hand, from looking directly at the files on Jammy/5.15[1], it looks like the qat diffs (https://lkml.org/lkml/2022/6/17/328) have already reached that kernel version. If indeed applying these diffs is not needed in this case, is there any reason why I shouldn't just re-build the kernel from here? I don't want to do this (and necessarily advance the kernel version from 5.4 to 5.15) without asking you since I don't know what the negative implications of this action may be, if any. [1] https://git.launchpad.net/~ubuntu-kernel-stable/+git/jammy/ [-- Attachment #2: Type: text/html, Size: 1424 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-08-21 8:45 ` Juraj Linkeš 2023-08-30 0:05 ` Patrick Robb @ 2023-09-01 21:30 ` Patrick Robb 2023-09-11 8:13 ` Juraj Linkeš 1 sibling, 1 reply; 31+ messages in thread From: Patrick Robb @ 2023-09-01 21:30 UTC (permalink / raw) To: Juraj Linkeš; +Cc: Ruifeng Wang, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 747 bytes --] Thanks Juraj, I did bring the system to 22.04 based on our conversation from yesterday. From there and from checking out to the new current kernel (5.15.0-82-generic) yes the diffs cleanly apply, removing the x86 dependency on the QAT kernel drivers, and then you can make the kernel, enabling the QAT driver. It looks like that all worked fine. I didn't actually install and reboot with the custom kernel today because I don't want to do that with a production server right before the weekend, particularly with USA having a holiday on Monday. I will reboot with the custom kernel on Tuesday morning though, and then hopefully the compress/crypto testsuites on QAT will be unblocked. Thanks, the guidance is greatly appreciated. Best, Patrick [-- Attachment #2: Type: text/html, Size: 890 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-09-01 21:30 ` Patrick Robb @ 2023-09-11 8:13 ` Juraj Linkeš 2023-09-20 18:28 ` Patrick Robb 0 siblings, 1 reply; 31+ messages in thread From: Juraj Linkeš @ 2023-09-11 8:13 UTC (permalink / raw) To: Patrick Robb; +Cc: Ruifeng Wang, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 968 bytes --] Hi Patrick, This is good news. How does the server fare after the restart? Juraj On Fri, Sep 1, 2023 at 11:30 PM Patrick Robb <probb@iol.unh.edu> wrote: > Thanks Juraj, > > I did bring the system to 22.04 based on our conversation from yesterday. > From there and from checking out to the new current kernel > (5.15.0-82-generic) yes the diffs cleanly apply, removing the x86 > dependency on the QAT kernel drivers, and then you can make the kernel, > enabling the QAT driver. It looks like that all worked fine. > > I didn't actually install and reboot with the custom kernel today because > I don't want to do that with a production server right before the weekend, > particularly with USA having a holiday on Monday. I will reboot with the > custom kernel on Tuesday morning though, and then hopefully the > compress/crypto testsuites on QAT will be unblocked. Thanks, the guidance > is greatly appreciated. > > Best, > Patrick > > [-- Attachment #2: Type: text/html, Size: 1374 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-09-11 8:13 ` Juraj Linkeš @ 2023-09-20 18:28 ` Patrick Robb 2023-09-25 15:19 ` Ruifeng Wang 0 siblings, 1 reply; 31+ messages in thread From: Patrick Robb @ 2023-09-20 18:28 UTC (permalink / raw) To: Juraj Linkeš; +Cc: Ruifeng Wang, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 2824 bytes --] Hi Juraj, Sorry for the late reply. So, yes I applied those diffs and set the QAT modules to =y in the .config file when building the custom kernel. It appears to have worked correctly. The qat module is now built into the new kernel running on the ampere server (called 5.15.82+). You can see it listed on modules.builtin and from modinfo. probb@arm-ampere-dut:~$ modinfo qat_c62x name: qat_c62x filename: (builtin) version: 0.6.0 description: Intel(R) QuickAssist Technology firmware: qat_c62x_mmp.bin firmware: qat_c62x.bin author: Intel license: Dual BSD/GPL file: drivers/crypto/qat/qat_c62x/qat_c62x And there is a /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs path now so that's good. However, when trying to create the VFs for the 3 PFs on the card, a segmentation fault was returned the first time, and on subsequent tries it hangs now. So like: root@arm-ampere-dut:~# lspci -d:37c8 0000:03:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist Technology (rev 04) 0000:04:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist Technology (rev 04) 0000:05:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist Technology (rev 04) root@arm-ampere-dut:~# echo 16 > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs The sriov_numvfs file should be writable from root so I'm a bit perplexed. I am wondering whether it is relevant to statically build in the qat_c62x module with the kernel, vs having it be a loadable driver? What do you do? On Mon, Sep 11, 2023 at 4:13 AM Juraj Linkeš <juraj.linkes@pantheon.tech> wrote: > Hi Patrick, > > This is good news. How does the server fare after the restart? > > Juraj > > On Fri, Sep 1, 2023 at 11:30 PM Patrick Robb <probb@iol.unh.edu> wrote: > >> Thanks Juraj, >> >> I did bring the system to 22.04 based on our conversation from yesterday. >> From there and from checking out to the new current kernel >> (5.15.0-82-generic) yes the diffs cleanly apply, removing the x86 >> dependency on the QAT kernel drivers, and then you can make the kernel, >> enabling the QAT driver. It looks like that all worked fine. >> >> I didn't actually install and reboot with the custom kernel today because >> I don't want to do that with a production server right before the weekend, >> particularly with USA having a holiday on Monday. I will reboot with the >> custom kernel on Tuesday morning though, and then hopefully the >> compress/crypto testsuites on QAT will be unblocked. Thanks, the guidance >> is greatly appreciated. >> >> Best, >> Patrick >> >> -- Patrick Robb Technical Service Manager UNH InterOperability Laboratory 21 Madbury Rd, Suite 100, Durham, NH 03824 www.iol.unh.edu [-- Attachment #2: Type: text/html, Size: 5530 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* RE: Intel QAT 8970 accel card on ARM Ampere Server 2023-09-20 18:28 ` Patrick Robb @ 2023-09-25 15:19 ` Ruifeng Wang 2023-10-09 16:34 ` Patrick Robb 0 siblings, 1 reply; 31+ messages in thread From: Ruifeng Wang @ 2023-09-25 15:19 UTC (permalink / raw) To: Patrick Robb, Juraj Linkeš, Dharmik Jayesh Thakkar Cc: Honnappa Nagarahalli, ci, nd, nd [-- Attachment #1: Type: text/plain, Size: 3413 bytes --] +Dharmik Hi Dharmik, Do you see a similar problem on your machine? Thanks, Ruifeng From: Patrick Robb <probb@iol.unh.edu> Sent: Thursday, September 21, 2023 2:28 AM To: Juraj Linkeš <juraj.linkes@pantheon.tech> Cc: Ruifeng Wang <Ruifeng.Wang@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; ci@dpdk.org; nd <nd@arm.com> Subject: Re: Intel QAT 8970 accel card on ARM Ampere Server Hi Juraj, Sorry for the late reply. So, yes I applied those diffs and set the QAT modules to =y in the .config file when building the custom kernel. It appears to have worked correctly. The qat module is now built into the new kernel running on the ampere server (called 5.15.82+). You can see it listed on modules.builtin and from modinfo. probb@arm-ampere-dut:~$ modinfo qat_c62x name: qat_c62x filename: (builtin) version: 0.6.0 description: Intel(R) QuickAssist Technology firmware: qat_c62x_mmp.bin firmware: qat_c62x.bin author: Intel license: Dual BSD/GPL file: drivers/crypto/qat/qat_c62x/qat_c62x And there is a /sys/bus/pci/drivers/c6xx/(pci address)/sriov_numvfs path now so that's good. However, when trying to create the VFs for the 3 PFs on the card, a segmentation fault was returned the first time, and on subsequent tries it hangs now. So like: root@arm-ampere-dut:~# lspci -d:37c8 0000:03:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist Technology (rev 04) 0000:04:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist Technology (rev 04) 0000:05:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist Technology (rev 04) root@arm-ampere-dut:~# echo 16 > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs The sriov_numvfs file should be writable from root so I'm a bit perplexed. I am wondering whether it is relevant to statically build in the qat_c62x module with the kernel, vs having it be a loadable driver? What do you do? On Mon, Sep 11, 2023 at 4:13 AM Juraj Linkeš <juraj.linkes@pantheon.tech<mailto:juraj.linkes@pantheon.tech>> wrote: Hi Patrick, This is good news. How does the server fare after the restart? Juraj On Fri, Sep 1, 2023 at 11:30 PM Patrick Robb <probb@iol.unh.edu<mailto:probb@iol.unh.edu>> wrote: Thanks Juraj, I did bring the system to 22.04 based on our conversation from yesterday. From there and from checking out to the new current kernel (5.15.0-82-generic) yes the diffs cleanly apply, removing the x86 dependency on the QAT kernel drivers, and then you can make the kernel, enabling the QAT driver. It looks like that all worked fine. I didn't actually install and reboot with the custom kernel today because I don't want to do that with a production server right before the weekend, particularly with USA having a holiday on Monday. I will reboot with the custom kernel on Tuesday morning though, and then hopefully the compress/crypto testsuites on QAT will be unblocked. Thanks, the guidance is greatly appreciated. Best, Patrick -- Patrick Robb Technical Service Manager UNH InterOperability Laboratory 21 Madbury Rd, Suite 100, Durham, NH 03824 www.iol.unh.edu<http://www.iol.unh.edu/> [https://lh4.googleusercontent.com/7sTY8VswXadak_YT0J13osh5ockNVRX2BuYaRsKoTTpkpilBokA0WlocYHLB4q7XUgXNHka6-ns47S8R_am0sOt7MYQQ1ILQ3S-P5aezsrjp3-IsJMmMrErHWmTARNgZhpAx06n2] [-- Attachment #2: Type: text/html, Size: 9634 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-09-25 15:19 ` Ruifeng Wang @ 2023-10-09 16:34 ` Patrick Robb 2023-10-10 2:28 ` Patrick Robb 0 siblings, 1 reply; 31+ messages in thread From: Patrick Robb @ 2023-10-09 16:34 UTC (permalink / raw) To: Ruifeng Wang Cc: Juraj Linkeš, Dharmik Jayesh Thakkar, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 327 bytes --] Hi Ruifeng, Dharmik, I'm just bumping this so we can come up with a plan to go forward. And again, I am wondering did you all build your custom kernel with the qat_c62x driver statically built in (like I did), or added as a loadable driver? I think that's one of the few ways our test beds could be different. Best, Patrick [-- Attachment #2: Type: text/html, Size: 462 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-10-09 16:34 ` Patrick Robb @ 2023-10-10 2:28 ` Patrick Robb 2023-10-10 3:55 ` Dharmik Jayesh Thakkar 0 siblings, 1 reply; 31+ messages in thread From: Patrick Robb @ 2023-10-10 2:28 UTC (permalink / raw) To: Ruifeng Wang Cc: Juraj Linkeš, Dharmik Jayesh Thakkar, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 465 bytes --] Also I am just now thinking I probably should have provided dpdk-devbind.py output: probb@arm-ampere-dut:/tmp/dpdk/usertools$ dpdk-devbind.py --status Crypto devices using kernel driver ================================== 0000:03:00.0 'C62x Chipset QuickAssist Technology 37c8' drv=c6xx unused=vfio-pci 0000:04:00.0 'C62x Chipset QuickAssist Technology 37c8' drv=c6xx unused=vfio-pci 0000:05:00.0 'C62x Chipset QuickAssist Technology 37c8' drv=c6xx unused=vfio-pci [-- Attachment #2: Type: text/html, Size: 551 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* RE: Intel QAT 8970 accel card on ARM Ampere Server 2023-10-10 2:28 ` Patrick Robb @ 2023-10-10 3:55 ` Dharmik Jayesh Thakkar 2023-10-10 7:25 ` David Marchand 2023-10-10 15:59 ` Patrick Robb 0 siblings, 2 replies; 31+ messages in thread From: Dharmik Jayesh Thakkar @ 2023-10-10 3:55 UTC (permalink / raw) To: Patrick Robb, Ruifeng Wang Cc: Juraj Linkeš, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 2963 bytes --] Hi Patrick, Can you provide the grub settings? Is iommu.passthrough=1 included? Also, is qat_c62xvf loaded as well? Finally, a few guidelines on the vfio driver: At times, we need to configure the vfio driver. On kernel vers. 5.9+ we need to load the vfio-pci driver with the additional parameter disable_denylist=1 Unload the vfio-pci driver if it is already loaded so that we can reload it with the correct parameters : sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio If you can't unload the vfio driver because it's been built into the kernel, you'll have to find another way to change VFIO parameters, or to rebuild your kernel with VFIO_PCI set as a module. Failing to do that, you might encounter issues later on when you try to bind the VFs to VFIO. Load the vfio-pci driver and bind it to QAT VFs device ids: sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9 Enable no-iommu-mode: echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode /sys/module/vfio/parameter is missing ? If /sys/module/vfio/parameters does not exist, you might be missing the kernel module VFIO_NOIOMMU Automatically set VFIO params on boot It's possible to set these parameters automatically on boot by creating a /etc/modprobe.d/vfio-pci.conf file with the parameters : cat /etc/modprobe.d/vfio-pci.conf options vfio enable_unsafe_noiommu_mode=1 options vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9 We haven’t encountered this issue in the past, so just making sure the configuration is correct. I don’t think having the driver static/loadable should make a difference, I will try with building statically on my setup. Thank you! From: Patrick Robb <probb@iol.unh.edu> Sent: Monday, October 9, 2023 9:29 PM To: Ruifeng Wang <Ruifeng.Wang@arm.com> Cc: Juraj Linkeš <juraj.linkes@pantheon.tech>; Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; ci@dpdk.org; nd <nd@arm.com> Subject: Re: Intel QAT 8970 accel card on ARM Ampere Server Also I am just now thinking I probably should have provided dpdk-devbind.py output: probb@arm-ampere-dut:/tmp/dpdk/usertools$ dpdk-devbind.py --status Crypto devices using kernel driver ================================== 0000:03:00.0 'C62x Chipset QuickAssist Technology 37c8' drv=c6xx unused=vfio-pci 0000:04:00.0 'C62x Chipset QuickAssist Technology 37c8' drv=c6xx unused=vfio-pci 0000:05:00.0 'C62x Chipset QuickAssist Technology 37c8' drv=c6xx unused=vfio-pci IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. [-- Attachment #2: Type: text/html, Size: 5861 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-10-10 3:55 ` Dharmik Jayesh Thakkar @ 2023-10-10 7:25 ` David Marchand 2023-10-10 15:03 ` Dharmik Jayesh Thakkar 2023-10-10 15:59 ` Patrick Robb 1 sibling, 1 reply; 31+ messages in thread From: David Marchand @ 2023-10-10 7:25 UTC (permalink / raw) To: Dharmik Jayesh Thakkar Cc: Patrick Robb, Ruifeng Wang, Juraj Linkeš, Honnappa Nagarahalli, ci, nd, Thomas Monjalon, Maxime Coquelin Hello, On Tue, Oct 10, 2023 at 5:56 AM Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com> wrote: > > Hi Patrick, > > Can you provide the grub settings? Is iommu.passthrough=1 included? > > > > Also, is qat_c62xvf loaded as well? > > > > Finally, a few guidelines on the vfio driver: > > At times, we need to configure the vfio driver. > > On kernel vers. 5.9+ we need to load the vfio-pci driver with the additional parameter disable_denylist=1 o_O I did not know this option, but it scares me a bit, reading its description. Could you please elaborate why this is needed? > > Unload the vfio-pci driver if it is already loaded so that we can reload it with the correct parameters : > sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio > > If you can't unload the vfio driver because it's been built into the kernel, you'll have to find another way to change VFIO parameters, or to rebuild your kernel with VFIO_PCI set as a module. Failing to do that, you might encounter issues later on when you try to bind the VFs to VFIO. > > Load the vfio-pci driver and bind it to QAT VFs device ids: > sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9 > > Enable no-iommu-mode: > echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode > > /sys/module/vfio/parameter is missing ? > > If /sys/module/vfio/parameters does not exist, you might be missing the kernel module VFIO_NOIOMMU > > > > Automatically set VFIO params on boot > > It's possible to set these parameters automatically on boot by creating a /etc/modprobe.d/vfio-pci.conf file with the parameters : > cat /etc/modprobe.d/vfio-pci.conf > options vfio enable_unsafe_noiommu_mode=1 > options vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9 > > > > We haven’t encountered this issue in the past, so just making sure the configuration is correct. I don’t think having the driver static/loadable should make a difference, I will try with building statically on my setup. -- David Marchand ^ permalink raw reply [flat|nested] 31+ messages in thread
* RE: Intel QAT 8970 accel card on ARM Ampere Server 2023-10-10 7:25 ` David Marchand @ 2023-10-10 15:03 ` Dharmik Jayesh Thakkar 2023-10-10 15:12 ` David Marchand 0 siblings, 1 reply; 31+ messages in thread From: Dharmik Jayesh Thakkar @ 2023-10-10 15:03 UTC (permalink / raw) To: David Marchand Cc: Patrick Robb, Ruifeng Wang, Juraj Linkeš, Honnappa Nagarahalli, ci, nd, thomas, Maxime Coquelin > -----Original Message----- > From: David Marchand <david.marchand@redhat.com> > Sent: Tuesday, October 10, 2023 2:26 AM > To: Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com> > Cc: Patrick Robb <probb@iol.unh.edu>; Ruifeng Wang > <Ruifeng.Wang@arm.com>; Juraj Linkeš <juraj.linkes@pantheon.tech>; > Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; ci@dpdk.org; nd > <nd@arm.com>; thomas@monjalon.net; Maxime Coquelin > <maxime.coquelin@redhat.com> > Subject: Re: Intel QAT 8970 accel card on ARM Ampere Server > > Hello, > > On Tue, Oct 10, 2023 at 5:56 AM Dharmik Jayesh Thakkar > <DharmikJayesh.Thakkar@arm.com> wrote: > > > > Hi Patrick, > > > > Can you provide the grub settings? Is iommu.passthrough=1 included? > > > > > > > > Also, is qat_c62xvf loaded as well? > > > > > > > > Finally, a few guidelines on the vfio driver: > > > > At times, we need to configure the vfio driver. > > > > On kernel vers. 5.9+ we need to load the vfio-pci driver with the > > additional parameter disable_denylist=1 > > o_O > I did not know this option, but it scares me a bit, reading its description. > Could you please elaborate why this is needed? > > Details for adding QAT to denylist provided in the below commit: https://github.com/torvalds/linux/commit/50173329c8cc0c892eaa7a9d0f0692ac39cd7b04 > > > > Unload the vfio-pci driver if it is already loaded so that we can reload it with > the correct parameters : > > sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo > > modprobe -r vfio_virqfd; sudo modprobe -r vfio > > > > If you can't unload the vfio driver because it's been built into the kernel, > you'll have to find another way to change VFIO parameters, or to rebuild your > kernel with VFIO_PCI set as a module. Failing to do that, you might encounter > issues later on when you try to bind the VFs to VFIO. > > > > Load the vfio-pci driver and bind it to QAT VFs device ids: > > sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 > > vfio-pci.ids=8086:37c9 > > > > Enable no-iommu-mode: > > echo "1" | sudo tee > > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode > > > > /sys/module/vfio/parameter is missing ? > > > > If /sys/module/vfio/parameters does not exist, you might be missing > > the kernel module VFIO_NOIOMMU > > > > > > > > Automatically set VFIO params on boot > > > > It's possible to set these parameters automatically on boot by creating a > /etc/modprobe.d/vfio-pci.conf file with the parameters : > > cat /etc/modprobe.d/vfio-pci.conf > > options vfio enable_unsafe_noiommu_mode=1 options vfio-pci > > disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9 > > > > > > > > We haven’t encountered this issue in the past, so just making sure the > configuration is correct. I don’t think having the driver static/loadable should > make a difference, I will try with building statically on my setup. > > > -- > David Marchand IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-10-10 15:03 ` Dharmik Jayesh Thakkar @ 2023-10-10 15:12 ` David Marchand 0 siblings, 0 replies; 31+ messages in thread From: David Marchand @ 2023-10-10 15:12 UTC (permalink / raw) To: Dharmik Jayesh Thakkar, Patrick Robb Cc: Ruifeng Wang, Juraj Linkeš, Honnappa Nagarahalli, ci, nd, thomas, Maxime Coquelin On Tue, Oct 10, 2023 at 5:03 PM Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com> wrote: > > > On kernel vers. 5.9+ we need to load the vfio-pci driver with the > > > additional parameter disable_denylist=1 > > > > o_O > > I did not know this option, but it scares me a bit, reading its description. > > Could you please elaborate why this is needed? > > > > > > Details for adding QAT to denylist provided in the below commit: > https://github.com/torvalds/linux/commit/50173329c8cc0c892eaa7a9d0f0692ac39cd7b04 Dharmik, Ok, thanks. That matches what I found in the qat documentation: http://doc.dpdk.org/guides/cryptodevs/qat.html#binding-the-available-vfs-to-the-vfio-pci-driver Patrick, Sorry for jumping in this thread, but to be clear, this disable_denylist option is really specific to this model of quickassist crypto card. It must not be enabled in other setups using vfio. -- David Marchand ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-10-10 3:55 ` Dharmik Jayesh Thakkar 2023-10-10 7:25 ` David Marchand @ 2023-10-10 15:59 ` Patrick Robb 2023-10-10 21:50 ` Dharmik Jayesh Thakkar ` (2 more replies) 1 sibling, 3 replies; 31+ messages in thread From: Patrick Robb @ 2023-10-10 15:59 UTC (permalink / raw) To: Dharmik Jayesh Thakkar, David Marchand Cc: Ruifeng Wang, Juraj Linkeš, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 3710 bytes --] On Mon, Oct 9, 2023 at 11:56 PM Dharmik Jayesh Thakkar < DharmikJayesh.Thakkar@arm.com> wrote: > Hi Patrick, > > > > Can you provide the grub settings? Is iommu.passthrough=1 included? > Sure. I'm not sure if you just wanted the kernel cmdline options or the whole grub config, but I assume you just meant kernel cmdline. Let me know if you meant more. GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=39-79 nohz_full=39-79 rcu_nocbs=39-79 processor.max_cstate=1 intel_pstate=disable console=ttyS0,115200 console=tty0" But, iommu.passthrough=1 is not included, so I can add that if we need to. Do you know that this won't have any bad implications for the (intel, nvidia, broadcom) NICs which we test on this server? > > > Also, is qat_c62xvf loaded as well? > qat_c62xvf is built in to the kernel also. > > > > Finally, a few guidelines on the vfio driver: > > At times, we need to configure the vfio driver. > > On kernel vers. 5.9+ we need to load the vfio-pci driver with the > additional parameter *disable_denylist=1* > > Unload the vfio-pci driver if it is already loaded so that we can reload > it with the correct parameters : > *sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo > modprobe -r vfio_virqfd; sudo modprobe -r vfio* > > If you can't unload the vfio driver because it's been built into the > kernel, you'll have to find another way to change VFIO parameters, or to > rebuild your kernel with VFIO_PCI set as a module. Failing to do that, you > might encounter issues later on when you try to bind the VFs to VFIO. > > Load the vfio-pci driver and bind it to QAT VFs device ids: > *sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 > vfio-pci.ids=8086:37c9* > > Enable no-iommu-mode: > *echo "1" | sudo tee > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode* > > /sys/module/vfio/parameter is missing ? > > If /sys/module/vfio/parameters does not exist, you might be missing the > kernel module VFIO_NOIOMMU > > > > *Automatically set VFIO params on boot* > > It's possible to set these parameters automatically on boot by creating a > */etc/modprobe.d/vfio-pci.conf *file with the parameters : > *cat /etc/modprobe.d/vfio-pci.conf* > *options vfio enable_unsafe_noiommu_mode=1* > *options vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9* > > > > We haven’t encountered this issue in the past, so just making sure the > configuration is correct. I don’t think having the driver static/loadable > should make a difference, I will try with building statically on my setup. > > > > Thank you! > > > Okay, this should be fine. Like I said, we are also running tests on NICs on this server. So, in our Jenkinsfiles scripts for running the testing, I will add a preliminary step only for QAT tests which runs: *sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio* *sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9* *echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode* (then run QAT tests) And if running on NICs, have a preliminary step which runs *sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio* *sudo modprobe vfio* David does this also sound reasonable to you, per your comment about isolating this setting to QAT card testing? Dharmik if this all sounds okay and you can confirm the iommu.passthrough change is fine, I will proceed. Thank you for providing the assistance. [-- Attachment #2: Type: text/html, Size: 6210 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* RE: Intel QAT 8970 accel card on ARM Ampere Server 2023-10-10 15:59 ` Patrick Robb @ 2023-10-10 21:50 ` Dharmik Jayesh Thakkar 2023-10-11 8:14 ` Juraj Linkeš 2023-10-11 11:51 ` David Marchand 2 siblings, 0 replies; 31+ messages in thread From: Dharmik Jayesh Thakkar @ 2023-10-10 21:50 UTC (permalink / raw) To: Patrick Robb, David Marchand Cc: Ruifeng Wang, Juraj Linkeš, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 4469 bytes --] Thank you for the details, Patrick! Yeah can you please update the grub and vfio settings and see if it works. I don’t think it should have any implications on other NICs. From: Patrick Robb <probb@iol.unh.edu> Sent: Tuesday, October 10, 2023 11:00 AM To: Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>; David Marchand <david.marchand@redhat.com> Cc: Ruifeng Wang <Ruifeng.Wang@arm.com>; Juraj Linkeš <juraj.linkes@pantheon.tech>; Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; ci@dpdk.org; nd <nd@arm.com> Subject: Re: Intel QAT 8970 accel card on ARM Ampere Server On Mon, Oct 9, 2023 at 11:56 PM Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com<mailto:DharmikJayesh.Thakkar@arm.com>> wrote: Hi Patrick, Can you provide the grub settings? Is iommu.passthrough=1 included? Sure. I'm not sure if you just wanted the kernel cmdline options or the whole grub config, but I assume you just meant kernel cmdline. Let me know if you meant more. GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=39-79 nohz_full=39-79 rcu_nocbs=39-79 processor.max_cstate=1 intel_pstate=disable console=ttyS0,115200 console=tty0" But, iommu.passthrough=1 is not included, so I can add that if we need to. Do you know that this won't have any bad implications for the (intel, nvidia, broadcom) NICs which we test on this server? Also, is qat_c62xvf loaded as well? qat_c62xvf is built in to the kernel also. Finally, a few guidelines on the vfio driver: At times, we need to configure the vfio driver. On kernel vers. 5.9+ we need to load the vfio-pci driver with the additional parameter disable_denylist=1 Unload the vfio-pci driver if it is already loaded so that we can reload it with the correct parameters : sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio If you can't unload the vfio driver because it's been built into the kernel, you'll have to find another way to change VFIO parameters, or to rebuild your kernel with VFIO_PCI set as a module. Failing to do that, you might encounter issues later on when you try to bind the VFs to VFIO. Load the vfio-pci driver and bind it to QAT VFs device ids: sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9 Enable no-iommu-mode: echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode /sys/module/vfio/parameter is missing ? If /sys/module/vfio/parameters does not exist, you might be missing the kernel module VFIO_NOIOMMU Automatically set VFIO params on boot It's possible to set these parameters automatically on boot by creating a /etc/modprobe.d/vfio-pci.conf file with the parameters : cat /etc/modprobe.d/vfio-pci.conf options vfio enable_unsafe_noiommu_mode=1 options vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9 We haven’t encountered this issue in the past, so just making sure the configuration is correct. I don’t think having the driver static/loadable should make a difference, I will try with building statically on my setup. Thank you! Okay, this should be fine. Like I said, we are also running tests on NICs on this server. So, in our Jenkinsfiles scripts for running the testing, I will add a preliminary step only for QAT tests which runs: sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9 echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode (then run QAT tests) And if running on NICs, have a preliminary step which runs sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio sudo modprobe vfio David does this also sound reasonable to you, per your comment about isolating this setting to QAT card testing? Dharmik if this all sounds okay and you can confirm the iommu.passthrough change is fine, I will proceed. Thank you for providing the assistance. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. [-- Attachment #2: Type: text/html, Size: 10876 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-10-10 15:59 ` Patrick Robb 2023-10-10 21:50 ` Dharmik Jayesh Thakkar @ 2023-10-11 8:14 ` Juraj Linkeš 2023-10-11 20:13 ` Patrick Robb 2023-10-11 11:51 ` David Marchand 2 siblings, 1 reply; 31+ messages in thread From: Juraj Linkeš @ 2023-10-11 8:14 UTC (permalink / raw) To: Patrick Robb Cc: Dharmik Jayesh Thakkar, David Marchand, Ruifeng Wang, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 4518 bytes --] On Tue, Oct 10, 2023 at 5:59 PM Patrick Robb <probb@iol.unh.edu> wrote: > > > On Mon, Oct 9, 2023 at 11:56 PM Dharmik Jayesh Thakkar < > DharmikJayesh.Thakkar@arm.com> wrote: > >> Hi Patrick, >> >> >> >> Can you provide the grub settings? Is iommu.passthrough=1 included? >> > > Sure. I'm not sure if you just wanted the kernel cmdline options or the > whole grub config, but I assume you just meant kernel cmdline. Let me know > if you meant more. > > GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G > hugepages=32 iommu=pt intel_iommu=on isolcpus=39-79 nohz_full=39-79 > rcu_nocbs=39-79 processor.max_cstate=1 intel_pstate=disable > console=ttyS0,115200 console=tty0" > > But, iommu.passthrough=1 is not included, so I can add that if we need to. > Do you know that this won't have any bad implications for the (intel, > nvidia, broadcom) NICs which we test on this server? > > Just a note here, Patrick. The iommu kernel and intel_pstate parameters aren't supported on arm, so you can remove those. And when iommu.passthrouh=1, IOMMU is bypassed and intel_iommu doesn't do anything (and maybe isn't supported on arm, but that's not clear from the docs <https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt>), so that can be removed as well. From what I can tell, using iommu.passthrough=1 is the standard, so if there are any negative implications, we should investigate them, but there shouldn't be anything major. > >> >> Also, is qat_c62xvf loaded as well? >> > qat_c62xvf is built in to the kernel also. > > >> >> > >> >> Finally, a few guidelines on the vfio driver: >> >> At times, we need to configure the vfio driver. >> >> On kernel vers. 5.9+ we need to load the vfio-pci driver with the >> additional parameter *disable_denylist=1* >> >> Unload the vfio-pci driver if it is already loaded so that we can reload >> it with the correct parameters : >> *sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo >> modprobe -r vfio_virqfd; sudo modprobe -r vfio* >> >> If you can't unload the vfio driver because it's been built into the >> kernel, you'll have to find another way to change VFIO parameters, or to >> rebuild your kernel with VFIO_PCI set as a module. Failing to do that, you >> might encounter issues later on when you try to bind the VFs to VFIO. >> >> Load the vfio-pci driver and bind it to QAT VFs device ids: >> *sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 >> vfio-pci.ids=8086:37c9* >> >> Enable no-iommu-mode: >> *echo "1" | sudo tee >> /sys/module/vfio/parameters/enable_unsafe_noiommu_mode* >> >> /sys/module/vfio/parameter is missing ? >> >> If /sys/module/vfio/parameters does not exist, you might be missing the >> kernel module VFIO_NOIOMMU >> >> >> >> *Automatically set VFIO params on boot* >> >> It's possible to set these parameters automatically on boot by creating a >> */etc/modprobe.d/vfio-pci.conf *file with the parameters : >> *cat /etc/modprobe.d/vfio-pci.conf* >> *options vfio enable_unsafe_noiommu_mode=1* >> *options vfio-pci disable_denylist=1 enable_sriov=1 >> vfio-pci.ids=8086:37c9* >> >> >> >> We haven’t encountered this issue in the past, so just making sure the >> configuration is correct. I don’t think having the driver static/loadable >> should make a difference, I will try with building statically on my setup. >> >> >> >> Thank you! >> >> >> Okay, this should be fine. Like I said, we are also running tests on NICs > on this server. So, in our Jenkinsfiles scripts for running the testing, I > will add a preliminary step only for QAT tests which runs: > *sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo > modprobe -r vfio_virqfd; sudo modprobe -r vfio* > *sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 > vfio-pci.ids=8086:37c9* > *echo "1" | sudo tee > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode* > (then run QAT tests) > > And if running on NICs, have a preliminary step which runs > *sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo > modprobe -r vfio_virqfd; sudo modprobe -r vfio* > *sudo modprobe vfio* > > David does this also sound reasonable to you, per your comment about > isolating this setting to QAT card testing? > > Dharmik if this all sounds okay and you can confirm the iommu.passthrough > change is fine, I will proceed. Thank you for providing the assistance. > > [-- Attachment #2: Type: text/html, Size: 7019 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-10-11 8:14 ` Juraj Linkeš @ 2023-10-11 20:13 ` Patrick Robb 2023-11-02 22:00 ` Patrick Robb 0 siblings, 1 reply; 31+ messages in thread From: Patrick Robb @ 2023-10-11 20:13 UTC (permalink / raw) To: Juraj Linkeš Cc: Dharmik Jayesh Thakkar, David Marchand, Ruifeng Wang, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 2312 bytes --] On Wed, Oct 11, 2023 at 4:14 AM Juraj Linkeš <juraj.linkes@pantheon.tech> wrote: > > > On Tue, Oct 10, 2023 at 5:59 PM Patrick Robb <probb@iol.unh.edu> wrote: > >> >> >> On Mon, Oct 9, 2023 at 11:56 PM Dharmik Jayesh Thakkar < >> DharmikJayesh.Thakkar@arm.com> wrote: >> >>> Hi Patrick, >>> >>> >>> >>> Can you provide the grub settings? Is iommu.passthrough=1 included? >>> >> >> Sure. I'm not sure if you just wanted the kernel cmdline options or the >> whole grub config, but I assume you just meant kernel cmdline. Let me know >> if you meant more. >> >> GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G >> hugepages=32 iommu=pt intel_iommu=on isolcpus=39-79 nohz_full=39-79 >> rcu_nocbs=39-79 processor.max_cstate=1 intel_pstate=disable >> console=ttyS0,115200 console=tty0" >> >> But, iommu.passthrough=1 is not included, so I can add that if we need >> to. Do you know that this won't have any bad implications for the (intel, >> nvidia, broadcom) NICs which we test on this server? >> >> > > Just a note here, Patrick. The iommu kernel and intel_pstate parameters > aren't supported on arm, so you can remove those. And when > iommu.passthrouh=1, IOMMU is bypassed and intel_iommu doesn't do anything > (and maybe isn't supported on arm, but that's not clear from the docs > <https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt>), > so that can be removed as well. > Thanks Dharmik and Juraj. Updated kernel cmdline args: BOOT_IMAGE=/vmlinuz-5.15.82+ root=/dev/mapper/ubuntu--vg--1-ubuntu--lv ro default_hugepagesz=1G hugepagesz=1G hugepages=32 isolcpus=39-79 nohz_full=39-79 rcu_nocbs=39-79 processor.max_cstate=1 iommu.passthrough=1 console=ttyS0,115200 console=tty0 I added the iommu.passthrough option and tried again, to no avail. FYI I am still using the guidance here: https://doc.dpdk.org/guides/cryptodevs/qat.html along with your added steps. root@arm-ampere-dut:~# echo 16 > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs Segmentation fault (core dumped) As you know the above setting of the 48 VFs is a prerequisite to binding the VFs to vfio-pci. But, I did run through loading the custom vfio and there were no issues, so once we clear this initial hurdle we should be fine. [-- Attachment #2: Type: text/html, Size: 3569 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-10-11 20:13 ` Patrick Robb @ 2023-11-02 22:00 ` Patrick Robb 2023-11-14 7:34 ` Ruifeng Wang 0 siblings, 1 reply; 31+ messages in thread From: Patrick Robb @ 2023-11-02 22:00 UTC (permalink / raw) To: Juraj Linkeš Cc: Dharmik Jayesh Thakkar, David Marchand, Ruifeng Wang, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 700 bytes --] On Wed, Oct 11, 2023 at 4:13 PM Patrick Robb <probb@iol.unh.edu> wrote: > > root@arm-ampere-dut:~# echo 16 > > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs > Segmentation fault (core dumped) > > Hi Aaron, Thanks for offering to take a look. I'm not sure if you've seen the rest of this conversation already from it being on the ci mailing list or not, but modinfo looks good for qat_c62x andqat_c62xvf after the custom kernel was built. From there, it should be possible to bind some VFs for each PF on the QAT card, per documentation here https://doc.dpdk.org/guides/cryptodevs/qat.html but it results in a seg fault like you see above. Let me know if you have any ideas. [-- Attachment #2: Type: text/html, Size: 1160 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-11-02 22:00 ` Patrick Robb @ 2023-11-14 7:34 ` Ruifeng Wang 2023-11-14 14:36 ` Patrick Robb 2024-02-27 6:58 ` Patrick Robb 0 siblings, 2 replies; 31+ messages in thread From: Ruifeng Wang @ 2023-11-14 7:34 UTC (permalink / raw) To: Patrick Robb, Juraj Linkeš Cc: Dharmik Jayesh Thakkar, David Marchand, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 1488 bytes --] Hi Patrick, It seems kernel v5.15 has a defect on this. A similar issue was fixed by commit: 40da865381ad ("crypto: qat - remove unneeded packed attribute") Could you patch the kernel and try again? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40da865381ad061ab75a7a9da469ed4e623bdfeb Thanks, Ruifeng From: Patrick Robb <probb@iol.unh.edu> Date: Friday, November 3, 2023 at 6:01 AM To: Juraj Linkeš <juraj.linkes@pantheon.tech> Cc: Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>, David Marchand <david.marchand@redhat.com>, Ruifeng Wang <Ruifeng.Wang@arm.com>, Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>, ci@dpdk.org <ci@dpdk.org>, nd <nd@arm.com> Subject: Re: Intel QAT 8970 accel card on ARM Ampere Server On Wed, Oct 11, 2023 at 4:13 PM Patrick Robb <probb@iol.unh.edu<mailto:probb@iol.unh.edu>> wrote: root@arm-ampere-dut:~# echo 16 > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs Segmentation fault (core dumped) Hi Aaron, Thanks for offering to take a look. I'm not sure if you've seen the rest of this conversation already from it being on the ci mailing list or not, but modinfo looks good for qat_c62x andqat_c62xvf after the custom kernel was built. From there, it should be possible to bind some VFs for each PF on the QAT card, per documentation here https://doc.dpdk.org/guides/cryptodevs/qat.html but it results in a seg fault like you see above. Let me know if you have any ideas. [-- Attachment #2: Type: text/html, Size: 5386 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-11-14 7:34 ` Ruifeng Wang @ 2023-11-14 14:36 ` Patrick Robb 2024-02-27 6:58 ` Patrick Robb 1 sibling, 0 replies; 31+ messages in thread From: Patrick Robb @ 2023-11-14 14:36 UTC (permalink / raw) To: Ruifeng Wang Cc: Juraj Linkeš, Dharmik Jayesh Thakkar, David Marchand, Honnappa Nagarahalli, ci, nd, Aaron Conole [-- Attachment #1: Type: text/plain, Size: 2086 bytes --] Hi Ruifeng, Okay, thanks for the update. I'll build a new kernel just like before, but with this patch added too. And, I know it shouldn't matter, but I'll avoid statically building in the qat modules this go around. Thanks, Patrick On Tue, Nov 14, 2023 at 2:35 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote: > Hi Patrick, > > > > It seems kernel v5.15 has a defect on this. A similar issue was fixed by > commit: > > 40da865381ad ("crypto: qat - remove unneeded packed attribute") > > > > Could you patch the kernel and try again? > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40da865381ad061ab75a7a9da469ed4e623bdfeb > > > > Thanks, > > Ruifeng > > > > *From: *Patrick Robb <probb@iol.unh.edu> > *Date: *Friday, November 3, 2023 at 6:01 AM > *To: *Juraj Linkeš <juraj.linkes@pantheon.tech> > *Cc: *Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com>, David > Marchand <david.marchand@redhat.com>, Ruifeng Wang <Ruifeng.Wang@arm.com>, > Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>, ci@dpdk.org < > ci@dpdk.org>, nd <nd@arm.com> > *Subject: *Re: Intel QAT 8970 accel card on ARM Ampere Server > > On Wed, Oct 11, 2023 at 4:13 PM Patrick Robb <probb@iol.unh.edu> wrote: > > > > root@arm-ampere-dut:~# echo 16 > > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs > Segmentation fault (core dumped) > > > > Hi Aaron, > > > > Thanks for offering to take a look. I'm not sure if you've seen the rest > of this conversation already from it being on the ci mailing list or not, > but modinfo looks good for qat_c62x andqat_c62xvf after the custom kernel > was built. From there, it should be possible to bind some VFs for each PF > on the QAT card, per documentation here > https://doc.dpdk.org/guides/cryptodevs/qat.html but it results in a seg > fault like you see above. Let me know if you have any ideas. > -- Patrick Robb Technical Service Manager UNH InterOperability Laboratory 21 Madbury Rd, Suite 100, Durham, NH 03824 www.iol.unh.edu [-- Attachment #2: Type: text/html, Size: 7488 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-11-14 7:34 ` Ruifeng Wang 2023-11-14 14:36 ` Patrick Robb @ 2024-02-27 6:58 ` Patrick Robb 2024-02-27 13:50 ` Honnappa Nagarahalli 1 sibling, 1 reply; 31+ messages in thread From: Patrick Robb @ 2024-02-27 6:58 UTC (permalink / raw) To: Ruifeng Wang Cc: Juraj Linkeš, Dharmik Jayesh Thakkar, David Marchand, Honnappa Nagarahalli, ci, nd [-- Attachment #1: Type: text/plain, Size: 2169 bytes --] On Tue, Nov 14, 2023 at 2:35 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote: > Hi Patrick, > > > > It seems kernel v5.15 has a defect on this. A similar issue was fixed by > commit: > > 40da865381ad ("crypto: qat - remove unneeded packed attribute") > > > > Could you patch the kernel and try again? > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40da865381ad061ab75a7a9da469ed4e623bdfeb > > > > Thanks, > > Ruifeng > > > Hi Ruifeng, Sorry for the delay on this - there has been a work item backlog at the Community Lab we've been working through. I did rebuild the patch today with these changes from the commit (or similar, as the commit above was for the qat_common file in a different state, but I tried to remain as true to the commit as possible). And that does seem to have resolved the seg fault problem! Thank you so much for picking this commit out of obscurity and sending it our way! root@arm-ampere-dut:~# echo 16 > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs root@arm-ampere-dut:~# cat /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs 16 Wunderbar! The only other thing I changed (just because I was floating the idea with Dharmik before) was in the kernel .config I changed the qat_c62x and qat_c62xvf modules from statically built in (=y) to loadable (=m). Of course, this should not matter, and I presume the change in behavior relates to those brought in from the commit above. I just want to present fully all changes made so there is a complete picture. I will continue on this tomorrow according to where this conversation left off, and try to move this quickly. If indeed there are no more blockers I think we are very close. As a reminder, when standing up a new testing plan, we want to make sure at least 1 rep from each vendor has SSH access and can remotely login to help with system tuning, troubleshooting, etc. for the testbed and test plan. Who would be the best person from ARM for this at this time, given the context on QAT testing? Ruifeng? Dharmik? Someone else? Thanks, I'll keep yall apprised of the situation. [-- Attachment #2: Type: text/html, Size: 4066 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2024-02-27 6:58 ` Patrick Robb @ 2024-02-27 13:50 ` Honnappa Nagarahalli 2024-02-28 20:00 ` Patrick Robb 0 siblings, 1 reply; 31+ messages in thread From: Honnappa Nagarahalli @ 2024-02-27 13:50 UTC (permalink / raw) To: Patrick Robb Cc: Ruifeng Wang, Juraj Linkeš, Dharmik Jayesh Thakkar, David Marchand, ci, nd, Wathsala Wathawana Vithanage, Paul Szczepanek + Paul, Wathsala > On Feb 27, 2024, at 12:58 AM, Patrick Robb <probb@iol.unh.edu> wrote: > > > > On Tue, Nov 14, 2023 at 2:35 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote: > Hi Patrick, > It seems kernel v5.15 has a defect on this. A similar issue was fixed by commit: > 40da865381ad ("crypto: qat - remove unneeded packed attribute") > Could you patch the kernel and try again? > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40da865381ad061ab75a7a9da469ed4e623bdfeb > Thanks, > Ruifeng > > Hi Ruifeng, > > Sorry for the delay on this - there has been a work item backlog at the Community Lab we've been working through. > > I did rebuild the patch today with these changes from the commit (or similar, as the commit above was for the qat_common file in a different state, but I tried to remain as true to the commit as possible). > > And that does seem to have resolved the seg fault problem! Thank you so much for picking this commit out of obscurity and sending it our way! > > root@arm-ampere-dut:~# echo 16 > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs > root@arm-ampere-dut:~# cat /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs > 16 > > Wunderbar! > > The only other thing I changed (just because I was floating the idea with Dharmik before) was in the kernel .config I changed the qat_c62x and qat_c62xvf modules from statically built in (=y) to loadable (=m). Of course, this should not matter, and I presume the change in behavior relates to those brought in from the commit above. I just want to present fully all changes made so there is a complete picture. > > I will continue on this tomorrow according to where this conversation left off, and try to move this quickly. If indeed there are no more blockers I think we are very close. As a reminder, when standing up a new testing plan, we want to make sure at least 1 rep from each vendor has SSH access and can remotely login to help with system tuning, troubleshooting, etc. for the testbed and test plan. Who would be the best person from ARM for this at this time, given the context on QAT testing? Ruifeng? Dharmik? Someone else? > > Thanks, I'll keep yall apprised of the situation. > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2024-02-27 13:50 ` Honnappa Nagarahalli @ 2024-02-28 20:00 ` Patrick Robb 2024-02-28 20:40 ` Honnappa Nagarahalli 0 siblings, 1 reply; 31+ messages in thread From: Patrick Robb @ 2024-02-28 20:00 UTC (permalink / raw) To: Honnappa Nagarahalli Cc: Ruifeng Wang, Juraj Linkeš, Dharmik Jayesh Thakkar, David Marchand, ci, nd, Wathsala Wathawana Vithanage, Paul Szczepanek [-- Attachment #1: Type: text/plain, Size: 2997 bytes --] quick update: I could bind the QAT VFs to vfio-pci after using the module loading options Dharmik mentioned. First I tested SYM QAT pmd from dpdk test on the VF and got: + Tests Total : 751 + Tests Skipped : 257 + Tests Executed : 659 + Tests Unsupported: 0 + Tests Passed : 494 + Tests Failed : 0 + ------------------------------------------------------- + Test OK I can try the crypto performance DTS testsuite next. Let me know if you have any thoughts. On Tue, Feb 27, 2024 at 8:51 AM Honnappa Nagarahalli < Honnappa.Nagarahalli@arm.com> wrote: > + Paul, Wathsala > > > On Feb 27, 2024, at 12:58 AM, Patrick Robb <probb@iol.unh.edu> wrote: > > > > > > > > On Tue, Nov 14, 2023 at 2:35 AM Ruifeng Wang <Ruifeng.Wang@arm.com> > wrote: > > Hi Patrick, > > It seems kernel v5.15 has a defect on this. A similar issue was fixed > by commit: > > 40da865381ad ("crypto: qat - remove unneeded packed attribute") > > Could you patch the kernel and try again? > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40da865381ad061ab75a7a9da469ed4e623bdfeb > > Thanks, > > Ruifeng > > > > Hi Ruifeng, > > > > Sorry for the delay on this - there has been a work item backlog at the > Community Lab we've been working through. > > > > I did rebuild the patch today with these changes from the commit (or > similar, as the commit above was for the qat_common file in a different > state, but I tried to remain as true to the commit as possible). > > > > And that does seem to have resolved the seg fault problem! Thank you so > much for picking this commit out of obscurity and sending it our way! > > > > root@arm-ampere-dut:~# echo 16 > > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs > > root@arm-ampere-dut:~# cat > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs > > 16 > > > > Wunderbar! > > > > The only other thing I changed (just because I was floating the idea > with Dharmik before) was in the kernel .config I changed the qat_c62x and > qat_c62xvf modules from statically built in (=y) to loadable (=m). Of > course, this should not matter, and I presume the change in behavior > relates to those brought in from the commit above. I just want to present > fully all changes made so there is a complete picture. > > > > I will continue on this tomorrow according to where this conversation > left off, and try to move this quickly. If indeed there are no more > blockers I think we are very close. As a reminder, when standing up a new > testing plan, we want to make sure at least 1 rep from each vendor has SSH > access and can remotely login to help with system tuning, troubleshooting, > etc. for the testbed and test plan. Who would be the best person from ARM > for this at this time, given the context on QAT testing? Ruifeng? Dharmik? > Someone else? > > > > Thanks, I'll keep yall apprised of the situation. > > > > [-- Attachment #2: Type: text/html, Size: 3917 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2024-02-28 20:00 ` Patrick Robb @ 2024-02-28 20:40 ` Honnappa Nagarahalli 2024-03-07 5:27 ` Patrick Robb 0 siblings, 1 reply; 31+ messages in thread From: Honnappa Nagarahalli @ 2024-02-28 20:40 UTC (permalink / raw) To: Patrick Robb Cc: Ruifeng Wang, Juraj Linkeš, Dharmik Jayesh Thakkar, David Marchand, ci, nd, Wathsala Wathawana Vithanage, Paul Szczepanek, Dhruv Tripathi > On Feb 28, 2024, at 2:00 PM, Patrick Robb <probb@iol.unh.edu> wrote: > > quick update: > > I could bind the QAT VFs to vfio-pci after using the module loading options Dharmik mentioned. > > First I tested SYM QAT pmd from dpdk test on the VF and got: > > + Tests Total : 751 > + Tests Skipped : 257 > + Tests Executed : 659 > + Tests Unsupported: 0 > + Tests Passed : 494 > + Tests Failed : 0 > + ------------------------------------------------------- + > Test OK > > I can try the crypto performance DTS testsuite next. Let me know if you have any thoughts. Please go ahead and try. We have not worked on the performance, but it is fine to try. > > > > On Tue, Feb 27, 2024 at 8:51 AM Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> wrote: > + Paul, Wathsala > > > On Feb 27, 2024, at 12:58 AM, Patrick Robb <probb@iol.unh.edu> wrote: > > > > > > > > On Tue, Nov 14, 2023 at 2:35 AM Ruifeng Wang <Ruifeng.Wang@arm.com> wrote: > > Hi Patrick, > > It seems kernel v5.15 has a defect on this. A similar issue was fixed by commit: > > 40da865381ad ("crypto: qat - remove unneeded packed attribute") > > Could you patch the kernel and try again? > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40da865381ad061ab75a7a9da469ed4e623bdfeb > > Thanks, > > Ruifeng > > > > Hi Ruifeng, > > > > Sorry for the delay on this - there has been a work item backlog at the Community Lab we've been working through. > > > > I did rebuild the patch today with these changes from the commit (or similar, as the commit above was for the qat_common file in a different state, but I tried to remain as true to the commit as possible). > > > > And that does seem to have resolved the seg fault problem! Thank you so much for picking this commit out of obscurity and sending it our way! > > > > root@arm-ampere-dut:~# echo 16 > /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs > > root@arm-ampere-dut:~# cat /sys/bus/pci/drivers/c6xx/0000:03:00.0/sriov_numvfs > > 16 > > > > Wunderbar! > > > > The only other thing I changed (just because I was floating the idea with Dharmik before) was in the kernel .config I changed the qat_c62x and qat_c62xvf modules from statically built in (=y) to loadable (=m). Of course, this should not matter, and I presume the change in behavior relates to those brought in from the commit above. I just want to present fully all changes made so there is a complete picture. > > > > I will continue on this tomorrow according to where this conversation left off, and try to move this quickly. If indeed there are no more blockers I think we are very close. As a reminder, when standing up a new testing plan, we want to make sure at least 1 rep from each vendor has SSH access and can remotely login to help with system tuning, troubleshooting, etc. for the testbed and test plan. Who would be the best person from ARM for this at this time, given the context on QAT testing? Ruifeng? Dharmik? Someone else? > > > > Thanks, I'll keep yall apprised of the situation. > > > > > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2024-02-28 20:40 ` Honnappa Nagarahalli @ 2024-03-07 5:27 ` Patrick Robb 2024-03-07 7:56 ` David Marchand 0 siblings, 1 reply; 31+ messages in thread From: Patrick Robb @ 2024-03-07 5:27 UTC (permalink / raw) To: Honnappa Nagarahalli Cc: Ruifeng Wang, Juraj Linkeš, Dharmik Jayesh Thakkar, David Marchand, ci, nd, Wathsala Wathawana Vithanage, Paul Szczepanek, Dhruv Tripathi Hi all, I have run the crypto_perf_cryptodev_perf DTS testsuite for the QAT card on the Ampere server, and have some updates below: On Wed, Feb 28, 2024 at 3:40 PM Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com> wrote: > > > > > On Feb 28, 2024, at 2:00 PM, Patrick Robb <probb@iol.unh.edu> wrote: > > > > quick update: > > > > I could bind the QAT VFs to vfio-pci after using the module loading options Dharmik mentioned. > > > > First I tested SYM QAT pmd from dpdk test on the VF and got: > > > > + Tests Total : 751 > > + Tests Skipped : 257 > > + Tests Executed : 659 > > + Tests Unsupported: 0 > > + Tests Passed : 494 > > + Tests Failed : 0 > > + ------------------------------------------------------- + > > Test OK > > > > I can try the crypto performance DTS testsuite next. Let me know if you have any thoughts. > Please go ahead and try. We have not worked on the performance, but it is fine to try. First, two tiny change are needed in DTS to make it work: 1. As Dharmik and David discussed, there are some QAT devices that need VFIO denylist=1. To account for this, in cryptodev_common.py (which the crypto perf testsuite imports), we need to add: given the c62x device id is 37c8 if dev_id in ["37c8", "435", "19e2"]: test_case.dut.send_expect('modprobe -r vfio_iommu_type1; modprobe -r vfio_pci; modprobe -r vfio_virqfd; modprobe -r vfio', '# ', 5) test_case.dut.send_expect('modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9', '# ', 5) test_case.dut.send_expect('echo "1" | tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode', '# ', 5) In order to maintain the custom vfio loading Dharmik recommended. The latter two dev ids in that list are for DH895XCC and C3XXX, since they are also included in https://github.com/torvalds/linux/commit/50173329c8cc0c892eaa7a9d0f0692ac39cd7b04 David and Dharmik, I think this is correct, but please chime in if it isn't. 2. For this testsuite we need to add some whitespace stripping on the lscpu output for ARM systems. For some reason on some systems there is no leading whitespace before "Core(s) per socket" in lscpu, but in others (the arm servers we have at the lab) there is. So, as long as this is all fine, I can submit a patch to DTS for these items. And from there we can run the testsuite and all QAT testcases are passing. It will give some results like: PerfTestsCryptodev: Test Case test_qat_zuc Begin dut.arm-ampere-dut.dpdklab.iol.unh.edu: lscpu dut.arm-ampere-dut.dpdklab.iol.unh.edu: x86_64-native-linux-gcc/app/dpdk-test-crypto-perf -l 9,10 -a 0000:03:01.0 --socket-mem 2048,0 -n 6 -- --ptest throughput --silent --total- CRYPTODEV: Initialisation parameters - name: 0000:03:01.0_qat_sym,socket id: 0, max queue pairs: 0 Allocated pool "sess_mp_0" on socket 0 lcore id Buf Size Burst Size Enqueued Dequeued Failed Enq Failed Deq MOps Gbps Cycles/Buf 10 64 32 30000000 30000000 39393954 33424660 5.5361 2.8345 4.52 10 128 32 30000000 30000000 40170307 34256181 5.4867 5.6184 4.56 10 256 32 30000000 30000000 42119414 36231215 5.3883 11.0352 4.64 10 512 32 30000000 30000000 44557481 38555569 5.2235 21.3955 4.79 10 1024 32 30000000 30000000 55097817 48193496 4.6161 37.8149 5.42 10 2048 32 30000000 30000000 126698128 118908347 3.0483 49.9439 8.20 I will let you folks who are working on this to assess the performance metrics. I assume this is useful, and if/when we bring this to CI, all these results will be stored as artifacts and viewable for any new series which come in. Happy to discuss further tomorrow at the CI meeting. If there are no issues here, I think we can write up the jenkins scripts pretty quickly and get this online tomorrow or early next week. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2024-03-07 5:27 ` Patrick Robb @ 2024-03-07 7:56 ` David Marchand 0 siblings, 0 replies; 31+ messages in thread From: David Marchand @ 2024-03-07 7:56 UTC (permalink / raw) To: Patrick Robb Cc: Honnappa Nagarahalli, Ruifeng Wang, Juraj Linkeš, Dharmik Jayesh Thakkar, ci, nd, Wathsala Wathawana Vithanage, Paul Szczepanek, Dhruv Tripathi Hello Patrick, On Thu, Mar 7, 2024 at 6:27 AM Patrick Robb <probb@iol.unh.edu> wrote: > 1. As Dharmik and David discussed, there are some QAT devices that > need VFIO denylist=1. To account for this, in cryptodev_common.py > (which the crypto perf testsuite imports), we need to add: > > given the c62x device id is 37c8 > > if dev_id in ["37c8", "435", "19e2"]: > test_case.dut.send_expect('modprobe -r vfio_iommu_type1; modprobe > -r vfio_pci; modprobe -r vfio_virqfd; modprobe -r vfio', '# ', 5) > test_case.dut.send_expect('modprobe vfio-pci disable_denylist=1 > enable_sriov=1 vfio-pci.ids=8086:37c9', '# ', 5) > test_case.dut.send_expect('echo "1" | tee > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode', '# ', 5) > > In order to maintain the custom vfio loading Dharmik recommended. The > latter two dev ids in that list are for DH895XCC and C3XXX, since they > are also included in > https://github.com/torvalds/linux/commit/50173329c8cc0c892eaa7a9d0f0692ac39cd7b04 > > David and Dharmik, I think this is correct, but please chime in if it isn't. You probably missed one question I had, mixed with my grmbl about disable_denylist. """ However, I don't think the vfio-pci.ids syntax works for passing parameters. And in any case, why do you need to set this initial list? Binding devices (using either driverctl or dpdk-devbind.py) to vfio-pci should be done the "usual" way, or is there some special case again for QAT? """ Re-reading vfio-pci kernel parsing code, the syntax for vfio-pci.ids seems ok, my bad. But I am still not clear if there is a need for a special case here. bind_qat_device() calls test_case.dut.bind_eventdev_port which itself calls dpdk-devbind to bind the VF to vfio-pci. So here, on the topic of loading vfio-pci wrt the QAT quirk, you only need: # modprobe vfio-pci disable_denylist=1 enable_sriov=1 -- David Marchand ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Intel QAT 8970 accel card on ARM Ampere Server 2023-10-10 15:59 ` Patrick Robb 2023-10-10 21:50 ` Dharmik Jayesh Thakkar 2023-10-11 8:14 ` Juraj Linkeš @ 2023-10-11 11:51 ` David Marchand 2 siblings, 0 replies; 31+ messages in thread From: David Marchand @ 2023-10-11 11:51 UTC (permalink / raw) To: Patrick Robb Cc: Dharmik Jayesh Thakkar, Ruifeng Wang, Juraj Linkeš, Honnappa Nagarahalli, ci, nd On Tue, Oct 10, 2023 at 6:00 PM Patrick Robb <probb@iol.unh.edu> wrote: > On Mon, Oct 9, 2023 at 11:56 PM Dharmik Jayesh Thakkar <DharmikJayesh.Thakkar@arm.com> wrote: >> >> Hi Patrick, >> >> >> >> Can you provide the grub settings? Is iommu.passthrough=1 included? > > > Sure. I'm not sure if you just wanted the kernel cmdline options or the whole grub config, but I assume you just meant kernel cmdline. Let me know if you meant more. > > GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=39-79 nohz_full=39-79 rcu_nocbs=39-79 processor.max_cstate=1 intel_pstate=disable console=ttyS0,115200 console=tty0" > > But, iommu.passthrough=1 is not included, so I can add that if we need to. Do you know that this won't have any bad implications for the (intel, nvidia, broadcom) NICs which we test on this server? > >> >> >> >> Also, is qat_c62xvf loaded as well? > > qat_c62xvf is built in to the kernel also. > >> >> >> >> >> >> Finally, a few guidelines on the vfio driver: >> >> At times, we need to configure the vfio driver. >> >> On kernel vers. 5.9+ we need to load the vfio-pci driver with the additional parameter disable_denylist=1 >> >> Unload the vfio-pci driver if it is already loaded so that we can reload it with the correct parameters : >> sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio >> >> If you can't unload the vfio driver because it's been built into the kernel, you'll have to find another way to change VFIO parameters, or to rebuild your kernel with VFIO_PCI set as a module. Failing to do that, you might encounter issues later on when you try to bind the VFs to VFIO. >> >> Load the vfio-pci driver and bind it to QAT VFs device ids: >> sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9 >> >> Enable no-iommu-mode: >> echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode >> >> /sys/module/vfio/parameter is missing ? >> >> If /sys/module/vfio/parameters does not exist, you might be missing the kernel module VFIO_NOIOMMU >> >> >> >> Automatically set VFIO params on boot >> >> It's possible to set these parameters automatically on boot by creating a /etc/modprobe.d/vfio-pci.conf file with the parameters : >> cat /etc/modprobe.d/vfio-pci.conf >> options vfio enable_unsafe_noiommu_mode=1 >> options vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9 >> >> >> >> We haven’t encountered this issue in the past, so just making sure the configuration is correct. I don’t think having the driver static/loadable should make a difference, I will try with building statically on my setup. >> >> >> >> Thank you! >> >> > Okay, this should be fine. Like I said, we are also running tests on NICs on this server. So, in our Jenkinsfiles scripts for running the testing, I will add a preliminary step only for QAT tests which runs: > sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio - I thought vfio_iommu_type1 was a x86 thing. So it would work for x86 (Intel/AMD) systems, but fail on other arches.. ? If you tested this on ARM, it is probably ok as is. > sudo modprobe vfio-pci disable_denylist=1 enable_sriov=1 vfio-pci.ids=8086:37c9 - Speaking to myself, too bad the disable_denylist param value is only read once, when loading the vfio-pci kernel module... So ok, I get why you need to reload the whole chain of kmods. However, I don't think the vfio-pci.ids syntax works for passing parameters. And in any case, why do you need to set this initial list? Binding devices (using either driverctl or dpdk-devbind.py) to vfio-pci should be done the "usual" way, or is there some special case again for QAT? - Besides, from what I understood so far, there are two parts specific to this QAT test: * enabling SRIOV so that creating VF is possible with a PF bound to vfio-pci (option enable_sriov=1), * for a list of PCI QAT cards, forcing the disable_denylist is needed (option disable_denylist=1), For the latter point, at this step of the test setup, do you know which QAT devices will be used? If so, the commandline params could be constructed to enable disable_denylist only for known-broken QAT devices (the list is available in the kernel commit Dharmik provided earlier). > echo "1" | sudo tee /sys/module/vfio/parameters/enable_unsafe_noiommu_mode > (then run QAT tests) > > And if running on NICs, have a preliminary step which runs > sudo modprobe -r vfio_iommu_type1; sudo modprobe -r vfio_pci; sudo modprobe -r vfio_virqfd; sudo modprobe -r vfio > sudo modprobe vfio Given that vfio_iommu_type1 is ok on other arch, this lgtm. -- David Marchand ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2024-03-07 7:57 UTC | newest] Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-07-31 17:13 Intel QAT 8970 accel card on ARM Ampere Server Patrick Robb 2023-08-04 9:48 ` Ruifeng Wang 2023-08-08 7:07 ` Juraj Linkeš 2023-08-08 7:11 ` Ruifeng Wang 2023-08-11 21:18 ` Patrick Robb 2023-08-21 8:45 ` Juraj Linkeš 2023-08-30 0:05 ` Patrick Robb 2023-09-01 21:30 ` Patrick Robb 2023-09-11 8:13 ` Juraj Linkeš 2023-09-20 18:28 ` Patrick Robb 2023-09-25 15:19 ` Ruifeng Wang 2023-10-09 16:34 ` Patrick Robb 2023-10-10 2:28 ` Patrick Robb 2023-10-10 3:55 ` Dharmik Jayesh Thakkar 2023-10-10 7:25 ` David Marchand 2023-10-10 15:03 ` Dharmik Jayesh Thakkar 2023-10-10 15:12 ` David Marchand 2023-10-10 15:59 ` Patrick Robb 2023-10-10 21:50 ` Dharmik Jayesh Thakkar 2023-10-11 8:14 ` Juraj Linkeš 2023-10-11 20:13 ` Patrick Robb 2023-11-02 22:00 ` Patrick Robb 2023-11-14 7:34 ` Ruifeng Wang 2023-11-14 14:36 ` Patrick Robb 2024-02-27 6:58 ` Patrick Robb 2024-02-27 13:50 ` Honnappa Nagarahalli 2024-02-28 20:00 ` Patrick Robb 2024-02-28 20:40 ` Honnappa Nagarahalli 2024-03-07 5:27 ` Patrick Robb 2024-03-07 7:56 ` David Marchand 2023-10-11 11:51 ` David Marchand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).