From: "Daniel Östman" <daniel.ostman@ericsson.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>,
Erez Ferber <erezferber@gmail.com>,
Slava Ovsiienko <viacheslavo@nvidia.com>
Cc: "users@dpdk.org" <users@dpdk.org>, Matan Azrad <matan@nvidia.com>,
"david.marchand@redhat.com" <david.marchand@redhat.com>
Subject: RE: mlx5: imissed / out_of_buffer counter always 0
Date: Fri, 18 Aug 2023 12:04:31 +0000 [thread overview]
Message-ID: <PAVPR07MB9310A361351619B5B2A93247861BA@PAVPR07MB9310.eurprd07.prod.outlook.com> (raw)
In-Reply-To: <e8138048-5065-96ce-c6ea-1d72121e2b8f@redhat.com>
Hi Maxime,
Sorry for the late reply, I've been on vacation.
Please see my answer below.
/ Daniel
> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Thursday, 22 June 2023 17:48
> To: Daniel Östman <daniel.ostman@ericsson.com>; Erez Ferber
> <erezferber@gmail.com>; Slava Ovsiienko <viacheslavo@nvidia.com>
> Cc: users@dpdk.org; Matan Azrad <matan@nvidia.com>;
> david.marchand@redhat.com
> Subject: Re: mlx5: imissed / out_of_buffer counter always 0
>
> Hi,
>
> On 6/21/23 22:22, Maxime Coquelin wrote:
> > Hi Daniel, all,
> >
> > On 6/5/23 16:00, Daniel Östman wrote:
> >> Hi Slava and Erez and thanks for your answers,
> >>
> >> Regarding the firmware, I’ve also deployed in a different OpenShift
> >> cluster were I see the exact same issue but with a different Mellanox
> >> NIC:
> >>
> >> Mellanox Technologies MT2892 Family - ConnectX-6 DX 2-port 100GbE
> >> QSFP56 PCIe Adapter
> >>
> >> driver: mlx5_core
> >>
> >> version: 5.0-0
> >> firmware-version: 22.36.1010 (DEL0000000027)
> >>
> >> From what I can see the firmware is relatively new on that one?
> >
> > With below configuration:
> > - ConnectX-6 Dx MT2892
> > - Kernel: 6.4.0-rc6
> > - FW version: 22.35.1012 (MT_0000000528)
> >
> > The out-of-buffer counter is fetched via
> > mlx5_devx_cmd_queue_counter_query():
> >
> > [pid 2942] ioctl(17, RDMA_VERBS_IOCTL, 0x7ffcb15bcd10) = 0 [pid
> > 2942] write(1, "\n ######################## NIC "..., 80) = 80 [pid
> > 2942] write(1, " RX-packets: 630997736 RX-miss"..., 70) = 70 [pid
> > 2942] write(1, " RX-errors: 0\n", 15) = 15 [pid 2942] write(1, "
> > RX-nombuf: 0 \n", 25) = 25 [pid 2942] write(1, "
> > TX-packets: 0 TX-erro"..., 60) = 60 [pid 2942] write(1,
> > "\n", 1) = 1 [pid 2942] write(1, " Throughput (since last
> > show)\n", 31) = 31 [pid 2942] write(1, " Rx-pps: 0
> > "..., 106) = 106 [pid 2942] write(1, "
> > ##############################"..., 79) = 79
> >
> > It looks like we may miss some mlx5 kernel patches so that we can use
> > mlx5_devx_cmd_queue_counter_query() with RHEL?
> >
> > Erez, Slava, any idea on the patches that could be missing?
>
> Above test was on baremetal as root, I get the same "working" behaviour on
> RHEL as root.
>
> We managed to reproduce Daniel's with running the same within a container,
> enabling debug logs we have this warning:
>
> mlx5_common: DevX create q counter set failed errno=121 status=0x2
> syndrome=0x8975f1
> mlx5_net: Port 0 queue counter object cannot be created by DevX - fall-back
> to use the kernel driver global queue counter.
>
> Running the container as privileged solves the issue, and so does when
> adding SYS_RAWIO capability to the container.
>
> Erez, Slava, is that expected to require SYS_RAWIO just to get a stat counter?
>
> Daniel, could you try adding SYS_RAWIO to your pod to confirm you face the
> same issue?
Yes I can confirm what you are seeing when running in a cluster with Openshift 4.12 (RHEL 8.6) and with SYS_RAWIO or running as privileged.
But with privileged container I also need to run with UID 0 for it to work, is that what you are doing as well?
In both these cases the counter can be successfully retrieved through the DevX interface.
However, when running in a cluster with Openshift 4.10 (RHEL 8.4) I can not get it to work with any of these two approaches.
> Thanks in advance,
> Maxime
> > Regards,
> > Maxime
> >
> >>
> >> I tried setting dv_flow_en=0 (and saw that it was propagated to
> >> config->dv_flow_en) but it didn’t seem to help.
> >>
> >> Erez, I’m not sure what you mean by shared or non-shared mode in this
> >> case, however it seems it could be related to the fact that the
> >> container is running in a separate network namespace. Because the
> >> hw_counter directory is available on the host (cluster node), but not
> >> in the pod container.
> >>
> >> Best regards,
> >>
> >> Daniel
> >>
> >> *From:*Erez Ferber <erezferber@gmail.com>
> >> *Sent:* Monday, 5 June 2023 12:29
> >> *To:* Slava Ovsiienko <viacheslavo@nvidia.com>
> >> *Cc:* Daniel Östman <daniel.ostman@ericsson.com>; users@dpdk.org;
> >> Matan Azrad <matan@nvidia.com>; maxime.coquelin@redhat.com;
> >> david.marchand@redhat.com
> >> *Subject:* Re: mlx5: imissed / out_of_buffer counter always 0
> >>
> >> Hi Daniel,
> >>
> >> is the container running in shared or non-shared mode ?
> >>
> >> For shared mode, I assume the kernel sysfs counters which DPDK relies
> >> on for imissed/out_of_buffer are not exposed.
> >>
> >> Best regards,
> >>
> >> Erez
> >>
> >> On Fri, 2 Jun 2023 at 18:07, Slava Ovsiienko <viacheslavo@nvidia.com
> >> <mailto:viacheslavo@nvidia.com>> wrote:
> >>
> >> Hi, Daniel
> >>
> >> I would recommend to take the following action:
> >>
> >> - update the firmware, 16.33.xxxx looks to be outdated a little bit.
> >> Please, try 16.35.1012 or later.
> >> mlx5_glue->devx_obj_create might succeed with the newer FW.
> >>
> >> - try to specify dv_flow_en=0 devarg, it forces mlx5 PMD to use
> >> rdma_core library for queue management
> >> and kernel driver will be aware about Rx queues being created
> >> and
> >> attach them to the kernel counter set
> >>
> >> With best regards,
> >> Slava
> >>
> >> *From:*Daniel Östman <daniel.ostman@ericsson.com
> >> <mailto:daniel.ostman@ericsson.com>>
> >> *Sent:* Friday, June 2, 2023 3:59 PM
> >> *To:* users@dpdk.org <mailto:users@dpdk.org>
> >> *Cc:* Matan Azrad <matan@nvidia.com <mailto:matan@nvidia.com>>;
> >> Slava Ovsiienko <viacheslavo@nvidia.com
> >> <mailto:viacheslavo@nvidia.com>>; maxime.coquelin@redhat.com
> >> <mailto:maxime.coquelin@redhat.com>; david.marchand@redhat.com
> >> <mailto:david.marchand@redhat.com>
> >> *Subject:* mlx5: imissed / out_of_buffer counter always 0
> >>
> >> Hi,
> >>
> >> I’m deploying a containerized DPDK application in an OpenShift
> >> Kubernetes environment using DPDK 21.11.3.
> >>
> >> The application uses a Mellanox ConnectX-5 100G NIC through VFs.
> >>
> >> The problem I have is that the ETH stats counter imissed (which
> >> seems to be mapped to “out_of_buffer” internally in mlx5 PMD
> >> driver)
> >> is 0 when I don’t expect it to be, i.e. when the application
> >> doesn’t
> >> read the packets fast enough.
> >>
> >> Using GDB I can see that it tries to access the counter through
> >> /sys/class/infiniband/mlx5_99/ports/1/hw_counters/out_of_buffer
> >> but
> >> the hw_counters directory is missing so it will just return a
> >> zero
> >> value. I don’t know why it is missing.
> >>
> >> When looking at mlx5_os_read_dev_stat() I can see that there is
> >> an
> >> alternative way of reading the counter, through
> >> mlx5_devx_cmd_queue_counter_query() but under the condition that
> >> priv->q_counters are set.
> >>
> >> It doesn’t get set in my case because
> >> mlx5_glue->devx_obj_create()
> >> fails (errno 22) in mlx5_devx_cmd_queue_counter_alloc().
> >>
> >> Have I missed something?
> >>
> >> NIC info:
> >>
> >> Mellanox Technologies MT27800 Family [ConnectX-5] - 100Gb 2-port
> >> QSFP28 MCX516A-CCHT
> >> driver: mlx5_core
> >> version: 5.0-0
> >> firmware-version: 16.33.1048 (MT_0000000417)
> >>
> >> Please let me know if I need to provide more information.
> >>
> >> Best regards,
> >>
> >> Daniel
> >>
next prev parent reply other threads:[~2023-08-18 12:04 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-02 12:59 Daniel Östman
2023-06-02 15:07 ` Slava Ovsiienko
2023-06-05 10:29 ` Erez Ferber
2023-06-05 14:00 ` Daniel Östman
2023-06-21 20:22 ` Maxime Coquelin
2023-06-22 15:47 ` Maxime Coquelin
2023-08-18 12:04 ` Daniel Östman [this message]
2023-10-04 13:49 ` Maxime Coquelin
2023-11-08 12:55 ` Daniel Östman
2023-11-09 14:51 ` Slava Ovsiienko
2023-12-06 12:40 ` Daniel Östman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=PAVPR07MB9310A361351619B5B2A93247861BA@PAVPR07MB9310.eurprd07.prod.outlook.com \
--to=daniel.ostman@ericsson.com \
--cc=david.marchand@redhat.com \
--cc=erezferber@gmail.com \
--cc=matan@nvidia.com \
--cc=maxime.coquelin@redhat.com \
--cc=users@dpdk.org \
--cc=viacheslavo@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).