From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6CF1042D18 for ; Wed, 21 Jun 2023 22:22:09 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C9F344068E; Wed, 21 Jun 2023 22:22:08 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 6311A4003C for ; Wed, 21 Jun 2023 22:22:07 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687378926; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OukrycHhoM6KNbU2QBt/Dn7YK7NX8lI5R+c7UkgqLhM=; b=KgN89IfNNsg7cpN6Ouz63PqdMOVUx9AUjXfHoZuKCaECUN2l9KDFYO1wATxg+Fgb9ANzn8 Q5+pzAJYQ0i6PUHmCnTIASs7aExoSHjsMStbO9nqyDNIJtKb8FoJcBgm+IhlamOP/fS91J yuiR5oLI5GBvhS9Obf/PF2aosSR3sdE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-543-HS0s5MwYM_y9EJNRVw4yLQ-1; Wed, 21 Jun 2023 16:22:04 -0400 X-MC-Unique: HS0s5MwYM_y9EJNRVw4yLQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 33E1F8028B2; Wed, 21 Jun 2023 20:22:04 +0000 (UTC) Received: from [10.39.208.22] (unknown [10.39.208.22]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C90A840C2063; Wed, 21 Jun 2023 20:22:02 +0000 (UTC) Message-ID: <5d9ae8ec-450a-c411-c044-577f00b127f5@redhat.com> Date: Wed, 21 Jun 2023 22:22:01 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 To: =?UTF-8?Q?Daniel_=c3=96stman?= , Erez Ferber , Slava Ovsiienko Cc: "users@dpdk.org" , Matan Azrad , "david.marchand@redhat.com" References: From: Maxime Coquelin Subject: Re: mlx5: imissed / out_of_buffer counter always 0 In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Hi Daniel, all, On 6/5/23 16:00, Daniel Östman wrote: > Hi Slava and Erez and thanks for your answers, > > Regarding the firmware, I’ve also deployed in a different OpenShift > cluster were I see the exact same issue but with a different Mellanox NIC: > > Mellanox Technologies MT2892 Family - ConnectX-6 DX 2-port 100GbE QSFP56 > PCIe Adapter > > driver: mlx5_core > > version: 5.0-0 > firmware-version: 22.36.1010 (DEL0000000027) > > From what I can see the firmware is relatively new on that one? With below configuration: - ConnectX-6 Dx MT2892 - Kernel: 6.4.0-rc6 - FW version: 22.35.1012 (MT_0000000528) The out-of-buffer counter is fetched via mlx5_devx_cmd_queue_counter_query(): [pid 2942] ioctl(17, RDMA_VERBS_IOCTL, 0x7ffcb15bcd10) = 0 [pid 2942] write(1, "\n ######################## NIC "..., 80) = 80 [pid 2942] write(1, " RX-packets: 630997736 RX-miss"..., 70) = 70 [pid 2942] write(1, " RX-errors: 0\n", 15) = 15 [pid 2942] write(1, " RX-nombuf: 0 \n", 25) = 25 [pid 2942] write(1, " TX-packets: 0 TX-erro"..., 60) = 60 [pid 2942] write(1, "\n", 1) = 1 [pid 2942] write(1, " Throughput (since last show)\n", 31) = 31 [pid 2942] write(1, " Rx-pps: 0 "..., 106) = 106 [pid 2942] write(1, " ##############################"..., 79) = 79 It looks like we may miss some mlx5 kernel patches so that we can use mlx5_devx_cmd_queue_counter_query() with RHEL? Erez, Slava, any idea on the patches that could be missing? Regards, Maxime > > I tried setting dv_flow_en=0 (and saw that it was propagated to > config->dv_flow_en) but it didn’t seem to help. > > Erez, I’m not sure what you mean by shared or non-shared mode in this > case, however it seems it could be related to the fact that the > container is running in a separate network namespace. Because the > hw_counter directory is available on the host (cluster node), but not in > the pod container. > > Best regards, > > Daniel > > *From:*Erez Ferber > *Sent:* Monday, 5 June 2023 12:29 > *To:* Slava Ovsiienko > *Cc:* Daniel Östman ; users@dpdk.org; Matan > Azrad ; maxime.coquelin@redhat.com; > david.marchand@redhat.com > *Subject:* Re: mlx5: imissed / out_of_buffer counter always 0 > > Hi Daniel, > > is the container running in shared or non-shared mode ? > > For shared mode, I assume the kernel sysfs counters which DPDK relies on > for imissed/out_of_buffer are not exposed. > > Best regards, > > Erez > > On Fri, 2 Jun 2023 at 18:07, Slava Ovsiienko > wrote: > > Hi, Daniel > > I would recommend to take the following action: > > - update the firmware, 16.33.xxxx looks to be outdated a little bit. > Please, try 16.35.1012 or later. >   mlx5_glue->devx_obj_create might succeed with the newer FW. > > - try to specify dv_flow_en=0 devarg, it forces mlx5 PMD to use > rdma_core library for queue management >  and kernel driver will  be aware about Rx queues being created and > attach them to the kernel counter set > > With best regards, > Slava > > *From:*Daniel Östman > > *Sent:* Friday, June 2, 2023 3:59 PM > *To:* users@dpdk.org > *Cc:* Matan Azrad >; > Slava Ovsiienko >; maxime.coquelin@redhat.com > ; david.marchand@redhat.com > > *Subject:* mlx5: imissed / out_of_buffer counter always 0 > > Hi, > > I’m deploying a containerized DPDK application in an OpenShift > Kubernetes environment using DPDK 21.11.3. > > The application uses a Mellanox ConnectX-5 100G NIC through VFs. > > The problem I have is that the ETH stats counter imissed (which > seems to be mapped to “out_of_buffer” internally in mlx5 PMD driver) > is 0 when I don’t expect it to be, i.e. when the application doesn’t > read the packets fast enough. > > Using GDB I can see that it tries to access the counter through > /sys/class/infiniband/mlx5_99/ports/1/hw_counters/out_of_buffer but > the hw_counters directory is missing so it will just return a zero > value. I don’t know why it is missing. > > When looking at mlx5_os_read_dev_stat() I can see that there is an > alternative way of reading the counter, through > mlx5_devx_cmd_queue_counter_query() but under the condition that > priv->q_counters are set. > > It doesn’t get set in my case because mlx5_glue->devx_obj_create() > fails (errno 22) in mlx5_devx_cmd_queue_counter_alloc(). > > Have I missed something? > > NIC info: > > Mellanox Technologies MT27800 Family [ConnectX-5] - 100Gb 2-port > QSFP28 MCX516A-CCHT > driver: mlx5_core > version: 5.0-0 > firmware-version: 16.33.1048 (MT_0000000417) > > Please let me know if I need to provide more information. > > Best regards, > > Daniel >