From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 16A78A0547 for ; Mon, 5 Sep 2022 12:14:35 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9F902400D6; Mon, 5 Sep 2022 12:14:34 +0200 (CEST) Received: from mail-ua1-f52.google.com (mail-ua1-f52.google.com [209.85.222.52]) by mails.dpdk.org (Postfix) with ESMTP id BE989400D4 for ; Mon, 5 Sep 2022 12:14:31 +0200 (CEST) Received: by mail-ua1-f52.google.com with SMTP id u14so668028ual.3 for ; Mon, 05 Sep 2022 03:14:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=argonnetech.net; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=3yBUUmhz44pkY1RutzO0oQoHuxplmlwX/GMokYXomV4=; b=WrQTVvcBbn15tB4JuDCJ2qhOxKeuZ9Q6U0FmLF9dFHIeCetkpdklVA/Wr1dA1JqiUx ZdctWxsSo7XdQ098A90ImiB2Ny5NiszrVU03u3W4guLToSbKeuBEnxvO2l1ICj4INmEk b7ZIi1vCa8uwPw8+gNhyQwIPlGk28/e3AOO0O/8Fc/qlbqvfiIuFcFyRxbeKdpciItWa 8buXa+vK9cCzUrJUuOzxJyOn/t4y1mEAtuWX9Pm+wn/SIAvm5stvf92YZ10wvycGfWOK ajbPqsrweFJDvaXrYD19785b8aNfLAuPnR9BpKLn5ylURz1i74v/wBgK2D7l9TCgkFly nmow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=3yBUUmhz44pkY1RutzO0oQoHuxplmlwX/GMokYXomV4=; b=uV6j0CgaQTiJ2oQIlEClSFrRo3kYKi5QHx4e0zk4YP5hbXaqjIwU27O+WMd4yGXITM E+fKUljFYHDfNlxKQ3PZ4d1+W3XOKEO9AKk3plfde19CjCVqlqRZc5xGUUXowRxo/p9e GQ6K08+J+I12PFrnAvutBNdtfYqZ3LSql7htqQzDUduGm2PUEVIpLfV3Vo+qE/V8wPfW hzwB8xotBzkE5zJP/NFSSFVFtEs1pfVjnT8a/rYxm9UVuCh8OicYYOhAQStE3PsdaisR 15Zky5rHVptrg5/auzmchfUavjcKBOV7RpWL61GB1x27eqetbis5LWyHRVmfCOyHTSrv +i5w== X-Gm-Message-State: ACgBeo3uE3qGp3i+JC4+o/fQO/IDQDMXa0K5WF1q06MkS9dcdIc09ITC FjfRsPnpYr3lHYzdC3IqG+VkwFZlmW59fyftbzVPQrZ4sdm6wJ6l X-Google-Smtp-Source: AA6agR7PwXjiJ4TioQCCztZSnwxkJcXQMOjdouJ/huH5ZwBeGhAHTRhnNOaFBDrOmAo6wA20A19J3O8cQK4+uYovIL4= X-Received: by 2002:a05:6130:c13:b0:39f:58bb:d51c with SMTP id cg19-20020a0561300c1300b0039f58bbd51cmr14460381uab.104.1662372871029; Mon, 05 Sep 2022 03:14:31 -0700 (PDT) MIME-Version: 1.0 References: <20220831182546.228d64a3@hermes.local> <20220901082159.31573889@hermes.local> In-Reply-To: <20220901082159.31573889@hermes.local> From: Anna Tauzzi Date: Mon, 5 Sep 2022 12:14:20 +0200 Message-ID: Subject: Re: Initializing and starting port on primary but transmitting on secondary I get port not ready To: Stephen Hemminger Cc: users@dpdk.org Content-Type: multipart/alternative; boundary="000000000000765a8005e7eb5807" X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org --000000000000765a8005e7eb5807 Content-Type: text/plain; charset="UTF-8" Thank you for your interest in the problem. It seems that the error message was due to the passing of option --allow 0000:00.0 by mistake to the secondary too. The primary correctly did all initialization phases: rte_dev_probe(vf) rte_eth_dev_configure(port_id, ... ); rte_eth_dev_adjust_nb_rx_tx_desc(port_id, ... ); rte_eth_rx_queue_setup(port_id, .... ); rte_eth_tx_queue_setup(port_id, ... ); rte_eth_dev_start(port_id ... ); and the secondary did nothing apart from the tx_burst but the secondary didn't see the port at all due to --allow wrong options. BR, Anna. Il giorno gio 1 set 2022 alle ore 17:22 Stephen Hemminger < stephen@networkplumber.org> ha scritto: > On Thu, 1 Sep 2022 09:33:54 +0200 > Anna Tauzzi wrote: > > > I'm using the Mellanox Connect X5: > > > > pci@0000:3b:00.0 enp59s0f0np0 network MT27800 Family > [ConnectX-5] > > pci@0000:3b:00.1 enp59s0f1np1 network MT27800 Family > [ConnectX-5] > > pci@0000:3b:00.2 enp59s0f0v0 network MT27800 Family > [ConnectX-5 > > Virtual Function] > > pci@0000:3b:00.3 enp59s0f0v1 network MT27800 Family > [ConnectX-5 > > Virtual Function] > > pci@0000:3b:00.4 enp59s0f0v2 network MT27800 Family > [ConnectX-5 > > Virtual Function] > > pci@0000:3b:00.5 enp59s0f0v3 network MT27800 Family > [ConnectX-5 > > Virtual Function] > > pci@0000:3b:04.2 enp59s0f1v0 network MT27800 Family > [ConnectX-5 > > Virtual Function] > > pci@0000:3b:04.3 enp59s0f1v1 network MT27800 Family > [ConnectX-5 > > Virtual Function] > > pci@0000:3b:04.4 enp59s0f1v2 network MT27800 Family > [ConnectX-5 > > Virtual Function] > > pci@0000:3b:04.5 enp59s0f1v3 network MT27800 Family > [ConnectX-5 > > Virtual Function] > > > > This is the message: > > lcore 6 called tx_pkt_burst for not ready port 0 > > 8: [/lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7ffff7c77a00]] > > 7: [/lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7ffff7be5b43]] > > 6: [/usr/local/lib/librte_eal.so.22(+0x1559a) [0x7ffff7d8e59a]] > > 5: [build/simple_eth_tx_mp(+0x1a0c7) [0x55555556e0c7]] > > 4: [build/simple_eth_tx_mp(+0x19f89) [0x55555556df89]] > > 3: [build/simple_eth_tx_mp(+0x423c) [0x55555555823c]] > > 2: [/usr/local/lib/librte_ethdev.so.22(+0x7cbc) [0x7ffff7eb3cbc]] > > 1: [/usr/local/lib/librte_eal.so.22(rte_dump_stack+0x32) > [0x7ffff7daf152]] > > > > I'm having all sorts of problems with this Mellanox stuff, Intel cards > are > > much more user friendly. > > > > Just to recap: > > * configure on primary and transmit on primary ---> GOOD > > > > * configure on secondary and transmit on secondary ---> SIGSEGV > > Thread 4 "lcore-worker-6" received signal SIGSEGV, Segmentation fault. > > [Switching to Thread 0x7ffff4346640 (LWP 7208)] > > rte_eth_tx_burst (port_id=0, queue_id=0, tx_pkts=0x7ffff4344ac0, > nb_pkts=1) > > at /usr/local/include/rte_ethdev.h:5650 > > 5650 qd = p->txq.data[queue_id]; > > (gdb) print p->txq > > $2 = {data = 0x0, clbk = 0x7ffff7f21528 } (data is > > NULL) > > > > > > * configure on primary and transmit on secondary ---> PORT NOT > READY > > > > Do you know who should be notified of this problem? Should I open a bug > on > > DPDK bugzilla or file it to NVIDIA? > > > > Thx. > > > > > > > > Il giorno gio 1 set 2022 alle ore 03:25 Stephen Hemminger < > > stephen@networkplumber.org> ha scritto: > > > > > On Wed, 31 Aug 2022 22:59:56 +0200 > > > Anna Tauzzi wrote: > > > > > > > I initialize a port with the following methods on a primary process: > > > > > > > > rte_dev_probe(vf) > > > > > > > > rte_eth_dev_configure(port_id, ... ); > > > > > > > > rte_eth_dev_adjust_nb_rx_tx_desc(port_id, ... ); > > > > > > > > rte_eth_rx_queue_setup(port_id, .... ); > > > > > > > > rte_eth_tx_queue_setup(port_id, ... ); > > > > > > > > rte_eth_dev_start(port_id ... ); > > > > > > > > > > > > > > > > Then I use the rte_eth_tx_burst(port_id) in the secondary process > but I > > > get > > > > this message: > > > > > > > > called tx_pkt_burst for not ready port 0 > > > > > > > > Is this expected? > > > > > > No looks like a device driver bug. Which PMD? > > What version of rdma-core and kernel. > There were some bugs in earlier versions around secondary process support. > They were fixed, some users are using failsafe and mlx5 on Azure with > secondary processes. > --000000000000765a8005e7eb5807 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thank you for your interest in the problem. It seems that = the error message was due to the passing of option --allow 0000:00.0 by mis= take to the secondary too.
The primary correctly did all initialization= phases:

rte_dev_probe(vf)
rte_eth_= dev_configure(port_id, ... );
rte_eth_dev_adjust_nb_rx_tx_desc(port_id, = ... );
rte_eth_rx_queue_setup(port_id, .... );
rte_eth_tx_queue_setup= (port_id, ... );
rte_eth_dev_start(port_id ... );

=C2=A0and the secondary did nothing apart from the tx_burst bu= t the secondary didn't see the port at all due to --allow wrong options= .

BR,
Anna.

Il giorn= o gio 1 set 2022 alle ore 17:22 Stephen Hemminger <stephen@networkplumber.org> ha scritto:
=
On Thu, 1 Sep 2022 = 09:33:54 +0200
Anna Tauzzi <= admin@argonnetech.net> wrote:

> I'm using the Mellanox Connect X5:
>
> pci@0000:3b:00.0=C2=A0 enp59s0f0np0=C2=A0 =C2=A0network=C2=A0 =C2=A0 = =C2=A0 =C2=A0 MT27800 Family [ConnectX-5]
> pci@0000:3b:00.1=C2=A0 enp59s0f1np1=C2=A0 =C2=A0network=C2=A0 =C2=A0 = =C2=A0 =C2=A0 MT27800 Family [ConnectX-5]
> pci@0000:3b:00.2=C2=A0 enp59s0f0v0=C2=A0 =C2=A0 network=C2=A0 =C2=A0 = =C2=A0 =C2=A0 MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:00.3=C2=A0 enp59s0f0v1=C2=A0 =C2=A0 network=C2=A0 =C2=A0 = =C2=A0 =C2=A0 MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:00.4=C2=A0 enp59s0f0v2=C2=A0 =C2=A0 network=C2=A0 =C2=A0 = =C2=A0 =C2=A0 MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:00.5=C2=A0 enp59s0f0v3=C2=A0 =C2=A0 network=C2=A0 =C2=A0 = =C2=A0 =C2=A0 MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:04.2=C2=A0 enp59s0f1v0=C2=A0 =C2=A0 network=C2=A0 =C2=A0 = =C2=A0 =C2=A0 MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:04.3=C2=A0 enp59s0f1v1=C2=A0 =C2=A0 network=C2=A0 =C2=A0 = =C2=A0 =C2=A0 MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:04.4=C2=A0 enp59s0f1v2=C2=A0 =C2=A0 network=C2=A0 =C2=A0 = =C2=A0 =C2=A0 MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:04.5=C2=A0 enp59s0f1v3=C2=A0 =C2=A0 network=C2=A0 =C2=A0 = =C2=A0 =C2=A0 MT27800 Family [ConnectX-5
> Virtual Function]
>
> This is the message:
> lcore 6 called tx_pkt_burst for not ready port 0
> 8: [/lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7ffff7c77a00]]
> 7: [/lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7ffff7be5b43]]
> 6: [/usr/local/lib/librte_eal.so.22(+0x1559a) [0x7ffff7d8e59a]]
> 5: [build/simple_eth_tx_mp(+0x1a0c7) [0x55555556e0c7]]
> 4: [build/simple_eth_tx_mp(+0x19f89) [0x55555556df89]]
> 3: [build/simple_eth_tx_mp(+0x423c) [0x55555555823c]]
> 2: [/usr/local/lib/librte_ethdev.so.22(+0x7cbc) [0x7ffff7eb3cbc]]
> 1: [/usr/local/lib/librte_eal.so.22(rte_dump_stack+0x32) [0x7ffff7daf1= 52]]
>
> I'm having all sorts of problems with this Mellanox stuff, Intel c= ards are
> much more user friendly.
>
> Just to recap:
> * configure on primary and transmit on primary=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0---> GOOD
>
> * configure on secondary and transmit on secondary=C2=A0 ---> SIGSE= GV
> Thread 4 "lcore-worker-6" received signal SIGSEGV, Segmentat= ion fault.
> [Switching to Thread 0x7ffff4346640 (LWP 7208)]
> rte_eth_tx_burst (port_id=3D0, queue_id=3D0, tx_pkts=3D0x7ffff4344ac0,= nb_pkts=3D1)
> at /usr/local/include/rte_ethdev.h:5650
> 5650=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 qd =3D p->txq.data[qu= eue_id];
> (gdb) print p->txq
> $2 =3D {data =3D 0x0, clbk =3D 0x7ffff7f21528 <rte_eth_devices+8296= >} (data is
> NULL)
>
>
> * configure on primary and transmit on secondary=C2=A0 =C2=A0 =C2=A0 = =C2=A0---> PORT NOT READY
>
> Do you know who should be notified of this problem? Should I open a bu= g on
> DPDK bugzilla or file it to NVIDIA?
>
> Thx.
>
>
>
> Il giorno gio 1 set 2022 alle ore 03:25 Stephen Hemminger <
> stephe= n@networkplumber.org> ha scritto:=C2=A0
>
> > On Wed, 31 Aug 2022 22:59:56 +0200
> > Anna Tauzzi <admin@argonnetech.net> wrote:
> >=C2=A0
> > > I initialize a port with the following methods on a primary = process:
> > >
> > > rte_dev_probe(vf)
> > >
> > > rte_eth_dev_configure(port_id, ... );
> > >
> > > rte_eth_dev_adjust_nb_rx_tx_desc(port_id, ... );
> > >
> > > rte_eth_rx_queue_setup(port_id, .... );
> > >
> > > rte_eth_tx_queue_setup(port_id, ... );
> > >
> > > rte_eth_dev_start(port_id ... );
> > >
> > >
> > >
> > > Then I use the rte_eth_tx_burst(port_id) in the secondary pr= ocess but I=C2=A0
> > get=C2=A0
> > > this message:
> > >
> > > called tx_pkt_burst for not ready port 0
> > >
> > > Is this expected?=C2=A0
> >
> > No looks like a device driver bug. Which PMD?

What version of rdma-core and kernel.
There were some bugs in earlier versions around secondary process support.<= br> They were fixed, some users are using failsafe and mlx5 on Azure with
secondary processes.
--000000000000765a8005e7eb5807--