From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id AC13345547; Tue, 2 Jul 2024 10:53:56 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9789640E35; Tue, 2 Jul 2024 10:53:56 +0200 (CEST) Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) by mails.dpdk.org (Postfix) with ESMTP id 56A744028A for ; Tue, 2 Jul 2024 10:49:27 +0200 (CEST) Received: by mail-pj1-f49.google.com with SMTP id 98e67ed59e1d1-2c7dd4586afso2884137a91.2 for ; Tue, 02 Jul 2024 01:49:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netgate.com; s=google; t=1719910166; x=1720514966; darn=dpdk.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Xoix1Q5P09w+bpSRhc1+hM8wk1Bv3cNVEw4u8MbIF0g=; b=EotgwKLFCEfKcHp9YG36rmLdUkhos9YEhoJmazeSGAZTwQkoAXicIMOKIFpAfFITpO g/eL8kF+DVlMRAq2LuF6CPvAWR/eqiYBSN8VLK3kKsq6F3mh9JOhVDLfX0QSMH/Y17Cp FspG8M2Ept0lL/CkFyYu9ZaQj1F96UDr4CanA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719910166; x=1720514966; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Xoix1Q5P09w+bpSRhc1+hM8wk1Bv3cNVEw4u8MbIF0g=; b=H7SBAh+zLVwqJvUfnRyyHTx5zbxB/KWLdyRpw0zpFdzd74UQvYy0tK7GlJqYJfBnuL 9yoc4GGZQ2k8NBIJIlYrb7DdY6wd2idJCAhWVnj4BHMb2ZIzCG955gi+CUUUYUlrq+nq ACU/K8ZxMVtik5/nSEPDUuqX7lMZscwTViTbPxY17Py9J2E+x9vXCCAB4UiuBfhlC96o puwwnW4Ax5IGiA8rSCcMzARxl/nYa/O/j93UFrh/prXdbPt2TkqqSGAXgjimn+u+sJ0b bG3Ev2vV5/wyJsShfYvTwYBLzw6ZEKxVYkXpr0I9Y3DaBmHEiIYVr3hi0gQmoqQ6dVD9 KhOQ== X-Forwarded-Encrypted: i=1; AJvYcCXc1XGkj9hQQRw95HCcgUEJPWWrZf2AZ2O+PSIh7ar8pjXJaLRpWBCNOzbwVki4FTjVklW6/WyzM6UvQrI= X-Gm-Message-State: AOJu0YzMwoblkNQip3Lsqo0AZMPSFNKK+hZBic9QVFIW3uR8YrkDqaHM n/uhLu77Xt6uBhfwfaK51dSA/HLJKov61chYq1dxMSv+sVv/s7izgKvFo3drGOYRsNksUA5q5Im pOtbD4n+8zM5GDZWvgnYISym6QZgXgUybR45g X-Google-Smtp-Source: AGHT+IFkHCitlPalN4jj5xEjRXItWABLfr3MCb9wEGO8DctxdOuCtXq/mq+P88OsPM+I1a8wTDnWfhX/teoa/Ak2I+0= X-Received: by 2002:a17:90a:cb09:b0:2c9:65df:f871 with SMTP id 98e67ed59e1d1-2c965dff9a2mr233558a91.15.1719910166352; Tue, 02 Jul 2024 01:49:26 -0700 (PDT) MIME-Version: 1.0 References: <20240628163503.15893-1-askorichenko@netgate.com> <20240630084041.44e86a57@hermes.local> In-Reply-To: <20240630084041.44e86a57@hermes.local> From: Alexander Skorichenko Date: Tue, 2 Jul 2024 10:49:15 +0200 Message-ID: Subject: Re: [PATCH] net/netvsc: fix mtu_set in netvsc devices To: stephen@networkplumber.org Cc: samandrew@microsoft.com, ferruh.yigit@amd.com, andrew.rybchenko@oktetlabs.ru, longli@microsoft.com, Wei Hu , dev@dpdk.org, Matthew Smith Content-Type: multipart/alternative; boundary="00000000000082ee6c061c3fca35" X-Mailman-Approved-At: Tue, 02 Jul 2024 10:53:55 +0200 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org --00000000000082ee6c061c3fca35 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello, The failure happens when running under VPP. In terms of dpdk calls the following sequence of events occurs for a NetVSC device: Durung device probing stage of VPP 1. eth_hn_dev_init() memory for a single primary rx_queue is allocated dev->data->rx_queues[0] =3D 0, allocated, but not set yet no allocation happened for tx_queues[i] and non-primary rx_queues[i] During device setup stage of VPP from VPP's own generic dpdk_device_setup()= : 2. rte_eth_dev_set_mtu() currently it segfaults in hn_reinit() when trying to reach into dev->data->rx_queues[i], dev->data->tx_queues[i] 3. rte_eth_tx_queue_setup() dev->data->tx_queues[i] are being allocated and set 4. rte_eth_rx_queue_setup() dev->data->rx_queues[i] get allocated (i > 0) and set So rx_queues[0] could be set in step 1, but rx_queues[i] and tx_queues[i] are still NULL. Allocating all the remaining rx/tx queues in step 1 would prevent the crash, but then in steps 3-4 would go through releasing and allocating all of the queues again. Another comment regarding the uniform condition introduced in hn_reinit() in the patch: if (dev->data->rx_queues[0] !=3D NULL) ... Because of the difference between rx/tx queues described above, it's probably safer to extend the condition to check both rx and tx separately if (dev->data->rx_queues[0] !=3D NULL && dev->data->tx_queues[0] != =3D NULL) - Alexander Skorichenko On Sun, Jun 30, 2024 at 5:40=E2=80=AFPM Stephen Hemminger < stephen@networkplumber.org> wrote: > On Fri, 28 Jun 2024 18:35:03 +0200 > Alexander Skorichenko wrote: > > > Prevent segfault in hn_reinit() caused by changing the MTU for > > an incompletely initialized device. > > > > Signed-off-by: Alexander Skorichenko > > How do you get in that state? > Maybe the init code should set up these pointers and avoid the problem. > --00000000000082ee6c061c3fca35 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,
The failure happens when runnin= g under VPP. In terms of dpdk calls the following sequence of events occurs= for a NetVSC device:

Durung device probing stage of VPP
=C2=A0 = =C2=A01. eth_hn_dev_init() memory for a single primary rx_queue is allocate= d
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 dev->data->rx_queues[0= ] =3D 0, allocated, but not set yet
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 no allocation happened for tx_queues[i] and non-primary rx_queues[i]=

During device setup stage of VPP from VPP's own generic dpdk_de= vice_setup():
=C2=A0 =C2=A02. =C2=A0rte_eth_dev_set_mtu()
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 currently it segfaults in hn_reinit() when = trying to reach into
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 dev->d= ata->rx_queues[i], dev->data->tx_queues[i]
=C2=A0 =C2=A03. =C2= =A0rte_eth_tx_queue_setup()
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 de= v->data->tx_queues[i] are being allocated and set
=C2=A0 =C2=A04. = =C2=A0rte_eth_rx_queue_setup()
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= dev->data->rx_queues[i] get allocated (i > 0) and set

So r= x_queues[0] could be set in step 1, but rx_queues[i] and tx_queues[i] are s= till NULL.
Allocating all the remaining rx/tx queues in step 1 would pr= event the crash, but then in steps 3-4 would go through releasing and alloc= ating all of the queues again.

Another comment regarding the unifor= m condition introduced in hn_reinit() in the patch:
=C2=A0 =C2=A0 =C2=A0= =C2=A0 if (dev->data->rx_queues[0] !=3D NULL) ...
Because of the = difference between rx/tx queues described above, it's probably safer to= extend the condition to check both rx and tx separately
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 if (dev->data->rx_queues[0] !=3D NULL && dev-&g= t;data->tx_queues[0] !=3D NULL)
=C2=A0 =C2=A0 =C2=A0 =C2=A0
- Ale= xander Skorichenko


On Sun, Jun 30, 2024 at 5:40=E2=80=AFPM Stephen H= emminger <stephen@networkp= lumber.org> wrote:
On Fri, 28 Jun 2024 18:35:03 +0200
Alexander Skorichenko <askorichenko@netgate.com> wrote:

> Prevent segfault in hn_reinit() caused by changing the MTU for
> an incompletely initialized device.
>
> Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com>

How do you get in that state?
Maybe the init code should set up these pointers and avoid the problem.
--00000000000082ee6c061c3fca35--