From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 16A1F45958; Tue, 10 Sep 2024 18:58:07 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E33B3402AB; Tue, 10 Sep 2024 18:58:06 +0200 (CEST) Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) by mails.dpdk.org (Postfix) with ESMTP id 0C6F84027D for ; Tue, 10 Sep 2024 18:58:05 +0200 (CEST) Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-2053525bd90so49972075ad.0 for ; Tue, 10 Sep 2024 09:58:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1725987485; x=1726592285; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=3bn7MVpsCAL9A5zF3IBHbokevh+Rnkg1McD+JrO7zLA=; b=12D2ahqYapnGysCE/kOCrejzgo+3IN7E5UCxqGRsbziuwNXKP3Blc13E8KOG2L8lTc PwSog0ntEbjvLEMM2ttM/4vjz7qirt9Guap9OMSuzT+gdQyZgf8OIH/L+nDiCpeX5zy/ BZP1mrzkrAf9A7HQ/NJCHHIWYUhzFBWCEhJo1O1yZA9n/TzCq5JA76XDLrRoZQfrt5dS gfYjoEitOJ99Af0/b873SKiOcza9+x+yXtrFwVNcC6lsfWAZ/y+WVdV+aGXvcTHLb6ly IDimnpvtwZcpZanHFeT2kE0UlZbXHXk3XUdGPSzfi+8w27XD3jmEHQ5FIJ5SW8V+ri54 FxWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725987485; x=1726592285; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3bn7MVpsCAL9A5zF3IBHbokevh+Rnkg1McD+JrO7zLA=; b=rWTQtlQkSgmw/MSiJPT5VQKX6QMNRNkq72jcctqgPQJtzecxeeT0Qu9khMfIGFz4Wx rSg0Zj0JLw6qT2i0hy3E77OuqUMo9A/6c3yEsGIA7bnzqQo9D0ETCayn8Do3/kh0JiCJ 4VaMEHScP2yZaZvoXxcwRiQMl3Ku26V26B4UcVLNJfn5t2nWd1XNhuMnIeUtfdNVWIZL kIi9KVJKP/NSAw6d3L/dsm+TBUSRZEL8xifbngv+jtUY0HcquioIcC6WRIpmEenk5Jq7 Rh95wBux8xhIHlKBaag8lsYosKO+9tnaCRyE0QOOOIxNKkrIqeMEa3goP1dvYRd4Wr1/ 0+zw== X-Forwarded-Encrypted: i=1; AJvYcCUQwPifUC4693Muk87go1d60GYlkY6cl1+BWyE9GOndlUC+elOD80BvSJU8KDUyHvVyUjQ=@dpdk.org X-Gm-Message-State: AOJu0YwODWM4ozV69qJgVy+whmg4N7d+CFC0KWlCYP2IR6MZ5ltnI+Ak rgYjNWU7P89ZEpHgZgnS8LyRuCCWfLICVb74UDp0mvjYAn9b37CwQrla0DbwqOw= X-Google-Smtp-Source: AGHT+IHcgNMLr+vjRo0P5IF+GV6kaX01EpujK6lruzExb0IJsJFtjSkOolTjeKCmRaTnbwVbkbGasQ== X-Received: by 2002:a17:903:2304:b0:1fb:7c7f:6458 with SMTP id d9443c01a7336-2074c6428a4mr16126295ad.32.1725987484998; Tue, 10 Sep 2024 09:58:04 -0700 (PDT) Received: from hermes.local (204-195-96-226.wavecable.com. [204.195.96.226]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20710ee70afsm51077005ad.162.2024.09.10.09.58.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Sep 2024 09:58:04 -0700 (PDT) Date: Tue, 10 Sep 2024 09:58:02 -0700 From: Stephen Hemminger To: Ferruh Yigit Cc: Edwin Brossette , dev@dpdk.org, Olivier Matz , Didier Pallard , Laurent Hardy , kparameshwar@vmware.com, ferruh.yigit@intel.com Subject: Re: Crash in tap pmd when using more than 8 rx queues Message-ID: <20240910095802.22f3ab60@hermes.local> In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Fri, 6 Sep 2024 12:16:47 +0100 Ferruh Yigit wrote: > On 9/5/2024 1:55 PM, Edwin Brossette wrote: > > Hello, > >=20 > > I have recently stumbled into an issue with my DPDK-based application > > running the failsafe pmd. This pmd uses a tap device, with which my > > application fails to start if more than 8 rx queues are used. This issue > > appears to be related to this patch: > > https://git.dpdk.org/dpdk/commit/? > > id=3Dc36ce7099c2187926cd62cff7ebd479823554929 > commit/?id=3Dc36ce7099c2187926cd62cff7ebd479823554929> =20 > >=20 > > I have seen in the documentation that there was a limitation to 8 max > > queues shared when using a tap device shared between multiple processes. > > However, my application uses a single primary process, with no secondary > > process, but it appears that I am still running into this limitation. > >=20 > > Now if we look at this small chunk of code: > >=20 > > memset(&msg, 0, sizeof(msg)); > > strlcpy(msg.name , TAP_MP_REQ_START_RXTX, > > sizeof(msg.name )); > > strlcpy(request_param->port_name, dev->data->name, sizeof(request_param= - =20 > >>port_name)); =20 > > msg.len_param =3D sizeof(*request_param); > > for (i =3D 0; i < dev->data->nb_tx_queues; i++) { > > =C2=A0=C2=A0=C2=A0 msg.fds[fd_iterator++] =3D process_private->txq_fds[= i]; > > =C2=A0=C2=A0=C2=A0 msg.num_fds++; > > =C2=A0=C2=A0=C2=A0 request_param->txq_count++; > > } > > for (i =3D 0; i < dev->data->nb_rx_queues; i++) { > > =C2=A0=C2=A0=C2=A0 msg.fds[fd_iterator++] =3D process_private->rxq_fds[= i]; > > =C2=A0=C2=A0=C2=A0 msg.num_fds++; > > =C2=A0=C2=A0=C2=A0 request_param->rxq_count++; > > } > > (Note that I am not using the latest DPDK version, but stable v23.11.1. > > But I believe the issue is still present on latest.) > >=20 > > There are no checks on the maximum value i can take in the for loops. > > Since the size of msg.fds is limited by the maximum of 8 queues shared > > between process because of the IPC API, there is a potential buffer > > overflow which can happen here. > >=20 > > See the struct declaration: > > struct rte_mp_msg { > > =C2=A0=C2=A0=C2=A0=C2=A0 char name[RTE_MP_MAX_NAME_LEN]; > > =C2=A0=C2=A0=C2=A0=C2=A0 int len_param; > > =C2=A0=C2=A0=C2=A0=C2=A0 int num_fds; > > =C2=A0=C2=A0=C2=A0=C2=A0 uint8_t param[RTE_MP_MAX_PARAM_LEN]; > > =C2=A0=C2=A0=C2=A0=C2=A0 int fds[RTE_MP_MAX_FD_NUM]; > > }; > >=20 > > This means that if the number of queues used is more than 8, the program > > will crash. This is what happens on my end as I get the following log: > > *** stack smashing detected ***: terminated > >=20 > > Reverting the commit mentionned above fixes my issue. Also setting a > > check like this works for me: > >=20 > > if (dev->data->nb_tx_queues + dev->data->nb_rx_queues > RTE_MP_MAX_FD_N= UM) > > =C2=A0=C2=A0=C2=A0=C2=A0 return -1; > >=20 > > I've made the changes on my local branch to fix my issue. This mail is > > just to bring attention on this problem. > > Thank you in advance for considering it. > > =20 >=20 > Hi Edwin, >=20 > Thanks for the report, I confirm issue is valid, although that code > changed a little (to increase 8 limit) [3]. >=20 > And in this release Stephen put another patch [1] to increase the limit > even more, but irrelevant from the limit, tap code needs to be fixed. >=20 > To fix: > 1. We need to add "nb_rx_queues > RTE_MP_MAX_FD_NUM" check you > mentioned, to not blindly update the 'msg.fds[]' > 2. We should prevent this to be a limit for tap PMD when there is only > primary process, this seems was oversight in our end. >=20 It is not clear what the error handling should be if the user requests 10 queues but RTE_MP_MAX_FD_NUM is 8. Ideally, it should work if no seconda= ry process is used. But there is no good way to know that in the driver. That is why it is best to just set TAP max queues to be less than or equal to RTE_MP_MAX_FD_NUM, and enforce that with a static assertion at compile time.