From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0969F4591F; Fri, 6 Sep 2024 16:04:54 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E6DDF42F05; Fri, 6 Sep 2024 16:04:53 +0200 (CEST) Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) by mails.dpdk.org (Postfix) with ESMTP id 5D8204025D for ; Fri, 6 Sep 2024 16:04:52 +0200 (CEST) Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-2d89dbb60bdso1465271a91.1 for ; Fri, 06 Sep 2024 07:04:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind.com; s=google; t=1725631491; x=1726236291; darn=dpdk.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=GnnCdj5YVlF/wjt3chOGPXvM8Uwxarl7Wi1rFornCu4=; b=ATSSf6oHIDvNsK85l6Xw+i3N6FGjsrW4OIzmsHfz0amIaQN6aEJtg/96IJW4/VSVqn 6yX6XqpVEcTGcH7yNDRA98fRP8tP7g1EXmNoxmSHMIvnz4KnIaczNbEhKAOEL/Q7RmOI MWHUF9v/AxDVRB2dIR0A0NzwKu/6g0jwZFdcDcgVeZcZ4nteQlWeHbyI0kOqXunELnFj 8rbL9SuA7NkaC/8HS3rjXfOeDY0HnRlMx2dpwpwFmLS+Rqerr4Aj18bV/yfkAltgI0Tq DnR/7e4sWQYsaZ91G22MgXYHB7a5QXLY9U2reMEdvkGuDMjiko37QoS0NnNnUV6SJ/0Q DU5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725631491; x=1726236291; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GnnCdj5YVlF/wjt3chOGPXvM8Uwxarl7Wi1rFornCu4=; b=LW3skAk8/QuvfEAB3Mpq32IWiRXtP82BUwsc6RSEhok8pNfTDOnR+jAmt6zzf4Kjii DQiJNYRZhQLKu6+vKJjFnEqHesvD8sZ/B5nzPEpJ+6jR0NV+31xTBe9rjcUir86M9bEO gihAbrdr7Ej1/CSwnfLQb2nKaqtwdYZ5LLciTZLZyZI1B+sZ1kDoaxHfYM6g9Lls2NCz zG1C9SYVx/9Td7aI28GkcBKEGXxdtTY/+TTkaNYi8SFE+Jj5ycewzztdfO1YjioUruAv ZXbC9/D5x9OyBSukpK5z9QoSNs356kVyoG6v0stOXwAG6vPbwUFFXh4MBFqOLF8Sg8hA kinQ== X-Gm-Message-State: AOJu0YyZtg54INq11J0jX/B+vIIcn2GHM3NC/qq1+H3Xqvo2e6p/O6ke eVBB6zkdsUDCEvfD4a0TzppPHWLKn/6TSE4tZA3HliXT67afaGCVWsK2Oj5w7SUV9c94oLy5V66 47WEZSppP5VHnnxtqVdxUANUfGVGoD2qU9xII6A== X-Google-Smtp-Source: AGHT+IGu47x7RbFQdq/49TPpVahrgxHPqxiw490Z9OthooD9lKY3AR9pGaX9Zz5MTT1jMKkbZ4HnW5+m/i8QaIF4CSs= X-Received: by 2002:a17:90a:9ac:b0:2d8:8d32:2e9b with SMTP id 98e67ed59e1d1-2d88d322f4fmr21821000a91.20.1725631490280; Fri, 06 Sep 2024 07:04:50 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Edwin Brossette Date: Fri, 6 Sep 2024 16:04:39 +0200 Message-ID: Subject: Re: Crash in tap pmd when using more than 8 rx queues To: Ferruh Yigit Cc: dev@dpdk.org, Olivier Matz , Didier Pallard , Laurent Hardy , Stephen Hemminger Content-Type: multipart/alternative; boundary="000000000000fdf193062173e32e" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org --000000000000fdf193062173e32e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello, I created a Bugzilla PR, just as you requested: https://bugs.dpdk.org/show_bug.cgi?id=3D1536 As for the bug resolution, I have other matters to attend to and I'm afraid I cannot spend more time on this issue, so I was only planning to report it= . Regards, Edwin Brossette. On Fri, Sep 6, 2024 at 1:16=E2=80=AFPM Ferruh Yigit = wrote: > On 9/5/2024 1:55 PM, Edwin Brossette wrote: > > Hello, > > > > I have recently stumbled into an issue with my DPDK-based application > > running the failsafe pmd. This pmd uses a tap device, with which my > > application fails to start if more than 8 rx queues are used. This issu= e > > appears to be related to this patch: > > https://git.dpdk.org/dpdk/commit/? > > id=3Dc36ce7099c2187926cd62cff7ebd479823554929 > commit/?id=3Dc36ce7099c2187926cd62cff7ebd479823554929> > > > > I have seen in the documentation that there was a limitation to 8 max > > queues shared when using a tap device shared between multiple processes= . > > However, my application uses a single primary process, with no secondar= y > > process, but it appears that I am still running into this limitation. > > > > Now if we look at this small chunk of code: > > > > memset(&msg, 0, sizeof(msg)); > > strlcpy(msg.name , TAP_MP_REQ_START_RXTX, > > sizeof(msg.name )); > > strlcpy(request_param->port_name, dev->data->name, sizeof(request_param= - > >>port_name)); > > msg.len_param =3D sizeof(*request_param); > > for (i =3D 0; i < dev->data->nb_tx_queues; i++) { > > msg.fds[fd_iterator++] =3D process_private->txq_fds[i]; > > msg.num_fds++; > > request_param->txq_count++; > > } > > for (i =3D 0; i < dev->data->nb_rx_queues; i++) { > > msg.fds[fd_iterator++] =3D process_private->rxq_fds[i]; > > msg.num_fds++; > > request_param->rxq_count++; > > } > > (Note that I am not using the latest DPDK version, but stable v23.11.1. > > But I believe the issue is still present on latest.) > > > > There are no checks on the maximum value i can take in the for loops. > > Since the size of msg.fds is limited by the maximum of 8 queues shared > > between process because of the IPC API, there is a potential buffer > > overflow which can happen here. > > > > See the struct declaration: > > struct rte_mp_msg { > > char name[RTE_MP_MAX_NAME_LEN]; > > int len_param; > > int num_fds; > > uint8_t param[RTE_MP_MAX_PARAM_LEN]; > > int fds[RTE_MP_MAX_FD_NUM]; > > }; > > > > This means that if the number of queues used is more than 8, the progra= m > > will crash. This is what happens on my end as I get the following log: > > *** stack smashing detected ***: terminated > > > > Reverting the commit mentionned above fixes my issue. Also setting a > > check like this works for me: > > > > if (dev->data->nb_tx_queues + dev->data->nb_rx_queues > > RTE_MP_MAX_FD_NUM) > > return -1; > > > > I've made the changes on my local branch to fix my issue. This mail is > > just to bring attention on this problem. > > Thank you in advance for considering it. > > > > Hi Edwin, > > Thanks for the report, I confirm issue is valid, although that code > changed a little (to increase 8 limit) [3]. > > And in this release Stephen put another patch [1] to increase the limit > even more, but irrelevant from the limit, tap code needs to be fixed. > > To fix: > 1. We need to add "nb_rx_queues > RTE_MP_MAX_FD_NUM" check you > mentioned, to not blindly update the 'msg.fds[]' > 2. We should prevent this to be a limit for tap PMD when there is only > primary process, this seems was oversight in our end. > > > Can you work on the issue or just reporting it? > Can you please report the bug in Bugzilla [2], to record the issue? > > > > [1] > > https://patches.dpdk.org/project/dpdk/patch/20240905162018.74301-1-stephe= n@networkplumber.org/ > > [2] > https://bugs.dpdk.org/ > > [3] > https://git.dpdk.org/dpdk/commit/?id=3D72ab1dc1598e > > --000000000000fdf193062173e32e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,

I created a Bugzilla = PR, just as you requested:

As for the bug resolution, I have other matters to attend= to and I'm afraid I cannot spend more time on this issue, so I was onl= y planning to report it.

Regards,
Edwin Brosset= te.

On Fri, Sep 6, 2024 at 1:16=E2=80=AFPM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
=
On 9/5/2024 1:55 PM, Edwi= n Brossette wrote:
> Hello,
>
> I have recently stumbled into an issue with my DPDK-based application<= br> > running the failsafe pmd. This pmd uses a tap device, with which my > application fails to start if more than 8 rx queues are used. This iss= ue
> appears to be related to this patch:
> https://git.dpdk.org/dpdk/commit/?
> id=3Dc36ce7099c2187926cd62cff7ebd479823554929 <https://git.dpdk.or= g/dpdk/
> commit/?id=3Dc36ce7099c2187926cd62cff7ebd479823554929>
>
> I have seen in the documentation that there was a limitation to 8 max<= br> > queues shared when using a tap device shared between multiple processe= s.
> However, my application uses a single primary process, with no seconda= ry
> process, but it appears that I am still running into this limitation.<= br> >
> Now if we look at this small chunk of code:
>
> memset(&msg, 0, sizeof(msg));
> strlcpy(msg.name <http://msg.name>, TAP_MP_REQ_START_RXTX,
> sizeof(msg.name <http://msg.name>));
> strlcpy(request_param->port_name, dev->data->name, sizeof(req= uest_param-
>>port_name));
> msg.len_param =3D sizeof(*request_param);
> for (i =3D 0; i < dev->data->nb_tx_queues; i++) {
> =C2=A0=C2=A0=C2=A0 msg.fds[fd_iterator++] =3D process_private->txq_= fds[i];
> =C2=A0=C2=A0=C2=A0 msg.num_fds++;
> =C2=A0=C2=A0=C2=A0 request_param->txq_count++;
> }
> for (i =3D 0; i < dev->data->nb_rx_queues; i++) {
> =C2=A0=C2=A0=C2=A0 msg.fds[fd_iterator++] =3D process_private->rxq_= fds[i];
> =C2=A0=C2=A0=C2=A0 msg.num_fds++;
> =C2=A0=C2=A0=C2=A0 request_param->rxq_count++;
> }
> (Note that I am not using the latest DPDK version, but stable v23.11.1= .
> But I believe the issue is still present on latest.)
>
> There are no checks on the maximum value i can take in the for loops.<= br> > Since the size of msg.fds is limited by the maximum of 8 queues shared=
> between process because of the IPC API, there is a potential buffer > overflow which can happen here.
>
> See the struct declaration:
> struct rte_mp_msg {
> =C2=A0=C2=A0=C2=A0=C2=A0 char name[RTE_MP_MAX_NAME_LEN];
> =C2=A0=C2=A0=C2=A0=C2=A0 int len_param;
> =C2=A0=C2=A0=C2=A0=C2=A0 int num_fds;
> =C2=A0=C2=A0=C2=A0=C2=A0 uint8_t param[RTE_MP_MAX_PARAM_LEN];
> =C2=A0=C2=A0=C2=A0=C2=A0 int fds[RTE_MP_MAX_FD_NUM];
> };
>
> This means that if the number of queues used is more than 8, the progr= am
> will crash. This is what happens on my end as I get the following log:=
> *** stack smashing detected ***: terminated
>
> Reverting the commit mentionned above fixes my issue. Also setting a > check like this works for me:
>
> if (dev->data->nb_tx_queues + dev->data->nb_rx_queues >= RTE_MP_MAX_FD_NUM)
> =C2=A0=C2=A0=C2=A0=C2=A0 return -1;
>
> I've made the changes on my local branch to fix my issue. This mai= l is
> just to bring attention on this problem.
> Thank you in advance for considering it.
>

Hi Edwin,

Thanks for the report, I confirm issue is valid, although that code
changed a little (to increase 8 limit) [3].

And in this release Stephen put another patch [1] to increase the limit
even more, but irrelevant from the limit, tap code needs to be fixed.

To fix:
1. We need to add "nb_rx_queues > RTE_MP_MAX_FD_NUM" check you=
mentioned, to not blindly update the 'msg.fds[]'
2. We should prevent this to be a limit for tap PMD when there is only
primary process, this seems was oversight in our end.


Can you work on the issue or just reporting it?
Can you please report the bug in Bugzilla [2], to record the issue?



[1]
https:= //patches.dpdk.org/project/dpdk/patch/20240905162018.74301-1-stephen@networ= kplumber.org/

[2]
htt= ps://bugs.dpdk.org/

[3]
https://git.dpdk.org/dpdk/commit/?id=3D72ab1dc159= 8e

--000000000000fdf193062173e32e--