From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <3chas3@gmail.com> Received: from mail-it0-f65.google.com (mail-it0-f65.google.com [209.85.214.65]) by dpdk.org (Postfix) with ESMTP id 8DB1B2C5 for ; Mon, 20 Aug 2018 02:07:53 +0200 (CEST) Received: by mail-it0-f65.google.com with SMTP id g141-v6so18370392ita.4 for ; Sun, 19 Aug 2018 17:07:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=nqIHGLL20p0mnZ8jwOZHst2cNpGS2FebxAKX/wc6ycA=; b=AtfsE2GjB6z+gpF7lUUob7Hfh9nsoIHOdTSRXTLasLYUdDg47yJCUKF7b5tXyAGL1x p02Cq7ZaytghPpukBtrnI5XbilrPvapP95siRRhCeZVCrG643KcpEN+T8W8C1bHfAmAa SsqxhbqF7vYsnaFq8WbeQywqm1Lhxgl6zKxkMJQ45qix0gNf3GvBuvdIxgm6oSveXQrb pBwQ+uuGf7qx5wYKoEyWloFvbD9muNbE20xgahYAnLqUEAGnRu1Gm2nJjIfL2K/vpUn0 jkWkKHB1hUKw1oOrDwkPqTKQPdCGnoWdMkc0PeKmOUDACkQQSDNNfWDKcbsr0KA6fIzx 9ltg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nqIHGLL20p0mnZ8jwOZHst2cNpGS2FebxAKX/wc6ycA=; b=Mlu5R4rJwu9/mw+NAotbITN/exw61i6JH89jDlmCYuRV5kBl1bkVweHhQ5i8kLEd9H dtJ0TVHbukJlyyPK0vGlA13GBJgNktygI4M5KUme6MqghOAFtGPxtt982v6sRpD6ctjb BSxu1IS8pM4HnL5tSlB4ItR7L5oAfECugosTqZacUEWSHpbvPseQJgduvLnAwHnR/dUs nm7mdkdyRfnQGx8FVT0cNoRLK0nMTdFAgZxgmRkMY2CtUiYiQd2DmWx6vftr8QYhOKcd SSvlDjytdF443P5AUeaFGTvaDEoPw5QgIv2f+RMXuRlb8VmrY0HHMkaNQmKJZXuSEpYW 5Iqg== X-Gm-Message-State: AOUpUlHCn4vOw6UrD/Jq7+4GnuQw9ookWplkzSspUohpK16qm29AO0Re h0b/G/9+iNJBplIW94LPe+vio6OoPtUNLy2zkZg= X-Google-Smtp-Source: AA+uWPwsX+lOiEoePuHfMg6Ca9GxPH/4SAaCxp4t5DeGjPPQco8ikGxvCllRS3mAKfvliLqkBA+zEc5JlJagDi5Uybc= X-Received: by 2002:a24:6b0d:: with SMTP id v13-v6mr12077534itc.16.1534723672888; Sun, 19 Aug 2018 17:07:52 -0700 (PDT) MIME-Version: 1.0 References: <1534551009-16550-1-git-send-email-jyu@vmware.com> <29C69F93-9D89-4528-A645-78E6C34E8CB5@vmware.com> In-Reply-To: <29C69F93-9D89-4528-A645-78E6C34E8CB5@vmware.com> From: Chas Williams <3chas3@gmail.com> Date: Sun, 19 Aug 2018 20:07:41 -0400 Message-ID: To: jyu@vmware.com Cc: dev@dpdk.org, Declan Doherty , Chas Williams Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH] net/bonding: fix buf corruption in merging un-transmitted packets X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Aug 2018 00:07:54 -0000 On Sun, Aug 19, 2018 at 6:19 PM Jia Yu wrote: > Hi Chas, > > > > Thanks for reviewing the change. > > > > Our application crashed after upgrading to DPDK 18.02, when packet rate i= s > high and bond is configured. It happened because txq contains invalid mbu= f > addresses after rte_eth_tx_burst call (for example, 0x100000000 repeated = 13 > times in one core dump). It seems that the invalid addresses came from bu= f > shift code below, so I changed this part of code to earlier version. We > don=E2=80=99t see crash after the fix. Please let us know if the fix is r= easonable. > With respect to correctness, the code you are fixing does seem broken. See inline below. > > > m_table =3D {0x7fdf23a68ec0, 0x7fdf23a68400, 0x7fdf23a66e80, 0x7fdf23a659= 00, > 0x7fdf23960e00, 0x100000000 , 0x7fdf23978640, > 0x7fdf23977b80, 0x7fdf239745c0, 0x7fdf23a4d600, 0x7fdf23a4cb40, > 0x7fdf23a75b00, 0x7fdf23a74580, 0x7fdf23972580, 0x7fdf239a76c0, > 0x7fdf239a5680, 0x7fdf23a7a640, 0x7fdf23a79b80, 0x7fdf23a77b40, > 0x7fdf2395b800} > > > > /* Send packet burst on each slave device */ > for (i =3D 0; i < slave_count; i++) { > if (slave_nb_bufs[i] =3D=3D 0) > continue; > > > slave_tx_count =3D rte_eth_tx_burst(slave_port_ids[i], > bd_tx_q->queue_id, slave_bufs[i], > slave_nb_bufs[i]); > A typical failure here would be to transmit no packets. slave_tx_count =3D= 0. > > > > total_tx_count +=3D slave_tx_count; > > > /* If tx burst fails move packets to end of bufs */ > if (unlikely(slave_tx_count < slave_nb_bufs[i])) { > slave_tx_fail_count[i] =3D slave_nb_bufs[i] - > slave_tx_count; > total_tx_fail_count +=3D slave_tx_fail_count[i]; > > > /* > * Shift bufs to beginning of array to allow reordering > * later > */ > for (j =3D 0; j < slave_tx_fail_count[i]; j++) { > slave_bufs[i][j] =3D > slave_bufs[i][(slave_tx_count - 1) + j]; <=3D=3D=3D= =3D what if > slave_tx_count =3D=3D 0, j =3D=3D 0 > As you correctly point out, slave_tx_count - 1 would be -1. Bad news. So, yes, I agree with your fix. In your fix, you might notice that slave_tx_fail_count doesn't need to be an array. It's local to if (unlikely(slave_tx_count < slave_nb_bufs[i])) now. > > } > } > } > > > > Thanks, > > Jia > > > > *From: *Chas Williams <3chas3@gmail.com> > *Date: *Saturday, August 18, 2018 at 4:50 PM > *To: *Jia Yu > *Cc: *"dev@dpdk.org" , Declan Doherty < > declan.doherty@intel.com>, Chas Williams > *Subject: *Re: [dpdk-dev] [PATCH] net/bonding: fix buf corruption in > merging un-transmitted packets > > > > > > > > On Fri, Aug 17, 2018 at 9:46 PM Jia Yu wrote: > > When bond slave devices cannot transmit all packets in bufs array, > tx_burst callback shall merge the un-transmitted packets back to > bufs array. Recent merge logic introduced a bug which causes > invalid mbuf addresses being written to bufs array. > > > > Can you expand on this a bit? What was the commit? > > > > > > When caller frees the un-transmitted packets, due to invalid addresses, > application will crash. > > The fix is avoid shifting mbufs, and directly write un-transmitted > packets back to bufs array. > > Signed-off-by: Jia Yu > --- > drivers/net/bonding/rte_eth_bond_pmd.c | 98 > +++++----------------------------- > 1 file changed, 14 insertions(+), 84 deletions(-) > > diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c > b/drivers/net/bonding/rte_eth_bond_pmd.c > index 58f7377..cce973a 100644 > --- a/drivers/net/bonding/rte_eth_bond_pmd.c > +++ b/drivers/net/bonding/rte_eth_bond_pmd.c > @@ -303,7 +303,7 @@ bond_ethdev_tx_burst_8023ad_fast_queue(void *queue, > struct rte_mbuf **bufs, > uint16_t slave_tx_count, slave_tx_fail_count[RTE_MAX_ETHPORTS] = =3D { > 0 }; > uint16_t total_tx_count =3D 0, total_tx_fail_count =3D 0; > > - uint16_t i, j; > + uint16_t i; > > if (unlikely(nb_bufs =3D=3D 0)) > return 0; > @@ -361,31 +361,9 @@ bond_ethdev_tx_burst_8023ad_fast_queue(void *queue, > struct rte_mbuf **bufs, > slave_tx_fail_count[i] =3D slave_nb_bufs[i] - > slave_tx_count; > total_tx_fail_count +=3D slave_tx_fail_count[i]; > - > - /* > - * Shift bufs to beginning of array to allow > reordering > - * later > - */ > - for (j =3D 0; j < slave_tx_fail_count[i]; j++) { > - slave_bufs[i][j] =3D > - slave_bufs[i][(slave_tx_count - 1= ) > + j]; > - } > - } > - } > - > - /* > - * If there are tx burst failures we move packets to end of bufs = to > - * preserve expected PMD behaviour of all failed transmitted bein= g > - * at the end of the input mbuf array > - */ > - if (unlikely(total_tx_fail_count > 0)) { > - int bufs_idx =3D nb_bufs - total_tx_fail_count - 1; > - > - for (i =3D 0; i < slave_count; i++) { > - if (slave_tx_fail_count[i] > 0) { > - for (j =3D 0; j < slave_tx_fail_count[i]; > j++) > - bufs[bufs_idx++] =3D > slave_bufs[i][j]; > - } > + memcpy(&bufs[nb_bufs - total_tx_fail_count], > + &slave_bufs[i][slave_tx_count], > + slave_tx_fail_count[i] * sizeof(bufs[0])); > } > } > > @@ -715,8 +693,8 @@ bond_ethdev_tx_burst_round_robin(void *queue, struct > rte_mbuf **bufs, > tx_fail_total +=3D tx_fail_slave; > > memcpy(&bufs[nb_pkts - tx_fail_total], > - > &slave_bufs[i][num_tx_slave], > - tx_fail_slave * > sizeof(bufs[0])); > + &slave_bufs[i][num_tx_slave], > + tx_fail_slave * sizeof(bufs[0])); > } > num_tx_total +=3D num_tx_slave; > } > @@ -1224,7 +1202,7 @@ bond_ethdev_tx_burst_balance(void *queue, struct > rte_mbuf **bufs, > uint16_t slave_tx_count, slave_tx_fail_count[RTE_MAX_ETHPORTS] = =3D { > 0 }; > uint16_t total_tx_count =3D 0, total_tx_fail_count =3D 0; > > - uint16_t i, j; > + uint16_t i; > > if (unlikely(nb_bufs =3D=3D 0)) > return 0; > @@ -1268,31 +1246,9 @@ bond_ethdev_tx_burst_balance(void *queue, struct > rte_mbuf **bufs, > slave_tx_fail_count[i] =3D slave_nb_bufs[i] - > slave_tx_count; > total_tx_fail_count +=3D slave_tx_fail_count[i]; > - > - /* > - * Shift bufs to beginning of array to allow > reordering > - * later > - */ > - for (j =3D 0; j < slave_tx_fail_count[i]; j++) { > - slave_bufs[i][j] =3D > - slave_bufs[i][(slave_tx_count - 1= ) > + j]; > - } > - } > - } > - > - /* > - * If there are tx burst failures we move packets to end of bufs = to > - * preserve expected PMD behaviour of all failed transmitted bein= g > - * at the end of the input mbuf array > - */ > - if (unlikely(total_tx_fail_count > 0)) { > - int bufs_idx =3D nb_bufs - total_tx_fail_count - 1; > - > - for (i =3D 0; i < slave_count; i++) { > - if (slave_tx_fail_count[i] > 0) { > - for (j =3D 0; j < slave_tx_fail_count[i]; > j++) > - bufs[bufs_idx++] =3D > slave_bufs[i][j]; > - } > + memcpy(&bufs[nb_bufs - total_tx_fail_count], > + &slave_bufs[i][slave_tx_count], > + slave_tx_fail_count[i] * sizeof(bufs[0])); > } > } > > @@ -1322,7 +1278,7 @@ bond_ethdev_tx_burst_8023ad(void *queue, struct > rte_mbuf **bufs, > uint16_t slave_tx_count, slave_tx_fail_count[RTE_MAX_ETHPORTS] = =3D { > 0 }; > uint16_t total_tx_count =3D 0, total_tx_fail_count =3D 0; > > - uint16_t i, j; > + uint16_t i; > > if (unlikely(nb_bufs =3D=3D 0)) > return 0; > @@ -1384,35 +1340,9 @@ bond_ethdev_tx_burst_8023ad(void *queue, struct > rte_mbuf **bufs, > slave_tx_count; > total_tx_fail_count +=3D > slave_tx_fail_count[i]; > > - /* > - * Shift bufs to beginning of array to > allow > - * reordering later > - */ > - for (j =3D 0; j < slave_tx_fail_count[i]; > j++) > - slave_bufs[i][j] =3D > - slave_bufs[i] > - [(slave_tx_count = - > 1) > - + j]; > - } > - } > - > - /* > - * If there are tx burst failures we move packets to end = of > - * bufs to preserve expected PMD behaviour of all failed > - * transmitted being at the end of the input mbuf array > - */ > - if (unlikely(total_tx_fail_count > 0)) { > - int bufs_idx =3D nb_bufs - total_tx_fail_count - = 1; > - > - for (i =3D 0; i < slave_count; i++) { > - if (slave_tx_fail_count[i] > 0) { > - for (j =3D 0; > - j < slave_tx_fail_count[i= ]; > - j++) { > - bufs[bufs_idx++] =3D > - slave_bufs[i][j]; > - } > - } > + memcpy(&bufs[nb_bufs - > total_tx_fail_count], > + &slave_bufs[i][slave_tx_count], > + slave_tx_fail_count[i] * > sizeof(bufs[0])); > } > } > } > -- > 2.7.4 > >