From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <3chas3@gmail.com> Received: from mail-io0-f193.google.com (mail-io0-f193.google.com [209.85.223.193]) by dpdk.org (Postfix) with ESMTP id 0AF182BAE for ; Mon, 20 Aug 2018 01:20:39 +0200 (CEST) Received: by mail-io0-f193.google.com with SMTP id c22-v6so5019357iob.1 for ; Sun, 19 Aug 2018 16:20:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=JgZpDKcgLnEIwJOC4uVHzDYOrOftp7b3Tbp42DUk2uI=; b=b+teV0rPxFUI9GgyMba8+DuixKXdAI9e0FB4TsszVDQ/ER9hZoBmTVCl39399guKx7 iTDOCG2LxjgRsl0uF5NGNDEePEmgoHCoXH71SNn8yJ3t/PLAzhhMMvrbo5YmFvSorj3q vT/B+0ArTqXpYPYm10/hW6UN6pWT6XUyQy6N/+atwB1TsXHM34+trbbOAoHlc+Dcqg8z zogEmlHsQ+as+R2ToUqjC4OWfO6GaD+QhvT/HPEZDw5snMw+qAf6EQ2oqgK0+zXsX3mj yjZb5kpZsMge8t9jXAbtaFzv4igT+zTdbSAmwtiVt5NT3d8z3QajP301/Yz3AXKiOfgD K4CQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JgZpDKcgLnEIwJOC4uVHzDYOrOftp7b3Tbp42DUk2uI=; b=DFsvM95bnMS5zL45JMsXs6f5ivN43ov987oxLhFMiwLmHrsS7yMK5p83VIrTE4y77m HkPvgOvHJQBdn3XI/dY7p4jWwN/ueimUiEkRqNYRa/dVd8R86wtcOA9+gBs2tllO272G 1lhk5jTLdRoDii3w4GjQbTcVQJyMu02epYSGyZMWZ/WKR0Xygpe+kV2TCWRKj4pFFnYb YS32zwD98/lwuiwxcGybj89TSHs1R8o4WKeb2rsqLNe6HkL3od5/MGrnpCZq/aXnxPxa ElzzR2GIFMS1xifuhABexzu6FnB9Kx9DMJnUBjGWgU2DhIYn6XpIhodjhsjt91d00JNJ flCg== X-Gm-Message-State: APzg51CER6be6HYORDaHIW61XkGfcWs2JVQvagI5Yed+wP0+W0sPaDkU RtYhHmv63FXbAbjBE9heGk+JHN0RgBNiRzQg2bg= X-Google-Smtp-Source: ANB0VdbQ//IezWNueBAb0dOXqktsTDRJCRoNgp6AaDdzDsEEbAm9iF8m5+ubE2nwsMcAXmLepOuKRG8pIRjMyQASRWw= X-Received: by 2002:a6b:9a4d:: with SMTP id c74-v6mr1493974ioe.110.1534720838261; Sun, 19 Aug 2018 16:20:38 -0700 (PDT) MIME-Version: 1.0 References: <1534551009-16550-1-git-send-email-jyu@vmware.com> <29C69F93-9D89-4528-A645-78E6C34E8CB5@vmware.com> In-Reply-To: <29C69F93-9D89-4528-A645-78E6C34E8CB5@vmware.com> From: Chas Williams <3chas3@gmail.com> Date: Sun, 19 Aug 2018 19:20:27 -0400 Message-ID: To: jyu@vmware.com Cc: dev@dpdk.org, Declan Doherty , Chas Williams Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH] net/bonding: fix buf corruption in merging un-transmitted packets X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Aug 2018 23:20:39 -0000 On Sun, Aug 19, 2018 at 6:19 PM Jia Yu wrote: > Hi Chas, > > > > Thanks for reviewing the change. > > > > Our application crashed after upgrading to DPDK 18.02, when packet rate i= s > high and bond is configured. It happened because txq contains invalid mbu= f > addresses after rte_eth_tx_burst call (for example, 0x100000000 repeated = 13 > times in one core dump). It seems that the invalid addresses came from bu= f > shift code below, so I changed this part of code to earlier version. We > don=E2=80=99t see crash after the fix. Please let us know if the fix is r= easonable. > Sorry, I was unclear. Can you reference the previous commit that broke wha= t you are fixed? I would like to see a "Fixes" line. See https://core.dpdk.org/contribute/ for guidance or just examine git log for an example of what Fixes: should look like. I am guessing you meant 09150784a776 ("net/bonding: burst mode hash calculation") in your commit message? > > > m_table =3D {0x7fdf23a68ec0, 0x7fdf23a68400, 0x7fdf23a66e80, 0x7fdf23a659= 00, > 0x7fdf23960e00, 0x100000000 , 0x7fdf23978640, > 0x7fdf23977b80, 0x7fdf239745c0, 0x7fdf23a4d600, 0x7fdf23a4cb40, > 0x7fdf23a75b00, 0x7fdf23a74580, 0x7fdf23972580, 0x7fdf239a76c0, > 0x7fdf239a5680, 0x7fdf23a7a640, 0x7fdf23a79b80, 0x7fdf23a77b40, > 0x7fdf2395b800} > > > > /* Send packet burst on each slave device */ > for (i =3D 0; i < slave_count; i++) { > if (slave_nb_bufs[i] =3D=3D 0) > continue; > > > slave_tx_count =3D rte_eth_tx_burst(slave_port_ids[i], > bd_tx_q->queue_id, slave_bufs[i], > slave_nb_bufs[i]); > > > total_tx_count +=3D slave_tx_count; > > > /* If tx burst fails move packets to end of bufs */ > if (unlikely(slave_tx_count < slave_nb_bufs[i])) { > slave_tx_fail_count[i] =3D slave_nb_bufs[i] - > slave_tx_count; > total_tx_fail_count +=3D slave_tx_fail_count[i]; > > > /* > * Shift bufs to beginning of array to allow reordering > * later > */ > for (j =3D 0; j < slave_tx_fail_count[i]; j++) { > slave_bufs[i][j] =3D > slave_bufs[i][(slave_tx_count - 1) + j]; <=3D=3D=3D= =3D what if > slave_tx_count =3D=3D 0, j =3D=3D 0 > } > } > } > > > > Thanks, > > Jia > > > > *From: *Chas Williams <3chas3@gmail.com> > *Date: *Saturday, August 18, 2018 at 4:50 PM > *To: *Jia Yu > *Cc: *"dev@dpdk.org" , Declan Doherty < > declan.doherty@intel.com>, Chas Williams > *Subject: *Re: [dpdk-dev] [PATCH] net/bonding: fix buf corruption in > merging un-transmitted packets > > > > > > > > On Fri, Aug 17, 2018 at 9:46 PM Jia Yu wrote: > > When bond slave devices cannot transmit all packets in bufs array, > tx_burst callback shall merge the un-transmitted packets back to > bufs array. Recent merge logic introduced a bug which causes > invalid mbuf addresses being written to bufs array. > > > > Can you expand on this a bit? What was the commit? > > > > > > When caller frees the un-transmitted packets, due to invalid addresses, > application will crash. > > The fix is avoid shifting mbufs, and directly write un-transmitted > packets back to bufs array. > > Signed-off-by: Jia Yu > --- > drivers/net/bonding/rte_eth_bond_pmd.c | 98 > +++++----------------------------- > 1 file changed, 14 insertions(+), 84 deletions(-) > > diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c > b/drivers/net/bonding/rte_eth_bond_pmd.c > index 58f7377..cce973a 100644 > --- a/drivers/net/bonding/rte_eth_bond_pmd.c > +++ b/drivers/net/bonding/rte_eth_bond_pmd.c > @@ -303,7 +303,7 @@ bond_ethdev_tx_burst_8023ad_fast_queue(void *queue, > struct rte_mbuf **bufs, > uint16_t slave_tx_count, slave_tx_fail_count[RTE_MAX_ETHPORTS] = =3D { > 0 }; > uint16_t total_tx_count =3D 0, total_tx_fail_count =3D 0; > > - uint16_t i, j; > + uint16_t i; > > if (unlikely(nb_bufs =3D=3D 0)) > return 0; > @@ -361,31 +361,9 @@ bond_ethdev_tx_burst_8023ad_fast_queue(void *queue, > struct rte_mbuf **bufs, > slave_tx_fail_count[i] =3D slave_nb_bufs[i] - > slave_tx_count; > total_tx_fail_count +=3D slave_tx_fail_count[i]; > - > - /* > - * Shift bufs to beginning of array to allow > reordering > - * later > - */ > - for (j =3D 0; j < slave_tx_fail_count[i]; j++) { > - slave_bufs[i][j] =3D > - slave_bufs[i][(slave_tx_count - 1= ) > + j]; > - } > - } > - } > - > - /* > - * If there are tx burst failures we move packets to end of bufs = to > - * preserve expected PMD behaviour of all failed transmitted bein= g > - * at the end of the input mbuf array > - */ > - if (unlikely(total_tx_fail_count > 0)) { > - int bufs_idx =3D nb_bufs - total_tx_fail_count - 1; > - > - for (i =3D 0; i < slave_count; i++) { > - if (slave_tx_fail_count[i] > 0) { > - for (j =3D 0; j < slave_tx_fail_count[i]; > j++) > - bufs[bufs_idx++] =3D > slave_bufs[i][j]; > - } > + memcpy(&bufs[nb_bufs - total_tx_fail_count], > + &slave_bufs[i][slave_tx_count], > + slave_tx_fail_count[i] * sizeof(bufs[0])); > } > } > > @@ -715,8 +693,8 @@ bond_ethdev_tx_burst_round_robin(void *queue, struct > rte_mbuf **bufs, > tx_fail_total +=3D tx_fail_slave; > > memcpy(&bufs[nb_pkts - tx_fail_total], > - > &slave_bufs[i][num_tx_slave], > - tx_fail_slave * > sizeof(bufs[0])); > + &slave_bufs[i][num_tx_slave], > + tx_fail_slave * sizeof(bufs[0])); > } > num_tx_total +=3D num_tx_slave; > } > @@ -1224,7 +1202,7 @@ bond_ethdev_tx_burst_balance(void *queue, struct > rte_mbuf **bufs, > uint16_t slave_tx_count, slave_tx_fail_count[RTE_MAX_ETHPORTS] = =3D { > 0 }; > uint16_t total_tx_count =3D 0, total_tx_fail_count =3D 0; > > - uint16_t i, j; > + uint16_t i; > > if (unlikely(nb_bufs =3D=3D 0)) > return 0; > @@ -1268,31 +1246,9 @@ bond_ethdev_tx_burst_balance(void *queue, struct > rte_mbuf **bufs, > slave_tx_fail_count[i] =3D slave_nb_bufs[i] - > slave_tx_count; > total_tx_fail_count +=3D slave_tx_fail_count[i]; > - > - /* > - * Shift bufs to beginning of array to allow > reordering > - * later > - */ > - for (j =3D 0; j < slave_tx_fail_count[i]; j++) { > - slave_bufs[i][j] =3D > - slave_bufs[i][(slave_tx_count - 1= ) > + j]; > - } > - } > - } > - > - /* > - * If there are tx burst failures we move packets to end of bufs = to > - * preserve expected PMD behaviour of all failed transmitted bein= g > - * at the end of the input mbuf array > - */ > - if (unlikely(total_tx_fail_count > 0)) { > - int bufs_idx =3D nb_bufs - total_tx_fail_count - 1; > - > - for (i =3D 0; i < slave_count; i++) { > - if (slave_tx_fail_count[i] > 0) { > - for (j =3D 0; j < slave_tx_fail_count[i]; > j++) > - bufs[bufs_idx++] =3D > slave_bufs[i][j]; > - } > + memcpy(&bufs[nb_bufs - total_tx_fail_count], > + &slave_bufs[i][slave_tx_count], > + slave_tx_fail_count[i] * sizeof(bufs[0])); > } > } > > @@ -1322,7 +1278,7 @@ bond_ethdev_tx_burst_8023ad(void *queue, struct > rte_mbuf **bufs, > uint16_t slave_tx_count, slave_tx_fail_count[RTE_MAX_ETHPORTS] = =3D { > 0 }; > uint16_t total_tx_count =3D 0, total_tx_fail_count =3D 0; > > - uint16_t i, j; > + uint16_t i; > > if (unlikely(nb_bufs =3D=3D 0)) > return 0; > @@ -1384,35 +1340,9 @@ bond_ethdev_tx_burst_8023ad(void *queue, struct > rte_mbuf **bufs, > slave_tx_count; > total_tx_fail_count +=3D > slave_tx_fail_count[i]; > > - /* > - * Shift bufs to beginning of array to > allow > - * reordering later > - */ > - for (j =3D 0; j < slave_tx_fail_count[i]; > j++) > - slave_bufs[i][j] =3D > - slave_bufs[i] > - [(slave_tx_count = - > 1) > - + j]; > - } > - } > - > - /* > - * If there are tx burst failures we move packets to end = of > - * bufs to preserve expected PMD behaviour of all failed > - * transmitted being at the end of the input mbuf array > - */ > - if (unlikely(total_tx_fail_count > 0)) { > - int bufs_idx =3D nb_bufs - total_tx_fail_count - = 1; > - > - for (i =3D 0; i < slave_count; i++) { > - if (slave_tx_fail_count[i] > 0) { > - for (j =3D 0; > - j < slave_tx_fail_count[i= ]; > - j++) { > - bufs[bufs_idx++] =3D > - slave_bufs[i][j]; > - } > - } > + memcpy(&bufs[nb_bufs - > total_tx_fail_count], > + &slave_bufs[i][slave_tx_count], > + slave_tx_fail_count[i] * > sizeof(bufs[0])); > } > } > } > -- > 2.7.4 > >