From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <rsanford@akamai.com>
Received: from prod-mail-xrelay02.akamai.com (prod-mail-xrelay02.akamai.com
 [72.246.2.14]) by dpdk.org (Postfix) with ESMTP id 17912684A
 for <dev@dpdk.org>; Wed, 20 Aug 2014 22:50:31 +0200 (CEST)
Received: from prod-mail-xrelay02.akamai.com (localhost [127.0.0.1])
 by postfix.imss70 (Postfix) with ESMTP id 1ABD4285CC;
 Wed, 20 Aug 2014 20:54:04 +0000 (GMT)
Received: from prod-mail-relay06.akamai.com (prod-mail-relay06.akamai.com
 [172.17.120.126])
 by prod-mail-xrelay02.akamai.com (Postfix) with ESMTP id 070E728583;
 Wed, 20 Aug 2014 20:54:04 +0000 (GMT)
Received: from ustx2ex-cashub.dfw01.corp.akamai.com
 (ustx2ex-cashub1.dfw01.corp.akamai.com [172.27.25.75])
 by prod-mail-relay06.akamai.com (Postfix) with ESMTP id D81562026;
 Wed, 20 Aug 2014 20:54:03 +0000 (GMT)
Received: from USMBX2.msg.corp.akamai.com ([169.254.1.85]) by
 ustx2ex-cashub1.dfw01.corp.akamai.com ([172.27.25.75]) with mapi; Wed, 20 Aug
 2014 15:54:03 -0500
From: "Sanford, Robert" <rsanford@akamai.com>
To: Declan Doherty <declan.doherty@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Date: Wed, 20 Aug 2014 15:54:02 -0500
Thread-Topic: [dpdk-dev] [PATCH 4/6] bond: free mbufs if transmission fails
 in bonding tx_burst functions
Thread-Index: Ac+8uN+006dAy0+sQSqgPktSQ1ygoQ==
Message-ID: <D01A81B6.27E1%rsanford@akamai.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
user-agent: Microsoft-MacOutlook/14.4.3.140616
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH 4/6] bond: free mbufs if transmission fails
 in bonding tx_burst functions
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Aug 2014 20:50:31 -0000

Hi Declan,

I have a problem with the TX-burst logic of this patch. I believe that for
packets that we *don't* enqueue to the device, we should *NOT* free them.
The API expects that the caller will free them or try again to send them.

Here is one way to accomplish selective freeing: Move mbuf pointers of
packets successfully enqueued, to the beginning of the caller's mbuf
vector; move mbuf pointers of packets not enqueued, to the end of the
caller's mbuf list. Although possibly re-ordered, the caller will be able
to free/resend all (and only) mbufs that we failed to enqueue.

Here is a code snippet to do this:

  uint16_t failed_index =3D nb_pkts;

  /* Send packet burst on each slave device */
  for (i =3D 0; i < num_of_slaves; i++) {
    if (slave_nb_pkts[i] > 0) {
      num_tx_slave =3D rte_eth_tx_burst(slaves[i],
          bd_tx_q->queue_id, slave_bufs[i], slave_nb_pkts[i]);

   =20
      /* Move sent packets to the beginning of the caller's array. */
      if (likely(num_tx_slave)) {
        memcpy(&bufs[num_tx_total], slave_bufs[i],
            num_tx_slave * sizeof(void *));
        num_tx_total +=3D num_tx_slave;
      }

   =20
      /* Move failed packets to the end of the caller's array,
       * so that the caller can free or resend the correct ones.
       */
      if (unlikely(num_tx_slave < slave_nb_pkts[i])) {
        uint16_t num_failed =3D slave_nb_pkts[i] - num_tx_slave;
        failed_index -=3D num_failed;
        memcpy(&bufs[failed_index], &slave_bufs[i][num_tx_slave],
            num_failed * sizeof(void *));
      }
    }
  }      =20
=20


--
Regards,
Robert


>Fixing a number of corner cases that if transmission failed on slave
>devices then this
>could lead to leaked mbufs
>
>
>Signed-off-by: Declan Doherty <declan.doherty@intel.com>
>---
> app/test/test_link_bonding.c           |    4 +-
> lib/librte_pmd_bond/rte_eth_bond_pmd.c |   46
>+++++++++++++++++++++++++-------
> 2 files changed, 38 insertions(+), 12 deletions(-)
>
>diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
>index 02823b6..3c265ee 100644
>--- a/app/test/test_link_bonding.c
>+++ b/app/test/test_link_bonding.c
>@@ -3415,7 +3415,7 @@ test_broadcast_tx_burst(void)
> 	/* Send burst on bonded port */
> 	nb_tx =3D rte_eth_tx_burst(test_params->bonded_port_id, 0, pkts_burst,
> 			burst_size);
>-	if (nb_tx !=3D burst_size * test_params->bonded_slave_count) {
>+	if (nb_tx !=3D burst_size) {
> 		printf("Bonded Port (%d) rx burst failed, packets transmitted value
>(%u) not as expected (%d)\n",
> 				test_params->bonded_port_id,
> 				nb_tx, burst_size);
>@@ -3770,7 +3770,7 @@
>test_broadcast_verify_slave_link_status_change_behaviour(void)
> 	}
>=20
> 	if (rte_eth_tx_burst(test_params->bonded_port_id, 0, &pkt_burst[0][0],
>-			burst_size) !=3D (burst_size * slave_count)) {
>+			burst_size) !=3D burst_size) {
> 		printf("rte_eth_tx_burst failed\n");
> 		return -1;
> 	}
>diff --git a/lib/librte_pmd_bond/rte_eth_bond_pmd.c
>b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
>index 70123fc..ae9726e 100644
>--- a/lib/librte_pmd_bond/rte_eth_bond_pmd.c
>+++ b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
>@@ -101,7 +101,7 @@ bond_ethdev_tx_burst_round_robin(void *queue, struct
>rte_mbuf **bufs,
> 	uint8_t num_of_slaves;
> 	uint8_t slaves[RTE_MAX_ETHPORTS];
>=20
>-	uint16_t num_tx_total =3D 0;
>+	uint16_t num_tx_total =3D 0, num_tx_slave;
>=20
> 	static int slave_idx =3D 0;
> 	int i, cs_idx =3D 0;
>@@ -130,9 +130,17 @@ bond_ethdev_tx_burst_round_robin(void *queue, struct
>rte_mbuf **bufs,
>=20
> 	/* Send packet burst on each slave device */
> 	for (i =3D 0; i < num_of_slaves; i++)
>-		if (slave_nb_pkts[i] > 0)
>-			num_tx_total +=3D rte_eth_tx_burst(slaves[i],
>+		if (slave_nb_pkts[i] > 0) {
>+			num_tx_slave =3D rte_eth_tx_burst(slaves[i],
> 					bd_tx_q->queue_id, slave_bufs[i], slave_nb_pkts[i]);
>+			num_tx_total +=3D num_tx_slave;
>+
>+			/* if tx burst fails, free unsent mbufs */
>+			while (unlikely(num_tx_slave < slave_nb_pkts[i])) {
>+				rte_pktmbuf_free(slave_bufs[i][num_tx_slave]);
>+				num_tx_slave++;
>+			}
>+		}
>=20
> 	return num_tx_total;
> }
>@@ -283,7 +291,7 @@ bond_ethdev_tx_burst_balance(void *queue, struct
>rte_mbuf **bufs,
> 	uint8_t num_of_slaves;
> 	uint8_t slaves[RTE_MAX_ETHPORTS];
>=20
>-	uint16_t num_tx_total =3D 0;
>+	uint16_t num_tx_total =3D 0, num_tx_slave =3D 0;
>=20
> 	int i, op_slave_id;
>=20
>@@ -315,11 +323,19 @@ bond_ethdev_tx_burst_balance(void *queue, struct
>rte_mbuf **bufs,
> 	/* Send packet burst on each slave device */
> 	for (i =3D 0; i < num_of_slaves; i++) {
> 		if (slave_nb_pkts[i] > 0) {
>-			num_tx_total +=3D rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id,
>+			num_tx_slave =3D rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id,
> 					slave_bufs[i], slave_nb_pkts[i]);
>+			num_tx_total +=3D num_tx_slave;
>+
>+			/* if tx burst fails, free unsent mbufs */
>+			while (unlikely(num_tx_slave < slave_nb_pkts[i])) {
>+				rte_pktmbuf_free(slave_bufs[i][num_tx_slave]);
>+				num_tx_slave++;
>+			}
> 		}
> 	}
>=20
>+
> 	return num_tx_total;
> }
>=20
>@@ -333,7 +349,7 @@ bond_ethdev_tx_burst_broadcast(void *queue, struct
>rte_mbuf **bufs,
> 	uint8_t num_of_slaves;
> 	uint8_t slaves[RTE_MAX_ETHPORTS];
>=20
>-	uint16_t num_tx_total =3D 0;
>+	uint16_t num_tx_slave =3D 0, max_tx_pkts =3D 0;
>=20
> 	int i;
>=20
>@@ -354,11 +370,21 @@ bond_ethdev_tx_burst_broadcast(void *queue, struct
>rte_mbuf **bufs,
> 		rte_mbuf_refcnt_update(bufs[i], num_of_slaves - 1);
>=20
> 	/* Transmit burst on each active slave */
>-	for (i =3D 0; i < num_of_slaves; i++)
>-		num_tx_total +=3D rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id,
>-				bufs, nb_pkts);
>+	for (i =3D 0; i < num_of_slaves; i++) {
>+		num_tx_slave =3D rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id,
>+					bufs, nb_pkts);
>=20
>-	return num_tx_total;
>+		if (max_tx_pkts < num_tx_slave)
>+			max_tx_pkts =3D num_tx_slave;
>+
>+		/* if tx burst fails, free unsent mbufs */
>+		while (unlikely(num_tx_slave < nb_pkts)) {
>+			rte_pktmbuf_free(bufs[num_tx_slave]);
>+			num_tx_slave++;
>+		}
>+	}
>+
>+	return max_tx_pkts;
> }
>=20
> void
>--=20
>1.7.0.7
>