From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <matan@mellanox.com>
Received: from EUR02-AM5-obe.outbound.protection.outlook.com
 (mail-eopbgr00046.outbound.protection.outlook.com [40.107.0.46])
 by dpdk.org (Postfix) with ESMTP id BC4191B336
 for <dev@dpdk.org>; Tue,  3 Oct 2017 12:50:05 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com;
 s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;
 bh=QsU4wUKa/dTJpI1INxa2/s8KKVh4loL9DHwM0zyxmm8=;
 b=xOsNhGu6J9CGv33LDI3+X1ljeN/3T970byGdqr15x/F9lbnyjaxKqxQY2c7Tl9vP/qoV7DgrZdME0KmceaHis8QEWh9bX5tclLpsYSqCW4AYUoXx/yqpXt/wgQ0YFarPInMumorfY573FV3bXOaUpPjwjKzMFNqAlt9vvpYwwO0=
Authentication-Results: spf=none (sender IP is )
 smtp.mailfrom=matan@mellanox.com; 
Received: from mellanox.com (37.142.13.130) by
 AM5PR0502MB3042.eurprd05.prod.outlook.com (2603:10a6:203:a1::18) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.77.7; Tue, 3 Oct
 2017 10:50:03 +0000
From: Matan Azrad <matan@mellanox.com>
To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Cc: dev@dpdk.org,
	Moti Haimovsky <motih@mellanox.com>
Date: Tue,  3 Oct 2017 10:48:28 +0000
Message-Id: <1507027711-879-4-git-send-email-matan@mellanox.com>
X-Mailer: git-send-email 1.8.3.1
In-Reply-To: <1507027711-879-1-git-send-email-matan@mellanox.com>
References: <1503590050-196143-1-git-send-email-motih@mellanox.com>
 <1507027711-879-1-git-send-email-matan@mellanox.com>
MIME-Version: 1.0
Content-Type: text/plain
X-Originating-IP: [37.142.13.130]
X-ClientProxiedBy: HE1PR09CA0085.eurprd09.prod.outlook.com
 (2603:10a6:7:3d::29) To AM5PR0502MB3042.eurprd05.prod.outlook.com
 (2603:10a6:203:a1::18)
X-MS-PublicTrafficType: Email
X-MS-Office365-Filtering-Correlation-Id: 4698ea7c-7913-4d3f-7041-08d50a4c8162
X-MS-Office365-Filtering-HT: Tenant
X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0;
 RULEID:(22001)(2017030254152)(48565401081)(2017052603199)(201703131423075)(201703031133081)(201702281549075);
 SRVR:AM5PR0502MB3042; 
X-Microsoft-Exchange-Diagnostics: 1; AM5PR0502MB3042;
 3:A/uj/eqIdZglDEsBnTVFQi7Al1ZvUokuH8Imy+MDuF3ZpOxmTVLKEKAHUC+L9HVWP6oCI58DElMrEm9LCDMgAnAvOtMefANwc0iey5qS66DVoPvFe4K4F29pWtvnLJ+L4PKjjXTGcKLDDCeF6VfmPC4Npr+27DZWzWrBYHmtQ27FhpJkMFLC22Q6v1sToiGZzHnd4VHNc636P77nnq/n8c9/ftA29eb1pSAs+JLdAfJGx/xZWKoQ16FAfggXn3fQ;
 25:roPWe1sevu5aYUKqnVCJssExmQGR8DVkb6dQrkxW4tfXwYesiIr3YJjpNzzaeUx+tApzDeZpaNM1tZ1v+Kzpo85+T6gl/1lcqzj7jkm8Y9rtJgCrfg9pzafhZdQbHtgTSrq2zHyuyxGKlmPg4nyJvPg+LfmEXZTXWPQjZVfd3tXFLZb84v1MR9E099CShfOIYdz4fMzmjBLuYWLSVsdZuoS1J75p/B/HLk7D3oRGTXR85IKA5/BZ5RYZtHIsjRwqQe8Olpris9dcJ4rjrUkbsfHs/dsZS9SGe8BkcJIha8ieOwa6IfhvZvpqGlxvlxSgDbezGgC9yJCsAKsfFnvTPQ==;
 31:DzHYuWR7Hkm4G8gc84a1H7QaJIAN7w2+Xg/mLChDfD0SUfWfvLSTetoO6/fQ7bqKzCfK82EHnJrU/POKyVbsq10X4qbeprXZlLBZT/Ylz2pp/602Rw9/mjAFPhcDdaL/nKelk9f5Rhxy4EXjeopn5AWMve0yTio+LOEoSJgbCEXIdSCO+e/OyB6mZrKkrgT65fmsCvdc0HdxewkmoWWLJkkASQGzMwqOAtkrM3I7pZ4=
X-MS-TrafficTypeDiagnostic: AM5PR0502MB3042:
X-LD-Processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr
X-Microsoft-Exchange-Diagnostics: 1; AM5PR0502MB3042;
 20:mSDsaviqJMD97MV9JNfhXt/nmxvZktDbaggQFS4jUG/wBRa0XiGCpkbdJtYTY/Hh5qBvQDds8fyZR/uuXJ+L/nyx6xYMwc2mZnUkMuEn445Bkj8ZY6uHWBIWJ7wOnMhgdE7QVX/IixTeCcCbXdkl2p7J/8JwJMLSJIdBreKlh349kaTSlsX6dx+Z9i870vQvbWaooynYF64JhiY/HW3pVcb04tlwNs7JV+mcMz2KaBWwb6IXoY47ERDJHYnXUQ+KrtVO45hKz2srn3r8l5ZrPUiMW/C6JqdNDS11vc7AxZhvxDLr95FYqZ68nOkS3aEeuuvatvmBgUq1g6mcJWcDu7n8F4lZfRgwB28cfjec24DP9w/uPaYgQOSXDHvf27wRlDeFtAl4z2EIOZDJusVZqhTSdIDYkpEFN8sag9CwDPn70EFcX+SnZwleQmqZrdAGjy7zDCR1nWE8Rzu31WzuMGmoxp6VfDE+eG1o15ek3HXhsBXLaYYr7J2LhnU2O+x1;
 4:qfXp1S5Xo/8/dJyf/kcTlrMgV8CN6YEQROxsnHYrtHeOB+UVhmDXj6A6PEbVleFnBn4l+n8BHY/jL1O7td0EjJ5wRqWv6u92ZG8xMxDIxh7Uqd0cPykdSilP5pZ6fZ38yZfFPPeoosGi385pu0hQayNCtRIurq8rJpJaM9WimfHA/rQ6vkUsXYP4/g7ur9KG1McQjLbkrAFyXmyvmJMuXStunx9bhTHKEV1BFQC7699qpFnucHUPIf7sV3+4czk9KcPEAg0YIaoxXdUd0Z5A1qZ9CaVVpzQ7l0wJ3reW0uk=
X-Exchange-Antispam-Report-Test: UriScan:(60795455431006);
X-Microsoft-Antispam-PRVS: <AM5PR0502MB30424110F344C922C954A6AED2720@AM5PR0502MB3042.eurprd05.prod.outlook.com>
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0;
 RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(93006095)(93001095)(10201501046)(3002001)(100000703101)(100105400095)(6055026)(6041248)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123555025)(20161123558100)(20161123562025)(20161123564025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);
 SRVR:AM5PR0502MB3042; BCL:0; PCL:0;
 RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);
 SRVR:AM5PR0502MB3042; 
X-Forefront-PRVS: 044968D9E1
X-Forefront-Antispam-Report: SFV:NSPM;
 SFS:(10009020)(6009001)(376002)(346002)(189002)(22813001)(199003)(33646002)(107886003)(5660300001)(55016002)(105586002)(47776003)(3846002)(53936002)(2906002)(66066001)(50986999)(76176999)(16586007)(86362001)(316002)(101416001)(81156014)(305945005)(8936002)(81166006)(8676002)(33026002)(50226002)(7736002)(16526017)(97736004)(48376002)(69596002)(25786009)(478600001)(50466002)(6116002)(4326008)(106356001)(5003940100001)(36756003)(6916009)(6666003)(4720700003)(21086003)(2950100002)(68736007)(189998001);
 DIR:OUT; SFP:1101; SCL:1; SRVR:AM5PR0502MB3042; H:mellanox.com; FPR:; SPF:None;
 PTR:InfoNoRecords; MX:1; A:1; LANG:en; 
Received-SPF: None (protection.outlook.com: mellanox.com does not designate
 permitted sender hosts)
X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; AM5PR0502MB3042;
 23:jAfWX6eNzk7Q2jaQ3fQ0V4OAyITuXSAvC+Jg4kC?=
 =?us-ascii?Q?DKt+h35fJfcHTwwzKU+0TpS4qHwaJq4A5Cvizy9Kz3hveWMeoWHm8INQYROB?=
 =?us-ascii?Q?OdW6xFRBifLGdtYBLJLJZm49Z4rhzGzkIoQwjy8Q3O864hPUIqy4T/8mWCgW?=
 =?us-ascii?Q?9q055upv1cc5h+GJvbkMkeZ24JbdoKN5QjsrJnHmtPxQyd+EKXsoKc7q59bL?=
 =?us-ascii?Q?dfGg7zng1Ix0jRgRZnyR0fcTQ3rt30W4mW/Rk+kxrHLkpTVOanCMOgC/83XH?=
 =?us-ascii?Q?S9KdgGEgVbfkICdsVqaIORdMrxqLv2oGV2DA68nkfve4Z8N4wA8ueR7BrZmW?=
 =?us-ascii?Q?LD5l91NeyuWqlU4YZ12JgfKDegHVGS9+vgs+DCyAeM4C2aJniiSNR1MbvCit?=
 =?us-ascii?Q?x2eHAaaB3snoHAWvwgr5GC0e9ibcr5yIVS3dL7loFJjkQdHP+X5FQP27cHGT?=
 =?us-ascii?Q?5kruANDt8hv8W17MJ9wChQKO449I2whtW8RqTuSdSXytowYEQ5SqfHrH/Rtn?=
 =?us-ascii?Q?UcpGsEQZU1GBjaRiout+EOgWt02thZOuM6JCyDRK/OWV1XBx032fI1ytDVRp?=
 =?us-ascii?Q?EtLk/O3o32xqRMwsuEk3D6ucPL4MTGYxkI0I93y/nsYnlaDVLBwLltxMw/or?=
 =?us-ascii?Q?ghandRT0Qp+o0bfMez9PvlcZEttIx5bgIZNIeLgxqrgZxMUrl+0UwXm/1dGM?=
 =?us-ascii?Q?5gQNXhZ0NYJdtNmdWEfnOThzuoo/s4oHm5AmlqRnLzgc1v+K6wdqfrfMpI/w?=
 =?us-ascii?Q?NuZ0UgyhlCkU7LbKdcFvfyvKF6EfapA7O5tASWUNwaUZPu+0rJaAUqQLvikE?=
 =?us-ascii?Q?T4Enhtijmvy8vKvO4LQp6XyIzB02dQt6kpbvUdrYG8+t7AZdGXsJyHwIT3sy?=
 =?us-ascii?Q?9SguKxAaIHwqtneCxLxcS+DXnfYc7pK5BV/bLJglL2VE1EUA6+DVcfmEnbYt?=
 =?us-ascii?Q?AIEL0pF/FAB3V903aUJsN9CXFusU5tk1fbrx9guohCtUeLiHxYz7wWRXnCZC?=
 =?us-ascii?Q?RyWcI4z5tWioh00eRwK5xrp7B9j2K5enmf3ShBPUqv17poZiqUFYoGd+MyBQ?=
 =?us-ascii?Q?y2Jf56fvt245QW8UySuDLrZEsd6QnSVq/E4mY0GI20hJe3/lnWqXxnD7wWk8?=
 =?us-ascii?Q?B/apeasmZQ+ZElQtqTYFMlm97Ukw1CLy3bPGXuM8/1D3Yn1vEiqSKMQ=3D?=
 =?us-ascii?Q?=3D?=
X-Microsoft-Exchange-Diagnostics: 1; AM5PR0502MB3042;
 6:V192fWXZXRcMYwwI40FHHJG1BGDs1+0K7YvrAv/TH2V6Vb8xjgSYHgUpsss1CVlWSxTXcy6RUrHXkZW9PgA+wJtGhmBK2FFfpa52qy9XE1DEgJgqQlYPo8PaVM9U/Hiu9iHMJsjT5N/72ieVTFP/53F+kDgZ+Hv4K2coQMDyIvwiEej4s7Ef1gqfLpcL5S7IRDxoa5/R3/VBBQXU16RFlyDsSBT+g0YdUqfKjLvNLy5how0glLhh7O4eGtg1WZtXt3jUYliIy+4CoUleEQHf7u2swoFY3puTqHCulQW8ICjD/BL1MJikShQ0HjnP/Eo2G/jsBVmwCf+AftjtrxA/7Q==;
 5:RCYoPMB56Vg/7BEKxJo1IYOI3Kae9kdWzo8Jtdvom+eNYcE+I7mUYOz/M3uHSd6jM70rnSe+EWr6ovUmKU5z7J5VqVudx9e5BqsNMTRduCb7dmVxzkGLWF8sg37WqSShragSVC/m5qQ5cz3RbA2d8M17t9wYMosGDTIYX4dIIp0=;
 24:8zchnHdRA65YJ6dXb6cuVPxBAthrKQgs28nA1t33eo9ZfTxunoE651oZ2HV0NmnNQXOghZumAVQVUWQZe5rJ0y9Gw+RAmSvrZ1AsuVPzdLM=;
 7:N5uE2kU5u/mbTz7M3CpQjXXVNgqfFzlwmZQSeBroDk+XnEJ1OMBhNKQau2H8PJlJbkhRRuOnvECC5nn9y1OJEeNZjq75V+eRv4HiqfcdUp+yNtbEnt68j2EMUfPo1MJUHMwb79fJian2vZoFX13IE2RDNDo2c05OI0zHsqR7wOufSLtpvbZKr/3HaZf21KrbrUEgXcsX6D+PcJa9R4c6hqKj/UbzXiBNI47L7rMYmDo=
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-OriginatorOrg: Mellanox.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Oct 2017 10:50:03.9172 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5PR0502MB3042
Subject: [dpdk-dev] [PATCH v2 3/6] net/mlx4: support multi-segments Tx
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Oct 2017 10:50:05 -0000

From: Moti Haimovsky <motih@mellanox.com>

This patch adds support for transmitting packets spanning over
multiple buffers.
In this patch we also take into consideration the amount of entries
a packet occupies in the TxQ when setting the report-completion flag
of the chip.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
---
 drivers/net/mlx4/mlx4_rxtx.c | 208 ++++++++++++++++++++++++-------------------
 drivers/net/mlx4/mlx4_rxtx.h |   6 +-
 drivers/net/mlx4/mlx4_txq.c  |  12 ++-
 3 files changed, 129 insertions(+), 97 deletions(-)

diff --git a/drivers/net/mlx4/mlx4_rxtx.c b/drivers/net/mlx4/mlx4_rxtx.c
index e45bb3b..4200716 100644
--- a/drivers/net/mlx4/mlx4_rxtx.c
+++ b/drivers/net/mlx4/mlx4_rxtx.c
@@ -63,6 +63,16 @@
 #include "mlx4_rxtx.h"
 #include "mlx4_utils.h"
 
+/*
+ * Pointer-value pair structure
+ * used in tx_post_send for saving the first DWORD (32 byte)
+ * of a TXBB0
+ */
+struct pv {
+	struct mlx4_wqe_data_seg *dseg;
+	uint32_t val;
+};
+
 /**
  * Stamp a WQE so it won't be reused by the HW.
  * Routine is used when freeing WQE used by the chip or when failing
@@ -296,34 +306,38 @@
  *
  * @param txq
  *   The Tx queue to post to.
- * @param wr
- *   The work request to handle.
- * @param bad_wr
- *   The wr in case that posting had failed.
+ * @param pkt
+ *   The packet to transmit.
  *
  * @return
  *   0 - success, negative errno value otherwise and rte_errno is set.
  */
 static inline int
 mlx4_post_send(struct txq *txq,
-	       struct rte_mbuf *pkt,
-	       uint32_t send_flags)
+	       struct rte_mbuf *pkt)
 {
 	struct mlx4_wqe_ctrl_seg *ctrl;
 	struct mlx4_wqe_data_seg *dseg;
 	struct mlx4_sq *sq = &txq->msq;
+	struct rte_mbuf *buf;
 	uint32_t head_idx = sq->head & sq->txbb_cnt_mask;
 	uint32_t lkey;
 	uintptr_t addr;
+	uint32_t srcrb_flags;
+	uint32_t owner_opcode = MLX4_OPCODE_SEND;
+	uint32_t byte_count;
 	int wqe_real_size;
 	int nr_txbbs;
 	int rc;
+	struct pv *pv = (struct pv *)txq->bounce_buf;
+	int pv_counter = 0;
 
 	/* Calculate the needed work queue entry size for this packet. */
 	wqe_real_size = sizeof(struct mlx4_wqe_ctrl_seg) +
 			pkt->nb_segs * sizeof(struct mlx4_wqe_data_seg);
 	nr_txbbs = MLX4_SIZE_TO_TXBBS(wqe_real_size);
-	/* Check that there is room for this WQE in the send queue and
+	/*
+	 * Check that there is room for this WQE in the send queue and
 	 * that the WQE size is legal.
 	 */
 	if (likely(((sq->head - sq->tail) + nr_txbbs +
@@ -332,76 +346,108 @@
 		rc = ENOSPC;
 		goto err;
 	}
-	/* Get the control and single-data entries of the WQE */
+	/* Get the control and data entries of the WQE. */
 	ctrl = (struct mlx4_wqe_ctrl_seg *)mlx4_get_send_wqe(sq, head_idx);
 	dseg = (struct mlx4_wqe_data_seg *)(((char *)ctrl) +
 		sizeof(struct mlx4_wqe_ctrl_seg));
-	/*
-	 * Fill the data segment with buffer information.
-	 */
-	addr = rte_pktmbuf_mtod(pkt, uintptr_t);
-	rte_prefetch0((volatile void *)addr);
-	dseg->addr = rte_cpu_to_be_64(addr);
-	/* Memory region key for this memory pool. */
-	lkey = mlx4_txq_mp2mr(txq, mlx4_txq_mb2mp(pkt));
-	if (unlikely(lkey == (uint32_t)-1)) {
-		/* MR does not exist. */
-		DEBUG("%p: unable to get MP <-> MR"
-		      " association", (void *)txq);
-		/*
-		 * Restamp entry in case of failure.
-		 * Make sure that size is written correctly.
-		 * Note that we give ownership to the SW, not the HW.
+	/* Fill the data segments with buffer information. */
+	for (buf = pkt; buf != NULL; buf = buf->next, dseg++) {
+		addr = rte_pktmbuf_mtod(buf, uintptr_t);
+		rte_prefetch0((volatile void *)addr);
+		/* Handle WQE wraparound. */
+		if (unlikely(dseg >= (struct mlx4_wqe_data_seg *)sq->eob))
+			dseg = (struct mlx4_wqe_data_seg *)sq->buf;
+		dseg->addr = rte_cpu_to_be_64(addr);
+		/* Memory region key for this memory pool. */
+		lkey = mlx4_txq_mp2mr(txq, mlx4_txq_mb2mp(buf));
+		if (unlikely(lkey == (uint32_t)-1)) {
+			/* MR does not exist. */
+			DEBUG("%p: unable to get MP <-> MR"
+			      " association", (void *)txq);
+			/*
+			 * Restamp entry in case of failure.
+			 * Make sure that size is written correctly
+			 * Note that we give ownership to the SW, not the HW.
+			 */
+			ctrl->fence_size = (wqe_real_size >> 4) & 0x3f;
+			mlx4_txq_stamp_freed_wqe(sq, head_idx,
+				     (sq->head & sq->txbb_cnt) ? 0 : 1);
+			rc = EFAULT;
+			goto err;
+		}
+		dseg->lkey = rte_cpu_to_be_32(lkey);
+		if (likely(buf->data_len))
+			byte_count = rte_cpu_to_be_32(buf->data_len);
+		else
+			/*
+			 * Zero length segment is treated as inline segment
+			 * with zero data.
+			 */
+			byte_count = RTE_BE32(0x80000000);
+		/* If the data segment is not at the beginning of a
+		 * Tx basic block(TXBB) then write the byte count,
+		 * else postpone the writing to just before updating the
+		 * control segment.
 		 */
-		ctrl->fence_size = (wqe_real_size >> 4) & 0x3f;
-		mlx4_txq_stamp_freed_wqe(sq, head_idx,
-					 (sq->head & sq->txbb_cnt) ? 0 : 1);
-		rc = EFAULT;
-		goto err;
+		if ((uintptr_t)dseg & (uintptr_t)(MLX4_TXBB_SIZE - 1)) {
+			/*
+			 * Need a barrier here before writing the byte_count
+			 * fields to make sure that all the data is visible
+			 * before the byte_count field is set.
+			 * Otherwise, if the segment begins a new cacheline,
+			 * the HCA prefetcher could grab the 64-byte chunk and
+			 * get a valid (!= * 0xffffffff) byte count but stale
+			 * data, and end up sending the wrong data.
+			 */
+			rte_io_wmb();
+			dseg->byte_count = byte_count;
+		} else {
+			/*
+			 * This data segment starts at the beginning of a new
+			 * TXBB, so we need to postpone its byte_count writing
+			 * for later.
+			 */
+			pv[pv_counter].dseg = dseg;
+			pv[pv_counter++].val = byte_count;
+		}
 	}
-	dseg->lkey = rte_cpu_to_be_32(lkey);
-	/*
-	 * Need a barrier here before writing the byte_count field to
-	 * make sure that all the data is visible before the
-	 * byte_count field is set.  Otherwise, if the segment begins
-	 * a new cacheline, the HCA prefetcher could grab the 64-byte
-	 * chunk and get a valid (!= * 0xffffffff) byte count but
-	 * stale data, and end up sending the wrong data.
-	 */
-	rte_io_wmb();
-	if (likely(pkt->data_len))
-		dseg->byte_count = rte_cpu_to_be_32(pkt->data_len);
-	else
-		/*
-		 * Zero length segment is treated as inline segment
-		 * with zero data.
-		 */
-		dseg->byte_count = RTE_BE32(0x80000000);
-	/*
-	 * Fill the control parameters for this packet.
-	 * For raw Ethernet, the SOLICIT flag is used to indicate that no icrc
-	 * should be calculated
-	 */
-	ctrl->srcrb_flags =
-		rte_cpu_to_be_32(MLX4_WQE_CTRL_SOLICIT |
-				 (send_flags & MLX4_WQE_CTRL_CQ_UPDATE));
+	/* Write the first DWORD of each TXBB save earlier. */
+	if (pv_counter) {
+		/* Need a barrier here before writing the byte_count. */
+		rte_io_wmb();
+		for (--pv_counter; pv_counter  >= 0; pv_counter--)
+			pv[pv_counter].dseg->byte_count = pv[pv_counter].val;
+	}
+	/* Fill the control parameters for this packet. */
 	ctrl->fence_size = (wqe_real_size >> 4) & 0x3f;
 	/*
 	 * The caller should prepare "imm" in advance in order to support
 	 * VF to VF communication (when the device is a virtual-function
 	 * device (VF)).
-	 */
+	*/
 	ctrl->imm = 0;
 	/*
+	 * For raw Ethernet, the SOLICIT flag is used to indicate that no icrc
+	 * should be calculated.
+	 */
+	txq->elts_comp_cd -= nr_txbbs;
+	if (unlikely(txq->elts_comp_cd <= 0)) {
+		txq->elts_comp_cd = txq->elts_comp_cd_init;
+		srcrb_flags = RTE_BE32(MLX4_WQE_CTRL_SOLICIT |
+				       MLX4_WQE_CTRL_CQ_UPDATE);
+	} else {
+		srcrb_flags = RTE_BE32(MLX4_WQE_CTRL_SOLICIT);
+	}
+	ctrl->srcrb_flags = srcrb_flags;
+	/*
 	 * Make sure descriptor is fully written before
 	 * setting ownership bit (because HW can start
 	 * executing as soon as we do).
 	 */
-	rte_wmb();
-	ctrl->owner_opcode =
-		rte_cpu_to_be_32(MLX4_OPCODE_SEND |
-				 ((sq->head & sq->txbb_cnt) ?
-				  MLX4_BIT_WQE_OWN : 0));
+	 rte_wmb();
+	 ctrl->owner_opcode = rte_cpu_to_be_32(owner_opcode |
+					       ((sq->head & sq->txbb_cnt) ?
+					       MLX4_BIT_WQE_OWN : 0));
 	sq->head += nr_txbbs;
 	return 0;
 err:
@@ -428,14 +474,13 @@
 	struct txq *txq = (struct txq *)dpdk_txq;
 	unsigned int elts_head = txq->elts_head;
 	const unsigned int elts_n = txq->elts_n;
-	unsigned int elts_comp_cd = txq->elts_comp_cd;
 	unsigned int elts_comp = 0;
 	unsigned int bytes_sent = 0;
 	unsigned int i;
 	unsigned int max;
 	int err;
 
-	assert(elts_comp_cd != 0);
+	assert(txq->elts_comp_cd != 0);
 	mlx4_txq_complete(txq);
 	max = (elts_n - (elts_head - txq->elts_tail));
 	if (max > elts_n)
@@ -454,8 +499,6 @@
 			(((elts_head + 1) == elts_n) ? 0 : elts_head + 1);
 		struct txq_elt *elt_next = &(*txq->elts)[elts_head_next];
 		struct txq_elt *elt = &(*txq->elts)[elts_head];
-		unsigned int segs = buf->nb_segs;
-		uint32_t send_flags = 0;
 
 		/* Clean up old buffer. */
 		if (likely(elt->buf != NULL)) {
@@ -473,34 +516,16 @@
 				tmp = next;
 			} while (tmp != NULL);
 		}
-		/* Request Tx completion. */
-		if (unlikely(--elts_comp_cd == 0)) {
-			elts_comp_cd = txq->elts_comp_cd_init;
-			++elts_comp;
-			send_flags |= MLX4_WQE_CTRL_CQ_UPDATE;
-		}
-		if (likely(segs == 1)) {
-			/* Update element. */
-			elt->buf = buf;
-			RTE_MBUF_PREFETCH_TO_FREE(elt_next->buf);
-			/* post the pkt for sending */
-			err = mlx4_post_send(txq, buf, send_flags);
-			if (unlikely(err)) {
-				if (unlikely(send_flags &
-					     MLX4_WQE_CTRL_CQ_UPDATE)) {
-					elts_comp_cd = 1;
-					--elts_comp;
-				}
-				elt->buf = NULL;
-				goto stop;
-			}
-			elt->buf = buf;
-			bytes_sent += buf->pkt_len;
-		} else {
-			err = -EINVAL;
-			rte_errno = -err;
+		RTE_MBUF_PREFETCH_TO_FREE(elt_next->buf);
+		/* post the packet for sending. */
+		err = mlx4_post_send(txq, buf);
+		if (unlikely(err)) {
+			elt->buf = NULL;
 			goto stop;
 		}
+		elt->buf = buf;
+		bytes_sent += buf->pkt_len;
+		++elts_comp;
 		elts_head = elts_head_next;
 	}
 stop:
@@ -516,7 +541,6 @@
 	rte_write32(txq->msq.doorbell_qpn, txq->msq.db);
 	txq->elts_head = elts_head;
 	txq->elts_comp += elts_comp;
-	txq->elts_comp_cd = elts_comp_cd;
 	return i;
 }
 
diff --git a/drivers/net/mlx4/mlx4_rxtx.h b/drivers/net/mlx4/mlx4_rxtx.h
index df83552..1b90533 100644
--- a/drivers/net/mlx4/mlx4_rxtx.h
+++ b/drivers/net/mlx4/mlx4_rxtx.h
@@ -103,13 +103,15 @@ struct txq {
 	struct mlx4_cq mcq; /**< Info for directly manipulating the CQ. */
 	unsigned int elts_head; /**< Current index in (*elts)[]. */
 	unsigned int elts_tail; /**< First element awaiting completion. */
-	unsigned int elts_comp; /**< Number of completion requests. */
-	unsigned int elts_comp_cd; /**< Countdown for next completion. */
+	unsigned int elts_comp; /**< Number of pkts waiting for completion. */
+	int elts_comp_cd; /**< Countdown for next completion. */
 	unsigned int elts_comp_cd_init; /**< Initial value for countdown. */
 	unsigned int elts_n; /**< (*elts)[] length. */
 	struct txq_elt (*elts)[]; /**< Tx elements. */
 	struct mlx4_txq_stats stats; /**< Tx queue counters. */
 	uint32_t max_inline; /**< Max inline send size. */
+	char *bounce_buf;
+	/**< memory used for storing the first DWORD of data TXBBs. */
 	struct {
 		const struct rte_mempool *mp; /**< Cached memory pool. */
 		struct ibv_mr *mr; /**< Memory region (for mp). */
diff --git a/drivers/net/mlx4/mlx4_txq.c b/drivers/net/mlx4/mlx4_txq.c
index 492779f..9333311 100644
--- a/drivers/net/mlx4/mlx4_txq.c
+++ b/drivers/net/mlx4/mlx4_txq.c
@@ -83,8 +83,14 @@
 		rte_calloc_socket("TXQ", 1, sizeof(*elts), 0, txq->ctrl.socket);
 	int ret = 0;
 
-	if (elts == NULL) {
-		ERROR("%p: can't allocate packets array", (void *)txq);
+	/* Allocate Bounce-buf memory */
+	txq->bounce_buf = (char *)rte_zmalloc_socket("TXQ",
+						     MLX4_MAX_WQE_SIZE,
+						     RTE_CACHE_LINE_MIN_SIZE,
+						     txq->ctrl.socket);
+
+	if ((elts == NULL) || (txq->bounce_buf == NULL)) {
+		ERROR("%p: can't allocate TXQ memory", (void *)txq);
 		ret = ENOMEM;
 		goto error;
 	}
@@ -110,6 +116,7 @@
 	assert(ret == 0);
 	return 0;
 error:
+	rte_free(txq->bounce_buf);
 	rte_free(elts);
 	DEBUG("%p: failed, freed everything", (void *)txq);
 	assert(ret > 0);
@@ -303,7 +310,6 @@ struct txq_mp2mr_mbuf_check_data {
 	struct mlx4dv_obj mlxdv;
 	struct mlx4dv_qp dv_qp;
 	struct mlx4dv_cq dv_cq;
-
 	struct txq tmpl = {
 		.ctrl = {
 			.priv = priv,
-- 
1.8.3.1