From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-eopbgr50070.outbound.protection.outlook.com [40.107.5.70]) by dpdk.org (Postfix) with ESMTP id AC2BC378B for ; Thu, 28 Jun 2018 14:50:04 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4Ix9AvZKWr9CZVuK5BYZ8QXhjdVb+fasZAJCKMoSjsw=; b=UigDFn/QQTAXdBinaWC1dj8aU03PtRojYvxRhJWXYoarwNbqW5SYbb/evumioymEh41zlDCMK5YTKYMR+Xm+cy4wufjSEkZNxkc3JZj03iiN8H+JSCNfpqmgXIsQQe2mQN9XMClpDNsHw5MZH/R40CoHBNbTXsjWlBA2CIwIQSQ= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=motih@mellanox.com; Received: from localhost.localdomain (37.142.13.130) by DB7PR05MB4443.eurprd05.prod.outlook.com (2603:10a6:5:1b::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.884.23; Thu, 28 Jun 2018 12:50:00 +0000 From: Moti Haimovsky To: adrien.mazarguil@6wind.com, matan@mellanox.com Cc: dev@dpdk.org, Moti Haimovsky Date: Thu, 28 Jun 2018 15:48:57 +0300 Message-Id: <1530190137-17848-1-git-send-email-motih@mellanox.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1530187032-6489-1-git-send-email-motih@mellanox.com> References: <1530187032-6489-1-git-send-email-motih@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [37.142.13.130] X-ClientProxiedBy: HE1PR09CA0057.eurprd09.prod.outlook.com (2603:10a6:7:3c::25) To DB7PR05MB4443.eurprd05.prod.outlook.com (2603:10a6:5:1b::20) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 247bcc58-3407-4337-f56f-08d5dcf5aa06 X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(8989117)(4534165)(4627221)(201703031133081)(201702281549075)(8990107)(5600026)(711020)(48565401081)(2017052603328)(7153060)(7193020); SRVR:DB7PR05MB4443; X-Microsoft-Exchange-Diagnostics: 1; DB7PR05MB4443; 3:tA7SNmNV0C/TgUM8UBeywWf8eEkJ4xYhFn+Q2c6Ife0ljrV/t8Yvo6SgsVW1escWlBtJ7qRSTKQXBhBfcDZJxSQp+NniEq6fg2p985ZoRg4aQA6Ha/oiD5lR+hsfBfu60xg9GNtxxi+T+/m4DD2o1pQ9oX0dFGyyDp6V94EGOSCS4rlFha2IeNH+BqiclSdoN3b8rhhJeAr5v5CubXR0PG+swcwHWEcyRpJS5IoMpknyBFY1RpBZ2sH5FN93T177; 25:CnehZ5LS+jwbs/vx8jKfB/hpwxfux2Zm//7PdK0z84xW2PchEP+bsUiGgxv76BFc89LOfE8dGJcV+lxOpvFKOvoX4bJu6RBzrxKR+B5dY8RCsxm8Gj5PmYl1ZFkUAaW90V+VseUMoUioiQw6AUoQ64z+CyjZpV3kKk52UrVWIEOqcCgYwUR2KXCHCF5HI+vUft8V/F3sspbE5sB0Fz/h9+yl+QJBwedELOdWYgDSnG2kKLu14N5tB1y6c4nbU3b0O51t8B3Ohd+nNq0pN4smvB5H8t0TltummS0UlVgSi4gCnnSj73DGS0fvA4ehlZSsYkmp4OeptZ0j9Z3QHrc4UQ==; 31:nuD4ZTCzkNeUijGcOMnww4AGw3Cj/29rLhfhztcBsaPQ+ynddyYrzT0yQ42JJhS0uaXkWjZjMBwevGR4oL4EaYt2qDxEqGOXfJibB6WayWm4TMJ9NcrMHXC/YnV9O/4mO+ij+evb3OM8LSbUBg8S3UvSuZCBPW6jX/ah9r8wmCmYKGkGvdEQ4kxhBYvVuMxRyspMiPeNgZP4RRecHkqM1YSbnnrRuFQ2R3GUwgkAnCQ= X-MS-TrafficTypeDiagnostic: DB7PR05MB4443: X-LD-Processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr X-Microsoft-Exchange-Diagnostics: 1; DB7PR05MB4443; 20:3j/rRg9EuEn/d10zKKhALEF8EXou3F8TLuS9QjUlHe5xz3cm/X8SkAWC3mP1OcWc5nr4FfYtk/d/cpHGK8Yo5rHLNoz9GlLJbgN35bUMM5Yg4KrKvSAFgrS8LpQ1C86naeg6DxMcF/FIJpWz768vb62pkXZpimTd0DLB5t5y8840Bh4fHxdTjyVUUkpRQraRLMLBoAL0wcpqtzxOTYjXg8lYZfR8AEQcdLPZLeYS2shCnWU27NtSBYSd4ymAQX9dey8gC0wLOUMuq6mQj7CvY/7igV14ntmDXv5CKCxYP2sSFuvPaqsOqMl2t+4UpbHQCdnFLdLljULT81mUmuwlNFf20Py4E0oGse8PtFZLSrkQCJJ5H2/5WFjTYNeo+6c0OAgYA752eNvGqv+AyKv+siisd1ILSAewSuQ14ejFYp7EsNT3cPqKtj49zA6naxP2e7nPhIznWOKPLFfIQp3Qb5gGg14OgV9M7XFQqZ5Q1XGRZ2H3px3sAlki4LpfcgMU; 4:qT3AxY9AcCfmhJl18XS6FQK6HC7B0DA6y+j0UQ2Dsrr2BkhD+mB7SLW2lAnVu0CVsgCuwzC9OOpaWWVEED6FySqR9hQ2i60+M0suhb7RB4iAlugbtFxTQlW3ZeutB3wB7/ewRj+wOsGD3zy01+ZJoZyUF6/o9cCt9XoxmCiMuLG8pyH/2mBndkA422WcMQtNd2hj4EJQe+hz1KCqQn/RtD+74WEiugljbuqe74PlDvJ/72NFWi5wf/quObyQracyBhGuqUVhHTyPhy+AEhSpEQ== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-MS-Exchange-SenderADCheck: 1 X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(3231254)(944501410)(52105095)(93006095)(93001095)(10201501046)(6055026)(149027)(150027)(6041310)(20161123558120)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(20161123560045)(6072148)(201708071742011)(7699016); SRVR:DB7PR05MB4443; BCL:0; PCL:0; RULEID:; SRVR:DB7PR05MB4443; X-Forefront-PRVS: 0717E25089 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(6069001)(396003)(136003)(39860400002)(366004)(346002)(376002)(189003)(199004)(6666003)(386003)(8676002)(48376002)(52116002)(6636002)(16586007)(7736002)(51416003)(5660300001)(76176011)(81156014)(50466002)(81166006)(106356001)(6506007)(105586002)(97736004)(956004)(6512007)(66066001)(11346002)(305945005)(446003)(2906002)(486006)(476003)(2616005)(47776003)(316002)(8936002)(86362001)(36756003)(14444005)(53936002)(68736007)(107886003)(478600001)(6486002)(50226002)(3846002)(4326008)(186003)(25786009)(26005)(16526019)(6116002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB7PR05MB4443; H:localhost.localdomain; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; Received-SPF: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DB7PR05MB4443; 23:WEyNI3ELiNIDD40llxaZcTXnPjc5sYrTvkWVGUasQ?= =?us-ascii?Q?k7ElnW/Y42FC7EgerbG/OUk+sc8JbRiUdEQSRAu8e8Gfwozifb90MDe3V0mC?= =?us-ascii?Q?ZsS/z7b+S8uuv64rzCuzt1aBKuXxQSVL+Hi/ActHYJ5ECwfup+bh2qI/rQj4?= =?us-ascii?Q?Yd2KxQAlqwF6M1k48Gb7UAu6dRKfV4Tw6RQxJfHMTsUrTeH+QurlQRQK7esw?= =?us-ascii?Q?KchR8Zxv7aLhGrmNo5tS9DdjMvSIbOlDrdRAIHVqQWcOkE+tBTshfvG8KX1b?= =?us-ascii?Q?6M5t4AYiN2WOrqDdzf9HHI5dDIU9sWUWtyMlq+vMezOP0C9T/XAn4VGR20rW?= =?us-ascii?Q?o0FKNLwq+idxfx/dnCVYMP2AOE/3EeD+bfACUcBJ52QTAoDJGdlOldZZx8bp?= =?us-ascii?Q?LmwKK8GunKTdsI9TpczDYpaGC/ljkBAYGiCaSvtrmqK9erqJZo1hqG5JKhlF?= =?us-ascii?Q?cK8nYPBEyk6Ep5tXGE8LmE259sPXVCa1UH+RwQGahOEMBzMXFia2Rnn3dMe8?= =?us-ascii?Q?DK5GZeuQSrtiCWuZxIGT4b5o+xDK7AjVO/rBVFn+ebS7+CNhTD9HpMA3VDVe?= =?us-ascii?Q?36HWMjlVo/VQSD+0q37UJfI+jaOJW4hs1ZYSSq+oUa4Y35tuOIOQKnuYl1Lg?= =?us-ascii?Q?eXnlNnzVernOqT865pCODgvEl5qwRlOBbMOAPNwcHAv+FhWOEWf3SxcxxF+7?= =?us-ascii?Q?wtOsvucwZ1Slze3r1vHQfX7jaZa9rh06dFuQIhjrlZQm0DPtPV5MeAQGLQ9x?= =?us-ascii?Q?04hz7Ts/asaMhucVjJHrklv7fGc+kdKxUBhE/qZXw58a7qksfGwrkQULq6A6?= =?us-ascii?Q?q+dXZ211me1Ng4LoPAf2yIWhizLWVavwPbMghh18vAtVN7oR6IpoDiQOKUEA?= =?us-ascii?Q?8y35Q7ZJoVCDkzkgkzFYs3LOfKFeHsErhdfcuDtih9pR66qSEWb65RIfFo7k?= =?us-ascii?Q?XCb5KmjYRKvFsOETCtiQej3pi+5yEtJcNmwSHKdyZecPmo5AG+QxYK4qlveW?= =?us-ascii?Q?dJz69Zs/wgcNQwFWT9S4VQqPkz8UejLd/geaI4oH/JFdYZjbeYEFu7HdUJbB?= =?us-ascii?Q?f55LQG3YcWMbMqHfCdNxOtpHjwWyyJBxBuuQtShI22dt33CCNi9EpiqLtEgh?= =?us-ascii?Q?0CdCax0tHlB9xilmNBw+yJZAFCtMcLKfj+RpSuqhMYPrp9du5rOpdizGG9m6?= =?us-ascii?Q?oC1Ysmc/AtJS+ZuprYdvcaB5jfd5XbJJV8XkN1uYO7XqS7kQQ6y8C2SCQ=3D?= =?us-ascii?Q?=3D?= X-Microsoft-Antispam-Message-Info: QDbZJ9GsSuKYE0FXl75wMSqI+BnH8qiA8P618U0UFaktmRyv8w5hoHNju6x2dMYYgR9oh4Tonlnj89nTmN+9/vocBr8bedTMvrEeuZhGH/oi69mS5jrYC6VfFRGZy5Wej6U9k3uP7L1MG517DbzimF3hTlB/SrGyYDDOn9xKr4nvdzk7TR/ewrb+SSBFMfMYyhH/JVnZMLbYUL30CN67nJ+NhmVMA2qgY9wRkvnIWXTscLXV+O4X1lVAIQPrMW2qi9sSQL5gsx/tbgzsIp1FRev4uBJyMZp0gsZnjHmcsM+9fbnGZZlUgPIARKgmsAwwasfD8UponpjkNB9AApjewyb44gcAMNkLvvoDOMch/ZA= X-Microsoft-Exchange-Diagnostics: 1; DB7PR05MB4443; 6:G7CSw1zJrFdzf5u0RCqqny8cBHCcmC/bsKMP5NGyPPIf0TlWJqAnfSlSGWWlW6EFki1XEdWw2gOgsBG43GEVtLvxLYReNBg8W0hB5UypmTJxvpff6Rrr+xpSCABjaArll6b0KcWKGE3F+UF4IhWEH4SmG24ylgaQ2icOdumESPdGSBQ5FwF8hq10CDXznd8Ql+s8QlmfBNGMGcF/Ml2r0BjN5RKZp8v0AqnuHO6DScHrU9vhn6o1J+OfbfL5OSPR4Ip2sShFTllGgj8U9wNUroWVNv2NIyR/aY/87g+qwL7itKBg4UDOpJqdHfuVIXOMsMbTPXuuQyVyvR3H2XAsvUfZ+pF3YBaGFpCYZN2ETdtpx1EECx/IBWE7Z2/Stsvw/tewDatTQPeO4ssNMmqxpUGn8OIilXUyBxYEZ8id+62KTGlUWTEjvWhkUHHHX/XLouRs5FVB3O/xM0KmzZHgHA==; 5:30g81oFA3c03b+pa1CWp1hSM/rWIddNFL27X8QhqAduWnoUgitGhnJ+NVoOVJvpU3bxD2YgsTNqX+RuZAnvZHlDCKoue5MBmAvaO4wsijN6mYgMMDNBETWJDNyuUeRpKSLrY0wP5JK7aP9uspWnzfVEBfyJGF4YkDm8uM2BXtpw=; 24:dAsco8W3qUaWNduyiup2zezrTqwuwHg1DfLJlE8ynR+0cwxSs0aBM4VM7MIQsxDjOrDfNPgC0dfyeGtw891BtUGwEqutleqIz/jqrQVk77o= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; DB7PR05MB4443; 7:hIUcCdCAEcjOPNNfuoCr/E8oQjbSEwfVkDILizr0je2805yTJOAUtR4Vxf9wvZNeYx7crjBd58rkn23ybjbZ/14NpHKboTbE139AO0s++j4axS8ApcDf7kPp1s0LBLHxhJitt0mrwQNCOAp/hY5tqpOh68TBcYxoJGJDxmnIl7NRP7FdDpgu0eMBbQabBN+rkLsyVG8it2sgstj8tv2Z5wNdPQe7vpXitdx6SpWq0/lABU7r/X1r5HKqJ5itaLlJ X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Jun 2018 12:50:00.8911 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 247bcc58-3407-4337-f56f-08d5dcf5aa06 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB7PR05MB4443 Subject: [dpdk-dev] [PATCH v3] net/mlx4: support hardware TSO X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Jun 2018 12:50:05 -0000 Implement support for hardware TSO. Signed-off-by: Moti Haimovsky --- v3: * Fixed compilation errors in compilers without GNU C extensions caused by a declaration of zero-length array in the code. in reply to 1530187032-6489-1-git-send-email-motih@mellanox.com v2: * Fixed coding style warning. in reply to 1530184583-30166-1-git-send-email-motih@mellanox.com v1: * Fixed coding style warnings. in reply to 1530181779-19716-1-git-send-email-motih@mellanox.com --- doc/guides/nics/features/mlx4.ini | 1 + doc/guides/nics/mlx4.rst | 3 + drivers/net/mlx4/mlx4.c | 16 ++ drivers/net/mlx4/mlx4.h | 5 + drivers/net/mlx4/mlx4_prm.h | 12 ++ drivers/net/mlx4/mlx4_rxtx.c | 372 +++++++++++++++++++++++++++++++++++++- drivers/net/mlx4/mlx4_rxtx.h | 2 +- drivers/net/mlx4/mlx4_txq.c | 8 +- 8 files changed, 415 insertions(+), 4 deletions(-) diff --git a/doc/guides/nics/features/mlx4.ini b/doc/guides/nics/features/mlx4.ini index f6efd21..98a3f61 100644 --- a/doc/guides/nics/features/mlx4.ini +++ b/doc/guides/nics/features/mlx4.ini @@ -13,6 +13,7 @@ Queue start/stop = Y MTU update = Y Jumbo frame = Y Scattered Rx = Y +TSO = Y Promiscuous mode = Y Allmulticast mode = Y Unicast MAC filter = Y diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst index 491106a..12adaeb 100644 --- a/doc/guides/nics/mlx4.rst +++ b/doc/guides/nics/mlx4.rst @@ -142,6 +142,9 @@ Limitations The ability to enable/disable CRC stripping requires OFED version 4.3-1.5.0.0 and above or rdma-core version v18 and above. +- TSO (Transmit Segmentation Offload) is supported in OFED version + 4.4 and above or in rdma-core version v18 and above. + Prerequisites ------------- diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c index d151a90..61b7844 100644 --- a/drivers/net/mlx4/mlx4.c +++ b/drivers/net/mlx4/mlx4.c @@ -519,6 +519,8 @@ struct mlx4_conf { .ports.present = 0, }; unsigned int vf; + struct rte_mbuf mbuf; + uint64_t size_test = UINT_MAX; int i; (void)pci_drv; @@ -677,6 +679,20 @@ struct mlx4_conf { IBV_RAW_PACKET_CAP_SCATTER_FCS); DEBUG("FCS stripping toggling is %ssupported", priv->hw_fcs_strip ? "" : "not "); + /* + * No TSO SIZE is defined in DPDK, need to figure it out + * in order to see if we can support it. + */ + mbuf.tso_segsz = size_test; + priv->tso = + ((device_attr_ex.tso_caps.max_tso >= mbuf.tso_segsz) && + (device_attr_ex.tso_caps.supported_qpts & + (1 << IBV_QPT_RAW_PACKET))); + if (priv->tso) + priv->tso_max_payload_sz = + device_attr_ex.tso_caps.max_tso; + DEBUG("TSO is %ssupported", + priv->tso ? "" : "not "); /* Configure the first MAC address by default. */ err = mlx4_get_mac(priv, &mac.addr_bytes); if (err) { diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h index 300cb4d..742d741 100644 --- a/drivers/net/mlx4/mlx4.h +++ b/drivers/net/mlx4/mlx4.h @@ -47,6 +47,9 @@ /** Interrupt alarm timeout value in microseconds. */ #define MLX4_INTR_ALARM_TIMEOUT 100000 +/* Maximum Packet headers size (L2+L3+L4) for TSO. */ +#define MLX4_MAX_TSO_HEADER 192 // TODO: find the real value + /** Port parameter. */ #define MLX4_PMD_PORT_KVARG "port" @@ -90,6 +93,8 @@ struct priv { uint32_t hw_csum:1; /**< Checksum offload is supported. */ uint32_t hw_csum_l2tun:1; /**< Checksum support for L2 tunnels. */ uint32_t hw_fcs_strip:1; /**< FCS stripping toggling is supported. */ + uint32_t tso:1; /**< Transmit segmentation offload is supported */ + uint32_t tso_max_payload_sz; /* Max TSO payload size being supported */ uint64_t hw_rss_sup; /**< Supported RSS hash fields (Verbs format). */ struct rte_intr_handle intr_handle; /**< Port interrupt handle. */ struct mlx4_drop *drop; /**< Shared resources for drop flow rules. */ diff --git a/drivers/net/mlx4/mlx4_prm.h b/drivers/net/mlx4/mlx4_prm.h index e15a3c1..0484878 100644 --- a/drivers/net/mlx4/mlx4_prm.h +++ b/drivers/net/mlx4/mlx4_prm.h @@ -40,6 +40,7 @@ /* Work queue element (WQE) flags. */ #define MLX4_WQE_CTRL_IIP_HDR_CSUM (1 << 28) #define MLX4_WQE_CTRL_IL4_HDR_CSUM (1 << 27) +#define MLX4_WQE_CTRL_RR (1 << 6) /* CQE checksum flags. */ enum { @@ -97,6 +98,17 @@ struct mlx4_cq { int arm_sn; /**< Rx event counter. */ }; +/* + * WQE LSO segment structure. + * Defined here as backward compatibility for rdma-core v17 and below. + * Similar definition is found in infiniband/mlx4dv.h in rdma-core v18 + * and above. + */ +struct mlx4_wqe_lso_seg_ { + __be32 mss_hdr_size; + __be32 header[]; +}; + /** * Retrieve a CQE entry from a CQ. * diff --git a/drivers/net/mlx4/mlx4_rxtx.c b/drivers/net/mlx4/mlx4_rxtx.c index a92da66..992d193 100644 --- a/drivers/net/mlx4/mlx4_rxtx.c +++ b/drivers/net/mlx4/mlx4_rxtx.c @@ -38,10 +38,25 @@ * DWORD (32 byte) of a TXBB. */ struct pv { - volatile struct mlx4_wqe_data_seg *dseg; + union { + volatile struct mlx4_wqe_data_seg *dseg; + volatile uint32_t *dst; + }; uint32_t val; }; +/** A helper struct for TSO packet handling. */ +struct tso_info { + /* Total size of the WQE including padding */ + uint32_t wqe_size; + /* size of TSO header to prepend to each packet to send */ + uint16_t tso_header_sz; + /* Total size of the TSO entry in the WQE. */ + uint16_t wqe_tso_seg_size; + /* Raw WQE size in units of 16 Bytes and without padding. */ + uint8_t fence_size; +}; + /** A table to translate Rx completion flags to packet type. */ uint32_t mlx4_ptype_table[0x100] __rte_cache_aligned = { /* @@ -377,6 +392,349 @@ struct pv { } /** + * Obtain and calculate TSO information needed for assembling a TSO WQE. + * + * @param buf + * Pointer to the first packet mbuf. + * @param txq + * Pointer to Tx queue structure. + * @param tinfo + * Pointer to a structure to fill the info with. + * + * @return + * 0 on success, negative value upon error. + */ +static inline int +mlx4_tx_burst_tso_get_params(struct rte_mbuf *buf, + struct txq *txq, + struct tso_info *tinfo) +{ + struct mlx4_sq *sq = &txq->msq; + const uint8_t tunneled = txq->priv->hw_csum_l2tun && + (buf->ol_flags & PKT_TX_TUNNEL_MASK); + + tinfo->tso_header_sz = buf->l2_len + buf->l3_len + buf->l4_len; + if (tunneled) + tinfo->tso_header_sz += buf->outer_l2_len + buf->outer_l3_len; + if (unlikely(buf->tso_segsz == 0 || tinfo->tso_header_sz == 0)) { + DEBUG("%p: Invalid TSO parameters", (void *)txq); + return -EINVAL; + } + /* First segment must contain all TSO headers. */ + if (unlikely(tinfo->tso_header_sz > MLX4_MAX_TSO_HEADER) || + tinfo->tso_header_sz > buf->data_len) { + DEBUG("%p: Invalid TSO header length", (void *)txq); + return -EINVAL; + } + /* + * Calculate the WQE TSO segment size + * Note: + * 1. An LSO segment must be padded such that the subsequent data + * segment is 16-byte aligned. + * 2. The start address of the TSO segment is always 16 Bytes aligned. + */ + tinfo->wqe_tso_seg_size = RTE_ALIGN(sizeof(struct mlx4_wqe_lso_seg_) + + tinfo->tso_header_sz, + sizeof(struct mlx4_wqe_data_seg)); + tinfo->fence_size = ((sizeof(struct mlx4_wqe_ctrl_seg) + + tinfo->wqe_tso_seg_size) >> MLX4_SEG_SHIFT) + + buf->nb_segs; + tinfo->wqe_size = + RTE_ALIGN((uint32_t)(tinfo->fence_size << MLX4_SEG_SHIFT), + MLX4_TXBB_SIZE); + /* Validate WQE size and WQE space in the send queue. */ + if (sq->remain_size < tinfo->wqe_size || + tinfo->wqe_size > MLX4_MAX_WQE_SIZE) + return -ENOMEM; + return 0; +} + +/** + * Fill the TSO WQE data segments with info on buffers to transmit . + * + * @param buf + * Pointer to the first packet mbuf. + * @param txq + * Pointer to Tx queue structure. + * @param tinfo + * Pointer to TSO info to use. + * @param dseg + * Pointer to the first data segment in the TSO WQE. + * @param pv + * Pointer to a stash area for saving the first 32bit word of each TXBB + * used for the TSO WQE. + * @param pv_counter + * Current location in the stash. + * + * @return + * 0 on success, negative value upon error. + */ +static inline int +mlx4_tx_burst_fill_tso_segs(struct rte_mbuf *buf, + struct txq *txq, + const struct tso_info *tinfo, + volatile struct mlx4_wqe_data_seg *dseg, + struct pv *pv, int *pv_counter) +{ + uint32_t lkey; + int nb_segs = buf->nb_segs; + int nb_segs_txbb; + struct mlx4_sq *sq = &txq->msq; + struct rte_mbuf *sbuf = buf; + uint16_t sb_of = tinfo->tso_header_sz; + uint16_t data_len; + + while (nb_segs > 0) { + /* Wrap dseg if it points at the end of the queue. */ + if ((volatile uint8_t *)dseg >= sq->eob) + dseg = (volatile struct mlx4_wqe_data_seg *) + (volatile uint8_t *)dseg - sq->size; + /* how many dseg entries do we have in the current TXBB ? */ + nb_segs_txbb = + (MLX4_TXBB_SIZE / sizeof(struct mlx4_wqe_data_seg)) - + ((uintptr_t)dseg & (MLX4_TXBB_SIZE - 1)) / + sizeof(struct mlx4_wqe_data_seg); + switch (nb_segs_txbb) { + case 4: + /* Memory region key for this memory pool. */ + lkey = mlx4_tx_mb2mr(txq, sbuf); + if (unlikely(lkey == (uint32_t)-1)) + goto lkey_err; + dseg->addr = + rte_cpu_to_be_64(rte_pktmbuf_mtod_offset(sbuf, + uintptr_t, + sb_of)); + dseg->lkey = lkey; + /* + * This data segment starts at the beginning of a new + * TXBB, so we need to postpone its byte_count writing + * for later. + */ + pv[*pv_counter].dseg = dseg; + /* + * Zero length segment is treated as inline segment + * with zero data. + */ + data_len = sbuf->data_len - sb_of; + pv[(*pv_counter)++].val = + rte_cpu_to_be_32(data_len ? + data_len : + 0x80000000); + sb_of = 0; + sbuf = sbuf->next; + dseg++; + if (--nb_segs == 0) + break; + /* fallthrough */ + case 3: + lkey = mlx4_tx_mb2mr(txq, sbuf); + if (unlikely(lkey == (uint32_t)-1)) + goto lkey_err; + data_len = sbuf->data_len - sb_of; + mlx4_fill_tx_data_seg(dseg, + lkey, + rte_pktmbuf_mtod_offset(sbuf, + uintptr_t, + sb_of), + rte_cpu_to_be_32(data_len ? + data_len : + 0x80000000)); + sb_of = 0; + sbuf = sbuf->next; + dseg++; + if (--nb_segs == 0) + break; + /* fallthrough */ + case 2: + lkey = mlx4_tx_mb2mr(txq, sbuf); + if (unlikely(lkey == (uint32_t)-1)) + goto lkey_err; + data_len = sbuf->data_len - sb_of; + mlx4_fill_tx_data_seg(dseg, + lkey, + rte_pktmbuf_mtod_offset(sbuf, + uintptr_t, + sb_of), + rte_cpu_to_be_32(data_len ? + data_len : + 0x80000000)); + sb_of = 0; + sbuf = sbuf->next; + dseg++; + if (--nb_segs == 0) + break; + /* fallthrough */ + case 1: + lkey = mlx4_tx_mb2mr(txq, sbuf); + if (unlikely(lkey == (uint32_t)-1)) + goto lkey_err; + data_len = sbuf->data_len - sb_of; + mlx4_fill_tx_data_seg(dseg, + lkey, + rte_pktmbuf_mtod_offset(sbuf, + uintptr_t, + sb_of), + rte_cpu_to_be_32(data_len ? + data_len : + 0x80000000)); + sb_of = 0; + sbuf = sbuf->next; + dseg++; + --nb_segs; + break; + default: + /* Should never happen */ + ERROR("%p: invalid number of txbb data segments %d", + (void *)txq, nb_segs_txbb); + return -EINVAL; + } + } + return 0; +lkey_err: + DEBUG("%p: unable to get MP <-> MR association", + (void *)txq); + return -EFAULT; +} + +/** + * Fill the packet's l2, l3 and l4 headers to the WQE. + * This will be used as the header for each TSO segment that is transmitted. + * + * @param buf + * Pointer to the first packet mbuf. + * @param txq + * Pointer to Tx queue structure. + * @param tinfo + * Pointer to TSO info to use. + * @param tseg + * Pointer to the TSO header field in the TSO WQE. + * @param pv + * Pointer to a stash area for saving the first 32bit word of each TXBB + * used for the TSO WQE. + * @param pv_counter + * Current location in the stash. + * + * @return + * 0 on success, negative value upon error. + */ +static inline int +mlx4_tx_burst_fill_tso_hdr(struct rte_mbuf *buf, + struct txq *txq, + const struct tso_info *tinfo, + volatile struct mlx4_wqe_lso_seg_ *tseg, + struct pv *pv, int *pv_counter) +{ + struct mlx4_sq *sq = &txq->msq; + int remain_sz = tinfo->tso_header_sz; + char *from = rte_pktmbuf_mtod(buf, char *); + uint16_t txbb_avail_space; + int copy_sz; + /* Union to overcome volatile constraints when copying TSO header. */ + union { + volatile uint8_t *vto; + uint8_t *to; + } thdr = { .vto = (volatile uint8_t *)tseg->header, }; + + /* + * TSO data always starts at offset 20 from the beginning of the TXBB + * (16 byte ctrl + 4byte TSO desc). Since each TXBB is 64Byte aligned + * we can write the first 44 TSO header bytes without worry for TxQ + * wrapping or overwriting the first TXBB 32bit word. + */ + txbb_avail_space = MLX4_TXBB_SIZE - + (sizeof(struct mlx4_wqe_ctrl_seg) + + sizeof(struct mlx4_wqe_lso_seg_)); + copy_sz = RTE_MIN(txbb_avail_space, remain_sz); + rte_memcpy(thdr.to, from, copy_sz); + remain_sz -= copy_sz; + while (remain_sz > 0) { + from += copy_sz; + thdr.to += copy_sz; + /* Start of TXBB need to check for TxQ wrap. */ + if (thdr.to >= sq->eob) + thdr.vto = sq->buf; + /* New TXBB, stash the first 32bits for later use. */ + pv[*pv_counter].dst = (volatile uint32_t *)thdr.vto; + pv[(*pv_counter)++].val = *((uint32_t *)from); + from += sizeof(uint32_t); + thdr.to += sizeof(uint32_t); + remain_sz -= sizeof(uint32_t); + if (remain_sz <= 0) + break; + /* Now copy the rest */ + txbb_avail_space = MLX4_TXBB_SIZE - sizeof(uint32_t); + copy_sz = RTE_MIN(txbb_avail_space, remain_sz); + rte_memcpy(thdr.to, from, copy_sz); + remain_sz -= copy_sz; + } + /* TODO: handle PID and IPID ? */ + tseg->mss_hdr_size = rte_cpu_to_be_32((buf->tso_segsz << 16) | + tinfo->tso_header_sz); + return 0; +} + +/** + * Write data segments and header for TSO uni/multi segment packet. + * + * @param buf + * Pointer to the first packet mbuf. + * @param txq + * Pointer to Tx queue structure. + * @param ctrl + * Pointer to the WQE control segment. + * + * @return + * Pointer to the next WQE control segment on success, NULL otherwise. + */ +static volatile struct mlx4_wqe_ctrl_seg * +mlx4_tx_burst_tso(struct rte_mbuf *buf, struct txq *txq, + volatile struct mlx4_wqe_ctrl_seg *ctrl) +{ + volatile struct mlx4_wqe_data_seg *dseg; + volatile struct mlx4_wqe_lso_seg_ *tseg = + (volatile struct mlx4_wqe_lso_seg_ *)(ctrl + 1); + struct mlx4_sq *sq = &txq->msq; + struct tso_info tinfo; + struct pv *pv = (struct pv *)txq->bounce_buf; + int pv_counter = 0; + int ret; + + ret = mlx4_tx_burst_tso_get_params(buf, txq, &tinfo); + if (ret) + goto error; + ret = mlx4_tx_burst_fill_tso_hdr(buf, txq, &tinfo, + tseg, pv, &pv_counter); + if (ret) + goto error; + /* Calculate data segment location */ + dseg = (volatile struct mlx4_wqe_data_seg *) + ((uintptr_t)tseg + tinfo.wqe_tso_seg_size); + if ((uintptr_t)dseg >= (uintptr_t)sq->eob) + dseg = (volatile struct mlx4_wqe_data_seg *) + ((uintptr_t)dseg - sq->size); + ret = mlx4_tx_burst_fill_tso_segs(buf, txq, &tinfo, + dseg, pv, &pv_counter); + if (ret) + goto error; + /* Write the first DWORD of each TXBB save earlier. */ + if (pv_counter) { + /* Need a barrier here before writing the first TXBB word. */ + rte_io_wmb(); + for (--pv_counter; pv_counter >= 0; pv_counter--) + *pv[pv_counter].dst = pv[pv_counter].val; + } + ctrl->fence_size = tinfo.fence_size; + sq->remain_size -= tinfo.wqe_size; + /* Align next WQE address to the next TXBB. */ + return (volatile struct mlx4_wqe_ctrl_seg *) + ((volatile uint8_t *)ctrl + tinfo.wqe_size); +error: + txq->stats.odropped++; + rte_errno = ret; + return NULL; +} + +/** * Write data segments of multi-segment packet. * * @param buf @@ -569,6 +927,7 @@ struct pv { uint16_t flags16[2]; } srcrb; uint32_t lkey; + bool tso = txq->priv->tso && (buf->ol_flags & PKT_TX_TCP_SEG); /* Clean up old buffer. */ if (likely(elt->buf != NULL)) { @@ -587,7 +946,16 @@ struct pv { } while (tmp != NULL); } RTE_MBUF_PREFETCH_TO_FREE(elt_next->buf); - if (buf->nb_segs == 1) { + if (tso) { + /* Change opcode to TSO */ + owner_opcode &= ~MLX4_OPCODE_CONFIG_CMD; + owner_opcode |= MLX4_OPCODE_LSO | MLX4_WQE_CTRL_RR; + ctrl_next = mlx4_tx_burst_tso(buf, txq, ctrl); + if (!ctrl_next) { + elt->buf = NULL; + break; + } + } else if (buf->nb_segs == 1) { /* Validate WQE space in the send queue. */ if (sq->remain_size < MLX4_TXBB_SIZE) { elt->buf = NULL; diff --git a/drivers/net/mlx4/mlx4_rxtx.h b/drivers/net/mlx4/mlx4_rxtx.h index 4c025e3..ffa8abf 100644 --- a/drivers/net/mlx4/mlx4_rxtx.h +++ b/drivers/net/mlx4/mlx4_rxtx.h @@ -90,7 +90,7 @@ struct mlx4_txq_stats { unsigned int idx; /**< Mapping index. */ uint64_t opackets; /**< Total of successfully sent packets. */ uint64_t obytes; /**< Total of successfully sent bytes. */ - uint64_t odropped; /**< Total of packets not sent when Tx ring full. */ + uint64_t odropped; /**< Total number of packets failed to transmit. */ }; /** Tx queue descriptor. */ diff --git a/drivers/net/mlx4/mlx4_txq.c b/drivers/net/mlx4/mlx4_txq.c index 6edaadb..9aa7440 100644 --- a/drivers/net/mlx4/mlx4_txq.c +++ b/drivers/net/mlx4/mlx4_txq.c @@ -116,8 +116,14 @@ DEV_TX_OFFLOAD_UDP_CKSUM | DEV_TX_OFFLOAD_TCP_CKSUM); } - if (priv->hw_csum_l2tun) + if (priv->tso) + offloads |= DEV_TX_OFFLOAD_TCP_TSO; + if (priv->hw_csum_l2tun) { offloads |= DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM; + if (priv->tso) + offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO | + DEV_TX_OFFLOAD_GRE_TNL_TSO); + } return offloads; } -- 1.8.3.1