From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-eopbgr70079.outbound.protection.outlook.com [40.107.7.79]) by dpdk.org (Postfix) with ESMTP id 62FC1343C for ; Mon, 9 Jul 2018 18:33:38 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=HwB5NUYjUGYFUu9N7yktA/QNAN4Hvz7oqFHXgo6mzWs=; b=abFMA0XG2tZ1aDGx/YJSKF5byfk26zGD0u3wQSZImwI0LTrXrgJtVST3XL7dOBEa48zGzOJL6+KDv0LXoQEKMov/wU5ylJxbt5urlekRUmWG0VtPEPOHX6gRirAbaTKuWGqiu97HoYO6Hp/5a1Q80r+WnugE+T4/GF181osCF7s= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=motih@mellanox.com; Received: from localhost.localdomain (37.142.13.130) by VI1PR05MB4445.eurprd05.prod.outlook.com (2603:10a6:803:42::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.930.20; Mon, 9 Jul 2018 16:33:35 +0000 From: Moti Haimovsky To: adrien.mazarguil@6wind.com, matan@mellanox.com Cc: dev@dpdk.org, Moti Haimovsky Date: Mon, 9 Jul 2018 19:33:15 +0300 Message-Id: <1531153995-26627-1-git-send-email-motih@mellanox.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1531132986-5054-1-git-send-email-motih@mellanox.com> References: <1531132986-5054-1-git-send-email-motih@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [37.142.13.130] X-ClientProxiedBy: AM5PR0602CA0015.eurprd06.prod.outlook.com (2603:10a6:203:a3::25) To VI1PR05MB4445.eurprd05.prod.outlook.com (2603:10a6:803:42::28) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 4c8364d5-6f7d-41dd-3bdb-08d5e5b9b845 X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989117)(5600053)(711020)(4534165)(4627221)(201703031133081)(201702281549075)(8990107)(48565401081)(2017052603328)(7153060)(7193020); SRVR:VI1PR05MB4445; X-Microsoft-Exchange-Diagnostics: 1; VI1PR05MB4445; 3:XqS4MAk9V4qLd/1Gj3QVjRDIY7GgXUgeJlaYJ9BZ+lMxPq9EtrIE5lNoPyVfuxiBeC/mSQnhTtSTZxjOMRVq9Bhh+sZZNNPHYqxziqaoKVEVKf0vXvDAF7Q2Ka2iwFSfp0/r0EgPPY2ZZtkPSnWtk4FaRT/JnvjjWW78pr+17Aj4U9qEX2sz56wsjXeg9ZUr9UV07sq8WnXcBXmRL5Rczigq9r2dhq1S151vzOtGj4JBUJpyPlGRU2Fri3utXwMv; 25:TShRADVKM45jNXbH+S7kAMq34qVfy/slK2H+fJmX+BCOq5FLNyCNNiY/jejNMS/RFYPS4yUeUv0653mctWSuQLL3F19VhPg46F8nu82hOvW+ykdZHNRVlPYTvLELMhZwJByTsiQ7I6oGkaf7aOXzI1sMrjBYMP8JWBClFcwiMt9Y0B40PJMEmHNVhjz2iWwhv9aJwZwUcH49NYHchiflpT5WUcsb9a6VaIzD1MIasywzB5dyd3m8UqDNqL490VU3jlp3avQ5bZDfwlKbKX0rqHkDpy4+09VU8lTJiINCz1KJHTSLwWh8VKLp28JgNoegnUHXSz3I/2RRalvT+536oA==; 31:pyuUrJv77RAyzyUiktEPX+tydgqcxYdCYqtuH0Ejm/4KvCmXVVIZDLoC5DSJX50GsjIYrGP4Bul5zGsr1UZ9uG4gGSIyneqvJu3c931tl9khNIzw0gnYd6CE01RkRPAaJCsOzwVlYtvf04w9dUwYIFVlYCMtD92snvY2B8vTsCqCVAhR4IW/aUeM4fbNab/f6v8wVMCaaKAQGAv5VJ7hF+bvhRNYvJOLi1nrjltVoJY= X-MS-TrafficTypeDiagnostic: VI1PR05MB4445: X-LD-Processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr X-Microsoft-Exchange-Diagnostics: 1; VI1PR05MB4445; 20:6cOGosXv/VZgTUxXvaSlbXMtKpWIcrZrz0IPjVsToheH5Mdkc4SMfezIbHm57cEzZZWM94VpBfcFYbLug4bYDvKf+GTQZ3qvfJfX3ZD/o+Lzcm8YQgGdUa9x85AqNByR3czh3QOMoY9hXO7/lW8W+3/W6bwH9AVhwfWoOR4Lu38t39KgrNNGOBBWgsjqTIN2Ct6K723j/9kJkLqjEA1lrDYKOFYsagmWU4WjuxOIJtHPH40JCOep3M4uuOHwz65n6kb+PfluXHqZFc6l6oGDUehUG58wUbrwaJSSFgClGA+rl35mz1Vj7sTt+GbDEdFrXEr3Cm9BTOYzPvCyd7v90ndOwtqqpag+Yu1sYWXX43acZGJUY4iS+hNc4KLc1zikPdZ/rCBfXBjN/bm+SM1M1NAhUgsIXuxoDgJzksngezi5T63/iFXWPqlgm9FQEubbxbS9vhnuPs5bqeGcURdJAErO631UdQvehFuK9Rf7A6vW1hrWq+3+dkAHOUVNvFl9; 4:/9zDmuYV34yhcZCoOb0iXeKaLjvX9Y3TWyL9tyh5jDOJ+VJzIkM1cWFtquZ7MptI6+S3lTo0rpb/SRW5vdJgfMcwEKZetHKNFrgkMaUOHKLVkzrj/fT26rC+JXqrPN4rAg/Er//BIJt+XNjHvtwJjziL4FgT0WtsW+o6N3QCu/e1VtOn2QH9+q0hEYyWtkt/FWtr4I/4uqaKuiD/yw7ojtj8cqIU12nECHLM5CIk+vGpaa7cBMr2V0aSt5ksCucU/7EcBWN91rhWg5JJK/8YSw== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-MS-Exchange-SenderADCheck: 1 X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(93006095)(93001095)(10201501046)(3002001)(3231311)(944501410)(52105095)(6055026)(149027)(150027)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123562045)(20161123564045)(20161123558120)(6072148)(201708071742011)(7699016); SRVR:VI1PR05MB4445; BCL:0; PCL:0; RULEID:; SRVR:VI1PR05MB4445; X-Forefront-PRVS: 07283408BE X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(6069001)(39850400004)(396003)(136003)(366004)(346002)(376002)(189003)(199004)(8676002)(36756003)(86362001)(8936002)(97736004)(7736002)(81166006)(81156014)(50226002)(305945005)(6666003)(6636002)(5660300001)(53936002)(6116002)(6486002)(4326008)(107886003)(2906002)(68736007)(6512007)(3846002)(25786009)(956004)(50466002)(478600001)(476003)(16586007)(51416003)(47776003)(106356001)(105586002)(186003)(16526019)(386003)(6506007)(11346002)(446003)(48376002)(2616005)(52116002)(14444005)(66066001)(26005)(316002)(76176011)(486006); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR05MB4445; H:localhost.localdomain; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; Received-SPF: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; VI1PR05MB4445; 23:pWXK4Kn/WwW2fkJEoML/w/ZTgGtDaqmf4PPwlAXc2?= =?us-ascii?Q?Xy1aggBfI6v/3KfVURhJVF/RvU9Moqub+8/pLYdsSvG/Nv8JgqPKNXBiF5bZ?= =?us-ascii?Q?LMRMqmQzrno4o98qFA4TR9iZ9c194687K2AOeHirhxBh1EyG/JMa0J2GTO6b?= =?us-ascii?Q?6BDXOt9I9HEHFmJBEB2PnbCbkFT7jFxaZCYP4Y+p1mZ5K5v3oGoBosgErhIG?= =?us-ascii?Q?I3HyrRKulo1pZfR+urswuKlB/bKK7RdrmJCHEGx2wAvWHH32koWaTJaakYvU?= =?us-ascii?Q?F9dcm91zUhOpbUA/A0fbl7PmXM9PoVugc7zQBlGHJKfmBWi3lOrAGi403TDr?= =?us-ascii?Q?/d9p1yRNFnS+OccjhcSIm+g3jKZVRa2UmmdxkyiVSae16ORLPClxOcALLMhr?= =?us-ascii?Q?iPR47PefFGtnDsGjQVVwwWlBGXcylNyLMy/7vU6j01r0pIZ3c+9rAvB0EYBC?= =?us-ascii?Q?Cu/XpBXtdB5k2wifCRU94LdXzvfD3EK39t618+Mb37GkR4yzAwWZZ/DnW/7g?= =?us-ascii?Q?uDlpnZlpkYrd7lTqQPueCiz6Z7lDKDAvl8mWE7LLPm/uNxqmVG1boIV7S8I/?= =?us-ascii?Q?SEcSDJ7KNYviHNhGOIkN9glqmyFd5Fuod9lYwUkKI7b1aiYridAjLeTVQfM6?= =?us-ascii?Q?KGmI1DY8uitAMo0M55MXWjZoI6sSpx1GO5lb5xRApK5b1k17bF91q1uoMGRE?= =?us-ascii?Q?W4cPkNhKFMxXc2O3AAKjv7bkKfhOOYIdqGdgiwUi4VfjunrppP2EDm5kk/qa?= =?us-ascii?Q?hsaVtLvz69Z6XPBydO6OFFeAehX5eIfTjnUU9gvQhBgEqgknfMYnFO5t6HgO?= =?us-ascii?Q?7Jb9keMeK1OJx+m3hfLEBYNS3oCsw4xBDknQXpY+m4F3fS8bkIiFivTawtRV?= =?us-ascii?Q?B4F2lUwC5PG5qVLzEjY4dVUHMmfvJIQGNc60/WJRlcLgJwHAqcn2fbsZhrdv?= =?us-ascii?Q?7ejtFpj3zyf7vzT3IVxAi754ZoeXoa51ejsQedMOWMANiIG/LvQTPv3QpAiy?= =?us-ascii?Q?MHu/nIPgQFni0ji0dO5qO56swFXZi2RMC72BkkArgbNKPpOhj2jbsSVPYPyH?= =?us-ascii?Q?vtVV1mVPABvBTWyU9Jb/7TFWctiKrE3ALLzVawABILrNdktl+d5cJuvcaAAJ?= =?us-ascii?Q?9d74zIOmPHlXpasMt0O8Ye0s4lz2EHh/UQIAyZhgq0ghnOAHTimLeOb1KcmM?= =?us-ascii?Q?yOoraREKiI4lbJ0Gl+IeOvU0XVKv53uRh7c1R3t9zfGoduOC3kEiBZgpg=3D?= =?us-ascii?Q?=3D?= X-Microsoft-Antispam-Message-Info: VwN9kl+9rxmj+PUdRHAjk2Xzo5SKN4RmfHJevzZRoXv4GlXzzY8zVC42RNV6KR1hVD9uRnadUFIvHwLy5AtzSSrXCwm8/TibRp8o+SU6IpLIDusxSfCRgAR3CrfRIjQCTKWH1NCypNWR/3R91gBqWF+F+Du9KFwfaUUjC7xObQnv/lIa4edlRVu6Cj0XldgL3dhE9E2JGDPqxLWm9y0u7/LUBvRWtIWK/xftcFnnI9QtnTJiFkmUsDjoSVTdM7uJ77Vbvt5nzq8Y5v7vFlj4VLFxpMsdfyCEVQSHo8H4euoubEZmg+DHJK6SxlmzsQgkCYZuA5lcUGqhZEiGWnhCvC4VK7ujH52Z8UY91/LRLGM= X-Microsoft-Exchange-Diagnostics: 1; VI1PR05MB4445; 6:dwmm2AAmRXyiWUayiQGtzzmXTTbkqsT1O8DZt4hc2PUzq13175gIfAoK+r9e/yZ3UOVhum6iyKxmi2ZJHwJ4Qp62KAg1neQ1ge3dmMX6eBb/4i05S1MqW8wSMksQAv/sNjgy0XRBN+qOKKfwhNL7llp1n3e3mh6CdcFB8Nb59vDZvR6HyEbR/ShC0VabKeiBQy59lTwuejfE93BYYgwFdaCyEEn6sR3cH/SOQ/Z2PzEA9YRUOl4b/xwwGYjAWPg3AlcNb/TdGBXBUaxKW/LvbttdhqaUeIpA1e9pEj8LVl+OyR7BOM8gG9rj+Q0Uw5yCrMKniPMaa6dm74pu7+rdag7HZ9/cCfvNSUPEYqsLbYgZX96UP8gSOD1q/P12WADkSsJrWOYmJlO73OXaNpcvGHdU0e+c6dDYrO1DR26L+2L1DfL40zfX/BFw0QUduzi23msB3MhKrFXlNvTsZX3D9A==; 5:SOWZNHbRnj0R9Lr3iUjNXW2atYkui6kUTVWHtSCh69tAmZOdWrsoihN701xpqzDhZYMhVYr3yYifzGespAWRRpp6CVqajYh0SORt7fmMh31ief59gi9dsPDRyTYar7GWdjuu03d45AqSEsZuFhavsUCf1tQ7R8hQBBPwxqYQT8k=; 24:+uOF6hNIKz0v4xRlQF5rSdixm0a0D7kBB9tnW5ls/JRhDN6Rq91AdZNvoCsD2e+oRyihbW/4pcaaBJMkxKeV7sgm/qihmWUy0Q8Rh44YHPI= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; VI1PR05MB4445; 7:8AEixUejNKLs0R2ojQT/Be3kYQejsacwr+Y5FLFOlAUr1CFVG1LeLsSMlF4dVFrajy4S2LF4/Io05b77QrqhoLeNFmBhsLlAp6A9omGxo0Oynb+8mK/ijJ4OEGIV9YpwWzTEibS0oWBLHHkkIyocQdumIIG08dc6FDZS6OcyV0veMwTOPM0EBCIlo7afi0YGnQKux89vHBUNle6wj2OxJ9NSzUS8Wuu7ZBuu2tySpe5Y1Jl7Pcb1lGmySHbLYHJz X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jul 2018 16:33:35.7692 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4c8364d5-6f7d-41dd-3bdb-08d5e5b9b845 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR05MB4445 Subject: [dpdk-dev] [PATCH v6] net/mlx4: support hardware TSO X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Jul 2018 16:33:38 -0000 Implement support for hardware TSO. Signed-off-by: Moti Haimovsky --- v6: * Minor bug fixes from previous commit. * More optimizations on TSO data-segments creation routine. in reply to 1531132986-5054-1-git-send-email-motih@mellanox.com v5: * Modification to the code according to review inputs from Matan Azrad. * Code optimization to the TSO header copy routine. * Rearranged the TSO data-segments creation routine. in reply to 1530715998-15703-1-git-send-email-motih@mellanox.com v4: * Bug fixes in filling TSO data segments. * Modifications according to review inputs from Adrien Mazarguil and Matan Azrad. in reply to 1530190137-17848-1-git-send-email-motih@mellanox.com v3: * Fixed compilation errors in compilers without GNU C extensions caused by a declaration of zero-length array in the code. in reply to 1530187032-6489-1-git-send-email-motih@mellanox.com v2: * Fixed coding style warning. in reply to 1530184583-30166-1-git-send-email-motih@mellanox.com v1: * Fixed coding style warnings. in reply to 1530181779-19716-1-git-send-email-motih@mellanox.com --- doc/guides/nics/features/mlx4.ini | 1 + doc/guides/nics/mlx4.rst | 3 + drivers/net/mlx4/Makefile | 5 + drivers/net/mlx4/mlx4.c | 9 + drivers/net/mlx4/mlx4.h | 5 + drivers/net/mlx4/mlx4_prm.h | 15 ++ drivers/net/mlx4/mlx4_rxtx.c | 378 +++++++++++++++++++++++++++++++++++++- drivers/net/mlx4/mlx4_rxtx.h | 2 +- drivers/net/mlx4/mlx4_txq.c | 8 +- 9 files changed, 422 insertions(+), 4 deletions(-) diff --git a/doc/guides/nics/features/mlx4.ini b/doc/guides/nics/features/mlx4.ini index f6efd21..98a3f61 100644 --- a/doc/guides/nics/features/mlx4.ini +++ b/doc/guides/nics/features/mlx4.ini @@ -13,6 +13,7 @@ Queue start/stop = Y MTU update = Y Jumbo frame = Y Scattered Rx = Y +TSO = Y Promiscuous mode = Y Allmulticast mode = Y Unicast MAC filter = Y diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst index 491106a..12adaeb 100644 --- a/doc/guides/nics/mlx4.rst +++ b/doc/guides/nics/mlx4.rst @@ -142,6 +142,9 @@ Limitations The ability to enable/disable CRC stripping requires OFED version 4.3-1.5.0.0 and above or rdma-core version v18 and above. +- TSO (Transmit Segmentation Offload) is supported in OFED version + 4.4 and above or in rdma-core version v18 and above. + Prerequisites ------------- diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile index 73f9d40..63bc003 100644 --- a/drivers/net/mlx4/Makefile +++ b/drivers/net/mlx4/Makefile @@ -85,6 +85,11 @@ mlx4_autoconf.h.new: FORCE mlx4_autoconf.h.new: $(RTE_SDK)/buildtools/auto-config-h.sh $Q $(RM) -f -- '$@' $Q : > '$@' + $Q sh -- '$<' '$@' \ + HAVE_IBV_MLX4_WQE_LSO_SEG \ + infiniband/mlx4dv.h \ + type 'struct mlx4_wqe_lso_seg' \ + $(AUTOCONF_OUTPUT) # Create mlx4_autoconf.h or update it in case it differs from the new one. diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c index d151a90..5d8c76d 100644 --- a/drivers/net/mlx4/mlx4.c +++ b/drivers/net/mlx4/mlx4.c @@ -677,6 +677,15 @@ struct mlx4_conf { IBV_RAW_PACKET_CAP_SCATTER_FCS); DEBUG("FCS stripping toggling is %ssupported", priv->hw_fcs_strip ? "" : "not "); + priv->tso = + ((device_attr_ex.tso_caps.max_tso > 0) && + (device_attr_ex.tso_caps.supported_qpts & + (1 << IBV_QPT_RAW_PACKET))); + if (priv->tso) + priv->tso_max_payload_sz = + device_attr_ex.tso_caps.max_tso; + DEBUG("TSO is %ssupported", + priv->tso ? "" : "not "); /* Configure the first MAC address by default. */ err = mlx4_get_mac(priv, &mac.addr_bytes); if (err) { diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h index 300cb4d..89d8c38 100644 --- a/drivers/net/mlx4/mlx4.h +++ b/drivers/net/mlx4/mlx4.h @@ -47,6 +47,9 @@ /** Interrupt alarm timeout value in microseconds. */ #define MLX4_INTR_ALARM_TIMEOUT 100000 +/* Maximum packet headers size (L2+L3+L4) for TSO. */ +#define MLX4_MAX_TSO_HEADER 192 + /** Port parameter. */ #define MLX4_PMD_PORT_KVARG "port" @@ -90,6 +93,8 @@ struct priv { uint32_t hw_csum:1; /**< Checksum offload is supported. */ uint32_t hw_csum_l2tun:1; /**< Checksum support for L2 tunnels. */ uint32_t hw_fcs_strip:1; /**< FCS stripping toggling is supported. */ + uint32_t tso:1; /**< Transmit segmentation offload is supported. */ + uint32_t tso_max_payload_sz; /**< Max supported TSO payload size. */ uint64_t hw_rss_sup; /**< Supported RSS hash fields (Verbs format). */ struct rte_intr_handle intr_handle; /**< Port interrupt handle. */ struct mlx4_drop *drop; /**< Shared resources for drop flow rules. */ diff --git a/drivers/net/mlx4/mlx4_prm.h b/drivers/net/mlx4/mlx4_prm.h index b771d8c..aef77ba 100644 --- a/drivers/net/mlx4/mlx4_prm.h +++ b/drivers/net/mlx4/mlx4_prm.h @@ -19,6 +19,7 @@ #ifdef PEDANTIC #pragma GCC diagnostic error "-Wpedantic" #endif +#include "mlx4_autoconf.h" /* ConnectX-3 Tx queue basic block. */ #define MLX4_TXBB_SHIFT 6 @@ -40,6 +41,7 @@ /* Work queue element (WQE) flags. */ #define MLX4_WQE_CTRL_IIP_HDR_CSUM (1 << 28) #define MLX4_WQE_CTRL_IL4_HDR_CSUM (1 << 27) +#define MLX4_WQE_CTRL_RR (1 << 6) /* CQE checksum flags. */ enum { @@ -98,6 +100,19 @@ struct mlx4_cq { int arm_sn; /**< Rx event counter. */ }; +#ifndef HAVE_IBV_MLX4_WQE_LSO_SEG +/* + * WQE LSO segment structure. + * Defined here as backward compatibility for rdma-core v17 and below. + * Similar definition is found in infiniband/mlx4dv.h in rdma-core v18 + * and above. + */ +struct mlx4_wqe_lso_seg { + rte_be32_t mss_hdr_size; + rte_be32_t header[]; +}; +#endif + /** * Retrieve a CQE entry from a CQ. * diff --git a/drivers/net/mlx4/mlx4_rxtx.c b/drivers/net/mlx4/mlx4_rxtx.c index 78b6dd5..6654843 100644 --- a/drivers/net/mlx4/mlx4_rxtx.c +++ b/drivers/net/mlx4/mlx4_rxtx.c @@ -38,10 +38,29 @@ * DWORD (32 byte) of a TXBB. */ struct pv { - volatile struct mlx4_wqe_data_seg *dseg; + union { + volatile struct mlx4_wqe_data_seg *dseg; + volatile uint32_t *dst; + }; uint32_t val; }; +/** A helper structure for TSO packet handling. */ +struct tso_info { + /** Pointer to the array of saved first DWORD (32 byte) of a TXBB. */ + struct pv *pv; + /** Current entry in the pv array. */ + int pv_counter; + /** Total size of the WQE including padding. */ + uint32_t wqe_size; + /** Size of TSO header to prepend to each packet to send. */ + uint16_t tso_header_size; + /** Total size of the TSO segment in the WQE. */ + uint16_t wqe_tso_seg_size; + /** Raw WQE size in units of 16 Bytes and without padding. */ + uint8_t fence_size; +}; + /** A table to translate Rx completion flags to packet type. */ uint32_t mlx4_ptype_table[0x100] __rte_cache_aligned = { /* @@ -368,6 +387,351 @@ struct pv { } /** + * Obtain and calculate TSO information needed for assembling a TSO WQE. + * + * @param buf + * Pointer to the first packet mbuf. + * @param txq + * Pointer to Tx queue structure. + * @param tinfo + * Pointer to a structure to fill the info with. + * + * @return + * 0 on success, negative value upon error. + */ +static inline int +mlx4_tx_burst_tso_get_params(struct rte_mbuf *buf, + struct txq *txq, + struct tso_info *tinfo) +{ + struct mlx4_sq *sq = &txq->msq; + const uint8_t tunneled = txq->priv->hw_csum_l2tun && + (buf->ol_flags & PKT_TX_TUNNEL_MASK); + + tinfo->tso_header_size = buf->l2_len + buf->l3_len + buf->l4_len; + if (tunneled) + tinfo->tso_header_size += + buf->outer_l2_len + buf->outer_l3_len; + if (unlikely(buf->tso_segsz == 0 || + tinfo->tso_header_size == 0 || + tinfo->tso_header_size > MLX4_MAX_TSO_HEADER || + tinfo->tso_header_size > buf->data_len)) + return -EINVAL; + /* + * Calculate the WQE TSO segment size + * Note: + * 1. An LSO segment must be padded such that the subsequent data + * segment is 16-byte aligned. + * 2. The start address of the TSO segment is always 16 Bytes aligned. + */ + tinfo->wqe_tso_seg_size = RTE_ALIGN(sizeof(struct mlx4_wqe_lso_seg) + + tinfo->tso_header_size, + sizeof(struct mlx4_wqe_data_seg)); + tinfo->fence_size = ((sizeof(struct mlx4_wqe_ctrl_seg) + + tinfo->wqe_tso_seg_size) >> MLX4_SEG_SHIFT) + + buf->nb_segs; + tinfo->wqe_size = + RTE_ALIGN((uint32_t)(tinfo->fence_size << MLX4_SEG_SHIFT), + MLX4_TXBB_SIZE); + /* Validate WQE size and WQE space in the send queue. */ + if (sq->remain_size < tinfo->wqe_size || + tinfo->wqe_size > MLX4_MAX_WQE_SIZE) + return -ENOMEM; + /* Init pv. */ + tinfo->pv = (struct pv *)txq->bounce_buf; + tinfo->pv_counter = 0; + return 0; +} + +/** + * Fill the TSO WQE data segments with info on buffers to transmit . + * + * @param buf + * Pointer to the first packet mbuf. + * @param txq + * Pointer to Tx queue structure. + * @param tinfo + * Pointer to TSO info to use. + * @param dseg + * Pointer to the first data segment in the TSO WQE. + * @param ctrl + * Pointer to the control segment in the TSO WQE. + * + * @return + * 0 on success, negative value upon error. + */ +static inline volatile struct mlx4_wqe_ctrl_seg * +mlx4_tx_burst_fill_tso_dsegs(struct rte_mbuf *buf, + struct txq *txq, + struct tso_info *tinfo, + volatile struct mlx4_wqe_data_seg *dseg, + volatile struct mlx4_wqe_ctrl_seg *ctrl) +{ + uint32_t lkey; + int nb_segs = buf->nb_segs; + int nb_segs_txbb; + struct mlx4_sq *sq = &txq->msq; + struct rte_mbuf *sbuf = buf; + struct pv *pv = tinfo->pv; + int *pv_counter = &tinfo->pv_counter; + volatile struct mlx4_wqe_ctrl_seg *ctrl_next = + (volatile struct mlx4_wqe_ctrl_seg *) + ((volatile uint8_t *)ctrl + tinfo->wqe_size); + uint16_t sb_of = tinfo->tso_header_size; + uint16_t data_len = sbuf->data_len - sb_of; + + do { + /* how many dseg entries do we have in the current TXBB ? */ + nb_segs_txbb = (MLX4_TXBB_SIZE - + ((uintptr_t)dseg & (MLX4_TXBB_SIZE - 1))) >> + MLX4_SEG_SHIFT; + switch (nb_segs_txbb) { + default: + /* Should never happen. */ + rte_panic("%p: Invalid number of SGEs(%d) for a TXBB", + (void *)txq, nb_segs_txbb); + /* rte_panic never returns. */ + case 4: + /* Memory region key for this memory pool. */ + lkey = mlx4_tx_mb2mr(txq, sbuf); + if (unlikely(lkey == (uint32_t)-1)) + goto err; + dseg->addr = + rte_cpu_to_be_64(rte_pktmbuf_mtod_offset(sbuf, + uintptr_t, + sb_of)); + dseg->lkey = lkey; + /* + * This data segment starts at the beginning of a new + * TXBB, so we need to postpone its byte_count writing + * for later. + */ + pv[*pv_counter].dseg = dseg; + /* + * Zero length segment is treated as inline segment + * with zero data. + */ + pv[(*pv_counter)++].val = + rte_cpu_to_be_32(data_len ? + data_len : + 0x80000000); + if (--nb_segs == 0) + return ctrl_next; + /* Prepare next buf info */ + sbuf = sbuf->next; + dseg++; + data_len = sbuf->data_len; + sb_of = 0; + /* fallthrough */ + case 3: + lkey = mlx4_tx_mb2mr(txq, sbuf); + if (unlikely(lkey == (uint32_t)-1)) + goto err; + mlx4_fill_tx_data_seg(dseg, + lkey, + rte_pktmbuf_mtod_offset(sbuf, + uintptr_t, + sb_of), + rte_cpu_to_be_32(data_len ? + data_len : + 0x80000000)); + if (--nb_segs == 0) + return ctrl_next; + /* Prepare next buf info */ + sbuf = sbuf->next; + dseg++; + data_len = sbuf->data_len; + sb_of = 0; + /* fallthrough */ + case 2: + lkey = mlx4_tx_mb2mr(txq, sbuf); + if (unlikely(lkey == (uint32_t)-1)) + goto err; + mlx4_fill_tx_data_seg(dseg, + lkey, + rte_pktmbuf_mtod_offset(sbuf, + uintptr_t, + sb_of), + rte_cpu_to_be_32(data_len ? + data_len : + 0x80000000)); + if (--nb_segs == 0) + return ctrl_next; + /* Prepare next buf info */ + sbuf = sbuf->next; + dseg++; + data_len = sbuf->data_len; + sb_of = 0; + /* fallthrough */ + case 1: + lkey = mlx4_tx_mb2mr(txq, sbuf); + if (unlikely(lkey == (uint32_t)-1)) + goto err; + mlx4_fill_tx_data_seg(dseg, + lkey, + rte_pktmbuf_mtod_offset(sbuf, + uintptr_t, + sb_of), + rte_cpu_to_be_32(data_len ? + data_len : + 0x80000000)); + if (--nb_segs == 0) + return ctrl_next; + /* Prepare next buf info */ + sbuf = sbuf->next; + dseg++; + data_len = sbuf->data_len; + sb_of = 0; + /* fallthrough */ + } + /* Wrap dseg if it points at the end of the queue. */ + if ((volatile uint8_t *)dseg >= sq->eob) + dseg = (volatile struct mlx4_wqe_data_seg *) + ((volatile uint8_t *)dseg - sq->size); + } while (true); +err: + return NULL; +} + +/** + * Fill the packet's l2, l3 and l4 headers to the WQE. + * + * This will be used as the header for each TSO segment that is transmitted. + * + * @param buf + * Pointer to the first packet mbuf. + * @param txq + * Pointer to Tx queue structure. + * @param tinfo + * Pointer to TSO info to use. + * @param ctrl + * Pointer to the control segment in the TSO WQE. + * + * @return + * 0 on success, negative value upon error. + */ +static inline volatile struct mlx4_wqe_data_seg * +mlx4_tx_burst_fill_tso_hdr(struct rte_mbuf *buf, + struct txq *txq, + struct tso_info *tinfo, + volatile struct mlx4_wqe_ctrl_seg *ctrl) +{ + volatile struct mlx4_wqe_lso_seg *tseg = + (volatile struct mlx4_wqe_lso_seg *)(ctrl + 1); + struct mlx4_sq *sq = &txq->msq; + struct pv *pv = tinfo->pv; + int *pv_counter = &tinfo->pv_counter; + int remain_size = tinfo->tso_header_size; + char *from = rte_pktmbuf_mtod(buf, char *); + uint16_t txbb_avail_space; + /* Union to overcome volatile constraints when copying TSO header. */ + union { + volatile uint8_t *vto; + uint8_t *to; + } thdr = { .vto = (volatile uint8_t *)tseg->header, }; + + /* + * TSO data always starts at offset 20 from the beginning of the TXBB + * (16 byte ctrl + 4byte TSO desc). Since each TXBB is 64Byte aligned + * we can write the first 44 TSO header bytes without worry for TxQ + * wrapping or overwriting the first TXBB 32bit word. + */ + txbb_avail_space = MLX4_TXBB_SIZE - + (sizeof(struct mlx4_wqe_ctrl_seg) + + sizeof(struct mlx4_wqe_lso_seg)); + while (remain_size >= (int)(txbb_avail_space + sizeof(uint32_t))) { + /* Copy to end of txbb. */ + rte_memcpy(thdr.to, from, txbb_avail_space); + from += txbb_avail_space; + thdr.to += txbb_avail_space; + /* New TXBB, Check for TxQ wrap. */ + if (thdr.to >= sq->eob) + thdr.vto = sq->buf; + /* New TXBB, stash the first 32bits for later use. */ + pv[*pv_counter].dst = (volatile uint32_t *)thdr.to; + pv[(*pv_counter)++].val = *(uint32_t *)from, + from += sizeof(uint32_t); + thdr.to += sizeof(uint32_t); + remain_size -= txbb_avail_space + sizeof(uint32_t); + /* Avail space in new TXBB is TXBB size - 4 */ + txbb_avail_space = MLX4_TXBB_SIZE - sizeof(uint32_t); + } + if (remain_size > txbb_avail_space) { + rte_memcpy(thdr.to, from, txbb_avail_space); + from += txbb_avail_space; + thdr.to += txbb_avail_space; + remain_size -= txbb_avail_space; + /* New TXBB, Check for TxQ wrap. */ + if (thdr.to >= sq->eob) + thdr.vto = sq->buf; + pv[*pv_counter].dst = (volatile uint32_t *)thdr.to; + rte_memcpy(&pv[*pv_counter].val, from, remain_size); + (*pv_counter)++; + } else if (remain_size) { + rte_memcpy(thdr.to, from, remain_size); + } + tseg->mss_hdr_size = rte_cpu_to_be_32((buf->tso_segsz << 16) | + tinfo->tso_header_size); + /* Calculate data segment location */ + return (volatile struct mlx4_wqe_data_seg *) + ((uintptr_t)tseg + tinfo->wqe_tso_seg_size); +} + +/** + * Write data segments and header for TSO uni/multi segment packet. + * + * @param buf + * Pointer to the first packet mbuf. + * @param txq + * Pointer to Tx queue structure. + * @param ctrl + * Pointer to the WQE control segment. + * + * @return + * Pointer to the next WQE control segment on success, NULL otherwise. + */ +static volatile struct mlx4_wqe_ctrl_seg * +mlx4_tx_burst_tso(struct rte_mbuf *buf, struct txq *txq, + volatile struct mlx4_wqe_ctrl_seg *ctrl) +{ + volatile struct mlx4_wqe_data_seg *dseg; + volatile struct mlx4_wqe_ctrl_seg *ctrl_next; + struct mlx4_sq *sq = &txq->msq; + struct tso_info tinfo; + struct pv *pv; + int pv_counter; + int ret; + + ret = mlx4_tx_burst_tso_get_params(buf, txq, &tinfo); + if (unlikely(ret)) + goto error; + dseg = mlx4_tx_burst_fill_tso_hdr(buf, txq, &tinfo, ctrl); + if (unlikely(dseg == NULL)) + goto error; + if ((uintptr_t)dseg >= (uintptr_t)sq->eob) + dseg = (volatile struct mlx4_wqe_data_seg *) + ((uintptr_t)dseg - sq->size); + ctrl_next = mlx4_tx_burst_fill_tso_dsegs(buf, txq, &tinfo, dseg, ctrl); + if (unlikely(ctrl_next == NULL)) + goto error; + /* Write the first DWORD of each TXBB save earlier. */ + if (likely(tinfo.pv_counter)) { + pv = tinfo.pv; + pv_counter = tinfo.pv_counter; + /* Need a barrier here before writing the first TXBB word. */ + rte_io_wmb(); + for (--pv_counter; pv_counter >= 0; pv_counter--) + *pv[pv_counter].dst = pv[pv_counter].val; + } + ctrl->fence_size = tinfo.fence_size; + sq->remain_size -= tinfo.wqe_size; + return ctrl_next; +error: + txq->stats.odropped++; + return NULL; +} + +/** * Write data segments of multi-segment packet. * * @param buf @@ -560,6 +924,7 @@ struct pv { uint16_t flags16[2]; } srcrb; uint32_t lkey; + bool tso = txq->priv->tso && (buf->ol_flags & PKT_TX_TCP_SEG); /* Clean up old buffer. */ if (likely(elt->buf != NULL)) { @@ -578,7 +943,16 @@ struct pv { } while (tmp != NULL); } RTE_MBUF_PREFETCH_TO_FREE(elt_next->buf); - if (buf->nb_segs == 1) { + if (tso) { + /* Change opcode to TSO */ + owner_opcode &= ~MLX4_OPCODE_CONFIG_CMD; + owner_opcode |= MLX4_OPCODE_LSO | MLX4_WQE_CTRL_RR; + ctrl_next = mlx4_tx_burst_tso(buf, txq, ctrl); + if (!ctrl_next) { + elt->buf = NULL; + break; + } + } else if (buf->nb_segs == 1) { /* Validate WQE space in the send queue. */ if (sq->remain_size < MLX4_TXBB_SIZE) { elt->buf = NULL; diff --git a/drivers/net/mlx4/mlx4_rxtx.h b/drivers/net/mlx4/mlx4_rxtx.h index 4c025e3..ffa8abf 100644 --- a/drivers/net/mlx4/mlx4_rxtx.h +++ b/drivers/net/mlx4/mlx4_rxtx.h @@ -90,7 +90,7 @@ struct mlx4_txq_stats { unsigned int idx; /**< Mapping index. */ uint64_t opackets; /**< Total of successfully sent packets. */ uint64_t obytes; /**< Total of successfully sent bytes. */ - uint64_t odropped; /**< Total of packets not sent when Tx ring full. */ + uint64_t odropped; /**< Total number of packets failed to transmit. */ }; /** Tx queue descriptor. */ diff --git a/drivers/net/mlx4/mlx4_txq.c b/drivers/net/mlx4/mlx4_txq.c index 6edaadb..9aa7440 100644 --- a/drivers/net/mlx4/mlx4_txq.c +++ b/drivers/net/mlx4/mlx4_txq.c @@ -116,8 +116,14 @@ DEV_TX_OFFLOAD_UDP_CKSUM | DEV_TX_OFFLOAD_TCP_CKSUM); } - if (priv->hw_csum_l2tun) + if (priv->tso) + offloads |= DEV_TX_OFFLOAD_TCP_TSO; + if (priv->hw_csum_l2tun) { offloads |= DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM; + if (priv->tso) + offloads |= (DEV_TX_OFFLOAD_VXLAN_TNL_TSO | + DEV_TX_OFFLOAD_GRE_TNL_TSO); + } return offloads; } -- 1.8.3.1