From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2E209A0566; Tue, 17 Mar 2020 17:22:15 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 80A0A1C0AA; Tue, 17 Mar 2020 17:22:14 +0100 (CET) Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-eopbgr140089.outbound.protection.outlook.com [40.107.14.89]) by dpdk.org (Postfix) with ESMTP id EEE081C07D for ; Tue, 17 Mar 2020 17:22:13 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UVG3X2F11uyLUu4nJJsmvWAHhev2oQFusVpfIQD2U1DOA5qOkMCvMwBO8Cpuv/iK6JiIu+3QB+pC6cNiRzwmU2jEIx+c3PaUyr3kEB1gvpKUukv3FQJtN1rO9xdQP1rlvF7vhst0Meq8ZksZP05MKSKKpWw1MmQWGoOVhYk5stczdh4TNieKRnq4tUvIkSIu3T++RkTKzKaW7/SGF4gG2K71UY5+rhcWcR3TkYkSjsViQicbh1XNxQIWPxWnrYiOnDuMtgJIFS3fIF0TCmoaKoQ4aNBxAyzMy2gp0Rfbq41sUy0oDrREM7r1yb6VwzPjbrf3nnCLWomdBC75pPXnSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Sc16tA9xfE8y7PXLOi9h/WNREJkaBq47nStPxEqG7Bw=; b=cc48/4SAMj6tZ/4Jo0z3R23IcnIYG4aakIXlDwwKIhzjQKnd8V1+HZtyyjSsUs4E40yA/sgPzibXb4TnkkrSE6TfKVo6Bog0RhFVWjztxdsHGkbOY8VKhntHJQbbzuZr3RTz+xCtbvZ+8/7IqFoYaBRf79CUfXZjCRGnwRoyKgRt/2bLtHkIPEoSnrYLORyfoZBqFSvxG/GF2pXToUrgVLy1YFwd6HFJghvavzjqMGYdiT293+UQAtK5DKyapUkkenDhG4n3A3YWLYNb6vU8CwvXXTRc/SO3hL92vuj6suJKqZsv3lSKNUpF+8tj33O8iMHXkH1haCle47ScJhKJYw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com; dkim=pass header.d=mellanox.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Sc16tA9xfE8y7PXLOi9h/WNREJkaBq47nStPxEqG7Bw=; b=h4dySdG10kLA8OqUAm8buqoqqjiA6APFyL33LvIS1v+B2VNFXi5QnnoiYflq90y28aTToIuyooOzVPBqwrnJXA4MKoZsmUIwJcwJCiy03NLt2zECZu9NZY8FNebb+jEQ+CjaNBPkg0uafn5ubZ0L9vGJIXfHj38nOzBvfUJP3KM= Received: from AM4PR05MB3265.eurprd05.prod.outlook.com (10.171.188.154) by AM4PR05MB3251.eurprd05.prod.outlook.com (10.171.187.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2814.21; Tue, 17 Mar 2020 16:22:12 +0000 Received: from AM4PR05MB3265.eurprd05.prod.outlook.com ([fe80::da5:7919:35c1:894]) by AM4PR05MB3265.eurprd05.prod.outlook.com ([fe80::da5:7919:35c1:894%6]) with mapi id 15.20.2814.021; Tue, 17 Mar 2020 16:22:12 +0000 From: Slava Ovsiienko To: "dev@dpdk.org" CC: Matan Azrad , Asaf Penso Thread-Topic: RE: [RFC] net/mlx5: add large packet size support to MPRQ Thread-Index: AdX8eBFiaa5MDf4dQdaczkKosQkQxg== Date: Tue, 17 Mar 2020 16:22:12 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=viacheslavo@mellanox.com; x-originating-ip: [95.164.10.10] x-ms-publictraffictype: Email x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: fc8709e3-374b-4bbd-3d56-08d7ca8f592d x-ms-traffictypediagnostic: AM4PR05MB3251:|AM4PR05MB3251: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:9508; x-forefront-prvs: 0345CFD558 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(4636009)(39860400002)(396003)(346002)(366004)(136003)(376002)(199004)(55016002)(8936002)(81156014)(81166006)(54906003)(8676002)(76116006)(5660300002)(33656002)(107886003)(86362001)(316002)(6916009)(7696005)(71200400001)(2906002)(478600001)(186003)(26005)(4326008)(52536014)(66476007)(66556008)(66946007)(64756008)(66446008)(6506007)(9686003); DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR05MB3251; H:AM4PR05MB3265.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: FN4ieFinvL9enVFg5Au7S/ZSQ9fSBkyrehjudO4mGQkJTzXlFQEKUdEAA2P4BTHwn95YJKsNax/8kboscQpcuZh05O4L0wkSwQgag2IjXLtc7v/iBjZHvIx+W4s4c0TyAGeOCD1YFsK07D/tv0hxGWE1bord9TFaQeXqivB4GCJcc64N8EoYPHT0MX2h6S6NkTHWJsoU4japAt74wL6eQ9JANuRY3XZLXCShflYQqGUMfjk/sy5fnRKgxwVxnh1aUQcIziQnHj7dCxmX4M3RPIw5jb52lnoE1rekSBPYN6mp2d/fVUTddEh6wqrws/sbkwehjDltLHwC2JR0L3USSXCBKGjsGNl5nWhPJU/pxJXJwoQYgAflDlRJijv6erw1Gx9A+w6qhSTsHAw6pFpBvjjNjVSkLD18mhSyhXh4c35hyo1jb1C3Y4p6TYYodKVI x-ms-exchange-antispam-messagedata: SV3QUTfY20NRY2fnqb7plt6iR25SjtYWxY0xLD1UoxMBWhxFUxfuSruV1ogVGL4zuDK5Xei3hwvUJA32HIk7QebLeYzjKQis5TYUpu5G71GPJueoLvOIln5J2YldqLwk1J1pLe0LqYCbEXC3PRD2Eg== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: fc8709e3-374b-4bbd-3d56-08d7ca8f592d X-MS-Exchange-CrossTenant-originalarrivaltime: 17 Mar 2020 16:22:12.0146 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: iWykJKhKUYlHa6paNNi1IuOsNe6MRQKlsZd7WMuZSr4/qC/TvMoZ8hdHQQFb+iOCKXALpxjH8xO2q+WHC73P7XmHNZgwNoNPeoCedPDgqz8= X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR05MB3251 Subject: Re: [dpdk-dev] [RFC] net/mlx5: add large packet size support to MPRQ X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The packet rate with 64 bytes packet size over 100 Gbps line can reach 148.= 8 million packets per second. The ConnectX NICs descriptor part to specify = the packet data receiving buffer is 16 bytes size. So, the required PCIe ba= ndwidth just to read descriptors by the NIC from the host memory is 148.8M*= 16B =3D 2.27GB per second, this is 1/6 of the PCIe x16 Gen 3 slot total ban= dwidth. To mitigate this requirement the Multi-Packet Receiving Queue (MPRQ= ) feature is provided by Mellanox NICs, with this feature the descriptor sp= ecifies the single linear buffer accepting the multiple packets into stride= s within this buffer. The current implementation of mlx5 PMD allows the pac= ket to be received into the single stride only, packet can't be placed into= multiple adjacent strides. It means the stride size must be large enough t= o store the packets up to MTU size. The maximal stride size is limited by h= ardware capabilities, for example, ConnectX-5 supports strides up to 8KB. H= ence, if MPRQ feature is enabled the maximal supported MTU is limited by ma= ximal stride size (minus space for the HEAD_ROOM). The MPRQ feature is crucial to support the full line rate with small size p= ackets over the fast lines, it must be enabled if the full line rate is des= ired. In order to support the MTU exceeding the stride size, the MPRQ featu= re should be updated to allow a packet to take more than one stride, receiv= ing packet into multiple adjacent strides should be implemented. The reason preventing the packet to be received into multiple strides is th= at the data buffer must be preceded with some HEAD_ROOM space. In the curre= nt implementation, the HEAD_ROOM space is borrowed by PMD from the tail of = the preceding stride. If packet took multiple strides it would happen the t= ail of stride is overwritten with packet data and the memory can't be borro= wed to provide the HEAD_ROOM space for the next packet. There three ways to= resolve the issue are proposed: 1. To copy the part of packet data to dedicated mbuf in order to free the m= emory needed for the next packet HEAD_ROOM. Actual copying is needed for th= e range of packets sizes when the tail of the stride is occupied by receive= d data. For example, for stride size 8KB, HEAD_ROOM size 128B, and MTU 9000= B the data copying would happen for the packet size range 8064-8192 bytes. = Then the dedicated mbuf should be linked to the mbuf chain in order to buil= d a multi-segment packet, the first mbuf points to the stride as an externa= l buffer, and the second mbuf contains the copied data, the tail of the str= ide is free to be used as HEAD_ROOM of the next packet. 2. The provide HEAD_ROOM as dedicated mbuf being linked as the first one in= to the packet mbuf chain. Not all applications and DPDK routines support th= is approach. For example, rte_vlan_insert() assumes the HEAD_ROOM immediate= ly precedes the packet data, hence this solution does not look to be approp= riate. 3. The abovementioned approaches suppose application and PMDs support the m= ulti-segment packets, if not - we should copy the entire data packet to the= single mbuf. To configure one of the approaches above the new devarg is proposed: mprq_l= og_stride_size - specifies the desired stride size (log2). If this paramete= r not specified the mlx5 PMD tries to support MPRQ in existing fashion, in = compatibility mode. Otherwise, the overlapping data copy is engaged, the mo= de depends on whether multi-segment packet support is enabled. If there is = no scattering enabled the approach (3) is engaged. Signed-off-by: Viacheslav Ovsiienko