From: Slava Ovsiienko <viacheslavo@mellanox.com>
To: "dev@dpdk.org" <dev@dpdk.org>
Cc: Matan Azrad <matan@mellanox.com>, Asaf Penso <asafp@mellanox.com>
Subject: Re: [dpdk-dev] [RFC] net/mlx5: add large packet size support to MPRQ
Date: Tue, 17 Mar 2020 16:22:12 +0000 [thread overview]
Message-ID: <AM4PR05MB326503FC27F3E7BC475AD53AD2F60@AM4PR05MB3265.eurprd05.prod.outlook.com> (raw)
The packet rate with 64 bytes packet size over 100 Gbps line can reach 148.8 million packets per second. The ConnectX NICs descriptor part to specify the packet data receiving buffer is 16 bytes size. So, the required PCIe bandwidth just to read descriptors by the NIC from the host memory is 148.8M*16B = 2.27GB per second, this is 1/6 of the PCIe x16 Gen 3 slot total bandwidth. To mitigate this requirement the Multi-Packet Receiving Queue (MPRQ) feature is provided by Mellanox NICs, with this feature the descriptor specifies the single linear buffer accepting the multiple packets into strides within this buffer. The current implementation of mlx5 PMD allows the packet to be received into the single stride only, packet can't be placed into multiple adjacent strides. It means the stride size must be large enough to store the packets up to MTU size. The maximal stride size is limited by hardware capabilities, for example, ConnectX-5 supports strides up to 8KB. Hence, if MPRQ feature is enabled the maximal supported MTU is limited by maximal stride size (minus space for the HEAD_ROOM).
The MPRQ feature is crucial to support the full line rate with small size packets over the fast lines, it must be enabled if the full line rate is desired. In order to support the MTU exceeding the stride size, the MPRQ feature should be updated to allow a packet to take more than one stride, receiving packet into multiple adjacent strides should be implemented.
The reason preventing the packet to be received into multiple strides is that the data buffer must be preceded with some HEAD_ROOM space. In the current implementation, the HEAD_ROOM space is borrowed by PMD from the tail of the preceding stride. If packet took multiple strides it would happen the tail of stride is overwritten with packet data and the memory can't be borrowed to provide the HEAD_ROOM space for the next packet. There three ways to resolve the issue are proposed:
1. To copy the part of packet data to dedicated mbuf in order to free the memory needed for the next packet HEAD_ROOM. Actual copying is needed for the range of packets sizes when the tail of the stride is occupied by received data. For example, for stride size 8KB, HEAD_ROOM size 128B, and MTU 9000B the data copying would happen for the packet size range 8064-8192 bytes. Then the dedicated mbuf should be linked to the mbuf chain in order to build a multi-segment packet, the first mbuf points to the stride as an external buffer, and the second mbuf contains the copied data, the tail of the stride is free to be used as HEAD_ROOM of the next packet.
2. The provide HEAD_ROOM as dedicated mbuf being linked as the first one into the packet mbuf chain. Not all applications and DPDK routines support this approach. For example, rte_vlan_insert() assumes the HEAD_ROOM immediately precedes the packet data, hence this solution does not look to be appropriate.
3. The abovementioned approaches suppose application and PMDs support the multi-segment packets, if not - we should copy the entire data packet to the single mbuf.
To configure one of the approaches above the new devarg is proposed: mprq_log_stride_size - specifies the desired stride size (log2). If this parameter not specified the mlx5 PMD tries to support MPRQ in existing fashion, in compatibility mode. Otherwise, the overlapping data copy is engaged, the mode depends on whether multi-segment packet support is enabled. If there is no scattering enabled the approach (3) is engaged.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
reply other threads:[~2020-03-17 16:22 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AM4PR05MB326503FC27F3E7BC475AD53AD2F60@AM4PR05MB3265.eurprd05.prod.outlook.com \
--to=viacheslavo@mellanox.com \
--cc=asafp@mellanox.com \
--cc=dev@dpdk.org \
--cc=matan@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).