From: "Wiles, Keith" <keith.wiles@intel.com>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: "Burakov, Anatoly" <anatoly.burakov@intel.com>,
Sam <batmanustc@gmail.com>, "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Where is the padding code in DPDK?
Date: Wed, 14 Nov 2018 16:19:08 +0000 [thread overview]
Message-ID: <FC92A98D-9FDB-47D5-A7C9-983737DC1E81@intel.com> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35B4248C@smartserver.smartshare.dk>
> On Nov 14, 2018, at 4:51 AM, Morten Brørup <mb@smartsharesystems.com> wrote:
>
> Anatoly,
>
> This differs from the Linux kernel's behavior, where padding belongs in the NIC driver layer, not in the protocol layer. If you pass a runt frame (too short packet) to a Linux NIC driver's transmission function, the NIC driver (or NIC hardware) will pad the frame to make it valid. E.g. look at the rhine_start_tx() function in the kernel: https://elixir.bootlin.com/linux/v4.9.137/source/drivers/net/ethernet/via/via-rhine.c#L1800
The PMD in DPDK rejects the frame or extend the number of bytes to send. Padding assumes you are zeroing out the packet to meet the NIC required length. In PMDs unless they are concerned with security they just make sure the number of bytes to be sent are correct for the hardware (60 bytes min). Most NICs can do this padding in hardware as the packet is sent.
If we are talking about virtio and only talking to virtio software backend then you can send any size packet, but the stacks or code receiving the packet you need to make sure it does not throw the packet away because it is a runt packet. Most NICs throw away Runts and are never received to memory. In software based design like virtio you can do whatever you want in the length, but I would suggest following the Ethernet standard anyway.
Now some stacks or code (like Pktgen) assume the hardware will append the CRC (4 bytes) and this means the application needs to at least do 60 byte frames for the PMD, unless you know the hardware will do the right thing. The challenge is that applications in DPDK do not know the details of the NIC at that level and should always assume the packet being sent and received are valid Ethernet frames. This means at lease 60 bytes as all NICs add the CRC now a days and not all of them adjust the size of the frame.
If you do not send the PMD a 60 byte frame then you are expecting the NIC to handle the padding and appending the CRC or at least expecting the PMD to adjust the size, which I know is not in all PMDs or from my dealing with writing Pktgen for DPDK.
If you are expecting DPDK PMDs to be Linux drivers then you need to adjust your thinking and only send the PMD 60 bytes at least. Unless you want to modify all of the PMDs to force the size to 60bytes, then I have no objection to that patch just need to get all of the PMDs maintainers to agree with your patch.
On RX frames of less then 64 bytes (with CRC) are runts and most NICs today will not receive these frames unless you program the hardware to do so. ‘In my day’ :-) we had collision on the wire which created a huge amount of fragments or Runts, today is not the case with point-to-point links we have today.
>
> If DPDK does not pad short frames passed to the egress function of the NIC drivers, it should be noted in the documentation - this is not the expected behavior by protocol developers.
>
> Or even better: The NIC hardware (or driver) should ensure padding, possibly considering it a TX Offload feature. Generating packets shorter than 60 bytes data is common - just consider the amount of TCP ACK packets, which are typically only 14 + 20 + 20 = 54 bytes (incl. the 14 byte Ethernet header).
>
>
> Med venlig hilsen / kind regards
> - Morten Brørup
>
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Burakov, Anatoly
>> Sent: Wednesday, November 14, 2018 11:18 AM
>> To: Sam
>> Cc: dev@dpdk.org
>> Subject: Re: [dpdk-dev] Where is the padding code in DPDK?
>>
>> On 14-Nov-18 5:45 AM, Sam wrote:
>>> OK, then shortly speaking, DPDK will NOT care about padding.
>>> NIC will care about padding while send and recv with NIC.
>>> kernel will care about while send and recv with vhostuser port.
>>>
>>> Is that right?
>>
>> I cannot speak for virtio/vhost user since i am not terribly familiar
>> with them. For regular packets, generally speaking, packets shorter
>> than
>> 60 bytes are invalid. Whether DPDK does or does not care about padding
>> is irrelevant, because *you* are attempting to transmit packets that
>> are
>> not valid. You shouldn't rely on this behavior.
>>
>>>
>>>
>>> Burakov, Anatoly <anatoly.burakov@intel.com
>>> <mailto:anatoly.burakov@intel.com>> 于2018年11月13日周二 下午5:29写道:
>>>
>>> On 13-Nov-18 7:16 AM, Sam wrote:
>>>> Hi all,
>>>>
>>>> As we know, ethernet frame must longer then 64B.
>>>>
>>>> So if I create rte_mbuf and fill it with just 60B data, will
>>>> rte_eth_tx_burst add padding data, let the frame longer then
>> 64B
>>>>
>>>> If it does, where is the code?
>>>>
>>>
>>> Others can correct me if i'm wrong here, but specifically in case
>> of
>>> 64-byte packets, these are the shortest valid packets that you
>> can
>>> send,
>>> and a 64-byte packet will actually carry only 60 bytes' worth of
>> packet
>>> data, because there's a 4-byte CRC frame at the end (see Ethernet
>> frame
>>> format). If you enabled CRC offload, then your NIC will append
>> the 4
>>> bytes at transmit. If you haven't, then it's up to each
>> individual
>>> driver/NIC to accept/reject such a packet because it can rightly
>> be
>>> considered malformed.
>>>
>>> In addition, your NIC may add e.g. VLAN tags or other stuff,
>> again
>>> depending on hardware offloads that you have enabled in your TX
>>> configuration, which may push the packet size beyond 64 bytes
>> while
>>> having only 60 bytes of actual packet data.
>>>
>>> --
>>> Thanks,
>>> Anatoly
>>>
>>
>>
>> --
>> Thanks,
>> Anatoly
>
Regards,
Keith
next prev parent reply other threads:[~2018-11-14 16:19 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-13 7:16 Sam
2018-11-13 7:17 ` Sam
2018-11-13 7:22 ` Sam
2018-11-13 9:29 ` Burakov, Anatoly
2018-11-14 5:45 ` Sam
2018-11-14 10:17 ` Burakov, Anatoly
2018-11-14 10:51 ` Morten Brørup
2018-11-14 16:19 ` Wiles, Keith [this message]
2018-11-15 2:07 ` Sam
2018-11-15 2:13 ` Sam
2018-11-15 10:06 ` Burakov, Anatoly
2018-11-15 10:27 ` Morten Brørup
2018-11-15 13:32 ` Wiles, Keith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=FC92A98D-9FDB-47D5-A7C9-983737DC1E81@intel.com \
--to=keith.wiles@intel.com \
--cc=anatoly.burakov@intel.com \
--cc=batmanustc@gmail.com \
--cc=dev@dpdk.org \
--cc=mb@smartsharesystems.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).