DPDK patches and discussions
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Shahaf Shuler <shahafs@mellanox.com>
Cc: "Morten Brørup" <mb@smartsharesystems.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"Ferruh Yigit" <ferruh.yigit@intel.com>,
	"Declan Doherty" <declan.doherty@intel.com>,
	"Chas Williams" <chas3@att.com>,
	"John W. Linville" <linville@tuxdriver.com>,
	"Marcin Wojtas" <mw@semihalf.com>,
	"Michal Krawczyk" <mk@semihalf.com>,
	"Guy Tzalik" <gtzalik@amazon.com>,
	"Evgeny Schemeilin" <evgenys@amazon.com>,
	"Ravi Kumar" <ravi1.kumar@amd.com>,
	"Igor Russkikh" <igor.russkikh@aquantia.com>,
	"Pavel Belous" <pavel.belous@aquantia.com>,
	"Shepard Siegel" <shepard.siegel@atomicrules.com>,
	"Ed Czeck" <ed.czeck@atomicrules.com>,
	"John Miller" <john.miller@atomicrules.com>,
	"Ajit Khaparde" <ajit.khaparde@broadcom.com>,
	"Somnath Kotur" <somnath.kotur@broadcom.com>,
	"Jerin Jacob" <jerin.jacob@caviumnetworks.com>,
	"Maciej Czekaj" <maciej.czekaj@caviumnetworks.com>,
	"Shijith Thotton" <shijith.thotton@cavium.com>,
	"Srisivasubramanian Srinivasan" <ssrinivasan@cavium.com>,
	"Rahul Lakkireddy" <rahul.lakkireddy@chelsio.com>,
	"John Daley" <johndale@cisco.com>,
	"Hyong Youb Kim" <hyonkim@cisco.com>,
	"Wenzhuo Lu" <wenzhuo.lu@intel.com>,
	"Konstantin Ananyev" <konstantin.ananyev@intel.com>,
	"Beilei Xing" <beilei.xing@intel.com>,
	"Qi Zhang" <qi.z.zhang@intel.com>,
	"Xiao Wang" <xiao.w.wang@intel.com>,
	"Jingjing Wu" <jingjing.wu@intel.com>,
	"Tomasz Duszynski" <tdu@semihalf.com>,
	"Dmitri Epshtein" <dima@marvell.com>,
	"Natalie Samsonov" <nsamsono@marvell.com>,
	"Zyta Szpak" <zr@semihalf.com>,
	"Matan Azrad" <matan@mellanox.com>,
	"Yongseok Koh" <yskoh@mellanox.com>, kys <kys@microsoft.com>,
	haiyangz <haiyangz@microsoft.com>,
	"Jan Remes" <remes@netcope.com>,
	"Alejandro Lucero" <alejandro.lucero@netronome.com>,
	"Hemant Agrawal" <hemant.agrawal@nxp.com>,
	"Shreyansh Jain" <shreyansh.jain@nxp.com>,
	"Gagandeep Singh" <g.singh@nxp.com>,
	"Pankaj Chauhan" <pankaj.chauhan@nxp.com>,
	"Harish Patil" <harish.patil@cavium.com>,
	"Rasesh Mody" <rasesh.mody@cavium.com>,
	"Shahed Shaikh" <shahed.shaikh@cavium.com>,
	"Andrew Rybchenko" <arybchenko@solarflare.com>,
	"Yong Wang" <yongwang@vmware.com>,
	"Maxime Coquelin" <maxime.coquelin@redhat.com>,
	"Tiwei Bie" <tiwei.bie@intel.com>,
	"Zhihong Wang" <zhihong.wang@intel.com>,
	"Allain Legacy" <allain.legacy@windriver.com>,
	"Matt Peters" <matt.peters@windriver.com>,
	"Keith Wiles" <keith.wiles@intel.com>,
	"Bruce Richardson" <bruce.richardson@intel.com>,
	"Tetsuya Mukawa" <mtetsuyah@gmail.com>,
	"Gaetan Rivet" <gaetan.rivet@6wind.com>,
	"Jasvinder Singh" <jasvinder.singh@intel.com>,
	"Cristian Dumitrescu" <cristian.dumitrescu@intel.com>
Subject: Re: [dpdk-dev] [RFC] Ethernet drivers to add padding on egress
Date: Tue, 20 Nov 2018 16:55:17 -0800	[thread overview]
Message-ID: <20181120165517.28b21004@xeon-e3> (raw)
In-Reply-To: <DB7PR05MB44264384DE7F4B3837E4BADAC3D80@DB7PR05MB4426.eurprd05.prod.outlook.com>

On Mon, 19 Nov 2018 08:02:02 +0000
Shahaf Shuler <shahafs@mellanox.com> wrote:

> Thursday, November 15, 2018 6:57 PM, Morten Brørup:
> > Subject: [RFC] Ethernet drivers to add padding on egress
> > 
> > Hi networking driver maintainers,
> > 
> > I suggest that the TX functions of Ethernet interface drivers accept packets
> > with less than 60 byte payload, and transmit them on the medium as valid
> > Ethernet frames, i.e. by padding the packets up to the minimum Ethernet
> > packet size of 64 bytes incl. Ethernet FCS, instead of discarding them.
> > 
> > This feature makes it easier for application developers who are using DPDK as
> > the lower layer in an IP stack, where lots of packets have less than 60 bytes
> > Ethernet payload, e.g. TCP SYN and TCP ACK packets.
> > 
> > This feature also makes it easier for application developers who are using
> > DPDK library functions that split, merge or otherwise transform packets into
> > packets of other sizes, e.g. Generic Segmentation Offload, IP Fragmentation
> > and various tunnel encapsulation/decapsulation functions.
> > 
> > Currently (without this feature), it is required by the application to check if
> > packets originating from the IP stack or having passed through a
> > split/merge/transform function are about to egress on an Ethernet interface,
> > and in that case, if some of the packets are less than 60 bytes (excl. Ethernet
> > FCS), add padding before passing them on to the driver's TX function.
> > 
> > E.g. when using Generic Segmentation Offload, a packet carrying 1461 byte
> > TCP payload (excl. 54 bytes Ethernet+IP+TCP headers) will be split into two
> > packets of respectively 1514 byte (incl. 54 bytes Ethernet+IP+TCP headers)
> > and 55 bytes (incl. 54 bytes Ethernet+IP+TCP headers), and the latter must
> > be padded before it is transmitted on an Ethernet interface.
> > 
> > 
> > In my opinion, it should be a requirement that the Ethernet interface drivers
> > ensure correct padding when egressing the packet on the medium.
> > Alternatively, it can be an optional feature, which could be exposed as an TX
> > Capabilities flag of the driver.
> > 
> > What do you think?  
> 
> I think at the first stage it should be a Tx offload capability - the ability to pad (maybe in HW) the packets and avoid the cost of padding in SW.
> PMD vendors who wants to make an easier life for their customers can implement it in SW, however the gain here is only with simplicity of code for application. Performance wise it wouldn't matter. 
> 
> When the majority/all PMDs will have this feature we can discuss on making it a standard for each PMD (like the CRC strip we have today).

Yet another tx offload flag may look good as a vendor but doesn't add anything useful and hurts useablity.
Every driver should take any size Ethernet packet and pad in hardware (or software) based on what it knows the NIC hardware can do.
For virtual devices where there is no minimum length is required, then nothing needs to be done.

Packets < Ether header are obvious errors and should increment tx_output errors.

  reply	other threads:[~2018-11-21  0:55 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-15 16:56 Morten Brørup
2018-11-19  8:02 ` Shahaf Shuler
2018-11-21  0:55   ` Stephen Hemminger [this message]
2018-11-19 16:10 ` Stephen Hemminger
2018-11-20  8:16   ` Shahaf Shuler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181120165517.28b21004@xeon-e3 \
    --to=stephen@networkplumber.org \
    --cc=ajit.khaparde@broadcom.com \
    --cc=alejandro.lucero@netronome.com \
    --cc=allain.legacy@windriver.com \
    --cc=arybchenko@solarflare.com \
    --cc=beilei.xing@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=chas3@att.com \
    --cc=cristian.dumitrescu@intel.com \
    --cc=declan.doherty@intel.com \
    --cc=dev@dpdk.org \
    --cc=dima@marvell.com \
    --cc=ed.czeck@atomicrules.com \
    --cc=evgenys@amazon.com \
    --cc=ferruh.yigit@intel.com \
    --cc=g.singh@nxp.com \
    --cc=gaetan.rivet@6wind.com \
    --cc=gtzalik@amazon.com \
    --cc=haiyangz@microsoft.com \
    --cc=harish.patil@cavium.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=hyonkim@cisco.com \
    --cc=igor.russkikh@aquantia.com \
    --cc=jasvinder.singh@intel.com \
    --cc=jerin.jacob@caviumnetworks.com \
    --cc=jingjing.wu@intel.com \
    --cc=john.miller@atomicrules.com \
    --cc=johndale@cisco.com \
    --cc=keith.wiles@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=kys@microsoft.com \
    --cc=linville@tuxdriver.com \
    --cc=maciej.czekaj@caviumnetworks.com \
    --cc=matan@mellanox.com \
    --cc=matt.peters@windriver.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mb@smartsharesystems.com \
    --cc=mk@semihalf.com \
    --cc=mtetsuyah@gmail.com \
    --cc=mw@semihalf.com \
    --cc=nsamsono@marvell.com \
    --cc=pankaj.chauhan@nxp.com \
    --cc=pavel.belous@aquantia.com \
    --cc=qi.z.zhang@intel.com \
    --cc=rahul.lakkireddy@chelsio.com \
    --cc=rasesh.mody@cavium.com \
    --cc=ravi1.kumar@amd.com \
    --cc=remes@netcope.com \
    --cc=shahafs@mellanox.com \
    --cc=shahed.shaikh@cavium.com \
    --cc=shepard.siegel@atomicrules.com \
    --cc=shijith.thotton@cavium.com \
    --cc=shreyansh.jain@nxp.com \
    --cc=somnath.kotur@broadcom.com \
    --cc=ssrinivasan@cavium.com \
    --cc=tdu@semihalf.com \
    --cc=tiwei.bie@intel.com \
    --cc=wenzhuo.lu@intel.com \
    --cc=xiao.w.wang@intel.com \
    --cc=yongwang@vmware.com \
    --cc=yskoh@mellanox.com \
    --cc=zhihong.wang@intel.com \
    --cc=zr@semihalf.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).