From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <konstantin.ananyev@intel.com>
Received: from mga03.intel.com (mga03.intel.com [134.134.136.65])
 by dpdk.org (Postfix) with ESMTP id 26CD51518
 for <dev@dpdk.org>; Wed, 31 Aug 2016 14:35:19 +0200 (CEST)
Received: from fmsmga006.fm.intel.com ([10.253.24.20])
 by orsmga103.jf.intel.com with ESMTP; 31 Aug 2016 05:34:58 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.30,261,1470726000"; 
   d="scan'208";a="3082616"
Received: from irsmsx109.ger.corp.intel.com ([163.33.3.23])
 by fmsmga006.fm.intel.com with ESMTP; 31 Aug 2016 05:34:57 -0700
Received: from irsmsx105.ger.corp.intel.com ([169.254.7.102]) by
 IRSMSX109.ger.corp.intel.com ([169.254.13.24]) with mapi id 14.03.0248.002;
 Wed, 31 Aug 2016 13:34:56 +0100
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Stephen Hemminger <stephen@networkplumber.org>, "Kulasek, TomaszX"
 <tomaszx.kulasek@intel.com>
CC: "dev@dpdk.org" <dev@dpdk.org>
Thread-Topic: [dpdk-dev] [PATCH 0/6] add Tx preparation
Thread-Index: AQHR/7YytoC5uCoCk0KbBMnsnmaByqBbbxkAgAeVJMA=
Date: Wed, 31 Aug 2016 12:34:56 +0000
Message-ID: <2601191342CEEE43887BDE71AB97725836B95117@irsmsx105.ger.corp.intel.com>
References: <1472228578-6980-1-git-send-email-tomaszx.kulasek@intel.com>
 <20160826103114.5b547cef@xeon-e3>
In-Reply-To: <20160826103114.5b547cef@xeon-e3>
Accept-Language: en-IE, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [163.33.239.182]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH 0/6] add Tx preparation
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 31 Aug 2016 12:35:19 -0000


>=20
> On Fri, 26 Aug 2016 18:22:52 +0200
> Tomasz Kulasek <tomaszx.kulasek@intel.com> wrote:
>=20
> > As discussed in that thread:
> >
> > http://dpdk.org/ml/archives/dev/2015-September/023603.html
> >
> > Different NIC models depending on HW offload requested might impose
> > different requirements on packets to be TX-ed in terms of:
> >
> >  - Max number of fragments per packet allowed
> >  - Max number of fragments per TSO segments
> >  - The way pseudo-header checksum should be pre-calculated
> >  - L3/L4 header fields filling
> >  - etc.
> >
> >
> > MOTIVATION:
> > -----------
> >
> > 1) Some work cannot (and didn't should) be done in rte_eth_tx_burst.
> >    However, this work is sometimes required, and now, it's an
> >    application issue.
>=20
> Why not? You are adding an additional API burden on every application.
>=20
> >
> > 2) Different hardware may have different requirements for TX offloads,
> >    other subset can be supported and so on.
>=20
> These need to be reported by API so that application can handle it.

If you read the patch description, you'll see that we do both:
- provide tx_prep()
- "2) Also new fields will be introduced in rte_eth_desc_lim:=20
   nb_seg_max and nb_mtu_seg_max, providing an information about max
   segments in TSO and non-TSO packets acceptable by device.

   This information is useful for application to not create/limit
   malicious packet."

> Doing these transformations in tx_prep seems late in the process.

Why is that?
It is totally up to the application to decide ahat stage it wants to call t=
x_prep() for each packet -
just after it formed and mbuf to be TX-ed, or just before calling tx_burst(=
) for it, or somewhere in btetween.=20

>=20
> >
> > 3) Some parameters (e.g. number of segments in ixgbe driver) may hung
> >    device. These parameters may be vary for different devices.
> >
> >    For example i40e HW allows 8 fragments per packet, but that is after
> >    TSO segmentation. While ixgbe has a 38-fragment pre-TSO limit.
>=20
> Seems better to handle these limits as exceptions in i40e_tx_burst etc; r=
ather than a pre-step. Look at how Linux driver API works, several
> drivers have to have an exception linearize path.

Hmm, doesn't it contradicts with your statement above:
' Doing these transformations in tx_prep seems late in the process.'? :)
I suppose we all know that Linux kernel driver and DPDK PMD usage model is =
quite different.=20
As a rule of thumb we try to avoid modifying packet data inside the tx_burs=
t() itself.
Having this functionality in a different function gives upper layer a choic=
e when it is better
to modify packet contents and hopefully hide/minimize memory access latenci=
es.        =20

>=20
> >
> > 4) Fields in packet may require different initialization (like e.g. wil=
l
> >    require pseudo-header checksum precalculation, sometimes in a
> >    different way depending on packet type, and so on). Now application
> >    needs to care about it.
>=20
> Once again, the driver should do this in Tx.

Once again, I really doubt it should.

>=20
>=20
> >
> > 5) Using additional API (rte_eth_tx_prep) before rte_eth_tx_burst let t=
o
> >    prepare packet burst in acceptable form for specific device.
> >
> > 6) Some additional checks may be done in debug mode keeping tx_burst
> >    implementation clean.
>=20
> Most of this could be done by refactoring existing tx_burst in drivers.
> Much of the code seems to be written as the "let's write a 2000 line func=
tion because that is most efficient" rather than "let's write small
> steps and let the compiler optimize it"

I don't see how that could be easily done inside tx_burst() without signifc=
atn performance loss.
Especially if we have a pipeline model, when we have one or several t produ=
ce mbufs to be TX-ed,
and one or several lcores that doing actual TX for these packets.

Konstantin
=20