From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <helin.zhang@intel.com>
Received: from mga03.intel.com (mga03.intel.com [134.134.136.65])
 by dpdk.org (Postfix) with ESMTP id 5053732A5
 for <dev@dpdk.org>; Tue,  2 Dec 2014 08:26:24 +0100 (CET)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
 by orsmga103.jf.intel.com with ESMTP; 01 Dec 2014 23:22:43 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.07,499,1413270000"; d="scan'208";a="617102833"
Received: from pgsmsx101.gar.corp.intel.com ([10.221.44.78])
 by orsmga001.jf.intel.com with ESMTP; 01 Dec 2014 23:26:02 -0800
Received: from shsmsx103.ccr.corp.intel.com (10.239.4.69) by
 PGSMSX101.gar.corp.intel.com (10.221.44.78) with Microsoft SMTP Server (TLS)
 id 14.3.195.1; Tue, 2 Dec 2014 15:26:01 +0800
Received: from shsmsx104.ccr.corp.intel.com ([169.254.5.182]) by
 SHSMSX103.ccr.corp.intel.com ([169.254.4.240]) with mapi id 14.03.0195.001;
 Tue, 2 Dec 2014 15:26:00 +0800
From: "Zhang, Helin" <helin.zhang@intel.com>
To: Olivier MATZ <olivier.matz@6wind.com>, "Ananyev, Konstantin"
 <konstantin.ananyev@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Thread-Topic: [PATCH] i40e: Use one bit flag for all hardware detected RX
 packet errors
Thread-Index: AQHQCWs/YCaH2DMkh0m0YaBNr2E3oZxyY4CAgAAJe4CAA0CEsP//iXmAgATBUXCAAAl4gIAB45Ow
Date: Tue, 2 Dec 2014 07:25:59 +0000
Message-ID: <F35DEAC7BCE34641BA9FAC6BCA4A12E70A7CD771@SHSMSX104.ccr.corp.intel.com>
References: <1416982032-28519-1-git-send-email-helin.zhang@intel.com>
 <2601191342CEEE43887BDE71AB977258213BA7CA@IRSMSX105.ger.corp.intel.com>
 <5475B7EE.4020400@6wind.com>
 <2601191342CEEE43887BDE71AB977258213BA8CB@IRSMSX105.ger.corp.intel.com>
 <5475DFB9.7060609@6wind.com>
 <F35DEAC7BCE34641BA9FAC6BCA4A12E70A7CC692@SHSMSX104.ccr.corp.intel.com>
 <547836A9.1010008@6wind.com>
 <F35DEAC7BCE34641BA9FAC6BCA4A12E70A7CCD73@SHSMSX104.ccr.corp.intel.com>
 <547C3BC3.9050505@6wind.com>
In-Reply-To: <547C3BC3.9050505@6wind.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.239.127.40]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH] i40e: Use one bit flag for all hardware
 detected RX packet errors
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 02 Dec 2014 07:26:25 -0000


> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Monday, December 1, 2014 5:58 PM
> To: Zhang, Helin; Ananyev, Konstantin; dev@dpdk.org
> Cc: Cao, Waterman; Cao, Min
> Subject: Re: [PATCH] i40e: Use one bit flag for all hardware detected RX =
packet
> errors
>=20
> Hi Helin,
>=20
> On 12/01/2014 02:57 AM, Zhang, Helin wrote:
> >>> #define PKT_RX_EIP_CKSUM_BAD (0ULL << 0)  /**< External IP header
> >>> checksum error. */ [Helin] Nobody complains it, so we will keep it
> >>> there, and
> >> just assign a new value to it.
> >>
> >> ok.
> >>
> >> But it would be nice to have a better definition of this flag: does
> >> external mean outer header? For instance, when you receive a
> >> Ether/IP1/UDP/vxlan/Ether/IP2/xxx, does the flag concerns IP1 or IP2?
> > 'E' means 'external', it indicates the (most) outer IP header checksum
> > error. If you don't think this name is not so clear, I can change it to
> 'PKT_RX_OUTER_IP_CHSUM_BAD'.
> > For inner IP header checksum error, it will be indicated by
> PKT_RX_IP_CKSUM_BAD.
> >
> >>
> >> If it's IP1, it's really strange compared to the current behavior
> >> (the flag PKT_RX_IP_CKSUM_BAD refers to IP1).
>=20
> Ok.
> But the real sense of my question was about the behavior which seems diff=
erent
> than with previous hardware. Today, if you receive the packet
> Ether/IP1/UDP/vxlan/Ether/IP2/xxx on an ixgbe, the flag
> PKT_RX_IP_CKSUM_BAD can be set if the checksum of IP1 is wrong. From your
> explanation, I understand that PKT_RX_EIP_CKSUM_BAD would be set for the
> same thing on i40e. Is it correct?
Yes, it is strange if ixgbe hardware checksum logic knows the packet is tun=
neling.
But, from another point of view, which seems the the ixgbe hardware does, i=
f
checksum logic does not know tunneling, it may treat all others as data. Th=
is is
reasonable to report the error with PKT_RX_IP_CKSUM_BAD.

>=20
>=20
> >>> #define PKT_RX_OVERSIZE      (0ULL << 0)  /**< Num of desc of an RX
> pkt
> >> oversize. */
> >>> [Helin] I don't think it can be merge with other hardware errors. It
> >>> indicates the packet received needs more descriptors than hardware
> >>> allowed, and the part of packets can still be stored in the mbufs
> >>> provided. It is a good hint for users that larger size of mbuf might
> >>> be needed. If just put it as hardware error, users will lose this
> >>> information. So I
> >> prefer to keep it there, and just assign a new value to it.
> >>
> >> Again, a statistic counter would do the job which if it's just to
> >> provide a hint to the application.
> > It seems that we do not maintain a counter for packets in PMD, if I am
> > not wrong. Two ways current DPDK is using.
> > One is hardware provide registers to do that, we can read it directly w=
hen
> needed.
> > The other one is that applications or middle layer sw maintain its own =
statistics.
>=20
> rte_eth_stats_get() gives the generic statistics For specific error stats=
,
> rte_eth_xstats_get() can be used from an application (the driver has to p=
rovide
> the full list of statistics).
Yes, that function read all the statistics from registers directly. I think=
 there might
already have the corresponding registers for it.

>=20
> >> I wonder in which case this flag can happen. If you fill the ring
> >> with mbufs that are large enough compared to your ethernet network,
> >> this should not happen in normal conditions. I really don't believe
> >> that an application receiving an mbuf with this flag would stop the dr=
iver, then
> refill the rings it with larger mbufs.
> > This is not because of it is lack of available RX descriptors. It is
> > because of a hardware requirement. FVL hardware requires that it
> > should not use more than 5 rx descriptors for receiving a single packet=
.
>=20
> I still don't understand what the application should do when the flag is =
set.
> Maybe you could provide an example in l2fwd or testpmd?
For an application, if it is reported with this oversize error, it then eas=
ily knows that
the mbuf size might not be enough. If it is reported with just a hardware e=
rror, then
the application developer needs to dig into the PMD to see what happens. I =
think most
of application developers do not want to debug PMD to much, as it might not=
 be their scope.

>=20
> >> Last but not least: If it's really useful, should we have the same
> >> behavior on other drivers like ixgbe? I think we really need to care
> >> about not having different ways to use the different drivers.
> > I don't see the similar bit in ixgbe datasheet, but this restriction
> > could be common for some other NICs, as it is reasonable from hardware
> perspective.
>=20
> In ixgbe, there are other error cases:
> - frames shorter than 64 bytes
> - oversize (frames larger than MAXFRS)
> - ... maybe others?
>=20
> Should we have a flag for each situation? I think not.
The more the better, but we may need a tradeoff, as the flag bits is limite=
d.

>=20
>=20
> >> To me, the only argument in favor of keeping this flag is when the
> >> mbuf contains a part of the data that could be dumped by a user for de=
bug
> purposes.
> >>
> >>> #define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer
> overflow.
> >>> */ [Helin] It indicates the header buff size is not enough, but not
> >>> means hardware cannot process the packet received. It is a good hint
> >>> for the users to provide larger size of header buffers. I also
> >>> prefer to keep it
> >> there, and just assign new value to it.
> >>
> >> Same for this one.
> > It is a bit different from previous one, it always has one header
> > buffer, this flag means the buffer size is not enough for the header.
> > These two flags are because of buffer size or number of buffers. The
> > mbufs are prepared in application or up layer software. If these two
> > flags occur, it is easier for up layer software to debug, and know
> > different buffers are needed. They do not need to debug PMD, as they
> generally don't want to do.
>=20
> You say it's easier for the software to debug, but I cannot see the diffe=
rence.
> When it's a statistics counter, you just have to use rte_eth_xstats_get()=
, which is
> an equivalent of "ethtool -S iface"
> which gives all the hardware statistics. It will work for any driver and =
any
> application.
>=20
> If we add these flags, the application have to know about all these speci=
fic flags
> and how to handle them.
OK. I agree with you not all errors need to be reported as flags. But I'd p=
refer to
add those flags which is useful for applications to bypass the errors.

>=20
> >>> #define PKT_RX_RECIP_ERR     (0ULL << 0)  /**< Hardware processing
> error.
> >> */
> >>> [Helin] In the latest data sheet, it is not opened to external users.
> >>> So we can just remove it from here.
> >>
> >> ok
> >>
> >>> #define PKT_RX_MAC_ERR       (0ULL << 0)  /**< MAC error. */
> >>> [Helin] This indicates a real hardware error happens.
> >>
> >> And what is the content of the mbuf data in this case? Does the
> >> application really need an mbuf?
> > Mbuf contains both the data and other information. I prefer to let the
> > up layer software to decide how to deal with the packet, no matter it
> > is correct or bad. In addition, even hardware errors happened, it
> > still can set a special bit to enable storing the whole packet to the
> > mbuf, for debug purpose. Hardware error bit can be used for all hardwar=
e
> errors. As we do not have one there, why not add one?
>=20
> You say "let the up layer software to decide how to deal with the packet,=
 no
> matter it is correct or bad". But what can do an application with a packe=
t if it does
> not know if the data is correct or bad?
Mbuf flags can tell him if the data is good or bad, if good, more info can =
be seen in flags.

>=20
> Regards,
> Olivier

Yes, flags should be a tradeoff, we can provide useful flags in mbuf, but n=
ot all.

Regards,
Helin