From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <helin.zhang@intel.com>
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
 by dpdk.org (Postfix) with ESMTP id ABDFF1518
 for <dev@dpdk.org>; Fri, 28 Nov 2014 09:08:28 +0100 (CET)
Received: from fmsmga001.fm.intel.com ([10.253.24.23])
 by fmsmga101.fm.intel.com with ESMTP; 28 Nov 2014 00:08:18 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.07,475,1413270000"; d="scan'208";a="629536397"
Received: from pgsmsx108.gar.corp.intel.com ([10.221.44.103])
 by fmsmga001.fm.intel.com with ESMTP; 28 Nov 2014 00:07:44 -0800
Received: from shsmsx151.ccr.corp.intel.com (10.239.6.50) by
 PGSMSX108.gar.corp.intel.com (10.221.44.103) with Microsoft SMTP Server (TLS)
 id 14.3.195.1; Fri, 28 Nov 2014 16:07:42 +0800
Received: from shsmsx104.ccr.corp.intel.com ([169.254.5.182]) by
 SHSMSX151.ccr.corp.intel.com ([169.254.3.86]) with mapi id 14.03.0195.001;
 Fri, 28 Nov 2014 16:07:41 +0800
From: "Zhang, Helin" <helin.zhang@intel.com>
To: Olivier MATZ <olivier.matz@6wind.com>, "Ananyev, Konstantin"
 <konstantin.ananyev@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Thread-Topic: [PATCH] i40e: Use one bit flag for all hardware detected RX
 packet errors
Thread-Index: AQHQCWs/YCaH2DMkh0m0YaBNr2E3oZxyY4CAgAAJe4CAA0CEsA==
Date: Fri, 28 Nov 2014 08:07:41 +0000
Message-ID: <F35DEAC7BCE34641BA9FAC6BCA4A12E70A7CC692@SHSMSX104.ccr.corp.intel.com>
References: <1416982032-28519-1-git-send-email-helin.zhang@intel.com>
 <2601191342CEEE43887BDE71AB977258213BA7CA@IRSMSX105.ger.corp.intel.com>
 <5475B7EE.4020400@6wind.com>
 <2601191342CEEE43887BDE71AB977258213BA8CB@IRSMSX105.ger.corp.intel.com>
 <5475DFB9.7060609@6wind.com>
In-Reply-To: <5475DFB9.7060609@6wind.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.239.127.40]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH] i40e: Use one bit flag for all hardware
 detected RX packet errors
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Nov 2014 08:08:29 -0000

Hi Olivier, Konstantin

> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Wednesday, November 26, 2014 10:12 PM
> To: Ananyev, Konstantin; Zhang, Helin; dev@dpdk.org
> Cc: Cao, Waterman; Cao, Min
> Subject: Re: [PATCH] i40e: Use one bit flag for all hardware detected RX =
packet
> errors
>=20
> Hi Konstantin,
>=20
> On 11/26/2014 02:38 PM, Ananyev, Konstantin wrote:
> >>> Probably I didn't explain myself clear enough, sorry.
> >>> I didn't suggest to get rid of setting bits that indicate L3/L4 check=
sum errors:
> >>> PKT_RX_IP_CKSUM_BAD, PKT_RX_L4_CKSUM_BAD,
> PKT_RX_EIP_CKSUM_BAD.
> >>> I think these flags should be set as before.
> >>>
> >>> I was talking only about collapsing only these 4 RX error flags into =
one:
> >>>
> >>> #define PKT_RX_OVERSIZE      (0ULL << 0)  /**< Num of desc of an RX
> pkt oversize. */
> >>> #define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer
> overflow. */
> >>> #define PKT_RX_RECIP_ERR     (0ULL << 0)  /**< Hardware processing
> error. */
> >>> #define PKT_RX_MAC_ERR       (0ULL << 0)  /**< MAC error. */
> >>>
> >>>   From my point of view the difference of these 2 groups are:
> >>> First - HW was able to receive whole packet without a problem, but L3=
/L4
> checksum check failed.
> >>>
> >>> Second - HW was not able to receive whole packet properly by whatever
> reason.
> >>>   From upper layer SW perspective - there it probably makes little
> >>> difference, what caused it, as most likely SW has to throw away erron=
eous
> packet.
> >>> And for debugging purposes, we can add PMD_LOG(DEBUG, ...) that would
> print what exactly HW error happened.
> >>
> >> I agree with Konstantin that there are 2 different cases:
> >>
> >> a) the packet is properly received by the hardware, but has a bad
> >>      checksum (or another protocol error, for instance an invalid ip l=
en,
> >>      a ip_version =3D=3D 8 :))
> >>
> >>      in this case, it is useful to the application to have the mbuf wi=
th
> >>      the data + an error flag. Then using a tcpdump-like tool could he=
lp
> >>      to debug what is the cause of the error and what equipment genera=
tes
> >>      a bad packet.
> >>
> >> b) the packet is not properly received by the hardware. In this case
> >>      the data is invalid in the mbuf and not useable by the applicatio=
n.
> >>      I suggest to only have a stats counter in this case, as receiving=
 the
> >>      mbuf is cpu time consuming and the only thing the application can=
 do
> >>      is to drop the packet.
> >
> > So for b) you suggest to drop the packet straight in PMD RX function?
> > Something like:
> > if (unlikely(error_bits & ...)) {
> >          PMD_LOG(DEBUG, ...);
> >           rte_pktmbuf_free(mb);
> > }
> > ?
>=20
> Yes
>=20
> > That's probably a bit too radical.
> > Yes, mbuf doesn't contain the whole packet, but it may contain at least=
 part of it,
> let say in case of 'packet oversize'.
> > So for debugging purposes the user may still like to examine the mbuf
> contents.
>=20
> As soon as there is some exploitable data in the mbuf, I agree it can be =
transfered
> to the application (ex: bad header, bad len, bad checksum...).
>=20
> But if the hardware is not able to provide any exploitable data, it looks=
 a bit
> overkill to give an mbuf with an error flag.
>=20
> But grouping the flags as you suggest is already a good clean-up to me, I=
 don't
> want to be more catholic than the Pope ;)

After I have completed another task, I read the datasheet carefully again. =
For those 5
error bits I introduced for a long time, I'd like to explain one by one as =
below.

#define PKT_RX_EIP_CKSUM_BAD (0ULL << 0)  /**< External IP header checksum =
error. */
[Helin] Nobody complains it, so we will keep it there, and just assign a ne=
w value to it.

#define PKT_RX_OVERSIZE      (0ULL << 0)  /**< Num of desc of an RX pkt ove=
rsize. */
[Helin] I don't think it can be merge with other hardware errors. It indica=
tes the packet
received needs more descriptors than hardware allowed, and the part of pack=
ets can
still be stored in the mbufs provided. It is a good hint for users that lar=
ger size of mbuf
might be needed. If just put it as hardware error, users will lose this inf=
ormation. So I
prefer to keep it there, and just assign a new value to it.

#define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer overflow. */
[Helin] It indicates the header buff size is not enough, but not means hard=
ware cannot
process the packet received. It is a good hint for the users to provide lar=
ger size of header
buffers. I also prefer to keep it there, and just assign new value to it.

#define PKT_RX_RECIP_ERR     (0ULL << 0)  /**< Hardware processing error. *=
/
[Helin] In the latest data sheet, it is not opened to external users. So we=
 can just remove
it from here.

#define PKT_RX_MAC_ERR       (0ULL << 0)  /**< MAC error. */
[Helin] This indicates a real hardware error happens.

So my point is to just remove PKT_RX_RECIP_ERR, and we still need other new=
 bit flags.
Any thought from you guys?

Regards,
Helin

>=20
> Regards,
> Olivier