From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 8E0FC6CBE for ; Wed, 6 Jul 2016 13:59:10 +0200 (CEST) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga104.fm.intel.com with ESMTP; 06 Jul 2016 04:59:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.28,318,1464678000"; d="scan'208";a="1016528167" Received: from irsmsx154.ger.corp.intel.com ([163.33.192.96]) by fmsmga002.fm.intel.com with ESMTP; 06 Jul 2016 04:59:04 -0700 Received: from irsmsx106.ger.corp.intel.com ([169.254.8.145]) by IRSMSX154.ger.corp.intel.com ([169.254.12.28]) with mapi id 14.03.0248.002; Wed, 6 Jul 2016 12:59:03 +0100 From: "Chilikin, Andrey" To: Olivier MATZ , "Liang, Cunming" , "dev@dpdk.org" CC: "Ananyev, Konstantin" Thread-Topic: [dpdk-dev] [PATCH 05/18] mbuf: add function to get packet type from data Thread-Index: AQHR11HwCOOTbt1pqUinZrKb29oQQKAK9IeAgAA1OlA= Date: Wed, 6 Jul 2016 11:59:03 +0000 Message-ID: References: <1467733310-20875-1-git-send-email-olivier.matz@6wind.com> <1467733310-20875-6-git-send-email-olivier.matz@6wind.com> <577CA8D1.5000203@intel.com> <12989717-cc42-9ce6-f520-0ffbd3db7a8a@6wind.com> In-Reply-To: <12989717-cc42-9ce6-f520-0ffbd3db7a8a@6wind.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 05/18] mbuf: add function to get packet type from data X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2016 11:59:11 -0000 Hi Oliver, > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier MATZ > Sent: Wednesday, July 6, 2016 8:43 AM > To: Liang, Cunming ; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH 05/18] mbuf: add function to get packet ty= pe > from data >=20 > Hi Cunming, >=20 > On 07/06/2016 08:44 AM, Liang, Cunming wrote: > > Hi Olivier, > > > > On 7/5/2016 11:41 PM, Olivier Matz wrote: > >> Introduce the function rte_pktmbuf_get_ptype() that parses a mbuf and > >> returns its packet type. For now, the following packet types are > >> parsed: > >> L2: Ether > >> L3: IPv4, IPv6 > >> L4: TCP, UDP, SCTP > >> > >> The goal here is to provide a reference implementation for packet > >> type parsing. This function will be used by testpmd in next commits, > >> allowing to compare its result with the value given by the hardware. > >> > >> This function will also be useful when implementing Rx offload > >> support in virtio pmd. Indeed, the virtio protocol gives the csum > >> start and offset, but it does not give the L4 protocol nor it tells > >> if the checksum is relevant for inner or outer. This information has > >> to be known to properly set the ol_flags in mbuf. > >> > >> Signed-off-by: Didier Pallard > >> Signed-off-by: Jean Dao > >> Signed-off-by: Olivier Matz > >> --- > >> doc/guides/rel_notes/release_16_11.rst | 5 + > >> lib/librte_mbuf/Makefile | 5 +- > >> lib/librte_mbuf/rte_mbuf_ptype.c | 234 > >> +++++++++++++++++++++++++++++++++ > >> lib/librte_mbuf/rte_mbuf_ptype.h | 43 ++++++ > >> lib/librte_mbuf/rte_mbuf_version.map | 1 + > >> 5 files changed, 286 insertions(+), 2 deletions(-) > >> create mode 100644 lib/librte_mbuf/rte_mbuf_ptype.c > >> > >> [...] > >> + > >> +/* parse mbuf data to get packet type */ uint32_t > >> +rte_pktmbuf_get_ptype(const struct rte_mbuf *m, > >> + struct rte_mbuf_hdr_lens *hdr_lens) { > >> + struct rte_mbuf_hdr_lens local_hdr_lens; > >> + const struct ether_hdr *eh; > >> + struct ether_hdr eh_copy; > >> + uint32_t pkt_type =3D RTE_PTYPE_L2_ETHER; > >> + uint32_t off =3D 0; > >> + uint16_t proto; > >> + > >> + if (hdr_lens =3D=3D NULL) > >> + hdr_lens =3D &local_hdr_lens; > >> + > >> + eh =3D rte_pktmbuf_read(m, off, sizeof(*eh), &eh_copy); > >> + if (unlikely(eh =3D=3D NULL)) > >> + return 0; > >> + proto =3D eh->ether_type; > >> + off =3D sizeof(*eh); > >> + hdr_lens->l2_len =3D off; > >> + > >> + if (proto =3D=3D rte_cpu_to_be_16(ETHER_TYPE_IPv4)) { > >> + const struct ipv4_hdr *ip4h; > >> + struct ipv4_hdr ip4h_copy; > >> + > >> + ip4h =3D rte_pktmbuf_read(m, off, sizeof(*ip4h), &ip4h_copy); > >> + if (unlikely(ip4h =3D=3D NULL)) > >> + return pkt_type; > >> + > >> + pkt_type |=3D ptype_l3_ip(ip4h->version_ihl); > >> + hdr_lens->l3_len =3D ip4_hlen(ip4h); > >> + off +=3D hdr_lens->l3_len; > >> + if (ip4h->fragment_offset & > >> + rte_cpu_to_be_16(IPV4_HDR_OFFSET_MASK | > >> + IPV4_HDR_MF_FLAG)) { > >> + pkt_type |=3D RTE_PTYPE_L4_FRAG; > >> + hdr_lens->l4_len =3D 0; > >> + return pkt_type; > >> + } > >> + proto =3D ip4h->next_proto_id; > >> + pkt_type |=3D ptype_l4(proto); > >> + } else if (proto =3D=3D rte_cpu_to_be_16(ETHER_TYPE_IPv6)) { > >> + const struct ipv6_hdr *ip6h; > >> + struct ipv6_hdr ip6h_copy; > >> + int frag =3D 0; > >> + > >> + ip6h =3D rte_pktmbuf_read(m, off, sizeof(*ip6h), &ip6h_copy); > >> + if (unlikely(ip6h =3D=3D NULL)) > >> + return pkt_type; > >> + > >> + proto =3D ip6h->proto; > >> + hdr_lens->l3_len =3D sizeof(*ip6h); > >> + off +=3D hdr_lens->l3_len; > >> + pkt_type |=3D ptype_l3_ip6(proto); > >> + if ((pkt_type & RTE_PTYPE_L3_MASK) =3D=3D RTE_PTYPE_L3_IPV6_E= XT) { > >> + proto =3D skip_ip6_ext(proto, m, &off, &frag); > >> + hdr_lens->l3_len =3D off - hdr_lens->l2_len; > >> + } > >> + if (proto =3D=3D 0) > >> + return pkt_type; > >> + if (frag) { > >> + pkt_type |=3D RTE_PTYPE_L4_FRAG; > >> + hdr_lens->l4_len =3D 0; > >> + return pkt_type; > >> + } > >> + pkt_type |=3D ptype_l4(proto); > >> + } > >> + > >> + if ((pkt_type & RTE_PTYPE_L4_MASK) =3D=3D RTE_PTYPE_L4_UDP) { > >> + hdr_lens->l4_len =3D sizeof(struct udp_hdr); > >> + } else if ((pkt_type & RTE_PTYPE_L4_MASK) =3D=3D RTE_PTYPE_L4_TCP= ) { > >> + const struct tcp_hdr *th; > >> + struct tcp_hdr th_copy; > >> + > >> + th =3D rte_pktmbuf_read(m, off, sizeof(*th), &th_copy); > >> + if (unlikely(th =3D=3D NULL)) > >> + return pkt_type & (RTE_PTYPE_L2_MASK | > >> + RTE_PTYPE_L3_MASK); > >> + hdr_lens->l4_len =3D (th->data_off & 0xf0) >> 2; > >> + } else if ((pkt_type & RTE_PTYPE_L4_MASK) =3D=3D RTE_PTYPE_L4_SCT= P) { > >> + hdr_lens->l4_len =3D sizeof(struct sctp_hdr); > >> + } else { > >> + hdr_lens->l4_len =3D 0; > >> + } > >> + > >> + return pkt_type; > >> +} > >> diff --git a/lib/librte_mbuf/rte_mbuf_ptype.h > >> b/lib/librte_mbuf/rte_mbuf_ptype.h > >> index 4a34678..f468520 100644 > >> --- a/lib/librte_mbuf/rte_mbuf_ptype.h > >> +++ b/lib/librte_mbuf/rte_mbuf_ptype.h > >> @@ -545,6 +545,49 @@ extern "C" { > >> RTE_PTYPE_INNER_L3_MASK | \ > >> RTE_PTYPE_INNER_L4_MASK)) > >> +struct rte_mbuf; > >> + > >> +/** > >> + * Structure containing header lengths associated to a packet. > >> + */ > >> +struct rte_mbuf_hdr_lens { > >> + uint8_t l2_len; > >> + uint8_t l3_len; > >> + uint8_t l4_len; > >> + uint8_t tunnel_len; > >> + uint8_t inner_l2_len; > >> + uint8_t inner_l3_len; > >> + uint8_t inner_l4_len; > >> +}; > > [LC] The header parsing graph usually is not unique. The definition > > maybe nice for the basic IP and L4 tunnel. > > However it can't scale out to other cases, e.g. qinq, mac-in-mac, mpls > > l2/l3 tunnel. > > The parsing logic of "rte_pktmbuf_get_ptype()" and the definition of > > "struct rte_mbuf_hdr_lens" consist a pair for one specific parser schem= e. > > In this case, the fixed function is to support below. > > > > + * Supported packet types are: > > + * L2: Ether > > + * L3: IPv4, IPv6 > > + * L4: TCP, UDP, SCTP > > > > Of course, it can add more packet type detection logic in future. But > > the more support, the higher the cost. > > > > One of the alternative way is to allow registering parser pair. APP > > decides to choose the predefined scheme(by DPDK LIB), or to > > self-define the parsing logic. > > In this way, the scheme can do some assumption for the specific case > > and ignore some useless graph detection. > > In addition, besides the SW parser, the HW parser(identified by > > packet_type in mbuf) can be turn on/off by leveraging the same manner. >=20 > Sorry, I'm not sure I'm fully getting what you are saying. If I understan= d well, > you would like to have something more flexible that supports the registra= tion of > protocol to be recognized? >=20 > I'm not sure having a function with a dynamic registration method would r= eally > increase the performance compared to a static complete function. > Actually, we will never support a tons of protocols since each layer pack= et type > is 4 bits, and since it requires that at least one hw supports it. This patch will be very useful as a reference implementation, but it also h= ighlights an issue with the current implementation of packet types reportin= g by HW and SW - as you just mentioned there are only 4 bits per each layer= . As these 4 bit are used as a enumeration it is impossible to reports mult= iple headers located on the same layer. MPLS is one example, different pack= ets could have different numbers of MPLS labels, but it is impossible to re= port using current packet_type structure. It is possible, however, to program HW to report user (application) specif= ic packet types. For example, for IPoMPLS with one MPLS label, HW will repo= rt packet type A, but for IPoMPLS with two MPLS labels HW will reports pack= et type B. In this case, instead of defining and supporting tons of statica= lly defined (or enumerated) protocol headers combinations, application will= register packet types it expects from HW in addition to standard packet ty= pes. At the moment we have high bits of packet_type reserved, so one possi= ble solution would be to use the highest bit to indicate that this is user = defined packet_type, specific to the application. Then it could be used wit= h HW and with SW parser. For example, packet_type 0x8000000A is IPoMPLS wit= h one MPLS label, 0x8000000B is IPoMPLS with two MPLS labels and so on. Regards, Andrey >=20 > As described in the cover letter, the 2 main goals of this patchset are t= o provide > a reference implementation for packet type recognition, and enable the > support of virtio offloads (I'll send the patchset soon). > This function is adapted to these 2 usages. Are you thinking of another u= se-case > that would not be covered? >=20 > Regards, > Olivier