From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id A4A001EAD9 for ; Tue, 12 Jun 2018 19:17:37 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Jun 2018 10:17:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,215,1526367600"; d="scan'208";a="47195117" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by fmsmga008.fm.intel.com with ESMTP; 12 Jun 2018 10:17:36 -0700 Received: from fmsmsx120.amr.corp.intel.com (10.18.124.208) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.319.2; Tue, 12 Jun 2018 10:17:36 -0700 Received: from fmsmsx117.amr.corp.intel.com ([169.254.3.220]) by fmsmsx120.amr.corp.intel.com ([10.18.124.208]) with mapi id 14.03.0319.002; Tue, 12 Jun 2018 10:17:36 -0700 From: "Wiles, Keith" To: Ophir Munk CC: "dev@dpdk.org" , Pascal Mazon , Thomas Monjalon , Olga Shern Thread-Topic: [PATCH v4 1/2] net/tap: calculate checksums of multi segs packets Thread-Index: AQHUAmrrQcgt0sC4UE+Q8/uCgdaGwKRdUwQA Date: Tue, 12 Jun 2018 17:17:35 +0000 Message-ID: <848DE22E-9581-4F40-AEF8-52EDEC425DFE@intel.com> References: <1520629826-23055-2-git-send-email-ophirmu@mellanox.com> <1528821108-12405-1-git-send-email-ophirmu@mellanox.com> <1528821108-12405-2-git-send-email-ophirmu@mellanox.com> In-Reply-To: <1528821108-12405-2-git-send-email-ophirmu@mellanox.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.254.100.179] Content-Type: text/plain; charset="us-ascii" Content-ID: <16B96DB09260B84B8FD6093782BC207C@intel.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v4 1/2] net/tap: calculate checksums of multi segs packets X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jun 2018 17:17:39 -0000 A few formatting problems I have noticed. We can review the code logic in a= meeting. > On Jun 12, 2018, at 11:31 AM, Ophir Munk wrote: >=20 > Prior to this commit IP/UDP/TCP checksum offload calculations > were skipped in case of a multi segments packet. > This commit enables TAP checksum calculations for multi segments > packets. > The only restriction is that the first segment must contain > headers of layers 3 (IP) and 4 (UDP or TCP) >=20 > Reviewed-by: Raslan Darawsheh > Signed-off-by: Ophir Munk > --- > drivers/net/tap/rte_eth_tap.c | 158 +++++++++++++++++++++++++++++--------= ----- > 1 file changed, 110 insertions(+), 48 deletions(-) >=20 > diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.= c > index df396bf..c19f053 100644 > --- a/drivers/net/tap/rte_eth_tap.c > +++ b/drivers/net/tap/rte_eth_tap.c > @@ -415,12 +415,43 @@ tap_tx_offload_get_queue_capa(void) > DEV_TX_OFFLOAD_TCP_CKSUM; > } >=20 > +/* Finalize l4 checksum calculation */ > static void > -tap_tx_offload(char *packet, uint64_t ol_flags, unsigned int l2_len, > - unsigned int l3_len) > +tap_tx_l4_cksum(uint16_t *l4_cksum, uint16_t l4_phdr_cksum, > + uint32_t l4_raw_cksum) > { > - void *l3_hdr =3D packet + l2_len; > + if (l4_cksum) { > + uint32_t cksum; > + > + cksum =3D __rte_raw_cksum_reduce(l4_raw_cksum); > + cksum +=3D l4_phdr_cksum; > + > + cksum =3D ((cksum & 0xffff0000) >> 16) + (cksum & 0xffff); > + cksum =3D (~cksum) & 0xffff; > + if (cksum =3D=3D 0) > + cksum =3D 0xffff; > + *l4_cksum =3D cksum; > + } > +} >=20 > +/* Accumaulate L4 raw checksums */ > +static void > +tap_tx_l4_add_rcksum(char *l4_data, unsigned int l4_len, uint16_t *l4_ck= sum, > + uint32_t *l4_raw_cksum) > +{ > + if (l4_cksum =3D=3D NULL) > + return; > + > + *l4_raw_cksum =3D __rte_raw_cksum(l4_data, l4_len, *l4_raw_cksum); > +} > + > +/* L3 and L4 pseudo headers checksum offloads */ > +static void > +tap_tx_l3_cksum(char *packet, uint64_t ol_flags, unsigned int l2_len, > + unsigned int l3_len, unsigned int l4_len, uint16_t **l4_cksum, > + uint16_t *l4_phdr_cksum, uint32_t *l4_raw_cksum) > +{ > + void *l3_hdr =3D packet + l2_len; Needs a blank line here. > if (ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_IPV4)) { > struct ipv4_hdr *iph =3D l3_hdr; > uint16_t cksum; > @@ -430,38 +461,21 @@ tap_tx_offload(char *packet, uint64_t ol_flags, uns= igned int l2_len, > iph->hdr_checksum =3D (cksum =3D=3D 0xffff) ? cksum : ~cksum; > } > if (ol_flags & PKT_TX_L4_MASK) { > - uint16_t l4_len; > - uint32_t cksum; > - uint16_t *l4_cksum; > void *l4_hdr; >=20 > l4_hdr =3D packet + l2_len + l3_len; > if ((ol_flags & PKT_TX_L4_MASK) =3D=3D PKT_TX_UDP_CKSUM) > - l4_cksum =3D &((struct udp_hdr *)l4_hdr)->dgram_cksum; > + *l4_cksum =3D &((struct udp_hdr *)l4_hdr)->dgram_cksum; > else if ((ol_flags & PKT_TX_L4_MASK) =3D=3D PKT_TX_TCP_CKSUM) > - l4_cksum =3D &((struct tcp_hdr *)l4_hdr)->cksum; > + *l4_cksum =3D &((struct tcp_hdr *)l4_hdr)->cksum; > else > return; > - *l4_cksum =3D 0; > - if (ol_flags & PKT_TX_IPV4) { > - struct ipv4_hdr *iph =3D l3_hdr; > - > - l4_len =3D rte_be_to_cpu_16(iph->total_length) - l3_len; > - cksum =3D rte_ipv4_phdr_cksum(l3_hdr, 0); > - } else { > - struct ipv6_hdr *ip6h =3D l3_hdr; > - > - /* payload_len does not include ext headers */ > - l4_len =3D rte_be_to_cpu_16(ip6h->payload_len) - > - l3_len + sizeof(struct ipv6_hdr); > - cksum =3D rte_ipv6_phdr_cksum(l3_hdr, 0); > - } > - cksum +=3D rte_raw_cksum(l4_hdr, l4_len); > - cksum =3D ((cksum & 0xffff0000) >> 16) + (cksum & 0xffff); > - cksum =3D (~cksum) & 0xffff; > - if (cksum =3D=3D 0) > - cksum =3D 0xffff; > - *l4_cksum =3D cksum; > + **l4_cksum =3D 0; > + if (ol_flags & PKT_TX_IPV4) > + *l4_phdr_cksum =3D rte_ipv4_phdr_cksum(l3_hdr, 0); > + else > + *l4_phdr_cksum =3D rte_ipv6_phdr_cksum(l3_hdr, 0); > + *l4_raw_cksum =3D __rte_raw_cksum(l4_hdr, l4_len, 0); > } > } >=20 > @@ -482,17 +496,27 @@ pmd_tx_burst(void *queue, struct rte_mbuf **bufs, u= int16_t nb_pkts) > max_size =3D *txq->mtu + (ETHER_HDR_LEN + ETHER_CRC_LEN + 4); > for (i =3D 0; i < nb_pkts; i++) { > struct rte_mbuf *mbuf =3D bufs[num_tx]; > - struct iovec iovecs[mbuf->nb_segs + 1]; > + struct iovec iovecs[mbuf->nb_segs + 2]; > struct tun_pi pi =3D { .flags =3D 0, .proto =3D 0x00 }; > struct rte_mbuf *seg =3D mbuf; > char m_copy[mbuf->data_len]; > + int proto; > int n; > int j; > + int k; /* first index in iovecs for copying segments */ > + uint16_t l234_hlen; /* length of layers 2,3,4 headers */ > + uint16_t seg_len; /* length of first segment */ > + uint16_t nb_segs; > + uint16_t *l4_cksum; /* l4 checksum (pseudo header + payload) */ > + uint32_t l4_raw_cksum =3D 0; /* TCP/UDP payload raw checksum */ > + uint16_t l4_phdr_cksum =3D 0; /* TCP/UDP pseudo header checksum */ > + uint16_t is_cksum =3D 0; /* in case cksum should be offloaded */ >=20 > /* stats.errs will be incremented */ > if (rte_pktmbuf_pkt_len(mbuf) > max_size) > break; >=20 > + l4_cksum =3D NULL; > if (txq->type =3D=3D ETH_TUNTAP_TYPE_TUN) { > /* > * TUN and TAP are created with IFF_NO_PI disabled. > @@ -505,35 +529,73 @@ pmd_tx_burst(void *queue, struct rte_mbuf **bufs, u= int16_t nb_pkts) > * is 4 or 6, then protocol field is updated. > */ > char *buff_data =3D rte_pktmbuf_mtod(seg, void *); > - j =3D (*buff_data & 0xf0); > - pi.proto =3D (j =3D=3D 0x40) ? rte_cpu_to_be_16(ETHER_TYPE_IPv4) : > - (j =3D=3D 0x60) ? rte_cpu_to_be_16(ETHER_TYPE_IPv6) : 0x00; > + proto =3D (*buff_data & 0xf0); > + pi.proto =3D (proto =3D=3D 0x40) ? > + rte_cpu_to_be_16(ETHER_TYPE_IPv4) : > + ((proto =3D=3D 0x60) ? > + rte_cpu_to_be_16(ETHER_TYPE_IPv6) : > + 0x00); > } >=20 > - iovecs[0].iov_base =3D π > - iovecs[0].iov_len =3D sizeof(pi); > - for (j =3D 1; j <=3D mbuf->nb_segs; j++) { > - iovecs[j].iov_len =3D rte_pktmbuf_data_len(seg); > - iovecs[j].iov_base =3D > - rte_pktmbuf_mtod(seg, void *); > - seg =3D seg->next; > - } > + k =3D 0; > + iovecs[k].iov_base =3D π > + iovecs[k].iov_len =3D sizeof(pi); > + k++; Blank lines are always good. > + nb_segs =3D mbuf->nb_segs; > if (txq->csum && > ((mbuf->ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_IPV4) || > (mbuf->ol_flags & PKT_TX_L4_MASK) =3D=3D PKT_TX_UDP_CKSUM || > (mbuf->ol_flags & PKT_TX_L4_MASK) =3D=3D PKT_TX_TCP_CKSUM))) { > - /* Support only packets with all data in the same seg */ > - if (mbuf->nb_segs > 1) > + is_cksum =3D 1; Blank line would be nice. > + /* Support only packets with at least layer 4 > + * header included in the first segment > + */ > + seg_len =3D rte_pktmbuf_data_len(mbuf); > + l234_hlen =3D mbuf->l2_len + mbuf->l3_len + mbuf->l4_len; > + if (seg_len < l234_hlen) > break; > - /* To change checksums, work on a copy of data. */ > + > + /* To change checksums, work on a > + * copy of l2, l3 l4 headers. > + */ > rte_memcpy(m_copy, rte_pktmbuf_mtod(mbuf, void *), > - rte_pktmbuf_data_len(mbuf)); > - tap_tx_offload(m_copy, mbuf->ol_flags, > - mbuf->l2_len, mbuf->l3_len); > - iovecs[1].iov_base =3D m_copy; > + l234_hlen); > + tap_tx_l3_cksum(m_copy, mbuf->ol_flags, > + mbuf->l2_len, mbuf->l3_len, mbuf->l4_len, > + &l4_cksum, &l4_phdr_cksum, > + &l4_raw_cksum); > + iovecs[k].iov_base =3D m_copy; > + iovecs[k].iov_len =3D l234_hlen; > + k++; Some blank lines would make this a lot more readable, like one here. > + /* Update next iovecs[] beyond l2, l3, l4 headers */ > + if (seg_len > l234_hlen) { > + iovecs[k].iov_len =3D seg_len - l234_hlen; > + iovecs[k].iov_base =3D > + rte_pktmbuf_mtod(seg, char *) + > + l234_hlen; > + tap_tx_l4_add_rcksum(iovecs[k].iov_base, > + iovecs[k].iov_len, l4_cksum, > + &l4_raw_cksum); > + k++; > + nb_segs++; > + } > + seg =3D seg->next; > } Another place a blank line would be nice. > + for (j =3D k; j <=3D nb_segs; j++) { > + iovecs[j].iov_len =3D rte_pktmbuf_data_len(seg); > + iovecs[j].iov_base =3D rte_pktmbuf_mtod(seg, void *); > + if (is_cksum) > + tap_tx_l4_add_rcksum(iovecs[j].iov_base, > + iovecs[j].iov_len, l4_cksum, > + &l4_raw_cksum); > + seg =3D seg->next; > + } > + > + if (is_cksum) > + tap_tx_l4_cksum(l4_cksum, l4_phdr_cksum, l4_raw_cksum); > + > /* copy the tx frame data */ > - n =3D writev(txq->fd, iovecs, mbuf->nb_segs + 1); > + n =3D writev(txq->fd, iovecs, j); > if (n <=3D 0) > break; >=20 > --=20 > 2.7.4 >=20 Regards, Keith