Hi! Thanks for submitting this. Some inline comments follow. > -----Original Message----- > From: 苏赛 > Sent: Thursday 31 July 2025 10:55 > To: jasvinder.singh@intel.com > Cc: dev@dpdk.org > Subject: [PATCH] net/cksum: compute raw cksum for several segments > > The rte_raw_cksum_mbuf function is used to compute > the raw checksum of a packet. > If the packet payload stored in multi mbuf, the function > will goto the hard case. In hard case, > the variable 'tmp' is a type of uint32_t, > so rte_bswap16 will drop high 16 bit. > Meanwhile, the variable 'sum' is a type of uint32_t, > so 'sum += tmp' will drop the carry when overflow. > Both drop will make cksum incorrect. > This commit fixes the above bug. > > Signed-off-by: Su Sai > diff --git a/lib/net/rte_cksum.h b/lib/net/rte_cksum.h > index a8e8927952..aa584d5f8d 100644 > --- a/lib/net/rte_cksum.h > +++ b/lib/net/rte_cksum.h > @@ -80,6 +80,25 @@ __rte_raw_cksum_reduce(uint32_t sum) > return (uint16_t)sum; > } > > +/** > + * @internal Reduce a sum to the non-complemented checksum. > + * Helper routine for the rte_raw_cksum_mbuf(). > + * > + * @param sum > + * Value of the sum. > + * @return > + * The non-complemented checksum. > + */ > +static inline uint16_t > +__rte_raw_cksum_reduce_u64(uint64_t sum) > +{ > + uint32_t tmp; > + > + tmp = __rte_raw_cksum_reduce((uint32_t)sum); > + tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32)); What if this addition overflows? To my taste I would not try to call `__rte_raw_cksum_reduce ` and instead reduce uint64_t directly to uint16_t, but it’s up to you. > + return __rte_raw_cksum_reduce(tmp); > +} > + > /** > * Process the non-complemented checksum of a buffer. > * > @@ -119,8 +138,9 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, > { > const struct rte_mbuf *seg; > const char *buf; > - uint32_t sum, tmp; > + uint32_t tmp; > uint32_t seglen, done; > + uint64_t sum; > > /* easy case: all data in the first segment */ > if (off + len <= rte_pktmbuf_data_len(m)) { > @@ -157,7 +177,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, > for (;;) { > tmp = __rte_raw_cksum(buf, seglen, 0); > if (done & 1) > - tmp = rte_bswap16((uint16_t)tmp); > + tmp = rte_bswap32(tmp); This part probably deserves a comment, since we only need to swap odd and even bytes, but we instead reverse all of them abusing the fact that order of 2-byte pairs does not matter for the algorithm. > sum += tmp; > done += seglen; > if (done == len) > @@ -169,7 +189,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, > seglen = len - done; > } > > - *cksum = __rte_raw_cksum_reduce(sum); > + *cksum = __rte_raw_cksum_reduce_u64(sum); > return 0; > } Changes to this function look correct to my eye, but given how many pitfalls we have already found I think we need tests.