Hi Thomas Monjalon,

First, let's describe the scenario where we discovered the problem at that time as follows:

This error can be reproduced as follows:
1. In the client ECS with an MTU of 1500, initiate traffic using the command "iperf3 -c {dst ip} -b 1m -M 125 -t 8000". It will trigger TCP segmentation.
2. On the host machine, TCP segmentation is performed through the 'rte_gso_segment' function.
3. After the gso, a packet in one mbuf will be split into multiple segments.
4. When calculating the TCP checksum using the 'rte_raw_cksum_mbuf' function, it will enter the 'hard case: process checksum of several segments' of the function. At this point, a calculation error may occur.
5. In the destination ECS, the InCsumErrors statistic can be viewed using the command "netstat -st | grep -i csum". The erroneous packets can also be confirmed via the tcpdump command.

The following is a detailed description of a captured erroneous packet.

The hex stream of the packet is as follows:
00163e0b6bd2eeffffffffff0800450000a50d7a40004006b94bc0a8f91dc0a8f91ed5d2145146f9d990e10d6a2d8010020040a200000101080a95ac86ba091145d3ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff

This is a packet in the format of Eth + IPv4 + TCP + Payload.

Taking the above-mentioned packet as an example, the calculation process of 'rte_raw_cksum_mbuf' is as follows:
1. Due to the small MSS, TSO fragmentation was triggered, generating 3 mbufs.
2. The data_len of the first mbuf is 66 bytes, containing the Ethernet header, IPv4 header, and TCP header.
3. The data_len of the second mbuf is 61 bytes.
4. The data_len of the third mbuf is 52 bytes.
5. When calculating the checksum of the TCP header for such an mbuf chain using the rte_raw_cksum_mbuf function, the 'tmp' value obtained during the calculation of the third mbuf is 0x19FFE6.
6. After applying rte_bswap16, tmp becomes 0xE6FF, with 0x19 discarded. This results in an incorrect final checksum.

Second, Not all multiseg packets will cause calculation errors in the 'rte_raw_cksum_mbuf' function. There are two cases that can lead to incorrect final results.
1. If the value of 'tmp' is greater than 0xFFFF, 'tmp = rte_bswap16((uint16_t)tmp)' will drop high 16 bit.
2. Both 'tmp' and 'sum' is uint32_t, if the value of 'sum' is greater than 0xFFFFFFFF, 'sum += tmp' will drop the carry when overflow. 

Third, in our online cloud network, we found that the problem only occurs when there are 3 or more segments. I believe that the aforementioned issue may be triggered when there are 3 or more segments, but a test case with 3 segments is sufficient to detect this problem.

On Mon, Aug 11, 2025 at 10:42 PM Thomas Monjalon <thomas@monjalon.net> wrote:
Hello,

04/08/2025 05:54, Su Sai:
> The rte_raw_cksum_mbuf function is used to compute
> the raw checksum of a packet.
> If the packet payload stored in multi mbuf, the function
> will goto the hard case. In hard case,
> the variable 'tmp' is a type of uint32_t,
> so rte_bswap16 will drop high 16 bit.
> Meanwhile, the variable 'sum' is a type of uint32_t,
> so 'sum += tmp' will drop the carry when overflow.
> Both drop will make cksum incorrect.
> This commit fixes the above bug.

Thank you for the fix and the associated test.

Please could describe the exact condition to get a wrong checksum?
Does it happen with all multiseg packets?
3 segments is a minimum? Any other constraint to reproduce?