Hi Jeff, On 05/09/2014 06:11 PM, Shaw, Jeffrey B wrote: > I agree, we should wait for comments then test the performance when the patches have settled. Here are some performance numbers I've measured with the TSO patches. The test platform is: +-----------+ +-----------+ | | | | | traffic |-----------| dpdk | | generator |-----------| testpmd | | |-----------| | | |-----------| | | | | | +-----------+ +-----------+ - 4 ixgbe ports - sandy bridge at 2.7 Ghz I've only included numbers for pkt_size=64. Other packet sizes do not bring more information in this case. I have 4 test cases: - testpmd in iofwd mode with normal tx/rx function - testpmd in iofwd mode with simple tx/rx function (txqflags=0xf01) - testpmd in macfwd mode with normal tx/rx function - testpmd in macfwd mode with simple tx/rx function (txqflags=0xf01) I tested this for 1c1t, 1c2t, 2c2t, 2c4t, 4c8t on the following version: - dpdk.org head - dpdk.org + tso patchs until 6/11 (included): it includes all mbuf reworks (data_offset instead of data, remove ctrl mbuf, use 48 bits physical address) - dpdk.org + all tso series The conclusion of the tests is: Patches up to 6/11 do not bring any performance regression. On the other hand, the full TSO patch series introduces a small performance regression (usually corresponding to ~5 cycles per packet). This is probably due to additional tests related to TSO done in driver. I suppose that this performance loss is acceptable if we consider that TSO will bring a huge performance enhancement for many real use cases. By the way, I found lower numbers in macfwd mode + simple rx/tx with current version (without my patches) with 1c2t. It seems reproduceable. I'll soon provide a v2 that will include: - the split of patch 6/11 (cosmetics vs functional) - the split of patch 11/11 (ixgbe vs generic changes) - new checksum flags PKT_RX_L4_CKSUM_GOOD and PKT_RX_IP_CKSUM_GOOD proposed by Stephen - modifications of external PMDs (memnic, virtio, vmxnet3) Regards, Olivier