On Wed, Nov 22, 2023 at 4:05 PM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
On 11/22/2023 6:01 AM, kumaraparameshwaran rathinavel wrote:
> Hi Folks,
>
> The current GRO code uses an unoptimised version of flow lookup where
> each flow in the table is iterated over during the flow matching
> process. For a rte_gro_reassemble_burst in lightweight mode this would
> not cause much of an impact. But with rte_gro_reassemble which is done
> with a timeout interval, this causes higher CPU utilisation during
> throughput tests. The proposal here is to use a Hash based flowtable
> which could make use of the  rte_hash table implementation in DPDK.
> There could be a hash table for each of the GRO types. The lookup
> function and the key could be different for each one of the types. If
> there is a consensus that this could have a better performance impact I
> would work on an initial patch set. Please let me know your thoughts.
>


Hi Kumara,

Your proposal looks reasonable to me, I think it worth to try.
cc'ed techboard for more comment.
Thanks Ferruh - Sure I will get a initial patch set with TCP/IPv4 GRO type.

Do you have any performance measurement with the existing code? To have
it helps to evaluate impact of the change.
I did some testing sometime back and the observations were that on a 10Gbps link, the throughput value with iperf testing
of unoptimised and optimised were almost the same, but the CPU conservation was upto 30-35%. So any tests running in 
parallel like imix kind of traffic would definitely have better results. I will try to profile the two cases with some performance impacting 
results.