DPDK usage discussions
 help / color / mirror / Atom feed
* [dpdk-users] How to use software prefetching for custom structures to increase throughput on the fast path
@ 2018-09-11  8:15 Arvind Narayanan
  2018-09-11 14:20 ` Wiles, Keith
  0 siblings, 1 reply; 11+ messages in thread
From: Arvind Narayanan @ 2018-09-11  8:15 UTC (permalink / raw)
  To: users

Hi,

I am trying to write a DPDK application and finding it difficult to achieve
line rate on a 10G NIC. I feel this has something to do with CPU caches and
related optimizations, and would be grateful if someone can point me to the
right direction.

I wrap every rte_mbuf into my own structure say, my_packet. Here is
my_packet's structure declaration:

```
struct my_packet {
 struct rte_mbuf * m;
 uint16_t tag1;
 uint16_t tag2;
}
```

During initialization, I reserve a mempool of type struct my_packet with
8192 elements. Whenever I form my_packet, I get them in bursts, similarly
for freeing I put them back into pool as bursts.

So there is a loop in the datapath which touches each of these my_packet's
tag to make a decision.

```
for (i = 0; i < pkt_count; i++) {
    if (rte_hash_lookup_data(rx_table, &(my_packet[i]->tag1), (void
**)&val[i]) < 0) {
    }
}
```

Based on my tests, &(my_packet->tag1) is the cause for not letting me
achieve line rate in the fast path. I say this because if I hardcode the
tag1's value, I am able to achieve line rate. As a workaround, I tried to
use rte_prefetch0() and rte_prefetch_non_temporal() to prefetch 2 to 8
my_packet(s) from my_packet[] array, but nothing seems to boost the
throughput.

I tried to play with the flags in rte_mempool_create() function call:
-- MEMPOOL_F_NO_SPREAD gives me 8.4GB throughput out of 10G
-- MEMPOOL_F_NO_CACHE_ALIGN initially gives ~9.4G but then gradually
settles to ~8.5GB after 20 or 30 seconds.
-- NO FLAG gives 7.7G

I am running DPDK 18.05 on Ubuntu 16.04.3 LTS.

Any help or pointers are highly appreciated.

Thanks,
Arvind

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-09-12  8:22 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-11  8:15 [dpdk-users] How to use software prefetching for custom structures to increase throughput on the fast path Arvind Narayanan
2018-09-11 14:20 ` Wiles, Keith
2018-09-11 15:42   ` Arvind Narayanan
2018-09-11 16:52     ` Wiles, Keith
2018-09-11 17:18       ` Arvind Narayanan
2018-09-11 18:07         ` Stephen Hemminger
2018-09-11 18:39           ` Arvind Narayanan
2018-09-11 19:12             ` Stephen Hemminger
2018-09-12  8:22             ` Van Haaren, Harry
2018-09-11 19:36           ` Pierre Laurent
2018-09-11 21:49             ` Arvind Narayanan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).