Re: [dpdk-users] users Digest, Vol 155, Issue 7

DPDK usage discussions
 help / color / mirror / Atom feed

* Re: [dpdk-users] users Digest, Vol 155, Issue 7
       [not found] <mailman.1.1539338401.21145.users@dpdk.org>
@ 2018-10-12 12:40 ` waqas ahmed
       [not found]   ` <CAAQUUHXDjV5ZEB6PvoVMCS=KNEDAMMAEE3gdVTNNyAA2kzXCfA@mail.gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: waqas ahmed @ 2018-10-12 12:40 UTC (permalink / raw)
  To: users, wajeeha.javed123

i think you are increasing descriptors to keep it like FIFO after 2
seconds, that isnt necessary. you need large amount of main memory from
where you allocate appropriate number of huge pages to the dpdk app. you
can do some calculation with size of mbuf and have 28 million mbufs for 2
seconds.
once 512 descriptors are exhausted than old ones are replaced with new
pointer every time it receives a packet and mbuf is allocated from the pool
if available.

On Fri, Oct 12, 2018 at 3:00 PM <users-request@dpdk.org> wrote:

> Send users mailing list submissions to
>         users@dpdk.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mails.dpdk.org/listinfo/users
> or, via email, send a message with subject or body 'help' to
>         users-request@dpdk.org
>
> You can reach the person managing the list at
>         users-owner@dpdk.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of users digest..."
>
>
> Today's Topics:
>
>    1. Re: Problems compiling DPDK for MLX4 (Anthony Hart)
>    2. Increasing the NB_MBUFs of PktMbuf MemPool (Wajeeha Javed)
>    3. Re: Increasing the NB_MBUFs of PktMbuf MemPool (Andrew Rybchenko)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 11 Oct 2018 16:21:48 -0400
> From: Anthony Hart <ahart@domainhart.com>
> To: Cliff Burdick <shaklee3@gmail.com>
> Cc: users <users@dpdk.org>
> Subject: Re: [dpdk-users] Problems compiling DPDK for MLX4
> Message-ID: <B10CCD14-3255-499B-99B0-4D92E6EF8CE4@domainhart.com>
> Content-Type: text/plain;       charset=utf-8
>
> Thanks I will try that
>
> > On Oct 11, 2018, at 3:11 PM, Cliff Burdick <shaklee3@gmail.com> wrote:
> >
> > The easy workaround is to install the mellanox OFED package with the
> flags --dpdk --upstream-libs.
> >
> > On Thu, Oct 11, 2018 at 8:57 AM Anthony Hart <ahart@domainhart.com
> <mailto:ahart@domainhart.com>> wrote:
> >
> > Having problems compiling DPDK for the Mellanox PMD.
> >
> > For dpdk-18-08 I get...
> >
> >   CC efx_phy.o
> > In file included from
> /root/th/dpdk-18.08/drivers/net/mlx4/mlx4_txq.c:35:0:
> > /root/th/dpdk-18.08/drivers/net/mlx4/mlx4_glue.h:16:31: fatal error:
> infiniband/mlx4dv.h: No such file or directory
> >  #include <infiniband/mlx4dv.h>
> >                                ^
> > compilation terminated.
> > In file included from
> /root/th/dpdk-18.08/drivers/net/mlx4/mlx4_intr.c:32:0:
> > /root/th/dpdk-18.08/drivers/net/mlx4/mlx4_glue.h:16:31: fatal error:
> infiniband/mlx4dv.h: No such file or directory
> >  #include <infiniband/mlx4dv.h>
> >
> >
> > For dpdk-1711.3
> >
> >   CC mlx5.o
> > /root/th/dpdk-stable-17.11.3/drivers/net/mlx5/mlx5.c: In function
> ?mlx5_pci_probe?:
> > /root/th/dpdk-stable-17.11.3/drivers/net/mlx5/mlx5.c:921:21: error:
> ?struct ibv_device_attr_ex? has no member named ?device_cap_flags_ex?
> >     !!(device_attr_ex.device_cap_flags_ex &
> >                      ^
> > /root/th/dpdk-stable-17.11.3/drivers/net/mlx5/mlx5.c:922:7: error:
> ?IBV_DEVICE_RAW_IP_CSUM? undeclared (first use in this function)
> >        IBV_DEVICE_RAW_IP_CSUM);
> >        ^
> > /root/th/dpdk-stable-17.11.3/drivers/net/mlx5/mlx5.c:922:7: note: each
> undeclared identifier is reported only once for each function it appears in
> > /root/th/dpdk-stable-17.11.3/drivers/net/mlx5/mlx5.c:942:18: error:
> ?struct ibv_device_attr_ex? has no member named ?rss_caps?
> >     device_attr_ex.rss_caps.max_rwq_indirection_table_size;
> >                   ^
> >
> >
> > This is on Lentos 7.5 (3.10.0-862.14.4.el7.x86_64)
> >
> > With mlnx-en-4.4-1.0.1.0-rhel7.5-x86_64.iso installed
> >
> >
> > Any thoughts?
> >
> > thanks
> > ?
> > tony
> >
> >
>
>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 12 Oct 2018 09:48:06 +0500
> From: Wajeeha Javed <wajeeha.javed123@gmail.com>
> To: users@dpdk.org
> Subject: [dpdk-users] Increasing the NB_MBUFs of PktMbuf MemPool
> Message-ID:
>         <
> CAAQUUHUbowa5EnTiOhsaimAXNJXxjxKgPY1GAsqR+EUtoGL_2w@mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Hi,
>
> I am in the process of developing  DPDK based Application where I would
> like to delay the packets for about 2 secs. There are two ports connected
> to DPDK App and sending traffic of 64 bytes size packets at a line rate of
> 10GB/s. Within 2 secs, I will have 28 Million packets for each of the port
> in delay application. The maximum RX Descriptor size is 16384. I am unable
> to increase the number of Rx descriptors more than 16384 value. Is it
> possible to increase the number of Rx descriptors to a large value. e.g.
> 65536.  Therefore I copied the mbufs using the pktmbuf copy code(shown
> below) and free the packet received. Now the issue is that I can not copy
> more than 5 million packets because the  nb_mbufs of the mempool can't be
> more than 5 Million (#define NB_MBUF 5000000). If I increase the NB_MBUF
> macro from more than 5 Million, the error is returned unable to init mbuf
> pool. Is there a possible way to increase the mempool size?
>
> Furthermore, kindly guide me if this is the appropriate mailing list for
> asking this type of questions.
>
> <Code>
>
> static inline struct rte_mbuf *
>
> pktmbuf_copy(struct rte_mbuf *md, struct rte_mempool *mp)
> {
> struct rte_mbuf *mc = NULL;
> struct rte_mbuf **prev = &mc;
>
> do {
>     struct rte_mbuf *mi;
>
>     mi = rte_pktmbuf_alloc(mp);
>     if (unlikely(mi == NULL)) {
>         rte_pktmbuf_free(mc);
>
>         rte_exit(EXIT_FAILURE, "Unable to Allocate Memory. Memory
> Failure.\n");
>         return NULL;
>     }
>
>     mi->data_off = md->data_off;
>     mi->data_len = md->data_len;
>     mi->port = md->port;
>     mi->vlan_tci = md->vlan_tci;
>     mi->tx_offload = md->tx_offload;
>     mi->hash = md->hash;
>
>     mi->next = NULL;
>     mi->pkt_len = md->pkt_len;
>     mi->nb_segs = md->nb_segs;
>     mi->ol_flags = md->ol_flags;
>     mi->packet_type = md->packet_type;
>
>    rte_memcpy(rte_pktmbuf_mtod(mi, char *), rte_pktmbuf_mtod(md, char *),
> md->data_len);
>    *prev = mi;
>    prev = &mi->next;
> } while ((md = md->next) != NULL);
>
> *prev = NULL;
> return mc;
>
> }
>
> </Code>
>
> *Reference:*  http://patchwork.dpdk.org/patch/6289/
>
> Thanks & Best Regards,
>
> Wajeeha Javed
>
>
> ------------------------------
>
> Message: 3
> Date: Fri, 12 Oct 2018 11:56:48 +0300
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> To: Wajeeha Javed <wajeeha.javed123@gmail.com>, <users@dpdk.org>
> Subject: Re: [dpdk-users] Increasing the NB_MBUFs of PktMbuf MemPool
> Message-ID: <b71716b2-2888-6828-3cfc-568c684c3181@solarflare.com>
> Content-Type: text/plain; charset="utf-8"; format=flowed
>
> Hi,
>
> On 10/12/18 7:48 AM, Wajeeha Javed wrote:
> > Hi,
> >
> > I am in the process of developing  DPDK based Application where I would
> > like to delay the packets for about 2 secs. There are two ports connected
> > to DPDK App and sending traffic of 64 bytes size packets at a line rate
> of
> > 10GB/s. Within 2 secs, I will have 28 Million packets for each of the
> port
> > in delay application. The maximum RX Descriptor size is 16384. I am
> unable
> > to increase the number of Rx descriptors more than 16384 value. Is it
> > possible to increase the number of Rx descriptors to a large value. e.g.
> > 65536.  Therefore I copied the mbufs using the pktmbuf copy code(shown
> > below) and free the packet received. Now the issue is that I can not copy
> > more than 5 million packets because the  nb_mbufs of the mempool can't be
> > more than 5 Million (#define NB_MBUF 5000000). If I increase the NB_MBUF
> > macro from more than 5 Million, the error is returned unable to init mbuf
> > pool. Is there a possible way to increase the mempool size?
>
> I've failed to find explicit limitations from the first glance.
> NB_MBUF define is typically internal to examples/apps.
> The question I'd like to double-check if the host has enought
> RAM and hugepages allocated? 5 million mbufs already require about
> 10G.
>
> Andrew.
>
>
> End of users Digest, Vol 155, Issue 7
> *************************************
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-users] users Digest, Vol 155, Issue 7
       [not found]   ` <CAAQUUHXDjV5ZEB6PvoVMCS=KNEDAMMAEE3gdVTNNyAA2kzXCfA@mail.gmail.com>
@ 2018-10-16 10:02     ` Wiles, Keith
  2018-10-23  6:17       ` Wajeeha Javed
  0 siblings, 1 reply; 4+ messages in thread
From: Wiles, Keith @ 2018-10-16 10:02 UTC (permalink / raw)
  To: Wajeeha Javed; +Cc: users

Sorry, you must have replied to my screwup not sending the reply in pure text format. I did send an updated reply to hopefully fix that problem. More comments inline below. All emails to the list must be in ‘text' format not ‘Rich Text’ format :-(

> On Oct 15, 2018, at 11:42 PM, Wajeeha Javed <wajeeha.javed123@gmail.com> wrote:
> 
> Hi,
> 
> Thanks, everyone for your reply. Please find below my comments.
> 
> *I've failed to find explicit limitations from the first glance.*
> * NB_MBUF define is typically internal to examples/apps.*
> * The question I'd like to double-check if the host has enought*
> * RAM and hugepages allocated? 5 million mbufs already require about*
> * 10G.*
> 
> Total Ram = 128 GB
> Available Memory = 23GB free
> 
> Total Huge Pages = 80
> 
> Free Huge Page = 38
> Huge Page Size = 1GB
> 
> *The mempool uses uint32_t for most sizes and the number of mempool items
> is uint32_t so ok with the number of entries in a can be ~4G as stated be
> make sure you have enough *
> 
> *memory as the over head for mbufs is not just the header + the packet size*
> 
> Right. Currently, there are total of 80 huge pages, 40 for each numa node
> (Numa node 0 and Numa node 1). I observed that I was using only 16 huge
> pages while the other 16
> 
> huge pages were used by other dpdk  application. By running only my dpdk
> application on numa node 0, I was able to increase the mempool size to 14M
> that uses all the
> 
> huge pages of Numa node 0.
> 
> *My question is why are you copying the mbuf and not just linking the mbufs
> into a link list? Maybe I do not understand the reason. I would try to make
> sure you do not do a copy of the *
> 
> *data and just link the mbufs together using the next pointer in the mbuf
> header unless you have chained mbufs already.*
> 
> The reason for copying the Mbuf is due to the NIC limitations, I cannot
> have more than 16384 Rx descriptors, whereas  I want to withhold all the
> packets coming at a line rate of 10GBits/sec for each port. I created a
> circular queue running on a FIFO basis. Initially, I thought of using
> rte_mbuf* packet burst for a delay of 2 secs. Now at line rate, we receive
> 14Million

I assume in your driver a mbuf is used to receive the packet data, which means the packet is inside an mbuf (if not then why not?). The mbuf data does not need to be copied you can use the ’next’ pointer in the mbuf to create a single link list. If you use fragmented packets in your design, which means you are using the ’next’ pointer in the mbuf to chain the frame fragments into a single packet then using ’next’ will not work. Plus when you call rte_pktmbuf_free() you need to make sure the next pointer is NULL or it will free the complete chain of mbufs (not what you want here).

In the case where you are using chained mbufs for a single packet then you can create a set of small buffers to hold the STAILQ pointers and the pointer to the mbuf. Then add the small structure onto a link list as this method maybe the best solution in the long run instead of trying to use the mbuf->next pointer.

Have a look at the rte_tailq.h and eal_common_tailqs.c files and rte_mempool.c (plus many other libs in DPDK). Use the rte_tailq_entry structure to create a linked list of mempool structures for searching and debugging mempools in the system. The 'struct rte_tailq_entry’ is just adding a simple structure to point to the mempool structure and allows it to build a linked list with the correct pointer types.

You can create a mempool of rte_tailq_entry structures if you want a fast and clean way to allocate/free the tailq entry structures.

Then you do not need to copy the packet memory anyplace just allocate a tailq entry structure, set the mbuf pointer in the tailq entry, the link the tailq entry  to the tailq list. These macros for tailq support are not the easiest to understand :-(, but once you understand the idea it becomes clearer.

I hope that helps.

> 
> Packet/s, so descriptor get full and I don't have other option left than
> copying the mbuf to the circular queue rather than using a rte_mbuf*
> pointer. I know I have to make a
> 
> compromise on performance to achieve a delay for packets. So for copying
> mbufs, I allocate memory from Mempool to copy the mbuf received and then
> free it. Please find the
> 
> code snippet below.
> 
> How we can chain different mbufs together? According to my understanding
> chained mbufs in the API are used for storing segments of the fragmented
> packets that are greater
> 
> than MTU. Even If we chain the mbufs together using next pointer we need to
> free the mbufs received, otherwise we will not be able to get free Rx
> descriptors at a line rate of
> 
> 10GBits/sec and eventually all the Rx descriptors will be filled and NIC
> will not receive any more packets.
> 
> <Code>
> 
> for( j = 0; j < nb_rx; j++) {
> m = pkts_burst[j];
> struct rte_mbuf* copy_mbuf = pktmbuf_copy(m, pktmbuf_pool[sockid]);
> ....
> rte_pktmbuf_free(m);
> }
> 
> </Code>
> 
> *The other question is can you drop any packets if not then you only have
> the linking option IMO. If you can drop packets then you can just start
> dropping them when the ring is getting full. Holding onto 28m packets for
> two seconds can cause other protocol related problems and TCP could be
> sending retransmitted packets and now you have caused a bunch of work on
> the RX side *
> 
> *at **the end point.*
> I would like my DPDK application to have zero packet loss, it only delays
> all the received packet for 2 secs than transmitted them as it is without
> any change or processing to packets.
> Moreover, DPDK application is receiving tap traffic(monitoring traffic)
> rather than real-time traffic. So there will not be any TCP or any other
> protocol-related problems.
> 
> Looking forward to your reply.
> 
> 
> Best Regards,
> 
> Wajeeha Javed

Regards,
Keith

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-users] users Digest, Vol 155, Issue 7
  2018-10-16 10:02     ` Wiles, Keith
@ 2018-10-23  6:17       ` Wajeeha Javed
  2018-10-23 15:22         ` Wiles, Keith
  0 siblings, 1 reply; 4+ messages in thread
From: Wajeeha Javed @ 2018-10-23  6:17 UTC (permalink / raw)
  To: keith.wiles; +Cc: users

Hi Keith,

Thanks for your reply. Please find below my comments

>> You're right, in my application all the packets are stored inside mbuf.
The reason for not using the next pointer of mbuf is that it might get used
by the fragmented packets having size greater than MTU.

>> I have tried using small buffer of STAILQ linked list for each port
having STAILQ entry and pointer to mbuf packets burst. I allocate the
stailq entry, set the mbuf pointer in the stailq entry, the link the stailq
entry to the stailq list using stailq macros. I observe millions of packet
loss, the stailq linked list could only hold less than 1 million packets
per second at line rate of 10Gbits/sec.

>> I would like to prevent data loss, could you please guide me what is the
best optimal solution for increasing the number of mbufs without freeing or
overwriting them for a delay of 2 secs.

Thanks & Best Regards,

Wajeeha Javed


On Tue, Oct 16, 2018 at 3:02 PM Wiles, Keith <keith.wiles@intel.com> wrote:

> Sorry, you must have replied to my screwup not sending the reply in pure
> text format. I did send an updated reply to hopefully fix that problem.
> More comments inline below. All emails to the list must be in ‘text' format
> not ‘Rich Text’ format :-(
>
> > On Oct 15, 2018, at 11:42 PM, Wajeeha Javed <wajeeha.javed123@gmail.com>
> wrote:
> >
> > Hi,
> >
> > Thanks, everyone for your reply. Please find below my comments.
> >
> > *I've failed to find explicit limitations from the first glance.*
> > * NB_MBUF define is typically internal to examples/apps.*
> > * The question I'd like to double-check if the host has enought*
> > * RAM and hugepages allocated? 5 million mbufs already require about*
> > * 10G.*
> >
> > Total Ram = 128 GB
> > Available Memory = 23GB free
> >
> > Total Huge Pages = 80
> >
> > Free Huge Page = 38
> > Huge Page Size = 1GB
> >
> > *The mempool uses uint32_t for most sizes and the number of mempool items
> > is uint32_t so ok with the number of entries in a can be ~4G as stated be
> > make sure you have enough *
> >
> > *memory as the over head for mbufs is not just the header + the packet
> size*
> >
> > Right. Currently, there are total of 80 huge pages, 40 for each numa node
> > (Numa node 0 and Numa node 1). I observed that I was using only 16 huge
> > pages while the other 16
> >
> > huge pages were used by other dpdk  application. By running only my dpdk
> > application on numa node 0, I was able to increase the mempool size to
> 14M
> > that uses all the
> >
> > huge pages of Numa node 0.
> >
> > *My question is why are you copying the mbuf and not just linking the
> mbufs
> > into a link list? Maybe I do not understand the reason. I would try to
> make
> > sure you do not do a copy of the *
> >
> > *data and just link the mbufs together using the next pointer in the mbuf
> > header unless you have chained mbufs already.*
> >
> > The reason for copying the Mbuf is due to the NIC limitations, I cannot
> > have more than 16384 Rx descriptors, whereas  I want to withhold all the
> > packets coming at a line rate of 10GBits/sec for each port. I created a
> > circular queue running on a FIFO basis. Initially, I thought of using
> > rte_mbuf* packet burst for a delay of 2 secs. Now at line rate, we
> receive
> > 14Million
>
> I assume in your driver a mbuf is used to receive the packet data, which
> means the packet is inside an mbuf (if not then why not?). The mbuf data
> does not need to be copied you can use the ’next’ pointer in the mbuf to
> create a single link list. If you use fragmented packets in your design,
> which means you are using the ’next’ pointer in the mbuf to chain the frame
> fragments into a single packet then using ’next’ will not work. Plus when
> you call rte_pktmbuf_free() you need to make sure the next pointer is NULL
> or it will free the complete chain of mbufs (not what you want here).
>
> In the case where you are using chained mbufs for a single packet then you
> can create a set of small buffers to hold the STAILQ pointers and the
> pointer to the mbuf. Then add the small structure onto a link list as this
> method maybe the best solution in the long run instead of trying to use the
> mbuf->next pointer.
>
> Have a look at the rte_tailq.h and eal_common_tailqs.c files and
> rte_mempool.c (plus many other libs in DPDK). Use the rte_tailq_entry
> structure to create a linked list of mempool structures for searching and
> debugging mempools in the system. The 'struct rte_tailq_entry’ is just
> adding a simple structure to point to the mempool structure and allows it
> to build a linked list with the correct pointer types.
>
> You can create a mempool of rte_tailq_entry structures if you want a fast
> and clean way to allocate/free the tailq entry structures.
>
> Then you do not need to copy the packet memory anyplace just allocate a
> tailq entry structure, set the mbuf pointer in the tailq entry, the link
> the tailq entry  to the tailq list. These macros for tailq support are not
> the easiest to understand :-(, but once you understand the idea it becomes
> clearer.
>
> I hope that helps.
>
> >
> > Packet/s, so descriptor get full and I don't have other option left than
> > copying the mbuf to the circular queue rather than using a rte_mbuf*
> > pointer. I know I have to make a
> >
> > compromise on performance to achieve a delay for packets. So for copying
> > mbufs, I allocate memory from Mempool to copy the mbuf received and then
> > free it. Please find the
> >
> > code snippet below.
> >
> > How we can chain different mbufs together? According to my understanding
> > chained mbufs in the API are used for storing segments of the fragmented
> > packets that are greater
> >
> > than MTU. Even If we chain the mbufs together using next pointer we need
> to
> > free the mbufs received, otherwise we will not be able to get free Rx
> > descriptors at a line rate of
> >
> > 10GBits/sec and eventually all the Rx descriptors will be filled and NIC
> > will not receive any more packets.
> >
> > <Code>
> >
> > for( j = 0; j < nb_rx; j++) {
> > m = pkts_burst[j];
> > struct rte_mbuf* copy_mbuf = pktmbuf_copy(m, pktmbuf_pool[sockid]);
> > ....
> > rte_pktmbuf_free(m);
> > }
> >
> > </Code>
> >
> > *The other question is can you drop any packets if not then you only have
> > the linking option IMO. If you can drop packets then you can just start
> > dropping them when the ring is getting full. Holding onto 28m packets for
> > two seconds can cause other protocol related problems and TCP could be
> > sending retransmitted packets and now you have caused a bunch of work on
> > the RX side *
> >
> > *at **the end point.*
> > I would like my DPDK application to have zero packet loss, it only delays
> > all the received packet for 2 secs than transmitted them as it is without
> > any change or processing to packets.
> > Moreover, DPDK application is receiving tap traffic(monitoring traffic)
> > rather than real-time traffic. So there will not be any TCP or any other
> > protocol-related problems.
> >
> > Looking forward to your reply.
> >
> >
> > Best Regards,
> >
> > Wajeeha Javed
>
> Regards,
> Keith
>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-users] users Digest, Vol 155, Issue 7
  2018-10-23  6:17       ` Wajeeha Javed
@ 2018-10-23 15:22         ` Wiles, Keith
  0 siblings, 0 replies; 4+ messages in thread
From: Wiles, Keith @ 2018-10-23 15:22 UTC (permalink / raw)
  To: Wajeeha Javed; +Cc: users



> On Oct 22, 2018, at 11:17 PM, Wajeeha Javed <wajeeha.javed123@gmail.com> wrote:
> 
> Hi Keith, 

Please try to reply inline to the text and do not top post, it makes its hard to follow so many email threads.

> 
> Thanks for your reply. Please find below my comments
> 
> >> You're right, in my application all the packets are stored inside mbuf. The reason for not using the next pointer of mbuf is that it might get used by the fragmented packets having size greater than MTU. 
> 
> >> I have tried using small buffer of STAILQ linked list for each port having STAILQ entry and pointer to mbuf packets burst. I allocate the stailq entry, set the mbuf pointer in the stailq entry, the link the stailq entry to the stailq list using stailq macros. I observe millions of packet loss, the stailq linked list could only hold less than 1 million packets per second at line rate of 10Gbits/sec.
> 
> >> I would like to prevent data loss, could you please guide me what is the best optimal solution for increasing the number of mbufs without freeing or overwriting them for a delay of 2 secs.

Using the stailq method is my best guess to solve your problem. If you are calling malloc on each packet you want to save at the time you need to link the packets that would be the reason you can not hold the packets without dropping some at the wire.

Allocate all of the stailq blocks and keep them in some type of array or list too avoid doing an allocation call at startup. Other then this type of help and not doing the code myself this is all I have for you, sorry. The amount of memory allocated for the stailq structures is going to be more then 28M blocks all sorts of cache issues could be causing the problem.

> 
> Thanks & Best Regards,
> 
> Wajeeha Javed
> 
> 
> 
> On Tue, Oct 16, 2018 at 3:02 PM Wiles, Keith <keith.wiles@intel.com> wrote:
> Sorry, you must have replied to my screwup not sending the reply in pure text format. I did send an updated reply to hopefully fix that problem. More comments inline below. All emails to the list must be in ‘text' format not ‘Rich Text’ format :-(
> 
> > On Oct 15, 2018, at 11:42 PM, Wajeeha Javed <wajeeha.javed123@gmail.com> wrote:
> > 
> > Hi,
> > 
> > Thanks, everyone for your reply. Please find below my comments.
> > 
> > *I've failed to find explicit limitations from the first glance.*
> > * NB_MBUF define is typically internal to examples/apps.*
> > * The question I'd like to double-check if the host has enought*
> > * RAM and hugepages allocated? 5 million mbufs already require about*
> > * 10G.*
> > 
> > Total Ram = 128 GB
> > Available Memory = 23GB free
> > 
> > Total Huge Pages = 80
> > 
> > Free Huge Page = 38
> > Huge Page Size = 1GB
> > 
> > *The mempool uses uint32_t for most sizes and the number of mempool items
> > is uint32_t so ok with the number of entries in a can be ~4G as stated be
> > make sure you have enough *
> > 
> > *memory as the over head for mbufs is not just the header + the packet size*
> > 
> > Right. Currently, there are total of 80 huge pages, 40 for each numa node
> > (Numa node 0 and Numa node 1). I observed that I was using only 16 huge
> > pages while the other 16
> > 
> > huge pages were used by other dpdk  application. By running only my dpdk
> > application on numa node 0, I was able to increase the mempool size to 14M
> > that uses all the
> > 
> > huge pages of Numa node 0.
> > 
> > *My question is why are you copying the mbuf and not just linking the mbufs
> > into a link list? Maybe I do not understand the reason. I would try to make
> > sure you do not do a copy of the *
> > 
> > *data and just link the mbufs together using the next pointer in the mbuf
> > header unless you have chained mbufs already.*
> > 
> > The reason for copying the Mbuf is due to the NIC limitations, I cannot
> > have more than 16384 Rx descriptors, whereas  I want to withhold all the
> > packets coming at a line rate of 10GBits/sec for each port. I created a
> > circular queue running on a FIFO basis. Initially, I thought of using
> > rte_mbuf* packet burst for a delay of 2 secs. Now at line rate, we receive
> > 14Million
> 
> I assume in your driver a mbuf is used to receive the packet data, which means the packet is inside an mbuf (if not then why not?). The mbuf data does not need to be copied you can use the ’next’ pointer in the mbuf to create a single link list. If you use fragmented packets in your design, which means you are using the ’next’ pointer in the mbuf to chain the frame fragments into a single packet then using ’next’ will not work. Plus when you call rte_pktmbuf_free() you need to make sure the next pointer is NULL or it will free the complete chain of mbufs (not what you want here).
> 
> In the case where you are using chained mbufs for a single packet then you can create a set of small buffers to hold the STAILQ pointers and the pointer to the mbuf. Then add the small structure onto a link list as this method maybe the best solution in the long run instead of trying to use the mbuf->next pointer.
> 
> Have a look at the rte_tailq.h and eal_common_tailqs.c files and rte_mempool.c (plus many other libs in DPDK). Use the rte_tailq_entry structure to create a linked list of mempool structures for searching and debugging mempools in the system. The 'struct rte_tailq_entry’ is just adding a simple structure to point to the mempool structure and allows it to build a linked list with the correct pointer types.
> 
> You can create a mempool of rte_tailq_entry structures if you want a fast and clean way to allocate/free the tailq entry structures.
> 
> Then you do not need to copy the packet memory anyplace just allocate a tailq entry structure, set the mbuf pointer in the tailq entry, the link the tailq entry  to the tailq list. These macros for tailq support are not the easiest to understand :-(, but once you understand the idea it becomes clearer.
> 
> I hope that helps.
> 
> > 
> > Packet/s, so descriptor get full and I don't have other option left than
> > copying the mbuf to the circular queue rather than using a rte_mbuf*
> > pointer. I know I have to make a
> > 
> > compromise on performance to achieve a delay for packets. So for copying
> > mbufs, I allocate memory from Mempool to copy the mbuf received and then
> > free it. Please find the
> > 
> > code snippet below.
> > 
> > How we can chain different mbufs together? According to my understanding
> > chained mbufs in the API are used for storing segments of the fragmented
> > packets that are greater
> > 
> > than MTU. Even If we chain the mbufs together using next pointer we need to
> > free the mbufs received, otherwise we will not be able to get free Rx
> > descriptors at a line rate of
> > 
> > 10GBits/sec and eventually all the Rx descriptors will be filled and NIC
> > will not receive any more packets.
> > 
> > <Code>
> > 
> > for( j = 0; j < nb_rx; j++) {
> > m = pkts_burst[j];
> > struct rte_mbuf* copy_mbuf = pktmbuf_copy(m, pktmbuf_pool[sockid]);
> > ....
> > rte_pktmbuf_free(m);
> > }
> > 
> > </Code>
> > 
> > *The other question is can you drop any packets if not then you only have
> > the linking option IMO. If you can drop packets then you can just start
> > dropping them when the ring is getting full. Holding onto 28m packets for
> > two seconds can cause other protocol related problems and TCP could be
> > sending retransmitted packets and now you have caused a bunch of work on
> > the RX side *
> > 
> > *at **the end point.*
> > I would like my DPDK application to have zero packet loss, it only delays
> > all the received packet for 2 secs than transmitted them as it is without
> > any change or processing to packets.
> > Moreover, DPDK application is receiving tap traffic(monitoring traffic)
> > rather than real-time traffic. So there will not be any TCP or any other
> > protocol-related problems.
> > 
> > Looking forward to your reply.
> > 
> > 
> > Best Regards,
> > 
> > Wajeeha Javed
> 
> Regards,
> Keith
> 

Regards,
Keith


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-10-23 15:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <mailman.1.1539338401.21145.users@dpdk.org>
2018-10-12 12:40 ` [dpdk-users] users Digest, Vol 155, Issue 7 waqas ahmed
     [not found]   ` <CAAQUUHXDjV5ZEB6PvoVMCS=KNEDAMMAEE3gdVTNNyAA2kzXCfA@mail.gmail.com>
2018-10-16 10:02     ` Wiles, Keith
2018-10-23  6:17       ` Wajeeha Javed
2018-10-23 15:22         ` Wiles, Keith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).