* [dpdk-dev] TX unable to enqueue packets to NIC due to no free TX descriptor
@ 2019-01-11 22:10 Soni, Shivam
2019-01-11 23:37 ` Stephen Hemminger
0 siblings, 1 reply; 8+ messages in thread
From: Soni, Shivam @ 2019-01-11 22:10 UTC (permalink / raw)
To: dev, users
Hi All,
We are trying to debug and fix an issue. After the deployment, in few of the hosts we see an issue where TX is unable to enqueue packets to NIC. On rebouncing or restarting our packet processor daemon, issue gets resolved.
We are using IntelDPDK version 17.11.4 and i40e drivers.
On looking into driver’s code, we found that whenever the issue is happening the value for nb_tx_free is ‘0’. And then it tries to free the buffer by calling function ‘i40e_tx_free_bufs’.
This method returns early as the buffer its trying to free says it hasn’t finished transmitting yet. The method returns at this if condition:
/* check DD bits on threshold descriptor */
if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
return 0;
}
Hence nb_tx_free remains 0.
Our tx descriptor count is 1024.
How can we fix this issue. Can someone help us out here please.
Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] TX unable to enqueue packets to NIC due to no free TX descriptor
2019-01-11 22:10 [dpdk-dev] TX unable to enqueue packets to NIC due to no free TX descriptor Soni, Shivam
@ 2019-01-11 23:37 ` Stephen Hemminger
2019-01-12 0:26 ` Soni, Shivam
0 siblings, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2019-01-11 23:37 UTC (permalink / raw)
To: Soni, Shivam; +Cc: dev, users
On Fri, 11 Jan 2019 22:10:39 +0000
"Soni, Shivam" <shivsoni@amazon.com> wrote:
> Hi All,
>
> We are trying to debug and fix an issue. After the deployment, in few of the hosts we see an issue where TX is unable to enqueue packets to NIC. On rebouncing or restarting our packet processor daemon, issue gets resolved.
>
> We are using IntelDPDK version 17.11.4 and i40e drivers.
>
> On looking into driver’s code, we found that whenever the issue is happening the value for nb_tx_free is ‘0’. And then it tries to free the buffer by calling function ‘i40e_tx_free_bufs’.
>
> This method returns early as the buffer its trying to free says it hasn’t finished transmitting yet. The method returns at this if condition:
>
> /* check DD bits on threshold descriptor */
> if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
> rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
> rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
> return 0;
> }
>
> Hence nb_tx_free remains 0.
>
> Our tx descriptor count is 1024.
>
> How can we fix this issue. Can someone help us out here please
Use bigger mbuf pool. For safety the mbuf pool has to be big enough
for Nports * (NRxd + NTxd) + NCore * (mbuf_pool_cache_size + burst_size)
Each NIC might get full receive ring and full transmit ring
and each active core might be processing a burst of packets and have
free buffers sitting in the mbuf pool cache. This doesn't account for additional
mbuf's created if doing things like reassembly, encryption, re-encapsulation, or compression
Anything smaller and your application is relying on statistical averages
to never see resource exhaustion; overcommitment
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] TX unable to enqueue packets to NIC due to no free TX descriptor
2019-01-11 23:37 ` Stephen Hemminger
@ 2019-01-12 0:26 ` Soni, Shivam
2019-01-14 17:54 ` Soni, Shivam
0 siblings, 1 reply; 8+ messages in thread
From: Soni, Shivam @ 2019-01-12 0:26 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev, users
Hi Stephen,
Thanks for the reply.
Our mbuf pool is big enough. We have 2 RX cores, 2 TX cores and 8 worker cores.
NTxd and NRxd is 1024 each and we have 16 Rx rings (shared between Rx and workers) and 8 Tx rings (between Tx and workers)
Mempool cache size is 256 and burst size is 32.
So overall calculation comes out to be
((NIC_RX_QUEUE_SIZE * RX_LCORES) + (NIC_TX_QUEUE_SIZE * TX_LCORES) + \
(WORKER_RX_RING_SIZE * RX_LCORES * NAT_WORKER_LCORES) + (WORKER_TX_RING_SIZE * NAT_WORKER_LCORES) + \
((MBUF_ARRAY_SIZE + CACHE_SIZE) * (RX_LCORES + TX_LCORES + NAT_WORKER_LCORES)))
With this the mbuf pool size should be 32128. To round off as power of 2 we have kept mbuf pool size as 32767.
Also the incoming packet rate Is pretty low.
For testing I have doubled the pool size for now. Not sure whether this will solve the issue.
Thanks.
On 1/11/19, 3:38 PM, "Stephen Hemminger" <stephen@networkplumber.org> wrote:
On Fri, 11 Jan 2019 22:10:39 +0000
"Soni, Shivam" <shivsoni@amazon.com> wrote:
> Hi All,
>
> We are trying to debug and fix an issue. After the deployment, in few of the hosts we see an issue where TX is unable to enqueue packets to NIC. On rebouncing or restarting our packet processor daemon, issue gets resolved.
>
> We are using IntelDPDK version 17.11.4 and i40e drivers.
>
> On looking into driver’s code, we found that whenever the issue is happening the value for nb_tx_free is ‘0’. And then it tries to free the buffer by calling function ‘i40e_tx_free_bufs’.
>
> This method returns early as the buffer its trying to free says it hasn’t finished transmitting yet. The method returns at this if condition:
>
> /* check DD bits on threshold descriptor */
> if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
> rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
> rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
> return 0;
> }
>
> Hence nb_tx_free remains 0.
>
> Our tx descriptor count is 1024.
>
> How can we fix this issue. Can someone help us out here please
Use bigger mbuf pool. For safety the mbuf pool has to be big enough
for Nports * (NRxd + NTxd) + NCore * (mbuf_pool_cache_size + burst_size)
Each NIC might get full receive ring and full transmit ring
and each active core might be processing a burst of packets and have
free buffers sitting in the mbuf pool cache. This doesn't account for additional
mbuf's created if doing things like reassembly, encryption, re-encapsulation, or compression
Anything smaller and your application is relying on statistical averages
to never see resource exhaustion; overcommitment
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] TX unable to enqueue packets to NIC due to no free TX descriptor
2019-01-12 0:26 ` Soni, Shivam
@ 2019-01-14 17:54 ` Soni, Shivam
2019-01-16 21:45 ` Soni, Shivam
0 siblings, 1 reply; 8+ messages in thread
From: Soni, Shivam @ 2019-01-14 17:54 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev, users
I doubled the mempool size to 65535 but the issue is not resolved.
On 1/11/19, 4:27 PM, "dev on behalf of Soni, Shivam" <dev-bounces@dpdk.org on behalf of shivsoni@amazon.com> wrote:
Hi Stephen,
Thanks for the reply.
Our mbuf pool is big enough. We have 2 RX cores, 2 TX cores and 8 worker cores.
NTxd and NRxd is 1024 each and we have 16 Rx rings (shared between Rx and workers) and 8 Tx rings (between Tx and workers)
Mempool cache size is 256 and burst size is 32.
So overall calculation comes out to be
((NIC_RX_QUEUE_SIZE * RX_LCORES) + (NIC_TX_QUEUE_SIZE * TX_LCORES) + \
(WORKER_RX_RING_SIZE * RX_LCORES * NAT_WORKER_LCORES) + (WORKER_TX_RING_SIZE * NAT_WORKER_LCORES) + \
((MBUF_ARRAY_SIZE + CACHE_SIZE) * (RX_LCORES + TX_LCORES + NAT_WORKER_LCORES)))
With this the mbuf pool size should be 32128. To round off as power of 2 we have kept mbuf pool size as 32767.
Also the incoming packet rate Is pretty low.
For testing I have doubled the pool size for now. Not sure whether this will solve the issue.
Thanks.
On 1/11/19, 3:38 PM, "Stephen Hemminger" <stephen@networkplumber.org> wrote:
On Fri, 11 Jan 2019 22:10:39 +0000
"Soni, Shivam" <shivsoni@amazon.com> wrote:
> Hi All,
>
> We are trying to debug and fix an issue. After the deployment, in few of the hosts we see an issue where TX is unable to enqueue packets to NIC. On rebouncing or restarting our packet processor daemon, issue gets resolved.
>
> We are using IntelDPDK version 17.11.4 and i40e drivers.
>
> On looking into driver’s code, we found that whenever the issue is happening the value for nb_tx_free is ‘0’. And then it tries to free the buffer by calling function ‘i40e_tx_free_bufs’.
>
> This method returns early as the buffer its trying to free says it hasn’t finished transmitting yet. The method returns at this if condition:
>
> /* check DD bits on threshold descriptor */
> if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
> rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
> rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
> return 0;
> }
>
> Hence nb_tx_free remains 0.
>
> Our tx descriptor count is 1024.
>
> How can we fix this issue. Can someone help us out here please
Use bigger mbuf pool. For safety the mbuf pool has to be big enough
for Nports * (NRxd + NTxd) + NCore * (mbuf_pool_cache_size + burst_size)
Each NIC might get full receive ring and full transmit ring
and each active core might be processing a burst of packets and have
free buffers sitting in the mbuf pool cache. This doesn't account for additional
mbuf's created if doing things like reassembly, encryption, re-encapsulation, or compression
Anything smaller and your application is relying on statistical averages
to never see resource exhaustion; overcommitment
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] TX unable to enqueue packets to NIC due to no free TX descriptor
2019-01-14 17:54 ` Soni, Shivam
@ 2019-01-16 21:45 ` Soni, Shivam
2019-01-17 10:56 ` Bruce Richardson
2019-04-15 5:35 ` Xiao, Xiaohong (NSB - CN/Shanghai)
0 siblings, 2 replies; 8+ messages in thread
From: Soni, Shivam @ 2019-01-16 21:45 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev, users, Uppal, Hardeep
On digging further found some more data.
On the host where everything works fine, I can see 'txq->nb_tx_free' getting reduced to 31 from 1024. After reaching at 31, i40e_tx_free_bufs() function gets called, which frees the buffer and nb_tx_free reaches to 63.
Also in the function i40e_tx_free_bufs(), this if condition never evaluates to true as whatsoever be the value of the index txq->tx_next_dd , the value of 'cmd_type_offset_bsz' is always 15. Hence this if condition is always false and the code works fine.
if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
return 0;
}
However, on the hosts where we are seeing the issue, after some calls of the i40e_tx_free_bufs(), value for 'txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz' becomes really weird like 1099511627888, 1030792151152. Because of these weird values the 'if condition' becomes true( if((1099511627888 & 15) != 15). Hence function returns from there itself and nb_tx_free doesn't get increased and eventually reaches '0'
Are these values expected or there is some memory corruption happening somewhere in our code?
As far as I can understand this if condition its purpose is to check whether the buffers to be freed are still transmitting or not.
Can someone help us out here.
On 1/14/19, 9:54 AM, "Soni, Shivam" <shivsoni@amazon.com> wrote:
I doubled the mempool size to 65535 but the issue is not resolved.
On 1/11/19, 4:27 PM, "dev on behalf of Soni, Shivam" <dev-bounces@dpdk.org on behalf of shivsoni@amazon.com> wrote:
Hi Stephen,
Thanks for the reply.
Our mbuf pool is big enough. We have 2 RX cores, 2 TX cores and 8 worker cores.
NTxd and NRxd is 1024 each and we have 16 Rx rings (shared between Rx and workers) and 8 Tx rings (between Tx and workers)
Mempool cache size is 256 and burst size is 32.
So overall calculation comes out to be
((NIC_RX_QUEUE_SIZE * RX_LCORES) + (NIC_TX_QUEUE_SIZE * TX_LCORES) + \
(WORKER_RX_RING_SIZE * RX_LCORES * NAT_WORKER_LCORES) + (WORKER_TX_RING_SIZE * NAT_WORKER_LCORES) + \
((MBUF_ARRAY_SIZE + CACHE_SIZE) * (RX_LCORES + TX_LCORES + NAT_WORKER_LCORES)))
With this the mbuf pool size should be 32128. To round off as power of 2 we have kept mbuf pool size as 32767.
Also the incoming packet rate Is pretty low.
For testing I have doubled the pool size for now. Not sure whether this will solve the issue.
Thanks.
On 1/11/19, 3:38 PM, "Stephen Hemminger" <stephen@networkplumber.org> wrote:
On Fri, 11 Jan 2019 22:10:39 +0000
"Soni, Shivam" <shivsoni@amazon.com> wrote:
> Hi All,
>
> We are trying to debug and fix an issue. After the deployment, in few of the hosts we see an issue where TX is unable to enqueue packets to NIC. On rebouncing or restarting our packet processor daemon, issue gets resolved.
>
> We are using IntelDPDK version 17.11.4 and i40e drivers.
>
> On looking into driver’s code, we found that whenever the issue is happening the value for nb_tx_free is ‘0’. And then it tries to free the buffer by calling function ‘i40e_tx_free_bufs’.
>
> This method returns early as the buffer its trying to free says it hasn’t finished transmitting yet. The method returns at this if condition:
>
> /* check DD bits on threshold descriptor */
> if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
> rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
> rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
> return 0;
> }
>
> Hence nb_tx_free remains 0.
>
> Our tx descriptor count is 1024.
>
> How can we fix this issue. Can someone help us out here please
Use bigger mbuf pool. For safety the mbuf pool has to be big enough
for Nports * (NRxd + NTxd) + NCore * (mbuf_pool_cache_size + burst_size)
Each NIC might get full receive ring and full transmit ring
and each active core might be processing a burst of packets and have
free buffers sitting in the mbuf pool cache. This doesn't account for additional
mbuf's created if doing things like reassembly, encryption, re-encapsulation, or compression
Anything smaller and your application is relying on statistical averages
to never see resource exhaustion; overcommitment
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] TX unable to enqueue packets to NIC due to no free TX descriptor
2019-01-16 21:45 ` Soni, Shivam
@ 2019-01-17 10:56 ` Bruce Richardson
2019-04-15 5:35 ` Xiao, Xiaohong (NSB - CN/Shanghai)
1 sibling, 0 replies; 8+ messages in thread
From: Bruce Richardson @ 2019-01-17 10:56 UTC (permalink / raw)
To: Soni, Shivam; +Cc: Stephen Hemminger, dev, users, Uppal, Hardeep
On Wed, Jan 16, 2019 at 09:45:42PM +0000, Soni, Shivam wrote:
> On digging further found some more data.
>
> On the host where everything works fine, I can see 'txq->nb_tx_free' getting reduced to 31 from 1024. After reaching at 31, i40e_tx_free_bufs() function gets called, which frees the buffer and nb_tx_free reaches to 63.
>
> Also in the function i40e_tx_free_bufs(), this if condition never evaluates to true as whatsoever be the value of the index txq->tx_next_dd , the value of 'cmd_type_offset_bsz' is always 15. Hence this if condition is always false and the code works fine.
> if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
> rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
> rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
> return 0;
> }
>
> However, on the hosts where we are seeing the issue, after some calls of the i40e_tx_free_bufs(), value for 'txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz' becomes really weird like 1099511627888, 1030792151152. Because of these weird values the 'if condition' becomes true( if((1099511627888 & 15) != 15). Hence function returns from there itself and nb_tx_free doesn't get increased and eventually reaches '0'
>
> Are these values expected or there is some memory corruption happening somewhere in our code?
>
> As far as I can understand this if condition its purpose is to check whether the buffers to be freed are still transmitting or not.
>
> Can someone help us out here.
>
Hi,
what you describe is correct. The check there is for the descriptor done
bit being set by the hardware when it has finished transmitting the packet
corresponding to the descriptor. If that DD bit is not set when you go to
free the buffer, then:
a) the packet has not been transmitted for some reason e.g. no link
b) the hardware is not writing back descriptors properly
c) something in software is corrupting the descriptor ring.
While I can't say what's wrong in your specific scenarios, I'd start by
verifying that that the initial packets are being sent correctly, to rule
out a, and thereafter look for issues that would cause c. If a HW issue is
suspected, you can try swapping the NIC for a different one to eliminate b
as a possible cause.
/Bruce
> On 1/14/19, 9:54 AM, "Soni, Shivam" <shivsoni@amazon.com> wrote:
>
> I doubled the mempool size to 65535 but the issue is not resolved.
>
> On 1/11/19, 4:27 PM, "dev on behalf of Soni, Shivam" <dev-bounces@dpdk.org on behalf of shivsoni@amazon.com> wrote:
>
> Hi Stephen,
>
> Thanks for the reply.
>
> Our mbuf pool is big enough. We have 2 RX cores, 2 TX cores and 8 worker cores.
> NTxd and NRxd is 1024 each and we have 16 Rx rings (shared between Rx and workers) and 8 Tx rings (between Tx and workers)
> Mempool cache size is 256 and burst size is 32.
>
> So overall calculation comes out to be
> ((NIC_RX_QUEUE_SIZE * RX_LCORES) + (NIC_TX_QUEUE_SIZE * TX_LCORES) + \
> (WORKER_RX_RING_SIZE * RX_LCORES * NAT_WORKER_LCORES) + (WORKER_TX_RING_SIZE * NAT_WORKER_LCORES) + \
> ((MBUF_ARRAY_SIZE + CACHE_SIZE) * (RX_LCORES + TX_LCORES + NAT_WORKER_LCORES)))
>
> With this the mbuf pool size should be 32128. To round off as power of 2 we have kept mbuf pool size as 32767.
>
> Also the incoming packet rate Is pretty low.
>
> For testing I have doubled the pool size for now. Not sure whether this will solve the issue.
>
> Thanks.
>
> On 1/11/19, 3:38 PM, "Stephen Hemminger" <stephen@networkplumber.org> wrote:
>
> On Fri, 11 Jan 2019 22:10:39 +0000
> "Soni, Shivam" <shivsoni@amazon.com> wrote:
>
> > Hi All,
> >
> > We are trying to debug and fix an issue. After the deployment, in few of the hosts we see an issue where TX is unable to enqueue packets to NIC. On rebouncing or restarting our packet processor daemon, issue gets resolved.
> >
> > We are using IntelDPDK version 17.11.4 and i40e drivers.
> >
> > On looking into driver’s code, we found that whenever the issue is happening the value for nb_tx_free is ‘0’. And then it tries to free the buffer by calling function ‘i40e_tx_free_bufs’.
> >
> > This method returns early as the buffer its trying to free says it hasn’t finished transmitting yet. The method returns at this if condition:
> >
> > /* check DD bits on threshold descriptor */
> > if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
> > rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
> > rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
> > return 0;
> > }
> >
> > Hence nb_tx_free remains 0.
> >
> > Our tx descriptor count is 1024.
> >
> > How can we fix this issue. Can someone help us out here please
>
> Use bigger mbuf pool. For safety the mbuf pool has to be big enough
> for Nports * (NRxd + NTxd) + NCore * (mbuf_pool_cache_size + burst_size)
>
> Each NIC might get full receive ring and full transmit ring
> and each active core might be processing a burst of packets and have
> free buffers sitting in the mbuf pool cache. This doesn't account for additional
> mbuf's created if doing things like reassembly, encryption, re-encapsulation, or compression
>
> Anything smaller and your application is relying on statistical averages
> to never see resource exhaustion; overcommitment
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] TX unable to enqueue packets to NIC due to no free TX descriptor
2019-01-16 21:45 ` Soni, Shivam
2019-01-17 10:56 ` Bruce Richardson
@ 2019-04-15 5:35 ` Xiao, Xiaohong (NSB - CN/Shanghai)
2019-04-15 5:35 ` Xiao, Xiaohong (NSB - CN/Shanghai)
1 sibling, 1 reply; 8+ messages in thread
From: Xiao, Xiaohong (NSB - CN/Shanghai) @ 2019-04-15 5:35 UTC (permalink / raw)
To: Soni, Shivam, Stephen Hemminger
Cc: dev, users, Uppal, Hardeep, Li, Miaocai A. (NSB - CN/Shanghai),
Chen, Fei A. (NSB - CN/Shanghai)
Hello
We met similar issue. DPDK17.11 + i40e. Tx queue seems full and hanging, no packets could be sent out at all.
Has this issue gone and how? Thank you very much.
Regards
Nokia, Xiao Xiaohong
-----Original Message-----
From: users [mailto:users-bounces@dpdk.org] On Behalf Of Soni, Shivam
Sent: 2019年1月17日 5:46
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: dev@dpdk.org; users@dpdk.org; Uppal, Hardeep <hardeepu@amazon.com>
Subject: Re: [dpdk-users] [dpdk-dev] TX unable to enqueue packets to NIC due to no free TX descriptor
On digging further found some more data.
On the host where everything works fine, I can see 'txq->nb_tx_free' getting reduced to 31 from 1024. After reaching at 31, i40e_tx_free_bufs() function gets called, which frees the buffer and nb_tx_free reaches to 63.
Also in the function i40e_tx_free_bufs(), this if condition never evaluates to true as whatsoever be the value of the index txq->tx_next_dd , the value of 'cmd_type_offset_bsz' is always 15. Hence this if condition is always false and the code works fine.
if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
return 0;
}
However, on the hosts where we are seeing the issue, after some calls of the i40e_tx_free_bufs(), value for 'txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz' becomes really weird like 1099511627888, 1030792151152. Because of these weird values the 'if condition' becomes true( if((1099511627888 & 15) != 15). Hence function returns from there itself and nb_tx_free doesn't get increased and eventually reaches '0'
Are these values expected or there is some memory corruption happening somewhere in our code?
As far as I can understand this if condition its purpose is to check whether the buffers to be freed are still transmitting or not.
Can someone help us out here.
On 1/14/19, 9:54 AM, "Soni, Shivam" <shivsoni@amazon.com> wrote:
I doubled the mempool size to 65535 but the issue is not resolved.
On 1/11/19, 4:27 PM, "dev on behalf of Soni, Shivam" <dev-bounces@dpdk.org on behalf of shivsoni@amazon.com> wrote:
Hi Stephen,
Thanks for the reply.
Our mbuf pool is big enough. We have 2 RX cores, 2 TX cores and 8 worker cores.
NTxd and NRxd is 1024 each and we have 16 Rx rings (shared between Rx and workers) and 8 Tx rings (between Tx and workers)
Mempool cache size is 256 and burst size is 32.
So overall calculation comes out to be
((NIC_RX_QUEUE_SIZE * RX_LCORES) + (NIC_TX_QUEUE_SIZE * TX_LCORES) + \
(WORKER_RX_RING_SIZE * RX_LCORES * NAT_WORKER_LCORES) + (WORKER_TX_RING_SIZE * NAT_WORKER_LCORES) + \
((MBUF_ARRAY_SIZE + CACHE_SIZE) * (RX_LCORES + TX_LCORES + NAT_WORKER_LCORES)))
With this the mbuf pool size should be 32128. To round off as power of 2 we have kept mbuf pool size as 32767.
Also the incoming packet rate Is pretty low.
For testing I have doubled the pool size for now. Not sure whether this will solve the issue.
Thanks.
On 1/11/19, 3:38 PM, "Stephen Hemminger" <stephen@networkplumber.org> wrote:
On Fri, 11 Jan 2019 22:10:39 +0000
"Soni, Shivam" <shivsoni@amazon.com> wrote:
> Hi All,
>
> We are trying to debug and fix an issue. After the deployment, in few of the hosts we see an issue where TX is unable to enqueue packets to NIC. On rebouncing or restarting our packet processor daemon, issue gets resolved.
>
> We are using IntelDPDK version 17.11.4 and i40e drivers.
>
> On looking into driver’s code, we found that whenever the issue is happening the value for nb_tx_free is ‘0’. And then it tries to free the buffer by calling function ‘i40e_tx_free_bufs’.
>
> This method returns early as the buffer its trying to free says it hasn’t finished transmitting yet. The method returns at this if condition:
>
> /* check DD bits on threshold descriptor */
> if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
> rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
> rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
> return 0;
> }
>
> Hence nb_tx_free remains 0.
>
> Our tx descriptor count is 1024.
>
> How can we fix this issue. Can someone help us out here please
Use bigger mbuf pool. For safety the mbuf pool has to be big enough
for Nports * (NRxd + NTxd) + NCore * (mbuf_pool_cache_size + burst_size)
Each NIC might get full receive ring and full transmit ring
and each active core might be processing a burst of packets and have
free buffers sitting in the mbuf pool cache. This doesn't account for additional
mbuf's created if doing things like reassembly, encryption, re-encapsulation, or compression
Anything smaller and your application is relying on statistical averages
to never see resource exhaustion; overcommitment
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] TX unable to enqueue packets to NIC due to no free TX descriptor
2019-04-15 5:35 ` Xiao, Xiaohong (NSB - CN/Shanghai)
@ 2019-04-15 5:35 ` Xiao, Xiaohong (NSB - CN/Shanghai)
0 siblings, 0 replies; 8+ messages in thread
From: Xiao, Xiaohong (NSB - CN/Shanghai) @ 2019-04-15 5:35 UTC (permalink / raw)
To: Soni, Shivam, Stephen Hemminger
Cc: dev, users, Uppal, Hardeep, Li, Miaocai A. (NSB - CN/Shanghai),
Chen, Fei A. (NSB - CN/Shanghai)
Hello
We met similar issue. DPDK17.11 + i40e. Tx queue seems full and hanging, no packets could be sent out at all.
Has this issue gone and how? Thank you very much.
Regards
Nokia, Xiao Xiaohong
-----Original Message-----
From: users [mailto:users-bounces@dpdk.org] On Behalf Of Soni, Shivam
Sent: 2019年1月17日 5:46
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: dev@dpdk.org; users@dpdk.org; Uppal, Hardeep <hardeepu@amazon.com>
Subject: Re: [dpdk-users] [dpdk-dev] TX unable to enqueue packets to NIC due to no free TX descriptor
On digging further found some more data.
On the host where everything works fine, I can see 'txq->nb_tx_free' getting reduced to 31 from 1024. After reaching at 31, i40e_tx_free_bufs() function gets called, which frees the buffer and nb_tx_free reaches to 63.
Also in the function i40e_tx_free_bufs(), this if condition never evaluates to true as whatsoever be the value of the index txq->tx_next_dd , the value of 'cmd_type_offset_bsz' is always 15. Hence this if condition is always false and the code works fine.
if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
return 0;
}
However, on the hosts where we are seeing the issue, after some calls of the i40e_tx_free_bufs(), value for 'txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz' becomes really weird like 1099511627888, 1030792151152. Because of these weird values the 'if condition' becomes true( if((1099511627888 & 15) != 15). Hence function returns from there itself and nb_tx_free doesn't get increased and eventually reaches '0'
Are these values expected or there is some memory corruption happening somewhere in our code?
As far as I can understand this if condition its purpose is to check whether the buffers to be freed are still transmitting or not.
Can someone help us out here.
On 1/14/19, 9:54 AM, "Soni, Shivam" <shivsoni@amazon.com> wrote:
I doubled the mempool size to 65535 but the issue is not resolved.
On 1/11/19, 4:27 PM, "dev on behalf of Soni, Shivam" <dev-bounces@dpdk.org on behalf of shivsoni@amazon.com> wrote:
Hi Stephen,
Thanks for the reply.
Our mbuf pool is big enough. We have 2 RX cores, 2 TX cores and 8 worker cores.
NTxd and NRxd is 1024 each and we have 16 Rx rings (shared between Rx and workers) and 8 Tx rings (between Tx and workers)
Mempool cache size is 256 and burst size is 32.
So overall calculation comes out to be
((NIC_RX_QUEUE_SIZE * RX_LCORES) + (NIC_TX_QUEUE_SIZE * TX_LCORES) + \
(WORKER_RX_RING_SIZE * RX_LCORES * NAT_WORKER_LCORES) + (WORKER_TX_RING_SIZE * NAT_WORKER_LCORES) + \
((MBUF_ARRAY_SIZE + CACHE_SIZE) * (RX_LCORES + TX_LCORES + NAT_WORKER_LCORES)))
With this the mbuf pool size should be 32128. To round off as power of 2 we have kept mbuf pool size as 32767.
Also the incoming packet rate Is pretty low.
For testing I have doubled the pool size for now. Not sure whether this will solve the issue.
Thanks.
On 1/11/19, 3:38 PM, "Stephen Hemminger" <stephen@networkplumber.org> wrote:
On Fri, 11 Jan 2019 22:10:39 +0000
"Soni, Shivam" <shivsoni@amazon.com> wrote:
> Hi All,
>
> We are trying to debug and fix an issue. After the deployment, in few of the hosts we see an issue where TX is unable to enqueue packets to NIC. On rebouncing or restarting our packet processor daemon, issue gets resolved.
>
> We are using IntelDPDK version 17.11.4 and i40e drivers.
>
> On looking into driver’s code, we found that whenever the issue is happening the value for nb_tx_free is ‘0’. And then it tries to free the buffer by calling function ‘i40e_tx_free_bufs’.
>
> This method returns early as the buffer its trying to free says it hasn’t finished transmitting yet. The method returns at this if condition:
>
> /* check DD bits on threshold descriptor */
> if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
> rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
> rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) {
> return 0;
> }
>
> Hence nb_tx_free remains 0.
>
> Our tx descriptor count is 1024.
>
> How can we fix this issue. Can someone help us out here please
Use bigger mbuf pool. For safety the mbuf pool has to be big enough
for Nports * (NRxd + NTxd) + NCore * (mbuf_pool_cache_size + burst_size)
Each NIC might get full receive ring and full transmit ring
and each active core might be processing a burst of packets and have
free buffers sitting in the mbuf pool cache. This doesn't account for additional
mbuf's created if doing things like reassembly, encryption, re-encapsulation, or compression
Anything smaller and your application is relying on statistical averages
to never see resource exhaustion; overcommitment
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-04-15 5:35 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-11 22:10 [dpdk-dev] TX unable to enqueue packets to NIC due to no free TX descriptor Soni, Shivam
2019-01-11 23:37 ` Stephen Hemminger
2019-01-12 0:26 ` Soni, Shivam
2019-01-14 17:54 ` Soni, Shivam
2019-01-16 21:45 ` Soni, Shivam
2019-01-17 10:56 ` Bruce Richardson
2019-04-15 5:35 ` Xiao, Xiaohong (NSB - CN/Shanghai)
2019-04-15 5:35 ` Xiao, Xiaohong (NSB - CN/Shanghai)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).