* [PATCH] kernel/kni: retry the xmit in case ring is full
@ 2021-12-17 15:00 Tudor Cornea
2021-12-17 16:24 ` Stephen Hemminger
0 siblings, 1 reply; 3+ messages in thread
From: Tudor Cornea @ 2021-12-17 15:00 UTC (permalink / raw)
To: ferruh.yigit; +Cc: thomas, stephen, dev, Tudor Cornea
This patch attempts to avoid dropping packets that are to be
transmitted, in case there is no space in the KNI ring.
We have a use case in which we leverage the Linux TCP / IP stack for
control plane, and some protocols might be sensitive to packet drops.
This might mean that the sender (Kernel) might be moving at a faster pace
than the receiver end (DPDK application), or it might have some brief
moments of bursty traffic patterns.
Requeuing the packets could add a kind of backpressure until a transmit
window is available to us.
The burden of retransmitting is shifted to the caller of ndo_start_xmit,
which in our case is the configured queuing discipline. This way, the
user should be able to influence the behavior w.r.t dropping packets,
by picking the desired queuing discipline.
Although it should technically be a good approach, from what
I have tested, stopping the queue prior to returning NETDEV_TX_BUSY seems
to add some extra overhead, and degrade the control-plane performance
a bit.
Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
---
kernel/linux/kni/kni_net.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c
index 29e5b9e..db0330f 100644
--- a/kernel/linux/kni/kni_net.c
+++ b/kernel/linux/kni/kni_net.c
@@ -321,10 +321,9 @@ kni_net_tx(struct sk_buff *skb, struct net_device *dev)
if (kni_fifo_free_count(kni->tx_q) == 0 ||
kni_fifo_count(kni->alloc_q) == 0) {
/**
- * If no free entry in tx_q or no entry in alloc_q,
- * drops skb and goes out.
+ * Tell the caller to requeue, and retry at a later time.
*/
- goto drop;
+ goto requeue;
}
/* dequeue a mbuf from alloc_q */
@@ -371,6 +370,10 @@ kni_net_tx(struct sk_buff *skb, struct net_device *dev)
dev->stats.tx_dropped++;
return NETDEV_TX_OK;
+
+requeue:
+ /* Signal the caller to re-transmit at a later time */
+ return NETDEV_TX_BUSY;
}
/*
--
2.7.4
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] kernel/kni: retry the xmit in case ring is full
2021-12-17 15:00 [PATCH] kernel/kni: retry the xmit in case ring is full Tudor Cornea
@ 2021-12-17 16:24 ` Stephen Hemminger
2022-01-17 15:29 ` Tudor Cornea
0 siblings, 1 reply; 3+ messages in thread
From: Stephen Hemminger @ 2021-12-17 16:24 UTC (permalink / raw)
To: Tudor Cornea; +Cc: ferruh.yigit, thomas, dev
On Fri, 17 Dec 2021 17:00:32 +0200
Tudor Cornea <tudor.cornea@gmail.com> wrote:
> This patch attempts to avoid dropping packets that are to be
> transmitted, in case there is no space in the KNI ring.
>
> We have a use case in which we leverage the Linux TCP / IP stack for
> control plane, and some protocols might be sensitive to packet drops.
>
> This might mean that the sender (Kernel) might be moving at a faster pace
> than the receiver end (DPDK application), or it might have some brief
> moments of bursty traffic patterns.
>
> Requeuing the packets could add a kind of backpressure until a transmit
> window is available to us.
>
> The burden of retransmitting is shifted to the caller of ndo_start_xmit,
> which in our case is the configured queuing discipline. This way, the
> user should be able to influence the behavior w.r.t dropping packets,
> by picking the desired queuing discipline.
>
> Although it should technically be a good approach, from what
> I have tested, stopping the queue prior to returning NETDEV_TX_BUSY seems
> to add some extra overhead, and degrade the control-plane performance
> a bit.
>
> Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
NAK
Doing this risks having a CPU lockup if userspace does not keep up
or the DPDK application gets stuck.
There are better ways to solve the TCP stack queue overrun issue:
1. Use a better queueing discipline on the kni device. The Linux default
of pfifo_fast has bufferbloat issues. Use fq_codel, fq, codel or pie?
2. KNI should implement BQL so that TCP stack can see lock backpressure
about possible queue depth.
As a simple workaround increase the KNI ring size. It won't solve the whole
problem but i tcan help
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] kernel/kni: retry the xmit in case ring is full
2021-12-17 16:24 ` Stephen Hemminger
@ 2022-01-17 15:29 ` Tudor Cornea
0 siblings, 0 replies; 3+ messages in thread
From: Tudor Cornea @ 2022-01-17 15:29 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Tudor Cornea, ferruh.yigit, thomas, dev
[-- Attachment #1: Type: text/plain, Size: 1031 bytes --]
Hi Stephen,
> NAK
> Doing this risks having a CPU lockup if userspace does not keep up
> or the DPDK application gets stuck.
>
> There are better ways to solve the TCP stack queue overrun issue:
> 1. Use a better queueing discipline on the kni device. The Linux default
> of pfifo_fast has bufferbloat issues. Use fq_codel, fq, codel or pie?
> 2. KNI should implement BQL so that TCP stack can see lock backpressure
> about possible queue depth.
>
>
Thanks for the suggestions.
I agree that we risk a lockup, in case the DPDK app gets stuck.
Indeed, I am running on an older Linux kernel, and the default queuing
discipline is pfifo_fast.
I'll experiment with the queuing disciplines you recommended.
> As a simple workaround increase the KNI ring size. It won't solve the whole
> problem but i tcan help
>
I obtained moderate success with increasing MAX_MBUF_BURST_NUM from 32 to
1024 in librte_kni.
I'm not sure if such a change would be upstreamable. Perhaps it needs a bit
of testing.
I'll drop the current patch.
[-- Attachment #2: Type: text/html, Size: 1647 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-01-17 15:30 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-17 15:00 [PATCH] kernel/kni: retry the xmit in case ring is full Tudor Cornea
2021-12-17 16:24 ` Stephen Hemminger
2022-01-17 15:29 ` Tudor Cornea
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).