From: Tudor Cornea <tudor.cornea@gmail.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: linville@tuxdriver.com, Thomas Monjalon <thomas@monjalon.net>,
Mihai Pogonaru <pogonarumihai@gmail.com>,
dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2] net/af_packet: fix ignoring full ring on tx
Date: Tue, 5 Oct 2021 18:11:08 +0300 [thread overview]
Message-ID: <CAOuQ8vWHLgt=8aNxoSd4XvWeYQqMuJ1Wt4iKpeVmY8tqwVp-_w@mail.gmail.com> (raw)
In-Reply-To: <CAOuQ8vW2t_HJPc-M89+FwP0Ed9kpUR82X7FPm1KjdVuTO-kpSQ@mail.gmail.com>
Hi Ferruh,
I have attempted to narrow down the issue.
I have the following bash script, which computes packet rates on an
interface.
[root@localhost ~]# cat compute-rates.sh
#!/usr/bin/env bash
if [[ ${#} -ne 2 ]]; then
echo "Usage: ${0} <iface-name> <sleep-interval-seconds>"
exit 1
fi
IFACE_NAME="${1}"
SLEEP_INTERVAL_SECONDS="${2}"
TMP_STATS_FILE="/tmp/netstat"
# Clear Previous stats file
echo "0 0 0 0" > "${TMP_STATS_FILE}"
echo "Press CTRL+C to exit..."
while true; do
export "RxB=0" "RxP=0" "TxB=0" "TxP=0"
# Extract Rx{Bytes,Packets} and Tx{Bytes,Packets} and
# format the output. Individual fields will be exported
export $(\
ifconfig "${IFACE_NAME}" \
| grep 'packets' \
| awk '{print $5, $3}' \
| xargs echo \
| sed -E -e \
"s/([0-9]+) ([0-9]+) ([0-9]+) ([0-9]+)/RxB=\1 RxP=\2
TxB=\3 TxP=\4/")
# Print Packet and Byte Rates
# Format: | Rx Bytes | Rx Packets | Tx Bytes | Tx Packets |
echo "${RxB}" "${RxP}" "${TxB}" "${TxP}" $(cat "${TMP_STATS_FILE}") \
| awk '{print "RxB="$1-$5, "RxP="$2-$6, "TxB="$3-$7, "TxP="$4-$8}'
# Save the new values
echo "${RxB}" "${RxP}" "${TxB}" "${TxP}" > "${TMP_STATS_FILE}"
sleep "${SLEEP_INTERVAL_SECONDS}"
done
On the transmit side, I'm using the engine behind [1] with the af_packet
PMD.
The configuration for the af_packet PMD is the following:
--vdev=net_af_packet0,iface=eth1,blocksz=16384,framesz=8192,framecnt=2048,qpairs=1,qdisc_bypass=0
I'm configuring a Tx rate of 335 packets / second and a packet size of 300
Bytes.
These seem to be the values using which we seem to have better chances of
seeing the problem. I suspect it might also be linked with the af_packet
configuration.
I'm starting traffic using the specified configuration, and in parallel,
running the script that computes the rates as follows:
./compute-rates.sh eth1 0.1
Initially, the packet rates seem steady
RxB=0 RxP=0 TxB=10952 TxP=37
RxB=0 RxP=0 TxB=10656 TxP=36
RxB=0 RxP=0 TxB=10656 TxP=36
RxB=0 RxP=0 TxB=10656 TxP=36
RxB=0 RxP=0 TxB=10952 TxP=37
RxB=0 RxP=0 TxB=10952 TxP=37
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10952 TxP=37
[...]
After a while, we toggle the interface up / down with a sleep between the
steps. I suspect the length of the sleep might be a variable in the
equation.
ifconfig eth1 down; sleep 7; ifconfig eth1 up
What we see, is that even after the interface is toggled back up, the rates
never seem to recover.
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=2072 TxP=7
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=521256 TxP=1761
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
[...]
I've attempted to mirror the same behavior using dpdk-pktgen [2] on a
different machine (Ubuntu 20.04). This time, af_packet runs on top of
a Linux virtio_net interface.
I seem to be getting a similar behavior. I have used the following
dpdk-pktgen configuration and run-time settings
pktgen \
-l 1-4 \
-n 4 \
--proc-type=primary \
--no-pci \
--no-telemetry \
--no-huge \
-m 512 \
--vdev=net_af_packet0,iface=eth1,blocksz=16384,framesz=8192,framecnt=2048,qpairs=1,qdisc_bypass=0
\
-- \
-P \
-T \
-m "3.0" \
-f themes/black-yellow.theme
set 0 size 300
set 0 rate 0.008
set 0 burst 1
start 0
[1] https://github.com/open-traffic-generator/ixia-c
[2] http://code.dpdk.org/pktgen-dpdk/pktgen-20.11.2/source/INSTALL.md
On Wed, 29 Sept 2021 at 13:03, Tudor Cornea <tudor.cornea@gmail.com> wrote:
> Hi Ferruh,
>
> What you described above looks like a ring buffer with single producer and
>> single consumer, and producer overwrites the not consumed items.
>
>
> Indeed. This is also my understanding of the bug.
> I am going to try to isolate the issue, and should probably be able to
> come up with a script in a few days.
>
> Our of curiosity, are you using an modified af_packet implementation in
>> kernel
>> for above described usage?
>
>
> We are currently using an Ubuntu-based distro with a 4.15 Linux kernel.
> We don't have any kernel patches for the af_packet implementation to my
> knowledge (probably excepting patches that are back-ported by Ubuntu
> maintainers from newer releases).
>
>
> On Mon, 20 Sept 2021 at 20:44, Ferruh Yigit <ferruh.yigit@intel.com>
> wrote:
>
>> On 9/13/2021 2:45 PM, Tudor Cornea wrote:
>> > The poll call can return POLLERR which is ignored, or it can return
>> > POLLOUT, even if there are no free frames in the mmap-ed area.
>> >
>> > We can account for both of these cases by re-checking if the next
>> > frame is empty before writing into it.
>> >
>> > Signed-off-by: Mihai Pogonaru <pogonarumihai@gmail.com>
>> > Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
>> > ---
>> > drivers/net/af_packet/rte_eth_af_packet.c | 19 +++++++++++++++++++
>> > 1 file changed, 19 insertions(+)
>> >
>> > diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
>> b/drivers/net/af_packet/rte_eth_af_packet.c
>> > index b73b211..087c196 100644
>> > --- a/drivers/net/af_packet/rte_eth_af_packet.c
>> > +++ b/drivers/net/af_packet/rte_eth_af_packet.c
>> > @@ -216,6 +216,25 @@ eth_af_packet_tx(void *queue, struct rte_mbuf
>> **bufs, uint16_t nb_pkts)
>> > (poll(&pfd, 1, -1) < 0))
>> > break;
>> >
>> > + /*
>> > + * Poll can return POLLERR if the interface is down
>> > + *
>> > + * It will almost always return POLLOUT, even if there
>> > + * are no extra buffers available
>> > + *
>> > + * This happens, because packet_poll() calls
>> datagram_poll()
>> > + * which checks the space left in the socket buffer and,
>> > + * in the case of packet_mmap, the default socket buffer
>> length
>> > + * doesn't match the requested size for the tx_ring.
>> > + * As such, there is almost always space left in socket
>> buffer,
>> > + * which doesn't seem to be correlated to the requested
>> size
>> > + * for the tx_ring in packet_mmap.
>> > + *
>> > + * This results in poll() returning POLLOUT.
>> > + */
>> > + if (ppd->tp_status != TP_STATUS_AVAILABLE)
>> > + break;
>> > +
>>
>> If 'POLLOUT' doesn't indicate that there is space in the buffer, what is
>> the
>> point of the 'poll()' at all?
>>
>> What can we test/reproduce the mentioned behavior? Or is there a way to
>> fix the
>> behavior of poll() or use an alternative of it?
>>
>>
>> OK to break on the 'POLLERR', I guess it can be detected in the
>> 'pfd.revent'.
>>
>>
>> > /* copy the tx frame data */
>> > pbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -
>> > sizeof(struct sockaddr_ll);
>> >
>>
>>
next prev parent reply other threads:[~2021-10-05 15:11 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-20 13:39 [dpdk-dev] [PATCH] " Tudor Cornea
2021-09-01 16:34 ` Ferruh Yigit
2021-09-06 10:23 ` Tudor Cornea
2021-09-20 17:11 ` Ferruh Yigit
2021-09-13 13:45 ` [dpdk-dev] [PATCH v2] " Tudor Cornea
2021-09-20 17:44 ` Ferruh Yigit
2021-09-29 10:03 ` Tudor Cornea
2021-10-05 15:11 ` Tudor Cornea [this message]
2021-10-26 14:30 ` Ferruh Yigit
2021-11-02 15:24 ` Tudor Cornea
2021-11-02 15:47 ` [dpdk-dev] [PATCH v3] " Tudor Cornea
2021-11-02 16:47 ` Ferruh Yigit
2021-11-03 9:31 ` [dpdk-dev] [PATCH v4] " Tudor Cornea
2021-11-04 12:07 ` Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOuQ8vWHLgt=8aNxoSd4XvWeYQqMuJ1Wt4iKpeVmY8tqwVp-_w@mail.gmail.com' \
--to=tudor.cornea@gmail.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=linville@tuxdriver.com \
--cc=pogonarumihai@gmail.com \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).