DPDK patches and discussions
 help / color / mirror / Atom feed
From: Tudor Cornea <tudor.cornea@gmail.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: linville@tuxdriver.com, Thomas Monjalon <thomas@monjalon.net>,
	 Mihai Pogonaru <pogonarumihai@gmail.com>,
	dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2] net/af_packet: fix ignoring full ring on tx
Date: Tue, 5 Oct 2021 18:11:08 +0300	[thread overview]
Message-ID: <CAOuQ8vWHLgt=8aNxoSd4XvWeYQqMuJ1Wt4iKpeVmY8tqwVp-_w@mail.gmail.com> (raw)
In-Reply-To: <CAOuQ8vW2t_HJPc-M89+FwP0Ed9kpUR82X7FPm1KjdVuTO-kpSQ@mail.gmail.com>

Hi Ferruh,

I have attempted to narrow down the issue.
I have the following bash script, which computes packet rates on an
interface.

[root@localhost ~]# cat compute-rates.sh
#!/usr/bin/env bash

if [[ ${#} -ne 2 ]]; then
    echo "Usage: ${0} <iface-name> <sleep-interval-seconds>"
    exit 1
fi

IFACE_NAME="${1}"
SLEEP_INTERVAL_SECONDS="${2}"
TMP_STATS_FILE="/tmp/netstat"

# Clear Previous stats file
echo "0 0 0 0" > "${TMP_STATS_FILE}"

echo "Press CTRL+C to exit..."

while true; do
    export "RxB=0" "RxP=0" "TxB=0" "TxP=0"

    # Extract Rx{Bytes,Packets} and Tx{Bytes,Packets} and
    # format the output. Individual fields will be exported
    export $(\
        ifconfig "${IFACE_NAME}" \
            | grep 'packets' \
            | awk '{print $5, $3}' \
            | xargs echo \
            | sed -E -e \
                "s/([0-9]+) ([0-9]+) ([0-9]+) ([0-9]+)/RxB=\1 RxP=\2
TxB=\3 TxP=\4/")

    # Print Packet and Byte Rates
    # Format: | Rx Bytes | Rx Packets | Tx Bytes | Tx Packets |

    echo "${RxB}" "${RxP}" "${TxB}" "${TxP}" $(cat "${TMP_STATS_FILE}") \
        | awk '{print "RxB="$1-$5, "RxP="$2-$6, "TxB="$3-$7, "TxP="$4-$8}'

    # Save the new values
    echo "${RxB}" "${RxP}" "${TxB}" "${TxP}" > "${TMP_STATS_FILE}"

    sleep "${SLEEP_INTERVAL_SECONDS}"

done

On the transmit side, I'm using the engine behind [1] with the af_packet
PMD.

The configuration for the af_packet PMD is the following:
--vdev=net_af_packet0,iface=eth1,blocksz=16384,framesz=8192,framecnt=2048,qpairs=1,qdisc_bypass=0

I'm configuring a Tx rate of 335 packets / second and a packet size of 300
Bytes.
These seem to be the values using which we seem to have better chances of
seeing the problem. I suspect it might also be linked with the af_packet
configuration.

I'm starting traffic using the specified configuration, and in parallel,
running the script that computes the rates as follows:
./compute-rates.sh eth1 0.1

Initially, the packet rates seem steady

RxB=0 RxP=0 TxB=10952 TxP=37
RxB=0 RxP=0 TxB=10656 TxP=36
RxB=0 RxP=0 TxB=10656 TxP=36
RxB=0 RxP=0 TxB=10656 TxP=36
RxB=0 RxP=0 TxB=10952 TxP=37
RxB=0 RxP=0 TxB=10952 TxP=37
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10952 TxP=37

[...]

After a while, we toggle the interface up / down with a sleep between the
steps. I suspect the length of the sleep might be a variable in the
equation.

ifconfig eth1 down; sleep 7; ifconfig eth1 up


What we see, is that even after the interface is toggled back up, the rates
never seem to recover.

RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=2072 TxP=7
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=521256 TxP=1761
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0

[...]


I've attempted to mirror the same behavior using dpdk-pktgen [2] on a
different machine (Ubuntu 20.04). This time, af_packet runs on top of
a Linux virtio_net interface.

I seem to be getting a  similar behavior. I have used the following
dpdk-pktgen configuration and run-time settings


pktgen \
    -l 1-4 \
    -n 4 \
    --proc-type=primary \
    --no-pci \
    --no-telemetry \
    --no-huge \
    -m 512 \
    --vdev=net_af_packet0,iface=eth1,blocksz=16384,framesz=8192,framecnt=2048,qpairs=1,qdisc_bypass=0
\
    -- \
    -P \
    -T \
    -m "3.0" \
    -f themes/black-yellow.theme

set 0 size 300
set 0 rate 0.008
set 0 burst 1
start 0


[1] https://github.com/open-traffic-generator/ixia-c
[2] http://code.dpdk.org/pktgen-dpdk/pktgen-20.11.2/source/INSTALL.md

On Wed, 29 Sept 2021 at 13:03, Tudor Cornea <tudor.cornea@gmail.com> wrote:

> Hi Ferruh,
>
> What you described above looks like a ring buffer with single producer and
>> single consumer, and producer overwrites the not consumed items.
>
>
> Indeed. This is also my understanding of the bug.
> I am going to try to isolate the issue, and should probably be able to
> come up with a script in a few days.
>
> Our of curiosity, are you using an modified af_packet implementation in
>> kernel
>> for above described usage?
>
>
> We are currently using an Ubuntu-based distro with a 4.15 Linux kernel.
> We don't have any kernel patches for the af_packet implementation to my
> knowledge (probably excepting patches that are back-ported by Ubuntu
> maintainers from newer releases).
>
>
> On Mon, 20 Sept 2021 at 20:44, Ferruh Yigit <ferruh.yigit@intel.com>
> wrote:
>
>> On 9/13/2021 2:45 PM, Tudor Cornea wrote:
>> > The poll call can return POLLERR which is ignored, or it can return
>> > POLLOUT, even if there are no free frames in the mmap-ed area.
>> >
>> > We can account for both of these cases by re-checking if the next
>> > frame is empty before writing into it.
>> >
>> > Signed-off-by: Mihai Pogonaru <pogonarumihai@gmail.com>
>> > Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
>> > ---
>> >  drivers/net/af_packet/rte_eth_af_packet.c | 19 +++++++++++++++++++
>> >  1 file changed, 19 insertions(+)
>> >
>> > diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
>> b/drivers/net/af_packet/rte_eth_af_packet.c
>> > index b73b211..087c196 100644
>> > --- a/drivers/net/af_packet/rte_eth_af_packet.c
>> > +++ b/drivers/net/af_packet/rte_eth_af_packet.c
>> > @@ -216,6 +216,25 @@ eth_af_packet_tx(void *queue, struct rte_mbuf
>> **bufs, uint16_t nb_pkts)
>> >                   (poll(&pfd, 1, -1) < 0))
>> >                       break;
>> >
>> > +             /*
>> > +              * Poll can return POLLERR if the interface is down
>> > +              *
>> > +              * It will almost always return POLLOUT, even if there
>> > +              * are no extra buffers available
>> > +              *
>> > +              * This happens, because packet_poll() calls
>> datagram_poll()
>> > +              * which checks the space left in the socket buffer and,
>> > +              * in the case of packet_mmap, the default socket buffer
>> length
>> > +              * doesn't match the requested size for the tx_ring.
>> > +              * As such, there is almost always space left in socket
>> buffer,
>> > +              * which doesn't seem to be correlated to the requested
>> size
>> > +              * for the tx_ring in packet_mmap.
>> > +              *
>> > +              * This results in poll() returning POLLOUT.
>> > +              */
>> > +             if (ppd->tp_status != TP_STATUS_AVAILABLE)
>> > +                     break;
>> > +
>>
>> If 'POLLOUT' doesn't indicate that there is space in the buffer, what is
>> the
>> point of the 'poll()' at all?
>>
>> What can we test/reproduce the mentioned behavior? Or is there a way to
>> fix the
>> behavior of poll() or use an alternative of it?
>>
>>
>> OK to break on the 'POLLERR', I guess it can be detected in the
>> 'pfd.revent'.
>>
>>
>> >               /* copy the tx frame data */
>> >               pbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -
>> >                       sizeof(struct sockaddr_ll);
>> >
>>
>>

  reply	other threads:[~2021-10-05 15:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-20 13:39 [dpdk-dev] [PATCH] " Tudor Cornea
2021-09-01 16:34 ` Ferruh Yigit
2021-09-06 10:23   ` Tudor Cornea
2021-09-20 17:11     ` Ferruh Yigit
2021-09-13 13:45 ` [dpdk-dev] [PATCH v2] " Tudor Cornea
2021-09-20 17:44   ` Ferruh Yigit
2021-09-29 10:03     ` Tudor Cornea
2021-10-05 15:11       ` Tudor Cornea [this message]
2021-10-26 14:30         ` Ferruh Yigit
2021-11-02 15:24           ` Tudor Cornea
2021-11-02 15:47   ` [dpdk-dev] [PATCH v3] " Tudor Cornea
2021-11-02 16:47     ` Ferruh Yigit
2021-11-03  9:31     ` [dpdk-dev] [PATCH v4] " Tudor Cornea
2021-11-04 12:07       ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOuQ8vWHLgt=8aNxoSd4XvWeYQqMuJ1Wt4iKpeVmY8tqwVp-_w@mail.gmail.com' \
    --to=tudor.cornea@gmail.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=linville@tuxdriver.com \
    --cc=pogonarumihai@gmail.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).