DPDK usage discussions
 help / color / mirror / Atom feed
From: Pavel Vazharov <freakpv@gmail.com>
To: Hao Chen <earthlovepython@outlook.com>
Cc: "users@dpdk.org" <users@dpdk.org>
Subject: Re: [dpdk-users] What is TCP read performance by using DPDK?
Date: Thu, 15 Apr 2021 09:57:47 +0300	[thread overview]
Message-ID: <CAK9EM1-4GxdaVbwDfy_QBHD19y28mz58k3ZSWq_8az_PamKybg@mail.gmail.com> (raw)
In-Reply-To: <BYAPR04MB4167EB734F66A3E5672AE9A2B24D9@BYAPR04MB4167.namprd04.prod.outlook.com>

Hi,

"Does it mean your code just look at IPHeader and TCPheader without
handling TCP payload?"
The proxy works in the application layer. I mean, it works with regular BSD
sockets. As I said we use modified version of F-stack (
https://github.com/F-Stack/f-stack) for this. Basically our version is very
close to the original libuinet (https://github.com/pkelsey/libuinet) but
based on a newer version of the FreeBSD networking stack (FreeBSD 11). Here
is a rough description how it works:
1. Every thread of our application reads packets in bursts from the single
RX queue using the DPDK API.
2. These packets are then passed/injected into the FreeBSD/F-stack
networking stack. We use separate networking stack per thread.
3. The networking stack processes the packets queueing them in the receive
buffers of the TCP sockets. These are regular sockets.
4. Every application thread also calls regularly an epoll_wait API provided
by the F-stack library. It's just a wrapper over the kevent API provided by
the FreeBSD.
5. The application gets the read/write events from the epoll_wait and
reads/writes to the corresponding sockets. Again this is done exactly like
in a regular Linux application where you read/write data from/to the
sockets.
6. Our test proxy application used sockets in pairs and all data read from
a given TCP socket were written to the corresponding TCP socket in the
other direction.
7. The written data to the given socket is put in the send buffers of this
socket and eventually sent out via the given TX queue using the DPDK API.
This happens via callback that's provided to the F-stack. The callback is
called for every single packet that needs to be send out by the F-stack and
our application implements this callback using the DPDK functionality. In
our design the F-stack/FreeBSD stack doesn't know about the DPDK it can
work with different packet processing framework.

"Does it mean UDP-payload-size is NOT 1400 bytes (MTU size)? And it is as
smaller as 64 bytes for example?"
My personal observation is that for the same amount of traffic the UTP
traffic generates much more packets per second than the corresponding HTTP
traffic running over TCP. These are the two tests that we did. However, I
can't provide you numbers about this at the moment but there are lots of
packets smaller than the MTU size usually. I think they come from things
like the internal ACK packets which seem to be send more frequently than
TCP. Also the request, cancel, have, etc messages, from the BitTorrent
protocol, are most of the times sent in smaller packets.

"Do you handle UTP payload, or just "relay" it like proxy?"
Our proxies always work with sockets. We have application business logic
built over the socket layer. For the test case we just proxied the data
between pairs of UTP sockets in the same way we did it for the TCP proxy
above.
We have implementation of the UTP protocol which provides a socket API
similar to the BSD socket API with read/write/shutdown/close/etc functions.
As you probably may have read, the UTP protocol is, kind of, a simplified
version of the TCP protocol but also more suitable for the needs of the
BitTorrent traffic. So this is a reliable protocol and this means that
there is a need for socket buffers. Our implementation is built over the
UDP sockets provided by the F-stack. The data are read from the UDP sockets
and put into the buffers of the corresponding UTP socket. If contiguous
data are collected into the buffers, the implementation fires notification
to the application layer. The write direction works in the opposite way.
The data from the application are first written to the buffers of the UTP
socket and then later send via the internal UDP socket from the F-stack.

So to summarize the above. We handle the TCP/UDP payload using the regular
BSD socket API provided by the F-stack library and our UTP stack library.
For the test we just relayed the data between a few thousands pairs of
sockets. Currently we do much more complex manipulation of this data but
this is still work in progress and the final performance is still not
tested.

Hope the above explanations help.
Pavel.

>

  reply	other threads:[~2021-04-15  6:58 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-23 23:06 Hao Chen
2021-03-29  8:38 ` Pavel Vazharov
2021-04-15  5:59   ` Hao Chen
2021-04-15  6:57     ` Pavel Vazharov [this message]
2021-04-15 16:03       ` Hao Chen
2021-04-16  6:00         ` Pavel Vazharov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAK9EM1-4GxdaVbwDfy_QBHD19y28mz58k3ZSWq_8az_PamKybg@mail.gmail.com \
    --to=freakpv@gmail.com \
    --cc=earthlovepython@outlook.com \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).