DPDK usage discussions
 help / color / mirror / Atom feed
From: Tom Barbette <tom.barbette@uclouvain.be>
To: "Kompella V, Purnima" <Kompella.Purnima@commscope.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>,
	Thea Corinne Rossman <thea.rossman@cs.stanford.edu>,
	"users@dpdk.org" <users@dpdk.org>
Subject: Re: Containernet (Docker/Container Networking) with DPDK?
Date: Wed, 20 Nov 2024 09:27:02 +0000	[thread overview]
Message-ID: <80E33329-12A0-418B-B2F1-CB85E2C2388B@uclouvain.be> (raw)
In-Reply-To: <DM6PR14MB35970935B48DC49870F673F99C212@DM6PR14MB3597.namprd14.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 3949 bytes --]

Hi Stephen, Thea,

If you uses SRIOV, then containers behave essentially like VMs, packets will be exchanged through the PCIe bus and switched on the NIC ASIC, which as Stephen mentions, will identify MAC addresses as « itself » and packets do not physically get out of the NIC . I’d argue these days it’s not as much of a problem. You can typically have a PCIe5 x16 ConnectX7 that has a bus BW of 500Gbps but has actually only one or two 100G ports, so you’ve got plenty of spare bandwidth for internal host exchange. We know NICs are getting smart and take a broader role than pure « external »  I/O.

Internal host networking without going through PCIe can be handled like VMs too : with virtio and the DPDK vhost driver. Memory copies are involved in that case.

I suspect for your matter at hand Thea, the easiest is to use SRIOV. Research-wise, a simple solution is to use —networking=host …

Eg this is working well but uses privileged container and lets the docker access all host network for fastclick :
sudo docker run -v /mnt/huge:/dev/hugepages -it --privileged --network host tbarbette/fastclick-dpdk:generic --dpdk -a $VF_PCIE_ADDR -- -e "FromDPDKDevice(0) -> Discard;"

The related sample Dockerfile can be found at : https://github.com/tbarbette/fastclick/blob/main/etc/Dockerfile.dpdk

A problem also with DPDK-based dockers is that you generally don’t want to keep the -march=native, so personally I got that script to build a version of my docker image with many architectures : https://github.com/tbarbette/fastclick/blob/main/etc/docker-build.sh so the user can use the image that targets their own arch.


May that be helpful,

Tom

Le 20 nov. 2024 à 08:10, Kompella V, Purnima <Kompella.Purnima@commscope.com> a écrit :

Vous n’obtenez pas souvent d’e-mail à partir de kompella.purnima@commscope.com<mailto:kompella.purnima@commscope.com>. Pourquoi c’est important<https://aka.ms/LearnAboutSenderIdentification>
Hi Stephen,

A parallel question about packet-flow between VFs of the same PF when VFs are assigned to different containers on the same host server
Create 2 SRIOV-VFs of a PF in the host and assign them to 2 containers (one VF per container)
send IP packet from container-1 to container-2 (SRC_MAC address in this ethernet frame = container1 VF’s MAC address, DST_MAC address = container2 VF’s MAC address),
container-1 sends packet by calling rte_eth_tx_burst()
container-2 is polling for packets from its VF by callingrte_eth_rx_burst()

Will the packet in above scenario leave the host server, go the switch and then come back to the same host machine for entering container-2 ?
Or, is the SRIOV in PF-NIC smart to identify that SRC_MAC and DST_MAC of the ethernet frame are its own VFs and hence it routes the packet locally within the NIC (packet doesn’t reach the switch at all) ?

Regards,
Purnima


From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Wednesday, November 20, 2024 3:34 AM
To: Thea Corinne Rossman <thea.rossman@cs.stanford.edu>
Cc: users@dpdk.org
Subject: Re: Containernet (Docker/Container Networking) with DPDK?


On Tue, 19 Nov 2024 13:39:38 -0800
Thea Corinne Rossman <thea.rossman@cs.stanford.edu<mailto:thea.rossman@cs.stanford.edu>> wrote:

> This is SO helpful -- thank you so much.
>
> One follow-up question regarding NICs: can multiple containers on the same
> host share the same PCI device? If I have a host NIC with (say) VFIO driver
> binding, do I have to split it with some kind of SR-IOV so that each
> container has its own "NIC" binding? Or, when running DPDK's "devbind"
> script, can I set up each one with the same PCI address?


Totally depends on what container system you are using.
If you have two containers sharing same exact PCI device, chaos would ensue.
You might be able to make two VF's on host and pass one to each container;
that would make more sense.


[-- Attachment #2: Type: text/html, Size: 21937 bytes --]

  reply	other threads:[~2024-11-20  9:27 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-18  5:42 Thea Corinne Rossman
2024-11-19 20:53 ` Thea Corinne Rossman
2024-11-19 21:29   ` Stephen Hemminger
2024-11-19 21:39     ` Thea Corinne Rossman
2024-11-19 22:03       ` Stephen Hemminger
2024-11-20  7:10         ` Kompella V, Purnima
2024-11-20  9:27           ` Tom Barbette [this message]
2024-11-20  9:28             ` Tom Barbette
2024-11-20  9:28               ` Tom Barbette
2024-11-20 19:49               ` Thea Corinne Rossman
2024-11-19 22:14       ` Thomas Monjalon
2024-11-19 23:23         ` Thea Corinne Rossman
2024-11-19 23:30           ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80E33329-12A0-418B-B2F1-CB85E2C2388B@uclouvain.be \
    --to=tom.barbette@uclouvain.be \
    --cc=Kompella.Purnima@commscope.com \
    --cc=stephen@networkplumber.org \
    --cc=thea.rossman@cs.stanford.edu \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).