RE: [PATCH v9 05/12] net/nfp: add flower PF setup logic

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Chaoyong He <chaoyong.he@corigine.com>
To: Ferruh Yigit <ferruh.yigit@xilinx.com>,
	"dev@dpdk.org" <dev@dpdk.org>, Ian Stokes <ian.stokes@intel.com>,
	David Marchand <david.marchand@redhat.com>
Cc: oss-drivers <oss-drivers@corigine.com>,
	Niklas Soderlund <niklas.soderlund@corigine.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
	Jerin Jacob Kollanukkaran <jerinj@marvell.com>
Subject: RE: [PATCH v9 05/12] net/nfp: add flower PF setup logic
Date: Wed, 21 Sep 2022 02:50:38 +0000	[thread overview]
Message-ID: <SJ0PR13MB55452B2239840826758AC73A9E4F9@SJ0PR13MB5545.namprd13.prod.outlook.com> (raw)
In-Reply-To: <cea1de8b-37ef-f098-16ec-fbd604094c18@xilinx.com>

> On 9/15/2022 11:44 AM, Chaoyong He wrote:
> > Adds the vNIC initialization logic for the flower PF vNIC. The flower
> > firmware application exposes this vNIC for the purposes of fallback
> > traffic in the switchdev use-case.
> >
> > Adds minimal dev_ops for this PF vNIC device. Because the device is
> > being exposed externally to DPDK it needs to implements a minimal set
> > of dev_ops.
> >
> > Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> > Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> 
> <...>
> 
> > +
> > +struct dp_packet {
> > +	struct rte_mbuf mbuf;
> > +	uint32_t source;
> > +};
> > +
> > +static void
> > +nfp_flower_pf_mp_init(__rte_unused struct rte_mempool *mp,
> > +		__rte_unused void *opaque_arg,
> > +		void *packet,
> > +		__rte_unused unsigned int i)
> > +{
> > +	struct dp_packet *pkt = packet;
> > +	/* Indicate that this pkt is from DPDK */
> > +	pkt->source = 3;
> > +}
> > +
> > +static struct rte_mempool *
> > +nfp_flower_pf_mp_create(void)
> > +{
> > +	uint32_t nb_mbufs;
> > +	unsigned int numa_node;
> > +	struct rte_mempool *pktmbuf_pool;
> > +	uint32_t n_rxd = PF_VNIC_NB_DESC;
> > +	uint32_t n_txd = PF_VNIC_NB_DESC;
> > +
> > +	nb_mbufs = RTE_MAX(n_rxd + n_txd + MAX_PKT_BURST +
> > +MEMPOOL_CACHE_SIZE, 81920U);
> > +
> > +	numa_node = rte_socket_id();
> > +	pktmbuf_pool = rte_pktmbuf_pool_create("flower_pf_mbuf_pool",
> nb_mbufs,
> > +			MEMPOOL_CACHE_SIZE, MBUF_PRIV_SIZE,
> > +			RTE_MBUF_DEFAULT_BUF_SIZE, numa_node);
> > +	if (pktmbuf_pool == NULL) {
> > +		PMD_INIT_LOG(ERR, "Cannot init pf vnic mbuf pool");
> > +		return NULL;
> > +	}
> > +
> > +	rte_mempool_obj_iter(pktmbuf_pool, nfp_flower_pf_mp_init,
> NULL);
> > +
> > +	return pktmbuf_pool;
> > +}
> > +
> 
> Hi Chaoyong,
> 
> Again, similar comment to previous versions, what I understand is this new
> flower FW supports HW flow filter and intended use case is for OvS HW
> acceleration.
> But is DPDK driver need to know OvS data structures, like "struct dp_packet",
> can it be transparent to application, I am sure there are other devices
> offloading some OvS task to HW.
> 
> @Ian, @David,
> 
> Can you please comment on above usage, do you guys see any way to
> escape from OvS specific code in the driver?

Firstly, I'll explain why we must include some OvS specific code in the driver.
If we don't set the `pkt->source = 3`, the OvS will coredump like this:
```
(gdb) bt
#0  0x00007fe1d48fd387 in raise () from /lib64/libc.so.6
#1  0x00007fe1d48fea78 in abort () from /lib64/libc.so.6
#2  0x00007fe1d493ff67 in __libc_message () from /lib64/libc.so.6
#3  0x00007fe1d4948329 in _int_free () from /lib64/libc.so.6
#4  0x000000000049c006 in dp_packet_uninit (b=0x1f262db80) at lib/dp-packet.c:135
#5  0x000000000061440a in dp_packet_delete (b=0x1f262db80) at lib/dp-packet.h:261
#6  0x0000000000619aa0 in dpdk_copy_batch_to_mbuf (netdev=0x1f0a04a80, batch=0x7fe1b40050c0) at lib/netdev-dpdk.c:274
#7  0x0000000000619b46 in netdev_dpdk_common_send (netdev=0x1f0a04a80, batch=0x7fe1b40050c0, stats=0x7fe1be7321f0) at
#8  0x000000000061a0ba in netdev_dpdk_eth_send (netdev=0x1f0a04a80, qid=0, batch=0x7fe1b40050c0, concurrent_txq=true)
#9  0x00000000004fbd10 in netdev_send (netdev=0x1f0a04a80, qid=0, batch=0x7fe1b40050c0, concurrent_txq=true) at lib/n
#10 0x00000000004aa663 in dp_netdev_pmd_flush_output_on_port (pmd=0x7fe1be735010, p=0x7fe1b4005090) at lib/dpif-netde
#11 0x00000000004aa85d in dp_netdev_pmd_flush_output_packets (pmd=0x7fe1be735010, force=false) at lib/dpif-netdev.c:5
#12 0x00000000004aaaef in dp_netdev_process_rxq_port (pmd=0x7fe1be735010, rxq=0x16f3f80, port_no=3) at lib/dpif-netde
#13 0x00000000004af17a in pmd_thread_main (f_=0x7fe1be735010) at lib/dpif-netdev.c:6958
#14 0x000000000057da80 in ovsthread_wrapper (aux_=0x1608b30) at lib/ovs-thread.c:422
#15 0x00007fe1d51a6ea5 in start_thread () from /lib64/libpthread.so.0
#16 0x00007fe1d49c5b0d in clone () from /lib64/libc.so.6
```
The logic in function `dp_packet_delete()` run into the wrong branch.

Then, why just our PMD need do this, and other PMDs don't?
Generally, it's greatly dependent on the hardware.

The Netronome's Network Flow Processor 4xxx (NFP-4xxx) card is the target card of these series patches.
Which only has one PF but has 2 physical ports, and the NFP PMD can work with up to 8 ports on the same PF device. 
Other PMDs hardware seems all 'one PF <--> one physical port'.

For the use case of OvS, we should add the representor port of 'physical port' to the bridge, not the representor port of PF like other PMDs.

We use a two-layer poll mode architecture. (Other PMDs are simple poll mode architecture)
In the RX direction:
1. When the physical port or vf receives pkts, the firmware will prepend a meta-data(indicating the input port) into the pkt.
2. We use the PF vNIC as a multiplexer, which keeps polling pkts from the firmware.
3. The PF vNIC will parse the meta-data, and enqueue the pkt into the corresponding rte_ring of the representor port of physical port or vf.
4. The OVS will polling pkts from the RX function of representor port, which dequeue pkts from the rte_ring.
In the TX direction:
1. The OVS send the pkts from the TX functions of representor port.
2. The representor port will prepend a meta-data(indicating the output port) into the pkt and send the pkt to firmware through the queue 0 of PF vNIC.
3. The firmware will parse the meta-data, and forward the pkt to the corresponding physical port or vf.

So the OvS won't create the mempool for us and we must create it ourselves for the PF vNIC to use.

Hopefully, I explained the things clearly. Thanks.

next prev parent reply	other threads:[~2022-09-21  2:50 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
2022-09-15 10:44 ` [PATCH v9 01/12] net/nfp: move app specific attributes to own struct Chaoyong He
2022-09-15 10:44 ` [PATCH v9 02/12] net/nfp: simplify initialization and remove dead code Chaoyong He
2022-09-15 10:44 ` [PATCH v9 03/12] net/nfp: move app specific init logic to own function Chaoyong He
2022-09-15 10:44 ` [PATCH v9 04/12] net/nfp: add initial flower firmware support Chaoyong He
2022-09-15 10:44 ` [PATCH v9 05/12] net/nfp: add flower PF setup logic Chaoyong He
2022-09-20 14:57   ` Ferruh Yigit
2022-09-21  2:50     ` Chaoyong He [this message]
2022-09-21  7:35       ` Thomas Monjalon
2022-09-21  7:47         ` Chaoyong He
2022-09-15 10:44 ` [PATCH v9 06/12] net/nfp: add flower PF related routines Chaoyong He
2022-09-15 10:44 ` [PATCH v9 07/12] net/nfp: add flower ctrl VNIC related logics Chaoyong He
2022-09-20 14:56   ` Ferruh Yigit
2022-09-21  2:02     ` Chaoyong He
2022-09-21  7:29       ` Thomas Monjalon
2022-09-21  7:42         ` Chaoyong He
2022-09-15 10:44 ` [PATCH v9 08/12] net/nfp: move common rxtx function for flower use Chaoyong He
2022-09-15 10:44 ` [PATCH v9 09/12] net/nfp: add flower ctrl VNIC rxtx logic Chaoyong He
2022-09-15 10:44 ` [PATCH v9 10/12] net/nfp: add flower representor framework Chaoyong He
2022-09-15 10:44 ` [PATCH v9 11/12] net/nfp: move rxtx function to header file Chaoyong He
2022-09-15 10:44 ` [PATCH v9 12/12] net/nfp: add flower PF rxtx logic Chaoyong He
2022-09-20 14:56 ` [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR13MB55452B2239840826758AC73A9E4F9@SJ0PR13MB5545.namprd13.prod.outlook.com \
    --to=chaoyong.he@corigine.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@xilinx.com \
    --cc=ian.stokes@intel.com \
    --cc=jerinj@marvell.com \
    --cc=niklas.soderlund@corigine.com \
    --cc=oss-drivers@corigine.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).