DPDK patches and discussions
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: Ilya Maximets <i.maximets@ovn.org>
Cc: dev@dpdk.org, Shahaf Shuler <shahafs@mellanox.com>,
	Jerin Jacob <jerinjacobk@gmail.com>,
	Andrew Rybchenko <arybchenko@solarflare.com>,
	Ferruh Yigit <ferruh.yigit@intel.com>,
	Stephen Hemminger <stephen@networkplumber.org>,
	Roni Bar Yanai <roniba@mellanox.com>,
	Rony Efraim <ronye@mellanox.com>,
	declan.doherty@intel.com, bernard.iremonger@intel.com,
	ajit.khaparde@broadcom.com
Subject: Re: [dpdk-dev] [PATCH v2 0/3] ethdev: configure SR-IOV VF from host
Date: Wed, 30 Oct 2019 22:42:16 +0100	[thread overview]
Message-ID: <1968866.mbH2BcW0Fd@xps> (raw)
In-Reply-To: <aed395e5-11d5-a947-b621-bd36c904de84@ovn.org>

30/10/2019 17:09, Ilya Maximets:
> On 30.10.2019 16:49, Thomas Monjalon wrote:
> > 30/10/2019 16:07, Ilya Maximets:
> >> On 29.10.2019 19:50, Thomas Monjalon wrote:
> >>> In a virtual environment, the network controller may have to configure
> >>> some SR-IOV VF parameters for security reasons.
> >>>
> >>> When the PF (host port) is driven by DPDK (OVS-DPDK case),
> >>> we face two different cases:
> >>>       - driver is bifurcated (Mellanox case),
> >>>         so the VF can be configured via the kernel.
> >>>       - driver is on top of UIO or VFIO, so DPDK API is required,
> >>>         and PMD-specific APIs were used.
> >>
> >> So, what is wrong with setting VF mac via the representor port as
> >> it done now?  From our previous discussion, I thought that we concluded
> >> that is enough to have current API, i.e. just call set_default_mac()
> >> for a representor port and this will lead to setting mac for VF.
> >> This is how it's implemented in Linux kernel and this is how it's
> >> implemented in current DPDK drivers that supports setting mac for
> >> the representor.
> > 
> > I don't know what is done in the Linux kernel.
> > In DPDK, setting the MAC of the representor is really setting
> > the MAC of the representor. Is it crazy?
> 
> Kind of. And no, it doesn't work this way in DPDK.
> Just look at the i40e driver:
> 
> 325 static int
> 326 i40e_vf_representor_mac_addr_set(struct rte_eth_dev *ethdev,
> 327         struct ether_addr *mac_addr)
> 328 {
> 329     struct i40e_vf_representor *representor = ethdev->data->dev_private;
> 330
> 331     return rte_pmd_i40e_set_vf_mac_addr(
> 332         representor->adapter->eth_dev->data->port_id,
> 333         representor->vf_id, mac_addr);
> 334 }
> ....
> 423 static const struct eth_dev_ops i40e_representor_dev_ops = {
>      <...>
> 445     .mac_addr_set         = i40e_vf_representor_mac_addr_set,
> 
> 
> It really calls the function to set VF mac address.

Indeed, I missed that i40e_vf_representor_mac_addr_set() is calling
rte_pmd_i40e_set_vf_mac_addr().

> And for MLX drivers it's the same.
> MLX driver call netlink to set representor MAC, but netlink is in
> kernel and this call inside the kernel translates to the setting
> of mac address of the VF.
> 
> Am I missing something?

I am not sure about this kernel translation. At least not in mlx5.
Setting MAC address of a VF representor seems not supported on mlx5.
But there is a specific netlink request to set a VF MAC address:
	ip link set <PF> vf <VF> mac <MAC>


> >> The only use case for this new API is to be able to control mac of
> >> the representor itself, which doesn't make much sense. At least there
> >> are only hypothetical use cases. And once again, Linux kernel doesn't
> >> support this behavior.
> > 
> > I think it is sane to be able to set different MAC addresses
> > for the representor and the VF.

Let me explain better my thoughts.
In the switchdev design, VF and uplink ports are all connected
together via a switch.
The representors are mirrors of those switch ports.
So a VF representor port is supposed to mirror the switch port
where a VF is connected to. It is different of the VF itself.

This is a drawing of my understanding of switchdev design:

VF1 ------ VF1 rep /--------\
                   | switch | uplink rep ------ uplink ------ wire
VF2 ------ VF2 rep \--------/    (PF)

Of course, there can be more VF and uplink ports.

With some switch-aware protocols, it might be interesting to have
different MAC addresses on a VF and its representor.
And more generally, I imagine that the config of a VF representor
could be different of the config of a VF.


> >>> This new generic API will avoid vendors fragmentation.
> >>
> >> I don't see the fragmentation. Right now you can set VF mac from DPDK
> >> by calling set_default_mac() for representor.  This API exists for all
> >> vendors. Not implemented for Intel, but new API is not implemented too.
> > 
> > No, the current situation is the following:
> > 	- for mlx5, VF MAC can be configured with iproute2 or netlink
> > 	- for ixgbe, rte_pmd_ixgbe_set_vf_mac_addr()
> > 	- for i40e, rte_pmd_i40e_set_vf_mac_addr()
> > 	- for bnxt, rte_pmd_bnxt_set_vf_mac_addr()

Thanks to Ilya's comment, this is an update of the DPDK situation:
	- for mlx5, VF MAC can be configured with iproute2 or netlink
	  and rte_eth_dev_default_mac_addr_set(rep) is not supported
	- for ixgbe, rte_pmd_ixgbe_set_vf_mac_addr()
	  and rte_eth_dev_default_mac_addr_set(rep) does the same
	- for i40e, rte_pmd_i40e_set_vf_mac_addr()
	  and rte_eth_dev_default_mac_addr_set(rep) does the same
	- for bnxt, rte_pmd_bnxt_set_vf_mac_addr()
	  and no representor

If we consider what Intel did, i.e. configure VF in place of representor
for some operations, there are two drawbacks:
- confusing that some ops apply to representor, others apply to VF
- some ops are not possible on representor (because targetted to VF)

I still feel that the addition of one single bit in the port ID
is an elegant solution to target either the VF or its representor.


> >> The this is that this new API will produce conceptual fragmentation
> >> between DPDK and the Linux kernel, because to do the same thing you'll
> >> have to use different ways. I mean, to change mac of VF in kernel you
> >> need to set mac to the representor, but in DPDK changing setting mac to
> >> representor will lead to changing the mac of the representor itself,
> >> not the VF. This will be really confusing for users.
> > 
> > I am not responsible of the choices in Linux.
> > But I agree it would be interesting to check the reason of such decision.
> > Rony, please could you explain?

I looked at few Linux drivers:

	bnxt_vf_rep_netdev_ops has no op to set MAC
	bnxt_netdev_ops.ndo_set_vf_mac = set VF MAC from PF

	lio_vf_rep_ndev_ops has no op to set MAC
	lionetdevops.ndo_set_vf_mac = set VF MAC from PF

	mlx5e_netdev_ops_rep has no op to set MAC
	mlx5e_netdev_ops.ndo_set_vf_mac = set VF MAC from PF
	mlx5e_netdev_ops_uplink_rep.ndo_set_vf_mac = set VF MAC from PF

	nfp_repr_netdev_ops.ndo_set_mac_address = set representor MAC
	nfp_repr_netdev_ops.ndo_set_vf_mac = set VF MAC from representor
	nfp_net_netdev_ops.ndo_set_vf_mac = set VF MAC from PF

There is a big chance that the behaviour is not standardized in Linux
(as usual). So it is already confusing for users of Linux.



  reply	other threads:[~2019-10-30 21:42 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-15 15:06 [dpdk-dev] [RFC] " Thomas Monjalon
2019-08-15 15:34 ` Jerin Jacob Kollanukkaran
2019-08-15 17:59   ` Thomas Monjalon
2019-08-29 15:02 ` Iremonger, Bernard
2019-09-04  8:23   ` Thomas Monjalon
2019-10-29 18:50 ` [dpdk-dev] [PATCH v2 0/3] " Thomas Monjalon
2019-10-29 18:50   ` [dpdk-dev] [PATCH v2 1/3] ethdev: identify " Thomas Monjalon
2019-10-29 18:50   ` [dpdk-dev] [PATCH v2 2/3] ethdev: set VF MAC address " Thomas Monjalon
2019-11-01  0:18     ` [dpdk-dev] [RFC PATCH] net/i[xgb|40]e: " Thomas Monjalon
2019-10-29 18:50   ` [dpdk-dev] [PATCH v2 3/3] net/mlx5: " Thomas Monjalon
2019-10-30  4:08   ` [dpdk-dev] [PATCH v2 0/3] ethdev: configure SR-IOV VF " Jerin Jacob
2019-10-30  7:22     ` Shahaf Shuler
2019-10-30  9:24       ` Jerin Jacob
2019-11-01  0:24         ` Thomas Monjalon
2019-11-01  9:06           ` Ilya Maximets
2019-11-01  9:56             ` Ilya Maximets
2019-10-30  8:56     ` Thomas Monjalon
2019-10-30  9:15       ` Jerin Jacob
2019-11-01  0:33         ` Thomas Monjalon
2019-11-01 11:01           ` Jerin Jacob
2019-11-01 13:25           ` Jerin Jacob
2019-11-03  6:31             ` Shahaf Shuler
2019-10-30 15:07   ` Ilya Maximets
2019-10-30 15:49     ` Thomas Monjalon
2019-10-30 16:09       ` Ilya Maximets
2019-10-30 21:42         ` Thomas Monjalon [this message]
2019-11-01  9:32           ` Ilya Maximets
2019-11-03  6:48             ` Shahaf Shuler
2019-11-03 15:27               ` Ananyev, Konstantin
2019-11-03 22:09                 ` Thomas Monjalon
2019-11-07 14:44                   ` Thomas Monjalon
2019-11-04 10:28               ` Ilya Maximets
2019-11-04 14:30                 ` Asaf Penso
2019-11-04 14:58                   ` Ilya Maximets
2019-11-04 20:33                 ` Shahaf Shuler
2019-11-05 12:15                   ` Ilya Maximets

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1968866.mbH2BcW0Fd@xps \
    --to=thomas@monjalon.net \
    --cc=ajit.khaparde@broadcom.com \
    --cc=arybchenko@solarflare.com \
    --cc=bernard.iremonger@intel.com \
    --cc=declan.doherty@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=i.maximets@ovn.org \
    --cc=jerinjacobk@gmail.com \
    --cc=roniba@mellanox.com \
    --cc=ronye@mellanox.com \
    --cc=shahafs@mellanox.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).