From: Thomas Monjalon <thomas@monjalon.net>
To: Ilya Maximets <i.maximets@ovn.org>
Cc: dev@dpdk.org, Shahaf Shuler <shahafs@mellanox.com>,
Jerin Jacob <jerinjacobk@gmail.com>,
Andrew Rybchenko <arybchenko@solarflare.com>,
Ferruh Yigit <ferruh.yigit@intel.com>,
Stephen Hemminger <stephen@networkplumber.org>,
Roni Bar Yanai <roniba@mellanox.com>,
Rony Efraim <ronye@mellanox.com>,
declan.doherty@intel.com, bernard.iremonger@intel.com,
ajit.khaparde@broadcom.com
Subject: Re: [dpdk-dev] [PATCH v2 0/3] ethdev: configure SR-IOV VF from host
Date: Wed, 30 Oct 2019 22:42:16 +0100 [thread overview]
Message-ID: <1968866.mbH2BcW0Fd@xps> (raw)
In-Reply-To: <aed395e5-11d5-a947-b621-bd36c904de84@ovn.org>
30/10/2019 17:09, Ilya Maximets:
> On 30.10.2019 16:49, Thomas Monjalon wrote:
> > 30/10/2019 16:07, Ilya Maximets:
> >> On 29.10.2019 19:50, Thomas Monjalon wrote:
> >>> In a virtual environment, the network controller may have to configure
> >>> some SR-IOV VF parameters for security reasons.
> >>>
> >>> When the PF (host port) is driven by DPDK (OVS-DPDK case),
> >>> we face two different cases:
> >>> - driver is bifurcated (Mellanox case),
> >>> so the VF can be configured via the kernel.
> >>> - driver is on top of UIO or VFIO, so DPDK API is required,
> >>> and PMD-specific APIs were used.
> >>
> >> So, what is wrong with setting VF mac via the representor port as
> >> it done now? From our previous discussion, I thought that we concluded
> >> that is enough to have current API, i.e. just call set_default_mac()
> >> for a representor port and this will lead to setting mac for VF.
> >> This is how it's implemented in Linux kernel and this is how it's
> >> implemented in current DPDK drivers that supports setting mac for
> >> the representor.
> >
> > I don't know what is done in the Linux kernel.
> > In DPDK, setting the MAC of the representor is really setting
> > the MAC of the representor. Is it crazy?
>
> Kind of. And no, it doesn't work this way in DPDK.
> Just look at the i40e driver:
>
> 325 static int
> 326 i40e_vf_representor_mac_addr_set(struct rte_eth_dev *ethdev,
> 327 struct ether_addr *mac_addr)
> 328 {
> 329 struct i40e_vf_representor *representor = ethdev->data->dev_private;
> 330
> 331 return rte_pmd_i40e_set_vf_mac_addr(
> 332 representor->adapter->eth_dev->data->port_id,
> 333 representor->vf_id, mac_addr);
> 334 }
> ....
> 423 static const struct eth_dev_ops i40e_representor_dev_ops = {
> <...>
> 445 .mac_addr_set = i40e_vf_representor_mac_addr_set,
>
>
> It really calls the function to set VF mac address.
Indeed, I missed that i40e_vf_representor_mac_addr_set() is calling
rte_pmd_i40e_set_vf_mac_addr().
> And for MLX drivers it's the same.
> MLX driver call netlink to set representor MAC, but netlink is in
> kernel and this call inside the kernel translates to the setting
> of mac address of the VF.
>
> Am I missing something?
I am not sure about this kernel translation. At least not in mlx5.
Setting MAC address of a VF representor seems not supported on mlx5.
But there is a specific netlink request to set a VF MAC address:
ip link set <PF> vf <VF> mac <MAC>
> >> The only use case for this new API is to be able to control mac of
> >> the representor itself, which doesn't make much sense. At least there
> >> are only hypothetical use cases. And once again, Linux kernel doesn't
> >> support this behavior.
> >
> > I think it is sane to be able to set different MAC addresses
> > for the representor and the VF.
Let me explain better my thoughts.
In the switchdev design, VF and uplink ports are all connected
together via a switch.
The representors are mirrors of those switch ports.
So a VF representor port is supposed to mirror the switch port
where a VF is connected to. It is different of the VF itself.
This is a drawing of my understanding of switchdev design:
VF1 ------ VF1 rep /--------\
| switch | uplink rep ------ uplink ------ wire
VF2 ------ VF2 rep \--------/ (PF)
Of course, there can be more VF and uplink ports.
With some switch-aware protocols, it might be interesting to have
different MAC addresses on a VF and its representor.
And more generally, I imagine that the config of a VF representor
could be different of the config of a VF.
> >>> This new generic API will avoid vendors fragmentation.
> >>
> >> I don't see the fragmentation. Right now you can set VF mac from DPDK
> >> by calling set_default_mac() for representor. This API exists for all
> >> vendors. Not implemented for Intel, but new API is not implemented too.
> >
> > No, the current situation is the following:
> > - for mlx5, VF MAC can be configured with iproute2 or netlink
> > - for ixgbe, rte_pmd_ixgbe_set_vf_mac_addr()
> > - for i40e, rte_pmd_i40e_set_vf_mac_addr()
> > - for bnxt, rte_pmd_bnxt_set_vf_mac_addr()
Thanks to Ilya's comment, this is an update of the DPDK situation:
- for mlx5, VF MAC can be configured with iproute2 or netlink
and rte_eth_dev_default_mac_addr_set(rep) is not supported
- for ixgbe, rte_pmd_ixgbe_set_vf_mac_addr()
and rte_eth_dev_default_mac_addr_set(rep) does the same
- for i40e, rte_pmd_i40e_set_vf_mac_addr()
and rte_eth_dev_default_mac_addr_set(rep) does the same
- for bnxt, rte_pmd_bnxt_set_vf_mac_addr()
and no representor
If we consider what Intel did, i.e. configure VF in place of representor
for some operations, there are two drawbacks:
- confusing that some ops apply to representor, others apply to VF
- some ops are not possible on representor (because targetted to VF)
I still feel that the addition of one single bit in the port ID
is an elegant solution to target either the VF or its representor.
> >> The this is that this new API will produce conceptual fragmentation
> >> between DPDK and the Linux kernel, because to do the same thing you'll
> >> have to use different ways. I mean, to change mac of VF in kernel you
> >> need to set mac to the representor, but in DPDK changing setting mac to
> >> representor will lead to changing the mac of the representor itself,
> >> not the VF. This will be really confusing for users.
> >
> > I am not responsible of the choices in Linux.
> > But I agree it would be interesting to check the reason of such decision.
> > Rony, please could you explain?
I looked at few Linux drivers:
bnxt_vf_rep_netdev_ops has no op to set MAC
bnxt_netdev_ops.ndo_set_vf_mac = set VF MAC from PF
lio_vf_rep_ndev_ops has no op to set MAC
lionetdevops.ndo_set_vf_mac = set VF MAC from PF
mlx5e_netdev_ops_rep has no op to set MAC
mlx5e_netdev_ops.ndo_set_vf_mac = set VF MAC from PF
mlx5e_netdev_ops_uplink_rep.ndo_set_vf_mac = set VF MAC from PF
nfp_repr_netdev_ops.ndo_set_mac_address = set representor MAC
nfp_repr_netdev_ops.ndo_set_vf_mac = set VF MAC from representor
nfp_net_netdev_ops.ndo_set_vf_mac = set VF MAC from PF
There is a big chance that the behaviour is not standardized in Linux
(as usual). So it is already confusing for users of Linux.
next prev parent reply other threads:[~2019-10-30 21:42 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-15 15:06 [dpdk-dev] [RFC] " Thomas Monjalon
2019-08-15 15:34 ` Jerin Jacob Kollanukkaran
2019-08-15 17:59 ` Thomas Monjalon
2019-08-29 15:02 ` Iremonger, Bernard
2019-09-04 8:23 ` Thomas Monjalon
2019-10-29 18:50 ` [dpdk-dev] [PATCH v2 0/3] " Thomas Monjalon
2019-10-29 18:50 ` [dpdk-dev] [PATCH v2 1/3] ethdev: identify " Thomas Monjalon
2019-10-29 18:50 ` [dpdk-dev] [PATCH v2 2/3] ethdev: set VF MAC address " Thomas Monjalon
2019-11-01 0:18 ` [dpdk-dev] [RFC PATCH] net/i[xgb|40]e: " Thomas Monjalon
2019-10-29 18:50 ` [dpdk-dev] [PATCH v2 3/3] net/mlx5: " Thomas Monjalon
2019-10-30 4:08 ` [dpdk-dev] [PATCH v2 0/3] ethdev: configure SR-IOV VF " Jerin Jacob
2019-10-30 7:22 ` Shahaf Shuler
2019-10-30 9:24 ` Jerin Jacob
2019-11-01 0:24 ` Thomas Monjalon
2019-11-01 9:06 ` Ilya Maximets
2019-11-01 9:56 ` Ilya Maximets
2019-10-30 8:56 ` Thomas Monjalon
2019-10-30 9:15 ` Jerin Jacob
2019-11-01 0:33 ` Thomas Monjalon
2019-11-01 11:01 ` Jerin Jacob
2019-11-01 13:25 ` Jerin Jacob
2019-11-03 6:31 ` Shahaf Shuler
2019-10-30 15:07 ` Ilya Maximets
2019-10-30 15:49 ` Thomas Monjalon
2019-10-30 16:09 ` Ilya Maximets
2019-10-30 21:42 ` Thomas Monjalon [this message]
2019-11-01 9:32 ` Ilya Maximets
2019-11-03 6:48 ` Shahaf Shuler
2019-11-03 15:27 ` Ananyev, Konstantin
2019-11-03 22:09 ` Thomas Monjalon
2019-11-07 14:44 ` Thomas Monjalon
2019-11-04 10:28 ` Ilya Maximets
2019-11-04 14:30 ` Asaf Penso
2019-11-04 14:58 ` Ilya Maximets
2019-11-04 20:33 ` Shahaf Shuler
2019-11-05 12:15 ` Ilya Maximets
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1968866.mbH2BcW0Fd@xps \
--to=thomas@monjalon.net \
--cc=ajit.khaparde@broadcom.com \
--cc=arybchenko@solarflare.com \
--cc=bernard.iremonger@intel.com \
--cc=declan.doherty@intel.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=i.maximets@ovn.org \
--cc=jerinjacobk@gmail.com \
--cc=roniba@mellanox.com \
--cc=ronye@mellanox.com \
--cc=shahafs@mellanox.com \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).