DPDK patches and discussions
 help / color / mirror / Atom feed
* Re: [dpdk-dev] [dpdk-dev, 5/7] net/mlx5: add port representor awareness
@ 2018-06-11 13:05 Xueming(Steven) Li
  2018-06-12 13:20 ` Adrien Mazarguil
  0 siblings, 1 reply; 4+ messages in thread
From: Xueming(Steven) Li @ 2018-06-11 13:05 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Shahaf Shuler, dev

Hi Adrien,

Couldn't find your original email from inbox anyway, have to start a new thread here.

> The current PCI probing method is not aware of Verbs port representors,
> which appear as standard Verbs devices bound to the same PCI address and
> cannot be distinguished.
> 
> Problem is that more often than not, the wrong Verbs device is used,
> resulting in unexpected traffic.
> 
> This patch adds necessary heuristics to bind affected driver instances to
> the intended (i.e. non-representor) device.
> 
> (Patch based on prior work from Yuanhan Liu)
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---
>  drivers/net/mlx5/mlx5.c | 61 +++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 57 insertions(+), 4 deletions(-)
> Patch diffmbox
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index d57e8118c..d3a298332 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -1155,6 +1155,32 @@  mlx5_dev_spawn(struct rte_device *dpdk_dev,
>  }
>  
>  /**
> + * Comparison callback to sort Verbs device names.
> + *
> + * This is meant to be used with qsort().
> + *
> + * @param a[in]
> + *   Pointer to pointer to first Verbs device.
> + * @param b[in]
> + *   Pointer to pointer to second Verbs device.
> + *
> + * @return
> + *   0 if both names are equal, less than 0 if the first argument is less
> + *   than the second, greater than 0 otherwise.
> + */
> +static int
> +mlx5_cmp_ibv_name(const void *a, const void *b)
> +{
> +	const char *name_a = (*(const struct ibv_device *const *)a)->name;
> +	const char *name_b = (*(const struct ibv_device *const *)b)->name;
> +	size_t i = 0;
> +
> +	while (name_a[i] && name_a[i] == name_b[i])
> +		++i;
> +	return atoi(name_a + i) - atoi(name_b + i);

Comparing "1" and "10" here will return 0, does this matter?

> +}
> +
> +/**
>   * DPDK callback to register a PCI device.
>   *
>   * This function creates an Ethernet device for each port of a given
> @@ -1174,6 +1200,7 @@  mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
>  {
>  	struct ibv_device **ibv_list;
>  	struct rte_eth_dev **eth_list = NULL;
> +	int n = 0;
>  	int vf;
>  	int ret;
>  
> @@ -1195,6 +1222,9 @@  mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
>  		DRV_LOG(ERR, "cannot list devices, is ib_uverbs loaded?");
>  		return -rte_errno;
>  	}
> +
> +	struct ibv_device *ibv_match[ret + 1];
> +
>  	while (ret-- > 0) {
>  		struct rte_pci_addr pci_addr;
>  
> @@ -1206,12 +1236,35 @@  mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
>  		    pci_dev->addr.devid != pci_addr.devid ||
>  		    pci_dev->addr.function != pci_addr.function)
>  			continue;
> -		DRV_LOG(INFO, "PCI information matches, using device \"%s\"",
> +		DRV_LOG(INFO, "PCI information matches for device \"%s\"",
>  			ibv_list[ret]->name);
> -		break;
> +		ibv_match[n++] = ibv_list[ret];
> +	}
> +	ibv_match[n] = NULL;
> +	if (n > 1) {
> +		/*
> +		 * The existence of several matching entries means port
> +		 * representors have been instantiated. No existing Verbs
> +		 * call nor /sys entries can tell them apart at this point.
> +		 *
> +		 * While definitely hackish, assume their names are numbered
> +		 * based on order of creation with master device first,
> +		 * followed by first port representor, followed by the
> +		 * second one and so on.
> +		 */
> +		DRV_LOG(WARNING,
> +			"probing device with port representors involves"
> +			" heuristics with uncertain outcome");
> +		qsort(ibv_match, n, sizeof(*ibv_match), mlx5_cmp_ibv_name);
> +		DRV_LOG(WARNING, "assuming \"%s\" is the master device",
> +			ibv_match[0]->name);
> +		for (ret = 1; ret < n; ++ret)
> +			DRV_LOG(WARNING,
> +				"assuming \"%s\" is port representor #%d",
> +				ibv_match[ret]->name, ret - 1);

Such dump will appear when attaching each rep port, how about just 
do it for PF in DEBUG level?

>  	}
> -	if (ret >= 0)
> -		eth_list = mlx5_dev_spawn(&pci_dev->device, ibv_list[ret], vf);
> +	if (n)
> +		eth_list = mlx5_dev_spawn(&pci_dev->device, ibv_match[0], vf);
>  	mlx5_glue->free_device_list(ibv_list);
>  	if (!eth_list || !*eth_list) {
>  		DRV_LOG(WARNING,
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] [dpdk-dev, 5/7] net/mlx5: add port representor awareness
  2018-06-11 13:05 [dpdk-dev] [dpdk-dev, 5/7] net/mlx5: add port representor awareness Xueming(Steven) Li
@ 2018-06-12 13:20 ` Adrien Mazarguil
  2018-06-12 13:57   ` Xueming(Steven) Li
  0 siblings, 1 reply; 4+ messages in thread
From: Adrien Mazarguil @ 2018-06-12 13:20 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Mon, Jun 11, 2018 at 01:05:55PM +0000, Xueming(Steven) Li wrote:
> Hi Adrien,
> 
> Couldn't find your original email from inbox anyway, have to start a new thread here.
<snip>
> > +static int
> > +mlx5_cmp_ibv_name(const void *a, const void *b)
> > +{
> > +	const char *name_a = (*(const struct ibv_device *const *)a)->name;
> > +	const char *name_b = (*(const struct ibv_device *const *)b)->name;
> > +	size_t i = 0;
> > +
> > +	while (name_a[i] && name_a[i] == name_b[i])
> > +		++i;
> > +	return atoi(name_a + i) - atoi(name_b + i);
> 
> Comparing "1" and "10" here will return 0, does this matter?

Sure it does! The whole point of this function is precisely to avoid this
kind of issues. I'll fix it for v2, thanks.

<snip>
> > +	if (n > 1) {
> > +		/*
> > +		 * The existence of several matching entries means port
> > +		 * representors have been instantiated. No existing Verbs
> > +		 * call nor /sys entries can tell them apart at this point.
> > +		 *
> > +		 * While definitely hackish, assume their names are numbered
> > +		 * based on order of creation with master device first,
> > +		 * followed by first port representor, followed by the
> > +		 * second one and so on.
> > +		 */
> > +		DRV_LOG(WARNING,
> > +			"probing device with port representors involves"
> > +			" heuristics with uncertain outcome");
> > +		qsort(ibv_match, n, sizeof(*ibv_match), mlx5_cmp_ibv_name);
> > +		DRV_LOG(WARNING, "assuming \"%s\" is the master device",
> > +			ibv_match[0]->name);
> > +		for (ret = 1; ret < n; ++ret)
> > +			DRV_LOG(WARNING,
> > +				"assuming \"%s\" is port representor #%d",
> > +				ibv_match[ret]->name, ret - 1);
> 
> Such dump will appear when attaching each rep port, how about just 
> do it for PF in DEBUG level?

It occurs only once when probing the master device and detecting the
presence of representors, not for each of them.

I prefer to leave it as a warning because this detection approach, while an
undeniable improvement over not checking anything and ending up configuring
the wrong netdevice, is unfortunately not 100% accurate. This will be
improved, however users must be warned of possible issues in the meantime.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] [dpdk-dev, 5/7] net/mlx5: add port representor awareness
  2018-06-12 13:20 ` Adrien Mazarguil
@ 2018-06-12 13:57   ` Xueming(Steven) Li
  2018-06-12 14:18     ` Adrien Mazarguil
  0 siblings, 1 reply; 4+ messages in thread
From: Xueming(Steven) Li @ 2018-06-12 13:57 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Shahaf Shuler, dev



> -----Original Message-----
> From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Sent: Tuesday, June 12, 2018 9:20 PM
> To: Xueming(Steven) Li <xuemingl@mellanox.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev,5/7] net/mlx5: add port representor awareness
> 
> On Mon, Jun 11, 2018 at 01:05:55PM +0000, Xueming(Steven) Li wrote:
> > Hi Adrien,
> >
> > Couldn't find your original email from inbox anyway, have to start a new thread here.
> <snip>
> > > +static int
> > > +mlx5_cmp_ibv_name(const void *a, const void *b) {
> > > +	const char *name_a = (*(const struct ibv_device *const *)a)->name;
> > > +	const char *name_b = (*(const struct ibv_device *const *)b)->name;
> > > +	size_t i = 0;
> > > +
> > > +	while (name_a[i] && name_a[i] == name_b[i])
> > > +		++i;
> > > +	return atoi(name_a + i) - atoi(name_b + i);
> >
> > Comparing "1" and "10" here will return 0, does this matter?
> 
> Sure it does! The whole point of this function is precisely to avoid this kind of issues. I'll fix it
> for v2, thanks.
> 
> <snip>
> > > +	if (n > 1) {
> > > +		/*
> > > +		 * The existence of several matching entries means port
> > > +		 * representors have been instantiated. No existing Verbs
> > > +		 * call nor /sys entries can tell them apart at this point.
> > > +		 *
> > > +		 * While definitely hackish, assume their names are numbered
> > > +		 * based on order of creation with master device first,
> > > +		 * followed by first port representor, followed by the
> > > +		 * second one and so on.
> > > +		 */
> > > +		DRV_LOG(WARNING,
> > > +			"probing device with port representors involves"
> > > +			" heuristics with uncertain outcome");
> > > +		qsort(ibv_match, n, sizeof(*ibv_match), mlx5_cmp_ibv_name);
> > > +		DRV_LOG(WARNING, "assuming \"%s\" is the master device",
> > > +			ibv_match[0]->name);
> > > +		for (ret = 1; ret < n; ++ret)
> > > +			DRV_LOG(WARNING,
> > > +				"assuming \"%s\" is port representor #%d",
> > > +				ibv_match[ret]->name, ret - 1);
> >
> > Such dump will appear when attaching each rep port, how about just do
> > it for PF in DEBUG level?
> 
> It occurs only once when probing the master device and detecting the presence of representors, not for
> each of them.
> 
> I prefer to leave it as a warning because this detection approach, while an undeniable improvement
> over not checking anything and ending up configuring the wrong netdevice, is unfortunately not 100%
> accurate. This will be improved, however users must be warned of possible issues in the meantime.

Yes, the list is different when VF number changed outside, a full dump should be helpful, how about set it to DEBUG or INFO level?
Users don't need to know this, just for debug purpose.

> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] [dpdk-dev, 5/7] net/mlx5: add port representor awareness
  2018-06-12 13:57   ` Xueming(Steven) Li
@ 2018-06-12 14:18     ` Adrien Mazarguil
  0 siblings, 0 replies; 4+ messages in thread
From: Adrien Mazarguil @ 2018-06-12 14:18 UTC (permalink / raw)
  To: Xueming(Steven) Li; +Cc: Shahaf Shuler, dev

On Tue, Jun 12, 2018 at 01:57:45PM +0000, Xueming(Steven) Li wrote:
> > -----Original Message-----
> > From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > Sent: Tuesday, June 12, 2018 9:20 PM
> > To: Xueming(Steven) Li <xuemingl@mellanox.com>
> > Cc: Shahaf Shuler <shahafs@mellanox.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev,5/7] net/mlx5: add port representor awareness
> > 
> > On Mon, Jun 11, 2018 at 01:05:55PM +0000, Xueming(Steven) Li wrote:
> > > Hi Adrien,
> > >
> > > Couldn't find your original email from inbox anyway, have to start a new thread here.
> > <snip>
> > > > +static int
> > > > +mlx5_cmp_ibv_name(const void *a, const void *b) {
> > > > +	const char *name_a = (*(const struct ibv_device *const *)a)->name;
> > > > +	const char *name_b = (*(const struct ibv_device *const *)b)->name;
> > > > +	size_t i = 0;
> > > > +
> > > > +	while (name_a[i] && name_a[i] == name_b[i])
> > > > +		++i;
> > > > +	return atoi(name_a + i) - atoi(name_b + i);
> > >
> > > Comparing "1" and "10" here will return 0, does this matter?
> > 
> > Sure it does! The whole point of this function is precisely to avoid this kind of issues. I'll fix it
> > for v2, thanks.
> > 
> > <snip>
> > > > +	if (n > 1) {
> > > > +		/*
> > > > +		 * The existence of several matching entries means port
> > > > +		 * representors have been instantiated. No existing Verbs
> > > > +		 * call nor /sys entries can tell them apart at this point.
> > > > +		 *
> > > > +		 * While definitely hackish, assume their names are numbered
> > > > +		 * based on order of creation with master device first,
> > > > +		 * followed by first port representor, followed by the
> > > > +		 * second one and so on.
> > > > +		 */
> > > > +		DRV_LOG(WARNING,
> > > > +			"probing device with port representors involves"
> > > > +			" heuristics with uncertain outcome");
> > > > +		qsort(ibv_match, n, sizeof(*ibv_match), mlx5_cmp_ibv_name);
> > > > +		DRV_LOG(WARNING, "assuming \"%s\" is the master device",
> > > > +			ibv_match[0]->name);
> > > > +		for (ret = 1; ret < n; ++ret)
> > > > +			DRV_LOG(WARNING,
> > > > +				"assuming \"%s\" is port representor #%d",
> > > > +				ibv_match[ret]->name, ret - 1);
> > >
> > > Such dump will appear when attaching each rep port, how about just do
> > > it for PF in DEBUG level?
> > 
> > It occurs only once when probing the master device and detecting the presence of representors, not for
> > each of them.
> > 
> > I prefer to leave it as a warning because this detection approach, while an undeniable improvement
> > over not checking anything and ending up configuring the wrong netdevice, is unfortunately not 100%
> > accurate. This will be improved, however users must be warned of possible issues in the meantime.
> 
> Yes, the list is different when VF number changed outside, a full dump should be helpful, how about set it to DEBUG or INFO level?
> Users don't need to know this, just for debug purpose.

Because by "assuming" things, there's a slight possibility for the PMD to be
wrong. It's not a mere debug message.

Using the wrong IB device may silently wreak havoc on some systems,
therefore since the PMD can't be sure, users are warned about this fact and
what IB devices will be used. This calls for extra attention and manual
checks (where possible) *before* an issue is encountered, until we replace
this piece of code with a safer approach.

I think a WARNING level is warranted.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-06-12 14:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-11 13:05 [dpdk-dev] [dpdk-dev, 5/7] net/mlx5: add port representor awareness Xueming(Steven) Li
2018-06-12 13:20 ` Adrien Mazarguil
2018-06-12 13:57   ` Xueming(Steven) Li
2018-06-12 14:18     ` Adrien Mazarguil

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).