DPDK patches and discussions
 help / color / mirror / Atom feed
From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
	dev@dpdk.org, chenbo.xia@intel.com, amorenoz@redhat.com,
	david.marchand@redhat.com, ferruh.yigit@intel.com,
	michaelba@nvidia.com, viacheslavo@nvidia.com,
	xiaoyun.li@intel.com
Cc: nelio.laranjeiro@6wind.com, yvugenfi@redhat.com, ybendito@redhat.com
Subject: Re: [dpdk-dev] [PATCH v5 1/5] net/virtio: add initial RSS support
Date: Tue, 19 Oct 2021 11:22:50 +0200	[thread overview]
Message-ID: <e53974cd-0272-ba22-6149-a9395110c455@redhat.com> (raw)
In-Reply-To: <3cf32ebd-47cd-05d0-c64f-67e9418839ba@oktetlabs.ru>

Hi Andrew,

On 10/19/21 09:30, Andrew Rybchenko wrote:
> On 10/18/21 1:20 PM, Maxime Coquelin wrote:
>> Provide the capability to update the hash key, hash types
>> and RETA table on the fly (without needing to stop/start
>> the device). However, the key length and the number of RETA
>> entries are fixed to 40B and 128 entries respectively. This
>> is done in order to simplify the design, but may be
>> revisited later as the Virtio spec provides this
>> flexibility.
>>
>> Note that only VIRTIO_NET_F_RSS support is implemented,
>> VIRTIO_NET_F_HASH_REPORT, which would enable reporting the
>> packet RSS hash calculated by the device into mbuf.rss, is
>> not yet supported.
>>
>> Regarding the default RSS configuration, it has been
>> chosen to use the default Intel ixgbe key as default key,
>> and default RETA is a simple modulo between the hash and
>> the number of Rx queues.
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> 
> See review notes below
> 
>> diff --git a/drivers/net/virtio/virtio.h b/drivers/net/virtio/virtio.h
>> index e78b2e429e..7118e5d24c 100644
>> --- a/drivers/net/virtio/virtio.h
>> +++ b/drivers/net/virtio/virtio.h
> 
> [snip]
> 
>> @@ -100,6 +101,29 @@
>>    */
>>   #define VIRTIO_MAX_INDIRECT ((int)(rte_mem_page_size() / 16))
>>   
>> +/*  Virtio RSS hash types */
>> +#define VIRTIO_NET_HASH_TYPE_IPV4	(1 << 0)
>> +#define VIRTIO_NET_HASH_TYPE_TCPV4	(1 << 1)
>> +#define VIRTIO_NET_HASH_TYPE_UDPV4	(1 << 2)
>> +#define VIRTIO_NET_HASH_TYPE_IPV6	(1 << 3)
>> +#define VIRTIO_NET_HASH_TYPE_TCPV6	(1 << 4)
>> +#define VIRTIO_NET_HASH_TYPE_UDPV6	(1 << 5)
>> +#define VIRTIO_NET_HASH_TYPE_IP_EX	(1 << 6)
>> +#define VIRTIO_NET_HASH_TYPE_TCP_EX	(1 << 7)
>> +#define VIRTIO_NET_HASH_TYPE_UDP_EX	(1 << 8)
> 
> I think it is a bit better to use RTE_BIT32() above.

I did not know this macro existed, I will use it in next revision.

> [snip]
> 
>> diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
>> index aff791fbd0..a8e2bf3e1a 100644
>> --- a/drivers/net/virtio/virtio_ethdev.c
>> +++ b/drivers/net/virtio/virtio_ethdev.c
> 
> [snip]
> 
>> @@ -347,20 +357,51 @@ virtio_send_command(struct virtnet_ctl *cvq, struct virtio_pmd_ctrl *ctrl,
>>   }
>>   
>>   static int
>> -virtio_set_multiple_queues(struct rte_eth_dev *dev, uint16_t nb_queues)
>> +virtio_set_multiple_queues_rss(struct rte_eth_dev *dev, uint16_t nb_queues)
>>   {
>>   	struct virtio_hw *hw = dev->data->dev_private;
>>   	struct virtio_pmd_ctrl ctrl;
>> -	int dlen[1];
>> +	struct virtio_net_ctrl_rss rss;
>> +	int dlen, ret;
>> +
>> +	rss.hash_types = hw->rss_hash_types & VIRTIO_NET_HASH_TYPE_MASK;
> 
> RTE_BUILD_BUG_ON(!RTE_IS_POWER_OF_2(VIRTIO_NET_RSS_RETA_SIZE));

Makes sense, it indeed relies on the reta size being a power of 2,
which the spec requires.

>> +	rss.indirection_table_mask = VIRTIO_NET_RSS_RETA_SIZE - 1;
> 
> It relies on the fact that device is power of 2.
> So, suggest to add above check.
> 
>> +	rss.unclassified_queue = 0;
>> +	memcpy(rss.indirection_table, hw->rss_reta, VIRTIO_NET_RSS_RETA_SIZE * sizeof(uint16_t));
>> +	rss.max_tx_vq = nb_queues;
> 
> Is it guaranteed that driver is configured with equal number
> of Rx and Tx queues? Or is it not a problem otherwise?

Virtio networking devices works with queue pairs.

> [snip]
> 
>> +static int
>> +virtio_dev_get_rss_config(struct virtio_hw *hw, uint32_t *rss_hash_types)
>> +{
>> +	struct virtio_net_config local_config;
>> +	struct virtio_net_config *config = &local_config;
>> +
>> +	virtio_read_dev_config(hw,
>> +			offsetof(struct virtio_net_config, rss_max_key_size),
>> +			&config->rss_max_key_size,
>> +			sizeof(config->rss_max_key_size));
>> +	if (config->rss_max_key_size < VIRTIO_NET_RSS_KEY_SIZE) {
> 
> Shouldn't it be
> config->rss_max_key_size != VIRTIO_NET_RSS_KEY_SIZE ?

I don't think so.

> Or do we just ensure that HW supports at least required key
> size and rely on the fact that it will reject set request later
> if our size is not supported in fact?

Exactly. The device advertises the max key length it supports in its PCI
config space. Then, the driver specifies the key length it uses when
sending the virtio_net_ctrl_rss message on the control queue.

If we later try to set a different key via .rss_hash_update(), its size
is checked there (see virtio_dev_rss_hash_update()).

>> +		PMD_INIT_LOG(ERR, "Invalid device RSS max key size (%u)",
>> +				config->rss_max_key_size);
>> +		return -EINVAL;
>> +	}
>> +
>> +	virtio_read_dev_config(hw,
>> +			offsetof(struct virtio_net_config,
>> +				rss_max_indirection_table_length),
>> +			&config->rss_max_indirection_table_length,
>> +			sizeof(config->rss_max_indirection_table_length));
>> +	if (config->rss_max_indirection_table_length < VIRTIO_NET_RSS_RETA_SIZE) {
> 
> Same question here.

Same anwser, virtio_dev_rss_reta_update() ensures the table size is
VIRTIO_NET_RSS_RETA_SIZE.

>> +		PMD_INIT_LOG(ERR, "Invalid device RSS max reta size (%u)",
>> +				config->rss_max_indirection_table_length);
>> +		return -EINVAL;
>> +	}
>> +
>> +	virtio_read_dev_config(hw,
>> +			offsetof(struct virtio_net_config, supported_hash_types),
>> +			&config->supported_hash_types,
>> +			sizeof(config->supported_hash_types));
>> +	if ((config->supported_hash_types & VIRTIO_NET_HASH_TYPE_MASK) == 0) {
>> +		PMD_INIT_LOG(ERR, "Invalid device RSS hash types (0x%x)",
>> +				config->supported_hash_types);
>> +		return -EINVAL;
>> +	}
>> +
>> +	*rss_hash_types = config->supported_hash_types & VIRTIO_NET_HASH_TYPE_MASK;
>> +
>> +	PMD_INIT_LOG(DEBUG, "Device RSS config:");
>> +	PMD_INIT_LOG(DEBUG, "\t-Max key size: %u", config->rss_max_key_size);
>> +	PMD_INIT_LOG(DEBUG, "\t-Max reta size: %u", config->rss_max_indirection_table_length);
>> +	PMD_INIT_LOG(DEBUG, "\t-Supported hash types: 0x%x", *rss_hash_types);
>> +
>> +	return 0;
>> +}
>> +
>> +static int
>> +virtio_dev_rss_hash_update(struct rte_eth_dev *dev,
>> +		struct rte_eth_rss_conf *rss_conf)
>> +{
>> +	struct virtio_hw *hw = dev->data->dev_private;
>> +	uint16_t nb_queues;
>> +
>> +	if (!virtio_with_feature(hw, VIRTIO_NET_F_RSS))
>> +		return -ENOTSUP;
>> +
>> +	if (rss_conf->rss_hf & ~virtio_to_ethdev_rss_offloads(VIRTIO_NET_HASH_TYPE_MASK))
>> +		return -EINVAL;
>> +
>> +	hw->rss_hash_types = ethdev_to_virtio_rss_offloads(rss_conf->rss_hf);
>> +
>> +	if (rss_conf->rss_key && rss_conf->rss_key_len) {
>> +		if (rss_conf->rss_key_len != VIRTIO_NET_RSS_KEY_SIZE) {
>> +			PMD_INIT_LOG(ERR, "Driver only supports %u RSS key length",
>> +					VIRTIO_NET_RSS_KEY_SIZE);
>> +			return -EINVAL;
>> +		}
>> +		memcpy(hw->rss_key, rss_conf->rss_key, VIRTIO_NET_RSS_KEY_SIZE);
>> +	}
>> +
>> +	nb_queues = RTE_MAX(dev->data->nb_rx_queues, dev->data->nb_tx_queues);
>> +	return virtio_set_multiple_queues_rss(dev, nb_queues);
> 
> Don't we need to rollback data in hw in the case of failure?

Agree, it would be better.

> [snip]
> 
>> +static int virtio_dev_rss_reta_update(struct rte_eth_dev *dev,
>> +			 struct rte_eth_rss_reta_entry64 *reta_conf,
>> +			 uint16_t reta_size)
>> +{
>> +	struct virtio_hw *hw = dev->data->dev_private;
>> +	uint16_t nb_queues;
>> +	int idx, pos, i;
>> +
>> +	if (!virtio_with_feature(hw, VIRTIO_NET_F_RSS))
>> +		return -ENOTSUP;
>> +
>> +	if (reta_size != VIRTIO_NET_RSS_RETA_SIZE)
>> +		return -EINVAL;
>> +
>> +	for (i = 0; i < reta_size; i++) {
>> +		idx = i / RTE_RETA_GROUP_SIZE;
>> +		pos = i % RTE_RETA_GROUP_SIZE;
>> +
>> +		if (((reta_conf[idx].mask >> pos) & 0x1) == 0)
>> +			continue;
>> +
>> +		hw->rss_reta[i] = reta_conf[idx].reta[pos];
>> +	}
>> +
>> +	nb_queues = RTE_MAX(dev->data->nb_rx_queues, dev->data->nb_tx_queues);
>> +	return virtio_set_multiple_queues_rss(dev, nb_queues);
> 
> Question about rollback in the case of failure stands here as
> well.
> 
>> +}

Yes.

> [snip]
> 
> 
>> +static int
>> +virtio_dev_rss_init(struct rte_eth_dev *eth_dev)
>> +{
>> +	struct virtio_hw *hw = eth_dev->data->dev_private;
>> +	uint16_t nb_rx_queues = eth_dev->data->nb_rx_queues;
>> +	struct rte_eth_rss_conf *rss_conf;
>> +	int ret, i;
>> +
>> +	rss_conf = &eth_dev->data->dev_conf.rx_adv_conf.rss_conf;
>> +
>> +	ret = virtio_dev_get_rss_config(hw, &hw->rss_hash_types);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (rss_conf->rss_hf) {
>> +		/*  Ensure requested hash types are supported by the device */
>> +		if (rss_conf->rss_hf & ~virtio_to_ethdev_rss_offloads(hw->rss_hash_types))
>> +			return -EINVAL;
>> +
>> +		hw->rss_hash_types = ethdev_to_virtio_rss_offloads(rss_conf->rss_hf);
>> +	}
>> +
>> +	if (!hw->rss_key) {
>> +		/* Setup default RSS key if not already setup by the user */
>> +		hw->rss_key = rte_malloc_socket("rss_key",
>> +				VIRTIO_NET_RSS_KEY_SIZE, 0,
>> +				eth_dev->device->numa_node);
>> +		if (!hw->rss_key) {
>> +			PMD_INIT_LOG(ERR, "Failed to allocate RSS key");
>> +			return -1;
>> +		}
>> +
>> +		if (rss_conf->rss_key && rss_conf->rss_key_len) {
>> +			if (rss_conf->rss_key_len != VIRTIO_NET_RSS_KEY_SIZE) {
>> +				PMD_INIT_LOG(ERR, "Driver only supports %u RSS key length",
>> +						VIRTIO_NET_RSS_KEY_SIZE);
>> +				return -EINVAL;
>> +			}
>> +			memcpy(hw->rss_key, rss_conf->rss_key, VIRTIO_NET_RSS_KEY_SIZE);
>> +		} else {
>> +			memcpy(hw->rss_key, rss_intel_key, VIRTIO_NET_RSS_KEY_SIZE);
>> +		}
> 
> Above if should work in the case of reconfigure as well when
> array is already allocated.

I'm not sure, because rte_eth_dev_rss_hash_update() API does not update
eth_dev->data->dev_conf.rx_adv_conf.rss_conf, so we may lose the RSS key
the user would have updated using this API. What do you think?

>> +	}
>> +
>> +	if (!hw->rss_reta) {
>> +		/* Setup default RSS reta if not already setup by the user */
>> +		hw->rss_reta = rte_malloc_socket("rss_reta",
>> +				VIRTIO_NET_RSS_RETA_SIZE * sizeof(uint16_t), 0,
>> +				eth_dev->device->numa_node);
>> +		if (!hw->rss_reta) {
>> +			PMD_INIT_LOG(ERR, "Failed to allocate RSS reta");
>> +			return -1;
>> +		}
>> +		for (i = 0; i < VIRTIO_NET_RSS_RETA_SIZE; i++)
>> +			hw->rss_reta[i] = i % nb_rx_queues;
> 
> How should it work in the case of reconfigure if a nubmer of Rx
> queue changes?

Hmm, good catch! I did not think about this, and so did not test it.

Not to lose user-provisionned reta in case of unrelated change, maybe I 
should save the number of Rx queues when setting the reta (here and in
virtio_dev_rss_reta_update), and perform a re-initialization if it is 
different?

> Also I'm wondering how it works...
> virtio_dev_rss_init() is called from eth_virtio_dev_init() as
> well when a number of Rx queues is zero. I guess the reason is
> VIRTIO_PMD_DEFAULT_GUEST_FEATURES, but it looks very fragile.

Yes, we only add VIRTIO_NET_F_RSS to the supported features in
virtio_dev_configure(), if Rx MQ mode is RSS. So virtio_dev_rss_init()
will never be called from eth_virtio_dev_init().

If I'm not mistaken, rte_eth_dev_configure() must be called it is stated
in the API documentation, so it will be negotiated if conditions are
met.

>> +	}
>> +
>> +	return 0;
>> +}
>> +
> 
> [snip]
> 
>> @@ -2107,6 +2465,9 @@ virtio_dev_configure(struct rte_eth_dev *dev)
>>   			return ret;
>>   	}
>>   
>> +	if (rxmode->mq_mode == ETH_MQ_RX_RSS)
>> +		req_features |= (1ULL << VIRTIO_NET_F_RSS);
> 
> RTE_BIT64
> 
>> +
>>   	if ((rx_offloads & DEV_RX_OFFLOAD_JUMBO_FRAME) &&
>>   	    (rxmode->max_rx_pkt_len > hw->max_mtu + ether_hdr_len))
>>   		req_features &= ~(1ULL << VIRTIO_NET_F_MTU);
> 
> [snip]
> 
>> @@ -2578,6 +2946,18 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
>>   		(1ULL << VIRTIO_NET_F_HOST_TSO6);
>>   	if ((host_features & tso_mask) == tso_mask)
>>   		dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_TCP_TSO;
>> +	if (host_features & (1ULL << VIRTIO_NET_F_RSS)) {
> 
> RTE_BIT64
> 
>> +		virtio_dev_get_rss_config(hw, &rss_hash_types);
>> +		dev_info->hash_key_size = VIRTIO_NET_RSS_KEY_SIZE;
>> +		dev_info->reta_size = VIRTIO_NET_RSS_RETA_SIZE;
>> +		dev_info->flow_type_rss_offloads =
>> +			virtio_to_ethdev_rss_offloads(rss_hash_types);
>> +	} else {
>> +		dev_info->hash_key_size = 0;
>> +		dev_info->reta_size = 0;
>> +		dev_info->flow_type_rss_offloads = 0;
>> +	}
>> +
>>   
>>   	if (host_features & (1ULL << VIRTIO_F_RING_PACKED)) {
>>   		/*
>> diff --git a/drivers/net/virtio/virtio_ethdev.h b/drivers/net/virtio/virtio_ethdev.h
>> index 40be484218..c08f382791 100644
>> --- a/drivers/net/virtio/virtio_ethdev.h
>> +++ b/drivers/net/virtio/virtio_ethdev.h
>> @@ -45,7 +45,8 @@
>>   	 1u << VIRTIO_NET_F_GUEST_TSO6     |	\
>>   	 1u << VIRTIO_NET_F_CSUM           |	\
>>   	 1u << VIRTIO_NET_F_HOST_TSO4      |	\
>> -	 1u << VIRTIO_NET_F_HOST_TSO6)
>> +	 1u << VIRTIO_NET_F_HOST_TSO6      |	\
>> +	 1ULL << VIRTIO_NET_F_RSS)
> 
> IMHO it should be converted to use RTE_BIT64().
> Yes, separate story, but right now it looks
> confusing to see 1u and 1ULL above.

Ok, I'll add that to my Todo list.

Thanks for the review,
Maxime

>>   
>>   extern const struct eth_dev_ops virtio_user_secondary_eth_dev_ops;
>>   
> [
> 
> [snip]
> 


  reply	other threads:[~2021-10-19  9:23 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-18 10:20 [dpdk-dev] [PATCH v5 0/5] Virtio PMD RSS support & RSS fixes Maxime Coquelin
2021-10-18 10:20 ` [dpdk-dev] [PATCH v5 1/5] net/virtio: add initial RSS support Maxime Coquelin
2021-10-19  4:47   ` Xia, Chenbo
2021-10-19  7:31     ` Maxime Coquelin
2021-10-19  7:30   ` Andrew Rybchenko
2021-10-19  9:22     ` Maxime Coquelin [this message]
2021-10-19  9:37       ` Andrew Rybchenko
2021-10-27 10:55         ` Maxime Coquelin
2021-10-27 14:45           ` Yuri Benditovich
2021-10-27 19:59             ` Maxime Coquelin
2021-10-18 10:20 ` [dpdk-dev] [PATCH v5 2/5] app/testpmd: fix RSS key length Maxime Coquelin
2021-10-18 10:20 ` [dpdk-dev] [PATCH v5 3/5] app/testpmd: fix RSS type display Maxime Coquelin
2021-10-18 10:20 ` [dpdk-dev] [PATCH v5 4/5] net/mlx5: fix RSS RETA update Maxime Coquelin
2021-10-18 10:20 ` [dpdk-dev] [PATCH v5 5/5] app/testpmd: add missing flow types in port info Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e53974cd-0272-ba22-6149-a9395110c455@redhat.com \
    --to=maxime.coquelin@redhat.com \
    --cc=amorenoz@redhat.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=chenbo.xia@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=michaelba@nvidia.com \
    --cc=nelio.laranjeiro@6wind.com \
    --cc=viacheslavo@nvidia.com \
    --cc=xiaoyun.li@intel.com \
    --cc=ybendito@redhat.com \
    --cc=yvugenfi@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).