From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8224EA0C43; Tue, 19 Oct 2021 11:23:14 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1C993410E8; Tue, 19 Oct 2021 11:23:11 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id BD6E8410DF for ; Tue, 19 Oct 2021 11:23:09 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634635389; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8ConIWuzEFt9dgrRQLGZFRGtXYxR9oDvj/YgSUXjpVw=; b=H9yWsndWET/c9Y7Oimv2xJBiCr2YWZJviF/WpL1ddcS69BEDIqO7++tM4tl3Ax9xHN2YYk riDHXcSVLeKwvVIlEajYSPM+LTUHyFfG8jEYU7u/C5Yx04mtfma1Ayy07dKXThpkCcpaAP WXHwvob48pR5nu+lpiMrczBhcAClUWs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-19-10tV7xLaML-naSW0jDpHig-1; Tue, 19 Oct 2021 05:23:06 -0400 X-MC-Unique: 10tV7xLaML-naSW0jDpHig-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7452019200C2; Tue, 19 Oct 2021 09:23:04 +0000 (UTC) Received: from [10.39.208.27] (unknown [10.39.208.27]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 31A3277E26; Tue, 19 Oct 2021 09:22:51 +0000 (UTC) Message-ID: Date: Tue, 19 Oct 2021 11:22:50 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.0 To: Andrew Rybchenko , dev@dpdk.org, chenbo.xia@intel.com, amorenoz@redhat.com, david.marchand@redhat.com, ferruh.yigit@intel.com, michaelba@nvidia.com, viacheslavo@nvidia.com, xiaoyun.li@intel.com Cc: nelio.laranjeiro@6wind.com, yvugenfi@redhat.com, ybendito@redhat.com References: <20211018102045.255831-1-maxime.coquelin@redhat.com> <20211018102045.255831-2-maxime.coquelin@redhat.com> <3cf32ebd-47cd-05d0-c64f-67e9418839ba@oktetlabs.ru> From: Maxime Coquelin In-Reply-To: <3cf32ebd-47cd-05d0-c64f-67e9418839ba@oktetlabs.ru> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=maxime.coquelin@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v5 1/5] net/virtio: add initial RSS support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Andrew, On 10/19/21 09:30, Andrew Rybchenko wrote: > On 10/18/21 1:20 PM, Maxime Coquelin wrote: >> Provide the capability to update the hash key, hash types >> and RETA table on the fly (without needing to stop/start >> the device). However, the key length and the number of RETA >> entries are fixed to 40B and 128 entries respectively. This >> is done in order to simplify the design, but may be >> revisited later as the Virtio spec provides this >> flexibility. >> >> Note that only VIRTIO_NET_F_RSS support is implemented, >> VIRTIO_NET_F_HASH_REPORT, which would enable reporting the >> packet RSS hash calculated by the device into mbuf.rss, is >> not yet supported. >> >> Regarding the default RSS configuration, it has been >> chosen to use the default Intel ixgbe key as default key, >> and default RETA is a simple modulo between the hash and >> the number of Rx queues. >> >> Signed-off-by: Maxime Coquelin > > See review notes below > >> diff --git a/drivers/net/virtio/virtio.h b/drivers/net/virtio/virtio.h >> index e78b2e429e..7118e5d24c 100644 >> --- a/drivers/net/virtio/virtio.h >> +++ b/drivers/net/virtio/virtio.h > > [snip] > >> @@ -100,6 +101,29 @@ >> */ >> #define VIRTIO_MAX_INDIRECT ((int)(rte_mem_page_size() / 16)) >> >> +/* Virtio RSS hash types */ >> +#define VIRTIO_NET_HASH_TYPE_IPV4 (1 << 0) >> +#define VIRTIO_NET_HASH_TYPE_TCPV4 (1 << 1) >> +#define VIRTIO_NET_HASH_TYPE_UDPV4 (1 << 2) >> +#define VIRTIO_NET_HASH_TYPE_IPV6 (1 << 3) >> +#define VIRTIO_NET_HASH_TYPE_TCPV6 (1 << 4) >> +#define VIRTIO_NET_HASH_TYPE_UDPV6 (1 << 5) >> +#define VIRTIO_NET_HASH_TYPE_IP_EX (1 << 6) >> +#define VIRTIO_NET_HASH_TYPE_TCP_EX (1 << 7) >> +#define VIRTIO_NET_HASH_TYPE_UDP_EX (1 << 8) > > I think it is a bit better to use RTE_BIT32() above. I did not know this macro existed, I will use it in next revision. > [snip] > >> diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c >> index aff791fbd0..a8e2bf3e1a 100644 >> --- a/drivers/net/virtio/virtio_ethdev.c >> +++ b/drivers/net/virtio/virtio_ethdev.c > > [snip] > >> @@ -347,20 +357,51 @@ virtio_send_command(struct virtnet_ctl *cvq, struct virtio_pmd_ctrl *ctrl, >> } >> >> static int >> -virtio_set_multiple_queues(struct rte_eth_dev *dev, uint16_t nb_queues) >> +virtio_set_multiple_queues_rss(struct rte_eth_dev *dev, uint16_t nb_queues) >> { >> struct virtio_hw *hw = dev->data->dev_private; >> struct virtio_pmd_ctrl ctrl; >> - int dlen[1]; >> + struct virtio_net_ctrl_rss rss; >> + int dlen, ret; >> + >> + rss.hash_types = hw->rss_hash_types & VIRTIO_NET_HASH_TYPE_MASK; > > RTE_BUILD_BUG_ON(!RTE_IS_POWER_OF_2(VIRTIO_NET_RSS_RETA_SIZE)); Makes sense, it indeed relies on the reta size being a power of 2, which the spec requires. >> + rss.indirection_table_mask = VIRTIO_NET_RSS_RETA_SIZE - 1; > > It relies on the fact that device is power of 2. > So, suggest to add above check. > >> + rss.unclassified_queue = 0; >> + memcpy(rss.indirection_table, hw->rss_reta, VIRTIO_NET_RSS_RETA_SIZE * sizeof(uint16_t)); >> + rss.max_tx_vq = nb_queues; > > Is it guaranteed that driver is configured with equal number > of Rx and Tx queues? Or is it not a problem otherwise? Virtio networking devices works with queue pairs. > [snip] > >> +static int >> +virtio_dev_get_rss_config(struct virtio_hw *hw, uint32_t *rss_hash_types) >> +{ >> + struct virtio_net_config local_config; >> + struct virtio_net_config *config = &local_config; >> + >> + virtio_read_dev_config(hw, >> + offsetof(struct virtio_net_config, rss_max_key_size), >> + &config->rss_max_key_size, >> + sizeof(config->rss_max_key_size)); >> + if (config->rss_max_key_size < VIRTIO_NET_RSS_KEY_SIZE) { > > Shouldn't it be > config->rss_max_key_size != VIRTIO_NET_RSS_KEY_SIZE ? I don't think so. > Or do we just ensure that HW supports at least required key > size and rely on the fact that it will reject set request later > if our size is not supported in fact? Exactly. The device advertises the max key length it supports in its PCI config space. Then, the driver specifies the key length it uses when sending the virtio_net_ctrl_rss message on the control queue. If we later try to set a different key via .rss_hash_update(), its size is checked there (see virtio_dev_rss_hash_update()). >> + PMD_INIT_LOG(ERR, "Invalid device RSS max key size (%u)", >> + config->rss_max_key_size); >> + return -EINVAL; >> + } >> + >> + virtio_read_dev_config(hw, >> + offsetof(struct virtio_net_config, >> + rss_max_indirection_table_length), >> + &config->rss_max_indirection_table_length, >> + sizeof(config->rss_max_indirection_table_length)); >> + if (config->rss_max_indirection_table_length < VIRTIO_NET_RSS_RETA_SIZE) { > > Same question here. Same anwser, virtio_dev_rss_reta_update() ensures the table size is VIRTIO_NET_RSS_RETA_SIZE. >> + PMD_INIT_LOG(ERR, "Invalid device RSS max reta size (%u)", >> + config->rss_max_indirection_table_length); >> + return -EINVAL; >> + } >> + >> + virtio_read_dev_config(hw, >> + offsetof(struct virtio_net_config, supported_hash_types), >> + &config->supported_hash_types, >> + sizeof(config->supported_hash_types)); >> + if ((config->supported_hash_types & VIRTIO_NET_HASH_TYPE_MASK) == 0) { >> + PMD_INIT_LOG(ERR, "Invalid device RSS hash types (0x%x)", >> + config->supported_hash_types); >> + return -EINVAL; >> + } >> + >> + *rss_hash_types = config->supported_hash_types & VIRTIO_NET_HASH_TYPE_MASK; >> + >> + PMD_INIT_LOG(DEBUG, "Device RSS config:"); >> + PMD_INIT_LOG(DEBUG, "\t-Max key size: %u", config->rss_max_key_size); >> + PMD_INIT_LOG(DEBUG, "\t-Max reta size: %u", config->rss_max_indirection_table_length); >> + PMD_INIT_LOG(DEBUG, "\t-Supported hash types: 0x%x", *rss_hash_types); >> + >> + return 0; >> +} >> + >> +static int >> +virtio_dev_rss_hash_update(struct rte_eth_dev *dev, >> + struct rte_eth_rss_conf *rss_conf) >> +{ >> + struct virtio_hw *hw = dev->data->dev_private; >> + uint16_t nb_queues; >> + >> + if (!virtio_with_feature(hw, VIRTIO_NET_F_RSS)) >> + return -ENOTSUP; >> + >> + if (rss_conf->rss_hf & ~virtio_to_ethdev_rss_offloads(VIRTIO_NET_HASH_TYPE_MASK)) >> + return -EINVAL; >> + >> + hw->rss_hash_types = ethdev_to_virtio_rss_offloads(rss_conf->rss_hf); >> + >> + if (rss_conf->rss_key && rss_conf->rss_key_len) { >> + if (rss_conf->rss_key_len != VIRTIO_NET_RSS_KEY_SIZE) { >> + PMD_INIT_LOG(ERR, "Driver only supports %u RSS key length", >> + VIRTIO_NET_RSS_KEY_SIZE); >> + return -EINVAL; >> + } >> + memcpy(hw->rss_key, rss_conf->rss_key, VIRTIO_NET_RSS_KEY_SIZE); >> + } >> + >> + nb_queues = RTE_MAX(dev->data->nb_rx_queues, dev->data->nb_tx_queues); >> + return virtio_set_multiple_queues_rss(dev, nb_queues); > > Don't we need to rollback data in hw in the case of failure? Agree, it would be better. > [snip] > >> +static int virtio_dev_rss_reta_update(struct rte_eth_dev *dev, >> + struct rte_eth_rss_reta_entry64 *reta_conf, >> + uint16_t reta_size) >> +{ >> + struct virtio_hw *hw = dev->data->dev_private; >> + uint16_t nb_queues; >> + int idx, pos, i; >> + >> + if (!virtio_with_feature(hw, VIRTIO_NET_F_RSS)) >> + return -ENOTSUP; >> + >> + if (reta_size != VIRTIO_NET_RSS_RETA_SIZE) >> + return -EINVAL; >> + >> + for (i = 0; i < reta_size; i++) { >> + idx = i / RTE_RETA_GROUP_SIZE; >> + pos = i % RTE_RETA_GROUP_SIZE; >> + >> + if (((reta_conf[idx].mask >> pos) & 0x1) == 0) >> + continue; >> + >> + hw->rss_reta[i] = reta_conf[idx].reta[pos]; >> + } >> + >> + nb_queues = RTE_MAX(dev->data->nb_rx_queues, dev->data->nb_tx_queues); >> + return virtio_set_multiple_queues_rss(dev, nb_queues); > > Question about rollback in the case of failure stands here as > well. > >> +} Yes. > [snip] > > >> +static int >> +virtio_dev_rss_init(struct rte_eth_dev *eth_dev) >> +{ >> + struct virtio_hw *hw = eth_dev->data->dev_private; >> + uint16_t nb_rx_queues = eth_dev->data->nb_rx_queues; >> + struct rte_eth_rss_conf *rss_conf; >> + int ret, i; >> + >> + rss_conf = ð_dev->data->dev_conf.rx_adv_conf.rss_conf; >> + >> + ret = virtio_dev_get_rss_config(hw, &hw->rss_hash_types); >> + if (ret) >> + return ret; >> + >> + if (rss_conf->rss_hf) { >> + /* Ensure requested hash types are supported by the device */ >> + if (rss_conf->rss_hf & ~virtio_to_ethdev_rss_offloads(hw->rss_hash_types)) >> + return -EINVAL; >> + >> + hw->rss_hash_types = ethdev_to_virtio_rss_offloads(rss_conf->rss_hf); >> + } >> + >> + if (!hw->rss_key) { >> + /* Setup default RSS key if not already setup by the user */ >> + hw->rss_key = rte_malloc_socket("rss_key", >> + VIRTIO_NET_RSS_KEY_SIZE, 0, >> + eth_dev->device->numa_node); >> + if (!hw->rss_key) { >> + PMD_INIT_LOG(ERR, "Failed to allocate RSS key"); >> + return -1; >> + } >> + >> + if (rss_conf->rss_key && rss_conf->rss_key_len) { >> + if (rss_conf->rss_key_len != VIRTIO_NET_RSS_KEY_SIZE) { >> + PMD_INIT_LOG(ERR, "Driver only supports %u RSS key length", >> + VIRTIO_NET_RSS_KEY_SIZE); >> + return -EINVAL; >> + } >> + memcpy(hw->rss_key, rss_conf->rss_key, VIRTIO_NET_RSS_KEY_SIZE); >> + } else { >> + memcpy(hw->rss_key, rss_intel_key, VIRTIO_NET_RSS_KEY_SIZE); >> + } > > Above if should work in the case of reconfigure as well when > array is already allocated. I'm not sure, because rte_eth_dev_rss_hash_update() API does not update eth_dev->data->dev_conf.rx_adv_conf.rss_conf, so we may lose the RSS key the user would have updated using this API. What do you think? >> + } >> + >> + if (!hw->rss_reta) { >> + /* Setup default RSS reta if not already setup by the user */ >> + hw->rss_reta = rte_malloc_socket("rss_reta", >> + VIRTIO_NET_RSS_RETA_SIZE * sizeof(uint16_t), 0, >> + eth_dev->device->numa_node); >> + if (!hw->rss_reta) { >> + PMD_INIT_LOG(ERR, "Failed to allocate RSS reta"); >> + return -1; >> + } >> + for (i = 0; i < VIRTIO_NET_RSS_RETA_SIZE; i++) >> + hw->rss_reta[i] = i % nb_rx_queues; > > How should it work in the case of reconfigure if a nubmer of Rx > queue changes? Hmm, good catch! I did not think about this, and so did not test it. Not to lose user-provisionned reta in case of unrelated change, maybe I should save the number of Rx queues when setting the reta (here and in virtio_dev_rss_reta_update), and perform a re-initialization if it is different? > Also I'm wondering how it works... > virtio_dev_rss_init() is called from eth_virtio_dev_init() as > well when a number of Rx queues is zero. I guess the reason is > VIRTIO_PMD_DEFAULT_GUEST_FEATURES, but it looks very fragile. Yes, we only add VIRTIO_NET_F_RSS to the supported features in virtio_dev_configure(), if Rx MQ mode is RSS. So virtio_dev_rss_init() will never be called from eth_virtio_dev_init(). If I'm not mistaken, rte_eth_dev_configure() must be called it is stated in the API documentation, so it will be negotiated if conditions are met. >> + } >> + >> + return 0; >> +} >> + > > [snip] > >> @@ -2107,6 +2465,9 @@ virtio_dev_configure(struct rte_eth_dev *dev) >> return ret; >> } >> >> + if (rxmode->mq_mode == ETH_MQ_RX_RSS) >> + req_features |= (1ULL << VIRTIO_NET_F_RSS); > > RTE_BIT64 > >> + >> if ((rx_offloads & DEV_RX_OFFLOAD_JUMBO_FRAME) && >> (rxmode->max_rx_pkt_len > hw->max_mtu + ether_hdr_len)) >> req_features &= ~(1ULL << VIRTIO_NET_F_MTU); > > [snip] > >> @@ -2578,6 +2946,18 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) >> (1ULL << VIRTIO_NET_F_HOST_TSO6); >> if ((host_features & tso_mask) == tso_mask) >> dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_TCP_TSO; >> + if (host_features & (1ULL << VIRTIO_NET_F_RSS)) { > > RTE_BIT64 > >> + virtio_dev_get_rss_config(hw, &rss_hash_types); >> + dev_info->hash_key_size = VIRTIO_NET_RSS_KEY_SIZE; >> + dev_info->reta_size = VIRTIO_NET_RSS_RETA_SIZE; >> + dev_info->flow_type_rss_offloads = >> + virtio_to_ethdev_rss_offloads(rss_hash_types); >> + } else { >> + dev_info->hash_key_size = 0; >> + dev_info->reta_size = 0; >> + dev_info->flow_type_rss_offloads = 0; >> + } >> + >> >> if (host_features & (1ULL << VIRTIO_F_RING_PACKED)) { >> /* >> diff --git a/drivers/net/virtio/virtio_ethdev.h b/drivers/net/virtio/virtio_ethdev.h >> index 40be484218..c08f382791 100644 >> --- a/drivers/net/virtio/virtio_ethdev.h >> +++ b/drivers/net/virtio/virtio_ethdev.h >> @@ -45,7 +45,8 @@ >> 1u << VIRTIO_NET_F_GUEST_TSO6 | \ >> 1u << VIRTIO_NET_F_CSUM | \ >> 1u << VIRTIO_NET_F_HOST_TSO4 | \ >> - 1u << VIRTIO_NET_F_HOST_TSO6) >> + 1u << VIRTIO_NET_F_HOST_TSO6 | \ >> + 1ULL << VIRTIO_NET_F_RSS) > > IMHO it should be converted to use RTE_BIT64(). > Yes, separate story, but right now it looks > confusing to see 1u and 1ULL above. Ok, I'll add that to my Todo list. Thanks for the review, Maxime >> >> extern const struct eth_dev_ops virtio_user_secondary_eth_dev_ops; >> > [ > > [snip] >