From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2B62EA034F; Sat, 16 Oct 2021 10:43:15 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D785941109; Sat, 16 Oct 2021 10:43:11 +0200 (CEST) Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2068.outbound.protection.outlook.com [40.107.94.68]) by mails.dpdk.org (Postfix) with ESMTP id A8A9A4003F for ; Sat, 16 Oct 2021 10:43:09 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YzcLpfIhZCLWuGiokNPpsipPjH9QM4q6T3XiPR5gg7ilfspRJrfGnTSqXbsFN8csNKTGHrCrMjSNCYPdx7CBN8LbDPAhj25SHjeaU1m+hdpualFH6TBmkKyrfhJ6LEfGwVcABDVMrcW1daxAbxuekftZAgj3zJ1Umyncqths9qmI7cEeM9jTt1PWTblD97Mf6895B82NBggcaLD1d/2bLNCIEgGroJd9aZP3djMVKXanUCB9PlLxhpHBhHw4A9284QqfGddBD5FStrGwhwyw+1xy8LwfEvA1D/9fsD0kQW/yTH6qRAoJGtFLvHItcG/SxZ6ClZc7zowbtivT8ls7Ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xMvwONTH/I+oefFF7eDkuhauhC5Y8FXoCbjx73UF7jQ=; b=Ha5FTuFymDVpV6BgzqhcT3Uf4DEbMahY3YcZcOC2ZImbhEKNTSY1cadcnPN2NsMRqEUtplsOIMeH3Y1Hy9YEvns/DLBNHIKVnViw7xWRzYo1tcF1A/EhKva7L0c0Yxxs0x4N8Yt+xr8KNQLIYXIl2Xk0MRZm5gMU6jbLTZHhW6A0cx9cJoqIEyiFeRwK8g0uvHv0JXpn8vyQQ3YVRYmQFG8k+Mf3h/yiWMORHGYwekw0t+3QoUG46bO7oRcN89gBMvEA3fO3+dnpPJIBwBmm3nl5Bbf/e+LtxuuRc3linmQwVrzdSUIqNb+T2DzARvf+xnWAzoPTlY3EqlH3Et8dqg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=oktetlabs.ru smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xMvwONTH/I+oefFF7eDkuhauhC5Y8FXoCbjx73UF7jQ=; b=Jhi2Y9yisV8A2rgFQXcmANCHCKfA15wH16UWmXNa03RlCqcSPrfSVCFCoLgPp4zu/5bUT3J/Yrc34OM6e3k5aA67TtyDmxo2KAKS3B3lcJLCm84TQqWudSViV17Lj2GpF/2vvTqTHLUUq1ahokshZkGVvS4aT+5hGbIgq0oo4rtj6a6e/Y4tounjJH9emW/bKFy0JNeJjW1VfP3odon1teCRL2FY3VP8iZJ88rzJ6l23+fDvqoIFvtSykUCcw/HgjTa4GvJ7XGcJO0hb6jK1IcPlW35EHrNa6ihexr1v84mUZzh8b+e1/kESUjosh/KkqoVoR5yoBdaBYCE/MZIhjQ== Received: from MW4P221CA0028.NAMP221.PROD.OUTLOOK.COM (2603:10b6:303:8b::33) by BY5PR12MB4801.namprd12.prod.outlook.com (2603:10b6:a03:1b1::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.16; Sat, 16 Oct 2021 08:43:06 +0000 Received: from CO1NAM11FT046.eop-nam11.prod.protection.outlook.com (2603:10b6:303:8b:cafe::a2) by MW4P221CA0028.outlook.office365.com (2603:10b6:303:8b::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.15 via Frontend Transport; Sat, 16 Oct 2021 08:43:06 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; oktetlabs.ru; dkim=none (message not signed) header.d=none;oktetlabs.ru; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by CO1NAM11FT046.mail.protection.outlook.com (10.13.174.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4608.15 via Frontend Transport; Sat, 16 Oct 2021 08:43:04 +0000 Received: from nvidia.com (172.20.187.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Sat, 16 Oct 2021 08:43:02 +0000 From: Xueming Li To: CC: , Jerin Jacob , Ferruh Yigit , Andrew Rybchenko , Viacheslav Ovsiienko , Thomas Monjalon , Lior Margalit , "Ananyev Konstantin" Date: Sat, 16 Oct 2021 16:42:33 +0800 Message-ID: <20211016084237.1808161-2-xuemingl@nvidia.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211016084237.1808161-1-xuemingl@nvidia.com> References: <20210727034204.20649-1-xuemingl@nvidia.com> <20211016084237.1808161-1-xuemingl@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [172.20.187.5] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 16c127b9-d781-451d-51d2-08d99080f8a8 X-MS-TrafficTypeDiagnostic: BY5PR12MB4801: X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: VlNniXkFpZlrAc3xsVUr6zRVqkaNMQnS1ETKyXXTR5mTvncRBqGZ6fR8zcE2hhkAY1nkusqfuzBZpcTUT3rKmAkH1QnICGXabfwCR4qgiybqprMdIfCOS3ySe2YLj+XU6ceMLryDJtJ4Kt1tid8TeCofNzP+Dn9Z7+0S8G3tpDS1M83Z7pFggfhEtsAKxtcMynqfzgcCaAlcA5VJNR2UmNr4SLmsvz9LCepDCu+Vp+TTkyjrkfCjVO5Z2dOxKiD1zp4KLDe2nNTYBFwjp5oM9/qRlqUfxLUCzYB1kHg7oSp33AGASPwxVVp69AWxWGzTHGu0uxAYliOku0iVHrRhgWCMZmHZkRWkYqX7zCIMcPQdEJmz08P91+UfkXiCJwEXoVVVj9v4qqbnlZK14HrPXCZg4Y86pbblTXqL2+IM3OzDrllshRcDCKt5Ayt00mrqZ+TI3DpOpyPN8/1oNWM9TMoByQhySh7P9C1kVlqNd33rkYV7Xc5Tb0P20FNYMWXZvrKB/8Yc2wUID/8xUV7dZ7mQNen2dvlPJWEHJhugSCup8JGkUB791tL/o9Yow3ewd1HjJHmNvqAUlUwm4pDl4zuAPodchwZ/8QUdy8E9czX1LRYo9CZOtMCTO5an96DrVriNTniFDmzrFMutGyMJdwNl2ttgplk8y/orW5aUu3GeAE8Z4+9e4ETMrURyuu69SXxKarE7zwAojmxozdLQ6AwHDX/0mHBc5Tycbx1/O/LXU7u3z1UVZ0bKx3m/N8/0jqrF5x62OebYqeRxvCUtt7CwB7R9tGzv97OHNreE0PA= X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(46966006)(36840700001)(2906002)(2616005)(8676002)(4326008)(8936002)(36756003)(508600001)(26005)(16526019)(86362001)(186003)(6286002)(5660300002)(83380400001)(316002)(426003)(47076005)(55016002)(70586007)(7696005)(36860700001)(70206006)(7636003)(6666004)(336012)(54906003)(1076003)(356005)(82310400003)(6916009); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Oct 2021 08:43:04.9611 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 16c127b9-d781-451d-51d2-08d99080f8a8 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT046.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR12MB4801 Subject: [dpdk-dev] [PATCH v7 1/5] ethdev: introduce shared Rx queue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" In current DPDK framework, each Rx queue is pre-loaded with mbufs to save incoming packets. For some PMDs, when number of representors scale out in a switch domain, the memory consumption became significant. Polling all ports also leads to high cache miss, high latency and low throughput. This patch introduce shared Rx queue. Ports in same Rx domain and switch domain could share Rx queue set by specifying non-zero sharing group in Rx queue configuration. Port A RxQ X can share RxQ with Port B RxQ X, but can't share with RxQ Y. All member ports in share group share a list of shared Rx queue indexed by Rx queue ID. No special API is defined to receive packets from shared Rx queue. Polling any member port of a shared Rx queue receives packets of that queue for all member ports, source port is identified by mbuf->port. Shared Rx queue must be polled in same thread or core, polling a queue ID of any member port is essentially same. Multiple share groups are supported. Device should support mixed configuration by allowing multiple share groups and non-shared Rx queue. Example grouping and polling model to reflect service priority: Group1, 2 shared Rx queues per port: PF, rep0, rep1 Group2, 1 shared Rx queue per port: rep2, rep3, ... rep127 Core0: poll PF queue0 Core1: poll PF queue1 Core2: poll rep2 queue0 PMD advertise shared Rx queue capability via RTE_ETH_DEV_CAPA_RXQ_SHARE. PMD is responsible for shared Rx queue consistency checks to avoid member port's configuration contradict to each other. Signed-off-by: Xueming Li --- doc/guides/nics/features.rst | 13 ++++++++++++ doc/guides/nics/features/default.ini | 1 + .../prog_guide/switch_representation.rst | 10 +++++++++ doc/guides/rel_notes/release_21_11.rst | 6 ++++++ lib/ethdev/rte_ethdev.c | 8 +++++++ lib/ethdev/rte_ethdev.h | 21 +++++++++++++++++++ 6 files changed, 59 insertions(+) diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst index e346018e4b8..b64433b8ea5 100644 --- a/doc/guides/nics/features.rst +++ b/doc/guides/nics/features.rst @@ -615,6 +615,19 @@ Supports inner packet L4 checksum. ``tx_offload_capa,tx_queue_offload_capa:DEV_TX_OFFLOAD_OUTER_UDP_CKSUM``. +.. _nic_features_shared_rx_queue: + +Shared Rx queue +--------------- + +Supports shared Rx queue for ports in same Rx domain of a switch domain. + +* **[uses] rte_eth_dev_info**: ``dev_capa:RTE_ETH_DEV_CAPA_RXQ_SHARE``. +* **[uses] rte_eth_dev_infoļ¼Œrte_eth_switch_info**: ``rx_domain``, ``domain_id``. +* **[uses] rte_eth_rxconf**: ``share_group``. +* **[provides] mbuf**: ``mbuf.port``. + + .. _nic_features_packet_type_parsing: Packet type parsing diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini index d473b94091a..93f5d1b46f4 100644 --- a/doc/guides/nics/features/default.ini +++ b/doc/guides/nics/features/default.ini @@ -19,6 +19,7 @@ Free Tx mbuf on demand = Queue start/stop = Runtime Rx queue setup = Runtime Tx queue setup = +Shared Rx queue = Burst mode info = Power mgmt address monitor = MTU update = diff --git a/doc/guides/prog_guide/switch_representation.rst b/doc/guides/prog_guide/switch_representation.rst index ff6aa91c806..de41db8385d 100644 --- a/doc/guides/prog_guide/switch_representation.rst +++ b/doc/guides/prog_guide/switch_representation.rst @@ -123,6 +123,16 @@ thought as a software "patch panel" front-end for applications. .. [1] `Ethernet switch device driver model (switchdev) `_ +- For some PMDs, memory usage of representors is huge when number of + representor grows, mbufs are allocated for each descriptor of Rx queue. + Polling large number of ports brings more CPU load, cache miss and + latency. Shared Rx queue can be used to share Rx queue between PF and + representors among same Rx domain. ``RTE_ETH_DEV_CAPA_RXQ_SHARE`` is + present in device capability of device info. Setting non-zero share group + in Rx queue configuration to enable share. Polling any member port can + receive packets of all member ports in the group, port ID is saved in + ``mbuf.port``. + Basic SR-IOV ------------ diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index 4c56cdfeaaa..1c84e896554 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -67,6 +67,12 @@ New Features * Modified to allow ``--huge-dir`` option to specify a sub-directory within a hugetlbfs mountpoint. +* **Added ethdev shared Rx queue support.** + + * Added new device capability flag and rx domain field to switch info. + * Added share group to Rx queue configuration. + * Added testpmd support and dedicate forwarding engine. + * **Added new RSS offload types for IPv4/L4 checksum in RSS flow.** Added macros ETH_RSS_IPV4_CHKSUM and ETH_RSS_L4_CHKSUM, now IPv4 and diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index 028907bc4b9..bc55f899f72 100644 --- a/lib/ethdev/rte_ethdev.c +++ b/lib/ethdev/rte_ethdev.c @@ -2159,6 +2159,14 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id, return -EINVAL; } + if (local_conf.share_group > 0 && + (dev_info.dev_capa & RTE_ETH_DEV_CAPA_RXQ_SHARE) == 0) { + RTE_ETHDEV_LOG(ERR, + "Ethdev port_id=%d rx_queue_id=%d, enabled share_group=%hu while device doesn't support Rx queue share\n", + port_id, rx_queue_id, local_conf.share_group); + return -EINVAL; + } + /* * If LRO is enabled, check that the maximum aggregated packet * size is supported by the configured device. diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index 6d80514ba7a..59d8904ac7c 100644 --- a/lib/ethdev/rte_ethdev.h +++ b/lib/ethdev/rte_ethdev.h @@ -1044,6 +1044,13 @@ struct rte_eth_rxconf { uint8_t rx_drop_en; /**< Drop packets if no descriptors are available. */ uint8_t rx_deferred_start; /**< Do not start queue with rte_eth_dev_start(). */ uint16_t rx_nseg; /**< Number of descriptions in rx_seg array. */ + /** + * Share group index in Rx domain and switch domain. + * Non-zero value to enable Rx queue share, zero value disable share. + * PMD driver is responsible for Rx queue consistency checks to avoid + * member port's configuration contradict to each other. + */ + uint16_t share_group; /** * Per-queue Rx offloads to be set using DEV_RX_OFFLOAD_* flags. * Only offloads set on rx_queue_offload_capa or rx_offload_capa @@ -1445,6 +1452,14 @@ struct rte_eth_conf { #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001 /** Device supports Tx queue setup after device started. */ #define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002 +/** + * Device supports shared Rx queue among ports within Rx domain and + * switch domain. Mbufs are consumed by shared Rx queue instead of + * every port. Multiple groups is supported by share_group of Rx + * queue configuration. Polling any port in the group receive packets + * of all member ports, source port identified by mbuf->port field. + */ +#define RTE_ETH_DEV_CAPA_RXQ_SHARE RTE_BIT64(2) /**@}*/ /* @@ -1488,6 +1503,12 @@ struct rte_eth_switch_info { * but each driver should explicitly define the mapping of switch * port identifier to that physical interconnect/switch */ + /** + * Shared Rx queue sub-domain boundary. Only ports in same Rx domain + * and switch domain can share Rx queue. Valid only if device advertised + * RTE_ETH_DEV_CAPA_RXQ_SHARE capability. + */ + uint16_t rx_domain; }; /** -- 2.33.0