From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CF3B945A0E; Mon, 23 Sep 2024 10:57:07 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 99579402BE; Mon, 23 Sep 2024 10:57:07 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id E27F740274 for ; Mon, 23 Sep 2024 10:57:05 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 8216A13B07 for ; Mon, 23 Sep 2024 10:57:04 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 5E46D13A51; Mon, 23 Sep 2024 10:57:04 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.2 required=5.0 tests=ALL_TRUSTED,AWL, T_SCC_BODY_TEXT_LINE autolearn=disabled version=4.0.0 X-Spam-Score: -1.2 Received: from [192.168.1.86] (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id DA6BC13A88; Mon, 23 Sep 2024 10:57:01 +0200 (CEST) Message-ID: <95d75412-3600-4e70-9a06-438af8cc09ea@lysator.liu.se> Date: Mon, 23 Sep 2024 10:57:01 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 1/3] eventdev: introduce event pre-scheduling To: Pavan Nikhilesh Bhagavatula , "Pathak, Pravin" , Jerin Jacob , Shijith Thotton , "Sevincer, Abdullah" , "hemant.agrawal@nxp.com" , "sachin.saxena@oss.nxp.com" , "Van Haaren, Harry" , "mattias.ronnblom@ericsson.com" , "liangma@liangbit.com" , "Mccarthy, Peter" Cc: "dev@dpdk.org" References: <20240910083117.4281-1-pbhagavatula@marvell.com> <20240917071106.8815-1-pbhagavatula@marvell.com> <20240917071106.8815-2-pbhagavatula@marvell.com> Content-Language: en-US From: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2024-09-19 15:13, Pavan Nikhilesh Bhagavatula wrote: >>> From: pbhagavatula@marvell.com >>> Sent: Tuesday, September 17, 2024 3:11 AM >>> To: jerinj@marvell.com; sthotton@marvell.com; Sevincer, Abdullah >>> ; hemant.agrawal@nxp.com; >>> sachin.saxena@oss.nxp.com; Van Haaren, Harry >> ; >>> mattias.ronnblom@ericsson.com; liangma@liangbit.com; Mccarthy, Peter >>> >>> Cc: dev@dpdk.org; Pavan Nikhilesh >>> Subject: [PATCH v2 1/3] eventdev: introduce event pre-scheduling >>> >>> From: Pavan Nikhilesh >>> >>> Event pre-scheduling improves scheduling performance by assigning events >> to >>> event ports in advance when dequeues are issued. >>> The dequeue operation initiates the pre-schedule operation, which >> completes in >>> parallel without affecting the dequeued event flow contexts and dequeue >>> latency. >>> >> Is the prescheduling done to get the event more quickly in the next dequeue? >> The first dequeue executes pre-schedule to make events available for the next >> dequeue. >> Is this how it is supposed to work? >> > > Yes, that is correct. > "improves scheduling performance" may be a bit misleading, in that case. I suggest "reduces scheduling overhead" instead. You can argue it likely reduces scheduling performance, in certain scenarios. "reduces scheduling overhead, at the cost of load balancing performance." It seems to me that this should be a simple hint-type API, where the hint is used by the event device to decide if pre-scheduling should be used or not (assuming pre-scheduling on/off is even an option). The hint would just be a way for the application to express whether or not it want the scheduler to prioritize load balancing agility and port-to-port wall-time latency, or scheduling overhead, which in turn could potentially be rephrased as the app being throughput or latency/RT-oriented. It could also be useful for the event device to know which priority levels are to be considered latency-sensitive, and which are throughput-oriented - maybe in the form of a threshold. >>> Event devices can indicate pre-scheduling capabilities using >>> `RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE` and >>> `RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE_ADAPTIVE` via the event >> device >>> info function `info.event_dev_cap`. >>> >>> Applications can select the pre-schedule type and configure it through >>> `rte_event_dev_config.preschedule_type` during `rte_event_dev_configure`. >>> >>> The supported pre-schedule types are: >>> * `RTE_EVENT_DEV_PRESCHEDULE_NONE` - No pre-scheduling. >>> * `RTE_EVENT_DEV_PRESCHEDULE` - Always issue a pre-schedule on >> dequeue. >>> * `RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE` - Delay issuing pre-schedule >>> until >>> there are no forward progress constraints with the held flow contexts. >>> >>> Signed-off-by: Pavan Nikhilesh >>> --- >>> app/test/test_eventdev.c | 63 +++++++++++++++++++++ >>> doc/guides/prog_guide/eventdev/eventdev.rst | 22 +++++++ >>> lib/eventdev/rte_eventdev.h | 48 ++++++++++++++++ >>> 3 files changed, 133 insertions(+) >>> >>> diff --git a/app/test/test_eventdev.c b/app/test/test_eventdev.c index >>> e4e234dc98..cf496ee88d 100644 >>> --- a/app/test/test_eventdev.c >>> +++ b/app/test/test_eventdev.c >>> @@ -1250,6 +1250,67 @@ test_eventdev_profile_switch(void) >>> return TEST_SUCCESS; >>> } >>> >>> +static int >>> +preschedule_test(rte_event_dev_preschedule_type_t preschedule_type, >>> +const char *preschedule_name) { >>> +#define NB_EVENTS 1024 >>> + uint64_t start, total; >>> + struct rte_event ev; >>> + int rc, cnt; >>> + >>> + ev.event_type = RTE_EVENT_TYPE_CPU; >>> + ev.queue_id = 0; >>> + ev.op = RTE_EVENT_OP_NEW; >>> + ev.u64 = 0xBADF00D0; >>> + >>> + for (cnt = 0; cnt < NB_EVENTS; cnt++) { >>> + ev.flow_id = cnt; >>> + rc = rte_event_enqueue_burst(TEST_DEV_ID, 0, &ev, 1); >>> + TEST_ASSERT(rc == 1, "Failed to enqueue event"); >>> + } >>> + >>> + RTE_SET_USED(preschedule_type); >>> + total = 0; >>> + while (cnt) { >>> + start = rte_rdtsc_precise(); >>> + rc = rte_event_dequeue_burst(TEST_DEV_ID, 0, &ev, 1, 0); >>> + if (rc) { >>> + total += rte_rdtsc_precise() - start; >>> + cnt--; >>> + } >>> + } >>> + printf("Preschedule type : %s, avg cycles %" PRIu64 "\n", >>> preschedule_name, >>> + total / NB_EVENTS); >>> + >>> + return TEST_SUCCESS; >>> +} >>> + >>> +static int >>> +test_eventdev_preschedule_configure(void) >>> +{ >>> + struct rte_event_dev_config dev_conf; >>> + struct rte_event_dev_info info; >>> + int rc; >>> + >>> + rte_event_dev_info_get(TEST_DEV_ID, &info); >>> + >>> + if ((info.event_dev_cap & >> RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE) >>> == 0) >>> + return TEST_SKIPPED; >>> + >>> + devconf_set_default_sane_values(&dev_conf, &info); >>> + dev_conf.preschedule_type = RTE_EVENT_DEV_PRESCHEDULE; >>> + rc = rte_event_dev_configure(TEST_DEV_ID, &dev_conf); >>> + TEST_ASSERT_SUCCESS(rc, "Failed to configure eventdev"); >>> + >>> + rc = preschedule_test(RTE_EVENT_DEV_PRESCHEDULE_NONE, >>> "RTE_EVENT_DEV_PRESCHEDULE_NONE"); >>> + rc |= preschedule_test(RTE_EVENT_DEV_PRESCHEDULE, >>> "RTE_EVENT_DEV_PRESCHEDULE"); >>> + if (info.event_dev_cap & >>> RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE_ADAPTIVE) >>> + rc |= >>> preschedule_test(RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE, >>> + >>> "RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE"); >>> + >>> + return rc; >>> +} >>> + >>> static int >>> test_eventdev_close(void) >>> { >>> @@ -1310,6 +1371,8 @@ static struct unit_test_suite >>> eventdev_common_testsuite = { >>> test_eventdev_start_stop), >>> TEST_CASE_ST(eventdev_configure_setup, >>> eventdev_stop_device, >>> test_eventdev_profile_switch), >>> + TEST_CASE_ST(eventdev_configure_setup, NULL, >>> + test_eventdev_preschedule_configure), >>> TEST_CASE_ST(eventdev_setup_device, >> eventdev_stop_device, >>> test_eventdev_link), >>> TEST_CASE_ST(eventdev_setup_device, >> eventdev_stop_device, >>> diff --git a/doc/guides/prog_guide/eventdev/eventdev.rst >>> b/doc/guides/prog_guide/eventdev/eventdev.rst >>> index fb6dfce102..341b9bb2c6 100644 >>> --- a/doc/guides/prog_guide/eventdev/eventdev.rst >>> +++ b/doc/guides/prog_guide/eventdev/eventdev.rst >>> @@ -357,6 +357,28 @@ Worker path: >>> // Process the event received. >>> } >>> >>> +Event Pre-scheduling >>> +~~~~~~~~~~~~~~~~~~~~ >>> + >>> +Event pre-scheduling improves scheduling performance by assigning >>> +events to event ports in advance when dequeues are issued. >>> +The `rte_event_dequeue_burst` operation initiates the pre-schedule >>> +operation, which completes in parallel without affecting the dequeued >> event >>> flow contexts and dequeue latency. >>> +On the next dequeue operation, the pre-scheduled events are dequeued >>> +and pre-schedule is initiated again. >>> + >>> +An application can use event pre-scheduling if the event device >>> +supports it at either device level or at a individual port level. >>> +The application can check pre-schedule capability by checking if >>> +``rte_event_dev_info.event_dev_cap`` >>> +has the bit ``RTE_EVENT_DEV_CAP_PRESCHEDULE`` set, if present >>> +pre-scheduling can be enabled at device configuration time by setting >>> appropriate pre-schedule type in ``rte_event_dev_config.preschedule``. >>> + >>> +Currently, the following pre-schedule types are supported: >>> + * ``RTE_EVENT_DEV_PRESCHEDULE_NONE`` - No pre-scheduling. >>> + * ``RTE_EVENT_DEV_PRESCHEDULE`` - Always issue a pre-schedule when >>> dequeue is issued. >>> + * ``RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE`` - Issue pre-schedule when >>> dequeue is issued and there are >>> + no forward progress constraints. >>> + >>> Starting the EventDev >>> ~~~~~~~~~~~~~~~~~~~~~ >>> >>> diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h >> index >>> 08e5f9320b..5ea7f5a07b 100644 >>> --- a/lib/eventdev/rte_eventdev.h >>> +++ b/lib/eventdev/rte_eventdev.h >>> @@ -446,6 +446,30 @@ struct rte_event; >>> * @see RTE_SCHED_TYPE_PARALLEL >>> */ >>> >>> +#define RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE (1ULL << 16) /**< >> Event >>> +device supports event pre-scheduling. >>> + * >>> + * When this capability is available, the application can enable event >>> +pre-scheduling on the event >>> + * device to pre-schedule events to a event port when >>> +`rte_event_dequeue_burst()` >>> + * is issued. >>> + * The pre-schedule process starts with the `rte_event_dequeue_burst()` >>> +call and the >>> + * pre-scheduled events are returned on the next >> `rte_event_dequeue_burst()` >>> call. >>> + * >>> + * @see rte_event_dev_configure() >>> + */ >>> + >>> +#define RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE_ADAPTIVE (1ULL << >> 17) >>> /**< >>> +Event device supports adaptive event pre-scheduling. >>> + * >>> + * When this capability is available, the application can enable >>> +adaptive pre-scheduling >>> + * on the event device where the events are pre-scheduled when there >>> +are no forward >>> + * progress constraints with the currently held flow contexts. >>> + * The pre-schedule process starts with the `rte_event_dequeue_burst()` >>> +call and the >>> + * pre-scheduled events are returned on the next >> `rte_event_dequeue_burst()` >>> call. >>> + * >>> + * @see rte_event_dev_configure() >>> + */ >>> + >>> /* Event device priority levels */ >>> #define RTE_EVENT_DEV_PRIORITY_HIGHEST 0 >>> /**< Highest priority level for events and queues. >>> @@ -680,6 +704,25 @@ rte_event_dev_attr_get(uint8_t dev_id, uint32_t >>> attr_id, >>> * @see rte_event_dequeue_timeout_ticks(), rte_event_dequeue_burst() >>> */ >>> >>> +typedef enum { >>> + RTE_EVENT_DEV_PRESCHEDULE_NONE = 0, >>> + /* Disable pre-schedule across the event device or on a given event >> port. >>> + * @ref rte_event_dev_config.preschedule_type >>> + */ >>> + RTE_EVENT_DEV_PRESCHEDULE, >>> + /* Enable pre-schedule always across the event device or a given event >>> port. >>> + * @ref rte_event_dev_config.preschedule_type >>> + * @see RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE >>> + */ >>> + RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE, >>> + /* Enable adaptive pre-schedule across the event device or a given >> event >>> port. >>> + * Delay issuing pre-schedule until there are no forward progress >>> constraints with >>> + * the held flow contexts. >>> + * @ref rte_event_dev_config.preschedule_type >>> + * @see RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE_ADAPTIVE >>> + */ >>> +} rte_event_dev_preschedule_type_t; >>> + >>> /** Event device configuration structure */ struct rte_event_dev_config { >>> uint32_t dequeue_timeout_ns; >>> @@ -752,6 +795,11 @@ struct rte_event_dev_config { >>> * optimized for single-link usage, this field is a hint for how many >>> * to allocate; otherwise, regular event ports and queues will be used. >>> */ >>> + rte_event_dev_preschedule_type_t preschedule_type; >>> + /**< Event pre-schedule type to use across the event device, if >>> supported. >>> + * @see RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE >>> + * @see RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE_ADAPTIVE >>> + */ >>> }; >>> >>> /** >>> -- >>> 2.25.1 >