DPDK patches and discussions
 help / color / mirror / Atom feed
From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
To: "Pathak, Pravin" <pravin.pathak@intel.com>,
	"Mattias Rönnblom" <hofors@lysator.liu.se>,
	"Jerin Jacob" <jerinj@marvell.com>,
	"Shijith Thotton" <sthotton@marvell.com>,
	"Sevincer, Abdullah" <abdullah.sevincer@intel.com>,
	"hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
	"sachin.saxena@oss.nxp.com" <sachin.saxena@oss.nxp.com>,
	"Van Haaren, Harry" <harry.van.haaren@intel.com>,
	"mattias.ronnblom@ericsson.com" <mattias.ronnblom@ericsson.com>,
	"liangma@liangbit.com" <liangma@liangbit.com>,
	"Mccarthy, Peter" <peter.mccarthy@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: RE: [EXTERNAL] Re: [PATCH v2 1/3] eventdev: introduce event pre-scheduling
Date: Thu, 26 Sep 2024 10:03:27 +0000	[thread overview]
Message-ID: <PH0PR18MB408621545AA0A042F1888D5EDE6A2@PH0PR18MB4086.namprd18.prod.outlook.com> (raw)
In-Reply-To: <BL1PR11MB5461B9AD533C0A3D549F45DCF46A2@BL1PR11MB5461.namprd11.prod.outlook.com>

> > -----Original Message-----
> > From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> > Sent: Wednesday, September 25, 2024 6:30 AM
> > To: Mattias Rönnblom <hofors@lysator.liu.se>; Pathak, Pravin
> > <pravin.pathak@intel.com>; Jerin Jacob <jerinj@marvell.com>; Shijith
> Thotton
> > <sthotton@marvell.com>; Sevincer, Abdullah
> <abdullah.sevincer@intel.com>;
> > hemant.agrawal@nxp.com; sachin.saxena@oss.nxp.com; Van Haaren, Harry
> > <harry.van.haaren@intel.com>; mattias.ronnblom@ericsson.com;
> > liangma@liangbit.com; Mccarthy, Peter <peter.mccarthy@intel.com>
> > Cc: dev@dpdk.org
> > Subject: RE: [EXTERNAL] Re: [PATCH v2 1/3] eventdev: introduce event pre-
> > scheduling
> >
> > > On 2024-09-19 15:13, Pavan Nikhilesh Bhagavatula wrote:
> > > >>> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> > > >>> Sent: Tuesday, September 17, 2024 3:11 AM
> > > >>> To: jerinj@marvell.com; sthotton@marvell.com; Sevincer, Abdullah
> > > >>> <abdullah.sevincer@intel.com>; hemant.agrawal@nxp.com;
> > > >>> sachin.saxena@oss.nxp.com; Van Haaren, Harry
> > > >> <harry.van.haaren@intel.com>;
> > > >>> mattias.ronnblom@ericsson.com; liangma@liangbit.com; Mccarthy,
> > > >>> Peter <peter.mccarthy@intel.com>
> > > >>> Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> > > >>> Subject: [PATCH v2 1/3] eventdev: introduce event pre-scheduling
> > > >>>
> > > >>> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > > >>>
> > > >>> Event pre-scheduling improves scheduling performance by assigning
> > > events
> > > >> to
> > > >>> event ports in advance when dequeues are issued.
> > > >>> The dequeue operation initiates the pre-schedule operation, which
> > > >> completes in
> > > >>> parallel without affecting the dequeued event flow contexts and
> > > >>> dequeue latency.
> > > >>>
> > > >> Is the prescheduling done to get the event more quickly in the next
> > > dequeue?
> > > >> The first dequeue executes pre-schedule to make events available
> > > >> for the
> > > next
> > > >> dequeue.
> > > >> Is this how it is supposed to work?
> > > >>
> > > >
> > > > Yes, that is correct.
> > > >
> > >
> > > "improves scheduling performance" may be a bit misleading, in that case.
> > > I suggest "reduces scheduling overhead" instead. You can argue it
> > > likely reduces scheduling performance, in certain scenarios. "reduces
> > > scheduling overhead, at the cost of load balancing performance."
> > >
> >
> > In case of OCTEON, we see double the scheduling performance with
> > prescheduling without effecting any priority/weight aspects.
> >
> > > It seems to me that this should be a simple hint-type API, where the
> > > hint is used by the event device to decide if pre-scheduling should be
> > > used or not (assuming pre-scheduling on/off is even an option). The
> > > hint would just be a way for the application to express whether or not
> > > it want the scheduler to prioritize load balancing agility and
> > > port-to-port wall-time latency, or scheduling overhead, which in turn
> > > could potentially be rephrased as the app being throughput or latency/RT-
> > oriented.
> > >
> >
> > The three prescheduling types are designed based on real world use-cases
> that
> > some of our customers require in their applications.
> > Relying on application to provide hits might not be possible in all the cases as
> it
> > is very timing sensitive.
> >
> >
> > > It could also be useful for the event device to know which priority
> > > levels are to be considered latency-sensitive, and which are
> > > throughput-oriented - maybe in the form of a threshold.
> > >
> > > >>> Event devices can indicate pre-scheduling capabilities using
> > > >>> `RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE` and
> > > >>> `RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE_ADAPTIVE` via the
> event
> > > >> device
> > > >>> info function `info.event_dev_cap`.
> 
> What is PRESCHEDULE_ADAPTIVE? Can you please add more description?

Unlike raw PRESCHEDULE where every dequeue triggers a pre-scheduling request in parallel, 
PRESCHEDULE_ADAPTIVE will delay issuing pre-scheduling till event device knows that the 
currently scheduled context can make forward progress. 
For example, in OCTEON HW does it when it sees that scheduling context held by the port is
top/head of the flow.

> This will be more useful as per port configuration instead of device-level
> configuration.

In the patch 2/3 I introduced port level API to control prefetches at an event port level.

It is not a port configuration API because Applications might want to enable/disable 
prefetching in fastpath. Example use cases we see is to disable pre-scheduling when 
application wants to preempt an lcore and reenable it at a later point of time.


> The application can choose a type based on its requirement on the port it is
> serving.
> As Mattias suggested, if this is made HINT flag for port configuration, other
> PMDs can
> Ignore it based on either they may not need it depending on their architecture
> or not support it.
> 

If PMDs support preschedules then it has to advertise via capabilities, silently ignoring 
feature configuration is bad.

We can make the fastpath APIs as hints.



> > > >>>
> > > >>> Applications can select the pre-schedule type and configure it
> > > >>> through `rte_event_dev_config.preschedule_type` during
> > > `rte_event_dev_configure`.
> > > >>>
> > > >>> The supported pre-schedule types are:
> > > >>>   * `RTE_EVENT_DEV_PRESCHEDULE_NONE` - No pre-scheduling.
> > > >>>   * `RTE_EVENT_DEV_PRESCHEDULE` - Always issue a pre-schedule on
> > > >> dequeue.
> > > >>>   * `RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE` - Delay issuing pre-
> > > schedule
> > > >>> until
> > > >>>     there are no forward progress constraints with the held flow
> contexts.
> > > >>>
> > > >>> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > > >>> ---
> > > >>>   app/test/test_eventdev.c                    | 63 +++++++++++++++++++++
> > > >>>   doc/guides/prog_guide/eventdev/eventdev.rst | 22 +++++++
> > > >>>   lib/eventdev/rte_eventdev.h                 | 48 ++++++++++++++++
> > > >>>   3 files changed, 133 insertions(+)
> > > >>>
> > > >>> diff --git a/app/test/test_eventdev.c b/app/test/test_eventdev.c
> > > >>> index e4e234dc98..cf496ee88d 100644
> > > >>> --- a/app/test/test_eventdev.c
> > > >>> +++ b/app/test/test_eventdev.c
> > > >>> @@ -1250,6 +1250,67 @@ test_eventdev_profile_switch(void)
> > > >>>   	return TEST_SUCCESS;
> > > >>>   }
> > > >>>
> > > >>> +static int
> > > >>> +preschedule_test(rte_event_dev_preschedule_type_t
> > > >>> +preschedule_type, const char *preschedule_name) {
> > > >>> +#define NB_EVENTS     1024
> > > >>> +	uint64_t start, total;
> > > >>> +	struct rte_event ev;
> > > >>> +	int rc, cnt;
> > > >>> +
> > > >>> +	ev.event_type = RTE_EVENT_TYPE_CPU;
> > > >>> +	ev.queue_id = 0;
> > > >>> +	ev.op = RTE_EVENT_OP_NEW;
> > > >>> +	ev.u64 = 0xBADF00D0;
> > > >>> +
> > > >>> +	for (cnt = 0; cnt < NB_EVENTS; cnt++) {
> > > >>> +		ev.flow_id = cnt;
> > > >>> +		rc = rte_event_enqueue_burst(TEST_DEV_ID, 0, &ev,
> 1);
> > > >>> +		TEST_ASSERT(rc == 1, "Failed to enqueue event");
> > > >>> +	}
> > > >>> +
> > > >>> +	RTE_SET_USED(preschedule_type);
> > > >>> +	total = 0;
> > > >>> +	while (cnt) {
> > > >>> +		start = rte_rdtsc_precise();
> > > >>> +		rc = rte_event_dequeue_burst(TEST_DEV_ID, 0, &ev,
> 1, 0);
> > > >>> +		if (rc) {
> > > >>> +			total += rte_rdtsc_precise() - start;
> > > >>> +			cnt--;
> > > >>> +		}
> > > >>> +	}
> > > >>> +	printf("Preschedule type : %s, avg cycles %" PRIu64 "\n",
> > > >>> preschedule_name,
> > > >>> +	       total / NB_EVENTS);
> > > >>> +
> > > >>> +	return TEST_SUCCESS;
> > > >>> +}
> > > >>> +
> > > >>> +static int
> > > >>> +test_eventdev_preschedule_configure(void)
> > > >>> +{
> > > >>> +	struct rte_event_dev_config dev_conf;
> > > >>> +	struct rte_event_dev_info info;
> > > >>> +	int rc;
> > > >>> +
> > > >>> +	rte_event_dev_info_get(TEST_DEV_ID, &info);
> > > >>> +
> > > >>> +	if ((info.event_dev_cap &
> > > >> RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE)
> > > >>> == 0)
> > > >>> +		return TEST_SKIPPED;
> > > >>> +
> > > >>> +	devconf_set_default_sane_values(&dev_conf, &info);
> > > >>> +	dev_conf.preschedule_type =
> RTE_EVENT_DEV_PRESCHEDULE;
> > > >>> +	rc = rte_event_dev_configure(TEST_DEV_ID, &dev_conf);
> > > >>> +	TEST_ASSERT_SUCCESS(rc, "Failed to configure eventdev");
> > > >>> +
> > > >>> +	rc = preschedule_test(RTE_EVENT_DEV_PRESCHEDULE_NONE,
> > > >>> "RTE_EVENT_DEV_PRESCHEDULE_NONE");
> > > >>> +	rc |= preschedule_test(RTE_EVENT_DEV_PRESCHEDULE,
> > > >>> "RTE_EVENT_DEV_PRESCHEDULE");
> > > >>> +	if (info.event_dev_cap &
> > > >>> RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE_ADAPTIVE)
> > > >>> +		rc |=
> > > >>> preschedule_test(RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE,
> > > >>> +
> > > >>> "RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE");
> > > >>> +
> > > >>> +	return rc;
> > > >>> +}
> > > >>> +
> > > >>>   static int
> > > >>>   test_eventdev_close(void)
> > > >>>   {
> > > >>> @@ -1310,6 +1371,8 @@ static struct unit_test_suite
> > > >>> eventdev_common_testsuite  = {
> > > >>>   			test_eventdev_start_stop),
> > > >>>   		TEST_CASE_ST(eventdev_configure_setup,
> > > >>> eventdev_stop_device,
> > > >>>   			test_eventdev_profile_switch),
> > > >>> +		TEST_CASE_ST(eventdev_configure_setup, NULL,
> > > >>> +			test_eventdev_preschedule_configure),
> > > >>>   		TEST_CASE_ST(eventdev_setup_device,
> > > >> eventdev_stop_device,
> > > >>>   			test_eventdev_link),
> > > >>>   		TEST_CASE_ST(eventdev_setup_device,
> > > >> eventdev_stop_device,
> > > >>> diff --git a/doc/guides/prog_guide/eventdev/eventdev.rst
> > > >>> b/doc/guides/prog_guide/eventdev/eventdev.rst
> > > >>> index fb6dfce102..341b9bb2c6 100644
> > > >>> --- a/doc/guides/prog_guide/eventdev/eventdev.rst
> > > >>> +++ b/doc/guides/prog_guide/eventdev/eventdev.rst
> > > >>> @@ -357,6 +357,28 @@ Worker path:
> > > >>>          // Process the event received.
> > > >>>      }
> > > >>>
> > > >>> +Event Pre-scheduling
> > > >>> +~~~~~~~~~~~~~~~~~~~~
> > > >>> +
> > > >>> +Event pre-scheduling improves scheduling performance by assigning
> > > >>> +events to event ports in advance when dequeues are issued.
> > > >>> +The `rte_event_dequeue_burst` operation initiates the
> > > >>> +pre-schedule operation, which completes in parallel without
> > > >>> +affecting the dequeued
> > > >> event
> > > >>> flow contexts and dequeue latency.
> > > >>> +On the next dequeue operation, the pre-scheduled events are
> > > >>> +dequeued and pre-schedule is initiated again.
> > > >>> +
> > > >>> +An application can use event pre-scheduling if the event device
> > > >>> +supports it at either device level or at a individual port level.
> > > >>> +The application can check pre-schedule capability by checking if
> > > >>> +``rte_event_dev_info.event_dev_cap``
> > > >>> +has the bit ``RTE_EVENT_DEV_CAP_PRESCHEDULE`` set, if present
> > > >>> +pre-scheduling can be enabled at device configuration time by
> > > >>> +setting
> > > >>> appropriate pre-schedule type in
> ``rte_event_dev_config.preschedule``.
> > > >>> +
> > > >>> +Currently, the following pre-schedule types are supported:
> > > >>> + * ``RTE_EVENT_DEV_PRESCHEDULE_NONE`` - No pre-scheduling.
> > > >>> + * ``RTE_EVENT_DEV_PRESCHEDULE`` - Always issue a pre-schedule
> > > >>> +when
> > > >>> dequeue is issued.
> > > >>> + * ``RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE`` - Issue pre-schedule
> > > when
> > > >>> dequeue is issued and there are
> > > >>> +   no forward progress constraints.
> > > >>> +
> > > >>>   Starting the EventDev
> > > >>>   ~~~~~~~~~~~~~~~~~~~~~
> > > >>>
> > > >>> diff --git a/lib/eventdev/rte_eventdev.h
> > > >>> b/lib/eventdev/rte_eventdev.h
> > > >> index
> > > >>> 08e5f9320b..5ea7f5a07b 100644
> > > >>> --- a/lib/eventdev/rte_eventdev.h
> > > >>> +++ b/lib/eventdev/rte_eventdev.h
> > > >>> @@ -446,6 +446,30 @@ struct rte_event;
> > > >>>    * @see RTE_SCHED_TYPE_PARALLEL
> > > >>>    */
> > > >>>
> > > >>> +#define RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE (1ULL << 16)
> > /**<
> > > >> Event
> > > >>> +device supports event pre-scheduling.
> > > >>> + *
> > > >>> + * When this capability is available, the application can enable
> > > >>> +event pre-scheduling on the event
> > > >>> + * device to pre-schedule events to a event port when
> > > >>> +`rte_event_dequeue_burst()`
> > > >>> + * is issued.
> > > >>> + * The pre-schedule process starts with the
> > > >>> +`rte_event_dequeue_burst()` call and the
> > > >>> + * pre-scheduled events are returned on the next
> > > >> `rte_event_dequeue_burst()`
> > > >>> call.
> > > >>> + *
> > > >>> + * @see rte_event_dev_configure() */
> > > >>> +
> > > >>> +#define RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE_ADAPTIVE
> (1ULL
> > > <<
> > > >> 17)
> > > >>> /**<
> > > >>> +Event device supports adaptive event pre-scheduling.
> > > >>> + *
> > > >>> + * When this capability is available, the application can enable
> > > >>> +adaptive pre-scheduling
> > > >>> + * on the event device where the events are pre-scheduled when
> > > >>> +there are no forward
> > > >>> + * progress constraints with the currently held flow contexts.
> > > >>> + * The pre-schedule process starts with the
> > > >>> +`rte_event_dequeue_burst()` call and the
> > > >>> + * pre-scheduled events are returned on the next
> > > >> `rte_event_dequeue_burst()`
> > > >>> call.
> > > >>> + *
> > > >>> + * @see rte_event_dev_configure() */
> > > >>> +
> > > >>>   /* Event device priority levels */
> > > >>>   #define RTE_EVENT_DEV_PRIORITY_HIGHEST   0
> > > >>>   /**< Highest priority level for events and queues.
> > > >>> @@ -680,6 +704,25 @@ rte_event_dev_attr_get(uint8_t dev_id,
> > > uint32_t
> > > >>> attr_id,
> > > >>>    *  @see rte_event_dequeue_timeout_ticks(),
> > > rte_event_dequeue_burst()
> > > >>>    */
> > > >>>
> > > >>> +typedef enum {
> > > >>> +	RTE_EVENT_DEV_PRESCHEDULE_NONE = 0,
> > > >>> +	/* Disable pre-schedule across the event device or on a given
> > > >>> +event
> > > >> port.
> > > >>> +	 * @ref rte_event_dev_config.preschedule_type
> > > >>> +	 */
> > > >>> +	RTE_EVENT_DEV_PRESCHEDULE,
> > > >>> +	/* Enable pre-schedule always across the event device or a
> given
> > > >>> +event
> > > >>> port.
> > > >>> +	 * @ref rte_event_dev_config.preschedule_type
> > > >>> +	 * @see RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE
> > > >>> +	 */
> > > >>> +	RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE,
> > > >>> +	/* Enable adaptive pre-schedule across the event device or a
> > > >>> +given
> > > >> event
> > > >>> port.
> > > >>> +	 * Delay issuing pre-schedule until there are no forward
> > > >>> +progress
> > > >>> constraints with
> > > >>> +	 * the held flow contexts.
> > > >>> +	 * @ref rte_event_dev_config.preschedule_type
> > > >>> +	 * @see
> RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE_ADAPTIVE
> > > >>> +	 */
> > > >>> +} rte_event_dev_preschedule_type_t;
> > > >>> +
> > > >>>   /** Event device configuration structure */  struct
> > rte_event_dev_config {
> > > >>>   	uint32_t dequeue_timeout_ns;
> > > >>> @@ -752,6 +795,11 @@ struct rte_event_dev_config {
> > > >>>   	 * optimized for single-link usage, this field is a hint for how
> many
> > > >>>   	 * to allocate; otherwise, regular event ports and queues will
> be used.
> > > >>>   	 */
> > > >>> +	rte_event_dev_preschedule_type_t preschedule_type;
> > > >>> +	/**< Event pre-schedule type to use across the event device, if
> > > >>> supported.
> > > >>> +	 * @see RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE
> > > >>> +	 * @see
> RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE_ADAPTIVE
> > > >>> +	 */
> > > >>>   };
> > > >>>
> > > >>>   /**
> > > >>> --
> > > >>> 2.25.1
> > > >


  reply	other threads:[~2024-09-26 10:03 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-10  8:31 [RFC 0/3] Introduce event prefetching pbhagavatula
2024-09-10  8:31 ` [RFC 1/3] eventdev: introduce " pbhagavatula
2024-09-10  8:31 ` [RFC 2/3] eventdev: allow event ports to modified prefetches pbhagavatula
2024-09-10  8:31 ` [RFC 3/3] eventdev: add SW event prefetch hint pbhagavatula
2024-09-10  9:08 ` [RFC 0/3] Introduce event prefetching Mattias Rönnblom
2024-09-10 11:53   ` [EXTERNAL] " Pavan Nikhilesh Bhagavatula
2024-09-17  7:11 ` [PATCH v2 0/3] Introduce event pre-scheduling pbhagavatula
2024-09-17  7:11   ` [PATCH v2 1/3] eventdev: introduce " pbhagavatula
2024-09-18 22:38     ` Pathak, Pravin
2024-09-19 13:13       ` Pavan Nikhilesh Bhagavatula
2024-09-23  8:57         ` Mattias Rönnblom
2024-09-25 10:30           ` [EXTERNAL] " Pavan Nikhilesh Bhagavatula
2024-09-26  2:54             ` Pathak, Pravin
2024-09-26 10:03               ` Pavan Nikhilesh Bhagavatula [this message]
2024-09-27  3:31                 ` Pathak, Pravin
2024-09-17  7:11   ` [PATCH v2 2/3] eventdev: add event port pre-schedule modify pbhagavatula
2024-09-17  7:11   ` [PATCH v2 3/3] eventdev: add SW event preschedule hint pbhagavatula

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PH0PR18MB408621545AA0A042F1888D5EDE6A2@PH0PR18MB4086.namprd18.prod.outlook.com \
    --to=pbhagavatula@marvell.com \
    --cc=abdullah.sevincer@intel.com \
    --cc=dev@dpdk.org \
    --cc=harry.van.haaren@intel.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=hofors@lysator.liu.se \
    --cc=jerinj@marvell.com \
    --cc=liangma@liangbit.com \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=peter.mccarthy@intel.com \
    --cc=pravin.pathak@intel.com \
    --cc=sachin.saxena@oss.nxp.com \
    --cc=sthotton@marvell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).