* [dpdk-dev] [RFC PATCH] EventDev buffered enqueue API
@ 2016-12-02 19:45 Gage Eads
2016-12-02 19:45 ` [dpdk-dev] [RFC PATCH] eventdev: add buffered enqueue and flush APIs Gage Eads
0 siblings, 1 reply; 8+ messages in thread
From: Gage Eads @ 2016-12-02 19:45 UTC (permalink / raw)
To: jerin.jacob
Cc: dev, bruce.richardson, harry.van.haaren, hemant.agrawal, Gage Eads
This patch extends the currently proposed libeventdev API with support
for buffered enqueue operations.
This functionality can improve application performance when using event
devices that benefit from bulk enqueue operations, but due to its
architecture the application cannot easily combine its enqueue
operations.
For instance, consider a network application that dequeues a burst of
events and then processes them individually, where each event may be
processed and enqueued in a different codepath depending on its layer-3
or layer-4 protocol type. To take advantage of burst enqueues with the
current eventdev API, the application would have to be modified with an
ad-hoc buffering solution.
This functionality can be achieved fairly simply in the eventdev itself,
and in doing so reduces the amount of boilerplate code repeated among
applications of the sort described.
This patch applies on top of Jerin Jacob's eventdev API patchset[1].
[1] http://dpdk.org/ml/archives/dev/2016-November/050355.html
Gage Eads (1):
eventdev: add buffered enqueue and flush APIs
lib/librte_eventdev/rte_eventdev.c | 29 ++++++++++
lib/librte_eventdev/rte_eventdev.h | 106 +++++++++++++++++++++++++++++++++++++
2 files changed, 135 insertions(+)
--
1.9.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [dpdk-dev] [RFC PATCH] eventdev: add buffered enqueue and flush APIs
2016-12-02 19:45 [dpdk-dev] [RFC PATCH] EventDev buffered enqueue API Gage Eads
@ 2016-12-02 19:45 ` Gage Eads
2016-12-02 21:18 ` Jerin Jacob
0 siblings, 1 reply; 8+ messages in thread
From: Gage Eads @ 2016-12-02 19:45 UTC (permalink / raw)
To: jerin.jacob
Cc: dev, bruce.richardson, harry.van.haaren, hemant.agrawal, Gage Eads
This commit adds buffered enqueue functionality to the eventdev API.
It is conceptually similar to the ethdev API's tx buffering, however
with a smaller API surface and no dropping of events.
Signed-off-by: Gage Eads <gage.eads@intel.com>
---
lib/librte_eventdev/rte_eventdev.c | 29 ++++++++++
lib/librte_eventdev/rte_eventdev.h | 106 +++++++++++++++++++++++++++++++++++++
2 files changed, 135 insertions(+)
diff --git a/lib/librte_eventdev/rte_eventdev.c b/lib/librte_eventdev/rte_eventdev.c
index 17ce5c3..564573f 100644
--- a/lib/librte_eventdev/rte_eventdev.c
+++ b/lib/librte_eventdev/rte_eventdev.c
@@ -219,6 +219,7 @@
uint16_t *links_map;
uint8_t *ports_dequeue_depth;
uint8_t *ports_enqueue_depth;
+ struct rte_eventdev_enqueue_buffer *port_buffers;
unsigned int i;
EDEV_LOG_DEBUG("Setup %d ports on device %u", nb_ports,
@@ -272,6 +273,19 @@
"nb_ports %u", nb_ports);
return -(ENOMEM);
}
+
+ /* Allocate memory to store port enqueue buffers */
+ dev->data->port_buffers =
+ rte_zmalloc_socket("eventdev->port_buffers",
+ sizeof(dev->data->port_buffers[0]) * nb_ports,
+ RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+ if (dev->data->port_buffers == NULL) {
+ dev->data->nb_ports = 0;
+ EDEV_LOG_ERR("failed to get memory for port enq"
+ " buffers, nb_ports %u", nb_ports);
+ return -(ENOMEM);
+ }
+
} else if (dev->data->ports != NULL && nb_ports != 0) {/* re-config */
RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->port_release, -ENOTSUP);
@@ -279,6 +293,7 @@
ports_dequeue_depth = dev->data->ports_dequeue_depth;
ports_enqueue_depth = dev->data->ports_enqueue_depth;
links_map = dev->data->links_map;
+ port_buffers = dev->data->port_buffers;
for (i = nb_ports; i < old_nb_ports; i++)
(*dev->dev_ops->port_release)(ports[i]);
@@ -324,6 +339,17 @@
return -(ENOMEM);
}
+ /* Realloc memory to store port enqueue buffers */
+ port_buffers = rte_realloc(dev->data->port_buffers,
+ sizeof(dev->data->port_buffers[0]) * nb_ports,
+ RTE_CACHE_LINE_SIZE);
+ if (port_buffers == NULL) {
+ dev->data->nb_ports = 0;
+ EDEV_LOG_ERR("failed to realloc mem for port enq"
+ " buffers, nb_ports %u", nb_ports);
+ return -(ENOMEM);
+ }
+
if (nb_ports > old_nb_ports) {
uint8_t new_ps = nb_ports - old_nb_ports;
@@ -336,12 +362,15 @@
memset(links_map +
(old_nb_ports * RTE_EVENT_MAX_QUEUES_PER_DEV),
0, sizeof(ports_enqueue_depth[0]) * new_ps);
+ memset(port_buffers + old_nb_ports, 0,
+ sizeof(port_buffers[0]) * new_ps);
}
dev->data->ports = ports;
dev->data->ports_dequeue_depth = ports_dequeue_depth;
dev->data->ports_enqueue_depth = ports_enqueue_depth;
dev->data->links_map = links_map;
+ dev->data->port_buffers = port_buffers;
} else if (dev->data->ports != NULL && nb_ports == 0) {
RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->port_release, -ENOTSUP);
diff --git a/lib/librte_eventdev/rte_eventdev.h b/lib/librte_eventdev/rte_eventdev.h
index 778d6dc..3f24342 100644
--- a/lib/librte_eventdev/rte_eventdev.h
+++ b/lib/librte_eventdev/rte_eventdev.h
@@ -246,6 +246,7 @@
#include <rte_dev.h>
#include <rte_memory.h>
#include <rte_errno.h>
+#include <rte_memcpy.h>
#define EVENTDEV_NAME_SKELETON_PMD event_skeleton
/**< Skeleton event device PMD name */
@@ -965,6 +966,26 @@ typedef uint16_t (*event_dequeue_burst_t)(void *port, struct rte_event ev[],
#define RTE_EVENTDEV_NAME_MAX_LEN (64)
/**< @internal Max length of name of event PMD */
+#define RTE_EVENT_BUF_MAX 16
+/**< Maximum number of events in an enqueue buffer. */
+
+/**
+ * @internal
+ * An enqueue buffer for each port.
+ *
+ * The reason this struct is in the header is for inlining the function calls
+ * to enqueue, as doing a function call per packet would incur significant
+ * performance overhead.
+ *
+ * \see rte_event_enqueue_buffer(), rte_event_enqueue_buffer_flush()
+ */
+struct rte_eventdev_enqueue_buffer {
+ /**> Count of events in this buffer */
+ uint16_t count;
+ /**> Array of events in this buffer */
+ struct rte_event events[RTE_EVENT_BUF_MAX];
+} __rte_cache_aligned;
+
/**
* @internal
* The data part, with no function pointers, associated with each device.
@@ -983,6 +1004,8 @@ struct rte_eventdev_data {
/**< Number of event ports. */
void **ports;
/**< Array of pointers to ports. */
+ struct rte_eventdev_enqueue_buffer *port_buffers;
+ /**< Array of port enqueue buffers. */
uint8_t *ports_dequeue_depth;
/**< Array of port dequeue depth. */
uint8_t *ports_enqueue_depth;
@@ -1132,6 +1155,89 @@ struct rte_eventdev {
}
/**
+ * Flush the enqueue buffer of the event port specified by *port_id*, in the
+ * event device specified by *dev_id*.
+ *
+ * This function attempts to flush as many of the buffered events as possible,
+ * and returns the number of flushed events. Any unflushed events remain in
+ * the buffer.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param port_id
+ * The identifier of the event port.
+ *
+ * @return
+ * The number of event objects actually flushed to the event device.
+ *
+ * \see rte_event_enqueue_buffer(), rte_event_enqueue_burst()
+ * \see rte_event_port_enqueue_depth()
+ */
+static inline int
+rte_event_enqueue_buffer_flush(uint8_t dev_id, uint8_t port_id)
+{
+ struct rte_eventdev *dev = &rte_eventdevs[dev_id];
+ struct rte_eventdev_enqueue_buffer *buf =
+ &dev->data->port_buffers[port_id];
+ int n;
+
+ n = rte_event_enqueue_burst(dev_id, port_id, buf->events, buf->count);
+
+ if (n != buf->count)
+ memmove(buf->events, &buf->events[n], buf->count - n);
+
+ buf->count -= n;
+
+ return n;
+}
+
+/**
+ * Buffer an event object supplied in *rte_event* structure for future
+ * enqueueing on an event device designated by its *dev_id* through the event
+ * port specified by *port_id*.
+ *
+ * This function takes a single event and buffers it for later enqueuing to the
+ * queue specified in the event structure. If the buffer is full, the
+ * function will attempt to flush the buffer before buffering the event.
+ * If the flush operation fails, the previously buffered events remain in the
+ * buffer and an error is returned to the user to indicate that *ev* was not
+ * buffered.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param port_id
+ * The identifier of the event port.
+ * @param ev
+ * Pointer to struct rte_event
+ *
+ * @return
+ * - 0 on success
+ * - <0 on failure. Failure can occur if the event port's output queue is
+ * backpressured, for instance.
+ *
+ * \see rte_event_enqueue_buffer_flush(), rte_event_enqueue_burst()
+ * \see rte_event_port_enqueue_depth()
+ */
+static inline int
+rte_event_enqueue_buffer(uint8_t dev_id, uint8_t port_id, struct rte_event *ev)
+{
+ struct rte_eventdev *dev = &rte_eventdevs[dev_id];
+ struct rte_eventdev_enqueue_buffer *buf =
+ &dev->data->port_buffers[port_id];
+ int ret;
+
+ /* If necessary, flush the enqueue buffer to make space for ev. */
+ if (buf->count == RTE_EVENT_BUF_MAX) {
+ ret = rte_event_enqueue_buffer_flush(dev_id, port_id);
+ if (ret == 0)
+ return -ENOSPC;
+ }
+
+ rte_memcpy(&buf->events[buf->count++], ev, sizeof(struct rte_event));
+ return 0;
+}
+
+/**
* Converts nanoseconds to *wait* value for rte_event_dequeue()
*
* If the device is configured with RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT flag then
--
1.9.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC PATCH] eventdev: add buffered enqueue and flush APIs
2016-12-02 19:45 ` [dpdk-dev] [RFC PATCH] eventdev: add buffered enqueue and flush APIs Gage Eads
@ 2016-12-02 21:18 ` Jerin Jacob
2016-12-05 23:30 ` Eads, Gage
0 siblings, 1 reply; 8+ messages in thread
From: Jerin Jacob @ 2016-12-02 21:18 UTC (permalink / raw)
To: Gage Eads; +Cc: dev, bruce.richardson, harry.van.haaren, hemant.agrawal
On Fri, Dec 02, 2016 at 01:45:56PM -0600, Gage Eads wrote:
> This commit adds buffered enqueue functionality to the eventdev API.
> It is conceptually similar to the ethdev API's tx buffering, however
> with a smaller API surface and no dropping of events.
Hello Gage,
Different implementation may have different strategies to hold the buffers.
and some does not need to hold the buffers if it is DDR backed.
IHMO, This may not be the candidate for common code. I guess you can move this
to driver side and abstract under SW driver's enqueue_burst.
>
> Signed-off-by: Gage Eads <gage.eads@intel.com>
> ---
> lib/librte_eventdev/rte_eventdev.c | 29 ++++++++++
> lib/librte_eventdev/rte_eventdev.h | 106 +++++++++++++++++++++++++++++++++++++
> 2 files changed, 135 insertions(+)
>
> diff --git a/lib/librte_eventdev/rte_eventdev.c b/lib/librte_eventdev/rte_eventdev.c
> index 17ce5c3..564573f 100644
> --- a/lib/librte_eventdev/rte_eventdev.c
> +++ b/lib/librte_eventdev/rte_eventdev.c
> @@ -219,6 +219,7 @@
> uint16_t *links_map;
> uint8_t *ports_dequeue_depth;
> uint8_t *ports_enqueue_depth;
> + struct rte_eventdev_enqueue_buffer *port_buffers;
> unsigned int i;
>
> EDEV_LOG_DEBUG("Setup %d ports on device %u", nb_ports,
> @@ -272,6 +273,19 @@
> "nb_ports %u", nb_ports);
> return -(ENOMEM);
> }
> +
> + /* Allocate memory to store port enqueue buffers */
> + dev->data->port_buffers =
> + rte_zmalloc_socket("eventdev->port_buffers",
> + sizeof(dev->data->port_buffers[0]) * nb_ports,
> + RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> + if (dev->data->port_buffers == NULL) {
> + dev->data->nb_ports = 0;
> + EDEV_LOG_ERR("failed to get memory for port enq"
> + " buffers, nb_ports %u", nb_ports);
> + return -(ENOMEM);
> + }
> +
> } else if (dev->data->ports != NULL && nb_ports != 0) {/* re-config */
> RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->port_release, -ENOTSUP);
>
> @@ -279,6 +293,7 @@
> ports_dequeue_depth = dev->data->ports_dequeue_depth;
> ports_enqueue_depth = dev->data->ports_enqueue_depth;
> links_map = dev->data->links_map;
> + port_buffers = dev->data->port_buffers;
>
> for (i = nb_ports; i < old_nb_ports; i++)
> (*dev->dev_ops->port_release)(ports[i]);
> @@ -324,6 +339,17 @@
> return -(ENOMEM);
> }
>
> + /* Realloc memory to store port enqueue buffers */
> + port_buffers = rte_realloc(dev->data->port_buffers,
> + sizeof(dev->data->port_buffers[0]) * nb_ports,
> + RTE_CACHE_LINE_SIZE);
> + if (port_buffers == NULL) {
> + dev->data->nb_ports = 0;
> + EDEV_LOG_ERR("failed to realloc mem for port enq"
> + " buffers, nb_ports %u", nb_ports);
> + return -(ENOMEM);
> + }
> +
> if (nb_ports > old_nb_ports) {
> uint8_t new_ps = nb_ports - old_nb_ports;
>
> @@ -336,12 +362,15 @@
> memset(links_map +
> (old_nb_ports * RTE_EVENT_MAX_QUEUES_PER_DEV),
> 0, sizeof(ports_enqueue_depth[0]) * new_ps);
> + memset(port_buffers + old_nb_ports, 0,
> + sizeof(port_buffers[0]) * new_ps);
> }
>
> dev->data->ports = ports;
> dev->data->ports_dequeue_depth = ports_dequeue_depth;
> dev->data->ports_enqueue_depth = ports_enqueue_depth;
> dev->data->links_map = links_map;
> + dev->data->port_buffers = port_buffers;
> } else if (dev->data->ports != NULL && nb_ports == 0) {
> RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->port_release, -ENOTSUP);
>
> diff --git a/lib/librte_eventdev/rte_eventdev.h b/lib/librte_eventdev/rte_eventdev.h
> index 778d6dc..3f24342 100644
> --- a/lib/librte_eventdev/rte_eventdev.h
> +++ b/lib/librte_eventdev/rte_eventdev.h
> @@ -246,6 +246,7 @@
> #include <rte_dev.h>
> #include <rte_memory.h>
> #include <rte_errno.h>
> +#include <rte_memcpy.h>
>
> #define EVENTDEV_NAME_SKELETON_PMD event_skeleton
> /**< Skeleton event device PMD name */
> @@ -965,6 +966,26 @@ typedef uint16_t (*event_dequeue_burst_t)(void *port, struct rte_event ev[],
> #define RTE_EVENTDEV_NAME_MAX_LEN (64)
> /**< @internal Max length of name of event PMD */
>
> +#define RTE_EVENT_BUF_MAX 16
> +/**< Maximum number of events in an enqueue buffer. */
> +
> +/**
> + * @internal
> + * An enqueue buffer for each port.
> + *
> + * The reason this struct is in the header is for inlining the function calls
> + * to enqueue, as doing a function call per packet would incur significant
> + * performance overhead.
> + *
> + * \see rte_event_enqueue_buffer(), rte_event_enqueue_buffer_flush()
> + */
> +struct rte_eventdev_enqueue_buffer {
> + /**> Count of events in this buffer */
> + uint16_t count;
> + /**> Array of events in this buffer */
> + struct rte_event events[RTE_EVENT_BUF_MAX];
> +} __rte_cache_aligned;
> +
> /**
> * @internal
> * The data part, with no function pointers, associated with each device.
> @@ -983,6 +1004,8 @@ struct rte_eventdev_data {
> /**< Number of event ports. */
> void **ports;
> /**< Array of pointers to ports. */
> + struct rte_eventdev_enqueue_buffer *port_buffers;
> + /**< Array of port enqueue buffers. */
> uint8_t *ports_dequeue_depth;
> /**< Array of port dequeue depth. */
> uint8_t *ports_enqueue_depth;
> @@ -1132,6 +1155,89 @@ struct rte_eventdev {
> }
>
> /**
> + * Flush the enqueue buffer of the event port specified by *port_id*, in the
> + * event device specified by *dev_id*.
> + *
> + * This function attempts to flush as many of the buffered events as possible,
> + * and returns the number of flushed events. Any unflushed events remain in
> + * the buffer.
> + *
> + * @param dev_id
> + * The identifier of the device.
> + * @param port_id
> + * The identifier of the event port.
> + *
> + * @return
> + * The number of event objects actually flushed to the event device.
> + *
> + * \see rte_event_enqueue_buffer(), rte_event_enqueue_burst()
> + * \see rte_event_port_enqueue_depth()
> + */
> +static inline int
> +rte_event_enqueue_buffer_flush(uint8_t dev_id, uint8_t port_id)
> +{
> + struct rte_eventdev *dev = &rte_eventdevs[dev_id];
> + struct rte_eventdev_enqueue_buffer *buf =
> + &dev->data->port_buffers[port_id];
> + int n;
> +
> + n = rte_event_enqueue_burst(dev_id, port_id, buf->events, buf->count);
> +
> + if (n != buf->count)
> + memmove(buf->events, &buf->events[n], buf->count - n);
> +
> + buf->count -= n;
> +
> + return n;
> +}
> +
> +/**
> + * Buffer an event object supplied in *rte_event* structure for future
> + * enqueueing on an event device designated by its *dev_id* through the event
> + * port specified by *port_id*.
> + *
> + * This function takes a single event and buffers it for later enqueuing to the
> + * queue specified in the event structure. If the buffer is full, the
> + * function will attempt to flush the buffer before buffering the event.
> + * If the flush operation fails, the previously buffered events remain in the
> + * buffer and an error is returned to the user to indicate that *ev* was not
> + * buffered.
> + *
> + * @param dev_id
> + * The identifier of the device.
> + * @param port_id
> + * The identifier of the event port.
> + * @param ev
> + * Pointer to struct rte_event
> + *
> + * @return
> + * - 0 on success
> + * - <0 on failure. Failure can occur if the event port's output queue is
> + * backpressured, for instance.
> + *
> + * \see rte_event_enqueue_buffer_flush(), rte_event_enqueue_burst()
> + * \see rte_event_port_enqueue_depth()
> + */
> +static inline int
> +rte_event_enqueue_buffer(uint8_t dev_id, uint8_t port_id, struct rte_event *ev)
> +{
> + struct rte_eventdev *dev = &rte_eventdevs[dev_id];
> + struct rte_eventdev_enqueue_buffer *buf =
> + &dev->data->port_buffers[port_id];
> + int ret;
> +
> + /* If necessary, flush the enqueue buffer to make space for ev. */
> + if (buf->count == RTE_EVENT_BUF_MAX) {
> + ret = rte_event_enqueue_buffer_flush(dev_id, port_id);
> + if (ret == 0)
> + return -ENOSPC;
> + }
> +
> + rte_memcpy(&buf->events[buf->count++], ev, sizeof(struct rte_event));
> + return 0;
> +}
> +
> +/**
> * Converts nanoseconds to *wait* value for rte_event_dequeue()
> *
> * If the device is configured with RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT flag then
> --
> 1.9.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC PATCH] eventdev: add buffered enqueue and flush APIs
2016-12-02 21:18 ` Jerin Jacob
@ 2016-12-05 23:30 ` Eads, Gage
2016-12-08 4:41 ` Jerin Jacob
0 siblings, 1 reply; 8+ messages in thread
From: Eads, Gage @ 2016-12-05 23:30 UTC (permalink / raw)
To: Jerin Jacob; +Cc: dev, Richardson, Bruce, Van Haaren, Harry, hemant.agrawal
> On Dec 3, 2016, at 5:18 AM, Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
>
>> On Fri, Dec 02, 2016 at 01:45:56PM -0600, Gage Eads wrote:
>> This commit adds buffered enqueue functionality to the eventdev API.
>> It is conceptually similar to the ethdev API's tx buffering, however
>> with a smaller API surface and no dropping of events.
>
> Hello Gage,
> Different implementation may have different strategies to hold the buffers.
A benefit of inlining the buffering logic in the header is that we avoid the overhead of entering the PMD for what is a fairly simple operation (common case: add event to an array, increment counter). If we make this implementation-defined (i.e. use PMD callbacks), we lose that benefit.
> and some does not need to hold the buffers if it is DDR backed.
Though DDR-backed hardware doesn't need to buffer in software, doing so would reduce the software overhead of enqueueing. Compared to N individual calls to enqueue, buffering N events then calling enqueue burst once can benefit from amortized (or parallelized) PMD-specific bookkeeping and error-checking across the set of events, and will definitely benefit from the amortized function call overhead and better I-cache behavior. (Essentially this is VPP from the fd.io project.) This should result in higher overall event throughout (agnostic of the underlying device).
I'm skeptical that other buffering strategies would emerge, but I can only speculate on Cavium/NXP/etc. NPU software.
> IHMO, This may not be the candidate for common code. I guess you can move this
> to driver side and abstract under SW driver's enqueue_burst.
>
I don't think that will work without adding a flush API, otherwise we could have indefinitely buffered events. I see three ways forward:
- The proposed approach
- Add the proposed functions but make them implementation-specific.
- Require the application to write its own buffering logic (i.e. no API change)
Thanks,
Gage
>
>>
>> Signed-off-by: Gage Eads <gage.eads@intel.com>
>> ---
>> lib/librte_eventdev/rte_eventdev.c | 29 ++++++++++
>> lib/librte_eventdev/rte_eventdev.h | 106 +++++++++++++++++++++++++++++++++++++
>> 2 files changed, 135 insertions(+)
>>
>> diff --git a/lib/librte_eventdev/rte_eventdev.c b/lib/librte_eventdev/rte_eventdev.c
>> index 17ce5c3..564573f 100644
>> --- a/lib/librte_eventdev/rte_eventdev.c
>> +++ b/lib/librte_eventdev/rte_eventdev.c
>> @@ -219,6 +219,7 @@
>> uint16_t *links_map;
>> uint8_t *ports_dequeue_depth;
>> uint8_t *ports_enqueue_depth;
>> + struct rte_eventdev_enqueue_buffer *port_buffers;
>> unsigned int i;
>>
>> EDEV_LOG_DEBUG("Setup %d ports on device %u", nb_ports,
>> @@ -272,6 +273,19 @@
>> "nb_ports %u", nb_ports);
>> return -(ENOMEM);
>> }
>> +
>> + /* Allocate memory to store port enqueue buffers */
>> + dev->data->port_buffers =
>> + rte_zmalloc_socket("eventdev->port_buffers",
>> + sizeof(dev->data->port_buffers[0]) * nb_ports,
>> + RTE_CACHE_LINE_SIZE, dev->data->socket_id);
>> + if (dev->data->port_buffers == NULL) {
>> + dev->data->nb_ports = 0;
>> + EDEV_LOG_ERR("failed to get memory for port enq"
>> + " buffers, nb_ports %u", nb_ports);
>> + return -(ENOMEM);
>> + }
>> +
>> } else if (dev->data->ports != NULL && nb_ports != 0) {/* re-config */
>> RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->port_release, -ENOTSUP);
>>
>> @@ -279,6 +293,7 @@
>> ports_dequeue_depth = dev->data->ports_dequeue_depth;
>> ports_enqueue_depth = dev->data->ports_enqueue_depth;
>> links_map = dev->data->links_map;
>> + port_buffers = dev->data->port_buffers;
>>
>> for (i = nb_ports; i < old_nb_ports; i++)
>> (*dev->dev_ops->port_release)(ports[i]);
>> @@ -324,6 +339,17 @@
>> return -(ENOMEM);
>> }
>>
>> + /* Realloc memory to store port enqueue buffers */
>> + port_buffers = rte_realloc(dev->data->port_buffers,
>> + sizeof(dev->data->port_buffers[0]) * nb_ports,
>> + RTE_CACHE_LINE_SIZE);
>> + if (port_buffers == NULL) {
>> + dev->data->nb_ports = 0;
>> + EDEV_LOG_ERR("failed to realloc mem for port enq"
>> + " buffers, nb_ports %u", nb_ports);
>> + return -(ENOMEM);
>> + }
>> +
>> if (nb_ports > old_nb_ports) {
>> uint8_t new_ps = nb_ports - old_nb_ports;
>>
>> @@ -336,12 +362,15 @@
>> memset(links_map +
>> (old_nb_ports * RTE_EVENT_MAX_QUEUES_PER_DEV),
>> 0, sizeof(ports_enqueue_depth[0]) * new_ps);
>> + memset(port_buffers + old_nb_ports, 0,
>> + sizeof(port_buffers[0]) * new_ps);
>> }
>>
>> dev->data->ports = ports;
>> dev->data->ports_dequeue_depth = ports_dequeue_depth;
>> dev->data->ports_enqueue_depth = ports_enqueue_depth;
>> dev->data->links_map = links_map;
>> + dev->data->port_buffers = port_buffers;
>> } else if (dev->data->ports != NULL && nb_ports == 0) {
>> RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->port_release, -ENOTSUP);
>>
>> diff --git a/lib/librte_eventdev/rte_eventdev.h b/lib/librte_eventdev/rte_eventdev.h
>> index 778d6dc..3f24342 100644
>> --- a/lib/librte_eventdev/rte_eventdev.h
>> +++ b/lib/librte_eventdev/rte_eventdev.h
>> @@ -246,6 +246,7 @@
>> #include <rte_dev.h>
>> #include <rte_memory.h>
>> #include <rte_errno.h>
>> +#include <rte_memcpy.h>
>>
>> #define EVENTDEV_NAME_SKELETON_PMD event_skeleton
>> /**< Skeleton event device PMD name */
>> @@ -965,6 +966,26 @@ typedef uint16_t (*event_dequeue_burst_t)(void *port, struct rte_event ev[],
>> #define RTE_EVENTDEV_NAME_MAX_LEN (64)
>> /**< @internal Max length of name of event PMD */
>>
>> +#define RTE_EVENT_BUF_MAX 16
>> +/**< Maximum number of events in an enqueue buffer. */
>> +
>> +/**
>> + * @internal
>> + * An enqueue buffer for each port.
>> + *
>> + * The reason this struct is in the header is for inlining the function calls
>> + * to enqueue, as doing a function call per packet would incur significant
>> + * performance overhead.
>> + *
>> + * \see rte_event_enqueue_buffer(), rte_event_enqueue_buffer_flush()
>> + */
>> +struct rte_eventdev_enqueue_buffer {
>> + /**> Count of events in this buffer */
>> + uint16_t count;
>> + /**> Array of events in this buffer */
>> + struct rte_event events[RTE_EVENT_BUF_MAX];
>> +} __rte_cache_aligned;
>> +
>> /**
>> * @internal
>> * The data part, with no function pointers, associated with each device.
>> @@ -983,6 +1004,8 @@ struct rte_eventdev_data {
>> /**< Number of event ports. */
>> void **ports;
>> /**< Array of pointers to ports. */
>> + struct rte_eventdev_enqueue_buffer *port_buffers;
>> + /**< Array of port enqueue buffers. */
>> uint8_t *ports_dequeue_depth;
>> /**< Array of port dequeue depth. */
>> uint8_t *ports_enqueue_depth;
>> @@ -1132,6 +1155,89 @@ struct rte_eventdev {
>> }
>>
>> /**
>> + * Flush the enqueue buffer of the event port specified by *port_id*, in the
>> + * event device specified by *dev_id*.
>> + *
>> + * This function attempts to flush as many of the buffered events as possible,
>> + * and returns the number of flushed events. Any unflushed events remain in
>> + * the buffer.
>> + *
>> + * @param dev_id
>> + * The identifier of the device.
>> + * @param port_id
>> + * The identifier of the event port.
>> + *
>> + * @return
>> + * The number of event objects actually flushed to the event device.
>> + *
>> + * \see rte_event_enqueue_buffer(), rte_event_enqueue_burst()
>> + * \see rte_event_port_enqueue_depth()
>> + */
>> +static inline int
>> +rte_event_enqueue_buffer_flush(uint8_t dev_id, uint8_t port_id)
>> +{
>> + struct rte_eventdev *dev = &rte_eventdevs[dev_id];
>> + struct rte_eventdev_enqueue_buffer *buf =
>> + &dev->data->port_buffers[port_id];
>> + int n;
>> +
>> + n = rte_event_enqueue_burst(dev_id, port_id, buf->events, buf->count);
>> +
>> + if (n != buf->count)
>> + memmove(buf->events, &buf->events[n], buf->count - n);
>> +
>> + buf->count -= n;
>> +
>> + return n;
>> +}
>> +
>> +/**
>> + * Buffer an event object supplied in *rte_event* structure for future
>> + * enqueueing on an event device designated by its *dev_id* through the event
>> + * port specified by *port_id*.
>> + *
>> + * This function takes a single event and buffers it for later enqueuing to the
>> + * queue specified in the event structure. If the buffer is full, the
>> + * function will attempt to flush the buffer before buffering the event.
>> + * If the flush operation fails, the previously buffered events remain in the
>> + * buffer and an error is returned to the user to indicate that *ev* was not
>> + * buffered.
>> + *
>> + * @param dev_id
>> + * The identifier of the device.
>> + * @param port_id
>> + * The identifier of the event port.
>> + * @param ev
>> + * Pointer to struct rte_event
>> + *
>> + * @return
>> + * - 0 on success
>> + * - <0 on failure. Failure can occur if the event port's output queue is
>> + * backpressured, for instance.
>> + *
>> + * \see rte_event_enqueue_buffer_flush(), rte_event_enqueue_burst()
>> + * \see rte_event_port_enqueue_depth()
>> + */
>> +static inline int
>> +rte_event_enqueue_buffer(uint8_t dev_id, uint8_t port_id, struct rte_event *ev)
>> +{
>> + struct rte_eventdev *dev = &rte_eventdevs[dev_id];
>> + struct rte_eventdev_enqueue_buffer *buf =
>> + &dev->data->port_buffers[port_id];
>> + int ret;
>> +
>> + /* If necessary, flush the enqueue buffer to make space for ev. */
>> + if (buf->count == RTE_EVENT_BUF_MAX) {
>> + ret = rte_event_enqueue_buffer_flush(dev_id, port_id);
>> + if (ret == 0)
>> + return -ENOSPC;
>> + }
>> +
>> + rte_memcpy(&buf->events[buf->count++], ev, sizeof(struct rte_event));
>> + return 0;
>> +}
>> +
>> +/**
>> * Converts nanoseconds to *wait* value for rte_event_dequeue()
>> *
>> * If the device is configured with RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT flag then
>> --
>> 1.9.1
>>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC PATCH] eventdev: add buffered enqueue and flush APIs
2016-12-05 23:30 ` Eads, Gage
@ 2016-12-08 4:41 ` Jerin Jacob
2016-12-12 17:56 ` Eads, Gage
0 siblings, 1 reply; 8+ messages in thread
From: Jerin Jacob @ 2016-12-08 4:41 UTC (permalink / raw)
To: Eads, Gage; +Cc: dev, Richardson, Bruce, Van Haaren, Harry, hemant.agrawal
On Mon, Dec 05, 2016 at 11:30:46PM +0000, Eads, Gage wrote:
>
> > On Dec 3, 2016, at 5:18 AM, Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
> >
> >> On Fri, Dec 02, 2016 at 01:45:56PM -0600, Gage Eads wrote:
> >> This commit adds buffered enqueue functionality to the eventdev API.
> >> It is conceptually similar to the ethdev API's tx buffering, however
> >> with a smaller API surface and no dropping of events.
> >
> > Hello Gage,
> > Different implementation may have different strategies to hold the buffers.
>
> A benefit of inlining the buffering logic in the header is that we avoid the overhead of entering the PMD for what is a fairly simple operation (common case: add event to an array, increment counter). If we make this implementation-defined (i.e. use PMD callbacks), we lose that benefit.
In general, I agree from the system perspective. But, few general issues with
eventdev integration part,
1) What if the burst has ATOMIC flows and if we are NOT en-queuing to the
implementation then other event ports won't get the packets from the same ATOMIC
tag ? BAD. Right?
2) At least, In our HW implementation, The event buffer strategy is more like,
if you enqueue to HW then ONLY you get the events from dequeue provided
if op == RTE_EVENT_OP_FORWARD.So it will create deadlock.i.e application cannot
hold the events with RTE_EVENT_OP_FORWARD
3) So considering the above case there is nothing like flush for us
4) In real high throughput benchmark case, we will get the packets at
the rate of max burst and then we always needs to memcpy before we flush.
Otherwise there will be ordering issue as burst can get us the packet from
different flows(unlike polling mode)
>
> > and some does not need to hold the buffers if it is DDR backed.
>
> Though DDR-backed hardware doesn't need to buffer in software, doing so would reduce the software overhead of enqueueing. Compared to N individual calls to enqueue, buffering N events then calling enqueue burst once can benefit from amortized (or parallelized) PMD-specific bookkeeping and error-checking across the set of events, and will definitely benefit from the amortized function call overhead and better I-cache behavior. (Essentially this is VPP from the fd.io project.) This should result in higher overall event throughout (agnostic of the underlying device).
See above. I am not against burst processing in "application".
The flush does not make sense for us in HW perspective and
it is costly for us if we trying generalize it.
>
> I'm skeptical that other buffering strategies would emerge, but I can only speculate on Cavium/NXP/etc. NPU software.
i>
> > IHMO, This may not be the candidate for common code. I guess you can move this
> > to driver side and abstract under SW driver's enqueue_burst.
> >
>
> I don't think that will work without adding a flush API, otherwise we could have indefinitely buffered events. I see three ways forward:
I agree. More portable way is to move the "flush" to the implementation and "flush"
whenever it makes sense to PMD.
>
> - The proposed approach
> - Add the proposed functions but make them implementation-specific.
> - Require the application to write its own buffering logic (i.e. no API change)
I think, If the additional function call overhead cost is too much for SW
implementation then we can think of implementation-specific API or
custom application flow based on SW driver.
But I am not fan of that(but tempted do now a days), If we take that route, we have truckload of custom
implementation specific API and now we try to hide all black magic under enqueue/dequeue
to make it portable at some expense.
>
> Thanks,
> Gage
>
> >
> >>
> >> Signed-off-by: Gage Eads <gage.eads@intel.com>
> >> ---
> >> lib/librte_eventdev/rte_eventdev.c | 29 ++++++++++
> >> lib/librte_eventdev/rte_eventdev.h | 106 +++++++++++++++++++++++++++++++++++++
> >> 2 files changed, 135 insertions(+)
> >>
> >> diff --git a/lib/librte_eventdev/rte_eventdev.c b/lib/librte_eventdev/rte_eventdev.c
> >> index 17ce5c3..564573f 100644
> >> --- a/lib/librte_eventdev/rte_eventdev.c
> >> +++ b/lib/librte_eventdev/rte_eventdev.c
> >> @@ -219,6 +219,7 @@
> >> uint16_t *links_map;
> >> uint8_t *ports_dequeue_depth;
> >> uint8_t *ports_enqueue_depth;
> >> + struct rte_eventdev_enqueue_buffer *port_buffers;
> >> unsigned int i;
> >>
> >> EDEV_LOG_DEBUG("Setup %d ports on device %u", nb_ports,
> >> @@ -272,6 +273,19 @@
> >> "nb_ports %u", nb_ports);
> >> return -(ENOMEM);
> >> }
> >> +
> >> + /* Allocate memory to store port enqueue buffers */
> >> + dev->data->port_buffers =
> >> + rte_zmalloc_socket("eventdev->port_buffers",
> >> + sizeof(dev->data->port_buffers[0]) * nb_ports,
> >> + RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> >> + if (dev->data->port_buffers == NULL) {
> >> + dev->data->nb_ports = 0;
> >> + EDEV_LOG_ERR("failed to get memory for port enq"
> >> + " buffers, nb_ports %u", nb_ports);
> >> + return -(ENOMEM);
> >> + }
> >> +
> >> } else if (dev->data->ports != NULL && nb_ports != 0) {/* re-config */
> >> RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->port_release, -ENOTSUP);
> >>
> >> @@ -279,6 +293,7 @@
> >> ports_dequeue_depth = dev->data->ports_dequeue_depth;
> >> ports_enqueue_depth = dev->data->ports_enqueue_depth;
> >> links_map = dev->data->links_map;
> >> + port_buffers = dev->data->port_buffers;
> >>
> >> for (i = nb_ports; i < old_nb_ports; i++)
> >> (*dev->dev_ops->port_release)(ports[i]);
> >> @@ -324,6 +339,17 @@
> >> return -(ENOMEM);
> >> }
> >>
> >> + /* Realloc memory to store port enqueue buffers */
> >> + port_buffers = rte_realloc(dev->data->port_buffers,
> >> + sizeof(dev->data->port_buffers[0]) * nb_ports,
> >> + RTE_CACHE_LINE_SIZE);
> >> + if (port_buffers == NULL) {
> >> + dev->data->nb_ports = 0;
> >> + EDEV_LOG_ERR("failed to realloc mem for port enq"
> >> + " buffers, nb_ports %u", nb_ports);
> >> + return -(ENOMEM);
> >> + }
> >> +
> >> if (nb_ports > old_nb_ports) {
> >> uint8_t new_ps = nb_ports - old_nb_ports;
> >>
> >> @@ -336,12 +362,15 @@
> >> memset(links_map +
> >> (old_nb_ports * RTE_EVENT_MAX_QUEUES_PER_DEV),
> >> 0, sizeof(ports_enqueue_depth[0]) * new_ps);
> >> + memset(port_buffers + old_nb_ports, 0,
> >> + sizeof(port_buffers[0]) * new_ps);
> >> }
> >>
> >> dev->data->ports = ports;
> >> dev->data->ports_dequeue_depth = ports_dequeue_depth;
> >> dev->data->ports_enqueue_depth = ports_enqueue_depth;
> >> dev->data->links_map = links_map;
> >> + dev->data->port_buffers = port_buffers;
> >> } else if (dev->data->ports != NULL && nb_ports == 0) {
> >> RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->port_release, -ENOTSUP);
> >>
> >> diff --git a/lib/librte_eventdev/rte_eventdev.h b/lib/librte_eventdev/rte_eventdev.h
> >> index 778d6dc..3f24342 100644
> >> --- a/lib/librte_eventdev/rte_eventdev.h
> >> +++ b/lib/librte_eventdev/rte_eventdev.h
> >> @@ -246,6 +246,7 @@
> >> #include <rte_dev.h>
> >> #include <rte_memory.h>
> >> #include <rte_errno.h>
> >> +#include <rte_memcpy.h>
> >>
> >> #define EVENTDEV_NAME_SKELETON_PMD event_skeleton
> >> /**< Skeleton event device PMD name */
> >> @@ -965,6 +966,26 @@ typedef uint16_t (*event_dequeue_burst_t)(void *port, struct rte_event ev[],
> >> #define RTE_EVENTDEV_NAME_MAX_LEN (64)
> >> /**< @internal Max length of name of event PMD */
> >>
> >> +#define RTE_EVENT_BUF_MAX 16
> >> +/**< Maximum number of events in an enqueue buffer. */
> >> +
> >> +/**
> >> + * @internal
> >> + * An enqueue buffer for each port.
> >> + *
> >> + * The reason this struct is in the header is for inlining the function calls
> >> + * to enqueue, as doing a function call per packet would incur significant
> >> + * performance overhead.
> >> + *
> >> + * \see rte_event_enqueue_buffer(), rte_event_enqueue_buffer_flush()
> >> + */
> >> +struct rte_eventdev_enqueue_buffer {
> >> + /**> Count of events in this buffer */
> >> + uint16_t count;
> >> + /**> Array of events in this buffer */
> >> + struct rte_event events[RTE_EVENT_BUF_MAX];
> >> +} __rte_cache_aligned;
> >> +
> >> /**
> >> * @internal
> >> * The data part, with no function pointers, associated with each device.
> >> @@ -983,6 +1004,8 @@ struct rte_eventdev_data {
> >> /**< Number of event ports. */
> >> void **ports;
> >> /**< Array of pointers to ports. */
> >> + struct rte_eventdev_enqueue_buffer *port_buffers;
> >> + /**< Array of port enqueue buffers. */
> >> uint8_t *ports_dequeue_depth;
> >> /**< Array of port dequeue depth. */
> >> uint8_t *ports_enqueue_depth;
> >> @@ -1132,6 +1155,89 @@ struct rte_eventdev {
> >> }
> >>
> >> /**
> >> + * Flush the enqueue buffer of the event port specified by *port_id*, in the
> >> + * event device specified by *dev_id*.
> >> + *
> >> + * This function attempts to flush as many of the buffered events as possible,
> >> + * and returns the number of flushed events. Any unflushed events remain in
> >> + * the buffer.
> >> + *
> >> + * @param dev_id
> >> + * The identifier of the device.
> >> + * @param port_id
> >> + * The identifier of the event port.
> >> + *
> >> + * @return
> >> + * The number of event objects actually flushed to the event device.
> >> + *
> >> + * \see rte_event_enqueue_buffer(), rte_event_enqueue_burst()
> >> + * \see rte_event_port_enqueue_depth()
> >> + */
> >> +static inline int
> >> +rte_event_enqueue_buffer_flush(uint8_t dev_id, uint8_t port_id)
> >> +{
> >> + struct rte_eventdev *dev = &rte_eventdevs[dev_id];
> >> + struct rte_eventdev_enqueue_buffer *buf =
> >> + &dev->data->port_buffers[port_id];
> >> + int n;
> >> +
> >> + n = rte_event_enqueue_burst(dev_id, port_id, buf->events, buf->count);
> >> +
> >> + if (n != buf->count)
> >> + memmove(buf->events, &buf->events[n], buf->count - n);
> >> +
> >> + buf->count -= n;
> >> +
> >> + return n;
> >> +}
> >> +
> >> +/**
> >> + * Buffer an event object supplied in *rte_event* structure for future
> >> + * enqueueing on an event device designated by its *dev_id* through the event
> >> + * port specified by *port_id*.
> >> + *
> >> + * This function takes a single event and buffers it for later enqueuing to the
> >> + * queue specified in the event structure. If the buffer is full, the
> >> + * function will attempt to flush the buffer before buffering the event.
> >> + * If the flush operation fails, the previously buffered events remain in the
> >> + * buffer and an error is returned to the user to indicate that *ev* was not
> >> + * buffered.
> >> + *
> >> + * @param dev_id
> >> + * The identifier of the device.
> >> + * @param port_id
> >> + * The identifier of the event port.
> >> + * @param ev
> >> + * Pointer to struct rte_event
> >> + *
> >> + * @return
> >> + * - 0 on success
> >> + * - <0 on failure. Failure can occur if the event port's output queue is
> >> + * backpressured, for instance.
> >> + *
> >> + * \see rte_event_enqueue_buffer_flush(), rte_event_enqueue_burst()
> >> + * \see rte_event_port_enqueue_depth()
> >> + */
> >> +static inline int
> >> +rte_event_enqueue_buffer(uint8_t dev_id, uint8_t port_id, struct rte_event *ev)
> >> +{
> >> + struct rte_eventdev *dev = &rte_eventdevs[dev_id];
> >> + struct rte_eventdev_enqueue_buffer *buf =
> >> + &dev->data->port_buffers[port_id];
> >> + int ret;
> >> +
> >> + /* If necessary, flush the enqueue buffer to make space for ev. */
> >> + if (buf->count == RTE_EVENT_BUF_MAX) {
> >> + ret = rte_event_enqueue_buffer_flush(dev_id, port_id);
> >> + if (ret == 0)
> >> + return -ENOSPC;
> >> + }
> >> +
> >> + rte_memcpy(&buf->events[buf->count++], ev, sizeof(struct rte_event));
> >> + return 0;
> >> +}
> >> +
> >> +/**
> >> * Converts nanoseconds to *wait* value for rte_event_dequeue()
> >> *
> >> * If the device is configured with RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT flag then
> >> --
> >> 1.9.1
> >>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC PATCH] eventdev: add buffered enqueue and flush APIs
2016-12-08 4:41 ` Jerin Jacob
@ 2016-12-12 17:56 ` Eads, Gage
2016-12-14 7:44 ` Jerin Jacob
2016-12-14 7:52 ` Jerin Jacob
0 siblings, 2 replies; 8+ messages in thread
From: Eads, Gage @ 2016-12-12 17:56 UTC (permalink / raw)
To: Jerin Jacob; +Cc: dev, Richardson, Bruce, Van Haaren, Harry, hemant.agrawal
> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Wednesday, December 7, 2016 10:42 PM
> To: Eads, Gage <gage.eads@intel.com>
> Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; Van
> Haaren, Harry <harry.van.haaren@intel.com>; hemant.agrawal@nxp.com
> Subject: Re: [RFC PATCH] eventdev: add buffered enqueue and flush APIs
>
> On Mon, Dec 05, 2016 at 11:30:46PM +0000, Eads, Gage wrote:
> >
> > > On Dec 3, 2016, at 5:18 AM, Jerin Jacob
> <jerin.jacob@caviumnetworks.com> wrote:
> > >
> > >> On Fri, Dec 02, 2016 at 01:45:56PM -0600, Gage Eads wrote:
> > >> This commit adds buffered enqueue functionality to the eventdev API.
> > >> It is conceptually similar to the ethdev API's tx buffering,
> > >> however with a smaller API surface and no dropping of events.
> > >
> > > Hello Gage,
> > > Different implementation may have different strategies to hold the buffers.
> >
> > A benefit of inlining the buffering logic in the header is that we avoid the
> overhead of entering the PMD for what is a fairly simple operation (common
> case: add event to an array, increment counter). If we make this
> implementation-defined (i.e. use PMD callbacks), we lose that benefit.
> In general, I agree from the system perspective. But, few general issues with
> eventdev integration part,
>
> 1) What if the burst has ATOMIC flows and if we are NOT en-queuing to the
> implementation then other event ports won't get the packets from the same
> ATOMIC tag ? BAD. Right?
I'm not sure what scenario you're describing here. The buffered (as implemented in my patch) and non-buffered enqueue operations are functionally the same (as long as the buffer is flushed), the difference lies in when the events are moved from the application level to the PMD.
> 2) At least, In our HW implementation, The event buffer strategy is more like, if
> you enqueue to HW then ONLY you get the events from dequeue provided if op
> == RTE_EVENT_OP_FORWARD.So it will create deadlock.i.e application cannot
> hold the events with RTE_EVENT_OP_FORWARD
If I'm reading this correctly, you're concerned that buffered events can result in deadlock if they're not flushed. Whether the buffering is done in the app itself, inline in the API, or in the PMDs, not flushing the buffer is an application bug. E.g. the app could be fixed by flushing its enqueue buffer after processing every burst dequeued event set, or only if dequeue returns 0 events.
> 3) So considering the above case there is nothing like flush for us
> 4) In real high throughput benchmark case, we will get the packets at the rate
> of max burst and then we always needs to memcpy before we flush.
> Otherwise there will be ordering issue as burst can get us the packet from
> different flows(unlike polling mode)
I take it you're referring to the memcpy in the patch, and not an additional memcpy? At any rate, I'm hoping that SIMD instructions can optimize the 16B event copy.
>
> >
> > > and some does not need to hold the buffers if it is DDR backed.
> >
> > Though DDR-backed hardware doesn't need to buffer in software, doing so
> would reduce the software overhead of enqueueing. Compared to N individual
> calls to enqueue, buffering N events then calling enqueue burst once can
> benefit from amortized (or parallelized) PMD-specific bookkeeping and error-
> checking across the set of events, and will definitely benefit from the amortized
> function call overhead and better I-cache behavior. (Essentially this is VPP from
> the fd.io project.) This should result in higher overall event throughout
> (agnostic of the underlying device).
>
> See above. I am not against burst processing in "application".
> The flush does not make sense for us in HW perspective and it is costly for us if
> we trying generalize it.
>
Besides the data copy that buffering requires, are there additional costs from your perspective?
> >
> > I'm skeptical that other buffering strategies would emerge, but I can only
> speculate on Cavium/NXP/etc. NPU software.
> i>
> > > IHMO, This may not be the candidate for common code. I guess you can
> > > move this to driver side and abstract under SW driver's enqueue_burst.
> > >
> >
> > I don't think that will work without adding a flush API, otherwise we could
> have indefinitely buffered events. I see three ways forward:
>
> I agree. More portable way is to move the "flush" to the implementation and
> "flush"
> whenever it makes sense to PMD.
>
> >
> > - The proposed approach
> > - Add the proposed functions but make them implementation-specific.
> > - Require the application to write its own buffering logic (i.e. no
> > API change)
>
> I think, If the additional function call overhead cost is too much for SW
> implementation then we can think of implementation-specific API or custom
> application flow based on SW driver.
>
> But I am not fan of that(but tempted do now a days), If we take that route, we
> have truckload of custom implementation specific API and now we try to hide
> all black magic under enqueue/dequeue to make it portable at some expense.
Agreed, it's not worth special-casing the API with this relatively minor addition.
Thanks,
Gage
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC PATCH] eventdev: add buffered enqueue and flush APIs
2016-12-12 17:56 ` Eads, Gage
@ 2016-12-14 7:44 ` Jerin Jacob
2016-12-14 7:52 ` Jerin Jacob
1 sibling, 0 replies; 8+ messages in thread
From: Jerin Jacob @ 2016-12-14 7:44 UTC (permalink / raw)
To: Eads, Gage; +Cc: dev, Richardson, Bruce, Van Haaren, Harry, hemant.agrawal
On Mon, Dec 12, 2016 at 05:56:32PM +0000, Eads, Gage wrote:
>
>
> > -----Original Message-----
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Wednesday, December 7, 2016 10:42 PM
> > To: Eads, Gage <gage.eads@intel.com>
> > 1) What if the burst has ATOMIC flows and if we are NOT en-queuing to the
> > implementation then other event ports won't get the packets from the same
> > ATOMIC tag ? BAD. Right?
>
> I'm not sure what scenario you're describing here. The buffered (as implemented in my patch) and non-buffered enqueue operations are functionally the same (as long as the buffer is flushed), the difference lies in when the events are moved from the application level to the PMD.
OK. I will try to explain with time-line
Assume,
flush size: 16
bust size: 4
At t0: dequeued 4 events(3 ordered and 1 atomic events)
At t1: after processing the events, store to the events local buffer
At t2: request to dequeue 4 more events
Now, Since scheduler has been scheduled an atomic event to port at t0, it
can not schedule an atomic event of same TAG to _any port_. As atomic
events from the same TAG's in-flight entry will be always one to enable
the critical section processing in the packet flow.
>
> > 2) At least, In our HW implementation, The event buffer strategy is more like, if
> > you enqueue to HW then ONLY you get the events from dequeue provided if op
> > == RTE_EVENT_OP_FORWARD.So it will create deadlock.i.e application cannot
> > hold the events with RTE_EVENT_OP_FORWARD
>
> If I'm reading this correctly, you're concerned that buffered events can result in deadlock if they're not flushed. Whether the buffering is done in the app itself, inline in the API, or in the PMDs, not flushing the buffer is an application bug. E.g. the app could be fixed by flushing its enqueue buffer after processing every burst dequeued event set, or only if dequeue returns 0 events.
No. At least, our HW implementation, it maintains the state of scheduled events
to a port. Drivers get next set of events ONLY if driver submits
the events which already got on the first dequeue.i.e application cannot
hold the events with RTE_EVENT_OP_FORWARD
>
> > 3) So considering the above case there is nothing like flush for us
> > 4) In real high throughput benchmark case, we will get the packets at the rate
> > of max burst and then we always needs to memcpy before we flush.
> > Otherwise there will be ordering issue as burst can get us the packet from
> > different flows(unlike polling mode)
>
> I take it you're referring to the memcpy in the patch, and not an additional memcpy? At any rate, I'm hoping that SIMD instructions can optimize the 16B event copy.
Hmm. The point was we need to memcpy all the time to maintain the order.
>
> >
> > >
> > > > and some does not need to hold the buffers if it is DDR backed.
> > >
> >
> > See above. I am not against burst processing in "application".
> > The flush does not make sense for us in HW perspective and it is costly for us if
> > we trying generalize it.
> >
>
> Besides the data copy that buffering requires, are there additional costs from your perspective?
It won't even work in our case as HW maintains the context on dequeued
events.
I suggest checking the function call overhead. If it turned out
to have the impact on the performance.Then we can split flow based on capability
flag but I recommend it as last option.
>
> > >
> > > I'm skeptical that other buffering strategies would emerge, but I can only
> > speculate on Cavium/NXP/etc. NPU software.
> > i>
> > > > IHMO, This may not be the candidate for common code. I guess you can
> > > > move this to driver side and abstract under SW driver's enqueue_burst.
> > > >
> > >
> > > I don't think that will work without adding a flush API, otherwise we could
> > have indefinitely buffered events. I see three ways forward:
> >
> > I agree. More portable way is to move the "flush" to the implementation and
> > "flush"
> > whenever it makes sense to PMD.
> >
> > >
> > > - The proposed approach
> > > - Add the proposed functions but make them implementation-specific.
> > > - Require the application to write its own buffering logic (i.e. no
> > > API change)
> >
> > I think, If the additional function call overhead cost is too much for SW
> > implementation then we can think of implementation-specific API or custom
> > application flow based on SW driver.
> >
> > But I am not fan of that(but tempted do now a days), If we take that route, we
> > have truckload of custom implementation specific API and now we try to hide
> > all black magic under enqueue/dequeue to make it portable at some expense.
>
> Agreed, it's not worth special-casing the API with this relatively minor addition.
>
> Thanks,
> Gage
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [RFC PATCH] eventdev: add buffered enqueue and flush APIs
2016-12-12 17:56 ` Eads, Gage
2016-12-14 7:44 ` Jerin Jacob
@ 2016-12-14 7:52 ` Jerin Jacob
1 sibling, 0 replies; 8+ messages in thread
From: Jerin Jacob @ 2016-12-14 7:52 UTC (permalink / raw)
To: Eads, Gage; +Cc: dev, Richardson, Bruce, Van Haaren, Harry, hemant.agrawal
On Mon, Dec 12, 2016 at 05:56:32PM +0000, Eads, Gage wrote:
>
>
> > -----Original Message-----
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Wednesday, December 7, 2016 10:42 PM
> > To: Eads, Gage <gage.eads@intel.com>
> > Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; Van
> > Haaren, Harry <harry.van.haaren@intel.com>; hemant.agrawal@nxp.com
> > Subject: Re: [RFC PATCH] eventdev: add buffered enqueue and flush APIs
> >
> > On Mon, Dec 05, 2016 at 11:30:46PM +0000, Eads, Gage wrote:
> > >
> > > > On Dec 3, 2016, at 5:18 AM, Jerin Jacob
> > <jerin.jacob@caviumnetworks.com> wrote:
> > > >
> > > >> On Fri, Dec 02, 2016 at 01:45:56PM -0600, Gage Eads wrote:
> > > >> This commit adds buffered enqueue functionality to the eventdev API.
> > > >> It is conceptually similar to the ethdev API's tx buffering,
> > > >> however with a smaller API surface and no dropping of events.
> > > >
> > > > Hello Gage,
> > > > Different implementation may have different strategies to hold the buffers.
> > >
> > > A benefit of inlining the buffering logic in the header is that we avoid the
> > overhead of entering the PMD for what is a fairly simple operation (common
> > case: add event to an array, increment counter). If we make this
> > implementation-defined (i.e. use PMD callbacks), we lose that benefit.
> > In general, I agree from the system perspective. But, few general issues with
> > eventdev integration part,
> >
> > 1) What if the burst has ATOMIC flows and if we are NOT en-queuing to the
> > implementation then other event ports won't get the packets from the same
> > ATOMIC tag ? BAD. Right?
>
> I'm not sure what scenario you're describing here. The buffered (as implemented in my patch) and non-buffered enqueue operations are functionally the same (as long as the buffer is flushed), the difference lies in when the events are moved from the application level to the PMD.
Another point, we treat dequeue as _implicit_ event release of the
existing dequeued events, so with proposed buffering scheme breaks all the logic.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-12-14 7:52 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-02 19:45 [dpdk-dev] [RFC PATCH] EventDev buffered enqueue API Gage Eads
2016-12-02 19:45 ` [dpdk-dev] [RFC PATCH] eventdev: add buffered enqueue and flush APIs Gage Eads
2016-12-02 21:18 ` Jerin Jacob
2016-12-05 23:30 ` Eads, Gage
2016-12-08 4:41 ` Jerin Jacob
2016-12-12 17:56 ` Eads, Gage
2016-12-14 7:44 ` Jerin Jacob
2016-12-14 7:52 ` Jerin Jacob
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).