* [dpdk-dev] [PATCH 0/3] New API to free consumed buffers in TX ring @ 2016-12-16 12:48 Billy McFall 2016-12-16 12:48 ` [dpdk-dev] [PATCH 1/3] ethdev: " Billy McFall ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Billy McFall @ 2016-12-16 12:48 UTC (permalink / raw) To: thomas.monjalon, wenzhuo.lu; +Cc: dev, Billy McFall Based on a request from Damjan Marion and seconded by Keith Wiles, see dpdk-dev mailling list from 11/21/2016, add a new API to free consumed buffers on TX ring. This addresses two scenarios: 1) Flooding a packet and want to reuse existing mbuf to avoid a packet copy. Increment the reference count of the packet and poll new API until reference count is decremented. 2) Application runs out of mbufs so call API to free consumed packets so processing can continue. API will return the number of packets freed (0-n) or error code if feature not supported (-ENOTSUP) or input invalid (-ENODEV). API for e1000 igb driver and vHost driver have been implemented. Other drivers can be implemented over time. Some drivers implement a TX done flush routine that should be reused where possible. e1000 igb driver and vHost driver do not have such functions. Billy McFall (3): ethdev: New API to free consumed buffers in TX ring driver: e1000 igb support to free consumed buffers driver: vHost support to free consumed buffers drivers/net/e1000/e1000_ethdev.h | 2 + drivers/net/e1000/igb_ethdev.c | 1 + drivers/net/e1000/igb_rxtx.c | 126 ++++++++++++++++++++++++++++++++++++++ drivers/net/vhost/rte_eth_vhost.c | 11 ++++ lib/librte_ether/rte_ethdev.h | 56 +++++++++++++++++ 5 files changed, 196 insertions(+) -- 2.9.3 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [dpdk-dev] [PATCH 1/3] ethdev: New API to free consumed buffers in TX ring 2016-12-16 12:48 [dpdk-dev] [PATCH 0/3] New API to free consumed buffers in TX ring Billy McFall @ 2016-12-16 12:48 ` Billy McFall 2016-12-16 16:28 ` Stephen Hemminger 2016-12-20 11:27 ` Adrien Mazarguil 2016-12-16 12:48 ` [dpdk-dev] [PATCH 2/3] driver: e1000 igb support to free consumed buffers Billy McFall 2016-12-16 12:48 ` [dpdk-dev] [PATCH 3/3] driver: vHost " Billy McFall 2 siblings, 2 replies; 12+ messages in thread From: Billy McFall @ 2016-12-16 12:48 UTC (permalink / raw) To: thomas.monjalon, wenzhuo.lu; +Cc: dev, Billy McFall Add a new API to force free consumed buffers on TX ring. API will return the number of packets freed (0-n) or error code if feature not supported (-ENOTSUP) or input invalid (-ENODEV). Because rte_eth_tx_buffer() may be used, and mbufs may still be held in local buffer, the API also accepts *buffer and *sent. Before attempting to free, rte_eth_tx_buffer_flush() is called to make sure all mbufs are sent to Tx ring. rte_eth_tx_buffer_flush() is called even if threshold is not met. Signed-off-by: Billy McFall <bmcfall@redhat.com> --- lib/librte_ether/rte_ethdev.h | 56 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 9678179..e3f2be4 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -1150,6 +1150,9 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev, typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset); /**< @internal Check DD bit of specific RX descriptor */ +typedef int (*eth_tx_done_cleanup_t)(void *txq, uint32_t free_cnt); +/**< @internal Force mbufs to be from TX ring. */ + typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev, uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo); @@ -1467,6 +1470,7 @@ struct eth_dev_ops { eth_rx_disable_intr_t rx_queue_intr_disable; eth_tx_queue_setup_t tx_queue_setup;/**< Set up device TX queue.*/ eth_queue_release_t tx_queue_release;/**< Release TX queue.*/ + eth_tx_done_cleanup_t tx_done_cleanup;/**< Free tx ring mbufs */ eth_dev_led_on_t dev_led_on; /**< Turn on LED. */ eth_dev_led_off_t dev_led_off; /**< Turn off LED. */ flow_ctrl_get_t flow_ctrl_get; /**< Get flow control. */ @@ -2943,6 +2947,58 @@ rte_eth_tx_buffer(uint8_t port_id, uint16_t queue_id, } /** + * Request the driver to free mbufs currently cached by the driver. The + * driver will only free the mbuf if it is no longer in use. + * + * @param port_id + * The port identifier of the Ethernet device. + * @param queue_id + * The index of the transmit queue through which output packets must be + * sent. + * The value must be in the range [0, nb_tx_queue - 1] previously supplied + * to rte_eth_dev_configure(). + * @param free_cnt + * Maximum number of packets to free. Use 0 to indicate all possible packets + * should be freed. Note that a packet may be using multiple mbufs. + * @param buffer + * Buffer used to collect packets to be sent. If provided, the buffer will + * be flushed, even if the current length is less than buffer->size. Pass NULL + * if buffer has already been flushed. + * @param sent + * Pointer to return number of packets sent if buffer has packets to be sent. + * If *buffer is supplied, *sent must also be supplied. + * @return + * Failure: < 0 + * -ENODEV: Invalid interface + * -ENOTSUP: Driver does not support function + * Success: >= 0 + * 0-n: Number of packets freed. More packets may still remain in ring that + * are in use. + */ + +static inline int +rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id, uint32_t free_cnt, + struct rte_eth_dev_tx_buffer *buffer, uint16_t *sent) +{ + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; + + /* Validate Input Data. Bail if not valid or not supported. */ + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP); + + /* + * If transmit buffer is provided and there are still packets to be + * sent, then send them before attempting to free pending mbufs. + */ + if (buffer && sent) + *sent = rte_eth_tx_buffer_flush(port_id, queue_id, buffer); + + /* Call driver to free pending mbufs. */ + return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id], + free_cnt); +} + +/** * Configure a callback for buffered packets which cannot be sent * * Register a specific callback to be called when an attempt is made to send -- 2.9.3 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] [PATCH 1/3] ethdev: New API to free consumed buffers in TX ring 2016-12-16 12:48 ` [dpdk-dev] [PATCH 1/3] ethdev: " Billy McFall @ 2016-12-16 16:28 ` Stephen Hemminger 2016-12-20 11:27 ` Adrien Mazarguil 1 sibling, 0 replies; 12+ messages in thread From: Stephen Hemminger @ 2016-12-16 16:28 UTC (permalink / raw) To: Billy McFall; +Cc: thomas.monjalon, wenzhuo.lu, dev On Fri, 16 Dec 2016 07:48:49 -0500 Billy McFall <bmcfall@redhat.com> wrote: > /** > + * Request the driver to free mbufs currently cached by the driver. The > + * driver will only free the mbuf if it is no longer in use. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param queue_id > + * The index of the transmit queue through which output packets must be > + * sent. > + * The value must be in the range [0, nb_tx_queue - 1] previously supplied > + * to rte_eth_dev_configure(). > + * @param free_cnt > + * Maximum number of packets to free. Use 0 to indicate all possible packets > + * should be freed. Note that a packet may be using multiple mbufs. > + * @param buffer > + * Buffer used to collect packets to be sent. If provided, the buffer will > + * be flushed, even if the current length is less than buffer->size. Pass NULL > + * if buffer has already been flushed. > + * @param sent > + * Pointer to return number of packets sent if buffer has packets to be sent. > + * If *buffer is supplied, *sent must also be supplied. > + * @return > + * Failure: < 0 > + * -ENODEV: Invalid interface > + * -ENOTSUP: Driver does not support function > + * Success: >= 0 > + * 0-n: Number of packets freed. More packets may still remain in ring that > + * are in use. > + */ > + > +static inline int > +rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id, uint32_t free_cnt, > + struct rte_eth_dev_tx_buffer *buffer, uint16_t *sent) This API is more complex than it needs to be. For the typical use case of OOM kind of cleanup, this is overkill. There is no need for: free_cnt - device driver should just free all buffer/param - the application should not care. The DPDK model is that once mbuf's are passed to device, the device "owns" the mbuf. I think changing that model is just going to break things for no gain. It does make sense to have a "please cleanup your mbufs" call. If application is using special mbuf's then it can use the normal callback on done model. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] [PATCH 1/3] ethdev: New API to free consumed buffers in TX ring 2016-12-16 12:48 ` [dpdk-dev] [PATCH 1/3] ethdev: " Billy McFall 2016-12-16 16:28 ` Stephen Hemminger @ 2016-12-20 11:27 ` Adrien Mazarguil 2016-12-20 12:17 ` Ananyev, Konstantin 1 sibling, 1 reply; 12+ messages in thread From: Adrien Mazarguil @ 2016-12-20 11:27 UTC (permalink / raw) To: Billy McFall; +Cc: thomas.monjalon, wenzhuo.lu, dev, Stephen Hemminger Hi Billy, On Fri, Dec 16, 2016 at 07:48:49AM -0500, Billy McFall wrote: > Add a new API to force free consumed buffers on TX ring. API will return > the number of packets freed (0-n) or error code if feature not supported > (-ENOTSUP) or input invalid (-ENODEV). > > Because rte_eth_tx_buffer() may be used, and mbufs may still be held > in local buffer, the API also accepts *buffer and *sent. Before > attempting to free, rte_eth_tx_buffer_flush() is called to make sure > all mbufs are sent to Tx ring. rte_eth_tx_buffer_flush() is called even > if threshold is not met. > > Signed-off-by: Billy McFall <bmcfall@redhat.com> > --- > lib/librte_ether/rte_ethdev.h | 56 +++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 56 insertions(+) > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > index 9678179..e3f2be4 100644 > --- a/lib/librte_ether/rte_ethdev.h > +++ b/lib/librte_ether/rte_ethdev.h > @@ -1150,6 +1150,9 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev, > typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset); > /**< @internal Check DD bit of specific RX descriptor */ > > +typedef int (*eth_tx_done_cleanup_t)(void *txq, uint32_t free_cnt); > +/**< @internal Force mbufs to be from TX ring. */ > + > typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev, > uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo); > > @@ -1467,6 +1470,7 @@ struct eth_dev_ops { > eth_rx_disable_intr_t rx_queue_intr_disable; > eth_tx_queue_setup_t tx_queue_setup;/**< Set up device TX queue.*/ > eth_queue_release_t tx_queue_release;/**< Release TX queue.*/ > + eth_tx_done_cleanup_t tx_done_cleanup;/**< Free tx ring mbufs */ > eth_dev_led_on_t dev_led_on; /**< Turn on LED. */ > eth_dev_led_off_t dev_led_off; /**< Turn off LED. */ > flow_ctrl_get_t flow_ctrl_get; /**< Get flow control. */ > @@ -2943,6 +2947,58 @@ rte_eth_tx_buffer(uint8_t port_id, uint16_t queue_id, > } > > /** > + * Request the driver to free mbufs currently cached by the driver. The > + * driver will only free the mbuf if it is no longer in use. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param queue_id > + * The index of the transmit queue through which output packets must be > + * sent. > + * The value must be in the range [0, nb_tx_queue - 1] previously supplied > + * to rte_eth_dev_configure(). > + * @param free_cnt > + * Maximum number of packets to free. Use 0 to indicate all possible packets > + * should be freed. Note that a packet may be using multiple mbufs. > + * @param buffer > + * Buffer used to collect packets to be sent. If provided, the buffer will > + * be flushed, even if the current length is less than buffer->size. Pass NULL > + * if buffer has already been flushed. > + * @param sent > + * Pointer to return number of packets sent if buffer has packets to be sent. > + * If *buffer is supplied, *sent must also be supplied. > + * @return > + * Failure: < 0 > + * -ENODEV: Invalid interface > + * -ENOTSUP: Driver does not support function > + * Success: >= 0 > + * 0-n: Number of packets freed. More packets may still remain in ring that > + * are in use. > + */ > + > +static inline int > +rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id, uint32_t free_cnt, > + struct rte_eth_dev_tx_buffer *buffer, uint16_t *sent) > +{ > + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; > + > + /* Validate Input Data. Bail if not valid or not supported. */ > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP); > + > + /* > + * If transmit buffer is provided and there are still packets to be > + * sent, then send them before attempting to free pending mbufs. > + */ > + if (buffer && sent) > + *sent = rte_eth_tx_buffer_flush(port_id, queue_id, buffer); > + > + /* Call driver to free pending mbufs. */ > + return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id], > + free_cnt); > +} > + > +/** > * Configure a callback for buffered packets which cannot be sent > * > * Register a specific callback to be called when an attempt is made to send Just a thought to follow-up on Stephen's comment to further simplify this API, how about not adding any new eth_dev_ops but instead defining what should happen during an empty TX burst call (tx_burst() with 0 packets). Several PMDs already have a check for this scenario and start by cleaning up completed packets anyway, they effectively partially implement this definition for free already. The main difference with this API would be that you wouldn't know how many mbufs were freed and wouldn't collect them into an array. However most applications have one mbuf pool and/or know where they come from, so they can just query the pool or attempt to re-allocate from it after doing empty bursts in case of starvation. [1] http://dpdk.org/ml/archives/dev/2016-December/052469.html -- Adrien Mazarguil 6WIND ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] [PATCH 1/3] ethdev: New API to free consumed buffers in TX ring 2016-12-20 11:27 ` Adrien Mazarguil @ 2016-12-20 12:17 ` Ananyev, Konstantin 2016-12-20 12:58 ` Adrien Mazarguil 0 siblings, 1 reply; 12+ messages in thread From: Ananyev, Konstantin @ 2016-12-20 12:17 UTC (permalink / raw) To: Adrien Mazarguil, Billy McFall Cc: thomas.monjalon, Lu, Wenzhuo, dev, Stephen Hemminger > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil > Sent: Tuesday, December 20, 2016 11:28 AM > To: Billy McFall <bmcfall@redhat.com> > Cc: thomas.monjalon@6wind.com; Lu, Wenzhuo <wenzhuo.lu@intel.com>; dev@dpdk.org; Stephen Hemminger > <stephen@networkplumber.org> > Subject: Re: [dpdk-dev] [PATCH 1/3] ethdev: New API to free consumed buffers in TX ring > > Hi Billy, > > On Fri, Dec 16, 2016 at 07:48:49AM -0500, Billy McFall wrote: > > Add a new API to force free consumed buffers on TX ring. API will return > > the number of packets freed (0-n) or error code if feature not supported > > (-ENOTSUP) or input invalid (-ENODEV). > > > > Because rte_eth_tx_buffer() may be used, and mbufs may still be held > > in local buffer, the API also accepts *buffer and *sent. Before > > attempting to free, rte_eth_tx_buffer_flush() is called to make sure > > all mbufs are sent to Tx ring. rte_eth_tx_buffer_flush() is called even > > if threshold is not met. > > > > Signed-off-by: Billy McFall <bmcfall@redhat.com> > > --- > > lib/librte_ether/rte_ethdev.h | 56 +++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 56 insertions(+) > > > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > > index 9678179..e3f2be4 100644 > > --- a/lib/librte_ether/rte_ethdev.h > > +++ b/lib/librte_ether/rte_ethdev.h > > @@ -1150,6 +1150,9 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev, > > typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset); > > /**< @internal Check DD bit of specific RX descriptor */ > > > > +typedef int (*eth_tx_done_cleanup_t)(void *txq, uint32_t free_cnt); > > +/**< @internal Force mbufs to be from TX ring. */ > > + > > typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev, > > uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo); > > > > @@ -1467,6 +1470,7 @@ struct eth_dev_ops { > > eth_rx_disable_intr_t rx_queue_intr_disable; > > eth_tx_queue_setup_t tx_queue_setup;/**< Set up device TX queue.*/ > > eth_queue_release_t tx_queue_release;/**< Release TX queue.*/ > > + eth_tx_done_cleanup_t tx_done_cleanup;/**< Free tx ring mbufs */ > > eth_dev_led_on_t dev_led_on; /**< Turn on LED. */ > > eth_dev_led_off_t dev_led_off; /**< Turn off LED. */ > > flow_ctrl_get_t flow_ctrl_get; /**< Get flow control. */ > > @@ -2943,6 +2947,58 @@ rte_eth_tx_buffer(uint8_t port_id, uint16_t queue_id, > > } > > > > /** > > + * Request the driver to free mbufs currently cached by the driver. The > > + * driver will only free the mbuf if it is no longer in use. > > + * > > + * @param port_id > > + * The port identifier of the Ethernet device. > > + * @param queue_id > > + * The index of the transmit queue through which output packets must be > > + * sent. > > + * The value must be in the range [0, nb_tx_queue - 1] previously supplied > > + * to rte_eth_dev_configure(). > > + * @param free_cnt > > + * Maximum number of packets to free. Use 0 to indicate all possible packets > > + * should be freed. Note that a packet may be using multiple mbufs. > > + * @param buffer > > + * Buffer used to collect packets to be sent. If provided, the buffer will > > + * be flushed, even if the current length is less than buffer->size. Pass NULL > > + * if buffer has already been flushed. > > + * @param sent > > + * Pointer to return number of packets sent if buffer has packets to be sent. > > + * If *buffer is supplied, *sent must also be supplied. > > + * @return > > + * Failure: < 0 > > + * -ENODEV: Invalid interface > > + * -ENOTSUP: Driver does not support function > > + * Success: >= 0 > > + * 0-n: Number of packets freed. More packets may still remain in ring that > > + * are in use. > > + */ > > + > > +static inline int > > +rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id, uint32_t free_cnt, > > + struct rte_eth_dev_tx_buffer *buffer, uint16_t *sent) > > +{ > > + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; > > + > > + /* Validate Input Data. Bail if not valid or not supported. */ > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); > > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP); > > + > > + /* > > + * If transmit buffer is provided and there are still packets to be > > + * sent, then send them before attempting to free pending mbufs. > > + */ > > + if (buffer && sent) > > + *sent = rte_eth_tx_buffer_flush(port_id, queue_id, buffer); > > + > > + /* Call driver to free pending mbufs. */ > > + return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id], > > + free_cnt); > > +} > > + > > +/** > > * Configure a callback for buffered packets which cannot be sent > > * > > * Register a specific callback to be called when an attempt is made to send > > Just a thought to follow-up on Stephen's comment to further simplify this > API, how about not adding any new eth_dev_ops but instead defining what > should happen during an empty TX burst call (tx_burst() with 0 packets). > > Several PMDs already have a check for this scenario and start by cleaning up > completed packets anyway, they effectively partially implement this > definition for free already. Many PMDs start by cleaning up only when number of free entries drop below some point. Also in that case the author would have to modify (and test) all existing TX routinies. So I think a separate API call seems more plausible. Though I am agree with previous comment from Stephen that last two parameters are redundant and would just overcomplicate things. tin > > The main difference with this API would be that you wouldn't know how many > mbufs were freed and wouldn't collect them into an array. However most > applications have one mbuf pool and/or know where they come from, so they > can just query the pool or attempt to re-allocate from it after doing empty > bursts in case of starvation. > > [1] http://dpdk.org/ml/archives/dev/2016-December/052469.html > > -- > Adrien Mazarguil > 6WIND ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] [PATCH 1/3] ethdev: New API to free consumed buffers in TX ring 2016-12-20 12:17 ` Ananyev, Konstantin @ 2016-12-20 12:58 ` Adrien Mazarguil 2016-12-20 14:15 ` Billy McFall 0 siblings, 1 reply; 12+ messages in thread From: Adrien Mazarguil @ 2016-12-20 12:58 UTC (permalink / raw) To: Ananyev, Konstantin Cc: Billy McFall, thomas.monjalon, Lu, Wenzhuo, dev, Stephen Hemminger On Tue, Dec 20, 2016 at 12:17:10PM +0000, Ananyev, Konstantin wrote: > > > > -----Original Message----- > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil > > Sent: Tuesday, December 20, 2016 11:28 AM > > To: Billy McFall <bmcfall@redhat.com> > > Cc: thomas.monjalon@6wind.com; Lu, Wenzhuo <wenzhuo.lu@intel.com>; dev@dpdk.org; Stephen Hemminger > > <stephen@networkplumber.org> > > Subject: Re: [dpdk-dev] [PATCH 1/3] ethdev: New API to free consumed buffers in TX ring > > > > Hi Billy, > > > > On Fri, Dec 16, 2016 at 07:48:49AM -0500, Billy McFall wrote: > > > Add a new API to force free consumed buffers on TX ring. API will return > > > the number of packets freed (0-n) or error code if feature not supported > > > (-ENOTSUP) or input invalid (-ENODEV). > > > > > > Because rte_eth_tx_buffer() may be used, and mbufs may still be held > > > in local buffer, the API also accepts *buffer and *sent. Before > > > attempting to free, rte_eth_tx_buffer_flush() is called to make sure > > > all mbufs are sent to Tx ring. rte_eth_tx_buffer_flush() is called even > > > if threshold is not met. > > > > > > Signed-off-by: Billy McFall <bmcfall@redhat.com> > > > --- > > > lib/librte_ether/rte_ethdev.h | 56 +++++++++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 56 insertions(+) > > > > > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > > > index 9678179..e3f2be4 100644 > > > --- a/lib/librte_ether/rte_ethdev.h > > > +++ b/lib/librte_ether/rte_ethdev.h > > > @@ -1150,6 +1150,9 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev, > > > typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset); > > > /**< @internal Check DD bit of specific RX descriptor */ > > > > > > +typedef int (*eth_tx_done_cleanup_t)(void *txq, uint32_t free_cnt); > > > +/**< @internal Force mbufs to be from TX ring. */ > > > + > > > typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev, > > > uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo); > > > > > > @@ -1467,6 +1470,7 @@ struct eth_dev_ops { > > > eth_rx_disable_intr_t rx_queue_intr_disable; > > > eth_tx_queue_setup_t tx_queue_setup;/**< Set up device TX queue.*/ > > > eth_queue_release_t tx_queue_release;/**< Release TX queue.*/ > > > + eth_tx_done_cleanup_t tx_done_cleanup;/**< Free tx ring mbufs */ > > > eth_dev_led_on_t dev_led_on; /**< Turn on LED. */ > > > eth_dev_led_off_t dev_led_off; /**< Turn off LED. */ > > > flow_ctrl_get_t flow_ctrl_get; /**< Get flow control. */ > > > @@ -2943,6 +2947,58 @@ rte_eth_tx_buffer(uint8_t port_id, uint16_t queue_id, > > > } > > > > > > /** > > > + * Request the driver to free mbufs currently cached by the driver. The > > > + * driver will only free the mbuf if it is no longer in use. > > > + * > > > + * @param port_id > > > + * The port identifier of the Ethernet device. > > > + * @param queue_id > > > + * The index of the transmit queue through which output packets must be > > > + * sent. > > > + * The value must be in the range [0, nb_tx_queue - 1] previously supplied > > > + * to rte_eth_dev_configure(). > > > + * @param free_cnt > > > + * Maximum number of packets to free. Use 0 to indicate all possible packets > > > + * should be freed. Note that a packet may be using multiple mbufs. > > > + * @param buffer > > > + * Buffer used to collect packets to be sent. If provided, the buffer will > > > + * be flushed, even if the current length is less than buffer->size. Pass NULL > > > + * if buffer has already been flushed. > > > + * @param sent > > > + * Pointer to return number of packets sent if buffer has packets to be sent. > > > + * If *buffer is supplied, *sent must also be supplied. > > > + * @return > > > + * Failure: < 0 > > > + * -ENODEV: Invalid interface > > > + * -ENOTSUP: Driver does not support function > > > + * Success: >= 0 > > > + * 0-n: Number of packets freed. More packets may still remain in ring that > > > + * are in use. > > > + */ > > > + > > > +static inline int > > > +rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id, uint32_t free_cnt, > > > + struct rte_eth_dev_tx_buffer *buffer, uint16_t *sent) > > > +{ > > > + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; > > > + > > > + /* Validate Input Data. Bail if not valid or not supported. */ > > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); > > > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP); > > > + > > > + /* > > > + * If transmit buffer is provided and there are still packets to be > > > + * sent, then send them before attempting to free pending mbufs. > > > + */ > > > + if (buffer && sent) > > > + *sent = rte_eth_tx_buffer_flush(port_id, queue_id, buffer); > > > + > > > + /* Call driver to free pending mbufs. */ > > > + return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id], > > > + free_cnt); > > > +} > > > + > > > +/** > > > * Configure a callback for buffered packets which cannot be sent > > > * > > > * Register a specific callback to be called when an attempt is made to send > > > > Just a thought to follow-up on Stephen's comment to further simplify this > > API, how about not adding any new eth_dev_ops but instead defining what > > should happen during an empty TX burst call (tx_burst() with 0 packets). > > > > Several PMDs already have a check for this scenario and start by cleaning up > > completed packets anyway, they effectively partially implement this > > definition for free already. > > Many PMDs start by cleaning up only when number of free entries > drop below some point. > Also in that case the author would have to modify (and test) all existing TX routinies. > So I think a separate API call seems more plausible. Not necessarily, as I understand this API in its current form only suggests that a PMD should release a few mbufs from a queue if possible, without any guarantee, PMDs are not forced to comply. I think the threshold you mention is a valid reason not to release them, and it wouldn't change a thing to existing tx_burst() implementations in the meantime (only documentation). This threshold could also be bypassed rather painlessly in the "if (unlikely(nb_pkts == 0))" case that all PMDs already check for in a way or another. > Though I am agree with previous comment from Stephen that last two parameters > are redundant and would just overcomplicate things. > tin > > > > > The main difference with this API would be that you wouldn't know how many > > mbufs were freed and wouldn't collect them into an array. However most > > applications have one mbuf pool and/or know where they come from, so they > > can just query the pool or attempt to re-allocate from it after doing empty > > bursts in case of starvation. > > > > [1] http://dpdk.org/ml/archives/dev/2016-December/052469.html -- Adrien Mazarguil 6WIND ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] [PATCH 1/3] ethdev: New API to free consumed buffers in TX ring 2016-12-20 12:58 ` Adrien Mazarguil @ 2016-12-20 14:15 ` Billy McFall 2016-12-23 9:45 ` Adrien Mazarguil 0 siblings, 1 reply; 12+ messages in thread From: Billy McFall @ 2016-12-20 14:15 UTC (permalink / raw) To: Adrien Mazarguil Cc: Ananyev, Konstantin, thomas.monjalon, Lu, Wenzhuo, dev, Stephen Hemminger Thank you for your responses, see inline. On Tue, Dec 20, 2016 at 7:58 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote: > On Tue, Dec 20, 2016 at 12:17:10PM +0000, Ananyev, Konstantin wrote: >> >> >> > -----Original Message----- >> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil >> > Sent: Tuesday, December 20, 2016 11:28 AM >> > To: Billy McFall <bmcfall@redhat.com> >> > Cc: thomas.monjalon@6wind.com; Lu, Wenzhuo <wenzhuo.lu@intel.com>; dev@dpdk.org; Stephen Hemminger >> > <stephen@networkplumber.org> >> > Subject: Re: [dpdk-dev] [PATCH 1/3] ethdev: New API to free consumed buffers in TX ring >> > >> > Hi Billy, >> > >> > On Fri, Dec 16, 2016 at 07:48:49AM -0500, Billy McFall wrote: >> > > Add a new API to force free consumed buffers on TX ring. API will return >> > > the number of packets freed (0-n) or error code if feature not supported >> > > (-ENOTSUP) or input invalid (-ENODEV). >> > > >> > > Because rte_eth_tx_buffer() may be used, and mbufs may still be held >> > > in local buffer, the API also accepts *buffer and *sent. Before >> > > attempting to free, rte_eth_tx_buffer_flush() is called to make sure >> > > all mbufs are sent to Tx ring. rte_eth_tx_buffer_flush() is called even >> > > if threshold is not met. >> > > >> > > Signed-off-by: Billy McFall <bmcfall@redhat.com> >> > > --- >> > > lib/librte_ether/rte_ethdev.h | 56 +++++++++++++++++++++++++++++++++++++++++++ >> > > 1 file changed, 56 insertions(+) >> > > >> > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h >> > > index 9678179..e3f2be4 100644 >> > > --- a/lib/librte_ether/rte_ethdev.h >> > > +++ b/lib/librte_ether/rte_ethdev.h >> > > @@ -1150,6 +1150,9 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev, >> > > typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset); >> > > /**< @internal Check DD bit of specific RX descriptor */ >> > > >> > > +typedef int (*eth_tx_done_cleanup_t)(void *txq, uint32_t free_cnt); >> > > +/**< @internal Force mbufs to be from TX ring. */ >> > > + >> > > typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev, >> > > uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo); >> > > >> > > @@ -1467,6 +1470,7 @@ struct eth_dev_ops { >> > > eth_rx_disable_intr_t rx_queue_intr_disable; >> > > eth_tx_queue_setup_t tx_queue_setup;/**< Set up device TX queue.*/ >> > > eth_queue_release_t tx_queue_release;/**< Release TX queue.*/ >> > > + eth_tx_done_cleanup_t tx_done_cleanup;/**< Free tx ring mbufs */ >> > > eth_dev_led_on_t dev_led_on; /**< Turn on LED. */ >> > > eth_dev_led_off_t dev_led_off; /**< Turn off LED. */ >> > > flow_ctrl_get_t flow_ctrl_get; /**< Get flow control. */ >> > > @@ -2943,6 +2947,58 @@ rte_eth_tx_buffer(uint8_t port_id, uint16_t queue_id, >> > > } >> > > >> > > /** >> > > + * Request the driver to free mbufs currently cached by the driver. The >> > > + * driver will only free the mbuf if it is no longer in use. >> > > + * >> > > + * @param port_id >> > > + * The port identifier of the Ethernet device. >> > > + * @param queue_id >> > > + * The index of the transmit queue through which output packets must be >> > > + * sent. >> > > + * The value must be in the range [0, nb_tx_queue - 1] previously supplied >> > > + * to rte_eth_dev_configure(). >> > > + * @param free_cnt >> > > + * Maximum number of packets to free. Use 0 to indicate all possible packets >> > > + * should be freed. Note that a packet may be using multiple mbufs. >> > > + * @param buffer >> > > + * Buffer used to collect packets to be sent. If provided, the buffer will >> > > + * be flushed, even if the current length is less than buffer->size. Pass NULL >> > > + * if buffer has already been flushed. >> > > + * @param sent >> > > + * Pointer to return number of packets sent if buffer has packets to be sent. >> > > + * If *buffer is supplied, *sent must also be supplied. >> > > + * @return >> > > + * Failure: < 0 >> > > + * -ENODEV: Invalid interface >> > > + * -ENOTSUP: Driver does not support function >> > > + * Success: >= 0 >> > > + * 0-n: Number of packets freed. More packets may still remain in ring that >> > > + * are in use. >> > > + */ >> > > + >> > > +static inline int >> > > +rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id, uint32_t free_cnt, >> > > + struct rte_eth_dev_tx_buffer *buffer, uint16_t *sent) >> > > +{ >> > > + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; >> > > + >> > > + /* Validate Input Data. Bail if not valid or not supported. */ >> > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); >> > > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP); >> > > + >> > > + /* >> > > + * If transmit buffer is provided and there are still packets to be >> > > + * sent, then send them before attempting to free pending mbufs. >> > > + */ >> > > + if (buffer && sent) >> > > + *sent = rte_eth_tx_buffer_flush(port_id, queue_id, buffer); >> > > + >> > > + /* Call driver to free pending mbufs. */ >> > > + return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id], >> > > + free_cnt); >> > > +} >> > > + >> > > +/** >> > > * Configure a callback for buffered packets which cannot be sent >> > > * >> > > * Register a specific callback to be called when an attempt is made to send >> > I will remove the buffer/sent parameters. It will be the applications responsibility to make sure rte_eth_tx_buffer_flush() is called. I don't feel strongly about the free_cnt parameter. It was in the original request so that if there was a large ring buffer, the API could bail early without having to go through all the entire ring. It might be a little unrealistic for the application to truly know how many mbufs it wants freed. Also, as an example, the I40e driver already has a i40e_tx_free_bufs(...) function, so by dropping the free_cnt parameter, this function could be reused without having to account for the free_cnt. >> > Just a thought to follow-up on Stephen's comment to further simplify this >> > API, how about not adding any new eth_dev_ops but instead defining what >> > should happen during an empty TX burst call (tx_burst() with 0 packets). >> > In the original API request thread, see dpdk-dev mailing list from 11/21/2016 with subject "Adding API to force freeing consumed buffers in TX ring", overloading the existing API with nb_pkts == 0 was suggested and consensus was to go with new API. I lean towards a new API since this is a special case most applications won't use, but I will go with the community on whether to enhance the existing burst functionality or add a new API. >> > Several PMDs already have a check for this scenario and start by cleaning up >> > completed packets anyway, they effectively partially implement this >> > definition for free already. >> >> Many PMDs start by cleaning up only when number of free entries >> drop below some point. True, but the original request for this API was for the scenario where packets are being flooded and the application wanted to reuse mbuf to avoid a packet copy. So the API was to request the driver to free "done" mbufs outside of any threshold. >> Also in that case the author would have to modify (and test) all existing TX routinies. >> So I think a separate API call seems more plausible. > > Not necessarily, as I understand this API in its current form only suggests > that a PMD should release a few mbufs from a queue if possible, without any > guarantee, PMDs are not forced to comply. > > I think the threshold you mention is a valid reason not to release them, and > it wouldn't change a thing to existing tx_burst() implementations in the > meantime (only documentation). > > This threshold could also be bypassed rather painlessly in the > "if (unlikely(nb_pkts == 0))" case that all PMDs already check for in a > way or another. > >> Though I am agree with previous comment from Stephen that last two parameters >> are redundant and would just overcomplicate things. >> tin >> >> > >> > The main difference with this API would be that you wouldn't know how many >> > mbufs were freed and wouldn't collect them into an array. However most >> > applications have one mbuf pool and/or know where they come from, so they >> > can just query the pool or attempt to re-allocate from it after doing empty >> > bursts in case of starvation. >> > >> > [1] http://dpdk.org/ml/archives/dev/2016-December/052469.html > > -- > Adrien Mazarguil > 6WIND ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] [PATCH 1/3] ethdev: New API to free consumed buffers in TX ring 2016-12-20 14:15 ` Billy McFall @ 2016-12-23 9:45 ` Adrien Mazarguil 0 siblings, 0 replies; 12+ messages in thread From: Adrien Mazarguil @ 2016-12-23 9:45 UTC (permalink / raw) To: Billy McFall Cc: Ananyev, Konstantin, thomas.monjalon, Lu, Wenzhuo, dev, Stephen Hemminger Hi Billy, On Tue, Dec 20, 2016 at 09:15:50AM -0500, Billy McFall wrote: > Thank you for your responses, see inline. > > On Tue, Dec 20, 2016 at 7:58 AM, Adrien Mazarguil > <adrien.mazarguil@6wind.com> wrote: > > On Tue, Dec 20, 2016 at 12:17:10PM +0000, Ananyev, Konstantin wrote: > >> > >> > >> > -----Original Message----- > >> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil > >> > Sent: Tuesday, December 20, 2016 11:28 AM > >> > To: Billy McFall <bmcfall@redhat.com> > >> > Cc: thomas.monjalon@6wind.com; Lu, Wenzhuo <wenzhuo.lu@intel.com>; dev@dpdk.org; Stephen Hemminger > >> > <stephen@networkplumber.org> > >> > Subject: Re: [dpdk-dev] [PATCH 1/3] ethdev: New API to free consumed buffers in TX ring > >> > > >> > Hi Billy, > >> > > >> > On Fri, Dec 16, 2016 at 07:48:49AM -0500, Billy McFall wrote: > >> > > Add a new API to force free consumed buffers on TX ring. API will return > >> > > the number of packets freed (0-n) or error code if feature not supported > >> > > (-ENOTSUP) or input invalid (-ENODEV). > >> > > > >> > > Because rte_eth_tx_buffer() may be used, and mbufs may still be held > >> > > in local buffer, the API also accepts *buffer and *sent. Before > >> > > attempting to free, rte_eth_tx_buffer_flush() is called to make sure > >> > > all mbufs are sent to Tx ring. rte_eth_tx_buffer_flush() is called even > >> > > if threshold is not met. > >> > > > >> > > Signed-off-by: Billy McFall <bmcfall@redhat.com> > >> > > --- > >> > > lib/librte_ether/rte_ethdev.h | 56 +++++++++++++++++++++++++++++++++++++++++++ > >> > > 1 file changed, 56 insertions(+) > >> > > > >> > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > >> > > index 9678179..e3f2be4 100644 > >> > > --- a/lib/librte_ether/rte_ethdev.h > >> > > +++ b/lib/librte_ether/rte_ethdev.h > >> > > @@ -1150,6 +1150,9 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct rte_eth_dev *dev, > >> > > typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset); > >> > > /**< @internal Check DD bit of specific RX descriptor */ > >> > > > >> > > +typedef int (*eth_tx_done_cleanup_t)(void *txq, uint32_t free_cnt); > >> > > +/**< @internal Force mbufs to be from TX ring. */ > >> > > + > >> > > typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev, > >> > > uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo); > >> > > > >> > > @@ -1467,6 +1470,7 @@ struct eth_dev_ops { > >> > > eth_rx_disable_intr_t rx_queue_intr_disable; > >> > > eth_tx_queue_setup_t tx_queue_setup;/**< Set up device TX queue.*/ > >> > > eth_queue_release_t tx_queue_release;/**< Release TX queue.*/ > >> > > + eth_tx_done_cleanup_t tx_done_cleanup;/**< Free tx ring mbufs */ > >> > > eth_dev_led_on_t dev_led_on; /**< Turn on LED. */ > >> > > eth_dev_led_off_t dev_led_off; /**< Turn off LED. */ > >> > > flow_ctrl_get_t flow_ctrl_get; /**< Get flow control. */ > >> > > @@ -2943,6 +2947,58 @@ rte_eth_tx_buffer(uint8_t port_id, uint16_t queue_id, > >> > > } > >> > > > >> > > /** > >> > > + * Request the driver to free mbufs currently cached by the driver. The > >> > > + * driver will only free the mbuf if it is no longer in use. > >> > > + * > >> > > + * @param port_id > >> > > + * The port identifier of the Ethernet device. > >> > > + * @param queue_id > >> > > + * The index of the transmit queue through which output packets must be > >> > > + * sent. > >> > > + * The value must be in the range [0, nb_tx_queue - 1] previously supplied > >> > > + * to rte_eth_dev_configure(). > >> > > + * @param free_cnt > >> > > + * Maximum number of packets to free. Use 0 to indicate all possible packets > >> > > + * should be freed. Note that a packet may be using multiple mbufs. > >> > > + * @param buffer > >> > > + * Buffer used to collect packets to be sent. If provided, the buffer will > >> > > + * be flushed, even if the current length is less than buffer->size. Pass NULL > >> > > + * if buffer has already been flushed. > >> > > + * @param sent > >> > > + * Pointer to return number of packets sent if buffer has packets to be sent. > >> > > + * If *buffer is supplied, *sent must also be supplied. > >> > > + * @return > >> > > + * Failure: < 0 > >> > > + * -ENODEV: Invalid interface > >> > > + * -ENOTSUP: Driver does not support function > >> > > + * Success: >= 0 > >> > > + * 0-n: Number of packets freed. More packets may still remain in ring that > >> > > + * are in use. > >> > > + */ > >> > > + > >> > > +static inline int > >> > > +rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id, uint32_t free_cnt, > >> > > + struct rte_eth_dev_tx_buffer *buffer, uint16_t *sent) > >> > > +{ > >> > > + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; > >> > > + > >> > > + /* Validate Input Data. Bail if not valid or not supported. */ > >> > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); > >> > > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP); > >> > > + > >> > > + /* > >> > > + * If transmit buffer is provided and there are still packets to be > >> > > + * sent, then send them before attempting to free pending mbufs. > >> > > + */ > >> > > + if (buffer && sent) > >> > > + *sent = rte_eth_tx_buffer_flush(port_id, queue_id, buffer); > >> > > + > >> > > + /* Call driver to free pending mbufs. */ > >> > > + return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id], > >> > > + free_cnt); > >> > > +} > >> > > + > >> > > +/** > >> > > * Configure a callback for buffered packets which cannot be sent > >> > > * > >> > > * Register a specific callback to be called when an attempt is made to send > >> > > > I will remove the buffer/sent parameters. It will be the applications > responsibility > to make sure rte_eth_tx_buffer_flush() is called. > > I don't feel strongly about the free_cnt parameter. It was in the > original request > so that if there was a large ring buffer, the API could bail early > without having > to go through all the entire ring. It might be a little unrealistic > for the application > to truly know how many mbufs it wants freed. Also, as an example, the I40e > driver already has a i40e_tx_free_bufs(...) function, so by dropping > the free_cnt > parameter, this function could be reused without having to account for > the free_cnt. > > >> > Just a thought to follow-up on Stephen's comment to further simplify this > >> > API, how about not adding any new eth_dev_ops but instead defining what > >> > should happen during an empty TX burst call (tx_burst() with 0 packets). > >> > > > In the original API request thread, see dpdk-dev mailing list from 11/21/2016 > with subject "Adding API to force freeing consumed buffers in TX ring", > overloading the existing API with nb_pkts == 0 was suggested and consensus > was to go with new API. I lean towards a new API since this is a special case > most applications won't use, but I will go with the community on whether to > enhance the existing burst functionality or add a new API. OK, I've just read the original thread. > >> > Several PMDs already have a check for this scenario and start by cleaning up > >> > completed packets anyway, they effectively partially implement this > >> > definition for free already. > >> > >> Many PMDs start by cleaning up only when number of free entries > >> drop below some point. > > True, but the original request for this API was for the scenario where packets > are being flooded and the application wanted to reuse mbuf to avoid a packet > copy. So the API was to request the driver to free "done" mbufs outside of any > threshold. Understood, so it's more than just a polite suggestion to PMDs that implement this call. In my opinion it's still better to avoid adding a new callback for that purpose since applications cannot rely on a specific outcome, it cannot guarantee any mbuf would be freed, not unlike calling tx_burst() with 0 packets. That's a separate discussion, however perhaps making struct eth_dev_ops part of the public API was not such a good idea after all. We're unable to maintain ABI compatibility across releases because of it. New callbacks would be met with less resistance (at least on my side) if this whole ABI compat thing was not an issue. > >> Also in that case the author would have to modify (and test) all existing TX routinies. > >> So I think a separate API call seems more plausible. > > > > Not necessarily, as I understand this API in its current form only suggests > > that a PMD should release a few mbufs from a queue if possible, without any > > guarantee, PMDs are not forced to comply. > > > > I think the threshold you mention is a valid reason not to release them, and > > it wouldn't change a thing to existing tx_burst() implementations in the > > meantime (only documentation). > > > > This threshold could also be bypassed rather painlessly in the > > "if (unlikely(nb_pkts == 0))" case that all PMDs already check for in a > > way or another. > > > >> Though I am agree with previous comment from Stephen that last two parameters > >> are redundant and would just overcomplicate things. > >> tin > >> > >> > > >> > The main difference with this API would be that you wouldn't know how many > >> > mbufs were freed and wouldn't collect them into an array. However most > >> > applications have one mbuf pool and/or know where they come from, so they > >> > can just query the pool or attempt to re-allocate from it after doing empty > >> > bursts in case of starvation. > >> > > >> > [1] http://dpdk.org/ml/archives/dev/2016-December/052469.html > > > > -- > > Adrien Mazarguil > > 6WIND -- Adrien Mazarguil 6WIND ^ permalink raw reply [flat|nested] 12+ messages in thread
* [dpdk-dev] [PATCH 2/3] driver: e1000 igb support to free consumed buffers 2016-12-16 12:48 [dpdk-dev] [PATCH 0/3] New API to free consumed buffers in TX ring Billy McFall 2016-12-16 12:48 ` [dpdk-dev] [PATCH 1/3] ethdev: " Billy McFall @ 2016-12-16 12:48 ` Billy McFall 2016-12-16 12:48 ` [dpdk-dev] [PATCH 3/3] driver: vHost " Billy McFall 2 siblings, 0 replies; 12+ messages in thread From: Billy McFall @ 2016-12-16 12:48 UTC (permalink / raw) To: thomas.monjalon, wenzhuo.lu; +Cc: dev, Billy McFall Add support to the e1000 igb driver for the new API to force free consumed buffers on TX ring. e1000 igb driver does not implement a tx_rs_thresh to free mbufs, it frees a slot in the ring as needed, so a new function needed to be written. Signed-off-by: Billy McFall <bmcfall@redhat.com> --- drivers/net/e1000/e1000_ethdev.h | 2 + drivers/net/e1000/igb_ethdev.c | 1 + drivers/net/e1000/igb_rxtx.c | 126 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 129 insertions(+) diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h index 6c25c8d..fce0278 100644 --- a/drivers/net/e1000/e1000_ethdev.h +++ b/drivers/net/e1000/e1000_ethdev.h @@ -308,6 +308,8 @@ int eth_igb_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id, uint16_t nb_tx_desc, unsigned int socket_id, const struct rte_eth_txconf *tx_conf); +int eth_igb_tx_done_cleanup(void *txq, uint32_t free_cnt); + int eth_igb_rx_init(struct rte_eth_dev *dev); void eth_igb_tx_init(struct rte_eth_dev *dev); diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c index 2fddf0c..4010dc4 100644 --- a/drivers/net/e1000/igb_ethdev.c +++ b/drivers/net/e1000/igb_ethdev.c @@ -402,6 +402,7 @@ static const struct eth_dev_ops eth_igb_ops = { .rx_descriptor_done = eth_igb_rx_descriptor_done, .tx_queue_setup = eth_igb_tx_queue_setup, .tx_queue_release = eth_igb_tx_queue_release, + .tx_done_cleanup = eth_igb_tx_done_cleanup, .dev_led_on = eth_igb_led_on, .dev_led_off = eth_igb_led_off, .flow_ctrl_get = eth_igb_flow_ctrl_get, diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c index dbd37ac..1d4fbcb 100644 --- a/drivers/net/e1000/igb_rxtx.c +++ b/drivers/net/e1000/igb_rxtx.c @@ -1227,6 +1227,132 @@ eth_igb_tx_queue_release(void *txq) igb_tx_queue_release(txq); } +static int +igb_tx_done_cleanup(struct igb_tx_queue *txq, uint32_t free_cnt) +{ + struct igb_tx_entry *sw_ring; + volatile union e1000_adv_tx_desc *txr; + uint16_t tx_first; /* First segment analyzed. */ + uint16_t tx_id; /* Current segment being processed. */ + uint16_t tx_last; /* Last segment in the current packet. */ + uint16_t tx_next; /* First segment of the next packet. */ + int count; + + if (txq != NULL) { + count = 0; + sw_ring = txq->sw_ring; + txr = txq->tx_ring; + + /* + * tx_tail is the last sent packet on the sw_ring. Goto the end + * of that packet (the last segment in the packet chain) and + * then the next segment will be the start of the oldest segment + * in the sw_ring. This is the first packet that will be + * attempted to be freed. + */ + + /* Get last segment in most recently added packet. */ + tx_first = sw_ring[txq->tx_tail].last_id; + + /* Get the next segment, which is the oldest segment in ring. */ + tx_first = sw_ring[tx_first].next_id; + + /* Set the current index to the first. */ + tx_id = tx_first; + + /* + * Loop through each packet. For each packet, verify that an + * mbuf exists and that the last segment is free. If so, free + * it and move on. + */ + while (1) { + tx_last = sw_ring[tx_id].last_id; + + if (sw_ring[tx_last].mbuf) { + if (txr[tx_last].wb.status & + E1000_TXD_STAT_DD) { + /* + * Increment the number of packets + * freed. + */ + count++; + + /* Get the start of the next packet. */ + tx_next = sw_ring[tx_last].next_id; + + /* + * Loop through all segments in a + * packet. + */ + do { + rte_pktmbuf_free_seg(sw_ring[tx_id].mbuf); + sw_ring[tx_id].mbuf = NULL; + sw_ring[tx_id].last_id = tx_id; + + /* Move to next segemnt. */ + tx_id = sw_ring[tx_id].next_id; + + } while (tx_id != tx_next); + + if (unlikely(count == (int)free_cnt)) + break; + } else + /* + * mbuf still in use, nothing left to + * free. + */ + break; + } else { + /* + * There are multiple reasons to be here: + * 1) All the packets on the ring have been + * freed - tx_id is equal to tx_first + * and some packets have been freed. + * - Done, exit + * 2) Interfaces has not sent a rings worth of + * packets yet, so the segment after tail is + * still empty. Or a previous call to this + * function freed some of the segments but + * not all so there is a hole in the list. + * Hopefully this is a rare case. + * - Walk the list and find the next mbuf. If + * there isn't one, then done. + */ + if (likely((tx_id == tx_first) && (count != 0))) + break; + + /* + * Walk the list and find the next mbuf, if any. + */ + do { + /* Move to next segemnt. */ + tx_id = sw_ring[tx_id].next_id; + + if (sw_ring[tx_id].mbuf) + break; + + } while (tx_id != tx_first); + + /* + * Determine why previous loop bailed. If there + * is not an mbuf, done. + */ + if (sw_ring[tx_id].mbuf == NULL) + break; + } + } + } else + count = -ENODEV; + + return count; +} + +int +eth_igb_tx_done_cleanup(void *txq, uint32_t free_cnt) +{ + return igb_tx_done_cleanup(txq, free_cnt); +} + static void igb_reset_tx_queue_stat(struct igb_tx_queue *txq) { -- 2.9.3 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [dpdk-dev] [PATCH 3/3] driver: vHost support to free consumed buffers 2016-12-16 12:48 [dpdk-dev] [PATCH 0/3] New API to free consumed buffers in TX ring Billy McFall 2016-12-16 12:48 ` [dpdk-dev] [PATCH 1/3] ethdev: " Billy McFall 2016-12-16 12:48 ` [dpdk-dev] [PATCH 2/3] driver: e1000 igb support to free consumed buffers Billy McFall @ 2016-12-16 12:48 ` Billy McFall 2016-12-16 16:24 ` Stephen Hemminger 2 siblings, 1 reply; 12+ messages in thread From: Billy McFall @ 2016-12-16 12:48 UTC (permalink / raw) To: thomas.monjalon, wenzhuo.lu; +Cc: dev, Billy McFall Add support to the vHostdriver for the new API to force free consumed buffers on TX ring. vHost does not cache the mbufs so there is no work to do. Signed-off-by: Billy McFall <bmcfall@redhat.com> --- drivers/net/vhost/rte_eth_vhost.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c index 766d4ef..6493d56 100644 --- a/drivers/net/vhost/rte_eth_vhost.c +++ b/drivers/net/vhost/rte_eth_vhost.c @@ -939,6 +939,16 @@ eth_queue_release(void *q) } static int +eth_tx_done_cleanup(void *txq __rte_unused, uint32_t free_cnt __rte_unused) +{ + /* + * vHost does not hang onto mbuf. eth_vhost_tx() copies packet data + * and releases mbuf, so nothing to cleanup. + */ + return 0; +} + +static int eth_link_update(struct rte_eth_dev *dev __rte_unused, int wait_to_complete __rte_unused) { @@ -979,6 +989,7 @@ static const struct eth_dev_ops ops = { .tx_queue_setup = eth_tx_queue_setup, .rx_queue_release = eth_queue_release, .tx_queue_release = eth_queue_release, + .tx_done_cleanup = eth_tx_done_cleanup, .link_update = eth_link_update, .stats_get = eth_stats_get, .stats_reset = eth_stats_reset, -- 2.9.3 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] [PATCH 3/3] driver: vHost support to free consumed buffers 2016-12-16 12:48 ` [dpdk-dev] [PATCH 3/3] driver: vHost " Billy McFall @ 2016-12-16 16:24 ` Stephen Hemminger 2017-01-11 19:54 ` Billy McFall 0 siblings, 1 reply; 12+ messages in thread From: Stephen Hemminger @ 2016-12-16 16:24 UTC (permalink / raw) To: Billy McFall; +Cc: thomas.monjalon, wenzhuo.lu, dev On Fri, 16 Dec 2016 07:48:51 -0500 Billy McFall <bmcfall@redhat.com> wrote: > Add support to the vHostdriver for the new API to force free consumed > buffers on TX ring. vHost does not cache the mbufs so there is no work > to do. > > Signed-off-by: Billy McFall <bmcfall@redhat.com> > --- > drivers/net/vhost/rte_eth_vhost.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c > index 766d4ef..6493d56 100644 > --- a/drivers/net/vhost/rte_eth_vhost.c > +++ b/drivers/net/vhost/rte_eth_vhost.c > @@ -939,6 +939,16 @@ eth_queue_release(void *q) > } > > static int > +eth_tx_done_cleanup(void *txq __rte_unused, uint32_t free_cnt __rte_unused) > +{ > + /* > + * vHost does not hang onto mbuf. eth_vhost_tx() copies packet data > + * and releases mbuf, so nothing to cleanup. > + */ > + return 0; > +} > + > +static int > eth_link_update(struct rte_eth_dev *dev __rte_unused, > int wait_to_complete __rte_unused) > { > @@ -979,6 +989,7 @@ static const struct eth_dev_ops ops = { > .tx_queue_setup = eth_tx_queue_setup, > .rx_queue_release = eth_queue_release, > .tx_queue_release = eth_queue_release, > + .tx_done_cleanup = eth_tx_done_cleanup, > .link_update = eth_link_update, > .stats_get = eth_stats_get, > .stats_reset = eth_stats_reset, Rather than having to change every drive, why not make this the default behavior? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [dpdk-dev] [PATCH 3/3] driver: vHost support to free consumed buffers 2016-12-16 16:24 ` Stephen Hemminger @ 2017-01-11 19:54 ` Billy McFall 0 siblings, 0 replies; 12+ messages in thread From: Billy McFall @ 2017-01-11 19:54 UTC (permalink / raw) To: Stephen Hemminger; +Cc: thomas.monjalon, wenzhuo.lu, dev This new API is attempting to address two scenarios: 1) Application wants to reuse existing mbuf to avoid a packet copy (example: Flooding a packet to multiple ports). The application increments the reference count of the packet and then polls new API until the reference count for the given mbuf is decremented. 2) Application runs out of mbufs, or some application like pktgen finishes a run and is preparing for an additional run, calls API to free consumed packets so processing can continue. With the current design, the application calls the new API, if rval >= 0, assume mubfs are being freed and can call multiple times if need be (to either get enough mbufs to continue or to get the specific one freed). If rval < 0, take some other action, like make a copy of packet in the flooding case or whatever is currently done in the application today. If the default behavior is to return 0, the application can't take any additional actions. Submitting V2 of the patch with the rte_eth_tx_buffer_flush() call and associated parameters removed and to continue the discussion on new API or not. Thanks, Billy McFall On Fri, Dec 16, 2016 at 11:24 AM, Stephen Hemminger < stephen@networkplumber.org> wrote: > On Fri, 16 Dec 2016 07:48:51 -0500 > Billy McFall <bmcfall@redhat.com> wrote: > > > Add support to the vHostdriver for the new API to force free consumed > > buffers on TX ring. vHost does not cache the mbufs so there is no work > > to do. > > > > Signed-off-by: Billy McFall <bmcfall@redhat.com> > > --- > > drivers/net/vhost/rte_eth_vhost.c | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/drivers/net/vhost/rte_eth_vhost.c > b/drivers/net/vhost/rte_eth_vhost.c > > index 766d4ef..6493d56 100644 > > --- a/drivers/net/vhost/rte_eth_vhost.c > > +++ b/drivers/net/vhost/rte_eth_vhost.c > > @@ -939,6 +939,16 @@ eth_queue_release(void *q) > > } > > > > static int > > +eth_tx_done_cleanup(void *txq __rte_unused, uint32_t free_cnt > __rte_unused) > > +{ > > + /* > > + * vHost does not hang onto mbuf. eth_vhost_tx() copies packet data > > + * and releases mbuf, so nothing to cleanup. > > + */ > > + return 0; > > +} > > + > > +static int > > eth_link_update(struct rte_eth_dev *dev __rte_unused, > > int wait_to_complete __rte_unused) > > { > > @@ -979,6 +989,7 @@ static const struct eth_dev_ops ops = { > > .tx_queue_setup = eth_tx_queue_setup, > > .rx_queue_release = eth_queue_release, > > .tx_queue_release = eth_queue_release, > > + .tx_done_cleanup = eth_tx_done_cleanup, > > .link_update = eth_link_update, > > .stats_get = eth_stats_get, > > .stats_reset = eth_stats_reset, > > Rather than having to change every drive, why not make this the default > behavior? > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-01-11 19:54 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-12-16 12:48 [dpdk-dev] [PATCH 0/3] New API to free consumed buffers in TX ring Billy McFall 2016-12-16 12:48 ` [dpdk-dev] [PATCH 1/3] ethdev: " Billy McFall 2016-12-16 16:28 ` Stephen Hemminger 2016-12-20 11:27 ` Adrien Mazarguil 2016-12-20 12:17 ` Ananyev, Konstantin 2016-12-20 12:58 ` Adrien Mazarguil 2016-12-20 14:15 ` Billy McFall 2016-12-23 9:45 ` Adrien Mazarguil 2016-12-16 12:48 ` [dpdk-dev] [PATCH 2/3] driver: e1000 igb support to free consumed buffers Billy McFall 2016-12-16 12:48 ` [dpdk-dev] [PATCH 3/3] driver: vHost " Billy McFall 2016-12-16 16:24 ` Stephen Hemminger 2017-01-11 19:54 ` Billy McFall
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).