From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 3D3328E6A for ; Fri, 15 Jan 2016 19:45:00 +0100 (CET) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga103.fm.intel.com with ESMTP; 15 Jan 2016 10:44:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,301,1449561600"; d="scan'208";a="634045514" Received: from irsmsx106.ger.corp.intel.com ([163.33.3.31]) by FMSMGA003.fm.intel.com with ESMTP; 15 Jan 2016 10:44:45 -0800 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.104]) by IRSMSX106.ger.corp.intel.com ([169.254.8.88]) with mapi id 14.03.0248.002; Fri, 15 Jan 2016 18:44:43 +0000 From: "Ananyev, Konstantin" To: "Kulasek, TomaszX" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api Thread-Index: AQHRT6NgF5mTPTzM9EGbrlxNGe76s57822EQ Date: Fri, 15 Jan 2016 18:44:42 +0000 Message-ID: <2601191342CEEE43887BDE71AB97725836AE637C@irsmsx105.ger.corp.intel.com> References: <1452869038-9140-1-git-send-email-tomaszx.kulasek@intel.com> <1452869038-9140-2-git-send-email-tomaszx.kulasek@intel.com> In-Reply-To: <1452869038-9140-2-git-send-email-tomaszx.kulasek@intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jan 2016 18:45:01 -0000 Hi Tomasz, > static int > rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues) > { > uint16_t old_nb_queues =3D dev->data->nb_tx_queues; > void **txq; > + struct rte_eth_dev_tx_buffer *new_bufs; > unsigned i; >=20 > if (dev->data->tx_queues =3D=3D NULL) { /* first time configuration */ > @@ -841,17 +872,40 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev= , uint16_t nb_queues) > dev->data->nb_tx_queues =3D 0; > return -(ENOMEM); > } > + > + dev->data->txq_bufs =3D rte_zmalloc("ethdev->txq_bufs", > + sizeof(*dev->data->txq_bufs) * nb_queues, 0); > + if (dev->data->txq_bufs =3D=3D NULL) { > + dev->data->nb_tx_queues =3D 0; > + rte_free(dev->data->tx_queues); > + return -(ENOMEM); > + } > + > } else { /* re-configure */ > + > + /* flush the packets queued for all queues*/ > + for (i =3D 0; i < old_nb_queues; i++) > + rte_eth_tx_buffer_flush(dev->data->port_id, i); > + I don't think it is safe to call tx_burst() at queue config stage. Instead you need to flush (or just empty) your txq)bufs at tx_queue_stop st= age. > RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release, -ENOTSUP); >=20 > + /* get new buffer space first, but keep old space around */ > + new_bufs =3D rte_zmalloc("ethdev->txq_bufs", > + sizeof(*dev->data->txq_bufs) * nb_queues, 0); > + if (new_bufs =3D=3D NULL) > + return -(ENOMEM); > + Why not to allocate space for txq_bufs together with tx_queues (as one chun= k for both)? As I understand there is always one to one mapping between them anyway. Would simplify things a bit. Or even introduce a new struct to group with all related tx queue info toge= tehr struct rte_eth_txq_data { void *queue; /*actual pmd queue*/ struct rte_eth_dev_tx_buffer buf; uint8_t state; } And use it inside struct rte_eth_dev_data? Would probably give a better data locality. > txq =3D dev->data->tx_queues; >=20 > for (i =3D nb_queues; i < old_nb_queues; i++) > (*dev->dev_ops->tx_queue_release)(txq[i]); > txq =3D rte_realloc(txq, sizeof(txq[0]) * nb_queues, > RTE_CACHE_LINE_SIZE); > - if (txq =3D=3D NULL) > - return -ENOMEM; > + if (txq =3D=3D NULL) { > + rte_free(new_bufs); > + return -(ENOMEM); > + } > + > if (nb_queues > old_nb_queues) { > uint16_t new_qs =3D nb_queues - old_nb_queues; >=20 > @@ -861,6 +915,9 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, = uint16_t nb_queues) >=20 > dev->data->tx_queues =3D txq; >=20 > + /* now replace old buffers with new */ > + rte_free(dev->data->txq_bufs); > + dev->data->txq_bufs =3D new_bufs; > } > dev->data->nb_tx_queues =3D nb_queues; > return 0; > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.= h > index bada8ad..23faa6a 100644 > --- a/lib/librte_ether/rte_ethdev.h > +++ b/lib/librte_ether/rte_ethdev.h > @@ -1,7 +1,7 @@ > /*- > * BSD LICENSE > * > - * Copyright(c) 2010-2015 Intel Corporation. All rights reserved. > + * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. > * All rights reserved. > * > * Redistribution and use in source and binary forms, with or without > @@ -182,6 +182,7 @@ extern "C" { > #include > #include > #include > +#include > #include "rte_ether.h" > #include "rte_eth_ctrl.h" > #include "rte_dev_info.h" > @@ -1519,6 +1520,34 @@ enum rte_eth_dev_type { > RTE_ETH_DEV_MAX /**< max value of this enum */ > }; >=20 > +typedef void (*buffer_tx_error_fn)(struct rte_mbuf **unsent, uint16_t co= unt, > + void *userdata); > + > +/** > + * @internal > + * Structure used to buffer packets for future TX > + * Used by APIs rte_eth_tx_buffer and rte_eth_tx_buffer_flush > + */ > +struct rte_eth_dev_tx_buffer { > + struct rte_mbuf *pkts[RTE_ETHDEV_TX_BUFSIZE]; I think it is better to make size of pkts[] configurable at runtime. There are a lot of different usage scenarios - hard to predict what would b= e an optimal buffer size for all cases. =20 > + unsigned nb_pkts; > + uint64_t errors; > + /**< Total number of queue packets to sent that are dropped. */ > +}; > + > +/** > + * @internal > + * Structure to hold a callback to be used on error when a tx_buffer_flu= sh > + * call fails to send all packets. > + * This needs to be a separate structure, as it must go in the ethdev st= ructure > + * rather than ethdev_data, due to the use of a function pointer, which = is not > + * multi-process safe. > + */ > +struct rte_eth_dev_tx_buffer_err_cb { > + buffer_tx_error_fn flush_cb; /* callback for when tx_burst fails */ > + void *userdata; /* userdata for callback */ > +}; > + > /** > * @internal > * The generic data structure associated with each ethernet device. > @@ -1550,6 +1579,9 @@ struct rte_eth_dev { > struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT]= ; > uint8_t attached; /**< Flag indicating the port is attached */ > enum rte_eth_dev_type dev_type; /**< Flag indicating the device type */ > + > + /** Callbacks to be used on a tx_buffer_flush error */ > + struct rte_eth_dev_tx_buffer_err_cb tx_buf_err_cb[RTE_MAX_QUEUES_PER_PO= RT]; > }; >=20 > struct rte_eth_dev_sriov { > @@ -1610,6 +1642,8 @@ struct rte_eth_dev_data { > enum rte_kernel_driver kdrv; /**< Kernel driver passthrough */ > int numa_node; /**< NUMA node connection */ > const char *drv_name; /**< Driver name */ > + struct rte_eth_dev_tx_buffer *txq_bufs; > + /**< space to allow buffered transmits */ > }; >=20 > /** Device supports hotplug detach */ > @@ -2661,8 +2695,181 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_= id, > } >=20 > /** > - * The eth device event type for interrupt, and maybe others in the futu= re. > + * Buffer a single packet for future transmission on a port and queue > + * > + * This function takes a single mbuf/packet and buffers it for later > + * transmission on the particular port and queue specified. Once the buf= fer is > + * full of packets, an attempt will be made to transmit all the buffered > + * packets. In case of error, where not all packets can be transmitted, = a > + * callback is called with the unsent packets as a parameter. If no call= back > + * is explicitly set up, the unsent packets are just freed back to the o= wning > + * mempool. The function returns the number of packets actually sent i.e= . > + * 0 if no buffer flush occurred, otherwise the number of packets succes= sfully > + * flushed > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param queue_id > + * The index of the transmit queue through which output packets must b= e > + * sent. > + * The value must be in the range [0, nb_tx_queue - 1] previously supp= lied > + * to rte_eth_dev_configure(). > + * @param tx_pkt > + * Pointer to the packet mbuf to be sent. > + * @return > + * 0 =3D packet has been buffered for later transmission > + * N > 0 =3D packet has been buffered, and the buffer was subsequently= flushed, > + * causing N packets to be sent, and the error callback to be called= for > + * the rest. > + */ > +static inline uint16_t __attribute__((always_inline)) > +rte_eth_tx_buffer(uint8_t port_id, uint16_t queue_id, struct rte_mbuf *t= x_pkt) > +{ > + struct rte_eth_dev *dev =3D &rte_eth_devices[port_id]; > + struct rte_eth_dev_tx_buffer *qbuf =3D &dev->data->txq_bufs[queue_id]; > + uint16_t i; > + > + qbuf->pkts[qbuf->nb_pkts++] =3D tx_pkt; > + if (qbuf->nb_pkts < RTE_ETHDEV_TX_BUFSIZE) > + return 0; > + Probably just call rte_eth_tx_buffer_flush() here to avoid duplication. > + const uint16_t sent =3D rte_eth_tx_burst(port_id, queue_id, qbuf->pkts, > + RTE_ETHDEV_TX_BUFSIZE); > + > + qbuf->nb_pkts =3D 0; > + > + /* All packets sent, or to be dealt with by callback below */ > + if (unlikely(sent !=3D RTE_ETHDEV_TX_BUFSIZE)) { > + if (dev->tx_buf_err_cb[queue_id].flush_cb) > + dev->tx_buf_err_cb[queue_id].flush_cb(&qbuf->pkts[sent], > + RTE_ETHDEV_TX_BUFSIZE - sent, > + dev->tx_buf_err_cb[queue_id].userdata); > + else { > + qbuf->errors +=3D RTE_ETHDEV_TX_BUFSIZE - sent; > + for (i =3D sent; i < RTE_ETHDEV_TX_BUFSIZE; i++) > + rte_pktmbuf_free(qbuf->pkts[i]); > + } > + } > + > + return sent; > +} > + > +/** > + * Send any packets queued up for transmission on a port and HW queue > + * > + * This causes an explicit flush of packets previously buffered via the > + * rte_eth_tx_buffer() function. It returns the number of packets succes= sfully > + * sent to the NIC, and calls the error callback for any unsent packets.= Unless > + * explicitly set up otherwise, the default callback simply frees the un= sent > + * packets back to the owning mempool. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param queue_id > + * The index of the transmit queue through which output packets must b= e > + * sent. > + * The value must be in the range [0, nb_tx_queue - 1] previously supp= lied > + * to rte_eth_dev_configure(). > + * @return > + * The number of packets successfully sent to the Ethernet device. The= error > + * callback is called for any packets which could not be sent. > + */ > +static inline uint16_t > +rte_eth_tx_buffer_flush(uint8_t port_id, uint16_t queue_id) > +{ > + uint16_t i; > + struct rte_eth_dev *dev =3D &rte_eth_devices[port_id]; > + struct rte_eth_dev_tx_buffer *qbuf =3D &dev->data->txq_bufs[queue_id]; > + > + if (qbuf->nb_pkts =3D=3D 0) > + return 0; > + > + const uint16_t to_send =3D qbuf->nb_pkts; > + > + const uint16_t sent =3D rte_eth_tx_burst(port_id, queue_id, qbuf->pkts, > + to_send); Try to avoid defining variables in the middle of the code block. Again no much value in having these 2 above variables as 'const'. Konstantin > + > + qbuf->nb_pkts =3D 0; > + > + /* All packets sent, or to be dealt with by callback below */ > + if (unlikely(sent !=3D to_send)) { > + if (dev->tx_buf_err_cb[queue_id].flush_cb) > + dev->tx_buf_err_cb[queue_id].flush_cb(&qbuf->pkts[sent], > + to_send - sent, > + dev->tx_buf_err_cb[queue_id].userdata); > + else { > + qbuf->errors +=3D to_send - sent; > + for (i =3D sent; i < to_send; i++) > + rte_pktmbuf_free(qbuf->pkts[i]); > + } > + } > + > + return sent; > +} > +