From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id EE4248D9D for ; Fri, 12 Feb 2016 12:44:45 +0100 (CET) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP; 12 Feb 2016 03:44:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,435,1449561600"; d="scan'208";a="744842246" Received: from irsmsx107.ger.corp.intel.com ([163.33.3.99]) by orsmga003.jf.intel.com with ESMTP; 12 Feb 2016 03:44:43 -0800 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.237]) by IRSMSX107.ger.corp.intel.com ([163.33.3.99]) with mapi id 14.03.0248.002; Fri, 12 Feb 2016 11:44:43 +0000 From: "Ananyev, Konstantin" To: "Ananyev, Konstantin" , "Kulasek, TomaszX" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api Thread-Index: AQHRT6NgF5mTPTzM9EGbrlxNGe76s57822EQgBvGfYCAAD2l4IALOJyAgABvxCCAA+JGkA== Date: Fri, 12 Feb 2016 11:44:42 +0000 Message-ID: <2601191342CEEE43887BDE71AB97725836B062E2@irsmsx105.ger.corp.intel.com> References: <1452869038-9140-1-git-send-email-tomaszx.kulasek@intel.com> <1452869038-9140-2-git-send-email-tomaszx.kulasek@intel.com> <2601191342CEEE43887BDE71AB97725836AE637C@irsmsx105.ger.corp.intel.com> <3042915272161B4EB253DA4D77EB373A14E440C3@IRSMSX102.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725836B024AC@irsmsx105.ger.corp.intel.com> <3042915272161B4EB253DA4D77EB373A14E46576@IRSMSX102.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725836B05728@irsmsx105.ger.corp.intel.com> In-Reply-To: <2601191342CEEE43887BDE71AB97725836B05728@irsmsx105.ger.corp.intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMzc2YTA5NjQtYmI2NS00MWQ5LTliMjktYWZhNjNhMzBjNjZjIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6Ilk2TytJR1FoZW41YWI3cE5GVmFBTTgrOVdcL2p4WHFBbzBSeGxVYm5ndTkwPSJ9 x-ctpclassification: CTP_IC x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Feb 2016 11:44:46 -0000 >=20 > > -----Original Message----- > > From: Kulasek, TomaszX > > Sent: Tuesday, February 09, 2016 5:03 PM > > To: Ananyev, Konstantin; dev@dpdk.org > > Subject: RE: [dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api > > > > > > > > > -----Original Message----- > > > From: Ananyev, Konstantin > > > Sent: Tuesday, February 2, 2016 14:50 > > > To: Kulasek, TomaszX ; dev@dpdk.org > > > Subject: RE: [dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api > > > > > > Hi Tomasz, > > > > > > > -----Original Message----- > > > > From: Kulasek, TomaszX > > > > Sent: Tuesday, February 02, 2016 10:01 AM > > > > To: Ananyev, Konstantin; dev@dpdk.org > > > > Subject: RE: [dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api > > > > > > > > Hi Konstantin, > > > > > > > > > -----Original Message----- > > > > > From: Ananyev, Konstantin > > > > > Sent: Friday, January 15, 2016 19:45 > > > > > To: Kulasek, TomaszX; dev@dpdk.org > > > > > Subject: RE: [dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api > > > > > > > > > > Hi Tomasz, > > > > > > > > > > > > > > > > > + /* get new buffer space first, but keep old space around > > > */ > > > > > > + new_bufs =3D rte_zmalloc("ethdev->txq_bufs", > > > > > > + sizeof(*dev->data->txq_bufs) * nb_queues, 0); > > > > > > + if (new_bufs =3D=3D NULL) > > > > > > + return -(ENOMEM); > > > > > > + > > > > > > > > > > Why not to allocate space for txq_bufs together with tx_queues (a= s > > > > > one chunk for both)? > > > > > As I understand there is always one to one mapping between them > > > anyway. > > > > > Would simplify things a bit. > > > > > Or even introduce a new struct to group with all related tx queue > > > > > info togetehr struct rte_eth_txq_data { > > > > > void *queue; /*actual pmd queue*/ > > > > > struct rte_eth_dev_tx_buffer buf; > > > > > uint8_t state; > > > > > } > > > > > And use it inside struct rte_eth_dev_data? > > > > > Would probably give a better data locality. > > > > > > > > > > > > > Introducing such a struct will require a huge rework of pmd drivers= . I > > > don't think it's worth only for this one feature. > > > > > > Why not? > > > Things are getting more and more messy here: now we have a separate a= rray > > > of pointer to queues, Separate array of queue states, you are going t= o add > > > separate array of tx buffers. > > > For me it seems logical to unite all these 3 fields into one sub-stru= ct. > > > > > > > I agree with you, and probably such a work will be nice also for rx que= ues, but these two changes impacts on another part of dpdk. > > While buffered tx API is more client application helper. > > > > For me these two thinks are different features and should be made separ= ately because: > > 1) They are independent and can be done separately, > > 2) They can (and should) be reviewed, tested and approved separately, > > 3) They are addressed to another type of people (tx buffering to applic= ation developers, rte_eth_dev_data to pmd developers), so > > another people can be interested in having (or not) one or second featu= re >=20 > Such division seems a bit artificial to me :) > You are making changes in rte_ethdev.[c,h] - I think that filed regroupi= ng would make code cleaner and easier to read/maintain. >=20 > > > > Even for bug tracking it will be cleaner to separate these two things. = And yes, it is logical to unite it, maybe also for rx queues, but > > should be discussed separately. > > > > I've made a prototype with this rework, and the impact on the code not = related to this particular feature is too wide and strong to > join > > them. I would rather to provide it as independent patch for further dis= cussion only on it, if needed. >=20 > Sure, separate patch is fine. > Why not to submit it as extra one is the series? >=20 >=20 > > > > > > > > > > > > > > > > +/** > > > > > > + * @internal > > > > > > + * Structure used to buffer packets for future TX > > > > > > + * Used by APIs rte_eth_tx_buffer and rte_eth_tx_buffer_flush = */ > > > > > > +struct rte_eth_dev_tx_buffer { > > > > > > + struct rte_mbuf *pkts[RTE_ETHDEV_TX_BUFSIZE]; > > > > > > > > > > I think it is better to make size of pkts[] configurable at runti= me. > > > > > There are a lot of different usage scenarios - hard to predict wh= at > > > > > would be an optimal buffer size for all cases. > > > > > > > > > > > > > This buffer is allocated in eth_dev shared memory, so there are two > > > scenarios: > > > > 1) We have prealocated buffer with maximal size, and then we can se= t > > > > threshold level without restarting device, or > > > > 2) We need to set its size before starting device. > > > > > > > > > > > Second one is better, I think. > > > > > > Yep, I was thinking about 2) too. > > > Might be an extra parameter in struct rte_eth_txconf. > > > > > > > Struct rte_eth_txconf is passed to ethdev after rte_eth_dev_tx_queue_co= nfig, so we don't know its value when buffers are > > allocated. >=20 > Ok, and why allocation of the tx buffer can't be done at rte_eth_tx_queue= _setup()? >=20 > Actually just thought why not to let rte_eth_tx_buffer() to accept struct= rte_eth_dev_tx_buffer * as a parameter: > +static inline int __attribute__((always_inline)) > +rte_eth_tx_buffer(uint8_t port_id, uint16_t queue_id, accept struct rte= _eth_dev_tx_buffer * txb, struct rte_mbuf *tx_pkt) > ? >=20 > In that case we don't need to make any changes at rte_ethdev.[h,c] to all= oc/free/maintain tx_buffer inside each queue... > It all will be upper layer responsibility. > So no need to modify existing rte_ethdev structures/code. > Again, no need for error callback - caller would check return value and d= ecide what to do with unsent packets in the tx_buffer. >=20 Just to summarise why I think it is better to have tx buffering managed on = the app level: 1. avoid any ABI change. 2. Avoid extra changes in rte_ethdev.c: tx_queue_setup/tx_queue_stop. 3. Provides much more flexibility to the user: a) where to allocate space for tx_buffer (stack, heap, hugepages, etc). b) user can mix and match plain tx_burst() and tx_buffer/tx_buffer_flu= sh() in any way he fills it appropriate. c) user can change the size of tx_buffer without stop/re-config/start qu= eue: just allocate new larger(smaller) tx_buffer & copy contents to the = new one. d) user can preserve buffered packets through device restart circle: i.e if let say TX hang happened, and user has to do dev_stop/dev_st= art - contents of tx_buffer will stay unchanged and its contents could be (re-)transmitted after device is up again, or through different po= rt/queue if needed. =20 As a drawbacks mentioned - tx error handling becomes less transparent... But we can add error handling routine and it's user provided parameter into struct rte_eth_dev_tx_buffer', something like that: +struct rte_eth_dev_tx_buffer { + buffer_tx_error_fn cbfn; + void *userdata; + unsigned nb_pkts; + uint64_t errors; + /**< Total number of queue packets to sent that are dropped. */ + struct rte_mbuf *pkts[]; +}; Konstantin