From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id C96F5108F for ; Sat, 21 Jan 2017 05:07:22 +0100 (CET) Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP; 20 Jan 2017 20:07:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,261,1477983600"; d="scan'208";a="55848777" Received: from fmsmsx108.amr.corp.intel.com ([10.18.124.206]) by fmsmga005.fm.intel.com with ESMTP; 20 Jan 2017 20:07:21 -0800 Received: from fmsmsx111.amr.corp.intel.com (10.18.116.5) by FMSMSX108.amr.corp.intel.com (10.18.124.206) with Microsoft SMTP Server (TLS) id 14.3.248.2; Fri, 20 Jan 2017 20:07:21 -0800 Received: from bgsmsx151.gar.corp.intel.com (10.224.48.42) by fmsmsx111.amr.corp.intel.com (10.18.116.5) with Microsoft SMTP Server (TLS) id 14.3.248.2; Fri, 20 Jan 2017 20:07:20 -0800 Received: from bgsmsx101.gar.corp.intel.com ([169.254.1.43]) by BGSMSX151.gar.corp.intel.com ([169.254.3.235]) with mapi id 14.03.0248.002; Sat, 21 Jan 2017 09:37:17 +0530 From: "Yang, Zhiyong" To: "Ananyev, Konstantin" , Andrew Rybchenko , "dev@dpdk.org" CC: "thomas.monjalon@6wind.com" , "Richardson, Bruce" Thread-Topic: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD batching behavior Thread-Index: AQHScwLxY9/t//0KCkG+GNAuCk/VhKFBKT0AgAAM+lCAAAC5QIABF79A Date: Sat, 21 Jan 2017 04:07:16 +0000 Message-ID: References: <1484905876-60165-1-git-send-email-zhiyong.yang@intel.com> <2601191342CEEE43887BDE71AB9772583F108924@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772583F108959@irsmsx105.ger.corp.intel.com> In-Reply-To: <2601191342CEEE43887BDE71AB9772583F108959@irsmsx105.ger.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNDM4ODRjODAtOGM3ZC00MmQ0LTlmYjItNGE5MGE3OTgzZTNmIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6IlwvMW01WVhUd3JiSG5kbzRTWW1ubVZ5RVpRUWhscVpjR1hoZUZnMzNFRXp3PSJ9 x-ctpclassification: CTP_IC x-originating-ip: [10.223.10.10] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD batching behavior X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Jan 2017 04:07:23 -0000 > -----Original Message----- > From: Ananyev, Konstantin > Sent: Friday, January 20, 2017 7:25 PM > To: Andrew Rybchenko ; Yang, Zhiyong > ; dev@dpdk.org > Cc: thomas.monjalon@6wind.com; Richardson, Bruce > > Subject: RE: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD batching > behavior >=20 > > > > From: Andrew Rybchenko [mailto:arybchenko@solarflare.com] > > Sent: Friday, January 20, 2017 10:26 AM > > To: Yang, Zhiyong ; dev@dpdk.org > > Cc: thomas.monjalon@6wind.com; Richardson, Bruce > > ; Ananyev, Konstantin > > > > Subject: Re: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD > > batching behavior > > > > On 01/20/2017 12:51 PM, Zhiyong Yang wrote: > > The rte_eth_tx_burst() function in the file Rte_ethdev.h is invoked to > > transmit output packets on the output queue for DPDK applications as > > follows. > > > > static inline uint16_t > > rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id, > > struct rte_mbuf **tx_pkts, uint16_t nb_pkts); > > > > Note: The fourth parameter nb_pkts: The number of packets to transmit. > > The rte_eth_tx_burst() function returns the number of packets it > > actually sent. The return value equal to *nb_pkts* means that all > > packets have been sent, and this is likely to signify that other > > output packets could be immediately transmitted again. Applications > > that implement a "send as many packets to transmit as possible" policy > > can check this specific case and keep invoking the rte_eth_tx_burst() > > function until a value less than > > *nb_pkts* is returned. > > > > When you call TX only once in rte_eth_tx_burst, you may get different > > behaviors from different PMDs. One problem that every DPDK user has to > > face is that they need to take the policy into consideration at the > > app- lication level when using any specific PMD to send the packets > > whether or not it is necessary, which brings usage complexities and > > makes DPDK users easily confused since they have to learn the details > > on TX function limit of specific PMDs and have to handle the different > > return value: the number of packets transmitted successfully for > > various PMDs. Some PMDs Tx func- tions have a limit of sending at most > > 32 packets for every invoking, some PMDs have another limit of at most > > 64 packets once, another ones have imp- lemented to send as many > > packets to transmit as possible, etc. This will easily cause wrong usag= e for > DPDK users. > > > > This patch proposes to implement the above policy in DPDK lib in order > > to simplify the application implementation and avoid the incorrect > > invoking as well. So, DPDK Users don't need to consider the > > implementation policy and to write duplicated code at the application > > level again when sending packets. In addition to it, the users don't > > need to know the difference of specific PMD TX and can transmit the > > arbitrary number of packets as they expect when invoking TX API > > rte_eth_tx_burst, then check the return value to get the number of > packets actually sent. > > > > How to implement the policy in DPDK lib? Two solutions are proposed > below. > > > > Solution 1: > > Implement the wrapper functions to remove some limits for each > > specific PMDs as i40e_xmit_pkts_simple and ixgbe_xmit_pkts_simple do > like that. > > > > > IMHO, the solution is a bit better since it: > > >=A01. Does not affect other PMDs at all > > >=A02. Could be a bit faster for the PMDs which require it since has no > > >indirect > > >=A0=A0=A0 function call on each iteration > > >=A03. No ABI change >=20 > I also would prefer solution number 1 for the reasons outlined by Andrew > above. > Also, IMO current limitation for number of packets to TX in some Intel PM= D > TX routines are sort of artificial: > - they are not caused by any real HW limitations > - avoiding them at PMD level shouldn't cause any performance or functiona= l > degradation. > So I don't see any good reason why instead of fixing these limitations in= our > own PMDs we are trying to push them to the upper (rte_ethdev) layer. >=20 > Konstantin Solution 1 indeed has advantages as Andrew and Konstantin said.=20 Zhiyong=20