From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from prod-mail-xrelay05.akamai.com (prod-mail-xrelay05.akamai.com [23.79.238.179]) by dpdk.org (Postfix) with ESMTP id 1242BFE5 for ; Fri, 1 Apr 2016 21:31:57 +0200 (CEST) Received: from prod-mail-xrelay05.akamai.com (localhost.localdomain [127.0.0.1]) by postfix.imss70 (Postfix) with ESMTP id 799B33F4066; Fri, 1 Apr 2016 19:31:56 +0000 (GMT) Received: from prod-mail-relay11.akamai.com (prod-mail-relay11.akamai.com [172.27.118.250]) by prod-mail-xrelay05.akamai.com (Postfix) with ESMTP id 623673F4040; Fri, 1 Apr 2016 19:31:56 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=akamai.com; s=a1; t=1459539116; bh=MAfHTxReG9b203AH65xgv9STnIqTa2G25iajoEICw9Q=; l=5425; h=From:To:CC:Date:References:In-Reply-To:From; b=eno9OJTD7ldHqn7kM+ljwUVFVLYNjiLAtiJp+x9QxCbSG7G5c0vfhvGNDD/Dh34Xj VjBgxzxp5wkY5HMpxmXlYvUZUbXuA1sC+T0EVRzfnzQvZOf99GZsjGde0osTKB/rqw V5SCN/peBHH6zhG+35RNsBDjDFGKS46KHgoxxzq4= Received: from email.msg.corp.akamai.com (ustx2ex-cas1.msg.corp.akamai.com [172.27.25.30]) by prod-mail-relay11.akamai.com (Postfix) with ESMTP id 3BFE81FC98; Fri, 1 Apr 2016 19:31:56 +0000 (GMT) Received: from ustx2ex-dag1mb6.msg.corp.akamai.com (172.27.27.107) by ustx2ex-dag1mb6.msg.corp.akamai.com (172.27.27.107) with Microsoft SMTP Server (TLS) id 15.0.1130.7; Fri, 1 Apr 2016 12:31:55 -0700 Received: from ustx2ex-dag1mb6.msg.corp.akamai.com ([172.27.27.107]) by ustx2ex-dag1mb6.msg.corp.akamai.com ([172.27.27.107]) with mapi id 15.00.1130.005; Fri, 1 Apr 2016 12:31:55 -0700 From: "Sanford, Robert" To: "Dumitrescu, Cristian" , "dev@dpdk.org" CC: "Venkatesan, Venky" , "Wiles, Keith" , "Liang, Cunming" , "Zhang, Helin" , "Richardson, Bruce" , "Ananyev, Konstantin" , Olivier MATZ , "adrien.mazarguil@6wind.com" , Stephen Hemminger , "Jerin Jacob (Cavium) (jerin.jacob@caviumnetworks.com)" Thread-Topic: [dpdk-dev] [PATCH 3/4] port: fix full burst checks in f_tx_bulk ops Thread-Index: AQHRjE0mJNHxDh7drkqmTH1A8kkg2A== Date: Fri, 1 Apr 2016 19:31:55 +0000 Message-ID: References: <1459198297-49854-1-git-send-email-rsanford@akamai.com> <1459198297-49854-4-git-send-email-rsanford@akamai.com> <3EB4FA525960D640B5BDFFD6A3D8912647975059@IRSMSX108.ger.corp.intel.com> In-Reply-To: <3EB4FA525960D640B5BDFFD6A3D8912647975059@IRSMSX108.ger.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.4.3.140616 x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [172.19.132.73] Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 3/4] port: fix full burst checks in f_tx_bulk ops X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2016 19:31:57 -0000 Hi Cristian, In hindsight, I was overly agressive in proposing the same change (approach #2, as you call it below) for rte_port_ring and rte_port_sched. Changing local variable bsz_mask to uint64_t should be sufficient. Please see additional comments inline below. On 3/31/16 11:41 AM, "Dumitrescu, Cristian" wrote: >> -----Original Message----- >> From: Robert Sanford [mailto:rsanford2@gmail.com] >> Sent: Monday, March 28, 2016 9:52 PM >> To: dev@dpdk.org; Dumitrescu, Cristian >> Subject: [PATCH 3/4] port: fix full burst checks in f_tx_bulk ops >>=20 >> For several f_tx_bulk functions in rte_port_{ethdev,ring,sched}.c, >> it appears that the intent of the bsz_mask logic is to test whether >> pkts_mask contains a full burst (i.e., the least >> significant bits are set). >>=20 >> There are two problems with the bsz_mask code: 1) It truncates >> by using the wrong size for local variable uint32_t bsz_mask, and > >This is indeed a bug: although port->bsz_mask is always defined as >uint64_t, there are several places where we cache it to a local variable >which is defined as 32-bit by mistake: uint32_t bsz =3D p->bsz_mask. >Thanks, Robert! > >> 2) We may pass oversized bursts to the underlying ethdev/ring/sched, >> e.g., tx_burst_sz=3D16, bsz_mask=3D0x8000, and pkts_mask=3D0x1ffff >> (17 packets), results in expr=3D=3D0, and we send a burst larger than >> desired (and non-power-of-2) to the underlying tx burst interface. >>=20 > >As stated in another related email, this is done by design, with the key >assumption being that larger TX burst sizes will always be beneficial. So >tx_burst_size is, by design, a requirement for the *minimal* value of the >TX burst size rather than the *exact* value for the burst size. >As an example, when the TX burst size of 32 is set, then larger burst >sizes of 33, 34, ..., 40, 41, ..., 48, ..., 64 are welcomed and sent out >as a single burst rather than breaking in into multiple fixed size >32-packet bursts.=20 >For PMDs, burst size (smaller than 64) is typically much lower than the >TX ring size (typical value for IXGBE: 512). Same for rte_ring. > >So what we are debating here is which of the following two approaches is >better: >Approach 1: Consider tx_burst_sz as the minimal burst size, welcome >larger bursts and send them as a single burst (i.e. do not break them >into fixed tx_burst_sz bursts). This is the existing approach used >consistently everywhere in librte_port. >Approach 2: Consider tx_burst_sz as an exact burst size requirement, any >larger incoming burst is broken into fixed size bursts of exactly >tx_burst_sz packets before send. This is the approach suggested by Robert. > >I think we should go for the approach that gives the best performance. >Personally, I think Approach 1 (existing) is doing this, but I would like >to get more fact-based opinions from the people on this mail list (CC-ing >a few key folks), especially PMD and ring maintainers. What is your >experience, guys? I only advocate approach #2 (for rte_port_ethdev) if we can't be certain of the semantics of the underlying rte_eth_tx_burst(). e.g., if we attempt to send 33 packets, and it never enqueues more than 32. Off topic, general points about PMD APIs: * From the viewpoint of an API user, having rte_eth_{rx,tx}_burst() unexpectedly truncate a burst request is an unwelcome surprise, and should at least be well-documented. * We previously encountered this problem on the RX-side of IXGBE: We asked rte_eth_rx_burst() for 64 packets, but it never returned more than 32, even when there were hundreds "done". This was a little unexpected. * From a quick look at the latest code, it appears that ixgbe_xmit_pkts_vec() and ixgbe_recv_pkts_vec() still do this. Yes, these are the vector versions, but do we need to study the drivers of every device that we might use, when deciding things such as burst sizes? :) * One simple enhancement idea: ethdev info-get API could convey related info, e.g., whether rx/tx APIs truncate bursts, if so, how big, etc. > >> We propose to effectively set bsz_mask =3D (1 << tx_burst_sz) - 1 >> (while avoiding truncation for tx_burst_sz=3D64), to cache the mask >> value of a full burst, and then do a simple compare with pkts_mask >> in each f_tx_bulk. >>=20 >> Signed-off-by: Robert Sanford >> --- >> lib/librte_port/rte_port_ethdev.c | 15 ++++----------- >> lib/librte_port/rte_port_ring.c | 16 ++++------------ >> lib/librte_port/rte_port_sched.c | 7 ++----- >> 3 files changed, 10 insertions(+), 28 deletions(-) >>=20 >> diff --git a/lib/librte_port/rte_port_ethdev.c >> b/lib/librte_port/rte_port_ethdev.c >> index 1c34602..3fb4947 100644 >> --- a/lib/librte_port/rte_port_ethdev.c >> +++ b/lib/librte_port/rte_port_ethdev.c >> @@ -188,7 +188,7 @@ rte_port_ethdev_writer_create(void *params, int >> socket_id) >> port->queue_id =3D conf->queue_id; >> port->tx_burst_sz =3D conf->tx_burst_sz; >> port->tx_buf_count =3D 0; >> - port->bsz_mask =3D 1LLU << (conf->tx_burst_sz - 1); >> + port->bsz_mask =3D UINT64_MAX >> (64 - conf->tx_burst_sz); > >Another way to write this is: port->bsz_mask =3D >RTE_LEN2MASK(conf->tx_burst_sz, uint64_t); True, didn't know about that macro. -- Thanks, Robert