From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id A15858E82 for ; Thu, 31 Mar 2016 17:41:15 +0200 (CEST) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP; 31 Mar 2016 08:41:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,422,1455004800"; d="scan'208";a="775509185" Received: from irsmsx102.ger.corp.intel.com ([163.33.3.155]) by orsmga003.jf.intel.com with ESMTP; 31 Mar 2016 08:41:13 -0700 Received: from irsmsx111.ger.corp.intel.com (10.108.20.4) by IRSMSX102.ger.corp.intel.com (163.33.3.155) with Microsoft SMTP Server (TLS) id 14.3.248.2; Thu, 31 Mar 2016 16:41:11 +0100 Received: from irsmsx108.ger.corp.intel.com ([169.254.11.13]) by irsmsx111.ger.corp.intel.com ([169.254.2.127]) with mapi id 14.03.0248.002; Thu, 31 Mar 2016 16:41:11 +0100 From: "Dumitrescu, Cristian" To: Robert Sanford , "dev@dpdk.org" CC: "Venkatesan, Venky" , "Wiles, Keith" , "Liang, Cunming" , "Zhang, Helin" , "Richardson, Bruce" , "Ananyev, Konstantin" , Olivier MATZ , "adrien.mazarguil@6wind.com" , Stephen Hemminger , "Jerin Jacob (Cavium) (jerin.jacob@caviumnetworks.com)" Thread-Topic: [PATCH 3/4] port: fix full burst checks in f_tx_bulk ops Thread-Index: AQHRiTO7oYvgDkfFc0uP+ktUhlHi059zqk+A Date: Thu, 31 Mar 2016 15:41:10 +0000 Message-ID: <3EB4FA525960D640B5BDFFD6A3D8912647975059@IRSMSX108.ger.corp.intel.com> References: <1459198297-49854-1-git-send-email-rsanford@akamai.com> <1459198297-49854-4-git-send-email-rsanford@akamai.com> In-Reply-To: <1459198297-49854-4-git-send-email-rsanford@akamai.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNTc3YTRhNzMtM2EyNy00ODBhLThhMmMtOWJiYzk2N2Y3NTMxIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6ImoxNDVMUlVhRVRVN0YraVlHc2pzQ1hTaVB4a2c2T3VUQ2xOaURLS2lhZzQ9In0= x-ctpclassification: CTP_IC x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 3/4] port: fix full burst checks in f_tx_bulk ops X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2016 15:41:16 -0000 > -----Original Message----- > From: Robert Sanford [mailto:rsanford2@gmail.com] > Sent: Monday, March 28, 2016 9:52 PM > To: dev@dpdk.org; Dumitrescu, Cristian > Subject: [PATCH 3/4] port: fix full burst checks in f_tx_bulk ops >=20 > For several f_tx_bulk functions in rte_port_{ethdev,ring,sched}.c, > it appears that the intent of the bsz_mask logic is to test whether > pkts_mask contains a full burst (i.e., the least > significant bits are set). >=20 > There are two problems with the bsz_mask code: 1) It truncates > by using the wrong size for local variable uint32_t bsz_mask, and This is indeed a bug: although port->bsz_mask is always defined as uint64_t= , there are several places where we cache it to a local variable which is d= efined as 32-bit by mistake: uint32_t bsz =3D p->bsz_mask. Thanks, Robert! > 2) We may pass oversized bursts to the underlying ethdev/ring/sched, > e.g., tx_burst_sz=3D16, bsz_mask=3D0x8000, and pkts_mask=3D0x1ffff > (17 packets), results in expr=3D=3D0, and we send a burst larger than > desired (and non-power-of-2) to the underlying tx burst interface. >=20 As stated in another related email, this is done by design, with the key as= sumption being that larger TX burst sizes will always be beneficial. So tx_= burst_size is, by design, a requirement for the *minimal* value of the TX b= urst size rather than the *exact* value for the burst size. As an example, when the TX burst size of 32 is set, then larger burst sizes= of 33, 34, ..., 40, 41, ..., 48, ..., 64 are welcomed and sent out as a si= ngle burst rather than breaking in into multiple fixed size 32-packet burst= s.=20 For PMDs, burst size (smaller than 64) is typically much lower than the TX = ring size (typical value for IXGBE: 512). Same for rte_ring. So what we are debating here is which of the following two approaches is be= tter: Approach 1: Consider tx_burst_sz as the minimal burst size, welcome larger = bursts and send them as a single burst (i.e. do not break them into fixed t= x_burst_sz bursts). This is the existing approach used consistently everywh= ere in librte_port. Approach 2: Consider tx_burst_sz as an exact burst size requirement, any la= rger incoming burst is broken into fixed size bursts of exactly tx_burst_sz= packets before send. This is the approach suggested by Robert. I think we should go for the approach that gives the best performance. Pers= onally, I think Approach 1 (existing) is doing this, but I would like to ge= t more fact-based opinions from the people on this mail list (CC-ing a few = key folks), especially PMD and ring maintainers. What is your experience, g= uys? > We propose to effectively set bsz_mask =3D (1 << tx_burst_sz) - 1 > (while avoiding truncation for tx_burst_sz=3D64), to cache the mask > value of a full burst, and then do a simple compare with pkts_mask > in each f_tx_bulk. >=20 > Signed-off-by: Robert Sanford > --- > lib/librte_port/rte_port_ethdev.c | 15 ++++----------- > lib/librte_port/rte_port_ring.c | 16 ++++------------ > lib/librte_port/rte_port_sched.c | 7 ++----- > 3 files changed, 10 insertions(+), 28 deletions(-) >=20 > diff --git a/lib/librte_port/rte_port_ethdev.c > b/lib/librte_port/rte_port_ethdev.c > index 1c34602..3fb4947 100644 > --- a/lib/librte_port/rte_port_ethdev.c > +++ b/lib/librte_port/rte_port_ethdev.c > @@ -188,7 +188,7 @@ rte_port_ethdev_writer_create(void *params, int > socket_id) > port->queue_id =3D conf->queue_id; > port->tx_burst_sz =3D conf->tx_burst_sz; > port->tx_buf_count =3D 0; > - port->bsz_mask =3D 1LLU << (conf->tx_burst_sz - 1); > + port->bsz_mask =3D UINT64_MAX >> (64 - conf->tx_burst_sz); Another way to write this is: port->bsz_mask =3D RTE_LEN2MASK(conf->tx_burs= t_sz, uint64_t); >=20 > return port; > } > @@ -229,12 +229,9 @@ rte_port_ethdev_writer_tx_bulk(void *port, > { > struct rte_port_ethdev_writer *p =3D > (struct rte_port_ethdev_writer *) port; > - uint32_t bsz_mask =3D p->bsz_mask; > uint32_t tx_buf_count =3D p->tx_buf_count; > - uint64_t expr =3D (pkts_mask & (pkts_mask + 1)) | > - ((pkts_mask & bsz_mask) ^ bsz_mask); >=20 > - if (expr =3D=3D 0) { > + if (pkts_mask =3D=3D p->bsz_mask) { > uint64_t n_pkts =3D __builtin_popcountll(pkts_mask); > uint32_t n_pkts_ok; >=20 > @@ -369,7 +366,7 @@ rte_port_ethdev_writer_nodrop_create(void > *params, int socket_id) > port->queue_id =3D conf->queue_id; > port->tx_burst_sz =3D conf->tx_burst_sz; > port->tx_buf_count =3D 0; > - port->bsz_mask =3D 1LLU << (conf->tx_burst_sz - 1); > + port->bsz_mask =3D UINT64_MAX >> (64 - conf->tx_burst_sz); >=20 > /* > * When n_retries is 0 it means that we should wait for every packet > to > @@ -435,13 +432,9 @@ rte_port_ethdev_writer_nodrop_tx_bulk(void > *port, > { > struct rte_port_ethdev_writer_nodrop *p =3D > (struct rte_port_ethdev_writer_nodrop *) port; > - > - uint32_t bsz_mask =3D p->bsz_mask; > uint32_t tx_buf_count =3D p->tx_buf_count; > - uint64_t expr =3D (pkts_mask & (pkts_mask + 1)) | > - ((pkts_mask & bsz_mask) ^ bsz_mask); >=20 > - if (expr =3D=3D 0) { > + if (pkts_mask =3D=3D p->bsz_mask) { > uint64_t n_pkts =3D __builtin_popcountll(pkts_mask); > uint32_t n_pkts_ok; >=20 > diff --git a/lib/librte_port/rte_port_ring.c b/lib/librte_port/rte_port_r= ing.c > index 765ecc5..b36e4ce 100644 > --- a/lib/librte_port/rte_port_ring.c > +++ b/lib/librte_port/rte_port_ring.c > @@ -217,7 +217,7 @@ rte_port_ring_writer_create_internal(void *params, > int socket_id, > port->ring =3D conf->ring; > port->tx_burst_sz =3D conf->tx_burst_sz; > port->tx_buf_count =3D 0; > - port->bsz_mask =3D 1LLU << (conf->tx_burst_sz - 1); > + port->bsz_mask =3D UINT64_MAX >> (64 - conf->tx_burst_sz); > port->is_multi =3D is_multi; >=20 > return port; > @@ -299,13 +299,9 @@ rte_port_ring_writer_tx_bulk_internal(void *port, > { > struct rte_port_ring_writer *p =3D > (struct rte_port_ring_writer *) port; > - > - uint32_t bsz_mask =3D p->bsz_mask; > uint32_t tx_buf_count =3D p->tx_buf_count; > - uint64_t expr =3D (pkts_mask & (pkts_mask + 1)) | > - ((pkts_mask & bsz_mask) ^ bsz_mask); >=20 > - if (expr =3D=3D 0) { > + if (pkts_mask =3D=3D p->bsz_mask) { > uint64_t n_pkts =3D __builtin_popcountll(pkts_mask); > uint32_t n_pkts_ok; >=20 > @@ -486,7 +482,7 @@ rte_port_ring_writer_nodrop_create_internal(void > *params, int socket_id, > port->ring =3D conf->ring; > port->tx_burst_sz =3D conf->tx_burst_sz; > port->tx_buf_count =3D 0; > - port->bsz_mask =3D 1LLU << (conf->tx_burst_sz - 1); > + port->bsz_mask =3D UINT64_MAX >> (64 - conf->tx_burst_sz); > port->is_multi =3D is_multi; >=20 > /* > @@ -613,13 +609,9 @@ rte_port_ring_writer_nodrop_tx_bulk_internal(void > *port, > { > struct rte_port_ring_writer_nodrop *p =3D > (struct rte_port_ring_writer_nodrop *) port; > - > - uint32_t bsz_mask =3D p->bsz_mask; > uint32_t tx_buf_count =3D p->tx_buf_count; > - uint64_t expr =3D (pkts_mask & (pkts_mask + 1)) | > - ((pkts_mask & bsz_mask) ^ bsz_mask); >=20 > - if (expr =3D=3D 0) { > + if (pkts_mask =3D=3D p->bsz_mask) { > uint64_t n_pkts =3D __builtin_popcountll(pkts_mask); > uint32_t n_pkts_ok; >=20 > diff --git a/lib/librte_port/rte_port_sched.c > b/lib/librte_port/rte_port_sched.c > index c5ff8ab..5b6afc4 100644 > --- a/lib/librte_port/rte_port_sched.c > +++ b/lib/librte_port/rte_port_sched.c > @@ -185,7 +185,7 @@ rte_port_sched_writer_create(void *params, int > socket_id) > port->sched =3D conf->sched; > port->tx_burst_sz =3D conf->tx_burst_sz; > port->tx_buf_count =3D 0; > - port->bsz_mask =3D 1LLU << (conf->tx_burst_sz - 1); > + port->bsz_mask =3D UINT64_MAX >> (64 - conf->tx_burst_sz); >=20 > return port; > } > @@ -214,12 +214,9 @@ rte_port_sched_writer_tx_bulk(void *port, > uint64_t pkts_mask) > { > struct rte_port_sched_writer *p =3D (struct rte_port_sched_writer *) > port; > - uint32_t bsz_mask =3D p->bsz_mask; > uint32_t tx_buf_count =3D p->tx_buf_count; > - uint64_t expr =3D (pkts_mask & (pkts_mask + 1)) | > - ((pkts_mask & bsz_mask) ^ bsz_mask); >=20 > - if (expr =3D=3D 0) { > + if (pkts_mask =3D=3D p->bsz_mask) { > __rte_unused uint32_t nb_tx; > uint64_t n_pkts =3D __builtin_popcountll(pkts_mask); >=20 > -- > 1.7.1