From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id C6F89A2EFC
	for <public@inbox.dpdk.org>; Tue, 15 Oct 2019 17:47:51 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 627401ED15;
	Tue, 15 Oct 2019 17:47:51 +0200 (CEST)
Received: from mga17.intel.com (mga17.intel.com [192.55.52.151])
 by dpdk.org (Postfix) with ESMTP id 62F3D1ED15
 for <dev@dpdk.org>; Tue, 15 Oct 2019 17:47:14 +0200 (CEST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga004.fm.intel.com ([10.253.24.48])
 by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 15 Oct 2019 08:47:13 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.67,300,1566889200"; d="scan'208";a="220463692"
Received: from irsmsx106.ger.corp.intel.com ([163.33.3.31])
 by fmsmga004.fm.intel.com with ESMTP; 15 Oct 2019 08:47:12 -0700
Received: from irsmsx108.ger.corp.intel.com ([169.254.11.131]) by
 IRSMSX106.ger.corp.intel.com ([169.254.8.185]) with mapi id 14.03.0439.000;
 Tue, 15 Oct 2019 16:47:12 +0100
From: "Dumitrescu, Cristian" <cristian.dumitrescu@intel.com>
To: "Singh, Jasvinder" <jasvinder.singh@intel.com>, "dev@dpdk.org"
 <dev@dpdk.org>
CC: "Krakowiak, LukaszX" <lukaszx.krakowiak@intel.com>
Thread-Topic: [PATCH 2/2] sched: modify internal structs and functions for
 64 bit values
Thread-Index: AQHVgqv0LuEBwb/z1ki3AqI9NCf/rqdb1N3A
Date: Tue, 15 Oct 2019 15:47:11 +0000
Message-ID: <3EB4FA525960D640B5BDFFD6A3D891268E946936@IRSMSX108.ger.corp.intel.com>
References: <20191014172455.139224-1-jasvinder.singh@intel.com>
 <20191014172455.139224-2-jasvinder.singh@intel.com>
In-Reply-To: <20191014172455.139224-2-jasvinder.singh@intel.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNWExZTc5MmYtMjIwZS00YzJkLTlhNTEtOTFkZTFkZDM3MGYxIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoicUFEbVpxaEI3UklGcjl6alJlb3BYendIeXJqSitEXC9yUE1WdzQ5bTM1SVpqYndWc1JNbFN5cmp0WHlPaG5oZXQifQ==
x-ctpclassification: CTP_NT
dlp-product: dlpe-windows
dlp-version: 11.2.0.6
dlp-reaction: no-action
x-originating-ip: [163.33.239.182]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH 2/2] sched: modify internal structs and
 functions for 64 bit values
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi Jasvinder,

> -----Original Message-----
> From: Singh, Jasvinder
> Sent: Monday, October 14, 2019 6:25 PM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Krakowiak,
> LukaszX <lukaszx.krakowiak@intel.com>
> Subject: [PATCH 2/2] sched: modify internal structs and functions for 64 =
bit
> values
>=20
> Modify internal structure and functions to support 64-bit
> values for rates and stats parameters.
>=20
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
> ---
>  lib/librte_sched/rte_approx.c       |  57 ++++----
>  lib/librte_sched/rte_approx.h       |   3 +-
>  lib/librte_sched/rte_sched.c        | 211 +++++++++++++++-------------
>  lib/librte_sched/rte_sched_common.h |  12 +-
>  4 files changed, 156 insertions(+), 127 deletions(-)
>=20
> diff --git a/lib/librte_sched/rte_approx.c b/lib/librte_sched/rte_approx.=
c
> index 30620b83d..4883d3969 100644
> --- a/lib/librte_sched/rte_approx.c
> +++ b/lib/librte_sched/rte_approx.c
> @@ -18,22 +18,23 @@
>   */
>=20
>  /* fraction comparison: compare (a/b) and (c/d) */
> -static inline uint32_t
> -less(uint32_t a, uint32_t b, uint32_t c, uint32_t d)
> +static inline sched_counter_t
> +less(sched_counter_t a, sched_counter_t b, sched_counter_t c,
> sched_counter_t d)
>  {
>  	return a*d < b*c;
>  }
>=20
> -static inline uint32_t
> -less_or_equal(uint32_t a, uint32_t b, uint32_t c, uint32_t d)
> +static inline sched_counter_t
> +less_or_equal(sched_counter_t a, sched_counter_t b, sched_counter_t c,
> +	sched_counter_t d)
>  {
>  	return a*d <=3D b*c;
>  }
>=20
>  /* check whether a/b is a valid approximation */
> -static inline uint32_t
> -matches(uint32_t a, uint32_t b,
> -	uint32_t alpha_num, uint32_t d_num, uint32_t denum)
> +static inline sched_counter_t
> +matches(sched_counter_t a, sched_counter_t b,
> +	sched_counter_t alpha_num, sched_counter_t d_num,
> sched_counter_t denum)
>  {
>  	if (less_or_equal(a, b, alpha_num - d_num, denum))
>  		return 0;
> @@ -45,33 +46,39 @@ matches(uint32_t a, uint32_t b,
>  }
>=20
>  static inline void
> -find_exact_solution_left(uint32_t p_a, uint32_t q_a, uint32_t p_b, uint3=
2_t
> q_b,
> -	uint32_t alpha_num, uint32_t d_num, uint32_t denum, uint32_t *p,
> uint32_t *q)
> +find_exact_solution_left(sched_counter_t p_a, sched_counter_t q_a,
> +	sched_counter_t p_b, sched_counter_t q_b, sched_counter_t
> alpha_num,
> +	sched_counter_t d_num, sched_counter_t denum,
> sched_counter_t *p,
> +	sched_counter_t *q)
>  {
> -	uint32_t k_num =3D denum * p_b - (alpha_num + d_num) * q_b;
> -	uint32_t k_denum =3D (alpha_num + d_num) * q_a - denum * p_a;
> -	uint32_t k =3D (k_num / k_denum) + 1;
> +	sched_counter_t k_num =3D denum * p_b - (alpha_num + d_num) *
> q_b;
> +	sched_counter_t k_denum =3D (alpha_num + d_num) * q_a - denum *
> p_a;
> +	sched_counter_t k =3D (k_num / k_denum) + 1;
>=20
>  	*p =3D p_b + k * p_a;
>  	*q =3D q_b + k * q_a;
>  }
>=20
>  static inline void
> -find_exact_solution_right(uint32_t p_a, uint32_t q_a, uint32_t p_b,
> uint32_t q_b,
> -	uint32_t alpha_num, uint32_t d_num, uint32_t denum, uint32_t *p,
> uint32_t *q)
> +find_exact_solution_right(sched_counter_t p_a, sched_counter_t q_a,
> +	sched_counter_t p_b, sched_counter_t q_b, sched_counter_t
> alpha_num,
> +	sched_counter_t d_num, sched_counter_t denum,
> sched_counter_t *p,
> +	sched_counter_t *q)
>  {
> -	uint32_t k_num =3D - denum * p_b + (alpha_num - d_num) * q_b;
> -	uint32_t k_denum =3D - (alpha_num - d_num) * q_a + denum * p_a;
> -	uint32_t k =3D (k_num / k_denum) + 1;
> +	sched_counter_t k_num =3D -denum * p_b + (alpha_num - d_num) *
> q_b;
> +	sched_counter_t k_denum =3D -(alpha_num - d_num) * q_a + denum
> * p_a;
> +	sched_counter_t k =3D (k_num / k_denum) + 1;
>=20
>  	*p =3D p_b + k * p_a;
>  	*q =3D q_b + k * q_a;
>  }
>=20
>  static int
> -find_best_rational_approximation(uint32_t alpha_num, uint32_t d_num,
> uint32_t denum, uint32_t *p, uint32_t *q)
> +find_best_rational_approximation(sched_counter_t alpha_num,
> +	sched_counter_t d_num, sched_counter_t denum,
> sched_counter_t *p,
> +	sched_counter_t *q)
>  {
> -	uint32_t p_a, q_a, p_b, q_b;
> +	sched_counter_t p_a, q_a, p_b, q_b;
>=20
>  	/* check assumptions on the inputs */
>  	if (!((0 < d_num) && (d_num < alpha_num) && (alpha_num <
> denum) && (d_num + alpha_num < denum))) {
> @@ -85,8 +92,8 @@ find_best_rational_approximation(uint32_t alpha_num,
> uint32_t d_num, uint32_t de
>  	q_b =3D 1;
>=20
>  	while (1) {
> -		uint32_t new_p_a, new_q_a, new_p_b, new_q_b;
> -		uint32_t x_num, x_denum, x;
> +		sched_counter_t new_p_a, new_q_a, new_p_b, new_q_b;
> +		sched_counter_t x_num, x_denum, x;
>  		int aa, bb;
>=20
>  		/* compute the number of steps to the left */
> @@ -139,9 +146,9 @@ find_best_rational_approximation(uint32_t
> alpha_num, uint32_t d_num, uint32_t de
>  	}
>  }
>=20
> -int rte_approx(double alpha, double d, uint32_t *p, uint32_t *q)
> +int rte_approx(double alpha, double d, sched_counter_t *p,
> sched_counter_t *q)
>  {
> -	uint32_t alpha_num, d_num, denum;
> +	sched_counter_t alpha_num, d_num, denum;
>=20
>  	/* Check input arguments */
>  	if (!((0.0 < d) && (d < alpha) && (alpha < 1.0))) {
> @@ -159,8 +166,8 @@ int rte_approx(double alpha, double d, uint32_t *p,
> uint32_t *q)
>  		d *=3D 10;
>  		denum *=3D 10;
>  	}
> -	alpha_num =3D (uint32_t) alpha;
> -	d_num =3D (uint32_t) d;
> +	alpha_num =3D (sched_counter_t) alpha;
> +	d_num =3D (sched_counter_t) d;
>=20
>  	/* Perform approximation */
>  	return find_best_rational_approximation(alpha_num, d_num,
> denum, p, q);
> diff --git a/lib/librte_sched/rte_approx.h b/lib/librte_sched/rte_approx.=
h
> index 0244d98f1..e591e122d 100644
> --- a/lib/librte_sched/rte_approx.h
> +++ b/lib/librte_sched/rte_approx.h
> @@ -20,6 +20,7 @@ extern "C" {
>   ***/
>=20
>  #include <stdint.h>
> +#include "rte_sched_common.h"
>=20
>  /**
>   * Find best rational approximation
> @@ -37,7 +38,7 @@ extern "C" {
>   * @return
>   *   0 upon success, error code otherwise
>   */
> -int rte_approx(double alpha, double d, uint32_t *p, uint32_t *q);
> +int rte_approx(double alpha, double d, sched_counter_t *p,
> sched_counter_t *q);
>=20
>  #ifdef __cplusplus
>  }

Please keep the rte_approx.[hc] independent of the librte_sched library, so=
 use unit32_t or uint64_t instead of sched_counter_t that is librte_sched d=
ependent. Also, for the same reason, remove the above inclusion of rte_sche=
d_common.h.

Please keep the existing 32-bit functions with their current name & prototy=
pe and create new 64-bit functions that have the "64" suffix to their name,=
 and use the 64-bit versions in the rte_sched.c implementation. Makes sense=
?

The rte_approx.[hc] files represent the implementation of an arithmetic alg=
orithm that is completely independent of the scheduler library. In fact, th=
ey could be moved to a more generic location in DPDK where they could be le=
veraged by other libraries without the need to create a (fake) dependency t=
o librte_sched.

> diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
> index 710ecf65a..11d1febe2 100644
> --- a/lib/librte_sched/rte_sched.c
> +++ b/lib/librte_sched/rte_sched.c
> @@ -49,13 +49,13 @@
>=20
>  struct rte_sched_pipe_profile {
>  	/* Token bucket (TB) */
> -	uint32_t tb_period;
> -	uint32_t tb_credits_per_period;
> -	uint32_t tb_size;
> +	sched_counter_t tb_period;
> +	sched_counter_t tb_credits_per_period;
> +	sched_counter_t tb_size;
>=20
>  	/* Pipe traffic classes */
> -	uint32_t tc_period;
> -	uint32_t
> tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> +	sched_counter_t tc_period;
> +	sched_counter_t
> tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
>  	uint8_t tc_ov_weight;
>=20
>  	/* Pipe best-effort traffic class queues */
> @@ -65,20 +65,20 @@ struct rte_sched_pipe_profile {
>  struct rte_sched_pipe {
>  	/* Token bucket (TB) */
>  	uint64_t tb_time; /* time of last update */
> -	uint32_t tb_credits;
> +	sched_counter_t tb_credits;
>=20
>  	/* Pipe profile and flags */
>  	uint32_t profile;
>=20
>  	/* Traffic classes (TCs) */
>  	uint64_t tc_time; /* time of next update */
> -	uint32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> +	sched_counter_t
> tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
>=20
>  	/* Weighted Round Robin (WRR) */
>  	uint8_t wrr_tokens[RTE_SCHED_BE_QUEUES_PER_PIPE];
>=20
>  	/* TC oversubscription */
> -	uint32_t tc_ov_credits;
> +	sched_counter_t tc_ov_credits;
>  	uint8_t tc_ov_period_id;
>  } __rte_cache_aligned;
>=20
> @@ -141,28 +141,28 @@ struct rte_sched_grinder {
>  struct rte_sched_subport {
>  	/* Token bucket (TB) */
>  	uint64_t tb_time; /* time of last update */
> -	uint32_t tb_period;
> -	uint32_t tb_credits_per_period;
> -	uint32_t tb_size;
> -	uint32_t tb_credits;
> +	sched_counter_t tb_period;
> +	sched_counter_t tb_credits_per_period;
> +	sched_counter_t tb_size;
> +	sched_counter_t tb_credits;
>=20
>  	/* Traffic classes (TCs) */
>  	uint64_t tc_time; /* time of next update */
> -	uint32_t
> tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> -	uint32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> -	uint32_t tc_period;
> +	sched_counter_t
> tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> +	sched_counter_t
> tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> +	sched_counter_t tc_period;
>=20
>  	/* TC oversubscription */
> -	uint32_t tc_ov_wm;
> -	uint32_t tc_ov_wm_min;
> -	uint32_t tc_ov_wm_max;
> +	sched_counter_t tc_ov_wm;
> +	sched_counter_t tc_ov_wm_min;
> +	sched_counter_t tc_ov_wm_max;
>  	uint8_t tc_ov_period_id;
>  	uint8_t tc_ov;
>  	uint32_t tc_ov_n;
>  	double tc_ov_rate;
>=20
>  	/* Statistics */
> -	struct rte_sched_subport_stats stats;
> +	struct rte_sched_subport_stats stats __rte_cache_aligned;
>=20
>  	/* Subport pipes */
>  	uint32_t n_pipes_per_subport_enabled;
> @@ -170,7 +170,7 @@ struct rte_sched_subport {
>  	uint32_t n_max_pipe_profiles;
>=20
>  	/* Pipe best-effort TC rate */
> -	uint32_t pipe_tc_be_rate_max;
> +	sched_counter_t pipe_tc_be_rate_max;
>=20
>  	/* Pipe queues size */
>  	uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> @@ -212,7 +212,7 @@ struct rte_sched_port {
>  	uint16_t pipe_queue[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
>  	uint8_t pipe_tc[RTE_SCHED_QUEUES_PER_PIPE];
>  	uint8_t tc_queue[RTE_SCHED_QUEUES_PER_PIPE];
> -	uint32_t rate;
> +	sched_counter_t rate;
>  	uint32_t mtu;
>  	uint32_t frame_overhead;
>  	int socket;
> @@ -517,33 +517,35 @@ rte_sched_port_log_pipe_profile(struct
> rte_sched_subport *subport, uint32_t i)
>  	struct rte_sched_pipe_profile *p =3D subport->pipe_profiles + i;
>=20
>  	RTE_LOG(DEBUG, SCHED, "Low level config for pipe profile %u:\n"
> -		"	Token bucket: period =3D %u, credits per period =3D %u,
> size =3D %u\n"
> -		"	Traffic classes: period =3D %u,\n"
> -		"	credits per period =3D [%u, %u, %u, %u, %u, %u, %u,
> %u, %u, %u, %u, %u, %u]\n"
> +		"	Token bucket: period =3D %"PRIu64", credits per period
> =3D %"PRIu64", size =3D %"PRIu64"\n"
> +		"	Traffic classes: period =3D %"PRIu64",\n"
> +		"	credits per period =3D [%"PRIu64", %"PRIu64",
> %"PRIu64", %"PRIu64
> +		", %"PRIu64", %"PRIu64", %"PRIu64", %"PRIu64", %"PRIu64",
> %"PRIu64
> +		", %"PRIu64", %"PRIu64", %"PRIu64"]\n"
>  		"	Best-effort traffic class oversubscription: weight =3D
> %hhu\n"
>  		"	WRR cost: [%hhu, %hhu, %hhu, %hhu]\n",
>  		i,
>=20
>  		/* Token bucket */
> -		p->tb_period,
> -		p->tb_credits_per_period,
> -		p->tb_size,
> +		(uint64_t)p->tb_period,
> +		(uint64_t)p->tb_credits_per_period,
> +		(uint64_t)p->tb_size,
>=20
>  		/* Traffic classes */
> -		p->tc_period,
> -		p->tc_credits_per_period[0],
> -		p->tc_credits_per_period[1],
> -		p->tc_credits_per_period[2],
> -		p->tc_credits_per_period[3],
> -		p->tc_credits_per_period[4],
> -		p->tc_credits_per_period[5],
> -		p->tc_credits_per_period[6],
> -		p->tc_credits_per_period[7],
> -		p->tc_credits_per_period[8],
> -		p->tc_credits_per_period[9],
> -		p->tc_credits_per_period[10],
> -		p->tc_credits_per_period[11],
> -		p->tc_credits_per_period[12],
> +		(uint64_t)p->tc_period,
> +		(uint64_t)p->tc_credits_per_period[0],
> +		(uint64_t)p->tc_credits_per_period[1],
> +		(uint64_t)p->tc_credits_per_period[2],
> +		(uint64_t)p->tc_credits_per_period[3],
> +		(uint64_t)p->tc_credits_per_period[4],
> +		(uint64_t)p->tc_credits_per_period[5],
> +		(uint64_t)p->tc_credits_per_period[6],
> +		(uint64_t)p->tc_credits_per_period[7],
> +		(uint64_t)p->tc_credits_per_period[8],
> +		(uint64_t)p->tc_credits_per_period[9],
> +		(uint64_t)p->tc_credits_per_period[10],
> +		(uint64_t)p->tc_credits_per_period[11],
> +		(uint64_t)p->tc_credits_per_period[12],
>=20
>  		/* Best-effort traffic class oversubscription */
>  		p->tc_ov_weight,
> @@ -553,7 +555,7 @@ rte_sched_port_log_pipe_profile(struct
> rte_sched_subport *subport, uint32_t i)
>  }
>=20
>  static inline uint64_t
> -rte_sched_time_ms_to_bytes(uint32_t time_ms, uint32_t rate)
> +rte_sched_time_ms_to_bytes(sched_counter_t time_ms,
> sched_counter_t rate)
>  {
>  	uint64_t time =3D time_ms;
>=20
> @@ -566,7 +568,7 @@ static void
>  rte_sched_pipe_profile_convert(struct rte_sched_subport *subport,
>  	struct rte_sched_pipe_params *src,
>  	struct rte_sched_pipe_profile *dst,
> -	uint32_t rate)
> +	sched_counter_t rate)
>  {
>  	uint32_t wrr_cost[RTE_SCHED_BE_QUEUES_PER_PIPE];
>  	uint32_t lcd1, lcd2, lcd;
> @@ -581,8 +583,8 @@ rte_sched_pipe_profile_convert(struct
> rte_sched_subport *subport,
>  				/ (double) rate;
>  		double d =3D RTE_SCHED_TB_RATE_CONFIG_ERR;
>=20
> -		rte_approx(tb_rate, d,
> -			&dst->tb_credits_per_period, &dst->tb_period);
> +		rte_approx(tb_rate, d, &dst->tb_credits_per_period,
> +			&dst->tb_period);
>  	}
>=20
>  	dst->tb_size =3D src->tb_size;
> @@ -594,8 +596,8 @@ rte_sched_pipe_profile_convert(struct
> rte_sched_subport *subport,
>  	for (i =3D 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
>  		if (subport->qsize[i])
>  			dst->tc_credits_per_period[i]
> -				=3D rte_sched_time_ms_to_bytes(src-
> >tc_period,
> -					src->tc_rate[i]);
> +				=3D (sched_counter_t)
> rte_sched_time_ms_to_bytes(
> +					src->tc_period, src->tc_rate[i]);
>=20
>  	dst->tc_ov_weight =3D src->tc_ov_weight;
>=20
> @@ -637,7 +639,8 @@ rte_sched_subport_config_pipe_profile_table(struct
> rte_sched_subport *subport,
>  	subport->pipe_tc_be_rate_max =3D 0;
>  	for (i =3D 0; i < subport->n_pipe_profiles; i++) {
>  		struct rte_sched_pipe_params *src =3D params->pipe_profiles
> + i;
> -		uint32_t pipe_tc_be_rate =3D src-
> >tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE];
> +		sched_counter_t pipe_tc_be_rate =3D
> +			src->tc_rate[RTE_SCHED_TRAFFIC_CLASS_BE];
>=20
>  		if (subport->pipe_tc_be_rate_max < pipe_tc_be_rate)
>  			subport->pipe_tc_be_rate_max =3D pipe_tc_be_rate;
> @@ -647,7 +650,7 @@ rte_sched_subport_config_pipe_profile_table(struct
> rte_sched_subport *subport,
>  static int
>  rte_sched_subport_check_params(struct rte_sched_subport_params
> *params,
>  	uint32_t n_max_pipes_per_subport,
> -	uint32_t rate)
> +	sched_counter_t rate)
>  {
>  	uint32_t i;
>=20
> @@ -684,7 +687,7 @@ rte_sched_subport_check_params(struct
> rte_sched_subport_params *params,
>  	}
>=20
>  	for (i =3D 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
> -		uint32_t tc_rate =3D params->tc_rate[i];
> +		sched_counter_t tc_rate =3D params->tc_rate[i];
>  		uint16_t qsize =3D params->qsize[i];
>=20
>  		if ((qsize =3D=3D 0 && tc_rate !=3D 0) ||
> @@ -910,36 +913,40 @@ rte_sched_port_log_subport_config(struct
> rte_sched_port *port, uint32_t i)
>  	struct rte_sched_subport *s =3D port->subports[i];
>=20
>  	RTE_LOG(DEBUG, SCHED, "Low level config for subport %u:\n"
> -		"	Token bucket: period =3D %u, credits per period =3D %u,
> size =3D %u\n"
> -		"	Traffic classes: period =3D %u\n"
> -		"	credits per period =3D [%u, %u, %u, %u, %u, %u, %u,
> %u, %u, %u, %u, %u, %u]\n"
> -		"	Best effort traffic class oversubscription: wm min =3D
> %u, wm max =3D %u\n",
> +		"	Token bucket: period =3D %"PRIu64", credits per period
> =3D %"PRIu64
> +		", size =3D %"PRIu64"\n"
> +		"	Traffic classes: period =3D %"PRIu64"\n"
> +		"	credits per period =3D [%"PRIu64", %"PRIu64",
> %"PRIu64", %"PRIu64
> +		", %"PRIu64", %"PRIu64", %"PRIu64", %"PRIu64", %"PRIu64",
> %"PRIu64
> +		", %"PRIu64", %"PRIu64", %"PRIu64"]\n"
> +		"	Best effort traffic class oversubscription: wm min =3D
> %"PRIu64
> +		", wm max =3D %"PRIu64"\n",
>  		i,
>=20
>  		/* Token bucket */
> -		s->tb_period,
> -		s->tb_credits_per_period,
> -		s->tb_size,
> +		(uint64_t)s->tb_period,
> +		(uint64_t)s->tb_credits_per_period,
> +		(uint64_t)s->tb_size,
>=20
>  		/* Traffic classes */
> -		s->tc_period,
> -		s->tc_credits_per_period[0],
> -		s->tc_credits_per_period[1],
> -		s->tc_credits_per_period[2],
> -		s->tc_credits_per_period[3],
> -		s->tc_credits_per_period[4],
> -		s->tc_credits_per_period[5],
> -		s->tc_credits_per_period[6],
> -		s->tc_credits_per_period[7],
> -		s->tc_credits_per_period[8],
> -		s->tc_credits_per_period[9],
> -		s->tc_credits_per_period[10],
> -		s->tc_credits_per_period[11],
> -		s->tc_credits_per_period[12],
> +		(uint64_t)s->tc_period,
> +		(uint64_t)s->tc_credits_per_period[0],
> +		(uint64_t)s->tc_credits_per_period[1],
> +		(uint64_t)s->tc_credits_per_period[2],
> +		(uint64_t)s->tc_credits_per_period[3],
> +		(uint64_t)s->tc_credits_per_period[4],
> +		(uint64_t)s->tc_credits_per_period[5],
> +		(uint64_t)s->tc_credits_per_period[6],
> +		(uint64_t)s->tc_credits_per_period[7],
> +		(uint64_t)s->tc_credits_per_period[8],
> +		(uint64_t)s->tc_credits_per_period[9],
> +		(uint64_t)s->tc_credits_per_period[10],
> +		(uint64_t)s->tc_credits_per_period[11],
> +		(uint64_t)s->tc_credits_per_period[12],
>=20
>  		/* Best effort traffic class oversubscription */
> -		s->tc_ov_wm_min,
> -		s->tc_ov_wm_max);
> +		(uint64_t)s->tc_ov_wm_min,
> +		(uint64_t)s->tc_ov_wm_max);
>  }
>=20
>  static void
> @@ -1023,7 +1030,8 @@ rte_sched_subport_config(struct rte_sched_port
> *port,
>  		double tb_rate =3D ((double) params->tb_rate) / ((double)
> port->rate);
>  		double d =3D RTE_SCHED_TB_RATE_CONFIG_ERR;
>=20
> -		rte_approx(tb_rate, d, &s->tb_credits_per_period, &s-
> >tb_period);
> +		rte_approx(tb_rate, d, &s->tb_credits_per_period,
> +			&s->tb_period);
>  	}
>=20
>  	s->tb_size =3D params->tb_size;
> @@ -1035,8 +1043,8 @@ rte_sched_subport_config(struct rte_sched_port
> *port,
>  	for (i =3D 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
>  		if (params->qsize[i])
>  			s->tc_credits_per_period[i]
> -				=3D rte_sched_time_ms_to_bytes(params-
> >tc_period,
> -					params->tc_rate[i]);
> +				=3D (sched_counter_t)
> rte_sched_time_ms_to_bytes(
> +					params->tc_period, params-
> >tc_rate[i]);
>  	}
>  	s->tc_time =3D port->time + s->tc_period;
>  	for (i =3D 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
> @@ -1970,13 +1978,15 @@ grinder_credits_update(struct rte_sched_port
> *port,
>  	/* Subport TB */
>  	n_periods =3D (port->time - subport->tb_time) / subport->tb_period;
>  	subport->tb_credits +=3D n_periods * subport-
> >tb_credits_per_period;
> -	subport->tb_credits =3D rte_sched_min_val_2_u32(subport-
> >tb_credits, subport->tb_size);
> +	subport->tb_credits =3D rte_sched_min_val_2(subport->tb_credits,
> +				subport->tb_size);
>  	subport->tb_time +=3D n_periods * subport->tb_period;
>=20
>  	/* Pipe TB */
>  	n_periods =3D (port->time - pipe->tb_time) / params->tb_period;
>  	pipe->tb_credits +=3D n_periods * params->tb_credits_per_period;
> -	pipe->tb_credits =3D rte_sched_min_val_2_u32(pipe->tb_credits,
> params->tb_size);
> +	pipe->tb_credits =3D rte_sched_min_val_2(pipe->tb_credits,
> +				params->tb_size);

Can we remove all the usages of rte_sched_min_val() (including its definiti=
on in rte_sched_common.h) and replace it with RTE_MIN, please?

>  	pipe->tb_time +=3D n_periods * params->tb_period;
>=20
>  	/* Subport TCs */
> @@ -1998,13 +2008,13 @@ grinder_credits_update(struct rte_sched_port
> *port,
>=20
>  #else
>=20
> -static inline uint32_t
> +static inline sched_counter_t
>  grinder_tc_ov_credits_update(struct rte_sched_port *port,
>  	struct rte_sched_subport *subport)
>  {
> -	uint32_t
> tc_ov_consumption[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> -	uint32_t tc_consumption =3D 0, tc_ov_consumption_max;
> -	uint32_t tc_ov_wm =3D subport->tc_ov_wm;
> +	sched_counter_t
> tc_ov_consumption[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> +	sched_counter_t tc_consumption =3D 0, tc_ov_consumption_max;
> +	sched_counter_t tc_ov_wm =3D subport->tc_ov_wm;
>  	uint32_t i;
>=20
>  	if (subport->tc_ov =3D=3D 0)
> @@ -2053,13 +2063,15 @@ grinder_credits_update(struct rte_sched_port
> *port,
>  	/* Subport TB */
>  	n_periods =3D (port->time - subport->tb_time) / subport->tb_period;
>  	subport->tb_credits +=3D n_periods * subport-
> >tb_credits_per_period;
> -	subport->tb_credits =3D rte_sched_min_val_2_u32(subport-
> >tb_credits, subport->tb_size);
> +	subport->tb_credits =3D rte_sched_min_val_2(subport->tb_credits,
> +				subport->tb_size);
>  	subport->tb_time +=3D n_periods * subport->tb_period;
>=20
>  	/* Pipe TB */
>  	n_periods =3D (port->time - pipe->tb_time) / params->tb_period;
>  	pipe->tb_credits +=3D n_periods * params->tb_credits_per_period;
> -	pipe->tb_credits =3D rte_sched_min_val_2_u32(pipe->tb_credits,
> params->tb_size);
> +	pipe->tb_credits =3D rte_sched_min_val_2(pipe->tb_credits,
> +				params->tb_size);
>  	pipe->tb_time +=3D n_periods * params->tb_period;
>=20
>  	/* Subport TCs */
> @@ -2101,11 +2113,11 @@ grinder_credits_check(struct rte_sched_port
> *port,
>  	struct rte_sched_pipe *pipe =3D grinder->pipe;
>  	struct rte_mbuf *pkt =3D grinder->pkt;
>  	uint32_t tc_index =3D grinder->tc_index;
> -	uint32_t pkt_len =3D pkt->pkt_len + port->frame_overhead;
> -	uint32_t subport_tb_credits =3D subport->tb_credits;
> -	uint32_t subport_tc_credits =3D subport->tc_credits[tc_index];
> -	uint32_t pipe_tb_credits =3D pipe->tb_credits;
> -	uint32_t pipe_tc_credits =3D pipe->tc_credits[tc_index];
> +	sched_counter_t pkt_len =3D pkt->pkt_len + port->frame_overhead;
> +	sched_counter_t subport_tb_credits =3D subport->tb_credits;
> +	sched_counter_t subport_tc_credits =3D subport-
> >tc_credits[tc_index];
> +	sched_counter_t pipe_tb_credits =3D pipe->tb_credits;
> +	sched_counter_t pipe_tc_credits =3D pipe->tc_credits[tc_index];
>  	int enough_credits;
>=20
>  	/* Check queue credits */
> @@ -2136,21 +2148,22 @@ grinder_credits_check(struct rte_sched_port
> *port,
>  	struct rte_sched_pipe *pipe =3D grinder->pipe;
>  	struct rte_mbuf *pkt =3D grinder->pkt;
>  	uint32_t tc_index =3D grinder->tc_index;
> -	uint32_t pkt_len =3D pkt->pkt_len + port->frame_overhead;
> -	uint32_t subport_tb_credits =3D subport->tb_credits;
> -	uint32_t subport_tc_credits =3D subport->tc_credits[tc_index];
> -	uint32_t pipe_tb_credits =3D pipe->tb_credits;
> -	uint32_t pipe_tc_credits =3D pipe->tc_credits[tc_index];
> -	uint32_t
> pipe_tc_ov_mask1[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> -	uint32_t
> pipe_tc_ov_mask2[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE] =3D {0};
> -	uint32_t pipe_tc_ov_credits, i;
> +	sched_counter_t pkt_len =3D pkt->pkt_len + port->frame_overhead;
> +	sched_counter_t subport_tb_credits =3D subport->tb_credits;
> +	sched_counter_t subport_tc_credits =3D subport-
> >tc_credits[tc_index];
> +	sched_counter_t pipe_tb_credits =3D pipe->tb_credits;
> +	sched_counter_t pipe_tc_credits =3D pipe->tc_credits[tc_index];
> +	sched_counter_t
> pipe_tc_ov_mask1[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
> +	sched_counter_t
> pipe_tc_ov_mask2[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE] =3D {0};
> +	sched_counter_t pipe_tc_ov_credits;
> +	uint32_t i;
>  	int enough_credits;
>=20
>  	for (i =3D 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
> -		pipe_tc_ov_mask1[i] =3D UINT32_MAX;
> +		pipe_tc_ov_mask1[i] =3D ~0;

Please use ~0LLU (or UINT64_MAX) to cover the 64-bit case gracefully. Pleas=
e also double-check that there are no usages of UINT32_MAX left in this cod=
e, unless there is a reason for it. Translation from 32-bit to 64-bit arith=
metic can be very tricky and yield some very difficult to debug issues.

>=20
>  	pipe_tc_ov_mask1[RTE_SCHED_TRAFFIC_CLASS_BE] =3D pipe-
> >tc_ov_credits;
> -	pipe_tc_ov_mask2[RTE_SCHED_TRAFFIC_CLASS_BE] =3D
> UINT32_MAX;
> +	pipe_tc_ov_mask2[RTE_SCHED_TRAFFIC_CLASS_BE] =3D ~0;
>  	pipe_tc_ov_credits =3D pipe_tc_ov_mask1[tc_index];
>=20
>  	/* Check pipe and subport credits */
> diff --git a/lib/librte_sched/rte_sched_common.h
> b/lib/librte_sched/rte_sched_common.h
> index 8c191a9b8..06520a686 100644
> --- a/lib/librte_sched/rte_sched_common.h
> +++ b/lib/librte_sched/rte_sched_common.h
> @@ -14,8 +14,16 @@ extern "C" {
>=20
>  #define __rte_aligned_16 __attribute__((__aligned__(16)))
>=20
> -static inline uint32_t
> -rte_sched_min_val_2_u32(uint32_t x, uint32_t y)
> +//#define COUNTER_SIZE_64
> +
> +#ifdef COUNTER_SIZE_64
> +typedef uint64_t sched_counter_t;
> +#else
> +typedef uint32_t sched_counter_t;
> +#endif
> +
> +static inline sched_counter_t
> +rte_sched_min_val_2(sched_counter_t x, sched_counter_t y)
>  {
>  	return (x < y)? x : y;
>  }
> --
> 2.21.0

I know I have previously suggested the creation of sched_counter_t, but thi=
s was meant to be a temporary solution until the full implementation is mad=
e available. Now that 19.11 is meant to be an ABI stable release, we cannot=
 really afford this trick (which might not be necessary either, since you h=
ave the full implementation), as this #ifdef COUNTER_SIZE is a massive ABI =
breakage.

Therefore, I strongly suggest we remove the sched_counter_t and use uint64_=
t everywhere throughout the implementation. Agree?

Regards,
Cristian