From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <konstantin.ananyev@intel.com>
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by dpdk.org (Postfix) with ESMTP id 017FC1B03F
 for <dev@dpdk.org>; Tue, 16 Jan 2018 13:38:39 +0100 (CET)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga004.jf.intel.com ([10.7.209.38])
 by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 16 Jan 2018 04:38:38 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.46,368,1511856000"; d="scan'208";a="166484667"
Received: from irsmsx108.ger.corp.intel.com ([163.33.3.3])
 by orsmga004.jf.intel.com with ESMTP; 16 Jan 2018 04:38:37 -0800
Received: from irsmsx155.ger.corp.intel.com (163.33.192.3) by
 IRSMSX108.ger.corp.intel.com (163.33.3.3) with Microsoft SMTP Server (TLS) id
 14.3.319.2; Tue, 16 Jan 2018 12:38:36 +0000
Received: from irsmsx105.ger.corp.intel.com ([169.254.7.236]) by
 irsmsx155.ger.corp.intel.com ([169.254.14.169]) with mapi id 14.03.0319.002;
 Tue, 16 Jan 2018 12:38:36 +0000
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "wei.guo.simon@gmail.com" <wei.guo.simon@gmail.com>, "Lu, Wenzhuo"
 <wenzhuo.lu@intel.com>
CC: "dev@dpdk.org" <dev@dpdk.org>, Thomas Monjalon <thomas@monjalon.net>
Thread-Topic: [dpdk-dev] [PATCH v5] app/testpmd: add option ring-bind-lcpu
 to	bind Q with CPU
Thread-Index: AQHTjm9RpGgyYska406FLNCRTPlp06N2b09A
Date: Tue, 16 Jan 2018 12:38:35 +0000
Message-ID: <2601191342CEEE43887BDE71AB9772588627E492@irsmsx105.ger.corp.intel.com>
References: <6A0DE07E22DDAD4C9103DF62FEBC09093B7109A8@shsmsx102.ccr.corp.intel.com>
 <1515810914-18762-1-git-send-email-wei.guo.simon@gmail.com>
In-Reply-To: <1515810914-18762-1-git-send-email-wei.guo.simon@gmail.com>
Accept-Language: en-IE, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZDZhMjQ2NDMtNjYxNC00NzM0LWI5YzktNzhjNDA4YmFjZWU5IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6IkVRaGpcL01PRE0wd1JIYlI5NDluVDhxVFwvd25GejlYRWp2bFVyRlNGTlwvMWs9In0=
x-ctpclassification: CTP_NT
dlp-product: dlpe-windows
dlp-version: 11.0.0.116
dlp-reaction: no-action
x-originating-ip: [163.33.239.181]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v5] app/testpmd: add option ring-bind-lcpu
	to	bind Q with CPU
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Jan 2018 12:38:40 -0000



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of wei.guo.simon@gmail.=
com
> Sent: Saturday, January 13, 2018 2:35 AM
> To: Lu, Wenzhuo <wenzhuo.lu@intel.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Simon Guo <wei.g=
uo.simon@gmail.com>
> Subject: [dpdk-dev] [PATCH v5] app/testpmd: add option ring-bind-lcpu to =
bind Q with CPU
>=20
> From: Simon Guo <wei.guo.simon@gmail.com>
>=20
> Currently the rx/tx queue is allocated from the buffer pool on socket of:
> - port's socket if --port-numa-config specified
> - or ring-numa-config setting per port
>=20
> All the above will "bind" queue to single socket per port configuration.
> But it can actually archieve better performance if one port's queue can
> be spread across multiple NUMA nodes, and the rx/tx queue is allocated
> per lcpu socket.
>=20
> This patch adds a new option "--ring-bind-lcpu"(no parameter).  With
> this, testpmd can utilize the PCI-e bus bandwidth on another NUMA
> nodes.
>=20
> When --port-numa-config or --ring-numa-config option is specified, this
> --ring-bind-lcpu option will be suppressed.

Instead of introducing one more option - wouldn't it be better to=20
allow user manually to define flows and assign them to particular lcores?
Then the user will be able to create any FWD configuration he/she likes.
Something like:
lcore X add flow rxq N,Y txq M,Z

Which would mean - on lcore X recv packets from port=3DN, rx_queue=3DY,
and send them through port=3DM,tx_queue=3DZ.

Konstantin       =20
=20



>=20
> Test result:
> 64bytes package, running in PowerPC with Mellanox
> CX-4 card, single port(100G), with 8 cores, fw mode:
> - Without this patch:  52.5Mpps throughput
> - With this patch: 66Mpps throughput
>       ~25% improvement
>=20
> Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
> ---
>  app/test-pmd/parameters.c             |  14 ++++-
>  app/test-pmd/testpmd.c                | 112 ++++++++++++++++++++++++----=
------
>  app/test-pmd/testpmd.h                |   7 +++
>  doc/guides/testpmd_app_ug/run_app.rst |   6 ++
>  4 files changed, 107 insertions(+), 32 deletions(-)
>=20
> diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
> index 304b98d..1dba92e 100644
> --- a/app/test-pmd/parameters.c
> +++ b/app/test-pmd/parameters.c
> @@ -104,6 +104,10 @@
>  	       "(flag: 1 for RX; 2 for TX; 3 for RX and TX).\n");
>  	printf("  --socket-num=3DN: set socket from which all memory is allocat=
ed "
>  	       "in NUMA mode.\n");
> +	printf("  --ring-bind-lcpu: "
> +		"specify TX/RX rings will be allocated on local socket of lcpu."
> +		"It will be ignored if ring-numa-config or port-numa-config is used. "
> +		"As a result, it allows one port binds to multiple NUMA nodes.\n");
>  	printf("  --mbuf-size=3DN: set the data size of mbuf to N bytes.\n");
>  	printf("  --total-num-mbufs=3DN: set the number of mbufs to be allocate=
d "
>  	       "in mbuf pools.\n");
> @@ -544,6 +548,7 @@
>  		{ "interactive",		0, 0, 0 },
>  		{ "cmdline-file",		1, 0, 0 },
>  		{ "auto-start",			0, 0, 0 },
> +		{ "ring-bind-lcpu",		0, 0, 0 },
>  		{ "eth-peers-configfile",	1, 0, 0 },
>  		{ "eth-peer",			1, 0, 0 },
>  #endif
> @@ -676,6 +681,10 @@
>  				stats_period =3D n;
>  				break;
>  			}
> +			if (!strcmp(lgopts[opt_idx].name, "ring-bind-lcpu")) {
> +				ring_bind_lcpu |=3D RBL_BIND_LOCAL_MASK;
> +				break;
> +			}
>  			if (!strcmp(lgopts[opt_idx].name,
>  				    "eth-peers-configfile")) {
>  				if (init_peer_eth_addrs(optarg) !=3D 0)
> @@ -739,11 +748,14 @@
>  				if (parse_portnuma_config(optarg))
>  					rte_exit(EXIT_FAILURE,
>  					   "invalid port-numa configuration\n");
> +				ring_bind_lcpu |=3D RBL_PORT_NUMA_MASK;
>  			}
> -			if (!strcmp(lgopts[opt_idx].name, "ring-numa-config"))
> +			if (!strcmp(lgopts[opt_idx].name, "ring-numa-config")) {
>  				if (parse_ringnuma_config(optarg))
>  					rte_exit(EXIT_FAILURE,
>  					   "invalid ring-numa configuration\n");
> +				ring_bind_lcpu |=3D RBL_RING_NUMA_MASK;
> +			}
>  			if (!strcmp(lgopts[opt_idx].name, "socket-num")) {
>  				n =3D atoi(optarg);
>  				if (!new_socket_id((uint8_t)n)) {
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index 9414d0e..e9e89d0 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -68,6 +68,9 @@
>  uint8_t interactive =3D 0;
>  uint8_t auto_start =3D 0;
>  uint8_t tx_first;
> +
> +uint8_t ring_bind_lcpu;
> +
>  char cmdline_filename[PATH_MAX] =3D {0};
>=20
>  /*
> @@ -1410,6 +1413,43 @@ static int eth_event_callback(portid_t port_id,
>  	return 1;
>  }
>=20
> +static int find_local_socket(queueid_t qi, int is_rxq)
> +{
> +	/*
> +	 * try to find the local socket with following logic:
> +	 * 1) Find the correct stream for the queue;
> +	 * 2) Find the correct lcore for the stream;
> +	 * 3) Find the correct socket for the lcore;
> +	 */
> +	int i, j, socket =3D NUMA_NO_CONFIG;
> +
> +	/* find the stream based on queue no*/
> +	for (i =3D 0; i < nb_fwd_streams; i++) {
> +		if (is_rxq) {
> +			if (fwd_streams[i]->rx_queue =3D=3D qi)
> +				break;
> +		} else {
> +			if (fwd_streams[i]->tx_queue =3D=3D qi)
> +				break;
> +		}
> +	}
> +	if (i =3D=3D nb_fwd_streams)
> +		return NUMA_NO_CONFIG;
> +
> +	/* find the lcore based on stream idx */
> +	for (j =3D 0; j < nb_lcores; j++) {
> +		if (fwd_lcores[j]->stream_idx =3D=3D i)
> +			break;
> +	}
> +	if (j =3D=3D nb_lcores)
> +		return NUMA_NO_CONFIG;
> +
> +	/* find the socket for the lcore */
> +	socket =3D rte_lcore_to_socket_id(fwd_lcores_cpuids[j]);
> +
> +	return socket;
> +}
> +
>  int
>  start_port(portid_t pid)
>  {
> @@ -1469,14 +1509,18 @@ static int eth_event_callback(portid_t port_id,
>  			port->need_reconfig_queues =3D 0;
>  			/* setup tx queues */
>  			for (qi =3D 0; qi < nb_txq; qi++) {
> +				int socket =3D port->socket_id;
>  				if ((numa_support) &&
>  					(txring_numa[pi] !=3D NUMA_NO_CONFIG))
> -					diag =3D rte_eth_tx_queue_setup(pi, qi,
> -						nb_txd,txring_numa[pi],
> -						&(port->tx_conf));
> -				else
> -					diag =3D rte_eth_tx_queue_setup(pi, qi,
> -						nb_txd,port->socket_id,
> +					socket =3D txring_numa[pi];
> +				else if (ring_bind_lcpu) {
> +					int ret =3D find_local_socket(qi, 0);
> +					if (ret !=3D NUMA_NO_CONFIG)
> +						socket =3D ret;
> +				}
> +
> +				diag =3D rte_eth_tx_queue_setup(pi, qi,
> +						nb_txd, socket,
>  						&(port->tx_conf));
>=20
>  				if (diag =3D=3D 0)
> @@ -1495,35 +1539,28 @@ static int eth_event_callback(portid_t port_id,
>  			}
>  			/* setup rx queues */
>  			for (qi =3D 0; qi < nb_rxq; qi++) {
> +				int socket =3D port->socket_id;
>  				if ((numa_support) &&
> -					(rxring_numa[pi] !=3D NUMA_NO_CONFIG)) {
> -					struct rte_mempool * mp =3D
> -						mbuf_pool_find(rxring_numa[pi]);
> -					if (mp =3D=3D NULL) {
> -						printf("Failed to setup RX queue:"
> -							"No mempool allocation"
> -							" on the socket %d\n",
> -							rxring_numa[pi]);
> -						return -1;
> -					}
> -
> -					diag =3D rte_eth_rx_queue_setup(pi, qi,
> -					     nb_rxd,rxring_numa[pi],
> -					     &(port->rx_conf),mp);
> -				} else {
> -					struct rte_mempool *mp =3D
> -						mbuf_pool_find(port->socket_id);
> -					if (mp =3D=3D NULL) {
> -						printf("Failed to setup RX queue:"
> +					(rxring_numa[pi] !=3D NUMA_NO_CONFIG))
> +					socket =3D rxring_numa[pi];
> +				else if (ring_bind_lcpu) {
> +					int ret =3D find_local_socket(qi, 1);
> +					if (ret !=3D NUMA_NO_CONFIG)
> +						socket =3D ret;
> +				}
> +
> +				struct rte_mempool *mp =3D
> +					mbuf_pool_find(socket);
> +				if (mp =3D=3D NULL) {
> +					printf("Failed to setup RX queue:"
>  							"No mempool allocation"
>  							" on the socket %d\n",
> -							port->socket_id);
> -						return -1;
> -					}
> -					diag =3D rte_eth_rx_queue_setup(pi, qi,
> -					     nb_rxd,port->socket_id,
> -					     &(port->rx_conf), mp);
> +							socket);
> +					return -1;
>  				}
> +				diag =3D rte_eth_rx_queue_setup(pi, qi,
> +						nb_rxd, socket,
> +						&(port->rx_conf), mp);
>  				if (diag =3D=3D 0)
>  					continue;
>=20
> @@ -2414,6 +2451,19 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
>  		       "but nb_txq=3D%d will prevent to fully test it.\n",
>  		       nb_rxq, nb_txq);
>=20
> +	if (ring_bind_lcpu & RBL_BIND_LOCAL_MASK) {
> +		if (ring_bind_lcpu &
> +				(RBL_RING_NUMA_MASK | RBL_PORT_NUMA_MASK)) {
> +			printf("ring-bind-lcpu option is suppressed by "
> +					"ring-numa-config or port-numa-config option\n");
> +			ring_bind_lcpu =3D 0;
> +		} else {
> +			printf("RingBuffer bind with local CPU selected\n");
> +			ring_bind_lcpu =3D 1;
> +		}
> +	} else
> +		ring_bind_lcpu =3D 0;
> +
>  	init_config();
>  	if (start_port(RTE_PORT_ALL) !=3D 0)
>  		rte_exit(EXIT_FAILURE, "Start ports failed\n");
> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
> index 2a266fd..99e55b2 100644
> --- a/app/test-pmd/testpmd.h
> +++ b/app/test-pmd/testpmd.h
> @@ -328,6 +328,13 @@ struct queue_stats_mappings {
>  extern uint8_t  interactive;
>  extern uint8_t  auto_start;
>  extern uint8_t  tx_first;
> +
> +/* for --ring-bind-lcpu option checking, RBL means Ring Bind Lcpu relate=
d */
> +#define RBL_BIND_LOCAL_MASK (1 << 0)
> +#define RBL_RING_NUMA_MASK  (1 << 1)
> +#define RBL_PORT_NUMA_MASK  (1 << 2)
> +extern uint8_t  ring_bind_lcpu;
> +
>  extern char cmdline_filename[PATH_MAX]; /**< offline commands file */
>  extern uint8_t  numa_support; /**< set by "--numa" parameter */
>  extern uint16_t port_topology; /**< set by "--port-topology" parameter *=
/
> diff --git a/doc/guides/testpmd_app_ug/run_app.rst b/doc/guides/testpmd_a=
pp_ug/run_app.rst
> index 4c0d2ce..b88f099 100644
> --- a/doc/guides/testpmd_app_ug/run_app.rst
> +++ b/doc/guides/testpmd_app_ug/run_app.rst
> @@ -240,6 +240,12 @@ The commandline options are:
>      Specify the socket on which the TX/RX rings for the port will be all=
ocated.
>      Where flag is 1 for RX, 2 for TX, and 3 for RX and TX.
>=20
> +*   ``--ring-bind-lcpu``
> +
> +    specify TX/RX rings will be allocated on local socket of lcpu.
> +    It will be ignored if ring-numa-config or port-numa-config is used.
> +    As a result, it allows one port binds to multiple NUMA nodes.
> +
>  *   ``--socket-num=3DN``
>=20
>      Set the socket from which all memory is allocated in NUMA mode,
> --
> 1.8.3.1