From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 23862A0597; Thu, 9 Apr 2020 14:36:43 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DF6F61C1E0; Thu, 9 Apr 2020 14:36:40 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id DCDB11C123 for ; Thu, 9 Apr 2020 14:36:38 +0200 (CEST) IronPort-SDR: hKAVkieV2jc3e5E30hVnIhjbZoU0fd6ArV78B97p4HcG89mVQHnA1+wg4EQb8lZSlOAd9RABzR CLT9/ujxxqTA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Apr 2020 05:36:37 -0700 IronPort-SDR: ONiTHEf3xSgpsZiYl8MHq2OktAB9zglgzzaFscAqbG58MdabN/QG56f0nYAHuPDNrxNcUsmmS1 ws/8r1OeJvag== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,362,1580803200"; d="scan'208";a="362141197" Received: from orsmsx102.amr.corp.intel.com ([10.22.225.129]) by fmsmga001.fm.intel.com with ESMTP; 09 Apr 2020 05:36:37 -0700 Received: from orsmsx121.amr.corp.intel.com (10.22.225.226) by ORSMSX102.amr.corp.intel.com (10.22.225.129) with Microsoft SMTP Server (TLS) id 14.3.439.0; Thu, 9 Apr 2020 05:36:37 -0700 Received: from ORSEDG002.ED.cps.intel.com (10.7.248.5) by ORSMSX121.amr.corp.intel.com (10.22.225.226) with Microsoft SMTP Server (TLS) id 14.3.439.0; Thu, 9 Apr 2020 05:36:36 -0700 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (104.47.58.106) by edgegateway.intel.com (134.134.137.101) with Microsoft SMTP Server (TLS) id 14.3.439.0; Thu, 9 Apr 2020 05:36:36 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ecp4312XklndhQ443x3hlMTv74q/vuJtvHK2Op9//O1x2RtHV1Gl0KiS3YI6/57v+VzUJgZgZYIBFxoi7mM5lsNJkvbB3Xl0r+h3ws1T5BCnPu27CHbKx/CkSyaPisAMnToG4l/anI+S3RVvJAmAXSlIPILWHFUE2r1rIfR0Q1GNIoE00/3sEI+63mF4SzNEWkqqG6zsN1sQ+dPLyEwFyaoIX+Irl2kFyAl6p94o6qyasUYxXzeiUqSLYJGSPOVELzQUOMaDaodVoZIDboB2G/6dI3S1MBUB1rHgHHMP4S7sxsAJT9YGbIIXoLRosci5QjfORfqQO8lKrytY96UFfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sTmtXrA/vczjaRstLEZ0II2030J8UBc+r6JF/JnLNtU=; b=XiBp7YAuccCvtRClrOK9rS5NW4IsueRdcEKRe1+2OD7CQY9pxZh0uKPy3Od7TTOHlhmmEjmlePW2RD0TxwwhVjwO36AgI+abxJilLUiFUKPk06+28Ymlh5IRbx4UIWR36UzRwGJgUCqMjQExAGK2S+W6Bde4LIM4xTH9MznlQRstRYuzLk+3Ar3SoxURW1xhIJusVUbD88VAqgPPJgvRPaZhCxGwUPYEEMivnCuF4oBC8ECMut9mix4a89VprF/Pj6mXOjw06K3GgWe6rMy0xL8iwsHvJgiMx1VO6Leh6cEleAAjx2D3xo2KQJ3fOhOQNmk2lN+9Oai8FYs63iiDeQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sTmtXrA/vczjaRstLEZ0II2030J8UBc+r6JF/JnLNtU=; b=s1m7Anpmj8i3REpKL9jXbPYsR7nW3QK/GqQgVUg1awvlt+OIgX0TrMemRcQxgmM0Jx94bpXF7L3gXZn3mS9zH3ERG4uB2fwUR6VvUHQsRyKMrR9xdJ6vQO4qoMJsg5UNrqVOztT5UCqzXZUSwrGiBbirTTgXbaZ8TAdF5iXuWiM= Received: from SN6PR11MB2558.namprd11.prod.outlook.com (2603:10b6:805:5d::19) by SN6PR11MB3134.namprd11.prod.outlook.com (2603:10b6:805:d9::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2878.16; Thu, 9 Apr 2020 12:36:35 +0000 Received: from SN6PR11MB2558.namprd11.prod.outlook.com ([fe80::3982:ee47:b95a:4ed4]) by SN6PR11MB2558.namprd11.prod.outlook.com ([fe80::3982:ee47:b95a:4ed4%3]) with mapi id 15.20.2900.015; Thu, 9 Apr 2020 12:36:35 +0000 From: "Ananyev, Konstantin" To: Honnappa Nagarahalli , "dev@dpdk.org" CC: "david.marchand@redhat.com" , "jielong.zjl@antfin.com" , nd , nd Thread-Topic: [PATCH v3 1/9] test/ring: add contention stress test Thread-Index: AQHWCd9oRLg1U6SQzEmd0VvXgtS5hahusT2AgAIM4GA= Date: Thu, 9 Apr 2020 12:36:35 +0000 Message-ID: References: <20200402220959.29885-1-konstantin.ananyev@intel.com> <20200403174235.23308-1-konstantin.ananyev@intel.com> <20200403174235.23308-2-konstantin.ananyev@intel.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.2.0.6 authentication-results: spf=none (sender IP is ) smtp.mailfrom=konstantin.ananyev@intel.com; x-originating-ip: [192.198.151.161] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: fa2c9e79-ab31-4663-3ad8-08d7dc82a411 x-ms-traffictypediagnostic: SN6PR11MB3134: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:1148; x-forefront-prvs: 0368E78B5B x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN6PR11MB2558.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(10019020)(376002)(346002)(396003)(39860400002)(136003)(366004)(26005)(66446008)(54906003)(66556008)(55016002)(7696005)(8936002)(9686003)(110136005)(8676002)(81156014)(316002)(6506007)(76116006)(64756008)(86362001)(5660300002)(66476007)(30864003)(4326008)(186003)(66946007)(33656002)(71200400001)(52536014)(81166007)(2906002)(478600001); DIR:OUT; SFP:1102; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: QXn8y4Ld1std+JyekAAqQP+yC6V5g8E++uBg/DKEacgTmNMr2hf1jfz/d5/a9NJbTHlST31kJE67BjyDo6D6+a/QoQqKkJSlsPBdRew23pnHzPE4nU2QGLbJZwzxjprU9xuAW1XFvF/1O34ceKZfYABynbEA8Kz+PG2eKHpD9CRLgfGiy6yZNg0PznSjaGYc/PIQpUe8lRCkwsfjLneFjych1BO41Cn0MXMW8XyLOWnKfW0qPt5b0dJ5FZUuqac+Q9E/R7YhTfd+4G3HN/zGdGt+z14MEuiNJjsTPcEc9JAlT9fTYYatITMtVpCbkmXnIWkK2epjhK/qMwFIRosjBlNhb5pyVof9qGYJKtYez2YiT62ifK7i8NksKY9s4aBPvYYPYD5sR0jOOFCyGMlGk08SZrPyCzUNbAxKFKse2dO6RcuWQhbzPdhxD2HCoOuX x-ms-exchange-antispam-messagedata: qpzVJ7N+YwAzX5t6S0USxHuqFa1AvWDyPWLrHaHEUkD8o6xGEASzpwQcf2RsFpDSBncE0nVEL0opXUTdJLaicjrPYS3SCnaFLwdV5kxQkebHQPXqtLPKjmrEDpWUqWp8ftgsxJROtb3HDiqxlHRcLg== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: fa2c9e79-ab31-4663-3ad8-08d7dc82a411 X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Apr 2020 12:36:35.1896 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: FlMX3uKH+KP7TqfrOouHRFct6q0D5aIFkXcD9jJDuShDt9Kax8Muj34TqOvPdF9fI+HPR84e/aPPaS+kxCChZ0HDU4dPk5H82t03eV0x6rQ= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR11MB3134 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v3 1/9] test/ring: add contention stress test X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > >=20 > > Subject: [PATCH v3 1/9] test/ring: add contention stress test > Minor, would 'add stress test for overcommitted use case' sound better? I liked to point out that this test-case can be used as contention stress-= test (many threads do enqueue/dequeue to/from the same ring) for both over-committed and not scenarios... Will probably try to add few extra explanations in v4. =20 =20 > > > > Introduce new test-case to measure ring perfomance under contention > Minor, 'over committed' seems to the word commonly used from the referenc= es you provided. Does it make sense to use that? >=20 > > (miltiple producers/consumers). > ^^^^^^^ multiple ack. >=20 > > Starts dequeue/enqueue loop on all available slave lcores. > > > > Signed-off-by: Konstantin Ananyev > > --- > > app/test/Makefile | 2 + > > app/test/meson.build | 2 + > > app/test/test_ring_mpmc_stress.c | 31 +++ > > app/test/test_ring_stress.c | 48 ++++ > > app/test/test_ring_stress.h | 35 +++ > > app/test/test_ring_stress_impl.h | 444 +++++++++++++++++++++++++++++++ > Would be good to change the file names to indicate that these tests are f= or over-committed usecase/configuration. > These are performance tests, better to have 'perf' or 'performance' in th= eir names. >=20 > > 6 files changed, 562 insertions(+) > > create mode 100644 app/test/test_ring_mpmc_stress.c create mode 10064= 4 > > app/test/test_ring_stress.c create mode 100644 app/test/test_ring_stre= ss.h > > create mode 100644 app/test/test_ring_stress_impl.h > > > > diff --git a/app/test/Makefile b/app/test/Makefile index > > 1f080d162..4eefaa887 100644 > > --- a/app/test/Makefile > > +++ b/app/test/Makefile > > @@ -77,7 +77,9 @@ SRCS-y +=3D test_external_mem.c SRCS-y +=3D > > test_rand_perf.c > > > > SRCS-y +=3D test_ring.c > > +SRCS-y +=3D test_ring_mpmc_stress.c > > SRCS-y +=3D test_ring_perf.c > > +SRCS-y +=3D test_ring_stress.c > > SRCS-y +=3D test_pmd_perf.c > > > > ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y) > > diff --git a/app/test/meson.build b/app/test/meson.build index > > 351d29cb6..827b04886 100644 > > --- a/app/test/meson.build > > +++ b/app/test/meson.build > > @@ -100,7 +100,9 @@ test_sources =3D files('commands.c', > > 'test_rib.c', > > 'test_rib6.c', > > 'test_ring.c', > > + 'test_ring_mpmc_stress.c', > > 'test_ring_perf.c', > > + 'test_ring_stress.c', > > 'test_rwlock.c', > > 'test_sched.c', > > 'test_service_cores.c', > > diff --git a/app/test/test_ring_mpmc_stress.c > > b/app/test/test_ring_mpmc_stress.c > > new file mode 100644 > > index 000000000..1524b0248 > > --- /dev/null > > +++ b/app/test/test_ring_mpmc_stress.c > > @@ -0,0 +1,31 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright(c) 2020 Intel Corporation > > + */ > > + > > +#include "test_ring_stress_impl.h" > > + > > +static inline uint32_t > > +_st_ring_dequeue_bulk(struct rte_ring *r, void **obj, uint32_t n, > > + uint32_t *avail) > > +{ > > + return rte_ring_mc_dequeue_bulk(r, obj, n, avail); } > > + > > +static inline uint32_t > > +_st_ring_enqueue_bulk(struct rte_ring *r, void * const *obj, uint32_t = n, > > + uint32_t *free) > > +{ > > + return rte_ring_mp_enqueue_bulk(r, obj, n, free); } > > + > > +static int > > +_st_ring_init(struct rte_ring *r, const char *name, uint32_t num) { > > + return rte_ring_init(r, name, num, 0); } > > + > > +const struct test test_ring_mpmc_stress =3D { > > + .name =3D "MP/MC", > > + .nb_case =3D RTE_DIM(tests), > > + .cases =3D tests, > > +}; > > diff --git a/app/test/test_ring_stress.c b/app/test/test_ring_stress.c = new file > > mode 100644 index 000000000..60706f799 > > --- /dev/null > > +++ b/app/test/test_ring_stress.c > > @@ -0,0 +1,48 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright(c) 2020 Intel Corporation > > + */ > > + > > +#include "test_ring_stress.h" > > + > > +static int > > +run_test(const struct test *test) > > +{ > > + int32_t rc; > > + uint32_t i, k; > > + > > + for (i =3D 0, k =3D 0; i !=3D test->nb_case; i++) { > > + > > + printf("TEST-CASE %s %s START\n", > > + test->name, test->cases[i].name); > > + > > + rc =3D test->cases[i].func(test->cases[i].wfunc); > > + k +=3D (rc =3D=3D 0); > > + > > + if (rc !=3D 0) > > + printf("TEST-CASE %s %s FAILED\n", > > + test->name, test->cases[i].name); > > + else > > + printf("TEST-CASE %s %s OK\n", > > + test->name, test->cases[i].name); > > + } > > + > > + return k; > > +} > > + > > +static int > > +test_ring_stress(void) > > +{ > > + uint32_t n, k; > > + > > + n =3D 0; > > + k =3D 0; > > + > > + n +=3D test_ring_mpmc_stress.nb_case; > > + k +=3D run_test(&test_ring_mpmc_stress); > > + > > + printf("Number of tests:\t%u\nSuccess:\t%u\nFailed:\t%u\n", > > + n, k, n - k); > > + return (k !=3D n); > > +} > > + > > +REGISTER_TEST_COMMAND(ring_stress_autotest, test_ring_stress); > > diff --git a/app/test/test_ring_stress.h b/app/test/test_ring_stress.h = new file > > mode 100644 index 000000000..60eac6216 > > --- /dev/null > > +++ b/app/test/test_ring_stress.h > > @@ -0,0 +1,35 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright(c) 2020 Intel Corporation > > + */ > > + > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +#include "test.h" > > + > > +struct test_case { > > + const char *name; > > + int (*func)(int (*)(void *)); > > + int (*wfunc)(void *arg); > > +}; > > + > > +struct test { > > + const char *name; > > + uint32_t nb_case; > > + const struct test_case *cases; > > +}; > > + > > +extern const struct test test_ring_mpmc_stress; > > diff --git a/app/test/test_ring_stress_impl.h > > b/app/test/test_ring_stress_impl.h > > new file mode 100644 > > index 000000000..11476d28c > > --- /dev/null > > +++ b/app/test/test_ring_stress_impl.h > > @@ -0,0 +1,444 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright(c) 2020 Intel Corporation > > + */ > > + > > +#include "test_ring_stress.h" > > + > > +/* > > + * Measures performance of ring enqueue/dequeue under high contention > > +*/ > > + > > +#define RING_NAME "RING_STRESS" > > +#define BULK_NUM 32 > > +#define RING_SIZE (2 * BULK_NUM * RTE_MAX_LCORE) > > + > > +enum { > > + WRK_CMD_STOP, > > + WRK_CMD_RUN, > > +}; > > + > > +static volatile uint32_t wrk_cmd __rte_cache_aligned; > > + > > +/* test run-time in seconds */ > > +static const uint32_t run_time =3D 60; > > +static const uint32_t verbose; > > + > > +struct lcore_stat { > > + uint64_t nb_cycle; > > + struct { > > + uint64_t nb_call; > > + uint64_t nb_obj; > > + uint64_t nb_cycle; > > + uint64_t max_cycle; > > + uint64_t min_cycle; > > + } op; > > +}; > > + > > +struct lcore_arg { > > + struct rte_ring *rng; > > + struct lcore_stat stats; > > +} __rte_cache_aligned; > > + > > +struct ring_elem { > > + uint32_t cnt[RTE_CACHE_LINE_SIZE / sizeof(uint32_t)]; } > > +__rte_cache_aligned; > > + > > +/* > > + * redefinable functions > > + */ > > +static uint32_t > > +_st_ring_dequeue_bulk(struct rte_ring *r, void **obj, uint32_t n, > > + uint32_t *avail); > > + > > +static uint32_t > > +_st_ring_enqueue_bulk(struct rte_ring *r, void * const *obj, uint32_t = n, > > + uint32_t *free); > > + > > +static int > > +_st_ring_init(struct rte_ring *r, const char *name, uint32_t num); > > + > > + > > +static void > > +lcore_stat_update(struct lcore_stat *ls, uint64_t call, uint64_t obj, > > + uint64_t tm, int32_t prcs) > > +{ > > + ls->op.nb_call +=3D call; > > + ls->op.nb_obj +=3D obj; > > + ls->op.nb_cycle +=3D tm; > > + if (prcs) { > > + ls->op.max_cycle =3D RTE_MAX(ls->op.max_cycle, tm); > > + ls->op.min_cycle =3D RTE_MIN(ls->op.min_cycle, tm); > > + } > > +} > > + > > +static void > > +lcore_op_stat_aggr(struct lcore_stat *ms, const struct lcore_stat *ls) > > +{ > > + > > + ms->op.nb_call +=3D ls->op.nb_call; > > + ms->op.nb_obj +=3D ls->op.nb_obj; > > + ms->op.nb_cycle +=3D ls->op.nb_cycle; > > + ms->op.max_cycle =3D RTE_MAX(ms->op.max_cycle, ls->op.max_cycle); > > + ms->op.min_cycle =3D RTE_MIN(ms->op.min_cycle, ls->op.min_cycle); } > > + > > +static void > > +lcore_stat_aggr(struct lcore_stat *ms, const struct lcore_stat *ls) { > > + ms->nb_cycle =3D RTE_MAX(ms->nb_cycle, ls->nb_cycle); > > + lcore_op_stat_aggr(ms, ls); > > +} > > + > > +static void > > +lcore_stat_dump(FILE *f, uint32_t lc, const struct lcore_stat *ls) { > > + long double st; > > + > > + st =3D (long double)rte_get_timer_hz() / US_PER_S; > > + > > + if (lc =3D=3D UINT32_MAX) > > + fprintf(f, "%s(AGGREGATE)=3D{\n", __func__); > > + else > > + fprintf(f, "%s(lc=3D%u)=3D{\n", __func__, lc); > > + > > + fprintf(f, "\tnb_cycle=3D%" PRIu64 "(%.2Lf usec),\n", > > + ls->nb_cycle, (long double)ls->nb_cycle / st); > > + > > + fprintf(f, "\tDEQ+ENQ=3D{\n"); > > + > > + fprintf(f, "\t\tnb_call=3D%" PRIu64 ",\n", ls->op.nb_call); > > + fprintf(f, "\t\tnb_obj=3D%" PRIu64 ",\n", ls->op.nb_obj); > > + fprintf(f, "\t\tnb_cycle=3D%" PRIu64 ",\n", ls->op.nb_cycle); > > + fprintf(f, "\t\tobj/call(avg): %.2Lf\n", > > + (long double)ls->op.nb_obj / ls->op.nb_call); > > + fprintf(f, "\t\tcycles/obj(avg): %.2Lf\n", > > + (long double)ls->op.nb_cycle / ls->op.nb_obj); > > + fprintf(f, "\t\tcycles/call(avg): %.2Lf\n", > > + (long double)ls->op.nb_cycle / ls->op.nb_call); > > + > > + /* if min/max cycles per call stats was collected */ > > + if (ls->op.min_cycle !=3D UINT64_MAX) { > > + fprintf(f, "\t\tmax cycles/call=3D%" PRIu64 "(%.2Lf usec),\n", > > + ls->op.max_cycle, > > + (long double)ls->op.max_cycle / st); > > + fprintf(f, "\t\tmin cycles/call=3D%" PRIu64 "(%.2Lf usec),\n", > > + ls->op.min_cycle, > > + (long double)ls->op.min_cycle / st); > > + } > > + > > + fprintf(f, "\t},\n"); > > + fprintf(f, "};\n"); > > +} > > + > > +static void > > +fill_ring_elm(struct ring_elem *elm, uint32_t fill) { > > + uint32_t i; > > + > > + for (i =3D 0; i !=3D RTE_DIM(elm->cnt); i++) > > + elm->cnt[i] =3D fill; > > +} > > + > > +static int32_t > > +check_updt_elem(struct ring_elem *elm[], uint32_t num, > > + const struct ring_elem *check, const struct ring_elem *fill) { > > + uint32_t i; > > + > > + static rte_spinlock_t dump_lock; > > + > > + for (i =3D 0; i !=3D num; i++) { > > + if (memcmp(check, elm[i], sizeof(*check)) !=3D 0) { > > + rte_spinlock_lock(&dump_lock); > > + printf("%s(lc=3D%u, num=3D%u) failed at %u-th iter, " > > + "offending object: %p\n", > > + __func__, rte_lcore_id(), num, i, elm[i]); > > + rte_memdump(stdout, "expected", check, > > sizeof(*check)); > > + rte_memdump(stdout, "result", elm[i], sizeof(elm[i])); > > + rte_spinlock_unlock(&dump_lock); > > + return -EINVAL; > > + } > > + memcpy(elm[i], fill, sizeof(*elm[i])); > > + } > > + > > + return 0; > > +} > > + > > +static int > > +check_ring_op(uint32_t exp, uint32_t res, uint32_t lc, > minor, lcore instead of lc would be better >=20 > > + const char *fname, const char *opname) { > > + if (exp !=3D res) { > > + printf("%s(lc=3D%u) failure: %s expected: %u, returned %u\n", > Suggest using lcore in the printf >=20 > > + fname, lc, opname, exp, res); > > + return -ENOSPC; > > + } > > + return 0; > > +} > > + > > +static int > > +test_worker_prcs(void *arg) > > +{ > > + int32_t rc; > > + uint32_t lc, n, num; > minor, lcore instead of lc would be better >=20 > > + uint64_t cl, tm0, tm1; > > + struct lcore_arg *la; > > + struct ring_elem def_elm, loc_elm; > > + struct ring_elem *obj[2 * BULK_NUM]; > > + > > + la =3D arg; > > + lc =3D rte_lcore_id(); > > + > > + fill_ring_elm(&def_elm, UINT32_MAX); > > + fill_ring_elm(&loc_elm, lc); > > + > > + while (wrk_cmd !=3D WRK_CMD_RUN) { > > + rte_smp_rmb(); > > + rte_pause(); > > + } > > + > > + cl =3D rte_rdtsc_precise(); > > + > > + do { > > + /* num in interval [7/8, 11/8] of BULK_NUM */ > > + num =3D 7 * BULK_NUM / 8 + rte_rand() % (BULK_NUM / 2); > > + > > + /* reset all pointer values */ > > + memset(obj, 0, sizeof(obj)); > > + > > + /* dequeue num elems */ > > + tm0 =3D rte_rdtsc_precise(); > > + n =3D _st_ring_dequeue_bulk(la->rng, (void **)obj, num, NULL); > > + tm0 =3D rte_rdtsc_precise() - tm0; > > + > > + /* check return value and objects */ > > + rc =3D check_ring_op(num, n, lc, __func__, > > + RTE_STR(_st_ring_dequeue_bulk)); > > + if (rc =3D=3D 0) > > + rc =3D check_updt_elem(obj, num, &def_elm, > > &loc_elm); > > + if (rc !=3D 0) > > + break; > Since this seems like a performance test, should we skip validating the o= bjects? > Did these tests run on Travis CI? I believe Travis CI has trouble running= stress/performance tests if they take too much time. > The RTS and HTS tests should be added to functional tests. >=20 > > + > > + /* enqueue num elems */ > > + rte_compiler_barrier(); > > + rc =3D check_updt_elem(obj, num, &loc_elm, &def_elm); > > + if (rc !=3D 0) > > + break; > > + > > + tm1 =3D rte_rdtsc_precise(); > > + n =3D _st_ring_enqueue_bulk(la->rng, (void **)obj, num, NULL); > > + tm1 =3D rte_rdtsc_precise() - tm1; > > + > > + /* check return value */ > > + rc =3D check_ring_op(num, n, lc, __func__, > > + RTE_STR(_st_ring_enqueue_bulk)); > > + if (rc !=3D 0) > > + break; > > + > > + lcore_stat_update(&la->stats, 1, num, tm0 + tm1, 1); > > + > > + } while (wrk_cmd =3D=3D WRK_CMD_RUN); > > + > > + la->stats.nb_cycle =3D rte_rdtsc_precise() - cl; > > + return rc; > > +} > > + > > +static int > > +test_worker_avg(void *arg) > > +{ > > + int32_t rc; > > + uint32_t lc, n, num; > > + uint64_t cl; > > + struct lcore_arg *la; > > + struct ring_elem def_elm, loc_elm; > > + struct ring_elem *obj[2 * BULK_NUM]; > > + > > + la =3D arg; > > + lc =3D rte_lcore_id(); > > + > > + fill_ring_elm(&def_elm, UINT32_MAX); > > + fill_ring_elm(&loc_elm, lc); > > + > > + while (wrk_cmd !=3D WRK_CMD_RUN) { > > + rte_smp_rmb(); > > + rte_pause(); > > + } > > + > > + cl =3D rte_rdtsc_precise(); > > + > > + do { > > + /* num in interval [7/8, 11/8] of BULK_NUM */ > > + num =3D 7 * BULK_NUM / 8 + rte_rand() % (BULK_NUM / 2); > > + > > + /* reset all pointer values */ > > + memset(obj, 0, sizeof(obj)); > > + > > + /* dequeue num elems */ > > + n =3D _st_ring_dequeue_bulk(la->rng, (void **)obj, num, NULL); > > + > > + /* check return value and objects */ > > + rc =3D check_ring_op(num, n, lc, __func__, > > + RTE_STR(_st_ring_dequeue_bulk)); > > + if (rc =3D=3D 0) > > + rc =3D check_updt_elem(obj, num, &def_elm, > > &loc_elm); > > + if (rc !=3D 0) > > + break; > > + > > + /* enqueue num elems */ > > + rte_compiler_barrier(); > > + rc =3D check_updt_elem(obj, num, &loc_elm, &def_elm); > > + if (rc !=3D 0) > > + break; > > + > > + n =3D _st_ring_enqueue_bulk(la->rng, (void **)obj, num, NULL); > > + > > + /* check return value */ > > + rc =3D check_ring_op(num, n, lc, __func__, > > + RTE_STR(_st_ring_enqueue_bulk)); > > + if (rc !=3D 0) > > + break; > > + > > + lcore_stat_update(&la->stats, 1, num, 0, 0); > > + > > + } while (wrk_cmd =3D=3D WRK_CMD_RUN); > > + > > + /* final stats update */ > > + cl =3D rte_rdtsc_precise() - cl; > > + lcore_stat_update(&la->stats, 0, 0, cl, 0); > > + la->stats.nb_cycle =3D cl; > > + > > + return rc; > > +} > Just wondering about the need of 2 tests which run the same functionality= . The difference is the way in which numbers are collected. > Does 'test_worker_avg' adding any value? IMO, we can remove 'test_worker_= avg'. Yeh, they are quite similar. I added _average_ version for two reasons: 1. In precise I call rte_rdtsc_precise() straight before/after=20 enqueue/dequeue op. At least at IA rte_rdtsc_precise() means mb(). This extra sync point might hide some sync problems in the ring enqueue/dequeue itself. So having a separate test without such extra sync points gives extra confidence that these tests would catch ring sync problems = if any. =20 2. People usually don't do enqueue/dequeue on its own. One of common patterns: dequeue/read-write data from the dequed objects= /enqueue. So this test measures cycles for dequeue/enqueue plus some reads/writes to the objects from the ring. =20 > > + > > +static void > > +mt1_fini(struct rte_ring *rng, void *data) { > > + rte_free(rng); > > + rte_free(data); > > +} > > + > > +static int > > +mt1_init(struct rte_ring **rng, void **data, uint32_t num) { > > + int32_t rc; > > + size_t sz; > > + uint32_t i, nr; > > + struct rte_ring *r; > > + struct ring_elem *elm; > > + void *p; > > + > > + *rng =3D NULL; > > + *data =3D NULL; > > + > > + sz =3D num * sizeof(*elm); > > + elm =3D rte_zmalloc(NULL, sz, __alignof__(*elm)); > > + if (elm =3D=3D NULL) { > > + printf("%s: alloc(%zu) for %u elems data failed", > > + __func__, sz, num); > > + return -ENOMEM; > > + } > > + > > + *data =3D elm; > > + > > + /* alloc ring */ > > + nr =3D 2 * num; > > + sz =3D rte_ring_get_memsize(nr); > > + r =3D rte_zmalloc(NULL, sz, __alignof__(*r)); > > + if (r =3D=3D NULL) { > > + printf("%s: alloc(%zu) for FIFO with %u elems failed", > > + __func__, sz, nr); > > + return -ENOMEM; > > + } > > + > > + *rng =3D r; > > + > > + rc =3D _st_ring_init(r, RING_NAME, nr); > > + if (rc !=3D 0) { > > + printf("%s: _st_ring_init(%p, %u) failed, error: %d(%s)\n", > > + __func__, r, nr, rc, strerror(-rc)); > > + return rc; > > + } > > + > > + for (i =3D 0; i !=3D num; i++) { > > + fill_ring_elm(elm + i, UINT32_MAX); > > + p =3D elm + i; > > + if (_st_ring_enqueue_bulk(r, &p, 1, NULL) !=3D 1) > > + break; > > + } > > + > > + if (i !=3D num) { > > + printf("%s: _st_ring_enqueue(%p, %u) returned %u\n", > > + __func__, r, num, i); > > + return -ENOSPC; > > + } > > + > > + return 0; > > +} > > + > > +static int > > +test_mt1(int (*test)(void *)) > > +{ > > + int32_t rc; > > + uint32_t lc, mc; > > + struct rte_ring *r; > > + void *data; > > + struct lcore_arg arg[RTE_MAX_LCORE]; > > + > > + static const struct lcore_stat init_stat =3D { > > + .op.min_cycle =3D UINT64_MAX, > > + }; > > + > > + rc =3D mt1_init(&r, &data, RING_SIZE); > > + if (rc !=3D 0) { > > + mt1_fini(r, data); > > + return rc; > > + } > > + > > + memset(arg, 0, sizeof(arg)); > > + > > + /* launch on all slaves */ > > + RTE_LCORE_FOREACH_SLAVE(lc) { > > + arg[lc].rng =3D r; > > + arg[lc].stats =3D init_stat; > > + rte_eal_remote_launch(test, &arg[lc], lc); > > + } > > + > > + /* signal worker to start test */ > > + wrk_cmd =3D WRK_CMD_RUN; > > + rte_smp_wmb(); > > + > > + usleep(run_time * US_PER_S); > > + > > + /* signal worker to start test */ > > + wrk_cmd =3D WRK_CMD_STOP; > > + rte_smp_wmb(); > > + > > + /* wait for slaves and collect stats. */ > > + mc =3D rte_lcore_id(); > > + arg[mc].stats =3D init_stat; > > + > > + rc =3D 0; > > + RTE_LCORE_FOREACH_SLAVE(lc) { > > + rc |=3D rte_eal_wait_lcore(lc); > > + lcore_stat_aggr(&arg[mc].stats, &arg[lc].stats); > > + if (verbose !=3D 0) > > + lcore_stat_dump(stdout, lc, &arg[lc].stats); > > + } > > + > > + lcore_stat_dump(stdout, UINT32_MAX, &arg[mc].stats); > > + mt1_fini(r, data); > > + return rc; > > +} > > + > > +static const struct test_case tests[] =3D { > > + { > > + .name =3D "MT-WRK_ENQ_DEQ-MST_NONE-PRCS", > > + .func =3D test_mt1, > > + .wfunc =3D test_worker_prcs, > > + }, > > + { > > + .name =3D "MT-WRK_ENQ_DEQ-MST_NONE-AVG", > > + .func =3D test_mt1, > > + .wfunc =3D test_worker_avg, > > + }, > > +}; > > -- > > 2.17.1