From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <Jerin.JacobKollanukkaran@cavium.com>
Received: from NAM01-SN1-obe.outbound.protection.outlook.com
 (mail-sn1nam01on0061.outbound.protection.outlook.com [104.47.32.61])
 by dpdk.org (Postfix) with ESMTP id 876271B5F4
 for <dev@dpdk.org>; Tue,  7 Nov 2017 10:58:17 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;
 bh=vVwHWcLCR4ypke8FwJNuKFhlkIx1sjIf7teFgQaspzU=;
 b=DX9dTPJEN5/Filh16B8q0F+MxaafXB5xy+HL8wapqbSoi8WJL6rMvffp4Q3+CX77Rigbuo8dQ5OHzGRfNnYDdkmcHP/VoO45WRu+snd1Pz+3QnCZauI4CVv3sB/BR6OIRFHtKUDT2ythsy2VE/x0yqx9XjTPv4vQjpmhk1GX1FU=
Received: from jerin (111.93.218.67) by
 CO2PR07MB2518.namprd07.prod.outlook.com (10.166.200.152) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id
 15.20.197.13; Tue, 7 Nov 2017 09:58:11 +0000
Date: Tue, 7 Nov 2017 15:27:29 +0530
From: Jerin Jacob <jerin.jacob@caviumnetworks.com>
To: Jia He <hejianet@gmail.com>
Cc: dev@dpdk.org, olivier.matz@6wind.com, konstantin.ananyev@intel.com,
 bruce.richardson@intel.com, jianbo.liu@arm.com,
 hemant.agrawal@nxp.com, jie2.liu@hxt-semitech.com,
 bing.zhao@hxt-semitech.com, jia.he@hxt-semitech.com
Message-ID: <20171107095727.GA23010@jerin>
References: <1509612210-5499-1-git-send-email-hejianet@gmail.com>
 <20171102172337.GB1478@jerin>
 <25192429-8369-ac3d-44b0-c1b1d7182ef0@gmail.com>
 <20171103125616.GB20326@jerin>
 <7b7f3677-8313-9a2f-868f-b3a6231548d6@gmail.com>
 <20171107043655.GA3244@jerin>
 <c2ce8774-a1b6-edf6-444e-ee0981df7497@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <c2ce8774-a1b6-edf6-444e-ee0981df7497@gmail.com>
User-Agent: Mutt/1.9.1 (2017-09-22)
X-Originating-IP: [111.93.218.67]
X-ClientProxiedBy: PN1PR0101CA0067.INDPRD01.PROD.OUTLOOK.COM (10.174.150.157)
 To CO2PR07MB2518.namprd07.prod.outlook.com (10.166.200.152)
X-MS-PublicTrafficType: Email
X-MS-Office365-Filtering-Correlation-Id: a588100c-c52b-4677-c7cb-08d525c6104a
X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0;
 RULEID:(22001)(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(2017052603249);
 SRVR:CO2PR07MB2518; 
X-Microsoft-Exchange-Diagnostics: 1; CO2PR07MB2518;
 3:Ouuqa4IBHheiofVv5o+edH23ShYWLZrr5+0RKbIGhHmCPKRgwkpnToj/Ii+aVKC9z88zaRiSnIdodyCkipaqaJFhYYTWE2gPA6T+ueq425h1R+YDzbRE8C23tmUGJtPPrpV/tXkvg8ByHWQjZnTVjPa1LSWq61L/HsvmRQywD0B/LFQHxukbXGaq5+C2hFB9hMxy/7tRl3rNth5Uq2Kkbsdr3lactwPD36BOdNC9kQoiELa7TJwy4Lr0AAeDvwwU;
 25:UM21iF3iNuvIQihSIr7DwjNw8kkzCXXIEcgdUo+n0Rd40A0fWfERTrNWumOOXe/AfsB+LolOpyrQTgX7WlmEhlonIhumQivbc2OerpaHESMSW09CoSIxdgGdRXyGNqFuoqPr0jLF/qeis/I78BXWHmvu1WgCO/sD3wUi/UggNkcDBPMPP3mRjax3/zfvLxoXDDTchNaR+n4H5mMMyxPb8IlDnweSDgKrZO6wJ8YLg4OlA55rCQB8xKso6Nhbxj3ckRbGlfAowAJE7vYk/U1R6s+pxloi+Ee2uL5SBIaDJL0roaEA59a2jHk3I3k7CfNEcMxg3ordobXXdXMYgAbZYg==;
 31:bOOvD6wR25/ZDfjos+jjvCb7E6mENSimjZ/lJ2B6kQXTVrdgKrQBRkvt66fAn6xYjI76r6jB9/Mive7lDfYwP9W6EuOYDMM8Gi7m4oy5Cjl7BSUSEVuzxEFxksfqW5RYL9H+LMvZS+Jo5ugemuy3WxKFDhyp4EZkVzFCZ1IEDxU+Eoz6FHjsPEHYSiWDW2GRGpFtby1sYtErVQiHZN0VwxFfqqRmhOqAOI1mo0eS+1g=
X-MS-TrafficTypeDiagnostic: CO2PR07MB2518:
Authentication-Results: spf=none (sender IP is )
 smtp.mailfrom=Jerin.JacobKollanukkaran@cavium.com; 
X-Microsoft-Exchange-Diagnostics: 1; CO2PR07MB2518;
 20:DXTqjancGac5NUn6sBZSwI2MiaxuT8DkMZjF7LgJpIJliaDIJX2oGRk8H3ddbHJRuW+h2z84Ckt0OhLnPsYgXPvgF2AI5KbkWtx5CGuX8t2pbLSwNwh+oOrb92fAuK/ocemo2Op+zQHtv80KA/ZNYIaGPllgbnFNCBAD2TNoPLGbXZwHC+XDSAEK5Xq8Ey7gPk9CgAqjVjZTIfu18T7z8BCrFgQjk7OK5ZPmd1zV1lhjYODLXuFaMetha5rfj6XlcBdqdgosV3uUoYPKikIWpucLyitMGyHMJmBkjB3mZhxELde0oj6LH5MtGndV1dO0ZuQLDnZDcie9RxL0y05e5xaM9EdyBw1Dy6xzQKomsHrqtizvRni8ZJuw+MgJLKeGKawIwB9dXKVLf5mzcDbJg5nw0NmcF8H7LqxSXmSnYLpUF7hxDUnGX8v/XJDGsDSMd+deaO88ewnQhg+yd/w0GxDxT5hULi++EpGZwwhJ9yyCljfRnbwdMqPc8Y9dzVX5GkVfkJa9bxIoGccaWyRAxYgVM7jesrvoC32PJcDGTc4Na/+Y/2+5jHpfBbNiA8iTbFh1i2H7te4j1Fdm/kSkVMoSlb26dm0xOxijlJqjomI=
X-Exchange-Antispam-Report-Test: UriScan:(180628864354917)(131327999870524)(185117386973197)(228905959029699); 
X-Microsoft-Antispam-PRVS: <CO2PR07MB251823CB4BD6D848C187739DE3510@CO2PR07MB2518.namprd07.prod.outlook.com>
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0;
 RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(100000703101)(100105400095)(3231021)(93006095)(10201501046)(3002001)(6041248)(20161123560025)(20161123555025)(20161123564025)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);
 SRVR:CO2PR07MB2518; BCL:0; PCL:0;
 RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);
 SRVR:CO2PR07MB2518; 
X-Microsoft-Exchange-Diagnostics: 1; CO2PR07MB2518;
 4:qAHcU/z0CGYVXjhNfGXJ2dhAzL2p7l0MudTGZDX9JzwC5hj/thCF4Tb41fm4hOEbPFYkEpwvoP4ODY7upwLjMxGa6nvg58emGmgjAor3FadFOtXrnBsfFkEIfacGXYgIIW2k+yQg4OR/+NWsyaI2KDHd7k3VqA00m/4CpE9E5DRAg1yav6xZYYTFw4c7+lRT2lK7Dm12f0HVX729YPbtoXd93mC1Rl7ur38XsM9KMKwbEP927tpN2RI0S5mODrWecVErMNlQzNfhoYZsYw/JLqz7lf8faoHrKZmmBPNFN1miYiwHRJeTM9j6ntM0Md9Bb0PfVzqyoYeizqQOI8xVaWpP3Mvenxlc9bw6+Z6kD7Yd7ZeSSNSO6GsbyY0B62Xed3/4v8nBrVVEx5TvueyzvmYkmuxPjY4MKrSTIdE0lqk=
X-Forefront-PRVS: 0484063412
X-Forefront-Antispam-Report: SFV:NSPM;
 SFS:(10009020)(6009001)(376002)(346002)(189002)(13464003)(51914003)(24454002)(199003)(42882006)(6666003)(58126008)(47776003)(189998001)(81166006)(81156014)(2950100002)(68736007)(105586002)(305945005)(83506002)(316002)(6116002)(7736002)(3846002)(53936002)(106356001)(7416002)(229853002)(23676003)(33656002)(8676002)(6916009)(97736004)(9686003)(8656006)(5660300001)(1411001)(55016002)(50466002)(39060400002)(6246003)(4326008)(6496005)(2870700001)(101416001)(66066001)(2906002)(8936002)(16526018)(72206003)(478600001)(33716001)(25786009)(1076002)(93886005)(50986999)(54356999)(76176999)(5009440100003)(18370500001);
 DIR:OUT; SFP:1101; SCL:1; SRVR:CO2PR07MB2518; H:jerin; FPR:; SPF:None;
 PTR:InfoNoRecords; MX:1; A:1; LANG:en; 
Received-SPF: None (protection.outlook.com: cavium.com does not designate
 permitted sender hosts)
X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtDTzJQUjA3TUIyNTE4OzIzOjFrV2xCZmRJcHluc25yNEs4cnFDSmFEV2c2?=
 =?utf-8?B?Z3lnaWFPRE9XZmN1VFFnRVRzdXdBdmFBTzFtdk9zZDg5SWwzUkRscXZPRFhG?=
 =?utf-8?B?c3VoWW5QdkFXZy9lam94b0dadnptdlRMTmIvZlg4Nk02MEJ4b1lnRXR5bUlU?=
 =?utf-8?B?RFJzQThDbVZzL1ZQTlFsemlZdG1KNm8yRFRJOUJZY2tJNzhHUU0vQkd1ZWhU?=
 =?utf-8?B?RmV1cWliQ3NqNWE1cS9NNThuMTFmNTB6VTVISXovalpZV0FTWk9lTUZGZXIy?=
 =?utf-8?B?eUhFVmVVT1h0VTk1amEvdFBESTFHajlkaktwV1lxWTJvandrbGh5My8vT2tZ?=
 =?utf-8?B?STZPanlocDlLUnlYZzNscTlKMENCMnBxMngva0E3VDQvc3NIWUZ6M05oYlBa?=
 =?utf-8?B?MFJFWFNoN0NmOGEvWE5HTGIxaWxqWloybWpTanZzWHJ1ZHVpaXk4RHFKeVA5?=
 =?utf-8?B?UFFvY0RtKzcybTFBbEZQSmNmK1pEL1paM0Y2V1FPVFpTU3ZheW5lL1Vqcnpz?=
 =?utf-8?B?ajJ0cW5nUnhsdG1KNXNlRHczTWVTajB4OXdrdGNpeWxpN2I4ZWxIRW5qbUI5?=
 =?utf-8?B?clRCZ2lTVjVXRmNySElkSkxodjF2Mzlxek15aXlXRFNJVEpqbmxZaVYzbXlz?=
 =?utf-8?B?NW0wZG9yUk9tSzJ4bGI3cGxnMEZTNWh1UHNJd2V2YTdwc1JUVEc0YWtXeWZO?=
 =?utf-8?B?MDFwRkZkNXJPZVNZTi9YYTdXQ3NycWZBTTljNlNoa1FqSWw3SVZwMnVQUU9W?=
 =?utf-8?B?K2RnVFlTeThBQnVWVVdNWkNadDgrUnpkUitiS0xFS01wUEpTNVBRbEtZY1dp?=
 =?utf-8?B?aWl1NS95bTJMNk13QVMxUmp0R2lZTjh1NnNiYVlwNXczSURRSzY5V253SjlG?=
 =?utf-8?B?TXdrNEtINEkzTko4Q1M4VUFUSjNWeTY4YUdWMWFacWtvL1JWR2t6UTVNMTA0?=
 =?utf-8?B?SVZHckdaay93RlhpUVJNa3dQUnRtNkU1ZkFJcHpOWlZEcW5iUDZ2MzJNM25W?=
 =?utf-8?B?aTBOL0tJM0RlQ2UrelJQaWFxMnBxbUJibnhNdjFKeUsvaTFuM1RqYjhyNlZr?=
 =?utf-8?B?Y1pCUlJYQkVpdU1zNEdKd01XQmJwL245RW41K204Z09HVUVnR1lzTHpWaFMw?=
 =?utf-8?B?Yk12Y1NHYVppRUhjNW9tSE5JbnB6Y3ZuOU9qNm1SN2FuQkt2aHozQ2FlaHdn?=
 =?utf-8?B?OEJiNFpQUGxsSlJJT2tBZDJuZzNHTEZCUjQ5d2dQL2s2V3hWSURVeFBxOTlv?=
 =?utf-8?B?UXhKMnZqMWpaZHpDUGJBaHBjSUh5eDI1MTQ1U3VKbVdTL0Y1ZGtneWwvaGxM?=
 =?utf-8?B?SXBxTHYrcFkvQmk0TEdYZWxHSVZpeWxLb0p4THBIUnRmeHZaL0oraUVaazVo?=
 =?utf-8?B?TzIyZ25mN1NiSVc0c2kyZG92VXo4SllmaUQzWThib0Y1enl6TnY2YlM4NG9z?=
 =?utf-8?B?aUpScjFlUEpreU5Md1hoUmtWbEdjRjZDc1VBUm5kaEJZMG9JdEtjdUtOMS9o?=
 =?utf-8?B?YWtGaGl4TUJEVDhKOFl2d3lWY2puNXZPL2dnZFR2V05Da0NoTnIxMGY1dE5S?=
 =?utf-8?B?cWVlbEg2dW4xcjNTRXR5ZUtGYXFuMXI5MUFjQWJDSE1MeTZPbXNSTWlFYVhR?=
 =?utf-8?B?MG5FOEE0NlA0alFNT1Qxc05sWmd6WFVMMTFiYXJvSFFJczZhZ2llaU1FOTBl?=
 =?utf-8?B?emd5UERHMitGNVJNN1d5YzZTb29PZ0g4V2UrQUt1ZTdITUthTGVJVmFPVDVE?=
 =?utf-8?B?aEl3TjJuQTBjWU1hV0d1Tk4zTHRZNy8wQ0xnTzBtV2FLbC9VZTVHNHdpYUkv?=
 =?utf-8?B?VXJ6Y21ic1BLN200eXUxMml4ZWk1NmxEeWpleWJjb3Z1UkMvdk1mMC9GVWl4?=
 =?utf-8?Q?2nIAvvsjidU=3D?=
X-Microsoft-Exchange-Diagnostics: 1; CO2PR07MB2518;
 6:m3hubTQYGGO/92BDTpxIKloSop7tfhEN7HEiBPG0a6xewZoiO283W65Oig8oSNpldBnkDwvDRViOnvtWi++EhZp0opzfVNsDpmeWkegtPGMaM+wDdDmcENYmzrH6/zzOY7n0eYpGBsxrgWZv7ASR+UAF0d3mwNrZMEL0G192U9AM/oba4HORY/DSKxMd9zt8KUHPPckRfjRSWUaVrgrfRjB4vMhWyiaKtnV5UhmlmpggdrtP0khbAaLhTsV+mMPP0KvsXSfJMdIisLxd3zl7OpJzm6ENmcvq5QH+32ohxYxZO8mViXYe41HDbFqMASEDf2zNjtBZ6heAnBBA0q2fDSBvvM9yqAA1GKupWZCrD48=;
 5:A8whqG6JE4JKHofqJkVI1GeUhGyOMA3U3Q8YnlDMGcNuP3KiH3EWuRlBvFn6XhKdzqlpm0xHifAtqkW1K5otUGsDSPrRheFzRsuLyuWgkRgYytcz9uEp+fGu+0Ll97mjEanovfgfOMFVWqLCYa/d/jBh+zTY+ARoXegZ+viiFhc=;
 24:JxcQX9X4zgnd2GMzLxeoFNWcPcWgzXHB6aFZT1heVMLRfI644qTjS+UkRVY9fGz+eOSHef1+rSjX1meh3UNuAWTZ1vthyWe1aAHG962yUjI=;
 7:yr30/YpBrInUSKDY6WPE4OWhaaaAReNA76ohu5fzoCB8kf4aRwnCUnKy6FzR+E2CZ4/axqgpFmhNvI/5P6D2vIwqg7nnTpoZNYRshHlJx6FJs+9tGVFvOnh7weIF/KuNXRxBWFn9pOEN8bc18MIa82qJKwE8lXxTlkrMRy9fKnPWdMYVeC1j4l+bIhbGp5/Odk+XGFx+qTXy5Zis5plabv9/d1htL+yRQZ5ojiRhgnMvPIe7ry2ElqbqMXWgrLwP
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-OriginatorOrg: caviumnetworks.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Nov 2017 09:58:11.6016 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: a588100c-c52b-4677-c7cb-08d525c6104a
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: 711e4ccf-2e9b-4bcf-a551-4094005b6194
X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO2PR07MB2518
Subject: Re: [dpdk-dev] [PATCH v2] ring: guarantee ordering of cons/prod
 loading when doing
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Nov 2017 09:58:18 -0000

-----Original Message-----
> Date: Tue, 7 Nov 2017 16:34:30 +0800
> From: Jia He <hejianet@gmail.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Cc: dev@dpdk.org, olivier.matz@6wind.com, konstantin.ananyev@intel.com,
>  bruce.richardson@intel.com, jianbo.liu@arm.com, hemant.agrawal@nxp.com,
>  jie2.liu@hxt-semitech.com, bing.zhao@hxt-semitech.com,
>  jia.he@hxt-semitech.com
> Subject: Re: [PATCH v2] ring: guarantee ordering of cons/prod loading when
>  doing
> User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101
>  Thunderbird/52.4.0
> 
> 
> 
> On 11/7/2017 12:36 PM, Jerin Jacob Wrote:
> > -----Original Message-----
> > 
> > On option could be to change the prototype of update_tail() and make
> > compiler accommodate it for zero cost for arm64(Which I think, it it the
> > case. But you can check the generated instructions)
> > If not, move, __rte_ring_do_dequeue() and __rte_ring_do_enqueue() instead of
> > __rte_ring_move_prod_head/__rte_ring_move_cons_head/update_tail()
> > 
> > 
> > ➜ [master][dpdk.org] $ git diff
> > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> > index 5e9b3b7b4..b32648825 100644
> > --- a/lib/librte_ring/rte_ring.h
> > +++ b/lib/librte_ring/rte_ring.h
> > @@ -358,8 +358,12 @@ void rte_ring_dump(FILE *f, const struct rte_ring
> > *r);
> >   static __rte_always_inline void
> >   update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t
> > new_val,
> > -               uint32_t single)
> > +               uint32_t single, const uint32_t enqueue)
> >   {
> > +       if (enqueue)
> > +               rte_smp_wmb();
> > +       else
> > +               rte_smp_rmb();
> >          /*
> >           * If there are other enqueues/dequeues in progress that
> >           * preceded us,
> >           * we need to wait for them to complete
> > @@ -470,9 +474,8 @@ __rte_ring_do_enqueue(struct rte_ring *r, void *
> > const *obj_table,
> >                  goto end;
> >          ENQUEUE_PTRS(r, &r[1], prod_head, obj_table, n, void *);
> > -       rte_smp_wmb();
> > -       update_tail(&r->prod, prod_head, prod_next, is_sp);
> > +       update_tail(&r->prod, prod_head, prod_next, is_sp, 1);
> >   end:
> >          if (free_space != NULL)
> >                  *free_space = free_entries - n;
> > @@ -575,9 +578,8 @@ __rte_ring_do_dequeue(struct rte_ring *r, void
> > **obj_table,
> >                  goto end;
> >          DEQUEUE_PTRS(r, &r[1], cons_head, obj_table, n, void *);
> > -       rte_smp_rmb();
> > -       update_tail(&r->cons, cons_head, cons_next, is_sc);
> > +       update_tail(&r->cons, cons_head, cons_next, is_sc, 0);
> >   end:
> >          if (available != NULL)
> > 
> > 
> > 
> Hi Jerin, yes I knew this suggestion in update_tail.
> But what I mean is the rte_smp_rmb() in __rte_ring_move_cons_head and
> __rte_ring_move_pros_head:
> [option 1]
> +        *old_head = r->cons.head;
> +        rte_smp_rmb();
> +        const uint32_t prod_tail = r->prod.tail;
> 
> [option 2]
> +        *old_head = __atomic_load_n(&r->cons.head,
> +                    __ATOMIC_ACQUIRE);
> +        *old_head = r->cons.head;
> 
> ie.I wonder what is the suitable new config name to distinguish the above 2
> options?

Why?
If you fix the generic version with rte_smp_rmb() then we just need only
one config to differentiate between c11 vs generic. See comments below,

> Thanks for the patience :-)
> 
> see my drafted patch below, the marcro "PREFER":
> + */
> +
> +#ifndef _RTE_RING_C11_MEM_H_
> +#define _RTE_RING_C11_MEM_H_
> +
> +static __rte_always_inline void
> +update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t
> new_val,
> +        uint32_t single, uint32_t enqueue)
> +{
> +    /* Don't need wmb/rmb when we prefer to use load_acquire/
> +     * store_release barrier */
> +#ifndef PREFER
> +    if (enqueue)
> +        rte_smp_wmb();
> +    else
> +        rte_smp_rmb();
> +#endif

You can remove PREFER and let the "generic" version has this. For x86,
rte_smp_?mb() it will be NOOP. So no issue.

> +
> +    /*
> +     * If there are other enqueues/dequeues in progress that preceded us,
> +     * we need to wait for them to complete
> +     */
> +    if (!single)
> +        while (unlikely(ht->tail != old_val))
> +            rte_pause();
> +
> +#ifdef PREFER
> +    __atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);

for c11 mem model version, it needs only __atomic_store_n version.

> +#else
> +    ht->tail = new_val;
> +#endif
> +}
> +
> +/**
> + * @internal This function updates the producer head for enqueue
> + *
> + * @param r
> + *   A pointer to the ring structure
> + * @param is_sp
> + *   Indicates whether multi-producer path is needed or not
> + * @param n
> + *   The number of elements we will want to enqueue, i.e. how far should
> the
> + *   head be moved
> + * @param behavior
> + *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
> + *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring
> + * @param old_head
> + *   Returns head value as it was before the move, i.e. where enqueue
> starts
> + * @param new_head
> + *   Returns the current/new head value i.e. where enqueue finishes
> + * @param free_entries
> + *   Returns the amount of free space in the ring BEFORE head was moved
> + * @return
> + *   Actual number of objects enqueued.
> + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_move_prod_head(struct rte_ring *r, int is_sp,
> +        unsigned int n, enum rte_ring_queue_behavior behavior,
> +        uint32_t *old_head, uint32_t *new_head,
> +        uint32_t *free_entries)
> +{
> +    const uint32_t capacity = r->capacity;
> +    unsigned int max = n;
> +    int success;
> +
> +    do {
> +        /* Reset n to the initial burst count */
> +        n = max;
> +
> +#ifdef PREFER
> +        *old_head = __atomic_load_n(&r->prod.head,
> +                    __ATOMIC_ACQUIRE);
> +#else
> +        *old_head = r->prod.head;
> +        /* prevent reorder of load/load */
> +        rte_smp_rmb();
> +#endif

Same as above comment.

> +        const uint32_t cons_tail = r->cons.tail;
> +        /*
> +         *  The subtraction is done between two unsigned 32bits value
> +         * (the result is always modulo 32 bits even if we have
> +         * *old_head > cons_tail). So 'free_entries' is always between 0
> +         * and capacity (which is < size).
> +         */
> +        *free_entries = (capacity + cons_tail - *old_head);
> +
> +        /* check that we have enough room in ring */
> +        if (unlikely(n > *free_entries))

> +static __rte_always_inline unsigned int
> +__rte_ring_do_enqueue(struct rte_ring *r, void * const *obj_table,
> +         unsigned int n, enum rte_ring_queue_behavior behavior,
> +         int is_sp, unsigned int *free_space)
> +{


Duplicate function, No need to replicate on both versions.


> +static __rte_always_inline unsigned int
> +__rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
> +        unsigned int n, enum rte_ring_queue_behavior behavior,
> +        uint32_t *old_head, uint32_t *new_head,
> +        uint32_t *entries)
> +{
> +    unsigned int max = n;
> +    int success;
> +
> +    /* move cons.head atomically */
> +    do {
> +        /* Restore n as it may change every loop */
> +        n = max;
> +#ifdef PREFER
> +        *old_head = __atomic_load_n(&r->cons.head,
> +                    __ATOMIC_ACQUIRE);
> +#else
> +        *old_head = r->cons.head;
> +        /*  prevent reorder of load/load */
> +        rte_smp_rmb();
> +#endif

Same as above comment

> +
> +        const uint32_t prod_tail = r->prod.tail;
> +        /* The subtraction is done between two unsigned 32bits value
> +         * (the result is always modulo 32 bits even if we have
> +         * cons_head > prod_tail). So 'entries' is always between 0
> +         * and size(ring)-1. */
> +        *entries = (prod_tail - *old_head);
> +
> +        /* Set the actual entries for dequeue */
> +        if (n > *entries)
> +            n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries;
> +
> +        if (unlikely(n == 0))
> +            return 0;
> +
> +        *new_head = *old_head + n;
> +        if (is_sc)
> +            r->cons.head = *new_head, success = 1;
> +        else
> +#ifdef PREFER
> +            success = arch_rte_atomic32_cmpset(&r->cons.head,
> +                            old_head, *new_head,
> +                            0, __ATOMIC_ACQUIRE,
> +                            __ATOMIC_RELAXED);
> +#else
> +            success = rte_atomic32_cmpset(&r->cons.head, *old_head,
> +                    *new_head);
> +#endif

Same as above comment

> +    } while (unlikely(success == 0));
> +    return n;
> +}
> +
> +/**
> + * @internal Dequeue several objects from the ring
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param obj_table
> + *   A pointer to a table of void * pointers (objects).
> + * @param n
> + *   The number of objects to pull from the ring.
> + * @param behavior
> + *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
> + *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
> + * @param is_sc
> + *   Indicates whether to use single consumer or multi-consumer head update
> + * @param available
> + *   returns the number of remaining ring entries after the dequeue has
> finished
> + * @return
> + *   - Actual number of objects dequeued.
> + *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_do_dequeue(struct rte_ring *r, void **obj_table,
> +         unsigned int n, enum rte_ring_queue_behavior behavior,
> +         int is_sc, unsigned int *available)
> +{

Duplicate function, No need to replicate on both versions.


> +    uint32_t cons_head, cons_next;
> +    uint32_t entries;
> +
> +    n = __rte_ring_move_cons_head(r, is_sc, n, behavior,
> +            &cons_head, &cons_next, &entries);
> +    if (n == 0)
> +        goto end;
> +
> +    DEQUEUE_PTRS(r, &r[1], cons_head, obj_table, n, void *);
> +
> +    update_tail(&r->cons, cons_head, cons_next, is_sc, 0);
> +
> +end:
> +    if (available != NULL)
> +        *available = entries - n;
> +    return n;
> +}
> +
> +#endif /* _RTE_RING_C11_MEM_H_ */
> +
> diff --git a/lib/librte_ring/rte_ring_generic.h
> b/lib/librte_ring/rte_ring_generic.h
> new file mode 100644
> index 0000000..0ce6d57
> --- /dev/null
> +++ b/lib/librte_ring/rte_ring_generic.h
> @@ -0,0 +1,268 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 hxt-semitech. All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of hxt-semitech nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _RTE_RING_GENERIC_H_
> +#define _RTE_RING_GENERIC_H_
> +
> +static __rte_always_inline void
> +update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t
> new_val,
> +        uint32_t single, uint32_t enqueue)
> +{
> +    if (enqueue)
> +        rte_smp_wmb();
> +    else
> +        rte_smp_rmb();
> +    /*
> +     * If there are other enqueues/dequeues in progress that preceded us,
> +     * we need to wait for them to complete
> +     */
> +    if (!single)
> +        while (unlikely(ht->tail != old_val))
> +            rte_pause();
> +
> +    ht->tail = new_val;
> +}
> +
> +/**
> + * @internal This function updates the producer head for enqueue
> + *
> + * @param r
> + *   A pointer to the ring structure
> + * @param is_sp
> + *   Indicates whether multi-producer path is needed or not
> + * @param n
> + *   The number of elements we will want to enqueue, i.e. how far should
> the
> + *   head be moved
> + * @param behavior
> + *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
> + *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring
> + * @param old_head
> + *   Returns head value as it was before the move, i.e. where enqueue
> starts
> + * @param new_head
> + *   Returns the current/new head value i.e. where enqueue finishes
> + * @param free_entries
> + *   Returns the amount of free space in the ring BEFORE head was moved
> + * @return
> + *   Actual number of objects enqueued.
> + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_move_prod_head(struct rte_ring *r, int is_sp,
> +        unsigned int n, enum rte_ring_queue_behavior behavior,
> +        uint32_t *old_head, uint32_t *new_head,
> +        uint32_t *free_entries)
> +{
> +    const uint32_t capacity = r->capacity;
> +    unsigned int max = n;
> +    int success;
> +
> +    do {
> +        /* Reset n to the initial burst count */
> +        n = max;
> +
> +        *old_head = r->prod.head;

adding rte_smp_rmb() no harm here as it is NOOP for x86 and it is
semantically correct too.


> +        const uint32_t cons_tail = r->cons.tail;
> +        /*
> +         *  The subtraction is done between two unsigned 32bits value
> +         * (the result is always modulo 32 bits even if we have
> +         * *old_head > cons_tail). So 'free_entries' is always between 0
> +         * and capacity (which is < size).
> +         */
> +        *free_entries = (capacity + cons_tail - *old_head);
> +
> +        /* check that we have enough room in ring */
> +        if (unlikely(n > *free_entries))
> +            n = (behavior == RTE_RING_QUEUE_FIXED) ?
> +                    0 : *free_entries;
> +
> +        if (n == 0)
> +            return 0;
> +
> +        *new_head = *old_head + n;
> +        if (is_sp)
> +            r->prod.head = *new_head, success = 1;
> +        else
> +            success = rte_atomic32_cmpset(&r->prod.head,
> +                    *old_head, *new_head);
> +    } while (unlikely(success == 0));
> +    return n;
> +}
> +
> +/**
> + * @internal Enqueue several objects on the ring
> + *
> +  * @param r
> + *   A pointer to the ring structure.
> + * @param obj_table
> + *   A pointer to a table of void * pointers (objects).
> + * @param n
> + *   The number of objects to add in the ring from the obj_table.
> + * @param behavior
> + *   RTE_RING_QUEUE_FIXED:    Enqueue a fixed number of items from a ring
> + *   RTE_RING_QUEUE_VARIABLE: Enqueue as many items as possible from ring
> + * @param is_sp
> + *   Indicates whether to use single producer or multi-producer head update
> + * @param free_space
> + *   returns the amount of space after the enqueue operation has finished
> + * @return
> + *   Actual number of objects enqueued.
> + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_do_enqueue(struct rte_ring *r, void * const *obj_table,
> +         unsigned int n, enum rte_ring_queue_behavior behavior,
> +         int is_sp, unsigned int *free_space)
> +{

Duplicate function, No need to replicate on both versions.

> +
> +/**
> + * @internal This function updates the consumer head for dequeue
> + *
> + * @param r
> + *   A pointer to the ring structure
> + * @param is_sc
> + *   Indicates whether multi-consumer path is needed or not
> + * @param n
> + *   The number of elements we will want to enqueue, i.e. how far should
> the
> + *   head be moved
> + * @param behavior
> + *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
> + *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
> + * @param old_head
> + *   Returns head value as it was before the move, i.e. where dequeue
> starts
> + * @param new_head
> + *   Returns the current/new head value i.e. where dequeue finishes
> + * @param entries
> + *   Returns the number of entries in the ring BEFORE head was moved
> + * @return
> + *   - Actual number of objects dequeued.
> + *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
> +        unsigned int n, enum rte_ring_queue_behavior behavior,
> +        uint32_t *old_head, uint32_t *new_head,
> +        uint32_t *entries)
> +{
> +    unsigned int max = n;
> +    int success;
> +
> +    /* move cons.head atomically */
> +    do {
> +        /* Restore n as it may change every loop */
> +        n = max;
> +
> +        *old_head = r->cons.head;
> +        const uint32_t prod_tail = r->prod.tail;

Same as above comment.

> +        /* The subtraction is done between two unsigned 32bits value
> +         * (the result is always modulo 32 bits even if we have
> +         * cons_head > prod_tail). So 'entries' is always between 0
> +         * and size(ring)-1. */
> +        *entries = (prod_tail - *old_head);
> +
> +        /* Set the actual entries for dequeue */
> +        if (n > *entries)
> +            n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : *entries;
> +
> +        if (unlikely(n == 0))
> +            return 0;
> +
> +        *new_head = *old_head + n;
> +        if (is_sc)
> +            r->cons.head = *new_head, success = 1;
> +        else
> +            success = rte_atomic32_cmpset(&r->cons.head, *old_head,
> +                    *new_head);
> +    } while (unlikely(success == 0));
> +    return n;
> +}
> +
> +/**
> + * @internal Dequeue several objects from the ring
> + *
> + * @param r
> + *   A pointer to the ring structure.
> + * @param obj_table
> + *   A pointer to a table of void * pointers (objects).
> + * @param n
> + *   The number of objects to pull from the ring.
> + * @param behavior
> + *   RTE_RING_QUEUE_FIXED:    Dequeue a fixed number of items from a ring
> + *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from ring
> + * @param is_sc
> + *   Indicates whether to use single consumer or multi-consumer head update
> + * @param available
> + *   returns the number of remaining ring entries after the dequeue has
> finished
> + * @return
> + *   - Actual number of objects dequeued.
> + *     If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only.
> + */
> +static __rte_always_inline unsigned int
> +__rte_ring_do_dequeue(struct rte_ring *r, void **obj_table,
> +         unsigned int n, enum rte_ring_queue_behavior behavior,
> +         int is_sc, unsigned int *available)
> +{
Duplicate function, No need to replicate on both versions.
>