From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 44E5BA052A; Fri, 10 Jul 2020 18:55:58 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id F37F91DB68; Fri, 10 Jul 2020 18:55:56 +0200 (CEST) Received: from new3-smtp.messagingengine.com (new3-smtp.messagingengine.com [66.111.4.229]) by dpdk.org (Postfix) with ESMTP id EC73C1DA5D for ; Fri, 10 Jul 2020 18:55:55 +0200 (CEST) Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailnew.nyi.internal (Postfix) with ESMTP id 845435801E4; Fri, 10 Jul 2020 12:55:55 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Fri, 10 Jul 2020 12:55:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=fm1; bh= sPkWdwH7HBU34x2B5KfcLEkMxN/HYWWh485ksK+Uais=; b=r3RzBIxqQLQPtewg D1j+UxwZInJIqS6VpvS/YsymKby3VODBThYa6/8Vci9ziBNbd5mzyY6AhSxx2PEl UuNlCO4Pl3mDPWMn9JllYproUjCnmcYRW2w/f+CsVjYsS2sgAOrQtZHY/GUNHi8M L5v1FLf/lZmyDDbF7dukMN6kYHwjrCZD7se+woiUL6SFXEVEFvvothvDFNDidIQG 1TTn4aX3p29DCgZ1st254WVnFd5FUdNSne32wbB9A1CLK70jO2xjDsZFAKTq+uKD iJVFq7PS1HRGcRk71v7OE2U4b5+kUS/XNXA4I1k9yjdP7lP2Ju+YbXJ0IYDTTgtI 8kQBcA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; bh=sPkWdwH7HBU34x2B5KfcLEkMxN/HYWWh485ksK+Ua is=; b=q4amxP4MxqHsqtqnSHZg4nJwLyDyQkBhBMHcKNOntq5AShFApFxWGQO/W cN3mh07cyxyOZrNhQsWeojGa6mACCYWlNkiEJXY6SxpBI9oSEgjpSxh22q/51EdN X+4NF/7LVJgGHNDNKsAeNXtaZpWRnvFdj9uVcxPN8EkG+iQNSO6HJ/06Tn1crpSc jAGVSVU1PtoO0BP7mII2mAk0VeAJVXQ8ncjqHOoB0cFixyFoNFWFE4RnGmCyrP/F vyyembDysVXBtrJ9ZbSnLdJLSeJdOwqUMRuugQeNBfvcxuKY/fCgw2EdRnQnAOZj HLpguLUnquqs2rf9Ng3fdycvm01cg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduiedrvddugdduudduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkfgjfhgggfgtsehtufertddttddvnecuhfhrohhmpefvhhhomhgr shcuofhonhhjrghlohhnuceothhhohhmrghssehmohhnjhgrlhhonhdrnhgvtheqnecugg ftrfgrthhtvghrnhepvdffgeffhefgtedvuedvgeetuddvieduffevgeefvdfggeejveel geevjeduffdvnecuffhomhgrihhnpehgihhthhhusgdrtghomhdpghhnuhdrohhrghdpug hpughkrdhorhhgnecukfhppeejjedrudefgedrvddtfedrudekgeenucevlhhushhtvghr ufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehthhhomhgrshesmhhonhhjrg hlohhnrdhnvght X-ME-Proxy: Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184]) by mail.messagingengine.com (Postfix) with ESMTPA id 7129A30653F1; Fri, 10 Jul 2020 12:55:53 -0400 (EDT) From: Thomas Monjalon To: Phil Yang Cc: dev@dpdk.org, david.marchand@redhat.com, drc@linux.vnet.ibm.com, Honnappa.Nagarahalli@arm.com, jerinj@marvell.com, konstantin.ananyev@intel.com, Ola.Liljedahl@arm.com, ruifeng.wang@arm.com, nd@arm.com, john.mcnamara@intel.com, bruce.richardson@intel.com Date: Fri, 10 Jul 2020 18:55:52 +0200 Message-ID: <1746729.2ERU3Mzd2g@thomas> In-Reply-To: <1594115449-13750-2-git-send-email-phil.yang@arm.com> References: <1590483667-10318-1-git-send-email-phil.yang@arm.com> <1594115449-13750-1-git-send-email-phil.yang@arm.com> <1594115449-13750-2-git-send-email-phil.yang@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [dpdk-dev] [PATCH v6 1/4] doc: add generic atomic deprecation section X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Interestingly, John, our doc maintainer is not Cc'ed. I add him. Please use --cc-cmd devtools/get-maintainer.sh I am expecting a review from an x86 maintainer as well. If no maintainer replies, ping them. 07/07/2020 11:50, Phil Yang: > Add deprecating the generic rte_atomic_xx APIs to c11 atomic built-ins > guide and examples. [...] > +Atomic Operations: Use C11 Atomic Built-ins > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +DPDK `generic rte_atomic `_ operations are Why this github link on 20.02? Please try to keep lines small. > +implemented by `__sync built-ins `_. Long links should be on their own line to avoid long lines. > +These __sync built-ins result in full barriers on aarch64, which are unnecessary > +in many use cases. They can be replaced by `__atomic built-ins `_ that > +conform to the C11 memory model and provide finer memory order control. > + > +So replacing the rte_atomic operations with __atomic built-ins might improve > +performance for aarch64 machines. `More details `_. "More details." Please make a sentence. > + > +Some typical optimization cases are listed below: > + > +Atomicity > +^^^^^^^^^ > + > +Some use cases require atomicity alone, the ordering of the memory operations > +does not matter. For example the packets statistics in the `vhost `_ example application. Again github. If you really want a web link, use code.dpdk.org or doc.dpdk.org/api But why giving code example at all? > + > +It just updates the number of transmitted packets, no subsequent logic depends > +on these counters. So the RELAXED memory ordering is sufficient: > + > +.. code-block:: c > + > + static __rte_always_inline void > + virtio_xmit(struct vhost_dev *dst_vdev, struct vhost_dev *src_vdev, > + struct rte_mbuf *m) > + { > + ... > + ... > + if (enable_stats) { > + __atomic_add_fetch(&dst_vdev->stats.rx_total_atomic, 1, __ATOMIC_RELAXED); > + __atomic_add_fetch(&dst_vdev->stats.rx_atomic, ret, __ATOMIC_RELAXED); > + ... > + } > + } I don't see how adding real code helps here. Why not just mentioning __atomic_add_fetch and __ATOMIC_RELAXED? > + > +One-way Barrier > +^^^^^^^^^^^^^^^ > + > +Some use cases allow for memory reordering in one way while requiring memory > +ordering in the other direction. > + > +For example, the memory operations before the `lock `_ can move to the > +critical section, but the memory operations in the critical section cannot move > +above the lock. In this case, the full memory barrier in the CAS operation can > +be replaced to ACQUIRE. On the other hand, the memory operations after the > +`unlock `_ can move to the critical section, but the memory operations in the > +critical section cannot move below the unlock. So the full barrier in the STORE > +operation can be replaced with RELEASE. Again github links instead of our doxygen. > + > +Reader-Writer Concurrency > +^^^^^^^^^^^^^^^^^^^^^^^^^ No blank line here? > +Lock-free reader-writer concurrency is one of the common use cases in DPDK. > + > +The payload or the data that the writer wants to communicate to the reader, > +can be written with RELAXED memory order. However, the guard variable should > +be written with RELEASE memory order. This ensures that the store to guard > +variable is observable only after the store to payload is observable. > +Refer to `rte_hash insert `_ for an example. Hum... > + > +.. code-block:: c > + > + static inline int32_t > + rte_hash_cuckoo_insert_mw(const struct rte_hash *h, > + ... > + int32_t *ret_val) > + { > + ... > + ... > + > + /* Insert new entry if there is room in the primary > + * bucket. > + */ > + for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > + /* Check if slot is available */ > + if (likely(prim_bkt->key_idx[i] == EMPTY_SLOT)) { > + prim_bkt->sig_current[i] = sig; > + /* Store to signature and key should not > + * leak after the store to key_idx. i.e. > + * key_idx is the guard variable for signature > + * and key. > + */ > + __atomic_store_n(&prim_bkt->key_idx[i], > + new_idx, > + __ATOMIC_RELEASE); > + break; > + } > + } > + > + ... > + } > + > +Correspondingly, on the reader side, the guard variable should be read > +with ACQUIRE memory order. The payload or the data the writer communicated, > +can be read with RELAXED memory order. This ensures that, if the store to > +guard variable is observable, the store to payload is also observable. Refer to `rte_hash lookup `_ for an example. > + > +.. code-block:: c > + > + static inline int32_t > + search_one_bucket_lf(const struct rte_hash *h, const void *key, uint16_t sig, > + void **data, const struct rte_hash_bucket *bkt) > + { > + ... > + > + for (i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { > + .... > + if (bkt->sig_current[i] == sig) { > + key_idx = __atomic_load_n(&bkt->key_idx[i], > + __ATOMIC_ACQUIRE); > + if (key_idx != EMPTY_SLOT) { > + k = (struct rte_hash_key *) ((char *)keys + > + key_idx * h->key_entry_size); > + > + if (rte_hash_cmp_eq(key, k->key, h) == 0) { > + if (data != NULL) { > + *data = __atomic_load_n(&k->pdata, > + __ATOMIC_ACQUIRE); > + } > + > + /* > + * Return index where key is stored, > + * subtracting the first dummy index > + */ > + return key_idx - 1; > + } > + ... > + } > + NACK for the big chunks of real code. Please use words and avoid code. If you insist on keeping code in doc, I will make you responsible of updating all the code we have already in the doc :)