From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id A8708376C for ; Tue, 7 Feb 2017 15:13:20 +0100 (CET) Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga104.jf.intel.com with ESMTP; 07 Feb 2017 06:13:19 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,346,1477983600"; d="scan'208";a="61813575" Received: from sivswdev01.ir.intel.com (HELO localhost.localdomain) ([10.237.217.45]) by orsmga005.jf.intel.com with ESMTP; 07 Feb 2017 06:13:17 -0800 From: Bruce Richardson To: olivier.matz@6wind.com Cc: thomas.monjalon@6wind.com, keith.wiles@intel.com, konstantin.ananyev@intel.com, stephen@networkplumber.org, dev@dpdk.org, Bruce Richardson Date: Tue, 7 Feb 2017 14:12:38 +0000 Message-Id: <1486476777-24768-1-git-send-email-bruce.richardson@intel.com> X-Mailer: git-send-email 1.7.0.7 In-Reply-To: <20170125121456.GA24344@bricha3-MOBL3.ger.corp.intel.com> References: <20170125121456.GA24344@bricha3-MOBL3.ger.corp.intel.com> Subject: [dpdk-dev] [PATCH RFCv3 00/19] ring cleanup and generalization X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Feb 2017 14:13:21 -0000 This patchset make a set of, sometimes non-backward compatible, cleanup changes to the rte_ring code in order to improve it. The resulting code is shorter*, since the existing functions are restructured to reduce code duplication, as well as being more consistent in behaviour. The specific changes made are explained in each patch which makes that change. Key incompatibilities: * The biggest, and probably most controversial change is that to the enqueue and dequeue APIs. The enqueue/deq burst and bulk functions have their function prototypes changed so that they all return an additional parameter, indicating the size of next call which is guaranteed to succeed. In case on enq, this is the number of available slots on the ring, and in case of deq, it is the number of objects which can be pulled. As well as this, the return value from the bulk functions have been changed to make them compatible with the burst functions. In all cases, the functions to enq/deq a set of objs now return the number of objects processed, 0 or N, in the case of bulk functions, 0, N or any value in between in the case of the burst ones. [Due to the extra parameter, the compiler will flag all instances of the function to allow the user to also change the return value logic at the same time] * The parameters to the single object enq/deq functions have not been changed. Because of that, the return value is also unmodified - as the compiler cannot automatically flag this to the user. Potential further cleanups: * To a certain extent the rte_ring structure has gone from being a whole ring structure, including a "ring" element itself, to just being a header which can be reused, along with the head/tail update functions to create new rings. For now, the enqueue code works by assuming that the ring data goes immediately after the header, but that can be changed to allow specialised ring implementations to put additional metadata of their own after the ring header. I didn't see this as being needed right now, but it may be worth considering for a V1 patchset. * There are 9 enqueue functions and 9 dequeue functions in rte_ring.h. I suspect not all of those are used, so personally I would consider dropping the functions to enqueue/dequeue a single value using single or multi semantics, i.e. drop rte_ring_sp_enqueue rte_ring_mp_enqueue rte_ring_sc_dequeue rte_ring_mc_dequeue That would still leave a single enqueue and dequeue function for working with a single object at a time. * It should be possible to merge the head update code for enqueue and dequeue into a single function. The key difference between the two is the calculation of how far the index can be moved. I felt that the functions for moving the head index are sufficiently complicated with many parameters to them already, that trying to merge in more code would impede readability. However, if so desired this change can be made at a later stage without affecting ABI or API. PERFORMANCE: I've run performance autotests on a couple of (Intel) platforms. Looking particularly at the core-2-core results, which I expect are the main ones of interest, the performance after this patchset is a few cycles per packet faster in my testing. I'm hoping it should be at least neutral perf-wise. REQUEST FOR FEEDBACK: * Are all of these changes worth making? * Should they be made in existing ring code, or do we look to provide a new fifo library to completely replace the ring one? * How does the implementation of new ring types using this code compare vs that of the previous RFCs? [*] LOC original rte_ring.h: 462. After patchset: 363. [Numbers generated using David A. Wheeler's 'SLOCCount'.] Bruce Richardson (19): app/pdump: fix duplicate macro definition ring: remove split cacheline build setting ring: create common structure for prod and cons metadata ring: add a function to return the ring size crypto/null: use ring size function ring: eliminate duplication of size and mask fields ring: remove debug setting ring: remove the yield when waiting for tail update ring: remove watermark support ring: make bulk and burst fn return vals consistent ring: allow enq fns to return free space value examples/quota_watermark: use ring space for watermarks ring: allow dequeue fns to return remaining entry count ring: reduce scope of local variables ring: separate out head index manipulation for enq/deq ring: create common function for updating tail idx ring: allow macros to work with any type of object ring: add object size parameter to memory size calculation ring: add event ring implementation app/pdump/main.c | 3 +- app/test-pipeline/pipeline_hash.c | 5 +- app/test-pipeline/runtime.c | 19 +- app/test/Makefile | 1 + app/test/commands.c | 52 -- app/test/test_event_ring.c | 85 +++ app/test/test_link_bonding_mode4.c | 6 +- app/test/test_pmd_ring_perf.c | 12 +- app/test/test_ring.c | 704 ++----------------- app/test/test_ring_perf.c | 36 +- app/test/test_table_acl.c | 2 +- app/test/test_table_pipeline.c | 2 +- app/test/test_table_ports.c | 12 +- app/test/virtual_pmd.c | 8 +- config/common_base | 3 - doc/guides/prog_guide/env_abstraction_layer.rst | 5 - doc/guides/prog_guide/ring_lib.rst | 7 - doc/guides/sample_app_ug/server_node_efd.rst | 2 +- drivers/crypto/null/null_crypto_pmd.c | 2 +- drivers/crypto/null/null_crypto_pmd_ops.c | 2 +- drivers/net/bonding/rte_eth_bond_pmd.c | 3 +- drivers/net/ring/rte_eth_ring.c | 4 +- examples/distributor/main.c | 5 +- examples/load_balancer/runtime.c | 34 +- .../client_server_mp/mp_client/client.c | 9 +- .../client_server_mp/mp_server/main.c | 2 +- examples/packet_ordering/main.c | 13 +- examples/qos_sched/app_thread.c | 14 +- examples/quota_watermark/qw/init.c | 5 +- examples/quota_watermark/qw/main.c | 15 +- examples/quota_watermark/qw/main.h | 1 + examples/quota_watermark/qwctl/commands.c | 2 +- examples/quota_watermark/qwctl/qwctl.c | 2 + examples/quota_watermark/qwctl/qwctl.h | 1 + examples/server_node_efd/node/node.c | 2 +- examples/server_node_efd/server/main.c | 2 +- lib/librte_hash/rte_cuckoo_hash.c | 5 +- lib/librte_mempool/rte_mempool_ring.c | 12 +- lib/librte_pdump/rte_pdump.c | 2 +- lib/librte_port/rte_port_frag.c | 3 +- lib/librte_port/rte_port_ras.c | 2 +- lib/librte_port/rte_port_ring.c | 34 +- lib/librte_ring/Makefile | 2 + lib/librte_ring/rte_event_ring.c | 220 ++++++ lib/librte_ring/rte_event_ring.h | 507 ++++++++++++++ lib/librte_ring/rte_ring.c | 82 +-- lib/librte_ring/rte_ring.h | 762 ++++++++------------- 47 files changed, 1340 insertions(+), 1373 deletions(-) create mode 100644 app/test/test_event_ring.c create mode 100644 lib/librte_ring/rte_event_ring.c create mode 100644 lib/librte_ring/rte_event_ring.h -- 2.9.3