From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 7D1AA5424 for ; Thu, 19 Jan 2017 13:07:44 +0100 (CET) Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga102.fm.intel.com with ESMTP; 19 Jan 2017 04:07:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,253,1477983600"; d="scan'208,217";a="54837893" Received: from dhunt5-mobl.ger.corp.intel.com (HELO [10.237.221.69]) ([10.237.221.69]) by orsmga005.jf.intel.com with ESMTP; 19 Jan 2017 04:07:41 -0800 To: Bruce Richardson References: <1482381428-148094-2-git-send-email-david.hunt@intel.com> <1483948248-91364-1-git-send-email-david.hunt@intel.com> <1483948248-91364-2-git-send-email-david.hunt@intel.com> <20170116163654.GA26296@bricha3-MOBL3.ger.corp.intel.com> Cc: dev@dpdk.org From: "Hunt, David" Message-ID: <70254cc9-2f57-00f0-cde0-bbf0200cb370@intel.com> Date: Thu, 19 Jan 2017 12:07:41 +0000 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <20170116163654.GA26296@bricha3-MOBL3.ger.corp.intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH v4 1/6] lib: distributor performance enhancements X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jan 2017 12:07:46 -0000 Thanks for the comments Bruce. Addressed below. On 16/1/2017 4:36 PM, Bruce Richardson wrote: > On Mon, Jan 09, 2017 at 07:50:43AM +0000, David Hunt wrote: >> Now sends bursts of up to 8 mbufs to each worker, and tracks >> the in-flight flow-ids (atomic scheduling) >> >> New file with a new api, similar to the old API except with _burst >> at the end of the function names >> > Can you explain why this is necessary, and also how the new version > works compared to the old. I know this is explained in the cover letter, > but the cover letter does not make the git commit log. Sure. I'll add extra comments into the git comment. The main reason is to preserve the original API. This gives the user the choice to migrate to the new API should they wish to. >> Signed-off-by: David Hunt >> --- > >> diff --git a/lib/librte_distributor/rte_distributor_burst.c >> b/lib/librte_distributor/rte_distributor_burst.c >> new file mode 100644 >> index 0000000..ae7cf9d >> --- /dev/null >> +++ b/lib/librte_distributor/rte_distributor_burst.c >> @@ -0,0 +1,558 @@ >> +/*- >> + * BSD LICENSE >> + * >> + * Copyright(c) 2016 Intel Corporation. All rights reserved. > Update year since we aren't in 2016 any more. > >> + * >> + * Redistribution and use in source and binary forms, with or without >> + * modification, are permitted provided that the following conditions >> + * are met: >> + * >> + * * Redistributions of source code must retain the above copyright >> + * notice, this list of conditions and the following disclaimer. >> + * * Redistributions in binary form must reproduce the above >> copyright >> + * notice, this list of conditions and the following >> disclaimer in >> + * the documentation and/or other materials provided with the >> + * distribution. >> + * * Neither the name of Intel Corporation nor the names of its >> + * contributors may be used to endorse or promote products >> derived >> + * from this software without specific prior written permission. >> + * >> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND >> CONTRIBUTORS >> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT >> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND >> FITNESS FOR >> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE >> COPYRIGHT >> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, >> INCIDENTAL, >> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT >> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS >> OF USE, >> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND >> ON ANY >> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR >> TORT >> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF >> THE USE >> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH >> DAMAGE. >> + */ >> + >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include "rte_distributor_priv.h" >> +#include "rte_distributor_burst.h" >> + >> +TAILQ_HEAD(rte_dist_burst_list, rte_distributor_burst); >> + >> +static struct rte_tailq_elem rte_dist_burst_tailq = { >> + .name = "RTE_DIST_BURST", >> +}; >> +EAL_REGISTER_TAILQ(rte_dist_burst_tailq) >> + >> +/**** APIs called by workers ****/ >> + >> +/**** Burst Packet APIs called by workers ****/ >> + >> +/* This function should really be called return_pkt_burst() */ > 1) Why should it be? > 2) Why isn't it called that? > Please explain the naming. It seemed to me that the main use of this function was to return the packets from the worker rather than requesting new packets, whilst also toggling the bit to tell the distributor to send more packets. So I guess it's OK as it is. I've removed the comment to remove this confusion. >> +void >> +rte_distributor_request_pkt_burst(struct rte_distributor_burst *d, >> + unsigned int worker_id, struct rte_mbuf **oldpkt, >> + unsigned int count) >> +{ >> + struct rte_distributor_buffer_burst *buf = &(d->bufs[worker_id]); >> + unsigned int i; >> + >> + volatile int64_t *retptr64; >> + >> + >> + /* if we dont' have any packets to return, return. */ >> + if (count == 0) >> + return; >> + > So if we don't return anything we don't get any more packets, right? > What happens if we return fewer packets than we were previously given? > If that is allowed, why the restriction on returning at least one? You are correct. We should be able to return 0, and still flip the handshake bit to request more packets. This check will be removed. >> + retptr64 = &(buf->retptr64[0]); > >> + >> +int >> +rte_distributor_get_pkt_burst(struct rte_distributor_burst *d, >> + unsigned int worker_id, struct rte_mbuf **pkts, >> + struct rte_mbuf **oldpkt, unsigned int return_count) >> +{ >> + unsigned int count; >> + uint64_t retries = 0; >> + >> + rte_distributor_request_pkt_burst(d, worker_id, oldpkt, >> return_count); >> + >> + count = rte_distributor_poll_pkt_burst(d, worker_id, pkts); >> + while (count == 0) { >> + rte_pause(); >> + retries++; >> + if (retries > 1000) >> + return 0; > This behaviour is different to the original get_pkt() behaviour in that > it has a timeout. Why the change to add the timeout, and should the > timeout not be user configurable in some way? I had another look at this, and managed to clean up this logic. There is no longer a need for the retry. In the old logic, the poll_pkt function returned a pointer, or NULL when the handshake bit was not ready. In the new logic, up until now, I had similar logic, but return 0 for both the case where the bit was not ready, and the bit was ready and the number of valid pointers was 0. This meant that there was no way for the loop to break out when the application was exiting or flushing. I've now introduced a -1 return when the bit is not ready, so will continue looping. But when the distributor sets the bot with no packets, the poll_pkt function will return 0, allowing the loop to exit and return to the caller. Thanks for that comment, Bruce, it's fixed a major shortcoming in the logic. >> + >> + uint64_t t = rte_rdtsc()+100; > need spaces around the "+" Done >> + >> + while (rte_rdtsc() < t) >> + rte_pause(); >> + >> + count = rte_distributor_poll_pkt_burst(d, worker_id, pkts); >> + } >> + return count; >> +} >> + >> +int >> +rte_distributor_return_pkt_burst(struct rte_distributor_burst *d, >> + unsigned int worker_id, struct rte_mbuf **oldpkt, int num) >> +{ >> + struct rte_distributor_buffer_burst *buf = &d->bufs[worker_id]; >> + unsigned int i; >> + >> + for (i = 0; i < RTE_DIST_BURST_SIZE; i++) >> + /* Switch off the return bit first */ >> + buf->retptr64[i] &= ~RTE_DISTRIB_RETURN_BUF; >> + >> + for (i = num; i-- > 0; ) >> + buf->retptr64[i] = (((int64_t)(uintptr_t)oldpkt[i]) << >> + RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_RETURN_BUF; >> + >> + /* set the GET_BUF but even if we got no returns */ >> + buf->retptr64[0] |= RTE_DISTRIB_GET_BUF; > Does this mean we are requesting more packets here? No, we're setting retptr which means that the distributor will start processing the returns cacheline. The only way to request more packets is buf->bufptr64[0] |= RTE_DISTRIB_GET_BUF; This is usually called when you are shutting down a thread and want to return what you have, and not request any new packets from the distributor. > >> + >> + return 0; >> +} >> + >> +/**** APIs called on distributor core ***/ >> + > >> + >> +static unsigned int >> +release(struct rte_distributor_burst *d, unsigned int wkr) > I think this function needs a comment describing what it is doing, > and where is it called from and why. Other functions on distributor side > probably need the same thing too. Done. >> +{ >> + struct rte_distributor_buffer_burst *buf = &(d->bufs[wkr]); >> + unsigned int i; >> + >> + if (d->backlog[wkr].count == 0) >> + return 0; >> + >> + while (!(d->bufs[wkr].bufptr64[0] & RTE_DISTRIB_GET_BUF)) >> + rte_pause(); >> + >> + handle_returns(d, wkr); >> + >> + buf->count = 0; >> + >> + for (i = 0; i < d->backlog[wkr].count; i++) { >> + d->bufs[wkr].bufptr64[i] = d->backlog[wkr].pkts[i] | >> + RTE_DISTRIB_GET_BUF | RTE_DISTRIB_VALID_BUF; >> + d->in_flight_tags[wkr][i] = d->backlog[wkr].tags[i]; >> + } >> + buf->count = i; >> + for ( ; i < RTE_DIST_BURST_SIZE ; i++) { >> + buf->bufptr64[i] = RTE_DISTRIB_GET_BUF; >> + d->in_flight_tags[wkr][i] = 0; >> + } >> + >> + d->backlog[wkr].count = 0; >> + >> + /* Clear the GET bit */ >> + buf->bufptr64[0] &= ~RTE_DISTRIB_GET_BUF; >> + return buf->count; >> + >> +} > >> +/** >> + * API called by a worker to get new packets to process. Any >> previous packets >> + * given to the worker is assumed to have completed processing, and >> may be >> + * optionally returned to the distributor via the oldpkt parameter. >> + * >> + * @param d >> + * The distributor instance to be used >> + * @param worker_id >> + * The worker instance number to use - must be less that >> num_workers passed >> + * at distributor creation time. >> + * @param pkts >> + * The mbufs pointer array to be filled in (up to 8 packets) >> + * @param oldpkt >> + * The previous packet, if any, being processed by the worker >> + * @param retcount >> + * The number of packets being returneda > I think you need to document that it can't be zero, if I read the above > C implementation correctly. Can be zero now, after resolving some issues indicated above. We should be able to return zero to indicated that we've processed all in the burst but are not returning any (i.e. drop) >> + * >> + * @return >> + * The number of packets in the pkts array >> + */ >> +int >> +rte_distributor_get_pkt_burst(struct rte_distributor_burst *d, >> + unsigned int worker_id, struct rte_mbuf **pkts, >> + struct rte_mbuf **oldpkt, unsigned int retcount); >> + >> +/** > >> + >> +/** >> + * Number of packets to deal with in bursts. Needs to be 8 so as to >> + * fit in one cache line. >> + */ >> +#define RTE_DIST_BURST_SIZE (sizeof(__m128i) / sizeof(uint16_t)) > Does this compile for non-x86 with the references to __m128i? Changed to rte_xmm_t >> + > >> + >> + struct rte_distributor_returned_pkts returns; >> +}; >> + >> +/* All different signature compare functions */ >> +enum rte_distributor_match_function { >> + RTE_DIST_MATCH_SCALAR = 0, >> + RTE_DIST_MATCH_NUM > I think this last entry should be "RTE_DIST_NUM_MATCH_FNS", as > "NUM" is not a match function, and the define doesn't ready right. Done. >> +}; >> + >> +struct rte_distributor_burst { >> + TAILQ_ENTRY(rte_distributor_burst) next; /**< Next in list. */ >> + >> + char name[RTE_DISTRIBUTOR_NAMESIZE]; /**< Name of the ring. */ >> + unsigned int num_workers; /**< Number of workers >> polling */ >> + >> + /**> >> + * First cache line in the this array are the tags inflight >> + * on the worker core. Second cache line are the backlog >> + * that are going to go to the worker core. >> + */ >> + uint16_t >> in_flight_tags[RTE_DISTRIB_MAX_WORKERS][RTE_DIST_BURST_SIZE*2] >> + __rte_cache_aligned; >> + >> + struct rte_distributor_backlog backlog[RTE_DISTRIB_MAX_WORKERS] >> + __rte_cache_aligned; >> + >> + struct rte_distributor_buffer_burst bufs[RTE_DISTRIB_MAX_WORKERS]; >> + >> + struct rte_distributor_returned_pkts returns; >> + >> + enum rte_distributor_match_function dist_match_fn; >> +}; >> + >> +#ifdef __cplusplus >> +} >> +#endif >> + >> +#endif >> diff --git a/lib/librte_distributor/rte_distributor_version.map >> b/lib/librte_distributor/rte_distributor_version.map >> index 73fdc43..39795a1 100644 >> --- a/lib/librte_distributor/rte_distributor_version.map >> +++ b/lib/librte_distributor/rte_distributor_version.map >> @@ -2,14 +2,23 @@ DPDK_2.0 { >> global: >> rte_distributor_clear_returns; >> + rte_distributor_clear_returns_burst; >> rte_distributor_create; >> + rte_distributor_create_burst; >> rte_distributor_flush; >> + rte_distributor_flush_burst; >> rte_distributor_get_pkt; >> + rte_distributor_get_pkt_burst; >> rte_distributor_poll_pkt; >> + rte_distributor_poll_pkt_burst; >> rte_distributor_process; >> + rte_distributor_process_burst; >> rte_distributor_request_pkt; >> + rte_distributor_request_pkt_burst; >> rte_distributor_return_pkt; >> + rte_distributor_return_pkt_burst; >> rte_distributor_returned_pkts; >> + rte_distributor_returned_pkts_burst; >> local: *; >> }; > The new functions are not present in DPDK 2.0, so you need a new node > for the 17.02 release. Sure. > Regards, > /Bruce > Thanks Bruce. I'll get a new revision up later today. Regards, Dave.