From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58]) by dpdk.org (Postfix) with ESMTP id 7E855231C for ; Tue, 20 May 2014 20:18:39 +0200 (CEST) Received: from hmsreliant.think-freely.org ([2001:470:8:a08:7aac:c0ff:fec2:933b] helo=localhost) by smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63) (envelope-from ) id 1Wmocf-0003oY-Ao; Tue, 20 May 2014 14:18:47 -0400 Date: Tue, 20 May 2014 14:18:44 -0400 From: Neil Horman To: Bruce Richardson Message-ID: <20140520181844.GF6648@hmsreliant.think-freely.org> References: <1400580057-30155-1-git-send-email-bruce.richardson@intel.com> <1400580057-30155-3-git-send-email-bruce.richardson@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1400580057-30155-3-git-send-email-bruce.richardson@intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -2.9 (--) X-Spam-Status: No Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH 2/4] distributor: new packet distributor library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 May 2014 18:18:40 -0000 On Tue, May 20, 2014 at 11:00:55AM +0100, Bruce Richardson wrote: > This adds the code for a new Intel DPDK library for packet distribution. > The distributor is a component which is designed to pass packets > one-at-a-time to workers, with dynamic load balancing. Using the RSS > field in the mbuf as a tag, the distributor tracks what packet tag is > being processed by what worker and then ensures that no two packets with > the same tag are in-flight simultaneously. Once a tag is not in-flight, > then the next packet with that tag will be sent to the next available > core. > > Signed-off-by: Bruce Richardson > > +#define RTE_DISTRIB_GET_BUF (1) > +#define RTE_DISTRIB_RETURN_BUF (2) > + Can you document the meaning of these bits please, the code makes it somewhat confusing to differentiate them. As I read the code, GET_BUF is used as a flag to indicate that rte_distributor_get_pkt needs to wait while a buffer is filled in by the processing thread, while RETURN_BUF indicates that a worker is leaving and the buffer needs to be (re)assigned to an alternate worker, is that correct? > +#define RTE_DISTRIB_BACKLOG_SIZE 8 > +#define RTE_DISTRIB_BACKLOG_MASK (RTE_DISTRIB_BACKLOG_SIZE - 1) > + > +#define RTE_DISTRIB_MAX_RETURNS 128 > +#define RTE_DISTRIB_RETURNS_MASK (RTE_DISTRIB_MAX_RETURNS - 1) > + > +union rte_distributor_buffer { > + volatile int64_t bufptr64; > + char pad[CACHE_LINE_SIZE*3]; Do you need the pad, if you mark the struct as cache aligned? > +} __rte_cache_aligned; > + > > + > +struct rte_mbuf * > +rte_distributor_get_pkt(struct rte_distributor *d, > + unsigned worker_id, struct rte_mbuf *oldpkt, > + unsigned reserved __rte_unused) > +{ > + union rte_distributor_buffer *buf = &d->bufs[worker_id]; > + int64_t req = (((int64_t)(uintptr_t)oldpkt) << RTE_DISTRIB_FLAG_BITS) | \ > + RTE_DISTRIB_GET_BUF; > + while (unlikely(buf->bufptr64 & RTE_DISTRIB_FLAGS_MASK)) > + rte_pause(); > + buf->bufptr64 = req; > + while (buf->bufptr64 & RTE_DISTRIB_GET_BUF) > + rte_pause(); You may want to document the fact that this is deadlock prone. You clearly state that only a single thread can run the processing routine, but if a user selects a single worker thread to preform double duty, the GET_BUF_FLAG will never get cleared here, and no other queues will get processed. > + /* since bufptr64 is a signed value, this should be an arithmetic shift */ > + int64_t ret = buf->bufptr64 >> RTE_DISTRIB_FLAG_BITS; > + return (struct rte_mbuf *)((uintptr_t)ret); > +} > + > +int > +rte_distributor_return_pkt(struct rte_distributor *d, > + unsigned worker_id, struct rte_mbuf *oldpkt) > +{ Maybe some optional sanity checking, here and above, to ensure that a packet returned through get_pkt doesn't also get returned here, mangling the flags field? > > + > +/* flush the distributor, so that there are no outstanding packets in flight or > + * queued up. */ > +int > +rte_distributor_flush(struct rte_distributor *d) > +{ You need to document that this function can only be called by the same thread that is running rte_distributor_process, lest your corrupt your queue data. Alternatively, it might be nicer to modify this functions internals to set a flag in the distributor status bits to make the process routine do the flush work when it gets set. that would allow this function to be called by any other thread, which seems like a more natural interface. > > +} > + > +/* clears the internal returns array in the distributor */ > +void > +rte_distributor_clear_returns(struct rte_distributor *d) > +{ This can also only be called by the same thread that runs the process routine, lest the start and count values get mis-assigned. > + d->returns.start = d->returns.count = 0; > +#ifndef __OPTIMIZE__ > + memset(d->returns.mbufs, 0, sizeof(d->returns.mbufs)); > +#endif > +} > + > +/* creates a distributor instance */ > +struct rte_distributor * > +rte_distributor_create(const char *name, > + unsigned socket_id, > + unsigned num_workers, > + struct rte_distributor_extra_args *args __rte_unused) > +{ > + struct rte_distributor *d; > + struct rte_distributor_list *distributor_list; > + char mz_name[RTE_MEMZONE_NAMESIZE]; > + const struct rte_memzone *mz; > + > + /* compilation-time checks */ > + RTE_BUILD_BUG_ON((sizeof(*d) & CACHE_LINE_MASK) != 0); > + RTE_BUILD_BUG_ON((RTE_MAX_LCORE & 7) != 0); > + > + if (name == NULL || num_workers >= RTE_MAX_LCORE) { > + rte_errno = EINVAL; > + return NULL; > + } > + rte_snprintf(mz_name, sizeof(mz_name), RTE_DISTRIB_PREFIX"%s", name); > + mz = rte_memzone_reserve(mz_name, sizeof(*d), socket_id, NO_FLAGS); > + if (mz == NULL) { > + rte_errno = ENOMEM; > + return NULL; > + } > + > + /* check that we have an initialised tail queue */ > + if ((distributor_list = RTE_TAILQ_LOOKUP_BY_IDX(RTE_TAILQ_DISTRIBUTOR, > + rte_distributor_list)) == NULL) { > + rte_errno = E_RTE_NO_TAILQ; > + return NULL; > + } > + > + d = mz->addr; > + rte_snprintf(d->name, sizeof(d->name), "%s", name); > + d->num_workers = num_workers; > + TAILQ_INSERT_TAIL(distributor_list, d, next); You need locking around this list unless you intend to assert that distributor creation and destruction must only be preformed from a single thread. Also, where is the API method to tear down a distributor instance? > > +#endif > + > +#include > + > +#define RTE_DISTRIBUTOR_NAMESIZE 32 /**< Length of name for instance */ > + > +struct rte_distributor; > + > +struct rte_distributor_extra_args { }; /**< reserved for future use*/ > + You don't need to reserve a struct name for future use. No one will use it until you create it. > +struct rte_mbuf * > +rte_distributor_get_pkt(struct rte_distributor *d, > + unsigned worker_id, struct rte_mbuf *oldpkt, unsigned reserved); > + Don't need to reserve an extra argument here. You're not ABI safe currently, and if DPDK becomes ABI safe in the future, we will use a linker script to provide versions with backward compatibility easily enough. Neil