Hi Konstantin, > -----Original Message----- > From: Ananyev, Konstantin > Sent: Tuesday, March 19, 2019 6:15 PM > To: Gavin Hu (Arm Technology China) ; dev@dpdk.org > Cc: nd ; stephen@networkplumber.org; > jerin.jacob@caviumnetworks.com; thomas@monjalon.net; Honnappa > Nagarahalli ; Joyce Kong (Arm Technology > China) > Subject: RE: [dpdk-dev] [PATCH v6 1/2] eal/ticketlock: ticket based to improve > fairness > > > Hi Gavin, > > > > > > > > diff --git a/lib/librte_eal/common/include/generic/rte_ticketlock.h > > > b/lib/librte_eal/common/include/generic/rte_ticketlock.h > > > > new file mode 100644 > > > > index 0000000..d63aaaa > > > > --- /dev/null > > > > +++ b/lib/librte_eal/common/include/generic/rte_ticketlock.h > > > > @@ -0,0 +1,308 @@ > > > > +/* SPDX-License-Identifier: BSD-3-Clause > > > > + * Copyright(c) 2019 Arm Limited > > > > + */ > > > > + > > > > +#ifndef _RTE_TICKETLOCK_H_ > > > > +#define _RTE_TICKETLOCK_H_ > > > > + > > > > +/** > > > > + * @file > > > > + * > > > > + * RTE ticket locks > > > > + * > > > > + * This file defines an API for ticket locks, which give each waiting > > > > + * thread a ticket and take the lock one by one, first come, first > > > > + * serviced. > > > > + * > > > > + * All locks must be initialised before use, and only initialised once. > > > > + * > > > > + */ > > > > + > > > > +#ifdef __cplusplus > > > > +extern "C" { > > > > +#endif > > > > + > > > > +#include > > > > +#include > > > > +#include > > > > + > > > > +/** > > > > + * The rte_ticketlock_t type. > > > > + */ > > > > +typedef struct { > > > > + uint16_t current; > > > > + uint16_t next; > > > > +} rte_ticketlock_t; > > > > + > > > > +/** > > > > + * A static ticketlock initializer. > > > > + */ > > > > +#define RTE_TICKETLOCK_INITIALIZER { 0 } > > > > + > > > > +/** > > > > + * Initialize the ticketlock to an unlocked state. > > > > + * > > > > + * @param tl > > > > + * A pointer to the ticketlock. > > > > + */ > > > > +static inline __rte_experimental void > > > > +rte_ticketlock_init(rte_ticketlock_t *tl) > > > > +{ > > > > + __atomic_store_n(&tl->current, 0, __ATOMIC_RELAXED); > > > > + __atomic_store_n(&tl->next, 0, __ATOMIC_RELAXED); > > > > +} > > > > + > > > > +/** > > > > + * Take the ticketlock. > > > > + * > > > > + * @param tl > > > > + * A pointer to the ticketlock. > > > > + */ > > > > +static inline __rte_experimental void > > > > +rte_ticketlock_lock(rte_ticketlock_t *tl) > > > > +{ > > > > + uint16_t me = __atomic_fetch_add(&tl->next, 1, > > > __ATOMIC_RELAXED); > > > > + while (__atomic_load_n(&tl->current, __ATOMIC_ACQUIRE) != me) > > > > + rte_pause(); > > > > +} > > > > + > > > > +/** > > > > + * Release the ticketlock. > > > > + * > > > > + * @param tl > > > > + * A pointer to the ticketlock. > > > > + */ > > > > +static inline __rte_experimental void > > > > +rte_ticketlock_unlock(rte_ticketlock_t *tl) > > > > +{ > > > > + uint16_t i = __atomic_load_n(&tl->current, __ATOMIC_RELAXED); > > > > + __atomic_store_n(&tl->current, i+1, __ATOMIC_RELEASE); > > > > +} > > > > + > > > > +/** > > > > + * Try to take the lock. > > > > + * > > > > + * @param tl > > > > + * A pointer to the ticketlock. > > > > + * @return > > > > + * 1 if the lock is successfully taken; 0 otherwise. > > > > + */ > > > > +static inline __rte_experimental int > > > > +rte_ticketlock_trylock(rte_ticketlock_t *tl) > > > > +{ > > > > + uint16_t next = __atomic_load_n(&tl->next, __ATOMIC_RELAXED); > > > > + uint16_t cur = __atomic_load_n(&tl->current, __ATOMIC_RELAXED); > > > > + if (next == cur) { > > > > > > Probably a naïve one: > > > Suppose next==cur==1 here, then this thread will experience really long > > > context switch, > > > > By saying context switch, do you mean running to here, it is out of CPU time > and starving for CPU? > > Yes. > > > > > > so next time it continues its execution tl->next value will wrap-up and will > > > be 1 again, and tl->current==0 (lock held). > > > I suppose this function will set tl->next=2 and will return a success? > > > > If this thread was swapped out and another thread took/attempted to take > the lock, yes, tl->next == 2 here, > > But as next == 1 unchanged, so it would not return a success. > > I am not talking about situation when tl->next == 2,tl->current==1 (just one > lock() was executed by different thread). > I am talking about situation when this thread was out of cpu for significant > amount of cycles, > and in that period tl->next and tl->current were wrapped around (they both > reached UINT16_MAX, then 0). > i.e. UINT16_MAX lock/unlock were executed while this thread was away from > cpu. > After that another thread just did successful lock(), so tl->next==1 and tl- > >current==0. > Now this thread wakeups and continues with: > __atomic_compare_exchange_n(&tl->next, &next, next+1, ...) > As both tl->next==1 and next==1, it will succeed. > So we have 2 threads assuming they grabbed the lock successfully. > Konstantin > Now I understood your points, but not sure if it is a rare or even impossible case for this thread stalls for CPU and during this time, the other threads have taken the lock for 2^16 times, to wrap up. Anyway I made a patch, currently in internal review to fix this issue, the basic idea is to compare not only the next, but also the current, and update the next(+1 and take the lock) only if both of them were not changed(or wrapped up and the lock released). I will submit the patch after internal review approved. Please let me know if you have more comments. > > > > > Wouldn't be better here and in _is_locked_ to do load/store for > > > next/current values in one go > > > (using 32bit op)? > > > Konstantin > > > > To load both in one go is feasible, but no necessary as we need to compare > them. > > We don't need store both as in this function tl->current is read only. > > tl->next is read-update-store, I ever thought of combining the two if- > statements to one __atomic_compare_exchange_n(&(&tl->next,&tl- > > >current, tl->next+1, ...), > > but tl->next+1 is out of atomicity and may be the old value and corrupt the > ticket lock waiting chain. > > > > The current code works ok except it may fail spuriously(in case during > context switch, the lock was taken and released by other threads, > > moving tl->next forward, in this case > > The lock is available but not obtained by this trylock). Anyway, as the name > suggests, it is a try/attempt, a spurious fail is not a big deal? > > And in most cases, dpdk running on dedicated cores, > > the context switch will not happen at all. > > > > Any more comments are welcome! > > > > > > > + if (__atomic_compare_exchange_n(&tl->next, &next, > > > next+1, > > > > + 0, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) > > > > + return 1; > > > > + } > > > > + > > > > + return 0; > > > > +} > > > > + > > > > +/** > > > > + * Test if the lock is taken. > > > > + * > > > > + * @param tl > > > > + * A pointer to the ticketlock. > > > > + * @return > > > > + * 1 if the lock icurrently taken; 0 otherwise. > > > > + */ > > > > +static inline __rte_experimental int > > > > +rte_ticketlock_is_locked(rte_ticketlock_t *tl) > > > > +{ > > > > + return (__atomic_load_n(&tl->current, __ATOMIC_ACQUIRE) != > > > > + __atomic_load_n(&tl->next, __ATOMIC_ACQUIRE)); > > > > +} > > > > +