* [dpdk-dev] [PATCH RFC 0/4] Thread safe rte_vhost_enqueue_burst(). @ 2016-02-19 6:32 Ilya Maximets 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 1/4] vhost: use SMP barriers instead of compiler ones Ilya Maximets ` (3 more replies) 0 siblings, 4 replies; 21+ messages in thread From: Ilya Maximets @ 2016-02-19 6:32 UTC (permalink / raw) To: dev, Huawei Xie, Yuanhan Liu; +Cc: Ilya Maximets, Dyasly Sergey Implementation of rte_vhost_enqueue_burst() based on lockless ring-buffer algorithm and contains almost all to be thread-safe, but it's not. This set adds required changes. First patch in set is a standalone patch that fixes many times discussed issue with barriers on different architectures. Second and third adds fixes to make rte_vhost_enqueue_burst thread safe. Last is a documentation fix. Ilya Maximets (4): vhost: use SMP barriers instead of compiler ones. vhost: make buf vector for scatter RX local. vhost: avoid reordering of used->idx and last_used_idx updating. doc: add note about rte_vhost_enqueue_burst thread safety. .../prog_guide/thread_safety_dpdk_functions.rst | 1 + lib/librte_vhost/rte_virtio_net.h | 1 - lib/librte_vhost/vhost_rxtx.c | 67 ++++++++++++---------- 3 files changed, 39 insertions(+), 30 deletions(-) -- 2.5.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH RFC 1/4] vhost: use SMP barriers instead of compiler ones. 2016-02-19 6:32 [dpdk-dev] [PATCH RFC 0/4] Thread safe rte_vhost_enqueue_burst() Ilya Maximets @ 2016-02-19 6:32 ` Ilya Maximets 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 2/4] vhost: make buf vector for scatter RX local Ilya Maximets ` (2 subsequent siblings) 3 siblings, 0 replies; 21+ messages in thread From: Ilya Maximets @ 2016-02-19 6:32 UTC (permalink / raw) To: dev, Huawei Xie, Yuanhan Liu; +Cc: Ilya Maximets, Dyasly Sergey Since commit 4c02e453cc62 ("eal: introduce SMP memory barriers") virtio uses architecture dependent SMP barriers. vHost should use them too. Fixes: 4c02e453cc62 ("eal: introduce SMP memory barriers") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> --- lib/librte_vhost/vhost_rxtx.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index 5e7e5b1..411dd95 100644 --- a/lib/librte_vhost/vhost_rxtx.c +++ b/lib/librte_vhost/vhost_rxtx.c @@ -274,7 +274,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, } } - rte_compiler_barrier(); + rte_smp_wmb(); /* Wait until it's our turn to add our buffer to the used ring. */ while (unlikely(vq->last_used_idx != res_base_idx)) @@ -575,7 +575,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, entry_success = copy_from_mbuf_to_vring(dev, queue_id, res_base_idx, res_cur_idx, pkts[pkt_idx]); - rte_compiler_barrier(); + rte_smp_wmb(); /* * Wait until it's our turn to add our buffer @@ -917,7 +917,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, entry_success++; } - rte_compiler_barrier(); + rte_smp_rmb(); vq->used->idx += entry_success; /* Kick guest if required. */ if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) -- 2.5.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH RFC 2/4] vhost: make buf vector for scatter RX local. 2016-02-19 6:32 [dpdk-dev] [PATCH RFC 0/4] Thread safe rte_vhost_enqueue_burst() Ilya Maximets 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 1/4] vhost: use SMP barriers instead of compiler ones Ilya Maximets @ 2016-02-19 6:32 ` Ilya Maximets 2016-02-19 7:06 ` Yuanhan Liu 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 3/4] vhost: avoid reordering of used->idx and last_used_idx updating Ilya Maximets 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety Ilya Maximets 3 siblings, 1 reply; 21+ messages in thread From: Ilya Maximets @ 2016-02-19 6:32 UTC (permalink / raw) To: dev, Huawei Xie, Yuanhan Liu; +Cc: Ilya Maximets, Dyasly Sergey Array of buf_vector's is just an array for temporary storing information about available descriptors. It used only locally in virtio_dev_merge_rx() and there is no reason for that array to be shared. Fix that by allocating local buf_vec inside virtio_dev_merge_rx(). Signed-off-by: Ilya Maximets <i.maximets@samsung.com> --- lib/librte_vhost/rte_virtio_net.h | 1 - lib/librte_vhost/vhost_rxtx.c | 45 ++++++++++++++++++++------------------- 2 files changed, 23 insertions(+), 23 deletions(-) diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h index 10dcb90..ae1e4fb 100644 --- a/lib/librte_vhost/rte_virtio_net.h +++ b/lib/librte_vhost/rte_virtio_net.h @@ -91,7 +91,6 @@ struct vhost_virtqueue { int kickfd; /**< Currently unused as polling mode is enabled. */ int enabled; uint64_t reserved[16]; /**< Reserve some spaces for future extension. */ - struct buf_vector buf_vec[BUF_VECTOR_MAX]; /**< for scatter RX. */ } __rte_cache_aligned; diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index 411dd95..9095fb1 100644 --- a/lib/librte_vhost/vhost_rxtx.c +++ b/lib/librte_vhost/vhost_rxtx.c @@ -295,7 +295,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, static inline uint32_t __attribute__((always_inline)) copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t queue_id, uint16_t res_base_idx, uint16_t res_end_idx, - struct rte_mbuf *pkt) + struct rte_mbuf *pkt, struct buf_vector *buf_vec) { uint32_t vec_idx = 0; uint32_t entry_success = 0; @@ -325,7 +325,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t queue_id, */ vq = dev->virtqueue[queue_id]; - vb_addr = gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr); + vb_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr); vb_hdr_addr = vb_addr; /* Prefetch buffer address. */ @@ -345,19 +345,19 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t queue_id, seg_avail = rte_pktmbuf_data_len(pkt); vb_offset = vq->vhost_hlen; - vb_avail = vq->buf_vec[vec_idx].buf_len - vq->vhost_hlen; + vb_avail = buf_vec[vec_idx].buf_len - vq->vhost_hlen; entry_len = vq->vhost_hlen; if (vb_avail == 0) { uint32_t desc_idx = - vq->buf_vec[vec_idx].desc_idx; + buf_vec[vec_idx].desc_idx; if ((vq->desc[desc_idx].flags & VRING_DESC_F_NEXT) == 0) { /* Update used ring with desc information */ vq->used->ring[cur_idx & (vq->size - 1)].id - = vq->buf_vec[vec_idx].desc_idx; + = buf_vec[vec_idx].desc_idx; vq->used->ring[cur_idx & (vq->size - 1)].len = entry_len; @@ -367,12 +367,12 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t queue_id, } vec_idx++; - vb_addr = gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr); + vb_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr); /* Prefetch buffer address. */ rte_prefetch0((void *)(uintptr_t)vb_addr); vb_offset = 0; - vb_avail = vq->buf_vec[vec_idx].buf_len; + vb_avail = buf_vec[vec_idx].buf_len; } cpy_len = RTE_MIN(vb_avail, seg_avail); @@ -399,11 +399,11 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t queue_id, * entry reach to its end. * But the segment doesn't complete. */ - if ((vq->desc[vq->buf_vec[vec_idx].desc_idx].flags & + if ((vq->desc[buf_vec[vec_idx].desc_idx].flags & VRING_DESC_F_NEXT) == 0) { /* Update used ring with desc information */ vq->used->ring[cur_idx & (vq->size - 1)].id - = vq->buf_vec[vec_idx].desc_idx; + = buf_vec[vec_idx].desc_idx; vq->used->ring[cur_idx & (vq->size - 1)].len = entry_len; entry_len = 0; @@ -413,9 +413,9 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t queue_id, vec_idx++; vb_addr = gpa_to_vva(dev, - vq->buf_vec[vec_idx].buf_addr); + buf_vec[vec_idx].buf_addr); vb_offset = 0; - vb_avail = vq->buf_vec[vec_idx].buf_len; + vb_avail = buf_vec[vec_idx].buf_len; cpy_len = RTE_MIN(vb_avail, seg_avail); } else { /* @@ -434,7 +434,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t queue_id, * from buf_vec. */ uint32_t desc_idx = - vq->buf_vec[vec_idx].desc_idx; + buf_vec[vec_idx].desc_idx; if ((vq->desc[desc_idx].flags & VRING_DESC_F_NEXT) == 0) { @@ -456,9 +456,9 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t queue_id, /* Get next buffer from buf_vec. */ vec_idx++; vb_addr = gpa_to_vva(dev, - vq->buf_vec[vec_idx].buf_addr); + buf_vec[vec_idx].buf_addr); vb_avail = - vq->buf_vec[vec_idx].buf_len; + buf_vec[vec_idx].buf_len; vb_offset = 0; } @@ -471,7 +471,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t queue_id, */ /* Update used ring with desc information */ vq->used->ring[cur_idx & (vq->size - 1)].id - = vq->buf_vec[vec_idx].desc_idx; + = buf_vec[vec_idx].desc_idx; vq->used->ring[cur_idx & (vq->size - 1)].len = entry_len; entry_success++; @@ -485,7 +485,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint32_t queue_id, static inline void __attribute__((always_inline)) update_secure_len(struct vhost_virtqueue *vq, uint32_t id, - uint32_t *secure_len, uint32_t *vec_idx) + uint32_t *secure_len, uint32_t *vec_idx, struct buf_vector *buf_vec) { uint16_t wrapped_idx = id & (vq->size - 1); uint32_t idx = vq->avail->ring[wrapped_idx]; @@ -496,9 +496,9 @@ update_secure_len(struct vhost_virtqueue *vq, uint32_t id, do { next_desc = 0; len += vq->desc[idx].len; - vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr; - vq->buf_vec[vec_id].buf_len = vq->desc[idx].len; - vq->buf_vec[vec_id].desc_idx = idx; + buf_vec[vec_id].buf_addr = vq->desc[idx].addr; + buf_vec[vec_id].buf_len = vq->desc[idx].len; + buf_vec[vec_id].desc_idx = idx; vec_id++; if (vq->desc[idx].flags & VRING_DESC_F_NEXT) { @@ -523,6 +523,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, uint16_t avail_idx; uint16_t res_base_idx, res_cur_idx; uint8_t success = 0; + struct buf_vector buf_vec[BUF_VECTOR_MAX]; LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_merge_rx()\n", dev->device_fh); @@ -561,8 +562,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, if (unlikely(res_cur_idx == avail_idx)) goto merge_rx_exit; - update_secure_len(vq, res_cur_idx, - &secure_len, &vec_idx); + update_secure_len(vq, res_cur_idx, &secure_len, + &vec_idx, buf_vec); res_cur_idx++; } while (pkt_len > secure_len); @@ -573,7 +574,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, } while (success == 0); entry_success = copy_from_mbuf_to_vring(dev, queue_id, - res_base_idx, res_cur_idx, pkts[pkt_idx]); + res_base_idx, res_cur_idx, pkts[pkt_idx], buf_vec); rte_smp_wmb(); -- 2.5.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH RFC 2/4] vhost: make buf vector for scatter RX local. 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 2/4] vhost: make buf vector for scatter RX local Ilya Maximets @ 2016-02-19 7:06 ` Yuanhan Liu 2016-02-19 7:30 ` Ilya Maximets 2016-04-05 5:47 ` [dpdk-dev] [RFC] vhost-user public struct refactor (was Re: [PATCH RFC 2/4] vhost: make buf vector for scatter RX) local Yuanhan Liu 0 siblings, 2 replies; 21+ messages in thread From: Yuanhan Liu @ 2016-02-19 7:06 UTC (permalink / raw) To: Ilya Maximets; +Cc: dev, Dyasly Sergey On Fri, Feb 19, 2016 at 09:32:41AM +0300, Ilya Maximets wrote: > Array of buf_vector's is just an array for temporary storing information > about available descriptors. It used only locally in virtio_dev_merge_rx() > and there is no reason for that array to be shared. > > Fix that by allocating local buf_vec inside virtio_dev_merge_rx(). > > Signed-off-by: Ilya Maximets <i.maximets@samsung.com> > --- > lib/librte_vhost/rte_virtio_net.h | 1 - > lib/librte_vhost/vhost_rxtx.c | 45 ++++++++++++++++++++------------------- > 2 files changed, 23 insertions(+), 23 deletions(-) > > diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h > index 10dcb90..ae1e4fb 100644 > --- a/lib/librte_vhost/rte_virtio_net.h > +++ b/lib/librte_vhost/rte_virtio_net.h > @@ -91,7 +91,6 @@ struct vhost_virtqueue { > int kickfd; /**< Currently unused as polling mode is enabled. */ > int enabled; > uint64_t reserved[16]; /**< Reserve some spaces for future extension. */ > - struct buf_vector buf_vec[BUF_VECTOR_MAX]; /**< for scatter RX. */ > } __rte_cache_aligned; I like this kind of cleanup, however, it breaks ABI. --yliu ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH RFC 2/4] vhost: make buf vector for scatter RX local. 2016-02-19 7:06 ` Yuanhan Liu @ 2016-02-19 7:30 ` Ilya Maximets 2016-02-19 8:10 ` Xie, Huawei 2016-04-05 5:47 ` [dpdk-dev] [RFC] vhost-user public struct refactor (was Re: [PATCH RFC 2/4] vhost: make buf vector for scatter RX) local Yuanhan Liu 1 sibling, 1 reply; 21+ messages in thread From: Ilya Maximets @ 2016-02-19 7:30 UTC (permalink / raw) To: Yuanhan Liu; +Cc: dev, Dyasly Sergey On 19.02.2016 10:06, Yuanhan Liu wrote: > On Fri, Feb 19, 2016 at 09:32:41AM +0300, Ilya Maximets wrote: >> Array of buf_vector's is just an array for temporary storing information >> about available descriptors. It used only locally in virtio_dev_merge_rx() >> and there is no reason for that array to be shared. >> >> Fix that by allocating local buf_vec inside virtio_dev_merge_rx(). >> >> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> >> --- >> lib/librte_vhost/rte_virtio_net.h | 1 - >> lib/librte_vhost/vhost_rxtx.c | 45 ++++++++++++++++++++------------------- >> 2 files changed, 23 insertions(+), 23 deletions(-) >> >> diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h >> index 10dcb90..ae1e4fb 100644 >> --- a/lib/librte_vhost/rte_virtio_net.h >> +++ b/lib/librte_vhost/rte_virtio_net.h >> @@ -91,7 +91,6 @@ struct vhost_virtqueue { >> int kickfd; /**< Currently unused as polling mode is enabled. */ >> int enabled; >> uint64_t reserved[16]; /**< Reserve some spaces for future extension. */ >> - struct buf_vector buf_vec[BUF_VECTOR_MAX]; /**< for scatter RX. */ >> } __rte_cache_aligned; > > I like this kind of cleanup, however, it breaks ABI. Should I prepare version of this patch with field above marked as deprecated and add note to doc/guides/rel_notes/release_16_04.rst about future deletion? Best regards, Ilya Maximets. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH RFC 2/4] vhost: make buf vector for scatter RX local. 2016-02-19 7:30 ` Ilya Maximets @ 2016-02-19 8:10 ` Xie, Huawei 0 siblings, 0 replies; 21+ messages in thread From: Xie, Huawei @ 2016-02-19 8:10 UTC (permalink / raw) To: Ilya Maximets, Yuanhan Liu; +Cc: dev, Dyasly Sergey On 2/19/2016 3:31 PM, Ilya Maximets wrote: > On 19.02.2016 10:06, Yuanhan Liu wrote: >> On Fri, Feb 19, 2016 at 09:32:41AM +0300, Ilya Maximets wrote: >>> Array of buf_vector's is just an array for temporary storing information >>> about available descriptors. It used only locally in virtio_dev_merge_rx() >>> and there is no reason for that array to be shared. >>> >>> Fix that by allocating local buf_vec inside virtio_dev_merge_rx(). >>> >>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> >>> --- >>> lib/librte_vhost/rte_virtio_net.h | 1 - >>> lib/librte_vhost/vhost_rxtx.c | 45 ++++++++++++++++++++------------------- >>> 2 files changed, 23 insertions(+), 23 deletions(-) >>> >>> diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h >>> index 10dcb90..ae1e4fb 100644 >>> --- a/lib/librte_vhost/rte_virtio_net.h >>> +++ b/lib/librte_vhost/rte_virtio_net.h >>> @@ -91,7 +91,6 @@ struct vhost_virtqueue { >>> int kickfd; /**< Currently unused as polling mode is enabled. */ >>> int enabled; >>> uint64_t reserved[16]; /**< Reserve some spaces for future extension. */ >>> - struct buf_vector buf_vec[BUF_VECTOR_MAX]; /**< for scatter RX. */ >>> } __rte_cache_aligned; >> I like this kind of cleanup, however, it breaks ABI. > Should I prepare version of this patch with field above marked as > deprecated and add note to doc/guides/rel_notes/release_16_04.rst > about future deletion? Ilya, you could follow the ABI process: http://dpdk.org/doc/guides/contributing/versioning.html > > Best regards, Ilya Maximets. > ^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [RFC] vhost-user public struct refactor (was Re: [PATCH RFC 2/4] vhost: make buf vector for scatter RX) local. 2016-02-19 7:06 ` Yuanhan Liu 2016-02-19 7:30 ` Ilya Maximets @ 2016-04-05 5:47 ` Yuanhan Liu 2016-04-05 8:37 ` Thomas Monjalon 2016-04-06 4:14 ` Flavio Leitner 1 sibling, 2 replies; 21+ messages in thread From: Yuanhan Liu @ 2016-04-05 5:47 UTC (permalink / raw) To: Ilya Maximets Cc: dev, Dyasly Sergey, Thomas Monjalon, Flavio Leitner, Xie, Huawei On Fri, Feb 19, 2016 at 03:06:50PM +0800, Yuanhan Liu wrote: > On Fri, Feb 19, 2016 at 09:32:41AM +0300, Ilya Maximets wrote: > > Array of buf_vector's is just an array for temporary storing information > > about available descriptors. It used only locally in virtio_dev_merge_rx() > > and there is no reason for that array to be shared. > > > > Fix that by allocating local buf_vec inside virtio_dev_merge_rx(). > > > > Signed-off-by: Ilya Maximets <i.maximets@samsung.com> > > --- > > lib/librte_vhost/rte_virtio_net.h | 1 - > > lib/librte_vhost/vhost_rxtx.c | 45 ++++++++++++++++++++------------------- > > 2 files changed, 23 insertions(+), 23 deletions(-) > > > > diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h > > index 10dcb90..ae1e4fb 100644 > > --- a/lib/librte_vhost/rte_virtio_net.h > > +++ b/lib/librte_vhost/rte_virtio_net.h > > @@ -91,7 +91,6 @@ struct vhost_virtqueue { > > int kickfd; /**< Currently unused as polling mode is enabled. */ > > int enabled; > > uint64_t reserved[16]; /**< Reserve some spaces for future extension. */ > > - struct buf_vector buf_vec[BUF_VECTOR_MAX]; /**< for scatter RX. */ > > } __rte_cache_aligned; > > I like this kind of cleanup, however, it breaks ABI. So, I was considering to add vhost-user Tx delayed-copy (or zero copy) support recently, which comes to yet another ABI violation, as we need add a new field to virtio_memory_regions struct to do guest phys addr to host phys addr translation. You may ask, however, that why do we need expose virtio_memory_regions struct to users at all? You are right, we don't have to. And here is the thing: we exposed way too many fields (or even structures) than necessary. Say, vhost_virtqueue struct should NOT be exposed to user at all: application just need to tell the right queue id to locate a specific queue, and that's all. The structure should be defined in an internal header file. With that, we could do any changes to it we want, without worrying about that we may offense the painful ABI rules. Similar changes could be done to virtio_net struct as well, just exposing very few fields that are necessary and moving all others to an internal structure. Huawei then suggested a more radical yet much cleaner one: just exposing a virtio_net handle to application, just like the way kernel exposes an fd to user for locating a specific file. However, it's more than an ABI change; it's also an API change: some fields are referenced by applications, such as flags, virt_qp_nb. We could expose some new functions to access them though. I'd vote for this one, as it sounds very clean to me. This would also solve the block issue of this patch. Though it would break OVS, I'm thinking that'd be okay, as OVS has dependence on DPDK version: what we need to do is just to send few patches to OVS, and let it points to next release, say DPDK v16.07. Flavio, please correct me if I'm wrong. Thoughts/comments? --yliu ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [RFC] vhost-user public struct refactor (was Re: [PATCH RFC 2/4] vhost: make buf vector for scatter RX) local. 2016-04-05 5:47 ` [dpdk-dev] [RFC] vhost-user public struct refactor (was Re: [PATCH RFC 2/4] vhost: make buf vector for scatter RX) local Yuanhan Liu @ 2016-04-05 8:37 ` Thomas Monjalon 2016-04-05 14:06 ` Yuanhan Liu 2016-04-06 4:14 ` Flavio Leitner 1 sibling, 1 reply; 21+ messages in thread From: Thomas Monjalon @ 2016-04-05 8:37 UTC (permalink / raw) To: Yuanhan Liu Cc: dev, Ilya Maximets, Dyasly Sergey, Flavio Leitner, Xie, Huawei 2016-04-05 13:47, Yuanhan Liu: > So, I was considering to add vhost-user Tx delayed-copy (or zero copy) > support recently, which comes to yet another ABI violation, as we need > add a new field to virtio_memory_regions struct to do guest phys addr > to host phys addr translation. You may ask, however, that why do we need > expose virtio_memory_regions struct to users at all? > > You are right, we don't have to. And here is the thing: we exposed way > too many fields (or even structures) than necessary. Say, vhost_virtqueue > struct should NOT be exposed to user at all: application just need to > tell the right queue id to locate a specific queue, and that's all. > The structure should be defined in an internal header file. With that, > we could do any changes to it we want, without worrying about that we > may offense the painful ABI rules. > > Similar changes could be done to virtio_net struct as well, just exposing > very few fields that are necessary and moving all others to an internal > structure. > > Huawei then suggested a more radical yet much cleaner one: just exposing > a virtio_net handle to application, just like the way kernel exposes an > fd to user for locating a specific file. However, it's more than an ABI > change; it's also an API change: some fields are referenced by applications, > such as flags, virt_qp_nb. We could expose some new functions to access > them though. > > I'd vote for this one, as it sounds very clean to me. This would also > solve the block issue of this patch. Though it would break OVS, I'm thinking > that'd be okay, as OVS has dependence on DPDK version: what we need to > do is just to send few patches to OVS, and let it points to next release, > say DPDK v16.07. Flavio, please correct me if I'm wrong. > > Thoughts/comments? Do you plan to send a deprecation notice to change API in 16.07? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [RFC] vhost-user public struct refactor (was Re: [PATCH RFC 2/4] vhost: make buf vector for scatter RX) local. 2016-04-05 8:37 ` Thomas Monjalon @ 2016-04-05 14:06 ` Yuanhan Liu 0 siblings, 0 replies; 21+ messages in thread From: Yuanhan Liu @ 2016-04-05 14:06 UTC (permalink / raw) To: Thomas Monjalon Cc: dev, Ilya Maximets, Dyasly Sergey, Flavio Leitner, Xie, Huawei On Tue, Apr 05, 2016 at 10:37:13AM +0200, Thomas Monjalon wrote: > 2016-04-05 13:47, Yuanhan Liu: > > So, I was considering to add vhost-user Tx delayed-copy (or zero copy) > > support recently, which comes to yet another ABI violation, as we need > > add a new field to virtio_memory_regions struct to do guest phys addr > > to host phys addr translation. You may ask, however, that why do we need > > expose virtio_memory_regions struct to users at all? > > > > You are right, we don't have to. And here is the thing: we exposed way > > too many fields (or even structures) than necessary. Say, vhost_virtqueue > > struct should NOT be exposed to user at all: application just need to > > tell the right queue id to locate a specific queue, and that's all. > > The structure should be defined in an internal header file. With that, > > we could do any changes to it we want, without worrying about that we > > may offense the painful ABI rules. > > > > Similar changes could be done to virtio_net struct as well, just exposing > > very few fields that are necessary and moving all others to an internal > > structure. > > > > Huawei then suggested a more radical yet much cleaner one: just exposing > > a virtio_net handle to application, just like the way kernel exposes an > > fd to user for locating a specific file. However, it's more than an ABI > > change; it's also an API change: some fields are referenced by applications, > > such as flags, virt_qp_nb. We could expose some new functions to access > > them though. > > > > I'd vote for this one, as it sounds very clean to me. This would also > > solve the block issue of this patch. Though it would break OVS, I'm thinking > > that'd be okay, as OVS has dependence on DPDK version: what we need to > > do is just to send few patches to OVS, and let it points to next release, > > say DPDK v16.07. Flavio, please correct me if I'm wrong. > > > > Thoughts/comments? > > Do you plan to send a deprecation notice to change API in 16.07? Yes, I planned to, shortly. Before that, I'd ask for comments first. --yliu ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [RFC] vhost-user public struct refactor (was Re: [PATCH RFC 2/4] vhost: make buf vector for scatter RX) local. 2016-04-05 5:47 ` [dpdk-dev] [RFC] vhost-user public struct refactor (was Re: [PATCH RFC 2/4] vhost: make buf vector for scatter RX) local Yuanhan Liu 2016-04-05 8:37 ` Thomas Monjalon @ 2016-04-06 4:14 ` Flavio Leitner 2016-04-06 4:54 ` Yuanhan Liu 1 sibling, 1 reply; 21+ messages in thread From: Flavio Leitner @ 2016-04-06 4:14 UTC (permalink / raw) To: Yuanhan Liu Cc: Ilya Maximets, dev, Dyasly Sergey, Thomas Monjalon, Xie, Huawei On Tue, Apr 05, 2016 at 01:47:33PM +0800, Yuanhan Liu wrote: > On Fri, Feb 19, 2016 at 03:06:50PM +0800, Yuanhan Liu wrote: > > On Fri, Feb 19, 2016 at 09:32:41AM +0300, Ilya Maximets wrote: > > > Array of buf_vector's is just an array for temporary storing information > > > about available descriptors. It used only locally in virtio_dev_merge_rx() > > > and there is no reason for that array to be shared. > > > > > > Fix that by allocating local buf_vec inside virtio_dev_merge_rx(). > > > > > > Signed-off-by: Ilya Maximets <i.maximets@samsung.com> > > > --- > > > lib/librte_vhost/rte_virtio_net.h | 1 - > > > lib/librte_vhost/vhost_rxtx.c | 45 ++++++++++++++++++++------------------- > > > 2 files changed, 23 insertions(+), 23 deletions(-) > > > > > > diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h > > > index 10dcb90..ae1e4fb 100644 > > > --- a/lib/librte_vhost/rte_virtio_net.h > > > +++ b/lib/librte_vhost/rte_virtio_net.h > > > @@ -91,7 +91,6 @@ struct vhost_virtqueue { > > > int kickfd; /**< Currently unused as polling mode is enabled. */ > > > int enabled; > > > uint64_t reserved[16]; /**< Reserve some spaces for future extension. */ > > > - struct buf_vector buf_vec[BUF_VECTOR_MAX]; /**< for scatter RX. */ > > > } __rte_cache_aligned; > > > > I like this kind of cleanup, however, it breaks ABI. > > So, I was considering to add vhost-user Tx delayed-copy (or zero copy) > support recently, which comes to yet another ABI violation, as we need > add a new field to virtio_memory_regions struct to do guest phys addr > to host phys addr translation. You may ask, however, that why do we need > expose virtio_memory_regions struct to users at all? > > You are right, we don't have to. And here is the thing: we exposed way > too many fields (or even structures) than necessary. Say, vhost_virtqueue > struct should NOT be exposed to user at all: application just need to > tell the right queue id to locate a specific queue, and that's all. > The structure should be defined in an internal header file. With that, > we could do any changes to it we want, without worrying about that we > may offense the painful ABI rules. > > Similar changes could be done to virtio_net struct as well, just exposing > very few fields that are necessary and moving all others to an internal > structure. > > Huawei then suggested a more radical yet much cleaner one: just exposing > a virtio_net handle to application, just like the way kernel exposes an > fd to user for locating a specific file. However, it's more than an ABI > change; it's also an API change: some fields are referenced by applications, > such as flags, virt_qp_nb. We could expose some new functions to access > them though. > > I'd vote for this one, as it sounds very clean to me. This would also > solve the block issue of this patch. Though it would break OVS, I'm thinking > that'd be okay, as OVS has dependence on DPDK version: what we need to > do is just to send few patches to OVS, and let it points to next release, > say DPDK v16.07. Flavio, please correct me if I'm wrong. There is a plan to use vHost PMD, so from OVS point of view the virtio stuff would be hidden because vhost PMD would look like just as a regular ethernet, right? I think we are waiting for 16.04 to be released with that so we can start push changes to OVS as well. -- fbl ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [RFC] vhost-user public struct refactor (was Re: [PATCH RFC 2/4] vhost: make buf vector for scatter RX) local. 2016-04-06 4:14 ` Flavio Leitner @ 2016-04-06 4:54 ` Yuanhan Liu 0 siblings, 0 replies; 21+ messages in thread From: Yuanhan Liu @ 2016-04-06 4:54 UTC (permalink / raw) To: Flavio Leitner Cc: Ilya Maximets, dev, Dyasly Sergey, Thomas Monjalon, Xie, Huawei On Wed, Apr 06, 2016 at 01:14:09AM -0300, Flavio Leitner wrote: > > > > I'd vote for this one, as it sounds very clean to me. This would also > > solve the block issue of this patch. Though it would break OVS, I'm thinking > > that'd be okay, as OVS has dependence on DPDK version: what we need to > > do is just to send few patches to OVS, and let it points to next release, > > say DPDK v16.07. Flavio, please correct me if I'm wrong. > > There is a plan to use vHost PMD, Great. > so from OVS point of view the virtio > stuff would be hidden because vhost PMD would look like just as a > regular ethernet, right? Yes. --yliu ^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH RFC 3/4] vhost: avoid reordering of used->idx and last_used_idx updating. 2016-02-19 6:32 [dpdk-dev] [PATCH RFC 0/4] Thread safe rte_vhost_enqueue_burst() Ilya Maximets 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 1/4] vhost: use SMP barriers instead of compiler ones Ilya Maximets 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 2/4] vhost: make buf vector for scatter RX local Ilya Maximets @ 2016-02-19 6:32 ` Ilya Maximets 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety Ilya Maximets 3 siblings, 0 replies; 21+ messages in thread From: Ilya Maximets @ 2016-02-19 6:32 UTC (permalink / raw) To: dev, Huawei Xie, Yuanhan Liu; +Cc: Ilya Maximets, Dyasly Sergey Calling rte_vhost_enqueue_burst() simultaneously from different threads for the same queue_id requires additional SMP memory barrier to avoid reordering of used->idx and last_used_idx updates. In case of virtio_dev_rx() memory barrier rte_mb() simply moved one instruction higher. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> --- lib/librte_vhost/vhost_rxtx.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index 9095fb1..a03f687 100644 --- a/lib/librte_vhost/vhost_rxtx.c +++ b/lib/librte_vhost/vhost_rxtx.c @@ -281,10 +281,13 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, rte_pause(); *(volatile uint16_t *)&vq->used->idx += count; - vq->last_used_idx = res_end_idx; - /* flush used->idx update before we read avail->flags. */ + /* + * Flush used->idx update to make it visible to virtio and all other + * threads before allowing to modify it. + */ rte_mb(); + vq->last_used_idx = res_end_idx; /* Kick the guest if necessary. */ if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) @@ -586,19 +589,24 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, rte_pause(); *(volatile uint16_t *)&vq->used->idx += entry_success; + /* + * Flush used->idx update to make it visible to all + * other threads before allowing to modify it. + */ + rte_smp_wmb(); + vq->last_used_idx = res_cur_idx; } merge_rx_exit: if (likely(pkt_idx)) { - /* flush used->idx update before we read avail->flags. */ + /* Flush used->idx update to make it visible to virtio. */ rte_mb(); /* Kick the guest if necessary. */ if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) eventfd_write(vq->callfd, (eventfd_t)1); } - return pkt_idx; } -- 2.5.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety. 2016-02-19 6:32 [dpdk-dev] [PATCH RFC 0/4] Thread safe rte_vhost_enqueue_burst() Ilya Maximets ` (2 preceding siblings ...) 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 3/4] vhost: avoid reordering of used->idx and last_used_idx updating Ilya Maximets @ 2016-02-19 6:32 ` Ilya Maximets 2016-02-19 7:10 ` Yuanhan Liu 3 siblings, 1 reply; 21+ messages in thread From: Ilya Maximets @ 2016-02-19 6:32 UTC (permalink / raw) To: dev, Huawei Xie, Yuanhan Liu; +Cc: Ilya Maximets, Dyasly Sergey Signed-off-by: Ilya Maximets <i.maximets@samsung.com> --- doc/guides/prog_guide/thread_safety_dpdk_functions.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst index 403e5fc..13a6c89 100644 --- a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst +++ b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst @@ -67,6 +67,7 @@ then locking, or some other form of mutual exclusion, is necessary. The ring library is based on a lockless ring-buffer algorithm that maintains its original design for thread safety. Moreover, it provides high performance for either multi- or single-consumer/producer enqueue/dequeue operations. The mempool library is based on the DPDK lockless ring library and therefore is also multi-thread safe. +rte_vhost_enqueue_burst() is also thread safe because based on lockless ring-buffer algorithm like the ring library. Performance Insensitive API --------------------------- -- 2.5.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety. 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety Ilya Maximets @ 2016-02-19 7:10 ` Yuanhan Liu 2016-02-19 8:36 ` Xie, Huawei 0 siblings, 1 reply; 21+ messages in thread From: Yuanhan Liu @ 2016-02-19 7:10 UTC (permalink / raw) To: Ilya Maximets, Xie, Huawei; +Cc: dev, Dyasly Sergey On Fri, Feb 19, 2016 at 09:32:43AM +0300, Ilya Maximets wrote: > Signed-off-by: Ilya Maximets <i.maximets@samsung.com> > --- > doc/guides/prog_guide/thread_safety_dpdk_functions.rst | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst > index 403e5fc..13a6c89 100644 > --- a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst > +++ b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst > @@ -67,6 +67,7 @@ then locking, or some other form of mutual exclusion, is necessary. > The ring library is based on a lockless ring-buffer algorithm that maintains its original design for thread safety. > Moreover, it provides high performance for either multi- or single-consumer/producer enqueue/dequeue operations. > The mempool library is based on the DPDK lockless ring library and therefore is also multi-thread safe. > +rte_vhost_enqueue_burst() is also thread safe because based on lockless ring-buffer algorithm like the ring library. FYI, Huawei meant to make rte_vhost_enqueue_burst() not be thread-safe, to aligh with the usage of rte_eth_tx_burst(). --yliu ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety. 2016-02-19 7:10 ` Yuanhan Liu @ 2016-02-19 8:36 ` Xie, Huawei 2016-02-19 9:05 ` Ilya Maximets 0 siblings, 1 reply; 21+ messages in thread From: Xie, Huawei @ 2016-02-19 8:36 UTC (permalink / raw) To: Yuanhan Liu, Ilya Maximets; +Cc: dev, Dyasly Sergey On 2/19/2016 3:10 PM, Yuanhan Liu wrote: > On Fri, Feb 19, 2016 at 09:32:43AM +0300, Ilya Maximets wrote: >> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> >> --- >> doc/guides/prog_guide/thread_safety_dpdk_functions.rst | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >> index 403e5fc..13a6c89 100644 >> --- a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >> +++ b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >> @@ -67,6 +67,7 @@ then locking, or some other form of mutual exclusion, is necessary. >> The ring library is based on a lockless ring-buffer algorithm that maintains its original design for thread safety. >> Moreover, it provides high performance for either multi- or single-consumer/producer enqueue/dequeue operations. >> The mempool library is based on the DPDK lockless ring library and therefore is also multi-thread safe. >> +rte_vhost_enqueue_burst() is also thread safe because based on lockless ring-buffer algorithm like the ring library. > FYI, Huawei meant to make rte_vhost_enqueue_burst() not be thread-safe, > to aligh with the usage of rte_eth_tx_burst(). > > --yliu I have a patch to remove the lockless enqueue. Unless there is strong reason, i prefer vhost PMD to behave like other PMDs, with no internal lockless algorithm. In future, for people who really need it, we could have dynamic/static switch to enable it. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety. 2016-02-19 8:36 ` Xie, Huawei @ 2016-02-19 9:05 ` Ilya Maximets 2016-02-22 2:07 ` Xie, Huawei 0 siblings, 1 reply; 21+ messages in thread From: Ilya Maximets @ 2016-02-19 9:05 UTC (permalink / raw) To: Xie, Huawei, Yuanhan Liu; +Cc: dev, Dyasly Sergey On 19.02.2016 11:36, Xie, Huawei wrote: > On 2/19/2016 3:10 PM, Yuanhan Liu wrote: >> On Fri, Feb 19, 2016 at 09:32:43AM +0300, Ilya Maximets wrote: >>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> >>> --- >>> doc/guides/prog_guide/thread_safety_dpdk_functions.rst | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>> index 403e5fc..13a6c89 100644 >>> --- a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>> +++ b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>> @@ -67,6 +67,7 @@ then locking, or some other form of mutual exclusion, is necessary. >>> The ring library is based on a lockless ring-buffer algorithm that maintains its original design for thread safety. >>> Moreover, it provides high performance for either multi- or single-consumer/producer enqueue/dequeue operations. >>> The mempool library is based on the DPDK lockless ring library and therefore is also multi-thread safe. >>> +rte_vhost_enqueue_burst() is also thread safe because based on lockless ring-buffer algorithm like the ring library. >> FYI, Huawei meant to make rte_vhost_enqueue_burst() not be thread-safe, >> to aligh with the usage of rte_eth_tx_burst(). >> >> --yliu > > I have a patch to remove the lockless enqueue. Unless there is strong > reason, i prefer vhost PMD to behave like other PMDs, with no internal > lockless algorithm. In future, for people who really need it, we could > have dynamic/static switch to enable it. OK, got it. So, I think, this documentation patch may be dropped. Other patches of series still may be merged to fix existing issues and keep code in consistent state for the future. Am I right? Best regards, Ilya Maximets. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety. 2016-02-19 9:05 ` Ilya Maximets @ 2016-02-22 2:07 ` Xie, Huawei 2016-02-22 10:14 ` Thomas Monjalon 0 siblings, 1 reply; 21+ messages in thread From: Xie, Huawei @ 2016-02-22 2:07 UTC (permalink / raw) To: Ilya Maximets, Yuanhan Liu, Thomas Monjalon; +Cc: dev, Dyasly Sergey On 2/19/2016 5:05 PM, Ilya Maximets wrote: > On 19.02.2016 11:36, Xie, Huawei wrote: >> On 2/19/2016 3:10 PM, Yuanhan Liu wrote: >>> On Fri, Feb 19, 2016 at 09:32:43AM +0300, Ilya Maximets wrote: >>>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> >>>> --- >>>> doc/guides/prog_guide/thread_safety_dpdk_functions.rst | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>>> index 403e5fc..13a6c89 100644 >>>> --- a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>>> +++ b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>>> @@ -67,6 +67,7 @@ then locking, or some other form of mutual exclusion, is necessary. >>>> The ring library is based on a lockless ring-buffer algorithm that maintains its original design for thread safety. >>>> Moreover, it provides high performance for either multi- or single-consumer/producer enqueue/dequeue operations. >>>> The mempool library is based on the DPDK lockless ring library and therefore is also multi-thread safe. >>>> +rte_vhost_enqueue_burst() is also thread safe because based on lockless ring-buffer algorithm like the ring library. >>> FYI, Huawei meant to make rte_vhost_enqueue_burst() not be thread-safe, >>> to aligh with the usage of rte_eth_tx_burst(). >>> >>> --yliu >> I have a patch to remove the lockless enqueue. Unless there is strong >> reason, i prefer vhost PMD to behave like other PMDs, with no internal >> lockless algorithm. In future, for people who really need it, we could >> have dynamic/static switch to enable it. Thomas, what is your opinion on this and my patch removing lockless enqueue? > OK, got it. So, I think, this documentation patch may be dropped. > Other patches of series still may be merged to fix existing issues and > keep code in consistent state for the future. > Am I right? Yes. > Best regards, Ilya Maximets. > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety. 2016-02-22 2:07 ` Xie, Huawei @ 2016-02-22 10:14 ` Thomas Monjalon 2016-02-23 5:56 ` Xie, Huawei 0 siblings, 1 reply; 21+ messages in thread From: Thomas Monjalon @ 2016-02-22 10:14 UTC (permalink / raw) To: Xie, Huawei; +Cc: Dyasly Sergey, dev, Ilya Maximets 2016-02-22 02:07, Xie, Huawei: > On 2/19/2016 5:05 PM, Ilya Maximets wrote: > > On 19.02.2016 11:36, Xie, Huawei wrote: > >> On 2/19/2016 3:10 PM, Yuanhan Liu wrote: > >>> On Fri, Feb 19, 2016 at 09:32:43AM +0300, Ilya Maximets wrote: > >>>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> > >>>> --- > >>>> doc/guides/prog_guide/thread_safety_dpdk_functions.rst | 1 + > >>>> 1 file changed, 1 insertion(+) > >>>> > >>>> diff --git a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst > >>>> index 403e5fc..13a6c89 100644 > >>>> --- a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst > >>>> +++ b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst > >>>> The mempool library is based on the DPDK lockless ring library and therefore is also multi-thread safe. > >>>> +rte_vhost_enqueue_burst() is also thread safe because based on lockless ring-buffer algorithm like the ring library. > >>> FYI, Huawei meant to make rte_vhost_enqueue_burst() not be thread-safe, > >>> to aligh with the usage of rte_eth_tx_burst(). > >>> > >>> --yliu > >> I have a patch to remove the lockless enqueue. Unless there is strong > >> reason, i prefer vhost PMD to behave like other PMDs, with no internal > >> lockless algorithm. In future, for people who really need it, we could > >> have dynamic/static switch to enable it. > > Thomas, what is your opinion on this and my patch removing lockless enqueue? The thread safety behaviour is part of the API specification. If we want to enable/disable such behaviour, it must be done with an API function. But it would introduce a conditional statement in the fast path. That's why the priority must be to keep a simple and consistent behaviour and try to build around. An API complexity may be considered only if there is a real (measured) gain. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety. 2016-02-22 10:14 ` Thomas Monjalon @ 2016-02-23 5:56 ` Xie, Huawei 2016-02-24 5:06 ` Ilya Maximets 0 siblings, 1 reply; 21+ messages in thread From: Xie, Huawei @ 2016-02-23 5:56 UTC (permalink / raw) To: Thomas Monjalon; +Cc: Dyasly Sergey, dev, Ilya Maximets On 2/22/2016 6:16 PM, Thomas Monjalon wrote: > 2016-02-22 02:07, Xie, Huawei: >> On 2/19/2016 5:05 PM, Ilya Maximets wrote: >>> On 19.02.2016 11:36, Xie, Huawei wrote: >>>> On 2/19/2016 3:10 PM, Yuanhan Liu wrote: >>>>> On Fri, Feb 19, 2016 at 09:32:43AM +0300, Ilya Maximets wrote: >>>>>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> >>>>>> --- >>>>>> doc/guides/prog_guide/thread_safety_dpdk_functions.rst | 1 + >>>>>> 1 file changed, 1 insertion(+) >>>>>> >>>>>> diff --git a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>>>>> index 403e5fc..13a6c89 100644 >>>>>> --- a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>>>>> +++ b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>>>>> The mempool library is based on the DPDK lockless ring library and therefore is also multi-thread safe. >>>>>> +rte_vhost_enqueue_burst() is also thread safe because based on lockless ring-buffer algorithm like the ring library. >>>>> FYI, Huawei meant to make rte_vhost_enqueue_burst() not be thread-safe, >>>>> to aligh with the usage of rte_eth_tx_burst(). >>>>> >>>>> --yliu >>>> I have a patch to remove the lockless enqueue. Unless there is strong >>>> reason, i prefer vhost PMD to behave like other PMDs, with no internal >>>> lockless algorithm. In future, for people who really need it, we could >>>> have dynamic/static switch to enable it. >> Thomas, what is your opinion on this and my patch removing lockless enqueue? > The thread safety behaviour is part of the API specification. > If we want to enable/disable such behaviour, it must be done with an API > function. But it would introduce a conditional statement in the fast path. > That's why the priority must be to keep a simple and consistent behaviour > and try to build around. An API complexity may be considered only if there > is a real (measured) gain. Let us put the gain aside temporarily. I would do the measurement. Vhost is wrapped as a PMD in Tetsuya's patch. And also in DPDK OVS's case, it is wrapped as a vport like all other physical ports. The DPDK app/OVS will treat all ports equally. It will add complexity if the app needs to know that some supports concurrency while some not. Since all other PMDs doesn't support thread safety, it doesn't make sense for vhost PMD to support that. I believe the APP will not use that behavior. >From the API's point of view, if we previously implemented it wrongly, we need to fix it as early as possible. > > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety. 2016-02-23 5:56 ` Xie, Huawei @ 2016-02-24 5:06 ` Ilya Maximets 2016-02-25 5:12 ` Xie, Huawei 0 siblings, 1 reply; 21+ messages in thread From: Ilya Maximets @ 2016-02-24 5:06 UTC (permalink / raw) To: Xie, Huawei, Thomas Monjalon; +Cc: Dyasly Sergey, dev On 23.02.2016 08:56, Xie, Huawei wrote: > On 2/22/2016 6:16 PM, Thomas Monjalon wrote: >> 2016-02-22 02:07, Xie, Huawei: >>> On 2/19/2016 5:05 PM, Ilya Maximets wrote: >>>> On 19.02.2016 11:36, Xie, Huawei wrote: >>>>> On 2/19/2016 3:10 PM, Yuanhan Liu wrote: >>>>>> On Fri, Feb 19, 2016 at 09:32:43AM +0300, Ilya Maximets wrote: >>>>>>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> >>>>>>> --- >>>>>>> doc/guides/prog_guide/thread_safety_dpdk_functions.rst | 1 + >>>>>>> 1 file changed, 1 insertion(+) >>>>>>> >>>>>>> diff --git a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>>>>>> index 403e5fc..13a6c89 100644 >>>>>>> --- a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>>>>>> +++ b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>>>>>> The mempool library is based on the DPDK lockless ring library and therefore is also multi-thread safe. >>>>>>> +rte_vhost_enqueue_burst() is also thread safe because based on lockless ring-buffer algorithm like the ring library. >>>>>> FYI, Huawei meant to make rte_vhost_enqueue_burst() not be thread-safe, >>>>>> to aligh with the usage of rte_eth_tx_burst(). >>>>>> >>>>>> --yliu >>>>> I have a patch to remove the lockless enqueue. Unless there is strong >>>>> reason, i prefer vhost PMD to behave like other PMDs, with no internal >>>>> lockless algorithm. In future, for people who really need it, we could >>>>> have dynamic/static switch to enable it. >>> Thomas, what is your opinion on this and my patch removing lockless enqueue? >> The thread safety behaviour is part of the API specification. >> If we want to enable/disable such behaviour, it must be done with an API >> function. But it would introduce a conditional statement in the fast path. >> That's why the priority must be to keep a simple and consistent behaviour >> and try to build around. An API complexity may be considered only if there >> is a real (measured) gain. > > Let us put the gain aside temporarily. I would do the measurement. > Vhost is wrapped as a PMD in Tetsuya's patch. And also in DPDK OVS's > case, it is wrapped as a vport like all other physical ports. The DPDK > app/OVS will treat all ports equally. That is not true. Currently vhost in Open vSwitch implemented as a separate netdev class. So, to use concurrency of vhost we just need to remove 2 lines (rte_spinlock_lock and rte_spinlock_unlock) from function __netdev_dpdk_vhost_send(). This will not change behaviour of other types of ports. > It will add complexity if the app > needs to know that some supports concurrency while some not. Since all > other PMDs doesn't support thread safety, it doesn't make sense for > vhost PMD to support that. I believe the APP will not use that behavior. >>From the API's point of view, if we previously implemented it wrongly, > we need to fix it as early as possible. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety. 2016-02-24 5:06 ` Ilya Maximets @ 2016-02-25 5:12 ` Xie, Huawei 0 siblings, 0 replies; 21+ messages in thread From: Xie, Huawei @ 2016-02-25 5:12 UTC (permalink / raw) To: Ilya Maximets, Thomas Monjalon; +Cc: Dyasly Sergey, dev On 2/24/2016 1:07 PM, Ilya Maximets wrote: > On 23.02.2016 08:56, Xie, Huawei wrote: >> On 2/22/2016 6:16 PM, Thomas Monjalon wrote: >>> 2016-02-22 02:07, Xie, Huawei: >>>> On 2/19/2016 5:05 PM, Ilya Maximets wrote: >>>>> On 19.02.2016 11:36, Xie, Huawei wrote: >>>>>> On 2/19/2016 3:10 PM, Yuanhan Liu wrote: >>>>>>> On Fri, Feb 19, 2016 at 09:32:43AM +0300, Ilya Maximets wrote: >>>>>>>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> >>>>>>>> --- >>>>>>>> doc/guides/prog_guide/thread_safety_dpdk_functions.rst | 1 + >>>>>>>> 1 file changed, 1 insertion(+) >>>>>>>> >>>>>>>> diff --git a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>>>>>>> index 403e5fc..13a6c89 100644 >>>>>>>> --- a/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>>>>>>> +++ b/doc/guides/prog_guide/thread_safety_dpdk_functions.rst >>>>>>>> The mempool library is based on the DPDK lockless ring library and therefore is also multi-thread safe. >>>>>>>> +rte_vhost_enqueue_burst() is also thread safe because based on lockless ring-buffer algorithm like the ring library. >>>>>>> FYI, Huawei meant to make rte_vhost_enqueue_burst() not be thread-safe, >>>>>>> to aligh with the usage of rte_eth_tx_burst(). >>>>>>> >>>>>>> --yliu >>>>>> I have a patch to remove the lockless enqueue. Unless there is strong >>>>>> reason, i prefer vhost PMD to behave like other PMDs, with no internal >>>>>> lockless algorithm. In future, for people who really need it, we could >>>>>> have dynamic/static switch to enable it. >>>> Thomas, what is your opinion on this and my patch removing lockless enqueue? >>> The thread safety behaviour is part of the API specification. >>> If we want to enable/disable such behaviour, it must be done with an API >>> function. But it would introduce a conditional statement in the fast path. >>> That's why the priority must be to keep a simple and consistent behaviour >>> and try to build around. An API complexity may be considered only if there >>> is a real (measured) gain. >> Let us put the gain aside temporarily. I would do the measurement. >> Vhost is wrapped as a PMD in Tetsuya's patch. And also in DPDK OVS's >> case, it is wrapped as a vport like all other physical ports. The DPDK >> app/OVS will treat all ports equally. > That is not true. Currently vhost in Open vSwitch implemented as a separate > netdev class. So, to use concurrency of vhost we just need to remove > 2 lines (rte_spinlock_lock and rte_spinlock_unlock) from function > __netdev_dpdk_vhost_send(). This will not change behaviour of other types > of ports. I checked OVS implementation. It raised several concerns. For physical ports, it uses multiple queues to solve concurrent tx. For vhost ports, a) The thread safe behavior of vhost isn't used. rte_spinlock is used outside. Yes, it could be removed. b) If a packet is send to vhost, it is directly enqueued to guest without buffering. We could use thread safe ring to queue packets first and then enqueued to guest at appropriate time later, then vhost internal lockless isn't needed. Besides, IMO thread safe implementation adds the complexity of vhost implementation. >> It will add complexity if the app >> needs to know that some supports concurrency while some not. Since all >> other PMDs doesn't support thread safety, it doesn't make sense for >> vhost PMD to support that. I believe the APP will not use that behavior. >> >From the API's point of view, if we previously implemented it wrongly, >> we need to fix it as early as possible. ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2016-04-06 4:52 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-02-19 6:32 [dpdk-dev] [PATCH RFC 0/4] Thread safe rte_vhost_enqueue_burst() Ilya Maximets 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 1/4] vhost: use SMP barriers instead of compiler ones Ilya Maximets 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 2/4] vhost: make buf vector for scatter RX local Ilya Maximets 2016-02-19 7:06 ` Yuanhan Liu 2016-02-19 7:30 ` Ilya Maximets 2016-02-19 8:10 ` Xie, Huawei 2016-04-05 5:47 ` [dpdk-dev] [RFC] vhost-user public struct refactor (was Re: [PATCH RFC 2/4] vhost: make buf vector for scatter RX) local Yuanhan Liu 2016-04-05 8:37 ` Thomas Monjalon 2016-04-05 14:06 ` Yuanhan Liu 2016-04-06 4:14 ` Flavio Leitner 2016-04-06 4:54 ` Yuanhan Liu 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 3/4] vhost: avoid reordering of used->idx and last_used_idx updating Ilya Maximets 2016-02-19 6:32 ` [dpdk-dev] [PATCH RFC 4/4] doc: add note about rte_vhost_enqueue_burst thread safety Ilya Maximets 2016-02-19 7:10 ` Yuanhan Liu 2016-02-19 8:36 ` Xie, Huawei 2016-02-19 9:05 ` Ilya Maximets 2016-02-22 2:07 ` Xie, Huawei 2016-02-22 10:14 ` Thomas Monjalon 2016-02-23 5:56 ` Xie, Huawei 2016-02-24 5:06 ` Ilya Maximets 2016-02-25 5:12 ` Xie, Huawei
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).