From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailout2.w1.samsung.com (mailout2.w1.samsung.com [210.118.77.12]) by dpdk.org (Postfix) with ESMTP id 408771BBF1 for ; Fri, 14 Dec 2018 15:03:50 +0100 (CET) Received: from eucas1p1.samsung.com (unknown [182.198.249.206]) by mailout2.w1.samsung.com (KnoxPortal) with ESMTP id 20181214140350euoutp020bb8f19231ed1f2b8c8fc5c3b2a40f54~wN-yTpzYg0447104471euoutp02F for ; Fri, 14 Dec 2018 14:03:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w1.samsung.com 20181214140350euoutp020bb8f19231ed1f2b8c8fc5c3b2a40f54~wN-yTpzYg0447104471euoutp02F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1544796230; bh=+WN0caZfjK39T4xigKVg5Beh2felGrD5kX/zIHUGn30=; h=Subject:To:Cc:From:Date:In-Reply-To:References:From; b=Q3QOaT58eB1zYZfkAgG48vSpCv/HHt01lhajZbp5s8SZgqPWUMFPPOX8ljRDoGvDE ACq4TQJrVj2WFMv49gIhy+XCLseQHW4UwHp+1LWBLMGtKjQio6aM3aONUYWRoNe9g5 jDXls/V8HC7hx41EHsla5XtC3PPK2/AAejamQiHw= Received: from eusmges1new.samsung.com (unknown [203.254.199.242]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20181214140349eucas1p1304a2c921d4920e2df899901c9488f57~wN-x03g6t2579025790eucas1p1W; Fri, 14 Dec 2018 14:03:49 +0000 (GMT) Received: from eucas1p2.samsung.com ( [182.198.249.207]) by eusmges1new.samsung.com (EUCPMTA) with SMTP id F1.0A.04441.548B31C5; Fri, 14 Dec 2018 14:03:49 +0000 (GMT) Received: from eusmtrp1.samsung.com (unknown [182.198.249.138]) by eucas1p1.samsung.com (KnoxPortal) with ESMTPA id 20181214140348eucas1p19d520f5b44475fd6d1009cc32993c8a4~wN-xEP9u92579025790eucas1p1T; Fri, 14 Dec 2018 14:03:48 +0000 (GMT) Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by eusmtrp1.samsung.com (KnoxPortal) with ESMTP id 20181214140348eusmtrp1d82fc9a05e3fe8fbb0cb64231dd7011e~wN-w0rGm42984929849eusmtrp1Y; Fri, 14 Dec 2018 14:03:48 +0000 (GMT) X-AuditID: cbfec7f2-5c9ff70000001159-02-5c13b8453f12 Received: from eusmtip2.samsung.com ( [203.254.199.222]) by eusmgms2.samsung.com (EUCPMTA) with SMTP id E3.1F.04128.448B31C5; Fri, 14 Dec 2018 14:03:48 +0000 (GMT) Received: from [106.109.129.180] (unknown [106.109.129.180]) by eusmtip2.samsung.com (KnoxPortal) with ESMTPA id 20181214140347eusmtip2384f0664ea36f0c81ddeae482dced957~wN-wGaAB00182001820eusmtip2C; Fri, 14 Dec 2018 14:03:47 +0000 (GMT) To: "Michael S. Tsirkin" Cc: dev@dpdk.org, Maxime Coquelin , Xiao Wang , Tiwei Bie , Zhihong Wang , jfreimann@redhat.com, Jason Wang , xiaolong.ye@intel.com, alejandro.lucero@netronome.com, stable@dpdk.org From: Ilya Maximets Message-ID: <31e17b3e-84fa-6e20-bab1-ca7b10e923d8@samsung.com> Date: Fri, 14 Dec 2018 17:03:43 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <20181214075506-mutt-send-email-mst@kernel.org> Content-Language: en-GB Content-Transfer-Encoding: 7bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrMKsWRmVeSWpSXmKPExsWy7djP87quO4RjDI5NE7E492kZk8W7T9uZ LK60/2S3WHbpM5PFuTVLWSyOde5hsfj/6xWrxb+OP+wWWxv+M1nsf36Y3eLPG1OLzRcnMTnw ePxasJTVY/Gel0we07sfMnu833eVzaNvyyrGANYoLpuU1JzMstQifbsEroz/P36wFsz2rphx VaCB8a1NFyMnh4SAiUTjlM3sXYxcHEICKxgl2vrmskA4XxglXs6fzgjhfGaU6J6/lhmmZdfF DjaIxHJGiUvnJ0L1f2SU6JrezQpSJSxgKXH041swW0RAU+LprddMIEXMAhuZJNqf/WEBSbAJ 6EicWn2EEcTmFbCTWLP0KBuIzSKgKvFn8h4wW1QgQqLj/mo2iBpBiZMzn4D1cgrYSGzouA5m MwuISzR9WckKYctLbH87B+rUS+wS0w/oQtguEqv/H2WBsIUlXh3fwg5hy0icntwDFa+XuN/y EuxnCYEORonph/4xQSTsJba8PgfUwAG0QFNi/S59iLCjREPvC7CwhACfxI23ghAn8ElM2jad GSLMK9HRJgRRrSLx++ByqMukJG6++8w+gVFpFpLHZiF5ZhaSZ2Yh7F3AyLKKUTy1tDg3PbXY MC+1XK84Mbe4NC9dLzk/dxMjMGmd/nf80w7Gr5eSDjEKcDAq8fBmTBKKEWJNLCuuzD3EKMHB rCTCG9YqHCPEm5JYWZValB9fVJqTWnyIUZqDRUmct5rhQbSQQHpiSWp2ampBahFMlomDU6qB MWhD6H4D5veH/qWcPnXcdOKKp44xyqt/OMerlDluCHgjKLQ2YcIZwR+Pz4Xue/fF/4H/tDM7 PBMYZ/iw2mbmF20+MaM7/EbP/klsT5o5nN29jh8McVb89H5DoXL4b/UDnE4Tuywt3dnXPxPv esVq8zboxpV/17kvsGZ1Z771sb4sb7vce/13ISWW4oxEQy3mouJEAJHamKxWAwAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrOIsWRmVeSWpSXmKPExsVy+t/xe7ouO4RjDP4/lLE492kZk8W7T9uZ LK60/2S3WHbpM5PFuTVLWSyOde5hsfj/6xWrxb+OP+wWWxv+M1nsf36Y3eLPG1OLzRcnMTnw ePxasJTVY/Gel0we07sfMnu833eVzaNvyyrGANYoPZui/NKSVIWM/OISW6VoQwsjPUNLCz0j E0s9Q2PzWCsjUyV9O5uU1JzMstQifbsEvYz/P36wFsz2rphxVaCB8a1NFyMnh4SAicSuix1s XYxcHEICSxklfrTsZoNISEn8+HWBFcIWlvhzrQuq6D2jxKrd/SwgCWEBS4mjH9+CFYkIaEo8 vfWaCaSIWWAjk8Sr3z+ZIDoOMkp8OjcFrINNQEfi1OojjCA2r4CdxJqlR8HWsQioSvyZvAfM FhWIkDj7ch1UjaDEyZlPwHo5BWwkNnRcB7OZBdQl/sy7xAxhi0s0fVnJCmHLS2x/O4d5AqPQ LCTts5C0zELSMgtJywJGllWMIqmlxbnpucVGesWJucWleel6yfm5mxiBkbrt2M8tOxi73gUf YhTgYFTi4T0wRShGiDWxrLgy9xCjBAezkghvWKtwjBBvSmJlVWpRfnxRaU5q8SFGU6DnJjJL iSbnA5NIXkm8oamhuYWlobmxubGZhZI473mDyighgfTEktTs1NSC1CKYPiYOTqkGxlBZzt6d FrGeRaGTGdSOxZWeu8fNbX4nQyh/7t5JExiqo0zubZn+SjXfZfkhqzfrmM7y/m8vu/Xrmldt 08OW3oouN6k787v8+TXML+1Ue+vQf7NwNU+b//Ulz171f5wrLBJWpTale/VbI6c6GZWV1dyJ 0itarrLd7C4uYTN641P6YK831++HSizFGYmGWsxFxYkA/TZ20uoCAAA= X-CMS-MailID: 20181214140348eucas1p19d520f5b44475fd6d1009cc32993c8a4 X-Msg-Generator: CA Content-Type: text/plain; charset="utf-8" X-RootMTR: 20181214123716eucas1p2928654e37999b8fc32899eed326a3581 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20181214123716eucas1p2928654e37999b8fc32899eed326a3581 References: <20181214123704.20110-1-i.maximets@samsung.com> <20181214075506-mutt-send-email-mst@kernel.org> Subject: Re: [dpdk-dev] [RFC] net/virtio: use real barriers for vDPA X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Dec 2018 14:03:51 -0000 On 14.12.2018 15:58, Michael S. Tsirkin wrote: > On Fri, Dec 14, 2018 at 03:37:04PM +0300, Ilya Maximets wrote: >> SMP barriers are OK for software vhost implementation, but vDPA is a >> DMA capable hardware and we have to use at least DMA barriers to >> ensure the memory ordering. >> >> This patch mimics the kernel virtio-net driver in part of choosing >> barriers we need. >> >> Fixes: a3f8150eac6d ("net/ifcvf: add ifcvf vDPA driver") >> Cc: stable@dpdk.org >> >> Signed-off-by: Ilya Maximets > > Instead of this vendor specific hack, can you please > use VIRTIO_F_ORDER_PLATFORM? Hmm. Yes. You're probably right. But VIRTIO_F_ORDER_PLATFORM a.k.a. VIRTIO_F_IO_BARRIER is not implemented anywhere at the time. This will require changes to the whole stack (vhost, qemu, virtio drivers) to allow the negotiation. Moreover, I guess that it's not offered/correctly treated by the current vDPA HW i.e. it allows to work with weak barriers (It allowed to do that by the spec, but I don't think that it's right). This vendor specific hack could allow to work right now. Actually, I'm not sure if vDPA is usable with current QEMU ? Does anybody know ? If it's usable, than having a hack could be acceptable for now. If not, of course, we could drop this and implement virtio native solution. But yes, I agree that virtio native solution should be done anyway regardless of having or not having hacks like this. Maybe I can split the patch in two: 1. Add ability to use real barriers. ('weak_barriers' always 'true') 2. Hack to initialize 'weak_barriers' according to subsystem_device_id. The first patch will be needed anyway. The second one could be dropped or applied and replaced in the future by proper solution with VIRTIO_F_ORDER_PLATFORM. Thoughts ? One more thing I have concerns about is that vDPA offers support for earlier versions of virtio protocol, which looks strange. > > >> --- >> >> Sending as RFC, because the patch is completely not tested. I heve no >> HW to test the real behaviour. And I actually do not know if the >> subsystem_vendor/device_id's are available at the time and has >> IFCVF_SUBSYS_* values inside the real guest system. >> >> One more thing I want to mention that cross net client Live Migration >> is actually not possible without any support from the guest driver. >> Because migration from the software vhost to vDPA will lead to using >> weaker barriers than required. >> >> The similar change (weak_barriers = false) should be made in the linux >> kernel virtio-net driver. > > IMHO no, it should use VIRTIO_F_ORDER_PLATFORM. Sure. I'm not a fan of hacks too, especially inside kernel. > >> Another dirty solution could be to restrict the vDPA support to x86 >> systems only. >> >> drivers/net/virtio/virtio_ethdev.c | 9 +++++++ >> drivers/net/virtio/virtio_pci.h | 1 + >> drivers/net/virtio/virtio_rxtx.c | 14 +++++----- >> drivers/net/virtio/virtio_user_ethdev.c | 1 + >> drivers/net/virtio/virtqueue.h | 35 +++++++++++++++++++++---- >> 5 files changed, 48 insertions(+), 12 deletions(-) >> >> diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c >> index cb2b2e0bf..249b536fd 100644 >> --- a/drivers/net/virtio/virtio_ethdev.c >> +++ b/drivers/net/virtio/virtio_ethdev.c >> @@ -1448,6 +1448,9 @@ virtio_configure_intr(struct rte_eth_dev *dev) >> return 0; >> } >> >> +#define IFCVF_SUBSYS_VENDOR_ID 0x8086 >> +#define IFCVF_SUBSYS_DEVICE_ID 0x001A >> + >> /* reset device and renegotiate features if needed */ >> static int >> virtio_init_device(struct rte_eth_dev *eth_dev, uint64_t req_features) >> @@ -1477,6 +1480,12 @@ virtio_init_device(struct rte_eth_dev *eth_dev, uint64_t req_features) >> if (!hw->virtio_user_dev) { >> pci_dev = RTE_ETH_DEV_TO_PCI(eth_dev); >> rte_eth_copy_pci_info(eth_dev, pci_dev); >> + >> + if (pci_dev->id.subsystem_vendor_id == IFCVF_SUBSYS_VENDOR_ID && >> + pci_dev->id.subsystem_device_id == IFCVF_SUBSYS_DEVICE_ID) >> + hw->weak_barriers = 0; >> + else >> + hw->weak_barriers = 1; >> } >> >> /* If host does not support both status and MSI-X then disable LSC */ >> diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h >> index e961a58ca..1f8e719a9 100644 >> --- a/drivers/net/virtio/virtio_pci.h >> +++ b/drivers/net/virtio/virtio_pci.h >> @@ -240,6 +240,7 @@ struct virtio_hw { >> uint8_t use_simple_rx; >> uint8_t use_inorder_rx; >> uint8_t use_inorder_tx; >> + uint8_t weak_barriers; >> bool has_tx_offload; >> bool has_rx_offload; >> uint16_t port_id; >> diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c >> index cb8f89f18..66195bf47 100644 >> --- a/drivers/net/virtio/virtio_rxtx.c >> +++ b/drivers/net/virtio/virtio_rxtx.c >> @@ -906,7 +906,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) >> >> nb_used = VIRTQUEUE_NUSED(vq); >> >> - virtio_rmb(); >> + virtio_rmb(hw->weak_barriers); >> >> num = likely(nb_used <= nb_pkts) ? nb_used : nb_pkts; >> if (unlikely(num > VIRTIO_MBUF_BURST_SZ)) >> @@ -1017,7 +1017,7 @@ virtio_recv_mergeable_pkts_inorder(void *rx_queue, >> nb_used = RTE_MIN(nb_used, nb_pkts); >> nb_used = RTE_MIN(nb_used, VIRTIO_MBUF_BURST_SZ); >> >> - virtio_rmb(); >> + virtio_rmb(hw->weak_barriers); >> >> PMD_RX_LOG(DEBUG, "used:%d", nb_used); >> >> @@ -1202,7 +1202,7 @@ virtio_recv_mergeable_pkts(void *rx_queue, >> >> nb_used = VIRTQUEUE_NUSED(vq); >> >> - virtio_rmb(); >> + virtio_rmb(hw->weak_barriers); >> >> PMD_RX_LOG(DEBUG, "used:%d", nb_used); >> >> @@ -1365,7 +1365,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) >> PMD_TX_LOG(DEBUG, "%d packets to xmit", nb_pkts); >> nb_used = VIRTQUEUE_NUSED(vq); >> >> - virtio_rmb(); >> + virtio_rmb(hw->weak_barriers); >> if (likely(nb_used > vq->vq_nentries - vq->vq_free_thresh)) >> virtio_xmit_cleanup(vq, nb_used); >> >> @@ -1407,7 +1407,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) >> /* Positive value indicates it need free vring descriptors */ >> if (unlikely(need > 0)) { >> nb_used = VIRTQUEUE_NUSED(vq); >> - virtio_rmb(); >> + virtio_rmb(hw->weak_barriers); >> need = RTE_MIN(need, (int)nb_used); >> >> virtio_xmit_cleanup(vq, need); >> @@ -1463,7 +1463,7 @@ virtio_xmit_pkts_inorder(void *tx_queue, >> PMD_TX_LOG(DEBUG, "%d packets to xmit", nb_pkts); >> nb_used = VIRTQUEUE_NUSED(vq); >> >> - virtio_rmb(); >> + virtio_rmb(hw->weak_barriers); >> if (likely(nb_used > vq->vq_nentries - vq->vq_free_thresh)) >> virtio_xmit_cleanup_inorder(vq, nb_used); >> >> @@ -1511,7 +1511,7 @@ virtio_xmit_pkts_inorder(void *tx_queue, >> need = slots - vq->vq_free_cnt; >> if (unlikely(need > 0)) { >> nb_used = VIRTQUEUE_NUSED(vq); >> - virtio_rmb(); >> + virtio_rmb(hw->weak_barriers); >> need = RTE_MIN(need, (int)nb_used); >> >> virtio_xmit_cleanup_inorder(vq, need); >> diff --git a/drivers/net/virtio/virtio_user_ethdev.c b/drivers/net/virtio/virtio_user_ethdev.c >> index f8791391a..f075774b4 100644 >> --- a/drivers/net/virtio/virtio_user_ethdev.c >> +++ b/drivers/net/virtio/virtio_user_ethdev.c >> @@ -434,6 +434,7 @@ virtio_user_eth_dev_alloc(struct rte_vdev_device *vdev) >> hw->use_simple_rx = 0; >> hw->use_inorder_rx = 0; >> hw->use_inorder_tx = 0; >> + hw->weak_barriers = 1; >> hw->virtio_user_dev = dev; >> return eth_dev; >> } >> diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h >> index 26518ed98..7bf17d3bf 100644 >> --- a/drivers/net/virtio/virtqueue.h >> +++ b/drivers/net/virtio/virtqueue.h >> @@ -19,15 +19,40 @@ >> struct rte_mbuf; >> >> /* >> - * Per virtio_config.h in Linux. >> + * Per virtio_ring.h in Linux. >> * For virtio_pci on SMP, we don't need to order with respect to MMIO >> * accesses through relaxed memory I/O windows, so smp_mb() et al are >> * sufficient. >> * >> + * For using virtio to talk to real devices (eg. vDPA) we do need real >> + * barriers. >> */ >> -#define virtio_mb() rte_smp_mb() >> -#define virtio_rmb() rte_smp_rmb() >> -#define virtio_wmb() rte_smp_wmb() >> +static inline void >> +virtio_mb(uint8_t weak_barriers) >> +{ >> + if (weak_barriers) >> + rte_smp_mb(); >> + else >> + rte_mb(); >> +} >> + >> +static inline void >> +virtio_rmb(uint8_t weak_barriers) >> +{ >> + if (weak_barriers) >> + rte_smp_rmb(); >> + else >> + rte_cio_rmb(); >> +} >> + >> +static inline void >> +virtio_wmb(uint8_t weak_barriers) >> +{ >> + if (weak_barriers) >> + rte_smp_wmb(); >> + else >> + rte_cio_wmb(); >> +} >> >> #ifdef RTE_PMD_PACKET_PREFETCH >> #define rte_packet_prefetch(p) rte_prefetch1(p) > > > Note kernel uses dma_ barriers not full barriers. > I don't know whether it applies to dpdk. Yes. I know. I used rte_cio_* barriers which are dpdk equivalent of dma_* barriers from kernel. > >> @@ -312,7 +337,7 @@ void vq_ring_free_inorder(struct virtqueue *vq, uint16_t desc_idx, >> static inline void >> vq_update_avail_idx(struct virtqueue *vq) >> { >> - virtio_wmb(); >> + virtio_wmb(vq->hw->weak_barriers); >> vq->vq_ring.avail->idx = vq->vq_avail_idx; >> } >> >> -- >> 2.17.1 > >