From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM02-BL2-obe.outbound.protection.outlook.com (mail-bl2nam02on0079.outbound.protection.outlook.com [104.47.38.79]) by dpdk.org (Postfix) with ESMTP id B4DAF37A6 for ; Fri, 1 Jul 2016 13:17:59 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=2D1Lt/dxnBUsPfddtB+0WxgNjYBtHGAUWY3a2FnT38Q=; b=ClEWvsQ9gxxrAKGFLFepzT+feiPAaerydFHrW9M1VjQRYwaIS6EF0+kVHdWaxJ1kcvHVdTox5u3FVvRBnSskDwrd2gaYLXMXtuK6M+o6WHPaiTOGbcr07wtd1I+BmtvRwa4BWYf4PKZbJWIFUaFmvKflQo5Xm5fELEQZdi9rs2o= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Jerin.Jacob@cavium.com; Received: from localhost.localdomain.localdomain (122.167.11.22) by BLUPR0701MB1714.namprd07.prod.outlook.com (10.163.85.140) with Microsoft SMTP Server (TLS) id 15.1.523.12; Fri, 1 Jul 2016 11:17:54 +0000 From: Jerin Jacob To: CC: , , , , , Jerin Jacob Date: Fri, 1 Jul 2016 16:46:37 +0530 Message-ID: <1467371814-26754-3-git-send-email-jerin.jacob@caviumnetworks.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1467371814-26754-1-git-send-email-jerin.jacob@caviumnetworks.com> References: <1467028448-8914-1-git-send-email-jerin.jacob@caviumnetworks.com> <1467371814-26754-1-git-send-email-jerin.jacob@caviumnetworks.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [122.167.11.22] X-ClientProxiedBy: PN1PR01CA0027.INDPRD01.PROD.OUTLOOK.COM (10.164.137.34) To BLUPR0701MB1714.namprd07.prod.outlook.com (10.163.85.140) X-MS-Office365-Filtering-Correlation-Id: bf911e71-09be-4cda-444f-08d3a1a15aac X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1714; 2:6ofXfDB3BhOJY+vDLwOylB7/gbvWN90NBEtrPQ2/GqagO9PlnnrZCI1+UZ5TKw4Rs129+M5lj0GBEeGbOgOvaOaPqOeP4nvtmegbrG0pFoh3L622PwV3eJ4H3ol0sKCcKek6qxrV78c8CrZ1NiTCJBoysRA5tMedSBPMajEVAGPAgZPH7C4XJWwvmNbJ4vIG; 3:BpvmqLjkNesvHkLrHc+odhDOR64y0oXQ1wMGgPMVhfkByaKm/yekj74sOTFnaXBoqpfp0h91rCF/MKTzWQ3dHNpb2kXNchx9M0GDTuNnAar7WDgDz5/r6+ErG7fBPHcL; 25:Yw51TXXRb0pPomL+qEtVM27bQjkzJndKfys01yjJQq54kB3YiI+WJQpWoE+NR0LwpegnSqmb0sCNEMylGF/AZQIkGPahlJEGDAtuMT+VJBw4ZDDHUb9B/sdCkuvrc2WaJbssnKfnjzYqrttDKRJYFnNGI3aNhKoD37oeKDFTEts0QDL3y9sPfu+3ftCZd2pYKhYEDBGF99TXY7z7Z2DMG8lKj1O2uO4ekyGqgWlEc+fT3LM1tTh4P/tjFYHpPrUM0CNtxu6+XPzVhRgdbdKUFMYN+5W0wKFjeU8APAgdLtsT0yjRu8pDY43EIeUMFtL0KIwGlI9AFuX0h8ZD6LjefP8sU1ZXFN0D/ka3gyhxuGYbpgBeSOV7KIgGBhmk6cSxsGiMIgtnqOjVrAcPD1bM/VOtx9Wq7xuMPcDPw9sHZ2g= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BLUPR0701MB1714; X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1714; 20:iig//zvambQQBRFHL7PW82f29akVmdqy/1RboLtu577vbtS2WUANgZAAiGfJXww/YQGyyZBtA3CgunSN9eaP3Ppdq/in5bUCsjY0tSIZa/Jn5+0uLBCNTErwU+nuKNunynNNFtDFB9NBXzI/drjaisnNG+8kW7KmN4DaJoXeYX0bR0/Ox0B+g7o8b19MnSrEE/hHL+53MpEQHAYWjqItsonxQ9G4sjpL72kqFm4qvchzu/kcuUIJGqNHjPFHNuugjvbLPw7pco+zQ6HaSjKD12YhVllIYxk4GiCFBbRXX+OGMPkec9tYNKhXAeVZX5maff6wd5KwSK6kpBVBA/V1A8X9BRQMb62Hofq6lTOx0JefkEJrMB+TSCAIZzGZYtFaV/4/KCT60kBqEQRf5W8WqMT25g75J6pqRaV27+m7GubopA2GeEBX0q0Bv0uJybzwL0E0UXBdBXKOL0GeOwNPmgfEP2tuiIy0HcVEdY+YjlA2r6Af8OBRimx8N1pjvgkuXJkX0Z9eEedKFj7TxL9yVuuTrTKxTY05QHFClF4AuXNJEoWSf8u+/TkW7AFyRAlY8Biz6wW3TkuZRtWEG5XL9E1kOpbg/4V/OjyZU0qEng4= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046); SRVR:BLUPR0701MB1714; BCL:0; PCL:0; RULEID:; SRVR:BLUPR0701MB1714; X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1714; 4:1OGdHL79ET4SG1r38EiikeOTnEvhuItntg07vMW/w8N73BNirWn1TdObLII7zfg7VMyHBz/YUsTAANepRVC+mi7JJnPhdiKXbNZgvlO48FQdaXnV9cBoMolyN0AoDKGv7oZa10AmEi7lCX42av4EyVY7JnvAnIDG4BYotwId/j+f847kGiUHPJ6sW4N4tiJm+kewc7aw2YqDT0NCGwh7mQ43xgHiV7gS0sdLwGwPragbHCSujVFO/yQQWtLWKB/xgQsi0gNIi68OoIBF0oSlWFsJ8doBFZ1k0u/DqTX9mjwGiR6gSFt8m1xxwJFpAzdMxBGYolBAFvwJAm61ifwyvR9Awg7cYVwHMBPwQisf9zwjdKKHLH94rsgYSyvBCItv X-Forefront-PRVS: 0990C54589 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(4630300001)(6009001)(6069001)(7916002)(189002)(199003)(57704003)(68736007)(8676002)(5003940100001)(81166006)(81156014)(66066001)(7846002)(76176999)(50466002)(4001430100002)(50986999)(305945005)(586003)(2906002)(4326007)(92566002)(48376002)(3846002)(6116002)(101416001)(77096005)(2950100001)(36756003)(42186005)(575784001)(107886002)(110136002)(97736004)(19580395003)(189998001)(7736002)(229853001)(33646002)(105586002)(2351001)(19580405001)(106356001)(47776003)(50226002)(7099028); DIR:OUT; SFP:1101; SCL:1; SRVR:BLUPR0701MB1714; H:localhost.localdomain.localdomain; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Received-SPF: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BLUPR0701MB1714; 23:uXA6XDOYJR5SgYUufnZlZvJ38xybmcTi2K6WPvc?= =?us-ascii?Q?FjcmdaecD7uaPRr8Klx6Bd3fPUtJ21BsbQP9u6vLBEjiW0eZFRHHY4/TyuCR?= =?us-ascii?Q?CNFAWFmbLFqLi4vfOHQpTqiKoZ60TGEj2D4x62dVNc4mvQbdqoG22rB3k2uQ?= =?us-ascii?Q?upx70X47jQLW134jw95ritvJhUdqrDDnkMnLA8t1Ka3mP+yuP2RM9DD6tby3?= =?us-ascii?Q?YeTKi3uqfW6ZdfFTq3w4hK511qzfavL9QPdA3XVCY7J7/9WL5oQnCpCb4TqP?= =?us-ascii?Q?spQ/kk+TeucOpO/oGaJjRkTCqEEHYY8ir/YenavFRHShNXpMHAYC4Z78/iCP?= =?us-ascii?Q?E5BVhT3ilFfxC1ZjCicqqV9+Nbe4LAmMo1f7XyVLZmUw+3Nd/oMN21nc9SRV?= =?us-ascii?Q?1DC6lV04zje7W5FSS5FaRhNrtstCo2mwnWFw7+b8L6Pc0ATzoT9Zw08BwiS5?= =?us-ascii?Q?JlojXjw2944AE4/7hJw1mXL5frebEp9airiVgl5b5PqoWNPssUFYV/8/cyjg?= =?us-ascii?Q?wJK7dxp08e7jNF16SnAZqmAqVvwL62dGtayMI2mEmF8Whg+OmyKmiboWQgt0?= =?us-ascii?Q?FDvWZ19g2kqLunHqhit82ce4lSCkz0OkDZyyaxt3NwmBgcy/5RbZhA5rvdHX?= =?us-ascii?Q?kC4FO7hGz1cAx0pxtxFocImZq6IbKYe1FsrdZdHz2OsChXtvktqn1kRzoqa2?= =?us-ascii?Q?U3x7703n9aIQceV+OmwAYF3PtD8FSIHHQT9QpIFLz4QErdg8AX6VSV4P3kxr?= =?us-ascii?Q?rG97z7LUtdf9cbNPp5ZFwdfIBkk8IUyEumdbs/DX/naa1m/JRrdURrAs8TRn?= =?us-ascii?Q?Te5oPyJb+Ck6LYATcFkzO7id9WEPCieAPIkpOkufXVxa0EtjTdCD1ngmbK0m?= =?us-ascii?Q?lfBkblu20ebOfBfa/amg3G6JZ9NeInwUPJgVh1+4wH7viU+KXqWttiYGD6RY?= =?us-ascii?Q?Y6+LfL/Dipsl7p/DvnhZk8VvYXRoSzxYMeHesdQplq7ruKPz8yJmFeJJ/PhN?= =?us-ascii?Q?vWyocC+ZmVQUPBgcf3daEu+bjIDPPjNXHM68FbGGqCSA1DYR7jxCHl7OJvRk?= =?us-ascii?Q?P4RgSUGVLMzBksjVTpvpOZl81sWe2heIwyHIa2xv20I/PNEX4btK5zo4GKJR?= =?us-ascii?Q?GQpdlA0/YHoZXDcTXgOCbPXHif4BlKR66f2OVySqU9M35jxfFiRzA0ctorOT?= =?us-ascii?Q?OJ/xVwJalSCKLiCc=3D?= X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1714; 6:ifcjtrH4Uv+0JIXgYnYI62l8omwBuqnQx72RCRY/GhzmY2k+n+9NmM01uUNt9ON+UBtgWA3Tk3PwFNSQjTJUfnwdfWrzHQpNAXucfdNDkmCjEUUJh8IGPq8Zs3PT3JsSpZy4x64ah7VOvw0y7Nlj0D8RtnEkNU6Q9SY+rlWlFI4pyl7kZddXj0uRitx2nFIBHsKSZ5SK9LNllpR1R5uVUgtyW+2nXPigYxmdOjE4T3lWpphVFS/WxYMFMF02htgPU4ATJVvCNzDCYv8W0gqfACws/RJhnHOt1oOsBS4mV35fdeI0DnrgF1sodKrVjEHZ; 5:hdOfhjojETx1wIwg3RVGhXzyDL6AnVmviu6gOeA7flhyA/TIi2S87bZhH08GO+h/+hPA22VQi0/aFwY2InMQHJxA2JbvkgbJZnCyiRP1YT6+7PJuXYQzhXtPtYHRlVbYtDb0aWXHNF1+2DvVxeuyPg==; 24:v58e9+783To8wicCLEhpeQJIABOfhCuqC67qDHeM+ycVkHze6dKxvQW8WzuPb2tloTIC3dCj/3SdvpUwIMKUxl2mw5Ng9Sz6cpqy5KqiSCQ=; 7:UnnJzc98YYSGLnVJ1piTCcp8TGyCS2yVaHWDn5EEI+zBjLPWI6XtyGSHMvnD9vurrx3UYn6Ijxb74vHOZyqhGX4ReMt9RS+JC9EToQY2CzPoh3HSC17Dql2vHF8k/RLXNDjpc0lvTnuEg2NBSNRD64hAa4ZuezhOPMuBrv9G9svQm+Bt7qk+PNmL3SqoN4a6e/62/MigaSJ6hWDTaoxBnnPZKnpCyupjn7wlELTHR6GXyocOZ6aesdDkjYhAyeevrhwsteK5eI3QAIniVL3New== SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Jul 2016 11:17:54.3150 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR0701MB1714 Subject: [dpdk-dev] [PATCH v2 2/3] virtio: move SSE based Rx implementation to separate file X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jul 2016 11:18:00 -0000 * Introduced cpuflag based run-time detection to select the SSE based simple Rx handler * Split out SSE instruction based virtio simple Rx implementation to a separate file Signed-off-by: Jerin Jacob --- drivers/net/virtio/Makefile | 4 + drivers/net/virtio/virtio_rxtx.c | 35 ++-- drivers/net/virtio/virtio_rxtx_simple.c | 273 ++-------------------------- drivers/net/virtio/virtio_rxtx_simple.h | 133 ++++++++++++++ drivers/net/virtio/virtio_rxtx_simple_sse.c | 222 ++++++++++++++++++++++ 5 files changed, 394 insertions(+), 273 deletions(-) create mode 100644 drivers/net/virtio/virtio_rxtx_simple.h create mode 100644 drivers/net/virtio/virtio_rxtx_simple_sse.c diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile index b9b0d8d..c4103b7 100644 --- a/drivers/net/virtio/Makefile +++ b/drivers/net/virtio/Makefile @@ -52,6 +52,10 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx.c SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_ethdev.c SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c +ifeq ($(CONFIG_RTE_ARCH_X86),y) +SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_sse.c +endif + ifeq ($(CONFIG_RTE_VIRTIO_USER),y) SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_user/vhost_user.c SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_user/virtio_user_dev.c diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index 63b53f7..a4d4a57 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -50,6 +50,7 @@ #include #include #include +#include #include "virtio_logs.h" #include "virtio_ethdev.h" @@ -470,6 +471,28 @@ virtio_dev_rx_queue_release(void *rxq) rte_memzone_free(mz); } +static void +virtio_update_rxtx_handler(struct rte_eth_dev *dev, + const struct rte_eth_txconf *tx_conf) +{ + uint8_t use_simple_rxtx = 0; + struct virtio_hw *hw = dev->data->dev_private; + +#if defined RTE_ARCH_X86 + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE3)) + use_simple_rxtx = 1; +#endif + /* Use simple rx/tx func if single segment and no offloads */ + if (use_simple_rxtx && + (tx_conf->txq_flags & VIRTIO_SIMPLE_FLAGS) == VIRTIO_SIMPLE_FLAGS && + !vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) { + PMD_INIT_LOG(INFO, "Using simple rx/tx path"); + dev->tx_pkt_burst = virtio_xmit_pkts_simple; + dev->rx_pkt_burst = virtio_recv_pkts_vec; + hw->use_simple_rxtx = use_simple_rxtx; + } +} + /* * struct rte_eth_dev *dev: Used to update dev * uint16_t nb_desc: Defaults to values read from config space @@ -499,17 +522,7 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev, return -EINVAL; } -#ifdef RTE_MACHINE_CPUFLAG_SSSE3 - struct virtio_hw *hw = dev->data->dev_private; - /* Use simple rx/tx func if single segment and no offloads */ - if ((tx_conf->txq_flags & VIRTIO_SIMPLE_FLAGS) == VIRTIO_SIMPLE_FLAGS && - !vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) { - PMD_INIT_LOG(INFO, "Using simple rx/tx path"); - dev->tx_pkt_burst = virtio_xmit_pkts_simple; - dev->rx_pkt_burst = virtio_recv_pkts_vec; - hw->use_simple_rxtx = 1; - } -#endif + virtio_update_rxtx_handler(dev, tx_conf); ret = virtio_dev_queue_setup(dev, VTNET_TQ, queue_idx, vtpci_queue_idx, nb_desc, socket_id, (void **)&txvq); diff --git a/drivers/net/virtio/virtio_rxtx_simple.c b/drivers/net/virtio/virtio_rxtx_simple.c index 67430da..485ddce 100644 --- a/drivers/net/virtio/virtio_rxtx_simple.c +++ b/drivers/net/virtio/virtio_rxtx_simple.c @@ -51,14 +51,7 @@ #include #include -#include "virtio_logs.h" -#include "virtio_ethdev.h" -#include "virtqueue.h" -#include "virtio_rxtx.h" - -#define RTE_VIRTIO_VPMD_RX_BURST 32 -#define RTE_VIRTIO_DESC_PER_LOOP 8 -#define RTE_VIRTIO_VPMD_RX_REARM_THRESH RTE_VIRTIO_VPMD_RX_BURST +#include "virtio_rxtx_simple.h" #ifndef __INTEL_COMPILER #pragma GCC diagnostic ignored "-Wcast-qual" @@ -89,260 +82,6 @@ virtqueue_enqueue_recv_refill_simple(struct virtqueue *vq, return 0; } -static inline void -virtio_rxq_rearm_vec(struct virtnet_rx *rxvq) -{ - int i; - uint16_t desc_idx; - struct rte_mbuf **sw_ring; - struct vring_desc *start_dp; - int ret; - struct virtqueue *vq = rxvq->vq; - - desc_idx = vq->vq_avail_idx & (vq->vq_nentries - 1); - sw_ring = &vq->sw_ring[desc_idx]; - start_dp = &vq->vq_ring.desc[desc_idx]; - - ret = rte_mempool_get_bulk(rxvq->mpool, (void **)sw_ring, - RTE_VIRTIO_VPMD_RX_REARM_THRESH); - if (unlikely(ret)) { - rte_eth_devices[rxvq->port_id].data->rx_mbuf_alloc_failed += - RTE_VIRTIO_VPMD_RX_REARM_THRESH; - return; - } - - for (i = 0; i < RTE_VIRTIO_VPMD_RX_REARM_THRESH; i++) { - uintptr_t p; - - p = (uintptr_t)&sw_ring[i]->rearm_data; - *(uint64_t *)p = rxvq->mbuf_initializer; - - start_dp[i].addr = - MBUF_DATA_DMA_ADDR(sw_ring[i], vq->offset) - - vq->hw->vtnet_hdr_size; - start_dp[i].len = sw_ring[i]->buf_len - - RTE_PKTMBUF_HEADROOM + vq->hw->vtnet_hdr_size; - } - - vq->vq_avail_idx += RTE_VIRTIO_VPMD_RX_REARM_THRESH; - vq->vq_free_cnt -= RTE_VIRTIO_VPMD_RX_REARM_THRESH; - vq_update_avail_idx(vq); -} - -#ifdef RTE_MACHINE_CPUFLAG_SSSE3 - -#include - -/* virtio vPMD receive routine, only accept(nb_pkts >= RTE_VIRTIO_DESC_PER_LOOP) - * - * This routine is for non-mergeable RX, one desc for each guest buffer. - * This routine is based on the RX ring layout optimization. Each entry in the - * avail ring points to the desc with the same index in the desc ring and this - * will never be changed in the driver. - * - * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet - */ -uint16_t -virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, - uint16_t nb_pkts) -{ - struct virtnet_rx *rxvq = rx_queue; - struct virtqueue *vq = rxvq->vq; - uint16_t nb_used; - uint16_t desc_idx; - struct vring_used_elem *rused; - struct rte_mbuf **sw_ring; - struct rte_mbuf **sw_ring_end; - uint16_t nb_pkts_received; - __m128i shuf_msk1, shuf_msk2, len_adjust; - - shuf_msk1 = _mm_set_epi8( - 0xFF, 0xFF, 0xFF, 0xFF, - 0xFF, 0xFF, /* vlan tci */ - 5, 4, /* dat len */ - 0xFF, 0xFF, 5, 4, /* pkt len */ - 0xFF, 0xFF, 0xFF, 0xFF /* packet type */ - - ); - - shuf_msk2 = _mm_set_epi8( - 0xFF, 0xFF, 0xFF, 0xFF, - 0xFF, 0xFF, /* vlan tci */ - 13, 12, /* dat len */ - 0xFF, 0xFF, 13, 12, /* pkt len */ - 0xFF, 0xFF, 0xFF, 0xFF /* packet type */ - ); - - /* Subtract the header length. - * In which case do we need the header length in used->len ? - */ - len_adjust = _mm_set_epi16( - 0, 0, - 0, - (uint16_t)-vq->hw->vtnet_hdr_size, - 0, (uint16_t)-vq->hw->vtnet_hdr_size, - 0, 0); - - if (unlikely(nb_pkts < RTE_VIRTIO_DESC_PER_LOOP)) - return 0; - - nb_used = VIRTQUEUE_NUSED(vq); - - rte_compiler_barrier(); - - if (unlikely(nb_used == 0)) - return 0; - - nb_pkts = RTE_ALIGN_FLOOR(nb_pkts, RTE_VIRTIO_DESC_PER_LOOP); - nb_used = RTE_MIN(nb_used, nb_pkts); - - desc_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1)); - rused = &vq->vq_ring.used->ring[desc_idx]; - sw_ring = &vq->sw_ring[desc_idx]; - sw_ring_end = &vq->sw_ring[vq->vq_nentries]; - - _mm_prefetch((const void *)rused, _MM_HINT_T0); - - if (vq->vq_free_cnt >= RTE_VIRTIO_VPMD_RX_REARM_THRESH) { - virtio_rxq_rearm_vec(rxvq); - if (unlikely(virtqueue_kick_prepare(vq))) - virtqueue_notify(vq); - } - - for (nb_pkts_received = 0; - nb_pkts_received < nb_used;) { - __m128i desc[RTE_VIRTIO_DESC_PER_LOOP / 2]; - __m128i mbp[RTE_VIRTIO_DESC_PER_LOOP / 2]; - __m128i pkt_mb[RTE_VIRTIO_DESC_PER_LOOP]; - - mbp[0] = _mm_loadu_si128((__m128i *)(sw_ring + 0)); - desc[0] = _mm_loadu_si128((__m128i *)(rused + 0)); - _mm_storeu_si128((__m128i *)&rx_pkts[0], mbp[0]); - - mbp[1] = _mm_loadu_si128((__m128i *)(sw_ring + 2)); - desc[1] = _mm_loadu_si128((__m128i *)(rused + 2)); - _mm_storeu_si128((__m128i *)&rx_pkts[2], mbp[1]); - - mbp[2] = _mm_loadu_si128((__m128i *)(sw_ring + 4)); - desc[2] = _mm_loadu_si128((__m128i *)(rused + 4)); - _mm_storeu_si128((__m128i *)&rx_pkts[4], mbp[2]); - - mbp[3] = _mm_loadu_si128((__m128i *)(sw_ring + 6)); - desc[3] = _mm_loadu_si128((__m128i *)(rused + 6)); - _mm_storeu_si128((__m128i *)&rx_pkts[6], mbp[3]); - - pkt_mb[1] = _mm_shuffle_epi8(desc[0], shuf_msk2); - pkt_mb[0] = _mm_shuffle_epi8(desc[0], shuf_msk1); - pkt_mb[1] = _mm_add_epi16(pkt_mb[1], len_adjust); - pkt_mb[0] = _mm_add_epi16(pkt_mb[0], len_adjust); - _mm_storeu_si128((void *)&rx_pkts[1]->rx_descriptor_fields1, - pkt_mb[1]); - _mm_storeu_si128((void *)&rx_pkts[0]->rx_descriptor_fields1, - pkt_mb[0]); - - pkt_mb[3] = _mm_shuffle_epi8(desc[1], shuf_msk2); - pkt_mb[2] = _mm_shuffle_epi8(desc[1], shuf_msk1); - pkt_mb[3] = _mm_add_epi16(pkt_mb[3], len_adjust); - pkt_mb[2] = _mm_add_epi16(pkt_mb[2], len_adjust); - _mm_storeu_si128((void *)&rx_pkts[3]->rx_descriptor_fields1, - pkt_mb[3]); - _mm_storeu_si128((void *)&rx_pkts[2]->rx_descriptor_fields1, - pkt_mb[2]); - - pkt_mb[5] = _mm_shuffle_epi8(desc[2], shuf_msk2); - pkt_mb[4] = _mm_shuffle_epi8(desc[2], shuf_msk1); - pkt_mb[5] = _mm_add_epi16(pkt_mb[5], len_adjust); - pkt_mb[4] = _mm_add_epi16(pkt_mb[4], len_adjust); - _mm_storeu_si128((void *)&rx_pkts[5]->rx_descriptor_fields1, - pkt_mb[5]); - _mm_storeu_si128((void *)&rx_pkts[4]->rx_descriptor_fields1, - pkt_mb[4]); - - pkt_mb[7] = _mm_shuffle_epi8(desc[3], shuf_msk2); - pkt_mb[6] = _mm_shuffle_epi8(desc[3], shuf_msk1); - pkt_mb[7] = _mm_add_epi16(pkt_mb[7], len_adjust); - pkt_mb[6] = _mm_add_epi16(pkt_mb[6], len_adjust); - _mm_storeu_si128((void *)&rx_pkts[7]->rx_descriptor_fields1, - pkt_mb[7]); - _mm_storeu_si128((void *)&rx_pkts[6]->rx_descriptor_fields1, - pkt_mb[6]); - - if (unlikely(nb_used <= RTE_VIRTIO_DESC_PER_LOOP)) { - if (sw_ring + nb_used <= sw_ring_end) - nb_pkts_received += nb_used; - else - nb_pkts_received += sw_ring_end - sw_ring; - break; - } else { - if (unlikely(sw_ring + RTE_VIRTIO_DESC_PER_LOOP >= - sw_ring_end)) { - nb_pkts_received += sw_ring_end - sw_ring; - break; - } else { - nb_pkts_received += RTE_VIRTIO_DESC_PER_LOOP; - - rx_pkts += RTE_VIRTIO_DESC_PER_LOOP; - sw_ring += RTE_VIRTIO_DESC_PER_LOOP; - rused += RTE_VIRTIO_DESC_PER_LOOP; - nb_used -= RTE_VIRTIO_DESC_PER_LOOP; - } - } - } - - vq->vq_used_cons_idx += nb_pkts_received; - vq->vq_free_cnt += nb_pkts_received; - rxvq->stats.packets += nb_pkts_received; - return nb_pkts_received; -} - -#endif - -#define VIRTIO_TX_FREE_THRESH 32 -#define VIRTIO_TX_MAX_FREE_BUF_SZ 32 -#define VIRTIO_TX_FREE_NR 32 -/* TODO: vq->tx_free_cnt could mean num of free slots so we could avoid shift */ -static inline void -virtio_xmit_cleanup(struct virtqueue *vq) -{ - uint16_t i, desc_idx; - int nb_free = 0; - struct rte_mbuf *m, *free[VIRTIO_TX_MAX_FREE_BUF_SZ]; - - desc_idx = (uint16_t)(vq->vq_used_cons_idx & - ((vq->vq_nentries >> 1) - 1)); - m = (struct rte_mbuf *)vq->vq_descx[desc_idx++].cookie; - m = __rte_pktmbuf_prefree_seg(m); - if (likely(m != NULL)) { - free[0] = m; - nb_free = 1; - for (i = 1; i < VIRTIO_TX_FREE_NR; i++) { - m = (struct rte_mbuf *)vq->vq_descx[desc_idx++].cookie; - m = __rte_pktmbuf_prefree_seg(m); - if (likely(m != NULL)) { - if (likely(m->pool == free[0]->pool)) - free[nb_free++] = m; - else { - rte_mempool_put_bulk(free[0]->pool, - (void **)free, nb_free); - free[0] = m; - nb_free = 1; - } - } - } - rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free); - } else { - for (i = 1; i < VIRTIO_TX_FREE_NR; i++) { - m = (struct rte_mbuf *)vq->vq_descx[desc_idx++].cookie; - m = __rte_pktmbuf_prefree_seg(m); - if (m != NULL) - rte_mempool_put(m->pool, m); - } - } - - vq->vq_used_cons_idx += VIRTIO_TX_FREE_NR; - vq->vq_free_cnt += (VIRTIO_TX_FREE_NR << 1); -} - uint16_t virtio_xmit_pkts_simple(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) @@ -423,3 +162,13 @@ virtio_rxq_vec_setup(struct virtnet_rx *rxq) return 0; } + +/* Stub for linkage when arch specific implementation is not available */ +uint16_t __attribute__((weak)) +virtio_recv_pkts_vec(void *rx_queue __rte_unused, + struct rte_mbuf **rx_pkts __rte_unused, + uint16_t nb_pkts __rte_unused) +{ + rte_panic("Wrong weak function linked by linker\n"); + return 0; +} diff --git a/drivers/net/virtio/virtio_rxtx_simple.h b/drivers/net/virtio/virtio_rxtx_simple.h new file mode 100644 index 0000000..8cb43c0 --- /dev/null +++ b/drivers/net/virtio/virtio_rxtx_simple.h @@ -0,0 +1,133 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _VIRTIO_RXTX_SIMPLE_H_ +#define _VIRTIO_RXTX_SIMPLE_H_ + +#include + +#include "virtio_logs.h" +#include "virtio_ethdev.h" +#include "virtqueue.h" +#include "virtio_rxtx.h" + +#define RTE_VIRTIO_VPMD_RX_BURST 32 +#define RTE_VIRTIO_VPMD_RX_REARM_THRESH RTE_VIRTIO_VPMD_RX_BURST + +static inline void +virtio_rxq_rearm_vec(struct virtnet_rx *rxvq) +{ + int i; + uint16_t desc_idx; + struct rte_mbuf **sw_ring; + struct vring_desc *start_dp; + int ret; + struct virtqueue *vq = rxvq->vq; + + desc_idx = vq->vq_avail_idx & (vq->vq_nentries - 1); + sw_ring = &vq->sw_ring[desc_idx]; + start_dp = &vq->vq_ring.desc[desc_idx]; + + ret = rte_mempool_get_bulk(rxvq->mpool, (void **)sw_ring, + RTE_VIRTIO_VPMD_RX_REARM_THRESH); + if (unlikely(ret)) { + rte_eth_devices[rxvq->port_id].data->rx_mbuf_alloc_failed += + RTE_VIRTIO_VPMD_RX_REARM_THRESH; + return; + } + + for (i = 0; i < RTE_VIRTIO_VPMD_RX_REARM_THRESH; i++) { + uintptr_t p; + + p = (uintptr_t)&sw_ring[i]->rearm_data; + *(uint64_t *)p = rxvq->mbuf_initializer; + + start_dp[i].addr = + MBUF_DATA_DMA_ADDR(sw_ring[i], vq->offset) - + vq->hw->vtnet_hdr_size; + start_dp[i].len = sw_ring[i]->buf_len - + RTE_PKTMBUF_HEADROOM + vq->hw->vtnet_hdr_size; + } + + vq->vq_avail_idx += RTE_VIRTIO_VPMD_RX_REARM_THRESH; + vq->vq_free_cnt -= RTE_VIRTIO_VPMD_RX_REARM_THRESH; + vq_update_avail_idx(vq); +} + +#define VIRTIO_TX_FREE_THRESH 32 +#define VIRTIO_TX_MAX_FREE_BUF_SZ 32 +#define VIRTIO_TX_FREE_NR 32 +/* TODO: vq->tx_free_cnt could mean num of free slots so we could avoid shift */ +static inline void +virtio_xmit_cleanup(struct virtqueue *vq) +{ + uint16_t i, desc_idx; + int nb_free = 0; + struct rte_mbuf *m, *free[VIRTIO_TX_MAX_FREE_BUF_SZ]; + + desc_idx = (uint16_t)(vq->vq_used_cons_idx & + ((vq->vq_nentries >> 1) - 1)); + m = (struct rte_mbuf *)vq->vq_descx[desc_idx++].cookie; + m = __rte_pktmbuf_prefree_seg(m); + if (likely(m != NULL)) { + free[0] = m; + nb_free = 1; + for (i = 1; i < VIRTIO_TX_FREE_NR; i++) { + m = (struct rte_mbuf *)vq->vq_descx[desc_idx++].cookie; + m = __rte_pktmbuf_prefree_seg(m); + if (likely(m != NULL)) { + if (likely(m->pool == free[0]->pool)) + free[nb_free++] = m; + else { + rte_mempool_put_bulk(free[0]->pool, + (void **)free, nb_free); + free[0] = m; + nb_free = 1; + } + } + } + rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free); + } else { + for (i = 1; i < VIRTIO_TX_FREE_NR; i++) { + m = (struct rte_mbuf *)vq->vq_descx[desc_idx++].cookie; + m = __rte_pktmbuf_prefree_seg(m); + if (m != NULL) + rte_mempool_put(m->pool, m); + } + } + + vq->vq_used_cons_idx += VIRTIO_TX_FREE_NR; + vq->vq_free_cnt += (VIRTIO_TX_FREE_NR << 1); +} + +#endif /* _VIRTIO_RXTX_SIMPLE_H_ */ diff --git a/drivers/net/virtio/virtio_rxtx_simple_sse.c b/drivers/net/virtio/virtio_rxtx_simple_sse.c new file mode 100644 index 0000000..39000e8 --- /dev/null +++ b/drivers/net/virtio/virtio_rxtx_simple_sse.c @@ -0,0 +1,222 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2015 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include +#include +#include +#include + +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "virtio_rxtx_simple.h" + +#define RTE_VIRTIO_VPMD_RX_BURST 32 +#define RTE_VIRTIO_DESC_PER_LOOP 8 +#define RTE_VIRTIO_VPMD_RX_REARM_THRESH RTE_VIRTIO_VPMD_RX_BURST + +/* virtio vPMD receive routine, only accept(nb_pkts >= RTE_VIRTIO_DESC_PER_LOOP) + * + * This routine is for non-mergeable RX, one desc for each guest buffer. + * This routine is based on the RX ring layout optimization. Each entry in the + * avail ring points to the desc with the same index in the desc ring and this + * will never be changed in the driver. + * + * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet + */ +uint16_t +virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, + uint16_t nb_pkts) +{ + struct virtnet_rx *rxvq = rx_queue; + struct virtqueue *vq = rxvq->vq; + uint16_t nb_used; + uint16_t desc_idx; + struct vring_used_elem *rused; + struct rte_mbuf **sw_ring; + struct rte_mbuf **sw_ring_end; + uint16_t nb_pkts_received; + __m128i shuf_msk1, shuf_msk2, len_adjust; + + shuf_msk1 = _mm_set_epi8( + 0xFF, 0xFF, 0xFF, 0xFF, + 0xFF, 0xFF, /* vlan tci */ + 5, 4, /* dat len */ + 0xFF, 0xFF, 5, 4, /* pkt len */ + 0xFF, 0xFF, 0xFF, 0xFF /* packet type */ + + ); + + shuf_msk2 = _mm_set_epi8( + 0xFF, 0xFF, 0xFF, 0xFF, + 0xFF, 0xFF, /* vlan tci */ + 13, 12, /* dat len */ + 0xFF, 0xFF, 13, 12, /* pkt len */ + 0xFF, 0xFF, 0xFF, 0xFF /* packet type */ + ); + + /* Subtract the header length. + * In which case do we need the header length in used->len ? + */ + len_adjust = _mm_set_epi16( + 0, 0, + 0, + (uint16_t)-vq->hw->vtnet_hdr_size, + 0, (uint16_t)-vq->hw->vtnet_hdr_size, + 0, 0); + + if (unlikely(nb_pkts < RTE_VIRTIO_DESC_PER_LOOP)) + return 0; + + nb_used = VIRTQUEUE_NUSED(vq); + + rte_compiler_barrier(); + + if (unlikely(nb_used == 0)) + return 0; + + nb_pkts = RTE_ALIGN_FLOOR(nb_pkts, RTE_VIRTIO_DESC_PER_LOOP); + nb_used = RTE_MIN(nb_used, nb_pkts); + + desc_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1)); + rused = &vq->vq_ring.used->ring[desc_idx]; + sw_ring = &vq->sw_ring[desc_idx]; + sw_ring_end = &vq->sw_ring[vq->vq_nentries]; + + _mm_prefetch((const void *)rused, _MM_HINT_T0); + + if (vq->vq_free_cnt >= RTE_VIRTIO_VPMD_RX_REARM_THRESH) { + virtio_rxq_rearm_vec(rxvq); + if (unlikely(virtqueue_kick_prepare(vq))) + virtqueue_notify(vq); + } + + for (nb_pkts_received = 0; + nb_pkts_received < nb_used;) { + __m128i desc[RTE_VIRTIO_DESC_PER_LOOP / 2]; + __m128i mbp[RTE_VIRTIO_DESC_PER_LOOP / 2]; + __m128i pkt_mb[RTE_VIRTIO_DESC_PER_LOOP]; + + mbp[0] = _mm_loadu_si128((__m128i *)(sw_ring + 0)); + desc[0] = _mm_loadu_si128((__m128i *)(rused + 0)); + _mm_storeu_si128((__m128i *)&rx_pkts[0], mbp[0]); + + mbp[1] = _mm_loadu_si128((__m128i *)(sw_ring + 2)); + desc[1] = _mm_loadu_si128((__m128i *)(rused + 2)); + _mm_storeu_si128((__m128i *)&rx_pkts[2], mbp[1]); + + mbp[2] = _mm_loadu_si128((__m128i *)(sw_ring + 4)); + desc[2] = _mm_loadu_si128((__m128i *)(rused + 4)); + _mm_storeu_si128((__m128i *)&rx_pkts[4], mbp[2]); + + mbp[3] = _mm_loadu_si128((__m128i *)(sw_ring + 6)); + desc[3] = _mm_loadu_si128((__m128i *)(rused + 6)); + _mm_storeu_si128((__m128i *)&rx_pkts[6], mbp[3]); + + pkt_mb[1] = _mm_shuffle_epi8(desc[0], shuf_msk2); + pkt_mb[0] = _mm_shuffle_epi8(desc[0], shuf_msk1); + pkt_mb[1] = _mm_add_epi16(pkt_mb[1], len_adjust); + pkt_mb[0] = _mm_add_epi16(pkt_mb[0], len_adjust); + _mm_storeu_si128((void *)&rx_pkts[1]->rx_descriptor_fields1, + pkt_mb[1]); + _mm_storeu_si128((void *)&rx_pkts[0]->rx_descriptor_fields1, + pkt_mb[0]); + + pkt_mb[3] = _mm_shuffle_epi8(desc[1], shuf_msk2); + pkt_mb[2] = _mm_shuffle_epi8(desc[1], shuf_msk1); + pkt_mb[3] = _mm_add_epi16(pkt_mb[3], len_adjust); + pkt_mb[2] = _mm_add_epi16(pkt_mb[2], len_adjust); + _mm_storeu_si128((void *)&rx_pkts[3]->rx_descriptor_fields1, + pkt_mb[3]); + _mm_storeu_si128((void *)&rx_pkts[2]->rx_descriptor_fields1, + pkt_mb[2]); + + pkt_mb[5] = _mm_shuffle_epi8(desc[2], shuf_msk2); + pkt_mb[4] = _mm_shuffle_epi8(desc[2], shuf_msk1); + pkt_mb[5] = _mm_add_epi16(pkt_mb[5], len_adjust); + pkt_mb[4] = _mm_add_epi16(pkt_mb[4], len_adjust); + _mm_storeu_si128((void *)&rx_pkts[5]->rx_descriptor_fields1, + pkt_mb[5]); + _mm_storeu_si128((void *)&rx_pkts[4]->rx_descriptor_fields1, + pkt_mb[4]); + + pkt_mb[7] = _mm_shuffle_epi8(desc[3], shuf_msk2); + pkt_mb[6] = _mm_shuffle_epi8(desc[3], shuf_msk1); + pkt_mb[7] = _mm_add_epi16(pkt_mb[7], len_adjust); + pkt_mb[6] = _mm_add_epi16(pkt_mb[6], len_adjust); + _mm_storeu_si128((void *)&rx_pkts[7]->rx_descriptor_fields1, + pkt_mb[7]); + _mm_storeu_si128((void *)&rx_pkts[6]->rx_descriptor_fields1, + pkt_mb[6]); + + if (unlikely(nb_used <= RTE_VIRTIO_DESC_PER_LOOP)) { + if (sw_ring + nb_used <= sw_ring_end) + nb_pkts_received += nb_used; + else + nb_pkts_received += sw_ring_end - sw_ring; + break; + } else { + if (unlikely(sw_ring + RTE_VIRTIO_DESC_PER_LOOP >= + sw_ring_end)) { + nb_pkts_received += sw_ring_end - sw_ring; + break; + } else { + nb_pkts_received += RTE_VIRTIO_DESC_PER_LOOP; + + rx_pkts += RTE_VIRTIO_DESC_PER_LOOP; + sw_ring += RTE_VIRTIO_DESC_PER_LOOP; + rused += RTE_VIRTIO_DESC_PER_LOOP; + nb_used -= RTE_VIRTIO_DESC_PER_LOOP; + } + } + } + + vq->vq_used_cons_idx += nb_pkts_received; + vq->vq_free_cnt += nb_pkts_received; + rxvq->stats.packets += nb_pkts_received; + return nb_pkts_received; +} -- 2.5.5