From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 23C8AA04B1; Tue, 24 Nov 2020 15:21:23 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 32E34C914; Tue, 24 Nov 2020 15:21:20 +0100 (CET) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 70B40C914 for ; Tue, 24 Nov 2020 15:21:18 +0100 (CET) IronPort-SDR: 5hj0qsgdqEtYMtAHOuS7IuGDfv58TVlFCV2Burhs5ubWGDKA2Dt1txp+/2OmE1KWKfo12kxJVt Pkq9pl2dpoPA== X-IronPort-AV: E=McAfee;i="6000,8403,9814"; a="171170003" X-IronPort-AV: E=Sophos;i="5.78,366,1599548400"; d="scan'208";a="171170003" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Nov 2020 06:21:16 -0800 IronPort-SDR: SKPz+H8hCjllvCJNJJIWCkilC8tI4ENFmtNh4EM2qKcg+GtOYR9ucrX8fxsgP3Su7+W32/WDuj RrE8Z/AcE5Ug== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,366,1599548400"; d="scan'208";a="478516095" Received: from irsmsx605.ger.corp.intel.com ([163.33.146.138]) by orsmga004.jf.intel.com with ESMTP; 24 Nov 2020 06:21:15 -0800 Received: from irsmsx604.ger.corp.intel.com (163.33.146.137) by IRSMSX605.ger.corp.intel.com (163.33.146.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 24 Nov 2020 14:21:15 +0000 Received: from irsmsx604.ger.corp.intel.com ([163.33.146.137]) by IRSMSX604.ger.corp.intel.com ([163.33.146.137]) with mapi id 15.01.1713.004; Tue, 24 Nov 2020 14:21:15 +0000 From: "Loftus, Ciara" To: Li RongQing CC: "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH][v3] net/af_xdp: optimize RX path by removing the unneeded allocation mbuf Thread-Index: AQHWvwSHlfQJkRCW6E+L56fsCV/NpKnXV3LQ Date: Tue, 24 Nov 2020 14:21:14 +0000 Message-ID: References: <1605852868-25682-1-git-send-email-lirongqing@baidu.com> In-Reply-To: <1605852868-25682-1-git-send-email-lirongqing@baidu.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.5.1.3 x-originating-ip: [163.33.253.164] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH][v3] net/af_xdp: optimize RX path by removing the unneeded allocation mbuf X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" >=20 > when receive packets, the max bunch number of mbuf are allocated > if hardware does not receive the max bunch number packets, it > will free redundancy mbuf, this is low performance >=20 > so optimize rx performance, by allocating number of mbuf based on > result of xsk_ring_cons__peek, to avoid to redundancy allocation, > and free mbuf when receive packets >=20 > and rx cached_cons must be rollbacked if fail to allocating mbuf, > found by Ciara Loftus >=20 > Signed-off-by: Li RongQing > Signed-off-by: Dongsheng Rong > --- >=20 > V2: rollback rx cached_cons if mbuf failed to be allocated > V3: add comment when rollback rx cached_cons > we should create a function for rollback as suggested by Ciara Loftus, > like xsk_ring_cons__cancel, but this function should be in kernel, > and I will send it to kernel >=20 > drivers/net/af_xdp/rte_eth_af_xdp.c | 73 ++++++++++++++--------------- > 1 file changed, 35 insertions(+), 38 deletions(-) >=20 > diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c > b/drivers/net/af_xdp/rte_eth_af_xdp.c > index 2c7892bd7..69a4d54a3 100644 > --- a/drivers/net/af_xdp/rte_eth_af_xdp.c > +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c > @@ -255,28 +255,32 @@ af_xdp_rx_zc(void *queue, struct rte_mbuf > **bufs, uint16_t nb_pkts) > struct xsk_umem_info *umem =3D rxq->umem; > uint32_t idx_rx =3D 0; > unsigned long rx_bytes =3D 0; > - int rcvd, i; > + int i; > struct rte_mbuf *fq_bufs[ETH_AF_XDP_RX_BATCH_SIZE]; >=20 > - /* allocate bufs for fill queue replenishment after rx */ > - if (rte_pktmbuf_alloc_bulk(umem->mb_pool, fq_bufs, nb_pkts)) { > - AF_XDP_LOG(DEBUG, > - "Failed to get enough buffers for fq.\n"); > - return 0; > - } > + nb_pkts =3D xsk_ring_cons__peek(rx, nb_pkts, &idx_rx); >=20 > - rcvd =3D xsk_ring_cons__peek(rx, nb_pkts, &idx_rx); > - > - if (rcvd =3D=3D 0) { > + if (nb_pkts =3D=3D 0) { > #if defined(XDP_USE_NEED_WAKEUP) > if (xsk_ring_prod__needs_wakeup(fq)) > (void)poll(rxq->fds, 1, 1000); > #endif >=20 > - goto out; > + return 0; > + } > + > + /* allocate bufs for fill queue replenishment after rx */ > + if (rte_pktmbuf_alloc_bulk(umem->mb_pool, fq_bufs, nb_pkts)) { > + AF_XDP_LOG(DEBUG, > + "Failed to get enough buffers for fq.\n"); > + /* rollback cached_cons which is added by > + * xsk_ring_prod__needs_wakeup > + */ Thanks for adding the comment. There's a small mistake here. The function in which cached_cons is added is xsk_ring_cons__peek. Could you please submit a v4 with this change? Thanks! Ciara > + rx->cached_cons -=3D nb_pkts; > + return 0; > } >=20 > - for (i =3D 0; i < rcvd; i++) { > + for (i =3D 0; i < nb_pkts; i++) { > const struct xdp_desc *desc; > uint64_t addr; > uint32_t len; > @@ -301,20 +305,14 @@ af_xdp_rx_zc(void *queue, struct rte_mbuf > **bufs, uint16_t nb_pkts) > rx_bytes +=3D len; > } >=20 > - xsk_ring_cons__release(rx, rcvd); > - > - (void)reserve_fill_queue(umem, rcvd, fq_bufs, fq); > + xsk_ring_cons__release(rx, nb_pkts); > + (void)reserve_fill_queue(umem, nb_pkts, fq_bufs, fq); >=20 > /* statistics */ > - rxq->stats.rx_pkts +=3D rcvd; > + rxq->stats.rx_pkts +=3D nb_pkts; > rxq->stats.rx_bytes +=3D rx_bytes; >=20 > -out: > - if (rcvd !=3D nb_pkts) > - rte_mempool_put_bulk(umem->mb_pool, (void > **)&fq_bufs[rcvd], > - nb_pkts - rcvd); > - > - return rcvd; > + return nb_pkts; > } > #else > static uint16_t > @@ -326,7 +324,7 @@ af_xdp_rx_cp(void *queue, struct rte_mbuf **bufs, > uint16_t nb_pkts) > struct xsk_ring_prod *fq =3D &rxq->fq; > uint32_t idx_rx =3D 0; > unsigned long rx_bytes =3D 0; > - int rcvd, i; > + int i; > uint32_t free_thresh =3D fq->size >> 1; > struct rte_mbuf *mbufs[ETH_AF_XDP_RX_BATCH_SIZE]; >=20 > @@ -334,20 +332,24 @@ af_xdp_rx_cp(void *queue, struct rte_mbuf > **bufs, uint16_t nb_pkts) > (void)reserve_fill_queue(umem, > ETH_AF_XDP_RX_BATCH_SIZE, > NULL, fq); >=20 > - if (unlikely(rte_pktmbuf_alloc_bulk(rxq->mb_pool, mbufs, nb_pkts) > !=3D 0)) > - return 0; > - > - rcvd =3D xsk_ring_cons__peek(rx, nb_pkts, &idx_rx); > - if (rcvd =3D=3D 0) { > + nb_pkts =3D xsk_ring_cons__peek(rx, nb_pkts, &idx_rx); > + if (nb_pkts =3D=3D 0) { > #if defined(XDP_USE_NEED_WAKEUP) > if (xsk_ring_prod__needs_wakeup(fq)) > (void)poll(rxq->fds, 1, 1000); > #endif > + return 0; > + } >=20 > - goto out; > + if (unlikely(rte_pktmbuf_alloc_bulk(rxq->mb_pool, mbufs, > nb_pkts))) { > + /* rollback cached_cons which is added by > + * xsk_ring_prod__needs_wakeup > + */ > + rx->cached_cons -=3D nb_pkts; > + return 0; > } >=20 > - for (i =3D 0; i < rcvd; i++) { > + for (i =3D 0; i < nb_pkts; i++) { > const struct xdp_desc *desc; > uint64_t addr; > uint32_t len; > @@ -366,18 +368,13 @@ af_xdp_rx_cp(void *queue, struct rte_mbuf > **bufs, uint16_t nb_pkts) > bufs[i] =3D mbufs[i]; > } >=20 > - xsk_ring_cons__release(rx, rcvd); > + xsk_ring_cons__release(rx, nb_pkts); >=20 > /* statistics */ > - rxq->stats.rx_pkts +=3D rcvd; > + rxq->stats.rx_pkts +=3D nb_pkts; > rxq->stats.rx_bytes +=3D rx_bytes; >=20 > -out: > - if (rcvd !=3D nb_pkts) > - rte_mempool_put_bulk(rxq->mb_pool, (void > **)&mbufs[rcvd], > - nb_pkts - rcvd); > - > - return rcvd; > + return nb_pkts; > } > #endif >=20 > -- > 2.17.3