From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 80C2DA0567; Sun, 26 Jun 2022 22:44:52 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2836541141; Sun, 26 Jun 2022 22:44:52 +0200 (CEST) Received: from mail-io1-f50.google.com (mail-io1-f50.google.com [209.85.166.50]) by mails.dpdk.org (Postfix) with ESMTP id 288B140E50 for ; Sun, 26 Jun 2022 22:44:51 +0200 (CEST) Received: by mail-io1-f50.google.com with SMTP id s17so7752166iob.7 for ; Sun, 26 Jun 2022 13:44:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6bZmM5XRmPA7YEqG8fdxf80woW0Mu0qdLMuF8VFo4aE=; b=F2XZxPOHxK3ls1bu9e31+1cVPVPTcpt8TD7kWTTVSve/clvGIX8bDN46qujLi/QpSG TNAMoaKwoO7bAAH0TQUvTCVOKfkocE4jEOic76jHDPO91B/5BTaHquvQ0fZVWGo6DWPd pNDGhqwmBybSNiTg/60uKbKC3y0m8qQQ0od1I= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6bZmM5XRmPA7YEqG8fdxf80woW0Mu0qdLMuF8VFo4aE=; b=lQFimSulbmDYSU7KL2d7YPjUOG4Zi4bsi8qvF74IOIT/4+B6hQ6uolQVGTfptLQzCp wZzcxUm68cc4kfifRl+M4L2crXQQ6G7/rKE0h1rF8/jm8rr9qpQmYdTP1HFkO5Q5pAue 9FF3P+4E0ISaqpEnG80SovufNqXgp0P5hBrvyLq1qIghi3BntOlZvxJiOA7Wdedqn8WU /vomJKJg5DkZ/OsD/gDG9YXpI6Xucw45A//cA520YFNVFcvNCwlMk0uEVW++cc8ffB/s 6EOQTjKVAbw4BULi+7H8BBKkkDO/fqI40+00g0L+tRgGFLxJnYhCHRk+psyQ/JeMYj5I SHiw== X-Gm-Message-State: AJIora87nT2u3yQuAjqrJLIx8y0gBRl4Uu9XE14STrDHOoh7I83F74qB A/Kg39/83BmE+GFK+Seo83OJ4Z2FvVaKn8Ew/nCkRQ== X-Google-Smtp-Source: AGRyM1tV4aLrtV6tLQ8U8CudTcEYne/WONmJQ3T8ar4poVxBUCOZS3lPPuAtHuTw5vvNGmU/OA9Wryc7Ok4/qACg4hg= X-Received: by 2002:a02:a893:0:b0:339:dfee:2d24 with SMTP id l19-20020a02a893000000b00339dfee2d24mr5985193jam.69.1656276290328; Sun, 26 Jun 2022 13:44:50 -0700 (PDT) MIME-Version: 1.0 References: <20220613062225.2317537-1-ruifeng.wang@arm.com> In-Reply-To: <20220613062225.2317537-1-ruifeng.wang@arm.com> From: Ajit Khaparde Date: Sun, 26 Jun 2022 13:44:34 -0700 Message-ID: Subject: Re: [PATCH] net/bnxt: reduce barriers in NEON vector Rx To: Ruifeng Wang , Ferruh Yigit Cc: Somnath Kotur , dpdk-dev , Honnappa Nagarahalli , nd Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg=sha-256; boundary="000000000000f406d805e25fdf5f" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org --000000000000f406d805e25fdf5f Content-Type: text/plain; charset="UTF-8" On Sun, Jun 12, 2022 at 11:22 PM Ruifeng Wang wrote: > > To read descriptors in expected order, barriers are inserted after each > descriptor read. The excessive use of barriers is unnecessary and could > cause performance drop. > > Removed barriers between descriptor reads. And changed counting of valid > packets so as to handle discontinuous valid packets. Because out of > order read could lead to valid descriptors that fetched being > discontinuous. > > In VPP L3 routing test, 6% performance gain was observed. The test was > done on a platform with ThunderX2 CPU and Broadcom PS225 NIC. > > Signed-off-by: Ruifeng Wang Reviewed-by: Ajit Khaparde Patch applied to dpdk-next-net-brcm. Thanks > > --- > drivers/net/bnxt/bnxt_rxtx_vec_neon.c | 47 ++++++++++++++------------- > 1 file changed, 25 insertions(+), 22 deletions(-) > > diff --git a/drivers/net/bnxt/bnxt_rxtx_vec_neon.c b/drivers/net/bnxt/bnxt_rxtx_vec_neon.c > index 32f8e59b3a..6a4ece681b 100644 > --- a/drivers/net/bnxt/bnxt_rxtx_vec_neon.c > +++ b/drivers/net/bnxt/bnxt_rxtx_vec_neon.c > @@ -235,34 +235,32 @@ recv_burst_vec_neon(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) > * IO barriers are used to ensure consistent state. > */ > rxcmp1[3] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 7]); > - rte_io_rmb(); > + rxcmp1[2] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 5]); > + rxcmp1[1] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 3]); > + rxcmp1[0] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 1]); > + > + /* Use acquire fence to order loads of descriptor words. */ > + rte_atomic_thread_fence(__ATOMIC_ACQUIRE); > /* Reload lower 64b of descriptors to make it ordered after info3_v. */ > rxcmp1[3] = vreinterpretq_u32_u64(vld1q_lane_u64 > ((void *)&cpr->cp_desc_ring[cons + 7], > vreinterpretq_u64_u32(rxcmp1[3]), 0)); > - rxcmp[3] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 6]); > - > - rxcmp1[2] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 5]); > - rte_io_rmb(); > rxcmp1[2] = vreinterpretq_u32_u64(vld1q_lane_u64 > ((void *)&cpr->cp_desc_ring[cons + 5], > vreinterpretq_u64_u32(rxcmp1[2]), 0)); > - rxcmp[2] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 4]); > - > - t1 = vreinterpretq_u64_u32(vzip2q_u32(rxcmp1[2], rxcmp1[3])); > - > - rxcmp1[1] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 3]); > - rte_io_rmb(); > rxcmp1[1] = vreinterpretq_u32_u64(vld1q_lane_u64 > ((void *)&cpr->cp_desc_ring[cons + 3], > vreinterpretq_u64_u32(rxcmp1[1]), 0)); > - rxcmp[1] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 2]); > - > - rxcmp1[0] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 1]); > - rte_io_rmb(); > rxcmp1[0] = vreinterpretq_u32_u64(vld1q_lane_u64 > ((void *)&cpr->cp_desc_ring[cons + 1], > vreinterpretq_u64_u32(rxcmp1[0]), 0)); > + > + rxcmp[3] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 6]); > + rxcmp[2] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 4]); > + > + t1 = vreinterpretq_u64_u32(vzip2q_u32(rxcmp1[2], rxcmp1[3])); > + > + rxcmp[1] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 2]); > rxcmp[0] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 0]); > > t0 = vreinterpretq_u64_u32(vzip2q_u32(rxcmp1[0], rxcmp1[1])); > @@ -278,16 +276,21 @@ recv_burst_vec_neon(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) > * bits and count the number of set bits in order to determine > * the number of valid descriptors. > */ > - valid = vget_lane_u64(vreinterpret_u64_u16(vqmovn_u32(info3_v)), > - 0); > + valid = vget_lane_u64(vreinterpret_u64_s16(vshr_n_s16 > + (vreinterpret_s16_u16(vshl_n_u16 > + (vqmovn_u32(info3_v), 15)), 15)), 0); > + > /* > * At this point, 'valid' is a 64-bit value containing four > - * 16-bit fields, each of which is either 0x0001 or 0x0000. > - * Compute number of valid descriptors from the index of > - * the highest non-zero field. > + * 16-bit fields, each of which is either 0xffff or 0x0000. > + * Count the number of consecutive 1s from LSB in order to > + * determine the number of valid descriptors. > */ > - num_valid = (sizeof(uint64_t) / sizeof(uint16_t)) - > - (__builtin_clzl(valid & desc_valid_mask) / 16); > + valid = ~(valid & desc_valid_mask); > + if (valid == 0) > + num_valid = 4; > + else > + num_valid = __builtin_ctzl(valid) / 16; > > if (num_valid == 0) > break; > -- > 2.25.1 > --000000000000f406d805e25fdf5f Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIIQdgYJKoZIhvcNAQcCoIIQZzCCEGMCAQExDzANBglghkgBZQMEAgEFADALBgkqhkiG9w0BBwGg gg3NMIIFDTCCA/WgAwIBAgIQeEqpED+lv77edQixNJMdADANBgkqhkiG9w0BAQsFADBMMSAwHgYD VQQLExdHbG9iYWxTaWduIFJvb3QgQ0EgLSBSMzETMBEGA1UEChMKR2xvYmFsU2lnbjETMBEGA1UE AxMKR2xvYmFsU2lnbjAeFw0yMDA5MTYwMDAwMDBaFw0yODA5MTYwMDAwMDBaMFsxCzAJBgNVBAYT AkJFMRkwFwYDVQQKExBHbG9iYWxTaWduIG52LXNhMTEwLwYDVQQDEyhHbG9iYWxTaWduIEdDQyBS MyBQZXJzb25hbFNpZ24gMiBDQSAyMDIwMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA vbCmXCcsbZ/a0fRIQMBxp4gJnnyeneFYpEtNydrZZ+GeKSMdHiDgXD1UnRSIudKo+moQ6YlCOu4t rVWO/EiXfYnK7zeop26ry1RpKtogB7/O115zultAz64ydQYLe+a1e/czkALg3sgTcOOcFZTXk38e aqsXsipoX1vsNurqPtnC27TWsA7pk4uKXscFjkeUE8JZu9BDKaswZygxBOPBQBwrA5+20Wxlk6k1 e6EKaaNaNZUy30q3ArEf30ZDpXyfCtiXnupjSK8WU2cK4qsEtj09JS4+mhi0CTCrCnXAzum3tgcH cHRg0prcSzzEUDQWoFxyuqwiwhHu3sPQNmFOMwIDAQABo4IB2jCCAdYwDgYDVR0PAQH/BAQDAgGG MGAGA1UdJQRZMFcGCCsGAQUFBwMCBggrBgEFBQcDBAYKKwYBBAGCNxQCAgYKKwYBBAGCNwoDBAYJ KwYBBAGCNxUGBgorBgEEAYI3CgMMBggrBgEFBQcDBwYIKwYBBQUHAxEwEgYDVR0TAQH/BAgwBgEB /wIBADAdBgNVHQ4EFgQUljPR5lgXWzR1ioFWZNW+SN6hj88wHwYDVR0jBBgwFoAUj/BLf6guRSSu TVD6Y5qL3uLdG7wwegYIKwYBBQUHAQEEbjBsMC0GCCsGAQUFBzABhiFodHRwOi8vb2NzcC5nbG9i YWxzaWduLmNvbS9yb290cjMwOwYIKwYBBQUHMAKGL2h0dHA6Ly9zZWN1cmUuZ2xvYmFsc2lnbi5j b20vY2FjZXJ0L3Jvb3QtcjMuY3J0MDYGA1UdHwQvMC0wK6ApoCeGJWh0dHA6Ly9jcmwuZ2xvYmFs c2lnbi5jb20vcm9vdC1yMy5jcmwwWgYDVR0gBFMwUTALBgkrBgEEAaAyASgwQgYKKwYBBAGgMgEo CjA0MDIGCCsGAQUFBwIBFiZodHRwczovL3d3dy5nbG9iYWxzaWduLmNvbS9yZXBvc2l0b3J5LzAN BgkqhkiG9w0BAQsFAAOCAQEAdAXk/XCnDeAOd9nNEUvWPxblOQ/5o/q6OIeTYvoEvUUi2qHUOtbf jBGdTptFsXXe4RgjVF9b6DuizgYfy+cILmvi5hfk3Iq8MAZsgtW+A/otQsJvK2wRatLE61RbzkX8 9/OXEZ1zT7t/q2RiJqzpvV8NChxIj+P7WTtepPm9AIj0Keue+gS2qvzAZAY34ZZeRHgA7g5O4TPJ /oTd+4rgiU++wLDlcZYd/slFkaT3xg4qWDepEMjT4T1qFOQIL+ijUArYS4owpPg9NISTKa1qqKWJ jFoyms0d0GwOniIIbBvhI2MJ7BSY9MYtWVT5jJO3tsVHwj4cp92CSFuGwunFMzCCA18wggJHoAMC AQICCwQAAAAAASFYUwiiMA0GCSqGSIb3DQEBCwUAMEwxIDAeBgNVBAsTF0dsb2JhbFNpZ24gUm9v dCBDQSAtIFIzMRMwEQYDVQQKEwpHbG9iYWxTaWduMRMwEQYDVQQDEwpHbG9iYWxTaWduMB4XDTA5 MDMxODEwMDAwMFoXDTI5MDMxODEwMDAwMFowTDEgMB4GA1UECxMXR2xvYmFsU2lnbiBSb290IENB IC0gUjMxEzARBgNVBAoTCkdsb2JhbFNpZ24xEzARBgNVBAMTCkdsb2JhbFNpZ24wggEiMA0GCSqG SIb3DQEBAQUAA4IBDwAwggEKAoIBAQDMJXaQeQZ4Ihb1wIO2hMoonv0FdhHFrYhy/EYCQ8eyip0E XyTLLkvhYIJG4VKrDIFHcGzdZNHr9SyjD4I9DCuul9e2FIYQebs7E4B3jAjhSdJqYi8fXvqWaN+J J5U4nwbXPsnLJlkNc96wyOkmDoMVxu9bi9IEYMpJpij2aTv2y8gokeWdimFXN6x0FNx04Druci8u nPvQu7/1PQDhBjPogiuuU6Y6FnOM3UEOIDrAtKeh6bJPkC4yYOlXy7kEkmho5TgmYHWyn3f/kRTv riBJ/K1AFUjRAjFhGV64l++td7dkmnq/X8ET75ti+w1s4FRpFqkD2m7pg5NxdsZphYIXAgMBAAGj QjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBSP8Et/qC5FJK5N UPpjmove4t0bvDANBgkqhkiG9w0BAQsFAAOCAQEAS0DbwFCq/sgM7/eWVEVJu5YACUGssxOGhigH M8pr5nS5ugAtrqQK0/Xx8Q+Kv3NnSoPHRHt44K9ubG8DKY4zOUXDjuS5V2yq/BKW7FPGLeQkbLmU Y/vcU2hnVj6DuM81IcPJaP7O2sJTqsyQiunwXUaMld16WCgaLx3ezQA3QY/tRG3XUyiXfvNnBB4V 14qWtNPeTCekTBtzc3b0F5nCH3oO4y0IrQocLP88q1UOD5F+NuvDV0m+4S4tfGCLw0FREyOdzvcy a5QBqJnnLDMfOjsl0oZAzjsshnjJYS8Uuu7bVW/fhO4FCU29KNhyztNiUGUe65KXgzHZs7XKR1g/ XzCCBVUwggQ9oAMCAQICDBCmE9BT7srhoNHDEDANBgkqhkiG9w0BAQsFADBbMQswCQYDVQQGEwJC RTEZMBcGA1UEChMQR2xvYmFsU2lnbiBudi1zYTExMC8GA1UEAxMoR2xvYmFsU2lnbiBHQ0MgUjMg UGVyc29uYWxTaWduIDIgQ0EgMjAyMDAeFw0yMTAyMjIxNDE4MjdaFw0yMjA5MjIxNDUxNDlaMIGW MQswCQYDVQQGEwJJTjESMBAGA1UECBMJS2FybmF0YWthMRIwEAYDVQQHEwlCYW5nYWxvcmUxFjAU BgNVBAoTDUJyb2FkY29tIEluYy4xHDAaBgNVBAMTE0FqaXQgS3VtYXIgS2hhcGFyZGUxKTAnBgkq hkiG9w0BCQEWGmFqaXQua2hhcGFyZGVAYnJvYWRjb20uY29tMIIBIjANBgkqhkiG9w0BAQEFAAOC AQ8AMIIBCgKCAQEAwXsxfYF9jpj9zve1vXxD491SrWDVlcmLMdnOS1c7POMC8lbbgvp1o2kIu/3n xCVFTai5H6rHZgrFItNNVZ+XaJW9Ob9eiSuXdnAu5gVdTb+IFAf4S/PT2LXzpP07M7vyvm/yvA+8 HtVfapzqqTNYdNVUpq28MYsKEWbnyK94x5+C3oCAV4bpNnMoPNtKrMhvOdpTREQRyew8hyy3/Mz7 RIaCW0xx+14NTQe17dkH6CEEpmCjejneq/FU0gmbuorwHoP9mOiqeh23/ZKVpmFO/eiDtvMNAMDW 6LzhOk/pMklUPTHu/gQNW3OQebyhyFUHiBSp8rDkfWZT57Asd0PtdQIDAQABo4IB2zCCAdcwDgYD VR0PAQH/BAQDAgWgMIGjBggrBgEFBQcBAQSBljCBkzBOBggrBgEFBQcwAoZCaHR0cDovL3NlY3Vy ZS5nbG9iYWxzaWduLmNvbS9jYWNlcnQvZ3NnY2NyM3BlcnNvbmFsc2lnbjJjYTIwMjAuY3J0MEEG CCsGAQUFBzABhjVodHRwOi8vb2NzcC5nbG9iYWxzaWduLmNvbS9nc2djY3IzcGVyc29uYWxzaWdu MmNhMjAyMDBNBgNVHSAERjBEMEIGCisGAQQBoDIBKAowNDAyBggrBgEFBQcCARYmaHR0cHM6Ly93 d3cuZ2xvYmFsc2lnbi5jb20vcmVwb3NpdG9yeS8wCQYDVR0TBAIwADBJBgNVHR8EQjBAMD6gPKA6 hjhodHRwOi8vY3JsLmdsb2JhbHNpZ24uY29tL2dzZ2NjcjNwZXJzb25hbHNpZ24yY2EyMDIwLmNy bDAlBgNVHREEHjAcgRphaml0LmtoYXBhcmRlQGJyb2FkY29tLmNvbTATBgNVHSUEDDAKBggrBgEF BQcDBDAfBgNVHSMEGDAWgBSWM9HmWBdbNHWKgVZk1b5I3qGPzzAdBgNVHQ4EFgQUPHif0ihgndR0 h7r3sANaOIu2yM8wDQYJKoZIhvcNAQELBQADggEBAAEuLXDnP0Xd2zAMpQobXLUyqbpqGMO6ycQc Xq4H2YYlSNKVwPA+ZAVdUOzbSimBKlx8mzAEHkI3Ll1yXlYeT4UwkfWV9fioyGuQelLN1sGzi5bm WEpaSIbR1eiJMtzxUPwpRTn19gHZVueIot2Gw0fEYgHiMJpUr6xBWv2QNXULu/E8qvbXIRh2iycq 5rWFggX/JHglO8nVqzb1ImzqzVMFnDN15h3j8ryy2MIvZ8VDQRP7l81IXaTvVwaKpWMgV6rfQOi6 aOQZuOKkad7qoCkS5N2oSsvxi+rZtDaJJNsDjs05y5JZZQtBlfAmdYS+mmvkPjZ1iaLTzk59o/Yo fNkxggJtMIICaQIBATBrMFsxCzAJBgNVBAYTAkJFMRkwFwYDVQQKExBHbG9iYWxTaWduIG52LXNh MTEwLwYDVQQDEyhHbG9iYWxTaWduIEdDQyBSMyBQZXJzb25hbFNpZ24gMiBDQSAyMDIwAgwQphPQ U+7K4aDRwxAwDQYJYIZIAWUDBAIBBQCggdQwLwYJKoZIhvcNAQkEMSIEIIA3RGawHEzOOWV608BE EV8U1GjCqbxdfIDlReKc7eyQMBgGCSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkF MQ8XDTIyMDYyNjIwNDQ1MFowaQYJKoZIhvcNAQkPMVwwWjALBglghkgBZQMEASowCwYJYIZIAWUD BAEWMAsGCWCGSAFlAwQBAjAKBggqhkiG9w0DBzALBgkqhkiG9w0BAQowCwYJKoZIhvcNAQEHMAsG CWCGSAFlAwQCATANBgkqhkiG9w0BAQEFAASCAQA6mJh6f2V+BJOx/q8BNgTF6DJRAc13ey7j8eEk Extj8EWUdyC9z6cWP+O18/pyOdLxjcXxp14RrQgvA19d/fBLJrm4RI/QOfkqCTsXsfE6C9hzPXLu FfOBSKXmsTMNKamZtAHMQabpE71LLd3fBe2sK3KapprOn6diqZmCOgirI8xhCRJfVnFgAtis/5YB WH+ScPBIhxEpn8cfeExoHtGRm1IgY15RUOi5RF9Ljnh93P3vgOUB5j2g5fVPHoEArPr3MfHhHJWB HfAUbvnCFqxcoHuveUEt20BGtxku65UizXEQgQl2SYPM/fp1RUI6g3lNKwnW+WGGA2tDI49iUhoi --000000000000f406d805e25fdf5f--