From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A5E5DA0577; Tue, 7 Apr 2020 17:00:27 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0F1F62B96; Tue, 7 Apr 2020 17:00:27 +0200 (CEST) Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2078.outbound.protection.outlook.com [40.107.20.78]) by dpdk.org (Postfix) with ESMTP id 0DA542B86 for ; Tue, 7 Apr 2020 17:00:26 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ac9fEZfKOF58FnW1Vd1Cqms2m1bHaALDFbVpJHgD4cDxoyTccl/LyzlxSx0mnkdomFTtLIWFUvURDx12w1jIjrKrng9MyVrOa2JJoMzNXzHvaz5GZuxU46UVndlblmy0M3jcqdzGVAicE0/nvQkv1feYuNiilg1SzOhSNn4oFGejvwkxiKD9UyXb6vayUettzDxo9hMLLRBCRAbw3XhJ51LFUikaN+5XG1xRY6IKA5fQH1Mvjh34LOcozb9xn0xdKMss0Jdy7x7nr4gxSw90irviI8ksoP2AeG943pnEYkY/6MVSM322vImAzb+iMxXlkqAkrhBIJEnukxt7DgowtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=EjPkjxqo05gCe6UyL4c99nDfsVXZ3qdYtrYOFgfGmyA=; b=DsxTTReYbJF7uH4iDYj/uTx190s7UaB0uBWi6hlpn9v1kWoaYHW1tAB/yM1aa7GWCbcgPyRRDASddTwlM1LXlik2xDP0OSG/I7xU7R0VwB0ilmCI1oP/Au3zTlWjGkElD6COHOCTs4MK107F1iLjrozIPPIj7c+VjaPHXrBRaapqEqI38N1uzF/7jA5171F5yZXSff2aN7vRc7Rud/0culfTp10tMpeZ2E1kvN+c9C0iBkFcAZYpLh4OupbKXFTi2j1pN4UaMq7h8TFQxsAdHVYe9nHSUyUehDx2pIsU/qMdmUXBLIARg1tJQUNp0ZaLqjUYT07xY5mbFcPAsxxJwA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com; dkim=pass header.d=mellanox.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=EjPkjxqo05gCe6UyL4c99nDfsVXZ3qdYtrYOFgfGmyA=; b=gPOx1CZmgwfgOQp0OvoCLVk6FR6k9MYJm/np2pXpXLXXB3jzS5eJ5KSdH7gkR1tKLOmEQhP59u+x9hgyFcLgjJrxbtx0DTsluOaUgzvteHxda+i3okGKz2coETodhNRaashOdOzhqSJ5YXY78SjOfuMgfrPGHWY54njisRQiTdo= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=suanmingm@mellanox.com; Received: from HE1PR05MB3484.eurprd05.prod.outlook.com (2603:10a6:7:2f::12) by HE1PR05MB3260.eurprd05.prod.outlook.com (2603:10a6:7:35::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2878.17; Tue, 7 Apr 2020 15:00:24 +0000 Received: from HE1PR05MB3484.eurprd05.prod.outlook.com ([fe80::e44a:abcc:6e96:cb00]) by HE1PR05MB3484.eurprd05.prod.outlook.com ([fe80::e44a:abcc:6e96:cb00%7]) with mapi id 15.20.2878.022; Tue, 7 Apr 2020 15:00:24 +0000 From: Suanming Mou To: Andrzej Ostruszka , Slava Ovsiienko , cristian.dumitrescu@intel.com, dev@dpdk.org References: <1583828479-204084-1-git-send-email-suanmingm@mellanox.com> <1583828479-204084-2-git-send-email-suanmingm@mellanox.com> <43d443c3-59e2-1b18-cc12-0827b1773c88@mellanox.com> Message-ID: Date: Tue, 7 Apr 2020 23:00:10 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 In-Reply-To: <43d443c3-59e2-1b18-cc12-0827b1773c88@mellanox.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-ClientProxiedBy: SG2PR01CA0171.apcprd01.prod.exchangelabs.com (2603:1096:4:28::27) To HE1PR05MB3484.eurprd05.prod.outlook.com (2603:10a6:7:2f::12) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from [192.168.31.73] (115.193.231.72) by SG2PR01CA0171.apcprd01.prod.exchangelabs.com (2603:1096:4:28::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2878.16 via Frontend Transport; Tue, 7 Apr 2020 15:00:22 +0000 X-Originating-IP: [115.193.231.72] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: e4ec077d-642f-4b5a-bd60-08d7db046683 X-MS-TrafficTypeDiagnostic: HE1PR05MB3260:|HE1PR05MB3260: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-Forefront-PRVS: 036614DD9C X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:HE1PR05MB3484.eurprd05.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(10009020)(4636009)(396003)(366004)(376002)(39860400002)(136003)(346002)(5660300002)(16576012)(316002)(86362001)(66556008)(31686004)(8676002)(81166006)(2906002)(2616005)(66476007)(31696002)(66946007)(81156014)(956004)(6486002)(52116002)(8936002)(6666004)(36756003)(110136005)(16526019)(186003)(478600001)(26005); DIR:OUT; SFP:1101; Received-SPF: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: WWq+xBdtADOZ2ciBJEKXxrE8Lw7tsw1t/Ys3cgOTfAgdmlz6UY7ATU3/FPDQbpq8yG7sS1JsQhJTlGsTBKmbGerAG60OMhaZ9OI7tVlWPHakl45AwzLciJMNvyrKTSNhrJDT9JMnWcY/t1tac3fl10rAJdVlT2QS0BDjzlZ5pRHU2Dt3fC1VId9H2Z7o/D/7kTJA8YXLBlDz4TY46A6zfc4FRZMJzXVj5XCM76YC+3uKv9tAjSuKhiGFOKv0FcyyEay68+fOTQejA83R2EZKRzLn55v9+5wnF6izrCeVW2jtfdC8uk8YVs/ma7TeuC3eOIHQm3W+8k4Ym2KF3/PPhHMX6BRMj1avso/y9VF2ynVwz0XEwp/lpAx4AMBEa9JCfDia4gk4ImymX9ZpVhINZDpoCdCa9lnS1eNas6ZWlfS4up9cdIu0doSQ/DU8srZO X-MS-Exchange-AntiSpam-MessageData: OGyJU71X3De/0iX7xR5o/ERg+tCBWmaoYzohUfY2mxWOHN3kNjS4crSOv/D+M4hfR4RjwmtjMcTYowIXZHtvSzsDh9pvjQ7oiP/v60CG/E7Cktf2fW0QdJQjaAoerV9ZNb25ok4St8vkHERKDIAe+A== X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: e4ec077d-642f-4b5a-bd60-08d7db046683 X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Apr 2020 15:00:24.5044 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: vHzt4XarCGJrHUI3fuEJ55GpDSBhEfQ94238iO8J7eypS3RlEQ+sFOeaZi7R9LXTrS+rVug5lKCT+hVFoHzG2w== X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR05MB3260 Subject: Re: [dpdk-dev] [PATCH 1/2] bitmap: add create bitmap with all bits set X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi guys, Since we are all quite curious about which is the best implementation for the performance, I just did some test on my server. There will be 3 implementations. 1. Clear all the array1 and array2 bits first, then set the bits we needed.(The current implementation in the patch). 2. Set all the bits in array1 and array2 first, then clear the not needed bits. 3. Set the needed bits in array1 and array2, and clear the left not need bits. (As we are allocate more memory as the alignment, clear not needed bits should be done anyway.) So it's call the 3 implementation Cs, Sc, sc: Capital 'C' means clear all bits. Lowercase 'c' means clear not needed bits. Capital 'S' means set all bits. Lowercase 's' means set needed bits. I add some test code in the bitmap_test code, here is the cycle for different bits with different implementations. RTE>>bitmap_test Set bits:63 Cs   Sc   sc 1018 1089 1078 Set bits:126 Cs   Sc   sc 972  1082 1048 Set bits:252 Cs   Sc   sc 918  1039 1029 Set bits:504 Cs   Sc   sc 861  986  957 Set bits:1008 Cs   Sc   sc 802  882  851 Set bits:2016 Cs   Sc   sc 618  646  625 Set bits:4032 Cs   Sc   sc 272  215  209 Set bits:8064 Cs   Sc   sc 537  392  391 Set bits:16128 Cs   Sc   sc 1083 786  798 As we can see, after 4K bits, the Cs case comes disadvantage, before 4K bits, it works much better. And since the cycles before 4K  bits does not show more significant differences, it should be OK to use the Sc or sc cases. Maybe better to choose the sc code. =================================== Testing code as below: static void test_tsc(uint32_t n_bits) {         void *mem;         uint32_t i;         uint64_t start, cost, cost2, cost3;         uint32_t bmp_size;         struct rte_bitmap *bmp;         bmp_size =                 rte_bitmap_get_memory_footprint(n_bits);         mem = rte_zmalloc("test_bmap", bmp_size, RTE_CACHE_LINE_SIZE);         if (mem == NULL) {                 printf("Failed to allocate memory for bitmap\n");                 return;         }         /* Make the memory hot.*/         bmp = rte_bitmap_init_with_all_set(n_bits, mem, bmp_size);         if (bmp == NULL) {                 printf("Failed to init bitmap\n");                 return;         }         /* Clear all bits first, set needed. */         start = rte_rdtsc();         for (i = 0; i < 1000; i++)                 rte_bitmap_init_with_all_set(n_bits, mem, bmp_size);         cost = (rte_rdtsc() - start) / 1000;         /* Set all bits first, clear not needed. */         start = rte_rdtsc();         for (i = 0; i < 1000; i++)                 rte_bitmap_init_with_all_set2(n_bits, mem, bmp_size);         cost2 = (rte_rdtsc() - start) / 1000;         /* Set needed bits, clear left. */         start = rte_rdtsc();         for (i = 0; i < 1000; i++)                 rte_bitmap_init_with_all_set3(n_bits, mem, bmp_size);         cost3 = (rte_rdtsc() - start) / 1000;         printf("Set bits:%d\nCs   Sc   sc\n", n_bits);         printf("%-4ld %-4ld %-4ld\n\n", cost, cost2, cost3);         rte_free(mem); }         uint32_t i;         for (i = 63; i < (63 << 9); i<<=1)                 test_tsc(i); =================================== Sc code as below: static inline struct rte_bitmap * rte_bitmap_init_with_all_set2(uint32_t n_bits, uint8_t *mem, uint32_t mem_size) {         uint32_t i;         uint32_t slabs;         struct rte_bitmap *bmp;         bmp = __rte_bitmap_init(n_bits, mem, mem_size);         if (!bmp)                 return NULL;         memset(bmp->array1, 0xff, bmp->array1_size * sizeof(uint64_t));         memset(bmp->array2, 0xff, bmp->array2_size * sizeof(uint64_t));         /* Fill the arry1 slab aligned bits. */         slabs = n_bits >> (RTE_BITMAP_SLAB_BIT_SIZE_LOG2 +                            RTE_BITMAP_CL_BIT_SIZE_LOG2);         /* Clear the array1 left slabs. */         memset(&bmp->array1[slabs], 0, (bmp->array1_size - slabs) *                sizeof(bmp->array1[0]));         /* Fill the array1 middle not full set slab. */         i = slabs << (RTE_BITMAP_SLAB_BIT_SIZE_LOG2 +                       RTE_BITMAP_CL_BIT_SIZE_LOG2);         for (;i < n_bits; i += RTE_BITMAP_CL_BIT_SIZE)                 rte_bitmap_set(bmp, i);         /* Clear the array2 left slabs. */         slabs = n_bits >> RTE_BITMAP_SLAB_BIT_SIZE_LOG2;         memset(&bmp->array2[slabs], 0, (bmp->array2_size - slabs) *                sizeof(bmp->array2[0]));         /* Fill the array2 middle not full set slab. */         for (i = slabs * RTE_BITMAP_SLAB_BIT_SIZE; i < n_bits; i++)                 rte_bitmap_set(bmp, i);         return bmp; } =================================== sc code as below: static inline struct rte_bitmap * rte_bitmap_init_with_all_set3(uint32_t n_bits, uint8_t *mem, uint32_t mem_size) {         uint32_t i;         uint32_t slabs;         struct rte_bitmap *bmp;         bmp = __rte_bitmap_init(n_bits, mem, mem_size);         if (!bmp)                 return NULL;         /* Fill the arry1 slab aligned bits. */         slabs = n_bits >> (RTE_BITMAP_SLAB_BIT_SIZE_LOG2 +                            RTE_BITMAP_CL_BIT_SIZE_LOG2);         memset(bmp->array1, 0xff, slabs * sizeof(bmp->array1[0]));         /* Clear the array1 left slabs. */         memset(&bmp->array1[slabs], 0, (bmp->array1_size - slabs) *                sizeof(bmp->array1[0]));         /* Fill the array1 middle not full set slab. */         i = slabs << (RTE_BITMAP_SLAB_BIT_SIZE_LOG2 +                       RTE_BITMAP_CL_BIT_SIZE_LOG2);         for (;i < n_bits; i += RTE_BITMAP_CL_BIT_SIZE)                 rte_bitmap_set(bmp, i);         /* Fill the arry2 slab aligned bits. */         slabs = n_bits >> RTE_BITMAP_SLAB_BIT_SIZE_LOG2;         memset(bmp->array2, 0xff, slabs * sizeof(bmp->array2[0]));         /* Clear the array2 left slabs. */         memset(&bmp->array2[slabs], 0, (bmp->array2_size - slabs) *                sizeof(bmp->array2[0]));         /* Fill the array2 middle not full set slab. */         for (i = slabs * RTE_BITMAP_SLAB_BIT_SIZE; i < n_bits; i++)                 rte_bitmap_set(bmp, i);         return bmp; } Any comments or suggestions? Thanks, SuanmingMou