From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 554A0A046B; Thu, 9 Jan 2020 17:06:44 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 254061DEEC; Thu, 9 Jan 2020 17:06:43 +0100 (CET) Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-eopbgr140045.outbound.protection.outlook.com [40.107.14.45]) by dpdk.org (Postfix) with ESMTP id 674661DEEB for ; Thu, 9 Jan 2020 17:06:41 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9o3x//fa3Z3/rimrdfq9m6pZ60sITdW8WgXb1QJchKw=; b=3jO5uybZA0MuN9i/VXv0hCHDoDvmI2SBE6pJnlKGEkP3OMJdypiSoN/9WAIM9rIRI9Jq0afTBhYcDTfqYudrvU7xiJA24cYYX3nh/DuYmx8gWblOmnNWc+U2L9r5NRPnZoRlqCCLxbV5Zv3sV6JomYuPuVWzpUAr+A65tU9trFg= Received: from VI1PR0801CA0083.eurprd08.prod.outlook.com (2603:10a6:800:7d::27) by VE1PR08MB4798.eurprd08.prod.outlook.com (2603:10a6:802:a2::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2602.12; Thu, 9 Jan 2020 16:06:39 +0000 Received: from DB5EUR03FT056.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e0a::208) by VI1PR0801CA0083.outlook.office365.com (2603:10a6:800:7d::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2623.8 via Frontend Transport; Thu, 9 Jan 2020 16:06:39 +0000 Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dpdk.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dpdk.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT056.mail.protection.outlook.com (10.152.21.124) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2602.11 via Frontend Transport; Thu, 9 Jan 2020 16:06:39 +0000 Received: ("Tessian outbound 4f3bc9719026:v40"); Thu, 09 Jan 2020 16:06:38 +0000 X-CR-MTA-TID: 64aa7808 Received: from f8923555c6bb.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 9AE0BCDF-0349-4B97-BDD0-934E550DF1F0.1; Thu, 09 Jan 2020 16:06:33 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id f8923555c6bb.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 09 Jan 2020 16:06:33 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RcTcme1lVtFNHOg9gmMlWMmz9RQkMVALXfpAE+/D2Ts8gD7M2/cPJRASFN17+OP2tnOsj4BjMz8uW7GJQsDe+1hSb4SnISeM0mxjWL54DBuJxbcrUV3qlnuh96N3/niMdaVQcsdoW6R1KIVbixttdshAdrFCUUMnZBzbIMNmOt9HN9zrr+0XZkL6vU2cVhCYOXqAOJhMkjGdbcjkI60det6C8LIeZq8pT/rYIL292RgZD+CUuLow8u/V4FEcxar1MZt+HN9ZKwB7zcXoKcE865VybA+4U7INr4BOfzf8LzInPS6tLfM/f4ALsBlVAM43omocNroampDGN64r+i44+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9o3x//fa3Z3/rimrdfq9m6pZ60sITdW8WgXb1QJchKw=; b=M2xiu3VNY7ObT4SLHUgpmlwmYBebjowIGMHanB5LO15L9GzPRNhh829dLt6dBeHCb8uNOv8SSW4y3I3rhqLoxSeq6ZNMC2oV/KXrEOemEm+SXe+UdYpQkCHcNCYYnflhrrdV/O1iINa3XzW+o5y45qN9Pik2k8kiN1PTI1FsLul5qYZ7DOWeu5eNhNA9Km9S9HNaooeSTahzCJB/tznpAexHleXnYxNQxf/iY5QHCKkW1MZKxkODgwG5/4pibA/95UJrgUW7O8dCcGjfd1WOHhmfNQNxaIrtELBaw8pmmlkdlhY7AghqixS9MraiLUYeQqqbnaXtQ1gdtE6Gy+MXew== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9o3x//fa3Z3/rimrdfq9m6pZ60sITdW8WgXb1QJchKw=; b=3jO5uybZA0MuN9i/VXv0hCHDoDvmI2SBE6pJnlKGEkP3OMJdypiSoN/9WAIM9rIRI9Jq0afTBhYcDTfqYudrvU7xiJA24cYYX3nh/DuYmx8gWblOmnNWc+U2L9r5NRPnZoRlqCCLxbV5Zv3sV6JomYuPuVWzpUAr+A65tU9trFg= Received: from VE1PR08MB5149.eurprd08.prod.outlook.com (20.179.30.27) by VE1PR08MB5136.eurprd08.prod.outlook.com (20.179.30.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2602.12; Thu, 9 Jan 2020 16:06:31 +0000 Received: from VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::29eb:a1be:8f8f:fae2]) by VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::29eb:a1be:8f8f:fae2%7]) with mapi id 15.20.2602.018; Thu, 9 Jan 2020 16:06:31 +0000 From: Honnappa Nagarahalli To: "Ananyev, Konstantin" , "olivier.matz@6wind.com" , "sthemmin@microsoft.com" , "jerinj@marvell.com" , "Richardson, Bruce" , "david.marchand@redhat.com" , "pbhagavatula@marvell.com" CC: "dev@dpdk.org" , Dharmik Thakkar , Ruifeng Wang , Gavin Hu , nd , Honnappa Nagarahalli , nd Thread-Topic: [PATCH v7 02/17] lib/ring: apis to support configurable element size Thread-Index: AQHVwYun8sYxhXoQvESKMlV7eC4Zb6fer7kQgAALWqCAAEpBgIAAT0pQgAAKI4CAAOSEMIAAT+uAgABqyrCAAIwUgIAA/7Zg Date: Thu, 9 Jan 2020 16:06:31 +0000 Message-ID: References: <20190906190510.11146-1-honnappa.nagarahalli@arm.com> <20191220044524.32910-1-honnappa.nagarahalli@arm.com> <20191220044524.32910-3-honnappa.nagarahalli@arm.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 9a9ba166-f19c-4e4c-bf4f-923579e76956.0 x-checkrecipientchecked: true Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; x-originating-ip: [217.140.111.135] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 54f480f6-51a7-443e-9ce0-08d7951de8fe X-MS-TrafficTypeDiagnostic: VE1PR08MB5136:|VE1PR08MB5136:|VE1PR08MB4798: x-ld-processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: True x-ms-oob-tlc-oobclassifiers: OLM:10000;OLM:10000; x-forefront-prvs: 02778BF158 X-Forefront-Antispam-Report-Untrusted: SFV:NSPM; SFS:(10009020)(4636009)(366004)(39860400002)(136003)(346002)(396003)(376002)(189003)(199004)(110136005)(26005)(66946007)(186003)(54906003)(316002)(71200400001)(6506007)(8936002)(66446008)(64756008)(86362001)(8676002)(66476007)(76116006)(7696005)(66556008)(2906002)(81156014)(81166006)(9686003)(55016002)(478600001)(5660300002)(4326008)(52536014)(33656002); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB5136; H:VE1PR08MB5149.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: LNND9j8Y8Bm2zbpLgv+OIanGetgiWluX9tQCe5mlzhgReu9BHnOilebJ9aT2jjS3xtiCLpH61Bb1++GGSoAOxM++vzXO6xgY4ifAYIHOOAbxh0N80KttmlFaeH38w4SR2/vSA0UEVedILDcJbE2URsdF0xWamlbGn3PVWPKvR73w+e+jzOVSkq0wd2Y2ovY1wdMvvaN/fcd5Mo+Ut/b8s4Ch7BzXVA7qo9ERDSCbS0SEHmkNEqt3mveDHt+fqWfv4LWI64xqqup+8kTxrVEat1g6UBROQf1U7w3cZo3JTKUASoMK0j1OsQOmZeWKGAbfJryWtYVothAYIE74ygbhitb7dX4cON1J1oNDgJxkCOCU4+AI024yJHdlmHMZaZwpnCsqQDyGxYLpB9KFpCH9Fqa2fWx5RN+WOHqHw0MZwP6Bj5VowM5srSQGoOdlZkLW Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5136 Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT056.eop-EUR03.prod.protection.outlook.com X-Forefront-Antispam-Report: CIP:63.35.35.123; IPV:CAL; SCL:-1; CTRY:IE; EFV:NLI; SFV:NSPM; SFS:(10009020)(4636009)(136003)(376002)(39860400002)(396003)(346002)(189003)(199004)(316002)(9686003)(8676002)(33656002)(8936002)(7696005)(110136005)(81166006)(81156014)(478600001)(26826003)(336012)(4326008)(52536014)(54906003)(70586007)(2906002)(70206006)(5660300002)(186003)(26005)(55016002)(6506007)(356004)(86362001); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB4798; H:64aa7808-outbound-1.mta.getcheckrecipient.com; FPR:; SPF:Pass; LANG:en; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; MX:1; A:1; X-MS-Office365-Filtering-Correlation-Id-Prvs: 358fbab2-2e2f-477a-bc31-08d7951de4a0 X-Forefront-PRVS: 02778BF158 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: AT0JGS48VCtowlfw7sLceCi0guVxtTMMHK8kvac9mw4klYt5zZ6VIDsToSpUOzS/QfRR/Xm1v0ZLJDDXKBqMXZWfTEMPCA1Yrhte0LcadWZmsB8CDGATeDncTCQPZpP6S+V2cEc0fa8Opl1FVzUTZfVFcUA7HUbrjokRorouXypB0LzLPken4e1cOHnsK3TBlCiFGAJvqAKYOMAq99XRSQPnRYcb/C0NKMc51+9iG/uRPNyNQ0KETfPD3zjnGfhzv0e858ulj4z2VEDrg0/xl8uPlEnF0wO4lTyNRocYWs350sdHgQdZJ222qISjwf9g+6+ylv5oPKZUpribiCV67NOoPfXnv6h6uJzPcBOdsFKF5AvYR3LFGnBUrqoQBCgwm2Zml7s0wvrYOqFl1NM+LAKzdrKfEsTJUzU7ATKtNlda12q+nI/A0o1Nn/KpxPCl X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2020 16:06:39.1294 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 54f480f6-51a7-443e-9ce0-08d7951de8fe X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB4798 Subject: Re: [dpdk-dev] [PATCH v7 02/17] lib/ring: apis to support configurable element size X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > > > > > > > > > > > + > > > > > > > > > > > +static __rte_always_inline void > > > > > > > > > > > +enqueue_elems_128(struct rte_ring *r, uint32_t > > > > > > > > > > > +prod_head, const void *obj_table, uint32_t n) { > > > > > > > > > > > +unsigned int i; const uint32_t size =3D > > > > > > > > > > > +r->size; uint32_t idx =3D prod_head & r->mask; > > > > > > > > > > > +r->__uint128_t > > > > > > > > > > > +*ring =3D (__uint128_t *)&r[1]; const __uint128_t > > > > > > > > > > > +*obj =3D (const __uint128_t *)obj_table; if > > > > > > > > > > > +(likely(idx + n < > > > > > > > > > > > +size)) { for (i =3D 0; i < (n & ~0x1); i +=3D 2, idx= +=3D > > > > > > > > > > > +2) { ring[idx] =3D obj[i]; ring[idx + 1] =3D obj[i + > > > > > > > > > > > +1]; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > AFAIK, that implies 16B aligned obj_table... > > > > > > > > > > Would it always be the case? > > > > > > > > > I am not sure from the compiler perspective. > > > > > > > > > At least on Arm architecture, unaligned access (address > > > > > > > > > that is accessed is not aligned to the size of the data > > > > > > > > > element being > > > > > > > > > accessed) will result in faults or require additional cyc= les. > > > > > > > > > So, aligning on > > > > > > > 16B should be fine. > > > > > > > > Further, I would be changing this to use 'rte_int128_t' as > > > > > > > > '__uint128_t' is > > > > > > > not defined on 32b systems. > > > > > > > > > > > > > > What I am trying to say: with this code we imply new > > > > > > > requirement for elems > > > > > > The only existing use case in DPDK for 16B is the event ring. > > > > > > The event ring > > > > > already does similar kind of copy (using 'struct rte_event'). > > > > > > So, there is no change in expectations for event ring. > > > > > > For future code, I think this expectation should be fine since > > > > > > it allows for > > > > > optimal code. > > > > > > > > > > > > > in the ring: when sizeof(elem)=3D=3D16 it's alignment also ha= s > > > > > > > to be at least > > > > > 16. > > > > > > > Which from my perspective is not ideal. > > > > > > Any reasoning? > > > > > > > > > > New implicit requirement and inconsistency. > > > > > Code like that: > > > > > > > > > > struct ring_elem {uint64_t a, b;}; .... > > > > > struct ring_elem elem; > > > > > rte_ring_dequeue_elem(ring, &elem, sizeof(elem)); > > > > > > > > > > might cause a crash. > > > > The alignment here is 8B. Assuming that instructions generated > > > > will require 16B alignment, it will result in a crash, if > > > > configured to generate > > > exception. > > > > But, these instructions are not atomic instructions. At least on > > > > aarch64, unaligned access will not result in an exception for > > > > non-atomic > > > loads/stores. I believe it is the same behavior for x86 as well. > > > > > > On IA, there are 2 types of 16B load/store instructions: aligned and > unaligned. > > > Aligned are a bit faster, but will cause an exception if used on non > > > 16B aligned address. > > > As you using uint128_t * compiler will assume that both src and dst > > > are 16B aligned and might generate code with aligned instructions. > > Ok, looking at few articles, I read that if the address is aligned, > > the unaligned instructions do not incur the penalty. Is this understand= ing > correct? >=20 > Yes, from my experience the difference is negligible. >=20 > > > > I see 2 solutions here: > > 1) We can switch this copy to use uint32_t pointer. It would still > > allow the compiler to generate (unaligned) instructions for up to 256b > > load/store. The 2 multiplications (to normalize the index and the size = of copy) > can use shifts. This should make it safer. If one wants performance, they= can > align the obj table to 16B (the ring itself is already aligned on the cac= he line > boundary). >=20 > Sounds good to me. >=20 > > > > 2) Considering that performance is paramount, we could document that > > the obj table needs to be aligned on 16B boundary. This would affect ev= ent > dev (if we go ahead with replacing the event ring implementation) signifi= cantly. >=20 > I don't think perf difference would be that significant to justify such c= onstraint. > I am in favor of #1. Ok, will go with this. Is it ok if I squash the intermediate commits for test cases? I can keep on= e commit for functional tests and another for performance tests. >=20 > > Note that we have to do the same thing for 64b elements as well. >=20 > I don't mind to have one unified copy procedure, which would always use 3= 2bit > elems, but AFAIK, on IA there is no such limitation for 64bit load/stores= . >=20 >=20 > > > > > > > > > > > > > > While exactly the same code with: > > > > > > > > > > struct ring_elem {uint64_t a, b, c;}; OR struct ring_elem > > > > > {uint64_t a, b, c, d;}; > > > > > > > > > > will work ok. > > > > The alignment for these structures is still 8B. Are you saying > > > > this will work because these will be copied using pointer to > > > > uint32_t (whose > > > alignment is 4B)? > > > > > > Yes, as we doing uint32_t copies, compiler can't assume the data > > > will be 16B aligned and will use unaligned instructions. > > > > > > > > > > > > > > > > > > > > > > > > > Note that for elem sizes > 16 (24, 32), there is no such cons= traint. > > > > > > The rest of them need to be aligned on 4B boundary. However, > > > > > > this should > > > > > not affect the existing code. > > > > > > The code for 8B and 16B is kept as is to ensure the > > > > > > performance is not > > > > > affected for the existing code. > > > >