From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A77A643215; Fri, 27 Oct 2023 15:46:58 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 961FB40A8A; Fri, 27 Oct 2023 15:46:58 +0200 (CEST) Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) by mails.dpdk.org (Postfix) with ESMTP id CE9C04064A for ; Fri, 27 Oct 2023 15:46:35 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698414396; x=1729950396; h=message-id:date:subject:to:cc:references:from: in-reply-to:content-transfer-encoding:mime-version; bh=Rju3fSVvQtZkyMKN0AuTA2uwXEreJFQeBq2jLfn2ctY=; b=OKYVQRyRD5vdlV6DaVJ1mX0iVMo6MlJjPxfQ27oDDP9Vm3mC4WhXcN1C S7IU0pHgysvDiuBre47iG08+tBoJQRwzoOtktKWAuv76iEmbZDvQPOwrK ZqriC0JonB9kBKS7efIzDOMeSfziA92hjSb2vDh24TAw1Kl83TYc7VN+s Aj0E7/ZnKEXKUU1q7cvUWYo36b5FXfcJsRDSAJvsb+TCCohAf9a2zG9fx XDLYLhywpD7I1Kj3ITc53dNT5eVBGlNQc/q8+kH+sslvP/PpYu/+X5DBO adDUN0Jwr0TWDj+/bvUsYu1zpmacrnBrdxFU/W7B8gKuQCDtrIj1o6mQV Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10876"; a="384983265" X-IronPort-AV: E=Sophos;i="6.03,256,1694761200"; d="scan'208";a="384983265" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Oct 2023 06:46:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,256,1694761200"; d="scan'208";a="837775" Received: from orsmsx603.amr.corp.intel.com ([10.22.229.16]) by fmviesa002.fm.intel.com with ESMTP/TLS/AES256-GCM-SHA384; 27 Oct 2023 06:46:24 -0700 Received: from orsmsx611.amr.corp.intel.com (10.22.229.24) by ORSMSX603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Fri, 27 Oct 2023 06:46:34 -0700 Received: from orsmsx610.amr.corp.intel.com (10.22.229.23) by ORSMSX611.amr.corp.intel.com (10.22.229.24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Fri, 27 Oct 2023 06:46:34 -0700 Received: from ORSEDG602.ED.cps.intel.com (10.7.248.7) by orsmsx610.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34 via Frontend Transport; Fri, 27 Oct 2023 06:46:34 -0700 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (104.47.56.168) by edgegateway.intel.com (134.134.137.103) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.34; Fri, 27 Oct 2023 06:46:34 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Zx70bxr02X3Tta42Ne7N8ZEf90hi5dwx2VQxBytZn6Bw+NBm2zy3hkhqOupR4Ud1M2M5+ZVXIUrMpTU9twJGP8jTtGEQX4AHUua0YMSTYnVdJHy8Bv0wk7mRrIJw45OwNyPaJvuecIRuRGEg2SJDw2Sl39Zl/VN73gx+YvPUMHf3USH2qFXy7V6BRkI3DA5cDzOLaGMYNMxk86fOy8r4RZgGN8EkbOfkxzHieLY6s93HIEBGvnGstzzr/U0ntNlv90TxMyd76TBs80FzijwOoFY1XQAsrmgv4xi4Yj9kZMeALHq8ttIqBNH7KdQ0/pZSI1vZwgz76cSO1DKYNJNnFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=71WXlFZi7cS8MZFOxDXKdjTBax/qxyZriDQSr0V6x5M=; b=ZciyMzlabHO6wghtmpRLBhmvehM/veU3XQ7iYHy2pfavR02gQ+peec3dG5/k71xBA1Atg2XLJrc8PBb1gW1jqQ/jM4REwWoAspx7N4uE7aI+NqPaERoiajzVoAojNat/ti2zQgT8XjNcBBFrnkmjO/nwMfvncamkhXL5KfQ6FSywgl1QxggOtlMtFckuQ/3es9n9wTBgidOyLIoyztFN6Pjdhi0ksvLyZcc/XPW8Mxg0f/sOzV4mTsEnjRXeIoGI45tihZCwrkMQCw/T4ATHv7oUljBPzI2kdvIjf//UTH0gfNgjiQe8gxkaB2M/9klzvEsbBBwqYCsjZ0H22Tbz4A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; Received: from SJ0PR11MB5772.namprd11.prod.outlook.com (2603:10b6:a03:422::8) by DM4PR11MB5422.namprd11.prod.outlook.com (2603:10b6:5:399::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6933.24; Fri, 27 Oct 2023 13:46:32 +0000 Received: from SJ0PR11MB5772.namprd11.prod.outlook.com ([fe80::db69:df42:7a74:fc50]) by SJ0PR11MB5772.namprd11.prod.outlook.com ([fe80::db69:df42:7a74:fc50%6]) with mapi id 15.20.6907.021; Fri, 27 Oct 2023 13:46:32 +0000 Message-ID: <3504aaf1-b4e0-44f5-883a-5e4bc0197283@intel.com> Date: Fri, 27 Oct 2023 14:46:28 +0100 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1 1/3] dmadev: add inter-domain operations To: Jerin Jacob , fengchengwen , , CC: Anatoly Burakov , , Kevin Laatz , Bruce Richardson References: <8866a5c7ea36e476b2a92e3e4cea6c2c127ab82f.1691768110.git.anatoly.burakov@intel.com> <8ce1cf14-17e4-4092-1102-305f8fe25a36@huawei.com> Content-Language: en-US From: "Medvedkin, Vladimir" In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: DU6P191CA0032.EURP191.PROD.OUTLOOK.COM (2603:10a6:10:53f::19) To SJ0PR11MB5772.namprd11.prod.outlook.com (2603:10b6:a03:422::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ0PR11MB5772:EE_|DM4PR11MB5422:EE_ X-MS-Office365-Filtering-Correlation-Id: 30b183ec-0eb0-420d-f6c7-08dbd6f320f5 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: oHtYii924+6xrsJdKlinjGUB3GJRNUyzDlJ6sI73jrf4BBfv7KDcrYyTjs8z76w8gTcz7Em7Ka9gvkZ1P9hC2SGRf6uRqYIGjhQCAPaWKV5FllGUjvWzXmP0rokorKzvKfwOXMqf5+4RkMjC65mSb0oVgplksseiV9ecVZ9r3/g9u0CdsO2BKBFrGIdUx2MTEvSc+qTsannds0n4b1Fe2frzqYwY8flV3jdU93AyKARlMuNsIxCaRI44pZ88Aqbf55hign51opZFUkGsrw81Z96sJMKGxQEujhGSjrLdyuzKnrTqMtJhpXkV/7J0tYc/S0X1KnOFG8CHKS1tqeEqkRkQAY2sA+g8z/igjHhEyDfUQ8elwKFsMwMdjGxlkl0PPszdqAFn8WJ0Hp9ntAIj+LfhiaXfzyZMU0HwfQwmUsMo1NQmTEcFaJ+TqT3hVtzy1ugyLUJYgSVQxBf3RvsvwK4VZi6QvM4lbNgNItIjeTQ2APBE/0PpNXBv4EhQCxb3a5PmWLQ/j/orOI29Q/0gSP+2tgVYsF9kw6E1kwcwXT4OGBkISdpShI5GxIdJCbsxkTZ7ROdsF6uvxTwdb0QmcqJy5f8opOgZ5T6aVRnIzJUYnVH9qtVweNw2Xuw5KzDeVuS49yoo1UMqZGHDEnMuyQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR11MB5772.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(376002)(346002)(396003)(136003)(39860400002)(366004)(230922051799003)(1800799009)(186009)(64100799003)(451199024)(36756003)(31686004)(316002)(110136005)(66556008)(54906003)(66476007)(66946007)(86362001)(38100700002)(82960400001)(31696002)(83380400001)(6512007)(2616005)(26005)(6666004)(53546011)(107886003)(6506007)(6486002)(478600001)(2906002)(41300700001)(5660300002)(8936002)(4326008)(8676002)(45980500001)(43740500002); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?bHNaTDBERHE5NFE2U2t4OW1jVU1XQU9uR0JVeVBHMW1FY1FUcWFOWDRlSWc3?= =?utf-8?B?MnFkZkV3dWlkQmFJUnV2KzJESWpmV09LN21RZHRSTUZzclBwMmVYTEU2Ymhw?= =?utf-8?B?R052bGp1ekZGczY3bFNOTHk5aVJJZ2pIRkRaNW1XcEZWNkJsTmh6Qkh2TEsz?= =?utf-8?B?MC9nVHJhL29CWUZCKzBKTTBuREwySlZTcE9uWmNieG0zOGQ0U25TOStFenJo?= =?utf-8?B?NnZtQlNwSkdiTmlaMlNnU0dsU3dLb21Eb2xmcFBEYk1taXJlb2Y3bVcxb3Ir?= =?utf-8?B?b0VCc1l1V2NqTXdqb2t0bUhKcmFMcmNvRmJiRlZmZUJJL0FpM0x5MC9DYkJP?= =?utf-8?B?OXZzdVRPSUdabm5oR3FSbG5IUlRjR0kyaFB0MDVmdnBBSVVVVkJqcXRrZWtW?= =?utf-8?B?QzZvZjhpVGRCbkQ4ZDlkeTJsdEhOcWVXcmJqbmIydHoyOEE1Q0FnQVNuMmJO?= =?utf-8?B?QWdYb2lvVzBucnlLUCt2ekZkdkJFcGh1SEpHY1VWSmxIODhGWXNkSEovaS9k?= =?utf-8?B?UjZxMFo4VHluRlVGVWJrVTBzbXUzUW1pY3ZrejRCNDU4Ym9JbHhBL1lTSWo4?= =?utf-8?B?TkRsN0ZXRDJoelRnVFNSeTl6ZHF4ejJJbjFNZWZEKzZQTjIwWVhZcVdtYko5?= =?utf-8?B?aDR6SHBxMGh3Smw4U1Nib21ob3FRTzZ0MVB4MG96UXh6NVlNdEFVNW9Tb01o?= =?utf-8?B?OTl3U3dCS3pSc0JtQXE1dWVjSnhRWHJjYUJxakhYVXh3ZjdKOUNlaTlzbm51?= =?utf-8?B?WmF2WTNySU9yY2hlcFlsRnNnOXlJaVBzSWVFeDJXMFFtTXh1MzArZFo3U3Zo?= =?utf-8?B?WktCUGt2bXlueGNSREJKWEcvQWtuZWhpWlNoSHFzVFBIUmhCNlB5azNaeE02?= =?utf-8?B?OFYzbWNUbVZsdWJFT1JUeTQ2VWxyRnY1U09ncUNqZmR4ekZBNjRSb2xMMWlE?= =?utf-8?B?UnJxWWlhM2k4SWRjUTdod09zdTI4U0MvZWs5WEE5RWExU3NRVWxUWHhiWjVZ?= =?utf-8?B?M0RUaTl3RGx2c2l3WjRIcEo4NU9BOFhlNVc5S2NDYnduL1JmV3RhMEdKeU9W?= =?utf-8?B?MVpQcXVwdFArWmRzTVB0UlRVZ0RZUzJ4U3pQWm83ek1uR1Q5OS9xaVVEY0hX?= =?utf-8?B?eHRZd2xoQWwzYjlsRHdJR09EcG9zK09YeE9nd2JWZ2ZHK3ovZTdUbS9Kbnov?= =?utf-8?B?K3UvS3JzVGxDRnp5WmhaMHJEK1NzbjR3YzJNNUU5RHVGN3dqdXFPQXdUOWtX?= =?utf-8?B?RW5raHFJVUUvVzNSUkVkL3lCQldnZFREMzlncFVCS0hsVG5rTU5LckNGdWpE?= =?utf-8?B?TC9odW1US3lYSkZoU1NlV3VNSUlFdGRtb0Q3cDlaamFGSzduMEE5eGdLVTFN?= =?utf-8?B?K0w1UUJzeHBqQXdLU2lpMSttMnBtZGNUUlRVNmxib1JMaVQxdmlJUlBpTGZu?= =?utf-8?B?bjZRU3ZnMXdmM24vemt5VzFCVktzbzhwMDJoQ0V2QysxT2lUd1FSamk1QmFI?= =?utf-8?B?TGo0a3VpREJoNU01S2ZBSjk5ODhNRHMvd1JLanlCeXZ2emJGNEtGeXpYc3No?= =?utf-8?B?NXduYzBMdnJudmttNHM0QXlTTnprNWJCMmMrRnJ4ckgvQVBsdzZMTkNSaW5z?= =?utf-8?B?dWY1UXUybzMzYURuS3FtRlBkQzdOQk5hM2J0RWxQcWhmM2xHUHdrMFN0N01J?= =?utf-8?B?cnF1MnBnZFVOMWhXbDZqWkVBc3YycFk2OUVid3hibWg4cENDMkJsSmtCb0F5?= =?utf-8?B?eEtieWszaVZZMTN0aEVVZzcxWHU5RVpiYlF4VDlFaTh6YnJQU21VVnkwMlBJ?= =?utf-8?B?WnpndWhIY3o4TURlUm0rZ1J0U3ArNnJBNzQwaVJ5YWN0RHpXZjhydk1HSmJq?= =?utf-8?B?Sk5HWXVTWDd6QytOMXJ6dHBReGJ2Y1dmRHF3TkpNTHRuelo4NGlkOGpxajFr?= =?utf-8?B?TXdqRHpCenNzcnVsVGZLMEd4QkgzdG9zMEZPK29sRTlCRmMvS04ySXFFcmR0?= =?utf-8?B?WVNIZE4xVU1zd1JBOXNZZGR1bUQ4bkR0aEdjOGJRNnF4andKZnE0VkFTUXJN?= =?utf-8?B?bncvTURyUDNYMjRSQ3lGVEtpUUp0R0FwYzZuRlJxNVdPRmZOMHIwZWRDK25v?= =?utf-8?B?RjROTkx6ZWxIb3dJNUVnVHViZThpcEFrUDRRbmU2aGRnOXlROEM5TjRGQXlU?= =?utf-8?B?ckE9PQ==?= X-MS-Exchange-CrossTenant-Network-Message-Id: 30b183ec-0eb0-420d-f6c7-08dbd6f320f5 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR11MB5772.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Oct 2023 13:46:32.4296 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 1UI2Cv5xlmguYR7/NyQHQmun/XjnOxtnE4NrfNXWSQPE60UJtnAm+O6zLwUw3KITSd+sOLvdY8fBFucc0sr8yiZFFoKt0brbhrsntYjR6yc= X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR11MB5422 X-OriginatorOrg: intel.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Satananda, Anoob, Chengwen, Jerin, all, After a number of internal discussions we have decided that we're going to postpone this feature/patchset till next release. >[Satananda] Have you considered extending  rte_dma_port_param and rte_dma_vchan_conf to represent interdomain memory transfer setup as a separate port type like RTE_DMA_PORT_INTER_DOMAIN ? >[Anoob] Can we move this 'src_handle' and 'dst_handle' registration to rte_dma_vchan_setup so that the 'src_handle' and 'dst_handle' can be configured in control path and the existing datapath APIs can work as is. >[Jerin] Or move src_handle/dst_hanel to vchan config We've listened to feedback on implementation, and have prototyped a vchan-based interface. This has a number of advantages and disadvantages, both in terms of API usage and in terms of our specific driver. Setting up inter-domain operations as separate vchans allow us to store data inside the PMD and not duplicate any API paths, so having multiple vchans addresses that problem. However, this also means that any new vchans added while the PMD is active (such as attaching to a new process) will have to be gated by start/stop. This is probably fine from API point of view, but a hassle for user (previously, we could've just started using the new inter-domain handle right away). Another usability issue with multiple vchan approach is that now, each vchan will have its own enqueue/submit/completion cycle, so any use case relying on one thread communicating with many processes will have to process each vchan separately, instead of everything going into one vchan - again, looks fine API-wise, but a hassle for the user, since this requires calling submit and completion for each vchan, and in some cases it requires maintaining some kind of reordering queue. (On the other hand, it would be much easier to separate operations intended for different processes with this approach, so perhaps this is not such a big issue) Finally, there is also an IDXD-specific issue. Currently, IDXD HW acceleration is implemented in such a way that each work queue will have a unique DMA device ID (rather than a unique vchan), and each device can technically process requests for both local and remote memory (local to remote, remote to local, remote to remote), all in one queue - as it was in our original implementation. By changing implementation to use vchans, we're essentially bifurcating this single queue - all vchans would have their own rings etc., but the enqueue-to-hardware operation is still common to all vchans, because there's a single underlying queue as far as hardware is concerned. The queue is atomic in hardware, and technically, ENQCMD instruction returns status in case of enqueue failure (such as when too many requests are in flight), so technically we could just not pay attention to number of in-flight operations and just rely on ENQCMD returning failures to handle error/retry, but the problem with this is that this failure is only happening on submit, not on enqueue. So, in essence, with IDXD driver we have two choices: either we implement some kind of in-flight counter to prevent our driver from submitting too many requests (that is, vchans will have to cooperate - use atomics or similar), or every user will have to handle not just errors on enqueue, but also on submit (which I don't believe many people do currently, even though technically submit can return failure - all non-test usage in DPDK seems to assume submit realistically won't fail, and I'd like to keep it that way). We're in process of measuring performance impact of different implementations, however I should note that while atomic operations on data path are unfortunate, realistically these atomics are accessed only at beginning/end of every 'enqueue-submit-complete' cycle, and not on every operation. At the first glance where are no observable performance penalty in regular use case (assuming we are not calling submit for every enqueued job). >[Satananda]Do you have usecases where a process from 3rd domain sets up transfer between memories from 2 domains? i.e process 1 is src, process 2 is dest and process 3 executes transfer. This usecase is working with proposed API on our hardware. >[Chengwen]And last, Could you introduce the application scenarios of this feature? We have used this feature to improve performance for memif driver. On 09/10/2023 06:05, Jerin Jacob wrote: > On Sun, Oct 8, 2023 at 8:03 AM fengchengwen wrote: >> Hi Anatoly, >> >> On 2023/8/12 0:14, Anatoly Burakov wrote: >>> Add a flag to indicate that a specific device supports inter-domain >>> operations, and add an API for inter-domain copy and fill. >>> >>> Inter-domain operation is an operation that is very similar to regular >>> DMA operation, except either source or destination addresses can be in a >>> different process's address space, indicated by source and destination >>> handle values. These values are currently meant to be provided by >>> private drivers' API's. >>> >>> This commit also adds a controller ID field into the DMA device API. >>> This is an arbitrary value that may not be implemented by hardware, but >>> it is meant to represent some kind of device hierarchy. >>> >>> Signed-off-by: Vladimir Medvedkin >>> Signed-off-by: Anatoly Burakov >>> --- >> ... >> >>> +__rte_experimental >>> +static inline int >>> +rte_dma_copy_inter_dom(int16_t dev_id, uint16_t vchan, rte_iova_t src, >>> + rte_iova_t dst, uint32_t length, uint16_t src_handle, >>> + uint16_t dst_handle, uint64_t flags) >> I would suggest add more general extension: >> rte_dma_copy*(int16_t dev_id, uint16_t vchan, rte_iova_t src, rte_iova_t dst, >> uint32_t length, uint64_t flags, void *param) >> The param only valid under some flags bits. >> As for this inter-domain extension: we could define inter-domain param struct. >> >> >> Whether add in current rte_dma_copy() API or add one new API, I think it mainly >> depend on performance impact of parameter transfer. Suggest more discuss for >> differnt platform and call specification. > Or move src_handle/dst_hanel to vchan config to enable better performance. > Application create N number of vchan based on the requirements. > >> >> And last, Could you introduce the application scenarios of this feature? > Looks like VM to VM or container to container copy. > >> >> Thanks. >> -- Regards, Vladimir