From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 86D32A0C3F; Mon, 28 Jun 2021 12:01:05 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4E1E540692; Mon, 28 Jun 2021 12:01:05 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 751C94068A for ; Mon, 28 Jun 2021 12:01:03 +0200 (CEST) X-IronPort-AV: E=McAfee;i="6200,9189,10028"; a="271781440" X-IronPort-AV: E=Sophos;i="5.83,305,1616482800"; d="scan'208";a="271781440" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Jun 2021 03:01:02 -0700 X-IronPort-AV: E=Sophos;i="5.83,305,1616482800"; d="scan'208";a="419112271" Received: from bricha3-mobl.ger.corp.intel.com ([10.252.1.17]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 28 Jun 2021 03:00:58 -0700 Date: Mon, 28 Jun 2021 11:00:54 +0100 From: Bruce Richardson To: fengchengwen Cc: Jerin Jacob , Jerin Jacob , Morten =?iso-8859-1?Q?Br=F8rup?= , Nipun Gupta , Thomas Monjalon , Ferruh Yigit , dpdk-dev , Hemant Agrawal , Maxime Coquelin , Honnappa Nagarahalli , David Marchand , Satananda Burla , Prasun Kapoor Message-ID: References: <25d29598-c26d-8497-2867-9b650c79df49@huawei.com> <3db2eda0-4490-2b8f-c65d-636bcf794494@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [dpdk-dev] dmadev discussion summary X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Sat, Jun 26, 2021 at 11:59:49AM +0800, fengchengwen wrote: > Hi, all > I analyzed the current DPAM DMA driver and drew this summary in conjunction > with the previous discussion, and this will as a basis for the V2 implementation. > Feedback is welcome, thanks > Fantastic review and summary, many thanks for the work. Some comments inline in API part below, but nothing too major, I hope. /Bruce > > Summary: > 1) The dpaa2/octeontx2/Kunpeng are all ARM soc, there may acts as endpoint of > x86 host (e.g. smart NIC), multiple memory transfer requirements may exist, > e.g. local-to-host/local-to-host..., from the point of view of API design, > I think we should adopt a similar 'channel' or 'virt-queue' concept. > 2) Whether to create a separate dmadev for each HW-queue? We previously > discussed this, and due HW-queue could indepent management (like > Kunpeng_dma and Intel DSA), we prefer create a separate dmadev for each > HW-queue before. But I'm not sure if that's the case with dpaa. I think > that can be left to the specific driver, no restriction is imposed on the > framework API layer. > 3) I think we could setup following abstraction at dmadev device: > ------------ ------------ > |virt-queue| |virt-queue| > ------------ ------------ > \ / > \ / > \ / > ------------ ------------ > | HW-queue | | HW-queue | > ------------ ------------ > \ / > \ / > \ / > dmadev > 4) The driver's ops design (here we only list key points): > [dev_info_get]: mainly return the number of HW-queues > [dev_configure]: nothing important > [queue_setup]: create one virt-queue, has following main parameters: > HW-queue-index: the HW-queue index used > nb_desc: the number of HW descriptors > opaque: driver's specific info > Note1: this API return virt-queue index which will used in later API. > If user want create multiple virt-queue one the same HW-queue, > they could achieved by call queue_setup with the same > HW-queue-index. > Note2: I think it's hard to define queue_setup config paramter, and > also this is control API, so I think it's OK to use opaque > pointer to implement it. I'm not sure opaque pointer will work in practice, so I think we should try and standardize the parameters as much as possible. Since it's a control plane API, using a struct with a superset of parameters may be workable. Let's start with a minimum set and build up from there. > [dma_copy/memset/sg]: all has vq_id input parameter. > Note: I notice dpaa can't support single and sg in one virt-queue, and > I think it's maybe software implement policy other than HW > restriction because virt-queue could share the same HW-queue. Presumably for queues which support sq, the single-enqueue APIs can use a single sg list internally? > Here we use vq_id to tackle different scenario, like local-to-local/ > local-to-host and etc. > 5) And the dmadev public data-plane API (just prototype): > dma_cookie_t rte_dmadev_memset(dev, vq_id, pattern, dst, len, flags) > -- flags: used as an extended parameter, it could be uint32_t Suggest uint64_t rather than uint32_t to ensure we have expansion room? Otherwise +1 > dma_cookie_t rte_dmadev_memcpy(dev, vq_id, src, dst, len, flags) +1 > dma_cookie_t rte_dmadev_memcpy_sg(dev, vq_id, sg, sg_len, flags) > -- sg: struct dma_scatterlist array I don't think our drivers will be directly implementing this API, but so long as SG support is listed as a capability flag I'm fine with this as an API. [We can't fudge it as a bunch of single copies, because that would cause us to have multiple cookies rather than one] > uint16_t rte_dmadev_completed(dev, vq_id, dma_cookie_t *cookie, > uint16_t nb_cpls, bool *has_error) > -- nb_cpls: indicate max process operations number > -- has_error: indicate if there is an error > -- return value: the number of successful completed operations. > -- example: > 1) If there are already 32 completed ops, and 4th is error, and > nb_cpls is 32, then the ret will be 3(because 1/2/3th is OK), and > has_error will be true. > 2) If there are already 32 completed ops, and all successful > completed, then the ret will be min(32, nb_cpls), and has_error > will be false. > 3) If there are already 32 completed ops, and all failed completed, > then the ret will be 0, and has_error will be true. +1 for this > uint16_t rte_dmadev_completed_status(dev_id, vq_id, dma_cookie_t *cookie, > uint16_t nb_status, uint32_t *status) > -- return value: the number of failed completed operations. > And here I agree with Morten: we should design API which adapts to DPDK > service scenarios. So we don't support some sound-cards DMA, and 2D memory > copy which mainly used in video scenarios. Can I suggest a few adjustments here to the semantics of this API. In future we may have operations which return a status value, e.g. our hardware can support ops like compare equal/not-equal, which means that this API would be meaningful even in case of success. Therefore, I suggest that the return value be changed to allow success also to be returned in the array, and the return value is not the number of failed ops, but the number of ops for which status is being returned. Also for consideration: when trying to implement this in a prototype in our driver, it would be easier if we relax the restriction on the "completed" API so that we can flag has_error when an error is detected rather than guaranteeing to return all elements right up to the error. For example, if we have a burst of packets and one is problematic, it may be easier to flag the error at the start of the burst and then have a few successful entries at the start of the completed_status array. [Separate from this] We should also have a "has_error" or "more_errors" flag on this API too, to indicate when the user can switch back to using the regular "completed" API. This means that apps switch from one API to the other when "has_error" is true, and only switch back when it becomes false again. > 6) The dma_cookie_t is signed int type, when <0 it mean error, it's > monotonically increasing base on HW-queue (other than virt-queue). The > driver needs to make sure this because the damdev framework don't manage > the dma_cookie's creation. +1 to this. I think we also should specify that the cookie is guaranteed to wrap at a power of 2 value (UINT16_MAX??). This allows it to be used as an index into a circular buffer just by masking. > 7) Because data-plane APIs are not thread-safe, and user could determine > virt-queue to HW-queue's map (at the queue-setup stage), so it is user's > duty to ensure thread-safe. > 8) One example: > vq_id = rte_dmadev_queue_setup(dev, config.{HW-queue-index=x, opaque}); > if (vq_id < 0) { > // create virt-queue failed > return; > } > // submit memcpy task > cookit = rte_dmadev_memcpy(dev, vq_id, src, dst, len, flags); > if (cookie < 0) { > // submit failed > return; > } > // get complete task > ret = rte_dmadev_completed(dev, vq_id, &cookie, 1, has_error); > if (!has_error && ret == 1) { > // the memcpy successful complete > } +1 > 9) As octeontx2_dma support sg-list which has many valid buffers in > dpi_dma_buf_ptr_s, it could call the rte_dmadev_memcpy_sg API. > 10) As ioat, it could delcare support one HW-queue at dev_configure stage, and > only support create one virt-queue. +1 > 11) As dpaa2_qdma, I think it could migrate to new framework, but still wait > for dpaa2_qdma guys feedback. > 12) About the prototype src/dst parameters of rte_dmadev_memcpy API, we have > two candidates which are iova and void *, how about introduce dma_addr_t > type which could be va or iova ? > Many thanks again.