From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0A303A0A0C; Fri, 2 Jul 2021 09:08:08 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BE48A40141; Fri, 2 Jul 2021 09:08:07 +0200 (CEST) Received: from sender11-of-o51.zoho.eu (sender11-of-o51.zoho.eu [31.186.226.237]) by mails.dpdk.org (Postfix) with ESMTP id A50F44003E for ; Fri, 2 Jul 2021 09:08:05 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; t=1625209660; cv=none; d=zohomail.eu; s=zohoarc; b=A415ycW7NDrOOLsdc6uyMiGaOhL40WAn4UmnPZFa30aSDNRAlVF5ixu2GuShPyWoSfergBq5AVNQPgD9o3jddbV0oZ+P97rnFGBij6E/K/lsKbiM/u1tubaTBiVBEmjWox47A2xcMluzEh9jO30V4MGtxW4epAnZBwcfKtPVQlU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.eu; s=zohoarc; t=1625209660; h=Content-Type:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To; bh=XGVyV7veliDn+F/YepW1dYjUtQOTTqhBJ4HWPTZpS04=; b=JYOu/AAYUM9tfugA8Q7Fr2RTrFtnZJFo/y7R5hb8Ey24T5xFlRZ1PtZoAp9n8wsoFdx54A/cYFHd97Ye+VPe6f866eikuwxtVBLd/OVw1TzH+Tsk5Jb8SMuBJPZ96yAbAS/fC5R4woeM2blUZ4AxJDAsXmGeyEmat5INheDG+sM= ARC-Authentication-Results: i=1; mx.zohomail.eu; spf=pass smtp.mailfrom=liangma@liangbit.com; dmarc=pass header.from= Received: from C02F33EJML85 (ec2-3-9-240-80.eu-west-2.compute.amazonaws.com [3.9.240.80]) by mx.zoho.eu with SMTPS id 1625209657692986.2884049750551; Fri, 2 Jul 2021 09:07:37 +0200 (CEST) Date: Fri, 2 Jul 2021 08:07:35 +0100 From: Liang Ma To: fengchengwen Cc: Bruce Richardson , Jerin Jacob , Jerin Jacob , Morten =?iso-8859-1?Q?Br=F8rup?= , Nipun Gupta , Thomas Monjalon , Ferruh Yigit , dpdk-dev , Hemant Agrawal , Maxime Coquelin , Honnappa Nagarahalli , David Marchand , Satananda Burla , Prasun Kapoor Message-ID: References: <25d29598-c26d-8497-2867-9b650c79df49@huawei.com> <3db2eda0-4490-2b8f-c65d-636bcf794494@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ZohoMailClient: External Subject: Re: [dpdk-dev] dmadev discussion summary X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Sat, Jun 26, 2021 at 11:59:49AM +0800, fengchengwen wrote: > Hi, all > I analyzed the current DPAM DMA driver and drew this summary in conjunction > with the previous discussion, and this will as a basis for the V2 implementation. > Feedback is welcome, thanks > > > dpaa2_qdma: > [probe]: mainly obtains the number of hardware queues. > [dev_configure]: has following parameters: > max_hw_queues_per_core: > max_vqs: max number of virt-queue > fle_queue_pool_cnt: the size of FLE pool > [queue_setup]: setup up one virt-queue, has following parameters: > lcore_id: > flags: some control params, e.g. sg-list, longformat desc, exclusive HW > queue... > rbp: some misc field which impact the descriptor > Note: this API return the index of virt-queue which was successful > setuped. > [enqueue_bufs]: data-plane API, the key fields: > vq_id: the index of virt-queue > job: the pointer of job array > nb_jobs: > Note: one job has src/dest/len/flag/cnxt/status/vq_id/use_elem fields, > the flag field indicate whether src/dst is PHY addr. > [dequeue_bufs]: get the completed jobs's pointer > > [key point]: > ------------ ------------ > |virt-queue| |virt-queue| > ------------ ------------ > \ / > \ / > \ / > ------------ ------------ > | HW-queue | | HW-queue | > ------------ ------------ > \ / > \ / > \ / > core/rawdev > 1) In the probe stage, driver tell how many HW-queues could use. > 2) User could specify the maximum number of HW-queues managed by a single > core in the dev_configure stage. > 3) User could create one virt-queue by queue_setup API, the virt-queue has > two types: a) exclusive HW-queue, b) shared HW-queue(as described > above), this is achieved by the corresponding bit of flags field. > 4) In this mode, queue management is simplified. User do not need to > specify the HW-queue to be applied for and create a virt-queue on the > HW-queue. All you need to do is say on which core I want to create a > virt-queue. > 5) The virt-queue could have different capability, e.g. virt-queue-0 > support scatter-gather format, and virt-queue-1 don't support sg, this > was control by flags and rbp fields in queue_setup stage. > 6) The data-plane API use the definition similar to rte_mbuf and > rte_eth_rx/tx_burst(). > PS: I still don't understand how sg-list enqueue/dequeue, and user how to > use RTE_QDMA_VQ_NO_RESPONSE. > > Overall, I think it's a flexible design with many scalability. Especially > the queue resource pool architecture, simplifies user invocations, > although the 'core' introduces a bit abruptly. > > > octeontx2_dma: > [dev_configure]: has one parameters: > chunk_pool: it's strange why it's not managed internally by the driver, > but passed in through the API. > [enqueue_bufs]: has three important parameters: > context: this is what Jerin referred to 'channel', it could hold the > completed ring of the job. > buffers: hold the pointer array of dpi_dma_buf_ptr_s > count: how many dpi_dma_buf_ptr_s > Note: one dpi_dma_buf_ptr_s may has many src and dst pairs (it's scatter- > gather list), and has one completed_ptr (when HW complete it will > write one value to this ptr), current the completed_ptr pointer > struct: > struct dpi_dma_req_compl_s { > uint64_t cdata; --driver init and HW update result to this. > void (*compl_cb)(void *dev, void *arg); > void *cb_data; > }; > [dequeue_bufs]: has two important parameters: > context: driver will scan it's completed ring to get complete info. > buffers: hold the pointer array of completed_ptr. > > [key point]: > ----------- ----------- > | channel | | channel | > ----------- ----------- > \ / > \ / > \ / > ------------ > | HW-queue | > ------------ > | > -------- > |rawdev| > -------- > 1) User could create one channel by init context(dpi_dma_queue_ctx_s), > this interface is not standardized and needs to be implemented by > users. > 2) Different channels can support different transmissions, e.g. one for > inner m2m, and other for inbound copy. > > Overall, I think the 'channel' is similar the 'virt-queue' of dpaa2_qdma. > The difference is that dpaa2_qdma supports multiple hardware queues. The > 'channel' has following > 1) A channel is an operable unit at the user level. User can create a > channel for each transfer type, for example, a local-to-local channel, > and a local-to-host channel. User could also get the completed status > of one channel. > 2) Multiple channels can run on the same HW-queue. In terms of API design, > this design reduces the number of data-plane API parameters. The > channel could has context info which will referred by data-plane APIs > execute. > > > ioat: > [probe]: create multiple rawdev if it's DSA device and has multiple HW-queues. > [dev_configure]: has three parameters: > ring_size: the HW descriptor size. > hdls_disable: whether ignore user-supplied handle params > no_prefetch_completions: > [rte_ioat_enqueue_copy]: has dev_id/src/dst/length/src_hdl/dst_hdl parameters. > [rte_ioat_completed_ops]: has dev_id/max_copies/status/num_unsuccessful/ > src_hdls/dst_hdls parameters. > > Overall, one HW-queue one rawdev, and don't have many 'channel' which similar > to octeontx2_dma. > > > Kunpeng_dma: > 1) The hardmware support multiple modes(e.g. local-to-local/local-to-pciehost/ > pciehost-to-local/immediated-to-local copy). > Note: Currently, we only implement local-to-local copy. > 2) The hardmware support multiple HW-queues. > > > Summary: > 1) The dpaa2/octeontx2/Kunpeng are all ARM soc, there may acts as endpoint of > x86 host (e.g. smart NIC), multiple memory transfer requirements may exist, > e.g. local-to-host/local-to-host..., from the point of view of API design, > I think we should adopt a similar 'channel' or 'virt-queue' concept. > 2) Whether to create a separate dmadev for each HW-queue? We previously > discussed this, and due HW-queue could indepent management (like > Kunpeng_dma and Intel DSA), we prefer create a separate dmadev for each > HW-queue before. But I'm not sure if that's the case with dpaa. I think > that can be left to the specific driver, no restriction is imposed on the > framework API layer. > 3) I think we could setup following abstraction at dmadev device: > ------------ ------------ > |virt-queue| |virt-queue| > ------------ ------------ > \ / > \ / > \ / > ------------ ------------ > | HW-queue | | HW-queue | > ------------ ------------ > \ / > \ / > \ / > dmadev > 4) The driver's ops design (here we only list key points): > [dev_info_get]: mainly return the number of HW-queues > [dev_configure]: nothing important > [queue_setup]: create one virt-queue, has following main parameters: > HW-queue-index: the HW-queue index used > nb_desc: the number of HW descriptors > opaque: driver's specific info > Note1: this API return virt-queue index which will used in later API. > If user want create multiple virt-queue one the same HW-queue, > they could achieved by call queue_setup with the same > HW-queue-index. > Note2: I think it's hard to define queue_setup config paramter, and > also this is control API, so I think it's OK to use opaque > pointer to implement it. > [dma_copy/memset/sg]: all has vq_id input parameter. > Note: I notice dpaa can't support single and sg in one virt-queue, and > I think it's maybe software implement policy other than HW > restriction because virt-queue could share the same HW-queue. > Here we use vq_id to tackle different scenario, like local-to-local/ > local-to-host and etc. > 5) And the dmadev public data-plane API (just prototype): > dma_cookie_t rte_dmadev_memset(dev, vq_id, pattern, dst, len, flags) > -- flags: used as an extended parameter, it could be uint32_t > dma_cookie_t rte_dmadev_memcpy(dev, vq_id, src, dst, len, flags) > dma_cookie_t rte_dmadev_memcpy_sg(dev, vq_id, sg, sg_len, flags) > -- sg: struct dma_scatterlist array > uint16_t rte_dmadev_completed(dev, vq_id, dma_cookie_t *cookie, > uint16_t nb_cpls, bool *has_error) > -- nb_cpls: indicate max process operations number > -- has_error: indicate if there is an error > -- return value: the number of successful completed operations. > -- example: > 1) If there are already 32 completed ops, and 4th is error, and > nb_cpls is 32, then the ret will be 3(because 1/2/3th is OK), and > has_error will be true. > 2) If there are already 32 completed ops, and all successful > completed, then the ret will be min(32, nb_cpls), and has_error > will be false. > 3) If there are already 32 completed ops, and all failed completed, > then the ret will be 0, and has_error will be true. > uint16_t rte_dmadev_completed_status(dev_id, vq_id, dma_cookie_t *cookie, > uint16_t nb_status, uint32_t *status) > -- return value: the number of failed completed operations. > And here I agree with Morten: we should design API which adapts to DPDK > service scenarios. So we don't support some sound-cards DMA, and 2D memory > copy which mainly used in video scenarios. > 6) The dma_cookie_t is signed int type, when <0 it mean error, it's > monotonically increasing base on HW-queue (other than virt-queue). The > driver needs to make sure this because the damdev framework don't manage > the dma_cookie's creation. > 7) Because data-plane APIs are not thread-safe, and user could determine > virt-queue to HW-queue's map (at the queue-setup stage), so it is user's > duty to ensure thread-safe. > 8) One example: > vq_id = rte_dmadev_queue_setup(dev, config.{HW-queue-index=x, opaque}); > if (vq_id < 0) { > // create virt-queue failed > return; > } > // submit memcpy task > cookit = rte_dmadev_memcpy(dev, vq_id, src, dst, len, flags); > if (cookie < 0) { > // submit failed > return; > } IMO rte_dmadev_memcpy should return ops number successfully submitted that's easier to do re-submit if previous session is not fully submitted. > // get complete task > ret = rte_dmadev_completed(dev, vq_id, &cookie, 1, has_error); > if (!has_error && ret == 1) { > // the memcpy successful complete > } > 9) As octeontx2_dma support sg-list which has many valid buffers in > dpi_dma_buf_ptr_s, it could call the rte_dmadev_memcpy_sg API. > 10) As ioat, it could delcare support one HW-queue at dev_configure stage, and > only support create one virt-queue. > 11) As dpaa2_qdma, I think it could migrate to new framework, but still wait > for dpaa2_qdma guys feedback. > 12) About the prototype src/dst parameters of rte_dmadev_memcpy API, we have > two candidates which are iova and void *, how about introduce dma_addr_t > type which could be va or iova ? >