DPDK patches and discussions
 help / color / mirror / Atom feed
From: Liang Ma <liangma@liangbit.com>
To: fengchengwen <fengchengwen@huawei.com>
Cc: "Bruce Richardson" <bruce.richardson@intel.com>,
	"Jerin Jacob" <jerinjacobk@gmail.com>,
	"Jerin Jacob" <jerinj@marvell.com>,
	"Morten Brørup" <mb@smartsharesystems.com>,
	"Nipun Gupta" <nipun.gupta@nxp.com>,
	"Thomas Monjalon" <thomas@monjalon.net>,
	"Ferruh Yigit" <ferruh.yigit@intel.com>, dpdk-dev <dev@dpdk.org>,
	"Hemant Agrawal" <hemant.agrawal@nxp.com>,
	"Maxime Coquelin" <maxime.coquelin@redhat.com>,
	"Honnappa Nagarahalli" <honnappa.nagarahalli@arm.com>,
	"David Marchand" <david.marchand@redhat.com>,
	"Satananda Burla" <sburla@marvell.com>,
	"Prasun Kapoor" <pkapoor@marvell.com>
Subject: Re: [dpdk-dev] dmadev discussion summary
Date: Fri, 2 Jul 2021 08:07:35 +0100	[thread overview]
Message-ID: <YN67N8Jb5VGQmuw3@C02F33EJML85> (raw)
In-Reply-To: <c4a0ee30-f7b8-f8a1-463c-8eedaec82aea@huawei.com>

On Sat, Jun 26, 2021 at 11:59:49AM +0800, fengchengwen wrote:
> Hi, all
>   I analyzed the current DPAM DMA driver and drew this summary in conjunction
> with the previous discussion, and this will as a basis for the V2 implementation.
>   Feedback is welcome, thanks
> 
> 
> dpaa2_qdma:
>   [probe]: mainly obtains the number of hardware queues.
>   [dev_configure]: has following parameters:
>       max_hw_queues_per_core:
>       max_vqs: max number of virt-queue
>       fle_queue_pool_cnt: the size of FLE pool
>   [queue_setup]: setup up one virt-queue, has following parameters:
>       lcore_id:
>       flags: some control params, e.g. sg-list, longformat desc, exclusive HW
>              queue...
>       rbp: some misc field which impact the descriptor
>       Note: this API return the index of virt-queue which was successful
>             setuped.
>   [enqueue_bufs]: data-plane API, the key fields:
>       vq_id: the index of virt-queue
> 	  job: the pointer of job array
> 	  nb_jobs:
> 	  Note: one job has src/dest/len/flag/cnxt/status/vq_id/use_elem fields,
>             the flag field indicate whether src/dst is PHY addr.
>   [dequeue_bufs]: get the completed jobs's pointer
> 
>   [key point]:
>       ------------    ------------
>       |virt-queue|    |virt-queue|
>       ------------    ------------
>              \           /
>               \         /
>                \       /
>              ------------     ------------
>              | HW-queue |     | HW-queue |
>              ------------     ------------
>                     \            /
>                      \          /
>                       \        /
>                       core/rawdev
>       1) In the probe stage, driver tell how many HW-queues could use.
>       2) User could specify the maximum number of HW-queues managed by a single
>          core in the dev_configure stage.
>       3) User could create one virt-queue by queue_setup API, the virt-queue has
>          two types: a) exclusive HW-queue, b) shared HW-queue(as described
>          above), this is achieved by the corresponding bit of flags field.
>       4) In this mode, queue management is simplified. User do not need to
>          specify the HW-queue to be applied for and create a virt-queue on the
>          HW-queue. All you need to do is say on which core I want to create a
>          virt-queue.
>       5) The virt-queue could have different capability, e.g. virt-queue-0
>          support scatter-gather format, and virt-queue-1 don't support sg, this
>          was control by flags and rbp fields in queue_setup stage.
>       6) The data-plane API use the definition similar to rte_mbuf and
>          rte_eth_rx/tx_burst().
>       PS: I still don't understand how sg-list enqueue/dequeue, and user how to
>           use RTE_QDMA_VQ_NO_RESPONSE.
> 
>       Overall, I think it's a flexible design with many scalability. Especially
>       the queue resource pool architecture, simplifies user invocations,
>       although the 'core' introduces a bit abruptly.
> 
> 
> octeontx2_dma:
>   [dev_configure]: has one parameters:
>       chunk_pool: it's strange why it's not managed internally by the driver,
>                   but passed in through the API.
>   [enqueue_bufs]: has three important parameters:
>       context: this is what Jerin referred to 'channel', it could hold the
>                completed ring of the job.
>       buffers: hold the pointer array of dpi_dma_buf_ptr_s
>       count: how many dpi_dma_buf_ptr_s
> 	  Note: one dpi_dma_buf_ptr_s may has many src and dst pairs (it's scatter-
>             gather list), and has one completed_ptr (when HW complete it will
>             write one value to this ptr), current the completed_ptr pointer
>             struct:
>                 struct dpi_dma_req_compl_s {
>                     uint64_t cdata;  --driver init and HW update result to this.
>                     void (*compl_cb)(void *dev, void *arg);
>                     void *cb_data;
>                 };
>   [dequeue_bufs]: has two important parameters:
>       context: driver will scan it's completed ring to get complete info.
>       buffers: hold the pointer array of completed_ptr.
> 
>   [key point]:
>       -----------    -----------
>       | channel |    | channel |
>       -----------    -----------
>              \           /
>               \         /
>                \       /
>              ------------
>              | HW-queue |
>              ------------
>                    |
>                 --------
>                 |rawdev|
>                 --------
>       1) User could create one channel by init context(dpi_dma_queue_ctx_s),
>          this interface is not standardized and needs to be implemented by
>          users.
>       2) Different channels can support different transmissions, e.g. one for
>          inner m2m, and other for inbound copy.
> 
>       Overall, I think the 'channel' is similar the 'virt-queue' of dpaa2_qdma.
>       The difference is that dpaa2_qdma supports multiple hardware queues. The
>       'channel' has following
>       1) A channel is an operable unit at the user level. User can create a
>          channel for each transfer type, for example, a local-to-local channel,
>          and a local-to-host channel. User could also get the completed status
>          of one channel.
>       2) Multiple channels can run on the same HW-queue. In terms of API design,
>          this design reduces the number of data-plane API parameters. The
>          channel could has context info which will referred by data-plane APIs
>          execute.
> 
> 
> ioat:
>   [probe]: create multiple rawdev if it's DSA device and has multiple HW-queues.
>   [dev_configure]: has three parameters:
>       ring_size: the HW descriptor size.
>       hdls_disable: whether ignore user-supplied handle params
>       no_prefetch_completions:
>   [rte_ioat_enqueue_copy]: has dev_id/src/dst/length/src_hdl/dst_hdl parameters.
>   [rte_ioat_completed_ops]: has dev_id/max_copies/status/num_unsuccessful/
>                             src_hdls/dst_hdls parameters.
> 
>   Overall, one HW-queue one rawdev, and don't have many 'channel' which similar
>   to octeontx2_dma.
> 
> 
> Kunpeng_dma:
>   1) The hardmware support multiple modes(e.g. local-to-local/local-to-pciehost/
>      pciehost-to-local/immediated-to-local copy).
>      Note: Currently, we only implement local-to-local copy.
>   2) The hardmware support multiple HW-queues.
> 
> 
> Summary:
>   1) The dpaa2/octeontx2/Kunpeng are all ARM soc, there may acts as endpoint of
>      x86 host (e.g. smart NIC), multiple memory transfer requirements may exist,
>      e.g. local-to-host/local-to-host..., from the point of view of API design,
>      I think we should adopt a similar 'channel' or 'virt-queue' concept.
>   2) Whether to create a separate dmadev for each HW-queue? We previously
>      discussed this, and due HW-queue could indepent management (like
>      Kunpeng_dma and Intel DSA), we prefer create a separate dmadev for each
>      HW-queue before. But I'm not sure if that's the case with dpaa. I think
>      that can be left to the specific driver, no restriction is imposed on the
>      framework API layer.
>   3) I think we could setup following abstraction at dmadev device:
>       ------------    ------------
>       |virt-queue|    |virt-queue|
>       ------------    ------------
>              \           /
>               \         /
>                \       /
>              ------------     ------------
>              | HW-queue |     | HW-queue |
>              ------------     ------------
>                     \            /
>                      \          /
>                       \        /
>                         dmadev
>   4) The driver's ops design (here we only list key points):
>      [dev_info_get]: mainly return the number of HW-queues
>      [dev_configure]: nothing important
>      [queue_setup]: create one virt-queue, has following main parameters:
>          HW-queue-index: the HW-queue index used
>          nb_desc: the number of HW descriptors
>          opaque: driver's specific info
>          Note1: this API return virt-queue index which will used in later API.
>                 If user want create multiple virt-queue one the same HW-queue,
>                 they could achieved by call queue_setup with the same
>                 HW-queue-index.
>          Note2: I think it's hard to define queue_setup config paramter, and
>                 also this is control API, so I think it's OK to use opaque
>                 pointer to implement it.
>       [dma_copy/memset/sg]: all has vq_id input parameter.
>          Note: I notice dpaa can't support single and sg in one virt-queue, and
>                I think it's maybe software implement policy other than HW
>                restriction because virt-queue could share the same HW-queue.
>       Here we use vq_id to tackle different scenario, like local-to-local/
>       local-to-host and etc.
>   5) And the dmadev public data-plane API (just prototype):
>      dma_cookie_t rte_dmadev_memset(dev, vq_id, pattern, dst, len, flags)
>        -- flags: used as an extended parameter, it could be uint32_t
>      dma_cookie_t rte_dmadev_memcpy(dev, vq_id, src, dst, len, flags)
>      dma_cookie_t rte_dmadev_memcpy_sg(dev, vq_id, sg, sg_len, flags)
>        -- sg: struct dma_scatterlist array
>      uint16_t rte_dmadev_completed(dev, vq_id, dma_cookie_t *cookie,
>                                    uint16_t nb_cpls, bool *has_error)
>        -- nb_cpls: indicate max process operations number
>        -- has_error: indicate if there is an error
>        -- return value: the number of successful completed operations.
>        -- example:
>           1) If there are already 32 completed ops, and 4th is error, and
>              nb_cpls is 32, then the ret will be 3(because 1/2/3th is OK), and
>              has_error will be true.
>           2) If there are already 32 completed ops, and all successful
>              completed, then the ret will be min(32, nb_cpls), and has_error
>              will be false.
>           3) If there are already 32 completed ops, and all failed completed,
>              then the ret will be 0, and has_error will be true.
>      uint16_t rte_dmadev_completed_status(dev_id, vq_id, dma_cookie_t *cookie,
>                                           uint16_t nb_status, uint32_t *status)
>        -- return value: the number of failed completed operations.
>      And here I agree with Morten: we should design API which adapts to DPDK
>      service scenarios. So we don't support some sound-cards DMA, and 2D memory
>      copy which mainly used in video scenarios.
>   6) The dma_cookie_t is signed int type, when <0 it mean error, it's
>      monotonically increasing base on HW-queue (other than virt-queue). The
>      driver needs to make sure this because the damdev framework don't manage
>      the dma_cookie's creation.
>   7) Because data-plane APIs are not thread-safe, and user could determine
>      virt-queue to HW-queue's map (at the queue-setup stage), so it is user's
>      duty to ensure thread-safe.
>   8) One example:
>      vq_id = rte_dmadev_queue_setup(dev, config.{HW-queue-index=x, opaque});
>      if (vq_id < 0) {
>         // create virt-queue failed
>         return;
>      }
>      // submit memcpy task
>      cookit = rte_dmadev_memcpy(dev, vq_id, src, dst, len, flags);
>      if (cookie < 0) {
>         // submit failed
>         return;
>      }
IMO
rte_dmadev_memcpy should return ops number successfully submitted
that's easier to do re-submit if previous session is not fully
submitted.
>      // get complete task
>      ret = rte_dmadev_completed(dev, vq_id, &cookie, 1, has_error);
>      if (!has_error && ret == 1) {
>         // the memcpy successful complete
>      }
>   9) As octeontx2_dma support sg-list which has many valid buffers in
>      dpi_dma_buf_ptr_s, it could call the rte_dmadev_memcpy_sg API.
>   10) As ioat, it could delcare support one HW-queue at dev_configure stage, and
>       only support create one virt-queue.
>   11) As dpaa2_qdma, I think it could migrate to new framework, but still wait
>       for dpaa2_qdma guys feedback.
>   12) About the prototype src/dst parameters of rte_dmadev_memcpy API, we have
>       two candidates which are iova and void *, how about introduce dma_addr_t
>       type which could be va or iova ?
> 

  parent reply	other threads:[~2021-07-02  7:08 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-15 13:22 [dpdk-dev] [RFC PATCH] dmadev: introduce DMA device library Chengwen Feng
2021-06-15 16:38 ` Bruce Richardson
2021-06-16  7:09   ` Morten Brørup
2021-06-16 10:17     ` fengchengwen
2021-06-16 12:09       ` Morten Brørup
2021-06-16 13:06       ` Bruce Richardson
2021-06-16 14:37       ` Jerin Jacob
2021-06-17  9:15         ` Bruce Richardson
2021-06-18  5:52           ` Jerin Jacob
2021-06-18  9:41             ` fengchengwen
2021-06-22 17:25               ` Jerin Jacob
2021-06-23  3:30                 ` fengchengwen
2021-06-23  7:21                   ` Jerin Jacob
2021-06-23  9:37                     ` Bruce Richardson
2021-06-23 11:40                       ` Jerin Jacob
2021-06-23 14:19                         ` Bruce Richardson
2021-06-24  6:49                           ` Jerin Jacob
2021-06-23  9:41                 ` Bruce Richardson
2021-06-23 10:10                   ` Morten Brørup
2021-06-23 11:46                   ` Jerin Jacob
2021-06-23 14:22                     ` Bruce Richardson
2021-06-18  9:55             ` Bruce Richardson
2021-06-22 17:31               ` Jerin Jacob
2021-06-22 19:17                 ` Bruce Richardson
2021-06-23  7:00                   ` Jerin Jacob
2021-06-16  9:41   ` fengchengwen
2021-06-16 17:31     ` Bruce Richardson
2021-06-16 18:08       ` Jerin Jacob
2021-06-16 19:13         ` Bruce Richardson
2021-06-17  7:42           ` Jerin Jacob
2021-06-17  8:00             ` Bruce Richardson
2021-06-18  5:16               ` Jerin Jacob
2021-06-18 10:03                 ` Bruce Richardson
2021-06-22 17:36                   ` Jerin Jacob
2021-06-17  9:48       ` fengchengwen
2021-06-17 11:02         ` Bruce Richardson
2021-06-17 14:18           ` Bruce Richardson
2021-06-18  8:52             ` fengchengwen
2021-06-18  9:30               ` Bruce Richardson
2021-06-22 17:51               ` Jerin Jacob
2021-06-23  3:50                 ` fengchengwen
2021-06-23 11:00                   ` Jerin Jacob
2021-06-23 14:56                   ` Bruce Richardson
2021-06-24 12:19                     ` fengchengwen
2021-06-26  3:59                       ` [dpdk-dev] dmadev discussion summary fengchengwen
2021-06-28 10:00                         ` Bruce Richardson
2021-06-28 11:14                           ` Ananyev, Konstantin
2021-06-28 12:53                             ` Bruce Richardson
2021-07-02 13:31                           ` fengchengwen
2021-07-01 15:01                         ` Jerin Jacob
2021-07-01 16:33                           ` Bruce Richardson
2021-07-02  7:39                             ` Morten Brørup
2021-07-02 10:05                               ` Bruce Richardson
2021-07-02 13:45                           ` fengchengwen
2021-07-02 14:57                             ` Morten Brørup
2021-07-03  0:32                               ` fengchengwen
2021-07-03  8:53                                 ` Morten Brørup
2021-07-03  9:08                                   ` Jerin Jacob
2021-07-03 12:24                                     ` Morten Brørup
2021-07-04  7:43                                       ` Jerin Jacob
2021-07-05 10:28                                         ` Morten Brørup
2021-07-06  7:11                                           ` fengchengwen
2021-07-03  9:45                                   ` fengchengwen
2021-07-03 12:00                                     ` Morten Brørup
2021-07-04  7:34                                       ` Jerin Jacob
2021-07-02  7:07                         ` Liang Ma [this message]
2021-07-02 13:59                           ` fengchengwen
2021-06-24  7:03                   ` [dpdk-dev] [RFC PATCH] dmadev: introduce DMA device library Jerin Jacob
2021-06-24  7:59                     ` Morten Brørup
2021-06-24  8:05                       ` Jerin Jacob
2021-06-23  5:34       ` Hu, Jiayu
2021-06-23 11:07         ` Jerin Jacob
2021-06-16  2:17 ` Wang, Haiyue
2021-06-16  8:04   ` Bruce Richardson
2021-06-16  8:16     ` Wang, Haiyue
2021-06-16 12:14 ` David Marchand
2021-06-16 13:11   ` Bruce Richardson
2021-06-16 16:48     ` Honnappa Nagarahalli
2021-06-16 19:10       ` Bruce Richardson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YN67N8Jb5VGQmuw3@C02F33EJML85 \
    --to=liangma@liangbit.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=ferruh.yigit@intel.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=jerinj@marvell.com \
    --cc=jerinjacobk@gmail.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mb@smartsharesystems.com \
    --cc=nipun.gupta@nxp.com \
    --cc=pkapoor@marvell.com \
    --cc=sburla@marvell.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).