From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 44195A0C4E; Mon, 12 Jul 2021 19:00:33 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B73834069E; Mon, 12 Jul 2021 19:00:32 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by mails.dpdk.org (Postfix) with ESMTP id ED5914069D for ; Mon, 12 Jul 2021 19:00:30 +0200 (CEST) X-IronPort-AV: E=McAfee;i="6200,9189,10043"; a="295663747" X-IronPort-AV: E=Sophos;i="5.84,234,1620716400"; d="scan'208";a="295663747" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jul 2021 10:00:26 -0700 X-IronPort-AV: E=Sophos;i="5.84,234,1620716400"; d="scan'208";a="502278101" Received: from bricha3-mobl.ger.corp.intel.com ([10.252.5.145]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 12 Jul 2021 10:00:23 -0700 Date: Mon, 12 Jul 2021 18:00:19 +0100 From: Bruce Richardson To: Jerin Jacob Cc: Chengwen Feng , Thomas Monjalon , Ferruh Yigit , Jerin Jacob , dpdk-dev , Morten =?iso-8859-1?Q?Br=F8rup?= , Nipun Gupta , Hemant Agrawal , Maxime Coquelin , Honnappa Nagarahalli , David Marchand , Satananda Burla , Prasun Kapoor , "Ananyev, Konstantin" , liangma@liangbit.com Message-ID: References: <1625231891-2963-1-git-send-email-fengchengwen@huawei.com> <1625995556-41473-1-git-send-email-fengchengwen@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [dpdk-dev] [PATCH v2] dmadev: introduce DMA device library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, Jul 12, 2021 at 10:04:07PM +0530, Jerin Jacob wrote: > On Mon, Jul 12, 2021 at 7:02 PM Bruce Richardson > wrote: > > > > On Mon, Jul 12, 2021 at 03:29:27PM +0530, Jerin Jacob wrote: > > > On Sun, Jul 11, 2021 at 2:59 PM Chengwen Feng wrote: > > > > > > > > This patch introduce 'dmadevice' which is a generic type of DMA > > > > device. > > > > > > > > The APIs of dmadev library exposes some generic operations which can > > > > enable configuration and I/O with the DMA devices. > > > > > > > > Signed-off-by: Chengwen Feng > > > > --- > > > > +/** > > > > + * @warning > > > > + * @b EXPERIMENTAL: this API may change without prior notice. > > > > + * > > > > + * Enqueue a fill operation onto the virtual DMA channel. > > > > + * > > > > + * This queues up a fill operation to be performed by hardware, but does not > > > > + * trigger hardware to begin that operation. > > > > + * > > > > + * @param dev_id > > > > + * The identifier of the device. > > > > + * @param vchan > > > > + * The identifier of virtual DMA channel. > > > > + * @param pattern > > > > + * The pattern to populate the destination buffer with. > > > > + * @param dst > > > > + * The address of the destination buffer. > > > > + * @param length > > > > + * The length of the destination buffer. > > > > + * @param flags > > > > + * An flags for this operation. > > > > + * > > > > + * @return > > > > + * - 0..UINT16_MAX: index of enqueued copy job. > > > > > > fill job > > > > > > > + * - <0: Error code returned by the driver copy function. > > > > + */ > > > > +__rte_experimental > > > > +static inline int > > > > +rte_dmadev_fill(uint16_t dev_id, uint16_t vchan, uint64_t pattern, > > > > + rte_iova_t dst, uint32_t length, uint64_t flags) > > > > +{ > > > > + struct rte_dmadev *dev = &rte_dmadevices[dev_id]; > > > > +#ifdef RTE_DMADEV_DEBUG > > > > + RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL); > > > > + RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP); > > > > + if (vchan >= dev->data->dev_conf.max_vchans) { > > > > + RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan); > > > > + return -EINVAL; > > > > + } > > > > +#endif > > > > + return (*dev->fill)(dev, vchan, pattern, dst, length, flags); > > > > +} > > > > + > > > > > > > +/** > > > > + * @warning > > > > + * @b EXPERIMENTAL: this API may change without prior notice. > > > > + * > > > > + * Returns the number of operations that have been successfully completed. > > > > + * > > > > + * @param dev_id > > > > + * The identifier of the device. > > > > + * @param vchan > > > > + * The identifier of virtual DMA channel. > > > > + * @param nb_cpls > > > > + * The maximum number of completed operations that can be processed. > > > > + * @param[out] last_idx > > > > + * The last completed operation's index. > > > > + * If not required, NULL can be passed in. > > > > > > This means the driver will be tracking the last index. > > > > > > > Driver will be doing this anyway, no, since it needs to ensure we don't > > Yes. > > > wrap around? > > > > > > > Is that mean, the application needs to call this API periodically to > > > consume the completion slot. > > > I.e up to 64K (UINT16_MAX) outstanding jobs are possible. If the > > > application fails to call this > > > >64K outstand job then the subsequence enqueue will fail. > > > > Well, given that there will be a regular enqueue ring which will probably > > be <= 64k in size, the completion call will need to be called frequently > > anyway. I don't think we need to document this restriction as it's fairly > > understood that you can't go beyond the size of the ring without cleanup. > > > See below. > > > > > > > > > If so, we need to document this. > > > > > > One of the concerns of keeping UINT16_MAX as the limit is the > > > completion memory will always not in cache. > > > On the other hand, if we make this size programmable. it may introduce > > > complexity in the application. > > > > > > Thoughts? > > > > The reason for using powers-of-2 sizes, e.g. 0 .. UINT16_MAX, is that the > > ring can be any other power-of-2 size and we can index it just by masking. > > In the sample app for dmadev, I expect the ring size used to be set the > > same as the dmadev enqueue ring size, for simplicity. > > No question on not using power of 2. Aligned on that. > > At least in our HW, the size of the ring is rte_dmadev_vchan_conf::nb_desc. > But completion happens in _different_ memory space. Currently, we are allocating > UINT16_MAX entries to hold that. That's where cache miss aspects of > completion aspects > came. Depending on HW, our completions can be written back to a separate memory area - a completion ring, if you will - but I've generally found it works as well to reuse the enqueue ring for that purpose. However, with a separate memory area for completions, why do you need to allocate 64K entries for the completions? Would nb_desc entries not be enough? Is that to allow the user to have more than nb_desc jobs outstanding before calling "get_completions" API? > > In your case, Is completion happens in the same ring memory(looks like > one bit in job desc represents the job completed or not) ? > And when application calls rte_dmadev_completed(), You are converting > UINT16_MAX based index to > rte_dmadev_vchan_conf::nb_desc. Right? Yes, we are masking to do that. Actually, for simplicity and perf we should only allow power-of-2 ring sizes. Having to use modulus instead of masking could be a problem. [Alternatively, I suppose we can allow drivers to round up the ring sizes to the next power of 2, but I prefer just documenting it as a limitation]. /Bruce