From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4FDFCA0548; Tue, 22 Jun 2021 19:25:54 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BEBB04003F; Tue, 22 Jun 2021 19:25:53 +0200 (CEST) Received: from mail-io1-f46.google.com (mail-io1-f46.google.com [209.85.166.46]) by mails.dpdk.org (Postfix) with ESMTP id A8EB54003E for ; Tue, 22 Jun 2021 19:25:52 +0200 (CEST) Received: by mail-io1-f46.google.com with SMTP id d9so33544ioo.2 for ; Tue, 22 Jun 2021 10:25:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=G1fJeqeHvbYMPG3t6xz78MJJOJYt3tayWgKZ+RM/MIc=; b=BWq8sImL73NQXSkVSKW3cee8ifi7gBh2O6N8kWyqmOBYHMEuv2Eo4/A84nLrcqyAIp 0TiNeaRMbimtZXrHxgBNnIYHN2IPD4i5NtF1wOitDV3gM77inC5RvYCxZJ/5pvoDRscN LAW4MCPjxgsEtCfnhKhM0KtAS8LMs8qeRdDmVRNVdv+Wh2G2XRqNaB+G26/kq0vuJ7Dt SlXgdr7+r60iGXBWyOHccOAGu/vz7K/ORbyopp9O57TLzxM5egFpSb1QkvKP6R0pDWsF K7u0dGDcsFI0lkoNczgMBFKDuVe3neMVRAVUi4IaOlvlTPJP7obuqxNZxH99XutwlK9t yzKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=G1fJeqeHvbYMPG3t6xz78MJJOJYt3tayWgKZ+RM/MIc=; b=MZBAb+vI8nYzh1NRlWBfUT5BNyRTAaDJNoSuaE7Q6k44hLx6Jx1h3CZu2dnGKZpIMC rs5tsd3cmN7vti7cAy6JK96IYOhVuFS5Aq1Ef8xbRRDx7+e1yymJXFHoc+b9KA+WvFRE XfV6Eb/RpMWCgKPsp3u4snIUBP13ry0oLRsgBW80cwEg49beINLZ++bFzmtAxx1q1FUM 1F73r/J4g11GSXqE3QN1hyD/QH/7CvTayv7HMvGQ+Fsh3oEPIxQgKJyI0sY+G19J0I/M zrWAPLpghdYlrxPiZAhe5WYfteF7nJJR05ihX6I5fVVgcgJ1dWRheg9uL/3BR47Ms9AT 8BKw== X-Gm-Message-State: AOAM531TAfeWPR2tCu8z4/xtSbAg9J+e6p/ggdyiwuwNhIgqfEN0XQkL BgQA66yFBo78j/LRcNfn4oQ3Nn+GjbujABRE7e8= X-Google-Smtp-Source: ABdhPJx8MNbAgi73cEuNRdJ7yGIRGQJ8D9hTOmJZwuJY5KsdKeFggzRIfPvFIAbt9UdegkZJtuuUCIZV6vDwYmn5q00= X-Received: by 2002:a05:6638:d4b:: with SMTP id d11mr4943664jak.112.1624382751816; Tue, 22 Jun 2021 10:25:51 -0700 (PDT) MIME-Version: 1.0 References: <1623763327-30987-1-git-send-email-fengchengwen@huawei.com> <98CBD80474FA8B44BF855DF32C47DC35C61860@smartserver.smartshare.dk> <3cb0bd01-2b0d-cf96-d173-920947466041@huawei.com> In-Reply-To: <3cb0bd01-2b0d-cf96-d173-920947466041@huawei.com> From: Jerin Jacob Date: Tue, 22 Jun 2021 22:55:24 +0530 Message-ID: To: fengchengwen Cc: Bruce Richardson , =?UTF-8?Q?Morten_Br=C3=B8rup?= , Thomas Monjalon , Ferruh Yigit , dpdk-dev , Nipun Gupta , Hemant Agrawal , Maxime Coquelin , Honnappa Nagarahalli , Jerin Jacob , David Marchand , Satananda Burla , Prasun Kapoor Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [RFC PATCH] dmadev: introduce DMA device library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Fri, Jun 18, 2021 at 3:11 PM fengchengwen wrot= e: > > On 2021/6/18 13:52, Jerin Jacob wrote: > > On Thu, Jun 17, 2021 at 2:46 PM Bruce Richardson > > wrote: > >> > >> On Wed, Jun 16, 2021 at 08:07:26PM +0530, Jerin Jacob wrote: > >>> On Wed, Jun 16, 2021 at 3:47 PM fengchengwen wrote: > >>>> > >>>> On 2021/6/16 15:09, Morten Br=C3=B8rup wrote: > >>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richard= son > >>>>>> Sent: Tuesday, 15 June 2021 18.39 > >>>>>> > >>>>>> On Tue, Jun 15, 2021 at 09:22:07PM +0800, Chengwen Feng wrote: > >>>>>>> This patch introduces 'dmadevice' which is a generic type of DMA > >>>>>>> device. > >>>>>>> > >>>>>>> The APIs of dmadev library exposes some generic operations which = can > >>>>>>> enable configuration and I/O with the DMA devices. > >>>>>>> > >>>>>>> Signed-off-by: Chengwen Feng > >>>>>>> --- > >>>>>> Thanks for sending this. > >>>>>> > >>>>>> Of most interest to me right now are the key data-plane APIs. Whil= e we > >>>>>> are > >>>>>> still in the prototyping phase, below is a draft of what we are > >>>>>> thinking > >>>>>> for the key enqueue/perform_ops/completed_ops APIs. > >>>>>> > >>>>>> Some key differences I note in below vs your original RFC: > >>>>>> * Use of void pointers rather than iova addresses. While using iov= a's > >>>>>> makes > >>>>>> sense in the general case when using hardware, in that it can wo= rk > >>>>>> with > >>>>>> both physical addresses and virtual addresses, if we change the = APIs > >>>>>> to use > >>>>>> void pointers instead it will still work for DPDK in VA mode, wh= ile > >>>>>> at the > >>>>>> same time allow use of software fallbacks in error cases, and al= so a > >>>>>> stub > >>>>>> driver than uses memcpy in the background. Finally, using iova's > >>>>>> makes the > >>>>>> APIs a lot more awkward to use with anything but mbufs or simila= r > >>>>>> buffers > >>>>>> where we already have a pre-computed physical address. > >>>>>> * Use of id values rather than user-provided handles. Allowing the > >>>>>> user/app > >>>>>> to manage the amount of data stored per operation is a better > >>>>>> solution, I > >>>>>> feel than proscribing a certain about of in-driver tracking. Som= e > >>>>>> apps may > >>>>>> not care about anything other than a job being completed, while = other > >>>>>> apps > >>>>>> may have significant metadata to be tracked. Taking the user-con= text > >>>>>> handles out of the API also makes the driver code simpler. > >>>>>> * I've kept a single combined API for completions, which differs f= rom > >>>>>> the > >>>>>> separate error handling completion API you propose. I need to gi= ve > >>>>>> the > >>>>>> two function approach a bit of thought, but likely both could wo= rk. > >>>>>> If we > >>>>>> (likely) never expect failed ops, then the specifics of error > >>>>>> handling > >>>>>> should not matter that much. > >>>>>> > >>>>>> For the rest, the control / setup APIs are likely to be rather > >>>>>> uncontroversial, I suspect. However, I think that rather than xsta= ts > >>>>>> APIs, > >>>>>> the library should first provide a set of standardized stats like > >>>>>> ethdev > >>>>>> does. If driver-specific stats are needed, we can add xstats later= to > >>>>>> the > >>>>>> API. > >>>>>> > >>>>>> Appreciate your further thoughts on this, thanks. > >>>>>> > >>>>>> Regards, > >>>>>> /Bruce > >>>>> > >>>>> I generally agree with Bruce's points above. > >>>>> > >>>>> I would like to share a couple of ideas for further discussion: > >>> > >>> > >>> I believe some of the other requirements and comments for generic DMA= will be > >>> > >>> 1) Support for the _channel_, Each channel may have different > >>> capabilities and functionalities. > >>> Typical cases are, each channel have separate source and destination > >>> devices like > >>> DMA between PCIe EP to Host memory, Host memory to Host memory, PCIe > >>> EP to PCIe EP. > >>> So we need some notion of the channel in the specification. > >>> > >> > >> Can you share a bit more detail on what constitutes a channel in this = case? > >> Is it equivalent to a device queue (which we are flattening to individ= ual > >> devices in this API), or to a specific configuration on a queue? > > > > It not a queue. It is one of the attributes for transfer. > > I.e in the same queue, for a given transfer it can specify the > > different "source" and "destination" device. > > Like CPU to Sound card, CPU to network card etc. > > > > > >> > >>> 2) I assume current data plane APIs are not thread-safe. Is it right? > >>> > >> Yes. > >> > >>> > >>> 3) Cookie scheme outlined earlier looks good to me. Instead of having > >>> generic dequeue() API > >>> > >>> 4) Can split the rte_dmadev_enqueue_copy(uint16_t dev_id, void * src, > >>> void * dst, unsigned int length); > >>> to two stage API like, Where one will be used in fastpath and other > >>> one will use used in slowpath. > >>> > >>> - slowpath API will for take channel and take other attributes for tr= ansfer > >>> > >>> Example syantx will be: > >>> > >>> struct rte_dmadev_desc { > >>> channel id; > >>> ops ; // copy, xor, fill etc > >>> other arguments specific to dma transfer // it can be set > >>> based on capability. > >>> > >>> }; > >>> > >>> rte_dmadev_desc_t rte_dmadev_preprare(uint16_t dev_id, struct > >>> rte_dmadev_desc *dec); > >>> > >>> - Fastpath takes arguments that need to change per transfer along wit= h > >>> slow-path handle. > >>> > >>> rte_dmadev_enqueue(uint16_t dev_id, void * src, void * dst, unsigned > >>> int length, rte_dmadev_desc_t desc) > >>> > >>> This will help to driver to > >>> -Former API form the device-specific descriptors in slow path for a > >>> given channel and fixed attributes per transfer > >>> -Later API blend "variable" arguments such as src, dest address with > >>> slow-path created descriptors > >>> > >> > >> This seems like an API for a context-aware device, where the channel i= s the > >> config data/context that is preserved across operations - is that corr= ect? > >> At least from the Intel DMA accelerators side, we have no concept of t= his > >> context, and each operation is completely self-described. The location= or > >> type of memory for copies is irrelevant, you just pass the src/dst > >> addresses to reference. > > > > it is not context-aware device. Each HW JOB is self-described. > > You can view it different attributes of transfer. > > > > > >> > >>> The above will give better performance and is the best trade-off c > >>> between performance and per transfer variables. > >> > >> We may need to have different APIs for context-aware and context-unawa= re > >> processing, with which to use determined by the capabilities discovery= . > >> Given that for these DMA devices the offload cost is critical, more so= than > >> any other dev class I've looked at before, I'd like to avoid having AP= Is > >> with extra parameters than need to be passed about since that just add= s > >> extra CPU cycles to the offload. > > > > If driver does not support additional attributes and/or the > > application does not need it, rte_dmadev_desc_t can be NULL. > > So that it won't have any cost in the datapath. I think, we can go to > > different API > > cases if we can not abstract problems without performance impact. > > Otherwise, it will be too much > > pain for applications. > > Yes, currently we plan to use different API for different case, e.g. > rte_dmadev_memcpy() -- deal with local to local memcopy > rte_dmadev_memset() -- deal with fill with local memory with pattern > maybe: > rte_dmadev_imm_data() --deal with copy very little data > rte_dmadev_p2pcopy() --deal with peer-to-peer copy of diffenet PCIE a= ddr > > These API capabilities will be reflected in the device capability set so = that > application could know by standard API. There will be a lot of combination of that it will be like M x N cross base case, It won't scale. > > > > > Just to understand, I think, we need to HW capabilities and how to > > have a common API. > > I assume HW will have some HW JOB descriptors which will be filled in > > SW and submitted to HW. > > In our HW, Job descriptor has the following main elements > > > > - Channel // We don't expect the application to change per transfer > > - Source address - It can be scatter-gather too - Will be changed per t= ransfer > > - Destination address - It can be scatter-gather too - Will be changed > > per transfer > > - Transfer Length - - It can be scatter-gather too - Will be changed > > per transfer > > - IOVA address where HW post Job completion status PER Job descriptor > > - Will be changed per transfer > > - Another sideband information related to channel // We don't expect > > the application to change per transfer > > - As an option, Job completion can be posted as an event to > > rte_event_queue too // We don't expect the application to change per > > transfer > > The 'option' field looks like a software interface field, but not HW desc= riptor. It is in HW descriptor. > > > > > @Richardson, Bruce @fengchengwen @Hemant Agrawal > > > > Could you share the options for your HW descriptors which you are > > planning to expose through API like above so that we can easily > > converge on fastpath API > > > > Kunpeng HW descriptor is self-describing, and don't need refer context in= fo. > > Maybe the fields which was fix with some transfer type could setup by dri= ver, and > don't expose to application. Yes. I agree.I think, that reason why I though to have rte_dmadev_prep() call to convert DPDK DMA transfer attributes to HW specific descriptors and have single enq() operation with variable argument(through enq parameter) and fix argumenents through rte_dmadev_prep() call object. > > So that we could use more generic way to define the API. > > > > > > >> > >> /Bruce > > > > . > > >