From: Elena Agostini <eagostini@nvidia.com>
To: Jerin Jacob <jerinjacobk@gmail.com>,
NBU-Contact-Thomas Monjalon <thomas@monjalon.net>
Cc: "techboard@dpdk.org" <techboard@dpdk.org>, dpdk-dev <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v3 0/9] GPU library
Date: Tue, 19 Oct 2021 10:00:45 +0000 [thread overview]
Message-ID: <DM6PR12MB41079FAE6B5DA35102B1BBFACDB79@DM6PR12MB4107.namprd12.prod.outlook.com> (raw)
In-Reply-To: <CALBAE1NSFdx3kdwBUGWQBVC7wqNDco8a=_i4_wZkZEROBNBcrw@mail.gmail.com>
From: Jerin Jacob <jerinjacobk@gmail.com>
Date: Monday, 11 October 2021 at 15:30
To: NBU-Contact-Thomas Monjalon <thomas@monjalon.net>
Cc: techboard@dpdk.org <techboard@dpdk.org>, Elena Agostini <eagostini@nvidia.com>, dpdk-dev <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v3 0/9] GPU library
> On Mon, Oct 11, 2021 at 6:14 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 11/10/2021 13:41, Jerin Jacob:
> > > On Mon, Oct 11, 2021 at 3:57 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > 11/10/2021 11:29, Jerin Jacob:
> > > > > On Mon, Oct 11, 2021 at 2:42 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > 11/10/2021 10:43, Jerin Jacob:
> > > > > > > On Mon, Oct 11, 2021 at 1:48 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > > > > 10/10/2021 12:16, Jerin Jacob:
> > > > > > > > > On Fri, Oct 8, 2021 at 11:13 PM <eagostini@nvidia.com> wrote:
> > > > > > > > > >
> > > > > > > > > > From: eagostini <eagostini@nvidia.com>
> > > > > > > > > >
> > > > > > > > > > In heterogeneous computing system, processing is not only in the CPU.
> > > > > > > > > > Some tasks can be delegated to devices working in parallel.
> > > > > > > > > >
> > > > > > > > > > The goal of this new library is to enhance the collaboration between
> > > > > > > > > > DPDK, that's primarily a CPU framework, and GPU devices.
> > > > > > > > > >
> > > > > > > > > > When mixing network activity with task processing on a non-CPU device,
> > > > > > > > > > there may be the need to put in communication the CPU with the device
> > > > > > > > > > in order to manage the memory, synchronize operations, exchange info, etc..
> > > > > > > > > >
> > > > > > > > > > This library provides a number of new features:
> > > > > > > > > > - Interoperability with GPU-specific library with generic handlers
> > > > > > > > > > - Possibility to allocate and free memory on the GPU
> > > > > > > > > > - Possibility to allocate and free memory on the CPU but visible from the GPU
> > > > > > > > > > - Communication functions to enhance the dialog between the CPU and the GPU
> > > > > > > > >
> > > > > > > > > In the RFC thread, There was one outstanding non technical issues on this,
> > > > > > > > >
> > > > > > > > > i.e
> > > > > > > > > The above features are driver specific details. Does the DPDK
> > > > > > > > > _application_ need to be aware of this?
> > > > > > > >
> > > > > > > > I don't see these features as driver-specific.
> > > > > > >
> > > > > > > That is the disconnect. I see this as more driver-specific details
> > > > > > > which are not required to implement an "application" facing API.
> > > > > >
> > > > > > Indeed this is the disconnect.
> > > > > > I already answered but it seems you don't accept the answer.
> > > > >
> > > > > Same with you. That is why I requested, we need to get opinions from others.
> > > > > Some of them already provided opinions in RFC.
> > > >
> > > > This is why I Cc'ed techboard.
> > >
> > > Yes. Indeed.
> > >
> > > >
> > > > > > First, this is not driver-specific. It is a low-level API.
> > > > >
> > > > > What is the difference between low-level API and driver-level API.
> > > >
> > > > The low-level API provides tools to build a feature,
> > > > but no specific feature.
> > > >
> > > > > > > For example, If we need to implement application facing" subsystems like bbdev,
> > > > > > > If we make all this driver interface, you can still implement the
> > > > > > > bbdev API as a driver without
> > > > > > > exposing HW specific details like how devices communicate to CPU, how
> > > > > > > memory is allocated etc
> > > > > > > to "application".
> > > > > >
> > > > > > There are 2 things to understand here.
> > > > > >
> > > > > > First we want to allow the application using the GPU for needs which are
> > > > > > not exposed by any other DPDK API.
> > > > > >
> > > > > > Second, if we want to implement another DPDK API like bbdev,
> > > > > > then the GPU implementation would be exposed as a vdev in bbdev,
> > > > > > using the HW GPU device being a PCI in gpudev.
> > > > > > They are two different levels, got it?
> > > > >
> > > > > Exactly. So what is the point of exposing low-level driver API to
> > > > > "application",
> > > > > why not it is part of the internal driver API. My point is, why the
> > > > > application needs to worry
> > > > > about, How the CPU <-> Device communicated? CPU < -> Device memory
> > > > > visibility etc.
> > > >
> > > > There are two reasons.
> > > >
> > > > 1/ The application may want to use the GPU for some application-specific
> > > > needs which are not abstracted in DPDK API.
> > >
> > > Yes. Exactly, That's where my concern, If we take this path, What is
> > > the motivation to contribute to DPDK abstracted subsystem APIs which
> > > make sense for multiple vendors and every
> > > Similar stuff applicable for DPU,
> >
> > A feature-specific API is better of course, there is no lose of motivation.
> > But you cannot forbid applications to have their own features on GPU.
>
> it still can use it. We don't need DPDK APIs for that.
>
> >
> > > Otherway to put, if GPU is doing some ethdev offload, why not making
> > > as ethdev offload in ethdev spec so that
> > > another type of device can be used and make sense for application writters.
> >
> > If we do ethdev offload, yes we'll implement it.
> > And we'll do it on top of gpudev, which is the only way to share the CPU.
> >
> > > For example, In the future, If someone needs to add ML(Machine
> > > learning) subsystem and enable a proper subsystem
> > > interface that is good for DPDK. If this path is open, there is no
> > > motivation for contribution and the application
> > > will not have a standard interface doing the ML job across multiple vendors.
> >
> > Wrong. It does remove the motivation, it is a first step to build on top of it.
>
> IMO, No need to make driver API to the public to feature API.
>
> >
> > > That's is the only reason why saying it should not APPLICATION
> > > interface it can be DRIVER interface.
> > >
> > > >
> > > > 2/ This API may also be used by some feature implementation internally
> > > > in some DPDK libs or drivers.
> > > > We cannot skip the gpudev layer because this is what allows generic probing
> > > > of the HW, and gpudev allows to share the GPU with multiple features
> > > > implemented in different libs or drivers, thanks to the "child" concept.
> > >
> > > Again, why do applications need to know it? It is similar to `bus`
> > > kind of this where it sharing the physical resouces.
> >
> > No it's not a bus, it is a device that we need to share.
> >
> > > > > > > > > aka DPDK device class has a fixed personality and it has API to deal
> > > > > > > > > with abstracting specific application specific
> > > > > > > > > end user functionality like ethdev, cryptodev, eventdev irrespective
> > > > > > > > > of underlying bus/device properties.
> > > > > > > >
> > > > > > > > The goal of the lib is to allow anyone to invent any feature
> > > > > > > > which is not already available in DPDK.
> > > > > > > >
> > > > > > > > > Even similar semantics are required for DPU(Smart NIC)
> > > > > > > > > communitication. I am planning to
> > > > > > > > > send RFC in coming days to address the issue without the application
> > > > > > > > > knowing the Bus/HW/Driver details.
> > > > > > > >
> > > > > > > > gpudev is not exposing bus/hw/driver details.
> > > > > > > > I don't understand what you mean.
> > > > > > >
> > > > > > > See above.
> >
> > We are going into circles.
>
> Yes.
>
> > In short, Jerin wants to forbid the generic use of GPU in DPDK.
>
> See below.
Honestly I don’t see a real problem releasing the library at application level.
It doesn’t prevent to use it internally by other DPDK libraries/drivers if needed.
Applications can benefit of this library for a number of reasons:
* Enhance the interaction between GPU specific library and DPDK
* Hide GPU specific implementation details to the “final” user that wants to build a GPU + DPDK application
* Measure network throughput with common tools like testpmd using GPU memory
Please be aware that this is just a starting point.
I’m planning to expose a number of features (at memory and processing levels) that can be useful
to enhance the communication among GPU, CPU and NIC hiding the implementation
details within the library/driver.
>
> > He wants only feature-specific API.
>
> To re-reiterate, feature-specific "application" API. A device-specific
> bit can be
> driver API and accessible to the out-of-tree driver if needed.
>
> IMO, if we take this path, DPU, XPU, GPU, etc we need N different libraries to
> get the job done for a specific feature for the dataplane.
> Instead, Enabling public feature APIs will make the application
> portable and does not
> need to worry about which type of *PU it runs.
>
As I stated multiple times, let’s start with something simple that works and
then think about how to enhance the library/driver.
IMHO it doesn’t make sense to address all the use-cases now.
This is a completely new scenario we’re opening in the DPDK context, let’s start
from the basis.
>
> > It is like restricting the functions we can run on a CPU.
> >
> > And anyway we need this layer to share the GPU between multiple features.
>
> No disagreement there. Is that layer public application API or not is
> the question.
> it is like PCI device API calls over of the application and makes the
> application device specific.
>
> >
> > Techboard please vote.
>
> Yes.
>
> >
> >
next prev parent reply other threads:[~2021-10-19 10:00 UTC|newest]
Thread overview: 128+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-02 20:35 [dpdk-dev] [PATCH] gpudev: introduce memory API Thomas Monjalon
2021-06-02 20:46 ` Stephen Hemminger
2021-06-02 20:48 ` Thomas Monjalon
2021-06-03 7:06 ` Andrew Rybchenko
2021-06-03 7:26 ` Thomas Monjalon
2021-06-03 7:49 ` Andrew Rybchenko
2021-06-03 8:26 ` Thomas Monjalon
2021-06-03 8:57 ` Andrew Rybchenko
2021-06-03 7:18 ` David Marchand
2021-06-03 7:30 ` Thomas Monjalon
2021-06-03 7:47 ` Jerin Jacob
2021-06-03 8:28 ` Thomas Monjalon
2021-06-03 8:41 ` Jerin Jacob
2021-06-03 8:43 ` Thomas Monjalon
2021-06-03 8:47 ` Jerin Jacob
2021-06-03 8:53 ` Thomas Monjalon
2021-06-03 9:20 ` Jerin Jacob
2021-06-03 9:36 ` Thomas Monjalon
2021-06-03 10:04 ` Jerin Jacob
2021-06-03 10:30 ` Thomas Monjalon
2021-06-03 11:38 ` Jerin Jacob
2021-06-04 12:55 ` Thomas Monjalon
2021-06-04 15:05 ` Jerin Jacob
2021-06-03 9:33 ` Ferruh Yigit
2021-06-04 10:28 ` Thomas Monjalon
2021-06-04 11:09 ` Jerin Jacob
2021-06-04 12:46 ` Thomas Monjalon
2021-06-04 13:05 ` Andrew Rybchenko
2021-06-04 13:18 ` Thomas Monjalon
2021-06-04 13:59 ` Andrew Rybchenko
2021-06-04 14:09 ` Thomas Monjalon
2021-06-04 15:20 ` Jerin Jacob
2021-06-04 15:51 ` Thomas Monjalon
2021-06-04 18:20 ` Wang, Haiyue
2021-06-05 5:09 ` Jerin Jacob
2021-06-06 1:13 ` Honnappa Nagarahalli
2021-06-06 5:28 ` Jerin Jacob
2021-06-07 10:29 ` Thomas Monjalon
2021-06-07 7:20 ` Wang, Haiyue
2021-06-07 10:43 ` Thomas Monjalon
2021-06-07 13:54 ` Jerin Jacob
2021-06-07 16:47 ` Thomas Monjalon
2021-06-08 4:10 ` Jerin Jacob
2021-06-08 6:34 ` Thomas Monjalon
2021-06-08 7:09 ` Jerin Jacob
2021-06-08 7:32 ` Thomas Monjalon
2021-06-15 18:24 ` Ferruh Yigit
2021-06-15 18:54 ` Thomas Monjalon
2021-06-07 23:31 ` Honnappa Nagarahalli
2021-06-04 5:51 ` Wang, Haiyue
2021-06-04 8:15 ` Thomas Monjalon
2021-06-04 11:07 ` Wang, Haiyue
2021-06-04 12:43 ` Thomas Monjalon
2021-06-04 13:25 ` Wang, Haiyue
2021-06-04 14:06 ` Thomas Monjalon
2021-06-04 18:04 ` Wang, Haiyue
2021-06-05 7:49 ` Thomas Monjalon
2021-06-05 11:09 ` Wang, Haiyue
2021-06-06 1:10 ` Honnappa Nagarahalli
2021-06-07 10:50 ` Thomas Monjalon
2021-07-30 13:55 ` [dpdk-dev] [RFC PATCH v2 0/7] heterogeneous computing library Thomas Monjalon
2021-07-30 13:55 ` [dpdk-dev] [RFC PATCH v2 1/7] hcdev: introduce heterogeneous computing device library Thomas Monjalon
2021-07-30 13:55 ` [dpdk-dev] [RFC PATCH v2 2/7] hcdev: add event notification Thomas Monjalon
2021-07-30 13:55 ` [dpdk-dev] [RFC PATCH v2 3/7] hcdev: add child device representing a device context Thomas Monjalon
2021-07-30 13:55 ` [dpdk-dev] [RFC PATCH v2 4/7] hcdev: support multi-process Thomas Monjalon
2021-07-30 13:55 ` [dpdk-dev] [RFC PATCH v2 5/7] hcdev: add memory API Thomas Monjalon
2021-07-30 13:55 ` [dpdk-dev] [RFC PATCH v2 6/7] hcdev: add communication flag Thomas Monjalon
2021-07-30 13:55 ` [dpdk-dev] [RFC PATCH v2 7/7] hcdev: add communication list Thomas Monjalon
2021-07-31 7:06 ` [dpdk-dev] [RFC PATCH v2 0/7] heterogeneous computing library Jerin Jacob
2021-07-31 8:21 ` Thomas Monjalon
2021-07-31 13:42 ` Jerin Jacob
2021-08-27 9:44 ` Thomas Monjalon
2021-08-27 12:19 ` Jerin Jacob
2021-08-29 5:32 ` Wang, Haiyue
2021-09-01 15:35 ` Elena Agostini
2021-09-02 13:12 ` Jerin Jacob
2021-09-06 16:11 ` Elena Agostini
2021-09-06 17:15 ` Wang, Haiyue
2021-09-06 17:22 ` Elena Agostini
2021-09-07 0:55 ` Wang, Haiyue
2021-10-09 1:53 ` [dpdk-dev] [PATCH v3 0/9] GPU library eagostini
2021-10-09 1:53 ` [dpdk-dev] [PATCH v3 1/9] gpudev: introduce GPU device class library eagostini
2021-10-09 1:53 ` [dpdk-dev] [PATCH v3 2/9] gpudev: add event notification eagostini
2021-10-09 1:53 ` [dpdk-dev] [PATCH v3 3/9] gpudev: add child device representing a device context eagostini
2021-10-09 1:53 ` [dpdk-dev] [PATCH v3 4/9] gpudev: support multi-process eagostini
2021-10-09 1:53 ` [dpdk-dev] [PATCH v3 5/9] gpudev: add memory API eagostini
2021-10-08 20:18 ` Thomas Monjalon
2021-10-29 19:38 ` Mattias Rönnblom
2021-11-08 15:16 ` Elena Agostini
2021-10-09 1:53 ` [dpdk-dev] [PATCH v3 6/9] gpudev: add memory barrier eagostini
2021-10-08 20:16 ` Thomas Monjalon
2021-10-09 1:53 ` [dpdk-dev] [PATCH v3 7/9] gpudev: add communication flag eagostini
2021-10-09 1:53 ` [dpdk-dev] [PATCH v3 8/9] gpudev: add communication list eagostini
2021-10-09 1:53 ` [dpdk-dev] [PATCH v3 9/9] doc: add CUDA example in GPU guide eagostini
2021-10-10 10:16 ` [dpdk-dev] [PATCH v3 0/9] GPU library Jerin Jacob
2021-10-11 8:18 ` Thomas Monjalon
2021-10-11 8:43 ` Jerin Jacob
2021-10-11 9:12 ` Thomas Monjalon
2021-10-11 9:29 ` Jerin Jacob
2021-10-11 10:27 ` Thomas Monjalon
2021-10-11 11:41 ` Jerin Jacob
2021-10-11 12:44 ` Thomas Monjalon
2021-10-11 13:30 ` Jerin Jacob
2021-10-19 10:00 ` Elena Agostini [this message]
2021-10-19 18:47 ` Jerin Jacob
2021-10-19 19:11 ` Thomas Monjalon
2021-10-19 19:56 ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2021-11-03 19:15 ` [dpdk-dev] [PATCH v4 " eagostini
2021-11-03 19:15 ` [dpdk-dev] [PATCH v4 1/9] gpudev: introduce GPU device class library eagostini
2021-11-03 19:15 ` [dpdk-dev] [PATCH v4 2/9] gpudev: add event notification eagostini
2021-11-03 19:15 ` [dpdk-dev] [PATCH v4 3/9] gpudev: add child device representing a device context eagostini
2021-11-03 19:15 ` [dpdk-dev] [PATCH v4 4/9] gpudev: support multi-process eagostini
2021-11-03 19:15 ` [dpdk-dev] [PATCH v4 5/9] gpudev: add memory API eagostini
2021-11-03 19:15 ` [dpdk-dev] [PATCH v4 6/9] gpudev: add memory barrier eagostini
2021-11-03 19:15 ` [dpdk-dev] [PATCH v4 7/9] gpudev: add communication flag eagostini
2021-11-03 19:15 ` [dpdk-dev] [PATCH v4 8/9] gpudev: add communication list eagostini
2021-11-03 19:15 ` [dpdk-dev] [PATCH v4 9/9] doc: add CUDA example in GPU guide eagostini
2021-11-08 18:57 ` [dpdk-dev] [PATCH v5 0/9] GPU library eagostini
2021-11-08 16:25 ` Thomas Monjalon
2021-11-08 18:57 ` [dpdk-dev] [PATCH v5 1/9] gpudev: introduce GPU device class library eagostini
2021-11-08 18:57 ` [dpdk-dev] [PATCH v5 2/9] gpudev: add event notification eagostini
2021-11-08 18:57 ` [dpdk-dev] [PATCH v5 3/9] gpudev: add child device representing a device context eagostini
2021-11-08 18:58 ` [dpdk-dev] [PATCH v5 4/9] gpudev: support multi-process eagostini
2021-11-08 18:58 ` [dpdk-dev] [PATCH v5 5/9] gpudev: add memory API eagostini
2021-11-08 18:58 ` [dpdk-dev] [PATCH v5 6/9] gpudev: add memory barrier eagostini
2021-11-08 18:58 ` [dpdk-dev] [PATCH v5 7/9] gpudev: add communication flag eagostini
2021-11-08 18:58 ` [dpdk-dev] [PATCH v5 8/9] gpudev: add communication list eagostini
2021-11-08 18:58 ` [dpdk-dev] [PATCH v5 9/9] doc: add CUDA example in GPU guide eagostini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DM6PR12MB41079FAE6B5DA35102B1BBFACDB79@DM6PR12MB4107.namprd12.prod.outlook.com \
--to=eagostini@nvidia.com \
--cc=dev@dpdk.org \
--cc=jerinjacobk@gmail.com \
--cc=techboard@dpdk.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).