DPDK patches and discussions
 help / color / mirror / Atom feed
From: Jerin Jacob <jerinjacobk@gmail.com>
To: Shahaf Shuler <shahafs@mellanox.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	Thomas Monjalon <thomas@monjalon.net>,
	 "olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	"wwasko@nvidia.com" <wwasko@nvidia.com>,
	 "spotluri@nvidia.com" <spotluri@nvidia.com>,
	Asaf Penso <asafp@mellanox.com>,
	 Slava Ovsiienko <viacheslavo@mellanox.com>
Subject: Re: [dpdk-dev] [RFC PATCH 20.02] mbuf: hint PMD not to inline packet
Date: Tue, 22 Oct 2019 20:47:06 +0530	[thread overview]
Message-ID: <CALBAE1PK-hrOhTYuFKSK+GjDwGwj1kvBcf0eraJsvxsrUkZX3Q@mail.gmail.com> (raw)
In-Reply-To: <AM0PR0502MB3795A1E5B802160D91BEA0D7C3680@AM0PR0502MB3795.eurprd05.prod.outlook.com>

On Tue, Oct 22, 2019 at 11:56 AM Shahaf Shuler <shahafs@mellanox.com> wrote:
>
> Thursday, October 17, 2019 8:19 PM, Jerin Jacob:
> > Subject: Re: [dpdk-dev] [RFC PATCH 20.02] mbuf: hint PMD not to inline
> > packet
> >
> > On Thu, Oct 17, 2019 at 4:30 PM Shahaf Shuler <shahafs@mellanox.com>
> > wrote:
> > >
> > > Thursday, October 17, 2019 11:17 AM, Jerin Jacob:
> > > > Subject: Re: [dpdk-dev] [RFC PATCH 20.02] mbuf: hint PMD not to
> > > > inline packet
> > > >
> > > > On Thu, Oct 17, 2019 at 12:57 PM Shahaf Shuler
> > > > <shahafs@mellanox.com>
> > > > wrote:
> > > > >
> > > > > Some PMDs inline the mbuf data buffer directly to device. This is
> > > > > in order to save the overhead of the PCI headers involved when the
> > > > > device DMA read the buffer pointer. For some devices it is
> > > > > essential in order to reach the pick BW.
> > > > >
> > > > > However, there are cases where such inlining is in-efficient. For
> > > > > example when the data buffer resides on other device memory (like
> > > > > GPU or storage device). attempt to inline such buffer will result
> > > > > in high PCI overhead for reading and copying the data from the remote
> > device.
> > > >
> > > > Some questions to understand the use case # Is this use case where
> > > > CPU, local DRAM, NW card and GPU memory connected on the coherent
> > > > bus
> > >
> > > Yes. For example one can allocate GPU memory and map it to the GPU bar,
> > make it accessible from the host CPU through LD/ST.
> > >
> > > > # Assuming the CPU needs to touch the buffer prior to Tx, In that
> > > > case, it will be useful?
> > >
> > > If the CPU needs to modify the data then no. it will be more efficient to
> > copy the data to CPU and then send it.
> > > However there are use cases where the data is DMA w/ zero copy to the
> > GPU (for example) , GPU perform the processing on the data, and then CPU
> > send the mbuf (w/o touching the data).
> >
> > OK. If I understanding it correctly it is for offloading the Network/Compute
> > functions to GPU from NW card and/or CPU.
>
> Mostly the compute. The networking on this model is expected to be done by the CPU.
> Note this is only one use case.
>
> >
> > >
> > > > # How the application knows, The data buffer is in GPU memory in
> > > > order to use this flag efficiently?
> > >
> > > Because it made it happen. For example it attached the mbuf external
> > buffer from the other device memory.
> > >
> > > > # Just an random thought, Does it help, if we create two different
> > > > mempools one from local DRAM and one from GPU memory so that the
> > > > application can work transparently.
> > >
> > > But you will still need to teach the PMD which pool it can inline and which
> > cannot.
> > > IMO it is more generic to have it per mbuf. Moreover, application has this
> > info.
> >
> > IMO, we can not use PKT_TX_DONT_INLINE_HINT flag for generic
> > applications, The application usage will be tightly coupled with the platform
> > and capabilities of GPU or Host CPU etc.
> >
> > I think, pushing this logic to the application is bad idea. But if you are writing
> > some custom application and the per packet-level you need to control then
> > this flag may be the only way.
>
> Yes. This flag is for custom application who do unique acceleration (by doing Zero copy for compute/compression/encryption accelerators) on specific platforms.
> Such application is fully aware to the platform and the location where the data resides hence it is very simple for it to know how to set this flag.

# if it is per packet, it will be an implicit requirement to add it mbuf.

If so,
# Does it makes sense to add through dynamic mbuf? Maybe it is not
worth it for a single bit.

Since we have only 17 bits (40 - 23) remaining for Rx and Tx and it is
custom application requirement,
how about adding PKT_PMD_CUSTOM1 flags so that similar requirement by other PMDs
can leverage the same bit for such custom applications.(We have a
similar use case for smart NIC (not so make much sense for generic
applications)  but needed for per packet)

>
> Note, This flag is 0 by default - meaning no hint and generic application works same as today.






>
> >
> >
> > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > >
> > > > > To support a mixed traffic pattern (some buffers from local DRAM,
> > > > > some buffers from other devices) with high BW, a hint flag is
> > > > > introduced in the mbuf.
> > > > > Application will hint the PMD whether or not it should try to
> > > > > inline the given mbuf data buffer. PMD should do best effort to
> > > > > act upon this request.
> > > > >
> > > > > Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> > > > > ---
> > > > >  lib/librte_mbuf/rte_mbuf.h | 9 +++++++++
> > > > >  1 file changed, 9 insertions(+)
> > > > >
> > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h
> > > > > b/lib/librte_mbuf/rte_mbuf.h index 98225ec80b..5934532b7f 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > @@ -203,6 +203,15 @@ extern "C" {
> > > > >  /* add new TX flags here */
> > > > >
> > > > >  /**
> > > > > + * Hint to PMD to not inline the mbuf data buffer to device
> > > > > + * rather let the device use its DMA engine to fetch the data
> > > > > +with the
> > > > > + * provided pointer.
> > > > > + *
> > > > > + * This flag is a only a hint. PMD should enforce it as best effort.
> > > > > + */
> > > > > +#define PKT_TX_DONT_INLINE_HINT (1ULL << 39)
> > > > > +
> > > > > +/**
> > > > >   * Indicate that the metadata field in the mbuf is in use.
> > > > >   */
> > > > >  #define PKT_TX_METADATA        (1ULL << 40)
> > > > > --
> > > > > 2.12.0
> > > > >

  reply	other threads:[~2019-10-22 15:17 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-17  7:27 Shahaf Shuler
2019-10-17  8:16 ` Jerin Jacob
2019-10-17 10:59   ` Shahaf Shuler
2019-10-17 17:18     ` Jerin Jacob
2019-10-22  6:26       ` Shahaf Shuler
2019-10-22 15:17         ` Jerin Jacob [this message]
2019-10-23 11:24           ` Shahaf Shuler
2019-10-25 11:17             ` Jerin Jacob
2019-10-17 15:14 ` Stephen Hemminger
2019-10-22  6:29   ` Shahaf Shuler
2019-12-11 17:01 ` [dpdk-dev] [RFC v2] mlx5/net: " Viacheslav Ovsiienko
2019-12-27  8:59   ` Olivier Matz
2020-01-14  7:57 ` [dpdk-dev] [PATCH] net/mlx5: update Tx datapath to support no inline hint Viacheslav Ovsiienko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALBAE1PK-hrOhTYuFKSK+GjDwGwj1kvBcf0eraJsvxsrUkZX3Q@mail.gmail.com \
    --to=jerinjacobk@gmail.com \
    --cc=asafp@mellanox.com \
    --cc=dev@dpdk.org \
    --cc=olivier.matz@6wind.com \
    --cc=shahafs@mellanox.com \
    --cc=spotluri@nvidia.com \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@mellanox.com \
    --cc=wwasko@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).