DPDK patches and discussions
 help / color / mirror / Atom feed
From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
To: "Morten Brørup" <mb@smartsharesystems.com>,
	"Paul Szczepanek" <paul.szczepanek@arm.com>,
	"konstantin.v.ananyev@yandex.ru" <konstantin.v.ananyev@yandex.ru>
Cc: "nd@arm.com" <nd@arm.com>, "dev@dpdk.org" <dev@dpdk.org>,
	"Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>
Subject: RE: RE: [PATCH v5 0/4] add pointer compression API
Date: Thu, 16 May 2024 08:40:55 +0000	[thread overview]
Message-ID: <f12eee42f80a4f948f594c1a04e4c5a7@huawei.com> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9F45C@smartserver.smartshare.dk>



> > From: Paul Szczepanek [mailto:paul.szczepanek@arm.com]
> > Sent: Wednesday, 15 May 2024 19.01
> >
> > On 04/03/2024 14:44, Konstantin Ananyev wrote:
> > >> This feature is targeted for pipeline mode of applications. We see
> > many customers using pipeline mode. This feature helps in reducing
> > >> the cost of transferring the packets between cores by reducing the
> > copies involved.
> > >
> > > I do understand the intention, and I am not arguing about usefulness
> > of the pipeline model.
> > > My point is you are introducing new API: compress/decompress pointers,
> > > but don't provide (or even describe) any proper way for the developer
> > to use it in a safe and predictable manner.
> > > Which from my perspective make it nearly useless and misleading.
> >
> > In the latest version there is an example in the docs showing how to use
> > it. There is an integration test that shows how to use it. The comments
> > in the header also provide detailed guidance.
> >
> > >> For an application with multiple pools, it depends on how the
> > applications are using multiple pools. But, if there is a bunch of
> > packets
> > >> belonging to multiple mempools, compressing those mbufs may not be
> > possible. But if those mbufs are grouped per mempool and
> > >> are transferred on different queues, then it is possible. Hence the
> > APIs are implemented very generically.
> > >
> > > Ok, let's consider even more simplistic scenario - all pointers belong
> > to one mempool.
> > > AFAIK, even one mempool can contain elements from different memzones,
> > > and these memzones are not guaranteed to have consecutive VAs.
> > > So even one mempool, with total size <=4GB can contain elements with
> > distances between them more than 4GB.
> > > Now let say at startup user created a mempool, how he can determine
> > programmatically
> > > can he apply your compress API safely on it or not?
> > > I presume that if you are serious about this API usage, then such
> > ability has to be provided.
> > > Something like:
> > >
> > > int compress_pointer_deduce_mempool_base(const struct rte_memepool
> > *mp[],
> > > 	uint32_t nb_mp, uint32_t compress_size, uintptr_t *base_ptr);
> > >
> > > Or probably even more generic one:
> > >
> > > struct mem_buf {uintptr_t base, size_t len;};
> > > int compress_pointer_deduce_base(const struct mem_buf *mem_buf[],
> > > 	uint32_t nb_membuf, uint32_t compress_size, uintptr_t *base_ptr);
> > >
> > > Even with these functions in-place, user has to be extra careful:
> > >  - he can't add new memory chunks to these mempools (or he'll need to
> > re-calcualte the new base_ptr)
> > >  - he needs to make sure that pointers from only these mempools will
> > be used by compress/decompress.
> > > But at least it provides some ability to use this feature in real
> > apps.
> > >
> > > With such API in place it should be possible to make the auto-test
> > more realistic:
> > > - allocate mempool
> > > - deduce base_pointer
> > > - then we can have a loop with producer/consumer to mimic realistic
> > workload.
> > >     As an example:
> > >      producer(s):  mempool_alloc(); <fill mbuf with some values>;
> > ring_enqueue();
> > >      consumer(s): ring_dequeue(); <read_and_check_mbuf_data>;
> > free_mbuf();
> > > - free mempool
> >
> > I understand your objections and agree that the proposed API is limited
> > in its applicability due to its strict requirements.
> >
> > AFAIK DPDK rte_mempool does require the addresses to be virtually
> > contiguous as the memory reservation is done during creation of the
> > mempool and a single memzone is reserved.
> 
> No, it does not.
> rte_pktmbuf_pool_create() first creates an empty mempool using rte_mempool_create_empty(), and then populates it using
> rte_mempool_populate_default(), which may use multiple memzones.

Yep. 

> As one possible solution to this, the application can call rte_pktmbuf_pool_create() as usual, and then check that mp-
> >nb_mem_chunks == 1 to confirm that all objects in the created mempool reside in one contiguous chunk of memory and thus
> compression can be used.

That's exactly what I am asking for: a new API call that would take number of bits to compress as a parameter,
check mempool layout (number of chunks, base addr and size of each chunk)and based on that either
return base ptr to use, or an error code.  
 
> Or even better, add a new mempool flag to the mempool library, e.g. RTE_MEMPOOL_F_COMPRESSIBLE, specifying that the
> mempool objects must be allocated as one chunk of memory with contiguous addresses.
> Unfortunately, rte_pktmbuf_pool_create() is missing the memzone flags parameter, so a new rte_pktmbuf_pool_create() API with
> the flags parameter added would also need to be added.

Hmm.., I didn't think about that, but yes, I think such API might be useful:
at mempool_create() define a memory layout constraints that it has to obey.
If we'll make it generic enough, I suppose might be helpful for different use-cases.  

> > However, I do not require
> > users to use the rte_mempool as the API accepts pointers so other
> > mempool implementations could indeed allow non-contiguous VAs.
> >
> > To help decide at compile time if compression is allowed I will add
> > extra macros to the header
> >
> > #define BITS_REQUIRED_TO_STORE_VALUE(x) \
> > 	((x) == 0 ? 1 : ((sizeof(x)) - __builtin_clzl(x)))
> >
> > #define BIT_SHIFT_FROM_ALIGNMENT(x) ((x) == 0 ? 0 : __builtin_clzl(x))
> >
> > #define CAN_USE_RTE_PTR_COMPRESS_16(memory_range, object_alignment) \
> > 	((BITS_REQUIRED_TO_STORE_VALUE(memory_range) - \
> > 	BIT_SHIFT_FROM_ALIGNMENT(object_alignment)) <= 16 ? 1 : 0)
> >
> > #define CAN_USE_RTE_PTR_COMPRESS_32(memory_range, object_alignment) \
> > 	((BITS_REQUIRED_TO_STORE_VALUE(memory_range) - \
> > 	BIT_SHIFT_FROM_ALIGNMENT(object_alignment)) <= 32 ? 1 : 0)
> >
> > And explain usage in the docs.
> >
> > The API is very generic and does not even require you to use a mempool.
> > There are no runtime checks to verify or calculate if the pointers can
> > be compressed. 

Yes, and that is the main problem, I believe.

>This is because doing this at runtime would remove any
> > gains achieved through compression. The code doing the compression needs
> > to remain limited in size, branching and execution time to remain fast.
> > This is IMHO the nature of C applications, they trade off runtime checks
> > for performance.

I think this is not correct and you are mixing few completely different things together.
I believe that well written C code (as any other) should do runtime checks.
Now, in DPDK we often do omit run-time checks for input parameters at
Performance critical path.
But we do encourage people to do as many checks as they can at control-path.
As an example - we don't do much checks at eth_rx_burst() itself,
but we do a lot of checks at setup phase: dev_configure(), rx_queue_setup(), etc.
With ptr compression API, I think it can done in the same way:
You can provide an API that would do check is ptr compression applicable for
his current memory layout or not.
It doesn't have to be fast, as user will call it just once at setup stage - before data-path is started.
See what Morten suggested above.   

>> Program correctness needs to be enforced through other
> > means, linting, valgrind, tests, peer review, etc. It is up to the
> > programmer to calculate and decide on the viability of compression as it
> > cannot be done at compile time automatically. There is no way for me to
> > programmatically verify the alignment and distance of the pointers being
> > passed in at compile time as I don't require the user to use any
> > particular mempool implementation.
> >
> > These limitations are clearly documented in the API and the guide.
> >
> > > Or probably you can go even further: take some existing pipeline
> > sample app and make it use compress/decompress API.
> > > That will provide people with some ability to test it and measure it's
> > perf impact.
> > > Again, it will provide an example of the amount of changes required to
> > enable it.
> > > My speculation here that majority of users will find the effort too
> > big,
> > > while the gain way too limited and fragile.
> > > But at least, there would be some realistic reference point for it and
> > users can decide themselves is it worth it or not.
> >
> > I have added a performance test that runs the compression and
> > decompression loop with different burst sizes so that you can easily
> > test if attempting compression is worth the the effort in your usecase.
> > This is documented in the guide.
> >
> > >
> > >>>
> > >>> I would like to add:
> > >>> If we want to offer optimizations specifically for applications with
> > a single mbuf pool, I think it should be considered in a system-wide
> > >> context to determine if performance could be improved in more areas.
> > >>> E.g. removing the pool field from the rte_mbuf structure might free
> > up space to move hot fields from the second cache line to the
> > >> first, so the second cache line rarely needs to be touched. (As an
> > alternative to removing the pool field, it could be moved to the
> > >> second cache line, only to be used if the global "single mbuf pool"
> > is NULL.)
> > >> Agree on this. The feedback I have received is on similar lines, many
> > are using simple features. I also received feedback that 90% of
> > >> the applications use less than 4GB of memory for mbuf and burst sizes
> > are up to 256.
> > >
> > > Well, from my perspective the story is completely different:
> > > Majority of real-world apps I am aware do use multiple mempools,
> > > it is also not uncommon to have a mempools with size bigger then 4GB
> > (8/16).
> > > Again, there are queries to make mempools growable/shrinkable on
> > demand.
> >
> > You can use this API with mempools even as big as 32GB as long as your
> > alignment allows for sufficient shift (as explained in the headers and
> > docs) and rte_mempool objects will have at least 8 bytes alignment so
> > can fit a 32GB mempool in. It is true that you cannot use it if you
> > cannot guarantee the address range at compile time. This utility is not
> > a golden bullet to use on every pointer.
> 
> For future proofing, please rename the compression functions to include the compression algorithm, i.e. "shift" or similar, in the
> function names.
> 
> Specifically I'm thinking about an alternative "multiply" compression algorithm based on division/multiplication by a constant
> "multiplier" parameter (instead of the "bit_shift" parameter).
> This "multiplier" would typically be the object size of the packet mbuf mempool.
> The "multiplier" could be constant at built time, e.g. 2368, or determined at runtime.
> I don't know the performance of division/multiplication compared to bit shifting for various CPUs, but it would make compression to
> 16 bit compressed pointers viable for more applications.
> 
> The perf test in this series could be used to determine compression/decompression performance of such an algorithm, and the
> application developer can determine which algorithm to use; "shift" with 32 bit compressed pointers, or "multiply" with 16 bit
> compressed pointers.


  parent reply	other threads:[~2024-05-16  8:40 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-27 15:08 [RFC 0/2] " Paul Szczepanek
2023-09-27 15:08 ` [RFC 1/2] eal: add pointer compression functions Paul Szczepanek
2023-10-09 15:54   ` Thomas Monjalon
2023-10-11 13:36     ` Honnappa Nagarahalli
2023-10-11 16:43       ` Paul Szczepanek
2023-10-11 12:43   ` [RFC v2 0/2] add pointer compression API Paul Szczepanek
2023-10-11 12:43     ` [RFC v2 1/2] eal: add pointer compression functions Paul Szczepanek
2023-10-11 12:43     ` [RFC v2 2/2] test: add pointer compress tests to ring perf test Paul Szczepanek
2023-10-31 18:10   ` [PATCH v3 0/3] add pointer compression API Paul Szczepanek
2023-10-31 18:10     ` [PATCH v3 1/3] eal: add pointer compression functions Paul Szczepanek
2023-10-31 18:10     ` [PATCH v3 2/3] test: add pointer compress tests to ring perf test Paul Szczepanek
2023-10-31 18:10     ` [PATCH v3 3/3] docs: add pointer compression to the EAL guide Paul Szczepanek
2023-11-01  7:42     ` [PATCH v3 0/3] add pointer compression API Morten Brørup
2023-11-01 12:52       ` Paul Szczepanek
2023-11-01 12:46   ` [PATCH v4 0/4] " Paul Szczepanek
2023-11-01 12:46     ` [PATCH v4 1/4] eal: add pointer compression functions Paul Szczepanek
2023-11-01 12:46     ` [PATCH v4 2/4] test: add pointer compress tests to ring perf test Paul Szczepanek
2023-11-01 12:46     ` [PATCH v4 3/4] docs: add pointer compression to the EAL guide Paul Szczepanek
2023-11-01 12:46     ` [PATCH v4 4/4] test: add unit test for ptr compression Paul Szczepanek
2023-11-01 18:12   ` [PATCH v5 0/4] add pointer compression API Paul Szczepanek
2023-11-01 18:12     ` [PATCH v5 1/4] eal: add pointer compression functions Paul Szczepanek
2024-02-11 15:32       ` Konstantin Ananyev
2023-11-01 18:12     ` [PATCH v5 2/4] test: add pointer compress tests to ring perf test Paul Szczepanek
2023-11-01 18:13     ` [PATCH v5 3/4] docs: add pointer compression to the EAL guide Paul Szczepanek
2023-11-01 18:13     ` [PATCH v5 4/4] test: add unit test for ptr compression Paul Szczepanek
2024-02-22  8:15     ` [PATCH v5 0/4] add pointer compression API Paul Szczepanek
2024-02-22 16:16       ` Konstantin Ananyev
2024-03-01 11:16         ` Morten Brørup
2024-03-01 16:12           ` Patrick Robb
2024-03-01 19:57           ` Honnappa Nagarahalli
2024-03-02 10:33             ` Morten Brørup
2024-03-06 22:31               ` Paul Szczepanek
2024-03-07  2:13                 ` Honnappa Nagarahalli
2024-03-04 14:44             ` Konstantin Ananyev
2024-05-15 17:00               ` Paul Szczepanek
2024-05-15 22:34                 ` Morten Brørup
2024-05-16  8:25                   ` Paul Szczepanek
2024-05-16  8:40                   ` Konstantin Ananyev [this message]
2024-05-24  8:33                     ` Paul Szczepanek
2024-05-24  9:09                       ` Konstantin Ananyev
2024-05-28 19:29                         ` Paul Szczepanek
2024-05-29 10:28                           ` Paul Szczepanek
2024-02-29 16:03   ` [PATCH v6 " Paul Szczepanek
2024-02-29 16:03     ` [PATCH v6 1/4] eal: add pointer compression functions Paul Szczepanek
2024-02-29 16:03     ` [PATCH v6 2/4] test: add pointer compress tests to ring perf test Paul Szczepanek
2024-02-29 16:03     ` [PATCH v6 3/4] docs: add pointer compression to the EAL guide Paul Szczepanek
2024-02-29 16:03     ` [PATCH v6 4/4] test: add unit test for ptr compression Paul Szczepanek
2024-03-01 10:21   ` [PATCH v7 0/4] add pointer compression API Paul Szczepanek
2024-03-01 10:21     ` [PATCH v7 1/4] eal: add pointer compression functions Paul Szczepanek
2024-03-07 11:22       ` David Marchand
2024-03-01 10:21     ` [PATCH v7 2/4] test: add pointer compress tests to ring perf test Paul Szczepanek
2024-03-07 11:27       ` David Marchand
2024-03-01 10:21     ` [PATCH v7 3/4] docs: add pointer compression to the EAL guide Paul Szczepanek
2024-03-01 10:21     ` [PATCH v7 4/4] test: add unit test for ptr compression Paul Szczepanek
2024-03-07 11:30       ` David Marchand
2024-03-07 20:39   ` [PATCH v7 0/4] add pointer compression API Paul Szczepanek
2024-03-07 20:39     ` [PATCH v8 1/4] ptr_compress: add pointer compression library Paul Szczepanek
2024-03-07 20:39     ` [PATCH v8 2/4] test: add pointer compress tests to ring perf test Paul Szczepanek
2024-03-07 20:39     ` [PATCH v8 3/4] docs: add pointer compression guide Paul Szczepanek
2024-03-07 20:39     ` [PATCH v8 4/4] test: add unit test for ptr compression Paul Szczepanek
2024-03-08  8:27     ` [PATCH v7 0/4] add pointer compression API David Marchand
2024-03-10 19:34       ` Honnappa Nagarahalli
2024-03-11  7:44         ` David Marchand
2024-03-11 14:47   ` [PATCH v9 0/5] " Paul Szczepanek
2024-03-11 14:47     ` [PATCH v9 1/5] lib: allow libraries with no sources Paul Szczepanek
2024-03-11 15:23       ` Bruce Richardson
2024-03-15  8:33         ` Paul Szczepanek
2024-03-11 14:47     ` [PATCH v9 2/5] ptr_compress: add pointer compression library Paul Szczepanek
2024-03-11 14:47     ` [PATCH v9 3/5] test: add pointer compress tests to ring perf test Paul Szczepanek
2024-03-11 14:47     ` [PATCH v9 4/5] docs: add pointer compression guide Paul Szczepanek
2024-03-11 14:47     ` [PATCH v9 5/5] test: add unit test for ptr compression Paul Szczepanek
2024-03-11 20:31   ` [PATCH v10 0/5] add pointer compression API Paul Szczepanek
2024-03-11 20:31     ` [PATCH v10 1/5] lib: allow libraries with no sources Paul Szczepanek
2024-03-15  9:14       ` Bruce Richardson
2024-03-11 20:31     ` [PATCH v10 2/5] ptr_compress: add pointer compression library Paul Szczepanek
2024-03-11 20:31     ` [PATCH v10 3/5] test: add pointer compress tests to ring perf test Paul Szczepanek
2024-03-11 20:31     ` [PATCH v10 4/5] docs: add pointer compression guide Paul Szczepanek
2024-03-11 20:31     ` [PATCH v10 5/5] test: add unit test for ptr compression Paul Szczepanek
2024-05-24  8:36   ` [PATCH v11 0/6] add pointer compression API Paul Szczepanek
2024-05-24  8:36     ` [PATCH v11 1/6] lib: allow libraries with no sources Paul Szczepanek
2024-05-24  8:36     ` [PATCH v11 2/6] mempool: add functions to get extra mempool info Paul Szczepanek
2024-05-24 12:20       ` Morten Brørup
2024-05-28 19:33         ` Paul Szczepanek
2024-05-24  8:36     ` [PATCH v11 3/6] ptr_compress: add pointer compression library Paul Szczepanek
2024-05-24 12:50       ` Morten Brørup
2024-05-24  8:36     ` [PATCH v11 4/6] test: add pointer compress tests to ring perf test Paul Szczepanek
2024-05-24  8:36     ` [PATCH v11 5/6] docs: add pointer compression guide Paul Szczepanek
2024-05-24  8:36     ` [PATCH v11 6/6] test: add unit test for ptr compression Paul Szczepanek
2024-05-29 10:22   ` [PATCH v12 0/6] add pointer compression API Paul Szczepanek
2024-05-29 10:22     ` [PATCH v12 1/6] lib: allow libraries with no sources Paul Szczepanek
2024-05-29 10:22     ` [PATCH v12 2/6] mempool: add functions to get extra mempool info Paul Szczepanek
2024-05-29 11:47       ` Morten Brørup
2024-05-29 13:56       ` Morten Brørup
2024-05-29 16:18         ` Paul Szczepanek
2024-05-30  0:56           ` Du, Frank
2024-05-29 10:22     ` [PATCH v12 3/6] ptr_compress: add pointer compression library Paul Szczepanek
2024-05-29 11:52       ` Morten Brørup
2024-05-29 10:22     ` [PATCH v12 4/6] test: add pointer compress tests to ring perf test Paul Szczepanek
2024-05-29 10:22     ` [PATCH v12 5/6] docs: add pointer compression guide Paul Szczepanek
2024-05-29 10:22     ` [PATCH v12 6/6] test: add unit test for ptr compression Paul Szczepanek
2024-05-30  9:40   ` [PATCH v13 0/6] add pointer compression API Paul Szczepanek
2024-05-30  9:40     ` [PATCH v13 1/6] lib: allow libraries with no sources Paul Szczepanek
2024-05-30  9:40     ` [PATCH v13 2/6] mempool: add functions to get extra mempool info Paul Szczepanek
2024-05-31  9:32       ` Morten Brørup
2024-05-30  9:40     ` [PATCH v13 3/6] ptr_compress: add pointer compression library Paul Szczepanek
2024-05-30  9:40     ` [PATCH v13 4/6] test: add pointer compress tests to ring perf test Paul Szczepanek
2024-05-30  9:40     ` [PATCH v13 5/6] docs: add pointer compression guide Paul Szczepanek
2024-05-30  9:40     ` [PATCH v13 6/6] test: add unit test for ptr compression Paul Szczepanek
2024-05-30 13:35     ` [PATCH v13 0/6] add pointer compression API Paul Szczepanek
2023-09-27 15:08 ` [RFC 2/2] test: add pointer compress tests to ring perf test Paul Szczepanek
2023-10-09 15:48   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f12eee42f80a4f948f594c1a04e4c5a7@huawei.com \
    --to=konstantin.ananyev@huawei.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.v.ananyev@yandex.ru \
    --cc=mb@smartsharesystems.com \
    --cc=nd@arm.com \
    --cc=paul.szczepanek@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).