Re: [PATCH] app/dma-perf: replace pktmbuf with mempool objects

patches for DPDK stable branches
 help / color / mirror / Atom feed

From: "Varghese, Vipin" <Vipin.Varghese@amd.com>
To: "Bruce Richardson" <bruce.richardson@intel.com>,
	"Morten Brørup" <mb@smartsharesystems.com>
Cc: "Yigit, Ferruh" <Ferruh.Yigit@amd.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"stable@dpdk.org" <stable@dpdk.org>,
	"honest.jiang@foxmail.com" <honest.jiang@foxmail.com>,
	"P, Thiyagarajan" <Thiyagarajan.P@amd.com>
Subject: Re: [PATCH] app/dma-perf: replace pktmbuf with mempool objects
Date: Tue, 12 Dec 2023 17:13:42 +0000	[thread overview]
Message-ID: <MN2PR12MB3085AD6D53497A37C5FB638D828EA@MN2PR12MB3085.namprd12.prod.outlook.com> (raw)
In-Reply-To: <ZXh-ObnTwmu69WaF@bricha3-MOBL.ger.corp.intel.com>

[-- Attachment #1: Type: text/plain, Size: 4465 bytes --]

[Public]

Sharing a few critical points based on my exposure to the dma-perf application below

<Snipped>

On Tue, Dec 12, 2023 at 04:16:20PM +0100, Morten Brørup wrote:
> +TO: Bruce, please stop me if I'm completely off track here.
>
> > From: Ferruh Yigit [mailto:ferruh.yigit@amd.com] Sent: Tuesday, 12
> > December 2023 15.38
> >
> > On 12/12/2023 11:40 AM, Morten Brørup wrote:
> > >> From: Vipin Varghese [mailto:vipin.varghese@amd.com] Sent: Tuesday,
> > >> 12 December 2023 11.38
> > >>
> > >> Replace pktmbuf pool with mempool, this allows increase in MOPS
> > >> especially in lower buffer size. Using Mempool, allows to reduce the
> > >> extra CPU cycles.
> > >
> > > I get the point of this change: It tests the performance of copying
> > raw memory objects using respectively rte_memcpy and DMA, without the
> > mbuf indirection overhead.
> > >
> > > However, I still consider the existing test relevant: The performance
> > of copying packets using respectively rte_memcpy and DMA.
> > >
> >
> > This is DMA performance test application and packets are not used,
> > using pktmbuf just introduces overhead to the main focus of the
> > application.
> >
> > I am not sure if pktmuf selected intentionally for this test
> > application, but I assume it is there because of historical reasons.
>
> I think pktmbuf was selected intentionally, to provide more accurate
> results for application developers trying to determine when to use
> rte_memcpy and when to use DMA. Much like the "copy breakpoint" in Linux
> Ethernet drivers is used to determine which code path to take for each
> received packet.

yes Ferruh, this is the right understanding. In DPDK example we already have
dma-forward application which makes use of pktmbuf payload to copy over
new pktmbuf payload area.

by moving to mempool, we are actually now focusing on source and destination buffers.
This allows to create mempool objects with 2MB and 1GB src-dst areas. Thus allowing
to focus src to dst copy. With pktmbuf we were not able to achieve the same.


>
> Most applications will be working with pktmbufs, so these applications
> will also experience the pktmbuf overhead. Performance testing with the
> same overhead as the application will be better to help the application
> developer determine when to use rte_memcpy and when to use DMA when
> working with pktmbufs.

Morten thank you for the input, but as shared above DPDK example dma-fwd does
justice to such scenario. inline to test-compress-perf & test-crypto-perf IMHO test-dma-perf
should focus on getting best values of dma engine and memcpy comparision.

>
> (Furthermore, for the pktmbuf tests, I wonder if copying performance
> could also depend on IOVA mode and RTE_IOVA_IN_MBUF.)
>
> Nonetheless, there may also be use cases where raw mempool objects are
> being copied by rte_memcpy or DMA, so adding tests for these use cases
> are useful.
>
>
> @Bruce, you were also deeply involved in the DMA library, and probably
> have more up-to-date practical experience with it. Am I right that
> pktmbuf overhead in these tests provides more "real life use"-like
> results? Or am I completely off track with my thinking here, i.e. the
> pktmbuf overhead is only noise?
>
I'm actually not that familiar with the dma-test application, so can't
comment on the specific overhead involved here. In the general case, if we
are just talking about the overhead of dereferencing the mbufs then I would
expect the overhead to be negligible. However, if we are looking to include
the cost of allocation and freeing of buffers, I'd try to avoid that as it
is a cost that would have to be paid for both SW copies and HW copies, so
should not count when calculating offload cost.

Bruce, as per test-dma-perf there is no repeated pktmbuf-alloc or pktmbuf-free.
Hence I disagree that the overhead discussed for pkmbuf here is not related to alloc and free.
But the cost as per my investigation goes into fetching the cacheline and performing mtod on
each iteration.

/Bruce

I can rewrite the logic to make use pktmbuf objects by sending the src and dst with pre-computed
mtod to avoid the overhead. But this will not resolve the 2MB and 1GB huge page copy alloc failures.
IMHO, I believe in similar lines to other perf application, dma-perf application should focus on acutal device
performance over application application performance.

[-- Attachment #2: Type: text/html, Size: 9057 bytes --]

next prev parent reply	other threads:[~2023-12-12 17:13 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-12 10:37 Vipin Varghese
2023-12-12 11:40 ` Morten Brørup
2023-12-12 14:38   ` Ferruh Yigit
2023-12-12 15:16     ` Morten Brørup
2023-12-12 15:37       ` Bruce Richardson
2023-12-12 17:13         ` Varghese, Vipin [this message]
2023-12-12 18:09           ` Morten Brørup
2023-12-12 18:13             ` Varghese, Vipin
2023-12-20  9:17               ` Varghese, Vipin
2023-12-20  9:21                 ` David Marchand
2023-12-19 16:35 Vipin Varghese

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MN2PR12MB3085AD6D53497A37C5FB638D828EA@MN2PR12MB3085.namprd12.prod.outlook.com \
    --to=vipin.varghese@amd.com \
    --cc=Ferruh.Yigit@amd.com \
    --cc=Thiyagarajan.P@amd.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=honest.jiang@foxmail.com \
    --cc=mb@smartsharesystems.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).