From: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
To: fengchengwen <fengchengwen@huawei.com>,
Amit Prakash Shukla <amitprakashs@marvell.com>,
Cheng Jiang <honest.jiang@foxmail.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, Jerin Jacob <jerinj@marvell.com>,
Anoob Joseph <anoobj@marvell.com>,
Kevin Laatz <kevin.laatz@intel.com>,
Bruce Richardson <bruce.richardson@intel.com>,
Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
Subject: RE: [EXT] Re: [PATCH v10 4/4] app/dma-perf: add SG copy support
Date: Thu, 29 Feb 2024 13:16:21 +0000 [thread overview]
Message-ID: <CO1PR18MB471463517CFD024EA9CD87F1CB5F2@CO1PR18MB4714.namprd18.prod.outlook.com> (raw)
In-Reply-To: <8148d7bc-f7a3-ad8c-71bf-c3f9c3c31a01@huawei.com>
Hi Fengcheng,
> -----Original Message-----
> From: fengchengwen <fengchengwen@huawei.com>
> Sent: Wednesday, February 28, 2024 3:02 PM
> To: Amit Prakash Shukla <amitprakashs@marvell.com>; Cheng Jiang
> <honest.jiang@foxmail.com>
> Cc: dev@dpdk.org; Jerin Jacob <jerinj@marvell.com>; Anoob Joseph
> <anoobj@marvell.com>; Kevin Laatz <kevin.laatz@intel.com>; Bruce
> Richardson <bruce.richardson@intel.com>; Pavan Nikhilesh Bhagavatula
> <pbhagavatula@marvell.com>; Gowrishankar Muthukrishnan
> <gmuthukrishn@marvell.com>
> Subject: [EXT] Re: [PATCH v10 4/4] app/dma-perf: add SG copy support
>
> External Email
>
> ----------------------------------------------------------------------
> Hi Gowrishankar,
>
> On 2024/2/28 2:56, Amit Prakash Shukla wrote:
> > From: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
> >
> > Add SG copy support.
> >
> > Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
> > Acked-by: Anoob Joseph <anoobj@marvell.com>
> > Acked-by: Chengwen Feng <fengchengwen@huawei.com>
> > ---
> > v10:
> > - SG config variables renamed.
> >
> > app/test-dma-perf/benchmark.c | 278
> > +++++++++++++++++++++++++++++-----
> > app/test-dma-perf/config.ini | 25 ++-
> > app/test-dma-perf/main.c | 34 ++++-
> > app/test-dma-perf/main.h | 5 +-
> > 4 files changed, 300 insertions(+), 42 deletions(-)
> >
> > diff --git a/app/test-dma-perf/benchmark.c
> > b/app/test-dma-perf/benchmark.c index 0047e2f4b8..25ed6fa6d0 100644
> > --- a/app/test-dma-perf/benchmark.c
> > +++ b/app/test-dma-perf/benchmark.c
> > @@ -46,6 +46,10 @@ struct lcore_params {
> > uint16_t test_secs;
> > struct rte_mbuf **srcs;
> > struct rte_mbuf **dsts;
> > + struct rte_dma_sge *src_sges;
> > + struct rte_dma_sge *dst_sges;
> > + uint8_t src_ptrs;
> > + uint8_t dst_ptrs;
>
> 1. src/dst_ptrs -> src/dst_nb_sge
Ack.
> 2. How about wrap these four fields as a struct?
Ack.
>
> > volatile struct worker_info worker_info; };
> >
> > @@ -86,21 +90,31 @@ calc_result(uint32_t buf_size, uint32_t nr_buf,
> > uint16_t nb_workers, uint16_t te }
> >
> > static void
> > -output_result(uint8_t scenario_id, uint32_t lcore_id, char *dma_name,
> uint16_t ring_size,
> > - uint16_t kick_batch, uint64_t ave_cycle, uint32_t
> buf_size, uint32_t nr_buf,
> > - float memory, float bandwidth, float mops, bool
> is_dma)
> > +output_result(struct test_configure *cfg, struct lcore_params *para,
> > + uint16_t kick_batch, uint64_t ave_cycle, uint32_t
> buf_size,
> > + uint32_t nr_buf, float memory, float bandwidth, float
> mops)
> > {
> > - if (is_dma)
> > - printf("lcore %u, DMA %s, DMA Ring Size: %u, Kick Batch Size:
> %u.\n",
> > - lcore_id, dma_name, ring_size, kick_batch);
> > - else
> > + uint16_t ring_size = cfg->ring_size.cur;
> > + uint8_t scenario_id = cfg->scenario_id;
> > + uint32_t lcore_id = para->lcore_id;
> > + char *dma_name = para->dma_name;
> > +
> > + if (cfg->is_dma) {
> > + printf("lcore %u, DMA %s, DMA Ring Size: %u, Kick Batch Size:
> %u", lcore_id,
> > + dma_name, ring_size, kick_batch);
> > + if (cfg->is_sg)
> > + printf(" DMA src ptrs: %u, dst ptrs: %u",
> > + para->src_ptrs, para->dst_ptrs);
>
> DMA src sges: %u DMA dst sges: %u
>
> I think we should add a column which title maybe misc, some like sg-src[4]-
> dst[1], and later we may add fill test, then this field could be pattern-
> 0x12345678
>
> And in "[PATCH v10 2/4] app/dma-perf: add PCI device support" commit, if
> the DMA was worked in non-mem2mem direction, we could add simple
> descriptor of direction and pcie.info in the above misc column.
>
I am sorry, I could not understand complete picture here. Do you mean we
reserve a column and use it as per test type.
For plain mem copy, nothing added.
For SG mem copy, instead of showing "DMA src sges: 1, dst sges: 4", print "sg-src[1]-dst[4]".
In future, when we add fill test in benchmark, this line instead be "pattern-0x12345678".
Is my understanding correct over here ?
> > + printf(".\n");
> > + } else {
> > printf("lcore %u\n", lcore_id);
> > + }
> >
> > printf("Average Cycles/op: %" PRIu64 ", Buffer Size: %u B, Buffer
> Number: %u, Memory: %.2lf MB, Frequency: %.3lf Ghz.\n",
> > ave_cycle, buf_size, nr_buf, memory,
> rte_get_timer_hz()/1000000000.0);
> > printf("Average Bandwidth: %.3lf Gbps, MOps: %.3lf\n", bandwidth,
> > mops);
> >
> > - if (is_dma)
> > + if (cfg->is_dma)
> > snprintf(output_str[lcore_id], MAX_OUTPUT_STR_LEN,
> CSV_LINE_DMA_FMT,
> > scenario_id, lcore_id, dma_name, ring_size,
> kick_batch, buf_size,
> > nr_buf, memory, ave_cycle, bandwidth, mops); @@ -
> 167,7 +181,7 @@
> > vchan_data_populate(uint32_t dev_id, struct rte_dma_vchan_conf *qconf,
> >
> > /* Configuration of device. */
> > static void
> > -configure_dmadev_queue(uint32_t dev_id, struct test_configure *cfg)
> > +configure_dmadev_queue(uint32_t dev_id, struct test_configure *cfg,
> > +uint8_t ptrs_max)
> > {
> > uint16_t vchan = 0;
> > struct rte_dma_info info;
> > @@ -190,6 +204,10 @@ configure_dmadev_queue(uint32_t dev_id, struct
> test_configure *cfg)
> > rte_exit(EXIT_FAILURE, "Error, no configured queues reported
> on device id. %u\n",
> > dev_id);
> >
> > + if (info.max_sges < ptrs_max)
> > + rte_exit(EXIT_FAILURE, "Error, DMA ptrs more than supported
> by
> > +device id %u.\n",
>
> "Error with unsupport max_sges on device id %u.\n"
Ack.
>
> > + dev_id);
> > +
> > if (rte_dma_start(dev_id) != 0)
> > rte_exit(EXIT_FAILURE, "Error with dma start.\n"); } @@ -
> 202,8
> > +220,12 @@ config_dmadevs(struct test_configure *cfg)
> > uint32_t i;
> > int dev_id;
> > uint16_t nb_dmadevs = 0;
> > + uint8_t ptrs_max = 0;
>
> It hard to understand, how about nb_sge?
Ack. Renamed it to nb_sges.
>
> > char *dma_name;
> >
> > + if (cfg->is_sg)
> > + ptrs_max = RTE_MAX(cfg->src_ptrs, cfg->dst_ptrs);
> > +
> > for (i = 0; i < ldm->cnt; i++) {
> > dma_name = ldm->dma_names[i];
> > dev_id = rte_dma_get_dev_id_by_name(dma_name);
> > @@ -213,7 +235,7 @@ config_dmadevs(struct test_configure *cfg)
> > }
> >
> > ldm->dma_ids[i] = dev_id;
> > - configure_dmadev_queue(dev_id, cfg);
> > + configure_dmadev_queue(dev_id, cfg, ptrs_max);
> > ++nb_dmadevs;
> > }
> >
> > @@ -253,7 +275,7 @@ do_dma_submit_and_poll(uint16_t dev_id,
> uint64_t *async_cnt,
> > }
> >
> > static inline int
> > -do_dma_mem_copy(void *p)
> > +do_dma_plain_mem_copy(void *p)
> > {
> > struct lcore_params *para = (struct lcore_params *)p;
> > volatile struct worker_info *worker_info = &(para->worker_info);
> > @@ -306,6 +328,65 @@ do_dma_mem_copy(void *p)
> > return 0;
> > }
> >
> > +static inline int
> > +do_dma_sg_mem_copy(void *p)
> > +{
> > + struct lcore_params *para = (struct lcore_params *)p;
> > + volatile struct worker_info *worker_info = &(para->worker_info);
> > + struct rte_dma_sge *src_sges = para->src_sges;
> > + struct rte_dma_sge *dst_sges = para->dst_sges;
> > + const uint16_t kick_batch = para->kick_batch;
> > + const uint8_t src_ptrs = para->src_ptrs;
> > + const uint8_t dst_ptrs = para->dst_ptrs;
> > + const uint16_t dev_id = para->dev_id;
> > + uint32_t nr_buf = para->nr_buf;
> > + uint64_t async_cnt = 0;
> > + uint32_t poll_cnt = 0;
> > + uint16_t nr_cpl;
> > + uint32_t i, j;
> > + int ret;
> > +
> > + nr_buf /= RTE_MAX(src_ptrs, dst_ptrs);
> > + worker_info->stop_flag = false;
> > + worker_info->ready_flag = true;
> > +
> > + while (!worker_info->start_flag)
> > + ;
> > +
> > + while (1) {
> > + j = 0;
> > + for (i = 0; i < nr_buf; i++) {
> > +dma_copy:
> > + ret = rte_dma_copy_sg(dev_id, 0,
> > + &src_sges[i * src_ptrs], &dst_sges[j *
> dst_ptrs],
> > + src_ptrs, dst_ptrs, 0);
> > + if (unlikely(ret < 0)) {
> > + if (ret == -ENOSPC) {
> > + do_dma_submit_and_poll(dev_id,
> &async_cnt, worker_info);
> > + goto dma_copy;
> > + } else
> > + error_exit(dev_id);
> > + }
> > + async_cnt++;
> > + j++;
> > +
> > + if ((async_cnt % kick_batch) == 0)
> > + do_dma_submit_and_poll(dev_id,
> &async_cnt, worker_info);
> > + }
> > +
> > + if (worker_info->stop_flag)
> > + break;
> > + }
> > +
> > + rte_dma_submit(dev_id, 0);
> > + while ((async_cnt > 0) && (poll_cnt++ < POLL_MAX)) {
> > + nr_cpl = rte_dma_completed(dev_id, 0, MAX_DMA_CPL_NB,
> NULL, NULL);
> > + async_cnt -= nr_cpl;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > static inline int
> > do_cpu_mem_copy(void *p)
> > {
> > @@ -347,8 +428,9 @@ dummy_free_ext_buf(void *addr, void *opaque)
> > }
> >
> > static int
> > -setup_memory_env(struct test_configure *cfg, struct rte_mbuf ***srcs,
> > - struct rte_mbuf ***dsts)
> > +setup_memory_env(struct test_configure *cfg,
> > + struct rte_mbuf ***srcs, struct rte_mbuf ***dsts,
> > + struct rte_dma_sge **src_sges, struct rte_dma_sge
> **dst_sges)
> > {
> > static struct rte_mbuf_ext_shared_info *ext_buf_info;
> > unsigned int cur_buf_size = cfg->buf_size.cur;
> > @@ -409,8 +491,8 @@ setup_memory_env(struct test_configure *cfg,
> struct rte_mbuf ***srcs,
> > }
> >
> > for (i = 0; i < nr_buf; i++) {
> > - memset(rte_pktmbuf_mtod((*srcs)[i], void *), rte_rand(),
> buf_size);
> > - memset(rte_pktmbuf_mtod((*dsts)[i], void *), 0, buf_size);
> > + memset(rte_pktmbuf_mtod((*srcs)[i], void *), rte_rand(),
> cur_buf_size);
> > + memset(rte_pktmbuf_mtod((*dsts)[i], void *), 0,
> cur_buf_size);
> > }
> >
> > if (cfg->transfer_dir == RTE_DMA_DIR_DEV_TO_MEM ||
> > @@ -446,20 +528,56 @@ setup_memory_env(struct test_configure *cfg,
> struct rte_mbuf ***srcs,
> > }
> > }
> >
> > + if (cfg->is_sg) {
> > + uint8_t src_ptrs = cfg->src_ptrs;
> > + uint8_t dst_ptrs = cfg->dst_ptrs;
> > + uint32_t sglen_src, sglen_dst;
> > +
> > + *src_sges = rte_zmalloc(NULL, nr_buf * sizeof(struct
> rte_dma_sge),
> > + RTE_CACHE_LINE_SIZE);
> > + if (*src_sges == NULL) {
> > + printf("Error: src_sges array malloc failed.\n");
> > + return -1;
> > + }
> > +
> > + *dst_sges = rte_zmalloc(NULL, nr_buf * sizeof(struct
> rte_dma_sge),
> > + RTE_CACHE_LINE_SIZE);
> > + if (*dst_sges == NULL) {
> > + printf("Error: dst_sges array malloc failed.\n");
> > + return -1;
> > + }
> > +
> > + sglen_src = cur_buf_size / src_ptrs;
> > + sglen_dst = cur_buf_size / dst_ptrs;
> > +
> > + for (i = 0; i < nr_buf; i++) {
> > + (*src_sges)[i].addr = rte_pktmbuf_iova((*srcs)[i]);
> > + (*src_sges)[i].length = sglen_src;
> > + if (!((i+1) % src_ptrs))
> > + (*src_sges)[i].length += (cur_buf_size %
> src_ptrs);
> > +
> > + (*dst_sges)[i].addr = rte_pktmbuf_iova((*dsts)[i]);
> > + (*dst_sges)[i].length = sglen_dst;
> > + if (!((i+1) % dst_ptrs))
> > + (*dst_sges)[i].length += (cur_buf_size %
> dst_ptrs);
> > + }
> > + }
> > +
> > return 0;
> > }
> >
> > int
> > -mem_copy_benchmark(struct test_configure *cfg, bool is_dma)
> > +mem_copy_benchmark(struct test_configure *cfg)
> > {
> > - uint32_t i;
> > + uint32_t i, j;
> > uint32_t offset;
> > unsigned int lcore_id = 0;
> > struct rte_mbuf **srcs = NULL, **dsts = NULL, **m = NULL;
> > + struct rte_dma_sge *src_sges = NULL, *dst_sges = NULL;
> > struct lcore_dma_map_t *ldm = &cfg->lcore_dma_map;
> > + const uint32_t mcore_id = rte_get_main_lcore();
> > unsigned int buf_size = cfg->buf_size.cur;
> > uint16_t kick_batch = cfg->kick_batch.cur;
> > - uint32_t nr_buf = cfg->nr_buf = (cfg->mem_size.cur * 1024 * 1024) /
> (cfg->buf_size.cur * 2);
> > uint16_t nb_workers = ldm->cnt;
> > uint16_t test_secs = cfg->test_secs;
> > float memory = 0;
> > @@ -467,12 +585,32 @@ mem_copy_benchmark(struct test_configure
> *cfg, bool is_dma)
> > uint32_t avg_cycles_total;
> > float mops, mops_total;
> > float bandwidth, bandwidth_total;
> > + uint32_t nr_sgsrc = 0, nr_sgdst = 0;
> > + uint32_t nr_buf;
> > int ret = 0;
> >
> > - if (setup_memory_env(cfg, &srcs, &dsts) < 0)
> > + /* Align number of buffers according to workers count */
> > + nr_buf = (cfg->mem_size.cur * 1024 * 1024) / (cfg->buf_size.cur * 2);
> > + nr_buf -= (nr_buf % nb_workers);
> > + if (cfg->is_sg) {
> > + nr_buf /= nb_workers;
> > + nr_buf -= nr_buf % (cfg->src_ptrs * cfg->dst_ptrs);
> > + nr_buf *= nb_workers;
> > +
> > + if (cfg->dst_ptrs > cfg->src_ptrs) {
> > + nr_sgsrc = (nr_buf / cfg->dst_ptrs * cfg->src_ptrs);
> > + nr_sgdst = nr_buf;
> > + } else {
> > + nr_sgsrc = nr_buf;
> > + nr_sgdst = (nr_buf / cfg->src_ptrs * cfg->dst_ptrs);
> > + }
> > + }
>
> pls move to a subfunction
Ack.
>
> > +
> > + cfg->nr_buf = nr_buf;
> > + if (setup_memory_env(cfg, &srcs, &dsts, &src_sges, &dst_sges) < 0)
> > goto out;
> >
> > - if (is_dma)
> > + if (cfg->is_dma)
> > if (config_dmadevs(cfg) < 0)
> > goto out;
> >
> > @@ -486,13 +624,23 @@ mem_copy_benchmark(struct test_configure
> *cfg, bool is_dma)
> >
> > for (i = 0; i < nb_workers; i++) {
> > lcore_id = ldm->lcores[i];
> > + if (lcore_id == mcore_id) {
> > + printf("lcore parameters can not use main core id
> %d\n", mcore_id);
> > + goto out;
> > + }
> > +
> > + if (rte_eal_lcore_role(lcore_id) == ROLE_OFF) {
> > + printf("lcore parameters can not use offline core id
> %d\n", lcore_id);
> > + goto out;
> > + }
>
> The above two judgement should in a seperate commit.
Sorry, somehow it got mixed from different patch I had in my local repo.
It will be in different commit.
>
> > +
> > offset = nr_buf / nb_workers * i;
> > lcores[i] = rte_malloc(NULL, sizeof(struct lcore_params), 0);
> > if (lcores[i] == NULL) {
> > printf("lcore parameters malloc failure for lcore %d\n",
> lcore_id);
> > break;
> > }
> > - if (is_dma) {
> > + if (cfg->is_dma) {
> > lcores[i]->dma_name = ldm->dma_names[i];
> > lcores[i]->dev_id = ldm->dma_ids[i];
> > lcores[i]->kick_batch = kick_batch;
> > @@ -506,10 +654,23 @@ mem_copy_benchmark(struct test_configure
> *cfg, bool is_dma)
> > lcores[i]->scenario_id = cfg->scenario_id;
> > lcores[i]->lcore_id = lcore_id;
> >
> > - if (is_dma)
> > - rte_eal_remote_launch(do_dma_mem_copy, (void
> *)(lcores[i]), lcore_id);
> > - else
> > + if (cfg->is_sg) {
> > + lcores[i]->src_ptrs = cfg->src_ptrs;
> > + lcores[i]->dst_ptrs = cfg->dst_ptrs;
> > + lcores[i]->src_sges = src_sges + (nr_sgsrc / nb_workers
> * i);
> > + lcores[i]->dst_sges = dst_sges + (nr_sgdst /
> nb_workers * i);
> > + }
> > +
> > + if (cfg->is_dma) {
> > + if (!cfg->is_sg)
> > +
> rte_eal_remote_launch(do_dma_plain_mem_copy, (void *)(lcores[i]),
> > + lcore_id);
> > + else
> > +
> rte_eal_remote_launch(do_dma_sg_mem_copy, (void *)(lcores[i]),
> > + lcore_id);
> > + } else {
> > rte_eal_remote_launch(do_cpu_mem_copy, (void
> *)(lcores[i]), lcore_id);
> > + }
>
> too many judgement for selecting target function, how about wrap it
> subfunction:
> lcore_function_t get_work_function(struct test_configure *cfg)
> then rte_eal_remote_launch(get_work_function(cfg), (void *)(lcores[i]),
> lcore_id);
>
Ack.
> > }
> >
> > while (1) {
> > @@ -541,13 +702,53 @@ mem_copy_benchmark(struct test_configure
> *cfg, bool is_dma)
> >
> > rte_eal_mp_wait_lcore();
> >
> > - for (i = 0; i < (nr_buf / nb_workers) * nb_workers; i++) {
> > - if (memcmp(rte_pktmbuf_mtod(srcs[i], void *),
> > - rte_pktmbuf_mtod(dsts[i], void *),
> > - cfg->buf_size.cur) != 0) {
> > - printf("Copy validation fails for buffer number %d\n",
> i);
> > - ret = -1;
> > - goto out;
> > + if (!cfg->is_sg && cfg->transfer_dir == RTE_DMA_DIR_MEM_TO_MEM)
> {
> > + for (i = 0; i < (nr_buf / nb_workers) * nb_workers; i++) {
> > + if (memcmp(rte_pktmbuf_mtod(srcs[i], void *),
> > + rte_pktmbuf_mtod(dsts[i], void *),
> > + cfg->buf_size.cur) != 0) {
> > + printf("Copy validation fails for buffer number
> %d\n", i);
> > + ret = -1;
> > + goto out;
> > + }
> > + }
> > + } else if (cfg->is_sg && cfg->transfer_dir ==
> RTE_DMA_DIR_MEM_TO_MEM) {
> > + size_t src_remsz = buf_size % cfg->src_ptrs;
> > + size_t dst_remsz = buf_size % cfg->dst_ptrs;
> > + size_t src_sz = buf_size / cfg->src_ptrs;
> > + size_t dst_sz = buf_size / cfg->dst_ptrs;
> > + uint8_t src[buf_size], dst[buf_size];
> > + uint8_t *sbuf, *dbuf, *ptr;
> > +
> > + for (i = 0; i < (nr_buf / RTE_MAX(cfg->src_ptrs, cfg->dst_ptrs));
> i++) {
> > + sbuf = src;
> > + dbuf = dst;
> > + ptr = NULL;
> > +
> > + for (j = 0; j < cfg->src_ptrs; j++) {
> > + ptr = rte_pktmbuf_mtod(srcs[i * cfg->src_ptrs
> + j], uint8_t *);
> > + memcpy(sbuf, ptr, src_sz);
> > + sbuf += src_sz;
> > + }
> > +
> > + if (src_remsz)
> > + memcpy(sbuf, ptr + src_sz, src_remsz);
> > +
> > + for (j = 0; j < cfg->dst_ptrs; j++) {
> > + ptr = rte_pktmbuf_mtod(dsts[i * cfg->dst_ptrs
> + j], uint8_t *);
> > + memcpy(dbuf, ptr, dst_sz);
> > + dbuf += dst_sz;
> > + }
> > +
> > + if (dst_remsz)
> > + memcpy(dbuf, ptr + dst_sz, dst_remsz);
> > +
> > + if (memcmp(src, dst, buf_size) != 0) {
> > + printf("SG Copy validation fails for buffer
> number %d\n",
> > + i * cfg->src_ptrs);
> > + ret = -1;
> > + goto out;
> > + }
>
> Now I doubt the value of verify, this verify can't find the middle round copy
> failure,
> because as long as the last round copy is successful, the validation will pass.
>
Validation is on entire buffer. If any middle copy is a failure, entire memcmp
would have failed. Or do I miss something ?
> And adding validatation in every round copy will impact performance.
>
This validation is just after worker function is stopped measuring perf.
How would this impact performance ?
> Also app/test_dmadev already verify data. so I think we should drop the
> validation commit.
Even in some corner cases or unknown issues, copy would have failed
and taking perf cycles then is meaningless. That is the reason, this validation
is added after perf function doing its job.
>
> > }
> > }
> >
> > @@ -558,10 +759,8 @@ mem_copy_benchmark(struct test_configure *cfg,
> bool is_dma)
> > calc_result(buf_size, nr_buf, nb_workers, test_secs,
> > lcores[i]->worker_info.test_cpl,
> > &memory, &avg_cycles, &bandwidth, &mops);
> > - output_result(cfg->scenario_id, lcores[i]->lcore_id,
> > - lcores[i]->dma_name, cfg-
> >ring_size.cur, kick_batch,
> > - avg_cycles, buf_size, nr_buf /
> nb_workers, memory,
> > - bandwidth, mops, is_dma);
> > + output_result(cfg, lcores[i], kick_batch, avg_cycles, buf_size,
> > + nr_buf / nb_workers, memory, bandwidth, mops);
> > mops_total += mops;
> > bandwidth_total += bandwidth;
> > avg_cycles_total += avg_cycles;
> > @@ -604,13 +803,20 @@ mem_copy_benchmark(struct test_configure
> *cfg, bool is_dma)
> > rte_mempool_free(dst_pool);
> > dst_pool = NULL;
> >
> > + /* free sges for mbufs */
> > + rte_free(src_sges);
> > + src_sges = NULL;
> > +
> > + rte_free(dst_sges);
> > + dst_sges = NULL;
> > +
> > /* free the worker parameters */
> > for (i = 0; i < nb_workers; i++) {
> > rte_free(lcores[i]);
> > lcores[i] = NULL;
> > }
> >
> > - if (is_dma) {
> > + if (cfg->is_dma) {
> > for (i = 0; i < nb_workers; i++) {
> > printf("Stopping dmadev %d\n", ldm->dma_ids[i]);
> > rte_dma_stop(ldm->dma_ids[i]);
> > diff --git a/app/test-dma-perf/config.ini b/app/test-dma-perf/config.ini
> > index 9c8221025e..28f6c9d1db 100644
> > --- a/app/test-dma-perf/config.ini
> > +++ b/app/test-dma-perf/config.ini
> > @@ -38,6 +38,14 @@
> >
> > ; "skip" To skip a test-case set skip to 1.
>
> Please place hese patchset new add entrys' descriptions above the
> "; To specify a configuration file, use the "--config" flag followed by the path to
> the file."
>
> because original config.ini, fist is parameters descriptor, and then program
> argment descriptor, and last was example.
>
Ack.
> >
> > +; Parameters to be configured for SG copy:
>
> Parameters for DMA scatter-gather memory copy:
>
Ack.
> > +; ========================================
>
> Please remove this line
>
Ack.
> > +; "dma_src_sge" denotes number of source segments.
> > +; "dma_dst_sge" denotes number of destination segments.
> > +;
> > +; For SG copy, both the parameters need to be configured and they are valid
> only
> > +; when type is DMA_MEM_COPY.
>
> For DMA scatter-gather memory copy, the parameters need to be configured
> and they are valid only
> when type is DMA_MEM_COPY.
>
Ack.
> > +;
> > ; Parameters to be configured for data transfers from "mem to dev" and
> "dev to mem":
> > ;
> ===================================================================
> ===============
>
> Please remove this line
>
> As another commit "Re: [PATCH v2] app/dma-perf: support bi-directional
> transfer"'s review feedback,
> these descriptor should place after
> "
> ; To use DMA for a test, please specify the "lcore_dma" parameter.
> ; If you have already set the "-l" and "-a" parameters using EAL,
> ; make sure that the value of "lcore_dma" falls within their range of the values.
> ; We have to ensure a 1:1 mapping between the core and DMA device.
> "
>
>
> > ; "direction" denotes the direction of data transfer. It can take 3 values:
> > @@ -69,6 +77,21 @@ lcore_dma=lcore10@0000:00:04.2,
> lcore11@0000:00:04.3
> > eal_args=--in-memory --file-prefix=test
> >
> > [case2]
> > +type=DMA_MEM_COPY
> > +mem_size=10
> > +buf_size=64,8192,2,MUL
> > +dma_ring_size=1024
> > +dma_src_sge=4
> > +dma_dst_sge=1
> > +kick_batch=32
> > +src_numa_node=0
> > +dst_numa_node=0
> > +cache_flush=0
> > +test_seconds=2
> > +lcore_dma=lcore10@0000:00:04.2, lcore11@0000:00:04.3
> > +eal_args=--in-memory --file-prefix=test
> > +
> > +[case3]
> > skip=1
> > type=DMA_MEM_COPY
> > direction=dev2mem
> > @@ -84,7 +107,7 @@ test_seconds=2
> > lcore_dma=lcore10@0000:00:04.2, lcore11@0000:00:04.3
> > eal_args=--in-memory --file-prefix=test
> >
> > -[case3]
> > +[case4]
> > type=CPU_MEM_COPY
> > mem_size=10
> > buf_size=64,8192,2,MUL
> > diff --git a/app/test-dma-perf/main.c b/app/test-dma-perf/main.c
> > index df05bcd7df..a27e4c9429 100644
> > --- a/app/test-dma-perf/main.c
> > +++ b/app/test-dma-perf/main.c
> > @@ -108,10 +108,8 @@ run_test_case(struct test_configure *case_cfg)
> >
> > switch (case_cfg->test_type) {
> > case TEST_TYPE_DMA_MEM_COPY:
> > - ret = mem_copy_benchmark(case_cfg, true);
> > - break;
> > case TEST_TYPE_CPU_MEM_COPY:
> > - ret = mem_copy_benchmark(case_cfg, false);
> > + ret = mem_copy_benchmark(case_cfg);
> > break;
> > default:
> > printf("Unknown test type. %s\n", case_cfg->test_type_str);
> > @@ -365,7 +363,8 @@ load_configs(const char *path)
> > const char *case_type;
> > const char *transfer_dir;
> > const char *lcore_dma;
> > - const char *mem_size_str, *buf_size_str, *ring_size_str,
> *kick_batch_str;
> > + const char *mem_size_str, *buf_size_str, *ring_size_str,
> *kick_batch_str,
> > + *src_ptrs_str, *dst_ptrs_str;
> > const char *skip;
> > struct rte_kvargs *kvlist;
> > const char *vchan_dev;
> > @@ -467,6 +466,7 @@ load_configs(const char *path)
> > rte_kvargs_free(kvlist);
> > }
> >
> > + test_case->is_dma = is_dma;
> > test_case->src_numa_node =
> (int)atoi(rte_cfgfile_get_entry(cfgfile,
> >
> section_name, "src_numa_node"));
> > test_case->dst_numa_node =
> (int)atoi(rte_cfgfile_get_entry(cfgfile,
> > @@ -501,6 +501,32 @@ load_configs(const char *path)
> > } else if (args_nr == 4)
> > nb_vp++;
> >
> > + src_ptrs_str = rte_cfgfile_get_entry(cfgfile,
> section_name,
> > +
> "dma_src_sge");
> > + if (src_ptrs_str != NULL) {
> > + test_case->src_ptrs =
> (int)atoi(rte_cfgfile_get_entry(cfgfile,
> > +
> section_name, "dma_src_sge"));
> > + }
> > +
> > + dst_ptrs_str = rte_cfgfile_get_entry(cfgfile,
> section_name,
> > +
> "dma_dst_sge");
> > + if (dst_ptrs_str != NULL) {
> > + test_case->dst_ptrs =
> (int)atoi(rte_cfgfile_get_entry(cfgfile,
> > +
> section_name, "dma_dst_sge"));
> > + }
> > +
> > + if ((src_ptrs_str != NULL && dst_ptrs_str == NULL) ||
> > + (src_ptrs_str == NULL && dst_ptrs_str != NULL)) {
>
> Please also check test_case->src_ptrs and test_case->dst_ptrs valid, make sure
> there are >1 and <=UINT16_MAX
At present, this is uint8_t. Do we need it more than UINT8_MAX ?
>
> > + printf("parse dma_src_sge, dma_dst_sge error
> in case %d.\n",
> > + i + 1);
> > + test_case->is_valid = false;
> > + continue;
> > + } else if (src_ptrs_str != NULL && dst_ptrs_str != NULL)
> {
> > + test_case->is_sg = true;
> > + } else {
> > + test_case->is_sg = false;
>
> the above could simple by: test_case->is_sg = (src_ptrs_str != NULL &&
> dst_ptrs_str != NULL);
>
Added check for nb_ validation here. Please check in next version of patch.
> > + }
> > +
> > kick_batch_str = rte_cfgfile_get_entry(cfgfile,
> section_name, "kick_batch");
> > args_nr = parse_entry(kick_batch_str, &test_case-
> >kick_batch);
> > if (args_nr < 0) {
> > diff --git a/app/test-dma-perf/main.h b/app/test-dma-perf/main.h
> > index 1123e7524a..baf149b72b 100644
> > --- a/app/test-dma-perf/main.h
> > +++ b/app/test-dma-perf/main.h
> > @@ -53,11 +53,14 @@ struct test_configure {
> > uint16_t dst_numa_node;
> > uint16_t opcode;
> > bool is_dma;
> > + bool is_sg;
> > struct lcore_dma_map_t lcore_dma_map;
> > struct test_configure_entry mem_size;
> > struct test_configure_entry buf_size;
> > struct test_configure_entry ring_size;
> > struct test_configure_entry kick_batch;
> > + uint8_t src_ptrs;
> > + uint8_t dst_ptrs;
> > uint8_t cache_flush;
> > uint32_t nr_buf;
> > uint16_t test_secs;
> > @@ -66,6 +69,6 @@ struct test_configure {
> > struct test_vchan_dev_config vchan_dev;
> > };
> >
> > -int mem_copy_benchmark(struct test_configure *cfg, bool is_dma);
> > +int mem_copy_benchmark(struct test_configure *cfg);
> >
> > #endif /* MAIN_H */
> >
Thank you for your review. Please confirm if there are any other changes
and I hope next version goes through 😊
Regards,
Gowrishankar
next prev parent reply other threads:[~2024-02-29 13:16 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-10 10:57 [PATCH v2] " Gowrishankar Muthukrishnan
2023-08-10 13:01 ` [PATCH v3 0/2] " Gowrishankar Muthukrishnan
2023-08-10 13:01 ` [PATCH v3 1/2] app/dma-perf: validate copied memory Gowrishankar Muthukrishnan
2023-08-23 11:46 ` [EXT] " Pavan Nikhilesh Bhagavatula
2023-08-10 13:01 ` [PATCH v3 2/2] app/dma-perf: add SG copy support Gowrishankar Muthukrishnan
2023-09-21 3:02 ` [PATCH v3 0/2] " Jiang, Cheng1
2023-09-24 9:32 ` [PATCH v4 " Gowrishankar Muthukrishnan
2023-09-24 9:32 ` [PATCH v4 1/2] app/dma-perf: validate copied memory Gowrishankar Muthukrishnan
2023-09-24 9:32 ` [PATCH v4 2/2] app/dma-perf: add SG copy support Gowrishankar Muthukrishnan
2023-09-28 21:12 ` Pavan Nikhilesh Bhagavatula
2023-10-26 18:31 ` [PATCH v5 0/4] app/dma-perf: PCI Dev and " Gowrishankar Muthukrishnan
2023-10-26 18:31 ` [PATCH v5 1/4] app/dma-perf: add skip support Gowrishankar Muthukrishnan
2023-11-10 9:03 ` Anoob Joseph
2023-10-26 18:31 ` [PATCH v5 2/4] app/dma-perf: add PCI device support Gowrishankar Muthukrishnan
2023-11-10 9:04 ` Anoob Joseph
2023-10-26 18:31 ` [PATCH v5 3/4] app/dma-perf: validate copied memory Gowrishankar Muthukrishnan
2023-11-10 9:05 ` Anoob Joseph
2023-10-26 18:31 ` [PATCH v5 4/4] app/dma-perf: add SG copy support Gowrishankar Muthukrishnan
2023-11-10 9:07 ` Anoob Joseph
2023-11-13 4:41 ` [PATCH v6 0/4] PCI Dev and " Gowrishankar Muthukrishnan
2023-11-13 4:41 ` [PATCH v6 1/4] app/dma-perf: add skip support Gowrishankar Muthukrishnan
2023-11-13 4:41 ` [PATCH v6 2/4] app/dma-perf: add PCI device support Gowrishankar Muthukrishnan
2023-11-13 4:41 ` [PATCH v6 3/4] app/dma-perf: validate copied memory Gowrishankar Muthukrishnan
2023-11-13 4:41 ` [PATCH v6 4/4] app/dma-perf: add SG copy support Gowrishankar Muthukrishnan
2023-11-17 12:15 ` [PATCH v7 0/4] PCI Dev and " Gowrishankar Muthukrishnan
2023-11-17 12:15 ` [PATCH v7 1/4] app/dma-perf: add skip support Gowrishankar Muthukrishnan
2023-11-20 2:54 ` fengchengwen
2023-11-22 12:01 ` [EXT] " Amit Prakash Shukla
2023-11-17 12:15 ` [PATCH v7 2/4] app/dma-perf: add PCI device support Gowrishankar Muthukrishnan
2023-11-17 12:15 ` [PATCH v7 3/4] app/dma-perf: validate copied memory Gowrishankar Muthukrishnan
2023-11-17 12:15 ` [PATCH v7 4/4] app/dma-perf: add SG copy support Gowrishankar Muthukrishnan
2023-11-22 11:06 ` [PATCH v8 0/4] PCI Dev and " Gowrishankar Muthukrishnan
2023-11-22 11:06 ` [PATCH v8 1/4] app/dma-perf: add skip support Gowrishankar Muthukrishnan
2023-11-22 11:06 ` [PATCH v8 2/4] app/dma-perf: add PCI device support Gowrishankar Muthukrishnan
2023-11-23 1:12 ` fengchengwen
2024-02-21 3:26 ` fengchengwen
2024-02-27 9:27 ` [EXT] " Amit Prakash Shukla
2023-11-22 11:06 ` [PATCH v8 3/4] app/dma-perf: validate copied memory Gowrishankar Muthukrishnan
2023-11-23 1:14 ` fengchengwen
2023-11-22 11:06 ` [PATCH v8 4/4] app/dma-perf: add SG copy support Gowrishankar Muthukrishnan
2024-01-25 12:44 ` fengchengwen
2024-02-21 3:52 ` fengchengwen
2024-02-27 16:09 ` [EXT] " Gowrishankar Muthukrishnan
2023-12-07 10:11 ` [PATCH v8 0/4] PCI Dev and " Gowrishankar Muthukrishnan
2024-02-05 10:37 ` Gowrishankar Muthukrishnan
2024-02-27 16:00 ` [PATCH v9 " Amit Prakash Shukla
2024-02-27 16:00 ` [PATCH v9 1/4] app/dma-perf: add skip support Amit Prakash Shukla
2024-02-27 16:00 ` [PATCH v9 2/4] app/dma-perf: add PCI device support Amit Prakash Shukla
2024-02-27 16:00 ` [PATCH v9 3/4] app/dma-perf: validate copied memory Amit Prakash Shukla
2024-02-27 16:00 ` [PATCH v9 4/4] app/dma-perf: add SG copy support Amit Prakash Shukla
2024-02-27 18:35 ` [PATCH v10 0/4] PCI Dev and " Amit Prakash Shukla
2024-02-27 18:35 ` [PATCH v10 1/4] app/dma-perf: add skip support Amit Prakash Shukla
2024-02-27 18:35 ` [PATCH v10 2/4] app/dma-perf: add PCI device support Amit Prakash Shukla
2024-02-27 18:35 ` [PATCH v10 3/4] app/dma-perf: validate copied memory Amit Prakash Shukla
2024-02-28 8:10 ` fengchengwen
2024-02-28 9:09 ` [EXT] " Gowrishankar Muthukrishnan
2024-02-29 13:48 ` [v11 0/4] PCI Dev and SG copy support Gowrishankar Muthukrishnan
2024-02-29 13:48 ` [v11 1/4] app/dma-perf: add skip support Gowrishankar Muthukrishnan
2024-02-29 13:48 ` [v11 2/4] app/dma-perf: add PCI device support Gowrishankar Muthukrishnan
2024-02-29 13:48 ` [v11 3/4] app/dma-perf: validate copied memory Gowrishankar Muthukrishnan
2024-02-29 13:48 ` [v11 4/4] app/dma-perf: add SG copy support Gowrishankar Muthukrishnan
2024-03-06 19:50 ` [v11 0/4] PCI Dev and " Thomas Monjalon
2024-03-07 13:48 ` fengchengwen
2024-03-07 13:55 ` [EXTERNAL] " Gowrishankar Muthukrishnan
2024-03-12 9:15 ` Thomas Monjalon
2024-03-12 12:05 ` fengchengwen
2024-03-12 12:24 ` Gowrishankar Muthukrishnan
2024-03-13 7:26 ` fengchengwen
2024-03-13 8:22 ` Gowrishankar Muthukrishnan
2024-03-15 7:30 ` Gowrishankar Muthukrishnan
2024-03-15 13:09 ` Thomas Monjalon
2024-03-18 7:32 ` Gowrishankar Muthukrishnan
2024-03-07 13:48 ` Gowrishankar Muthukrishnan
2024-02-27 18:56 ` [PATCH v10 4/4] app/dma-perf: add " Amit Prakash Shukla
2024-02-28 9:31 ` fengchengwen
2024-02-29 13:16 ` Gowrishankar Muthukrishnan [this message]
2024-03-01 2:07 ` [EXT] " fengchengwen
2024-03-01 8:06 ` [EXTERNAL] " Gowrishankar Muthukrishnan
2024-03-01 9:45 ` fengchengwen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CO1PR18MB471463517CFD024EA9CD87F1CB5F2@CO1PR18MB4714.namprd18.prod.outlook.com \
--to=gmuthukrishn@marvell.com \
--cc=amitprakashs@marvell.com \
--cc=anoobj@marvell.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=fengchengwen@huawei.com \
--cc=honest.jiang@foxmail.com \
--cc=jerinj@marvell.com \
--cc=kevin.laatz@intel.com \
--cc=pbhagavatula@marvell.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).