[dpdk-dev] [PATCH v5 9/9] doc: add CUDA example in GPU guide

DPDK patches and discussions
 help / color / mirror / Atom feed

From: <eagostini@nvidia.com>
To: <dev@dpdk.org>
Cc: Elena Agostini <eagostini@nvidia.com>
Subject: [dpdk-dev] [PATCH v5 9/9] doc: add CUDA example in GPU guide
Date: Mon, 8 Nov 2021 18:58:05 +0000	[thread overview]
Message-ID: <20211108185805.3887-10-eagostini@nvidia.com> (raw)
In-Reply-To: <20211108185805.3887-1-eagostini@nvidia.com>

From: Elena Agostini <eagostini@nvidia.com>

Signed-off-by: Elena Agostini <eagostini@nvidia.com>
---
 doc/guides/prog_guide/gpudev.rst | 122 +++++++++++++++++++++++++++++++
 1 file changed, 122 insertions(+)

diff --git a/doc/guides/prog_guide/gpudev.rst b/doc/guides/prog_guide/gpudev.rst
index cbaec5a1e4..1baf0c6772 100644
--- a/doc/guides/prog_guide/gpudev.rst
+++ b/doc/guides/prog_guide/gpudev.rst
@@ -102,3 +102,125 @@ the list of mbuf payload addresses where received packet have been stored.
 The ``rte_gpu_comm_*()`` functions are responsible to create a list of packets
 that can be populated with receive mbuf payload addresses
 and communicated to the task running on the GPU.
+
+
+CUDA Example
+------------
+
+In the example below, there is a pseudo-code to give an example
+about how to use functions in this library in case of a CUDA application.
+
+.. code-block:: c
+
+   //////////////////////////////////////////////////////////////////////////
+   ///// gpudev library + CUDA functions
+   //////////////////////////////////////////////////////////////////////////
+   #define GPU_PAGE_SHIFT 16
+   #define GPU_PAGE_SIZE (1UL << GPU_PAGE_SHIFT)
+
+   int main() {
+       struct rte_gpu_flag quit_flag;
+       struct rte_gpu_comm_list *comm_list;
+       int nb_rx = 0;
+       int comm_list_entry = 0;
+       struct rte_mbuf * rx_mbufs[max_rx_mbufs];
+       cudaStream_t cstream;
+       struct rte_mempool *mpool_payload, *mpool_header;
+       struct rte_pktmbuf_extmem ext_mem;
+       int16_t dev_id;
+       int16_t port_id = 0;
+
+       /** Initialize CUDA objects (cstream, context, etc..). */
+       /** Use gpudev library to register a new CUDA context if any */
+       /** Let's assume the application wants to use the default context of the GPU device 0 */
+
+       dev_id = 0;
+
+       /**
+        * Create an external memory mempool using memory allocated on the GPU.
+        */
+       ext_mem.elt_size = mbufs_headroom_size;
+                   ext_mem.buf_len = RTE_ALIGN_CEIL(mbufs_num * ext_mem.elt_size, GPU_PAGE_SIZE);
+       ext_mem.buf_iova = RTE_BAD_IOVA;
+       ext_mem.buf_ptr = rte_gpu_malloc(dev_id, ext_mem.buf_len, 0);
+       rte_extmem_register(ext_mem.buf_ptr, ext_mem.buf_len, NULL, ext_mem.buf_iova, GPU_PAGE_SIZE);
+       rte_dev_dma_map(rte_eth_devices[port_id].device, ext_mem.buf_ptr, ext_mem.buf_iova, ext_mem.buf_len);
+       mpool_payload = rte_pktmbuf_pool_create_extbuf("gpu_mempool", mbufs_num,
+                                                       0, 0, ext_mem.elt_size,
+                                                       rte_socket_id(), &ext_mem, 1);
+
+       /**
+        * Create CPU - device communication flag. With this flag, the CPU can tell to the CUDA kernel
+        * to exit from the main loop.
+        */
+       rte_gpu_comm_create_flag(dev_id, &quit_flag, RTE_GPU_COMM_FLAG_CPU);
+       rte_gpu_comm_set_flag(&quit_flag , 0);
+
+       /**
+        * Create CPU - device communication list. Each entry of this list will be populated by the CPU
+        * with a new set of received mbufs that the CUDA kernel has to process.
+        */
+       comm_list = rte_gpu_comm_create_list(dev_id, num_entries);
+
+       /** A very simple CUDA kernel with just 1 CUDA block and RTE_GPU_COMM_LIST_PKTS_MAX CUDA threads. */
+       cuda_kernel_packet_processing<<<1, RTE_GPU_COMM_LIST_PKTS_MAX, 0, cstream>>>(quit_flag->ptr, comm_list, num_entries, ...);
+
+       /**
+        * For simplicity, the CPU here receives only 2 bursts of mbufs.
+        * In a real application, network activity and device processing should overlap.
+        */
+       nb_rx = rte_eth_rx_burst(port_id, queue_id, &(rx_mbufs[0]), max_rx_mbufs);
+       rte_gpu_comm_populate_list_pkts(comm_list[0], rx_mbufs, nb_rx);
+       nb_rx = rte_eth_rx_burst(port_id, queue_id, &(rx_mbufs[0]), max_rx_mbufs);
+       rte_gpu_comm_populate_list_pkts(comm_list[1], rx_mbufs, nb_rx);
+
+       /**
+        * CPU waits for the completion of the packets' processing on the CUDA kernel
+        * and then it does a cleanup of the received mbufs.
+        */
+       while(rte_gpu_comm_cleanup_list(comm_list[0]));
+       while(rte_gpu_comm_cleanup_list(comm_list[1]));
+
+       /** CPU notifies the CUDA kernel that it has to terminate */
+       rte_gpu_comm_set_flag(&quit_flag, 1);
+
+       /** gpudev objects cleanup/destruction */
+       /** CUDA cleanup */
+
+       rte_gpu_free(dev_id, ext_mem.buf_len);
+
+       /** DPDK cleanup */
+
+       return 0;
+   }
+
+   //////////////////////////////////////////////////////////////////////////
+   ///// CUDA kernel
+   //////////////////////////////////////////////////////////////////////////
+
+   void cuda_kernel(uint32_t * quit_flag_ptr, struct rte_gpu_comm_list *comm_list, int comm_list_entries) {
+      int comm_list_index = 0;
+      struct rte_gpu_comm_pkt *pkt_list = NULL;
+
+      /** Do some pre-processing operations. */
+
+      /** GPU kernel keeps checking this flag to know if it has to quit or wait for more packets. */
+      while(*quit_flag_ptr == 0)
+      {
+         if(comm_list[comm_list_index]->status != RTE_GPU_COMM_LIST_READY)
+         continue;
+
+         if(threadIdx.x < comm_list[comm_list_index]->num_pkts)
+         {
+            /** Each CUDA thread processes a different packet. */
+            packet_processing(comm_list[comm_list_index]->addr, comm_list[comm_list_index]->size, ..);
+         }
+         __threadfence();
+         __syncthreads();
+
+         /** Wait for new packets on the next communication list entry. */
+         comm_list_index = (comm_list_index+1) % comm_list_entries;
+      }
+
+      /** Do some post-processing operations. */
+   }
-- 
2.17.1

     prev parent reply	other threads:[~2021-11-08 10:48 UTC|newest]

Thread overview: 128+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-02 20:35 [dpdk-dev] [PATCH] gpudev: introduce memory API Thomas Monjalon
2021-06-02 20:46 ` Stephen Hemminger
2021-06-02 20:48   ` Thomas Monjalon
2021-06-03  7:06 ` Andrew Rybchenko
2021-06-03  7:26   ` Thomas Monjalon
2021-06-03  7:49     ` Andrew Rybchenko
2021-06-03  8:26       ` Thomas Monjalon
2021-06-03  8:57         ` Andrew Rybchenko
2021-06-03  7:18 ` David Marchand
2021-06-03  7:30   ` Thomas Monjalon
2021-06-03  7:47 ` Jerin Jacob
2021-06-03  8:28   ` Thomas Monjalon
2021-06-03  8:41     ` Jerin Jacob
2021-06-03  8:43       ` Thomas Monjalon
2021-06-03  8:47         ` Jerin Jacob
2021-06-03  8:53           ` Thomas Monjalon
2021-06-03  9:20             ` Jerin Jacob
2021-06-03  9:36               ` Thomas Monjalon
2021-06-03 10:04                 ` Jerin Jacob
2021-06-03 10:30                   ` Thomas Monjalon
2021-06-03 11:38                     ` Jerin Jacob
2021-06-04 12:55                       ` Thomas Monjalon
2021-06-04 15:05                         ` Jerin Jacob
2021-06-03  9:33   ` Ferruh Yigit
2021-06-04 10:28     ` Thomas Monjalon
2021-06-04 11:09       ` Jerin Jacob
2021-06-04 12:46         ` Thomas Monjalon
2021-06-04 13:05           ` Andrew Rybchenko
2021-06-04 13:18             ` Thomas Monjalon
2021-06-04 13:59               ` Andrew Rybchenko
2021-06-04 14:09                 ` Thomas Monjalon
2021-06-04 15:20                   ` Jerin Jacob
2021-06-04 15:51                     ` Thomas Monjalon
2021-06-04 18:20                       ` Wang, Haiyue
2021-06-05  5:09                         ` Jerin Jacob
2021-06-06  1:13                           ` Honnappa Nagarahalli
2021-06-06  5:28                             ` Jerin Jacob
2021-06-07 10:29                               ` Thomas Monjalon
2021-06-07  7:20                             ` Wang, Haiyue
2021-06-07 10:43                               ` Thomas Monjalon
2021-06-07 13:54                                 ` Jerin Jacob
2021-06-07 16:47                                   ` Thomas Monjalon
2021-06-08  4:10                                     ` Jerin Jacob
2021-06-08  6:34                                       ` Thomas Monjalon
2021-06-08  7:09                                         ` Jerin Jacob
2021-06-08  7:32                                           ` Thomas Monjalon
2021-06-15 18:24                                         ` Ferruh Yigit
2021-06-15 18:54                                           ` Thomas Monjalon
2021-06-07 23:31                                   ` Honnappa Nagarahalli
2021-06-04  5:51 ` Wang, Haiyue
2021-06-04  8:15   ` Thomas Monjalon
2021-06-04 11:07 ` Wang, Haiyue
2021-06-04 12:43   ` Thomas Monjalon
2021-06-04 13:25     ` Wang, Haiyue
2021-06-04 14:06       ` Thomas Monjalon
2021-06-04 18:04         ` Wang, Haiyue
2021-06-05  7:49           ` Thomas Monjalon
2021-06-05 11:09             ` Wang, Haiyue
2021-06-06  1:10 ` Honnappa Nagarahalli
2021-06-07 10:50   ` Thomas Monjalon
2021-07-30 13:55 ` [dpdk-dev] [RFC PATCH v2 0/7] heterogeneous computing library Thomas Monjalon
2021-07-30 13:55   ` [dpdk-dev] [RFC PATCH v2 1/7] hcdev: introduce heterogeneous computing device library Thomas Monjalon
2021-07-30 13:55   ` [dpdk-dev] [RFC PATCH v2 2/7] hcdev: add event notification Thomas Monjalon
2021-07-30 13:55   ` [dpdk-dev] [RFC PATCH v2 3/7] hcdev: add child device representing a device context Thomas Monjalon
2021-07-30 13:55   ` [dpdk-dev] [RFC PATCH v2 4/7] hcdev: support multi-process Thomas Monjalon
2021-07-30 13:55   ` [dpdk-dev] [RFC PATCH v2 5/7] hcdev: add memory API Thomas Monjalon
2021-07-30 13:55   ` [dpdk-dev] [RFC PATCH v2 6/7] hcdev: add communication flag Thomas Monjalon
2021-07-30 13:55   ` [dpdk-dev] [RFC PATCH v2 7/7] hcdev: add communication list Thomas Monjalon
2021-07-31  7:06   ` [dpdk-dev] [RFC PATCH v2 0/7] heterogeneous computing library Jerin Jacob
2021-07-31  8:21     ` Thomas Monjalon
2021-07-31 13:42       ` Jerin Jacob
2021-08-27  9:44         ` Thomas Monjalon
2021-08-27 12:19           ` Jerin Jacob
2021-08-29  5:32             ` Wang, Haiyue
2021-09-01 15:35               ` Elena Agostini
2021-09-02 13:12                 ` Jerin Jacob
2021-09-06 16:11                   ` Elena Agostini
2021-09-06 17:15                     ` Wang, Haiyue
2021-09-06 17:22                       ` Elena Agostini
2021-09-07  0:55                         ` Wang, Haiyue
2021-10-09  1:53 ` [dpdk-dev] [PATCH v3 0/9] GPU library eagostini
2021-10-09  1:53   ` [dpdk-dev] [PATCH v3 1/9] gpudev: introduce GPU device class library eagostini
2021-10-09  1:53   ` [dpdk-dev] [PATCH v3 2/9] gpudev: add event notification eagostini
2021-10-09  1:53   ` [dpdk-dev] [PATCH v3 3/9] gpudev: add child device representing a device context eagostini
2021-10-09  1:53   ` [dpdk-dev] [PATCH v3 4/9] gpudev: support multi-process eagostini
2021-10-09  1:53   ` [dpdk-dev] [PATCH v3 5/9] gpudev: add memory API eagostini
2021-10-08 20:18     ` Thomas Monjalon
2021-10-29 19:38     ` Mattias Rönnblom
2021-11-08 15:16       ` Elena Agostini
2021-10-09  1:53   ` [dpdk-dev] [PATCH v3 6/9] gpudev: add memory barrier eagostini
2021-10-08 20:16     ` Thomas Monjalon
2021-10-09  1:53   ` [dpdk-dev] [PATCH v3 7/9] gpudev: add communication flag eagostini
2021-10-09  1:53   ` [dpdk-dev] [PATCH v3 8/9] gpudev: add communication list eagostini
2021-10-09  1:53   ` [dpdk-dev] [PATCH v3 9/9] doc: add CUDA example in GPU guide eagostini
2021-10-10 10:16   ` [dpdk-dev] [PATCH v3 0/9] GPU library Jerin Jacob
2021-10-11  8:18     ` Thomas Monjalon
2021-10-11  8:43       ` Jerin Jacob
2021-10-11  9:12         ` Thomas Monjalon
2021-10-11  9:29           ` Jerin Jacob
2021-10-11 10:27             ` Thomas Monjalon
2021-10-11 11:41               ` Jerin Jacob
2021-10-11 12:44                 ` Thomas Monjalon
2021-10-11 13:30                   ` Jerin Jacob
2021-10-19 10:00                     ` Elena Agostini
2021-10-19 18:47                       ` Jerin Jacob
2021-10-19 19:11                         ` Thomas Monjalon
2021-10-19 19:56                           ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2021-11-03 19:15 ` [dpdk-dev] [PATCH v4 " eagostini
2021-11-03 19:15   ` [dpdk-dev] [PATCH v4 1/9] gpudev: introduce GPU device class library eagostini
2021-11-03 19:15   ` [dpdk-dev] [PATCH v4 2/9] gpudev: add event notification eagostini
2021-11-03 19:15   ` [dpdk-dev] [PATCH v4 3/9] gpudev: add child device representing a device context eagostini
2021-11-03 19:15   ` [dpdk-dev] [PATCH v4 4/9] gpudev: support multi-process eagostini
2021-11-03 19:15   ` [dpdk-dev] [PATCH v4 5/9] gpudev: add memory API eagostini
2021-11-03 19:15   ` [dpdk-dev] [PATCH v4 6/9] gpudev: add memory barrier eagostini
2021-11-03 19:15   ` [dpdk-dev] [PATCH v4 7/9] gpudev: add communication flag eagostini
2021-11-03 19:15   ` [dpdk-dev] [PATCH v4 8/9] gpudev: add communication list eagostini
2021-11-03 19:15   ` [dpdk-dev] [PATCH v4 9/9] doc: add CUDA example in GPU guide eagostini
2021-11-08 18:57 ` [dpdk-dev] [PATCH v5 0/9] GPU library eagostini
2021-11-08 16:25   ` Thomas Monjalon
2021-11-08 18:57   ` [dpdk-dev] [PATCH v5 1/9] gpudev: introduce GPU device class library eagostini
2021-11-08 18:57   ` [dpdk-dev] [PATCH v5 2/9] gpudev: add event notification eagostini
2021-11-08 18:57   ` [dpdk-dev] [PATCH v5 3/9] gpudev: add child device representing a device context eagostini
2021-11-08 18:58   ` [dpdk-dev] [PATCH v5 4/9] gpudev: support multi-process eagostini
2021-11-08 18:58   ` [dpdk-dev] [PATCH v5 5/9] gpudev: add memory API eagostini
2021-11-08 18:58   ` [dpdk-dev] [PATCH v5 6/9] gpudev: add memory barrier eagostini
2021-11-08 18:58   ` [dpdk-dev] [PATCH v5 7/9] gpudev: add communication flag eagostini
2021-11-08 18:58   ` [dpdk-dev] [PATCH v5 8/9] gpudev: add communication list eagostini
2021-11-08 18:58   ` eagostini [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211108185805.3887-10-eagostini@nvidia.com \
    --to=eagostini@nvidia.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).