To improve throughput and latency, this patch allows Rx polling timer delay to 0us. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- doc/guides/vdpadevs/mlx5.rst | 3 +-- drivers/vdpa/mlx5/mlx5_vdpa.c | 9 +++------ 2 files changed, 4 insertions(+), 8 deletions(-) diff --git a/doc/guides/vdpadevs/mlx5.rst b/doc/guides/vdpadevs/mlx5.rst index 3a6d88362d..903fdb0e60 100644 --- a/doc/guides/vdpadevs/mlx5.rst +++ b/doc/guides/vdpadevs/mlx5.rst @@ -125,8 +125,7 @@ Driver options - 0, A nonzero value to set timer step in micro-seconds. The timer thread dynamic delay change steps according to this value. Default value is 1us. - - 1, A nonzero value to set fixed timer delay in micro-seconds. Default value - is 100us. + - 1, A value to set fixed timer delay in micro-seconds. Default value is 0us. - ``no_traffic_time`` parameter [int] diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index b64f364eb7..5020a99fae 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -651,12 +651,9 @@ mlx5_vdpa_config_get(struct rte_devargs *devargs, struct mlx5_vdpa_priv *priv) return; rte_kvargs_process(kvlist, NULL, mlx5_vdpa_args_check_handler, priv); rte_kvargs_free(kvlist); - if (!priv->event_us) { - if (priv->event_mode == MLX5_VDPA_EVENT_MODE_DYNAMIC_TIMER) - priv->event_us = MLX5_VDPA_DEFAULT_TIMER_STEP_US; - else if (priv->event_mode == MLX5_VDPA_EVENT_MODE_FIXED_TIMER) - priv->event_us = MLX5_VDPA_DEFAULT_TIMER_DELAY_US; - } + if (!priv->event_us && + priv->event_mode == MLX5_VDPA_EVENT_MODE_DYNAMIC_TIMER) + priv->event_us = MLX5_VDPA_DEFAULT_TIMER_STEP_US; DRV_LOG(DEBUG, "event mode is %d.", priv->event_mode); DRV_LOG(DEBUG, "event_us is %u us.", priv->event_us); DRV_LOG(DEBUG, "no traffic time is %u s.", priv->no_traffic_time_s); -- 2.25.1
To improve performance and latency, this patch set Rx polling mode default delay time to zero. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/vdpa/mlx5/mlx5_vdpa.h | 2 +- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index d039ada65b..08e04a86c4 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -36,7 +36,7 @@ #define VIRTIO_F_RING_PACKED 34 #endif -#define MLX5_VDPA_DEFAULT_TIMER_DELAY_US 100u +#define MLX5_VDPA_DEFAULT_TIMER_DELAY_US 0u #define MLX5_VDPA_DEFAULT_TIMER_STEP_US 1u struct mlx5_vdpa_cq { diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c b/drivers/vdpa/mlx5/mlx5_vdpa_event.c index 3aeaeb893f..5366937e03 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c @@ -265,7 +265,8 @@ mlx5_vdpa_timer_sleep(struct mlx5_vdpa_priv *priv, uint32_t max) break; } } - usleep(priv->timer_delay_us); + if (priv->timer_delay_us) + usleep(priv->timer_delay_us); } static void * -- 2.25.1
This patch adds new device argument to specify cpu core affinity to event polling thread for better latency and throughput. The thread could be also located by name "vDPA-mlx5-<id>". Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- doc/guides/vdpadevs/mlx5.rst | 5 +++++ drivers/vdpa/mlx5/mlx5_vdpa.c | 7 +++++++ drivers/vdpa/mlx5/mlx5_vdpa.h | 1 + drivers/vdpa/mlx5/mlx5_vdpa_event.c | 23 ++++++++++++++++++++++- 4 files changed, 35 insertions(+), 1 deletion(-) diff --git a/doc/guides/vdpadevs/mlx5.rst b/doc/guides/vdpadevs/mlx5.rst index 903fdb0e60..20254257c9 100644 --- a/doc/guides/vdpadevs/mlx5.rst +++ b/doc/guides/vdpadevs/mlx5.rst @@ -134,6 +134,11 @@ Driver options interrupts are configured to the device in order to notify traffic for the driver. Default value is 2s. +- ``event_core`` parameter [int] + + CPU core number to set polling thread affinity to, default to control plane + cpu. + Error handling ^^^^^^^^^^^^^^ diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 5020a99fae..1f92c529c9 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -612,6 +612,7 @@ mlx5_vdpa_args_check_handler(const char *key, const char *val, void *opaque) { struct mlx5_vdpa_priv *priv = opaque; unsigned long tmp; + int n_cores = sysconf(_SC_NPROCESSORS_ONLN); if (strcmp(key, "class") == 0) return 0; @@ -630,6 +631,11 @@ mlx5_vdpa_args_check_handler(const char *key, const char *val, void *opaque) priv->event_us = (uint32_t)tmp; } else if (strcmp(key, "no_traffic_time") == 0) { priv->no_traffic_time_s = (uint32_t)tmp; + } else if (strcmp(key, "event_core") == 0) { + if (tmp >= (unsigned long)n_cores) + DRV_LOG(WARNING, "Invalid event_core %s.", val); + else + priv->event_core = tmp; } else { DRV_LOG(WARNING, "Invalid key %s.", key); } @@ -643,6 +649,7 @@ mlx5_vdpa_config_get(struct rte_devargs *devargs, struct mlx5_vdpa_priv *priv) priv->event_mode = MLX5_VDPA_EVENT_MODE_DYNAMIC_TIMER; priv->event_us = 0; + priv->event_core = -1; priv->no_traffic_time_s = MLX5_VDPA_DEFAULT_NO_TRAFFIC_TIME_S; if (devargs == NULL) return; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 08e04a86c4..b4dd3834aa 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -131,6 +131,7 @@ struct mlx5_vdpa_priv { pthread_cond_t timer_cond; volatile uint8_t timer_on; int event_mode; + int event_core; /* Event thread cpu affinity core. */ uint32_t event_us; uint32_t timer_delay_us; uint32_t no_traffic_time_s; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c b/drivers/vdpa/mlx5/mlx5_vdpa_event.c index 5366937e03..f731c80004 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c @@ -532,6 +532,9 @@ int mlx5_vdpa_cqe_event_setup(struct mlx5_vdpa_priv *priv) { int ret; + rte_cpuset_t cpuset; + pthread_attr_t attr; + char name[16]; if (!priv->eventc) /* All virtqs are in poll mode. */ @@ -540,12 +543,30 @@ mlx5_vdpa_cqe_event_setup(struct mlx5_vdpa_priv *priv) pthread_mutex_init(&priv->timer_lock, NULL); pthread_cond_init(&priv->timer_cond, NULL); priv->timer_on = 0; - ret = pthread_create(&priv->timer_tid, NULL, + pthread_attr_init(&attr); + CPU_ZERO(&cpuset); + if (priv->event_core != -1) + CPU_SET(priv->event_core, &cpuset); + else + cpuset = rte_lcore_cpuset(rte_get_main_lcore()); + ret = pthread_attr_setaffinity_np(&attr, sizeof(cpuset), + &cpuset); + if (ret) { + DRV_LOG(ERR, "Failed to set thread affinity."); + return -1; + } + ret = pthread_create(&priv->timer_tid, &attr, mlx5_vdpa_poll_handle, (void *)priv); if (ret) { DRV_LOG(ERR, "Failed to create timer thread."); return -1; } + snprintf(name, sizeof(name), "vDPA-mlx5-%d", priv->vid); + ret = pthread_setname_np(priv->timer_tid, name); + if (ret) { + DRV_LOG(ERR, "Failed to set timer thread name."); + return -1; + } } priv->intr_handle.fd = priv->eventc->fd; priv->intr_handle.type = RTE_INTR_HANDLE_EXT; -- 2.25.1
For better performance and latency, this patch sets default event handling mode to polling mode which uses dedicate thread per device to poll and process event. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- doc/guides/vdpadevs/mlx5.rst | 2 +- drivers/vdpa/mlx5/mlx5_vdpa.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/guides/vdpadevs/mlx5.rst b/doc/guides/vdpadevs/mlx5.rst index 20254257c9..730e171ba3 100644 --- a/doc/guides/vdpadevs/mlx5.rst +++ b/doc/guides/vdpadevs/mlx5.rst @@ -116,7 +116,7 @@ Driver options - 2, Completion queue scheduling will be managed by interrupts. Each CQ burst arms the CQ in order to get an interrupt event in the next traffic burst. - - Default mode is 0. + - Default mode is 1. - ``event_us`` parameter [int] diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index 1f92c529c9..5d954d48ce 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -647,7 +647,7 @@ mlx5_vdpa_config_get(struct rte_devargs *devargs, struct mlx5_vdpa_priv *priv) { struct rte_kvargs *kvlist; - priv->event_mode = MLX5_VDPA_EVENT_MODE_DYNAMIC_TIMER; + priv->event_mode = MLX5_VDPA_EVENT_MODE_FIXED_TIMER; priv->event_us = 0; priv->event_core = -1; priv->no_traffic_time_s = MLX5_VDPA_DEFAULT_NO_TRAFFIC_TIME_S; -- 2.25.1
On 12/3/20 12:36 AM, Xueming Li wrote:
> To improve throughput and latency, this patch allows Rx polling timer
> delay to 0us.
>
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>
> ---
> doc/guides/vdpadevs/mlx5.rst | 3 +--
> drivers/vdpa/mlx5/mlx5_vdpa.c | 9 +++------
> 2 files changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/doc/guides/vdpadevs/mlx5.rst b/doc/guides/vdpadevs/mlx5.rst
> index 3a6d88362d..903fdb0e60 100644
> --- a/doc/guides/vdpadevs/mlx5.rst
> +++ b/doc/guides/vdpadevs/mlx5.rst
> @@ -125,8 +125,7 @@ Driver options
> - 0, A nonzero value to set timer step in micro-seconds. The timer thread
> dynamic delay change steps according to this value. Default value is 1us.
>
> - - 1, A nonzero value to set fixed timer delay in micro-seconds. Default value
> - is 100us.
> + - 1, A value to set fixed timer delay in micro-seconds. Default value is 0us.
>
> - ``no_traffic_time`` parameter [int]
>
> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c
> index b64f364eb7..5020a99fae 100644
> --- a/drivers/vdpa/mlx5/mlx5_vdpa.c
> +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
> @@ -651,12 +651,9 @@ mlx5_vdpa_config_get(struct rte_devargs *devargs, struct mlx5_vdpa_priv *priv)
> return;
> rte_kvargs_process(kvlist, NULL, mlx5_vdpa_args_check_handler, priv);
> rte_kvargs_free(kvlist);
> - if (!priv->event_us) {
> - if (priv->event_mode == MLX5_VDPA_EVENT_MODE_DYNAMIC_TIMER)
> - priv->event_us = MLX5_VDPA_DEFAULT_TIMER_STEP_US;
> - else if (priv->event_mode == MLX5_VDPA_EVENT_MODE_FIXED_TIMER)
> - priv->event_us = MLX5_VDPA_DEFAULT_TIMER_DELAY_US;
> - }
> + if (!priv->event_us &&
> + priv->event_mode == MLX5_VDPA_EVENT_MODE_DYNAMIC_TIMER)
> + priv->event_us = MLX5_VDPA_DEFAULT_TIMER_STEP_US;
> DRV_LOG(DEBUG, "event mode is %d.", priv->event_mode);
> DRV_LOG(DEBUG, "event_us is %u us.", priv->event_us);
> DRV_LOG(DEBUG, "no traffic time is %u s.", priv->no_traffic_time_s);
>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
On 12/3/20 12:36 AM, Xueming Li wrote: > To improve performance and latency, this patch set Rx polling mode s/set/sets/ I'll fix while applying. > default delay time to zero. > > Signed-off-by: Xueming Li <xuemingl@nvidia.com> > Acked-by: Matan Azrad <matan@nvidia.com> > --- > drivers/vdpa/mlx5/mlx5_vdpa.h | 2 +- > drivers/vdpa/mlx5/mlx5_vdpa_event.c | 3 ++- > 2 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h > index d039ada65b..08e04a86c4 100644 > --- a/drivers/vdpa/mlx5/mlx5_vdpa.h > +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h > @@ -36,7 +36,7 @@ > #define VIRTIO_F_RING_PACKED 34 > #endif > > -#define MLX5_VDPA_DEFAULT_TIMER_DELAY_US 100u > +#define MLX5_VDPA_DEFAULT_TIMER_DELAY_US 0u > #define MLX5_VDPA_DEFAULT_TIMER_STEP_US 1u > > struct mlx5_vdpa_cq { > diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c b/drivers/vdpa/mlx5/mlx5_vdpa_event.c > index 3aeaeb893f..5366937e03 100644 > --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c > +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c > @@ -265,7 +265,8 @@ mlx5_vdpa_timer_sleep(struct mlx5_vdpa_priv *priv, uint32_t max) > break; > } > } > - usleep(priv->timer_delay_us); > + if (priv->timer_delay_us) > + usleep(priv->timer_delay_us); > } > > static void * > Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Thanks, Maxime
On 12/3/20 12:36 AM, Xueming Li wrote:
> This patch adds new device argument to specify cpu core affinity to
> event polling thread for better latency and throughput. The thread
> could be also located by name "vDPA-mlx5-<id>".
>
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>
> ---
> doc/guides/vdpadevs/mlx5.rst | 5 +++++
> drivers/vdpa/mlx5/mlx5_vdpa.c | 7 +++++++
> drivers/vdpa/mlx5/mlx5_vdpa.h | 1 +
> drivers/vdpa/mlx5/mlx5_vdpa_event.c | 23 ++++++++++++++++++++++-
> 4 files changed, 35 insertions(+), 1 deletion(-)
>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
On 12/3/20 12:36 AM, Xueming Li wrote:
> For better performance and latency, this patch sets default event
> handling mode to polling mode which uses dedicate thread per device to
> poll and process event.
>
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>
> ---
> doc/guides/vdpadevs/mlx5.rst | 2 +-
> drivers/vdpa/mlx5/mlx5_vdpa.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/doc/guides/vdpadevs/mlx5.rst b/doc/guides/vdpadevs/mlx5.rst
> index 20254257c9..730e171ba3 100644
> --- a/doc/guides/vdpadevs/mlx5.rst
> +++ b/doc/guides/vdpadevs/mlx5.rst
> @@ -116,7 +116,7 @@ Driver options
> - 2, Completion queue scheduling will be managed by interrupts. Each CQ burst
> arms the CQ in order to get an interrupt event in the next traffic burst.
>
> - - Default mode is 0.
> + - Default mode is 1.
>
> - ``event_us`` parameter [int]
>
> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c
> index 1f92c529c9..5d954d48ce 100644
> --- a/drivers/vdpa/mlx5/mlx5_vdpa.c
> +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
> @@ -647,7 +647,7 @@ mlx5_vdpa_config_get(struct rte_devargs *devargs, struct mlx5_vdpa_priv *priv)
> {
> struct rte_kvargs *kvlist;
>
> - priv->event_mode = MLX5_VDPA_EVENT_MODE_DYNAMIC_TIMER;
> + priv->event_mode = MLX5_VDPA_EVENT_MODE_FIXED_TIMER;
> priv->event_us = 0;
> priv->event_core = -1;
> priv->no_traffic_time_s = MLX5_VDPA_DEFAULT_NO_TRAFFIC_TIME_S;
>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
On 12/3/20 12:36 AM, Xueming Li wrote:
> To improve throughput and latency, this patch allows Rx polling timer
> delay to 0us.
>
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>
> ---
> doc/guides/vdpadevs/mlx5.rst | 3 +--
> drivers/vdpa/mlx5/mlx5_vdpa.c | 9 +++------
> 2 files changed, 4 insertions(+), 8 deletions(-)
>
Series applied to dpdk-next-virtio/main.
Please when multiple patches, add a cover letter to avoid me having to
reply to every patches when applying your series.
Thanks,
Maxime
On 12/3/20 12:36 AM, Xueming Li wrote:
> To improve performance and latency, this patch set Rx polling mode
> default delay time to zero.
>
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>
> ---
> drivers/vdpa/mlx5/mlx5_vdpa.h | 2 +-
> drivers/vdpa/mlx5/mlx5_vdpa_event.c | 3 ++-
> 2 files changed, 3 insertions(+), 2 deletions(-)
Series applied to dpdk-next-virtio/main.
Thanks,
Maxime
On 12/3/20 12:36 AM, Xueming Li wrote:
> This patch adds new device argument to specify cpu core affinity to
> event polling thread for better latency and throughput. The thread
> could be also located by name "vDPA-mlx5-<id>".
>
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>
> ---
> doc/guides/vdpadevs/mlx5.rst | 5 +++++
> drivers/vdpa/mlx5/mlx5_vdpa.c | 7 +++++++
> drivers/vdpa/mlx5/mlx5_vdpa.h | 1 +
> drivers/vdpa/mlx5/mlx5_vdpa_event.c | 23 ++++++++++++++++++++++-
> 4 files changed, 35 insertions(+), 1 deletion(-)
Series applied to dpdk-next-virtio/main.
Thanks,
Maxime
On 12/3/20 12:36 AM, Xueming Li wrote:
> For better performance and latency, this patch sets default event
> handling mode to polling mode which uses dedicate thread per device to
> poll and process event.
>
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>
> ---
> doc/guides/vdpadevs/mlx5.rst | 2 +-
> drivers/vdpa/mlx5/mlx5_vdpa.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
Series applied to dpdk-next-virtio/main.
Thanks,
Maxime
Hi Maxime, >-----Original Message----- >From: Maxime Coquelin <maxime.coquelin@redhat.com> >Sent: Friday, January 8, 2021 5:13 PM >To: Xueming(Steven) Li <xuemingl@nvidia.com>; Matan Azrad ><matan@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com> >Cc: dev@dpdk.org; Asaf Penso <asafp@nvidia.com> >Subject: Re: [PATCH 1/4] vdpa/mlx5: set polling mode default delay to zero > > > >On 12/3/20 12:36 AM, Xueming Li wrote: >> To improve throughput and latency, this patch allows Rx polling timer >> delay to 0us. >> >> Signed-off-by: Xueming Li <xuemingl@nvidia.com> >> Acked-by: Matan Azrad <matan@nvidia.com> >> --- >> doc/guides/vdpadevs/mlx5.rst | 3 +-- drivers/vdpa/mlx5/mlx5_vdpa.c >> | 9 +++------ >> 2 files changed, 4 insertions(+), 8 deletions(-) >> > >Series applied to dpdk-next-virtio/main. > >Please when multiple patches, add a cover letter to avoid me having to reply >to every patches when applying your series. Got it, thanks! > >Thanks, >Maxime