* [dpdk-dev] [PATCH 0/4] net/mlx5: add large packet size support to MPRQ @ 2020-03-31 21:52 Alexander Kozyrev 2020-03-31 21:52 ` [dpdk-dev] [PATCH 1/4] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev ` (5 more replies) 0 siblings, 6 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-03-31 21:52 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo, ferruh.yigit, thomas In order to support the 9K MTU the MPRQ feature should be updated to allow a packet to take more than one stride (single linear buffer). Receiving a packet into multiple adjacent strides should be implemented. The reason preventing the packet to be received into multiple strides is that the data buffer must be preceded with some HEAD_ROOM space. In the current implementation the HEAD_ROOM space is borrowed by the PMD from the tail of the preceding stride. If packet takes multiple strides the tail of stride may be overwritten with a packet data and the memory can't be borrowed to provide the HEAD_ROOM space for the next packet. Special care is needed to prevent the HEAD_ROOM corruption as such: - copy a whole packet into a separate memory buffer if scatter is off - copy an overlapping data only and craft a multi-segment mbuf otherwise After multi-stride support for packets receiving is in place it is possible to reduce the stride size for more efficient memory utilization. Introduce the mprq_log_stride_size device parameter to configure a stride size for MPRQ. Default stride size is set to 2048 bytes. Alexander Kozyrev (4): net/mlx5: add a devarg to specify MPRQ stride size net/mlx5: enable MPRQ multi-stride operations doc: add a decsription for MPRQ stride size devarg net/mlx5: add multi-segment packets in MPRQ mode doc/guides/nics/mlx5.rst | 9 +++ doc/guides/rel_notes/release_20_05.rst | 1 + drivers/net/mlx5/mlx5.c | 34 ++++++++-- drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_defs.h | 3 + drivers/net/mlx5/mlx5_rxq.c | 43 +++++-------- drivers/net/mlx5/mlx5_rxtx.c | 113 +++++++++++++++++++-------------- drivers/net/mlx5/mlx5_rxtx.h | 2 +- 8 files changed, 125 insertions(+), 81 deletions(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH 1/4] net/mlx5: add a devarg to specify MPRQ stride size 2020-03-31 21:52 [dpdk-dev] [PATCH 0/4] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev @ 2020-03-31 21:52 ` Alexander Kozyrev 2020-04-02 10:00 ` Slava Ovsiienko 2020-03-31 21:52 ` [dpdk-dev] [PATCH 2/4] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev ` (4 subsequent siblings) 5 siblings, 1 reply; 28+ messages in thread From: Alexander Kozyrev @ 2020-03-31 21:52 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo, ferruh.yigit, thomas Define a device parameter to configure log 2 of a stride size for MPRQ - mprq_log_stride_size. User is able to specify a stride size in a range allowed by an underlying hardware. The default stride size is defined as 2048 bytes to encompass most commonly used packet sizes in the Internet (MTU 1518 and less) and will be used in case a maximum configured packet size cannot fit into the largest possible stride size. Otherwise a stride size is set to a large enough value to encompass a whole packet. Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> --- drivers/net/mlx5/mlx5.c | 34 +++++++++++++++++++++++++++------- drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_defs.h | 3 +++ drivers/net/mlx5/mlx5_rxq.c | 22 +++++++++++++--------- 4 files changed, 44 insertions(+), 16 deletions(-) diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 6a11b14..a2ba6d3 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -63,6 +63,9 @@ /* Device parameter to configure log 2 of the number of strides for MPRQ. */ #define MLX5_RX_MPRQ_LOG_STRIDE_NUM "mprq_log_stride_num" +/* Device parameter to configure log 2 of the stride size for MPRQ. */ +#define MLX5_RX_MPRQ_LOG_STRIDE_SIZE "mprq_log_stride_size" + /* Device parameter to limit the size of memcpy'd packet for MPRQ. */ #define MLX5_RX_MPRQ_MAX_MEMCPY_LEN "mprq_max_memcpy_len" @@ -1531,6 +1534,8 @@ struct mlx5_flow_id_pool * config->mprq.enabled = !!tmp; } else if (strcmp(MLX5_RX_MPRQ_LOG_STRIDE_NUM, key) == 0) { config->mprq.stride_num_n = tmp; + } else if (strcmp(MLX5_RX_MPRQ_LOG_STRIDE_SIZE, key) == 0) { + config->mprq.stride_size_n = tmp; } else if (strcmp(MLX5_RX_MPRQ_MAX_MEMCPY_LEN, key) == 0) { config->mprq.max_memcpy_len = tmp; } else if (strcmp(MLX5_RXQS_MIN_MPRQ, key) == 0) { @@ -1627,6 +1632,7 @@ struct mlx5_flow_id_pool * MLX5_RXQ_PKT_PAD_EN, MLX5_RX_MPRQ_EN, MLX5_RX_MPRQ_LOG_STRIDE_NUM, + MLX5_RX_MPRQ_LOG_STRIDE_SIZE, MLX5_RX_MPRQ_MAX_MEMCPY_LEN, MLX5_RXQS_MIN_MPRQ, MLX5_TXQ_INLINE, @@ -2302,8 +2308,6 @@ struct mlx5_flow_id_pool * mprq_caps.min_single_wqe_log_num_of_strides; mprq_max_stride_num_n = mprq_caps.max_single_wqe_log_num_of_strides; - config.mprq.stride_num_n = RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, - mprq_min_stride_num_n); } #endif if (RTE_CACHE_LINE_SIZE == 128 && @@ -2617,17 +2621,32 @@ struct mlx5_flow_id_pool * #endif } if (config.mprq.enabled && mprq) { - if (config.mprq.stride_num_n > mprq_max_stride_num_n || - config.mprq.stride_num_n < mprq_min_stride_num_n) { + if (config.mprq.stride_num_n && + (config.mprq.stride_num_n > mprq_max_stride_num_n || + config.mprq.stride_num_n < mprq_min_stride_num_n)) { config.mprq.stride_num_n = - RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, - mprq_min_stride_num_n); + RTE_MIN(RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, + mprq_min_stride_num_n), + mprq_max_stride_num_n); DRV_LOG(WARNING, "the number of strides" " for Multi-Packet RQ is out of range," " setting default value (%u)", 1 << config.mprq.stride_num_n); } + if (config.mprq.stride_size_n && + (config.mprq.stride_size_n > mprq_max_stride_size_n || + config.mprq.stride_size_n < mprq_min_stride_size_n)) { + config.mprq.stride_size_n = + RTE_MIN(RTE_MAX(MLX5_MPRQ_STRIDE_SIZE_N, + mprq_min_stride_size_n), + mprq_max_stride_size_n); + DRV_LOG(WARNING, + "the size of a stride" + " for Multi-Packet RQ is out of range," + " setting default value (%u)", + 1 << config.mprq.stride_size_n); + } config.mprq.min_stride_size_n = mprq_min_stride_size_n; config.mprq.max_stride_size_n = mprq_max_stride_size_n; } else if (config.mprq.enabled && !mprq) { @@ -3361,7 +3380,8 @@ struct mlx5_flow_id_pool * .mr_ext_memseg_en = 1, .mprq = { .enabled = 0, /* Disabled by default. */ - .stride_num_n = MLX5_MPRQ_STRIDE_NUM_N, + .stride_num_n = 0, + .stride_size_n = 0, .max_memcpy_len = MLX5_MPRQ_MEMCPY_DEFAULT_LEN, .min_rxqs_num = MLX5_MPRQ_MIN_RXQS, }, diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 62b0810..c8e2454 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -176,6 +176,7 @@ struct mlx5_dev_config { struct { unsigned int enabled:1; /* Whether MPRQ is enabled. */ unsigned int stride_num_n; /* Number of strides. */ + unsigned int stride_size_n; /* Size of a stride. */ unsigned int min_stride_size_n; /* Min size of a stride. */ unsigned int max_stride_size_n; /* Max size of a stride. */ unsigned int max_memcpy_len; diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index 19e8253..260f584 100644 --- a/drivers/net/mlx5/mlx5_defs.h +++ b/drivers/net/mlx5/mlx5_defs.h @@ -143,6 +143,9 @@ /* Log 2 of the default number of strides per WQE for Multi-Packet RQ. */ #define MLX5_MPRQ_STRIDE_NUM_N 6U +/* Log 2 of the default size of a stride per WQE for Multi-Packet RQ. */ +#define MLX5_MPRQ_STRIDE_SIZE_N 11U + /* Two-byte shift is disabled for Multi-Packet RQ. */ #define MLX5_MPRQ_TWO_BYTE_SHIFT 0 diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index 0a95e3c..85fcfe6 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -1856,25 +1856,26 @@ struct mlx5_rxq_ctrl * strd_headroom_en = 1; mprq_stride_size = non_scatter_min_mbuf_size; } + if (!config->mprq.stride_num_n) + config->mprq.stride_num_n = MLX5_MPRQ_STRIDE_NUM_N; + if (!config->mprq.stride_size_n) + config->mprq.stride_size_n = (mprq_stride_size <= + (1U << config->mprq.max_stride_size_n)) ? + log2above(mprq_stride_size) : MLX5_MPRQ_STRIDE_SIZE_N; /* * This Rx queue can be configured as a Multi-Packet RQ if all of the * following conditions are met: * - MPRQ is enabled. * - The number of descs is more than the number of strides. - * - max_rx_pkt_len plus overhead is less than the max size of a - * stride. * Otherwise, enable Rx scatter if necessary. */ - if (mprq_en && - desc > (1U << config->mprq.stride_num_n) && - mprq_stride_size <= (1U << config->mprq.max_stride_size_n)) { + if (mprq_en && desc > (1U << config->mprq.stride_num_n)) { /* TODO: Rx scatter isn't supported yet. */ tmpl->rxq.sges_n = 0; /* Trim the number of descs needed. */ desc >>= config->mprq.stride_num_n; tmpl->rxq.strd_num_n = config->mprq.stride_num_n; - tmpl->rxq.strd_sz_n = RTE_MAX(log2above(mprq_stride_size), - config->mprq.min_stride_size_n); + tmpl->rxq.strd_sz_n = config->mprq.stride_size_n; tmpl->rxq.strd_shift_en = MLX5_MPRQ_TWO_BYTE_SHIFT; tmpl->rxq.strd_headroom_en = strd_headroom_en; tmpl->rxq.mprq_max_memcpy_len = RTE_MIN(first_mb_free_size, @@ -1924,9 +1925,12 @@ struct mlx5_rxq_ctrl * DRV_LOG(WARNING, "port %u MPRQ is requested but cannot be enabled" " (requested: desc = %u, stride_sz = %u," - " supported: min_stride_num = %u, max_stride_sz = %u).", - dev->data->port_id, desc, mprq_stride_size, + " supported: min_stride_num = %u, min_stride_sz = %u," + "max_stride_sz = %u).", + dev->data->port_id, desc, + (1 << config->mprq.stride_size_n), (1 << config->mprq.stride_num_n), + (1 << config->mprq.min_stride_size_n), (1 << config->mprq.max_stride_size_n)); DRV_LOG(DEBUG, "port %u maximum number of segments per packet: %u", dev->data->port_id, 1 << tmpl->rxq.sges_n); -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH 1/4] net/mlx5: add a devarg to specify MPRQ stride size 2020-03-31 21:52 ` [dpdk-dev] [PATCH 1/4] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev @ 2020-04-02 10:00 ` Slava Ovsiienko 0 siblings, 0 replies; 28+ messages in thread From: Slava Ovsiienko @ 2020-04-02 10:00 UTC (permalink / raw) To: Alexander Kozyrev, dev Cc: Raslan Darawsheh, Matan Azrad, ferruh.yigit, Thomas Monjalon > -----Original Message----- > From: Alexander Kozyrev <akozyrev@mellanox.com> > Sent: Wednesday, April 1, 2020 0:53 > To: dev@dpdk.org > Cc: Raslan Darawsheh <rasland@mellanox.com>; Matan Azrad > <matan@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>; > ferruh.yigit@intel.com; Thomas Monjalon <thomas@monjalon.net> > Subject: [PATCH 1/4] net/mlx5: add a devarg to specify MPRQ stride size > > Define a device parameter to configure log 2 of a stride size for MPRQ > - mprq_log_stride_size. User is able to specify a stride size in a range allowed > by an underlying hardware. The default stride size is defined as > 2048 bytes to encompass most commonly used packet sizes in the Internet > (MTU 1518 and less) and will be used in case a maximum configured packet > size cannot fit into the largest possible stride size. Otherwise a stride size is > set to a large enough value to encompass a whole packet. > > Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> > --- > drivers/net/mlx5/mlx5.c | 34 +++++++++++++++++++++++++++------- > drivers/net/mlx5/mlx5.h | 1 + > drivers/net/mlx5/mlx5_defs.h | 3 +++ > drivers/net/mlx5/mlx5_rxq.c | 22 +++++++++++++--------- > 4 files changed, 44 insertions(+), 16 deletions(-) > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index > 6a11b14..a2ba6d3 100644 > --- a/drivers/net/mlx5/mlx5.c > +++ b/drivers/net/mlx5/mlx5.c > @@ -63,6 +63,9 @@ > /* Device parameter to configure log 2 of the number of strides for MPRQ. */ > #define MLX5_RX_MPRQ_LOG_STRIDE_NUM "mprq_log_stride_num" > > +/* Device parameter to configure log 2 of the stride size for MPRQ. */ > +#define MLX5_RX_MPRQ_LOG_STRIDE_SIZE "mprq_log_stride_size" > + > /* Device parameter to limit the size of memcpy'd packet for MPRQ. */ > #define MLX5_RX_MPRQ_MAX_MEMCPY_LEN "mprq_max_memcpy_len" > > @@ -1531,6 +1534,8 @@ struct mlx5_flow_id_pool * > config->mprq.enabled = !!tmp; > } else if (strcmp(MLX5_RX_MPRQ_LOG_STRIDE_NUM, key) == 0) { > config->mprq.stride_num_n = tmp; > + } else if (strcmp(MLX5_RX_MPRQ_LOG_STRIDE_SIZE, key) == 0) { > + config->mprq.stride_size_n = tmp; > } else if (strcmp(MLX5_RX_MPRQ_MAX_MEMCPY_LEN, key) == 0) { > config->mprq.max_memcpy_len = tmp; > } else if (strcmp(MLX5_RXQS_MIN_MPRQ, key) == 0) { @@ -1627,6 > +1632,7 @@ struct mlx5_flow_id_pool * > MLX5_RXQ_PKT_PAD_EN, > MLX5_RX_MPRQ_EN, > MLX5_RX_MPRQ_LOG_STRIDE_NUM, > + MLX5_RX_MPRQ_LOG_STRIDE_SIZE, > MLX5_RX_MPRQ_MAX_MEMCPY_LEN, > MLX5_RXQS_MIN_MPRQ, > MLX5_TXQ_INLINE, > @@ -2302,8 +2308,6 @@ struct mlx5_flow_id_pool * > mprq_caps.min_single_wqe_log_num_of_strides; > mprq_max_stride_num_n = > mprq_caps.max_single_wqe_log_num_of_strides; > - config.mprq.stride_num_n = > RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, > - mprq_min_stride_num_n); > } > #endif > if (RTE_CACHE_LINE_SIZE == 128 && > @@ -2617,17 +2621,32 @@ struct mlx5_flow_id_pool * #endif > } > if (config.mprq.enabled && mprq) { > - if (config.mprq.stride_num_n > mprq_max_stride_num_n || > - config.mprq.stride_num_n < mprq_min_stride_num_n) { > + if (config.mprq.stride_num_n && > + (config.mprq.stride_num_n > mprq_max_stride_num_n || > + config.mprq.stride_num_n < mprq_min_stride_num_n)) { > config.mprq.stride_num_n = > - RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, > - mprq_min_stride_num_n); > + > RTE_MIN(RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, > + mprq_min_stride_num_n), > + mprq_max_stride_num_n); > DRV_LOG(WARNING, > "the number of strides" > " for Multi-Packet RQ is out of range," > " setting default value (%u)", > 1 << config.mprq.stride_num_n); > } > + if (config.mprq.stride_size_n && > + (config.mprq.stride_size_n > mprq_max_stride_size_n || > + config.mprq.stride_size_n < mprq_min_stride_size_n)) { > + config.mprq.stride_size_n = > + > RTE_MIN(RTE_MAX(MLX5_MPRQ_STRIDE_SIZE_N, > + mprq_min_stride_size_n), > + mprq_max_stride_size_n); > + DRV_LOG(WARNING, > + "the size of a stride" > + " for Multi-Packet RQ is out of range," > + " setting default value (%u)", > + 1 << config.mprq.stride_size_n); > + } > config.mprq.min_stride_size_n = mprq_min_stride_size_n; > config.mprq.max_stride_size_n = mprq_max_stride_size_n; > } else if (config.mprq.enabled && !mprq) { @@ -3361,7 +3380,8 @@ > struct mlx5_flow_id_pool * > .mr_ext_memseg_en = 1, > .mprq = { > .enabled = 0, /* Disabled by default. */ > - .stride_num_n = MLX5_MPRQ_STRIDE_NUM_N, > + .stride_num_n = 0, > + .stride_size_n = 0, > .max_memcpy_len = > MLX5_MPRQ_MEMCPY_DEFAULT_LEN, > .min_rxqs_num = MLX5_MPRQ_MIN_RXQS, > }, > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index > 62b0810..c8e2454 100644 > --- a/drivers/net/mlx5/mlx5.h > +++ b/drivers/net/mlx5/mlx5.h > @@ -176,6 +176,7 @@ struct mlx5_dev_config { > struct { > unsigned int enabled:1; /* Whether MPRQ is enabled. */ > unsigned int stride_num_n; /* Number of strides. */ > + unsigned int stride_size_n; /* Size of a stride. */ > unsigned int min_stride_size_n; /* Min size of a stride. */ > unsigned int max_stride_size_n; /* Max size of a stride. */ > unsigned int max_memcpy_len; > diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h > index 19e8253..260f584 100644 > --- a/drivers/net/mlx5/mlx5_defs.h > +++ b/drivers/net/mlx5/mlx5_defs.h > @@ -143,6 +143,9 @@ > /* Log 2 of the default number of strides per WQE for Multi-Packet RQ. */ > #define MLX5_MPRQ_STRIDE_NUM_N 6U > > +/* Log 2 of the default size of a stride per WQE for Multi-Packet RQ. > +*/ #define MLX5_MPRQ_STRIDE_SIZE_N 11U > + > /* Two-byte shift is disabled for Multi-Packet RQ. */ #define > MLX5_MPRQ_TWO_BYTE_SHIFT 0 > > diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index > 0a95e3c..85fcfe6 100644 > --- a/drivers/net/mlx5/mlx5_rxq.c > +++ b/drivers/net/mlx5/mlx5_rxq.c > @@ -1856,25 +1856,26 @@ struct mlx5_rxq_ctrl * > strd_headroom_en = 1; > mprq_stride_size = non_scatter_min_mbuf_size; > } > + if (!config->mprq.stride_num_n) > + config->mprq.stride_num_n = MLX5_MPRQ_STRIDE_NUM_N; > + if (!config->mprq.stride_size_n) > + config->mprq.stride_size_n = (mprq_stride_size <= > + (1U << config->mprq.max_stride_size_n)) ? > + log2above(mprq_stride_size) : > MLX5_MPRQ_STRIDE_SIZE_N; > /* > * This Rx queue can be configured as a Multi-Packet RQ if all of the > * following conditions are met: > * - MPRQ is enabled. > * - The number of descs is more than the number of strides. > - * - max_rx_pkt_len plus overhead is less than the max size of a > - * stride. > * Otherwise, enable Rx scatter if necessary. > */ > - if (mprq_en && > - desc > (1U << config->mprq.stride_num_n) && > - mprq_stride_size <= (1U << config->mprq.max_stride_size_n)) { > + if (mprq_en && desc > (1U << config->mprq.stride_num_n)) { > /* TODO: Rx scatter isn't supported yet. */ > tmpl->rxq.sges_n = 0; > /* Trim the number of descs needed. */ > desc >>= config->mprq.stride_num_n; > tmpl->rxq.strd_num_n = config->mprq.stride_num_n; > - tmpl->rxq.strd_sz_n = RTE_MAX(log2above(mprq_stride_size), > - config->mprq.min_stride_size_n); > + tmpl->rxq.strd_sz_n = config->mprq.stride_size_n; > tmpl->rxq.strd_shift_en = MLX5_MPRQ_TWO_BYTE_SHIFT; > tmpl->rxq.strd_headroom_en = strd_headroom_en; > tmpl->rxq.mprq_max_memcpy_len = > RTE_MIN(first_mb_free_size, @@ -1924,9 +1925,12 @@ struct mlx5_rxq_ctrl * > DRV_LOG(WARNING, > "port %u MPRQ is requested but cannot be enabled" > " (requested: desc = %u, stride_sz = %u," > - " supported: min_stride_num = %u, max_stride_sz = > %u).", > - dev->data->port_id, desc, mprq_stride_size, > + " supported: min_stride_num = %u, min_stride_sz = > %u," > + "max_stride_sz = %u).", > + dev->data->port_id, desc, > + (1 << config->mprq.stride_size_n), > (1 << config->mprq.stride_num_n), > + (1 << config->mprq.min_stride_size_n), > (1 << config->mprq.max_stride_size_n)); > DRV_LOG(DEBUG, "port %u maximum number of segments per > packet: %u", > dev->data->port_id, 1 << tmpl->rxq.sges_n); > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH 2/4] net/mlx5: enable MPRQ multi-stride operations 2020-03-31 21:52 [dpdk-dev] [PATCH 0/4] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev 2020-03-31 21:52 ` [dpdk-dev] [PATCH 1/4] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev @ 2020-03-31 21:52 ` Alexander Kozyrev 2020-04-02 10:01 ` Slava Ovsiienko 2020-03-31 21:52 ` [dpdk-dev] [PATCH 3/4] doc: add a decsription for MPRQ stride size devarg Alexander Kozyrev ` (3 subsequent siblings) 5 siblings, 1 reply; 28+ messages in thread From: Alexander Kozyrev @ 2020-03-31 21:52 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo, ferruh.yigit, thomas MPRQ feature should be updated to allow a packet to be received into multiple strides in order to support the MTU exceeding 8KB. Special care is needed to prevent the headroom corruption in the multi-stride mode since the headroom space is borrowed by the PMD from the tail of the preceding stride. Copy the whole packet into a separate mbuf in this case or just the overlapping data if the Rx scattering is supported by an application. Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> --- drivers/net/mlx5/mlx5_rxq.c | 25 ++++------------ drivers/net/mlx5/mlx5_rxtx.c | 68 +++++++++++++++++++------------------------- drivers/net/mlx5/mlx5_rxtx.h | 2 +- 3 files changed, 35 insertions(+), 60 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index 85fcfe6..a64f536 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -1793,9 +1793,7 @@ struct mlx5_rxq_ctrl * struct mlx5_priv *priv = dev->data->dev_private; struct mlx5_rxq_ctrl *tmpl; unsigned int mb_len = rte_pktmbuf_data_room_size(mp); - unsigned int mprq_stride_size; struct mlx5_dev_config *config = &priv->config; - unsigned int strd_headroom_en; /* * Always allocate extra slots, even if eventually * the vector Rx will not be used. @@ -1841,27 +1839,13 @@ struct mlx5_rxq_ctrl * tmpl->socket = socket; if (dev->data->dev_conf.intr_conf.rxq) tmpl->irq = 1; - /* - * LRO packet may consume all the stride memory, hence we cannot - * guaranty head-room near the packet memory in the stride. - * In this case scatter is, for sure, enabled and an empty mbuf may be - * added in the start for the head-room. - */ - if (lro_on_queue && RTE_PKTMBUF_HEADROOM > 0 && - non_scatter_min_mbuf_size > mb_len) { - strd_headroom_en = 0; - mprq_stride_size = RTE_MIN(max_rx_pkt_len, - 1u << config->mprq.max_stride_size_n); - } else { - strd_headroom_en = 1; - mprq_stride_size = non_scatter_min_mbuf_size; - } if (!config->mprq.stride_num_n) config->mprq.stride_num_n = MLX5_MPRQ_STRIDE_NUM_N; if (!config->mprq.stride_size_n) - config->mprq.stride_size_n = (mprq_stride_size <= + config->mprq.stride_size_n = (non_scatter_min_mbuf_size <= (1U << config->mprq.max_stride_size_n)) ? - log2above(mprq_stride_size) : MLX5_MPRQ_STRIDE_SIZE_N; + log2above(non_scatter_min_mbuf_size) : + MLX5_MPRQ_STRIDE_SIZE_N; /* * This Rx queue can be configured as a Multi-Packet RQ if all of the * following conditions are met: @@ -1877,7 +1861,8 @@ struct mlx5_rxq_ctrl * tmpl->rxq.strd_num_n = config->mprq.stride_num_n; tmpl->rxq.strd_sz_n = config->mprq.stride_size_n; tmpl->rxq.strd_shift_en = MLX5_MPRQ_TWO_BYTE_SHIFT; - tmpl->rxq.strd_headroom_en = strd_headroom_en; + tmpl->rxq.strd_scatter_en = + !!(offloads & DEV_RX_OFFLOAD_SCATTER); tmpl->rxq.mprq_max_memcpy_len = RTE_MIN(first_mb_free_size, config->mprq.max_memcpy_len); max_lro_size = RTE_MIN(max_rx_pkt_len, diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index f3bf763..4c27952 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1658,21 +1658,20 @@ enum mlx5_txcmp_code { unsigned int i = 0; uint32_t rq_ci = rxq->rq_ci; uint16_t consumed_strd = rxq->consumed_strd; - uint16_t headroom_sz = rxq->strd_headroom_en * RTE_PKTMBUF_HEADROOM; struct mlx5_mprq_buf *buf = (*rxq->mprq_bufs)[rq_ci & wq_mask]; while (i < pkts_n) { struct rte_mbuf *pkt; void *addr; int ret; - unsigned int len; + uint32_t len; uint16_t strd_cnt; uint16_t strd_idx; uint32_t offset; uint32_t byte_cnt; + int32_t hdrm_overlap; volatile struct mlx5_mini_cqe8 *mcqe = NULL; uint32_t rss_hash_res = 0; - uint8_t lro_num_seg; if (consumed_strd == strd_n) { /* Replace WQE only if the buffer is still in use. */ @@ -1719,18 +1718,6 @@ enum mlx5_txcmp_code { MLX5_ASSERT(strd_idx < strd_n); MLX5_ASSERT(!((rte_be_to_cpu_16(cqe->wqe_id) ^ rq_ci) & wq_mask)); - lro_num_seg = cqe->lro_num_seg; - /* - * Currently configured to receive a packet per a stride. But if - * MTU is adjusted through kernel interface, device could - * consume multiple strides without raising an error. In this - * case, the packet should be dropped because it is bigger than - * the max_rx_pkt_len. - */ - if (unlikely(!lro_num_seg && strd_cnt > 1)) { - ++rxq->stats.idropped; - continue; - } pkt = rte_pktmbuf_alloc(rxq->mp); if (unlikely(pkt == NULL)) { ++rxq->stats.rx_nombuf; @@ -1742,12 +1729,16 @@ enum mlx5_txcmp_code { len -= RTE_ETHER_CRC_LEN; offset = strd_idx * strd_sz + strd_shift; addr = RTE_PTR_ADD(mlx5_mprq_buf_addr(buf, strd_n), offset); + hdrm_overlap = len + RTE_PKTMBUF_HEADROOM - strd_cnt * strd_sz; /* * Memcpy packets to the target mbuf if: * - The size of packet is smaller than mprq_max_memcpy_len. * - Out of buffer in the Mempool for Multi-Packet RQ. + * - There is no space for a headroom and scatter is disabled. */ - if (len <= rxq->mprq_max_memcpy_len || rxq->mprq_repl == NULL) { + if (len <= rxq->mprq_max_memcpy_len || + rxq->mprq_repl == NULL || + (hdrm_overlap > 0 && !rxq->strd_scatter_en)) { /* * When memcpy'ing packet due to out-of-buffer, the * packet must be smaller than the target mbuf. @@ -1769,7 +1760,7 @@ enum mlx5_txcmp_code { rte_atomic16_add_return(&buf->refcnt, 1); MLX5_ASSERT((uint16_t)rte_atomic16_read(&buf->refcnt) <= strd_n + 1); - buf_addr = RTE_PTR_SUB(addr, headroom_sz); + buf_addr = RTE_PTR_SUB(addr, RTE_PKTMBUF_HEADROOM); /* * MLX5 device doesn't use iova but it is necessary in a * case where the Rx packet is transmitted via a @@ -1788,43 +1779,42 @@ enum mlx5_txcmp_code { rte_pktmbuf_attach_extbuf(pkt, buf_addr, buf_iova, buf_len, shinfo); /* Set mbuf head-room. */ - pkt->data_off = headroom_sz; + SET_DATA_OFF(pkt, RTE_PKTMBUF_HEADROOM); MLX5_ASSERT(pkt->ol_flags == EXT_ATTACHED_MBUF); - /* - * Prevent potential overflow due to MTU change through - * kernel interface. - */ - if (unlikely(rte_pktmbuf_tailroom(pkt) < len)) { - rte_pktmbuf_free_seg(pkt); - ++rxq->stats.idropped; - continue; - } + MLX5_ASSERT(rte_pktmbuf_tailroom(pkt) < + len - (hdrm_overlap > 0 ? hdrm_overlap : 0)); DATA_LEN(pkt) = len; /* - * LRO packet may consume all the stride memory, in this - * case packet head-room space is not guaranteed so must - * to add an empty mbuf for the head-room. + * Copy the last fragment of a packet (up to headroom + * size bytes) in case there is a stride overlap with + * a next packet's headroom. Allocate a separate mbuf + * to store this fragment and link it. Scatter is on. */ - if (!rxq->strd_headroom_en) { - struct rte_mbuf *headroom_mbuf = - rte_pktmbuf_alloc(rxq->mp); + if (hdrm_overlap > 0) { + MLX5_ASSERT(rxq->strd_scatter_en); + struct rte_mbuf *seg = + rte_pktmbuf_alloc(rxq->mp); - if (unlikely(headroom_mbuf == NULL)) { + if (unlikely(seg == NULL)) { rte_pktmbuf_free_seg(pkt); ++rxq->stats.rx_nombuf; break; } - PORT(pkt) = rxq->port_id; - NEXT(headroom_mbuf) = pkt; - pkt = headroom_mbuf; + SET_DATA_OFF(seg, 0); + rte_memcpy(rte_pktmbuf_mtod(seg, void *), + RTE_PTR_ADD(addr, len - hdrm_overlap), + hdrm_overlap); + DATA_LEN(seg) = hdrm_overlap; + DATA_LEN(pkt) = len - hdrm_overlap; + NEXT(pkt) = seg; NB_SEGS(pkt) = 2; } } rxq_cq_to_mbuf(rxq, pkt, cqe, rss_hash_res); - if (lro_num_seg > 1) { + if (cqe->lro_num_seg > 1) { mlx5_lro_update_hdr(addr, cqe, len); pkt->ol_flags |= PKT_RX_LRO; - pkt->tso_segsz = strd_sz; + pkt->tso_segsz = len / cqe->lro_num_seg; } PKT_LEN(pkt) = len; PORT(pkt) = rxq->port_id; diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 939778a..d155c24 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -119,7 +119,7 @@ struct mlx5_rxq_data { unsigned int strd_sz_n:4; /* Log 2 of stride size. */ unsigned int strd_shift_en:1; /* Enable 2bytes shift on a stride. */ unsigned int err_state:2; /* enum mlx5_rxq_err_state. */ - unsigned int strd_headroom_en:1; /* Enable mbuf headroom in MPRQ. */ + unsigned int strd_scatter_en:1; /* Scattered packets from a stride. */ unsigned int lro:1; /* Enable LRO. */ unsigned int :1; /* Remaining bits. */ volatile uint32_t *rq_db; -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH 2/4] net/mlx5: enable MPRQ multi-stride operations 2020-03-31 21:52 ` [dpdk-dev] [PATCH 2/4] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev @ 2020-04-02 10:01 ` Slava Ovsiienko 0 siblings, 0 replies; 28+ messages in thread From: Slava Ovsiienko @ 2020-04-02 10:01 UTC (permalink / raw) To: Alexander Kozyrev, dev Cc: Raslan Darawsheh, Matan Azrad, ferruh.yigit, Thomas Monjalon > -----Original Message----- > From: Alexander Kozyrev <akozyrev@mellanox.com> > Sent: Wednesday, April 1, 2020 0:53 > To: dev@dpdk.org > Cc: Raslan Darawsheh <rasland@mellanox.com>; Matan Azrad > <matan@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>; > ferruh.yigit@intel.com; Thomas Monjalon <thomas@monjalon.net> > Subject: [PATCH 2/4] net/mlx5: enable MPRQ multi-stride operations > > MPRQ feature should be updated to allow a packet to be received into > multiple strides in order to support the MTU exceeding 8KB. > Special care is needed to prevent the headroom corruption in the multi-stride > mode since the headroom space is borrowed by the PMD from the tail of the > preceding stride. Copy the whole packet into a separate mbuf in this case or > just the overlapping data if the Rx scattering is supported by an application. > > Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> > --- > drivers/net/mlx5/mlx5_rxq.c | 25 ++++------------ > drivers/net/mlx5/mlx5_rxtx.c | 68 +++++++++++++++++++------------------------- > drivers/net/mlx5/mlx5_rxtx.h | 2 +- > 3 files changed, 35 insertions(+), 60 deletions(-) > > diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index > 85fcfe6..a64f536 100644 > --- a/drivers/net/mlx5/mlx5_rxq.c > +++ b/drivers/net/mlx5/mlx5_rxq.c > @@ -1793,9 +1793,7 @@ struct mlx5_rxq_ctrl * > struct mlx5_priv *priv = dev->data->dev_private; > struct mlx5_rxq_ctrl *tmpl; > unsigned int mb_len = rte_pktmbuf_data_room_size(mp); > - unsigned int mprq_stride_size; > struct mlx5_dev_config *config = &priv->config; > - unsigned int strd_headroom_en; > /* > * Always allocate extra slots, even if eventually > * the vector Rx will not be used. > @@ -1841,27 +1839,13 @@ struct mlx5_rxq_ctrl * > tmpl->socket = socket; > if (dev->data->dev_conf.intr_conf.rxq) > tmpl->irq = 1; > - /* > - * LRO packet may consume all the stride memory, hence we cannot > - * guaranty head-room near the packet memory in the stride. > - * In this case scatter is, for sure, enabled and an empty mbuf may be > - * added in the start for the head-room. > - */ > - if (lro_on_queue && RTE_PKTMBUF_HEADROOM > 0 && > - non_scatter_min_mbuf_size > mb_len) { > - strd_headroom_en = 0; > - mprq_stride_size = RTE_MIN(max_rx_pkt_len, > - 1u << config- > >mprq.max_stride_size_n); > - } else { > - strd_headroom_en = 1; > - mprq_stride_size = non_scatter_min_mbuf_size; > - } > if (!config->mprq.stride_num_n) > config->mprq.stride_num_n = MLX5_MPRQ_STRIDE_NUM_N; > if (!config->mprq.stride_size_n) > - config->mprq.stride_size_n = (mprq_stride_size <= > + config->mprq.stride_size_n = (non_scatter_min_mbuf_size <= > (1U << config->mprq.max_stride_size_n)) ? > - log2above(mprq_stride_size) : > MLX5_MPRQ_STRIDE_SIZE_N; > + log2above(non_scatter_min_mbuf_size) : > + MLX5_MPRQ_STRIDE_SIZE_N; > /* > * This Rx queue can be configured as a Multi-Packet RQ if all of the > * following conditions are met: > @@ -1877,7 +1861,8 @@ struct mlx5_rxq_ctrl * > tmpl->rxq.strd_num_n = config->mprq.stride_num_n; > tmpl->rxq.strd_sz_n = config->mprq.stride_size_n; > tmpl->rxq.strd_shift_en = MLX5_MPRQ_TWO_BYTE_SHIFT; > - tmpl->rxq.strd_headroom_en = strd_headroom_en; > + tmpl->rxq.strd_scatter_en = > + !!(offloads & DEV_RX_OFFLOAD_SCATTER); > tmpl->rxq.mprq_max_memcpy_len = > RTE_MIN(first_mb_free_size, > config->mprq.max_memcpy_len); > max_lro_size = RTE_MIN(max_rx_pkt_len, diff --git > a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index > f3bf763..4c27952 100644 > --- a/drivers/net/mlx5/mlx5_rxtx.c > +++ b/drivers/net/mlx5/mlx5_rxtx.c > @@ -1658,21 +1658,20 @@ enum mlx5_txcmp_code { > unsigned int i = 0; > uint32_t rq_ci = rxq->rq_ci; > uint16_t consumed_strd = rxq->consumed_strd; > - uint16_t headroom_sz = rxq->strd_headroom_en * > RTE_PKTMBUF_HEADROOM; > struct mlx5_mprq_buf *buf = (*rxq->mprq_bufs)[rq_ci & wq_mask]; > > while (i < pkts_n) { > struct rte_mbuf *pkt; > void *addr; > int ret; > - unsigned int len; > + uint32_t len; > uint16_t strd_cnt; > uint16_t strd_idx; > uint32_t offset; > uint32_t byte_cnt; > + int32_t hdrm_overlap; > volatile struct mlx5_mini_cqe8 *mcqe = NULL; > uint32_t rss_hash_res = 0; > - uint8_t lro_num_seg; > > if (consumed_strd == strd_n) { > /* Replace WQE only if the buffer is still in use. */ > @@ -1719,18 +1718,6 @@ enum mlx5_txcmp_code { > MLX5_ASSERT(strd_idx < strd_n); > MLX5_ASSERT(!((rte_be_to_cpu_16(cqe->wqe_id) ^ rq_ci) & > wq_mask)); > - lro_num_seg = cqe->lro_num_seg; > - /* > - * Currently configured to receive a packet per a stride. But if > - * MTU is adjusted through kernel interface, device could > - * consume multiple strides without raising an error. In this > - * case, the packet should be dropped because it is bigger > than > - * the max_rx_pkt_len. > - */ > - if (unlikely(!lro_num_seg && strd_cnt > 1)) { > - ++rxq->stats.idropped; > - continue; > - } > pkt = rte_pktmbuf_alloc(rxq->mp); > if (unlikely(pkt == NULL)) { > ++rxq->stats.rx_nombuf; > @@ -1742,12 +1729,16 @@ enum mlx5_txcmp_code { > len -= RTE_ETHER_CRC_LEN; > offset = strd_idx * strd_sz + strd_shift; > addr = RTE_PTR_ADD(mlx5_mprq_buf_addr(buf, strd_n), > offset); > + hdrm_overlap = len + RTE_PKTMBUF_HEADROOM - strd_cnt * > strd_sz; > /* > * Memcpy packets to the target mbuf if: > * - The size of packet is smaller than > mprq_max_memcpy_len. > * - Out of buffer in the Mempool for Multi-Packet RQ. > + * - There is no space for a headroom and scatter is disabled. > */ > - if (len <= rxq->mprq_max_memcpy_len || rxq->mprq_repl == > NULL) { > + if (len <= rxq->mprq_max_memcpy_len || > + rxq->mprq_repl == NULL || > + (hdrm_overlap > 0 && !rxq->strd_scatter_en)) { > /* > * When memcpy'ing packet due to out-of-buffer, the > * packet must be smaller than the target mbuf. > @@ -1769,7 +1760,7 @@ enum mlx5_txcmp_code { > rte_atomic16_add_return(&buf->refcnt, 1); > MLX5_ASSERT((uint16_t)rte_atomic16_read(&buf- > >refcnt) <= > strd_n + 1); > - buf_addr = RTE_PTR_SUB(addr, headroom_sz); > + buf_addr = RTE_PTR_SUB(addr, > RTE_PKTMBUF_HEADROOM); > /* > * MLX5 device doesn't use iova but it is necessary in a > * case where the Rx packet is transmitted via a @@ - > 1788,43 +1779,42 @@ enum mlx5_txcmp_code { > rte_pktmbuf_attach_extbuf(pkt, buf_addr, buf_iova, > buf_len, shinfo); > /* Set mbuf head-room. */ > - pkt->data_off = headroom_sz; > + SET_DATA_OFF(pkt, RTE_PKTMBUF_HEADROOM); > MLX5_ASSERT(pkt->ol_flags == > EXT_ATTACHED_MBUF); > - /* > - * Prevent potential overflow due to MTU change > through > - * kernel interface. > - */ > - if (unlikely(rte_pktmbuf_tailroom(pkt) < len)) { > - rte_pktmbuf_free_seg(pkt); > - ++rxq->stats.idropped; > - continue; > - } > + MLX5_ASSERT(rte_pktmbuf_tailroom(pkt) < > + len - (hdrm_overlap > 0 ? hdrm_overlap : 0)); > DATA_LEN(pkt) = len; > /* > - * LRO packet may consume all the stride memory, in > this > - * case packet head-room space is not guaranteed so > must > - * to add an empty mbuf for the head-room. > + * Copy the last fragment of a packet (up to > headroom > + * size bytes) in case there is a stride overlap with > + * a next packet's headroom. Allocate a separate > mbuf > + * to store this fragment and link it. Scatter is on. > */ > - if (!rxq->strd_headroom_en) { > - struct rte_mbuf *headroom_mbuf = > - rte_pktmbuf_alloc(rxq->mp); > + if (hdrm_overlap > 0) { > + MLX5_ASSERT(rxq->strd_scatter_en); > + struct rte_mbuf *seg = > + rte_pktmbuf_alloc(rxq->mp); > > - if (unlikely(headroom_mbuf == NULL)) { > + if (unlikely(seg == NULL)) { > rte_pktmbuf_free_seg(pkt); > ++rxq->stats.rx_nombuf; > break; > } > - PORT(pkt) = rxq->port_id; > - NEXT(headroom_mbuf) = pkt; > - pkt = headroom_mbuf; > + SET_DATA_OFF(seg, 0); > + rte_memcpy(rte_pktmbuf_mtod(seg, void *), > + RTE_PTR_ADD(addr, len - > hdrm_overlap), > + hdrm_overlap); > + DATA_LEN(seg) = hdrm_overlap; > + DATA_LEN(pkt) = len - hdrm_overlap; > + NEXT(pkt) = seg; > NB_SEGS(pkt) = 2; > } > } > rxq_cq_to_mbuf(rxq, pkt, cqe, rss_hash_res); > - if (lro_num_seg > 1) { > + if (cqe->lro_num_seg > 1) { > mlx5_lro_update_hdr(addr, cqe, len); > pkt->ol_flags |= PKT_RX_LRO; > - pkt->tso_segsz = strd_sz; > + pkt->tso_segsz = len / cqe->lro_num_seg; > } > PKT_LEN(pkt) = len; > PORT(pkt) = rxq->port_id; > diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h > index 939778a..d155c24 100644 > --- a/drivers/net/mlx5/mlx5_rxtx.h > +++ b/drivers/net/mlx5/mlx5_rxtx.h > @@ -119,7 +119,7 @@ struct mlx5_rxq_data { > unsigned int strd_sz_n:4; /* Log 2 of stride size. */ > unsigned int strd_shift_en:1; /* Enable 2bytes shift on a stride. */ > unsigned int err_state:2; /* enum mlx5_rxq_err_state. */ > - unsigned int strd_headroom_en:1; /* Enable mbuf headroom in > MPRQ. */ > + unsigned int strd_scatter_en:1; /* Scattered packets from a stride. */ > unsigned int lro:1; /* Enable LRO. */ > unsigned int :1; /* Remaining bits. */ > volatile uint32_t *rq_db; > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH 3/4] doc: add a decsription for MPRQ stride size devarg 2020-03-31 21:52 [dpdk-dev] [PATCH 0/4] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev 2020-03-31 21:52 ` [dpdk-dev] [PATCH 1/4] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev 2020-03-31 21:52 ` [dpdk-dev] [PATCH 2/4] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev @ 2020-03-31 21:52 ` Alexander Kozyrev 2020-03-31 21:52 ` [dpdk-dev] [PATCH 4/4] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev ` (2 subsequent siblings) 5 siblings, 0 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-03-31 21:52 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo, ferruh.yigit, thomas Provide a descriptin of a newly added mprq_log_stride_size devarg parameter for specifying a stride size in case of MPRQ Rx is on. Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> --- doc/guides/nics/mlx5.rst | 9 +++++++++ doc/guides/rel_notes/release_20_05.rst | 1 + 2 files changed, 10 insertions(+) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index e13c07d..4e8c130 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -434,6 +434,15 @@ Run-time configuration The size of Rx queue should be bigger than the number of strides. +- ``mprq_log_stride_size`` parameter [int] + + Log 2 of the size of a stride for Multi-Packet Rx queue. Configuring a smaller + stride size can save some memory and reduce probability of a depletion of all + available strides due to unreleased packets by an application. If configured + value is not in the range of device capability, the default value will be set + with a warning message. The default value is 11 which is 2048 bytes per a + stride, valid only if ``mprq_en`` is set. + - ``mprq_max_memcpy_len`` parameter [int] The maximum length of packet to memcpy in case of Multi-Packet Rx queue. Rx diff --git a/doc/guides/rel_notes/release_20_05.rst b/doc/guides/rel_notes/release_20_05.rst index c960fd2..1459218 100644 --- a/doc/guides/rel_notes/release_20_05.rst +++ b/doc/guides/rel_notes/release_20_05.rst @@ -62,6 +62,7 @@ New Features * Added support for matching on IPv4 Time To Live and IPv6 Hop Limit. * Added support for creating Relaxed Ordering Memory Regions. + * Added support for 9000 MTU in Multi-Packet RQ mode. Removed Items ------------- -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH 4/4] net/mlx5: add multi-segment packets in MPRQ mode 2020-03-31 21:52 [dpdk-dev] [PATCH 0/4] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev ` (2 preceding siblings ...) 2020-03-31 21:52 ` [dpdk-dev] [PATCH 3/4] doc: add a decsription for MPRQ stride size devarg Alexander Kozyrev @ 2020-03-31 21:52 ` Alexander Kozyrev 2020-04-02 10:02 ` Slava Ovsiienko 2020-04-02 18:11 ` [dpdk-dev] [PATCH 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 " Alexander Kozyrev 5 siblings, 1 reply; 28+ messages in thread From: Alexander Kozyrev @ 2020-03-31 21:52 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo, ferruh.yigit, thomas The multi-stride operations now allow to reduce a stride size while supporting Jumbo frames. That means that it is possible to have mbufs configured with a size smaller than the whole packet received. It is not an issue during normal MPRQ operations since we attach external buffers instead of copying the data into the mbuf itself. But it is not the case in "emergency mode" when we have to copy every packet because of no more external mbufs are available. Assemble a multi-segment packet to overcome this issue in case scatter mode is enabled, drop a packet if not. Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> --- drivers/net/mlx5/mlx5_rxtx.c | 47 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index 4c27952..7ce3732 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1734,22 +1734,52 @@ enum mlx5_txcmp_code { * Memcpy packets to the target mbuf if: * - The size of packet is smaller than mprq_max_memcpy_len. * - Out of buffer in the Mempool for Multi-Packet RQ. - * - There is no space for a headroom and scatter is disabled. + * - The packet's stride overlaps a headroom and scatter is off. */ if (len <= rxq->mprq_max_memcpy_len || rxq->mprq_repl == NULL || (hdrm_overlap > 0 && !rxq->strd_scatter_en)) { - /* - * When memcpy'ing packet due to out-of-buffer, the - * packet must be smaller than the target mbuf. - */ - if (unlikely(rte_pktmbuf_tailroom(pkt) < len)) { + if (likely(rte_pktmbuf_tailroom(pkt) >= len)) { + rte_memcpy(rte_pktmbuf_mtod(pkt, void *), + addr, len); + DATA_LEN(pkt) = len; + } else if (rxq->strd_scatter_en) { + struct rte_mbuf *prev = pkt; + uint32_t seg_len = + RTE_MIN(rte_pktmbuf_tailroom(pkt), len); + uint32_t rem_len = len - seg_len; + + rte_memcpy(rte_pktmbuf_mtod(pkt, void *), + addr, seg_len); + DATA_LEN(pkt) = seg_len; + while (rem_len) { + struct rte_mbuf *next = + rte_pktmbuf_alloc(rxq->mp); + + if (unlikely(next == NULL)) { + rte_pktmbuf_free(pkt); + ++rxq->stats.rx_nombuf; + goto out; + } + NEXT(prev) = next; + SET_DATA_OFF(next, 0); + addr = RTE_PTR_ADD(addr, seg_len); + seg_len = RTE_MIN + (rte_pktmbuf_tailroom(next), + rem_len); + rte_memcpy + (rte_pktmbuf_mtod(next, void *), + addr, seg_len); + DATA_LEN(next) = seg_len; + rem_len -= seg_len; + prev = next; + ++NB_SEGS(pkt); + } + } else { rte_pktmbuf_free_seg(pkt); ++rxq->stats.idropped; continue; } - rte_memcpy(rte_pktmbuf_mtod(pkt, void *), addr, len); - DATA_LEN(pkt) = len; } else { rte_iova_t buf_iova; struct rte_mbuf_ext_shared_info *shinfo; @@ -1826,6 +1856,7 @@ enum mlx5_txcmp_code { *(pkts++) = pkt; ++i; } +out: /* Update the consumer indexes. */ rxq->consumed_strd = consumed_strd; rte_cio_wmb(); -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH 4/4] net/mlx5: add multi-segment packets in MPRQ mode 2020-03-31 21:52 ` [dpdk-dev] [PATCH 4/4] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev @ 2020-04-02 10:02 ` Slava Ovsiienko 0 siblings, 0 replies; 28+ messages in thread From: Slava Ovsiienko @ 2020-04-02 10:02 UTC (permalink / raw) To: Alexander Kozyrev, dev Cc: Raslan Darawsheh, Matan Azrad, ferruh.yigit, Thomas Monjalon > -----Original Message----- > From: Alexander Kozyrev <akozyrev@mellanox.com> > Sent: Wednesday, April 1, 2020 0:53 > To: dev@dpdk.org > Cc: Raslan Darawsheh <rasland@mellanox.com>; Matan Azrad > <matan@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com>; > ferruh.yigit@intel.com; Thomas Monjalon <thomas@monjalon.net> > Subject: [PATCH 4/4] net/mlx5: add multi-segment packets in MPRQ mode > > The multi-stride operations now allow to reduce a stride size while supporting > Jumbo frames. That means that it is possible to have mbufs configured with a > size smaller than the whole packet received. It is not an issue during normal > MPRQ operations since we attach external buffers instead of copying the data > into the mbuf itself. But it is not the case in "emergency mode" > when we have to copy every packet because of no more external mbufs are > available. Assemble a multi-segment packet to overcome this issue in case > scatter mode is enabled, drop a packet if not. > > Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> > --- > drivers/net/mlx5/mlx5_rxtx.c | 47 > ++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 39 insertions(+), 8 deletions(-) > > diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index > 4c27952..7ce3732 100644 > --- a/drivers/net/mlx5/mlx5_rxtx.c > +++ b/drivers/net/mlx5/mlx5_rxtx.c > @@ -1734,22 +1734,52 @@ enum mlx5_txcmp_code { > * Memcpy packets to the target mbuf if: > * - The size of packet is smaller than > mprq_max_memcpy_len. > * - Out of buffer in the Mempool for Multi-Packet RQ. > - * - There is no space for a headroom and scatter is disabled. > + * - The packet's stride overlaps a headroom and scatter is off. > */ > if (len <= rxq->mprq_max_memcpy_len || > rxq->mprq_repl == NULL || > (hdrm_overlap > 0 && !rxq->strd_scatter_en)) { > - /* > - * When memcpy'ing packet due to out-of-buffer, the > - * packet must be smaller than the target mbuf. > - */ > - if (unlikely(rte_pktmbuf_tailroom(pkt) < len)) { > + if (likely(rte_pktmbuf_tailroom(pkt) >= len)) { > + rte_memcpy(rte_pktmbuf_mtod(pkt, void *), > + addr, len); > + DATA_LEN(pkt) = len; > + } else if (rxq->strd_scatter_en) { > + struct rte_mbuf *prev = pkt; > + uint32_t seg_len = > + RTE_MIN(rte_pktmbuf_tailroom(pkt), > len); > + uint32_t rem_len = len - seg_len; > + > + rte_memcpy(rte_pktmbuf_mtod(pkt, void *), > + addr, seg_len); > + DATA_LEN(pkt) = seg_len; > + while (rem_len) { > + struct rte_mbuf *next = > + rte_pktmbuf_alloc(rxq->mp); > + > + if (unlikely(next == NULL)) { > + rte_pktmbuf_free(pkt); > + ++rxq->stats.rx_nombuf; > + goto out; > + } > + NEXT(prev) = next; > + SET_DATA_OFF(next, 0); > + addr = RTE_PTR_ADD(addr, seg_len); > + seg_len = RTE_MIN > + (rte_pktmbuf_tailroom(next), > + rem_len); > + rte_memcpy > + (rte_pktmbuf_mtod(next, > void *), > + addr, seg_len); > + DATA_LEN(next) = seg_len; > + rem_len -= seg_len; > + prev = next; > + ++NB_SEGS(pkt); > + } > + } else { > rte_pktmbuf_free_seg(pkt); > ++rxq->stats.idropped; > continue; > } > - rte_memcpy(rte_pktmbuf_mtod(pkt, void *), addr, > len); > - DATA_LEN(pkt) = len; > } else { > rte_iova_t buf_iova; > struct rte_mbuf_ext_shared_info *shinfo; @@ - > 1826,6 +1856,7 @@ enum mlx5_txcmp_code { > *(pkts++) = pkt; > ++i; > } > +out: > /* Update the consumer indexes. */ > rxq->consumed_strd = consumed_strd; > rte_cio_wmb(); > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH 0/3] net/mlx5: add large packet size support to MPRQ 2020-03-31 21:52 [dpdk-dev] [PATCH 0/4] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev ` (3 preceding siblings ...) 2020-03-31 21:52 ` [dpdk-dev] [PATCH 4/4] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev @ 2020-04-02 18:11 ` Alexander Kozyrev 2020-04-02 18:11 ` [dpdk-dev] [PATCH 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev ` (3 more replies) 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 " Alexander Kozyrev 5 siblings, 4 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-02 18:11 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo In order to support the 9K MTU the MPRQ feature should be updated to allow a packet to take more than one stride (single linear buffer). Receiving a packet into multiple adjacent strides should be implemented. The reason preventing the packet to be received into multiple strides is that the data buffer must be preceded with some HEAD_ROOM space. In the current implementation the HEAD_ROOM space is borrowed by the PMD from the tail of the preceding stride. If packet takes multiple strides the tail of stride may be overwritten with a packet data and the memory can't be borrowed to provide the HEAD_ROOM space for the next packet. Special care is needed to prevent the HEAD_ROOM corruption as such: - copy a whole packet into a separate memory buffer if scatter is off - copy an overlapping data only and craft a multi-segment mbuf otherwise After multi-stride support for packets receiving is in place it is possible to reduce the stride size for more efficient memory utilization. Introduce the mprq_log_stride_size device parameter to configure a stride size for MPRQ. Default stride size is set to 2048 bytes. Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> --- v1: https://patchwork.dpdk.org/cover/67558/ v2: merge documentation and implementation in one commit Alexander Kozyrev (3): net/mlx5: add a devarg to specify MPRQ stride size net/mlx5: enable MPRQ multi-stride operations net/mlx5: add multi-segment packets in MPRQ mode doc/guides/nics/mlx5.rst | 9 +++ doc/guides/rel_notes/release_20_05.rst | 1 + drivers/net/mlx5/mlx5.c | 34 ++++++++-- drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_defs.h | 3 + drivers/net/mlx5/mlx5_rxq.c | 47 ++++++-------- drivers/net/mlx5/mlx5_rxtx.c | 113 +++++++++++++++++++-------------- drivers/net/mlx5/mlx5_rxtx.h | 2 +- 8 files changed, 128 insertions(+), 82 deletions(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH 1/3] net/mlx5: add a devarg to specify MPRQ stride size 2020-04-02 18:11 ` [dpdk-dev] [PATCH 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev @ 2020-04-02 18:11 ` Alexander Kozyrev 2020-04-02 18:11 ` [dpdk-dev] [PATCH 2/3] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev ` (2 subsequent siblings) 3 siblings, 0 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-02 18:11 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo Define a device parameter to configure log 2 of a stride size for MPRQ - mprq_log_stride_size. User is able to specify a stride size in a range allowed by an underlying hardware. The default stride size is defined as 2048 bytes to encompass most commonly used packet sizes in the Internet (MTU 1518 and less) and will be used in case a maximum configured packet size cannot fit into the largest possible stride size. Otherwise a stride size is set to a large enough value to encompass a whole packet. Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> --- doc/guides/nics/mlx5.rst | 9 +++++++++ doc/guides/rel_notes/release_20_05.rst | 1 + drivers/net/mlx5/mlx5.c | 34 +++++++++++++++++++++++++++------- drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_defs.h | 3 +++ drivers/net/mlx5/mlx5_rxq.c | 28 +++++++++++++++++----------- 6 files changed, 58 insertions(+), 18 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index e13c07d..4e8c130 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -434,6 +434,15 @@ Run-time configuration The size of Rx queue should be bigger than the number of strides. +- ``mprq_log_stride_size`` parameter [int] + + Log 2 of the size of a stride for Multi-Packet Rx queue. Configuring a smaller + stride size can save some memory and reduce probability of a depletion of all + available strides due to unreleased packets by an application. If configured + value is not in the range of device capability, the default value will be set + with a warning message. The default value is 11 which is 2048 bytes per a + stride, valid only if ``mprq_en`` is set. + - ``mprq_max_memcpy_len`` parameter [int] The maximum length of packet to memcpy in case of Multi-Packet Rx queue. Rx diff --git a/doc/guides/rel_notes/release_20_05.rst b/doc/guides/rel_notes/release_20_05.rst index c960fd2..1459218 100644 --- a/doc/guides/rel_notes/release_20_05.rst +++ b/doc/guides/rel_notes/release_20_05.rst @@ -62,6 +62,7 @@ New Features * Added support for matching on IPv4 Time To Live and IPv6 Hop Limit. * Added support for creating Relaxed Ordering Memory Regions. + * Added support for 9000 MTU in Multi-Packet RQ mode. Removed Items ------------- diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 6a11b14..a2ba6d3 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -63,6 +63,9 @@ /* Device parameter to configure log 2 of the number of strides for MPRQ. */ #define MLX5_RX_MPRQ_LOG_STRIDE_NUM "mprq_log_stride_num" +/* Device parameter to configure log 2 of the stride size for MPRQ. */ +#define MLX5_RX_MPRQ_LOG_STRIDE_SIZE "mprq_log_stride_size" + /* Device parameter to limit the size of memcpy'd packet for MPRQ. */ #define MLX5_RX_MPRQ_MAX_MEMCPY_LEN "mprq_max_memcpy_len" @@ -1531,6 +1534,8 @@ struct mlx5_flow_id_pool * config->mprq.enabled = !!tmp; } else if (strcmp(MLX5_RX_MPRQ_LOG_STRIDE_NUM, key) == 0) { config->mprq.stride_num_n = tmp; + } else if (strcmp(MLX5_RX_MPRQ_LOG_STRIDE_SIZE, key) == 0) { + config->mprq.stride_size_n = tmp; } else if (strcmp(MLX5_RX_MPRQ_MAX_MEMCPY_LEN, key) == 0) { config->mprq.max_memcpy_len = tmp; } else if (strcmp(MLX5_RXQS_MIN_MPRQ, key) == 0) { @@ -1627,6 +1632,7 @@ struct mlx5_flow_id_pool * MLX5_RXQ_PKT_PAD_EN, MLX5_RX_MPRQ_EN, MLX5_RX_MPRQ_LOG_STRIDE_NUM, + MLX5_RX_MPRQ_LOG_STRIDE_SIZE, MLX5_RX_MPRQ_MAX_MEMCPY_LEN, MLX5_RXQS_MIN_MPRQ, MLX5_TXQ_INLINE, @@ -2302,8 +2308,6 @@ struct mlx5_flow_id_pool * mprq_caps.min_single_wqe_log_num_of_strides; mprq_max_stride_num_n = mprq_caps.max_single_wqe_log_num_of_strides; - config.mprq.stride_num_n = RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, - mprq_min_stride_num_n); } #endif if (RTE_CACHE_LINE_SIZE == 128 && @@ -2617,17 +2621,32 @@ struct mlx5_flow_id_pool * #endif } if (config.mprq.enabled && mprq) { - if (config.mprq.stride_num_n > mprq_max_stride_num_n || - config.mprq.stride_num_n < mprq_min_stride_num_n) { + if (config.mprq.stride_num_n && + (config.mprq.stride_num_n > mprq_max_stride_num_n || + config.mprq.stride_num_n < mprq_min_stride_num_n)) { config.mprq.stride_num_n = - RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, - mprq_min_stride_num_n); + RTE_MIN(RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, + mprq_min_stride_num_n), + mprq_max_stride_num_n); DRV_LOG(WARNING, "the number of strides" " for Multi-Packet RQ is out of range," " setting default value (%u)", 1 << config.mprq.stride_num_n); } + if (config.mprq.stride_size_n && + (config.mprq.stride_size_n > mprq_max_stride_size_n || + config.mprq.stride_size_n < mprq_min_stride_size_n)) { + config.mprq.stride_size_n = + RTE_MIN(RTE_MAX(MLX5_MPRQ_STRIDE_SIZE_N, + mprq_min_stride_size_n), + mprq_max_stride_size_n); + DRV_LOG(WARNING, + "the size of a stride" + " for Multi-Packet RQ is out of range," + " setting default value (%u)", + 1 << config.mprq.stride_size_n); + } config.mprq.min_stride_size_n = mprq_min_stride_size_n; config.mprq.max_stride_size_n = mprq_max_stride_size_n; } else if (config.mprq.enabled && !mprq) { @@ -3361,7 +3380,8 @@ struct mlx5_flow_id_pool * .mr_ext_memseg_en = 1, .mprq = { .enabled = 0, /* Disabled by default. */ - .stride_num_n = MLX5_MPRQ_STRIDE_NUM_N, + .stride_num_n = 0, + .stride_size_n = 0, .max_memcpy_len = MLX5_MPRQ_MEMCPY_DEFAULT_LEN, .min_rxqs_num = MLX5_MPRQ_MIN_RXQS, }, diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 34ab475..65bc0dc 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -179,6 +179,7 @@ struct mlx5_dev_config { struct { unsigned int enabled:1; /* Whether MPRQ is enabled. */ unsigned int stride_num_n; /* Number of strides. */ + unsigned int stride_size_n; /* Size of a stride. */ unsigned int min_stride_size_n; /* Min size of a stride. */ unsigned int max_stride_size_n; /* Max size of a stride. */ unsigned int max_memcpy_len; diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index 19e8253..260f584 100644 --- a/drivers/net/mlx5/mlx5_defs.h +++ b/drivers/net/mlx5/mlx5_defs.h @@ -143,6 +143,9 @@ /* Log 2 of the default number of strides per WQE for Multi-Packet RQ. */ #define MLX5_MPRQ_STRIDE_NUM_N 6U +/* Log 2 of the default size of a stride per WQE for Multi-Packet RQ. */ +#define MLX5_MPRQ_STRIDE_SIZE_N 11U + /* Two-byte shift is disabled for Multi-Packet RQ. */ #define MLX5_MPRQ_TWO_BYTE_SHIFT 0 diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index 0a95e3c..8f8c16b 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -1793,6 +1793,7 @@ struct mlx5_rxq_ctrl * struct mlx5_priv *priv = dev->data->dev_private; struct mlx5_rxq_ctrl *tmpl; unsigned int mb_len = rte_pktmbuf_data_room_size(mp); + unsigned int mprq_stride_nums; unsigned int mprq_stride_size; struct mlx5_dev_config *config = &priv->config; unsigned int strd_headroom_en; @@ -1856,25 +1857,27 @@ struct mlx5_rxq_ctrl * strd_headroom_en = 1; mprq_stride_size = non_scatter_min_mbuf_size; } + mprq_stride_nums = config->mprq.stride_num_n ? + config->mprq.stride_num_n : MLX5_MPRQ_STRIDE_NUM_N; + mprq_stride_size = (mprq_stride_size <= + (1U << config->mprq.max_stride_size_n)) ? + log2above(mprq_stride_size) : MLX5_MPRQ_STRIDE_SIZE_N; /* * This Rx queue can be configured as a Multi-Packet RQ if all of the * following conditions are met: * - MPRQ is enabled. * - The number of descs is more than the number of strides. - * - max_rx_pkt_len plus overhead is less than the max size of a - * stride. * Otherwise, enable Rx scatter if necessary. */ - if (mprq_en && - desc > (1U << config->mprq.stride_num_n) && - mprq_stride_size <= (1U << config->mprq.max_stride_size_n)) { + if (mprq_en && desc > (1U << mprq_stride_nums)) { /* TODO: Rx scatter isn't supported yet. */ tmpl->rxq.sges_n = 0; /* Trim the number of descs needed. */ - desc >>= config->mprq.stride_num_n; - tmpl->rxq.strd_num_n = config->mprq.stride_num_n; - tmpl->rxq.strd_sz_n = RTE_MAX(log2above(mprq_stride_size), - config->mprq.min_stride_size_n); + desc >>= mprq_stride_nums; + tmpl->rxq.strd_num_n = config->mprq.stride_num_n ? + config->mprq.stride_num_n : mprq_stride_nums; + tmpl->rxq.strd_sz_n = config->mprq.stride_size_n ? + config->mprq.stride_size_n : mprq_stride_size; tmpl->rxq.strd_shift_en = MLX5_MPRQ_TWO_BYTE_SHIFT; tmpl->rxq.strd_headroom_en = strd_headroom_en; tmpl->rxq.mprq_max_memcpy_len = RTE_MIN(first_mb_free_size, @@ -1924,9 +1927,12 @@ struct mlx5_rxq_ctrl * DRV_LOG(WARNING, "port %u MPRQ is requested but cannot be enabled" " (requested: desc = %u, stride_sz = %u," - " supported: min_stride_num = %u, max_stride_sz = %u).", - dev->data->port_id, desc, mprq_stride_size, + " supported: min_stride_num = %u, min_stride_sz = %u," + "max_stride_sz = %u).", + dev->data->port_id, desc, + (1 << config->mprq.stride_size_n), (1 << config->mprq.stride_num_n), + (1 << config->mprq.min_stride_size_n), (1 << config->mprq.max_stride_size_n)); DRV_LOG(DEBUG, "port %u maximum number of segments per packet: %u", dev->data->port_id, 1 << tmpl->rxq.sges_n); -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH 2/3] net/mlx5: enable MPRQ multi-stride operations 2020-04-02 18:11 ` [dpdk-dev] [PATCH 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev 2020-04-02 18:11 ` [dpdk-dev] [PATCH 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev @ 2020-04-02 18:11 ` Alexander Kozyrev 2020-04-02 18:11 ` [dpdk-dev] [PATCH 3/3] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev 3 siblings, 0 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-02 18:11 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo MPRQ feature should be updated to allow a packet to be received into multiple strides in order to support the MTU exceeding 8KB. Special care is needed to prevent the headroom corruption in the multi-stride mode since the headroom space is borrowed by the PMD from the tail of the preceding stride. Copy the whole packet into a separate mbuf in this case or just the overlapping data if the Rx scattering is supported by an application. Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> --- drivers/net/mlx5/mlx5_rxq.c | 25 ++++------------ drivers/net/mlx5/mlx5_rxtx.c | 68 +++++++++++++++++++------------------------- drivers/net/mlx5/mlx5_rxtx.h | 2 +- 3 files changed, 35 insertions(+), 60 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index 8f8c16b..82efd32 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -1796,7 +1796,6 @@ struct mlx5_rxq_ctrl * unsigned int mprq_stride_nums; unsigned int mprq_stride_size; struct mlx5_dev_config *config = &priv->config; - unsigned int strd_headroom_en; /* * Always allocate extra slots, even if eventually * the vector Rx will not be used. @@ -1842,26 +1841,11 @@ struct mlx5_rxq_ctrl * tmpl->socket = socket; if (dev->data->dev_conf.intr_conf.rxq) tmpl->irq = 1; - /* - * LRO packet may consume all the stride memory, hence we cannot - * guaranty head-room near the packet memory in the stride. - * In this case scatter is, for sure, enabled and an empty mbuf may be - * added in the start for the head-room. - */ - if (lro_on_queue && RTE_PKTMBUF_HEADROOM > 0 && - non_scatter_min_mbuf_size > mb_len) { - strd_headroom_en = 0; - mprq_stride_size = RTE_MIN(max_rx_pkt_len, - 1u << config->mprq.max_stride_size_n); - } else { - strd_headroom_en = 1; - mprq_stride_size = non_scatter_min_mbuf_size; - } mprq_stride_nums = config->mprq.stride_num_n ? config->mprq.stride_num_n : MLX5_MPRQ_STRIDE_NUM_N; - mprq_stride_size = (mprq_stride_size <= - (1U << config->mprq.max_stride_size_n)) ? - log2above(mprq_stride_size) : MLX5_MPRQ_STRIDE_SIZE_N; + mprq_stride_size = non_scatter_min_mbuf_size <= + (1U << config->mprq.max_stride_size_n) ? + log2above(non_scatter_min_mbuf_size) : MLX5_MPRQ_STRIDE_SIZE_N; /* * This Rx queue can be configured as a Multi-Packet RQ if all of the * following conditions are met: @@ -1879,7 +1863,8 @@ struct mlx5_rxq_ctrl * tmpl->rxq.strd_sz_n = config->mprq.stride_size_n ? config->mprq.stride_size_n : mprq_stride_size; tmpl->rxq.strd_shift_en = MLX5_MPRQ_TWO_BYTE_SHIFT; - tmpl->rxq.strd_headroom_en = strd_headroom_en; + tmpl->rxq.strd_scatter_en = + !!(offloads & DEV_RX_OFFLOAD_SCATTER); tmpl->rxq.mprq_max_memcpy_len = RTE_MIN(first_mb_free_size, config->mprq.max_memcpy_len); max_lro_size = RTE_MIN(max_rx_pkt_len, diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index f3bf763..4c27952 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1658,21 +1658,20 @@ enum mlx5_txcmp_code { unsigned int i = 0; uint32_t rq_ci = rxq->rq_ci; uint16_t consumed_strd = rxq->consumed_strd; - uint16_t headroom_sz = rxq->strd_headroom_en * RTE_PKTMBUF_HEADROOM; struct mlx5_mprq_buf *buf = (*rxq->mprq_bufs)[rq_ci & wq_mask]; while (i < pkts_n) { struct rte_mbuf *pkt; void *addr; int ret; - unsigned int len; + uint32_t len; uint16_t strd_cnt; uint16_t strd_idx; uint32_t offset; uint32_t byte_cnt; + int32_t hdrm_overlap; volatile struct mlx5_mini_cqe8 *mcqe = NULL; uint32_t rss_hash_res = 0; - uint8_t lro_num_seg; if (consumed_strd == strd_n) { /* Replace WQE only if the buffer is still in use. */ @@ -1719,18 +1718,6 @@ enum mlx5_txcmp_code { MLX5_ASSERT(strd_idx < strd_n); MLX5_ASSERT(!((rte_be_to_cpu_16(cqe->wqe_id) ^ rq_ci) & wq_mask)); - lro_num_seg = cqe->lro_num_seg; - /* - * Currently configured to receive a packet per a stride. But if - * MTU is adjusted through kernel interface, device could - * consume multiple strides without raising an error. In this - * case, the packet should be dropped because it is bigger than - * the max_rx_pkt_len. - */ - if (unlikely(!lro_num_seg && strd_cnt > 1)) { - ++rxq->stats.idropped; - continue; - } pkt = rte_pktmbuf_alloc(rxq->mp); if (unlikely(pkt == NULL)) { ++rxq->stats.rx_nombuf; @@ -1742,12 +1729,16 @@ enum mlx5_txcmp_code { len -= RTE_ETHER_CRC_LEN; offset = strd_idx * strd_sz + strd_shift; addr = RTE_PTR_ADD(mlx5_mprq_buf_addr(buf, strd_n), offset); + hdrm_overlap = len + RTE_PKTMBUF_HEADROOM - strd_cnt * strd_sz; /* * Memcpy packets to the target mbuf if: * - The size of packet is smaller than mprq_max_memcpy_len. * - Out of buffer in the Mempool for Multi-Packet RQ. + * - There is no space for a headroom and scatter is disabled. */ - if (len <= rxq->mprq_max_memcpy_len || rxq->mprq_repl == NULL) { + if (len <= rxq->mprq_max_memcpy_len || + rxq->mprq_repl == NULL || + (hdrm_overlap > 0 && !rxq->strd_scatter_en)) { /* * When memcpy'ing packet due to out-of-buffer, the * packet must be smaller than the target mbuf. @@ -1769,7 +1760,7 @@ enum mlx5_txcmp_code { rte_atomic16_add_return(&buf->refcnt, 1); MLX5_ASSERT((uint16_t)rte_atomic16_read(&buf->refcnt) <= strd_n + 1); - buf_addr = RTE_PTR_SUB(addr, headroom_sz); + buf_addr = RTE_PTR_SUB(addr, RTE_PKTMBUF_HEADROOM); /* * MLX5 device doesn't use iova but it is necessary in a * case where the Rx packet is transmitted via a @@ -1788,43 +1779,42 @@ enum mlx5_txcmp_code { rte_pktmbuf_attach_extbuf(pkt, buf_addr, buf_iova, buf_len, shinfo); /* Set mbuf head-room. */ - pkt->data_off = headroom_sz; + SET_DATA_OFF(pkt, RTE_PKTMBUF_HEADROOM); MLX5_ASSERT(pkt->ol_flags == EXT_ATTACHED_MBUF); - /* - * Prevent potential overflow due to MTU change through - * kernel interface. - */ - if (unlikely(rte_pktmbuf_tailroom(pkt) < len)) { - rte_pktmbuf_free_seg(pkt); - ++rxq->stats.idropped; - continue; - } + MLX5_ASSERT(rte_pktmbuf_tailroom(pkt) < + len - (hdrm_overlap > 0 ? hdrm_overlap : 0)); DATA_LEN(pkt) = len; /* - * LRO packet may consume all the stride memory, in this - * case packet head-room space is not guaranteed so must - * to add an empty mbuf for the head-room. + * Copy the last fragment of a packet (up to headroom + * size bytes) in case there is a stride overlap with + * a next packet's headroom. Allocate a separate mbuf + * to store this fragment and link it. Scatter is on. */ - if (!rxq->strd_headroom_en) { - struct rte_mbuf *headroom_mbuf = - rte_pktmbuf_alloc(rxq->mp); + if (hdrm_overlap > 0) { + MLX5_ASSERT(rxq->strd_scatter_en); + struct rte_mbuf *seg = + rte_pktmbuf_alloc(rxq->mp); - if (unlikely(headroom_mbuf == NULL)) { + if (unlikely(seg == NULL)) { rte_pktmbuf_free_seg(pkt); ++rxq->stats.rx_nombuf; break; } - PORT(pkt) = rxq->port_id; - NEXT(headroom_mbuf) = pkt; - pkt = headroom_mbuf; + SET_DATA_OFF(seg, 0); + rte_memcpy(rte_pktmbuf_mtod(seg, void *), + RTE_PTR_ADD(addr, len - hdrm_overlap), + hdrm_overlap); + DATA_LEN(seg) = hdrm_overlap; + DATA_LEN(pkt) = len - hdrm_overlap; + NEXT(pkt) = seg; NB_SEGS(pkt) = 2; } } rxq_cq_to_mbuf(rxq, pkt, cqe, rss_hash_res); - if (lro_num_seg > 1) { + if (cqe->lro_num_seg > 1) { mlx5_lro_update_hdr(addr, cqe, len); pkt->ol_flags |= PKT_RX_LRO; - pkt->tso_segsz = strd_sz; + pkt->tso_segsz = len / cqe->lro_num_seg; } PKT_LEN(pkt) = len; PORT(pkt) = rxq->port_id; diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 939778a..d155c24 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -119,7 +119,7 @@ struct mlx5_rxq_data { unsigned int strd_sz_n:4; /* Log 2 of stride size. */ unsigned int strd_shift_en:1; /* Enable 2bytes shift on a stride. */ unsigned int err_state:2; /* enum mlx5_rxq_err_state. */ - unsigned int strd_headroom_en:1; /* Enable mbuf headroom in MPRQ. */ + unsigned int strd_scatter_en:1; /* Scattered packets from a stride. */ unsigned int lro:1; /* Enable LRO. */ unsigned int :1; /* Remaining bits. */ volatile uint32_t *rq_db; -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH 3/3] net/mlx5: add multi-segment packets in MPRQ mode 2020-04-02 18:11 ` [dpdk-dev] [PATCH 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev 2020-04-02 18:11 ` [dpdk-dev] [PATCH 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev 2020-04-02 18:11 ` [dpdk-dev] [PATCH 2/3] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev @ 2020-04-02 18:11 ` Alexander Kozyrev 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev 3 siblings, 0 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-02 18:11 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo The multi-stride operations now allow to reduce a stride size while supporting Jumbo frames. That means that it is possible to have mbufs configured with a size smaller than the whole packet received. It is not an issue during normal MPRQ operations since we attach external buffers instead of copying the data into the mbuf itself. But it is not the case in "emergency mode" when we have to copy every packet because of no more external mbufs are available. Assemble a multi-segment packet to overcome this issue in case scatter mode is enabled, drop a packet if not. Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> --- drivers/net/mlx5/mlx5_rxtx.c | 47 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index 4c27952..7ce3732 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1734,22 +1734,52 @@ enum mlx5_txcmp_code { * Memcpy packets to the target mbuf if: * - The size of packet is smaller than mprq_max_memcpy_len. * - Out of buffer in the Mempool for Multi-Packet RQ. - * - There is no space for a headroom and scatter is disabled. + * - The packet's stride overlaps a headroom and scatter is off. */ if (len <= rxq->mprq_max_memcpy_len || rxq->mprq_repl == NULL || (hdrm_overlap > 0 && !rxq->strd_scatter_en)) { - /* - * When memcpy'ing packet due to out-of-buffer, the - * packet must be smaller than the target mbuf. - */ - if (unlikely(rte_pktmbuf_tailroom(pkt) < len)) { + if (likely(rte_pktmbuf_tailroom(pkt) >= len)) { + rte_memcpy(rte_pktmbuf_mtod(pkt, void *), + addr, len); + DATA_LEN(pkt) = len; + } else if (rxq->strd_scatter_en) { + struct rte_mbuf *prev = pkt; + uint32_t seg_len = + RTE_MIN(rte_pktmbuf_tailroom(pkt), len); + uint32_t rem_len = len - seg_len; + + rte_memcpy(rte_pktmbuf_mtod(pkt, void *), + addr, seg_len); + DATA_LEN(pkt) = seg_len; + while (rem_len) { + struct rte_mbuf *next = + rte_pktmbuf_alloc(rxq->mp); + + if (unlikely(next == NULL)) { + rte_pktmbuf_free(pkt); + ++rxq->stats.rx_nombuf; + goto out; + } + NEXT(prev) = next; + SET_DATA_OFF(next, 0); + addr = RTE_PTR_ADD(addr, seg_len); + seg_len = RTE_MIN + (rte_pktmbuf_tailroom(next), + rem_len); + rte_memcpy + (rte_pktmbuf_mtod(next, void *), + addr, seg_len); + DATA_LEN(next) = seg_len; + rem_len -= seg_len; + prev = next; + ++NB_SEGS(pkt); + } + } else { rte_pktmbuf_free_seg(pkt); ++rxq->stats.idropped; continue; } - rte_memcpy(rte_pktmbuf_mtod(pkt, void *), addr, len); - DATA_LEN(pkt) = len; } else { rte_iova_t buf_iova; struct rte_mbuf_ext_shared_info *shinfo; @@ -1826,6 +1856,7 @@ enum mlx5_txcmp_code { *(pkts++) = pkt; ++i; } +out: /* Update the consumer indexes. */ rxq->consumed_strd = consumed_strd; rte_cio_wmb(); -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ 2020-04-02 18:11 ` [dpdk-dev] [PATCH 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev ` (2 preceding siblings ...) 2020-04-02 18:11 ` [dpdk-dev] [PATCH 3/3] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev @ 2020-04-09 22:23 ` Alexander Kozyrev 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev ` (4 more replies) 3 siblings, 5 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-09 22:23 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo In order to support the 9K MTU the MPRQ feature should be updated to allow a packet to take more than one stride (single linear buffer). Receiving a packet into multiple adjacent strides should be implemented. The reason preventing the packet to be received into multiple strides is that the data buffer must be preceded with some HEAD_ROOM space. In the current implementation the HEAD_ROOM space is borrowed by the PMD from the tail of the preceding stride. If packet takes multiple strides the tail of stride may be overwritten with a packet data and the memory can't be borrowed to provide the HEAD_ROOM space for the next packet. Special care is needed to prevent the HEAD_ROOM corruption as such: - copy a whole packet into a separate memory buffer if scatter is off - copy an overlapping data only and craft a multi-segment mbuf otherwise After multi-stride support for packets receiving is in place it is possible to reduce the stride size for more efficient memory utilization. Introduce the mprq_log_stride_size device parameter to configure a stride size for MPRQ. Default stride size is set to 2048 bytes. Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> --- v1: https://patchwork.dpdk.org/cover/67558/ v2: https://patchwork.dpdk.org/cover/67670/ merge documentation and implementation in one commit v3: https://patchwork.dpdk.org/patch/68085/ rollback to simple burst Rx in case the packet size is too big to fit into the stride and the stride size is not configured v4: fix typo in code comments Alexander Kozyrev (3): net/mlx5: add a devarg to specify MPRQ stride size net/mlx5: enable MPRQ multi-stride operations net/mlx5: add multi-segment packets in MPRQ mode doc/guides/nics/mlx5.rst | 17 ++++- doc/guides/rel_notes/release_20_05.rst | 1 + drivers/net/mlx5/mlx5.c | 34 ++++++++-- drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_defs.h | 3 + drivers/net/mlx5/mlx5_rxq.c | 70 +++++++++++--------- drivers/net/mlx5/mlx5_rxtx.c | 113 +++++++++++++++++++-------------- drivers/net/mlx5/mlx5_rxtx.h | 2 +- 8 files changed, 154 insertions(+), 87 deletions(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v4 1/3] net/mlx5: add a devarg to specify MPRQ stride size 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev @ 2020-04-09 22:23 ` Alexander Kozyrev 2020-04-14 11:42 ` Ferruh Yigit 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 2/3] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev ` (3 subsequent siblings) 4 siblings, 1 reply; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-09 22:23 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo, stable Define a device parameter to configure log 2 of a stride size for MPRQ - mprq_log_stride_size. User is able to specify a stride size in a range allowed by an underlying hardware. The default stride size is defined as 2048 bytes to encompass most commonly used packet sizes in the Internet (MTU 1518 and less) and will be used in case a maximum configured packet size cannot fit into the largest possible stride size. Otherwise a stride size is set to a large enough value to encompass a whole packet. Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> --- doc/guides/nics/mlx5.rst | 17 +++++++++-- doc/guides/rel_notes/release_20_05.rst | 1 + drivers/net/mlx5/mlx5.c | 34 +++++++++++++++++----- drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_defs.h | 3 ++ drivers/net/mlx5/mlx5_rxq.c | 52 +++++++++++++++++++++++++--------- 6 files changed, 85 insertions(+), 23 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index e13c07d..759d0ac 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -408,8 +408,7 @@ Run-time configuration A nonzero value enables configuring Multi-Packet Rx queues. Rx queue is configured as Multi-Packet RQ if the total number of Rx queues is - ``rxqs_min_mprq`` or more and Rx scatter isn't configured. Disabled by - default. + ``rxqs_min_mprq`` or more. Disabled by default. Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe bandwidth by posting a single large buffer for multiple packets. Instead of posting a @@ -434,6 +433,20 @@ Run-time configuration The size of Rx queue should be bigger than the number of strides. +- ``mprq_log_stride_size`` parameter [int] + + Log 2 of the size of a stride for Multi-Packet Rx queue. Configuring a smaller + stride size can save some memory and reduce probability of a depletion of all + available strides due to unreleased packets by an application. If configured + value is not in the range of device capability, the default value will be set + with a warning message. The default value is 11 which is 2048 bytes per a + stride, valid only if ``mprq_en`` is set. With ``mprq_log_stride_size`` set + it is possible for a pcaket to span across multiple strides. This mode allows + support of jumbo frames (9K) with MPRQ. The memcopy of some packets (or part + of a packet if Rx scatter is configured) may be required in case there is no + space left for a head room at the end of a stride which incurs some + performance penalty. + - ``mprq_max_memcpy_len`` parameter [int] The maximum length of packet to memcpy in case of Multi-Packet Rx queue. Rx diff --git a/doc/guides/rel_notes/release_20_05.rst b/doc/guides/rel_notes/release_20_05.rst index 2596269..586a442 100644 --- a/doc/guides/rel_notes/release_20_05.rst +++ b/doc/guides/rel_notes/release_20_05.rst @@ -62,6 +62,7 @@ New Features * Added support for matching on IPv4 Time To Live and IPv6 Hop Limit. * Added support for creating Relaxed Ordering Memory Regions. + * Added support for jumbo frame size (9K MTU) in Multi-Packet RQ mode. * **Updated the Intel ice driver.** diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index efdd53c..293d316 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -63,6 +63,9 @@ /* Device parameter to configure log 2 of the number of strides for MPRQ. */ #define MLX5_RX_MPRQ_LOG_STRIDE_NUM "mprq_log_stride_num" +/* Device parameter to configure log 2 of the stride size for MPRQ. */ +#define MLX5_RX_MPRQ_LOG_STRIDE_SIZE "mprq_log_stride_size" + /* Device parameter to limit the size of memcpy'd packet for MPRQ. */ #define MLX5_RX_MPRQ_MAX_MEMCPY_LEN "mprq_max_memcpy_len" @@ -1533,6 +1536,8 @@ struct mlx5_flow_id_pool * config->mprq.enabled = !!tmp; } else if (strcmp(MLX5_RX_MPRQ_LOG_STRIDE_NUM, key) == 0) { config->mprq.stride_num_n = tmp; + } else if (strcmp(MLX5_RX_MPRQ_LOG_STRIDE_SIZE, key) == 0) { + config->mprq.stride_size_n = tmp; } else if (strcmp(MLX5_RX_MPRQ_MAX_MEMCPY_LEN, key) == 0) { config->mprq.max_memcpy_len = tmp; } else if (strcmp(MLX5_RXQS_MIN_MPRQ, key) == 0) { @@ -1629,6 +1634,7 @@ struct mlx5_flow_id_pool * MLX5_RXQ_PKT_PAD_EN, MLX5_RX_MPRQ_EN, MLX5_RX_MPRQ_LOG_STRIDE_NUM, + MLX5_RX_MPRQ_LOG_STRIDE_SIZE, MLX5_RX_MPRQ_MAX_MEMCPY_LEN, MLX5_RXQS_MIN_MPRQ, MLX5_TXQ_INLINE, @@ -2304,8 +2310,6 @@ struct mlx5_flow_id_pool * mprq_caps.min_single_wqe_log_num_of_strides; mprq_max_stride_num_n = mprq_caps.max_single_wqe_log_num_of_strides; - config.mprq.stride_num_n = RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, - mprq_min_stride_num_n); } #endif if (RTE_CACHE_LINE_SIZE == 128 && @@ -2619,17 +2623,32 @@ struct mlx5_flow_id_pool * #endif } if (config.mprq.enabled && mprq) { - if (config.mprq.stride_num_n > mprq_max_stride_num_n || - config.mprq.stride_num_n < mprq_min_stride_num_n) { + if (config.mprq.stride_num_n && + (config.mprq.stride_num_n > mprq_max_stride_num_n || + config.mprq.stride_num_n < mprq_min_stride_num_n)) { config.mprq.stride_num_n = - RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, - mprq_min_stride_num_n); + RTE_MIN(RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, + mprq_min_stride_num_n), + mprq_max_stride_num_n); DRV_LOG(WARNING, "the number of strides" " for Multi-Packet RQ is out of range," " setting default value (%u)", 1 << config.mprq.stride_num_n); } + if (config.mprq.stride_size_n && + (config.mprq.stride_size_n > mprq_max_stride_size_n || + config.mprq.stride_size_n < mprq_min_stride_size_n)) { + config.mprq.stride_size_n = + RTE_MIN(RTE_MAX(MLX5_MPRQ_STRIDE_SIZE_N, + mprq_min_stride_size_n), + mprq_max_stride_size_n); + DRV_LOG(WARNING, + "the size of a stride" + " for Multi-Packet RQ is out of range," + " setting default value (%u)", + 1 << config.mprq.stride_size_n); + } config.mprq.min_stride_size_n = mprq_min_stride_size_n; config.mprq.max_stride_size_n = mprq_max_stride_size_n; } else if (config.mprq.enabled && !mprq) { @@ -3363,7 +3382,8 @@ struct mlx5_flow_id_pool * .mr_ext_memseg_en = 1, .mprq = { .enabled = 0, /* Disabled by default. */ - .stride_num_n = MLX5_MPRQ_STRIDE_NUM_N, + .stride_num_n = 0, + .stride_size_n = 0, .max_memcpy_len = MLX5_MPRQ_MEMCPY_DEFAULT_LEN, .min_rxqs_num = MLX5_MPRQ_MIN_RXQS, }, diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 396dba7..39af7ab 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -179,6 +179,7 @@ struct mlx5_dev_config { struct { unsigned int enabled:1; /* Whether MPRQ is enabled. */ unsigned int stride_num_n; /* Number of strides. */ + unsigned int stride_size_n; /* Size of a stride. */ unsigned int min_stride_size_n; /* Min size of a stride. */ unsigned int max_stride_size_n; /* Max size of a stride. */ unsigned int max_memcpy_len; diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index 19e8253..260f584 100644 --- a/drivers/net/mlx5/mlx5_defs.h +++ b/drivers/net/mlx5/mlx5_defs.h @@ -143,6 +143,9 @@ /* Log 2 of the default number of strides per WQE for Multi-Packet RQ. */ #define MLX5_MPRQ_STRIDE_NUM_N 6U +/* Log 2 of the default size of a stride per WQE for Multi-Packet RQ. */ +#define MLX5_MPRQ_STRIDE_SIZE_N 11U + /* Two-byte shift is disabled for Multi-Packet RQ. */ #define MLX5_MPRQ_TWO_BYTE_SHIFT 0 diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index 0a95e3c..1b57f00 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -1793,7 +1793,9 @@ struct mlx5_rxq_ctrl * struct mlx5_priv *priv = dev->data->dev_private; struct mlx5_rxq_ctrl *tmpl; unsigned int mb_len = rte_pktmbuf_data_room_size(mp); + unsigned int mprq_stride_nums; unsigned int mprq_stride_size; + unsigned int mprq_stride_cap; struct mlx5_dev_config *config = &priv->config; unsigned int strd_headroom_en; /* @@ -1856,25 +1858,40 @@ struct mlx5_rxq_ctrl * strd_headroom_en = 1; mprq_stride_size = non_scatter_min_mbuf_size; } + mprq_stride_nums = config->mprq.stride_num_n ? + config->mprq.stride_num_n : MLX5_MPRQ_STRIDE_NUM_N; + mprq_stride_size = (mprq_stride_size <= + (1U << config->mprq.max_stride_size_n)) ? + log2above(mprq_stride_size) : MLX5_MPRQ_STRIDE_SIZE_N; + mprq_stride_cap = (config->mprq.stride_num_n ? + (1U << config->mprq.stride_num_n) : (1U << mprq_stride_nums)) * + (config->mprq.stride_size_n ? + (1U << config->mprq.stride_size_n) : (1U << mprq_stride_size)); /* * This Rx queue can be configured as a Multi-Packet RQ if all of the * following conditions are met: * - MPRQ is enabled. * - The number of descs is more than the number of strides. - * - max_rx_pkt_len plus overhead is less than the max size of a - * stride. + * - max_rx_pkt_len plus overhead is less than the max size + * of a stride or mprq_stride_size is specified by a user. + * Need to nake sure that there are enough stides to encap + * the maximum packet size in case mprq_stride_size is set. * Otherwise, enable Rx scatter if necessary. */ - if (mprq_en && - desc > (1U << config->mprq.stride_num_n) && - mprq_stride_size <= (1U << config->mprq.max_stride_size_n)) { + if (mprq_en && desc > (1U << mprq_stride_nums) && + (non_scatter_min_mbuf_size - + (lro_on_queue ? RTE_PKTMBUF_HEADROOM : 0) <= + (1U << config->mprq.max_stride_size_n) || + (config->mprq.stride_size_n && + non_scatter_min_mbuf_size <= mprq_stride_cap))) { /* TODO: Rx scatter isn't supported yet. */ tmpl->rxq.sges_n = 0; /* Trim the number of descs needed. */ - desc >>= config->mprq.stride_num_n; - tmpl->rxq.strd_num_n = config->mprq.stride_num_n; - tmpl->rxq.strd_sz_n = RTE_MAX(log2above(mprq_stride_size), - config->mprq.min_stride_size_n); + desc >>= mprq_stride_nums; + tmpl->rxq.strd_num_n = config->mprq.stride_num_n ? + config->mprq.stride_num_n : mprq_stride_nums; + tmpl->rxq.strd_sz_n = config->mprq.stride_size_n ? + config->mprq.stride_size_n : mprq_stride_size; tmpl->rxq.strd_shift_en = MLX5_MPRQ_TWO_BYTE_SHIFT; tmpl->rxq.strd_headroom_en = strd_headroom_en; tmpl->rxq.mprq_max_memcpy_len = RTE_MIN(first_mb_free_size, @@ -1923,11 +1940,18 @@ struct mlx5_rxq_ctrl * if (mprq_en && !mlx5_rxq_mprq_enabled(&tmpl->rxq)) DRV_LOG(WARNING, "port %u MPRQ is requested but cannot be enabled" - " (requested: desc = %u, stride_sz = %u," - " supported: min_stride_num = %u, max_stride_sz = %u).", - dev->data->port_id, desc, mprq_stride_size, - (1 << config->mprq.stride_num_n), - (1 << config->mprq.max_stride_size_n)); + " (requested: packet size = %u, desc = %u," + " stride_sz = %u, stride_num = %u," + " supported: min_stride_sz = %u, max_stride_sz = %u).", + dev->data->port_id, non_scatter_min_mbuf_size, desc, + config->mprq.stride_size_n ? + (1U << config->mprq.stride_size_n) : + (1U << mprq_stride_size), + config->mprq.stride_num_n ? + (1U << config->mprq.stride_num_n) : + (1U << mprq_stride_nums), + (1U << config->mprq.min_stride_size_n), + (1U << config->mprq.max_stride_size_n)); DRV_LOG(DEBUG, "port %u maximum number of segments per packet: %u", dev->data->port_id, 1 << tmpl->rxq.sges_n); if (desc % (1 << tmpl->rxq.sges_n)) { -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/3] net/mlx5: add a devarg to specify MPRQ stride size 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev @ 2020-04-14 11:42 ` Ferruh Yigit 2020-04-14 12:52 ` Thomas Monjalon 0 siblings, 1 reply; 28+ messages in thread From: Ferruh Yigit @ 2020-04-14 11:42 UTC (permalink / raw) To: Alexander Kozyrev, dev Cc: rasland, matan, viacheslavo, stable, Kevin Traynor, Luca Boccassi, David Marchand, Thomas Monjalon On 4/9/2020 11:23 PM, Alexander Kozyrev wrote: > Define a device parameter to configure log 2 of a stride size for MPRQ > - mprq_log_stride_size. User is able to specify a stride size in a range > allowed by an underlying hardware. The default stride size is defined as > 2048 bytes to encompass most commonly used packet sizes in the Internet > (MTU 1518 and less) and will be used in case a maximum configured packet > size cannot fit into the largest possible stride size. Otherwise a > stride size is set to a large enough value to encompass a whole packet. > > Cc: stable@dpdk.org Hi Alexander, This is a new feature, and you are asking it for to be backported to the stable trees. There is no question on getting the fixes to the stable tree, but for backporting features I would like to get the comment of the stable tree maintainers first before merging the series. Or if the stable tag added by mistake please let me know, I can remove them while merging. Thanks, ferruh > > Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> > Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> <...> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/3] net/mlx5: add a devarg to specify MPRQ stride size 2020-04-14 11:42 ` Ferruh Yigit @ 2020-04-14 12:52 ` Thomas Monjalon 2020-04-15 11:01 ` Ferruh Yigit 0 siblings, 1 reply; 28+ messages in thread From: Thomas Monjalon @ 2020-04-14 12:52 UTC (permalink / raw) To: Alexander Kozyrev, Ferruh Yigit, Kevin Traynor, Luca Boccassi Cc: dev, rasland, matan, viacheslavo, stable, David Marchand 14/04/2020 13:42, Ferruh Yigit: > On 4/9/2020 11:23 PM, Alexander Kozyrev wrote: > > Define a device parameter to configure log 2 of a stride size for MPRQ > > - mprq_log_stride_size. User is able to specify a stride size in a range > > allowed by an underlying hardware. The default stride size is defined as > > 2048 bytes to encompass most commonly used packet sizes in the Internet > > (MTU 1518 and less) and will be used in case a maximum configured packet > > size cannot fit into the largest possible stride size. Otherwise a > > stride size is set to a large enough value to encompass a whole packet. > > > > Cc: stable@dpdk.org > > Hi Alexander, > > This is a new feature, and you are asking it for to be backported to the stable > trees. > > There is no question on getting the fixes to the stable tree, but for > backporting features I would like to get the comment of the stable tree > maintainers first before merging the series. As far as I know, there is a fix hidden in this series, for the case of jumbo frames. In my understanding, jumbo frames cannot be fixed without a new option. I agree it's tricky deciding what is the limit with backports. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/3] net/mlx5: add a devarg to specify MPRQ stride size 2020-04-14 12:52 ` Thomas Monjalon @ 2020-04-15 11:01 ` Ferruh Yigit 2020-04-15 11:25 ` Luca Boccassi 0 siblings, 1 reply; 28+ messages in thread From: Ferruh Yigit @ 2020-04-15 11:01 UTC (permalink / raw) To: Thomas Monjalon, Alexander Kozyrev, Kevin Traynor, Luca Boccassi Cc: dev, rasland, matan, viacheslavo, stable, David Marchand On 4/14/2020 1:52 PM, Thomas Monjalon wrote: > 14/04/2020 13:42, Ferruh Yigit: >> On 4/9/2020 11:23 PM, Alexander Kozyrev wrote: >>> Define a device parameter to configure log 2 of a stride size for MPRQ >>> - mprq_log_stride_size. User is able to specify a stride size in a range >>> allowed by an underlying hardware. The default stride size is defined as >>> 2048 bytes to encompass most commonly used packet sizes in the Internet >>> (MTU 1518 and less) and will be used in case a maximum configured packet >>> size cannot fit into the largest possible stride size. Otherwise a >>> stride size is set to a large enough value to encompass a whole packet. >>> >>> Cc: stable@dpdk.org >> >> Hi Alexander, >> >> This is a new feature, and you are asking it for to be backported to the stable >> trees. >> >> There is no question on getting the fixes to the stable tree, but for >> backporting features I would like to get the comment of the stable tree >> maintainers first before merging the series. > > As far as I know, there is a fix hidden in this series, > for the case of jumbo frames. > In my understanding, jumbo frames cannot be fixed without a new option. > I agree it's tricky deciding what is the limit with backports. > I missed the fix bit, so if there is no objection from stable tree maintainers I will continue with the set keeping the stable tag. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/3] net/mlx5: add a devarg to specify MPRQ stride size 2020-04-15 11:01 ` Ferruh Yigit @ 2020-04-15 11:25 ` Luca Boccassi 2020-04-15 15:34 ` Alexander Kozyrev 0 siblings, 1 reply; 28+ messages in thread From: Luca Boccassi @ 2020-04-15 11:25 UTC (permalink / raw) To: Ferruh Yigit, Thomas Monjalon, Alexander Kozyrev, Kevin Traynor Cc: dev, rasland, matan, viacheslavo, stable, David Marchand On Wed, 2020-04-15 at 12:01 +0100, Ferruh Yigit wrote: > On 4/14/2020 1:52 PM, Thomas Monjalon wrote: > > 14/04/2020 13:42, Ferruh Yigit: > > > On 4/9/2020 11:23 PM, Alexander Kozyrev wrote: > > > > Define a device parameter to configure log 2 of a stride size for MPRQ > > > > - mprq_log_stride_size. User is able to specify a stride size in a range > > > > allowed by an underlying hardware. The default stride size is defined as > > > > 2048 bytes to encompass most commonly used packet sizes in the Internet > > > > (MTU 1518 and less) and will be used in case a maximum configured packet > > > > size cannot fit into the largest possible stride size. Otherwise a > > > > stride size is set to a large enough value to encompass a whole packet. > > > > > > > > Cc: stable@dpdk.org > > > > > > Hi Alexander, > > > > > > This is a new feature, and you are asking it for to be backported to the stable > > > trees. > > > > > > There is no question on getting the fixes to the stable tree, but for > > > backporting features I would like to get the comment of the stable tree > > > maintainers first before merging the series. > > > > As far as I know, there is a fix hidden in this series, > > for the case of jumbo frames. > > In my understanding, jumbo frames cannot be fixed without a new option. > > I agree it's tricky deciding what is the limit with backports. > > > > I missed the fix bit, so if there is no objection from stable tree maintainers I > will continue with the set keeping the stable tag. Given it's confined to a single PMD it's fine by me, provided that: 1) Backward compatibility is maintained 2) Forward compatibility is maintained (eg: going 19.11.x to 20.02 should still work and not cause any errors) -- Luca Boccassi <bluca@debian.org> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v4 1/3] net/mlx5: add a devarg to specify MPRQ stride size 2020-04-15 11:25 ` Luca Boccassi @ 2020-04-15 15:34 ` Alexander Kozyrev 2020-04-15 15:52 ` [dpdk-dev] [dpdk-stable] " Luca Boccassi 0 siblings, 1 reply; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-15 15:34 UTC (permalink / raw) To: Luca Boccassi, Ferruh Yigit, Thomas Monjalon, Kevin Traynor Cc: dev, Raslan Darawsheh, Matan Azrad, Slava Ovsiienko, stable, David Marchand On Wed, April 15, 2020 7:26 Luca Boccassi wrote: > On Wed, 2020-04-15 at 12:01 +0100, Ferruh Yigit wrote: > > On 4/14/2020 1:52 PM, Thomas Monjalon wrote: > > > 14/04/2020 13:42, Ferruh Yigit: > > > > On 4/9/2020 11:23 PM, Alexander Kozyrev wrote: > > > > > Define a device parameter to configure log 2 of a stride size > > > > > for MPRQ > > > > > - mprq_log_stride_size. User is able to specify a stride size in > > > > > a range allowed by an underlying hardware. The default stride > > > > > size is defined as > > > > > 2048 bytes to encompass most commonly used packet sizes in the > > > > > Internet (MTU 1518 and less) and will be used in case a maximum > > > > > configured packet size cannot fit into the largest possible > > > > > stride size. Otherwise a stride size is set to a large enough value to > encompass a whole packet. > > > > > > > > > > Cc: stable@dpdk.org > > > > > > > > Hi Alexander, > > > > > > > > This is a new feature, and you are asking it for to be backported > > > > to the stable trees. > > > > > > > > There is no question on getting the fixes to the stable tree, but > > > > for backporting features I would like to get the comment of the > > > > stable tree maintainers first before merging the series. > > > > > > As far as I know, there is a fix hidden in this series, for the case > > > of jumbo frames. > > > In my understanding, jumbo frames cannot be fixed without a new option. > > > I agree it's tricky deciding what is the limit with backports. > > > > > > > I missed the fix bit, so if there is no objection from stable tree > > maintainers I will continue with the set keeping the stable tag. > > Given it's confined to a single PMD it's fine by me, provided that: > > 1) Backward compatibility is maintained > 2) Forward compatibility is maintained (eg: going 19.11.x to 20.02 should still > work and not cause any errors) The whole point of these patches are to fix the inability to handle 9K jumbo frames. A new devarg is required in order not to break the backward compatibility. So I do not expect any issues with backward/forward compatibility with these patches. One thing I need to mention: backporting requires one small change. Conversion from MLX5_ASSERT to assert clauses since we reworked them in 20.02. I can provide a separate list of patches for 19.11 with this change if it works better for you. Thank you for you assistance and let me know how we should proceed further. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v4 1/3] net/mlx5: add a devarg to specify MPRQ stride size 2020-04-15 15:34 ` Alexander Kozyrev @ 2020-04-15 15:52 ` Luca Boccassi 0 siblings, 0 replies; 28+ messages in thread From: Luca Boccassi @ 2020-04-15 15:52 UTC (permalink / raw) To: Alexander Kozyrev, Ferruh Yigit, Thomas Monjalon, Kevin Traynor Cc: dev, Raslan Darawsheh, Matan Azrad, Slava Ovsiienko, stable, David Marchand On Wed, 2020-04-15 at 15:34 +0000, Alexander Kozyrev wrote: > On Wed, April 15, 2020 7:26 Luca Boccassi wrote: > > On Wed, 2020-04-15 at 12:01 +0100, Ferruh Yigit wrote: > > > On 4/14/2020 1:52 PM, Thomas Monjalon wrote: > > > > 14/04/2020 13:42, Ferruh Yigit: > > > > > On 4/9/2020 11:23 PM, Alexander Kozyrev wrote: > > > > > > Define a device parameter to configure log 2 of a stride size > > > > > > for MPRQ > > > > > > - mprq_log_stride_size. User is able to specify a stride size in > > > > > > a range allowed by an underlying hardware. The default stride > > > > > > size is defined as > > > > > > 2048 bytes to encompass most commonly used packet sizes in the > > > > > > Internet (MTU 1518 and less) and will be used in case a maximum > > > > > > configured packet size cannot fit into the largest possible > > > > > > stride size. Otherwise a stride size is set to a large enough value to > > encompass a whole packet. > > > > > > Cc: stable@dpdk.org > > > > > > > > > > Hi Alexander, > > > > > > > > > > This is a new feature, and you are asking it for to be backported > > > > > to the stable trees. > > > > > > > > > > There is no question on getting the fixes to the stable tree, but > > > > > for backporting features I would like to get the comment of the > > > > > stable tree maintainers first before merging the series. > > > > > > > > As far as I know, there is a fix hidden in this series, for the case > > > > of jumbo frames. > > > > In my understanding, jumbo frames cannot be fixed without a new option. > > > > I agree it's tricky deciding what is the limit with backports. > > > > > > > > > > I missed the fix bit, so if there is no objection from stable tree > > > maintainers I will continue with the set keeping the stable tag. > > > > Given it's confined to a single PMD it's fine by me, provided that: > > > > 1) Backward compatibility is maintained > > 2) Forward compatibility is maintained (eg: going 19.11.x to 20.02 should still > > work and not cause any errors) > > The whole point of these patches are to fix the inability to handle 9K jumbo frames. > A new devarg is required in order not to break the backward compatibility. > So I do not expect any issues with backward/forward compatibility with these patches. > One thing I need to mention: backporting requires one small change. > Conversion from MLX5_ASSERT to assert clauses since we reworked them in 20.02. > I can provide a separate list of patches for 19.11 with this change if it works better for you. > Thank you for you assistance and let me know how we should proceed further. Ok, thanks. It's not necessary to send a series if the only change is the assert, I can fix it myself when backporting. ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v4 2/3] net/mlx5: enable MPRQ multi-stride operations 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev @ 2020-04-09 22:23 ` Alexander Kozyrev 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 3/3] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev ` (2 subsequent siblings) 4 siblings, 0 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-09 22:23 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo, stable MPRQ feature should be updated to allow a packet to be received into multiple strides in order to support the MTU exceeding 8KB. Special care is needed to prevent the headroom corruption in the multi-stride mode since the headroom space is borrowed by the PMD from the tail of the preceding stride. Copy the whole packet into a separate mbuf in this case or just the overlapping data if the Rx scattering is supported by an application. Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> --- drivers/net/mlx5/mlx5_rxq.c | 28 ++++-------------- drivers/net/mlx5/mlx5_rxtx.c | 68 +++++++++++++++++++------------------------- drivers/net/mlx5/mlx5_rxtx.h | 2 +- 3 files changed, 36 insertions(+), 62 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index 1b57f00..1cc9f1d 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -1797,7 +1797,6 @@ struct mlx5_rxq_ctrl * unsigned int mprq_stride_size; unsigned int mprq_stride_cap; struct mlx5_dev_config *config = &priv->config; - unsigned int strd_headroom_en; /* * Always allocate extra slots, even if eventually * the vector Rx will not be used. @@ -1843,26 +1842,11 @@ struct mlx5_rxq_ctrl * tmpl->socket = socket; if (dev->data->dev_conf.intr_conf.rxq) tmpl->irq = 1; - /* - * LRO packet may consume all the stride memory, hence we cannot - * guaranty head-room near the packet memory in the stride. - * In this case scatter is, for sure, enabled and an empty mbuf may be - * added in the start for the head-room. - */ - if (lro_on_queue && RTE_PKTMBUF_HEADROOM > 0 && - non_scatter_min_mbuf_size > mb_len) { - strd_headroom_en = 0; - mprq_stride_size = RTE_MIN(max_rx_pkt_len, - 1u << config->mprq.max_stride_size_n); - } else { - strd_headroom_en = 1; - mprq_stride_size = non_scatter_min_mbuf_size; - } mprq_stride_nums = config->mprq.stride_num_n ? config->mprq.stride_num_n : MLX5_MPRQ_STRIDE_NUM_N; - mprq_stride_size = (mprq_stride_size <= - (1U << config->mprq.max_stride_size_n)) ? - log2above(mprq_stride_size) : MLX5_MPRQ_STRIDE_SIZE_N; + mprq_stride_size = non_scatter_min_mbuf_size <= + (1U << config->mprq.max_stride_size_n) ? + log2above(non_scatter_min_mbuf_size) : MLX5_MPRQ_STRIDE_SIZE_N; mprq_stride_cap = (config->mprq.stride_num_n ? (1U << config->mprq.stride_num_n) : (1U << mprq_stride_nums)) * (config->mprq.stride_size_n ? @@ -1879,8 +1863,7 @@ struct mlx5_rxq_ctrl * * Otherwise, enable Rx scatter if necessary. */ if (mprq_en && desc > (1U << mprq_stride_nums) && - (non_scatter_min_mbuf_size - - (lro_on_queue ? RTE_PKTMBUF_HEADROOM : 0) <= + (non_scatter_min_mbuf_size <= (1U << config->mprq.max_stride_size_n) || (config->mprq.stride_size_n && non_scatter_min_mbuf_size <= mprq_stride_cap))) { @@ -1893,7 +1876,8 @@ struct mlx5_rxq_ctrl * tmpl->rxq.strd_sz_n = config->mprq.stride_size_n ? config->mprq.stride_size_n : mprq_stride_size; tmpl->rxq.strd_shift_en = MLX5_MPRQ_TWO_BYTE_SHIFT; - tmpl->rxq.strd_headroom_en = strd_headroom_en; + tmpl->rxq.strd_scatter_en = + !!(offloads & DEV_RX_OFFLOAD_SCATTER); tmpl->rxq.mprq_max_memcpy_len = RTE_MIN(first_mb_free_size, config->mprq.max_memcpy_len); max_lro_size = RTE_MIN(max_rx_pkt_len, diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index f3bf763..4c27952 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1658,21 +1658,20 @@ enum mlx5_txcmp_code { unsigned int i = 0; uint32_t rq_ci = rxq->rq_ci; uint16_t consumed_strd = rxq->consumed_strd; - uint16_t headroom_sz = rxq->strd_headroom_en * RTE_PKTMBUF_HEADROOM; struct mlx5_mprq_buf *buf = (*rxq->mprq_bufs)[rq_ci & wq_mask]; while (i < pkts_n) { struct rte_mbuf *pkt; void *addr; int ret; - unsigned int len; + uint32_t len; uint16_t strd_cnt; uint16_t strd_idx; uint32_t offset; uint32_t byte_cnt; + int32_t hdrm_overlap; volatile struct mlx5_mini_cqe8 *mcqe = NULL; uint32_t rss_hash_res = 0; - uint8_t lro_num_seg; if (consumed_strd == strd_n) { /* Replace WQE only if the buffer is still in use. */ @@ -1719,18 +1718,6 @@ enum mlx5_txcmp_code { MLX5_ASSERT(strd_idx < strd_n); MLX5_ASSERT(!((rte_be_to_cpu_16(cqe->wqe_id) ^ rq_ci) & wq_mask)); - lro_num_seg = cqe->lro_num_seg; - /* - * Currently configured to receive a packet per a stride. But if - * MTU is adjusted through kernel interface, device could - * consume multiple strides without raising an error. In this - * case, the packet should be dropped because it is bigger than - * the max_rx_pkt_len. - */ - if (unlikely(!lro_num_seg && strd_cnt > 1)) { - ++rxq->stats.idropped; - continue; - } pkt = rte_pktmbuf_alloc(rxq->mp); if (unlikely(pkt == NULL)) { ++rxq->stats.rx_nombuf; @@ -1742,12 +1729,16 @@ enum mlx5_txcmp_code { len -= RTE_ETHER_CRC_LEN; offset = strd_idx * strd_sz + strd_shift; addr = RTE_PTR_ADD(mlx5_mprq_buf_addr(buf, strd_n), offset); + hdrm_overlap = len + RTE_PKTMBUF_HEADROOM - strd_cnt * strd_sz; /* * Memcpy packets to the target mbuf if: * - The size of packet is smaller than mprq_max_memcpy_len. * - Out of buffer in the Mempool for Multi-Packet RQ. + * - There is no space for a headroom and scatter is disabled. */ - if (len <= rxq->mprq_max_memcpy_len || rxq->mprq_repl == NULL) { + if (len <= rxq->mprq_max_memcpy_len || + rxq->mprq_repl == NULL || + (hdrm_overlap > 0 && !rxq->strd_scatter_en)) { /* * When memcpy'ing packet due to out-of-buffer, the * packet must be smaller than the target mbuf. @@ -1769,7 +1760,7 @@ enum mlx5_txcmp_code { rte_atomic16_add_return(&buf->refcnt, 1); MLX5_ASSERT((uint16_t)rte_atomic16_read(&buf->refcnt) <= strd_n + 1); - buf_addr = RTE_PTR_SUB(addr, headroom_sz); + buf_addr = RTE_PTR_SUB(addr, RTE_PKTMBUF_HEADROOM); /* * MLX5 device doesn't use iova but it is necessary in a * case where the Rx packet is transmitted via a @@ -1788,43 +1779,42 @@ enum mlx5_txcmp_code { rte_pktmbuf_attach_extbuf(pkt, buf_addr, buf_iova, buf_len, shinfo); /* Set mbuf head-room. */ - pkt->data_off = headroom_sz; + SET_DATA_OFF(pkt, RTE_PKTMBUF_HEADROOM); MLX5_ASSERT(pkt->ol_flags == EXT_ATTACHED_MBUF); - /* - * Prevent potential overflow due to MTU change through - * kernel interface. - */ - if (unlikely(rte_pktmbuf_tailroom(pkt) < len)) { - rte_pktmbuf_free_seg(pkt); - ++rxq->stats.idropped; - continue; - } + MLX5_ASSERT(rte_pktmbuf_tailroom(pkt) < + len - (hdrm_overlap > 0 ? hdrm_overlap : 0)); DATA_LEN(pkt) = len; /* - * LRO packet may consume all the stride memory, in this - * case packet head-room space is not guaranteed so must - * to add an empty mbuf for the head-room. + * Copy the last fragment of a packet (up to headroom + * size bytes) in case there is a stride overlap with + * a next packet's headroom. Allocate a separate mbuf + * to store this fragment and link it. Scatter is on. */ - if (!rxq->strd_headroom_en) { - struct rte_mbuf *headroom_mbuf = - rte_pktmbuf_alloc(rxq->mp); + if (hdrm_overlap > 0) { + MLX5_ASSERT(rxq->strd_scatter_en); + struct rte_mbuf *seg = + rte_pktmbuf_alloc(rxq->mp); - if (unlikely(headroom_mbuf == NULL)) { + if (unlikely(seg == NULL)) { rte_pktmbuf_free_seg(pkt); ++rxq->stats.rx_nombuf; break; } - PORT(pkt) = rxq->port_id; - NEXT(headroom_mbuf) = pkt; - pkt = headroom_mbuf; + SET_DATA_OFF(seg, 0); + rte_memcpy(rte_pktmbuf_mtod(seg, void *), + RTE_PTR_ADD(addr, len - hdrm_overlap), + hdrm_overlap); + DATA_LEN(seg) = hdrm_overlap; + DATA_LEN(pkt) = len - hdrm_overlap; + NEXT(pkt) = seg; NB_SEGS(pkt) = 2; } } rxq_cq_to_mbuf(rxq, pkt, cqe, rss_hash_res); - if (lro_num_seg > 1) { + if (cqe->lro_num_seg > 1) { mlx5_lro_update_hdr(addr, cqe, len); pkt->ol_flags |= PKT_RX_LRO; - pkt->tso_segsz = strd_sz; + pkt->tso_segsz = len / cqe->lro_num_seg; } PKT_LEN(pkt) = len; PORT(pkt) = rxq->port_id; diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 939778a..d155c24 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -119,7 +119,7 @@ struct mlx5_rxq_data { unsigned int strd_sz_n:4; /* Log 2 of stride size. */ unsigned int strd_shift_en:1; /* Enable 2bytes shift on a stride. */ unsigned int err_state:2; /* enum mlx5_rxq_err_state. */ - unsigned int strd_headroom_en:1; /* Enable mbuf headroom in MPRQ. */ + unsigned int strd_scatter_en:1; /* Scattered packets from a stride. */ unsigned int lro:1; /* Enable LRO. */ unsigned int :1; /* Remaining bits. */ volatile uint32_t *rq_db; -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v4 3/3] net/mlx5: add multi-segment packets in MPRQ mode 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 2/3] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev @ 2020-04-09 22:23 ` Alexander Kozyrev 2020-04-10 14:01 ` [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ Matan Azrad 2020-04-13 10:57 ` Raslan Darawsheh 4 siblings, 0 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-09 22:23 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo, stable The multi-stride operations now allow to reduce a stride size while supporting Jumbo frames. That means that it is possible to have mbufs configured with a size smaller than the whole packet received. It is not an issue during normal MPRQ operations since we attach external buffers instead of copying the data into the mbuf itself. But it is not the case in "emergency mode" when we have to copy every packet because of no more external mbufs are available. Assemble a multi-segment packet to overcome this issue in case scatter mode is enabled, drop a packet if not. Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> --- drivers/net/mlx5/mlx5_rxtx.c | 47 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index 4c27952..7ce3732 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1734,22 +1734,52 @@ enum mlx5_txcmp_code { * Memcpy packets to the target mbuf if: * - The size of packet is smaller than mprq_max_memcpy_len. * - Out of buffer in the Mempool for Multi-Packet RQ. - * - There is no space for a headroom and scatter is disabled. + * - The packet's stride overlaps a headroom and scatter is off. */ if (len <= rxq->mprq_max_memcpy_len || rxq->mprq_repl == NULL || (hdrm_overlap > 0 && !rxq->strd_scatter_en)) { - /* - * When memcpy'ing packet due to out-of-buffer, the - * packet must be smaller than the target mbuf. - */ - if (unlikely(rte_pktmbuf_tailroom(pkt) < len)) { + if (likely(rte_pktmbuf_tailroom(pkt) >= len)) { + rte_memcpy(rte_pktmbuf_mtod(pkt, void *), + addr, len); + DATA_LEN(pkt) = len; + } else if (rxq->strd_scatter_en) { + struct rte_mbuf *prev = pkt; + uint32_t seg_len = + RTE_MIN(rte_pktmbuf_tailroom(pkt), len); + uint32_t rem_len = len - seg_len; + + rte_memcpy(rte_pktmbuf_mtod(pkt, void *), + addr, seg_len); + DATA_LEN(pkt) = seg_len; + while (rem_len) { + struct rte_mbuf *next = + rte_pktmbuf_alloc(rxq->mp); + + if (unlikely(next == NULL)) { + rte_pktmbuf_free(pkt); + ++rxq->stats.rx_nombuf; + goto out; + } + NEXT(prev) = next; + SET_DATA_OFF(next, 0); + addr = RTE_PTR_ADD(addr, seg_len); + seg_len = RTE_MIN + (rte_pktmbuf_tailroom(next), + rem_len); + rte_memcpy + (rte_pktmbuf_mtod(next, void *), + addr, seg_len); + DATA_LEN(next) = seg_len; + rem_len -= seg_len; + prev = next; + ++NB_SEGS(pkt); + } + } else { rte_pktmbuf_free_seg(pkt); ++rxq->stats.idropped; continue; } - rte_memcpy(rte_pktmbuf_mtod(pkt, void *), addr, len); - DATA_LEN(pkt) = len; } else { rte_iova_t buf_iova; struct rte_mbuf_ext_shared_info *shinfo; @@ -1826,6 +1856,7 @@ enum mlx5_txcmp_code { *(pkts++) = pkt; ++i; } +out: /* Update the consumer indexes. */ rxq->consumed_strd = consumed_strd; rte_cio_wmb(); -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev ` (2 preceding siblings ...) 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 3/3] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev @ 2020-04-10 14:01 ` Matan Azrad 2020-04-13 10:57 ` Raslan Darawsheh 4 siblings, 0 replies; 28+ messages in thread From: Matan Azrad @ 2020-04-10 14:01 UTC (permalink / raw) To: Alexander Kozyrev, dev; +Cc: Raslan Darawsheh, Slava Ovsiienko From: Alexander Kozyrev > In order to support the 9K MTU the MPRQ feature should be updated to > allow a packet to take more than one stride (single linear buffer). > Receiving a packet into multiple adjacent strides should be implemented. > The reason preventing the packet to be received into multiple strides is that > the data buffer must be preceded with some HEAD_ROOM space. > In the current implementation the HEAD_ROOM space is borrowed by the > PMD from the tail of the preceding stride. If packet takes multiple strides the > tail of stride may be overwritten with a packet data and the memory can't be > borrowed to provide the HEAD_ROOM space for the next packet. > Special care is needed to prevent the HEAD_ROOM corruption as such: > - copy a whole packet into a separate memory buffer if scatter is off > - copy an overlapping data only and craft a multi-segment mbuf otherwise > After multi-stride support for packets receiving is in place it is possible to > reduce the stride size for more efficient memory utilization. > Introduce the mprq_log_stride_size device parameter to configure a stride > size for MPRQ. Default stride size is set to 2048 bytes. > > Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> > --- > v1: > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatch > work.dpdk.org%2Fcover%2F67558%2F&data=02%7C01%7Cmatan%40m > ellanox.com%7Cf120e0911d1b47dde5a208d7dcd4c8eb%7Ca652971c7d2e4d9 > ba6a4d149256f461b%7C0%7C0%7C637220678801839724&sdata=LUv0%2 > B6RkXIdDYpp5B7n5ZGo1IaSAKV5saVdMjTldUW4%3D&reserved=0 > v2: > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatch > work.dpdk.org%2Fcover%2F67670%2F&data=02%7C01%7Cmatan%40m > ellanox.com%7Cf120e0911d1b47dde5a208d7dcd4c8eb%7Ca652971c7d2e4d9 > ba6a4d149256f461b%7C0%7C0%7C637220678801839724&sdata=t%2FVy > bs0qmS2NxgEyMsBsD%2BsJebXZ37CQU0opn7Yor%2BI%3D&reserved= > 0 > merge documentation and implementation in one commit > v3: > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatch > work.dpdk.org%2Fpatch%2F68085%2F&data=02%7C01%7Cmatan%40m > ellanox.com%7Cf120e0911d1b47dde5a208d7dcd4c8eb%7Ca652971c7d2e4d9 > ba6a4d149256f461b%7C0%7C0%7C637220678801839724&sdata=jXSxCaI > Fr3V0XF7Cb0C6Xje1qwsf4aDlXAwB9BxGfdk%3D&reserved=0 > rollback to simple burst Rx in case the packet size is too big to fit into the > stride and the stride size is not configured > v4: fix typo in code comments Series-acked-by: Matan Azrad <matan@mellanox.com> > Alexander Kozyrev (3): > net/mlx5: add a devarg to specify MPRQ stride size > net/mlx5: enable MPRQ multi-stride operations > net/mlx5: add multi-segment packets in MPRQ mode > > doc/guides/nics/mlx5.rst | 17 ++++- > doc/guides/rel_notes/release_20_05.rst | 1 + > drivers/net/mlx5/mlx5.c | 34 ++++++++-- > drivers/net/mlx5/mlx5.h | 1 + > drivers/net/mlx5/mlx5_defs.h | 3 + > drivers/net/mlx5/mlx5_rxq.c | 70 +++++++++++--------- > drivers/net/mlx5/mlx5_rxtx.c | 113 +++++++++++++++++++------------- > - > drivers/net/mlx5/mlx5_rxtx.h | 2 +- > 8 files changed, 154 insertions(+), 87 deletions(-) > > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev ` (3 preceding siblings ...) 2020-04-10 14:01 ` [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ Matan Azrad @ 2020-04-13 10:57 ` Raslan Darawsheh 4 siblings, 0 replies; 28+ messages in thread From: Raslan Darawsheh @ 2020-04-13 10:57 UTC (permalink / raw) To: Alexander Kozyrev, dev; +Cc: Matan Azrad, Slava Ovsiienko Hi, > -----Original Message----- > From: Alexander Kozyrev <akozyrev@mellanox.com> > Sent: Friday, April 10, 2020 1:24 AM > To: dev@dpdk.org > Cc: Raslan Darawsheh <rasland@mellanox.com>; Matan Azrad > <matan@mellanox.com>; Slava Ovsiienko <viacheslavo@mellanox.com> > Subject: [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ > > In order to support the 9K MTU the MPRQ feature should be updated > to allow a packet to take more than one stride (single linear buffer). > Receiving a packet into multiple adjacent strides should be implemented. > The reason preventing the packet to be received into multiple strides is > that the data buffer must be preceded with some HEAD_ROOM space. > In the current implementation the HEAD_ROOM space is borrowed by the > PMD > from the tail of the preceding stride. If packet takes multiple strides > the tail of stride may be overwritten with a packet data and the memory > can't be borrowed to provide the HEAD_ROOM space for the next packet. > Special care is needed to prevent the HEAD_ROOM corruption as such: > - copy a whole packet into a separate memory buffer if scatter is off > - copy an overlapping data only and craft a multi-segment mbuf otherwise > After multi-stride support for packets receiving is in place it is > possible to reduce the stride size for more efficient memory utilization. > Introduce the mprq_log_stride_size device parameter to configure > a stride size for MPRQ. Default stride size is set to 2048 bytes. > > Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> > --- > v1: > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatch > work.dpdk.org%2Fcover%2F67558%2F&data=02%7C01%7Crasland%40 > mellanox.com%7Cb4e88289699c4d57cba508d7dcd4caf2%7Ca652971c7d2e4d9 > ba6a4d149256f461b%7C0%7C0%7C637220678808646094&sdata=3aiMgH > dWFEaCMh6Vw%2B3lzU9wr9C8FkZcpQN5jq1%2BWGg%3D&reserved=0 > v2: > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatch > work.dpdk.org%2Fcover%2F67670%2F&data=02%7C01%7Crasland%40 > mellanox.com%7Cb4e88289699c4d57cba508d7dcd4caf2%7Ca652971c7d2e4d9 > ba6a4d149256f461b%7C0%7C0%7C637220678808646094&sdata=rw%2FR > FqaQAY9gnScLNMTc%2Fg2NqKT6yuHal%2FP7vLdTf%2BY%3D&reserved > =0 > merge documentation and implementation in one commit > v3: > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatch > work.dpdk.org%2Fpatch%2F68085%2F&data=02%7C01%7Crasland%40 > mellanox.com%7Cb4e88289699c4d57cba508d7dcd4caf2%7Ca652971c7d2e4d9 > ba6a4d149256f461b%7C0%7C0%7C637220678808646094&sdata=asYoyH > y7r0hkU7Vgw%2Fjo1Zp%2B0UKZv%2BRB7iinXg%2B6mMw%3D&reserv > ed=0 > rollback to simple burst Rx in case the packet size is too big > to fit into the stride and the stride size is not configured > v4: fix typo in code comments > > Alexander Kozyrev (3): > net/mlx5: add a devarg to specify MPRQ stride size > net/mlx5: enable MPRQ multi-stride operations > net/mlx5: add multi-segment packets in MPRQ mode > > doc/guides/nics/mlx5.rst | 17 ++++- > doc/guides/rel_notes/release_20_05.rst | 1 + > drivers/net/mlx5/mlx5.c | 34 ++++++++-- > drivers/net/mlx5/mlx5.h | 1 + > drivers/net/mlx5/mlx5_defs.h | 3 + > drivers/net/mlx5/mlx5_rxq.c | 70 +++++++++++--------- > drivers/net/mlx5/mlx5_rxtx.c | 113 +++++++++++++++++++------------- > - > drivers/net/mlx5/mlx5_rxtx.h | 2 +- > 8 files changed, 154 insertions(+), 87 deletions(-) > > -- > 1.8.3.1 Series applied to next-net-mlx, Kindest regards, Raslan Darawsheh ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v3 0/3] net/mlx5: add large packet size support to MPRQ 2020-03-31 21:52 [dpdk-dev] [PATCH 0/4] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev ` (4 preceding siblings ...) 2020-04-02 18:11 ` [dpdk-dev] [PATCH 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev @ 2020-04-09 21:24 ` Alexander Kozyrev 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev ` (2 more replies) 5 siblings, 3 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-09 21:24 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo In order to support the 9K MTU the MPRQ feature should be updated to allow a packet to take more than one stride (single linear buffer). Receiving a packet into multiple adjacent strides should be implemented. The reason preventing the packet to be received into multiple strides is that the data buffer must be preceded with some HEAD_ROOM space. In the current implementation the HEAD_ROOM space is borrowed by the PMD from the tail of the preceding stride. If packet takes multiple strides the tail of stride may be overwritten with a packet data and the memory can't be borrowed to provide the HEAD_ROOM space for the next packet. Special care is needed to prevent the HEAD_ROOM corruption as such: - copy a whole packet into a separate memory buffer if scatter is off - copy an overlapping data only and craft a multi-segment mbuf otherwise After multi-stride support for packets receiving is in place it is possible to reduce the stride size for more efficient memory utilization. Introduce the mprq_log_stride_size device parameter to configure a stride size for MPRQ. Default stride size is set to 2048 bytes. Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> --- v1: https://patchwork.dpdk.org/cover/67558/ v2: https://patchwork.dpdk.org/cover/67670/ merge documentation and implementation in one commit v3: rollback to the simple burst Rx in case the packet size is too big to fit into the stride and the mprq_log_stride_size is not configured Alexander Kozyrev (3): net/mlx5: add a devarg to specify MPRQ stride size net/mlx5: enable MPRQ multi-stride operations net/mlx5: add multi-segment packets in MPRQ mode doc/guides/nics/mlx5.rst | 17 ++++- doc/guides/rel_notes/release_20_05.rst | 1 + drivers/net/mlx5/mlx5.c | 34 ++++++++-- drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_defs.h | 3 + drivers/net/mlx5/mlx5_rxq.c | 70 +++++++++++--------- drivers/net/mlx5/mlx5_rxtx.c | 113 +++++++++++++++++++-------------- drivers/net/mlx5/mlx5_rxtx.h | 2 +- 8 files changed, 154 insertions(+), 87 deletions(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v3 1/3] net/mlx5: add a devarg to specify MPRQ stride size 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 " Alexander Kozyrev @ 2020-04-09 21:24 ` Alexander Kozyrev 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 2/3] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 3/3] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev 2 siblings, 0 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-09 21:24 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo, stable Define a device parameter to configure log 2 of a stride size for MPRQ - mprq_log_stride_size. User is able to specify a stride size in a range allowed by an underlying hardware. The default stride size is defined as 2048 bytes to encompass most commonly used packet sizes in the Internet (MTU 1518 and less) and will be used in case a maximum configured packet size cannot fit into the largest possible stride size. Otherwise a stride size is set to a large enough value to encompass a whole packet. Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> --- doc/guides/nics/mlx5.rst | 17 +++++++++-- doc/guides/rel_notes/release_20_05.rst | 1 + drivers/net/mlx5/mlx5.c | 34 +++++++++++++++++----- drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_defs.h | 3 ++ drivers/net/mlx5/mlx5_rxq.c | 52 +++++++++++++++++++++++++--------- 6 files changed, 85 insertions(+), 23 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index e13c07d..759d0ac 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -408,8 +408,7 @@ Run-time configuration A nonzero value enables configuring Multi-Packet Rx queues. Rx queue is configured as Multi-Packet RQ if the total number of Rx queues is - ``rxqs_min_mprq`` or more and Rx scatter isn't configured. Disabled by - default. + ``rxqs_min_mprq`` or more. Disabled by default. Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe bandwidth by posting a single large buffer for multiple packets. Instead of posting a @@ -434,6 +433,20 @@ Run-time configuration The size of Rx queue should be bigger than the number of strides. +- ``mprq_log_stride_size`` parameter [int] + + Log 2 of the size of a stride for Multi-Packet Rx queue. Configuring a smaller + stride size can save some memory and reduce probability of a depletion of all + available strides due to unreleased packets by an application. If configured + value is not in the range of device capability, the default value will be set + with a warning message. The default value is 11 which is 2048 bytes per a + stride, valid only if ``mprq_en`` is set. With ``mprq_log_stride_size`` set + it is possible for a pcaket to span across multiple strides. This mode allows + support of jumbo frames (9K) with MPRQ. The memcopy of some packets (or part + of a packet if Rx scatter is configured) may be required in case there is no + space left for a head room at the end of a stride which incurs some + performance penalty. + - ``mprq_max_memcpy_len`` parameter [int] The maximum length of packet to memcpy in case of Multi-Packet Rx queue. Rx diff --git a/doc/guides/rel_notes/release_20_05.rst b/doc/guides/rel_notes/release_20_05.rst index 2596269..586a442 100644 --- a/doc/guides/rel_notes/release_20_05.rst +++ b/doc/guides/rel_notes/release_20_05.rst @@ -62,6 +62,7 @@ New Features * Added support for matching on IPv4 Time To Live and IPv6 Hop Limit. * Added support for creating Relaxed Ordering Memory Regions. + * Added support for jumbo frame size (9K MTU) in Multi-Packet RQ mode. * **Updated the Intel ice driver.** diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index efdd53c..293d316 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -63,6 +63,9 @@ /* Device parameter to configure log 2 of the number of strides for MPRQ. */ #define MLX5_RX_MPRQ_LOG_STRIDE_NUM "mprq_log_stride_num" +/* Device parameter to configure log 2 of the stride size for MPRQ. */ +#define MLX5_RX_MPRQ_LOG_STRIDE_SIZE "mprq_log_stride_size" + /* Device parameter to limit the size of memcpy'd packet for MPRQ. */ #define MLX5_RX_MPRQ_MAX_MEMCPY_LEN "mprq_max_memcpy_len" @@ -1533,6 +1536,8 @@ struct mlx5_flow_id_pool * config->mprq.enabled = !!tmp; } else if (strcmp(MLX5_RX_MPRQ_LOG_STRIDE_NUM, key) == 0) { config->mprq.stride_num_n = tmp; + } else if (strcmp(MLX5_RX_MPRQ_LOG_STRIDE_SIZE, key) == 0) { + config->mprq.stride_size_n = tmp; } else if (strcmp(MLX5_RX_MPRQ_MAX_MEMCPY_LEN, key) == 0) { config->mprq.max_memcpy_len = tmp; } else if (strcmp(MLX5_RXQS_MIN_MPRQ, key) == 0) { @@ -1629,6 +1634,7 @@ struct mlx5_flow_id_pool * MLX5_RXQ_PKT_PAD_EN, MLX5_RX_MPRQ_EN, MLX5_RX_MPRQ_LOG_STRIDE_NUM, + MLX5_RX_MPRQ_LOG_STRIDE_SIZE, MLX5_RX_MPRQ_MAX_MEMCPY_LEN, MLX5_RXQS_MIN_MPRQ, MLX5_TXQ_INLINE, @@ -2304,8 +2310,6 @@ struct mlx5_flow_id_pool * mprq_caps.min_single_wqe_log_num_of_strides; mprq_max_stride_num_n = mprq_caps.max_single_wqe_log_num_of_strides; - config.mprq.stride_num_n = RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, - mprq_min_stride_num_n); } #endif if (RTE_CACHE_LINE_SIZE == 128 && @@ -2619,17 +2623,32 @@ struct mlx5_flow_id_pool * #endif } if (config.mprq.enabled && mprq) { - if (config.mprq.stride_num_n > mprq_max_stride_num_n || - config.mprq.stride_num_n < mprq_min_stride_num_n) { + if (config.mprq.stride_num_n && + (config.mprq.stride_num_n > mprq_max_stride_num_n || + config.mprq.stride_num_n < mprq_min_stride_num_n)) { config.mprq.stride_num_n = - RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, - mprq_min_stride_num_n); + RTE_MIN(RTE_MAX(MLX5_MPRQ_STRIDE_NUM_N, + mprq_min_stride_num_n), + mprq_max_stride_num_n); DRV_LOG(WARNING, "the number of strides" " for Multi-Packet RQ is out of range," " setting default value (%u)", 1 << config.mprq.stride_num_n); } + if (config.mprq.stride_size_n && + (config.mprq.stride_size_n > mprq_max_stride_size_n || + config.mprq.stride_size_n < mprq_min_stride_size_n)) { + config.mprq.stride_size_n = + RTE_MIN(RTE_MAX(MLX5_MPRQ_STRIDE_SIZE_N, + mprq_min_stride_size_n), + mprq_max_stride_size_n); + DRV_LOG(WARNING, + "the size of a stride" + " for Multi-Packet RQ is out of range," + " setting default value (%u)", + 1 << config.mprq.stride_size_n); + } config.mprq.min_stride_size_n = mprq_min_stride_size_n; config.mprq.max_stride_size_n = mprq_max_stride_size_n; } else if (config.mprq.enabled && !mprq) { @@ -3363,7 +3382,8 @@ struct mlx5_flow_id_pool * .mr_ext_memseg_en = 1, .mprq = { .enabled = 0, /* Disabled by default. */ - .stride_num_n = MLX5_MPRQ_STRIDE_NUM_N, + .stride_num_n = 0, + .stride_size_n = 0, .max_memcpy_len = MLX5_MPRQ_MEMCPY_DEFAULT_LEN, .min_rxqs_num = MLX5_MPRQ_MIN_RXQS, }, diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 396dba7..39af7ab 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -179,6 +179,7 @@ struct mlx5_dev_config { struct { unsigned int enabled:1; /* Whether MPRQ is enabled. */ unsigned int stride_num_n; /* Number of strides. */ + unsigned int stride_size_n; /* Size of a stride. */ unsigned int min_stride_size_n; /* Min size of a stride. */ unsigned int max_stride_size_n; /* Max size of a stride. */ unsigned int max_memcpy_len; diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index 19e8253..260f584 100644 --- a/drivers/net/mlx5/mlx5_defs.h +++ b/drivers/net/mlx5/mlx5_defs.h @@ -143,6 +143,9 @@ /* Log 2 of the default number of strides per WQE for Multi-Packet RQ. */ #define MLX5_MPRQ_STRIDE_NUM_N 6U +/* Log 2 of the default size of a stride per WQE for Multi-Packet RQ. */ +#define MLX5_MPRQ_STRIDE_SIZE_N 11U + /* Two-byte shift is disabled for Multi-Packet RQ. */ #define MLX5_MPRQ_TWO_BYTE_SHIFT 0 diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index 0a95e3c..7adeabc 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -1793,7 +1793,9 @@ struct mlx5_rxq_ctrl * struct mlx5_priv *priv = dev->data->dev_private; struct mlx5_rxq_ctrl *tmpl; unsigned int mb_len = rte_pktmbuf_data_room_size(mp); + unsigned int mprq_stride_nums; unsigned int mprq_stride_size; + unsigned int mprq_stride_cap; struct mlx5_dev_config *config = &priv->config; unsigned int strd_headroom_en; /* @@ -1856,25 +1858,40 @@ struct mlx5_rxq_ctrl * strd_headroom_en = 1; mprq_stride_size = non_scatter_min_mbuf_size; } + mprq_stride_nums = config->mprq.stride_num_n ? + config->mprq.stride_num_n : MLX5_MPRQ_STRIDE_NUM_N; + mprq_stride_size = (mprq_stride_size <= + (1U << config->mprq.max_stride_size_n)) ? + log2above(mprq_stride_size) : MLX5_MPRQ_STRIDE_SIZE_N; + mprq_stride_cap = (config->mprq.stride_num_n ? + (1U << config->mprq.stride_num_n) : (1U << mprq_stride_nums)) * + (config->mprq.stride_size_n ? + (1U << config->mprq.stride_size_n) : (1U << mprq_stride_size)); /* * This Rx queue can be configured as a Multi-Packet RQ if all of the * following conditions are met: * - MPRQ is enabled. * - The number of descs is more than the number of strides. - * - max_rx_pkt_len plus overhead is less than the max size of a - * stride. + * - max_rx_pkt_len plus overhead is less than the max size + * of a stride or mprq_stride_size is specified by an user. + * Need to nake sure that there are enough stides to encap + * the maximum packet size in case mprq_stride_size is set. * Otherwise, enable Rx scatter if necessary. */ - if (mprq_en && - desc > (1U << config->mprq.stride_num_n) && - mprq_stride_size <= (1U << config->mprq.max_stride_size_n)) { + if (mprq_en && desc > (1U << mprq_stride_nums) && + (non_scatter_min_mbuf_size - + (lro_on_queue ? RTE_PKTMBUF_HEADROOM : 0) <= + (1U << config->mprq.max_stride_size_n) || + (config->mprq.stride_size_n && + non_scatter_min_mbuf_size <= mprq_stride_cap))) { /* TODO: Rx scatter isn't supported yet. */ tmpl->rxq.sges_n = 0; /* Trim the number of descs needed. */ - desc >>= config->mprq.stride_num_n; - tmpl->rxq.strd_num_n = config->mprq.stride_num_n; - tmpl->rxq.strd_sz_n = RTE_MAX(log2above(mprq_stride_size), - config->mprq.min_stride_size_n); + desc >>= mprq_stride_nums; + tmpl->rxq.strd_num_n = config->mprq.stride_num_n ? + config->mprq.stride_num_n : mprq_stride_nums; + tmpl->rxq.strd_sz_n = config->mprq.stride_size_n ? + config->mprq.stride_size_n : mprq_stride_size; tmpl->rxq.strd_shift_en = MLX5_MPRQ_TWO_BYTE_SHIFT; tmpl->rxq.strd_headroom_en = strd_headroom_en; tmpl->rxq.mprq_max_memcpy_len = RTE_MIN(first_mb_free_size, @@ -1923,11 +1940,18 @@ struct mlx5_rxq_ctrl * if (mprq_en && !mlx5_rxq_mprq_enabled(&tmpl->rxq)) DRV_LOG(WARNING, "port %u MPRQ is requested but cannot be enabled" - " (requested: desc = %u, stride_sz = %u," - " supported: min_stride_num = %u, max_stride_sz = %u).", - dev->data->port_id, desc, mprq_stride_size, - (1 << config->mprq.stride_num_n), - (1 << config->mprq.max_stride_size_n)); + " (requested: packet size = %u, desc = %u," + " stride_sz = %u, stride_num = %u," + " supported: min_stride_sz = %u, max_stride_sz = %u).", + dev->data->port_id, non_scatter_min_mbuf_size, desc, + config->mprq.stride_size_n ? + (1U << config->mprq.stride_size_n) : + (1U << mprq_stride_size), + config->mprq.stride_num_n ? + (1U << config->mprq.stride_num_n) : + (1U << mprq_stride_nums), + (1U << config->mprq.min_stride_size_n), + (1U << config->mprq.max_stride_size_n)); DRV_LOG(DEBUG, "port %u maximum number of segments per packet: %u", dev->data->port_id, 1 << tmpl->rxq.sges_n); if (desc % (1 << tmpl->rxq.sges_n)) { -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v3 2/3] net/mlx5: enable MPRQ multi-stride operations 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 " Alexander Kozyrev 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev @ 2020-04-09 21:24 ` Alexander Kozyrev 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 3/3] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev 2 siblings, 0 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-09 21:24 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo, stable MPRQ feature should be updated to allow a packet to be received into multiple strides in order to support the MTU exceeding 8KB. Special care is needed to prevent the headroom corruption in the multi-stride mode since the headroom space is borrowed by the PMD from the tail of the preceding stride. Copy the whole packet into a separate mbuf in this case or just the overlapping data if the Rx scattering is supported by an application. Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> --- drivers/net/mlx5/mlx5_rxq.c | 28 ++++-------------- drivers/net/mlx5/mlx5_rxtx.c | 68 +++++++++++++++++++------------------------- drivers/net/mlx5/mlx5_rxtx.h | 2 +- 3 files changed, 36 insertions(+), 62 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index 7adeabc..08e4ccb 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -1797,7 +1797,6 @@ struct mlx5_rxq_ctrl * unsigned int mprq_stride_size; unsigned int mprq_stride_cap; struct mlx5_dev_config *config = &priv->config; - unsigned int strd_headroom_en; /* * Always allocate extra slots, even if eventually * the vector Rx will not be used. @@ -1843,26 +1842,11 @@ struct mlx5_rxq_ctrl * tmpl->socket = socket; if (dev->data->dev_conf.intr_conf.rxq) tmpl->irq = 1; - /* - * LRO packet may consume all the stride memory, hence we cannot - * guaranty head-room near the packet memory in the stride. - * In this case scatter is, for sure, enabled and an empty mbuf may be - * added in the start for the head-room. - */ - if (lro_on_queue && RTE_PKTMBUF_HEADROOM > 0 && - non_scatter_min_mbuf_size > mb_len) { - strd_headroom_en = 0; - mprq_stride_size = RTE_MIN(max_rx_pkt_len, - 1u << config->mprq.max_stride_size_n); - } else { - strd_headroom_en = 1; - mprq_stride_size = non_scatter_min_mbuf_size; - } mprq_stride_nums = config->mprq.stride_num_n ? config->mprq.stride_num_n : MLX5_MPRQ_STRIDE_NUM_N; - mprq_stride_size = (mprq_stride_size <= - (1U << config->mprq.max_stride_size_n)) ? - log2above(mprq_stride_size) : MLX5_MPRQ_STRIDE_SIZE_N; + mprq_stride_size = non_scatter_min_mbuf_size <= + (1U << config->mprq.max_stride_size_n) ? + log2above(non_scatter_min_mbuf_size) : MLX5_MPRQ_STRIDE_SIZE_N; mprq_stride_cap = (config->mprq.stride_num_n ? (1U << config->mprq.stride_num_n) : (1U << mprq_stride_nums)) * (config->mprq.stride_size_n ? @@ -1879,8 +1863,7 @@ struct mlx5_rxq_ctrl * * Otherwise, enable Rx scatter if necessary. */ if (mprq_en && desc > (1U << mprq_stride_nums) && - (non_scatter_min_mbuf_size - - (lro_on_queue ? RTE_PKTMBUF_HEADROOM : 0) <= + (non_scatter_min_mbuf_size <= (1U << config->mprq.max_stride_size_n) || (config->mprq.stride_size_n && non_scatter_min_mbuf_size <= mprq_stride_cap))) { @@ -1893,7 +1876,8 @@ struct mlx5_rxq_ctrl * tmpl->rxq.strd_sz_n = config->mprq.stride_size_n ? config->mprq.stride_size_n : mprq_stride_size; tmpl->rxq.strd_shift_en = MLX5_MPRQ_TWO_BYTE_SHIFT; - tmpl->rxq.strd_headroom_en = strd_headroom_en; + tmpl->rxq.strd_scatter_en = + !!(offloads & DEV_RX_OFFLOAD_SCATTER); tmpl->rxq.mprq_max_memcpy_len = RTE_MIN(first_mb_free_size, config->mprq.max_memcpy_len); max_lro_size = RTE_MIN(max_rx_pkt_len, diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index f3bf763..4c27952 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1658,21 +1658,20 @@ enum mlx5_txcmp_code { unsigned int i = 0; uint32_t rq_ci = rxq->rq_ci; uint16_t consumed_strd = rxq->consumed_strd; - uint16_t headroom_sz = rxq->strd_headroom_en * RTE_PKTMBUF_HEADROOM; struct mlx5_mprq_buf *buf = (*rxq->mprq_bufs)[rq_ci & wq_mask]; while (i < pkts_n) { struct rte_mbuf *pkt; void *addr; int ret; - unsigned int len; + uint32_t len; uint16_t strd_cnt; uint16_t strd_idx; uint32_t offset; uint32_t byte_cnt; + int32_t hdrm_overlap; volatile struct mlx5_mini_cqe8 *mcqe = NULL; uint32_t rss_hash_res = 0; - uint8_t lro_num_seg; if (consumed_strd == strd_n) { /* Replace WQE only if the buffer is still in use. */ @@ -1719,18 +1718,6 @@ enum mlx5_txcmp_code { MLX5_ASSERT(strd_idx < strd_n); MLX5_ASSERT(!((rte_be_to_cpu_16(cqe->wqe_id) ^ rq_ci) & wq_mask)); - lro_num_seg = cqe->lro_num_seg; - /* - * Currently configured to receive a packet per a stride. But if - * MTU is adjusted through kernel interface, device could - * consume multiple strides without raising an error. In this - * case, the packet should be dropped because it is bigger than - * the max_rx_pkt_len. - */ - if (unlikely(!lro_num_seg && strd_cnt > 1)) { - ++rxq->stats.idropped; - continue; - } pkt = rte_pktmbuf_alloc(rxq->mp); if (unlikely(pkt == NULL)) { ++rxq->stats.rx_nombuf; @@ -1742,12 +1729,16 @@ enum mlx5_txcmp_code { len -= RTE_ETHER_CRC_LEN; offset = strd_idx * strd_sz + strd_shift; addr = RTE_PTR_ADD(mlx5_mprq_buf_addr(buf, strd_n), offset); + hdrm_overlap = len + RTE_PKTMBUF_HEADROOM - strd_cnt * strd_sz; /* * Memcpy packets to the target mbuf if: * - The size of packet is smaller than mprq_max_memcpy_len. * - Out of buffer in the Mempool for Multi-Packet RQ. + * - There is no space for a headroom and scatter is disabled. */ - if (len <= rxq->mprq_max_memcpy_len || rxq->mprq_repl == NULL) { + if (len <= rxq->mprq_max_memcpy_len || + rxq->mprq_repl == NULL || + (hdrm_overlap > 0 && !rxq->strd_scatter_en)) { /* * When memcpy'ing packet due to out-of-buffer, the * packet must be smaller than the target mbuf. @@ -1769,7 +1760,7 @@ enum mlx5_txcmp_code { rte_atomic16_add_return(&buf->refcnt, 1); MLX5_ASSERT((uint16_t)rte_atomic16_read(&buf->refcnt) <= strd_n + 1); - buf_addr = RTE_PTR_SUB(addr, headroom_sz); + buf_addr = RTE_PTR_SUB(addr, RTE_PKTMBUF_HEADROOM); /* * MLX5 device doesn't use iova but it is necessary in a * case where the Rx packet is transmitted via a @@ -1788,43 +1779,42 @@ enum mlx5_txcmp_code { rte_pktmbuf_attach_extbuf(pkt, buf_addr, buf_iova, buf_len, shinfo); /* Set mbuf head-room. */ - pkt->data_off = headroom_sz; + SET_DATA_OFF(pkt, RTE_PKTMBUF_HEADROOM); MLX5_ASSERT(pkt->ol_flags == EXT_ATTACHED_MBUF); - /* - * Prevent potential overflow due to MTU change through - * kernel interface. - */ - if (unlikely(rte_pktmbuf_tailroom(pkt) < len)) { - rte_pktmbuf_free_seg(pkt); - ++rxq->stats.idropped; - continue; - } + MLX5_ASSERT(rte_pktmbuf_tailroom(pkt) < + len - (hdrm_overlap > 0 ? hdrm_overlap : 0)); DATA_LEN(pkt) = len; /* - * LRO packet may consume all the stride memory, in this - * case packet head-room space is not guaranteed so must - * to add an empty mbuf for the head-room. + * Copy the last fragment of a packet (up to headroom + * size bytes) in case there is a stride overlap with + * a next packet's headroom. Allocate a separate mbuf + * to store this fragment and link it. Scatter is on. */ - if (!rxq->strd_headroom_en) { - struct rte_mbuf *headroom_mbuf = - rte_pktmbuf_alloc(rxq->mp); + if (hdrm_overlap > 0) { + MLX5_ASSERT(rxq->strd_scatter_en); + struct rte_mbuf *seg = + rte_pktmbuf_alloc(rxq->mp); - if (unlikely(headroom_mbuf == NULL)) { + if (unlikely(seg == NULL)) { rte_pktmbuf_free_seg(pkt); ++rxq->stats.rx_nombuf; break; } - PORT(pkt) = rxq->port_id; - NEXT(headroom_mbuf) = pkt; - pkt = headroom_mbuf; + SET_DATA_OFF(seg, 0); + rte_memcpy(rte_pktmbuf_mtod(seg, void *), + RTE_PTR_ADD(addr, len - hdrm_overlap), + hdrm_overlap); + DATA_LEN(seg) = hdrm_overlap; + DATA_LEN(pkt) = len - hdrm_overlap; + NEXT(pkt) = seg; NB_SEGS(pkt) = 2; } } rxq_cq_to_mbuf(rxq, pkt, cqe, rss_hash_res); - if (lro_num_seg > 1) { + if (cqe->lro_num_seg > 1) { mlx5_lro_update_hdr(addr, cqe, len); pkt->ol_flags |= PKT_RX_LRO; - pkt->tso_segsz = strd_sz; + pkt->tso_segsz = len / cqe->lro_num_seg; } PKT_LEN(pkt) = len; PORT(pkt) = rxq->port_id; diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 939778a..d155c24 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -119,7 +119,7 @@ struct mlx5_rxq_data { unsigned int strd_sz_n:4; /* Log 2 of stride size. */ unsigned int strd_shift_en:1; /* Enable 2bytes shift on a stride. */ unsigned int err_state:2; /* enum mlx5_rxq_err_state. */ - unsigned int strd_headroom_en:1; /* Enable mbuf headroom in MPRQ. */ + unsigned int strd_scatter_en:1; /* Scattered packets from a stride. */ unsigned int lro:1; /* Enable LRO. */ unsigned int :1; /* Remaining bits. */ volatile uint32_t *rq_db; -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [dpdk-dev] [PATCH v3 3/3] net/mlx5: add multi-segment packets in MPRQ mode 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 " Alexander Kozyrev 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 2/3] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev @ 2020-04-09 21:24 ` Alexander Kozyrev 2 siblings, 0 replies; 28+ messages in thread From: Alexander Kozyrev @ 2020-04-09 21:24 UTC (permalink / raw) To: dev; +Cc: rasland, matan, viacheslavo, stable The multi-stride operations now allow to reduce a stride size while supporting Jumbo frames. That means that it is possible to have mbufs configured with a size smaller than the whole packet received. It is not an issue during normal MPRQ operations since we attach external buffers instead of copying the data into the mbuf itself. But it is not the case in "emergency mode" when we have to copy every packet because of no more external mbufs are available. Assemble a multi-segment packet to overcome this issue in case scatter mode is enabled, drop a packet if not. Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> --- drivers/net/mlx5/mlx5_rxtx.c | 47 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index 4c27952..7ce3732 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -1734,22 +1734,52 @@ enum mlx5_txcmp_code { * Memcpy packets to the target mbuf if: * - The size of packet is smaller than mprq_max_memcpy_len. * - Out of buffer in the Mempool for Multi-Packet RQ. - * - There is no space for a headroom and scatter is disabled. + * - The packet's stride overlaps a headroom and scatter is off. */ if (len <= rxq->mprq_max_memcpy_len || rxq->mprq_repl == NULL || (hdrm_overlap > 0 && !rxq->strd_scatter_en)) { - /* - * When memcpy'ing packet due to out-of-buffer, the - * packet must be smaller than the target mbuf. - */ - if (unlikely(rte_pktmbuf_tailroom(pkt) < len)) { + if (likely(rte_pktmbuf_tailroom(pkt) >= len)) { + rte_memcpy(rte_pktmbuf_mtod(pkt, void *), + addr, len); + DATA_LEN(pkt) = len; + } else if (rxq->strd_scatter_en) { + struct rte_mbuf *prev = pkt; + uint32_t seg_len = + RTE_MIN(rte_pktmbuf_tailroom(pkt), len); + uint32_t rem_len = len - seg_len; + + rte_memcpy(rte_pktmbuf_mtod(pkt, void *), + addr, seg_len); + DATA_LEN(pkt) = seg_len; + while (rem_len) { + struct rte_mbuf *next = + rte_pktmbuf_alloc(rxq->mp); + + if (unlikely(next == NULL)) { + rte_pktmbuf_free(pkt); + ++rxq->stats.rx_nombuf; + goto out; + } + NEXT(prev) = next; + SET_DATA_OFF(next, 0); + addr = RTE_PTR_ADD(addr, seg_len); + seg_len = RTE_MIN + (rte_pktmbuf_tailroom(next), + rem_len); + rte_memcpy + (rte_pktmbuf_mtod(next, void *), + addr, seg_len); + DATA_LEN(next) = seg_len; + rem_len -= seg_len; + prev = next; + ++NB_SEGS(pkt); + } + } else { rte_pktmbuf_free_seg(pkt); ++rxq->stats.idropped; continue; } - rte_memcpy(rte_pktmbuf_mtod(pkt, void *), addr, len); - DATA_LEN(pkt) = len; } else { rte_iova_t buf_iova; struct rte_mbuf_ext_shared_info *shinfo; @@ -1826,6 +1856,7 @@ enum mlx5_txcmp_code { *(pkts++) = pkt; ++i; } +out: /* Update the consumer indexes. */ rxq->consumed_strd = consumed_strd; rte_cio_wmb(); -- 1.8.3.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2020-04-15 15:52 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-31 21:52 [dpdk-dev] [PATCH 0/4] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev 2020-03-31 21:52 ` [dpdk-dev] [PATCH 1/4] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev 2020-04-02 10:00 ` Slava Ovsiienko 2020-03-31 21:52 ` [dpdk-dev] [PATCH 2/4] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev 2020-04-02 10:01 ` Slava Ovsiienko 2020-03-31 21:52 ` [dpdk-dev] [PATCH 3/4] doc: add a decsription for MPRQ stride size devarg Alexander Kozyrev 2020-03-31 21:52 ` [dpdk-dev] [PATCH 4/4] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev 2020-04-02 10:02 ` Slava Ovsiienko 2020-04-02 18:11 ` [dpdk-dev] [PATCH 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev 2020-04-02 18:11 ` [dpdk-dev] [PATCH 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev 2020-04-02 18:11 ` [dpdk-dev] [PATCH 2/3] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev 2020-04-02 18:11 ` [dpdk-dev] [PATCH 3/3] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ Alexander Kozyrev 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev 2020-04-14 11:42 ` Ferruh Yigit 2020-04-14 12:52 ` Thomas Monjalon 2020-04-15 11:01 ` Ferruh Yigit 2020-04-15 11:25 ` Luca Boccassi 2020-04-15 15:34 ` Alexander Kozyrev 2020-04-15 15:52 ` [dpdk-dev] [dpdk-stable] " Luca Boccassi 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 2/3] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev 2020-04-09 22:23 ` [dpdk-dev] [PATCH v4 3/3] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev 2020-04-10 14:01 ` [dpdk-dev] [PATCH v4 0/3] net/mlx5: add large packet size support to MPRQ Matan Azrad 2020-04-13 10:57 ` Raslan Darawsheh 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 " Alexander Kozyrev 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 1/3] net/mlx5: add a devarg to specify MPRQ stride size Alexander Kozyrev 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 2/3] net/mlx5: enable MPRQ multi-stride operations Alexander Kozyrev 2020-04-09 21:24 ` [dpdk-dev] [PATCH v3 3/3] net/mlx5: add multi-segment packets in MPRQ mode Alexander Kozyrev
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).