From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-f182.google.com (mail-wr0-f182.google.com [209.85.128.182]) by dpdk.org (Postfix) with ESMTP id E05742BB1 for ; Wed, 1 Mar 2017 15:33:40 +0100 (CET) Received: by mail-wr0-f182.google.com with SMTP id g10so31482039wrg.2 for ; Wed, 01 Mar 2017 06:33:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=UHLdwKD5iLRptIkrTUffZ2NxCcQLEDrpgG8kJRrnCNc=; b=FzHNOMBb7IB7jdt8GKzjCb+ZgKxfkyL+Aw9J1D4j9ZRPulOtmGqi76ER6zhoQ2suGR IQXGctL5xD90HO9i2g8d1Aqrkf5EANXaakSZNqO9/EZ8J75O/HSTde/2X4ukNEsX5bjx RAjBgRucr/k79muyN6guCI7EBnVBQF7GvMlCxE9amlNOZi1XqR+AxfmWomAFT66i2zs1 tLltj6VSBk7bpl4rQjU1MFwRJwmfb6DMpTJGmb5dl1x8+l7OhW4KBg+k60B28DImJJzQ rd956qBEL3Nq4yG6S6EFSzlpI/5tR7HiKUREo4P3WSrPrFW3M/3w+OcIuYtoOBNCNlxd 1vhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=UHLdwKD5iLRptIkrTUffZ2NxCcQLEDrpgG8kJRrnCNc=; b=XU/oyKm12xd69ji8BL2ERzRpD4QHNoD2ykcwkvgygqHCi/aosAwkDcMxB13EATi/FU dA+X6FPHfWa9xsWJxz/WhDCYo+C0rKC9bMoRuOr+hTuTvce3phAbDfzmj9CYdN4cu1So 6pvYBfVzT0EFJCjIOxxTDuGNFVYy6xt9g3zG3p0AAnKFf6N0C7jMqHQ/uSoUcrDjxtvp GSHkK7gRLpDeNIldLDqxOn5RjJBRwQalbsiTLh2o0DzH6ubQLWKLFQaQMFAdmLyODHn/ IDUbjshxLjBBFMkzsiZGaIh3v3NBcSUp5rNhae31CT8jaJwlJd3gB52/WRyxzvEkV0Oz A6xg== X-Gm-Message-State: AMke39nQ5sspTiFyXsXV1I7ynX/E9lBaLWw5dBI+VWyKbpNFRRhs/Rf0zmUHTKXn2YG1ALv5 X-Received: by 10.223.177.142 with SMTP id q14mr8456395wra.194.1488378820129; Wed, 01 Mar 2017 06:33:40 -0800 (PST) Received: from autoinstall.dev.6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id 36sm6873402wrz.8.2017.03.01.06.33.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 Mar 2017 06:33:39 -0800 (PST) Date: Wed, 1 Mar 2017 15:33:31 +0100 From: =?iso-8859-1?Q?N=E9lio?= Laranjeiro To: Shahaf Shuler Cc: adrien.mazarguil@6wind.com, dev@dpdk.org Message-ID: <20170301143331.GW22756@autoinstall.dev.6wind.com> References: <1487779800-46491-1-git-send-email-shahafs@mellanox.com> <1488366702-51373-1-git-send-email-shahafs@mellanox.com> <1488366702-51373-2-git-send-email-shahafs@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1488366702-51373-2-git-send-email-shahafs@mellanox.com> User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] [PATCH v2 1/1] net/mlx5: add hardware TSO support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Mar 2017 14:33:41 -0000 Sahaf, Some few remarks below. On Wed, Mar 01, 2017 at 01:11:42PM +0200, Shahaf Shuler wrote: > Implement support for hardware TSO. > > Signed-off-by: Shahaf Shuler > --- > on v2: > * Instead of exposing capability, TSO checks on data path. > * PMD specific parameter to enable TSO. > * different implementaion for the data path. > Performance impact ~0.1-0.2Mpps > --- > doc/guides/nics/features/mlx5.ini | 1 + > doc/guides/nics/mlx5.rst | 12 ++++ > drivers/net/mlx5/mlx5.c | 18 ++++++ > drivers/net/mlx5/mlx5.h | 2 + > drivers/net/mlx5/mlx5_defs.h | 3 + > drivers/net/mlx5/mlx5_ethdev.c | 2 + > drivers/net/mlx5/mlx5_rxtx.c | 120 +++++++++++++++++++++++++++++++++----- > drivers/net/mlx5/mlx5_rxtx.h | 2 + > drivers/net/mlx5/mlx5_txq.c | 13 +++++ > 9 files changed, 157 insertions(+), 16 deletions(-) > > diff --git a/doc/guides/nics/features/mlx5.ini b/doc/guides/nics/features/mlx5.ini > index f20d214..c6948cb 100644 > --- a/doc/guides/nics/features/mlx5.ini > +++ b/doc/guides/nics/features/mlx5.ini > @@ -19,6 +19,7 @@ RSS hash = Y > RSS key update = Y > RSS reta update = Y > SR-IOV = Y > +TSO = Y This file expects spaces to align the equal sign. > VLAN filter = Y > Flow director = Y > Flow API = Y > diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst > index 09922a0..8651456 100644 > --- a/doc/guides/nics/mlx5.rst > +++ b/doc/guides/nics/mlx5.rst > @@ -90,6 +90,7 @@ Features > - Secondary process TX is supported. > - KVM and VMware ESX SR-IOV modes are supported. > - RSS hash result is supported. > +- Hardware TSO. > > Limitations > ----------- > @@ -186,9 +187,20 @@ Run-time configuration > save PCI bandwidth and improve performance at the cost of a slightly > higher CPU usage. > > + This option cannot be used in conjunction with ``tso`` below. When ``tso`` > + is set, ``txq_mpw_en`` is disabled. > + > It is currently only supported on the ConnectX-4 Lx and ConnectX-5 > families of adapters. Enabled by default. > > +- ``tso`` parameter [int] > + > + A nonzero value enables hardware TSO. > + When hardware TSO is enabled, packets marked with TCP segmentation > + offload will be divided into segments by the hardware. > + > + Disabled by default. > + > Prerequisites > ------------- > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c > index d4bd469..3623fbe 100644 > --- a/drivers/net/mlx5/mlx5.c > +++ b/drivers/net/mlx5/mlx5.c > @@ -84,6 +84,9 @@ > /* Device parameter to enable multi-packet send WQEs. */ > #define MLX5_TXQ_MPW_EN "txq_mpw_en" > > +/* Device parameter to enable hardware TSO offload. */ > +#define MLX5_TSO "tso" > + > /** > * Retrieve integer value from environment variable. > * > @@ -290,6 +293,8 @@ > priv->txqs_inline = tmp; > } else if (strcmp(MLX5_TXQ_MPW_EN, key) == 0) { > priv->mps &= !!tmp; /* Enable MPW only if HW supports */ > + } else if (strcmp(MLX5_TSO, key) == 0) { > + priv->tso = !!tmp; > } else { > WARN("%s: unknown parameter", key); > return -EINVAL; > @@ -316,6 +321,7 @@ > MLX5_TXQ_INLINE, > MLX5_TXQS_MIN_INLINE, > MLX5_TXQ_MPW_EN, > + MLX5_TSO, > NULL, > }; > struct rte_kvargs *kvlist; > @@ -479,6 +485,7 @@ > IBV_EXP_DEVICE_ATTR_RX_HASH | > IBV_EXP_DEVICE_ATTR_VLAN_OFFLOADS | > IBV_EXP_DEVICE_ATTR_RX_PAD_END_ALIGN | > + IBV_EXP_DEVICE_ATTR_TSO_CAPS | > 0; > > DEBUG("using port %u (%08" PRIx32 ")", port, test); > @@ -580,11 +587,22 @@ > > priv_get_num_vfs(priv, &num_vfs); > priv->sriov = (num_vfs || sriov); > + priv->tso = ((priv->tso) && > + (exp_device_attr.tso_caps.max_tso > 0) && > + (exp_device_attr.tso_caps.supported_qpts & > + (1 << IBV_QPT_RAW_ETH))); > + if (priv->tso) > + priv->max_tso_payload_sz = > + exp_device_attr.tso_caps.max_tso; > if (priv->mps && !mps) { > ERROR("multi-packet send not supported on this device" > " (" MLX5_TXQ_MPW_EN ")"); > err = ENOTSUP; > goto port_error; > + } else if (priv->mps && priv->tso) { > + WARN("multi-paclet send not supported in conjunction " > + "with TSO. PMD overrides to 0"); > + priv->mps = 0; > } It is not so clear which of both features are disabled unless reading the code. Is not it better to replace "PMD overrides to 0" by "MPS disabled"? > /* Allocate and register default RSS hash keys. */ > priv->rss_conf = rte_calloc(__func__, hash_rxq_init_n, > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h > index 2b4345a..d2bb835 100644 > --- a/drivers/net/mlx5/mlx5.h > +++ b/drivers/net/mlx5/mlx5.h > @@ -126,6 +126,8 @@ struct priv { > unsigned int mps:1; /* Whether multi-packet send is supported. */ > unsigned int cqe_comp:1; /* Whether CQE compression is enabled. */ > unsigned int pending_alarm:1; /* An alarm is pending. */ > + unsigned int tso:1; /* Whether TSO is supported. */ > + unsigned int max_tso_payload_sz; /* Maximum TCP payload for TSO. */ > unsigned int txq_inline; /* Maximum packet size for inlining. */ > unsigned int txqs_inline; /* Queue number threshold for inlining. */ > /* RX/TX queues. */ > diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h > index e91d245..eecb908 100644 > --- a/drivers/net/mlx5/mlx5_defs.h > +++ b/drivers/net/mlx5/mlx5_defs.h > @@ -79,4 +79,7 @@ > /* Maximum number of extended statistics counters. */ > #define MLX5_MAX_XSTATS 32 > > +/* Maximum Packet headers size (L2+L3+L4) for TSO. */ > +#define MLX5_MAX_TSO_HEADER 128 > + > #endif /* RTE_PMD_MLX5_DEFS_H_ */ > diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c > index 5677f03..5542193 100644 > --- a/drivers/net/mlx5/mlx5_ethdev.c > +++ b/drivers/net/mlx5/mlx5_ethdev.c > @@ -693,6 +693,8 @@ struct priv * > (DEV_TX_OFFLOAD_IPV4_CKSUM | > DEV_TX_OFFLOAD_UDP_CKSUM | > DEV_TX_OFFLOAD_TCP_CKSUM); > + if (priv->tso) > + info->tx_offload_capa |= DEV_TX_OFFLOAD_TCP_TSO; > if (priv_get_ifname(priv, &ifname) == 0) > info->if_index = if_nametoindex(ifname); > /* FIXME: RETA update/query API expects the callee to know the size of > diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c > index b2b7223..3589aae 100644 > --- a/drivers/net/mlx5/mlx5_rxtx.c > +++ b/drivers/net/mlx5/mlx5_rxtx.c > @@ -365,6 +365,7 @@ > const unsigned int elts_n = 1 << txq->elts_n; > unsigned int i = 0; > unsigned int j = 0; > + unsigned int k = 0; > unsigned int max; > uint16_t max_wqe; > unsigned int comp; > @@ -392,8 +393,10 @@ > uintptr_t addr; > uint64_t naddr; > uint16_t pkt_inline_sz = MLX5_WQE_DWORD_SIZE + 2; > + uint16_t tso_header_sz = 0; > uint16_t ehdr; > uint8_t cs_flags = 0; > + uint64_t tso = 0; > #ifdef MLX5_PMD_SOFT_COUNTERS > uint32_t total_length = 0; > #endif > @@ -465,14 +468,71 @@ > length -= pkt_inline_sz; > addr += pkt_inline_sz; > } > + if (txq->tso_en) { > + tso = buf->ol_flags & PKT_TX_TCP_SEG; > + if (tso) { > + uintptr_t end = (uintptr_t) > + (((uintptr_t)txq->wqes) + > + (1 << txq->wqe_n) * > + MLX5_WQE_SIZE); > + unsigned int copy_b; > + > + tso_header_sz = buf->l2_len + buf->l3_len + > + buf->l4_len; > + if (unlikely(tso_header_sz > > + MLX5_MAX_TSO_HEADER)) > + break; > + copy_b = tso_header_sz - pkt_inline_sz; > + /* First seg must contain all headers. */ > + assert(copy_b <= length); > + raw += MLX5_WQE_DWORD_SIZE; > + if (copy_b && > + ((end - (uintptr_t)raw) > copy_b)) { > + uint16_t n = (MLX5_WQE_DS(copy_b) - > + 1 + 3) / 4; > + > + if (unlikely(max_wqe < n)) > + break; > + max_wqe -= n; > + rte_memcpy((void *)raw, > + (void *)addr, copy_b); > + addr += copy_b; > + length -= copy_b; > + pkt_inline_sz += copy_b; > + /* > + * Another DWORD will be added > + * in the inline part. > + */ > + raw += MLX5_WQE_DS(copy_b) * > + MLX5_WQE_DWORD_SIZE - > + MLX5_WQE_DWORD_SIZE; > + } else { > + /* NOP WQE. */ > + wqe->ctrl = (rte_v128u32_t){ > + htonl(txq->wqe_ci << 8), > + htonl(txq->qp_num_8s | 1), > + 0, > + 0, > + }; > + ds = 1; > + total_length = 0; > + pkts--; > + pkts_n++; > + elts_head = (elts_head - 1) & > + (elts_n - 1); > + k++; > + goto next_wqe; > + } > + } > + } > /* Inline if enough room. */ > - if (txq->max_inline) { > + if (txq->inline_en || tso) { > uintptr_t end = (uintptr_t) > (((uintptr_t)txq->wqes) + > (1 << txq->wqe_n) * MLX5_WQE_SIZE); > unsigned int max_inline = txq->max_inline * > RTE_CACHE_LINE_SIZE - > - MLX5_WQE_DWORD_SIZE; > + (pkt_inline_sz - 2); > uintptr_t addr_end = (addr + max_inline) & > ~(RTE_CACHE_LINE_SIZE - 1); > unsigned int copy_b = (addr_end > addr) ? > @@ -491,6 +551,18 @@ > if (unlikely(max_wqe < n)) > break; > max_wqe -= n; > + if (tso) { > + uint32_t inl = > + htonl(copy_b | MLX5_INLINE_SEG); > + > + pkt_inline_sz = > + MLX5_WQE_DS(tso_header_sz) * > + MLX5_WQE_DWORD_SIZE; > + rte_memcpy((void *)raw, > + (void *)&inl, sizeof(inl)); > + raw += sizeof(inl); > + pkt_inline_sz += sizeof(inl); > + } > rte_memcpy((void *)raw, (void *)addr, copy_b); > addr += copy_b; > length -= copy_b; > @@ -591,18 +663,34 @@ > next_pkt: > ++i; > /* Initialize known and common part of the WQE structure. */ > - wqe->ctrl = (rte_v128u32_t){ > - htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND), > - htonl(txq->qp_num_8s | ds), > - 0, > - 0, > - }; > - wqe->eseg = (rte_v128u32_t){ > - 0, > - cs_flags, > - 0, > - (ehdr << 16) | htons(pkt_inline_sz), > - }; > + if (tso) { > + wqe->ctrl = (rte_v128u32_t){ > + htonl((txq->wqe_ci << 8) | MLX5_OPCODE_TSO), > + htonl(txq->qp_num_8s | ds), > + 0, > + 0, > + }; > + wqe->eseg = (rte_v128u32_t){ > + 0, > + cs_flags | (htons(buf->tso_segsz) << 16), > + 0, > + (ehdr << 16) | htons(tso_header_sz), > + }; > + } else { > + wqe->ctrl = (rte_v128u32_t){ > + htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND), > + htonl(txq->qp_num_8s | ds), > + 0, > + 0, > + }; > + wqe->eseg = (rte_v128u32_t){ > + 0, > + cs_flags, > + 0, > + (ehdr << 16) | htons(pkt_inline_sz), > + }; > + } > +next_wqe: > txq->wqe_ci += (ds + 3) / 4; > #ifdef MLX5_PMD_SOFT_COUNTERS > /* Increment sent bytes counter. */ > @@ -610,10 +698,10 @@ > #endif > } while (pkts_n); > /* Take a shortcut if nothing must be sent. */ > - if (unlikely(i == 0)) > + if (unlikely((i + k) == 0)) > return 0; > /* Check whether completion threshold has been reached. */ > - comp = txq->elts_comp + i + j; > + comp = txq->elts_comp + i + j + k; > if (comp >= MLX5_TX_COMP_THRESH) { > volatile struct mlx5_wqe_ctrl *w = > (volatile struct mlx5_wqe_ctrl *)wqe; > diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h > index 41a34d7..6b328cf 100644 > --- a/drivers/net/mlx5/mlx5_rxtx.h > +++ b/drivers/net/mlx5/mlx5_rxtx.h > @@ -254,6 +254,8 @@ struct txq { > uint16_t cqe_n:4; /* Number of CQ elements (in log2). */ > uint16_t wqe_n:4; /* Number of of WQ elements (in log2). */ > uint16_t max_inline; /* Multiple of RTE_CACHE_LINE_SIZE to inline. */ > + uint16_t inline_en:1; /* When set inline is enabled. */ > + uint16_t tso_en:1; /* When set hardware TSO is enabled. */ > uint32_t qp_num_8s; /* QP number shifted by 8. */ > volatile struct mlx5_cqe (*cqes)[]; /* Completion queue. */ > volatile void *wqes; /* Work queue (use volatile to write into). */ > diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c > index 949035b..995b763 100644 > --- a/drivers/net/mlx5/mlx5_txq.c > +++ b/drivers/net/mlx5/mlx5_txq.c > @@ -342,6 +342,19 @@ > RTE_CACHE_LINE_SIZE); > attr.init.cap.max_inline_data = > tmpl.txq.max_inline * RTE_CACHE_LINE_SIZE; > + tmpl.txq.inline_en = 1; > + } > + if (priv->tso) { > + uint16_t max_tso_inline = ((MLX5_MAX_TSO_HEADER + > + (RTE_CACHE_LINE_SIZE - 1)) / > + RTE_CACHE_LINE_SIZE); > + > + attr.init.max_tso_header = > + max_tso_inline * RTE_CACHE_LINE_SIZE; > + attr.init.comp_mask |= IBV_EXP_QP_INIT_ATTR_MAX_TSO_HEADER; > + tmpl.txq.max_inline = RTE_MAX(tmpl.txq.max_inline, > + max_tso_inline); > + tmpl.txq.tso_en = 1; > } > tmpl.qp = ibv_exp_create_qp(priv->ctx, &attr.init); > if (tmpl.qp == NULL) { > -- > 1.8.3.1 Thanks, -- Nélio Laranjeiro 6WIND