From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-f169.google.com (mail-wr0-f169.google.com [209.85.128.169]) by dpdk.org (Postfix) with ESMTP id 856938E01 for ; Tue, 13 Mar 2018 15:25:38 +0100 (CET) Received: by mail-wr0-f169.google.com with SMTP id f14so23266778wre.8 for ; Tue, 13 Mar 2018 07:25:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=8d1fQLoe9/hbs5xJm31xhmBX8IaYMDA+3IKqRZeegWs=; b=ESYznUTFVC3YWJVBpM5vfOqX0P+WiePWmM4qXbJf1KddNhH7IRtVeywioe6mYfVx43 T1W+t3KD3wmX9eXi/08BCXUrvdvKTXfP/Y43DpCio08PrCB5iDp7REIlu3EvwiYmad6J vYvC2lE1DKiB11SMczXeEUq8onYP2iOdKEifJeokGvpTy9p+3sl6FVgjPDHHHrG7vWE2 cIUsa1SpZzC95QspXKzVH3dalhd3Llg6JCEUPGxSnJGVf1HUpvpQn/Y6qChHOrVAqUxw xOscYWlX5YWaDyqsH5XRPTjEjhG5M9INiidE47nMGtzjRy+AWQ55A4VsaLb4kKossUyK nUEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=8d1fQLoe9/hbs5xJm31xhmBX8IaYMDA+3IKqRZeegWs=; b=QGxlXHQNRdwL652HJAKprfq6CiY7i2YOgHkSXtr9/6AauGIehqOPNzOtRB1bzVMit0 d7bw/FNz9YnM079y9UA5qksGfMVObflVe+6pDYzlArKi9wuFGsTQRfhDhFFRRKG9PlrL DsXoBWM+XgaKmKWkT7HAnX/Ptu4daO+o0YeOkdqDHodBqvOr12/rdME0aA2O/a2+gd6y 9gsz0u3Z50Mdt+lxm7OMVUkOmSIf9dk/7zgsnizBOwozawTEj7RRjQTw3KFmEsDmuatH KaQ1CIq5+AwTEHtYBJyKuEFzsFAWWlW0rBM6653wHkISSvSfYk1FuE6xhDQ81u5nmt1y VcyQ== X-Gm-Message-State: AElRT7FmV2oud/HwGDi6gZl481LWhzBAKzKpewDPy1gaDFPCY2fOFOkD muGBYiiHZ2rLvkC+TImlh5cFL4BV X-Google-Smtp-Source: AG47ELvFwnLWt5ObdX3uHRSnAZSPE/hmfBR1vjRiISQ+FJz8D7Ypcl7W4rISREGkhrvjGskiTMeLgQ== X-Received: by 10.223.133.182 with SMTP id 51mr737625wrt.226.1520951137764; Tue, 13 Mar 2018 07:25:37 -0700 (PDT) Received: from 6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id l14sm457934wrh.62.2018.03.13.07.25.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Mar 2018 07:25:37 -0700 (PDT) Date: Tue, 13 Mar 2018 15:25:23 +0100 From: Adrien Mazarguil To: Ophir Munk Cc: dev@dpdk.org, Thomas Monjalon , Olga Shern , Moti Haimovsky Message-ID: <20180313142523.GJ3994@6wind.com> References: <1518072954-19082-1-git-send-email-ophirmu@mellanox.com> <1519511783-18928-1-git-send-email-ophirmu@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1519511783-18928-1-git-send-email-ophirmu@mellanox.com> Subject: Re: [dpdk-dev] [PATCH v2] doc: update mlx4 flow limitations X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Mar 2018 14:25:38 -0000 On Sat, Feb 24, 2018 at 10:36:23PM +0000, Ophir Munk wrote: > This patch updates mlx4 documentation with flow > configuration limitations imposed by NIC hardware and > PMD implementation > > Signed-off-by: Moti Haimovsky > Signed-off-by: Ophir Munk > --- > v1: initial version (use testpmd examples) > v2: enhance and add rte_flow examples I still don't agree with this version (reasons below), however if anyone thinks it's production-ready and good enough for a start, please send your acked-by line and I won't oppose its inclusion. > > doc/guides/nics/mlx4.rst | 383 +++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 383 insertions(+) > > diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst > index 98b9716..02cea79 100644 > --- a/doc/guides/nics/mlx4.rst > +++ b/doc/guides/nics/mlx4.rst > @@ -515,3 +515,386 @@ devices managed by librte_pmd_mlx4. > Port 3 Link Up - speed 40000 Mbps - full-duplex > Done > testpmd> > + > +Flow Limitations > +---------------- It's still announcing limitations instead of features, I remain unconvinced by this approach for reasons I already stated [1]. > + > +- Flows are specified by rte_flow API (defined in **rte_flow.h**) which provides > + a generic means to match specific traffic. Please refer to > + http://dpdk.org/doc/guides/prog_guide/rte_flow.html This should be a relative internal link to remain consistent when documentation is hosted on a different server or generated as a PDF file, see the various ":ref:" examples under doc/. > +- testpmd application **(app/test-pmd/)** has a command line interface where > + flows can be specified. These flows are translated by testpmd > + to rte_flow calls. Irrelevant unless you also describe the testpmd syntax will be used for subsequent examples and by providing a link to its definition. > +- **drivers/net/mlx4/mlx4_flow.c** can be used as a reference to flow > + limitations written in source code. It should be used as reference by the documentation writer, it's not really meant for readers of this documentation. PMD developers already know where to look. > + > +General > +~~~~~~~ > + > +.. code-block:: console > + > + struct rte_flow_attr { > + uint32_t group; /**< Priority group. */ > + uint32_t priority; /**< Priority level within group. */ > + uint32_t ingress:1; /**< Rule applies to ingress traffic. */ > + uint32_t egress:1; /**< Rule applies to egress traffic. */ > + uint32_t reserved:30; /**< Reserved, must be zero. */ > + } > + > + struct rte_flow_attr *attr; You shouldn't copy/paste the API nor PMD code in this documentation. It's overly verbose and makes it difficult to maintain. Use links if you want to refer to something documented elsewhere. > + > +- No support for mlx4 group rte_flow Did you mean "No support for group values other than 0"? Again, please focus on features instead of limitations, otherwise there's no end to this. > + > +.. code-block:: console > + > + if (attr->group) > + print_err("groups are not supported"); > + Besides the made-up print_err() definition, is this documentation expected to be synchronized every time error messages are updated in PMD code? > +- No support for mlx4 rte_flow priority above 0xfff > + > +.. code-block:: console > + > + if (attr->priority > 0xfff) > + print_err("maximum priority level is 0xfff"); You don't need to provide code proof to every statement. Moreover this one is inaccurate, the maximum priority level depends on whether isolated mode is enabled (0-4095 with rte_flow in isolated mode, 0-4094 otherwise). > + > +- No support for mlx4 rte_flow egress filters > + > +.. code-block:: console > + > + if (attr->egress) > + print_err("egress is not supported"); > + > +- Must specify mlx4 rte_flow ingress filters > + > +.. code-block:: console > + > + if (!attr->ingress) > + print_err("only ingress is supported"); > + Same comments as above regarding past and subsequent code-blocks... > +Flow rules filters > +~~~~~~~~~~~~~~~~~~ > + > +Flow rules filters can be validated using testpmd **flow validate** syntax. It's already part of testpmd documentation. > + > +L2 (ETH) > +^^^^^^^^ > + > +This section documents flow rules limitations related to RTE_FLOW_ITEM_TYPE_ETH. > +It does not apply to an inner ETH used after VXLAN since mlx4 can't match those > +yet. OK for VXLAN, now what about NVGRE and other tunnel protocols, I assume mlx4 can match them? (see how I'm trying to make this documentation only about supported features :) > + > +.. code-block:: console > + > + /** > + * RTE_FLOW_ITEM_TYPE_ETH > + * > + * Matches an Ethernet header. > + */ > + struct rte_flow_item_eth { > + struct ether_addr dst; /**< Destination MAC. */ > + struct ether_addr src; /**< Source MAC. */ > + rte_be16_t type; /**< EtherType. */ > + }; > + > + struct rte_flow_item_eth *mask; > + > +- Can only use real destination MAC. real => exact / fully specified? > +- EtherType is not taken into consideration. This reads like you can put whatever in this field and PMD just ignores it, while it's in fact not the case. Masking this field causes the PMD to explicitly reject the requested flow rule. > +- Source MAC is not taken into consideration and should be set to 0 Misleading since it's not the source MAC itself but its mask (like EtherType in fact). > + > +.. code-block:: console > + > + /** > + * Summarize all src adrresses adrresses => addresses > + **/ > + for (i = 0; i != sizeof(mask->src.addr_bytes); ++i) > + sum_src += mask->src.addr_bytes[i]; > + > + if (sum_src) > + print_err("mlx4 does not support source MAC matching"); > + > + > +Using testpmd application - src mask must be 00:00:00:00:00:00 > +otherwise the following command will fail. Please remove code proofs. Code is subject to change, what HW supports on the other hand isn't. A proper description should be enough. > + > +.. code-block:: console > + > + testpmd> flow validate 1 ingress pattern eth > + src spec 00:16:3e:2b:e6:47 src mask FF:FF:FF:FF:FF:FF > + / end actions drop / end > + > +- Supports only full mask. > +- No support for partial masks, except in the specific case of matching > + all multicast traffic (spec->dst and mask->dst equal to > + 01:00:00:00:00:00). > + > +.. code-block:: console > + > + /** > + * Summarize all dst adrresses > + */ > + for (i = 0; i != sizeof(mask->dst.addr_bytes); ++i) { > + sum_dst += mask->dst.addr_bytes[i]; > + > + if (sum_dst != (0xffui8 * ETHER_ADDR_LEN)) > + print_err("mlx4 does not support matching partial" > + " Ethernet fields"); > + } > + > +Using the following testpmd command with partial mask will fail. > + > +.. code-block:: console > + > + testpmd> flow validate 1 ingress pattern eth > + src spec 00:16:3e:2b:e6:47 > + dst spec 4A:11:6C:FA:60:D0 dst mask FF:00:FF:FF:FF:00 > + / end actions drop / end > + While I know the answer, an unsuspecting reader will wonder why "src spec" is specified and the comment doesn't fail on that basis? More generally, avoid testpmd examples, unless you use its syntax to describe what's supported, e.g.: eth [dst [(spec|is) {MAC-48_spec}] [last {MAC-48_last}] [mask {MAC-48_mask}]] MAC-48_spec: standard MAC address. MAC-48_last: must be equal to MAC-48_spec when "mask" or "is" are specified. MAX-48_mask: must be ether full or empty, not partial. It's just an example, I don't find it particularly clear. > +- When configured to run in promiscuous or all-multicast modes does > + not support additional rules No such modes at the rte_flow level. You have to describe in terms of full/empty/partial/specific field masks. > +.. code-block:: console > + > + if (flow->promisc || flow->allmulti) > + print_err("mlx4 does not support additional matching" > + " criteria combined with indiscriminate" > + " matching on Ethernet headers"); > + > +- Does not support the explicit exclusion of all multicast traffic Same comment. > +.. code-block:: console > + > + if (sum_dst == 1 && mask->dst.addr_bytes[0] == 1) > + if (!(spec->dst.addr_bytes[0] & 1)) > + print_err("mlx4 does not support the explicit" > + " exclusion of all multicast traffic"); > + > +VLAN > +^^^^ > + > +This section documents flow rules limitations related to > +RTE_FLOW_ITEM_TYPE_VLAN. > + > +.. code-block:: console > + > + /** > + * RTE_FLOW_ITEM_TYPE_VLAN > + * > + * Matches an 802.1Q/ad VLAN tag. > + * > + * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or > + * RTE_FLOW_ITEM_TYPE_VLAN. > + */ > + struct rte_flow_item_vlan { > + rte_be16_t tpid; /**< Tag protocol identifier. */ > + rte_be16_t tci; /**< Tag control information. */ > + }; > + > + struct rte_flow_item_vlan mask; > + > +- TCI VID must be specified > + > +.. code-block:: console > + > + if (!mask || !mask->tci) > + print_err("mlx4 cannot match all VLAN traffic while excluding" > + " non-VLAN traffic, TCI VID must be specified"); > + > +- Does not support partial VLAN TCI VID matching > + > +.. code-block:: console > + > + if (mask->tci != RTE_BE16(0x0fff)) > + print_err("mlx4 does not support partial TCI VID matching"); I'm sure all this could have been described in a single line. > +L3 (IPv4) > +^^^^^^^^^ > + > +This section documents flow rules limitations related to > +RTE_FLOW_ITEM_TYPE_IPV4. > + > +.. code-block:: console > + > + /** > + * RTE_FLOW_ITEM_TYPE_IPV4 > + * > + * Matches an IPv4 header. > + * > + * Note: IPv4 options are handled by dedicated pattern items. > + */ > + struct rte_flow_item_ipv4 { > + struct ipv4_hdr hdr; /**< IPv4 header definition. */ > + }; > + > + struct rte_flow_item_ipv4 *mask; > + > +- Prerequisites: must follow eth dst spec definition. > +- Supports only zero or full one's source and destinatin masks. destinatin => destination > + > +.. code-block:: console > + > + if (mask && > + (uint32_t)(mask->hdr.src_addr + 1) > 1U || > + (uint32_t)(mask->hdr.dst_addr + 1) > 1U)) > + print_err("mlx4 does not support matching partial IPv4 fields"); > + > +Using the following testpmd command with ipv4 prefix 16 will fail. > + > +.. code-block:: console > + > + testpmd> flow validate 0 ingress pattern eth > + src spec e4:1d:2d:2d:8d:22 > + dst spec 00:15:5D:10:8D:00 dst mask FF:FF:FF:FF:FF:FF > + / ipv4 src spec 144.144.92.0 src prefix 16 > + / end actions drop / end Same comments. Note mlx4 does not support partial masks. It's always either full or empty except for the allmulticast bit which is implemented through a kind of workaround. This may be mentioned once at the beginning to shorten documentation. > + > +L3 (IPv6) > +^^^^^^^^^ > + > +mlx4 does not support IPv6 filters Not much to say regarding this section except it shouldn't exist if only features were documented, however ending sentences with periods makes documentation look more professional and all. > + > +L4 UDP > +^^^^^^ > + > +This section documents flow rules limitations related to RTE_FLOW_ITEM_TYPE_UDP. > + > +.. code-block:: console > + > + /** > + * RTE_FLOW_ITEM_TYPE_UDP. > + * > + * Matches a UDP header. > + */ > + struct rte_flow_item_udp { > + struct udp_hdr hdr; /**< UDP header definition. */ > + }; > + > + struct rte_flow_item_udp mask; > + > +- Prerequisites - must follow eth dst followed by IPv4 specs True, however what do "dst" and "specs" mean in this context? > +- Supports only zero or full source and destination ports masks. > + > +.. code-block:: console > + > + if (mask && > + ((uint16_t)(mask->hdr.src_port + 1) > 1ui16 || > + (uint16_t)(mask->hdr.dst_port + 1) > 1ui16)) > + print_err("mlx4 does not support matching partial UDP fields"); > + Same comments as for IPv4. > +L4 TCP > +^^^^^^ > + > +This section documents flow rules limitations related to RTE_FLOW_ITEM_TYPE_TCP. > + > +.. code-block:: console > + > + /** > + * RTE_FLOW_ITEM_TYPE_TCP. > + * > + * Matches a TCP header. > + */ > + struct rte_flow_item_tcp { > + struct tcp_hdr hdr; /**< TCP header definition. */ > + }; > + > + struct rte_flow_item_tcp *mask; > + > +- Prerequisites - must follow eth dst spec followed by IPv4 spec > +- Supports only zero or full source and destination ports masks. > + > +.. code-block:: console > + > + if (mask && > + ((uint16_t)(mask->hdr.src_port + 1) > 1ui16 || > + (uint16_t)(mask->hdr.dst_port + 1) > 1ui16)) > + print_err("mlx4 does not support matching partial TCP fields"); > + Same comments as for TCP. > +Flow actions > +~~~~~~~~~~~~ > + > +RSS > +^^^ > + > +RSS is performed on packets to spread them among several queues based on hash > +function calculation and according to provided parameters. > + > +.. code-block:: console > + > + struct rte_eth_rss_conf { > + uint8_t *rss_key; /**< If not NULL, 40-byte hash key. */ > + uint8_t rss_key_len; /**< hash key length in bytes. */ > + uint64_t rss_hf; /**< Hash functions to apply - see below. */ > + }; > + > + struct rte_flow_action_rss { > + const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */ > + uint16_t num; /**< Number of entries in queue[]. */ > + uint16_t queue[]; /**< Queues indices to use. */ > + }; > + > + struct rte_flow_action_rss *rss; > + > +- RSS hash is calculated on fixed packet fields including: L3 source and > + destination addresses (ipv4 or ipv6) and L4 source and destination addresses > + (upd or tcp ports) upd => UDP, also use caps for acronyms (e.g. IPv4, IPv6, TCP). This reads like there's more than what you listed. You should be more specific: - Combined IPv4 source and destination addresses. - Combined IPv6 source and destination addresses. - Combined UDP source and destination ports. - Combined TCP source and destination ports. You may even throw in the related ETH_RSS_* macros for good measure. > +- Every Rx queue can be specified only once in RSS action > + > +- Only power of two number of queues is supported > + > +.. code-block:: console > + > + if (!rte_is_power_of_2(rss->num)) > + print_err("for RSS, mlx4 requires the number of" > + " queues to be a power of two"); > + > +Using the following RSS action with three RSS ports (0 1 2) will fail. ports => queues? > + > +.. code-block:: console > + > + testpmd> flow create 0 ingress pattern eth dst is f4:52:14:7a:59:81 > + / ipv4 / tcp / end actions rss queues 0 1 2 end / end > + > +- RSS hash key must be 40 characters characters => bytes? > + > +.. code-block:: console > + > + if (rss_conf->rss_key_len != > + sizeof(flow->rss->key)) > + print_err("mlx4 supports exactly one RSS hash key" > + " length: 40" > + > +- Packets must be distributed over consecutive queue indeces only indeces => indices > + > +.. code-block:: console > + > + for (i = 1; i < rss->num; ++i) > + if (rss->queue[i] - rss->queue[i - 1] != 1) > + break; > + if (i != rss->num) > + "mlx4 requires RSS contexts to use" > + " consecutive queue indices only"); This code might be difficult to compile. You know, I'm starting to think error messages in code proofs are verbose enough on their own, no need for documentation. Just point readers to the source code :) > +Using the following RSS action with non-consecutive ports (0 2) will fail. > + > +.. code-block:: console > + > + testpmd> flow create 0 ingress pattern eth dst is f4:52:14:7a:59:81 > + / ipv4 / tcp / end actions rss queues 0 2 end / end > + > +- The first queue index specified in RSS context must be aligned > + to context size > + > +.. code-block:: console > + > + if (rss->queue[0] % rss->num) > + print_err("mlx4 requires the first queue of a RSS" > + " context to be aligned on a multiple" > + " of the context size"); > + > +Using the following RSS action with a fist queue index set as 1 - will fail. > + > +.. code-block:: console > + > + testpmd> flow create 0 ingress pattern eth dst is f4:52:14:7a:59:81 > + / ipv4 / tcp / end actions rss queues 1 2 end / end > + > -- > 2.7.4 No word on the remaining limitations? - RTE_FLOW_ITEM_TYPE_INVERT - RTE_FLOW_ITEM_TYPE_ANY - RTE_FLOW_ITEM_TYPE_PF - RTE_FLOW_ITEM_TYPE_VF - RTE_FLOW_ITEM_TYPE_PORT - RTE_FLOW_ITEM_TYPE_RAW - RTE_FLOW_ITEM_TYPE_SCTP - [...] (see rte_flow.h) - RTE_FLOW_ACTION_TYPE_PASSTHRU - RTE_FLOW_ACTION_TYPE_MARK - [...] (see rte_flow.h) All right, I think you got the point... [1] Message-ID: <20180209162124.GD4256@6wind.com> http://dpdk.org/ml/archives/dev/2018-February/090649.html -- Adrien Mazarguil 6WIND