From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B8A65A0032; Wed, 14 Sep 2022 09:32:11 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A8E8C40151; Wed, 14 Sep 2022 09:32:11 +0200 (CEST) Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113]) by mails.dpdk.org (Postfix) with ESMTP id 8DF2D40141 for ; Wed, 14 Sep 2022 09:32:10 +0200 (CEST) Received: by shelob.oktetlabs.ru (Postfix, from userid 115) id 03CE269; Wed, 14 Sep 2022 10:32:09 +0300 (MSK) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mail1.oktetlabs.ru X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=ALL_TRUSTED, DKIM_ADSP_DISCARD, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 Received: from bree.oktetlabs.ru (bree.oktetlabs.ru [192.168.34.5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by shelob.oktetlabs.ru (Postfix) with ESMTPS id 925D262; Wed, 14 Sep 2022 10:32:08 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru 925D262 Authentication-Results: shelob.oktetlabs.ru/925D262; dkim=none; dkim-atps=neutral Date: Wed, 14 Sep 2022 10:32:08 +0300 (MSK) From: Ivan Malov To: Rongwei Liu cc: Matan Azrad , Slava Ovsiienko , Ori Kam , "NBU-Contact-Thomas Monjalon (EXTERNAL)" , Aman Singh , Yuying Zhang , Andrew Rybchenko , "dev@dpdk.org" , Raslan Darawsheh Subject: RE: [PATCH v1] ethdev: add direction info when creating the transfer table In-Reply-To: Message-ID: <46841a9-37f1-29a8-ba86-ac5410723e2f@oktetlabs.ru> References: <20220907024020.2474860-1-rongweil@nvidia.com> <1be72d6-be5b-88b2-f15-16fd2c6d0c0@oktetlabs.ru> <5d8d42b2-7011-cb46-7f2c-1b1019c4151e@oktetlabs.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi, On Wed, 14 Sep 2022, Rongwei Liu wrote: > HI > > BR > Rongwei > >> -----Original Message----- >> From: Ivan Malov >> Sent: Tuesday, September 13, 2022 22:33 >> To: Rongwei Liu >> Cc: Matan Azrad ; Slava Ovsiienko >> ; Ori Kam ; NBU-Contact- >> Thomas Monjalon (EXTERNAL) ; Aman Singh >> ; Yuying Zhang ; >> Andrew Rybchenko ; dev@dpdk.org; Raslan >> Darawsheh >> Subject: RE: [PATCH v1] ethdev: add direction info when creating the transfer >> table >> >> External email: Use caution opening links or attachments >> >> >> Hi Rongwei, >> >> PSB >> >> On Tue, 13 Sep 2022, Rongwei Liu wrote: >> >>> Hi >>> >>> BR >>> Rongwei >>> >>>> -----Original Message----- >>>> From: Ivan Malov >>>> Sent: Tuesday, September 13, 2022 00:57 >>>> To: Rongwei Liu >>>> Cc: Matan Azrad ; Slava Ovsiienko >>>> ; Ori Kam ; NBU-Contact- >>>> Thomas Monjalon (EXTERNAL) ; Aman Singh >>>> ; Yuying Zhang ; >>>> Andrew Rybchenko ; dev@dpdk.org; >>>> Raslan Darawsheh >>>> Subject: Re: [PATCH v1] ethdev: add direction info when creating the >>>> transfer table >>>> >>>> External email: Use caution opening links or attachments >>>> >>>> >>>> Hi, >>>> >>>> On Wed, 7 Sep 2022, Rongwei Liu wrote: >>>> >>>>> The transfer domain rule is able to match traffic wire/vf origin and >>>>> it means two directions' underlayer resource. >>>> >>>> The point of fact is that matching traffic coming from some entity >>>> like wire / VF has been long generalised in the form of representors. >>>> So, a flow rule with attribute "transfer" is able to match traffic >>>> coming from either a REPRESENTED_PORT or from a PORT_REPRESENTOR >> (please find these items). >>>> >>>>> >>>>> In customer deployments, they usually match only one direction >>>>> traffic in single flow table: either from wire or from vf. >>>> >>>> Which customer deployments? Could you please provide detailed examples? >>>> >>>>> >>> >>> We saw a lot of customers' deployment like: >>> 1. Match overlay traffic from wire and do decap, then send to specific vport. >>> 2. Match specific 5-tuples and do encap, then send to wire. >>> The matching criteria has obvious direction preference. >> >> Thank you. My questions are as follows: >> >> In (1), when you say "from wire", do you mean the need to match packets >> arriving via whatever physical ports rather then matching packets arriving >> from some specific phys. port? ^^ Could you please find my question above? Based on your understanding of templates in async flow approach, an answer to this question may help us find the common ground. -- >> >> If, however, matching traffic "from wire" in fact means matching packets >> arriving from a *specific* physical port, then for sure item >> REPRESENTED_PORT should perfectly do the job, and the proposed attribute is >> unneeded. >> >> (BTW, in DPDK, it is customary to use term "physical port", not "wire") >> >> In (1), what are "vport"s? Please explain. Once again, I should remind that, in >> DPDK, folks prefer terms "represented entity" / "representor" >> over vendor-specific terms like "vport", etc. >> > Vport is virtual port for short such as VF. Thanks. As I say, term "vport" might be confusing to some readers, so it'd be better to provide this explanation (about VF) in the commit description next time. >> As for (2), imagine matching 5-tuple traffic emitted by a VF / guest. >> Could you please explain, why not just add a match item REPRESENTED_PORT >> pointing to that VF via its representor? Doing so should perfectly define the >> exact direction / traffic source. Isn't that sufficient? >> > Per my view, there is matching field and matching value difference. > Like IPv4 src_addr 1.1.1.1, 1.1.1.2. 1.1.1.3, will you treat it as same or different matching criteria? > I would like to call them same since it can be summarized like 1.1.1.0/30 > REPRESENTED_PORT is just another matching item, no essential differences and it can't stand for direction info. It looks like we're starting to run into disagreement here. There's no "direction" at all. There's an embedded switch inside the NIC, and there're (logical) switch ports that packets enter the switch from. When the user submits a "transfer" rule and does not provide neither REPRESENTED_PORT nor PORT_REPRESENTOR in the pattern, the embedded switch is supposed to match packets coming from ANY ports, be it VFs or physical (wire) ports. But when the user provides, in example, item REPRESENTED_PORT to point to the physical (wire) port, the embedded switch knows exactly which port the packets should enter it from. In this case, it is supposed to match only packets coming from that physical port. And this should be sufficient. This in fact replaces the need to know a "direction". It's just an exact specification of packet's origin. > Port id depends on the attach sequence. Unfortunately, this is hardly a good argument because flow rules are supposed to be inserted based on the run-time packet learning. Attach sequence is a don't care here. >> Also please mind that, although I appreciate your explanations here, on the >> mailing list, they should finally be added to the commit message, so that >> readers do not have to look for them elsewhere. >> > We have explained the high possibility of single-direction matching, right? Not quite. As I said, it is not correct to assume any "direction", like in geographical sense ("north", "south", etc.). Application has ethdevs, and they are representors of some "virtual ports" (in your terminology) belonging to the switch, for example, VFs, SFs or physical ports. The user adds an appropriate item to the pattern (REPRESENTED_PORT), and doing so specifies the packet path which it enters the switch. > It' hard to list all the possibilities of traffic matching preferences. And let's say more: one need never do this. That's exactly the reason why DPDK has abandoned the concept of "direction" in *transfer* rules and switched to the use of precise criteria (REPRESENTED_PORT, etc.). > The underlay is the one we have met for now. >>> >>>>> Introduce one new member transfer_mode into rte_flow_attr to >>>>> indicate the flow table direction property: from wire, from vf or >>>>> bi-direction(default). >>>> >>>> AFAIK, 'rte_flow_attr' serves both traditional flow rule insertion >>>> and asynchronous (table) approach. The patch adds the attributes to >>>> generic 'rte_flow_attr' but, for some reason, ignores non-table rules. >>>> >>>>> >>> Sync API uses one rule to contain everything. It' hard for PMD to determine >> if this rule has direction preference or not. >>> Image a situation, just for an example: >>> 1. Vport 1 VxLAN do decap send to vport 2. 1 million scale >>> 2. Vport 0 (wire) VxLAN do decap send to vport 3. 1 hundred scale. >>> 1 and 2 share the same matching conditions (eth / ipv4 / udp / vxlan /...), so >> sync API consider them share matching determination logic. >>> It means "2" have 1M scale capability too. Obviously, it wastes a lot of >> resources. >> >> Strictly speaking, they do not share the same match pattern. >> Your example clearly shows that, in (1), the pattern should request packets >> coming from "vport 1" and, in (2), packets coming from "vport 0". >> >> My point is simple: the "vport" from which packets enter the embedded switch >> is ALSO a match criterion. If you accept this, you'll see: the matching >> conditions differ. >> > See above. > In this case, I think the matching fields are both "port_id + ipv4_vxlan". They are same. > Only differs with values like vni 100 or 200 vice versa. Not quite. Look closer: you use *different* port IDs for (1) and (2). The value of "ethdev_id" field in item REPRESENTED_PORT differs. >>> >>> In async API, there is pattern_template introduced. We can mark "1" to use >> pattern_tempate id 1 and "2" to use pattern_template 2. >>> They will be separated from each other, don't share anymore. >> >> Consider an example. "Wire" is a physical port represented by PF0 which, in >> turn, is attached to DPDK via ethdev 0. "VF" (vport?) is attached to guest and is >> represented by a representor ethdev 1 in DPDK. >> >> So, some rules (template 1) are needed to deliver packets from "wire" >> to "VF" and also decapsulate them. And some rules (template 2) are needed to >> deliver packets in the opposite direction, from "VF" >> to "wire" and also encapsulate them. >> >> My question is, what prevents you from adding match item >> REPRESENTED_PORT[ethdev_id=0] to the pattern template 1 and >> REPRESENTED_PORT[ethdev_id=1] to the pattern template 2? >> >> As I said previously, if you insert such item before eth / ipv4 / etc to your >> match pattern, doing so defines an *exact* direction / source. >> > Could you check the async API guidance? I think pattern template focusing on the matching field (mask). > "REPRESENTED_PORT[ethdev_id=0] " and "REPRESENTED_PORT[ethdev_id=1] "are the same. > 1. pattern template: REPRESENTED_PORT mask 0xffff ... > 2. action template: action1 / actions2. / > 3. table create with pattern_template plus action template.. > REPRESENTED_PORT[ethdev_id=0] will be rule1: rule create REPRESENTED_PORT port_id is 0 / actions .... > REPRESENTED_PORT[ethdev_id=1] will be rule2: rule create REPRESENTED_PORT port_id is 1 / actions .... OK, so, based on this explanation, it appears that you might be looking to refer to: a) a *set* of any physical (wire) ports b) a *set* of any guest ports (VFs) You chose to achieve this using an attribute, but: 1) as I explained above, the use of term "direction" is wrong; please hear me out: I'm not saying that your use case and your optimisation is wrong: I'm saying that naming for it is wrong: it has nothing to do with "direction"; 2) while naming a *set* of wire ports as "wire_orig" might be OK, sticking with term "vf_orig" for a *set* of guest ports is clearly not, simply because the user may pass another PF to a guest instead of passing a VF; in other words, a better term is needed here; 3) since it is possible to plug multiple NICs to a DPDK application, even from different vendors, the user may end up having multiple physical ports belonging to different physical NICs attached to the application; if this is the case, then referring to a *set* of wire ports using the new attribute is ambiguous in the sense that it's unclear whether this applies only to wire ports of some specific physical NIC or to the physical ports of *all* NICs managed by the app; 4) adding an attribute instead of yet another pattern item type is not quite good because PMDs need to be updated separately to detect this attribute and throw an error if it's not supported, whilst with a new item type, the PMDs do not need to be updated = if a PMD sees an unsupported item while traversing the item with switch () { case }, it will anyway throw an error; 5) as in (4), a new attribute is not good from documentation standpoint; plase search for "represented_port = Y" in documentation = this way, all supported items are easily defined for various NIC vendors, but the same isn't true for attributes = there is no way to indicate supported attributes in doc. If points (1 - 5) make sense to you, then, if I may be so bold, I'd like to suggest that the idea of adding a new attribute be abandoned. Instead, I'd like to suggest adding new items: (the names are just sketch, for sure, it should be discussed) ANY_PHY_PORTS { switch_domain_id } = match packets entering the embedded switch from *whatever* physical ports belonging to the given switch domain ANY_GUEST_PORTS { switch_domain_id } = match packets entering the embedded switch from *whatever* guest ports (VFs, PFs, etc.) belonging to the given switch domain The field "switch_domain_id" is required to tell one physical board / vendor from another (as I explained in point (3)). The application can query this parameter from ethdev's switch info: please see "struct rte_eth_switch_info". What's your opinion? > >>> >>>> For example, the diff below adds the attributes to "table" commands >>>> in testpmd but does not add them to regular (non-table) commands like >>>> "flow create". Why? >>>> >>>>> >>> >>> "table" command limits pattern_template to single direction or bidirection >> per user specified attribute. >> >> As I say above, the same effect can be achieved by adding item >> REPRESENTED_PORT to the corresponding pattern template. > See above. >> >>> "rule" command must tight with one "table_id", so the rule will inherit the >> "table" direction property, no need to specify again. >> >> You migh've misunderstood. I do not talk about "rule" command coupled with >> some "table". What I talk about is regular, NON-async flow insertion >> commands. >> >> Please take a look at section "/* Validate/create attributes. */" in file >> "app/test-pmd/cmdline_flow.c". When one adds a new flow attribute, they >> should reflect it the same way as VC_INGRESS, VC_TRANSFER, etc. >> >> That's it. > We don't intend to pass this to sync API. The above code example is for sync API. So I understand. But there's one slight problem: in your patch, you add the new attributes to the structure which is *shared* between sync and async use case scenarios. If one adds an attribute to this structure, they have to provide accessors for it in all sync-related commands in testpmd, but your patch does not do that. In other words, it is wrong to assume that "struct rte_flow_attr" only applies to async approach. It had been introduced long before the async flow design was added to DPDK. That's it. >> >> But, as I say, I still believe that the new attributes aren't needed. > I think we are not at the same page for now. Can we reach agreement on the same > matching criteria first? >>> >>>>> It helps to save underlayer memory also on insertion rate. >>>> >>>> Which memory? Host memory? NIC memory? Term "underlayer" is vague. >>>> I suggest that the commit message be revised to first explain how >>>> such memory is spent currently, then explain why this is not optimal >>>> and, finally, which way the patch is supposed to improve that. I.e. be more >> specific. >>>> >>>>> >>> >>> For large scalable rules, HW (depends on implementation) always needs >> memory to hold the rules' patterns and actions, either from NIC or from host. >>> The memory footprint highly depends on "user rules' complexity", also diff >> between NICs. >>> ~50% memory saving is expected if one-direction is cut. >> >> Regardless of this talk, this explanation should probably be present in the >> commit description. >> > This number may differ with different NICs or implementation. We can't say it for sure. Not an exact number, of course, but a brief explanation of: a) what is wrong / not optimal in the current design; b) how it is observed in customer deployments; c) why the proposed patch is a good solution. >>> >>>>> By default, the transfer domain is bi-direction, and no behavior changes. >>>>> >>>>> 1. Match wire origin only >>>>> flow template_table 0 create group 0 priority 0 transfer wire_orig... >>>>> 2. Match vf origin only >>>>> flow template_table 0 create group 0 priority 0 transfer vf_orig... >>>>> >>>>> Signed-off-by: Rongwei Liu >>>>> --- >>>>> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++++ >>>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++- >>>>> lib/ethdev/rte_flow.h | 9 ++++++- >>>>> 3 files changed, 36 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/app/test-pmd/cmdline_flow.c >>>>> b/app/test-pmd/cmdline_flow.c index 7f50028eb7..b25b595e82 100644 >>>>> --- a/app/test-pmd/cmdline_flow.c >>>>> +++ b/app/test-pmd/cmdline_flow.c >>>>> @@ -177,6 +177,8 @@ enum index { >>>>> TABLE_INGRESS, >>>>> TABLE_EGRESS, >>>>> TABLE_TRANSFER, >>>>> + TABLE_TRANSFER_WIRE_ORIG, >>>>> + TABLE_TRANSFER_VF_ORIG, >>>>> TABLE_RULES_NUMBER, >>>>> TABLE_PATTERN_TEMPLATE, >>>>> TABLE_ACTIONS_TEMPLATE, >>>>> @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = { >>>>> TABLE_INGRESS, >>>>> TABLE_EGRESS, >>>>> TABLE_TRANSFER, >>>>> + TABLE_TRANSFER_WIRE_ORIG, >>>>> + TABLE_TRANSFER_VF_ORIG, >>>>> TABLE_RULES_NUMBER, >>>>> TABLE_PATTERN_TEMPLATE, >>>>> TABLE_ACTIONS_TEMPLATE, >>>>> @@ -2881,6 +2885,18 @@ static const struct token token_list[] = { >>>>> .next = NEXT(next_table_attr), >>>>> .call = parse_table, >>>>> }, >>>>> + [TABLE_TRANSFER_WIRE_ORIG] = { >>>>> + .name = "wire_orig", >>>>> + .help = "affect rule direction to transfer", >>>> >>>> This does not explain the "wire" aspect. It's too broad. >>>> >>>>> + .next = NEXT(next_table_attr), >>>>> + .call = parse_table, >>>>> + }, >>>>> + [TABLE_TRANSFER_VF_ORIG] = { >>>>> + .name = "vf_orig", >>>>> + .help = "affect rule direction to transfer", >>>> >>>> This explanation simply duplicates such of the "wire_orig". >>>> It does not explain the "vf" part. Should be more specific. >>>> >>>>> + .next = NEXT(next_table_attr), >>>>> + .call = parse_table, >>>>> + }, >>>>> [TABLE_RULES_NUMBER] = { >>>>> .name = "rules_number", >>>>> .help = "number of rules in table", @@ -8894,6 >>>>> +8910,16 @@ parse_table(struct context *ctx, const struct token *token, >>>>> case TABLE_TRANSFER: >>>>> out->args.table.attr.flow_attr.transfer = 1; >>>>> return len; >>>>> + case TABLE_TRANSFER_WIRE_ORIG: >>>>> + if (!out->args.table.attr.flow_attr.transfer) >>>>> + return -1; >>>>> + out->args.table.attr.flow_attr.transfer_mode = 1; >>>>> + return len; >>>>> + case TABLE_TRANSFER_VF_ORIG: >>>>> + if (!out->args.table.attr.flow_attr.transfer) >>>>> + return -1; >>>>> + out->args.table.attr.flow_attr.transfer_mode = 2; >>>>> + return len; >>>>> default: >>>>> return -1; >>>>> } >>>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst >>>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst >>>>> index 330e34427d..603b7988dd 100644 >>>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst >>>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst >>>>> @@ -3332,7 +3332,8 @@ It is bound to >>>> ``rte_flow_template_table_create()``:: >>>>> >>>>> flow template_table {port_id} create >>>>> [table_id {id}] [group {group_id}] >>>>> - [priority {level}] [ingress] [egress] [transfer] >>>>> + [priority {level}] [ingress] [egress] >>>>> + [transfer [vf_orig] [wire_orig]] >>>> >>>> Is it correct? Shouldn't it rather be [transfer] [vf_orig] >>>> [wire_orig] ? >>>> >>>>> rules_number {number} >>>>> pattern_template {pattern_template_id} >>>>> actions_template {actions_template_id} diff --git >>>>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index >>>>> a79f1e7ef0..512b08d817 100644 >>>>> --- a/lib/ethdev/rte_flow.h >>>>> +++ b/lib/ethdev/rte_flow.h >>>>> @@ -130,7 +130,14 @@ struct rte_flow_attr { >>>>> * through a suitable port. @see rte_flow_pick_transfer_proxy(). >>>>> */ >>>>> uint32_t transfer:1; >>>>> - uint32_t reserved:29; /**< Reserved, must be zero. */ >>>>> + /** >>>>> + * 0 means bidirection, >>>>> + * 0x1 origin uplink, >>>> >>>> What does "uplink" mean? It's too vague. Hardly a good term. I believe this comment should be reworked, in case the idea of having an extra attribute persists. >>>> >>>>> + * 0x2 origin vport, >>>> >>>> What does "origin vport" mean? Hardly a good term as well. I still believe this explanation is way too brief and needs to be reworked to provide more details, to define the use case for the attribute more specifically. >>>> >>>>> + * N/A both set. >>>> >>>> What's this? The question stands. >>>> >>>>> + */ >>>>> + uint32_t transfer_mode:2; >>>>> + uint32_t reserved:27; /**< Reserved, must be zero. */ >>>>> }; >>>>> >>>>> /** >>>>> -- >>>>> 2.27.0 >>>>> >>>> >>>> Since the attributes are added to generic 'struct rte_flow_attr', >>>> non-table >>>> (synchronous) flow rules are supposed to support them, too. If that >>>> is indeed the case, then I'm afraid such proposal does not agree with >>>> the existing items PORT_REPRESENTOR and REPRESENTED_PORT. They do >>>> exactly the same thing, but they are designed to be way more generic. Why >> not use them? >> >> The question stands. >> >>>> >>>> Ivan >>> >> >> Ivan >