From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-f68.google.com (mail-wm1-f68.google.com [209.85.128.68]) by dpdk.org (Postfix) with ESMTP id 8E2361B4ED for ; Thu, 11 Oct 2018 15:12:40 +0200 (CEST) Received: by mail-wm1-f68.google.com with SMTP id a8-v6so9348766wmf.1 for ; Thu, 11 Oct 2018 06:12:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=t+CWtqmQPj9Px78vns8jFCPFwTs7ucAl0g/T6iQU880=; b=bCiEd81Mnh52gtFLHY/JMYs78AzwFV/yFuaukEXsVN+6/ibrzrqKmGdJkvTiTAA8uj GgbJrSKybSUbPTiqcylWOHDoET1zmeY+fRVThDu9kO8nfidMCl6CnhLdJwjECURpUX89 o9yjjts+fP5nFQDhUttan1ygVjOtly7185gE6uomVSmAgfkNe598qMoUcuqql2e4gD6J Qht0FZTppMLkOkd7IyEQ1ZRx3FzCKher0VjHUw0jD7FYnrTMOIrkorCpcrBwX9AWspQf QzkUs08CTMNvdbXmlnT+6A3ARnj16MFo7KB+X28MQY9CsNZF94hNHn6Rp1RGtdd24Xzv trlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=t+CWtqmQPj9Px78vns8jFCPFwTs7ucAl0g/T6iQU880=; b=UIvHQUTN5ktjBKlR7Uuliq/fWEkgHyIab1aVp6STCkwU5/elNo48SDF+RT93FPw8dY kJCyv+wykKl/y3BBgX3nkWyX55aRRZacXzjRQfZ6JAFD6VvkkIH7hpNTekvUioG15gu2 EoMiJry2Fhx90BN1hDjk3J7apca+J8hwYT99F8k5HzmG/UDIoII4XK5Z9U8xHx3l6sG4 D2Fgwv1pzO2wRqYBfqfYPueVPsWGKJd+xXrf0uYZBx32NfJmMqU5XosqlYsQJvKzeKQr aXhk3pwhyz3AmAzuQ8xM5dPwccFwmiiBHgCun7sCRoWmguYjFRi5RHxLsH1U4tD6jLEo LQBQ== X-Gm-Message-State: ABuFfogJvTdoZ57qUtXch0F3CnhsyCaa9L5Z7Ev5wk/2TSXMw52GmvVl Gl/AAQdaevFB624JMF3EekCPMQ== X-Google-Smtp-Source: ACcGV617KUrdJexDu7vn9vi6o6I7UyBB8If1NfZUaq3bnHucnjSBhnGVvi9UbzVtIu6Y1WVngCPpGg== X-Received: by 2002:a1c:a708:: with SMTP id q8-v6mr1616917wme.133.1539263560190; Thu, 11 Oct 2018 06:12:40 -0700 (PDT) Received: from 6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id c18-v6sm7363370wrr.74.2018.10.11.06.12.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Oct 2018 06:12:39 -0700 (PDT) Date: Thu, 11 Oct 2018 15:12:21 +0200 From: Adrien Mazarguil To: Ori Kam Cc: Andrew Rybchenko , Ferruh Yigit , "stephen@networkplumber.org" , Declan Doherty , "dev@dpdk.org" , Dekel Peled , Thomas Monjalon , =?utf-8?B?TsOpbGlv?= Laranjeiro , Yongseok Koh , Shahaf Shuler Message-ID: <20181011131221.GQ18937@6wind.com> References: <1537995646-95260-1-git-send-email-orika@mellanox.com> <1538917054-68283-1-git-send-email-orika@mellanox.com> <9760f054-bbe9-2036-dd5d-d39edd906496@intel.com> <1165fc19-c68b-f13c-e2a6-eeb3f6937922@solarflare.com> <20181010120207.GM18937@6wind.com> <20181010161015.GN18937@6wind.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [dpdk-dev] [PATCH v3 0/3] ethdev: add generic L2/L3 tunnel encapsulation actions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Oct 2018 13:12:40 -0000 Hey Ori, (removing most of the discussion, I'll only reply to the summary) On Thu, Oct 11, 2018 at 08:48:05AM +0000, Ori Kam wrote: > Hi Adrian, > > Thanks for your comments please see my answer below and inline. > > Due to a very short time limit and the fact that we have more than > 4 patches that are based on this we need to close it fast. > > As I can see there are number of options: > * the old approach that neither of us like. And which mean that for > every tunnel we create a new command. Just to be sure, you mean that for each new tunnel *type* a new rte_flow action *type* must be added to DPDK right? Because the above reads like with your proposal, a single flow rule can manage any number of TEPs and flow rule creation for subsequent tunnels can be somehow bypassed. One flow *rule* is still needed per TEP or did I miss something? > * My proposed suggestion as is. Which is easier for at least number of application > to implement and faster in most cases. > * My suggestion with different name, but then we need to find also a name > for the decap and also a name for decap_l3. This approach is also problematic > since we have 2 API that are doing the same thig. For example in test-pmd encap > vxlan in which API shell we use? Since you're doing this for MPLSoUDP and MPLSoGRE, you could leave VXLAN/NVGRE encap as is, especially since (AFAIK) there are series still relying on their API floating on the ML. > * Combine between my suggestion and the current one by replacing the raw > buffer with list of items. Less code duplication easier on the validation ( that > don't think we need to validate the encap data) but we loss insertion rate. Already suggested in the past [1], this led to VXLAN and NVGRE encap as we know them. > * your suggestion of list of action that each action is one item. Main problem > is speed. Complexity form the application side and time to implement. Speed matters a lot to me also (go figure) but I still doubt this approach is measurably faster. On the usability side, compared to one action per protocol layer which better fits the rte_flow model, I'm also not convinced. If we put aside usability and performance on which we'll never agree, there is still one outstanding issue: the lack of mask. Users cannot tell which fields are relevant and to be kept as is, and which are not. How do applications know what blanks are filled in by HW? How do PMDs know what applications expect? There's a risk of sending incomplete or malformed packets depending on the implementation. One may expect PMDs and HW to just "do the sensible thing" but some applications won't know that some fields are not offloaded and will be emitted with an unexpected value, while others will attempt to force a normally offloaded field to some specific value and expect it to leave unmodified. This cannot be predicted by the PMD, something is needed. Assuming you add a mask pointer to address this, generic encap should be functionally complete but not all that different from what we currently have for VXLAN/NVGRE and from Declan's earlier proposal for generic encap [1]; PMD must parse the buffer (using a proper packet parser with your approach), collect relevant fields, see if anything's unsupported while doing so before proceeding with the flow rule. Anyway, if you add that mask and rename these actions (since they should work with pretty much anything, not necessarily tunnels, i.e. lazy applications could ask HW to prepend missing Ethernet headers to pure IP traffic), they can make sense. How about labeling this "raw" encap/decap? RTE_FLOW_ACTION_TYPE_RAW_(ENCAP|DECAP) struct rte_flow_action_raw_encap { uint8_t *data; /**< Encapsulation data. */ uint8_t *preserve; /**< Bit-mask of @p data to preserve on output. */ size_t size; /**< Size of @p data and @p preserve. */ }; I guess decap could use the same object. Since there is no way to define a sensible default behavior that works across multiple vendors when "preserve" is not provided, I think this field cannot be NULL. As for "L3 decap", well, can't one just provide a separate encap action? I mean a raw decap action, followed by another action doing raw encap of the intended L2? A separate set of actions seems unnecessary for that. [1] "[PATCH v3 2/4] ethdev: Add tunnel encap/decap actions" https://mails.dpdk.org/archives/dev/2018-April/095733.html -- Adrien Mazarguil 6WIND