From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3072AA04F9; Fri, 27 Dec 2019 14:54:34 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A95FE1BFFD; Fri, 27 Dec 2019 14:54:32 +0100 (CET) Received: from mail-wm1-f65.google.com (mail-wm1-f65.google.com [209.85.128.65]) by dpdk.org (Postfix) with ESMTP id 9F9C91BFAA for ; Fri, 27 Dec 2019 14:54:31 +0100 (CET) Received: by mail-wm1-f65.google.com with SMTP id q9so6536210wmj.5 for ; Fri, 27 Dec 2019 05:54:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=vlMYvQ7IUUcgz39uIsuttSHOCg8ckUGqPl/r592hlN0=; b=A6GFfuHIwAARWByGWOUlR+xkWAXEMiP3n3LHGyhPGbBjXPQu4PfLwKCUdDzL/NnI6m FAgXe+PrKnwzteq9ZhZOhw7eJ9kHvAj6NAtuCytefEq+d5dQMnRotkOE4ncJviehpdlF O2DdNxhc/N+OwEa7HmLgR6HRqyzyw7bKZ7z0snOs96vLstLcCSVfs51zgLTJuzlRwZSi m7cvNwiRi1q/MbqK/buDrWnBv4q6tXaEpCj7ySC9+zMivqzXRMUKRbeHkc/XnKPV8kwl UpoVtEJKOy/Y9JJG0AoaBl2BOl+d4bYbGgFSbubUQwUmmlmnOQ1WCKa06kv7+9PmaVWk U0Zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=vlMYvQ7IUUcgz39uIsuttSHOCg8ckUGqPl/r592hlN0=; b=hBa5fB4I5MgcSig9HJAXKPLNudprmGyKimNX8HTV2nNzdFAtP7QY6hUU7sbzwIdeY9 VOzIgkr4D1sRxe0xWVEp9izHy6oZMzZaf+V6jPxjvY0GXKXk6tlafiEejnZdGtkml/Dx 4E76XG/fn/OqDEDuqgcQA5kQHNaLzsbQo5j4biUzdpUNP0MYRWDn4+nK83rDhkytCXr6 0gYvF0PpEi0ij5x5w9nq4d4fTwsTzj/COuUFIUy1M+H7qOqpPsdIW/WaeK5e/pJbcx8x n1kQhhLg16H5be1WLd6pdRmi1/rjpw9zKtev0uz5+3hj7vtoePjJAqMV3gVTr7gt9LKb wMOg== X-Gm-Message-State: APjAAAWIOXwlpmzN/e3l+iuC5CH9VhBK7jWFGLda8qjxuv9FSXC88Gw6 Xsf5ev+29SgRUZyv8Hf/K1KKeg== X-Google-Smtp-Source: APXvYqyqgbfxCeNr+LBTg5bu/91fwgxB4oPHo07z+lNO0+sY7N90hXvsFXev//Irs8C0DdcZxrpj6A== X-Received: by 2002:a1c:628b:: with SMTP id w133mr18992587wmb.25.1577454871112; Fri, 27 Dec 2019 05:54:31 -0800 (PST) Received: from 6wind.com (2a01cb0c0005a600345636f7e65ed1a0.ipv6.abo.wanadoo.fr. [2a01:cb0c:5:a600:3456:36f7:e65e:d1a0]) by smtp.gmail.com with ESMTPSA id q3sm35525027wrn.33.2019.12.27.05.54.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Dec 2019 05:54:29 -0800 (PST) Date: Fri, 27 Dec 2019 14:54:28 +0100 From: Olivier Matz To: Andrew Rybchenko Cc: Ferruh Yigit , Shahaf Shuler , "dev@dpdk.org" , "Ananyev, Konstantin" , Thomas Monjalon , Bruce Richardson , Matan Azrad , Jerin Jacob Kollanukkaran Message-ID: <20191227135428.GP22738@platinum> References: <20180123135308.tr7nmuqsdeogm7bl@glumotte.dev.6wind.com> <65f5f247-15e7-ac0a-183e-8a66193f426f@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Subject: Re: [dpdk-dev] questions about new offload ethdev api X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi, Few comments below. On Mon, Dec 16, 2019 at 11:39:05AM +0300, Andrew Rybchenko wrote: > On 12/10/19 9:07 PM, Ferruh Yigit wrote: > > On 1/23/2018 2:34 PM, Shahaf Shuler wrote: > >> Tuesday, January 23, 2018 3:53 PM, Olivier Matz: > > > > <...> > > > >>> > >>> 2/ meaning of rxmode.jumbo_frame, rxmode.enable_scatter, > >>> rxmode.max_rx_pkt_len > >>> > >>> While it's not related to the new API, it is probably a good opportunity > >>> to clarify the meaning of these flags. I'm not able to find a good > >>> documentation about them. > >>> > >>> Here is my understanding, the configuration only depends on: - the maximum > >>> rx frame length - the amount of data available in a mbuf (minus headroom) > >>> > >>> Flags to set in rxmode (example): > >>> +---------------+----------------+----------------+-----------------+ | > >>> |mbuf_data_len=1K|mbuf_data_len=2K|mbuf_data_len=16K| > >>> +---------------+----------------+----------------+-----------------+ > >>> |max_rx_len=1500|enable_scatter | | | > >>> +---------------+----------------+----------------+-----------------+ > >>> |max_rx_len=9000|enable_scatter, |enable_scatter, |jumbo_frame | | > >>> |jumbo_frame |jumbo_frame | | > >>> +---------------+----------------+----------------+-----------------+ Due to successive quotes, the table was not readable in my mail client, here it is again (narrower): +------------+---------------+---------------+---------------+ | |mbuf_data_len= |mbuf_data_len= |mbuf_data_len= | | |1K |2K |16K | +------------+---------------+---------------+---------------+ |max_rx_len= |enable_scatter | | | |1500 | | | | +------------+---------------+---------------+---------------+ |max_rx_len= |enable_scatter,|enable_scatter,|jumbo_frame | |9000 |jumbo_frame |jumbo_frame | | +------------+---------------+---------------+---------------+ > >>> If this table is correct, the flag jumbo_frame would be equivalent to check > >>> if max_rx_pkt_len is above a threshold. > >>> > >>> And enable_scatter could be deduced from the mbuf size of the given rxq > >>> (which is a bit harder but maybe doable). > >> > >> I glad you raised this subject. We had a lot of discussion on it internally > >> in Mellanox. > >> > >> I fully agree. All application needs is to specify the maximum packet size it > >> wants to receive. > >> > >> I think also the lack of documentation is causing PMDs to use those flags > >> wrongly. For example - some PMDs set the jumbo_frame flag internally without > >> it being set by the application. > >> > >> I would like to add one more item : MTU. What is the relation (if any) > >> between setting MTU and the max_rx_len ? I know MTU stands for Max Transmit > >> Unit, however at least in Linux it is the same for the Send and the receive. > >> > >> > > > > (Resurrecting the thread after two years, I will reply again with latest > > understanding.) > > > > Thanks Olivier for above summary and table, and unfortunately usage still not > > consistent between PMDs. According my understanding: > > > > 'max_rx_pkt_len' is user configuration value, to limit the size packet that is > > shared with host, but this doesn't limit the size of packet that NIC receives. When you say the size of packet shared with the host, do you mean for instance that the NIC will receive a 1500B packet and will only write 128 bytes of data in the mbuf? If yes, this was not my understanding. I suppose it could be used for monitoring. What should be the value for rx offload infos like checksum or packet type if the packet (or the header) is truncated? > Also comment in lib/librte_ethdev/rte_ethdev.h says that the > rxmode field is used if (and I think only if) JUMBO_FRAME is > enabled. So, if user wants to set it on device configure stage, > device *must* support JUMBO_FRAME offload which mean that > driver code handles rxmode.max_rx_pkt_len and either accept it > and configures HW appropriately or return an error if specified > value is wrong. Basically it is written in jumbo frame feature > definition in features.rst. User has max_rx_pktlen in dev_info > to find out maximum supported value for max_rx_pkt_len. > > > Like if the mbuf size of the mempool used by a queue is 1024 bytes, we don't > > want packets bigger than buffer size, but if NIC supports it is possible receive > > 6000 bytes packet and split data into multiple buffers, and we can use multi > > segment packets to represent it. > > So what we need is NIC ability to limit the size of data to share to host and > > scattered Rx support (device + driver). > > It requires RX_SCATTER offload enabled and it must be > controlled by the user only (not PMD) since it basically > mean if the application is ready to handle multi-segment > packets (have code which takes a look at the number of > segments and next pointers etc). Moreover, application > may disable MULTI_SEG Tx offload (and drivers may ignore > number of segments and next pointer as well). Agree, I think it is important that the application can control the enabling of rx scatter, either by a flag, or simply by passing max_rx_len <= mbuf_data_len. > > But MTU limits the size of the packet that NIC receives. > > Personally I've never treated it this way. For me the only > difference between max_rx_pkt_len and MTU is: > - max_rx_pkt_len is entire packet with all L2 headers and > even FCS (basically everything which could be provided > to application in mbuf) > - MTU does not cover L2 (and VLANs, I'm not sure about MPLS) > > > Assuming above are correct J, > > > > Using mbuf data size as 'max_rx_pkt_len' without asking from user is an option, > > but perhaps user has different reason to limit packet size, so I think better to > > keep as different config option. > > > > I think PMD itself enabling "jumbo frame" offload is not too bad, and > > acceptable, since providing a large MTU already implies it. > > Yes +1 > > But not sure about PMD enabling scattered Rx, application may want to force to > > receive single segment mbufs, for that case PMD enabling this config on its own > > looks like a problem. > > Yes +1 > > But user really needs this when a packet doesn't fit to the mbuf, so providing a > > MTU larger than 'max_rx_pkt_len' _may_ imply enabling scattered Rx, I assume > > this is the logic in some PMDs which looks acceptable. > > I don't think so. IMO auto enabling Rx scatter from PMD is a > breakage of a contract between application and driver. > As stated above the application may be simply not ready to > handle multi-segment packets correctly. > > I think that providing an MTU larger than 'max_rx_pkt_len' is simply a > change of max_rx_pkt_len = (MTU plus space for L2+). As VLAN(s) are not taken in account in MTU, it means that if MTU is 1500, max Ethernet len is 1500 + 14 (eth hdr) + 4 (vlan) + 4 (2nd vlan / qinq) + 4 (crc) = 1522. Shouldn't we only use a L2 lengths instead of MTU? I don't know what is usually expected by different hardware (mtu or l2 len). > > And PMD behavior should be according for mentioned configs: > > > > 1) Don't change user provided 'max_rx_pkt_len' value > I have no strong opinion. However, it is important to clarify > which understanding of max_rx_pkt_len vs MTU is the right one. > > > 2) If jumbo frame is not enabled, don't limit the size of packets to the host (I > > think this is based on assumption that mbuf size always will be > 1514) > > I think that JUMBO_FRAME is not relevant here. It is just a > promise to take a look at max_rx_pkt_len on configure or > start stage. > > > 3) When user request to set the MTU bigger than ETH_MAX, PMD enable jumbo frame > > support (if it is not enabled by user already and supported by HW). If HW > > doesn't support if of course it should fail. > > I'm not sure which ETH_MAX is mentioned above. > #define ETH_MAX_MTU 0xFFFFU /* 65535, same as IP_MAX_MTU */ > or do you mean > #define ETH_FRAME_LEN 1514 /* Max. octets in frame sans FCS */ > or even > #define ETH_DATA_LEN 1500 /* Max. octets in payload */ > > We should be careful when we talk about Ethernet lengths and > MTU. > > > 4) When user request to set MTU bigger than 'max_rx_pkt_len' > > I think the second parameter to consider here is not > max_rx_pkt_len, but amount of space for data in single > mbuf (for all Rx queues). > > > 4a) if "scattered Rx" is enabled, configure the MTU and limit packet size to > > host to 'max_rx_pkt_len' > > Yes and no. IMO configure the MTU and bump max_rx_pkt_len. > > > 4b) if "scattered Rx" is not enabled but HW supports it, enable "scattered Rx" > > by PMD, configure the MTU and limit packet size to host to 'max_rx_pkt_len' > > No, I think it is wrong to enable Rx scatter from PMD. > > > 4c) if "scattered Rx" is not enabled and not supported by HW, fail MTU set. > > Yes, regardless support in HW. > > > 4d) if HW doesn't support to limit the packet size to host, but requested MTU > > bigger than 'max_rx_pkt_len' it should fail. > > I would rephrase it as impossibility to disable Rx scatter. > If so, it must be driver responsibility to drop scattered > packets if Rx scatter offload is not enabled. > > > Btw, I am aware of that some PMDs have a larger MTU by default and can't limit > > the packet size to host to 'max_rx_pkt_len' value, I don't know what to do in > > that case, fail in configure? Or at least be sure configured mempool's mbuf size > > is big enough? > > See above. > > Thanks for reminder about the topic. I have the impression that what we want can be done with these 3 values: - max_rx_pkt_size: maximum size of received packet, larger ones are dropped - max_rx_data_size: maximum size of data copied in a mbuf chain, larger packets are truncated - max_rx_seg_size: maximum size written in a segment (this can be retrieved from pool private info = rte_pktmbuf_data_room_size() - RTE_PKTMBUF_HEADROOM) I think the first 2 values should be L2 lengths, including CRC if CRC-receive is enabled. if max_rx_data_size <= max_rx_seg_size, scatter is disabled if max_rx_data_size < max_rx_pkt_size, packets can be truncated In case a PMD is not able to limit a packet size, it can be advertised by a capability, and it would be up to the application to do it by sw. Olivier