From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f53.google.com (mail-wm0-f53.google.com [74.125.82.53]) by dpdk.org (Postfix) with ESMTP id 7E5742C39 for ; Wed, 30 Nov 2016 08:40:12 +0100 (CET) Received: by mail-wm0-f53.google.com with SMTP id u144so52059504wmu.1 for ; Tue, 29 Nov 2016 23:40:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=tRaHLVFQ4iBEwQca0tWZAIIBXzQk4l0SqH1JESJRbwc=; b=VzDAD4cxuERFwU2PxwfXv1pii54h7FwMd0hcKFsflZuS6uAUOEvDaVOL+Vdo/gcGgB clJxuN0s7DPjv6kwvkLDjceOtJB/St9Z8XC+RWna/ERxh4IX0ODA9Ren5zX1aIl2bVvw fA6hPtDYfb/LMIg3UE2L2j/YGGikZjp9dDymNqxLJJkmtlyhys1OLfwSdNA9Ldlqq+9c fmgwufflxbNysz1dp0GbgxPwCyGnax62zzCtVlNCFvK/dMcUqNqetiVvAjGKGmUgUlFg eTVWUiyaboV2I3UgKwq2Tk0CFBBpiEYHX4GA01RFcla9TSCN/5M2OTpNdYJiY9jrlkT9 VtQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=tRaHLVFQ4iBEwQca0tWZAIIBXzQk4l0SqH1JESJRbwc=; b=igMYeuz73DHCoCePO488D+ADh2a7F//qujv/2LUinycmz+YLZbjmVANGtG0dvTneAO rMjMTO0pBa1IkZcvaEC6R62ITJ4bKxAwymXs5vKyE3uf8Agd7P+z6q289GReg3Vh2xx6 9Fd4HF1FRWZAEwV7enF3SMsRXrUnl0l/RihRms5fVEG6kBDvMciBU5Rf5iyqIJFAOcbh Ibqy+nkeoyVYeODah+lYDQ2DrVdA/bE0ADUtCsQH3AIkxUX41oxjKYmD279E8rXEZeWi +rszkiR8NR4ZqWPT3gUWabS4O3y/gLIQ/1aoyGPtIEfWmXX1nGqyTrgJRRRChdc8Vdsc xTTA== X-Gm-Message-State: AKaTC02UcYQofMpwWeB8B8mmm6adDCeH5twBySbFIGc4jneOt9vSbDbMQEzgsDs8fgnoa3OO X-Received: by 10.28.87.84 with SMTP id l81mr29265844wmb.48.1480491612174; Tue, 29 Nov 2016 23:40:12 -0800 (PST) Received: from 6wind.com (guy78-3-82-239-227-177.fbx.proxad.net. [82.239.227.177]) by smtp.gmail.com with ESMTPSA id vr9sm71523207wjc.35.2016.11.29.23.40.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Nov 2016 23:40:11 -0800 (PST) Date: Wed, 30 Nov 2016 08:40:03 +0100 From: Adrien Mazarguil To: Thomas Monjalon Cc: dev@dpdk.org, Rahul Lakkireddy , Stephen Hurd , Jan Medala , Jakub Palider , John Daley , Alejandro Lucero , Harish Patil , Rasesh Mody , Jerin Jacob , Yuanhan Liu , Yong Wang , Tomasz Kulasek , konstantin.ananyev@intel.com, olivier.matz@6wind.com Message-ID: <20161130074003.GD10340@6wind.com> References: <1477486575-25148-1-git-send-email-tomaszx.kulasek@intel.com> <1479922585-8640-1-git-send-email-tomaszx.kulasek@intel.com> <8317180.L80Qf11uiu@xps13> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8317180.L80Qf11uiu@xps13> Subject: Re: [dpdk-dev] [PATCH v12 0/6] add Tx preparation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Nov 2016 07:40:12 -0000 On Mon, Nov 28, 2016 at 12:03:06PM +0100, Thomas Monjalon wrote: > We need attention of every PMD developers on this thread. I've been following this thread from the beginning while working on rte_flow and wanted to see where it was headed before replying. (I know, v11 was submitted about 1 month ago but still.) > Reminder of what Konstantin suggested: > " > - if the PMD supports TX offloads AND > - if to be able use any of these offloads the upper layer SW would have to: > * modify the contents of the packet OR > * obey HW specific restrictions > then it is a PMD developer responsibility to provide tx_prep() that would implement > expected modifications of the packet contents and restriction checks. > Otherwise, tx_prep() implementation is not required and can be safely set to NULL. > " > > I copy/paste also my previous conclusion: > > Before txprep, there is only one API: the application must prepare the > packets checksum itself (get_psd_sum in testpmd). > With txprep, the application have 2 choices: keep doing the job itself > or call txprep which calls a PMD-specific function. Something is definitely needed here, and only PMDs can provide it. I think applications should not have to clear checksum fields or initialize them to some magic value, same goes for any other offload or hardware limitation that needs to be worked around. tx_prep() is one possible answer to this issue, however as mentioned in the original patch it can be very expensive if exposed by the PMD. Another issue I'm more concerned about is the way limitations are managed (struct rte_eth_desc_lim). While not officially tied to tx_prep(), this structure contains new fields that are only relevant to a few devices, and I fear it will keep growing with each new hardware quirk to manage, breaking ABIs in the process. What are applications supposed to do, check each of them regardless before attempting to send a burst? I understand tx_prep() automates this process, however I'm wondering why isn't the TX burst function doing that itself. Using nb_mtu_seg_max as an example, tx_prep() has an extra check in case of TSO that the TX burst function does not perform. This ends up being much more expensive to applications due to the additional loop doing redundant testing on each mbuf. If, say as a performance improvement, we decided to leave the validation part to the TX burst function; what remains in tx_prep() is basically heavy "preparation" requiring mbuf changes (i.e. erasing checksums, for now). Following the same logic, why can't such a thing be made part of the TX burst function as well (through a direct call to rte_phdr_cksum_fix() whenever necessary). From an application standpoint, what are the advantages of having to: if (tx_prep()) // iterate and update mbufs as needed tx_burst(); // iterate and send Compared to: tx_burst(); // iterate, update as needed and send Note that PMDs could still provide different TX callbacks depending on the set of enabled offloads so performance is not unnecessarily impacted. In my opinion the second approach is both faster to applications and more friendly from a usability perspective, am I missing something obvious? > The question is: does non-Intel drivers need a checksum preparation for TSO? > Will it behave well if txprep does nothing in these drivers? > > When looking at the code, most of drivers handle the TSO flags. > But it is hard to know whether they rely on the pseudo checksum or not. > > git grep -l 'PKT_TX_UDP_CKSUM\|PKT_TX_TCP_CKSUM\|PKT_TX_TCP_SEG' drivers/net/ > > drivers/net/bnxt/bnxt_txr.c > drivers/net/cxgbe/sge.c > drivers/net/e1000/em_rxtx.c > drivers/net/e1000/igb_rxtx.c > drivers/net/ena/ena_ethdev.c > drivers/net/enic/enic_rxtx.c > drivers/net/fm10k/fm10k_rxtx.c > drivers/net/i40e/i40e_rxtx.c > drivers/net/ixgbe/ixgbe_rxtx.c > drivers/net/mlx4/mlx4.c > drivers/net/mlx5/mlx5_rxtx.c > drivers/net/nfp/nfp_net.c > drivers/net/qede/qede_rxtx.c > drivers/net/thunderx/nicvf_rxtx.c > drivers/net/virtio/virtio_rxtx.c > drivers/net/vmxnet3/vmxnet3_rxtx.c > > Please, we need a comment for each driver saying > "it is OK, we do not need any checksum preparation for TSO" > or > "yes we have to implement tx_prepare or TSO will not work in this mode" For both mlx4 and mlx5 then, "it is OK, we do not need any checksum preparation for TSO". Actually I do not think we'll ever need tx_prep() unless we add our own quirks to struct rte_eth_desc_lim (and friends) which are currently quietly handled by TX burst functions. -- Adrien Mazarguil 6WIND