From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com [209.85.212.172]) by dpdk.org (Postfix) with ESMTP id 3473F20F for ; Fri, 11 Sep 2015 19:44:57 +0200 (CEST) Received: by wicfx3 with SMTP id fx3so66432914wic.0 for ; Fri, 11 Sep 2015 10:44:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding; bh=EkSwWet0xMLLFGbWIwANG7UzDc5Yz+sZJgujg3TTOVQ=; b=K6HlRckus4PsvlXhtdFIgvBrZzKFsH9nEWNUbLhszH9aOdyslZR/BBEY5F1fVFM7h8 gU5koSwpD9TyP5tAo+EqHG8vDJKQB8a/OFupcaMNnwQfcAaza6m5oB+sNHV2W46/MTpB wtcU24X7Ua+0p+3stvLP0NRtcl492y3M30X8HtLc0cXhRnSZQH6YB/9jaAs1lnkobB3g 3m/PSCpYcxOF916W9lSOtbCoro2fUUBV+uVQu57wK/5LChpT0JB7i7rB0Fr9EQysHOJj s+isKanys8oCmI+5wen1gkOWXM8Lw4nErUYaJV9K4Xc+N3KkqH5C2vwUPhwRgLysI0h/ r8bA== X-Gm-Message-State: ALoCoQlMpTzg8fhw0jbH/hmy+LECUtZ+VyHOw0KTAnD6kJ+6wNR+3a0Ywu+U4bauk4q7x/6Qegcq X-Received: by 10.194.184.242 with SMTP id ex18mr17502wjc.70.1441993497008; Fri, 11 Sep 2015 10:44:57 -0700 (PDT) Received: from [10.0.0.4] (bzq-109-64-134-34.red.bezeqint.net. [109.64.134.34]) by smtp.googlemail.com with ESMTPSA id p3sm206407wib.16.2015.09.11.10.44.55 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Sep 2015 10:44:55 -0700 (PDT) To: "Richardson, Bruce" , Vladislav Zolotarov References: <1439489195-31553-1-git-send-email-vladz@cloudius-systems.com> <55F2E448.1070602@6wind.com> <55F2E997.5050009@cloudius-systems.com> <1762144.1LKiyImgC1@xps13> <55F2F6A9.6080405@cloudius-systems.com> <59AF69C657FD0841A61C55336867B5B0359263BC@IRSMSX103.ger.corp.intel.com> From: Avi Kivity Message-ID: <55F31316.2090807@cloudius-systems.com> Date: Fri, 11 Sep 2015 20:44:54 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <59AF69C657FD0841A61C55336867B5B0359263BC@IRSMSX103.ger.corp.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for all NICs but 82598 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Sep 2015 17:44:57 -0000 On 09/11/2015 07:07 PM, Richardson, Bruce wrote: > >> -----Original Message----- >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vladislav Zolotarov >> Sent: Friday, September 11, 2015 5:04 PM >> To: Avi Kivity >> Cc: dev@dpdk.org >> Subject: Re: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 >> for all NICs but 82598 >> >> On Sep 11, 2015 6:43 PM, "Avi Kivity" wrote: >>> On 09/11/2015 06:12 PM, Vladislav Zolotarov wrote: >>>> >>>> On Sep 11, 2015 5:55 PM, "Thomas Monjalon" >>>> >> wrote: >>>>> 2015-09-11 17:47, Avi Kivity: >>>>>> On 09/11/2015 05:25 PM, didier.pallard wrote: >>>>>>> On 08/25/2015 08:52 PM, Vlad Zolotarov wrote: >>>>>>>> Helin, the issue has been seen on x540 devices. Pls., see a >> chapter >>>>>>>> 7.2.1.1 of x540 devices spec: >>>>>>>> >>>>>>>> A packet (or multiple packets in transmit segmentation) can >>>>>>>> span >> any >>>>>>>> number of >>>>>>>> buffers (and their descriptors) up to a limit of 40 minus >>>>>>>> WTHRESH minus 2 (see Section 7.2.3.3 for Tx Ring details and >>>>>>>> section Section 7.2.3.5.1 >> for >>>>>>>> WTHRESH >>>>>>>> details). For best performance it is recommended to minimize >>>>>>>> the number of buffers as possible. >>>>>>>> >>>>>>>> Could u, pls., clarify why do u think that the maximum number >>>>>>>> of >> data >>>>>>>> buffers is limited by 8? >>>>>>>> >>>>>>>> thanks, >>>>>>>> vlad >>>>>>> Hi vlad, >>>>>>> >>>>>>> Documentation states that a packet (or multiple packets in >>>>>>> transmit >>>>>>> segmentation) can span any number of buffers (and their >>>>>>> descriptors) up to a limit of 40 minus WTHRESH minus 2. >>>>>>> >>>>>>> Shouldn't there be a test in transmit function that drops >>>>>>> properly >> the >>>>>>> mbufs with a too large number of segments, while incrementing a >>>>>>> statistic; otherwise transmit >> function >>>>>>> may be locked by the faulty packet without notification. >>>>>>> >>>>>> What we proposed is that the pmd expose to dpdk, and dpdk expose >>>>>> to >> the >>>>>> application, an mbuf check function. This way applications that >>>>>> can generate complex packets can verify that the device will be >>>>>> able to process them, and applications that only generate simple >>>>>> mbufs can >> avoid >>>>>> the overhead by not calling the function. >>>>> More than a check, it should be exposed as a capability of the port. >>>>> Anyway, if the application sends too much segments, the driver must >>>>> drop it to avoid hang, and maintain a dedicated statistic counter >>>>> to >> allow >>>>> easy debugging. >>>> I agree with Thomas - this should not be optional. Malformed packets >> should be dropped. In the icgbe case it's a very simple test - it's a >> single branch per packet so i doubt that it could impose any measurable >> performance degradation.allows >>>> >>> A drop allows the application no chance to recover. The driver must >> either provide the ability for the application to know that it cannot >> accept the packet, or it must fix it up itself. >> >> An appropriate statistics counter would be a perfect tool to detect such >> issues. Knowingly sending a packet that will cause a HW to hang is not >> acceptable. > I would agree. Drivers should provide a function to query the max number of > segments they can accept and the driver should be able to discard any packets > exceeding that number, and just track it via a stat. > There is no such max number of segments. The i40e card, as an extreme example, allows 8 fragments per packet, but that is after TSO segmentation. So if the header is in three fragments, that leaves 5 data fragments per packet. Another card (ixgbe) has a 38-fragment pre-TSO limit. With such a variety of limitations, the only generic way to expose them is via a function.