From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-f193.google.com (mail-wr0-f193.google.com [209.85.128.193]) by dpdk.org (Postfix) with ESMTP id 192F423A for ; Sun, 23 Jul 2017 11:49:39 +0200 (CEST) Received: by mail-wr0-f193.google.com with SMTP id p12so7227908wrc.5 for ; Sun, 23 Jul 2017 02:49:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=6q8jWfjR4hwpb57mAx4ASfI+kQGiYnk+36OFZ8nf4sE=; b=SHdBb0JmJcD2TWEHYsVRx7u7cRaBzXGwyx1sDFPe4jmdp9LkGseC8Lp0tUBQ2sbsC5 ScqSiYE7CG/LRgjcOu62ZyhwHI10cAgKRE5abXIft3VHnL8ILln+QtnCTI+dYvFFqmK3 KYHBHGhjydenFOyKPsndPw4NflHswo58B1tuZOqpO/sAYo77Vqa26jYm5ZSE3M16mbtK DdmP9vcTwVzyC1y7c2Jf3cRKsVO0P98VEsJlap7j3eZBVewKXUyNoisPAhHDXr8Jw6FH KSQOvCWQKwuMOCssjiWmORb3Eypkx+JFqhWG85tkknP/fOjNaS08TFQD1RaJzObeFmOZ Vqfg== X-Gm-Message-State: AIVw113qORUftSfKiipbgcZl/lceg2pzss4euzex0sKvTxAEflzCoDsl 0JKUUFmAXzwJLlEi6F0= X-Received: by 10.223.149.65 with SMTP id 59mr8857697wrs.292.1500803378570; Sun, 23 Jul 2017 02:49:38 -0700 (PDT) Received: from [192.168.64.116] (bzq-82-81-101-184.red.bezeqint.net. [82.81.101.184]) by smtp.gmail.com with ESMTPSA id q64sm6318432wmg.35.2017.07.23.02.49.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 23 Jul 2017 02:49:38 -0700 (PDT) To: Yongseok Koh Cc: adrien.mazarguil@6wind.com, nelio.laranjeiro@6wind.com, dev@dpdk.org References: <20170720154835.13571-1-yskoh@mellanox.com> <20170721151006.GA38779@yongseok-MBP.local> From: Sagi Grimberg Message-ID: <918eb17f-269c-7102-41ab-69ceb95fdacf@grimberg.me> Date: Sun, 23 Jul 2017 12:49:36 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170721151006.GA38779@yongseok-MBP.local> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH] net/mlx5: poll completion queue once per a call X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Jul 2017 09:49:39 -0000 >>> mlx5_tx_complete() polls completion queue multiple times until it >>> encounters an invalid entry. As Tx completions are suppressed by >>> MLX5_TX_COMP_THRESH, it is waste of cycles to expect multiple completions >>> in a poll. And freeing too many buffers in a call can cause high jitter. >>> This patch improves throughput a little. >> >> What if the device generates burst of completions? > mlx5 PMD suppresses completions anyway. It requests a completion per every > MLX5_TX_COMP_THRESH Tx mbufs, not every single mbuf. So, the size of completion > queue is even much small. Yes I realize that, but can't the device still complete in a burst (of unsuppressed completions)? I mean its not guaranteed that for every txq_complete a signaled completion is pending right? What happens if the device has inconsistent completion pacing? Can't the sw grow a batch of completions if txq_complete will process a single completion unconditionally? >> Holding these completions un-reaped can theoretically cause resource stress on >> the corresponding mempool(s). > Can you make your point clearer? Do you think the "stress" can impact > performance? I think stress doesn't matter unless it is depleted. And app is > responsible for supplying enough mbufs considering the depth of all queues (max > # of outstanding mbufs). I might be missing something, but # of outstanding mbufs should be relatively small as the pmd reaps every MLX5_TX_COMP_THRESH mbufs right? Why should the pool account for the entire TX queue depth (which can be very large)? Is there a hard requirement documented somewhere that the application needs to account for the entire TX queue depths for sizing its mbuf pool? My question is with the proposed change, doesn't this mean that the application might need to allocate a bigger TX mbuf pool? Because the pmd can theoretically consume completions slower (as in multiple TX burst calls)? >> I totally get the need for a stopping condition, but is "loop once" >> the best stop condition? > Best for what? Best condition to stop consuming TX completions. As I said, I think that leaving TX completions un-reaped can (at least in theory) slow down the mbuf reclamation, which impacts the application. (unless I'm not understanding something fundamental) >> Perhaps an adaptive budget (based on online stats) perform better? > Please bring up any suggestion or submit a patch if any. I was simply providing a review for the patch. I don't have the time to come up with a better patch unfortunately, but I still think its fair to raise a point. > Does "budget" mean the > threshold? If so, calculation of stats for adaptive threshold can impact single > core performance. With multiple cores, adjusting threshold doesn't affect much. If you look at mlx5e driver in the kernel, it maintains online stats on its RX and TX queues. It maintain these stats mostly for adaptive interrupt moderation control (but not only). I was suggesting maintaining per TX queue stats on average completions consumed for each TX burst call, and adjust the stopping condition according to a calculated stat.