From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <3chas3@gmail.com> Received: from mail-qk1-f194.google.com (mail-qk1-f194.google.com [209.85.222.194]) by dpdk.org (Postfix) with ESMTP id 01A55201; Wed, 7 Nov 2018 19:54:59 +0100 (CET) Received: by mail-qk1-f194.google.com with SMTP id y16so21044779qki.7; Wed, 07 Nov 2018 10:54:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=MFnORruyQdWhaAfrhKrHhB3Fp4c68/d5FqHw1EI8Dr0=; b=c7HdD8kdaoOn/YtxDFPaIsZ/TOCCgwQRrJcbvCTx+ufW9ilBPgMAEAgLwFqvYUC0SP C5J8DdwQcC1rUm4b+LS6Es0SyAjsI/l6yB8P+EmSGlLmo4TLFu6N4boSwI8+RlH94lHI Qn4FWzxRNSlgXn4wyh3k3Xx8ZbuMaTkyFVdOMcZAd3KjlX4fCiQeTAWk35w0tDUAUov8 AaEiUlrcE2GdYnsl/lYjd18NCaCz68yz3c6gX/f7O0Y6caoQLj1almWXva2S5WXB4Nkn 1TBGKn6yv4dX7dT9EN+RcRVLVgbOxWMaApYWTiB5k+kvTpQuJ4B8T/ysy92RxVZmj9iu ykjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=MFnORruyQdWhaAfrhKrHhB3Fp4c68/d5FqHw1EI8Dr0=; b=rkM+Cb/jfjYJaZ1WFEPatZK3p+wDOOTPfZqXacJ3MqzDOr2tpGxW9saFZ1W7WkbQhI 1jgoxw6ehTjUUAPhSC03ABES1JYU1eDD/X8ck/2l1uHLEiOQAgtFT91I9drTzVMSqPsG XEowno5McONr0wUxZBQJ8eoA1A5yl6yxebUhBUW3m8kXeDaAG2YYSEKCm2XHGiIONEOT YnXRviNzw0Yx91GfKsZExlS7SrGvnOLt1JYPotmO95kcinRuRMEJ02/oQ/Zk8Ty7xRkS utL/RzLCUXHOYmigqt7UAgN8uEFMkxYx6ThGWi+X/qv0+Ra2I/aYURfozrWcvl1DuygP b1Og== X-Gm-Message-State: AGRZ1gJgjL/OXmDgecm7KWczsyNYrW3aI3sr5ouK67WY029jaIl7uCSX Buo/zPmFw5CNfjPu7nVG92ejqQYP X-Google-Smtp-Source: AJdET5dC7wJtchSlTMgS5zxLt9PYbu31lWfHehkvZlXRlaY2Lb0qsNVsSYSgTqzjYio6Ah++drlnQw== X-Received: by 2002:a0c:a802:: with SMTP id w2mr1467760qva.198.1541616898713; Wed, 07 Nov 2018 10:54:58 -0800 (PST) Received: from [192.168.1.10] (pool-96-255-82-34.washdc.fios.verizon.net. [96.255.82.34]) by smtp.gmail.com with ESMTPSA id r5-v6sm726630qtp.1.2018.11.07.10.54.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Nov 2018 10:54:57 -0800 (PST) To: "Zhao1, Wei" , Luca Boccassi , "dev@dpdk.org" Cc: "Lu, Wenzhuo" , "Ananyev, Konstantin" , "stable@dpdk.org" References: <20180815141430.13421-1-bluca@debian.org> From: Chas Williams <3chas3@gmail.com> Message-ID: <45b8b29e-6310-c906-5d22-e9797a81547a@gmail.com> Date: Wed, 7 Nov 2018 13:54:56 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: reduce PF mailbox interrupt rate X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Nov 2018 18:55:00 -0000 On 11/07/2018 04:17 AM, Zhao1, Wei wrote: > Hi, Luca Boccassi > > The purpose of this patch is to reduce the mailbox interrupt from vf to pf, but there seem some point need for discussion in this patch. > > First, I do not know why do you change code of function ixgbe_check_mac_link_vf(), because in rte_eth_link_get_nowait() and rte_eth_link_get(), > it will call ixgbe_dev_link_update()->ixgbe_dev_link_update_share()-> ixgbevf_check_link() for VF, NOT ixgbe_check_mac_link_vf() in your patch! > > Second, in function ixgbevf_check_link(), there is mailbox message read operation for vf, > " if (mbx->ops.read(hw, &in_msg, 1, 0))", that is ixgbe_read_mbx_vf() , > This will cause interrupt from vf to pf, this is just the point of this patch, it is also the problem that you want to solve. > So, you use autoneg_wait_to_complete flag to control this mailbox message read operation, maybe you will use rte_eth_link_get_nowait(), Which set autoneg_wait_to_complete = 0, then the interrupt from vf to pf can be reduced. > > But I do not think this patch is necessary, because in ixgbevf_check_link(), it,has I think you are right here. This patch dates to before the addition of the vf argument to ixgbe_dev_link_update_share() and the split of .link_update between ixgbe and ixgbevf. At one point, this patch was especially beneficial if you were running bonding (which tends to make quite a few link status checks). So this patch probably hasn't been helping at this point. I will try to get some time to locally test this. > " > bool no_pflink_check = wait_to_complete == 0; > > //////////////////////// > > if (no_pflink_check) { > if (*speed == IXGBE_LINK_SPEED_UNKNOWN) > mac->get_link_status = true; > else > mac->get_link_status = false; > > goto out; > } > " > Comment of "for a quick link status checking, wait_to_compelet == 0, skip PF link status checking " is clear. > > That means in rte_eth_link_get_nowait(), code will skip this mailbox read interrupt, only in > rte_eth_link_get() there will be this interrupt, so I think what you need to is just replace > rte_eth_link_get() with rte_eth_link_get_nowait() in your APP, > that will reduce interrupt from vf to pf in mailbox read. > > >> -----Original Message----- >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Luca Boccassi >> Sent: Wednesday, August 15, 2018 10:15 PM >> To: dev@dpdk.org >> Cc: Lu, Wenzhuo ; Ananyev, Konstantin >> ; Luca Boccassi ; >> stable@dpdk.org >> Subject: [dpdk-dev] [PATCH] net/ixgbe: reduce PF mailbox interrupt rate >> >> We have observed high rate of NIC PF interrupts when VNF is using DPDK >> APIs rte_eth_link_get_nowait() and rte_eth_link_get() functions, as they >> are causing VF driver to send many MBOX ACK messages. >> >> With these changes, the interrupt rates go down significantly. Here's some >> testing results: >> >> Without the patch: >> >> $ egrep 'CPU|ens1f' /proc/interrupts ; sleep 10; egrep 'CPU|ens1f' >> /proc/interrupts >> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 >> CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15 >> 34: 88 0 0 0 0 41 30 509 0 0 350 >> 24 88 114 461 562 PCI-MSI 1572864-edge ens1f0-TxRx-0 >> 35: 49 24 0 0 65 130 64 29 67 0 10 >> 0 0 46 38 764 PCI-MSI 1572865-edge ens1f0-TxRx-1 >> 36: 53 0 0 64 15 85 132 71 108 0 >> 30 0 165 215 303 104 PCI-MSI 1572866-edge ens1f0- >> TxRx-2 >> 37: 46 196 0 0 10 48 62 68 51 0 0 >> 0 103 82 54 192 PCI-MSI 1572867-edge ens1f0-TxRx-3 >> 38: 226 0 0 0 159 145 749 265 0 0 >> 202 0 69229 166 450 0 PCI-MSI 1572868-edge ens1f0 >> 52: 95 896 0 0 0 18 53 0 494 0 0 >> 0 0 265 79 124 PCI-MSI 1574912-edge ens1f1-TxRx-0 >> 53: 50 0 18 0 72 33 0 168 330 0 0 >> 0 141 22 12 65 PCI-MSI 1574913-edge ens1f1-TxRx-1 >> 54: 65 0 0 0 239 104 166 49 442 0 >> 0 0 126 26 307 0 PCI-MSI 1574914-edge ens1f1-TxRx-2 >> 55: 57 0 0 0 123 35 83 54 157 106 >> 0 0 26 29 312 97 PCI-MSI 1574915-edge ens1f1-TxRx-3 >> 56: 232 0 13910 0 16 21 0 54422 0 0 >> 0 24 25 0 78 0 PCI-MSI 1574916-edge ens1f1 >> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 >> CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15 >> 34: 88 0 0 0 0 41 30 509 0 0 350 >> 24 88 119 461 562 PCI-MSI 1572864-edge ens1f0-TxRx-0 >> 35: 49 24 0 0 65 130 64 29 67 0 10 >> 0 0 46 38 771 PCI-MSI 1572865-edge ens1f0-TxRx-1 >> 36: 53 0 0 64 15 85 132 71 108 0 >> 30 0 165 215 303 113 PCI-MSI 1572866-edge ens1f0- >> TxRx-2 >> 37: 46 196 0 0 10 48 62 68 56 0 0 >> 0 103 82 54 192 PCI-MSI 1572867-edge ens1f0-TxRx-3 >> 38: 226 0 0 0 159 145 749 265 0 0 >> 202 0 71281 166 450 0 PCI-MSI 1572868-edge ens1f0 >> 52: 95 896 0 0 0 18 53 0 494 0 0 >> 0 0 265 79 133 PCI-MSI 1574912-edge ens1f1-TxRx-0 >> 53: 50 0 18 0 72 33 0 173 330 0 0 >> 0 141 22 12 65 PCI-MSI 1574913-edge ens1f1-TxRx-1 >> 54: 65 0 0 0 239 104 166 49 442 0 >> 0 0 126 26 312 0 PCI-MSI 1574914-edge ens1f1-TxRx-2 >> 55: 57 0 0 0 123 35 83 59 157 106 >> 0 0 26 29 312 97 PCI-MSI 1574915-edge ens1f1-TxRx-3 >> 56: 232 0 15910 0 16 21 0 54422 0 0 >> 0 24 25 0 78 0 PCI-MSI 1574916-edge ens1f1 >> >> During the 10s interval, CPU2 jumped by 2000 interrupts, CPU12 by 2051 >> interrupts, for about 200 interrupts/second. That's on the order of what we >> expect. I would have guessed 100/s but perhaps there are two mailbox >> messages. >> >> With the patch: >> >> $ egrep 'CPU|ens1f' /proc/interrupts ; sleep 10; egrep 'CPU|ens1f' >> /proc/interrupts >> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 >> CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15 >> 34: 88 0 0 0 0 25 19 177 0 0 350 >> 24 88 100 362 559 PCI-MSI 1572864-edge ens1f0-TxRx-0 >> 35: 49 19 0 0 65 130 64 29 67 0 10 >> 0 0 46 38 543 PCI-MSI 1572865-edge ens1f0-TxRx-1 >> 36: 53 0 0 64 15 53 85 71 108 0 24 >> 0 85 215 292 31 PCI-MSI 1572866-edge ens1f0-TxRx-2 >> 37: 46 196 0 0 10 43 57 39 19 0 0 >> 0 78 69 49 149 PCI-MSI 1572867-edge ens1f0-TxRx-3 >> 38: 226 0 0 0 159 145 749 247 0 0 >> 202 0 58250 0 450 0 PCI-MSI 1572868-edge ens1f0 >> 52: 95 896 0 0 0 18 53 0 189 0 0 >> 0 0 265 79 25 PCI-MSI 1574912-edge ens1f1-TxRx-0 >> 53: 50 0 18 0 72 33 0 90 330 0 0 >> 0 136 5 12 0 PCI-MSI 1574913-edge ens1f1-TxRx-1 >> 54: 65 0 0 0 10 104 166 49 442 0 0 >> 0 126 26 226 0 PCI-MSI 1574914-edge ens1f1-TxRx-2 >> 55: 57 0 0 0 61 35 83 30 157 101 0 >> 0 26 15 312 0 PCI-MSI 1574915-edge ens1f1-TxRx-3 >> 56: 232 0 2062 0 16 21 0 54422 0 0 >> 0 24 25 0 78 0 PCI-MSI 1574916-edge ens1f1 >> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 >> CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15 >> 34: 88 0 0 0 0 25 19 177 0 0 350 >> 24 88 102 362 562 PCI-MSI 1572864-edge ens1f0-TxRx-0 >> 35: 49 19 0 0 65 130 64 29 67 0 10 >> 0 0 46 38 548 PCI-MSI 1572865-edge ens1f0-TxRx-1 >> 36: 53 0 0 64 15 53 85 71 108 0 24 >> 0 85 215 292 36 PCI-MSI 1572866-edge ens1f0-TxRx-2 >> 37: 46 196 0 0 10 45 57 39 19 0 0 >> 0 78 69 49 152 PCI-MSI 1572867-edge ens1f0-TxRx-3 >> 38: 226 0 0 0 159 145 749 247 0 0 >> 202 0 58259 0 450 0 PCI-MSI 1572868-edge ens1f0 >> 52: 95 896 0 0 0 18 53 0 194 0 0 >> 0 0 265 79 25 PCI-MSI 1574912-edge ens1f1-TxRx-0 >> 53: 50 0 18 0 72 33 0 95 330 0 0 >> 0 136 5 12 0 PCI-MSI 1574913-edge ens1f1-TxRx-1 >> 54: 65 0 0 0 10 104 166 49 442 0 0 >> 0 126 26 231 0 PCI-MSI 1574914-edge ens1f1-TxRx-2 >> 55: 57 0 0 0 66 35 83 30 157 101 0 >> 0 26 15 312 0 PCI-MSI 1574915-edge ens1f1-TxRx-3 >> 56: 232 0 2071 0 16 21 0 54422 0 0 >> 0 24 25 0 78 0 PCI-MSI 1574916-edge ens1f1 >> >> Note the interrupt rate has gone way down. During the 10s interval, we only >> saw a handful of interrupts. >> >> Note that this patch was originally provided by Intel directly to AT&T and >> Vyatta, but unfortunately I am unable to find records of the exact author. >> >> We have been using this in production for more than a year. >> >> Fixes: af75078fece3 ("first public release") >> Cc: stable@dpdk.org >> >> Signed-off-by: Luca Boccassi >> --- >> drivers/net/ixgbe/base/ixgbe_vf.c | 33 ++++++++++++++++--------------- >> 1 file changed, 17 insertions(+), 16 deletions(-) >> >> diff --git a/drivers/net/ixgbe/base/ixgbe_vf.c >> b/drivers/net/ixgbe/base/ixgbe_vf.c >> index 5b25a6b4d4..16086670b1 100644 >> --- a/drivers/net/ixgbe/base/ixgbe_vf.c >> +++ b/drivers/net/ixgbe/base/ixgbe_vf.c >> @@ -586,7 +586,6 @@ s32 ixgbe_check_mac_link_vf(struct ixgbe_hw *hw, >> ixgbe_link_speed *speed, >> s32 ret_val = IXGBE_SUCCESS; >> u32 links_reg; >> u32 in_msg = 0; >> - UNREFERENCED_1PARAMETER(autoneg_wait_to_complete); >> >> /* If we were hit with a reset drop the link */ >> if (!mbx->ops.check_for_rst(hw, 0) || !mbx->timeout) @@ -643,23 >> +642,25 @@ s32 ixgbe_check_mac_link_vf(struct ixgbe_hw *hw, >> ixgbe_link_speed *speed, >> *speed = IXGBE_LINK_SPEED_UNKNOWN; >> } >> >> - /* if the read failed it could just be a mailbox collision, best wait >> - * until we are called again and don't report an error >> - */ >> - if (mbx->ops.read(hw, &in_msg, 1, 0)) >> - goto out; >> + if (autoneg_wait_to_complete) { >> + /* if the read failed it could just be a mailbox collision, best >> wait >> + * until we are called again and don't report an error >> + */ >> + if (mbx->ops.read(hw, &in_msg, 1, 0)) >> + goto out; >> >> - if (!(in_msg & IXGBE_VT_MSGTYPE_CTS)) { >> - /* msg is not CTS and is NACK we must have lost CTS status >> */ >> - if (in_msg & IXGBE_VT_MSGTYPE_NACK) >> + if (!(in_msg & IXGBE_VT_MSGTYPE_CTS)) { >> + /* msg is not CTS and is NACK we must have lost CTS >> status */ >> + if (in_msg & IXGBE_VT_MSGTYPE_NACK) >> + ret_val = -1; >> + goto out; >> + } >> + >> + /* the pf is talking, if we timed out in the past we reinit */ >> + if (!mbx->timeout) { >> ret_val = -1; >> - goto out; >> - } >> - >> - /* the pf is talking, if we timed out in the past we reinit */ >> - if (!mbx->timeout) { >> - ret_val = -1; >> - goto out; >> + goto out; >> + } >> } >> >> /* if we passed all the tests above then the link is up and we no >> -- >> 2.18.0 >