From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from prod-mail-xrelay05.akamai.com (prod-mail-xrelay05.akamai.com [23.79.238.179]) by dpdk.org (Postfix) with ESMTP id 2CD5A8E68 for ; Tue, 13 Oct 2015 16:47:38 +0200 (CEST) Received: from prod-mail-xrelay05.akamai.com (localhost.localdomain [127.0.0.1]) by postfix.imss70 (Postfix) with ESMTP id 921F94DDD3; Tue, 13 Oct 2015 14:47:37 +0000 (GMT) Received: from prod-mail-relay10.akamai.com (prod-mail-relay10.akamai.com [172.27.118.251]) by prod-mail-xrelay05.akamai.com (Postfix) with ESMTP id 7BBEC4D3EA; Tue, 13 Oct 2015 14:47:37 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=akamai.com; s=a1; t=1444747657; bh=dMw6ylNbwgYekbz/a4dSrgGA8jOlDYOcLna18yff8Cc=; l=1904; h=From:To:Date:References:In-Reply-To:From; b=vi3+08JrV7aTi/ZyuAD7AKzWC+c4JGvInjsNDLA55w9jcXlODvZ9HdIi9llkSezMy tENXgVP3SgRsNlr79BMA54GSXstZWpiw1Gs2mCoiW4d08mYTeFX/n8+blMoKqJV/Mp Tx10wlll6+u5vKJYCXsu6F/ZhKDuZTTcPlV4B4dI= Received: from email.msg.corp.akamai.com (ustx2ex-cas4.msg.corp.akamai.com [172.27.25.33]) by prod-mail-relay10.akamai.com (Postfix) with ESMTP id 636052026; Tue, 13 Oct 2015 14:47:37 +0000 (GMT) Received: from ustx2ex-dag1mb6.msg.corp.akamai.com (172.27.27.107) by ustx2ex-dag1mb6.msg.corp.akamai.com (172.27.27.107) with Microsoft SMTP Server (TLS) id 15.0.1076.9; Tue, 13 Oct 2015 07:47:36 -0700 Received: from ustx2ex-dag1mb6.msg.corp.akamai.com ([172.27.27.107]) by ustx2ex-dag1mb6.msg.corp.akamai.com ([172.27.27.107]) with mapi id 15.00.1076.000; Tue, 13 Oct 2015 07:47:36 -0700 From: "Sanford, Robert" To: Bruce Richardson , Stephen Hemminger , "dev@dpdk.org" Thread-Topic: [dpdk-dev] IXGBE RX packet loss with 5+ cores Thread-Index: AQHRBWLwhDAnFbZzP06Psm0gcmGeEp5pV8QAgACRr4D//8pEAA== Date: Tue, 13 Oct 2015 14:47:36 +0000 Message-ID: References: <20151012221830.6f5f42af@xeon-e3> <20151013135955.GA31844@bricha3-MOBL3> In-Reply-To: <20151013135955.GA31844@bricha3-MOBL3> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.4.3.140616 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [172.19.132.50] Content-Type: text/plain; charset="us-ascii" Content-ID: <14964A27E9AEBC42BBBC65C6CA58B590@akamai.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] IXGBE RX packet loss with 5+ cores X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Oct 2015 14:47:38 -0000 >>> [Robert:] >>> 1. The 82599 device supports up to 128 queues. Why do we see trouble >>> with as few as 5 queues? What could limit the system (and one port >>> controlled by 5+ cores) from receiving at line-rate without loss? >>> >>> 2. As far as we can tell, the RX path only touches the device >>> registers when it updates a Receive Descriptor Tail register (RDT[n]), >>> roughly every rx_free_thresh packets. Is there a big difference >>> between one core doing this and N cores doing it 1/N as often? >>[Stephen:] >>As you add cores, there is more traffic on the PCI bus from each core >>polling. There is a fix number of PCI bus transactions per second >>possible. >>Each core is increasing the number of useless (empty) transactions. >[Bruce:] >The polling for packets by the core should not be using PCI bandwidth >directly, >as the ixgbe driver (and other drivers) check for the DD bit being set on >the >descriptor in memory/cache. I was preparing to reply with the same point. >>[Stephen:] Why do you think adding more cores will help? We're using run-to-completion and sometimes spend too many cycles per pkt. We realize that we need to move to io+workers model, but wanted a better understanding of the dynamics involved here. >[Bruce:] However, using an increased number of queues can >use PCI bandwidth in other ways, for instance, with more queues you >reduce the >amount of descriptor coalescing that can be done by the NICs, so that >instead of >having a single transaction of 4 descriptors to one queue, the NIC may >instead >have to do 4 transactions each writing 1 descriptor to 4 different >queues. This >is possibly why sending all traffic to a single queue works ok - the >polling on >the other queues is still being done, but has little effect. Brilliant! This idea did not occur to me. -- Thanks guys, Robert