From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stephen@networkplumber.org>
Received: from mail-pa0-f45.google.com (mail-pa0-f45.google.com
 [209.85.220.45]) by dpdk.org (Postfix) with ESMTP id C054011F5
 for <dev@dpdk.org>; Tue, 13 Oct 2015 07:18:22 +0200 (CEST)
Received: by padhy16 with SMTP id hy16so9677198pad.1
 for <dev@dpdk.org>; Mon, 12 Oct 2015 22:18:20 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to
 :references:mime-version:content-type:content-transfer-encoding;
 bh=D8dB0qu3MutKNu3BQQ1LU6ZUbI8ksDDWk1FSP0pHqVQ=;
 b=GsyB7lQm9Wcn4XD+evoJKjiXBD73QAXxisAvL9ca3gL1b+TfAPTbjWL148rCAEAr7X
 JZ5rjmlbq6164oGT4xDF2ZM0sMRnh6DHJLIPvnbDDsOZlhvDSMJPxf5iy+1jsjTCWure
 SiA4nDccc6RqKumOP7l7CneWjM76fMzy8m54Jzw+uM5/LliVzZfN1Xs4uUiqwv3V2ee+
 cGwnMov+NesGSHMF0OWO+RZUmM55xzC/loKpln27vBxI9Yl+D5pYHVSRMEuY5b0Z7BlO
 7AETP5QoW9Ho0yjGSatyY8YMy8RoR1usdmExwJ+Iu2P8MZoNlCm54pMFv66igkB2GreP
 FyeQ==
X-Gm-Message-State: ALoCoQnuoFJtF2j7kbZIqNxgw5UftqgAxarwj4dydvaeoYe26XkCK7GorGq2AzEx0V27yNAYfaJY
X-Received: by 10.69.1.67 with SMTP id be3mr38871697pbd.78.1444713500536;
 Mon, 12 Oct 2015 22:18:20 -0700 (PDT)
Received: from xeon-e3 (static-50-53-82-155.bvtn.or.frontiernet.net.
 [50.53.82.155])
 by smtp.gmail.com with ESMTPSA id cn4sm1174768pbc.94.2015.10.12.22.18.20
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Mon, 12 Oct 2015 22:18:20 -0700 (PDT)
Date: Mon, 12 Oct 2015 22:18:30 -0700
From: Stephen Hemminger <stephen@networkplumber.org>
To: "Sanford, Robert" <rsanford@akamai.com>
Message-ID: <20151012221830.6f5f42af@xeon-e3>
In-Reply-To: <D241E186.CDC6%rsanford@akamai.com>
References: <D241E186.CDC6%rsanford@akamai.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] IXGBE RX packet loss with 5+ cores
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Oct 2015 05:18:23 -0000

On Tue, 13 Oct 2015 02:57:46 +0000
"Sanford, Robert" <rsanford@akamai.com> wrote:

> I'm hoping that someone (perhaps at Intel) can help us understand
> an IXGBE RX packet loss issue we're able to reproduce with testpmd.
> 
> We run testpmd with various numbers of cores. We offer line-rate
> traffic (~14.88 Mpps) to one ethernet port, and forward all received
> packets via the second port.
> 
> When we configure 1, 2, 3, or 4 cores (per port, with same number RX
> queues per port), there is no RX packet loss. When we configure 5 or
> more cores, we observe the following packet loss (approximate):
>  5 cores - 3% loss
>  6 cores - 7% loss
>  7 cores - 11% loss
>  8 cores - 15% loss
>  9 cores - 18% loss
> 
> All of the "lost" packets are accounted for in the device's Rx Missed
> Packets Count register (RXMPC[0]). Quoting the datasheet:
>  "Packets are missed when the receive FIFO has insufficient space to
>  store the incoming packet. This might be caused due to insufficient
>  buffers allocated, or because there is insufficient bandwidth on the
>  IO bus."
> 
> RXMPC, and our use of API rx_descriptor_done to verify that we don't
> run out of mbufs (discussed below), lead us to theorize that packet
> loss occurs because the device is unable to DMA all packets from its
> internal packet buffer (512 KB, reported by register RXPBSIZE[0])
> before overrun.
> 
> Questions
> =========
> 1. The 82599 device supports up to 128 queues. Why do we see trouble
> with as few as 5 queues? What could limit the system (and one port
> controlled by 5+ cores) from receiving at line-rate without loss?
> 
> 2. As far as we can tell, the RX path only touches the device
> registers when it updates a Receive Descriptor Tail register (RDT[n]),
> roughly every rx_free_thresh packets. Is there a big difference
> between one core doing this and N cores doing it 1/N as often?
> 
> 3. Do CPU reads/writes from/to device registers have a higher priority
> than device reads/writes from/to memory? Could the former transactions
> (CPU <-> device) significantly impede the latter (device <-> RAM)?
> 
> Thanks in advance for any help you can provide.

As you add cores, there is more traffic on the PCI bus from each core
polling. There is a fix number of PCI bus transactions per second possible.
Each core is increasing the number of useless (empty) transactions.
Why do you think adding more cores will help?