From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <mason@porklips.org>
Received: from porklips.org (mail.porklips.org [128.52.130.142])
 by dpdk.org (Postfix) with ESMTP id 1F9383237
 for <users@dpdk.org>; Thu, 26 Jul 2018 22:09:22 +0200 (CEST)
Received: from mason by porklips.org with local (Exim 4.82)
 (envelope-from <mason@porklips.org>) id 1fimZc-0004r6-KI
 for users@dpdk.org; Thu, 26 Jul 2018 16:09:21 -0400
Date: Thu, 26 Jul 2018 16:09:20 -0400
From: Mark Mason <mason+dpdk@steelypip.org>
To: "users@dpdk.org" <users@dpdk.org>
Message-ID: <20180726200920.GA18059@postdiluvian.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: [dpdk-users] Mbuf pool/ring size question
X-BeenThere: users@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK usage discussions <users.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/users>,
 <mailto:users-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/users/>
List-Post: <mailto:users@dpdk.org>
List-Help: <mailto:users-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/users>,
 <mailto:users-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Jul 2018 20:09:22 -0000

Hi all,

I've got a question about mbuf pool and ring sizes - DPDK 17.02 PMD.

I've got a pipelined application running with RSS on a Cavium CN83XX.
40GE, 4 RSS queues wide and a pipeline 3 deep, ISOLCPUs with only DPDK
running on each of the 12 worker cores.  There are two RTE SP/SC rings
per RSS queue for communication between the pipeline stages - the
rings are 1024 deep, 512 cache, and an mbuf pool of 16K-1.

Performance is generally good - 40G in and 40G out with 1M flows of
512 byte packets, EXCEPT for intermittent drops on the order of a few
dozen to a few hundred packets/second.  I did some timing measurements
and found that sometimes a packet can take much longer to get through
the pipeline, despite being identical (except for destination address)
and taking an identical(ish) code path - sometimes two to three orders
of magnitude longer.

I tried measuring where the extra time was going, but pretty much
everything I tried perturbed the system, so I wasn't easily able to
get a clear answer.  One of my suspicions is the per-lcore mbuf cache
flush/fill, since the rx and tx are being done by different cores.  Is
there an efficient way to manage the mbuf pool in this case than
rte_pktmbuf_pool_create?  Some cores don't allocate or free mbufs, so
I'm also curious if I'm losing mbufs to the caches on those cores.

Since I have memory to burn I figured I could absorb any glitches by
increasing the RX/TX descriptor pool, mbuf pool, and ring sizes,
allowing more packets to be buffered during the glitches.  This didn't
help, which I guess makes sense if my issue is lock contentioon on the
mbuf cache, which I can't make larger.  Almost all of the DPDK
examples and applications I could find use roughly the same parameters
- 128-512 buffer descriptors, 4-16K mbuf pool, 1K ring sizes, etc.  It
seems that there are diminishing returns for increasing much beyond
these values, why is that?