From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it0-f41.google.com (mail-it0-f41.google.com [209.85.214.41]) by dpdk.org (Postfix) with ESMTP id 7425DC710 for ; Thu, 16 Jun 2016 18:20:55 +0200 (CEST) Received: by mail-it0-f41.google.com with SMTP id e5so24632399ith.1 for ; Thu, 16 Jun 2016 09:20:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=LSYSP/xzMQJUl5kx6cQXIDvnfJdzJ9y8OwQN4zMiIZI=; b=Fou86Ba4qMxPP48EUbM8V52mbdqwXdDc+yERdN6NHtQMeYkD5qP2eq+PHE/qCpLt6N kljutB/jCsglMLNEMTsM2YOxf+lA2is+hEDcxcunPMo3DqrJAx6AYtdTUi3KzIgCaUnh AOvx7YkMlXmMBVCc+UaBnSsLCo6GoS8vsCTqcJF9jONOGACmD2vaocS8ozwnVXHBDSl4 oJQM2z3weKWlWCOQtm5QO5RMcVsGGDDVJEVsaAD7LRSqvbS5pEgI3Ytq1KjzuyWzmEsS jEd8329UwzRzHEBodG6HN05bDjYqEBynDd0mjgqqIyV5l49j9LmMoMeUXMZBqITPNdZ0 JLpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=LSYSP/xzMQJUl5kx6cQXIDvnfJdzJ9y8OwQN4zMiIZI=; b=Za65pYSNyytU8Q9OdsYAHWxVlOiB3tlmHw3Zdy5sEi6EwYYCJaoX5eEShhsACnRBjR Fv7e9OrN3Mg5FjoGE0lm1Y24i/JIcePu5EAnR0IjC6DK0GcI4j7r0/RVvKD57linnjG9 dW7fKufyoSGpJlJP484VLBN9+PEvZh4GZna8doZYaFtgphS0WY6Q2AjqRZ1cmEGgIaFu CpTAxfiQRvtMj2SkWhbQMLcKAIhegK+56Wh8gPekzTJKpgOpC6J87vN7iK5+mRpJy2LD oL2KDxKK1t91qGJUWtuUuyCK9dChqQgdiEqxrPJvtoxEamw6dA+pUCW4oE5t/RBZzpug J+Nw== X-Gm-Message-State: ALyK8tKsjRcc043fR6lRpJ779zGQTJifESeGv/KschJW0jLekA5XLJhiFQpBNW3kjgLL4H5Zhu3IEv7DT18qFA== X-Received: by 10.36.117.17 with SMTP id y17mr12813375itc.25.1466094054606; Thu, 16 Jun 2016 09:20:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.20.197 with HTTP; Thu, 16 Jun 2016 09:20:35 -0700 (PDT) In-Reply-To: References: <1DB024A9-9185-41C8-9FA5-67C41891189A@intel.com> <2A957DA5-72A6-45B0-8B76-FE0DBDE758FD@intel.com> From: Take Ceara Date: Thu, 16 Jun 2016 18:20:35 +0200 Message-ID: To: "Wiles, Keith" Cc: "dev@dpdk.org" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] Performance hit - NICs on different CPU sockets X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jun 2016 16:20:55 -0000 On Thu, Jun 16, 2016 at 5:29 PM, Wiles, Keith wrote= : > > Right now I do not know what the issue is with the system. Could be too m= any Rx/Tx ring pairs per port and limiting the memory in the NICs, which is= why you get better performance when you have 8 core per port. I am not rea= lly seeing the whole picture and how DPDK is configured to help more. Sorry= . I doubt that there is a limitation wrt running 16 cores per port vs 8 cores per port as I've tried with two different machines connected back to back each with one X710 port and 16 cores on each of them running on that port. In that case our performance doubled as expected. > > Maybe seeing the DPDK command line would help. The command line I use with ports 01:00.3 and 81:00.3 is: ./warp17 -c 0xFFFFFFFFF3 -m 32768 -w 0000:81:00.3 -w 0000:01:00.3 -- --qmap 0.0x003FF003F0 --qmap 1.0x0FC00FFC00 Our own qmap args allow the user to control exactly how cores are split between ports. In this case we end up with: warp17> show port map Port 0[socket: 0]: Core 4[socket:0] (Tx: 0, Rx: 0) Core 5[socket:0] (Tx: 1, Rx: 1) Core 6[socket:0] (Tx: 2, Rx: 2) Core 7[socket:0] (Tx: 3, Rx: 3) Core 8[socket:0] (Tx: 4, Rx: 4) Core 9[socket:0] (Tx: 5, Rx: 5) Core 20[socket:0] (Tx: 6, Rx: 6) Core 21[socket:0] (Tx: 7, Rx: 7) Core 22[socket:0] (Tx: 8, Rx: 8) Core 23[socket:0] (Tx: 9, Rx: 9) Core 24[socket:0] (Tx: 10, Rx: 10) Core 25[socket:0] (Tx: 11, Rx: 11) Core 26[socket:0] (Tx: 12, Rx: 12) Core 27[socket:0] (Tx: 13, Rx: 13) Core 28[socket:0] (Tx: 14, Rx: 14) Core 29[socket:0] (Tx: 15, Rx: 15) Port 1[socket: 1]: Core 10[socket:1] (Tx: 0, Rx: 0) Core 11[socket:1] (Tx: 1, Rx: 1) Core 12[socket:1] (Tx: 2, Rx: 2) Core 13[socket:1] (Tx: 3, Rx: 3) Core 14[socket:1] (Tx: 4, Rx: 4) Core 15[socket:1] (Tx: 5, Rx: 5) Core 16[socket:1] (Tx: 6, Rx: 6) Core 17[socket:1] (Tx: 7, Rx: 7) Core 18[socket:1] (Tx: 8, Rx: 8) Core 19[socket:1] (Tx: 9, Rx: 9) Core 30[socket:1] (Tx: 10, Rx: 10) Core 31[socket:1] (Tx: 11, Rx: 11) Core 32[socket:1] (Tx: 12, Rx: 12) Core 33[socket:1] (Tx: 13, Rx: 13) Core 34[socket:1] (Tx: 14, Rx: 14) Core 35[socket:1] (Tx: 15, Rx: 15) Just for reference, the cpu_layout script shows: $ $RTE_SDK/tools/cpu_layout.py =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Core and Socket Information (as reported by '/proc/cpuinfo') =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D cores =3D [0, 1, 2, 3, 4, 8, 9, 10, 11, 12] sockets =3D [0, 1] Socket 0 Socket 1 -------- -------- Core 0 [0, 20] [10, 30] Core 1 [1, 21] [11, 31] Core 2 [2, 22] [12, 32] Core 3 [3, 23] [13, 33] Core 4 [4, 24] [14, 34] Core 8 [5, 25] [15, 35] Core 9 [6, 26] [16, 36] Core 10 [7, 27] [17, 37] Core 11 [8, 28] [18, 38] Core 12 [9, 29] [19, 39] I know it might be complicated to gigure out exactly what's happening in our setup with our own code so please let me know if you need additional information. I appreciate the help! Thanks, Dumitru