From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <kiselev99@gmail.com>
Received: from mail-wm0-f47.google.com (mail-wm0-f47.google.com [74.125.82.47])
 by dpdk.org (Postfix) with ESMTP id B28B32C09
 for <users@dpdk.org>; Fri, 15 Apr 2016 00:47:05 +0200 (CEST)
Received: by mail-wm0-f47.google.com with SMTP id n3so8410959wmn.0
 for <users@dpdk.org>; Thu, 14 Apr 2016 15:47:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc; bh=XP55uVaMZfebJGPBqTtC5e5ZjIDdCyT7lAeuTZCaP9g=;
 b=tOJ5wnijXN+ERMWpEh7lRlsnyA94hxhqo8qvmZM6w6KlaGq/1raVw+aikcseRjk5o2
 RyoD2JDK+OHpQ2nk/XZECD/N8DUg6vajFmm31Pn43eJ1kWQ0yaa+uea1r8l4TGyB1mSu
 w2Laev2hEcSAGpmzhPeYneU5KgSn1dcNknIIMeYWHfcQhltW/whML3IlU8eea3oEXQSp
 TFEKwRHVNJEXbXxR6TK9jEJIgX06pJJbP+cPNG/W0WFcVLJRQxtb2U+5g8c5aUTFHPsu
 XMB/IoV+3be/rjsAcwemXD+WqiieUgQFrmddCp7kq5oqYZgj44U6d/G0krE4CjM3cFYD
 2uMw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:date
 :message-id:subject:from:to:cc;
 bh=XP55uVaMZfebJGPBqTtC5e5ZjIDdCyT7lAeuTZCaP9g=;
 b=aEkBMEE+gOq0IaU3qRRamS79SN6+ybFdC7VxaKL2hsN6tMB0Fo/lWXRDmio6Lgv3h3
 e95JadGd9tZoOWgwnUQpTSVZoFl8aArBjgLYGJg7reFT1LIpLruuDBL2UIV6i259OrYv
 N1+WKruujFKlHSueHZ1gdll05JEuB1GLJ70p8QWgmce/2ujadqxjsartyp469nxDgbVH
 aHMz/hJ350ejj3+mQiEK0y/fJ3E6Bjcoe890EZWprnxBzYU59fXKbia8P7JZKbtf69Es
 ghT2S+CExLugpoMqx1KxBQzGRa+hDmLdlkRuX+7zO+mZtgi9KNX0whIrBKrMc0rRe2YE
 UfIw==
X-Gm-Message-State: AOPr4FVXEl4QQ75mknWqb5NEH6/h1qY39uLmjERyFgjwpwCOdtc91b89QBK3+hXs2F5l7yxVAObk01O1uPpuyA==
MIME-Version: 1.0
X-Received: by 10.28.179.84 with SMTP id c81mr846322wmf.13.1460674025420; Thu,
 14 Apr 2016 15:47:05 -0700 (PDT)
Received: by 10.28.19.134 with HTTP; Thu, 14 Apr 2016 15:47:05 -0700 (PDT)
In-Reply-To: <88A92D351643BA4CB23E303155170626150A9E02@SHSMSX101.ccr.corp.intel.com>
References: <9485D7B0-E2EA-4D23-BBD9-6D233BDF8E29@gmail.com>
 <CAKdLGVAdNmid2ZXtxDd=Z-dxSDoaPk5JMWCPJTCoHxb9pubB4g@mail.gmail.com>
 <6EBF3C5F-D1A0-4E49-9A16-7FDB2F15E46C@gmail.com>
 <CAKdLGVC6GrZtEGFt0fi_f4YW32TSV1B8hTpCtpZAP2uyCADv-g@mail.gmail.com>
 <8CD7A8EE-BCAF-4107-9CEA-8B696B7F4A5C@gmail.com>
 <88A92D351643BA4CB23E303155170626150A9974@SHSMSX101.ccr.corp.intel.com>
 <CAMKNYbxYnwH+BcUAqM+HCfje+vuo17QZFd360cdmuU=sx_m-2w@mail.gmail.com>
 <88A92D351643BA4CB23E303155170626150A9C16@SHSMSX101.ccr.corp.intel.com>
 <CAMKNYbzR3UviJaRjRMCDBs9y3CVgoKbv2=iRd5oh6hMkiXXN6w@mail.gmail.com>
 <88A92D351643BA4CB23E303155170626150A9E02@SHSMSX101.ccr.corp.intel.com>
Date: Fri, 15 Apr 2016 01:47:05 +0300
Message-ID: <CAMKNYby87SKfVty6q7HzdU0Y1AocWORro9CZThROVKfk+=JJqw@mail.gmail.com>
From: =?UTF-8?B?0JDQu9C10LrRgdCw0L3QtNGAINCa0LjRgdC10LvQtdCy?=
 <kiselev99@gmail.com>
To: "Hu, Xuekun" <xuekun.hu@intel.com>
Cc: "users@dpdk.org" <users@dpdk.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.15
Subject: Re: [dpdk-users] Lcore impact
X-BeenThere: users@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: usage discussions <users.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/users>,
 <mailto:users-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/users/>
List-Post: <mailto:users@dpdk.org>
List-Help: <mailto:users-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/users>,
 <mailto:users-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Apr 2016 22:47:05 -0000

Yes. 31% is ITLB-load-misses. My cpu is Intel(R) Core(TM) i5-2400 CPU @
3.10GHz.
There is no other big differences. I would say there is no other little
differences too. The numbers are about the same.

I also noticed another strange thing: once I start sockets operations
perf context-switches counter increases in the ALL sibling thread
corresponded to dpdk lcores. But why? Only one thread is doing socket
operation and invoke system calls, so I expected context-switches to occur
only in that thread, not in the all threads.

2016-04-15 1:09 GMT+03:00 Hu, Xuekun <xuekun.hu@intel.com>:

> I think 31.09% means ITLB-load-misses, right? To be strait forward, yes,
> this count means code misses is high that code footprint is big. For
> example, function A call function B, while the code address of B is far
> away from A.
>
>
>
> Is there any other big difference? Like L2/L3 cache miss? Actually I don=
=E2=80=99t
> expect the iTLB-load-misses could have that big impact (10% packet loss).
>
>
>
> BTW. What=E2=80=99s your CPU?
>
>
>
> *From:* =D0=90=D0=BB=D0=B5=D0=BA=D1=81=D0=B0=D0=BD=D0=B4=D1=80 =D0=9A=D0=
=B8=D1=81=D0=B5=D0=BB=D0=B5=D0=B2 [mailto:kiselev99@gmail.com]
> *Sent:* Friday, April 15, 2016 5:33 AM
> *To:* Hu, Xuekun
>
> *Cc:* users@dpdk.org
> *Subject:* Re: [dpdk-users] Lcore impact
>
>
>
> I've done my homework with perf and the results show that iTLB-load-misse=
s
> value is very high. In the tests without socket operations the processing
> lcore has 0.87% of all iTLB cache hits and there is no packet loss. In th=
e
> test WITH socket operations the processing lcore has 31.09% of all iTLB
> cache hits and there is about 10% packet loss. How to interpret with
> results? Google shows a little about iTLB. So far some web pages suggest
> the following:
>
> "Try to minimize the size of the source code and locality so that
> instructions span a minimum number of pages, and so that the instruction
> span is less then the number of ITLB entries."
>
>
>
> Any ideas?
>
>
>
>
>
>
>
>
>
>
>
> 2016-04-14 23:43 GMT+03:00 Hu, Xuekun <xuekun.hu@intel.com>:
>
> Perf could. Or PCM, that is also a good tool.
> https://software.intel.com/en-us/articles/intel-performance-counter-monit=
or-a-better-way-to-measure-cpu-utilization
>
>
>
> *From:* =D0=90=D0=BB=D0=B5=D0=BA=D1=81=D0=B0=D0=BD=D0=B4=D1=80 =D0=9A=D0=
=B8=D1=81=D0=B5=D0=BB=D0=B5=D0=B2 [mailto:kiselev99@gmail.com]
> *Sent:* Friday, April 15, 2016 3:31 AM
> *To:* Hu, Xuekun
> *Cc:* Shawn Lewis; users@dpdk.org
>
>
> *Subject:* Re: [dpdk-users] Lcore impact
>
>
>
>
>
>
>
> 2016-04-14 20:49 GMT+03:00 Hu, Xuekun <xuekun.hu@intel.com>:
>
> Are the two lcore belonging to one processor, or two processors? What the
> memory footprint is for the system call threads? If the memory footprint =
is
> big (>LLC cache size) and two locre are in the same processor, then it
> could have impact on packet processing thread.
>
>
>
> Those two lcores belong to one processor and it's a single processor
> machine.
>
>
>
> Both cores allocates a lot of memory and use the full dpdk arsenal: lpm,
> mempools, hashes and etc. But during the test the core doing socket data
> transfering is using only small 16k buffer for sending and sending is the
> all it does during the test. It doesn't use any other allocated memory
> structures. The processing core in turn is using rte_lpm whitch is big, b=
ut
> in my test there are only about 10 routes in it, so I think the amount
> "hot" memory is not very big. But I can't say if it's bigger than l3 cpu
> cache or not. Should I use some profilers and see if socket operations
> cause a lot of cache miss in the processing lcore? It there some tool tha=
t
> allows me to do that? perf maybe?
>
>
>
>
>
>
>
> -----Original Message-----
> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Alexander Kisele=
v
> Sent: Friday, April 15, 2016 1:19 AM
> To: Shawn Lewis
> Cc: users@dpdk.org
> Subject: Re: [dpdk-users] Lcore impact
>
> I've already seen this documen and have used this tricks a lot of times.
> But this time I send data locally over localhost. There is even no nics
> bind to linux in my machine. Therefore there is no nics interruptions I c=
an
> pin to cpu. So what do you propose?
>
> > 14 =D0=B0=D0=BF=D1=80. 2016 =D0=B3., =D0=B2 20:06, Shawn Lewis <smlsr@t=
encara.com> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0):
> >
> > You have to work with IRQBalancer as well
> >
> >
> http://www.intel.com/content/dam/doc/application-note/82575-82576-82598-8=
2599-ethernet-controllers-interrupts-appl-note.pdf
> >
> > Is just an example document which discuss this (not so much DPDK
> related)...  But the OS will attempt to balance the interrupts when you
> actually want to remove or pin them down...
> >
> >> On Thu, Apr 14, 2016 at 1:02 PM, Alexander Kiselev <kiselev99@gmail.co=
m>
> wrote:
> >>
> >>
> >>> 14 =D0=B0=D0=BF=D1=80. 2016 =D0=B3., =D0=B2 19:35, Shawn Lewis <smlsr=
@tencara.com> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0):
> >>>
> >>> Lots of things...
> >>>
> >>> One just because you have a process running on an lcore, does not mea=
n
> thats all that runs on it.  Unless you have told the kernel at boot to NO=
T
> use those specific cores, those cores will be used for many things OS
> related.
> >>
> >> Generally yes, but unless I start sending data to socket there is no
> packet loss.  I did about 10 test runs in a raw and everythis was ok. And
> there is no other application running on that test machine that uses cpu
> cores.
> >>
> >> So the question is why this socket operations influence the other lcor=
e?
> >>
> >>>
> >>> IRQBlance
> >>> System OS operations.
> >>> Other Applications.
> >>>
> >>> So by doing file i/o you are generating interrupts, where those
> interrupts get serviced is up to IRQBalancer.  So could be any one of you=
r
> cores.
> >>
> >> That is a good point. I can use cpu affinity feature to bind
> unterruption handler to the core not used in my test. But I send data
> locally over localhost. Is it possible to use cpu affinity in that case?
> >>
> >>>
> >>>
> >>>
> >>>> On Thu, Apr 14, 2016 at 12:31 PM, Alexander Kiselev <
> kiselev99@gmail.com> wrote:
> >>>> Could someone give me any hints about what could cause permormance
> issues in a situation where one lcore doing a lot of linux system calls
> (read/write on socket) slow down the other lcore doing packet forwarding?
> In my test the forwarding lcore doesn't share any memory structures with
> the other lcore that sends test data to socket. Both lcores pins to
> different processors cores. So therotically they shouldn't have any impac=
t
> on each other but they do, once one lcore starts sending data to socket t=
he
> other lcore starts dropping packets. Why?
> >>>
> >
>
>
>
>
>
> --
>
> =D0=A1 =D1=83=D0=B2=D0=B0=D0=B6=D0=B5=D0=BD=D0=B8=D0=B5=D0=BC,
> =D0=9A=D0=B8=D1=81=D0=B5=D0=BB=D0=B5=D0=B2 =D0=90=D0=BB=D0=B5=D0=BA=D1=81=
=D0=B0=D0=BD=D0=B4=D1=80
>
>
>
>
>
> --
>
> =D0=A1 =D1=83=D0=B2=D0=B0=D0=B6=D0=B5=D0=BD=D0=B8=D0=B5=D0=BC,
> =D0=9A=D0=B8=D1=81=D0=B5=D0=BB=D0=B5=D0=B2 =D0=90=D0=BB=D0=B5=D0=BA=D1=81=
=D0=B0=D0=BD=D0=B4=D1=80
>


--=20
=D0=A1 =D1=83=D0=B2=D0=B0=D0=B6=D0=B5=D0=BD=D0=B8=D0=B5=D0=BC,
=D0=9A=D0=B8=D1=81=D0=B5=D0=BB=D0=B5=D0=B2 =D0=90=D0=BB=D0=B5=D0=BA=D1=81=
=D0=B0=D0=BD=D0=B4=D1=80