Re: [dpdk-users] Lcore impact - Александр Киселев

DPDK usage discussions
 help / color / mirror / Atom feed

From: "Александр Киселев" <kiselev99@gmail.com>
To: "Hu, Xuekun" <xuekun.hu@intel.com>, Shawn Lewis <smlsr@tencara.com>
Cc: "users@dpdk.org" <users@dpdk.org>
Subject: Re: [dpdk-users] Lcore impact
Date: Fri, 15 Apr 2016 02:51:39 +0300	[thread overview]
Message-ID: <CAMKNYbxs6mXRVeXxKzz-4jBCuX2BT2BMDUUBeyM4aNEgdJJpAg@mail.gmail.com> (raw)
In-Reply-To: <CAMKNYby87SKfVty6q7HzdU0Y1AocWORro9CZThROVKfk+=JJqw@mail.gmail.com>

I found out the cause of context-switches and iTLB misses. I ran the socket
reading application on the same host with dpdk app. Sorry guys I was a fool
:) Thank you for your help.
Shawn you was right from the beginning I should've think more carefully
about other applications running on the same host.

2016-04-15 1:47 GMT+03:00 Александр Киселев <kiselev99@gmail.com>:

> Yes. 31% is ITLB-load-misses. My cpu is Intel(R) Core(TM) i5-2400 CPU @
> 3.10GHz.
> There is no other big differences. I would say there is no other little
> differences too. The numbers are about the same.
>
> I also noticed another strange thing: once I start sockets operations
> perf context-switches counter increases in the ALL sibling thread
> corresponded to dpdk lcores. But why? Only one thread is doing socket
> operation and invoke system calls, so I expected context-switches to occur
> only in that thread, not in the all threads.
>
> 2016-04-15 1:09 GMT+03:00 Hu, Xuekun <xuekun.hu@intel.com>:
>
>> I think 31.09% means ITLB-load-misses, right? To be strait forward, yes,
>> this count means code misses is high that code footprint is big. For
>> example, function A call function B, while the code address of B is far
>> away from A.
>>
>>
>>
>> Is there any other big difference? Like L2/L3 cache miss? Actually I
>> don’t expect the iTLB-load-misses could have that big impact (10% packet
>> loss).
>>
>>
>>
>> BTW. What’s your CPU?
>>
>>
>>
>> *From:* Александр Киселев [mailto:kiselev99@gmail.com]
>> *Sent:* Friday, April 15, 2016 5:33 AM
>> *To:* Hu, Xuekun
>>
>> *Cc:* users@dpdk.org
>> *Subject:* Re: [dpdk-users] Lcore impact
>>
>>
>>
>> I've done my homework with perf and the results show that
>> iTLB-load-misses value is very high. In the tests without socket operations
>> the processing lcore has 0.87% of all iTLB cache hits and there is no
>> packet loss. In the test WITH socket operations the processing lcore
>> has 31.09% of all iTLB cache hits and there is about 10% packet loss. How
>> to interpret with results? Google shows a little about iTLB. So far some
>> web pages suggest the following:
>>
>> "Try to minimize the size of the source code and locality so that
>> instructions span a minimum number of pages, and so that the instruction
>> span is less then the number of ITLB entries."
>>
>>
>>
>> Any ideas?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2016-04-14 23:43 GMT+03:00 Hu, Xuekun <xuekun.hu@intel.com>:
>>
>> Perf could. Or PCM, that is also a good tool.
>> https://software.intel.com/en-us/articles/intel-performance-counter-monitor-a-better-way-to-measure-cpu-utilization
>>
>>
>>
>> *From:* Александр Киселев [mailto:kiselev99@gmail.com]
>> *Sent:* Friday, April 15, 2016 3:31 AM
>> *To:* Hu, Xuekun
>> *Cc:* Shawn Lewis; users@dpdk.org
>>
>>
>> *Subject:* Re: [dpdk-users] Lcore impact
>>
>>
>>
>>
>>
>>
>>
>> 2016-04-14 20:49 GMT+03:00 Hu, Xuekun <xuekun.hu@intel.com>:
>>
>> Are the two lcore belonging to one processor, or two processors? What the
>> memory footprint is for the system call threads? If the memory footprint is
>> big (>LLC cache size) and two locre are in the same processor, then it
>> could have impact on packet processing thread.
>>
>>
>>
>> Those two lcores belong to one processor and it's a single processor
>> machine.
>>
>>
>>
>> Both cores allocates a lot of memory and use the full dpdk arsenal: lpm,
>> mempools, hashes and etc. But during the test the core doing socket data
>> transfering is using only small 16k buffer for sending and sending is the
>> all it does during the test. It doesn't use any other allocated memory
>> structures. The processing core in turn is using rte_lpm whitch is big, but
>> in my test there are only about 10 routes in it, so I think the amount
>> "hot" memory is not very big. But I can't say if it's bigger than l3 cpu
>> cache or not. Should I use some profilers and see if socket operations
>> cause a lot of cache miss in the processing lcore? It there some tool that
>> allows me to do that? perf maybe?
>>
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Alexander
>> Kiselev
>> Sent: Friday, April 15, 2016 1:19 AM
>> To: Shawn Lewis
>> Cc: users@dpdk.org
>> Subject: Re: [dpdk-users] Lcore impact
>>
>> I've already seen this documen and have used this tricks a lot of times.
>> But this time I send data locally over localhost. There is even no nics
>> bind to linux in my machine. Therefore there is no nics interruptions I can
>> pin to cpu. So what do you propose?
>>
>> > 14 апр. 2016 г., в 20:06, Shawn Lewis <smlsr@tencara.com> написал(а):
>> >
>> > You have to work with IRQBalancer as well
>> >
>> >
>> http://www.intel.com/content/dam/doc/application-note/82575-82576-82598-82599-ethernet-controllers-interrupts-appl-note.pdf
>> >
>> > Is just an example document which discuss this (not so much DPDK
>> related)...  But the OS will attempt to balance the interrupts when you
>> actually want to remove or pin them down...
>> >
>> >> On Thu, Apr 14, 2016 at 1:02 PM, Alexander Kiselev <
>> kiselev99@gmail.com> wrote:
>> >>
>> >>
>> >>> 14 апр. 2016 г., в 19:35, Shawn Lewis <smlsr@tencara.com> написал(а):
>> >>>
>> >>> Lots of things...
>> >>>
>> >>> One just because you have a process running on an lcore, does not
>> mean thats all that runs on it.  Unless you have told the kernel at boot to
>> NOT use those specific cores, those cores will be used for many things OS
>> related.
>> >>
>> >> Generally yes, but unless I start sending data to socket there is no
>> packet loss.  I did about 10 test runs in a raw and everythis was ok. And
>> there is no other application running on that test machine that uses cpu
>> cores.
>> >>
>> >> So the question is why this socket operations influence the other
>> lcore?
>> >>
>> >>>
>> >>> IRQBlance
>> >>> System OS operations.
>> >>> Other Applications.
>> >>>
>> >>> So by doing file i/o you are generating interrupts, where those
>> interrupts get serviced is up to IRQBalancer.  So could be any one of your
>> cores.
>> >>
>> >> That is a good point. I can use cpu affinity feature to bind
>> unterruption handler to the core not used in my test. But I send data
>> locally over localhost. Is it possible to use cpu affinity in that case?
>> >>
>> >>>
>> >>>
>> >>>
>> >>>> On Thu, Apr 14, 2016 at 12:31 PM, Alexander Kiselev <
>> kiselev99@gmail.com> wrote:
>> >>>> Could someone give me any hints about what could cause permormance
>> issues in a situation where one lcore doing a lot of linux system calls
>> (read/write on socket) slow down the other lcore doing packet forwarding?
>> In my test the forwarding lcore doesn't share any memory structures with
>> the other lcore that sends test data to socket. Both lcores pins to
>> different processors cores. So therotically they shouldn't have any impact
>> on each other but they do, once one lcore starts sending data to socket the
>> other lcore starts dropping packets. Why?
>> >>>
>> >
>>
>>
>>
>>
>>
>> --
>>
>> С уважением,
>> Киселев Александр
>>
>>
>>
>>
>>
>> --
>>
>> С уважением,
>> Киселев Александр
>>
>
>
>
> --
> С уважением,
> Киселев Александр
>



-- 
С уважением,
Киселев Александр

     prev parent reply	other threads:[~2016-04-14 23:51 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-14 16:31 Alexander Kiselev
2016-04-14 16:35 ` Shawn Lewis
2016-04-14 17:02   ` Alexander Kiselev
2016-04-14 17:06     ` Shawn Lewis
2016-04-14 17:18       ` Alexander Kiselev
2016-04-14 17:49         ` Hu, Xuekun
2016-04-14 19:31           ` Александр Киселев
2016-04-14 20:43             ` Hu, Xuekun
2016-04-14 21:32               ` Александр Киселев
2016-04-14 22:09                 ` Hu, Xuekun
2016-04-14 22:47                   ` Александр Киселев
2016-04-14 23:51                     ` Александр Киселев [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMKNYbxs6mXRVeXxKzz-4jBCuX2BT2BMDUUBeyM4aNEgdJJpAg@mail.gmail.com \
    --to=kiselev99@gmail.com \
    --cc=smlsr@tencara.com \
    --cc=users@dpdk.org \
    --cc=xuekun.hu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).