DPDK usage discussions
 help / color / mirror / Atom feed
* [dpdk-users] movzbl in rte_eth_rx_burst
@ 2017-08-19  8:45 Dorsett, Michal
  2017-08-20 17:56 ` Stephen Hemminger
  0 siblings, 1 reply; 2+ messages in thread
From: Dorsett, Michal @ 2017-08-19  8:45 UTC (permalink / raw)
  To: users

Hi,

We are running dpdk 16.07. Below is a snippet from a perf annotate report of a CPU running a thread that constantly reads packets.
As you can see, the hottest instructruction is

movzbl 0x10(%rcx),%r15d

which, I believe, is referring to

struct rte_eth_dev *dev = &rte_eth_devices[port_id];


Can someone explain why this instruction is so costly, and how I can remedy this?

  0.66 x        lea    0xc8(%rsp),%rax
       x      _ZN8LBThread7executeEv():
       x              {
       x                  u32RetPkt = vecRXQ->at(u32Index)->receiveRawPackets(xPktArr, BURST_SIZE);
       x        movq   $0x0,0x50(%rsp)
       x      _ZNSt6vectorIP18ReceivePacketQueueSaIS1_EE2atEm():
       x        movq   $0x0,0x48(%rsp)
       x      __mempool_generic_put():
  0.09 x        mov    %rax,0x88(%rsp)
  0.42 x        mov    0x60(%rsp),%rax
       x        add    $0x18,%rax
  0.05 x        mov    %rax,0x40(%rsp)
  0.14 x        mov    0x50(%rsp),%rax
      x      _ZN8LBThread7executeEv():
  0.71 x 370:   mov    (%rdx,%rax,8),%rax
  0.57 x        mov    %rax,%rcx
  1.23 x        mov    %rax,0x80(%rsp)
       x      rte_rdtsc():
       x              }
       x      #endif
       x
       x              asm volatile("rdtsc" :
       x                           "=a" (tsc.lo_32),
       x                           "=d" (tsc.hi_32));
  0.47 x        rdtsc
       x      rte_eth_rx_burst():
       x       */
       x      static inline uint16_t
       x      rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
       x                       struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
       x      {
       x              struct rte_eth_dev *dev = &rte_eth_devices[port_id];
24.74 x        movzbl 0x10(%rcx),%r15d
       x      rte_rdtsc():
  0.09 x        mov    %eax,%r13d
       x      _ZN18ReceivePacketQueue17receiveRawPacketsEP6Packetj():
       x          uint64_t u64StartTick = CPUCycles::getTSCCycles();
       x          uint32_t u32PtksReceived;
       x          int32_t refcnt;
       x          int retCode;
       x
       x          u32PtksReceived = rte_eth_rx_burst(m_u8PortId, m_u16QueueIndexForNICPort, m_pArrPktsBurst, u32NumOfPkts);
  2.60 x        movzwl 0xe(%rcx),%r14d
       x      rte_rdtsc():
       x        shl    $0x20,%rdx
       x      _ZN18ReceivePacketQueue17receiveRawPacketsEP6Packetj():
       x        lea    0x18(%rcx),%r12
       x      rte_rdtsc():
  0.05 x        or     %rdx,%r13
       x      rte_eth_rx_burst():
       x                      RTE_PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
       x                      return 0;
       x              }
       x      #endif
       x              int16_t nb_rx = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id],


Thanks,

Michal Dorsett
Developer, Strategic IP Group
Desk: +972 962 4350
Mobile: +972 50 771 6689
Verint Cyber Intelligence
www.verint.com<http://www.verint.com/>



This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [dpdk-users] movzbl in rte_eth_rx_burst
  2017-08-19  8:45 [dpdk-users] movzbl in rte_eth_rx_burst Dorsett, Michal
@ 2017-08-20 17:56 ` Stephen Hemminger
  0 siblings, 0 replies; 2+ messages in thread
From: Stephen Hemminger @ 2017-08-20 17:56 UTC (permalink / raw)
  To: Dorsett, Michal; +Cc: users

On Sat, 19 Aug 2017 08:45:15 +0000
"Dorsett, Michal" <Michal.Dorsett@verint.com> wrote:

> Hi,
> 
> We are running dpdk 16.07. Below is a snippet from a perf annotate report of a CPU running a thread that constantly reads packets.
> As you can see, the hottest instructruction is
> 
> movzbl 0x10(%rcx),%r15d
> 
> which, I believe, is referring to
> 
> struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> 
> 
> Can someone explain why this instruction is so costly, and how I can remedy this?

Perf is not accurate on exact instruction.
Your problem is that reading TSC causes a full pipeline stall on most x86 processors.

https://archive.fosdem.org/2015/schedule/event/dpdk_performance/

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-08-20 17:56 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-19  8:45 [dpdk-users] movzbl in rte_eth_rx_burst Dorsett, Michal
2017-08-20 17:56 ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).