From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 0E4EFA00E6 for ; Fri, 17 May 2019 15:04:41 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 58A725A44; Fri, 17 May 2019 15:04:40 +0200 (CEST) Received: from mail-vk1-f194.google.com (mail-vk1-f194.google.com [209.85.221.194]) by dpdk.org (Postfix) with ESMTP id 9F5842BCE for ; Fri, 17 May 2019 15:04:38 +0200 (CEST) Received: by mail-vk1-f194.google.com with SMTP id p24so1990421vki.5 for ; Fri, 17 May 2019 06:04:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tOC2iWCfT8Tz8DelVwmrloErmE9YhMinR5z3Pz3wpKc=; b=jPbOY8x+0RijYvLW/JXHSnDSPzTDgR6sNYrYi+0ct/WDYfgQfDFEz+1uSv9JY+22IA PwvE8HuG17Tq3yzMmEW3VAxe8A5FRRPn4pnsyA2i5GSfFmBFxb3lk/0hoTTsQpTUwc++ BYCnCS6uIboCjimeHNbWR07pHnUSZD6bvEOFLvTbZlQqRoQHFt9TeyAMVCUvGXbEMxtN hCIps6uoirt3nfS6LJd5sZAnD0T4df4smbpsmHU5YfVCb0YSQ+iXChSnbQbg60O461+Q 5RMZnLIpD987sOo48nJQPpersARuBWSC6BKzyvbbSu/m/v10gpWwCq+t3g/hEZlJ5sfy IbbA== X-Gm-Message-State: APjAAAUm+wNp6DLX//1xaZltufoRsVzEOMCOUpXx8QEj5vscFw2oPsgp CIIzq67bShzFR2Yw/cmCiM1j1x14wB/Mc7gmdMrXtQ== X-Google-Smtp-Source: APXvYqxxbcE60FTrdzZEjB4FnKOoRl0PRTaSC/o8yjrsYif5di/WutTG2AbyQVo13o0+dL4pdHl8ACnVDX/1GbOO/+s= X-Received: by 2002:ac5:c219:: with SMTP id m25mr2221516vkk.53.1558098277692; Fri, 17 May 2019 06:04:37 -0700 (PDT) MIME-Version: 1.0 References: <20190517122220.31283-1-maxime.coquelin@redhat.com> In-Reply-To: <20190517122220.31283-1-maxime.coquelin@redhat.com> From: David Marchand Date: Fri, 17 May 2019 15:04:26 +0200 Message-ID: To: Maxime Coquelin Cc: dev , Tiwei Bie , Jens Freimann , Zhihong Wang , Bruce Richardson , "Ananyev, Konstantin" Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH 0/5] vhost: I-cache pressure optimizations X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Fri, May 17, 2019 at 2:23 PM Maxime Coquelin wrote: > Some OVS-DPDK PVP benchmarks show a performance drop > when switching from DPDK v17.11 to v18.11. > > With the addition of packed ring layout support, > rte_vhost_enqueue_burst and rte_vhost_dequeue_burst > became very large, and only a part of the instructions > are executed (either packed or split ring used). > > This series aims at improving the I-cache pressure, > first by un-inlining split and packed rings, but > also by moving parts considered as cold in dedicated > functions (dirty page logging, fragmented descriptors > buffer management added for CVE-2018-1059). > > With the series applied, size of the enqueue and > dequeue split paths is reduced significantly: > > +---------+--------------------+---------------------+ > | Version | Enqueue split path | Dequeue split path | > +---------+--------------------+---------------------+ > | v19.05 | 16461B | 25521B | > | +series | 7286B | 11285B | > +---------+--------------------+---------------------+ > > Using perf tool to monitor iTLB-load-misses event > while doing PVP benchmark with testpmd as vswitch, > we can see the number of iTLB misses being reduced: > > - v19.05: > # perf stat --repeat 10 -C 2,3 -e iTLB-load-miss -- sleep 10 > > Performance counter stats for 'CPU(s) 2,3' (10 runs): > > 2,438 iTLB-load-miss > ( +- 13.43% ) > > 10.00058928 +- 0.00000336 seconds time elapsed ( +- 0.00% ) > > - +series: > # perf stat --repeat 10 -C 2,3 -e iTLB-load-miss -- sleep 10 > > Performance counter stats for 'CPU(s) 2,3' (10 runs): > > 55 iTLB-load-miss > ( +- 10.08% ) > > 10.00059466 +- 0.00000283 seconds time elapsed ( +- 0.00% ) > > The series also force the inlining of some rte_memcpy > helpers, as by adding packed ring support, some of them > were not more inlined but embedded as functions in > the virtio_net object file, which was not expected. > > Finally, the series simplifies the descriptors buffers > prefetching, by doing it in the recently introduced > descriptor buffer mapping function. > > Maxime Coquelin (4): > vhost: un-inline dirty pages logging functions > vhost: do not inline packed and split functions > vhost: do not inline unlikely fragmented buffers code > vhost: simplify descriptor's buffer prefetching > > root (1): > eal/x86: force inlining of all memcpy and mov helpers > root ? "oops" :-) -- David Marchand