From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-f46.google.com (mail-yw1-f46.google.com [209.85.161.46]) by dpdk.org (Postfix) with ESMTP id C5C845F1D for ; Fri, 1 Mar 2019 04:07:37 +0100 (CET) Received: by mail-yw1-f46.google.com with SMTP id u205so12933964ywe.1 for ; Thu, 28 Feb 2019 19:07:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=JiVgN20HjS5Gy1zvAVA1fpoBImrh8C3lQPQLp6TJdwo=; b=d1I0wyk636J7Wiu7lZCP2jAYzLo3ui+V0kXXipyfPZx5/5K014i6+1tw9RoaTRtIVZ ucQQS806FXrRm0uctNJl+M9UjMH+mWnB8CKL78Rw/liIeJngMg/B+fegnW/89G3/8YZC djVkWJd7WuQmbxdtfckftvxNai8hBNeuTw+3Thkx5ESsiu+BlvIz4ryxKYM5JbN6c/qW gBrNLxHAw69ChdxVpM4Z/dknSHMNoBXRa/JzYapd8hifxZwGtXYC9qGPhbt2muM2C3pJ OuCRM5fs69R2C8+ugE9eRs1MhEh15Amrey6g7WM1xAQpi8SH3Vk39T6ru6L3WNyJLvBo SLAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JiVgN20HjS5Gy1zvAVA1fpoBImrh8C3lQPQLp6TJdwo=; b=nbZsN+XxlLnNRmhXcXpZdF/cPY+v3bRHfHofYer8IF7JiP8u2NLN67WgafmEiW8NZt lGu3GNIvoSDb999s3jaACHnk4Oy0RjJ2yqfRz9wane0ZHkPOs167PLU4kjGJeSDA6F0q Of3mUGhjloQeZpE/aWUEIlgGwYNZ5IMuZYtl5VG7VRhCqgbnUS7hsRp7IS+koMf42xL2 918GdmibtoaOTjzYrc4R70HEt+qoB8LRv3+y5JwhdTEmAAawSzcZ2cltdVhFUzUCT5X9 dq14s7fTrD2gkPtJXm+Ddq6PBx2woyC7+0Psp1tkym33IHdnpTA9RqGbqo12vt+jHS38 q/nQ== X-Gm-Message-State: APjAAAVQPsWfBdBFmguVscypQEqnRqS04+2ZxPvQtiTjkUeB8iCNU59q u5WOsuNywtLVtQR5OqZHc41eO4jNiqLmAaN8g0M= X-Google-Smtp-Source: APXvYqxoaLu/O2yJBX9D47w2h63SwjbVRaIXYkNqHxy86to4WMezTuumMJQojgtnbfP31r/27cQlCt13wP+yuvtYSjE= X-Received: by 2002:a25:7650:: with SMTP id r77mr2338549ybc.206.1551409656976; Thu, 28 Feb 2019 19:07:36 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Cliff Burdick Date: Thu, 28 Feb 2019 19:07:25 -0800 Message-ID: To: Arvind Narayanan Cc: users Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-users] rte_flow / hw-offloading is degrading performance when testing @ 100G X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Mar 2019 03:07:38 -0000 That's definitely interesting. Hopefully someone from mellanox can comment on the performance impact since I haven't seen it qualified. On Thu, Feb 28, 2019, 18:57 Arvind Narayanan wrote: > > On Thu, Feb 28, 2019, 8:23 PM Cliff Burdick wrote: > >> What size packets are you using? I've only steered to 2 rx queues by IP >> dst match, and was able to hit 100Gbps. That's with a 4KB jumboframe. >> > > 64 bytes. Agreed this is small, what seems interesting is l3fwd is able to > handle 64B but rte_flow suffers (a lot) - suggesting offloading is > expensive?! > > I'm doing something similar, steering to different queues based off > dst_ip. However, my tests have around 80 rules, each rule steering to one > of the 20 rx_queues. I have a one-to-one rx_queue-to-core_id mapping. > > Arvind > > > >> On Thu, Feb 28, 2019, 17:42 Arvind Narayanan >> wrote: >> >>> Hi, >>> >>> I am using DPDK 18.11 on Ubuntu 18.04, with Mellanox Connect X-5 100G >>> EN (MLNX_OFED_LINUX-4.5-1.0.1.0-ubuntu18.04-x86_64). >>> Packet generator: t-rex 2.49 running on another machine. >>> >>> I am able to achieve 100G line rate with l3fwd application (fr sz 64B) >>> using the parameters suggested in their performance report. >>> ( >>> https://fast.dpdk.org/doc/perf/DPDK_18_11_Mellanox_NIC_performance_report.pdf >>> ) >>> >>> However, as soon as I install rte_flow rules to steer packets to >>> different queues and/or use rte_flow's mark action, the throughput >>> reduces to ~41G. I also modified DPDK's flow_filtering example >>> application, and am getting the same reduced throughput of around 41G >>> out of 100G. But without rte_flow, it goes to 100G. >>> >>> I didn't change any OS/Kernel parameters to test l3fwd or the >>> application that uses rte_flow. I also ensure the application is >>> numa-aware and use 20 cores to handle 100G traffic. >>> >>> Upon further investigation (using Mellanox NIC counters), the drop in >>> throughput is due to mbuf allocation errors. >>> >>> Is such performance degradation normal when performing hw-acceleration >>> using rte_flow? >>> Has anyone tested throughput performance using rte_flow @ 100G? >>> >>> Its surprising to see hardware offloading is degrading the >>> performance, unless I am doing something wrong. >>> >>> Thanks, >>> Arvind >>> >>