From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-f178.google.com (mail-qt0-f178.google.com [209.85.216.178]) by dpdk.org (Postfix) with ESMTP id E5B8BDED; Wed, 22 Aug 2018 19:43:22 +0200 (CEST) Received: by mail-qt0-f178.google.com with SMTP id d4-v6so3050020qtn.13; Wed, 22 Aug 2018 10:43:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=5TR7cqzbQwQYvUwyn8PttXPqLHRLnz1rJy5ffpARXOk=; b=FgVPPsL3yqozev8Mvnq+UtfMFb/4PRBljCaXtSdilsS0CmTH701RAk3FDjFtaTqUzL YCWQA9FzcABSaIPesJBrAEND0jWuXlrFRzbS2EKa58gJ8QzfUt9T+Btwj0atzI15yEfT waZbeORs90WJ+c0vzd1ZaGCLSq15tPhCZmyDzEn3FwF64y7BgZncEsvmlbSbrVd+YtQG xI7fPQdo4jM4wLAr/8bO5IhJby7W58Mp5XR0WNtBRgFokYAKi/77/ZakNUIoeNO+092O OwAM8abbjFH4IRj0N72X4vCncS+qq9TlQUri8i5Tz1kQ77TmPIpaJYmfsK9W5oKdck4q T6KQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=5TR7cqzbQwQYvUwyn8PttXPqLHRLnz1rJy5ffpARXOk=; b=FrOZnkXgn1bfUw7xa7TphSr2hWSoUCY07+VIR4bSPrJcxVU2MArcRWur0HLGjbViqG gzgYxn4llZkiLyuaGAD7Bf4xo2bi+o7/es71iNc8I8TlSCKykM1GLUd7O9wkxJasqE4i 21cSszvV7NUPH3ApiEgJse8G9fbFET8hRvN1PinpKGYWE/jIfEKD8u4Q7twyMq6t04J3 3oEP17vzSEGb1WClToIGduzZgZFkVemsEMwiDk4EKOpP+UyalF6cdpjlcNqpUpEzmZU+ NvE86tQmOWfDhVw462UX7ciYmlt5pojOh7AHSbbqLcOK8ohM7nqzNlQbqYPo8udmTWn/ mmJA== X-Gm-Message-State: APzg51DaIDSpx3nse17fPGsa1zLZKgvgzFJAOUZKa8IwAhSyU3KKTo/I Kz98ZEO6lJ6evudtw+fwHzI= X-Google-Smtp-Source: ANB0VdZsRuwRgLsvAI9g48FrvYzZfe0K3l9+6D9LFaNG8LE2CH21/1QSDesByPmcKxqEWftSgTOsWA== X-Received: by 2002:a0c:bc03:: with SMTP id j3-v6mr16026269qvg.242.1534959802273; Wed, 22 Aug 2018 10:43:22 -0700 (PDT) Received: from gmail.com (pool-173-73-46-101.washdc.fios.verizon.net. [173.73.46.101]) by smtp.gmail.com with ESMTPSA id z13-v6sm1631147qtz.4.2018.08.22.10.43.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 22 Aug 2018 10:43:21 -0700 (PDT) Received: by gmail.com (sSMTP sendmail emulation); Wed, 22 Aug 2018 13:43:19 -0400 Date: Wed, 22 Aug 2018 13:43:19 -0400 From: Eric Kinzie To: Matan Azrad Cc: Luca Boccassi , Chas Williams <3chas3@gmail.com>, "dev@dpdk.org" , Declan Doherty , Chas Williams , "stable@dpdk.org" Message-ID: <20180822174316.GA29821@roosta> References: <20180816125202.15980-1-bluca@debian.org> <20180816133208.26566-1-bluca@debian.org> <1534933159.5764.107.camel@debian.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.8.3 (2017-05-23) Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH v4] net/bonding: per-slave intermediate rx ring X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Aug 2018 17:43:23 -0000 On Wed Aug 22 11:42:37 +0000 2018, Matan Azrad wrote: > Hi Luca > > From: Luca Boccassi > > On Wed, 2018-08-22 at 07:09 +0000, Matan Azrad wrote: > > > Hi Chas > > > > > > From: Chas Williams > > > > On Tue, Aug 21, 2018 at 11:43 AM Matan Azrad > > > .com> wrote: > > > > Hi Chas > > > > > > > > From: Chas Williams > > > > > On Tue, Aug 21, 2018 at 6:56 AM Matan Azrad > > > > x.com> > > > > > wrote: > > > > > Hi > > > > > > > > > > From: Chas Williams > > > > > > This will need to be implemented for some of the other RX burst > > > > > > methods at some point for other modes to see this performance > > > > > > improvement (with the exception of active-backup). > > > > > > > > > > Yes, I think it should be done at least to > > > > > bond_ethdev_rx_burst_8023ad_fast_queue (should be easy) for now. > > > > > > > > > > There is some duplicated code between the various RX paths. > > > > > I would like to eliminate that as much as possible, so I was going > > > > > to give that some thought first. > > > > > > > > There is no reason to stay this function as is while its twin is > > > > changed. > > > > > > > > Unfortunately, this is all the patch I have at this time. > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 16, 2018 at 9:32 AM Luca Boccassi > > > > > ian.org> wrote: > > > > > > > > > > > > > During bond 802.3ad receive, a burst of packets is fetched > > > > > > > from each slave into a local array and appended to per-slave > > > > > > > ring buffer. > > > > > > > Packets are taken from the head of the ring buffer and > > > > > > > returned to the caller.  The number of mbufs provided to each > > > > > > > slave is sufficient to meet the requirements of the ixgbe > > > > > > > vector receive. > > > > > > > > > > Luca, > > > > > > > > > > Can you explain these requirements of ixgbe? > > > > > > > > > > The ixgbe (and some other Intel PMDs) have vectorized RX routines > > > > > that are more efficient (if not faster) taking advantage of some > > > > > advanced CPU instructions.  I think you need to be receiving at > > > > > least 32 packets or more. > > > > > > > > So, why to do it in bond which is a generic driver for all the > > > > vendors PMDs, If for ixgbe and other Intel nics it is better you can > > > > force those PMDs to receive always 32 packets and to manage a ring > > > > by themselves. > > > > > > > > The drawback of the ring is some additional latency on the receive > > > > path. > > > > In testing, the additional latency hasn't been an issue for bonding. > > > > > > When bonding does processing slower it may be a bottleneck for the > > > packet processing for some application. > > > > > > > The bonding PMD has a fair bit of overhead associated with the RX > > > > and TX path calculations.  Most applications can just arrange to > > > > call the RX path with a sufficiently large receive.  Bonding can't > > > > do this. > > > > > > I didn't talk on application I talked on the slave PMDs, The slave PMD > > > can manage a ring by itself if it helps for its own performance. > > > The bonding should not be oriented to specific PMDs. > > > > The issue though is that the performance problem is not with the individual > > PMDs - it's with bonding. There were no reports regarding the individual > > PMDs. > > This comes from reports from customers from real world production > > deployments - the issue of bonding being too slow was raised multiple times. > > This patch addresses those issues, again in production deployments, where > > it's been used for years, to users and customers satisfaction. > > From Chas I understood that using burst of 32 helps for some slave PMDs performance which makes sense. > I can't understand how the extra copy phases improves the bonding itself performance: > > You added 2 copy phases in the bonding RX function: > 1. Get packets from the slave to a local array. > 2. Copy packet pointers from a local array to the ring array. > 3. Copy packet pointers from the ring array to the application array. > > Each packet arriving to the application must pass the above 3 phases(in a specific call or in previous calls). > > Without this patch we have only - > Get packets from the slave to the application array. > > Can you explain how the extra copies improves the bonding performance? > > Looks like it improves the slaves PMDs and because of that the bonding PMD performance becomes better. I'm not sure that adding more buffer management to the vector PMDs will improve the drivers' performance; it's just that calling the rx function in such a way that it returns no data wastes time. The bonding driver is already an exercise in buffer management so adding this layer of indirection here makes sense in my opinion, as does hiding the details of the consituent interfaces where possible. > > So I'd like to share this improvement rather than keeping it private - because > > I'm nice that way :-P > > > > > > > Did you check for other vendor PMDs? It may hurt performance > > > > > there.. > > > > > > > > > > I don't know, but I suspect probably not.  For the most part you > > > > > are typically reading almost up to the vector requirement.  But if > > > > > one slave has just a single packet, then you can't vectorize on > > > > > the next slave. > > > > > > > > > > > > > I don't think that the ring overhead is better for PMDs which are > > > > not using the vectorized instructions. > > > > > > > > The non-vectorized PMDs are usually quite slow.  The additional > > > > overhead doesn't make a difference in their performance. > > > > > > We should not do things worse than they are. > > > > There were no reports that this made things worse. The feedback from > > production was that it made things better. > > Yes, It may be good for specific slaves drivers but hurt another slaves drivers, > So maybe it should stay private to specific costumers using specific nics. > > Again, I can understand how this patch improves performance of some PMDs > therefore I think the bonding is not the place to add it but maybe some PMDs.