From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk0-f43.google.com (mail-vk0-f43.google.com [209.85.213.43]) by dpdk.org (Postfix) with ESMTP id D1C0958CF for ; Wed, 26 Oct 2016 10:28:11 +0200 (CEST) Received: by mail-vk0-f43.google.com with SMTP id d65so1556106vkg.0 for ; Wed, 26 Oct 2016 01:28:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=AeMyGzShMoQfVOSVaBRX8IClQyC9xVdoDmPY46r++jQ=; b=OTvImb78Z/ichRFpGJyberMw6OufvAhoHvyibPAfpzrwfTp0so0I1ryMpNNAFYBzar ztwIWnPbeDCdMiY+OubGkBiOXU2GmmcFa3UMMP91JzzHkqtisQmThikVaJ2A3ATxiCes e+Celsv21n/YezkeZMaTY7JZrKCnKh8nu6w64mt+27PEfSDYuXh6wGn1m7DsvO3Gg/UG snSsf9BOObBKQkPSIxe+8FESAuV38M6Dj+T3ArhYYZFGi+IXbzPK5ldKf8+2Tphzw/Tt w05EesU89PU+gMcu6OFd5VCcKihRBBuuihXiIfgiJbPfd9xihpZzh8V54tYo4NG8rFzB isuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=AeMyGzShMoQfVOSVaBRX8IClQyC9xVdoDmPY46r++jQ=; b=mrOd1CqmvDaP9kM7UBLgU36ix6MZz99epjLEg7BK7q9X2CggWsNihAk3AwtNFOzLDB PRn+VH4hhKvcTMIPhJgzjGuApELR+UHN1kHofuBOSy5UlG7Ycr+mZ5pvVsCIIi4VtF70 1+2hXOeGMJicc7ym+QAZ6EIScJTzhnm/lVkJ/ZyBR1y8OJGiS704l3xaQcGPmecaN4oI bTrHPU8w8JTDtiLUNkthtRhz82yK1h0Tl5RUubcAfXAOuCOel0ey1+FoCFzh8XSTdp+F t9xLrcHh8P8jW50cB4q7cz1OalgGrbAH01/7WIxUd4ejdwzjmasFYPDwG0G7nH/1aEUg WYHw== X-Gm-Message-State: ABUngvfgfoOqZRsIobrb7PIcS00n26SaIZpAqyyD4nGO6xPq/4f34YDTJP4OCVyBGS4CurwQ7xR0PBvffbkvRgrD X-Received: by 10.31.94.202 with SMTP id s193mr167235vkb.167.1477470491305; Wed, 26 Oct 2016 01:28:11 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.91.8 with HTTP; Wed, 26 Oct 2016 01:28:10 -0700 (PDT) In-Reply-To: <20161025120537.GA56680@bricha3-MOBL3.ger.corp.intel.com> References: <98CBD80474FA8B44BF855DF32C47DC359EA8B1@smartserver.smartshare.dk> <7910CF2F-7087-4307-A9AC-DE0287104185@intel.com> <20161024162538.GA34988@bricha3-MOBL3.ger.corp.intel.com> <20161025120537.GA56680@bricha3-MOBL3.ger.corp.intel.com> From: Alejandro Lucero Date: Wed, 26 Oct 2016 10:28:10 +0200 Message-ID: To: Bruce Richardson Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Cc: Shreyansh Jain , "dev@dpdk.org" , =?UTF-8?Q?Morten_Br=C3=B8rup?= Subject: Re: [dpdk-dev] mbuf changes X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Oct 2016 08:28:12 -0000 On Tue, Oct 25, 2016 at 2:05 PM, Bruce Richardson < bruce.richardson@intel.com> wrote: > On Tue, Oct 25, 2016 at 05:24:28PM +0530, Shreyansh Jain wrote: > > On Monday 24 October 2016 09:55 PM, Bruce Richardson wrote: > > > On Mon, Oct 24, 2016 at 04:11:33PM +0000, Wiles, Keith wrote: > > > > > > > > > On Oct 24, 2016, at 10:49 AM, Morten Br=C3=B8rup < > mb@smartsharesystems.com> wrote: > > > > > > > > > > First of all: Thanks for a great DPDK Userspace 2016! > > > > > > > > > > > > > > > > > > > > Continuing the Userspace discussion about Olivier Matz=E2=80=99s = proposed > mbuf changes... > > > > > > Thanks for keeping the discussion going! > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > Stephen Hemminger had a noteworthy general comment about keeping > metadata for the NIC in the appropriate section of the mbuf: Metadata > generated by the NIC=E2=80=99s RX handler belongs in the first cache line= , and > metadata required by the NIC=E2=80=99s TX handler belongs in the second c= ache line. > This also means that touching the second cache line on ingress should be > avoided if possible; and Bruce Richardson mentioned that for this reason > m->next was zeroed on free(). > > > > > > > > Thinking about it, I suspect there are more fields we can reset on fr= ee > > > to save time on alloc. Refcnt, as discussed below is one of them, but > so > > > too could be the nb_segs field and possibly others. > > > > > > > > > > > > > > > > > > 2. > > > > > > > > > > There seemed to be consensus that the size of m->refcnt should > match the size of m->port because a packet could be duplicated on all > physical ports for L3 multicast and L2 flooding. > > > > > > > > > > Furthermore, although a single physical machine (i.e. a single > server) with 255 physical ports probably doesn=E2=80=99t exist, it might = contain > more than 255 virtual machines with a virtual port each, so it makes sens= e > extending these mbuf fields from 8 to 16 bits. > > > > > > > > I thought we also talked about removing the m->port from the mbuf a= s > it is not really needed. > > > > > > > Yes, this was mentioned, and also the option of moving the port value > to > > > the second cacheline, but it appears that NXP are using the port valu= e > > > in their NIC drivers for passing in metadata, so we'd need their > > > agreement on any move (or removal). > > > > I am not sure where NXP's NIC came into picture on this, but now that i= t > is > > highlighted, this field is required for libevent implementation [1]. > > > > A scheduler sending an event, which can be a packet, would only have > > information of a flow_id. From this matching it back to a port, without > > mbuf->port, would be very difficult (costly). There may be way around > this > > but at least in current proposal I think port would be important to hav= e > - > > even if in second cache line. > > > > But, off the top of my head, as of now it is not being used for any > specific > > purpose in NXP's PMD implementation. > > > > Even the SoC patches don't necessarily rely on it except using it > because it > > is available. > > > > @Bruce: where did you get the NXP context here from? > > > Oh, I'm just mis-remembering. :-( It was someone else who was looking for > this - Netronome, perhaps? > > CC'ing Alejandro in the hope I'm remembering correctly second time > round! > > Yes. Thanks Bruce! So Netronome uses the port field and, as I commented on the user meeting, we are happy with the field going from 8 to 16 bits. In our case, this is something some clients have demanded, and if I'm not wrong (I'll double check this asap), the port value is for knowing where the packet is coming from. Think about a switch in the NIC, with ports linked to VFs/VMs, and one or more physical ports. That port value is not related to DPDK ports but to the switch ports. Code in the host (DPDK or not) can receive packets from the wire or from VFs through the NIC. This is also true for packets received by VMs, but I guess the port value is just interested for host code. > /Bruce >