From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f47.google.com (mail-wm0-f47.google.com [74.125.82.47]) by dpdk.org (Postfix) with ESMTP id 193231B00B for ; Tue, 23 Jan 2018 14:31:11 +0100 (CET) Received: by mail-wm0-f47.google.com with SMTP id t74so1871976wme.3 for ; Tue, 23 Jan 2018 05:31:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=oVWZaU23iTujNXPU7TGlWcJcvcMlKABNpzWyLlkYM78=; b=U0GI5UQ4Ij3gMTftlX2/Nt28dR1b8RoF6JYDj2z2lw0Kg1CAbIQQR/kmoETTYcsCi+ BRR7DHlWWt4joyHSFAtBMm0su795BtDeHpQMA2/ZZSCMUtCd9S3+b4h5jnVzw8+HnpjZ +XI21+csmLYOiXe1JRbW1EPMmy907+tRFxTMdygw7Qze/X4bLTidblauQQNIe8RhQcQb qhW93JecMhhyfb8zfJ0iRaRrqP/Yz3RnbRBhmdbYYHTP62g2uKHEUVtVcjwF+kyPplKc hA6NVsxqCj1m2w8PpSPkPjin7cke7J9gn93aF99RPy6p5qcvLxafLp9I4MkKFTphxqUN bNEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=oVWZaU23iTujNXPU7TGlWcJcvcMlKABNpzWyLlkYM78=; b=OZQp0cGVWwCnp5Ej1yN9TAqB4j+bvY4FO/G3GUMQVCbw8th1Ou85o4s+3BWzX18sM1 HssQItxOazChWkuHIEId92elnBYr8K6qkzjGwgMYXS83mHkQ157uGAUtnwrXtnfwLE2b chnpI3O09Rzc3vtmAd87bp/eZJWhriZwRa4m1BATXbGWrVsw9UsDpGvZrFEYv/3NRUk1 Zy0JZaXdkH498+FD/e2SrYIhfwFqaad2SgyBdRNOF32ZSs0QIF+6cmdRRpyKW5l/s1Ch 2jIDQBgvRqeFXdPd6p134cxzoi4a074IKtC00vAOFV4jBKc/ju6uEbaI0NKEIJ6xgcUx L6nw== X-Gm-Message-State: AKwxytcP2YYtoZjYeclO1hgotDTute0pOFZ/2XWASjk3mNZ5w6RWc2Py dt+xebcW4GGb9B5aciXAFBvS X-Google-Smtp-Source: AH8x227OPEogyAEYMn0loQSr747EsV9UiCUInrIfaOJO4LZAPDrQmWREQ6l3fGxZgJQ19fLOGHU5ew== X-Received: by 10.28.132.207 with SMTP id g198mr1844069wmd.118.1516714270732; Tue, 23 Jan 2018 05:31:10 -0800 (PST) Received: from laranjeiro-vm.dev.6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id m86sm6297804wmi.40.2018.01.23.05.31.09 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 23 Jan 2018 05:31:10 -0800 (PST) Date: Tue, 23 Jan 2018 14:31:08 +0100 From: =?iso-8859-1?Q?N=E9lio?= Laranjeiro To: "Xueming(Steven) Li" Cc: Shahaf Shuler , "dev@dpdk.org" Message-ID: <20180123133108.b2uhpfjaaalfhsqq@laranjeiro-vm.dev.6wind.com> References: <20180119150854.89828-1-xuemingl@mellanox.com> <20180122145321.jpyepyvjjlktillp@laranjeiro-vm.dev.6wind.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) Subject: Re: [dpdk-dev] [PATCH] net/mlx5: remmap UAR address for multiple process X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Jan 2018 13:31:11 -0000 Hi Xueming, My lonely comments are more related to the commit log which should be re-written to be more accurate to the issue you try to address, even if this patch does not solves it. Please see below, On Tue, Jan 23, 2018 at 09:50:42AM +0000, Xueming(Steven) Li wrote: > Hi Nelio, > > > -----Original Message----- > > From: Nélio Laranjeiro [mailto:nelio.laranjeiro@6wind.com] > > Sent: Monday, January 22, 2018 10:53 PM > > To: Xueming(Steven) Li > > Cc: Shahaf Shuler ; dev@dpdk.org > > Subject: Re: [PATCH] net/mlx5: remmap UAR address for multiple process > > > > Hi Xueming, > > > > On Fri, Jan 19, 2018 at 11:08:54PM +0800, Xueming Li wrote: > > > UAR(doorbell) is hw resources that have to be same address between > > > primary and secondary process, failed to mmap UAR will make TX packets > > > invisible to HW. > > > Today, UAR address returned from verbs api is mixed in heap and loaded > > > library address space, prone to be occupied in secondary process. > > > This patch reserves a dedicate UAR address space, both primary and > > > secondary process re-mmap UAR pages into this space. > > > Below is a brief picture of dpdk app address space allocation: > > > Before This patch > > > ------ ---------- > > > [stack] [stack] > > > [.so, uar, heap] [.so, heap] > > > [(empty)] [(empty)] > > > [hugepage] [hugepage] > > > [? others] [? others] > > > [(empty)] [(empty)] > > > [uar] > > > [(empty)] > > > To minimize conflicts, UAR address space comes after hugepage space > > > with an offset to skip potential usage from other drivers. > > > > Seems it is not the case when the memory is contiguous, according to what > > I see in my testpmd /proc//maps: > > > > PMD: mlx5.c:523: mlx5_uar_init_primary(): Reserved UAR address space: > > 0x0x7f4da5800000 > > > > And the fist huge page is at address 0x7f4fa5800000, new UAR space is > > before and not after. > > > > With this patch I still have the situation described as "before". > > > > Your observation is correct, system is allocating address in a high-to-low > manner like stack. UAR address range 0x0x7f4da5800000 - 0x0x7f4ea5800000, > 4GB size, With another 4G offset, hugepage range start is 0x7f4fa5800000. >>From what I understand, remapping the UAR pages to an address before the huge pages reduce the situation where the secondaries process cannot start. This patch does not fix the fact it may fail. Your small display of the memory mapping between before and after seems not accurate depending on the OS being run, on Linux v4.14 from debian9 S.I.D. I am still on the situation before no matter how many time I restart the process. For that I'll suggest you to remove it. > > > Once UAR space reserved successfully, UAR pages are re-mmapped into > > > new area to keep UAR address aligned between primary and secondary > > process. > > > > > > Signed-off-by: Xueming Li > > > --- > > > drivers/net/mlx5/mlx5.c | 107 > > ++++++++++++++++++++++++++++++++++++++++ > > > drivers/net/mlx5/mlx5.h | 1 + > > > drivers/net/mlx5/mlx5_defs.h | 10 ++++ > > > drivers/net/mlx5/mlx5_rxtx.h | 3 +- > > > drivers/net/mlx5/mlx5_trigger.c | 7 ++- > > > drivers/net/mlx5/mlx5_txq.c | 51 +++++++++++++------ > > > 6 files changed, 163 insertions(+), 16 deletions(-) > > > > > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index > > > fc2d59fee..1539ef608 100644 > > > --- a/drivers/net/mlx5/mlx5.c > > > +++ b/drivers/net/mlx5/mlx5.c > > > @@ -39,6 +39,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > > > > /* Verbs header. */ > > > /* ISO C doesn't support unnamed structs/unions, disabling -pedantic. > > > */ @@ -56,6 +57,7 @@ #include #include > > > #include > > > +#include > > > #include > > > > > > #include "mlx5.h" > > > @@ -466,6 +468,101 @@ mlx5_args(struct mlx5_dev_config *config, struct > > > rte_devargs *devargs) > > > > > > static struct rte_pci_driver mlx5_driver; > > > > > > +/* > > > + * Reserved UAR address space for TXQ UAR(hw doorbell) mapping, > > > +process > > > + * local resource used by both primary and secondary to avoid > > > +duplicate > > > + * reservation. > > > + * The space has to be available on both primary and secondary > > > +process, > > > + * TXQ UAR maps to this area using fixed mmap w/o double check. > > > + */ > > > +static void *uar_base; > > > + > > > +/** > > > + * Reserve UAR address space for primary process > > > + * > > > + * @param[in] priv > > > + * Pointer to private structure. > > > + * > > > + * @return > > > + * 0 on success, negative errno value on failure. > > > + */ > > > +static int > > > +mlx5_uar_init_primary(struct priv *priv) { > > > + void *addr = (void *)0; > > > + int i; > > > + const struct rte_mem_config *mcfg; > > > + > > > + if (uar_base) { /* UAR address space mapped */ > > > + priv->uar_base = uar_base; > > > + return 0; > > > + } > > > + /* find out lower bound of hugepage segments */ > > > + mcfg = rte_eal_get_configuration()->mem_config; > > > + for (i = 0; i < RTE_MAX_MEMSEG && mcfg->memseg[i].addr; i++) { > > > + if (addr) > > > + addr = RTE_MIN(addr, mcfg->memseg[i].addr); > > > + else > > > + addr = mcfg->memseg[i].addr; > > > > This if/else is useless as addr is already initialised with the smallest > > possible value. > > That's my original code :-) and I always get addr zero then. > Addr here is the lower bound of hugepage, we don't want addr to keep zero. Indeed, I mix my mind the min and max. > > > + } > > > + /* offset down UAR area */ > > > + addr = RTE_PTR_SUB(addr, MLX5_UAR_OFFSET + MLX5_UAR_SIZE); > > > > Seems the error is here, the loops get the address of the memseg with the > > smallest address and then it subtract the UAR size, addr cannot be after > > the huge pages unless if this subtraction overflows. > > Thanks, my word "after" is something like address alloction order, the UAR block > under "hugepage" on the overall picture. There is no guarantee that the system will allocate from the end to the beginning. After means having an address higher than the reference, otherwise it is not after but before. > > > + /* anonymous mmap, no real memory consumption */ > > > + addr = mmap(addr, MLX5_UAR_SIZE, > > > + PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > > > + if (addr == MAP_FAILED) { > > > + ERROR("Failed to reserve UAR address space, please adjust " > > > + "MLX5_UAR_SIZE or try --base-virtaddr"); > > > > How does a user knows the UAR memory space the NIC needs to adjust the > > MLX5_UAR_SIZE? > > > > > + return -ENOMEM; > > > + } > > > + /* Accept either same addr or a new addr returned from mmap if > > target > > > + * range occupied. > > > + */ > > > + INFO("Reserved UAR address space: 0x%p", addr); > > > > The '%p' already prefix the address with the 0x. > > > > > + priv->uar_base = addr; /* for primary and secondary UAR re-mmap */ > > > + uar_base = addr; /* process local, don't reserve again */ > > > + return 0; > > > +} > > > + > > > > > > Regards, > > > > -- > > Nélio Laranjeiro > > 6WIND Thanks, -- Nélio Laranjeiro 6WIND