From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-f182.google.com (mail-wr0-f182.google.com [209.85.128.182]) by dpdk.org (Postfix) with ESMTP id 704E1201 for ; Thu, 20 Jul 2017 15:41:21 +0200 (CEST) Received: by mail-wr0-f182.google.com with SMTP id y43so70946355wrd.3 for ; Thu, 20 Jul 2017 06:41:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=xAw7jBZAzmOnwbeJm4gjFSN+/eXSiyVATKQZBLC60Ho=; b=StD5gbG4tg/97C5CshkbwmvOgR4PJwHhRCuQtV8VL0lipX765lU0ZR7R0puFkf2ICg 19zFilmmYeSqcVmknXy+t5RKNlHzUIMblPZUgYbKY1whWQg97yDgeUjEQciJb2QkQocd oSVH2xHb33Dd9QcIHmyrTLLg1E9DBkme4gtTBLkBl6TIihLbbm+07JWMQFeQ6g+bnWz9 Ffamc8mY5ylGo1UtFnu2b8AZt2cZkd6ZK+RlyVvnatkK5qFDX6tJ8AF3A+ad7f0f045p YfORsH6/P/i7GhE0vMVXs5P8AL0es1mhyBboiJ/mcF2pHIapux9Jwoi9RBb2z6p5+PXx n4iQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=xAw7jBZAzmOnwbeJm4gjFSN+/eXSiyVATKQZBLC60Ho=; b=lPF6cMgaytuNhivZlchNYG6lqnCCBhNyRw5THOhVC3dxy679GE8XrMYGcQyO+m4TGL AP75XP/NQLswNCEBqYBPf2YdqH7nT+h77GnvLsSCPJLr+V7jyiIHEGolhA10khsmNGN6 QpzpdDM9JrAIwHjdz84Ej934Gwwk57rvTwXKyXBhkLaQkZjiVxjGYpqUQw5iMCuVvcYv iLUgts4iWMVOS6yyMBs6PO0CJFRKUjKtl7yO6esNZaVyOXHONjQtIT01SzY0nEp5nN0E WjKu5b1Qvwq3dTIcHqQ8t6L4AQJ6CK0T5+sHNHC1y22a9rlGZi9itfPkN2uieCsUJYHE wFgA== X-Gm-Message-State: AIVw110kTA2XHamlAToif792Nt+SnTfFWBv/O69bgIFX0HlUuVF9cEJ9 O2SU1G3kmQQIggWO X-Received: by 10.223.170.144 with SMTP id h16mr3148678wrc.227.1500558080995; Thu, 20 Jul 2017 06:41:20 -0700 (PDT) Received: from shalom ([193.47.165.251]) by smtp.gmail.com with ESMTPSA id q185sm3761041wmd.19.2017.07.20.06.41.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 20 Jul 2017 06:41:20 -0700 (PDT) Date: Thu, 20 Jul 2017 15:55:49 +0200 From: =?iso-8859-1?Q?N=E9lio?= Laranjeiro To: Sagi Grimberg , Shahaf Shuler Cc: dev@dpdk.org, Yongseok Koh , Roy Shterman , Alexander Solganik , Leon Romanovsky Message-ID: <20170720135548.qlkjnmzthd2vep5e@shalom> References: <75d08202-1882-7660-924c-b6dbb4455b88@grimberg.me> <20170717210222.j4dwxiujqdlqhlp2@shalom> <85c0b1d9-bbf3-c6ab-727f-f508c5e5f584@grimberg.me> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <85c0b1d9-bbf3-c6ab-727f-f508c5e5f584@grimberg.me> User-Agent: NeoMutt/20170113 (1.7.2) Subject: Re: [dpdk-dev] Question on mlx5 PMD txq memory registration X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Jul 2017 13:41:21 -0000 On Wed, Jul 19, 2017 at 09:21:39AM +0300, Sagi Grimberg wrote: > > > There is none, if you send a burst of 9 packets each one coming from a > > different mempool the first one will be dropped. > > Its worse than just a drop, without debug enabled the error completion > is ignored, and the wqe_pi is taken from an invalid field, which leads > to bogus mbufs free (elts_tail is not valid). Right > > > AFAICT, it is the driver responsibility to guarantee to never deregister > > > a memory region that has inflight send operations posted, otherwise > > > the send operation *will* complete with a local protection error. Is > > > that taken care of? > > > > Up to now we have assumed that the user knows its application needs and > > he can increase this cache size to its needs through the configuration > > item. > > This way this limit and guarantee becomes true. > > That is an undocumented assumption. I agree it should be documented, in reality you are the first one we know having this issue. > > > Another question, why is the MR cache maintained per TX queue and not > > > per device? If the application starts N TX queues then a single mempool > > > will be registered N times instead of just once. Having lots of MR > > > instances will pollute the device ICMC pretty badly. Am I missing > > > something? > > > > Having this cache per device needs a lock on the device structure while > > threads are sending packets. > > Not sure why it needs a lock at all. it *may* need an rcu protection > or rw_lock if at all. Tx queues may run on several CPU there is a need to be sure this array cannot be modified by two threads at the same time. Anyway it is costly. > > Having such locks cost cycles, that is why > > the cache is per queue. Another point is, having several mempool per > > device is something common, whereas having several mempool per queues is > > not, it seems logical to have this cache per queue for those two > > reasons. > > > > > > I am currently re-working this part of the code to improve it using > > reference counters instead. The cache will remain for performance > > purpose. This will fix the issues you are pointing. > > AFAICT, all this caching mechanism is just working around the fact > that mlx5 allocates resources on top of the existing verbs interface. > I think it should work like any other pmd driver, i.e. use mbuf the > physical addresses. > > The mlx5 device (like all other rdma devices) has a global DMA lkey that > spans the entire physical address space. Just about all the kernel > drivers heavily use this lkey. IMO, the mlx5_pmd driver should be able > to query the kernel what this lkey is and ask for the kernel to create > the QP with privilege level to post send/recv operations with that lkey. > > And then, mlx5_pmd becomes like other drivers working with physical > addresses instead of working around the memory registration sub-optimally. It is one possibility discussed also with Mellanox guys, the point is this breaks the security point of view which is also an important stuff. If this is added in the future it will certainly be as an option, this way both will be possible, the application could then choose about security vs performance. I don't know any planning on this from Mellanox side, maybe Shahaf have. > And while were on the subject, what is the plan of detaching mlx5_pmd > from its MLNX_OFED dependency? Mellanox has been doing a good job > upstreaming the needed features (rdma-core). CC'ing Leon (who is > co-maintaining the user-space rdma tree. This is also a in progress in PMD part, it should be part of the next DPDK release. -- Nélio Laranjeiro 6WIND