From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <rahul.lakkireddy@chelsio.com>
Received: from stargate.chelsio.com (stargate.chelsio.com [12.32.117.8])
 by dpdk.org (Postfix) with ESMTP id B9EF51B66E
 for <dev@dpdk.org>; Tue,  6 Feb 2018 10:21:24 +0100 (CET)
Received: from localhost (scalar.blr.asicdesigners.com [10.193.185.94])
 by stargate.chelsio.com (8.13.8/8.13.8) with ESMTP id w169L0jM008644;
 Tue, 6 Feb 2018 01:21:01 -0800
Date: Tue, 6 Feb 2018 14:50:47 +0530
From: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, Kumar A S <kumaras@chelsio.com>,
 Surendra Mobiya <surendra@chelsio.com>,
 Nirranjan Kirubaharan <nirranjan@chelsio.com>,
 Indranil Choudhury <indranil@chelsio.com>
Message-ID: <20180206092046.GA19019@chelsio.com>
References: <cover.1517685185.git.rahul.lakkireddy@chelsio.com>
 <86269c17d76550025a5eb1a60202c7052f8e6ddf.1517685185.git.rahul.lakkireddy@chelsio.com>
 <58e0e745-bdba-e5a5-98a0-68bc1698e3c2@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <58e0e745-bdba-e5a5-98a0-68bc1698e3c2@intel.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
Subject: Re: [dpdk-dev] [PATCH 1/7] cxgbe: fix secondary process
 initialization
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Feb 2018 09:21:25 -0000

On Monday, February 02/05/18, 2018 at 22:53:36 +0530, Ferruh Yigit wrote:
> On 2/4/2018 6:06 AM, Rahul Lakkireddy wrote:
> > From: Kumar Sanghvi <kumaras@chelsio.com>
> > 
> > rte_eth_dev_allocate already assigns the eth_dev_data. So,
> > don't allocate it separately as part of probe function. And we
> > save this eth_dev_data as part of txq structure.
> > 
> > Attach to rte_eth_dev devices allocated by Primary process for
> > Ports other than Port-0 in the secondary process.
> > 
> > Fixes: 8318984927ff ("cxgbe: add pmd skeleton")
> > Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
> > Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
> > ---
> >  doc/guides/nics/features/cxgbe.ini |  1 +
> >  drivers/net/cxgbe/base/adapter.h   |  1 +
> >  drivers/net/cxgbe/cxgbe_ethdev.c   | 28 +++++++++++++++++++++++-----
> >  drivers/net/cxgbe/cxgbe_main.c     | 11 ++---------
> >  drivers/net/cxgbe/sge.c            |  5 +++--
> >  5 files changed, 30 insertions(+), 16 deletions(-)
> > 
> > diff --git a/doc/guides/nics/features/cxgbe.ini b/doc/guides/nics/features/cxgbe.ini
> > index 3d0fde2fd..1b9d81ffb 100644
> > --- a/doc/guides/nics/features/cxgbe.ini
> > +++ b/doc/guides/nics/features/cxgbe.ini
> > @@ -24,6 +24,7 @@ Basic stats          = Y
> >  Stats per queue      = Y
> >  EEPROM dump          = Y
> >  Registers dump       = Y
> > +Multiprocess aware   = Y
> >  BSD nic_uio          = Y
> >  Linux UIO            = Y
> >  Linux VFIO           = Y
> > diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
> > index f2057af14..0c1a1be25 100644
> > --- a/drivers/net/cxgbe/base/adapter.h
> > +++ b/drivers/net/cxgbe/base/adapter.h
> > @@ -267,6 +267,7 @@ struct sge_eth_tx_stats {	/* Ethernet tx queue statistics */
> >  struct sge_eth_txq {                   /* state for an SGE Ethernet Tx queue */
> >  	struct sge_txq q;
> >  	struct rte_eth_dev *eth_dev;   /* port that this queue belongs to */
> > +	struct rte_eth_dev_data *data;
> 
> Why you need to store a local copy of data? You have to store/know eth_dev, and
> I believe it is always better idea to access data through eth_dev. Perhaps we
> even should hide this access with macro etc. Not sure about motivation of this.
> 

Considering an example of Multiprocess Server and Client app, with
command-line like:
# mp_server -l 1-2 -n 4 -- -p 3 -n 2
# mp_client -l 3 -n 4 --proc-type=auto -- -n 0
# mp_client -l 4 -n 4 --proc-type=auto -- -n 1

Below is the flow...

For Primary process:
- For Port-0, eth_dev comes from stack.
- For Port-1 and onwards, we allocate eth_dev via
  rte_eth_dev_allocate().
Both will have its own eth_dev_data.
Also, in primary, we allocate TxQ, RxQ and associate them with
respective eth_dev available in Primary.

   
For Secondary process:
New Port-0 ethdev would be allocated. However, it will attach to
eth_dev_data of Primary's Port-0.
New Port-1 ethdev would be allocated. However, it will now attach
to eth_dev_data of Primary's Port-1.
    
When Secondary process transmits on Port-1, it transmits over Port-1
ethdev allocated in Secondary process.
Since this ethdev has same eth_dev_data of Primary's Port-1 - it
transmits over TxQ allocated in Primary process.
     
Now, when we are in t4_eth_xmit (via cxgbe_xmit_pkts), all we have is a
TxQ. And in Primary process, we always associate TxQ with respective
ethdev i.e. in this case, it would be Port-1 ethdev of Primary. So, for
retrieving max_rx_pkt_len in t4_eth_xmit, when we access txq->eth_dev -
it is NULL ---> because secondary process can't access eth_dev of
Primary.

And thats why, we need a pointer for directly accessing the eth_dev_data
from t4_eth_xmit.

> >  	struct sge_eth_tx_stats stats; /* queue statistics */
> >  	rte_spinlock_t txq_lock;
> >  
> > diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
> > index 5cd260f48..6d56f3c1b 100644
> > --- a/drivers/net/cxgbe/cxgbe_ethdev.c
> > +++ b/drivers/net/cxgbe/cxgbe_ethdev.c
> > @@ -1004,14 +1004,32 @@ static int eth_cxgbe_dev_init(struct rte_eth_dev *eth_dev)
> >  	eth_dev->dev_ops = &cxgbe_eth_dev_ops;
> >  	eth_dev->rx_pkt_burst = &cxgbe_recv_pkts;
> >  	eth_dev->tx_pkt_burst = &cxgbe_xmit_pkts;
> > +	pci_dev = RTE_ETH_DEV_TO_PCI(eth_dev);
> >  
> > -	/* for secondary processes, we don't initialise any further as primary
> > -	 * has already done this work.
> > +	/* for secondary processes, we attach to ethdevs allocated by primary
> > +	 * and do minimal initialization.
> >  	 */
> > -	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > +		int i;
> > +
> > +		for (i = 1; i < MAX_NPORTS; i++) {
> > +			struct rte_eth_dev *rest_eth_dev;
> > +			char namei[RTE_ETH_NAME_MAX_LEN];
> > +
> > +			snprintf(namei, sizeof(namei), "cxgbe%d", i);
> > +			rest_eth_dev = rte_eth_dev_attach_secondary(namei);
> > +			if (rest_eth_dev) {
> > +				rest_eth_dev->device = &pci_dev->device;
> > +				rest_eth_dev->dev_ops =
> > +					eth_dev->dev_ops;
> > +				rest_eth_dev->rx_pkt_burst =
> > +					eth_dev->rx_pkt_burst;
> > +				rest_eth_dev->tx_pkt_burst =
> > +					eth_dev->tx_pkt_burst;
> > +			}
> > +		}
> 
> You shouldn't need this update, old version seems better.
> 
> If a secondary process is running call stack is like:
> eth_cxgbe_pci_probe
>   rte_eth_dev_pci_generic_probe
>     rte_eth_dev_pci_allocate
>       rte_eth_dev_attach_secondary
>     eth_cxgbe_dev_init
> 
> rte_eth_dev_attach_secondary() already called in the patch, all need to be done
> in eth_cxgbe_dev_init() which already does.
> 
> Unless I am missing something this update shouldn't happen.
> 
> 

All physical ports on the device are exposed to DPDK via a single
physical function (PF).  We only bind our PF4 to DPDK stack.
Thus, eth_cxgbe_pci_probe() is called only for port 0, and we have
to handle the remaining ports manually.

> >  		return 0;
> > -
> > -	pci_dev = RTE_ETH_DEV_TO_PCI(eth_dev);
> > +	}
> >  
> >  	snprintf(name, sizeof(name), "cxgbeadapter%d", eth_dev->data->port_id);
> >  	adapter = rte_zmalloc(name, sizeof(*adapter), 0);
> > diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
> > index 28db6c061..c502fadf7 100644
> > --- a/drivers/net/cxgbe/cxgbe_main.c
> > +++ b/drivers/net/cxgbe/cxgbe_main.c
> > @@ -1266,8 +1266,6 @@ int cxgbe_probe(struct adapter *adapter)
> >  
> >  	for_each_port(adapter, i) {
> >  		char name[RTE_ETH_NAME_MAX_LEN];
> > -		struct rte_eth_dev_data *data = NULL;
> > -		const unsigned int numa_node = rte_socket_id();
> >  
> >  		pi = &adapter->port[i];
> >  		pi->adapter = adapter;
> > @@ -1293,13 +1291,8 @@ int cxgbe_probe(struct adapter *adapter)
> >  		if (!pi->eth_dev)
> >  			goto out_free;
> >  
> > -		data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
> > -		if (!data)
> > -			goto out_free;
> > -
> > -		data->port_id = adapter->eth_dev->data->port_id + i;
> > -
> > -		pi->eth_dev->data = data;
> 
> +1 to remove allocating "data" which is already comes with eth_dev.
> 
> We allocate them for virtual PMDs because they are local to the process, not
> shared between DPDK processes.
> 
> Overall I am not sure about cxgbe_probe(), which has been called by
> eth_cxgbe_dev_init() and eth_cxgbe_dev_init() already knows the allocated
> eth_dev, but cxgbe_probe() ignores it and allocates a new one, I am not able to
> understand why.
> 

Like I said, we only bind a single PF to DPDK.  Port 0 is already
handled by DPDK stack, but we need to handle exposing the remaining
ports on the device manually and hence the call to
rte_eth_dev_allocate() for remaining ports in cxgbe_probe().

> > +		pi->eth_dev->data->port_id =
> > +			adapter->eth_dev->data->port_id + i;
> 
> And this one, why updating internal port_id value, this looks wrong, how
> multi-process works with this and if there is another nic in the system now
> port_id is no more unique.
> 

Good catch!  I see that rte_eth_dev_allocate() already assigns unique
port_id and this seems unnecessary.  Will fix this.

> Perhaps you have a unique hardware/use case, it helps a lot if you can describe
> it a little.
> 

I hope my explanation above about multiple ports on single PF makes
sense. Let us know if unclear or more info is needed.

> >  
> >  allocate_mac:
> >  		pi->eth_dev->device = &adapter->pdev->device;
> > diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c
> > index 3d5aa596f..6ff8bc46b 100644
> > --- a/drivers/net/cxgbe/sge.c
> > +++ b/drivers/net/cxgbe/sge.c
> > @@ -1095,7 +1095,7 @@ int t4_eth_xmit(struct sge_eth_txq *txq, struct rte_mbuf *mbuf,
> >  	u32 wr_mid;
> >  	u64 cntrl, *end;
> >  	bool v6;
> > -	u32 max_pkt_len = txq->eth_dev->data->dev_conf.rxmode.max_rx_pkt_len;
> > +	u32 max_pkt_len = txq->data->dev_conf.rxmode.max_rx_pkt_len;
> >  
> >  	/* Reject xmit if queue is stopped */
> >  	if (unlikely(txq->flags & EQ_STOPPED))
> > @@ -1115,7 +1115,7 @@ int t4_eth_xmit(struct sge_eth_txq *txq, struct rte_mbuf *mbuf,
> >  	    (unlikely(m->pkt_len > max_pkt_len)))
> >  		goto out_free;
> >  
> > -	pi = (struct port_info *)txq->eth_dev->data->dev_private;
> > +	pi = (struct port_info *)txq->data->dev_private;
> >  	adap = pi->adapter;
> >  
> >  	cntrl = F_TXPKT_L4CSUM_DIS | F_TXPKT_IPCSUM_DIS;
> > @@ -1997,6 +1997,7 @@ int t4_sge_alloc_eth_txq(struct adapter *adap, struct sge_eth_txq *txq,
> >  	txq->stats.mapping_err = 0;
> >  	txq->flags |= EQ_STOPPED;
> >  	txq->eth_dev = eth_dev;
> > +	txq->data = eth_dev->data;
> >  	t4_os_lock_init(&txq->txq_lock);
> >  	return 0;
> >  }
> > 
> 

Thanks,
Rahul