From: Alejandro Lucero <alejandro.lucero@netronome.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: dev <dev@dpdk.org>, Thomas Monjalon <thomas@monjalon.net>,
Andrew Rybchenko <arybchenko@solarflare.com>,
"Burakov, Anatoly" <anatoly.burakov@intel.com>,
Bruce Richardson <bruce.richardson@intel.com>,
John Daley <johndale@cisco.com>,
Shahaf Shuler <shahafs@mellanox.com>,
Adrien Mazarguil <adrien.mazarguil@6wind.com>,
"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
Stephen Hemminger <stephen@networkplumber.org>,
Qi Zhang <qi.z.zhang@intel.com>,
Jerin Jacob <jerin.jacob@caviumnetworks.com>
Subject: Re: [dpdk-dev] [PATCH] net/nfp: add multiprocess support
Date: Wed, 28 Nov 2018 18:05:22 +0000 [thread overview]
Message-ID: <CAD+H991ypZ-RZBsVuXaD3-OtEvcXjiTuUKTGCp9pNQ=Y_9vjJw@mail.gmail.com> (raw)
In-Reply-To: <56a0dbab-20e1-9a46-bbfe-3081dea0744d@intel.com>
On Wed, Nov 28, 2018 at 5:28 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> On 11/28/2018 11:32 AM, Alejandro Lucero wrote:
> > This patch introduces changes for supporting multiprocess support.
> > This is trivial for VFs but comes with some limitations for the PF.
> >
> > Due to restrictions when using NFP CPP interface, just one secondary
> > process is supported by now for the PF.
>
> Hi Alejandro,
>
>
Hi Ferruh,
> I think we needs to clarify first what is the multiprocess support
> required in
> PMD level, there are a few different implementations. cc'ed a few more
> folks for
> helping clarification.
>
> According my understanding,
> Only *one* DPDK application (primary or secondary) should control a device.
> There is no restriction in DPDK for it, this responsibility is pushed to
> application, application should manage it.
>
>
Right.
> Device initialization (probe()) done only by primary application.
>
>
I think it could also be possible to have a secondary process where a
device is hotplugged, so the probe is done by the secondary and shared with
the primary.
> Because of above two items, we only need RTE_PROC_PRIMARY check in driver
> probe() but not in other eth_dev_ops.
> If we need eth_dev_ops should be protected, I believe that should be done
> in
> ethdev layer for all PMDs instead of each (some) PMD does it itself.
>
>
Right. But NFP requires to access the device through the NFP CPP interface
for things like setting the link up or down. If the secondary starts the
port, it will use the CPP interface by itself. Because the CPP interface is
not shared between processes by now but each one gets its own interface,
the way the link is set requires to know which process type, primary or
secondary, is doing it.
I think to differentiate between primary and secondaries just in the probe
function is fine ... except in our case. The way this NFP PMD multiprocess
support is implemented has some limitations what I will try to overcome in
the near future sharing the CPP interface (if possible). This will not be a
real sharing, because it will get complex easily for supporting unlimited
secondary processes, but a communication between primary and secondaries
where just the primary will use the CPP interface, for itself and on behalf
of the secondary ones. Implementing this is not trivial, and I want to have
basic multiprocess support by now for at least a primary and a secondary
because:
1) Apps like dpdk-procinfo are being used for getting information about
running DPDK apps.
2) I plan to submit another patch implementing a CPP bridge inside the PMD
using the service layer. This will allow user space tools accessing the NFP
for development and debugging. The CPP bridge will be handle by a secondary
process with the only purpose of giving this service when another DPDK app,
the primary process, is running.
> Following should be possible:
> 1)
> - Primary starts with 5 ports and configures them
> - Start 5 secondary apps in a way each control a specific port data path
>
>
With the path submitted, we will only support one secondary which could
control one or any port started by the primary.
Of course, I want to support the full case, but as I have said, doing this
is not trivial, and basic support of just one secondary is a priority now.
> 2)
> - Primary start with N ports
> - A secondary per port configure the port and control data path
>
>
We can support that.
>
> Thanks,
> ferruh
>
>
After your concerns, I think I should sent another version updating the NFP
guide with this support and the current limitations.
Thanks.
>
>
> >
> > Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
> > ---
> > doc/guides/nics/features/nfp.ini | 1 +
> > doc/guides/nics/features/nfp_vf.ini | 1 +
> > drivers/net/nfp/nfp_net.c | 133 +++++++++++++--------
> > drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c | 103 ++++++++++++++--
> > 4 files changed, 177 insertions(+), 61 deletions(-)
> >
> > diff --git a/doc/guides/nics/features/nfp.ini
> b/doc/guides/nics/features/nfp.ini
> > index d2899e7fb..70297b090 100644
> > --- a/doc/guides/nics/features/nfp.ini
> > +++ b/doc/guides/nics/features/nfp.ini
> > @@ -24,5 +24,6 @@ Basic stats = Y
> > Stats per queue = Y
> > Linux UIO = Y
> > Linux VFIO = Y
> > +Multiprocess aware = Y
> > x86-64 = Y
> > Usage doc = Y
> > diff --git a/doc/guides/nics/features/nfp_vf.ini
> b/doc/guides/nics/features/nfp_vf.ini
> > index d2899e7fb..70297b090 100644
> > --- a/doc/guides/nics/features/nfp_vf.ini
> > +++ b/doc/guides/nics/features/nfp_vf.ini
> > @@ -24,5 +24,6 @@ Basic stats = Y
> > Stats per queue = Y
> > Linux UIO = Y
> > Linux VFIO = Y
> > +Multiprocess aware = Y
> > x86-64 = Y
> > Usage doc = Y
> > diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
> > index 54c6da924..ffef97d80 100644
> > --- a/drivers/net/nfp/nfp_net.c
> > +++ b/drivers/net/nfp/nfp_net.c
> > @@ -766,9 +766,12 @@ nfp_net_start(struct rte_eth_dev *dev)
> > goto error;
> > }
> >
> > - if (hw->is_pf)
> > + if (hw->is_pf && rte_eal_process_type() == RTE_PROC_PRIMARY)
> > /* Configure the physical port up */
> > nfp_eth_set_configured(hw->cpp, hw->pf_port_idx, 1);
> > + else
> > + nfp_eth_set_configured(dev->process_private,
> > + hw->pf_port_idx, 1);
> >
> > hw->ctrl = new_ctrl;
> >
> > @@ -817,9 +820,12 @@ nfp_net_stop(struct rte_eth_dev *dev)
> > (struct nfp_net_rxq *)dev->data->rx_queues[i]);
> > }
> >
> > - if (hw->is_pf)
> > + if (hw->is_pf && rte_eal_process_type() == RTE_PROC_PRIMARY)
> > /* Configure the physical port down */
> > nfp_eth_set_configured(hw->cpp, hw->pf_port_idx, 0);
> > + else
> > + nfp_eth_set_configured(dev->process_private,
> > + hw->pf_port_idx, 0);
> > }
> >
> > /* Reset and stop device. The device can not be restarted. */
> > @@ -2918,16 +2924,16 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
> > hw->mac_addr[0], hw->mac_addr[1], hw->mac_addr[2],
> > hw->mac_addr[3], hw->mac_addr[4], hw->mac_addr[5]);
> >
> > - /* Registering LSC interrupt handler */
> > - rte_intr_callback_register(&pci_dev->intr_handle,
> > - nfp_net_dev_interrupt_handler,
> > - (void *)eth_dev);
> > -
> > - /* Telling the firmware about the LSC interrupt entry */
> > - nn_cfg_writeb(hw, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
> > -
> > - /* Recording current stats counters values */
> > - nfp_net_stats_reset(eth_dev);
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > + /* Registering LSC interrupt handler */
> > + rte_intr_callback_register(&pci_dev->intr_handle,
> > + nfp_net_dev_interrupt_handler,
> > + (void *)eth_dev);
> > + /* Telling the firmware about the LSC interrupt entry */
> > + nn_cfg_writeb(hw, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
> > + /* Recording current stats counters values */
> > + nfp_net_stats_reset(eth_dev);
> > + }
> >
> > return 0;
> >
> > @@ -2947,7 +2953,7 @@ nfp_pf_create_dev(struct rte_pci_device *dev, int
> port, int ports,
> > struct rte_eth_dev *eth_dev;
> > struct nfp_net_hw *hw;
> > char *port_name;
> > - int ret;
> > + int retval;
> >
> > port_name = rte_zmalloc("nfp_pf_port_name", 100, 0);
> > if (!port_name)
> > @@ -2958,51 +2964,76 @@ nfp_pf_create_dev(struct rte_pci_device *dev,
> int port, int ports,
> > else
> > sprintf(port_name, "%s", dev->device.name);
> >
> > - eth_dev = rte_eth_dev_allocate(port_name);
> > - if (!eth_dev)
> > - return -ENOMEM;
> >
> > - if (port == 0) {
> > - *priv = rte_zmalloc(port_name,
> > - sizeof(struct nfp_net_adapter) * ports,
> > - RTE_CACHE_LINE_SIZE);
> > - if (!*priv) {
> > - rte_eth_dev_release_port(eth_dev);
> > - return -ENOMEM;
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > + eth_dev = rte_eth_dev_allocate(port_name);
> > + if (!eth_dev) {
> > + rte_free(port_name);
> > + return -ENODEV;
> > }
> > - }
> > -
> > - eth_dev->data->dev_private = *priv;
> > -
> > - /*
> > - * dev_private pointing to port0 dev_private because we need
> > - * to configure vNIC bars based on port0 at nfp_net_init.
> > - * Then dev_private is adjusted per port.
> > - */
> > - hw = (struct nfp_net_hw *)(eth_dev->data->dev_private) + port;
> > - hw->cpp = cpp;
> > - hw->hwinfo = hwinfo;
> > - hw->sym_tbl = sym_tbl;
> > - hw->pf_port_idx = phys_port;
> > - hw->is_pf = 1;
> > - if (ports > 1)
> > - hw->pf_multiport_enabled = 1;
> > + if (port == 0) {
> > + *priv = rte_zmalloc(port_name,
> > + sizeof(struct nfp_net_adapter)
> *
> > + ports, RTE_CACHE_LINE_SIZE);
> > + if (!*priv) {
> > + rte_free(port_name);
> > + rte_eth_dev_release_port(eth_dev);
> > + return -ENOMEM;
> > + }
> > + }
> > + eth_dev->data->dev_private = *priv;
> >
> > - hw->total_ports = ports;
> > + /*
> > + * dev_private pointing to port0 dev_private because we
> need
> > + * to configure vNIC bars based on port0 at nfp_net_init.
> > + * Then dev_private is adjusted per port.
> > + */
> > + hw = (struct nfp_net_hw *)(eth_dev->data->dev_private) +
> port;
> > + hw->cpp = cpp;
> > + hw->hwinfo = hwinfo;
> > + hw->sym_tbl = sym_tbl;
> > + hw->pf_port_idx = phys_port;
> > + hw->is_pf = 1;
> > + if (ports > 1)
> > + hw->pf_multiport_enabled = 1;
> > +
> > + hw->total_ports = ports;
> > + } else {
> > + eth_dev = rte_eth_dev_attach_secondary(port_name);
> > + if (!eth_dev) {
> > + RTE_LOG(ERR, EAL, "secondary process attach
> failed, "
> > + "ethdev doesn't exist");
> > + rte_free(port_name);
> > + return -ENODEV;
> > + }
> > + eth_dev->process_private = cpp;
> > + }
> >
> > eth_dev->device = &dev->device;
> > rte_eth_copy_pci_info(eth_dev, dev);
> >
> > - ret = nfp_net_init(eth_dev);
> > + retval = nfp_net_init(eth_dev);
> >
> > - if (ret)
> > - rte_eth_dev_release_port(eth_dev);
> > - else
> > + if (retval) {
> > + retval = -ENODEV;
> > + goto probe_failed;
> > + } else {
> > rte_eth_dev_probing_finish(eth_dev);
> > + }
> >
> > rte_free(port_name);
> >
> > - return ret;
> > + return retval;
> > +
> > +probe_failed:
> > + rte_free(port_name);
> > + /* free ports private data if primary process */
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> > + rte_free(eth_dev->data->dev_private);
> > +
> > + rte_eth_dev_release_port(eth_dev);
> > +
> > + return retval;
> > }
> >
> > #define DEFAULT_FW_PATH "/lib/firmware/netronome"
> > @@ -3180,10 +3211,12 @@ static int nfp_pf_pci_probe(struct
> rte_pci_driver *pci_drv __rte_unused,
> > return -EIO;
> > }
> >
> > - if (nfp_fw_setup(dev, cpp, nfp_eth_table, hwinfo)) {
> > - PMD_DRV_LOG(INFO, "Error when uploading firmware");
> > - ret = -EIO;
> > - goto error;
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > + if (nfp_fw_setup(dev, cpp, nfp_eth_table, hwinfo)) {
> > + PMD_DRV_LOG(INFO, "Error when uploading firmware");
> > + ret = -EIO;
> > + goto error;
> > + }
> > }
> >
> > /* Now the symbol table should be there */
> > diff --git a/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
> b/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
> > index c68d9400f..39bd48a83 100644
> > --- a/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
> > +++ b/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
> > @@ -112,6 +112,7 @@ struct nfp_pcie_user {
> >
> > int device;
> > int lock;
> > + int secondary_lock;
> > char busdev[BUSDEV_SZ];
> > int barsz;
> > char *cfg;
> > @@ -290,14 +291,32 @@ nfp_reconfigure_bar(struct nfp_pcie_user *nfp,
> struct nfp_bar *bar, int tgt,
> > * already mapped.
> > *
> > * BAR0.0: Reserved for General Mapping (for MSI-X access to PCIe SRAM)
> > + *
> > + * Halving PCItoCPPBars for primary and secondary processes.
> > + * NFP PMD just requires two fixed slots, one for configuration
> BAR,
> > + * and another for accessing the hw queues. Another slot is
> needed
> > + * for setting the link up or down. Secondary processes do not
> need
> > + * to map the first two slots again, but it requires one slot
> for
> > + * accessing the link, even if it is not likely the secondary
> process
> > + * starting the port. This implies a limit of secondary
> processes
> > + * supported. Due to this requirement and future extensions
> requiring
> > + * new slots per process, only one secondary process is
> supported by
> > + * now.
> > */
> > static int
> > nfp_enable_bars(struct nfp_pcie_user *nfp)
> > {
> > struct nfp_bar *bar;
> > - int x;
> > + int x, start, end;
> >
> > - for (x = ARRAY_SIZE(nfp->bar); x > 0; x--) {
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > + start = 4;
> > + end = 1;
> > + } else {
> > + start = 7;
> > + end = 4;
> > + }
> > + for (x = start; x > end; x--) {
> > bar = &nfp->bar[x - 1];
> > bar->barcfg = 0;
> > bar->nfp = nfp;
> > @@ -320,9 +339,16 @@ static struct nfp_bar *
> > nfp_alloc_bar(struct nfp_pcie_user *nfp)
> > {
> > struct nfp_bar *bar;
> > - int x;
> > + int x, start, end;
> >
> > - for (x = ARRAY_SIZE(nfp->bar); x > 0; x--) {
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > + start = 4;
> > + end = 1;
> > + } else {
> > + start = 7;
> > + end = 4;
> > + }
> > + for (x = start; x > end; x--) {
> > bar = &nfp->bar[x - 1];
> > if (!bar->lock) {
> > bar->lock = 1;
> > @@ -336,9 +362,17 @@ static void
> > nfp_disable_bars(struct nfp_pcie_user *nfp)
> > {
> > struct nfp_bar *bar;
> > - int x;
> > + int x, start, end;
> > +
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > + start = 4;
> > + end = 1;
> > + } else {
> > + start = 7;
> > + end = 4;
> > + }
> >
> > - for (x = ARRAY_SIZE(nfp->bar); x > 0; x--) {
> > + for (x = start; x > end; x--) {
> > bar = &nfp->bar[x - 1];
> > if (bar->iomem) {
> > bar->iomem = NULL;
> > @@ -633,6 +667,47 @@ nfp_acquire_process_lock(struct nfp_pcie_user *desc)
> > return 0;
> > }
> >
> > +static int
> > +nfp_acquire_secondary_process_lock(struct nfp_pcie_user *desc)
> > +{
> > + int rc;
> > + struct flock lock;
> > + const char *lockname = "/.lock_nfp_secondary";
> > + char *home_path;
> > + char *lockfile;
> > +
> > + memset(&lock, 0, sizeof(lock));
> > +
> > + /*
> > + * Using user's home directory. Note this can be called in a DPDK
> app
> > + * being executed as non-root. This is not the case for the
> previous
> > + * function nfp_acquire_process_lock which is invoked only when UIO
> > + * driver is used because that implies root user.
> > + */
> > + home_path = getenv("HOME");
> > + lockfile = calloc(strlen(home_path) + strlen(lockname) + 1,
> > + sizeof(char));
> > +
> > + strcat(lockfile, home_path);
> > + strcat(lockfile, "/.lock_nfp_secondary");
> > + desc->secondary_lock = open(lockfile, O_RDWR | O_CREAT |
> O_NONBLOCK,
> > + 0666);
> > + if (desc->secondary_lock < 0) {
> > + RTE_LOG(ERR, PMD, "NFP lock for secondary process
> failed\n");
> > + return desc->secondary_lock;
> > + }
> > +
> > + lock.l_type = F_WRLCK;
> > + lock.l_whence = SEEK_SET;
> > + rc = fcntl(desc->secondary_lock, F_SETLK, &lock);
> > + if (rc < 0) {
> > + RTE_LOG(ERR, PMD, "NFP lock for secondary process
> failed\n");
> > + close(desc->secondary_lock);
> > + }
> > +
> > + return rc;
> > +}
> > +
> > static int
> > nfp6000_set_model(struct rte_pci_device *dev, struct nfp_cpp *cpp)
> > {
> > @@ -774,7 +849,6 @@ static int
> > nfp6000_init(struct nfp_cpp *cpp, struct rte_pci_device *dev)
> > {
> > int ret = 0;
> > - uint32_t model;
> > struct nfp_pcie_user *desc;
> >
> > desc = malloc(sizeof(*desc));
> > @@ -785,12 +859,20 @@ nfp6000_init(struct nfp_cpp *cpp, struct
> rte_pci_device *dev)
> > memset(desc->busdev, 0, BUSDEV_SZ);
> > strlcpy(desc->busdev, dev->device.name, sizeof(desc->busdev));
> >
> > - if (cpp->driver_lock_needed) {
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
> > + cpp->driver_lock_needed) {
> > ret = nfp_acquire_process_lock(desc);
> > if (ret)
> > return -1;
> > }
> >
> > + /* Just support for one secondary process */
> > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > + ret = nfp_acquire_secondary_process_lock(desc);
> > + if (ret)
> > + return -1;
> > + }
> > +
> > if (nfp6000_set_model(dev, cpp) < 0)
> > return -1;
> > if (nfp6000_set_interface(dev, cpp) < 0)
> > @@ -806,9 +888,6 @@ nfp6000_init(struct nfp_cpp *cpp, struct
> rte_pci_device *dev)
> >
> > nfp_cpp_priv_set(cpp, desc);
> >
> > - model = __nfp_cpp_model_autodetect(cpp);
> > - nfp_cpp_model_set(cpp, model);
> > -
> > return ret;
> > }
> >
> > @@ -820,6 +899,8 @@ nfp6000_free(struct nfp_cpp *cpp)
> > nfp_disable_bars(desc);
> > if (cpp->driver_lock_needed)
> > close(desc->lock);
> > + if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> > + close(desc->secondary_lock);
> > close(desc->device);
> > free(desc);
> > }
> >
>
>
next prev parent reply other threads:[~2018-11-28 18:05 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-28 11:32 Alejandro Lucero
2018-11-28 17:28 ` Ferruh Yigit
2018-11-28 18:05 ` Alejandro Lucero [this message]
2018-11-29 14:05 ` Ferruh Yigit
2018-11-29 14:43 ` Tom Barbette
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAD+H991ypZ-RZBsVuXaD3-OtEvcXjiTuUKTGCp9pNQ=Y_9vjJw@mail.gmail.com' \
--to=alejandro.lucero@netronome.com \
--cc=adrien.mazarguil@6wind.com \
--cc=anatoly.burakov@intel.com \
--cc=arybchenko@solarflare.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=jerin.jacob@caviumnetworks.com \
--cc=johndale@cisco.com \
--cc=konstantin.ananyev@intel.com \
--cc=qi.z.zhang@intel.com \
--cc=shahafs@mellanox.com \
--cc=stephen@networkplumber.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).