From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-f174.google.com (mail-wr0-f174.google.com [209.85.128.174]) by dpdk.org (Postfix) with ESMTP id BEACF1041 for ; Fri, 15 Sep 2017 09:31:21 +0200 (CEST) Received: by mail-wr0-f174.google.com with SMTP id r74so1058173wrb.13 for ; Fri, 15 Sep 2017 00:31:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=BBhWUjX+WUnsJ/cH1VWBu9eAmDcYJnVKWPxAO+4s6RI=; b=tLwRLMtPBDYlGZA/wGloLxtIN8+lkRhmPvIBjHIG8tyVQqaPA8rJ84KzC3EkG3d7xg y+fNLlObkOEKC1XXPiYbci3I1fNyiVfFlj66fsiZ7O+7FkjFbGNEJZArzPThEswjrI44 zcnMO6S6+YKtfRvV9KeSyn1Firqwk6mPYveBs2nzsR3hJcxS6XKnp+Q5HMQp18uusr92 kkmfghrfLV8wRobkqyOlV/9/U2rg+zaUw1GLpb1UyNMJY/wvkhLs6g5M8p2R5psUBVoD IPWS6vQj+LRouhw4Coxw0BH7HgXnKl6CIudiXmxqYU8hEuqSpGv9F5URLcxhmPS3MCV9 MIog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=BBhWUjX+WUnsJ/cH1VWBu9eAmDcYJnVKWPxAO+4s6RI=; b=AipomI0dpYfEH+T9yxSmNILfzphKW3J9/FjiYgBVNKvTkxCXFwUlZLlS5Sy67pray9 fBZKleT0i1vdMWjkYOE7RF8xKcUNeZA/h3TsoWqCTNzBbjt9PIp0mRyjbdHAnXsweQVa g7AOWey4dRQRlf2jAaMoTCZdZJeA0sBEd3E3OwguqodiS2y0vJT30KgzNJ4oj+9yq5YV uLtXrIWEJS1M/7E9doQl8srJTQjmBLK9s2xY3NIg9H0j9gmSnM7SIFxjzikQxNk/xuCL Tk+/YxLMycIWlwnFd6Xxp9VFSnHp4A/qxRL6fNbio2XEUFG0WxJ3owBe0AGlIYzGpNR0 0KKQ== X-Gm-Message-State: AHPjjUhrSRWAc7YLgoMQAe7U21JLv41dqEGIwL/I62mLwTOTJMTd58Dm rtD0gwf5TYJOpA89 X-Google-Smtp-Source: ADKCNb5MLCxDn4YDEatDQHm6Oi1H2k1Gj/Ie5/4ZaRbDXe+bCdnmfn+3TUPwjTF+gahWf2ujXGPQ0Q== X-Received: by 10.223.151.139 with SMTP id s11mr22304490wrb.237.1505460680864; Fri, 15 Sep 2017 00:31:20 -0700 (PDT) Received: from [192.168.1.79] (82.107.69.91.rev.sfr.net. [91.69.107.82]) by smtp.gmail.com with ESMTPSA id h33sm173962wrh.70.2017.09.15.00.31.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 Sep 2017 00:31:19 -0700 (PDT) To: Ophir Munk Cc: Adrien Mazarguil , dev@dpdk.org, Thomas Monjalon , Olga Shern , stable@dpdk.org References: <1505386126-29546-1-git-send-email-ophirmu@mellanox.com> <1505401645-19008-1-git-send-email-ophirmu@mellanox.com> From: Pascal Mazon Message-ID: <0e18a611-3b35-6624-013b-59c72f71307b@6wind.com> Date: Fri, 15 Sep 2017 09:31:19 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <1505401645-19008-1-git-send-email-ophirmu@mellanox.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US Subject: Re: [dpdk-dev] [PATCH v2] net/tap: fix flow and port commands X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Sep 2017 07:31:21 -0000 Hi, The patch looks good globally, but I have a few comments/requests, inline. Best regards, Pascal On 14/09/2017 17:07, Ophir Munk wrote: > This commit fixes two bugs related to tap devices. The first bug occurs > when executing in testpmd the following flow rule assuming tap device has > 4 rx and tx pair queues > "flow create 0 ingress pattern eth / end actions queue index 5 / end" > This command will report on success and will print ""Flow rule #0 created" > although it should have failed as queue index number 5 does not exist > > The second bug occurs when executing in testpmd "port start all" following > a port configuration. Assuming 1 pair of rx and tx queues an error is > reported: "Fail to start port 0" > > Before this commit a fixed max number (16) of rx and tx queue pairs were > created on startup where the file descriptors (fds) of rx and tx pairs were > identical. As a result in the first bug queue index 5 existed because the > tap device was created with 16 rx and tx queue pairs regardless of the > configured number of queues. In the second bug when tap device was started > tx fd was closed before opening it and executing ioctl() on it. However > closing the sole fd of the device caused ioctl to fail with "No such > device". > > This commit creates the configured number of rx and tx queue pairs (up to > max 16) and assigns a unique fd to each queue. It was written to solve the > first bug and was found as the right fix for the second bug as well. > > Fixes: 02f96a0a82d1 ("net/tap: add TUN/TAP device PMD") > Fixes: bf7b7f437b49 ("net/tap: create netdevice during probing") > Fixes: de96fe68ae95 ("net/tap: add basic flow API patterns and actions") > Cc: stable@dpdk.org > > Signed-off-by: Ophir Munk > --- > Comment 1: > Reason for V2: fixing coding style. Previous versions of this patch are supersseded > > Comment 2: > Pascal - I would appreciate your feedback on the following discussion: > >>> /* Immediately create the netdevice (this will create the 1st >>> queue). */ >>> - if (tap_setup_queue(dev, pmd, 0) == -1) >>> + // rx queue >>> + if (tap_setup_queue(dev, pmd, 0, 1) == -1) goto error_exit; // tx >>> + queue if (tap_setup_queue(dev, pmd, 0, 0) == -1) >>> goto error_exit; >>> >> Adrien Mazarguil wrote: >> Can this be removed entirely to instead wait until the application >> actually requests the creation of the first queue? > Ophir Munk wrote: > Removing this piece of code is causing failure during tap device initialization and testpmd as well. > The subsequent code following this removal contains hundreds of lines so I prefer getting Pasca advise why the immediate creation is required here and if possible - how to avoid it. I think it was required to have the netdevice present as soon as we left the create function, as some other function occurred just a little later (prior to queue_setup) that needed ioctl() on the netdevice. I would leave it like this for this patch, but it would be more elegant to create the netdevice only when really needed. > > drivers/net/tap/rte_eth_tap.c | 130 ++++++++++++++++++++++++++++++++---------- > drivers/net/tap/rte_eth_tap.h | 1 + > drivers/net/tap/tap_flow.c | 3 +- > 3 files changed, 104 insertions(+), 30 deletions(-) > > diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c > index 9acea83..f8cf330 100644 > --- a/drivers/net/tap/rte_eth_tap.c > +++ b/drivers/net/tap/rte_eth_tap.c > @@ -603,8 +603,31 @@ enum ioctl_mode { > } > > static int > -tap_dev_configure(struct rte_eth_dev *dev __rte_unused) > +tap_dev_configure(struct rte_eth_dev *dev) > { > + if (dev->data->nb_rx_queues > RTE_PMD_TAP_MAX_QUEUES) { > + RTE_LOG(ERR, PMD, > + "%s: number of rx queues %d exceeds max num of queues %d\n", > + dev->device->name, > + dev->data->nb_rx_queues, > + RTE_PMD_TAP_MAX_QUEUES); > + return -1; > + } > + if (dev->data->nb_tx_queues > RTE_PMD_TAP_MAX_QUEUES) { > + RTE_LOG(ERR, PMD, > + "%s: number of tx queues %d exceeds max num of queues %d\n", > + dev->device->name, > + dev->data->nb_tx_queues, > + RTE_PMD_TAP_MAX_QUEUES); > + return -1; > + } > + > + RTE_LOG(INFO, PMD, "%s: %p: TX queues number update: 0 -> %u\n", > + dev->device->name, (void *)dev, dev->data->nb_tx_queues); > + > + RTE_LOG(INFO, PMD, "%s: %p: RX queues number update: 0 -> %u\n", > + dev->device->name, (void *)dev, dev->data->nb_rx_queues); > + > return 0; > } tap_dev_configure() can be called when restarting a port, I'm not so sure the number update changes number of queues from "0". Why not a simple "TX/RX configured queues number: %u"? > @@ -673,9 +696,9 @@ enum ioctl_mode { > unsigned long rx_nombuf = 0, ierrors = 0; > const struct pmd_internals *pmd = dev->data->dev_private; > > - imax = (pmd->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? > - pmd->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS; > - > + /* rx queue statistics */ > + imax = (dev->data->nb_rx_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? > + dev->data->nb_rx_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS; > for (i = 0; i < imax; i++) { > tap_stats->q_ipackets[i] = pmd->rxq[i].stats.ipackets; > tap_stats->q_ibytes[i] = pmd->rxq[i].stats.ibytes; > @@ -683,7 +706,13 @@ enum ioctl_mode { > rx_bytes_total += tap_stats->q_ibytes[i]; > rx_nombuf += pmd->rxq[i].stats.rx_nombuf; > ierrors += pmd->rxq[i].stats.ierrors; > + } > + > + /* tx queue statistics */ > + imax = (dev->data->nb_tx_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? > + dev->data->nb_tx_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS; > > + for (i = 0; i < imax; i++) { > tap_stats->q_opackets[i] = pmd->txq[i].stats.opackets; > tap_stats->q_errors[i] = pmd->txq[i].stats.errs; > tap_stats->q_obytes[i] = pmd->txq[i].stats.obytes; > @@ -730,10 +759,14 @@ enum ioctl_mode { > tap_flow_implicit_flush(internals, NULL); > > for (i = 0; i < internals->nb_queues; i++) { > - if (internals->rxq[i].fd != -1) > + if (internals->rxq[i].fd != -1) { > close(internals->rxq[i].fd); > - internals->rxq[i].fd = -1; > - internals->txq[i].fd = -1; > + internals->rxq[i].fd = -1; > + } > + if (internals->txq[i].fd != -1) { > + close(internals->txq[i].fd); > + internals->txq[i].fd = -1; > + } > } > > if (internals->remote_if_index) { > @@ -887,30 +920,57 @@ enum ioctl_mode { > static int > tap_setup_queue(struct rte_eth_dev *dev, > struct pmd_internals *internals, > - uint16_t qid) > + uint16_t qid, > + int is_rx) > { > + int *fd; > + int *other_fd; > + const char *dir; > struct pmd_internals *pmd = dev->data->dev_private; > struct rx_queue *rx = &internals->rxq[qid]; > struct tx_queue *tx = &internals->txq[qid]; > - int fd = rx->fd == -1 ? tx->fd : rx->fd; > > - if (fd == -1) { > - RTE_LOG(INFO, PMD, "Add queue to TAP %s for qid %d\n", > - pmd->name, qid); > - fd = tun_alloc(pmd); > - if (fd < 0) { > + if (is_rx) { > + fd = &rx->fd; > + other_fd = &tx->fd; > + dir = "rx"; > + } else { > + fd = &tx->fd; > + other_fd = &rx->fd; > + dir = "tx"; > + } > + if (*fd != -1) { > + /* fd for this queue already exists */ > + RTE_LOG(DEBUG, PMD, "%s: fd %d for %s queue qid %d exists\n", > + pmd->name, *fd, dir, qid); > + } else if (*other_fd != -1) { > + /* Only other_fd exists. dup it */ > + *fd = dup(*other_fd); > + if (*fd < 0) { > + *fd = -1; > + RTE_LOG(ERR, PMD, "%s: dup() failed.\n", > + pmd->name); > + return -1; > + } > + RTE_LOG(DEBUG, PMD, "%s: dup fd %d for %s queue qid %d (%d)\n", > + pmd->name, *other_fd, dir, qid, *fd); > + } else { > + /* Both RX and TX fds do not exist (equal -1). Create fd */ > + *fd = tun_alloc(pmd); > + if (*fd < 0) { > + *fd = -1; /* restore original value */ > RTE_LOG(ERR, PMD, "%s: tun_alloc() failed.\n", > pmd->name); > return -1; > } > + RTE_LOG(DEBUG, PMD, "%s: add %s queue for qid %d fd %d\n", > + pmd->name, dir, qid, *fd); > } > > - rx->fd = fd; > - tx->fd = fd; > tx->mtu = &dev->data->mtu; > rx->rxmode = &dev->data->dev_conf.rxmode; > > - return fd; > + return *fd; > } > > static int > @@ -932,10 +992,10 @@ enum ioctl_mode { > int fd; > int i; > > - if ((rx_queue_id >= internals->nb_queues) || !mp) { > + if (rx_queue_id >= dev->data->nb_rx_queues || !mp) { > RTE_LOG(WARNING, PMD, > - "nb_queues %d too small or mempool NULL\n", > - internals->nb_queues); > + "nb_rx_queues %d too small or mempool NULL\n", > + dev->data->nb_rx_queues); > return -1; > } > > @@ -954,7 +1014,7 @@ enum ioctl_mode { > rxq->iovecs = iovecs; > > dev->data->rx_queues[rx_queue_id] = rxq; > - fd = tap_setup_queue(dev, internals, rx_queue_id); > + fd = tap_setup_queue(dev, internals, rx_queue_id, 1); > if (fd == -1) { > ret = fd; > goto error; > @@ -1002,11 +1062,11 @@ enum ioctl_mode { > struct pmd_internals *internals = dev->data->dev_private; > int ret; > > - if (tx_queue_id >= internals->nb_queues) > + if (tx_queue_id >= dev->data->nb_tx_queues) > return -1; > > dev->data->tx_queues[tx_queue_id] = &internals->txq[tx_queue_id]; > - ret = tap_setup_queue(dev, internals, tx_queue_id); > + ret = tap_setup_queue(dev, internals, tx_queue_id, 0); > if (ret == -1) > return -1; > > @@ -1166,7 +1226,6 @@ enum ioctl_mode { > .filter_ctrl = tap_dev_filter_ctrl, > }; > > - > static int > eth_dev_tap_create(struct rte_vdev_device *vdev, char *tap_name, > char *remote_iface, int fixed_mac_type) > @@ -1193,6 +1252,7 @@ enum ioctl_mode { > } > > pmd = dev->data->dev_private; > + pmd->dev = dev; You use pmd->dev only to access dev->nb_rx_queues. I would suggest instead to add *nb_rx_queues in struct pmd_internals, and remove *dev from it. You can even remove nb_queues from pmd_internals, and only use nb_rx_queues and RTE_PMD_TAP_MAX_QUEUES where relevant. > snprintf(pmd->name, sizeof(pmd->name), "%s", tap_name); > pmd->nb_queues = RTE_PMD_TAP_MAX_QUEUES; > > @@ -1212,8 +1272,9 @@ enum ioctl_mode { > > data->dev_link = pmd_link; > data->mac_addrs = &pmd->eth_addr; > - data->nb_rx_queues = pmd->nb_queues; > - data->nb_tx_queues = pmd->nb_queues; > + /* Set the number of RX and TX queues */ > + data->nb_rx_queues = 0; > + data->nb_tx_queues = 0; > > dev->data = data; > dev->dev_ops = &ops; > @@ -1241,7 +1302,11 @@ enum ioctl_mode { > } > > /* Immediately create the netdevice (this will create the 1st queue). */ > - if (tap_setup_queue(dev, pmd, 0) == -1) > + /* rx queue */ > + if (tap_setup_queue(dev, pmd, 0, 1) == -1) > + goto error_exit; > + /* tx queue */ > + if (tap_setup_queue(dev, pmd, 0, 0) == -1) > goto error_exit; > > ifr.ifr_mtu = dev->data->mtu; > @@ -1515,9 +1580,16 @@ enum ioctl_mode { > tap_flow_implicit_flush(internals, NULL); > nl_final(internals->nlsk_fd); > } > - for (i = 0; i < internals->nb_queues; i++) > - if (internals->rxq[i].fd != -1) > + for (i = 0; i < internals->nb_queues; i++) { > + if (internals->rxq[i].fd != -1) { > close(internals->rxq[i].fd); > + internals->rxq[i].fd = -1; > + } > + if (internals->txq[i].fd != -1) { > + close(internals->txq[i].fd); > + internals->txq[i].fd = -1; > + } > + } > > close(internals->ioctl_sock); > rte_free(eth_dev->data->dev_private); > diff --git a/drivers/net/tap/rte_eth_tap.h b/drivers/net/tap/rte_eth_tap.h > index 928a045..be57cc7 100644 > --- a/drivers/net/tap/rte_eth_tap.h > +++ b/drivers/net/tap/rte_eth_tap.h > @@ -80,6 +80,7 @@ struct tx_queue { > }; > > struct pmd_internals { > + struct rte_eth_dev *dev; /* Ethernet device. */ > char remote_iface[RTE_ETH_NAME_MAX_LEN]; /* Remote netdevice name */ > char name[RTE_ETH_NAME_MAX_LEN]; /* Internal Tap device name */ > uint16_t nb_queues; /* Number of queues supported */ See comment above. > diff --git a/drivers/net/tap/tap_flow.c b/drivers/net/tap/tap_flow.c > index 41f7345..eefa868 100644 > --- a/drivers/net/tap/tap_flow.c > +++ b/drivers/net/tap/tap_flow.c > @@ -1092,7 +1092,8 @@ struct tap_flow_items { > if (action) > goto exit_action_not_supported; > action = 1; > - if (!queue || (queue->index >= pmd->nb_queues)) > + if (!queue || > + (queue->index > pmd->dev->data->nb_rx_queues - 1)) > goto exit_action_not_supported; > if (flow) > err = add_action_skbedit(flow, queue->index);