DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
@ 2016-01-15 16:18 Ferruh Yigit
  2016-01-15 16:18 ` [dpdk-dev] [RFC 1/3] rte_ctrl_if: add control interface library Ferruh Yigit
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Ferruh Yigit @ 2016-01-15 16:18 UTC (permalink / raw)
  To: dev

This work is to make DPDK ports more visible and to enable using common
Linux tools to configure DPDK ports.

Patch is based on KNI but contains only control functionality of it,
also this patch does not include any Linux kernel network driver as
part of it.

Basically with the help of a kernel module (KCP), virtual Linux network
interfaces named as "dpdk$" are created per DPDK port, control messages
sent to these virtual interfaces are forwarded to DPDK, and response
sent back to Linux application.

Virtual interfaces created when DPDK application started and destroyed
automatically when DPDK application terminated.

Communication between kernel-space and DPDK done using netlink socket.

Currently implementation is not complete, sample support added for the
RFC, more functionality can be added based on community response.

With this RFC Patch, supported: get/set mac address/mtu of DPDK devices,
getting stats from DPDK devices and some set of ethtool commands.

Samples:

$ ifconfig
dpdk0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 90:e2:ba:0e:49:b8  txqueuelen 1000  (Ethernet)
        RX packets 33  bytes 2058 (2.0 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 33  bytes 2058 (2.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

dpdk1: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 00:1b:21:76:fa:21  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

After some traffic on port 0:

$ ifconfig
dpdk0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 90:e2:ba:0e:49:77  txqueuelen 1000  (Ethernet)
        RX packets 962  bytes 57798 (56.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 962  bytes 57798 (56.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


$ ethtool -i dpdk0
driver: rte_ixgbe_pmd
version: RTE 2.3.0-rc0
firmware-version: 
expansion-rom-version: 
bus-info: 0000:08:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no


$ ip l show dpdk0
25: dpdk0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 90:e2:ba:0e:49:b8 brd ff:ff:ff:ff:ff:ff

$ ip l set dpdk0 addr 90:e2:ba:0e:49:77

$ ip l show dpdk0
25: dpdk0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 90:e2:ba:0e:49:77 brd ff:ff:ff:ff:ff:ff


Ferruh Yigit (3):
  rte_ctrl_if: add control interface library
  kcp: add kernel control path kernel module
  examples/ethtool: add control interface support to the application

 config/common_linuxapp                     |   9 +-
 examples/ethtool/ethtool-app/main.c        |   8 +-
 lib/Makefile                               |   3 +-
 lib/librte_ctrl_if/Makefile                |  58 +++++
 lib/librte_ctrl_if/rte_ctrl_if.c           | 166 ++++++++++++++
 lib/librte_ctrl_if/rte_ctrl_if.h           |  54 +++++
 lib/librte_ctrl_if/rte_ctrl_if_version.map |   9 +
 lib/librte_ctrl_if/rte_ethtool.c           | 354 +++++++++++++++++++++++++++++
 lib/librte_ctrl_if/rte_ethtool.h           |  64 ++++++
 lib/librte_ctrl_if/rte_nl.c                | 263 +++++++++++++++++++++
 lib/librte_ctrl_if/rte_nl.h                |  60 +++++
 lib/librte_eal/common/include/rte_log.h    |   3 +-
 lib/librte_eal/linuxapp/Makefile           |   5 +-
 lib/librte_eal/linuxapp/kcp/Makefile       |  58 +++++
 lib/librte_eal/linuxapp/kcp/kcp_dev.h      |  81 +++++++
 lib/librte_eal/linuxapp/kcp/kcp_ethtool.c  | 261 +++++++++++++++++++++
 lib/librte_eal/linuxapp/kcp/kcp_misc.c     | 282 +++++++++++++++++++++++
 lib/librte_eal/linuxapp/kcp/kcp_net.c      | 209 +++++++++++++++++
 lib/librte_eal/linuxapp/kcp/kcp_nl.c       | 194 ++++++++++++++++
 mk/rte.app.mk                              |   3 +-
 20 files changed, 2138 insertions(+), 6 deletions(-)
 create mode 100644 lib/librte_ctrl_if/Makefile
 create mode 100644 lib/librte_ctrl_if/rte_ctrl_if.c
 create mode 100644 lib/librte_ctrl_if/rte_ctrl_if.h
 create mode 100644 lib/librte_ctrl_if/rte_ctrl_if_version.map
 create mode 100644 lib/librte_ctrl_if/rte_ethtool.c
 create mode 100644 lib/librte_ctrl_if/rte_ethtool.h
 create mode 100644 lib/librte_ctrl_if/rte_nl.c
 create mode 100644 lib/librte_ctrl_if/rte_nl.h
 create mode 100644 lib/librte_eal/linuxapp/kcp/Makefile
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_dev.h
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_ethtool.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_misc.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_net.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_nl.c

-- 
2.5.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [dpdk-dev] [RFC 1/3] rte_ctrl_if: add control interface library
  2016-01-15 16:18 [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports Ferruh Yigit
@ 2016-01-15 16:18 ` Ferruh Yigit
  2016-01-15 16:18 ` [dpdk-dev] [RFC 2/3] kcp: add kernel control path kernel module Ferruh Yigit
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Ferruh Yigit @ 2016-01-15 16:18 UTC (permalink / raw)
  To: dev

This library gets control messages form kernelspace and forwards them to
librte_ether and returns response back to the kernelspace.

Library does:
1) Trigger Linux virtual interface creation
2) Initialize the netlink socket communication
3) Provides process() API to the application that does processing the
received messages

This library requires corresponding kernel module to be inserted.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
 config/common_linuxapp                     |   7 +-
 lib/Makefile                               |   3 +-
 lib/librte_ctrl_if/Makefile                |  58 +++++
 lib/librte_ctrl_if/rte_ctrl_if.c           | 166 ++++++++++++++
 lib/librte_ctrl_if/rte_ctrl_if.h           |  54 +++++
 lib/librte_ctrl_if/rte_ctrl_if_version.map |   9 +
 lib/librte_ctrl_if/rte_ethtool.c           | 354 +++++++++++++++++++++++++++++
 lib/librte_ctrl_if/rte_ethtool.h           |  64 ++++++
 lib/librte_ctrl_if/rte_nl.c                | 274 ++++++++++++++++++++++
 lib/librte_ctrl_if/rte_nl.h                |  60 +++++
 lib/librte_eal/common/include/rte_log.h    |   3 +-
 mk/rte.app.mk                              |   3 +-
 12 files changed, 1051 insertions(+), 4 deletions(-)
 create mode 100644 lib/librte_ctrl_if/Makefile
 create mode 100644 lib/librte_ctrl_if/rte_ctrl_if.c
 create mode 100644 lib/librte_ctrl_if/rte_ctrl_if.h
 create mode 100644 lib/librte_ctrl_if/rte_ctrl_if_version.map
 create mode 100644 lib/librte_ctrl_if/rte_ethtool.c
 create mode 100644 lib/librte_ctrl_if/rte_ethtool.h
 create mode 100644 lib/librte_ctrl_if/rte_nl.c
 create mode 100644 lib/librte_ctrl_if/rte_nl.h

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 74bc515..de705d0 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -503,6 +503,11 @@ CONFIG_RTE_KNI_VHOST_DEBUG_RX=n
 CONFIG_RTE_KNI_VHOST_DEBUG_TX=n
 
 #
+# Compile librte_ctrl_if
+#
+CONFIG_RTE_LIBRTE_CTRL_IF=y
+
+#
 # Compile vhost library
 # fuse-devel is needed to run vhost-cuse.
 # fuse-devel enables user space char driver development
diff --git a/lib/Makefile b/lib/Makefile
index ef172ea..a50bc1e 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -58,6 +58,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
 DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
 DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
 DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
+DIRS-$(CONFIG_RTE_LIBRTE_CTRL_IF) += librte_ctrl_if
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_ctrl_if/Makefile b/lib/librte_ctrl_if/Makefile
new file mode 100644
index 0000000..9e1ed0d
--- /dev/null
+++ b/lib/librte_ctrl_if/Makefile
@@ -0,0 +1,58 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_ctrl_if.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_ctrl_if_version.map
+
+LIBABIVER := 2
+
+SRCS-y += rte_ctrl_if.c
+SRCS-y += rte_nl.c
+SRCS-y += rte_ethtool.c
+
+#
+# Export include files
+#
+SYMLINK-y-include += rte_ctrl_if.h
+
+# this lib depends upon:
+DEPDIRS-y += lib/librte_ether
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_ctrl_if/rte_ctrl_if.c b/lib/librte_ctrl_if/rte_ctrl_if.c
new file mode 100644
index 0000000..acc578a
--- /dev/null
+++ b/lib/librte_ctrl_if/rte_ctrl_if.c
@@ -0,0 +1,166 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <fcntl.h>
+#include <unistd.h>
+
+#include <sys/ioctl.h>
+
+#include <rte_ethdev.h>
+#include "rte_ctrl_if.h"
+#include "rte_nl.h"
+
+static int kcp_fd = -1;
+static int kcp_fd_ref;
+
+#define RTE_KCP_IOCTL_TEST    _IOWR(0, 1, int)
+#define RTE_KCP_IOCTL_CREATE  _IOWR(0, 2, int)
+#define RTE_KCP_IOCTL_RELEASE _IOWR(0, 3, int)
+
+static int
+control_interface_init(void)
+{
+	int ret;
+	kcp_fd = open("/dev/kcp", O_RDWR);
+
+	if (kcp_fd < 0) {
+		RTE_LOG(ERR, CTRL_IF,
+				"Failed to initialize control interface.\n");
+		return -1;
+	}
+
+	ret = control_interface_nl_init();
+	if (ret < 0)
+		close(kcp_fd);
+
+	return ret;
+}
+
+static int
+control_interface_ref_get(void)
+{
+	int ret = 0;
+
+	if (kcp_fd_ref == 0)
+		ret = control_interface_init();
+
+	if (ret == 0)
+		kcp_fd_ref++;
+
+	return kcp_fd_ref;
+}
+
+static void
+control_interface_release(void)
+{
+	close(kcp_fd);
+	control_interface_nl_release();
+}
+
+static int
+control_interface_ref_put(void)
+{
+	if (kcp_fd_ref == 0)
+		return 0;
+
+	kcp_fd_ref--;
+
+	if (kcp_fd_ref == 0)
+		control_interface_release();
+
+	return kcp_fd_ref;
+}
+
+static int
+rte_eth_control_interface_create_one(uint8_t port_id)
+{
+	if (control_interface_ref_get() != 0) {
+		ioctl(kcp_fd, RTE_KCP_IOCTL_CREATE, port_id);
+		RTE_LOG(DEBUG, CTRL_IF,
+				"Control interface created for port:%u\n",
+			port_id);
+	}
+
+	return 0;
+}
+
+int
+rte_eth_control_interface_create(void)
+{
+	int i;
+	int ret = 0;
+
+	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
+		if (rte_eth_dev_is_valid_port(i)) {
+			ret = rte_eth_control_interface_create_one(i);
+			if (ret < 0)
+				return ret;
+		}
+	}
+
+	return ret;
+}
+
+static int
+rte_eth_control_interface_destroy_one(uint8_t port_id)
+{
+	ioctl(kcp_fd, RTE_KCP_IOCTL_RELEASE, port_id);
+	control_interface_ref_put();
+	RTE_LOG(DEBUG, CTRL_IF, "Control interface destroyed for port:%u\n",
+			port_id);
+
+	return 0;
+}
+
+int
+rte_eth_control_interface_destroy(void)
+{
+	int i;
+	int ret = 0;
+
+	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
+		if (rte_eth_dev_is_valid_port(i)) {
+			ret = rte_eth_control_interface_destroy_one(i);
+			if (ret < 0)
+				return ret;
+		}
+	}
+
+	return ret;
+}
+
+int
+rte_eth_control_interface_process_msg(int flag, int timeout_sec)
+{
+	return control_interface_process_msg(flag, timeout_sec);
+}
diff --git a/lib/librte_ctrl_if/rte_ctrl_if.h b/lib/librte_ctrl_if/rte_ctrl_if.h
new file mode 100644
index 0000000..77245ae
--- /dev/null
+++ b/lib/librte_ctrl_if/rte_ctrl_if.h
@@ -0,0 +1,54 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CTRL_IF_H_
+#define _RTE_CTRL_IF_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+enum control_interface_process_flag {
+	RTE_ETHTOOL_CTRL_IF_PROCESS_MSG,
+	RTE_ETHTOOL_CTRL_IF_DISCARD_MSG,
+};
+
+int rte_eth_control_interface_create(void);
+int rte_eth_control_interface_destroy(void);
+int rte_eth_control_interface_process_msg(int flag, int timeout_sec);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CTRL_IF_H_ */
diff --git a/lib/librte_ctrl_if/rte_ctrl_if_version.map b/lib/librte_ctrl_if/rte_ctrl_if_version.map
new file mode 100644
index 0000000..8b27e26
--- /dev/null
+++ b/lib/librte_ctrl_if/rte_ctrl_if_version.map
@@ -0,0 +1,9 @@
+DPDK_2.3 {
+	global:
+
+	rte_eth_control_interface_create;
+	rte_eth_control_interface_destroy;
+	rte_eth_control_interface_process_msg;
+
+	local: *;
+};
diff --git a/lib/librte_ctrl_if/rte_ethtool.c b/lib/librte_ctrl_if/rte_ethtool.c
new file mode 100644
index 0000000..7d6ccec
--- /dev/null
+++ b/lib/librte_ctrl_if/rte_ethtool.c
@@ -0,0 +1,354 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+
+#include <linux/if_link.h>
+
+#include <rte_version.h>
+#include <rte_ethdev.h>
+#include "rte_ethtool.h"
+
+#define ETHTOOL_GEEPROM_LEN 99
+#define ETHTOOL_GREGS_LEN 98
+#define ETHTOOL_GSSET_COUNT 97
+
+static int
+get_drvinfo(int port_id, void *data, int *data_len)
+{
+	struct ethtool_drvinfo *info = data;
+	struct rte_eth_dev_info dev_info;
+	int n;
+
+	memset(&dev_info, 0, sizeof(dev_info));
+	rte_eth_dev_info_get(port_id, &dev_info);
+
+	snprintf(info->driver, sizeof(info->driver), "%s",
+		dev_info.driver_name);
+	snprintf(info->version, sizeof(info->version), "%s",
+		rte_version());
+	snprintf(info->bus_info, sizeof(info->bus_info),
+		"%04x:%02x:%02x.%x",
+		dev_info.pci_dev->addr.domain, dev_info.pci_dev->addr.bus,
+		dev_info.pci_dev->addr.devid, dev_info.pci_dev->addr.function);
+
+	n = rte_eth_dev_get_reg_length(port_id);
+	info->regdump_len = n < 0 ? 0 : n;
+
+	n = rte_eth_dev_get_eeprom_length(port_id);
+	info->eedump_len = n < 0 ? 0 : n;
+
+	info->n_stats = sizeof(struct rte_eth_stats) / sizeof(uint64_t);
+	info->testinfo_len = 0;
+
+	*data_len = sizeof(struct ethtool_drvinfo);
+
+	return 0;
+}
+
+static int
+get_reg_len(int port_id, void *data, int *data_len)
+{
+	int reg_length = 0;
+
+	reg_length = rte_eth_dev_get_reg_length(port_id);
+	if (reg_length < 0)
+		return reg_length;
+
+	*(int *)data = reg_length * sizeof(uint32_t);
+	*data_len = sizeof(int);
+
+	return 0;
+}
+
+static int
+get_reg(int port_id, void *in_data, void *out_data, int *out_data_len)
+{
+	unsigned int reg_length;
+	int reg_length_out_len;
+	struct ethtool_regs *ethtool_regs = in_data;
+	struct rte_dev_reg_info regs = {
+		.data = out_data,
+		.length = 0,
+	};
+	int ret;
+
+	ret = get_reg_len(port_id, &reg_length, &reg_length_out_len);
+	if (ret < 0 || reg_length > ethtool_regs->len)
+		return -1;
+
+	ret = rte_eth_dev_get_reg_info(port_id, &regs);
+	if (ret < 0)
+		return ret;
+
+	ethtool_regs->version = regs.version;
+	*out_data_len = reg_length;
+
+	return 0;
+}
+
+static int
+get_link(int port_id, void *data, int *data_len)
+{
+	struct rte_eth_link link;
+
+	rte_eth_link_get(port_id, &link);
+
+	*(int *)data = link.link_status;
+	*data_len = sizeof(int);
+
+	return 0;
+}
+
+static int
+get_eeprom_length(int port_id, void *data, int *data_len)
+{
+	int eeprom_length = 0;
+
+	eeprom_length = rte_eth_dev_get_eeprom_length(port_id);
+	if (eeprom_length < 0)
+		return eeprom_length;
+
+	*(int *)data = eeprom_length;
+	*data_len = sizeof(int);
+
+	return 0;
+}
+
+static int
+get_eeprom(int port_id, void *in_data, void *out_data)
+{
+	struct ethtool_eeprom *eeprom = in_data;
+	struct rte_dev_eeprom_info eeprom_info = {
+		.data = out_data,
+		.offset = eeprom->offset,
+		.length = eeprom->len,
+	};
+	int ret;
+
+	ret = rte_eth_dev_get_eeprom(port_id, &eeprom_info);
+	if (ret < 0)
+		return ret;
+
+	eeprom->magic = eeprom_info.magic;
+
+	return 0;
+}
+
+static int
+set_eeprom(int port_id, void *in_data, void *out_data)
+{
+	struct ethtool_eeprom *eeprom = in_data;
+	struct rte_dev_eeprom_info eeprom_info = {
+		.data = out_data,
+		.offset = eeprom->offset,
+		.length = eeprom->len,
+	};
+	int ret;
+
+	ret = rte_eth_dev_set_eeprom(port_id, &eeprom_info);
+	if (ret < 0)
+		return ret;
+
+	eeprom->magic = eeprom_info.magic;
+
+	return 0;
+}
+
+static int
+get_pauseparam(int port_id, void *data, void *data_len)
+{
+	struct ethtool_pauseparam *pauseparam = data;
+	struct rte_eth_fc_conf fc_conf;
+	int ret;
+
+	ret = rte_eth_dev_flow_ctrl_get(port_id, &fc_conf);
+	if (ret)
+		return ret;
+
+	pauseparam->tx_pause = 0;
+	pauseparam->rx_pause = 0;
+
+	switch (fc_conf.mode) {
+	case RTE_FC_RX_PAUSE:
+		pauseparam->rx_pause = 1;
+		break;
+	case RTE_FC_TX_PAUSE:
+		pauseparam->tx_pause = 1;
+		break;
+	case RTE_FC_FULL:
+		pauseparam->rx_pause = 1;
+		pauseparam->tx_pause = 1;
+	default:
+		break;
+	}
+	pauseparam->autoneg = (uint32_t)fc_conf.autoneg;
+
+	*(int *)data_len = sizeof(struct ethtool_pauseparam);
+
+	return 0;
+}
+
+int
+rte_eth_dev_ethtool_process(int cmd_id, int port_id, void *in_data,
+		void *out_data, int *out_data_len)
+{
+	int ret = 0;
+
+	if (!rte_eth_dev_is_valid_port(port_id))
+		return -ENODEV;
+
+	switch (cmd_id) {
+	case ETHTOOL_GDRVINFO:
+		return get_drvinfo(port_id, out_data, out_data_len);
+	case ETHTOOL_GREGS_LEN:
+		return get_reg_len(port_id, out_data, out_data_len);
+	case ETHTOOL_GREGS:
+		return get_reg(port_id, in_data, out_data, out_data_len);
+	case ETHTOOL_GLINK:
+		return get_link(port_id, out_data, out_data_len);
+	case ETHTOOL_GEEPROM_LEN:
+		return get_eeprom_length(port_id, out_data, out_data_len);
+	case ETHTOOL_GEEPROM:
+		return get_eeprom(port_id, in_data, out_data);
+	case ETHTOOL_SEEPROM:
+		return set_eeprom(port_id, in_data, out_data);
+	case ETHTOOL_GPAUSEPARAM:
+		return get_pauseparam(port_id, out_data, out_data_len);
+	default:
+		ret = -95 /* EOPNOTSUPP */;
+		break;
+	}
+
+	return ret;
+}
+
+static int
+set_mtu(int port_id, void *in_data)
+{
+	int *mtu = in_data;
+
+	return rte_eth_dev_set_mtu(port_id, *mtu);
+}
+
+static int
+get_stats(int port_id, void *data, int *data_len)
+{
+	struct rte_eth_stats stats;
+	struct rtnl_link_stats64 *if_stats = data;
+	int ret;
+
+	ret = rte_eth_stats_get(port_id, &stats);
+	if (ret < 0)
+		return -EOPNOTSUPP;
+
+	if_stats->rx_packets = stats.ipackets;
+	if_stats->tx_packets = stats.opackets;
+	if_stats->rx_bytes = stats.ibytes;
+	if_stats->tx_bytes = stats.obytes;
+	if_stats->rx_errors = stats.ierrors;
+	if_stats->tx_errors = stats.oerrors;
+	if_stats->rx_dropped = stats.imissed;
+	if_stats->multicast = stats.imcasts;
+
+	*data_len = sizeof(struct rtnl_link_stats64);
+
+	return 0;
+}
+
+static int
+get_mac(int port_id, void *data, int *data_len)
+{
+	struct ether_addr addr;
+
+	rte_eth_macaddr_get(port_id, &addr);
+	memcpy(data, &addr, sizeof(struct ether_addr));
+
+	*data_len = sizeof(struct ether_addr);
+
+	return 0;
+}
+
+static int
+set_mac(int port_id, void *in_data)
+{
+	struct ether_addr addr;
+
+	memcpy(&addr, in_data, ETHER_ADDR_LEN);
+
+	return rte_eth_dev_default_mac_addr_set(port_id, &addr);
+}
+
+static int
+start_port(int port_id)
+{
+	rte_eth_dev_stop(port_id);
+	return rte_eth_dev_start(port_id);
+}
+
+static int
+stop_port(int port_id)
+{
+	rte_eth_dev_stop(port_id);
+	return 0;
+}
+
+int
+rte_eth_dev_control_process(int cmd_id, int port_id, void *in_data,
+		void *out_data, int *out_data_len)
+{
+	int ret = 0;
+
+	if (!rte_eth_dev_is_valid_port(port_id))
+		return -ENODEV;
+
+	switch (cmd_id) {
+	case RTE_KCP_REQ_CHANGE_MTU:
+		return set_mtu(port_id, in_data);
+	case RTE_KCP_REQ_GET_STATS:
+		return get_stats(port_id, out_data, out_data_len);
+	case RTE_KCP_REQ_GET_MAC:
+		return get_mac(port_id, out_data, out_data_len);
+	case RTE_KCP_REQ_SET_MAC:
+		return set_mac(port_id, in_data);
+	case RTE_KCP_REQ_START_PORT:
+		return start_port(port_id);
+	case RTE_KCP_REQ_STOP_PORT:
+		return stop_port(port_id);
+	default:
+		ret = -95 /* EOPNOTSUPP */;
+		break;
+	}
+
+	return ret;
+}
diff --git a/lib/librte_ctrl_if/rte_ethtool.h b/lib/librte_ctrl_if/rte_ethtool.h
new file mode 100644
index 0000000..af1abfe
--- /dev/null
+++ b/lib/librte_ctrl_if/rte_ethtool.h
@@ -0,0 +1,64 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ETHTOOL_H_
+#define _RTE_ETHTOOL_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <linux/ethtool.h>
+
+enum rte_kcp_req_id {
+	RTE_KCP_REQ_UNKNOWN = (1 << 16),
+	RTE_KCP_REQ_CHANGE_MTU,
+	RTE_KCP_REQ_CFG_NETWORK_IF,
+	RTE_KCP_REQ_GET_STATS,
+	RTE_KCP_REQ_GET_MAC,
+	RTE_KCP_REQ_SET_MAC,
+	RTE_KCP_REQ_START_PORT,
+	RTE_KCP_REQ_STOP_PORT,
+	RTE_KCP_REQ_MAX,
+};
+
+int rte_eth_dev_ethtool_process(int cmd_id, int port_id, void *in_data,
+		void *out_data, int *out_data_len);
+int rte_eth_dev_control_process(int cmd_id, int port_id, void *in_data,
+		void *out_data, int *out_data_len);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ETHTOOL_H_ */
diff --git a/lib/librte_ctrl_if/rte_nl.c b/lib/librte_ctrl_if/rte_nl.c
new file mode 100644
index 0000000..03b39cb
--- /dev/null
+++ b/lib/librte_ctrl_if/rte_nl.c
@@ -0,0 +1,274 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <unistd.h>
+
+#include <sys/socket.h>
+#include <linux/netlink.h>
+
+#include <rte_spinlock.h>
+#include <rte_ethdev.h>
+#include "rte_ethtool.h"
+#include "rte_nl.h"
+#include "rte_ctrl_if.h"
+
+#define KCP_NL_GRP 31
+#define MAX_PAYLOAD 1024
+
+static int sock_fd = -1;
+pthread_t thread_id;
+pthread_cond_t cond  = PTHREAD_COND_INITIALIZER;
+pthread_mutex_t list_lock = PTHREAD_MUTEX_INITIALIZER;
+static struct nlmsghdr *nlh_s;
+static struct nlmsghdr *nlh_r;
+static struct msghdr msg_s;
+static struct msghdr msg_r;
+static struct iovec iov_s;
+static struct iovec iov_r;
+static struct sockaddr_nl dest_addr;
+static struct sockaddr_nl src_addr;
+static int terminate;
+
+static int kcp_ethtool_msg_count;
+static struct kcp_ethtool_msg head;
+
+static void
+control_interface_nl_send(void *buf, size_t len)
+{
+	int ret;
+
+	/* Fill in the netlink message payload */
+	memcpy(NLMSG_DATA(nlh_s), buf, len);
+
+	ret = sendmsg(sock_fd, &msg_s, 0);
+
+	if (ret < 0)
+		RTE_LOG(ERR, CTRL_IF, "Failed nl msg send. ret:%d, err:%d\n",
+				ret, errno);
+}
+
+static void
+control_interface_nl_process_msg(struct kcp_ethtool_msg *msg)
+{
+	if (msg->cmd_id > RTE_KCP_REQ_UNKNOWN) {
+		msg->err = rte_eth_dev_control_process(msg->cmd_id,
+				msg->port_id, msg->input_buffer,
+				msg->output_buffer, &msg->output_buffer_len);
+	} else {
+		msg->err = rte_eth_dev_ethtool_process(msg->cmd_id,
+				msg->port_id, msg->input_buffer,
+				msg->output_buffer, &msg->output_buffer_len);
+	}
+
+	control_interface_nl_send((void *)msg,
+			sizeof(struct kcp_ethtool_msg));
+}
+
+int
+control_interface_process_msg(int flag, int timeout_sec)
+{
+	int ret = 0;
+	struct timespec ts;
+
+	pthread_mutex_lock(&list_lock);
+	while (timeout_sec && !kcp_ethtool_msg_count && !ret) {
+		clock_gettime(CLOCK_REALTIME, &ts);
+		ts.tv_sec += timeout_sec;
+		ret = pthread_cond_timedwait(&cond, &list_lock, &ts);
+	}
+
+	switch (flag) {
+	case RTE_ETHTOOL_CTRL_IF_PROCESS_MSG:
+		if (kcp_ethtool_msg_count) {
+			control_interface_nl_process_msg(&head);
+			kcp_ethtool_msg_count = 0;
+		}
+		break;
+
+	case RTE_ETHTOOL_CTRL_IF_DISCARD_MSG:
+		if (kcp_ethtool_msg_count) {
+			head.err = -1;
+			control_interface_nl_send((void *)&head,
+					sizeof(struct kcp_ethtool_msg));
+			kcp_ethtool_msg_count = 0;
+		}
+		break;
+
+	default:
+		ret = -1;
+		break;
+	}
+	pthread_mutex_unlock(&list_lock);
+
+	return ret;
+}
+
+static int
+msg_list_add(struct nlmsghdr *nlh)
+{
+	pthread_mutex_lock(&list_lock);
+
+	memcpy(&head, NLMSG_DATA(nlh), sizeof(struct kcp_ethtool_msg));
+	kcp_ethtool_msg_count = 1;
+
+	pthread_mutex_unlock(&list_lock);
+
+	return 0;
+}
+
+static void *
+control_interface_nl_recv(void *arg)
+{
+	int ret;
+
+	for (;;) {
+		if (terminate == 1)
+			break;
+
+		ret = recvmsg(sock_fd, &msg_r, 0);
+		if (ret < 0)
+			continue;
+
+		if ((unsigned)ret < sizeof(struct kcp_ethtool_msg)) {
+			RTE_LOG(WARNING, CTRL_IF,
+					"Received %u bytes, payload %lu\n",
+					ret, sizeof(struct kcp_ethtool_msg));
+			continue;
+		}
+
+		ret = msg_list_add(nlh_r);
+
+		pthread_cond_signal(&cond);
+	}
+
+	return arg;
+}
+
+static void
+nl_setup_header(struct msghdr *msg, struct nlmsghdr **nlh, struct iovec *iov,
+		struct sockaddr_nl *daddr)
+{
+	struct nlmsghdr *nlh_tmp;
+
+	if (*nlh == NULL)
+		*nlh = malloc(NLMSG_SPACE(MAX_PAYLOAD));
+
+	nlh_tmp = *nlh;
+	memset(nlh_tmp, 0, NLMSG_SPACE(MAX_PAYLOAD));
+
+	/* Fill the netlink message header */
+	nlh_tmp->nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);
+	nlh_tmp->nlmsg_pid = getpid();  /* self pid */
+	nlh_tmp->nlmsg_flags = 0;
+
+	iov->iov_base = (void *)nlh_tmp;
+	iov->iov_len = nlh_tmp->nlmsg_len;
+	memset(msg, 0, sizeof(struct msghdr));
+	msg->msg_name = (void *)daddr;
+	msg->msg_namelen = sizeof(struct sockaddr_nl);
+	msg->msg_iov = iov;
+	msg->msg_iovlen = 1;
+}
+
+static int
+control_interface_nl_socket_init(void)
+{
+	int fd;
+	int ret;
+
+	fd = socket(PF_NETLINK, SOCK_RAW, KCP_NL_GRP);
+	if (fd < 0)
+		return -1;
+
+	src_addr.nl_family = AF_NETLINK;
+	src_addr.nl_pid = getpid();
+	ret = bind(fd, (struct sockaddr *)&src_addr, sizeof(src_addr));
+	if (ret) {
+		close(fd);
+		return -1;
+	}
+
+	dest_addr.nl_family = AF_NETLINK;
+	dest_addr.nl_pid = 0;   /*  For Linux Kernel */
+	dest_addr.nl_groups = 0;
+
+	nl_setup_header(&msg_s, &nlh_s, &iov_s, &dest_addr);
+	nl_setup_header(&msg_r, &nlh_r, &iov_r, &dest_addr);
+
+	return fd;
+}
+
+int
+control_interface_nl_init(void)
+{
+	int ret;
+	char buf[] = "pid";
+	sock_fd = control_interface_nl_socket_init();
+
+	if (sock_fd < 0) {
+		RTE_LOG(ERR, CTRL_IF,
+				"Failed to initialize control interface.\n");
+		return -1;
+	}
+
+	ret = pthread_create(&thread_id, NULL, control_interface_nl_recv,
+			NULL);
+	if (ret != 0)
+		return -1;
+	control_interface_nl_send((void *)buf, sizeof(buf));
+
+	return 0;
+}
+
+static void
+msg_list_destroy(void)
+{
+	pthread_mutex_lock(&list_lock);
+	kcp_ethtool_msg_count = 0;
+	pthread_cond_signal(&cond);
+	pthread_mutex_unlock(&list_lock);
+}
+
+void
+control_interface_nl_release(void)
+{
+	terminate = 1;
+	pthread_cancel(thread_id);
+	pthread_join(thread_id, NULL);
+	close(sock_fd);
+	msg_list_destroy();
+	free(nlh_r);
+	free(nlh_s);
+	nlh_r = NULL;
+	nlh_s = NULL;
+}
diff --git a/lib/librte_ctrl_if/rte_nl.h b/lib/librte_ctrl_if/rte_nl.h
new file mode 100644
index 0000000..bb600cc
--- /dev/null
+++ b/lib/librte_ctrl_if/rte_nl.h
@@ -0,0 +1,60 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_NL_H_
+#define _RTE_NL_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define KCP_ETHTOOL_MSG_LEN 500
+struct kcp_ethtool_msg {
+	int cmd_id;
+	int port_id;
+	char input_buffer[KCP_ETHTOOL_MSG_LEN];
+	char output_buffer[KCP_ETHTOOL_MSG_LEN];
+	int input_buffer_len;
+	int output_buffer_len;
+	int err;
+};
+
+int control_interface_nl_init(void);
+void control_interface_nl_release(void);
+int control_interface_process_msg(int flag, int timeout_sec);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_NL_H_ */
diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index 2e47e7f..a0a2c9f 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -79,6 +79,7 @@ extern struct rte_logs rte_logs;
 #define RTE_LOGTYPE_PIPELINE 0x00008000 /**< Log related to pipeline. */
 #define RTE_LOGTYPE_MBUF    0x00010000 /**< Log related to mbuf. */
 #define RTE_LOGTYPE_CRYPTODEV 0x00020000 /**< Log related to cryptodev. */
+#define RTE_LOGTYPE_CTRL_IF 0x00040000 /**< Log related to control interface. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1   0x01000000 /**< User-defined log type 1. */
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 8ecab41..e1638f0 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   Copyright(c) 2014-2015 6WIND S.A.
 #   All rights reserved.
 #
@@ -122,6 +122,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_MBUF)           += -lrte_mbuf
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MBUF_OFFLOAD)   += -lrte_mbuf_offload
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ETHER)          += -lethdev
+_LDLIBS-$(CONFIG_RTE_LIBRTE_CTRL_IF)        += -lrte_ctrl_if
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CRYPTODEV)      += -lrte_cryptodev
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MEMPOOL)        += -lrte_mempool
 _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
-- 
2.5.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [dpdk-dev] [RFC 2/3] kcp: add kernel control path kernel module
  2016-01-15 16:18 [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports Ferruh Yigit
  2016-01-15 16:18 ` [dpdk-dev] [RFC 1/3] rte_ctrl_if: add control interface library Ferruh Yigit
@ 2016-01-15 16:18 ` Ferruh Yigit
  2016-01-15 16:18 ` [dpdk-dev] [RFC 3/3] examples/ethtool: add control interface support to the application Ferruh Yigit
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Ferruh Yigit @ 2016-01-15 16:18 UTC (permalink / raw)
  To: dev

This kernel module is based on KNI module, but this one is stripped
version of it and only for control messages, no data transfer
functionality provided.

This Linux kernel module helps userspace application create virtual
interfaces and when a control command issued into that virtual
interface, module pushes the command to the userspace and gets the
response back for the caller application.

The Linux tools like ethtool/ifconfig/ip can be used on virtual
interfaces but not ones for related data, like tcpdump.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
 config/common_linuxapp                    |   2 +
 lib/librte_eal/linuxapp/Makefile          |   5 +-
 lib/librte_eal/linuxapp/kcp/Makefile      |  58 ++++++
 lib/librte_eal/linuxapp/kcp/kcp_dev.h     |  81 +++++++++
 lib/librte_eal/linuxapp/kcp/kcp_ethtool.c | 261 +++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/kcp/kcp_misc.c    | 282 ++++++++++++++++++++++++++++++
 lib/librte_eal/linuxapp/kcp/kcp_net.c     | 209 ++++++++++++++++++++++
 lib/librte_eal/linuxapp/kcp/kcp_nl.c      | 194 ++++++++++++++++++++
 8 files changed, 1091 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_eal/linuxapp/kcp/Makefile
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_dev.h
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_ethtool.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_misc.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_net.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_nl.c

diff --git a/config/common_linuxapp b/config/common_linuxapp
index de705d0..ed32ca8 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -506,6 +506,8 @@ CONFIG_RTE_KNI_VHOST_DEBUG_TX=n
 # Compile librte_ctrl_if
 #
 CONFIG_RTE_LIBRTE_CTRL_IF=y
+CONFIG_RTE_KCP_KMOD=y
+CONFIG_RTE_KCP_KO_DEBUG=n
 
 #
 # Compile vhost library
diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile
index d9c5233..d1fa3a3 100644
--- a/lib/librte_eal/linuxapp/Makefile
+++ b/lib/librte_eal/linuxapp/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -38,6 +38,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal
 ifeq ($(CONFIG_RTE_KNI_KMOD),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kni
 endif
+ifeq ($(CONFIG_RTE_KCP_KMOD),y)
+DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kcp
+endif
 ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += xen_dom0
 endif
diff --git a/lib/librte_eal/linuxapp/kcp/Makefile b/lib/librte_eal/linuxapp/kcp/Makefile
new file mode 100644
index 0000000..e7472f3
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/Makefile
@@ -0,0 +1,58 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# module name and path
+#
+MODULE = rte_kcp
+
+#
+# CFLAGS
+#
+MODULE_CFLAGS += -I$(SRCDIR)
+MODULE_CFLAGS += -I$(RTE_OUTPUT)/include
+MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h
+MODULE_CFLAGS += -Wall -Werror
+
+# this lib needs main eal
+DEPDIRS-y += lib/librte_eal/linuxapp/eal
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-y += kcp_misc.c
+SRCS-y += kcp_net.c
+SRCS-y += kcp_ethtool.c
+SRCS-y += kcp_nl.c
+
+include $(RTE_SDK)/mk/rte.module.mk
diff --git a/lib/librte_eal/linuxapp/kcp/kcp_dev.h b/lib/librte_eal/linuxapp/kcp/kcp_dev.h
new file mode 100644
index 0000000..1097cb4
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/kcp_dev.h
@@ -0,0 +1,81 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program; if not, write to the Free Software
+ *   Foundation.
+ *   The full GNU General Public License is included in this distribution
+ *   in the file called LICENSE.GPL.
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#ifndef _KCP_DEV_H_
+#define _KCP_DEV_H_
+
+#include <linux/netdevice.h>
+
+#define RTE_KCP_NAMESIZE 32
+#define KCP_DEVICE "kcp"
+
+#define RTE_KCP_IOCTL_TEST    _IOWR(0, 1, int)
+#define RTE_KCP_IOCTL_CREATE  _IOWR(0, 2, int)
+#define RTE_KCP_IOCTL_RELEASE _IOWR(0, 3, int)
+
+enum rte_kcp_req_id {
+	RTE_KCP_REQ_UNKNOWN = (1 << 16),
+	RTE_KCP_REQ_CHANGE_MTU,
+	RTE_KCP_REQ_CFG_NETWORK_IF,
+	RTE_KCP_REQ_GET_STATS,
+	RTE_KCP_REQ_GET_MAC,
+	RTE_KCP_REQ_SET_MAC,
+	RTE_KCP_REQ_START_PORT,
+	RTE_KCP_REQ_STOP_PORT,
+	RTE_KCP_REQ_MAX,
+};
+
+struct kcp_dev {
+	/* kcp list */
+	struct list_head list;
+
+	char name[RTE_KCP_NAMESIZE]; /* Network device name */
+
+	/* kcp device */
+	struct net_device *net_dev;
+
+	int port_id;
+	struct completion msg_received;
+};
+
+void kcp_net_init(struct net_device *dev);
+
+void kcp_nl_init(void);
+void kcp_nl_release(void);
+int kcp_nl_exec(int cmd, struct net_device *dev, void *in_data, int in_len,
+		void *out_data, int out_len);
+
+void kcp_set_ethtool_ops(struct net_device *netdev);
+
+#define KCP_ERR(args...) printk(KERN_ERR "KCP: " args)
+#define KCP_INFO(args...) printk(KERN_INFO "KCP: " args)
+#define KCP_PRINT(args...) printk(KERN_DEBUG "KCP: " args)
+
+#ifdef RTE_KCP_KO_DEBUG
+#define KCP_DBG(args...) printk(KERN_DEBUG "KCP: " args)
+#else
+#define KCP_DBG(args...)
+#endif
+
+#endif /* _KCP_DEV_H_ */
diff --git a/lib/librte_eal/linuxapp/kcp/kcp_ethtool.c b/lib/librte_eal/linuxapp/kcp/kcp_ethtool.c
new file mode 100644
index 0000000..0f5b583
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/kcp_ethtool.c
@@ -0,0 +1,261 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program; if not, write to the Free Software
+ *   Foundation.
+ *   The full GNU General Public License is included in this distribution
+ *   in the file called LICENSE.GPL.
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#include "kcp_dev.h"
+
+#define ETHTOOL_GEEPROM_LEN 99
+#define ETHTOOL_GREGS_LEN 98
+#define ETHTOOL_GSSET_COUNT 97
+
+static int
+kcp_check_if_running(struct net_device *dev)
+{
+	return 0;
+}
+
+static void
+kcp_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info)
+{
+	int ret;
+
+	ret = kcp_nl_exec(info->cmd, dev, NULL, 0,
+			info, sizeof(struct ethtool_drvinfo));
+	if (ret < 0)
+		memset(info, 0, sizeof(struct ethtool_drvinfo));
+}
+
+static int
+kcp_get_settings(struct net_device *dev, struct ethtool_cmd *ecmd)
+{
+	return kcp_nl_exec(ecmd->cmd, dev, NULL, 0,
+			ecmd, sizeof(struct ethtool_cmd));
+}
+
+static int
+kcp_set_settings(struct net_device *dev, struct ethtool_cmd *ecmd)
+{
+	return kcp_nl_exec(ecmd->cmd, dev, ecmd, sizeof(struct ethtool_cmd),
+			NULL, 0);
+}
+
+static void
+kcp_get_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
+{
+	int ret;
+
+	ret = kcp_nl_exec(wol->cmd, dev, NULL, 0,
+			wol, sizeof(struct ethtool_wolinfo));
+	if (ret < 0)
+		memset(wol, 0, sizeof(struct ethtool_wolinfo));
+}
+
+static int
+kcp_set_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
+{
+	return kcp_nl_exec(wol->cmd, dev, wol, sizeof(struct ethtool_wolinfo),
+			NULL, 0);
+}
+
+static int
+kcp_nway_reset(struct net_device *dev)
+{
+	return kcp_nl_exec(ETHTOOL_NWAY_RST, dev, NULL, 0, NULL, 0);
+}
+
+static int
+kcp_get_eeprom_len(struct net_device *dev)
+{
+	int data;
+	int ret;
+
+	ret = kcp_nl_exec(ETHTOOL_GEEPROM_LEN, dev, NULL, 0,
+			&data, sizeof(int));
+	if (ret < 0)
+		return ret;
+
+	return data;
+}
+
+static int
+kcp_get_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom,
+		u8 *bytes)
+{
+	int ret;
+
+	ret = kcp_nl_exec(eeprom->cmd, dev,
+			eeprom, sizeof(struct ethtool_eeprom),
+			bytes, eeprom->len);
+	*bytes = 0;
+	return ret;
+}
+
+static int
+kcp_set_eeprom(struct net_device *dev, struct ethtool_eeprom *eeprom,
+		u8 *bytes)
+{
+	int ret;
+
+	ret = kcp_nl_exec(eeprom->cmd, dev,
+			eeprom, sizeof(struct ethtool_eeprom),
+			bytes, eeprom->len);
+	*bytes = 0;
+	return ret;
+}
+
+static void
+kcp_get_ringparam(struct net_device *dev, struct ethtool_ringparam *ring)
+{
+
+	kcp_nl_exec(ring->cmd, dev, NULL, 0,
+			ring, sizeof(struct ethtool_ringparam));
+}
+
+static int
+kcp_set_ringparam(struct net_device *dev, struct ethtool_ringparam *ring)
+{
+	int ret;
+
+	ret = kcp_nl_exec(ring->cmd, dev,
+			ring, sizeof(struct ethtool_ringparam),
+			NULL, 0);
+	return ret;
+}
+
+static void
+kcp_get_pauseparam(struct net_device *dev, struct ethtool_pauseparam *pause)
+{
+
+	kcp_nl_exec(pause->cmd, dev, NULL, 0,
+			pause, sizeof(struct ethtool_pauseparam));
+}
+
+static int
+kcp_set_pauseparam(struct net_device *dev, struct ethtool_pauseparam *pause)
+{
+	return kcp_nl_exec(pause->cmd, dev,
+			pause, sizeof(struct ethtool_pauseparam),
+			NULL, 0);
+}
+
+static u32
+kcp_get_msglevel(struct net_device *dev)
+{
+	int data;
+	int ret;
+
+	ret = kcp_nl_exec(ETHTOOL_GMSGLVL, dev, NULL, 0, &data, sizeof(int));
+	if (ret < 0)
+		return ret;
+
+	return data;
+}
+
+static void
+kcp_set_msglevel(struct net_device *dev, u32 data)
+{
+
+	kcp_nl_exec(ETHTOOL_SMSGLVL, dev, &data, sizeof(int), NULL, 0);
+}
+
+static int
+kcp_get_regs_len(struct net_device *dev)
+{
+	int data;
+	int ret;
+
+	ret = kcp_nl_exec(ETHTOOL_GREGS_LEN, dev, NULL, 0, &data, sizeof(int));
+	if (ret < 0)
+		return ret;
+
+	return data;
+}
+
+static void
+kcp_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *p)
+{
+
+	kcp_nl_exec(regs->cmd, dev, regs, sizeof(struct ethtool_regs),
+			p, regs->len);
+}
+
+static void
+kcp_get_strings(struct net_device *dev, u32 stringset, u8 *data)
+{
+
+	kcp_nl_exec(ETHTOOL_GSTRINGS, dev, &stringset, sizeof(u32), data, 0);
+}
+
+static int
+kcp_get_sset_count(struct net_device *dev, int sset)
+{
+	int data;
+	int ret;
+
+	ret = kcp_nl_exec(ETHTOOL_GSSET_COUNT, dev, &sset, sizeof(int),
+			&data, sizeof(int));
+	if (ret < 0)
+		return ret;
+
+	return data;
+}
+
+static void
+kcp_get_ethtool_stats(struct net_device *dev, struct ethtool_stats *stats,
+		u64 *data)
+{
+
+	kcp_nl_exec(stats->cmd, dev, stats, sizeof(struct ethtool_stats),
+			data, stats->n_stats);
+}
+
+static const struct ethtool_ops kcp_ethtool_ops = {
+	.begin			= kcp_check_if_running,
+	.get_drvinfo		= kcp_get_drvinfo,
+	.get_settings		= kcp_get_settings,
+	.set_settings		= kcp_set_settings,
+	.get_regs_len		= kcp_get_regs_len,
+	.get_regs		= kcp_get_regs,
+	.get_wol		= kcp_get_wol,
+	.set_wol		= kcp_set_wol,
+	.nway_reset		= kcp_nway_reset,
+	.get_link		= ethtool_op_get_link,
+	.get_eeprom_len		= kcp_get_eeprom_len,
+	.get_eeprom		= kcp_get_eeprom,
+	.set_eeprom		= kcp_set_eeprom,
+	.get_ringparam		= kcp_get_ringparam,
+	.set_ringparam		= kcp_set_ringparam,
+	.get_pauseparam		= kcp_get_pauseparam,
+	.set_pauseparam		= kcp_set_pauseparam,
+	.get_msglevel		= kcp_get_msglevel,
+	.set_msglevel		= kcp_set_msglevel,
+	.get_strings		= kcp_get_strings,
+	.get_sset_count		= kcp_get_sset_count,
+	.get_ethtool_stats	= kcp_get_ethtool_stats,
+};
+
+void
+kcp_set_ethtool_ops(struct net_device *netdev)
+{
+	netdev->ethtool_ops = &kcp_ethtool_ops;
+}
diff --git a/lib/librte_eal/linuxapp/kcp/kcp_misc.c b/lib/librte_eal/linuxapp/kcp/kcp_misc.c
new file mode 100644
index 0000000..eadd1d7
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/kcp_misc.c
@@ -0,0 +1,282 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program; if not, write to the Free Software
+ *   Foundation.
+ *   The full GNU General Public License is included in this distribution
+ *   in the file called LICENSE.GPL.
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#include <linux/module.h>
+#include <linux/miscdevice.h>
+
+#include "kcp_dev.h"
+
+#define KCP_DEV_IN_USE_BIT_NUM 0 /* Bit number for device in use */
+
+static volatile unsigned long device_in_use; /* device in use flag */
+
+/* kcp list lock */
+static DECLARE_RWSEM(kcp_list_lock);
+
+/* kcp list */
+static struct list_head kcp_list_head = LIST_HEAD_INIT(kcp_list_head);
+
+static int
+kcp_open(struct inode *inode, struct file *file)
+{
+	/* kcp device can be opened by one user only, test and set bit */
+	if (test_and_set_bit(KCP_DEV_IN_USE_BIT_NUM, &device_in_use))
+		return -EBUSY;
+
+	KCP_PRINT("/dev/kcp opened\n");
+
+	kcp_nl_init();
+
+	return 0;
+}
+
+static int
+kcp_dev_remove(struct kcp_dev *dev)
+{
+	if (!dev)
+		return -ENODEV;
+
+	if (dev->net_dev) {
+		unregister_netdev(dev->net_dev);
+		free_netdev(dev->net_dev);
+	}
+
+	return 0;
+}
+
+static int
+kcp_release(struct inode *inode, struct file *file)
+{
+	struct kcp_dev *dev, *n;
+
+	down_write(&kcp_list_lock);
+	list_for_each_entry_safe(dev, n, &kcp_list_head, list) {
+		kcp_dev_remove(dev);
+		list_del(&dev->list);
+	}
+	up_write(&kcp_list_lock);
+
+	kcp_nl_release();
+
+	/* Clear the bit of device in use */
+	clear_bit(KCP_DEV_IN_USE_BIT_NUM, &device_in_use);
+
+	KCP_PRINT("/dev/kcp closed\n");
+
+	return 0;
+}
+
+static int
+kcp_check_param(struct kcp_dev *kcp, char *name)
+{
+	if (!kcp)
+		return -1;
+
+	/* Check if network name has been used */
+	if (!strncmp(kcp->name, name, RTE_KCP_NAMESIZE)) {
+		KCP_ERR("KCP interface name %s duplicated\n", name);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+kcp_ioctl_create(unsigned int ioctl_num, unsigned long ioctl_param)
+{
+	int ret;
+	struct net_device *net_dev = NULL;
+	struct kcp_dev *kcp, *dev, *n;
+	struct net *net;
+	char name[RTE_KCP_NAMESIZE];
+	unsigned int instance = ioctl_param;
+	char mac[ETH_ALEN];
+
+	KCP_PRINT("Creating kcp...\n");
+
+	snprintf(name, RTE_KCP_NAMESIZE, "dpdk%u", instance);
+
+	/* Check if it has been created */
+	down_read(&kcp_list_lock);
+	list_for_each_entry_safe(dev, n, &kcp_list_head, list) {
+		if (kcp_check_param(dev, name) < 0) {
+			up_read(&kcp_list_lock);
+			return -EINVAL;
+		}
+	}
+	up_read(&kcp_list_lock);
+
+	net_dev = alloc_netdev(sizeof(struct kcp_dev), name,
+#ifdef NET_NAME_UNKNOWN
+							NET_NAME_UNKNOWN,
+#endif
+							kcp_net_init);
+	if (net_dev == NULL) {
+		KCP_ERR("error allocating device \"%s\"\n", name);
+		return -EBUSY;
+	}
+
+	net = get_net_ns_by_pid(task_pid_vnr(current));
+	if (IS_ERR(net)) {
+		free_netdev(net_dev);
+		return PTR_ERR(net);
+	}
+	dev_net_set(net_dev, net);
+	put_net(net);
+
+	kcp = netdev_priv(net_dev);
+
+	kcp->net_dev = net_dev;
+	kcp->port_id = instance;
+	init_completion(&kcp->msg_received);
+	strncpy(kcp->name, name, RTE_KCP_NAMESIZE);
+
+	kcp_nl_exec(RTE_KCP_REQ_GET_MAC, net_dev, NULL, 0, mac, ETH_ALEN);
+	memcpy(net_dev->dev_addr, mac, net_dev->addr_len);
+
+	kcp_set_ethtool_ops(net_dev);
+	ret = register_netdev(net_dev);
+	if (ret) {
+		KCP_ERR("error %i registering device \"%s\"\n", ret, name);
+		kcp_dev_remove(kcp);
+		return -ENODEV;
+	}
+
+	down_write(&kcp_list_lock);
+	list_add(&kcp->list, &kcp_list_head);
+	up_write(&kcp_list_lock);
+
+	return 0;
+}
+
+static int
+kcp_ioctl_release(unsigned int ioctl_num, unsigned long ioctl_param)
+{
+	int ret = -EINVAL;
+	struct kcp_dev *dev;
+	struct kcp_dev *n;
+	char name[RTE_KCP_NAMESIZE];
+	unsigned int instance = ioctl_param;
+
+	snprintf(name, RTE_KCP_NAMESIZE, "dpdk%u", instance);
+
+	down_write(&kcp_list_lock);
+	list_for_each_entry_safe(dev, n, &kcp_list_head, list) {
+		if (strncmp(dev->name, name, RTE_KCP_NAMESIZE) != 0)
+			continue;
+		kcp_dev_remove(dev);
+		list_del(&dev->list);
+		ret = 0;
+		break;
+	}
+	up_write(&kcp_list_lock);
+	KCP_INFO("%s release kcp named %s\n",
+		(ret == 0 ? "Successfully" : "Unsuccessfully"), name);
+
+	return ret;
+}
+
+static int
+kcp_ioctl(struct inode *inode, unsigned int ioctl_num,
+	unsigned long ioctl_param)
+{
+	int ret = -EINVAL;
+
+	KCP_DBG("IOCTL num=0x%0x param=0x%0lx\n", ioctl_num, ioctl_param);
+
+	/*
+	 * Switch according to the ioctl called
+	 */
+	switch (_IOC_NR(ioctl_num)) {
+	case _IOC_NR(RTE_KCP_IOCTL_TEST):
+		/* For test only, not used */
+		break;
+	case _IOC_NR(RTE_KCP_IOCTL_CREATE):
+		ret = kcp_ioctl_create(ioctl_num, ioctl_param);
+		break;
+	case _IOC_NR(RTE_KCP_IOCTL_RELEASE):
+		ret = kcp_ioctl_release(ioctl_num, ioctl_param);
+		break;
+	default:
+		KCP_DBG("IOCTL default\n");
+		break;
+	}
+
+	return ret;
+}
+
+static int
+kcp_compat_ioctl(struct inode *inode, unsigned int ioctl_num,
+		unsigned long ioctl_param)
+{
+	/* 32 bits app on 64 bits OS to be supported later */
+	KCP_PRINT("Not implemented.\n");
+
+	return -EINVAL;
+}
+
+static const struct file_operations kcp_fops = {
+	.owner = THIS_MODULE,
+	.open = kcp_open,
+	.release = kcp_release,
+	.unlocked_ioctl = (void *)kcp_ioctl,
+	.compat_ioctl = (void *)kcp_compat_ioctl,
+};
+
+static struct miscdevice kcp_misc = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name = KCP_DEVICE,
+	.fops = &kcp_fops,
+};
+
+static int __init
+kcp_init(void)
+{
+	KCP_PRINT("DPDK kcp module loading\n");
+
+	if (misc_register(&kcp_misc) != 0) {
+		KCP_ERR("Misc registration failed\n");
+		return -EPERM;
+	}
+
+	/* Clear the bit of device in use */
+	clear_bit(KCP_DEV_IN_USE_BIT_NUM, &device_in_use);
+
+	KCP_PRINT("DPDK kcp module loaded\n");
+
+	return 0;
+}
+module_init(kcp_init);
+
+static void __exit
+kcp_exit(void)
+{
+	misc_deregister(&kcp_misc);
+	KCP_PRINT("DPDK kcp module unloaded\n");
+}
+module_exit(kcp_exit);
+
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("Kernel Module for managing kcp devices");
diff --git a/lib/librte_eal/linuxapp/kcp/kcp_net.c b/lib/librte_eal/linuxapp/kcp/kcp_net.c
new file mode 100644
index 0000000..8aba386
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/kcp_net.c
@@ -0,0 +1,209 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program; if not, write to the Free Software
+ *   Foundation.
+ *   The full GNU General Public License is included in this distribution
+ *   in the file called LICENSE.GPL.
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+/*
+ * This code is inspired from the book "Linux Device Drivers" by
+ * Alessandro Rubini and Jonathan Corbet, published by O'Reilly & Associates
+ */
+
+#include <linux/version.h>
+#include <linux/etherdevice.h> /* eth_type_trans */
+
+#include "kcp_dev.h"
+
+/*
+ * Open and close
+ */
+static int
+kcp_net_open(struct net_device *dev)
+{
+	kcp_nl_exec(RTE_KCP_REQ_START_PORT, dev, NULL, 0, NULL, 0);
+	netif_start_queue(dev);
+	return 0;
+}
+
+static int
+kcp_net_release(struct net_device *dev)
+{
+	kcp_nl_exec(RTE_KCP_REQ_STOP_PORT, dev, NULL, 0, NULL, 0);
+	netif_stop_queue(dev); /* can't transmit any more */
+	return 0;
+}
+
+/*
+ * Configuration changes (passed on by ifconfig)
+ */
+static int
+kcp_net_config(struct net_device *dev, struct ifmap *map)
+{
+	if (dev->flags & IFF_UP) /* can't act on a running interface */
+		return -EBUSY;
+
+	/* ignore other fields */
+	return 0;
+}
+
+static int
+kcp_net_change_mtu(struct net_device *dev, int new_mtu)
+{
+	int err;
+
+	KCP_DBG("kcp_net_change_mtu new mtu %d to be set\n", new_mtu);
+	err = kcp_nl_exec(RTE_KCP_REQ_CHANGE_MTU, dev, &new_mtu, sizeof(int),
+			NULL, 0);
+
+	if (err == 0)
+		dev->mtu = new_mtu;
+
+	return err;
+}
+
+/*
+ * Ioctl commands
+ */
+static int
+kcp_net_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
+{
+	KCP_DBG("kcp_net_ioctl\n");
+
+	return 0;
+}
+
+/*
+ * Return statistics to the caller
+ */
+static struct  rtnl_link_stats64 *
+kcp_net_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
+{
+	int err;
+
+	err = kcp_nl_exec(RTE_KCP_REQ_GET_STATS, dev, NULL, 0,
+			stats, sizeof(struct rtnl_link_stats64));
+
+	return stats;
+}
+
+/**
+ * kcp_net_set_mac - Change the Ethernet Address of the KCP NIC
+ * @netdev: network interface device structure
+ * @p: pointer to an address structure
+ *
+ * Returns 0 on success, negative on failure
+ **/
+static int
+kcp_net_set_mac(struct net_device *dev, void *p)
+{
+	struct sockaddr *addr = p;
+	int err;
+
+	if (!is_valid_ether_addr((unsigned char *)(addr->sa_data)))
+		return -EADDRNOTAVAIL;
+
+	err = kcp_nl_exec(RTE_KCP_REQ_SET_MAC, dev, addr->sa_data,
+			dev->addr_len, NULL, 0);
+	if (err < 0)
+		return -EADDRNOTAVAIL;
+
+	memcpy(dev->dev_addr, addr->sa_data, dev->addr_len);
+	return 0;
+}
+
+#if (KERNEL_VERSION(3, 9, 0) <= LINUX_VERSION_CODE)
+static int
+kcp_net_change_carrier(struct net_device *dev, bool new_carrier)
+{
+	if (new_carrier)
+		netif_carrier_on(dev);
+	else
+		netif_carrier_off(dev);
+	return 0;
+}
+#endif
+
+static const struct net_device_ops kcp_net_netdev_ops = {
+	.ndo_open = kcp_net_open,
+	.ndo_stop = kcp_net_release,
+	.ndo_set_config = kcp_net_config,
+	.ndo_change_mtu = kcp_net_change_mtu,
+	.ndo_do_ioctl = kcp_net_ioctl,
+	.ndo_get_stats64 = kcp_net_stats64,
+	.ndo_set_mac_address = kcp_net_set_mac,
+#if (KERNEL_VERSION(3, 9, 0) <= LINUX_VERSION_CODE)
+	.ndo_change_carrier = kcp_net_change_carrier,
+#endif
+};
+
+/*
+ *  Fill the eth header
+ */
+static int
+kcp_net_header(struct sk_buff *skb, struct net_device *dev,
+		unsigned short type, const void *daddr,
+		const void *saddr, unsigned int len)
+{
+	struct ethhdr *eth = (struct ethhdr *) skb_push(skb, ETH_HLEN);
+
+	memcpy(eth->h_source, saddr ? saddr : dev->dev_addr, dev->addr_len);
+	memcpy(eth->h_dest,   daddr ? daddr : dev->dev_addr, dev->addr_len);
+	eth->h_proto = htons(type);
+
+	return dev->hard_header_len;
+}
+
+/*
+ * Re-fill the eth header
+ */
+#if (KERNEL_VERSION(4, 1, 0) > LINUX_VERSION_CODE)
+static int
+kcp_net_rebuild_header(struct sk_buff *skb)
+{
+	struct net_device *dev = skb->dev;
+	struct ethhdr *eth = (struct ethhdr *) skb->data;
+
+	memcpy(eth->h_source, dev->dev_addr, dev->addr_len);
+	memcpy(eth->h_dest, dev->dev_addr, dev->addr_len);
+
+	return 0;
+}
+#endif
+
+static const struct header_ops kcp_net_header_ops = {
+	.create  = kcp_net_header,
+#if (KERNEL_VERSION(4, 1, 0) > LINUX_VERSION_CODE)
+	.rebuild = kcp_net_rebuild_header,
+#endif
+	.cache   = NULL,  /* disable caching */
+};
+
+void
+kcp_net_init(struct net_device *dev)
+{
+	KCP_DBG("kcp_net_init\n");
+
+	ether_setup(dev); /* assign some of the fields */
+	dev->netdev_ops      = &kcp_net_netdev_ops;
+	dev->header_ops      = &kcp_net_header_ops;
+
+	dev->flags |= IFF_UP;
+}
diff --git a/lib/librte_eal/linuxapp/kcp/kcp_nl.c b/lib/librte_eal/linuxapp/kcp/kcp_nl.c
new file mode 100644
index 0000000..3c2ed5b
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/kcp_nl.c
@@ -0,0 +1,194 @@
+/*-
+ * GPL LICENSE SUMMARY
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *
+ *   This program is free software; you can redistribute it and/or modify
+ *   it under the terms of version 2 of the GNU General Public License as
+ *   published by the Free Software Foundation.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *   General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program;
+ *   The full GNU General Public License is included in this distribution
+ *   in the file called LICENSE.GPL.
+ *
+ *   Contact Information:
+ *   Intel Corporation
+ */
+
+#include <net/sock.h>
+
+#include "kcp_dev.h"
+
+#define KCP_NL_GRP 31
+
+#define KCP_ETHTOOL_MSG_LEN 500
+struct kcp_ethtool_msg {
+	int cmd_id;
+	int port_id;
+	char input_buffer[KCP_ETHTOOL_MSG_LEN];
+	char output_buffer[KCP_ETHTOOL_MSG_LEN];
+	int input_buf_len;
+	int output_buf_len;
+	int err;
+};
+
+static struct ethtool_input_buffer {
+	int magic;
+	void *buffer;
+	int length;
+	struct completion *msg_received;
+	int *err;
+} ethtool_input_buffer;
+
+static struct sock *nl_sock;
+static int pid __read_mostly = -1;
+static struct mutex sync_lock;
+
+static int
+kcp_input_buffer_register(int magic, void *buffer, int length,
+		struct completion *msg_received, int *err)
+{
+	if (ethtool_input_buffer.buffer == NULL) {
+		ethtool_input_buffer.magic = magic;
+		ethtool_input_buffer.buffer = buffer;
+		ethtool_input_buffer.length = length;
+		ethtool_input_buffer.msg_received = msg_received;
+		ethtool_input_buffer.err = err;
+		return 0;
+	}
+
+	return 1;
+}
+
+static void
+kcp_input_buffer_unregister(int magic)
+{
+	if (ethtool_input_buffer.buffer != NULL) {
+		if (magic == ethtool_input_buffer.magic) {
+			ethtool_input_buffer.magic = -1;
+			ethtool_input_buffer.buffer = NULL;
+			ethtool_input_buffer.length = 0;
+			ethtool_input_buffer.msg_received = NULL;
+			ethtool_input_buffer.err = NULL;
+		}
+	}
+}
+
+static void
+nl_recv(struct sk_buff *skb)
+{
+	struct nlmsghdr *nlh;
+	struct kcp_ethtool_msg ethtool_msg;
+
+	nlh = (struct nlmsghdr *)skb->data;
+	if (pid < 0) {
+		pid = nlh->nlmsg_pid;
+		KCP_INFO("PID: %d\n", pid);
+		return;
+	} else if (pid != nlh->nlmsg_pid) {
+		KCP_INFO("Message from unexpected peer: %d", nlh->nlmsg_pid);
+		return;
+	}
+
+	memcpy(&ethtool_msg, NLMSG_DATA(nlh), sizeof(struct kcp_ethtool_msg));
+	KCP_DBG("CMD: %d\n", ethtool_msg.cmd_id);
+
+	if (ethtool_input_buffer.magic > 0) {
+		if (ethtool_input_buffer.buffer != NULL) {
+			memcpy(ethtool_input_buffer.buffer,
+					&ethtool_msg.output_buffer,
+					ethtool_input_buffer.length);
+		}
+		*ethtool_input_buffer.err = ethtool_msg.err;
+		complete(ethtool_input_buffer.msg_received);
+		kcp_input_buffer_unregister(ethtool_input_buffer.magic);
+	}
+}
+
+static int
+kcp_nl_send(int cmd_id, int port_id, void *input_buffer, int input_buf_len)
+{
+	struct sk_buff *skb;
+	struct nlmsghdr *nlh;
+	struct kcp_ethtool_msg ethtool_msg;
+
+	memset(&ethtool_msg, 0, sizeof(struct kcp_ethtool_msg));
+	ethtool_msg.cmd_id = cmd_id;
+	ethtool_msg.port_id = port_id;
+
+	if (input_buffer) {
+		if (input_buf_len == 0 || input_buf_len > KCP_ETHTOOL_MSG_LEN)
+			return -EINVAL;
+		ethtool_msg.input_buf_len = input_buf_len;
+		memcpy(ethtool_msg.input_buffer, input_buffer, input_buf_len);
+	}
+
+	skb = nlmsg_new(NLMSG_ALIGN(sizeof(struct kcp_ethtool_msg)),
+			GFP_ATOMIC);
+	nlh = nlmsg_put(skb, 0, 0, NLMSG_DONE, sizeof(struct kcp_ethtool_msg),
+			0);
+
+	NETLINK_CB(skb).dst_group = 0;
+
+	memcpy(nlmsg_data(nlh), &ethtool_msg, sizeof(struct kcp_ethtool_msg));
+
+	nlmsg_unicast(nl_sock, skb, pid);
+	KCP_DBG("Sent cmd:%d port:%d\n", cmd_id, port_id);
+
+	/*nlmsg_free(skb);*/
+
+	return 0;
+}
+
+int
+kcp_nl_exec(int cmd, struct net_device *dev, void *in_data, int in_len,
+		void *out_data, int out_len)
+{
+	struct kcp_dev *priv = netdev_priv(dev);
+	int err = -EINVAL;
+	int ret;
+
+	mutex_lock(&sync_lock);
+	ret = kcp_input_buffer_register(cmd, out_data, out_len,
+			&priv->msg_received, &err);
+	if (ret) {
+		mutex_unlock(&sync_lock);
+		return -EINVAL;
+	}
+
+	kcp_nl_send(cmd, priv->port_id, in_data, in_len);
+	ret = wait_for_completion_interruptible_timeout(&priv->msg_received,
+			 msecs_to_jiffies(10));
+	if (ret == 0 || err < 0) {
+		kcp_input_buffer_unregister(ethtool_input_buffer.magic);
+		mutex_unlock(&sync_lock);
+		return ret == 0 ? -EINVAL : err;
+	}
+	mutex_unlock(&sync_lock);
+
+	return 0;
+}
+
+static struct netlink_kernel_cfg cfg = {
+	.input = nl_recv,
+};
+
+void
+kcp_nl_init(void)
+{
+	nl_sock = netlink_kernel_create(&init_net, KCP_NL_GRP, &cfg);
+	mutex_init(&sync_lock);
+}
+
+void
+kcp_nl_release(void)
+{
+	netlink_kernel_release(nl_sock);
+	pid = -1;
+}
-- 
2.5.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [dpdk-dev] [RFC 3/3] examples/ethtool: add control interface support to the application
  2016-01-15 16:18 [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports Ferruh Yigit
  2016-01-15 16:18 ` [dpdk-dev] [RFC 1/3] rte_ctrl_if: add control interface library Ferruh Yigit
  2016-01-15 16:18 ` [dpdk-dev] [RFC 2/3] kcp: add kernel control path kernel module Ferruh Yigit
@ 2016-01-15 16:18 ` Ferruh Yigit
  2016-01-18 16:20 ` [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports Aaron Conole
  2016-01-18 23:12 ` Stephen Hemminger
  4 siblings, 0 replies; 15+ messages in thread
From: Ferruh Yigit @ 2016-01-15 16:18 UTC (permalink / raw)
  To: dev

Control interface APIs added into the sample application.

To have the support corresponding kernel module (KCP) needs to be inserted.
If kernel module is not there, application will run as it is without
kernel control path support.

When KCP module inserted, running application creates a virtual Linux
network interface (dpdk$) per DPDK port. This interface can be used by
traditional Linux tools.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
 examples/ethtool/ethtool-app/main.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/examples/ethtool/ethtool-app/main.c b/examples/ethtool/ethtool-app/main.c
index e21abcd..bfa2128 100644
--- a/examples/ethtool/ethtool-app/main.c
+++ b/examples/ethtool/ethtool-app/main.c
@@ -44,6 +44,7 @@
 #include <rte_memory.h>
 #include <rte_mempool.h>
 #include <rte_mbuf.h>
+#include <rte_ctrl_if.h>
 
 #include "ethapp.h"
 
@@ -54,7 +55,6 @@
 #define PKTPOOL_EXTRA_SIZE 512
 #define PKTPOOL_CACHE 32
 
-
 struct txq_port {
 	uint16_t cnt_unsent;
 	struct rte_mbuf *buf_frames[MAX_BURST_LENGTH];
@@ -254,6 +254,8 @@ static int slave_main(__attribute__((unused)) void *ptr_data)
 			}
 			rte_spinlock_unlock(&ptr_port->lock);
 		} /* end for( idx_port ) */
+		rte_eth_control_interface_process_msg(
+				RTE_ETHTOOL_CTRL_IF_PROCESS_MSG, 0);
 	} /* end for(;;) */
 
 	return 0;
@@ -293,6 +295,8 @@ int main(int argc, char **argv)
 	id_core = rte_get_next_lcore(id_core, 1, 1);
 	rte_eal_remote_launch(slave_main, NULL, id_core);
 
+	rte_eth_control_interface_create();
+
 	ethapp_main();
 
 	app_cfg.exit_now = 1;
@@ -301,5 +305,7 @@ int main(int argc, char **argv)
 			return -1;
 	}
 
+	rte_eth_control_interface_destroy();
+
 	return 0;
 }
-- 
2.5.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
  2016-01-15 16:18 [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports Ferruh Yigit
                   ` (2 preceding siblings ...)
  2016-01-15 16:18 ` [dpdk-dev] [RFC 3/3] examples/ethtool: add control interface support to the application Ferruh Yigit
@ 2016-01-18 16:20 ` Aaron Conole
  2016-01-19  9:59   ` Ferruh Yigit
  2016-01-18 23:12 ` Stephen Hemminger
  4 siblings, 1 reply; 15+ messages in thread
From: Aaron Conole @ 2016-01-18 16:20 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev

Ferruh Yigit <ferruh.yigit@intel.com> writes:
> This work is to make DPDK ports more visible and to enable using common
> Linux tools to configure DPDK ports.

This is a good goal. Only question - why use an additional kernel module
to do this? Is it _JUST_ for ethtool support? I think the other stuff
can be accomplished using netlink sockets + messages, no? The only
trepidation I would have with something like this is the support from
major vendors - out of tree modules are not generally supportable. Might
be good to get some of the ethtool commands as netlink messages as well,
then it is supportable with no 3rd party kernel modules.

Especially since (continued below)...

> Patch is based on KNI but contains only control functionality of it,
> also this patch does not include any Linux kernel network driver as
> part of it.
>
> Basically with the help of a kernel module (KCP), virtual Linux network
> interfaces named as "dpdk$" are created per DPDK port, control messages
> sent to these virtual interfaces are forwarded to DPDK, and response
> sent back to Linux application.
>
> Virtual interfaces created when DPDK application started and destroyed
> automatically when DPDK application terminated.
>
> Communication between kernel-space and DPDK done using netlink socket.

... you're already using a netlink socket here.

> Currently implementation is not complete, sample support added for the
> RFC, more functionality can be added based on community response.
>
> With this RFC Patch, supported: get/set mac address/mtu of DPDK devices,
> getting stats from DPDK devices and some set of ethtool commands.

I actually think there could be some additional features for
debuggability with this approach, so in general I like goal - I just have
implementation nit picks.

-Aaron

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
  2016-01-15 16:18 [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports Ferruh Yigit
                   ` (3 preceding siblings ...)
  2016-01-18 16:20 ` [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports Aaron Conole
@ 2016-01-18 23:12 ` Stephen Hemminger
  2016-01-18 23:48   ` Jay Rolette
  4 siblings, 1 reply; 15+ messages in thread
From: Stephen Hemminger @ 2016-01-18 23:12 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev

On Fri, 15 Jan 2016 16:18:01 +0000
Ferruh Yigit <ferruh.yigit@intel.com> wrote:

> This work is to make DPDK ports more visible and to enable using common
> Linux tools to configure DPDK ports.
> 
> Patch is based on KNI but contains only control functionality of it,
> also this patch does not include any Linux kernel network driver as
> part of it.

I actually would like KNI to die and be replaced by something generic.
Right now with KNI it is driver and hardware specific. It is almost as if there
are three drivers for ixgbe, the Linux driver, the DPDK driver, and the KNI driver.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
  2016-01-18 23:12 ` Stephen Hemminger
@ 2016-01-18 23:48   ` Jay Rolette
  2016-01-19  1:36     ` Stephen Hemminger
  2016-01-19 10:08     ` Ferruh Yigit
  0 siblings, 2 replies; 15+ messages in thread
From: Jay Rolette @ 2016-01-18 23:48 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: DPDK

On Mon, Jan 18, 2016 at 5:12 PM, Stephen Hemminger <
stephen@networkplumber.org> wrote:

> On Fri, 15 Jan 2016 16:18:01 +0000
> Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>
> > This work is to make DPDK ports more visible and to enable using common
> > Linux tools to configure DPDK ports.
> >
> > Patch is based on KNI but contains only control functionality of it,
> > also this patch does not include any Linux kernel network driver as
> > part of it.
>
> I actually would like KNI to die and be replaced by something generic.
> Right now with KNI it is driver and hardware specific. It is almost as if
> there
> are three drivers for ixgbe, the Linux driver, the DPDK driver, and the
> KNI driver.
>

Any ideas about what that would look like? Having the ability to send
traffic to/from DPDK-owned ports from control plane applications that live
outside of (and are ignorant of) DPDK is a platform requirement for our
product.

I'm assuming that isn't uncommon, but that could just be the nature of the
types of products I've built over the years.

That said, I'd love there to be something that performs better and plays
nicer with the system than KNI.

Jay

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
  2016-01-18 23:48   ` Jay Rolette
@ 2016-01-19  1:36     ` Stephen Hemminger
  2016-01-19 10:08     ` Ferruh Yigit
  1 sibling, 0 replies; 15+ messages in thread
From: Stephen Hemminger @ 2016-01-19  1:36 UTC (permalink / raw)
  To: Jay Rolette; +Cc: DPDK

On Mon, 18 Jan 2016 17:48:51 -0600
Jay Rolette <rolette@infiniteio.com> wrote:

> On Mon, Jan 18, 2016 at 5:12 PM, Stephen Hemminger <
> stephen@networkplumber.org> wrote:
> 
> > On Fri, 15 Jan 2016 16:18:01 +0000
> > Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> >
> > > This work is to make DPDK ports more visible and to enable using common
> > > Linux tools to configure DPDK ports.
> > >
> > > Patch is based on KNI but contains only control functionality of it,
> > > also this patch does not include any Linux kernel network driver as
> > > part of it.
> >
> > I actually would like KNI to die and be replaced by something generic.
> > Right now with KNI it is driver and hardware specific. It is almost as if
> > there
> > are three drivers for ixgbe, the Linux driver, the DPDK driver, and the
> > KNI driver.
> >
> 
> Any ideas about what that would look like? Having the ability to send
> traffic to/from DPDK-owned ports from control plane applications that live
> outside of (and are ignorant of) DPDK is a platform requirement for our
> product.
> 
> I'm assuming that isn't uncommon, but that could just be the nature of the
> types of products I've built over the years.
> 
> That said, I'd love there to be something that performs better and plays
> nicer with the system than KNI.

Maybe something using switchdev API in kernel? Or making the bifurcated
driver model work? Or something more like netmap where actual driver code
is in kernel for controlling hardware and only ring buffers need to be
exposed.

The existing DPDK although high performance suffers from lots of cases
of DRY (https://en.wikipedia.org/wiki/Don%27t_repeat_yourself).
For a recent example, we discovered that VLAN's don't work on I350
because the code to handle the workaround for byte swapping is not
there in DPDK (it is in the Linux driver).  Because DPDK causes
there to be has two driver code bases, this kind of problem is bound
to occur.

I realize this is a very hard problem, and there is no quick solution.
Any long term solution will require work in both spaces kernel and DPDK.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
  2016-01-18 16:20 ` [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports Aaron Conole
@ 2016-01-19  9:59   ` Ferruh Yigit
  2016-01-19 11:29     ` Panu Matilainen
  0 siblings, 1 reply; 15+ messages in thread
From: Ferruh Yigit @ 2016-01-19  9:59 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev

On Mon, Jan 18, 2016 at 11:20:02AM -0500, Aaron Conole wrote:
> Ferruh Yigit <ferruh.yigit@intel.com> writes:
> > This work is to make DPDK ports more visible and to enable using common
> > Linux tools to configure DPDK ports.
> 
> This is a good goal. Only question - why use an additional kernel module
> to do this? Is it _JUST_ for ethtool support? 

Kernel module used to create/destroy Linux net_devices, and module has a simple
driver for that device which only handles control messages by passing them into
userspace.

To represent DPDK ports as Linux net_devices we need kernel support.

> I think the other stuff
> can be accomplished using netlink sockets + messages, no?

Netlink sockets just used to communicate kernel-space - user-space, this is not
why we need a kernel module, for example this communication is implemented in
original KNI as part of FIFO.

>The only
> trepidation I would have with something like this is the support from
> major vendors - out of tree modules are not generally supportable. Might
> be good to get some of the ethtool commands as netlink messages as well,
> then it is supportable with no 3rd party kernel modules.

Yes, there is a out of three module problem for some distros, but unfortunately
we are not able to find a solution for this case without an external kernel module.

This patch is still an RFC and if we receive suggested solution without a kernel
module, we can work on it together.

Thanks,
ferruh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
  2016-01-18 23:48   ` Jay Rolette
  2016-01-19  1:36     ` Stephen Hemminger
@ 2016-01-19 10:08     ` Ferruh Yigit
  1 sibling, 0 replies; 15+ messages in thread
From: Ferruh Yigit @ 2016-01-19 10:08 UTC (permalink / raw)
  To: Jay Rolette; +Cc: DPDK

On Mon, Jan 18, 2016 at 05:48:51PM -0600, Jay Rolette wrote:
> On Mon, Jan 18, 2016 at 5:12 PM, Stephen Hemminger <
> stephen@networkplumber.org> wrote:
> 
> > On Fri, 15 Jan 2016 16:18:01 +0000
> > Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> >
> > > This work is to make DPDK ports more visible and to enable using common
> > > Linux tools to configure DPDK ports.
> > >
> > > Patch is based on KNI but contains only control functionality of it,
> > > also this patch does not include any Linux kernel network driver as
> > > part of it.
> >
> > I actually would like KNI to die and be replaced by something generic.
> > Right now with KNI it is driver and hardware specific. It is almost as if
> > there
> > are three drivers for ixgbe, the Linux driver, the DPDK driver, and the
> > KNI driver.
> >
> 
> Any ideas about what that would look like? Having the ability to send
> traffic to/from DPDK-owned ports from control plane applications that live
> outside of (and are ignorant of) DPDK is a platform requirement for our
> product.
> 
> I'm assuming that isn't uncommon, but that could just be the nature of the
> types of products I've built over the years.
> 
> That said, I'd love there to be something that performs better and plays
> nicer with the system than KNI.
> 
There is also another work going on for slow path communication, which converts
KNI's slow patch communication part to a PMD, to make it easier to use. An RFC
patch will be in mailing list next days.

Overall two responsibilities of KNI will be distributed into two different piece
of code with some enhancements.

If these new pieces get accepted by users and covers all KNI use cases, in long
run KNI can be depreciated...

Thanks,
ferruh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
  2016-01-19  9:59   ` Ferruh Yigit
@ 2016-01-19 11:29     ` Panu Matilainen
  2016-02-04 13:30       ` Ferruh Yigit
  0 siblings, 1 reply; 15+ messages in thread
From: Panu Matilainen @ 2016-01-19 11:29 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev

On 01/19/2016 11:59 AM, Ferruh Yigit wrote:
> On Mon, Jan 18, 2016 at 11:20:02AM -0500, Aaron Conole wrote:
>> Ferruh Yigit <ferruh.yigit@intel.com> writes:
>>> This work is to make DPDK ports more visible and to enable using common
>>> Linux tools to configure DPDK ports.
>>
>> This is a good goal. Only question - why use an additional kernel module
>> to do this? Is it _JUST_ for ethtool support?
>
> Kernel module used to create/destroy Linux net_devices, and module has a simple
> driver for that device which only handles control messages by passing them into
> userspace.
>
> To represent DPDK ports as Linux net_devices we need kernel support.
>
>> I think the other stuff
>> can be accomplished using netlink sockets + messages, no?
>
> Netlink sockets just used to communicate kernel-space - user-space, this is not
> why we need a kernel module, for example this communication is implemented in
> original KNI as part of FIFO.
>
>> The only
>> trepidation I would have with something like this is the support from
>> major vendors - out of tree modules are not generally supportable. Might
>> be good to get some of the ethtool commands as netlink messages as well,
>> then it is supportable with no 3rd party kernel modules.
>
> Yes, there is a out of three module problem for some distros, but unfortunately
> we are not able to find a solution for this case without an external kernel module.
>
> This patch is still an RFC and if we receive suggested solution without a kernel
> module, we can work on it together.

If it has to be in the kernel then you need to find a design that is 
upstreamable. Out of tree kernel modules are not a solution, they're a 
problem that people are working on eliminating.

	- Panu -

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
  2016-01-19 11:29     ` Panu Matilainen
@ 2016-02-04 13:30       ` Ferruh Yigit
  2016-02-04 13:38         ` Ferruh Yigit
  2016-02-04 14:40         ` Aaron Conole
  0 siblings, 2 replies; 15+ messages in thread
From: Ferruh Yigit @ 2016-02-04 13:30 UTC (permalink / raw)
  To: Panu Matilainen; +Cc: dev

On Tue, Jan 19, 2016 at 01:29:32PM +0200, Panu Matilainen wrote:
> On 01/19/2016 11:59 AM, Ferruh Yigit wrote:
>> On Mon, Jan 18, 2016 at 11:20:02AM -0500, Aaron Conole wrote:
>>> Ferruh Yigit <ferruh.yigit@intel.com> writes:
>>>> This work is to make DPDK ports more visible and to enable using common
>>>> Linux tools to configure DPDK ports.
>>>
>>> This is a good goal. Only question - why use an additional kernel module
>>> to do this? Is it _JUST_ for ethtool support?
>>
>> Kernel module used to create/destroy Linux net_devices, and module has a simple
>> driver for that device which only handles control messages by passing them into
>> userspace.
>>
>> To represent DPDK ports as Linux net_devices we need kernel support.
>>
>>> I think the other stuff
>>> can be accomplished using netlink sockets + messages, no?
>>
>> Netlink sockets just used to communicate kernel-space - user-space, this is not
>> why we need a kernel module, for example this communication is implemented in
>> original KNI as part of FIFO.
>>
>>> The only
>>> trepidation I would have with something like this is the support from
>>> major vendors - out of tree modules are not generally supportable. Might
>>> be good to get some of the ethtool commands as netlink messages as well,
>>> then it is supportable with no 3rd party kernel modules.
>>
>> Yes, there is a out of three module problem for some distros, but unfortunately
>> we are not able to find a solution for this case without an external kernel module.
>>
>> This patch is still an RFC and if we receive suggested solution without a kernel
>> module, we can work on it together.
>
> If it has to be in the kernel then you need to find a design that is 
> upstreamable. Out of tree kernel modules are not a solution, they're a 
> problem that people are working on eliminating.
>

Hi Stephen, and other Linux experts in the mail list,

Can you please help finding a upstreamable solution for kernel control path?

Mainly what we are looking for is userspace network driver support in kernel, similar to what FUSE does but a much simple version.

Above KCP module basically does this, by having a network driver which passing requests to userspace network driver, but it is not generic enough.

I wonder if it is possible make it more generic by extending rtnetlink support:
1- Add a new network driver to Linux (or update existing one like tun) to forward requests, get responses.
2- Extend rtnelink to support to attach any userspace driver to this device? (ip link set <device> uspace <?> ?)

Does this make sense?

rtnetlink already supports creating interfaces, and it provides kernel/user space communication,
with "attach" support interface learns about it's peer in usersppace and can communicate.

FUSE like communication method also can be alternative to transfer request and responses, but since rtnelink support exists, no need to create something new think.

Thanks,
ferruh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
  2016-02-04 13:30       ` Ferruh Yigit
@ 2016-02-04 13:38         ` Ferruh Yigit
  2016-02-04 14:40         ` Aaron Conole
  1 sibling, 0 replies; 15+ messages in thread
From: Ferruh Yigit @ 2016-02-04 13:38 UTC (permalink / raw)
  To: Panu Matilainen; +Cc: dev

On Thu, Feb 04, 2016 at 01:30:35PM +0000, Ferruh Yigit wrote:
> On Tue, Jan 19, 2016 at 01:29:32PM +0200, Panu Matilainen wrote:
> > On 01/19/2016 11:59 AM, Ferruh Yigit wrote:
> >> On Mon, Jan 18, 2016 at 11:20:02AM -0500, Aaron Conole wrote:
> >>> Ferruh Yigit <ferruh.yigit@intel.com> writes:
> >>>> This work is to make DPDK ports more visible and to enable using common
> >>>> Linux tools to configure DPDK ports.
> >>>
> >>> This is a good goal. Only question - why use an additional kernel module
> >>> to do this? Is it _JUST_ for ethtool support?
> >>
> >> Kernel module used to create/destroy Linux net_devices, and module has a simple
> >> driver for that device which only handles control messages by passing them into
> >> userspace.
> >>
> >> To represent DPDK ports as Linux net_devices we need kernel support.
> >>
> >>> I think the other stuff
> >>> can be accomplished using netlink sockets + messages, no?
> >>
> >> Netlink sockets just used to communicate kernel-space - user-space, this is not
> >> why we need a kernel module, for example this communication is implemented in
> >> original KNI as part of FIFO.
> >>
> >>> The only
> >>> trepidation I would have with something like this is the support from
> >>> major vendors - out of tree modules are not generally supportable. Might
> >>> be good to get some of the ethtool commands as netlink messages as well,
> >>> then it is supportable with no 3rd party kernel modules.
> >>
> >> Yes, there is a out of three module problem for some distros, but unfortunately
> >> we are not able to find a solution for this case without an external kernel module.
> >>
> >> This patch is still an RFC and if we receive suggested solution without a kernel
> >> module, we can work on it together.
> >
> > If it has to be in the kernel then you need to find a design that is 
> > upstreamable. Out of tree kernel modules are not a solution, they're a 
> > problem that people are working on eliminating.
> >
> 
> Hi Stephen, and other Linux experts in the mail list,

forget to add Stephen Hemminger to cc, doing now.

> 
> Can you please help finding a upstreamable solution for kernel control path?
> 
> Mainly what we are looking for is userspace network driver support in kernel, similar to what FUSE does but a much simple version.
> 
> Above KCP module basically does this, by having a network driver which passing requests to userspace network driver, but it is not generic enough.
> 
> I wonder if it is possible make it more generic by extending rtnetlink support:
> 1- Add a new network driver to Linux (or update existing one like tun) to forward requests, get responses.
> 2- Extend rtnelink to support to attach any userspace driver to this device? (ip link set <device> uspace <?> ?)
> 
> Does this make sense?
> 
> rtnetlink already supports creating interfaces, and it provides kernel/user space communication,
> with "attach" support interface learns about it's peer in usersppace and can communicate.
> 
> FUSE like communication method also can be alternative to transfer request and responses, but since rtnelink support exists, no need to create something new think.
> 
> Thanks,
> ferruh
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
  2016-02-04 13:30       ` Ferruh Yigit
  2016-02-04 13:38         ` Ferruh Yigit
@ 2016-02-04 14:40         ` Aaron Conole
  2016-02-04 16:28           ` Ferruh Yigit
  1 sibling, 1 reply; 15+ messages in thread
From: Aaron Conole @ 2016-02-04 14:40 UTC (permalink / raw)
  To: Panu Matilainen; +Cc: dev

Hi Ferruh,

I missed your original reply to me. Sorry.

Ferruh Yigit <ferruh.yigit@intel.com> writes:
> On Tue, Jan 19, 2016 at 01:29:32PM +0200, Panu Matilainen wrote:
>> On 01/19/2016 11:59 AM, Ferruh Yigit wrote:
>>> On Mon, Jan 18, 2016 at 11:20:02AM -0500, Aaron Conole wrote:
>>>> Ferruh Yigit <ferruh.yigit@intel.com> writes:
>>>>> This work is to make DPDK ports more visible and to enable using common
>>>>> Linux tools to configure DPDK ports.
>>>>
>>>> This is a good goal. Only question - why use an additional kernel module
>>>> to do this? Is it _JUST_ for ethtool support?
>>>
>>> Kernel module used to create/destroy Linux net_devices, and module has a simple
>>> driver for that device which only handles control messages by passing them into
>>> userspace.
>>>
>>> To represent DPDK ports as Linux net_devices we need kernel support.

Why? Just create tun/tap interface, no? Then you get a queue into the
network stack, as well. Subscribe to netlink, and you can get all of the
changes that happen in the system - just look for those messages that
relate to your tun device. At least, that's what I see right away (and I
have some private patches for this, and you can take them over if you want).

I think most of the stuff you are trying to solve already exists, but I
am probably misunderstanding something (apologies for that).

>>>> I think the other stuff
>>>> can be accomplished using netlink sockets + messages, no?
>>>
>>> Netlink sockets just used to communicate kernel-space - user-space, this is not
>>> why we need a kernel module, for example this communication is implemented in
>>> original KNI as part of FIFO.
>>>
>>>> The only
>>>> trepidation I would have with something like this is the support from
>>>> major vendors - out of tree modules are not generally supportable. Might
>>>> be good to get some of the ethtool commands as netlink messages as well,
>>>> then it is supportable with no 3rd party kernel modules.
>>>
>>> Yes, there is a out of three module problem for some distros, but unfortunately
>>> we are not able to find a solution for this case without an
>>> external kernel module.
>>>
>>> This patch is still an RFC and if we receive suggested solution without a kernel
>>> module, we can work on it together.
>>
>> If it has to be in the kernel then you need to find a design that is 
>> upstreamable. Out of tree kernel modules are not a solution, they're a 
>> problem that people are working on eliminating.
>>
>
> Hi Stephen, and other Linux experts in the mail list,
>
> Can you please help finding a upstreamable solution for kernel control path?
>
> Mainly what we are looking for is userspace network driver support in
> kernel, similar to what FUSE does but a much simple version.
>
> Above KCP module basically does this, by having a network driver which
> passing requests to userspace network driver, but it is not generic
> enough.
>
> I wonder if it is possible make it more generic by extending rtnetlink support:
> 1- Add a new network driver to Linux (or update existing one like tun)
> to forward requests, get responses.
> 2- Extend rtnelink to support to attach any userspace driver to this
> device? (ip link set <device> uspace <?> ?)
>
> Does this make sense?
>
> rtnetlink already supports creating interfaces, and it provides
> kernel/user space communication,
> with "attach" support interface learns about it's peer in usersppace
> and can communicate.
>
> FUSE like communication method also can be alternative to transfer
> request and responses, but since rtnelink support exists, no need to
> create something new think.
>
> Thanks,
> ferruh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
  2016-02-04 14:40         ` Aaron Conole
@ 2016-02-04 16:28           ` Ferruh Yigit
  0 siblings, 0 replies; 15+ messages in thread
From: Ferruh Yigit @ 2016-02-04 16:28 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev

On Thu, Feb 04, 2016 at 09:40:18AM -0500, Aaron Conole wrote:
> Hi Ferruh,
Hi Aaron,

> 
> I missed your original reply to me. Sorry.
> 
> Ferruh Yigit <ferruh.yigit@intel.com> writes:
> > On Tue, Jan 19, 2016 at 01:29:32PM +0200, Panu Matilainen wrote:
> >> On 01/19/2016 11:59 AM, Ferruh Yigit wrote:
> >>> On Mon, Jan 18, 2016 at 11:20:02AM -0500, Aaron Conole wrote:
> >>>> Ferruh Yigit <ferruh.yigit@intel.com> writes:
> >>>>> This work is to make DPDK ports more visible and to enable using common
> >>>>> Linux tools to configure DPDK ports.
> >>>>
> >>>> This is a good goal. Only question - why use an additional kernel module
> >>>> to do this? Is it _JUST_ for ethtool support?
> >>>
> >>> Kernel module used to create/destroy Linux net_devices, and module has a simple
> >>> driver for that device which only handles control messages by passing them into
> >>> userspace.
> >>>
> >>> To represent DPDK ports as Linux net_devices we need kernel support.
> 
> Why? Just create tun/tap interface, no? Then you get a queue into the
> network stack, as well. Subscribe to netlink, and you can get all of the
> changes that happen in the system - just look for those messages that
> relate to your tun device. At least, that's what I see right away (and I
> have some private patches for this, and you can take them over if you want).
> 
Do you mean subscribe to rtnl messages, like "ip monitor" does?
If so the problem with that is it is unidirectional, from kernel to userspace.

Some operations can be applied with this method, still missing ability return status of action.

But many can't be applied because userspace driver can't send data to kernel,
like almost all ethtool commands requires information from userspace driver.

In kernel side, an entity required to listen and handle userspace driver responses.

> I think most of the stuff you are trying to solve already exists, but I
> am probably misunderstanding something (apologies for that).
> 
> >>>> I think the other stuff
> >>>> can be accomplished using netlink sockets + messages, no?
> >>>
> >>> Netlink sockets just used to communicate kernel-space - user-space, this is not
> >>> why we need a kernel module, for example this communication is implemented in
> >>> original KNI as part of FIFO.
> >>>
> >>>> The only
> >>>> trepidation I would have with something like this is the support from
> >>>> major vendors - out of tree modules are not generally supportable. Might
> >>>> be good to get some of the ethtool commands as netlink messages as well,
> >>>> then it is supportable with no 3rd party kernel modules.
> >>>
> >>> Yes, there is a out of three module problem for some distros, but unfortunately
> >>> we are not able to find a solution for this case without an
> >>> external kernel module.
> >>>
> >>> This patch is still an RFC and if we receive suggested solution without a kernel
> >>> module, we can work on it together.
> >>
> >> If it has to be in the kernel then you need to find a design that is 
> >> upstreamable. Out of tree kernel modules are not a solution, they're a 
> >> problem that people are working on eliminating.
> >>
> >
> > Hi Stephen, and other Linux experts in the mail list,
> >
> > Can you please help finding a upstreamable solution for kernel control path?
> >
> > Mainly what we are looking for is userspace network driver support in
> > kernel, similar to what FUSE does but a much simple version.
> >
> > Above KCP module basically does this, by having a network driver which
> > passing requests to userspace network driver, but it is not generic
> > enough.
> >
> > I wonder if it is possible make it more generic by extending rtnetlink support:
> > 1- Add a new network driver to Linux (or update existing one like tun)
> > to forward requests, get responses.
> > 2- Extend rtnelink to support to attach any userspace driver to this
> > device? (ip link set <device> uspace <?> ?)
> >
> > Does this make sense?
> >
> > rtnetlink already supports creating interfaces, and it provides
> > kernel/user space communication,
> > with "attach" support interface learns about it's peer in usersppace
> > and can communicate.
> >
> > FUSE like communication method also can be alternative to transfer
> > request and responses, but since rtnelink support exists, no need to
> > create something new think.
> >
> > Thanks,
> > ferruh

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-02-04 16:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-15 16:18 [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports Ferruh Yigit
2016-01-15 16:18 ` [dpdk-dev] [RFC 1/3] rte_ctrl_if: add control interface library Ferruh Yigit
2016-01-15 16:18 ` [dpdk-dev] [RFC 2/3] kcp: add kernel control path kernel module Ferruh Yigit
2016-01-15 16:18 ` [dpdk-dev] [RFC 3/3] examples/ethtool: add control interface support to the application Ferruh Yigit
2016-01-18 16:20 ` [dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports Aaron Conole
2016-01-19  9:59   ` Ferruh Yigit
2016-01-19 11:29     ` Panu Matilainen
2016-02-04 13:30       ` Ferruh Yigit
2016-02-04 13:38         ` Ferruh Yigit
2016-02-04 14:40         ` Aaron Conole
2016-02-04 16:28           ` Ferruh Yigit
2016-01-18 23:12 ` Stephen Hemminger
2016-01-18 23:48   ` Jay Rolette
2016-01-19  1:36     ` Stephen Hemminger
2016-01-19 10:08     ` Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).