DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC v3] /net: memory interface (memif)
@ 2018-12-13 13:30 Jakub Grajciar
  2018-12-13 18:07 ` Stephen Hemminger
                   ` (5 more replies)
  0 siblings, 6 replies; 58+ messages in thread
From: Jakub Grajciar @ 2018-12-13 13:30 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 config/common_base                          |    5 +
 config/common_linuxapp                      |    1 +
 doc/guides/nics/memif.rst                   |   80 ++
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   29 +
 drivers/net/memif/memif.h                   |  143 +++
 drivers/net/memif/memif_socket.c            | 1093 +++++++++++++++++
 drivers/net/memif/memif_socket.h            |  104 ++
 drivers/net/memif/meson.build               |   10 +
 drivers/net/memif/rte_eth_memif.c           | 1189 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  207 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 14 files changed, 2868 insertions(+)
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

diff --git a/config/common_base b/config/common_base
index d12ae98bc..b8ed10ae5 100644
--- a/config/common_base
+++ b/config/common_base
@@ -403,6 +403,11 @@ CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX_FREE=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 6c1c8d0f4..42cbde8f5 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -18,6 +18,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..55d477840
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,80 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018 Cisco Systems, Inc.
+
+Memif Poll Mode Driver
+======================
+
+Memif PMD allows for DPDK and any other client using memif (DPDK, VPP, libmemif) to
+communicate using shared memory.
+
+The created device, transmits packets in a raw format. It can be used with Ethernet
+mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is supported in
+DPDK memif implementation.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0, net_memif1,
+and so on.  Memif uses unix domain socket to transmit control messages.
+Each memif has a unique id per socket. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``, etc.
+Note that if you assign a socket to a master interface it becomes a listener socket.
+Listener socket can not be used by a slave interface on same client.
+
+Memif works in two roles: master and slave. Slave connects to master over an existing
+socket. It is also a producer of shared memory file and initializes the shared memory.
+Master creates the socket and listens for any slave connection requests. The socket may
+already exist on the system. Be sure to remove any such sockets, if you are
+creating a master interface, or you will see an "Address already in use" error. Function
+``rte_pmd_memif_remove()``, which removes memif interface, will also remove a listener socket,
+if  it is not beeing used by any other interface.
+
+.. csv-table:: Memif configuration options:
+   :header: "Option", "Description", "Default"
+
+   "id=0", "Each memif on same socket must be given a unique id", 0
+   "role=master", "Set memif role", slave
+   "bsize=1024", "Size of packet buffer", 2048
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", 10
+   "nrxq=2", "Number of RX queues", 1
+   "ntxq=2", "Number of TX queues", 1
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef"
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer",
+    "zero-copy=yes", "Enable/disable zero-copy slave mode", no
+
+Example: testpmd and VPP
+------------------------
+
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index c0386feb9..0feab5241 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -32,6 +32,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += fm10k
 DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..a82448423
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,29 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += -I$(SRCDIR)
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -Wno-pointer-arith
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
+LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs
+LDLIBS += -lrte_bus_vdev
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..8b759d74b
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,143 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#ifndef MEMIF_CACHELINE_SIZE
+#define MEMIF_CACHELINE_SIZE 64
+#endif
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE = 0,
+	MEMIF_MSG_TYPE_ACK = 1,
+	MEMIF_MSG_TYPE_HELLO = 2,
+	MEMIF_MSG_TYPE_INIT = 3,
+	MEMIF_MSG_TYPE_ADD_REGION = 4,
+	MEMIF_MSG_TYPE_ADD_RING = 5,
+	MEMIF_MSG_TYPE_CONNECT = 6,
+	MEMIF_MSG_TYPE_CONNECTED = 7,
+	MEMIF_MSG_TYPE_DISCONNECT = 8,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M = 0,
+	MEMIF_RING_M2S = 1
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET = 0,
+	MEMIF_INTERFACE_MODE_IP = 1,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT = 2,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+typedef struct __attribute__ ((packed)) {
+	uint8_t name[32];
+	memif_version_t min_version;
+	memif_version_t max_version;
+	memif_region_index_t max_region;
+	memif_ring_index_t max_m2s_ring;
+	memif_ring_index_t max_s2m_ring;
+	memif_log2_ring_size_t max_log2_ring_size;
+} memif_msg_hello_t;
+
+typedef struct __attribute__ ((packed)) {
+	memif_version_t version;
+	memif_interface_id_t id;
+	memif_interface_mode_t mode:8;
+	uint8_t secret[24];
+	uint8_t name[32];
+} memif_msg_init_t;
+
+typedef struct __attribute__ ((packed)) {
+	memif_region_index_t index;
+	memif_region_size_t size;
+} memif_msg_add_region_t;
+
+typedef struct __attribute__ ((packed)) {
+	uint16_t flags;
+#define MEMIF_MSG_ADD_RING_FLAG_S2M	(1 << 0)
+	memif_ring_index_t index;
+	memif_region_index_t region;
+	memif_region_offset_t offset;
+	memif_log2_ring_size_t log2_ring_size;
+	uint16_t private_hdr_size;	/* used for private metadata */
+} memif_msg_add_ring_t;
+
+typedef struct __attribute__ ((packed)) {
+	uint8_t if_name[32];
+} memif_msg_connect_t;
+
+typedef struct __attribute__ ((packed)) {
+	uint8_t if_name[32];
+} memif_msg_connected_t;
+
+typedef struct __attribute__ ((packed)) {
+	uint32_t code;
+	uint8_t string[96];
+} memif_msg_disconnect_t;
+
+typedef struct __attribute__ ((packed, aligned(128))) {
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+typedef struct __attribute__ ((packed)) {
+	uint16_t flags;
+#define MEMIF_DESC_FLAG_NEXT (1 << 0)
+	memif_region_index_t region;
+	uint32_t length;
+	memif_region_offset_t offset;
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __attribute__((aligned(MEMIF_CACHELINE_SIZE)))
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;
+	uint16_t flags;
+#define MEMIF_RING_FLAG_MASK_INT 1
+	volatile uint16_t head;
+	 MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;
+	 MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..a66b5741a
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1093 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include <rte_eth_memif.h>
+#include <memif_socket.h>
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = (void *)msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		struct cmsghdr *cmsg;
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "%s: Sent msg type %u.",
+			(cc->pmd !=
+			 NULL) ? rte_vdev_device_name(cc->pmd->vdev) :
+				 "memif_driver",
+			e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = rte_zmalloc("memif_msg",
+						    sizeof(struct
+							   memif_msg_queue_elt),
+						    0);
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	if (e == NULL) {
+		MIF_LOG(WARNING, "%s: Failed to enqueue disconnect message.",
+			(cc->pmd !=
+			 NULL) ? rte_vdev_device_name(cc->pmd->vdev) :
+				 "memif_driver");
+		return;
+	}
+
+	memif_msg_disconnect_t *d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->pmd != NULL) {
+			strlcpy(cc->pmd->local_disc_string, reason,
+				sizeof(cc->pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_hello_t *h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_IDX;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct pmd_internals *pmd, memif_msg_t *msg)
+{
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version",
+					 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.buffer_size = pmd->cfg.buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_pmd_list_elt *elt;
+	struct pmd_internals *pmd;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->pmd_queue, next) {
+		pmd = elt->pmd;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->pmd = pmd;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags && (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected",
+							 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required",
+								 0);
+					return -1;
+				}
+				if (strcmp(pmd->secret, (char *)i->secret) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret",
+								 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct pmd_internals *pmd, memif_msg_t *msg,
+			     int fd)
+{
+	memif_msg_add_region_t *ar = &msg->add_region;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	struct memif_region *mr;
+
+	if (ar->index > ETH_MEMIF_MAX_REGION_IDX) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	mr = rte_realloc(pmd->regions, sizeof(struct memif_region) *
+			 (ar->index + 1), 0);
+	if (mr == NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Device error", 0);
+		return -1;
+	}
+
+	pmd->regions = mr;
+	pmd->regions[ar->index].fd = fd;
+	pmd->regions[ar->index].region_size = ar->size;
+	pmd->regions[ar->index].addr = NULL;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct pmd_internals *pmd, memif_msg_t *msg, int fd)
+{
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	struct memif_queue *mq;
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index",
+						 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index",
+						 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    &pmd->rx_queues[ar->index] : &pmd->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct pmd_internals *pmd, memif_msg_t *msg)
+{
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(pmd);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct pmd_internals *pmd, memif_msg_t *msg)
+{
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(pmd);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct pmd_internals *pmd, memif_msg_t *msg)
+{
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, sizeof(pmd->remote_disc_string));
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, 96);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct pmd_internals *pmd)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct pmd_internals *pmd)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_init_t *i = &e->msg.init;
+
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (pmd->secret)
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct pmd_internals *pmd, uint8_t idx)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_add_region_t *ar = &e->msg.add_region;
+	struct memif_region *mr = &pmd->regions[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct pmd_internals *pmd, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_add_ring_t *ar = &e->msg.add_ring;
+	struct memif_queue *mq;
+
+	mq = (type == MEMIF_RING_S2M) ? &pmd->tx_queues[idx] :
+	    &pmd->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct pmd_internals *pmd)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_connect_t *c = &e->msg.connect;
+	const char *name = rte_vdev_device_name(pmd->vdev);
+
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct pmd_internals *pmd)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_connected_t *c = &e->msg.connected;
+
+	const char *name = rte_vdev_device_name(pmd->vdev);
+
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		while ((elt = TAILQ_FIRST(&pmd->cc->msg_queue)) != NULL) {
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		if (pmd->cc->intr_handle.fd > 0) {
+			ret =
+			    rte_intr_callback_unregister(&pmd->cc->intr_handle,
+							 memif_intr_handler,
+							 pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				ret =
+				    rte_intr_callback_unregister_pending(&pmd->
+							cc->intr_handle,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+			} else if (ret > 0) {
+				close(pmd->cc->intr_handle.fd);
+				rte_free(pmd->cc);
+			}
+			if (ret <= 0)
+				MIF_LOG(WARNING,
+					"%s: Failed to unregister control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	struct memif_queue *mq;
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    &pmd->tx_queues[i] : &pmd->rx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			rte_intr_disable(&mq->intr_handle);
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    &pmd->rx_queues[i] : &pmd->tx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			rte_intr_disable(&mq->intr_handle);
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->pmd == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->pmd, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		for (i = 0; i < cc->pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->pmd, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < cc->pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->pmd, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < cc->pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->pmd, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->pmd, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->pmd, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->pmd, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->pmd);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->pmd, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->pmd, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	struct rte_eth_dev *dev;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->pmd == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	dev = rte_eth_dev_allocated(rte_vdev_device_name(cc->pmd->vdev));
+	if (dev == NULL) {
+		MIF_LOG(WARNING, "%s: eth dev not allocated",
+			rte_vdev_device_name(cc->pmd->vdev));
+		return;
+	}
+	memif_disconnect(dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->pmd = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret =
+	    rte_intr_callback_register(&cc->intr_handle, memif_intr_handler,
+				       cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL) {
+		rte_free(cc);
+		cc = NULL;
+	}
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->pmd_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_pmd_list_elt *elt;
+	int ret;
+	char key[256];
+
+	struct rte_hash *hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->pmd_queue, next) {
+		if (elt->pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt =
+	    rte_malloc("pmd-queue", sizeof(struct memif_socket_pmd_list_elt),
+		       0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->pmd = pmd;
+	TAILQ_INSERT_TAIL(&socket->pmd_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct pmd_internals *pmd)
+{
+	struct memif_socket *socket = NULL;
+	struct memif_socket_pmd_list_elt *elt, *next;
+
+	struct rte_hash *hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) <
+	    0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->pmd_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->pmd == pmd) {
+			TAILQ_REMOVE(&socket->pmd_queue, elt, next);
+			free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->pmd_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	if (pmd->rx_queues == NULL || pmd->tx_queues == NULL ||
+	    pmd->socket_filename == NULL) {
+		MIF_LOG(ERR, "%s: Device not configured!",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	if (pmd->rx_queues == NULL || pmd->tx_queues == NULL ||
+	    pmd->socket_filename == NULL) {
+		MIF_LOG(ERR, "%s: Device not configured!",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->pmd = pmd;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for controll fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..e67009404
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct pmd_internals *pmd);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_pmd_list_elt {
+	TAILQ_ENTRY(memif_socket_pmd_list_elt) next;
+	struct pmd_internals *pmd;		/**< pointer to device internals */
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	uint8_t listener;			/**< if not zero socket is listener */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_pmd_list_elt) pmd_queue;
+	/**< Queue of devices using this socket */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	memif_msg_t msg;			/**< control message */
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	 TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct pmd_internals *pmd;		/**< pointer to device internals */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..a6f6ea8bc
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..5540b10cc
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1189 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include <rte_eth_memif.h>
+#include <memif_socket.h>
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_NRXQ_ARG		"nrxq"
+#define ETH_MEMIF_NTXQ_ARG		"ntxq"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_NRXQ_ARG,
+	ETH_MEMIF_NTXQ_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+static struct rte_vdev_driver pmd_memif_drv;
+
+const char *
+memif_version(void)
+{
+#define STR_HELP(s)	#s
+#define STR(s)		STR_HELP(s)
+	return ("memif-" STR(MEMIF_VERSION_MAJOR) "." STR(MEMIF_VERSION_MINOR));
+#undef STR
+#undef STR_HELP
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	dev_info->if_index = pmd->if_index;
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = (pmd->role == MEMIF_ROLE_SLAVE) ?
+	    pmd->cfg.num_m2s_rings : pmd->cfg.num_s2m_rings;
+	dev_info->max_tx_queues = (pmd->role == MEMIF_ROLE_SLAVE) ?
+	    pmd->cfg.num_s2m_rings : pmd->cfg.num_m2s_rings;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0].addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+	p += (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return (pmd->regions[d->region].addr + d->offset);
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	memif_ring_t *ring = mq->ring;
+	if (unlikely(ring == NULL))
+		return 0;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+	    RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head = NULL;
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		uint64_t b;
+		ssize_t size __rte_unused;
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+	}
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+ next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size + RTE_PKTMBUF_HEADROOM;
+
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				rte_pktmbuf_chain(mbuf_head, mbuf);
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_pkt_len(mbuf) =
+			    rte_pktmbuf_data_len(mbuf) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+ no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+ refill:
+	if (type == MEMIF_RING_M2S) {
+		uint16_t head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	memif_ring_t *ring = mq->ring;
+	if (unlikely(ring == NULL))
+		return 0;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_free && n_tx_pkts < nb_pkts) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len =
+		    (type ==
+		     MEMIF_RING_S2M) ? pmd->run.buffer_size : d0->length;
+
+ next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy(memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+ no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		uint64_t a = 1;
+		ssize_t size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt on qid %ld: %s",
+				rte_vdev_device_name(pmd->vdev),
+				mq - pmd->tx_queues, strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions + i;
+		if (r == NULL)
+			return;
+		if (r->addr == NULL)
+			return;
+		munmap(r->addr, r->region_size);
+		if (r->fd > 0) {
+			close(r->fd);
+			r->fd = -1;
+		}
+	}
+	rte_free(pmd->regions);
+}
+
+static int
+memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn)
+{
+	struct memif_region *r;
+	char shm_name[32];
+	int i;
+	int ret = 0;
+
+	r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate regions.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	pmd->regions = r;
+	pmd->regions_num = brn + 1;
+
+	/*
+	 * Create shm for every region. Region 0 is reserved for descriptors.
+	 * Other regions contain buffers.
+	 */
+	for (i = 0; i < (brn + 1); i++) {
+		r = &pmd->regions[i];
+
+		r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
+					       pmd->run.num_m2s_rings) *
+		    (sizeof(memif_ring_t) +
+		     sizeof(memif_desc_t) * (1 << pmd->run.log2_ring_size)) : 0;
+		r->region_size = (i == 0) ? r->buffer_offset :
+		    (uint32_t)(pmd->run.buffer_size *
+				(1 << pmd->run.log2_ring_size) *
+				(pmd->run.num_s2m_rings +
+				 pmd->run.num_m2s_rings));
+
+		memset(shm_name, 0, sizeof(char) * 32);
+		sprintf(shm_name, "memif region %d", i);
+
+		r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+		if (r->fd < 0) {
+			MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		ret = ftruncate(r->fd, r->region_size);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		r->addr = mmap(NULL, r->region_size, PROT_READ |
+			       PROT_WRITE, MAP_SHARED, r->fd, 0);
+		if (r->addr == NULL) {
+			MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct pmd_internals *pmd)
+{
+	memif_ring_t *ring;
+	int i, j;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			uint16_t slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 1;
+			ring->desc[j].offset = pmd->regions[1].buffer_offset +
+			    (uint32_t)(slot * pmd->run.buffer_size);
+			ring->desc[j].length = pmd->run.buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			uint16_t slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 1;
+			ring->desc[j].offset = pmd->regions[1].buffer_offset +
+			    (uint32_t)(slot * pmd->run.buffer_size);
+			ring->desc[j].length = pmd->run.buffer_size;
+		}
+	}
+}
+
+static void
+memif_init_queues(struct pmd_internals *pmd)
+{
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = &pmd->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (void *)mq->ring - (void *)pmd->regions[0].addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = &pmd->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (void *)mq->ring - (void *)pmd->regions[0].addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct pmd_internals *pmd)
+{
+	int ret;
+
+	ret = memif_alloc_regions(pmd, /* num of buffer regions */ 1);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(pmd);
+
+	memif_init_queues(pmd);
+
+	return 0;
+}
+
+int
+memif_connect(struct pmd_internals *pmd)
+{
+	struct rte_eth_dev *eth_dev =
+	    rte_eth_dev_allocated(rte_vdev_device_name(pmd->vdev));
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions + i;
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    &pmd->tx_queues[i] : &pmd->rx_queues[i];
+		mq->ring = pmd->regions[mq->region].addr + mq->offset;
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* polling mode by default */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    &pmd->rx_queues[i] : &pmd->tx_queues[i];
+		mq->ring = pmd->regions[mq->region].addr + mq->offset;
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* polling mode by default */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	eth_dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_realloc(pmd->tx_queues, sizeof(struct memif_queue) * (qid + 1),
+			 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc tx queue %u.",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	pmd->tx_queues = mq;
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_realloc(pmd->rx_queues, sizeof(struct memif_queue) * (qid + 1),
+			 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc rx queue %u.",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	pmd->rx_queues = mq;
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	uint8_t tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	uint8_t nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = &pmd->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = &pmd->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->q_errors[i] = mq->n_err;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? &pmd->tx_queues[i] :
+		    &pmd->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? &pmd->rx_queues[i] :
+		    &pmd->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	/* Enable MEMIF interrupts. */
+	/* pmd->rx_queues[qid].ring->flags  &= ~MEMIF_RING_FLAG_MASK_INT; */
+
+	/*
+	 * TODO: Tell dpdk to use interrupt mode.
+	 *
+	 * return rte_intr_enable(&pmd->rx_queues[qid].intr_handle);
+	 */
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	/* Disable MEMIF interrupts. */
+	/* pmd->rx_queues[qid].ring->flags |= MEMIF_RING_FLAG_MASK_INT; */
+
+	/*
+	 * TODO: Tell dpdk to use polling mode.
+	 *
+	 * return rte_intr_disable(&pmd->rx_queues[qid].intr_handle);
+	 */
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size, uint8_t nrxq,
+	     uint8_t ntxq, uint16_t buffer_size, const char *secret,
+	     const char *eth_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->if_index = id;
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * 24);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	if (log2_ring_size != 0)
+		pmd->cfg.log2_ring_size = log2_ring_size;
+	pmd->cfg.num_s2m_rings = ETH_MEMIF_DEFAULT_NRXQ;
+	pmd->cfg.num_m2s_rings = ETH_MEMIF_DEFAULT_NTXQ;
+
+	if (nrxq != 0) {
+		if (role == MEMIF_ROLE_SLAVE)
+			pmd->cfg.num_m2s_rings = nrxq;
+		else
+			pmd->cfg.num_s2m_rings = nrxq;
+	}
+	if (ntxq != 0) {
+		if (role == MEMIF_ROLE_SLAVE)
+			pmd->cfg.num_s2m_rings = ntxq;
+		else
+			pmd->cfg.num_m2s_rings = ntxq;
+	}
+
+	pmd->cfg.buffer_size = ETH_MEMIF_DEFAULT_BUFFER_SIZE;
+	if (buffer_size != 0)
+		pmd->cfg.buffer_size = buffer_size;
+
+	/* FIXME: generate mac? */
+	if (eth_addr == NULL)
+		eth_addr = ETH_MEMIF_DEFAULT_ETH_ADDR;
+
+	ret = sscanf(eth_addr, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &pmd->eth_addr.addr_bytes[0], &pmd->eth_addr.addr_bytes[1],
+	       &pmd->eth_addr.addr_bytes[2], &pmd->eth_addr.addr_bytes[3],
+	       &pmd->eth_addr.addr_bytes[4], &pmd->eth_addr.addr_bytes[5]);
+	if (ret != 6) {
+		MIF_LOG(WARNING, "%s: Failed to parse mac '%s'.",
+			rte_vdev_device_name(vdev), eth_addr);
+	}
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = &pmd->eth_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_nq(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *nq = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFF) {
+		MIF_LOG(ERR, "Invalid number of queues: %s.", value);
+		return -EINVAL;
+	}
+	*nq = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	unsigned int i;
+	struct rte_kvargs *kvlist;
+	const struct rte_kvargs_pair *pair;
+
+	const char *name = rte_vdev_device_name(vdev);
+
+	enum memif_role_t role;
+	memif_interface_id_t id;
+
+	uint16_t buffer_size;
+	memif_log2_ring_size_t log2_ring_size;
+	uint8_t nrxq, ntxq;
+	const char *socket_filename;
+	const char *eth_addr;
+	uint32_t flags;
+	const char *secret;
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* set default values */
+	role = MEMIF_ROLE_SLAVE;
+	flags = 0;
+	id = 0;
+	buffer_size = 2048;
+	log2_ring_size = 10;
+	nrxq = 1;
+	ntxq = 1;
+	socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	secret = NULL;
+	eth_addr = NULL;
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_ROLE_ARG) == 1) {
+			ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+						 &memif_set_role, &role);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_ID_ARG) == 1) {
+			ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+						 &memif_set_id, &id);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_BUFFER_SIZE_ARG) == 1) {
+			ret =
+			    rte_kvargs_process(kvlist,
+					       ETH_MEMIF_BUFFER_SIZE_ARG,
+					       &memif_set_bs, &buffer_size);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_RING_SIZE_ARG) == 1) {
+			ret =
+			    rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					       &memif_set_rs, &log2_ring_size);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_NRXQ_ARG) == 1) {
+			ret = rte_kvargs_process(kvlist, ETH_MEMIF_NRXQ_ARG,
+						 &memif_set_nq, &nrxq);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_NTXQ_ARG) == 1) {
+			ret = rte_kvargs_process(kvlist, ETH_MEMIF_NTXQ_ARG,
+						 &memif_set_nq, &ntxq);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_SOCKET_ARG) == 1) {
+			for (i = 0; i < kvlist->count; i++) {
+				pair = &kvlist->pairs[i];
+				if (strcmp(pair->key, ETH_MEMIF_SOCKET_ARG) == 0) {
+					socket_filename = pair->value;
+					ret = memif_check_socket_filename(
+						socket_filename);
+					if (ret < 0)
+						goto exit;
+				}
+			}
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_MAC_ARG) == 1) {
+			for (i = 0; i < kvlist->count; i++) {
+				pair = &kvlist->pairs[i];
+				if (strcmp(pair->key, ETH_MEMIF_MAC_ARG) == 0)
+					eth_addr = pair->value;
+			}
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_ZC_ARG) == 1) {
+			ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+						 &memif_set_zc, &flags);
+			if (ret < 0)
+				goto exit;
+		}
+		if (rte_kvargs_count(kvlist, ETH_MEMIF_SECRET_ARG) == 1) {
+			for (i = 0; i < kvlist->count; i++) {
+				pair = &kvlist->pairs[i];
+				if (strcmp(pair->key, ETH_MEMIF_SECRET_ARG) == 0)
+					secret = pair->value;
+			}
+		}
+	}
+
+	/* create interface */
+	ret =
+	    memif_create(vdev, role, id, flags, socket_filename, log2_ring_size,
+			 nrxq, ntxq, buffer_size, secret, eth_addr);
+
+ exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	struct pmd_internals *pmd = eth_dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Invalid message size", 0);
+	memif_disconnect(eth_dev);
+
+	memif_socket_remove_device(pmd);
+
+	pmd->vdev = NULL;
+
+	rte_free(eth_dev->data->dev_private);
+
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+RTE_PMD_REGISTER_ALIAS(net_memif, eth_memif);
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=<string>"
+			      ETH_MEMIF_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_NRXQ_ARG "=<int>"
+			      ETH_MEMIF_NTXQ_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=<string>"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..e122b2bb7
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,207 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include <memif.h>
+
+/* generate mac? */
+#define ETH_MEMIF_DEFAULT_ETH_ADDR		"01:ab:23:cd:45:ef"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/tmp/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_NRXQ			1
+#define ETH_MEMIF_DEFAULT_NTXQ			1
+#define ETH_MEMIF_DEFAULT_BUFFER_SIZE		2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		256
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_IDX		255
+
+int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER = 0,
+	MEMIF_ROLE_SLAVE = 1,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t buffer_offset;			/**< offset at which buffers start */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	uint16_t in_port;			/**< port id */
+
+	struct pmd_internals *pmd;		/**< device internals */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	/* ring info */
+	memif_ring_type_t type;			/**< ring type */
+	memif_ring_t *ring;			/**< pointer to ring */
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+
+	memif_region_index_t region;		/**< shared memory region index */
+	memif_region_offset_t offset;		/**< offset at which the queue begins */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+	/*uint32_t *buffers;*/
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+};
+
+struct pmd_internals {
+	int if_index;				/**< index */
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	struct ether_addr eth_addr;		/**< mac address */
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[24]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions;		/**< shared memory regions */
+	uint8_t regions_num;			/**< number of regions */
+
+	struct memif_queue *rx_queues;		/**< RX queues */
+	struct memif_queue *tx_queues;		/**< TX queues */
+
+	/* remote info */
+	char remote_name[64];			/**< remote app name */
+	char remote_if_name[64];		/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t buffer_size;		/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t buffer_size;		/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[96];		/**< local disconnect reason */
+	char remote_disc_string[96];		/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct pmd_internals *pmd);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct pmd_internals *pmd);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..97fd251df
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.02 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 980eec233..b0becbf31 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -21,6 +21,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 5699d979d..f236c5ebc 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -168,6 +168,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 ifeq ($(CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4 -ldl
 else
-- 
2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v3] /net: memory interface (memif)
  2018-12-13 13:30 [dpdk-dev] [RFC v3] /net: memory interface (memif) Jakub Grajciar
@ 2018-12-13 18:07 ` Stephen Hemminger
  2018-12-14  9:39   ` Bruce Richardson
  2019-01-04 17:16 ` Ferruh Yigit
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 58+ messages in thread
From: Stephen Hemminger @ 2018-12-13 18:07 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

On Thu, 13 Dec 2018 14:30:51 +0100
Jakub Grajciar <jgrajcia@cisco.com> wrote:

> +
> +typedef uint16_t memif_region_index_t;
> +typedef uint32_t memif_region_offset_t;
> +typedef uint64_t memif_region_size_t;
> +typedef uint16_t memif_ring_index_t;
> +typedef uint32_t memif_interface_id_t;
> +typedef uint16_t memif_version_t;
> +typedef uint8_t memif_log2_ring_size_t;
> +

Seems very typedef heavy to me. Having more typedefs
does not improve the readability.


> +typedef struct __attribute__ ((packed)) {

Use __rte_packed rather than attributes directly.

> +typedef struct __attribute__ ((packed)) {
> +	uint8_t if_name[32];
> +} memif_msg_connect_t;
> +

Why magic constant 32? Better to use something like

#define MEMIF_NAMESZ 32

Also, I am confused about how this relates
to DPDK device names and Linux network device names (if at all).

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v3] /net: memory interface (memif)
  2018-12-13 18:07 ` Stephen Hemminger
@ 2018-12-14  9:39   ` Bruce Richardson
  2018-12-14 16:12     ` Wiles, Keith
  0 siblings, 1 reply; 58+ messages in thread
From: Bruce Richardson @ 2018-12-14  9:39 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Jakub Grajciar, dev

On Thu, Dec 13, 2018 at 10:07:09AM -0800, Stephen Hemminger wrote:
> On Thu, 13 Dec 2018 14:30:51 +0100
> Jakub Grajciar <jgrajcia@cisco.com> wrote:
> 
> > +
> > +typedef uint16_t memif_region_index_t;
> > +typedef uint32_t memif_region_offset_t;
> > +typedef uint64_t memif_region_size_t;
> > +typedef uint16_t memif_ring_index_t;
> > +typedef uint32_t memif_interface_id_t;
> > +typedef uint16_t memif_version_t;
> > +typedef uint8_t memif_log2_ring_size_t;
> > +
> 
> Seems very typedef heavy to me. Having more typedefs
> does not improve the readability.
> 

+1
Our coding guidelines generally recommend against using typedefs, though
they generally refer to structure typedefs rather than typedefs for basic
types.

/Bruce

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v3] /net: memory interface (memif)
  2018-12-14  9:39   ` Bruce Richardson
@ 2018-12-14 16:12     ` Wiles, Keith
  0 siblings, 0 replies; 58+ messages in thread
From: Wiles, Keith @ 2018-12-14 16:12 UTC (permalink / raw)
  To: Richardson, Bruce; +Cc: Stephen Hemminger, Jakub Grajciar, dev



> On Dec 14, 2018, at 3:39 AM, Bruce Richardson <bruce.richardson@intel.com> wrote:
> 
> On Thu, Dec 13, 2018 at 10:07:09AM -0800, Stephen Hemminger wrote:
>> On Thu, 13 Dec 2018 14:30:51 +0100
>> Jakub Grajciar <jgrajcia@cisco.com> wrote:
>> 
>>> +
>>> +typedef uint16_t memif_region_index_t;
>>> +typedef uint32_t memif_region_offset_t;
>>> +typedef uint64_t memif_region_size_t;
>>> +typedef uint16_t memif_ring_index_t;
>>> +typedef uint32_t memif_interface_id_t;
>>> +typedef uint16_t memif_version_t;
>>> +typedef uint8_t memif_log2_ring_size_t;
>>> +
>> 
>> Seems very typedef heavy to me. Having more typedefs
>> does not improve the readability.
>> 
> 
> +1
> Our coding guidelines generally recommend against using typedefs, though
> they generally refer to structure typedefs rather than typedefs for basic
> types.

The guide lines do suggest not to use typedefs, but here is a PMD which is self contained in that the headers are not normally used outside of the PMD. This means to me that typedefs in this give case to reasonable. Plus it is a suggestion in the guide lines and the cases talked about in the guide seem to be all related to headers that are more globally used.

I suggest he can keep these typedefs in this case, as we have done a something much more wide scope with port id typedef.
> 
> /Bruce

Regards,
Keith

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v3] /net: memory interface (memif)
  2018-12-13 13:30 [dpdk-dev] [RFC v3] /net: memory interface (memif) Jakub Grajciar
  2018-12-13 18:07 ` Stephen Hemminger
@ 2019-01-04 17:16 ` Ferruh Yigit
  2019-01-04 19:23 ` Stephen Hemminger
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 58+ messages in thread
From: Ferruh Yigit @ 2019-01-04 17:16 UTC (permalink / raw)
  To: Jakub Grajciar, dev

On 12/13/2018 1:30 PM, Jakub Grajciar wrote:
> Memory interface (memif), provides high performance
> packet transfer over shared memory.
> 
> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>

Hi Jakub,

I put some comments but overall it is a long patch, I am almost sure there are
some details missed, it can make easier to review if you can split the patch
into multiple patches.

> ---
>  config/common_base                          |    5 +
>  config/common_linuxapp                      |    1 +
>  doc/guides/nics/memif.rst                   |   80 ++

"doc/guides/nics/index.rst" needs to be updated to add the new doc into index

Also there is a driver feature documentation, "doc/guides/nics/features/*.ini",
I admit it is not very informative for virtual devices, but having it will add
the driver to the overview table, so please add it too, with as much data as
possible.

>  drivers/net/Makefile                        |    1 +
>  drivers/net/memif/Makefile                  |   29 +
>  drivers/net/memif/memif.h                   |  143 +++
>  drivers/net/memif/memif_socket.c            | 1093 +++++++++++++++++
>  drivers/net/memif/memif_socket.h            |  104 ++
>  drivers/net/memif/meson.build               |   10 +
>  drivers/net/memif/rte_eth_memif.c           | 1189 +++++++++++++++++++
>  drivers/net/memif/rte_eth_memif.h           |  207 ++++
>  drivers/net/memif/rte_pmd_memif_version.map |    4 +
>  drivers/net/meson.build                     |    1 +
>  mk/rte.app.mk                               |    1 +

Can you please update "MAINTAINERS" file too, to add driver and its owner.

<...>

> @@ -0,0 +1,80 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2018 Cisco Systems, Inc.
> +
> +Memif Poll Mode Driver
> +======================
> +
> +Memif PMD allows for DPDK and any other client using memif (DPDK, VPP, libmemif) to
> +communicate using shared memory.

It can be good to describe what 'memif' is briefly here and perhaps a link for
more detailed documentation.

And some more low level detail can be good, like message type, messaging
protocol, descriptor format etc.. A link also good enough if it is already
documented somewhere.

Also memif PMD is only for Linux, what do you think documenting that limitation
here?

<...>

> +Example: testpmd and VPP
> +------------------------
> +
> +For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.

Can memif PMD used between two DPDK application, or as two interfaces to
testpmd? If so what do you think adding that as a main sample, which would be
easier for everyone to test?

<...>

> @@ -0,0 +1,29 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2018 Cisco Systems, Inc.  All rights reserved.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +#
> +# library name
> +#
> +LIB = librte_pmd_memif.a
> +
> +EXPORT_MAP := rte_pmd_memif_version.map
> +
> +LIBABIVER := 1
> +
> +CFLAGS += -O3
> +CFLAGS += -I$(SRCDIR)

Why need to include $(SRCDIR) ?

> +CFLAGS += $(WERROR_FLAGS)
> +CFLAGS += -Wno-pointer-arith

Is this warning disabled explicitly?

<...>

> @@ -0,0 +1,143 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2018 Cisco Systems, Inc.  All rights reserved.
> + */
> +
> +#ifndef _MEMIF_H_
> +#define _MEMIF_H_
> +
> +#ifndef MEMIF_CACHELINE_SIZE
> +#define MEMIF_CACHELINE_SIZE 64
> +#endif

How much this is related to processor cache size? If same thing it is better to
use RTE_CACHE_LINE_SIZE since it can be set properly for various architectures.

> +
> +#define MEMIF_COOKIE		0x3E31F20
> +#define MEMIF_VERSION_MAJOR	2
> +#define MEMIF_VERSION_MINOR	0
> +#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
> +
> +/*
> + *  Type definitions
> + */
> +
> +typedef enum memif_msg_type {
> +	MEMIF_MSG_TYPE_NONE = 0,
> +	MEMIF_MSG_TYPE_ACK = 1,
> +	MEMIF_MSG_TYPE_HELLO = 2,
> +	MEMIF_MSG_TYPE_INIT = 3,
> +	MEMIF_MSG_TYPE_ADD_REGION = 4,
> +	MEMIF_MSG_TYPE_ADD_RING = 5,
> +	MEMIF_MSG_TYPE_CONNECT = 6,
> +	MEMIF_MSG_TYPE_CONNECTED = 7,
> +	MEMIF_MSG_TYPE_DISCONNECT = 8,

No need to assign numbers to enums, they will be already provided by compiler.
Valid for all enums below.

> +} memif_msg_type_t;
> +
> +typedef enum {
> +	MEMIF_RING_S2M = 0,
> +	MEMIF_RING_M2S = 1
> +} memif_ring_type_t;
> +
> +typedef enum {
> +	MEMIF_INTERFACE_MODE_ETHERNET = 0,
> +	MEMIF_INTERFACE_MODE_IP = 1,
> +	MEMIF_INTERFACE_MODE_PUNT_INJECT = 2,
> +} memif_interface_mode_t;
> +
> +typedef uint16_t memif_region_index_t;
> +typedef uint32_t memif_region_offset_t;
> +typedef uint64_t memif_region_size_t;
> +typedef uint16_t memif_ring_index_t;
> +typedef uint32_t memif_interface_id_t;
> +typedef uint16_t memif_version_t;
> +typedef uint8_t memif_log2_ring_size_t;
> +
> +/*
> + *  Socket messages
> + */
> +
> +typedef struct __attribute__ ((packed)) {

can use "__rte_packed" instead of "__attribute__ ((packed))"

> +	uint8_t name[32];

It can be good to define a name size macro instead of 32

> +	memif_version_t min_version;
> +	memif_version_t max_version;
> +	memif_region_index_t max_region;
> +	memif_ring_index_t max_m2s_ring;
> +	memif_ring_index_t max_s2m_ring;
> +	memif_log2_ring_size_t max_log2_ring_size;
> +} memif_msg_hello_t;
> +
> +typedef struct __attribute__ ((packed)) {
> +	memif_version_t version;
> +	memif_interface_id_t id;
> +	memif_interface_mode_t mode:8;

What will be the size of the enum when combined bitfield and "__attribute__
((packed))", I guess it will be 1byte above, does it worth commenting to clarify?
And I wonder if this behavior is part of standard?

> +	uint8_t secret[24];
> +	uint8_t name[32];
> +} memif_msg_init_t;
> +
> +typedef struct __attribute__ ((packed)) {
> +	memif_region_index_t index;
> +	memif_region_size_t size;
> +} memif_msg_add_region_t;
> +
> +typedef struct __attribute__ ((packed)) {
> +	uint16_t flags;
> +#define MEMIF_MSG_ADD_RING_FLAG_S2M	(1 << 0)

Why not just "1" ?

> +	memif_ring_index_t index;
> +	memif_region_index_t region;
> +	memif_region_offset_t offset;
> +	memif_log2_ring_size_t log2_ring_size;
> +	uint16_t private_hdr_size;	/* used for private metadata */
> +} memif_msg_add_ring_t;
> +
> +typedef struct __attribute__ ((packed)) {
> +	uint8_t if_name[32];
> +} memif_msg_connect_t;
> +
> +typedef struct __attribute__ ((packed)) {
> +	uint8_t if_name[32];
> +} memif_msg_connected_t;
> +
> +typedef struct __attribute__ ((packed)) {
> +	uint32_t code;
> +	uint8_t string[96];

Why not 97? Or 42?

> +} memif_msg_disconnect_t;
> +
> +typedef struct __attribute__ ((packed, aligned(128))) {
> +	memif_msg_type_t type:16;
> +	union {
> +		memif_msg_hello_t hello;
> +		memif_msg_init_t init;
> +		memif_msg_add_region_t add_region;
> +		memif_msg_add_ring_t add_ring;
> +		memif_msg_connect_t connect;
> +		memif_msg_connected_t connected;
> +		memif_msg_disconnect_t disconnect;
> +	};
> +} memif_msg_t;
> +
> +/*
> + *  Ring and Descriptor Layout
> + */
> +
> +typedef struct __attribute__ ((packed)) {
> +	uint16_t flags;
> +#define MEMIF_DESC_FLAG_NEXT (1 << 0)
> +	memif_region_index_t region;
> +	uint32_t length;
> +	memif_region_offset_t offset;
> +	uint32_t metadata;
> +} memif_desc_t;
> +
> +#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
> +	uint8_t mark[0] __attribute__((aligned(MEMIF_CACHELINE_SIZE)))

can use "__rte_aligned()" instead. And if this is for processors cache size
'__rte_cache_aligned' can be better to use.

<...>

> +
> +#include <rte_eth_memif.h>
> +#include <memif_socket.h>

These are private headers, not installed to system path, why not access them as
local #include "memif_socket.h"

<...>

> +
> +			if (pmd->flags && (ETH_MEMIF_FLAG_CONNECTING |
> +					   ETH_MEMIF_FLAG_CONNECTED)) {

I assume intention is '&' here, instead of '&&'

<...>

> +	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
> +
> +	if (pmd->secret)
> +		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));

This will always evaluate to true, is the intention '*pmd->secret' ?

<...>

> +			if (ret == -EAGAIN) {
> +				ret =
> +				    rte_intr_callback_unregister_pending(&pmd->
> +							cc->intr_handle,
> +							memif_intr_handler,
> +							pmd->cc,
> +							memif_intr_unregister_handler);

Is 'rte_intr_callback_unregister_pending()' defined in DPDK, I am getting build
error for this.

<...>

> +static void
> +memif_intr_handler(void *arg)
> +{
> +	struct memif_control_channel *cc = arg;
> +	struct rte_eth_dev *dev;
> +	int ret;
> +
> +	ret = memif_msg_receive(cc);
> +	/* if driver failed to assign device */
> +	if (cc->pmd == NULL) {

Isn't this already set when slave connected, When/How this can happen?

<...>

> @@ -0,0 +1,10 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2018 Cisco Systems, Inc.  All rights reserved.
> +
> +if host_machine.system() != 'linux'
> +        build = false
> +endif
> +sources = files('rte_eth_memif.c',
> +		'memif_socket.c')
> +
> +deps += ['hash']

This dependency is not in Makefile, and since librte_hash is used, dependency
should be added into makefile too.

<...>

> +
> +static struct rte_vdev_driver pmd_memif_drv;

This deceleration may not be required.

> +
> +const char *
> +memif_version(void)
> +{
> +#define STR_HELP(s)	#s
> +#define STR(s)		STR_HELP(s)

Can use RTE_STR(x) instead.

> +	return ("memif-" STR(MEMIF_VERSION_MAJOR) "." STR(MEMIF_VERSION_MINOR));
> +#undef STR
> +#undef STR_HELP
> +}
> +
> +static void
> +memif_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +
> +	dev_info->if_index = pmd->if_index;

if_index is for host interfaces, that something you should be able to provide to
if_indextoname() as parameter, make sense for af_packet or tap, but not for here.

> +	dev_info->max_mac_addrs = 1;
> +	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
> +	dev_info->max_rx_queues = (pmd->role == MEMIF_ROLE_SLAVE) ?
> +	    pmd->cfg.num_m2s_rings : pmd->cfg.num_s2m_rings;
> +	dev_info->max_tx_queues = (pmd->role == MEMIF_ROLE_SLAVE) ?
> +	    pmd->cfg.num_s2m_rings : pmd->cfg.num_m2s_rings;

max_rx_queues & max_tx_queues are for maximum available queue number, not
current number, setting them to current number prevents app capability to
increase the queue number.
For this case ETH_MEMIF_MAX_NUM_Q_PAIRS seems the max queue number.

<...>

> +	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
> +		uint64_t a = 1;
> +		ssize_t size = write(mq->intr_handle.fd, &a, sizeof(a));
> +		if (unlikely(size < 0)) {
> +			MIF_LOG(WARNING,
> +				"%s: Failed to send interrupt on qid %ld: %s",
> +				rte_vdev_device_name(pmd->vdev),
> +				mq - pmd->tx_queues, strerror(errno));

giving build error because of %ld,
error: format ‘%ld’ expects argument of type ‘long int’, but argument 6 has type
‘int’ [-Werror=format=]
   "%s(): " fmt "\n", __func__, ##args)
   ^~~~~~~~

<...>

> +static int
> +memif_tx_queue_setup(struct rte_eth_dev *dev,
> +		     uint16_t qid,
> +		     uint16_t nb_tx_desc __rte_unused,
> +		     unsigned int socket_id __rte_unused,
> +		     const struct rte_eth_txconf *tx_conf __rte_unused)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_queue *mq;
> +
> +	mq = rte_realloc(pmd->tx_queues, sizeof(struct memif_queue) * (qid + 1),
> +			 0);

There is an assumption in this code that whenever this fucntion called, it will
be called with a 'qid' bigger that previous one, this assumption is not correct,
there is no forced order of calling setup queue. Reverse order may cause you
loosing all the memory.

You should know the number of the queues already, instead of trying to allocate
dynamiccaly in setup function, why don't you allocate it in advance?

> +	if (mq == NULL) {
> +		MIF_LOG(ERR, "%s: Failed to alloc tx queue %u.",
> +			rte_vdev_device_name(pmd->vdev), qid);
> +		return -ENOMEM;
> +	}
> +
> +	pmd->tx_queues = mq;

It may be better to have a varible pointer to pointer,
"struct memif_queue **tx_queues", as 'data->tx_queues' does, because that is
what you are storing, harder to keep this with "struct memif_queue *tx_queues"
although can be done.

> +
> +	mq->type =
> +	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
> +	mq->n_pkts = 0;
> +	mq->n_bytes = 0;
> +	mq->n_err = 0;
> +	mq->intr_handle.fd = -1;
> +	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
> +	mq->pmd = pmd;

Something looks wrong in this code or I am missing something.

When "mq" expanded with "rte_realloc()" context preserved, but the new chunk of
memory it at the end, but you are writing to the beggining, isn't this overwrite
the existing data?
Have you test this code with multi-queue?

> +	dev->data->tx_queues[qid] = mq;

Also this part seems wrong, "data->tx_queues[qid]" should preserve the pointer
to that specific queue with 'qid', here it point to the beginning of the memorry
that has all queues.
Also after rte_realloc() previous pointer may become invalid, and you may be
storing/accessing invalid pointer.

> +
> +	return 0;
> +}
> +
> +static int
> +memif_rx_queue_setup(struct rte_eth_dev *dev,
> +		     uint16_t qid,
> +		     uint16_t nb_rx_desc __rte_unused,
> +		     unsigned int socket_id __rte_unused,
> +		     const struct rte_eth_rxconf *rx_conf __rte_unused,
> +		     struct rte_mempool *mb_pool)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_queue *mq;
> +
> +	mq = rte_realloc(pmd->rx_queues, sizeof(struct memif_queue) * (qid + 1),
> +			 0);
> +	if (mq == NULL) {
> +		MIF_LOG(ERR, "%s: Failed to alloc rx queue %u.",
> +			rte_vdev_device_name(pmd->vdev), qid);
> +		return -ENOMEM;
> +	}
> +
> +	pmd->rx_queues = mq;
> +
> +	mq->type =
> +	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
> +	mq->n_pkts = 0;
> +	mq->n_bytes = 0;
> +	mq->n_err = 0;
> +	mq->intr_handle.fd = -1;
> +	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
> +	mq->mempool = mb_pool;
> +	mq->in_port = dev->data->port_id;
> +	mq->pmd = pmd;
> +	dev->data->rx_queues[qid] = mq;

Same problems with 'memif_tx_queue_setup' exists here, I will assume multi queue
not tested at all.

<...>

> +static void
> +memif_stats_reset(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	int i;
> +	struct memif_queue *mq;
> +
> +	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
> +		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? &pmd->tx_queues[i] :
> +		    &pmd->rx_queues[i];
> +		mq->n_pkts = 0;
> +		mq->n_bytes = 0;
> +		mq->n_err = 0;
> +	}
> +	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
> +		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? &pmd->rx_queues[i] :
> +		    &pmd->tx_queues[i];
> +		mq->n_pkts = 0;
> +		mq->n_bytes = 0;
> +		mq->n_err = 0;
> +	}

Is this logic correct? If "role == MEMIF_ROLE_SLAVE",
foreach pmd->run.num_s2m_rings: reset tx_queues
foreach pmd->run.num_m2s_rings: reset rx_queues

Not sure if "pmd->run.num_s2m_rings" & "pmd->run.num_s2m_rings" can be different
values, but from rest of the code what I get, one is number of queues for SLAVE
and other is number of queues for MASTER. The logic on 'memif_stats_get()' looks
better.

> +}
> +
> +static int
> +memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +
> +	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
> +		rte_vdev_device_name(pmd->vdev));
> +
> +	/* Enable MEMIF interrupts. */
> +	/* pmd->rx_queues[qid].ring->flags  &= ~MEMIF_RING_FLAG_MASK_INT; */
> +
> +	/*
> +	 * TODO: Tell dpdk to use interrupt mode.
> +	 *
> +	 * return rte_intr_enable(&pmd->rx_queues[qid].intr_handle);
> +	 */

Please remove commented code while upstreaming.

> +	return -1;
> +}
> +
> +static int
> +memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
> +{
> +	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
> +
> +	/* Disable MEMIF interrupts. */
> +	/* pmd->rx_queues[qid].ring->flags |= MEMIF_RING_FLAG_MASK_INT; */
> +
> +	/*
> +	 * TODO: Tell dpdk to use polling mode.
> +	 *
> +	 * return rte_intr_disable(&pmd->rx_queues[qid].intr_handle);
> +	 */

Same here.

<...>

> +
> +	/* FIXME: generate mac? */
> +	if (eth_addr == NULL)
> +		eth_addr = ETH_MEMIF_DEFAULT_ETH_ADDR;

It is easy to generate random MAC in DPDK via eth_random_addr() API, it worth
doing it and removing the FIXME.

<...>

> +static int
> +memif_set_role(const char *key __rte_unused, const char *value,
> +	       void *extra_args)
> +{
> +	enum memif_role_t *role = (enum memif_role_t *)extra_args;
> +	if (strstr(value, "master") != NULL) {

Can 'value' be something containing 'master' or should it be exact 'master', if
expectaion is exact match, strcmp() can be used. Same for below ones.

<...>

> +static int
> +memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
> +{
> +	unsigned long tmp;
> +	uint16_t *buffer_size = (uint16_t *)extra_args;
> +
> +	tmp = strtoul(value, NULL, 10);
> +	if (tmp == 0 || tmp > 0xFFFF) {

This 0xFFFF size limit is very hidden in kvargs function, can be good to add
this to documentation and header comment.

> +		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
> +		return -EINVAL;
> +	}
> +	*buffer_size = tmp;
> +	return 0;
> +}
> +
> +static int
> +memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
> +{
> +	unsigned long tmp;
> +	memif_log2_ring_size_t *log2_ring_size =
> +	    (memif_log2_ring_size_t *)extra_args;
> +
> +	tmp = strtoul(value, NULL, 10);
> +	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {

Same here for ETH_MEMIF_MAX_LOG2_RING_SIZE, can you document it? Same for below
ones.

<...>

> +	/* set default values */
> +	role = MEMIF_ROLE_SLAVE;
> +	flags = 0;
> +	id = 0;
> +	buffer_size = 2048;
> +	log2_ring_size = 10;
> +	nrxq = 1;
> +	ntxq = 1;

There are already some ETH_MEMIF_DEFAULT_* macros are defined same of these
values, why not use them? And can be good to have default data information in
single place.

> +	socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
> +	secret = NULL;
> +	eth_addr = NULL;
> +
> +	/* parse parameters */
> +	if (kvlist != NULL) {
> +		if (rte_kvargs_count(kvlist, ETH_MEMIF_ROLE_ARG) == 1) {
> +			ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
> +						 &memif_set_role, &role);
> +			if (ret < 0)
> +				goto exit;
> +		}
Do you need rte_kvargs_count() checks?
If you are not really interested in the count, can call rte_kvargs_process()
directly, if key exists the callback will be called, if not exists it will
return 0, so safe to call without count check.

> +		if (rte_kvargs_count(kvlist, ETH_MEMIF_ID_ARG) == 1) {
> +			ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
> +						 &memif_set_id, &id);
> +			if (ret < 0)
> +				goto exit;
> +		}
> +		if (rte_kvargs_count(kvlist, ETH_MEMIF_BUFFER_SIZE_ARG) == 1) {
> +			ret =
> +			    rte_kvargs_process(kvlist,
> +					       ETH_MEMIF_BUFFER_SIZE_ARG,
> +					       &memif_set_bs, &buffer_size);
> +			if (ret < 0)
> +				goto exit;
> +		}
> +		if (rte_kvargs_count(kvlist, ETH_MEMIF_RING_SIZE_ARG) == 1) {
> +			ret =
> +			    rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
> +					       &memif_set_rs, &log2_ring_size);
> +			if (ret < 0)
> +				goto exit;
> +		}
> +		if (rte_kvargs_count(kvlist, ETH_MEMIF_NRXQ_ARG) == 1) {
> +			ret = rte_kvargs_process(kvlist, ETH_MEMIF_NRXQ_ARG,
> +						 &memif_set_nq, &nrxq);
> +			if (ret < 0)
> +				goto exit;
> +		}
> +		if (rte_kvargs_count(kvlist, ETH_MEMIF_NTXQ_ARG) == 1) {
> +			ret = rte_kvargs_process(kvlist, ETH_MEMIF_NTXQ_ARG,
> +						 &memif_set_nq, &ntxq);
> +			if (ret < 0)
> +				goto exit;
> +		}

Nuber of Rx/Tx queues can be configured dynamically by applications. This
information provided by ".dev_configure" which seems you are not using at all.
Why getting number of queue information statically during PMD probe instead of
getting dynamic by app?

Won't it be possible app disconnect one memif PMD, reconfigure it with different
queue numbers and connect to other master?

> +		if (rte_kvargs_count(kvlist, ETH_MEMIF_SOCKET_ARG) == 1) {
> +			for (i = 0; i < kvlist->count; i++) {
> +				pair = &kvlist->pairs[i];
> +				if (strcmp(pair->key, ETH_MEMIF_SOCKET_ARG) == 0) {
> +					socket_filename = pair->value;
> +					ret = memif_check_socket_filename(
> +						socket_filename);
> +					if (ret < 0)
> +						goto exit;
> +				}

Instead of accessing the kvarg details and 'pair->value' why not implement
'memif_check_socket_filename()' as callback funtion to the 'rte_kvargs_process()' ?

> +			}
> +		}
> +		if (rte_kvargs_count(kvlist, ETH_MEMIF_MAC_ARG) == 1) {
> +			for (i = 0; i < kvlist->count; i++) {
> +				pair = &kvlist->pairs[i];
> +				if (strcmp(pair->key, ETH_MEMIF_MAC_ARG) == 0)
> +					eth_addr = pair->value;

Similar comment here, instead of keeping 'eth_addr' as string and posponing task
of process it to 'memif_create()', you can process it in a callback and store as
'struct ether_addr'

Also you can store the default mac address as 'struct ether_addr' instead of string.

> +			}
> +		}
> +		if (rte_kvargs_count(kvlist, ETH_MEMIF_ZC_ARG) == 1) {
> +			ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
> +						 &memif_set_zc, &flags);
> +			if (ret < 0)
> +				goto exit;
> +		}
> +		if (rte_kvargs_count(kvlist, ETH_MEMIF_SECRET_ARG) == 1) {
> +			for (i = 0; i < kvlist->count; i++) {
> +				pair = &kvlist->pairs[i];
> +				if (strcmp(pair->key, ETH_MEMIF_SECRET_ARG) == 0)
> +					secret = pair->value;
> +			}

Same here, I think better to implement as rte_kvargs_process();

> +		}
> +	}
> +
> +	/* create interface */
> +	ret =
> +	    memif_create(vdev, role, id, flags, socket_filename, log2_ring_size,
> +			 nrxq, ntxq, buffer_size, secret, eth_addr);

Just syntax, but no need to break after "ret = " line.

<...>

> +RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
> +RTE_PMD_REGISTER_ALIAS(net_memif, eth_memif);

No need to provide alias for new PMDs, it is for backward compatibility.

> +RTE_PMD_REGISTER_PARAM_STRING(net_memif,
> +			      ETH_MEMIF_ID_ARG "=<int>"
> +			      ETH_MEMIF_ROLE_ARG "=<string>"

It can be more helpful to describe the valid set, since it is limited:
ETH_MEMIF_ROLE_ARG "=master|slave"

<...>
> +int memif_logtype;

This will work but wouldn't it be better to define variable to .c file and have
an extern here.

<...>

> +	uint16_t last_head;			/**< last ring head */
> +	uint16_t last_tail;			/**< last ring tail */
> +	/*uint32_t *buffers;*/

Please remove commented code.

<...>

> +
> +	struct memif_queue *rx_queues;		/**< RX queues */
> +	struct memif_queue *tx_queues;		/**< TX queues */

Aren't these are duplicates of 'data->rx_queues' & 'data->tx_queues'? Why not
using existing ones but creating anouther copy here?

> +
> +	/* remote info */
> +	char remote_name[64];			/**< remote app name */
> +	char remote_if_name[64];		/**< remote peer name */
> +
> +	struct {
> +		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
> +		uint8_t num_s2m_rings;		/**< number of slave to master rings */
> +		uint8_t num_m2s_rings;		/**< number of master to slave rings */
> +		uint16_t buffer_size;		/**< buffer size */
> +	} cfg;					/**< Configured parameters (max values) */
> +
> +	struct {
> +		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
> +		uint8_t num_s2m_rings;		/**< number of slave to master rings */
> +		uint8_t num_m2s_rings;		/**< number of master to slave rings */
> +		uint16_t buffer_size;		/**< buffer size */
> +	} run;

What about defining the struct out, and reuse it for both 'cfg' & 'run'?

<...>
> +#ifndef MFD_HUGETLB
> +#ifndef __NR_memfd_create

Is there are a kernel version requirement for these? If so what it is?

> +
> +#if defined __x86_64__
> +#define __NR_memfd_create 319
> +#elif defined __arm__
> +#define __NR_memfd_create 385
> +#elif defined __aarch64__
> +#define __NR_memfd_create 279
> +#else
> +#error "__NR_memfd_create unknown for this architecture"
> +#endif
> +
> +#endif				/* __NR_memfd_create */
> +
> +static inline int memfd_create(const char *name, unsigned int flags)
> +{
> +	return syscall(__NR_memfd_create, name, flags);
> +}

Isn't there a glibc wrapper for this, do you really need to call syscall?
And why i686 and powerpc are not supported?

btw, is memif PMD supports 32-bits?
(.ini file, requested above, good for document these kind of questions)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v3] /net: memory interface (memif)
  2018-12-13 13:30 [dpdk-dev] [RFC v3] /net: memory interface (memif) Jakub Grajciar
  2018-12-13 18:07 ` Stephen Hemminger
  2019-01-04 17:16 ` Ferruh Yigit
@ 2019-01-04 19:23 ` Stephen Hemminger
  2019-01-04 19:27 ` Stephen Hemminger
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 58+ messages in thread
From: Stephen Hemminger @ 2019-01-04 19:23 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

On Thu, 13 Dec 2018 14:30:51 +0100
Jakub Grajciar <jgrajcia@cisco.com> wrote:

> +CFLAGS += -O3
> +CFLAGS += -I$(SRCDIR)
> +CFLAGS += $(WERROR_FLAGS)
> +CFLAGS += -Wno-pointer-arith

Why this additional compiler flag?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v3] /net: memory interface (memif)
  2018-12-13 13:30 [dpdk-dev] [RFC v3] /net: memory interface (memif) Jakub Grajciar
                   ` (2 preceding siblings ...)
  2019-01-04 19:23 ` Stephen Hemminger
@ 2019-01-04 19:27 ` Stephen Hemminger
  2019-01-04 19:32 ` Stephen Hemminger
  2019-02-20 11:52 ` [dpdk-dev] [RFC v4] " Jakub Grajciar
  5 siblings, 0 replies; 58+ messages in thread
From: Stephen Hemminger @ 2019-01-04 19:27 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

On Thu, 13 Dec 2018 14:30:51 +0100
Jakub Grajciar <jgrajcia@cisco.com> wrote:

> +static ssize_t
> +memif_msg_send(int fd, memif_msg_t *msg, int afd)
> +{
> +	struct msghdr mh = { 0 };
> +	struct iovec iov[1];
> +	char ctl[CMSG_SPACE(sizeof(int))];
> +
> +	iov[0].iov_base = (void *)msg;

Since iov_base is of type void * the cast is unnecessary in C.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v3] /net: memory interface (memif)
  2018-12-13 13:30 [dpdk-dev] [RFC v3] /net: memory interface (memif) Jakub Grajciar
                   ` (3 preceding siblings ...)
  2019-01-04 19:27 ` Stephen Hemminger
@ 2019-01-04 19:32 ` Stephen Hemminger
  2019-02-20 11:52 ` [dpdk-dev] [RFC v4] " Jakub Grajciar
  5 siblings, 0 replies; 58+ messages in thread
From: Stephen Hemminger @ 2019-01-04 19:32 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

On Thu, 13 Dec 2018 14:30:51 +0100
Jakub Grajciar <jgrajcia@cisco.com> wrote:

> +	role = MEMIF_ROLE_SLAVE;
> +	flags = 0;
> +	id = 0;
> +	buffer_size = 2048;
> +	log2_ring_size = 10;
> +	nrxq = 1;
> +	ntxq = 1;
> +	socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
> +	secret = NULL;
> +	eth_addr = NULL;

Minor nit, since you initialize some of the values when declaring them.
Why not move all these initializations up to where the declaration is.
Just makes everything more compact.


> +	/* create interface */
> +	ret =
> +	    memif_create(vdev, role, id, flags, socket_filename, log2_ring_size,
> +			 nrxq, ntxq, buffer_size, secret, eth_addr);
> +

Another nit, why split the line like this? Instead of:
	ret = memif_create(vdev, role, id, flags, socket_filename,
			   log2_ring_size, nrxq, ntxq, buffer_size, secret, eth_addr);

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dpdk-dev] [RFC v4] /net: memory interface (memif)
  2018-12-13 13:30 [dpdk-dev] [RFC v3] /net: memory interface (memif) Jakub Grajciar
                   ` (4 preceding siblings ...)
  2019-01-04 19:32 ` Stephen Hemminger
@ 2019-02-20 11:52 ` Jakub Grajciar
  2019-02-20 15:46   ` Stephen Hemminger
                     ` (4 more replies)
  5 siblings, 5 replies; 58+ messages in thread
From: Jakub Grajciar @ 2019-02-20 11:52 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 config/common_base                          |    5 +
 config/common_linuxapp                      |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  194 ++++
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   28 +
 drivers/net/memif/memif.h                   |  178 +++
 drivers/net/memif/memif_socket.c            | 1092 ++++++++++++++++++
 drivers/net/memif/memif_socket.h            |  104 ++
 drivers/net/memif/meson.build               |   13 +
 drivers/net/memif/rte_eth_memif.c           | 1124 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  203 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 17 files changed, 2970 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/49009/

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

diff --git a/MAINTAINERS b/MAINTAINERS
index 71ba31208..12df181c6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -786,6 +786,12 @@ F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst
 
+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+
 
 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index d12ae98bc..b8ed10ae5 100644
--- a/config/common_base
+++ b/config/common_base
@@ -403,6 +403,11 @@ CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX_FREE=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 6c1c8d0f4..42cbde8f5 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -18,6 +18,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 1e4670501..3cf34b52d 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -32,6 +32,7 @@ Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..f4be39dab
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,194 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) toc ommunicate using shared memory. Memif is
+Linux only.s
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Master creates the socket and listens for any slave
+connection requests. The socket may already exist on the system. Be sure to
+remove any such sockets, if you are creating a master interface, or you will
+see an "Address already in use" error. Function ``rte_pmd_memif_remove()``,
+which removes memif interface, will also remove a listener socket, if it is
+not being used by any other interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Each memif on same socket must be given a unique id", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "nrxq=2", "Number of RX queues", "1", "255"
+   "ntxq=2", "Number of TX queues", "1", "255"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+
+**Connection establishment**
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by master
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have slave and master interfaces at the
+same time, provided each role assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(MEMIF_MSG_TYPE_HELLO), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(MEMIF_MSG_TYPE_INIT). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (MEMIF_MSG_TYPE_ACK) is
+sent. Slave interface sends 'add region' message (MEMIF_MSG_TYPE_ADD_REGION) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(MEMIF_MSG_TYPE_ADD_RING) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (MEMIF_MSG_TYPE_CONNECT). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (MEMIF_MSG_TYPE_CONNECTED). Disconnect
+(MEMIF_MSG_TYPE_DISCONNECT) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+
+
+Example: testpmd and testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create master interface::
+
+    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create slave interace (master must be already running so the slave will connect)::
+
+    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
+
+    testpmd> set fwd rxonly
+    testpmd> start
+
+    testpmd> set fwd txonly
+    testpmd> start
+
+Show status::
+
+    testpmd> show port stats 0
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index c0386feb9..0feab5241 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -32,6 +32,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += fm10k
 DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..0a9445017
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,28 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
+LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs
+LDLIBS += -lrte_bus_vdev -lrte_hash
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..04ba14f88
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,178 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128) {
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	 MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	 MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..48d60ee29
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1092 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		struct cmsghdr *cmsg;
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = rte_zmalloc("memif_msg",
+						    sizeof(struct
+							   memif_msg_queue_elt), 0);
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	struct pmd_internals *pmd = cc->dev->data->dev_private;
+	memif_msg_disconnect_t *d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_hello_t *h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_IDX;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.buffer_size = pmd->cfg.buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strcmp(pmd->secret, (char *)i->secret) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	struct memif_region *mr;
+
+	if (ar->index > ETH_MEMIF_MAX_REGION_IDX) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	mr = rte_realloc(pmd->regions, sizeof(struct memif_region) *
+			 (ar->index + 1), 0);
+	if (mr == NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Device error", 0);
+		return -1;
+	}
+
+	pmd->regions = mr;
+	pmd->regions[ar->index].fd = fd;
+	pmd->regions[ar->index].region_size = ar->size;
+	pmd->regions[ar->index].addr = NULL;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	struct memif_queue *mq;
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, sizeof(pmd->remote_disc_string));
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, 96);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_init_t *i = &e->msg.init;
+
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_add_region_t *ar = &e->msg.add_region;
+	struct memif_region *mr = &pmd->regions[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_add_ring_t *ar = &e->msg.add_ring;
+	struct memif_queue *mq;
+
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_connect_t *c = &e->msg.connect;
+	const char *name = rte_vdev_device_name(pmd->vdev);
+
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	memif_msg_connected_t *c = &e->msg.connected;
+
+	const char *name = rte_vdev_device_name(pmd->vdev);
+
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		while ((elt = TAILQ_FIRST(&pmd->cc->msg_queue)) != NULL) {
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		if (pmd->cc->intr_handle.fd > 0) {
+			ret =
+			    rte_intr_callback_unregister(&pmd->cc->intr_handle,
+							 memif_intr_handler,
+							 pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				/* *INDENT-OFF* */
+				ret = rte_intr_callback_unregister_pending(
+							&pmd->cc->intr_handle,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+				/* *INDENT-ON* */
+			} else if (ret > 0) {
+				close(pmd->cc->intr_handle.fd);
+				rte_free(pmd->cc);
+			}
+			if (ret <= 0)
+				MIF_LOG(WARNING,
+					"%s: Failed to unregister control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	struct memif_queue *mq;
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			rte_intr_disable(&mq->intr_handle);
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			rte_intr_disable(&mq->intr_handle);
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	struct rte_eth_dev *dev;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	dev = rte_eth_dev_allocated(rte_vdev_device_name(
+		((struct pmd_internals *)cc->dev->data->dev_private)->vdev));
+	if (dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret =
+	    rte_intr_callback_register(&cc->intr_handle, memif_intr_handler,
+				       cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL) {
+		rte_free(cc);
+		cc = NULL;
+	}
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	int ret;
+	char key[256];
+
+	struct rte_hash *hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt =
+	    rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt),
+		       0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+
+	struct rte_hash *hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) <
+	    0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..8caea270b
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	uint8_t listener;			/**< if not zero socket is listener */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	memif_msg_t msg;			/**< control message */
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..4dfe37d3a
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..be0281a2e
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1124 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+static struct rte_vdev_driver pmd_memif_drv;
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0].addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region].addr + d->offset);
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	memif_ring_t *ring = mq->ring;
+	if (unlikely(ring == NULL))
+		return 0;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+	    RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head = NULL;
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		uint64_t b;
+		ssize_t size __rte_unused;
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+	}
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+ next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size + RTE_PKTMBUF_HEADROOM;
+
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				rte_pktmbuf_chain(mbuf_head, mbuf);
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_pkt_len(mbuf) =
+			    rte_pktmbuf_data_len(mbuf) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+ no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+ refill:
+	if (type == MEMIF_RING_M2S) {
+		uint16_t head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	memif_ring_t *ring = mq->ring;
+	if (unlikely(ring == NULL))
+		return 0;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_free && n_tx_pkts < nb_pkts) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len =
+		    (type ==
+		     MEMIF_RING_S2M) ? pmd->run.buffer_size : d0->length;
+
+ next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+ no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		uint64_t a = 1;
+		ssize_t size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions + i;
+		if (r == NULL)
+			return;
+		if (r->addr == NULL)
+			return;
+		munmap(r->addr, r->region_size);
+		if (r->fd > 0) {
+			close(r->fd);
+			r->fd = -1;
+		}
+	}
+	rte_free(pmd->regions);
+}
+
+static int
+memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn)
+{
+	struct memif_region *r;
+	char shm_name[32];
+	int i;
+	int ret = 0;
+
+	r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate regions.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	pmd->regions = r;
+	pmd->regions_num = brn + 1;
+
+	/*
+	 * Create shm for every region. Region 0 is reserved for descriptors.
+	 * Other regions contain buffers.
+	 */
+	for (i = 0; i < (brn + 1); i++) {
+		r = &pmd->regions[i];
+
+		r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
+					       pmd->run.num_m2s_rings) *
+		    (sizeof(memif_ring_t) +
+		     sizeof(memif_desc_t) * (1 << pmd->run.log2_ring_size)) : 0;
+		r->region_size = (i == 0) ? r->buffer_offset :
+		    (uint32_t)(pmd->run.buffer_size *
+				(1 << pmd->run.log2_ring_size) *
+				(pmd->run.num_s2m_rings +
+				 pmd->run.num_m2s_rings));
+
+		memset(shm_name, 0, sizeof(char) * 32);
+		sprintf(shm_name, "memif region %d", i);
+
+		r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+		if (r->fd < 0) {
+			MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		ret = ftruncate(r->fd, r->region_size);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		r->addr = mmap(NULL, r->region_size, PROT_READ |
+			       PROT_WRITE, MAP_SHARED, r->fd, 0);
+		if (r->addr == NULL) {
+			MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			uint16_t slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 1;
+			ring->desc[j].offset = pmd->regions[1].buffer_offset +
+			    (uint32_t)(slot * pmd->run.buffer_size);
+			ring->desc[j].length = pmd->run.buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			uint16_t slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 1;
+			ring->desc[j].offset = pmd->regions[1].buffer_offset +
+			    (uint32_t)(slot * pmd->run.buffer_size);
+			ring->desc[j].length = pmd->run.buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0].addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0].addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_alloc_regions(dev->data->dev_private, /* num of buffer regions */ 1);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions + i;
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region].addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region].addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *q = (struct memif_queue *)queue;
+
+	if (!q)
+		return;
+
+	rte_free(q);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	uint8_t tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	uint8_t nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->q_errors[i] = mq->n_err;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t buffer_size, const char *secret,
+	     struct ether_addr *eth_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * 24);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.buffer_size = buffer_size;
+
+	rte_memcpy(&pmd->eth_addr, eth_addr, sizeof(struct ether_addr));
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = &pmd->eth_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct ether_addr *eth_addr = (struct ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &eth_addr->addr_bytes[0], &eth_addr->addr_bytes[1],
+	       &eth_addr->addr_bytes[2], &eth_addr->addr_bytes[3],
+	       &eth_addr->addr_bytes[4], &eth_addr->addr_bytes[5]);
+	if (ret != 6) {
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	}
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+
+	const char *name = rte_vdev_device_name(vdev);
+
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t buffer_size = ETH_MEMIF_DEFAULT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct ether_addr eth_addr;
+
+	eth_random_addr(eth_addr.addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, &eth_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, buffer_size, secret, &eth_addr);
+
+ exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	struct pmd_internals *pmd = eth_dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Invalid message size", 0);
+	memif_disconnect(eth_dev);
+
+	memif_socket_remove_device(eth_dev);
+
+	pmd->vdev = NULL;
+
+	rte_free(eth_dev->data->dev_private);
+
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..930de38ed
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,203 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/tmp/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_BUFFER_SIZE		2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_IDX		255
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t buffer_offset;			/**< offset at which buffers start */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	uint16_t in_port;			/**< port id */
+
+	struct pmd_internals *pmd;		/**< device internals */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	/* ring info */
+	memif_ring_type_t type;			/**< ring type */
+	memif_ring_t *ring;			/**< pointer to ring */
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+
+	memif_region_index_t region;		/**< shared memory region index */
+	memif_region_offset_t offset;		/**< offset at which the queue begins */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	struct ether_addr eth_addr;		/**< mac address */
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[24]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions;		/**< shared memory regions */
+	uint8_t regions_num;			/**< number of regions */
+
+	/* remote info */
+	char remote_name[64];			/**< remote app name */
+	char remote_if_name[64];		/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t buffer_size;		/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t buffer_size;		/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[96];		/**< local disconnect reason */
+	char remote_disc_string[96];		/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..66748c008
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+EXPERIMENTAL {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 980eec233..b0becbf31 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -21,6 +21,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 5699d979d..f236c5ebc 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -168,6 +168,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 ifeq ($(CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4 -ldl
 else
-- 
2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v4] /net: memory interface (memif)
  2019-02-20 11:52 ` [dpdk-dev] [RFC v4] " Jakub Grajciar
@ 2019-02-20 15:46   ` Stephen Hemminger
  2019-02-20 16:17   ` Stephen Hemminger
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 58+ messages in thread
From: Stephen Hemminger @ 2019-02-20 15:46 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

On Wed, 20 Feb 2019 12:52:54 +0100
Jakub Grajciar <jgrajcia@cisco.com> wrote:

> +static int
> +memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	memif_msg_add_ring_t *ar = &msg->add_ring;
> +
> +	if (fd < 0) {
> +		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
> +		return -1;
> +	}
> +
> +	struct memif_queue *mq;

Minor nit. Please avoid doing c99 style declarations and move this line up to
start of the function.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v4] /net: memory interface (memif)
  2019-02-20 11:52 ` [dpdk-dev] [RFC v4] " Jakub Grajciar
  2019-02-20 15:46   ` Stephen Hemminger
@ 2019-02-20 16:17   ` Stephen Hemminger
  2019-02-21 10:50   ` Rami Rosen
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 58+ messages in thread
From: Stephen Hemminger @ 2019-02-20 16:17 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

On Wed, 20 Feb 2019 12:52:54 +0100
Jakub Grajciar <jgrajcia@cisco.com> wrote:

> +static int
> +memif_msg_enq_init(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
<insert blank line here>
> +	if (e == NULL)
> +		return -1

Checkpatch prefers a blank line after declaration and before code.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v4] /net: memory interface (memif)
  2019-02-20 11:52 ` [dpdk-dev] [RFC v4] " Jakub Grajciar
  2019-02-20 15:46   ` Stephen Hemminger
  2019-02-20 16:17   ` Stephen Hemminger
@ 2019-02-21 10:50   ` Rami Rosen
  2019-02-27 17:04   ` Ferruh Yigit
  2019-03-22 11:57   ` [dpdk-dev] [RFC v5] " Jakub Grajciar
  4 siblings, 0 replies; 58+ messages in thread
From: Rami Rosen @ 2019-02-21 10:50 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

Hi,

+Shared memory packet interface (memif) PMD allows for DPDK and any other
client
+using memif (DPDK, VPP, libmemif) toc ommunicate using shared memory.
Memif is
+Linux only.s

Minor nit.
should be "to communicate"
And Also
Linux only.s => Linux only.

Reviewed-by: Rami Rosen <ramirose at gmail.com>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v4] /net: memory interface (memif)
  2019-02-20 11:52 ` [dpdk-dev] [RFC v4] " Jakub Grajciar
                     ` (2 preceding siblings ...)
  2019-02-21 10:50   ` Rami Rosen
@ 2019-02-27 17:04   ` Ferruh Yigit
  2019-03-22 11:57   ` [dpdk-dev] [RFC v5] " Jakub Grajciar
  4 siblings, 0 replies; 58+ messages in thread
From: Ferruh Yigit @ 2019-02-27 17:04 UTC (permalink / raw)
  To: Jakub Grajciar, dev

On 2/20/2019 11:52 AM, Jakub Grajciar wrote:
> Memory interface (memif), provides high performance
> packet transfer over shared memory.
> 
> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
> ---
>  MAINTAINERS                                 |    6 +
>  config/common_base                          |    5 +
>  config/common_linuxapp                      |    1 +
>  doc/guides/nics/features/memif.ini          |   14 +
>  doc/guides/nics/index.rst                   |    1 +
>  doc/guides/nics/memif.rst                   |  194 ++++
>  drivers/net/Makefile                        |    1 +
>  drivers/net/memif/Makefile                  |   28 +
>  drivers/net/memif/memif.h                   |  178 +++
>  drivers/net/memif/memif_socket.c            | 1092 ++++++++++++++++++
>  drivers/net/memif/memif_socket.h            |  104 ++
>  drivers/net/memif/meson.build               |   13 +
>  drivers/net/memif/rte_eth_memif.c           | 1124 +++++++++++++++++++
>  drivers/net/memif/rte_eth_memif.h           |  203 ++++
>  drivers/net/memif/rte_pmd_memif_version.map |    4 +
>  drivers/net/meson.build                     |    1 +
>  mk/rte.app.mk                               |    1 +

Can you please update release notes to document new PMD.

<...>

> 
> requires patch: http://patchwork.dpdk.org/patch/49009/

Thanks for highlighting this dependency, can you please elaborate the relation
with interrupt and memif PMD?

<...>

> +Example: testpmd and testpmd
> +----------------------------
> +In this example we run two instances of testpmd application and transmit packets over memif.
> +
> +First create master interface::
> +
> +    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
> +
> +Now create slave interace (master must be already running so the slave will connect)::

s/interace/interface/

> +
> +    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
> +
> +Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
> +
> +    testpmd> set fwd rxonly
> +    testpmd> start
> +
> +    testpmd> set fwd txonly
> +    testpmd> start

Is shared mem used as a ring, with single producer, single consumer? Is there a
way to use memif PMD for both sending and receiving packets.

It is possible to create two ring PMDs in a single testpmd application, and loop
packets between them continuously via tx_first param [1], can same be done via
memif?

[1]
./build/app/testpmd -w 0:0.0 --vdev net_ring0 --vdev net_ring1 -- -i
testpmd> start tx_first
testpmd> show port stats all
  ######################## NIC statistics for port 0  ########################
  RX-packets: 1365649088 RX-missed: 0          RX-bytes:  0
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 1365649120 TX-errors: 0          TX-bytes:  0
.....


<...>

> +
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +CFLAGS += -DALLOW_EXPERIMENTAL_API
> +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring

Is rte_ring library used?

<...>

> @@ -0,0 +1,4 @@
> +EXPERIMENTAL {
> +
> +        local: *;
> +};

Please use the version name instead of "EXPERIMENTAL" which is for experimental
APIs.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-02-20 11:52 ` [dpdk-dev] [RFC v4] " Jakub Grajciar
                     ` (3 preceding siblings ...)
  2019-02-27 17:04   ` Ferruh Yigit
@ 2019-03-22 11:57   ` Jakub Grajciar
  2019-03-22 11:57     ` Jakub Grajciar
                       ` (2 more replies)
  4 siblings, 3 replies; 58+ messages in thread
From: Jakub Grajciar @ 2019-03-22 11:57 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 config/common_base                          |    5 +
 config/common_linux                         |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  200 ++++
 doc/guides/rel_notes/release_19_05.rst      |    4 +
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   28 +
 drivers/net/memif/memif.h                   |  179 +++
 drivers/net/memif/memif_socket.c            | 1104 ++++++++++++++++++
 drivers/net/memif/memif_socket.h            |  104 ++
 drivers/net/memif/meson.build               |   13 +
 drivers/net/memif/rte_eth_memif.c           | 1132 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  203 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 18 files changed, 3001 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/51511/
Memif uses a unix domain socket as control channel (cc).
rte_interrupts.h is used to handle interrupts for this cc.
This patch is required, because the message received can
be a reason for disconnecting.

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

v5:
- coding style fixes
- add release notes
- remove unused dependencies
- doc update

diff --git a/MAINTAINERS b/MAINTAINERS
index 452b8eb82..145b5282a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -790,6 +790,12 @@ F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst
 
+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+
 
 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 0b09a9348..eb6055ea8 100644
--- a/config/common_base
+++ b/config/common_base
@@ -416,6 +416,11 @@ CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX_FREE=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linux b/config/common_linux
index 75334273d..87514fe4f 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -19,6 +19,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 5c80e3baa..167e0dacb 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -34,6 +34,7 @@ Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..7cc547105
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,200 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
+Linux only.
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Master creates the socket and listens for any slave
+connection requests. The socket may already exist on the system. Be sure to
+remove any such sockets, if you are creating a master interface, or you will
+see an "Address already in use" error. Function ``rte_pmd_memif_remove()``,
+which removes memif interface, will also remove a listener socket, if it is
+not being used by any other interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Each memif on same socket must be given a unique id", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "nrxq=2", "Number of RX queues", "1", "255"
+   "ntxq=2", "Number of TX queues", "1", "255"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+
+**Connection establishment**
+
+In order to create memif connection, two memif interfaces, each in separate
+process, are needed. Once interface in ``master`` role and other in
+``slave`` role. It is not possible to connect two interfaces in a single
+process.
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by ``master``
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have ``slave`` and ``master`` interfaces 
+at the same time, provided each role is assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
+sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
+(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+
+
+Example: testpmd and testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create ``master`` interface::
+
+    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create ``slave`` interface (master must be already running so the slave will connect)::
+
+    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
+
+    testpmd> set fwd rxonly
+    testpmd> start
+
+    testpmd> set fwd txonly
+    testpmd> start
+
+Show status::
+
+    testpmd> show port stats 0
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/doc/guides/rel_notes/release_19_05.rst b/doc/guides/rel_notes/release_19_05.rst
index 61a2c7383..be55faacc 100644
--- a/doc/guides/rel_notes/release_19_05.rst
+++ b/doc/guides/rel_notes/release_19_05.rst
@@ -91,6 +91,10 @@ New Features
 
   * Added promiscuous mode support.
 
+* **Added memif PMD.**
+
+  Added the new Shared Memory Packet Interface (``memif``) PMD.
+  See the :doc:`../nics/memif` guide for more details on this new driver.
 
 Removed Items
 -------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 502869a87..1ecbf5155 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -33,6 +33,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_IAVF_PMD) += iavf
 DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..346cf6473
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,28 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_ethdev -lrte_kvargs
+LDLIBS += -lrte_hash
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..15a56888b
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128)
+{
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	 MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	 MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..ad6a5e483
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1104 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	struct cmsghdr *cmsg;
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = rte_zmalloc("memif_msg",
+						    sizeof(struct
+							   memif_msg_queue_elt), 0);
+
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	struct pmd_internals *pmd;
+	memif_msg_disconnect_t *d;
+
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	pmd = cc->dev->data->dev_private;
+	d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	memif_msg_hello_t *h;
+
+	if (e == NULL)
+		return -1;
+
+	h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_IDX;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.buffer_size = pmd->cfg.buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strcmp(pmd->secret, (char *)i->secret) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+	struct memif_region *mr;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	if (ar->index > ETH_MEMIF_MAX_REGION_IDX) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	mr = rte_realloc(pmd->regions, sizeof(struct memif_region) *
+			 (ar->index + 1), 0);
+	if (mr == NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Device error", 0);
+		return -1;
+	}
+
+	pmd->regions = mr;
+	pmd->regions[ar->index].fd = fd;
+	pmd->regions[ar->index].region_size = ar->size;
+	pmd->regions[ar->index].addr = NULL;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+	struct memif_queue *mq;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, sizeof(pmd->remote_disc_string));
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, 96);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_init_t *i = &e->msg.init;
+
+	if (e == NULL)
+		return -1;
+
+	i = &e->msg.init;
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_add_region_t *ar;
+	struct memif_region *mr = &pmd->regions[idx];
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_region;
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	struct memif_queue *mq;
+	memif_msg_add_ring_t *ar;
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_ring;
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connect_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connect;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connected_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connected;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt;
+	struct memif_queue *mq;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		while ((elt = TAILQ_FIRST(&pmd->cc->msg_queue)) != NULL) {
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		if (pmd->cc->intr_handle.fd > 0) {
+			ret =
+			    rte_intr_callback_unregister(&pmd->cc->intr_handle,
+							 memif_intr_handler,
+							 pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				/* *INDENT-OFF* */
+				ret = rte_intr_callback_unregister_pending(
+							&pmd->cc->intr_handle,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+				/* *INDENT-ON* */
+			} else if (ret > 0) {
+				close(pmd->cc->intr_handle.fd);
+				rte_free(pmd->cc);
+			}
+			if (ret <= 0)
+				MIF_LOG(WARNING,
+					"%s: Failed to unregister control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			rte_intr_disable(&mq->intr_handle);
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			rte_intr_disable(&mq->intr_handle);
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	struct rte_eth_dev *dev;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	dev = rte_eth_dev_allocated(rte_vdev_device_name(
+		((struct pmd_internals *)cc->dev->data->dev_private)->vdev));
+	if (dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret =
+	    rte_intr_callback_register(&cc->intr_handle, memif_intr_handler,
+				       cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL) {
+		rte_free(cc);
+		cc = NULL;
+	}
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	struct rte_hash *hash;
+	int ret;
+	char key[256];
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt =
+	    rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt),
+		       0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+	struct rte_hash *hash;
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) <
+	    0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..8caea270b
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	uint8_t listener;			/**< if not zero socket is listener */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	memif_msg_t msg;			/**< control message */
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..4dfe37d3a
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..7ef1bca93
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1132 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+static struct rte_vdev_driver pmd_memif_drv;
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0].addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region].addr + d->offset);
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+		RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head = NULL;
+	uint64_t b;
+	ssize_t size __rte_unused;
+	uint16_t head;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+ next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size + RTE_PKTMBUF_HEADROOM;
+
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				rte_pktmbuf_chain(mbuf_head, mbuf);
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_pkt_len(mbuf) =
+			    rte_pktmbuf_data_len(mbuf) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+ no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+ refill:
+	if (type == MEMIF_RING_M2S) {
+		head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+	uint64_t a;
+	ssize_t size;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_free && n_tx_pkts < nb_pkts) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len =
+		    (type ==
+		     MEMIF_RING_S2M) ? pmd->run.buffer_size : d0->length;
+
+ next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+ no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		a = 1;
+		size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions + i;
+		if (r == NULL)
+			return;
+		if (r->addr == NULL)
+			return;
+		munmap(r->addr, r->region_size);
+		if (r->fd > 0) {
+			close(r->fd);
+			r->fd = -1;
+		}
+	}
+	rte_free(pmd->regions);
+}
+
+static int
+memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn)
+{
+	struct memif_region *r;
+	char shm_name[32];
+	int i;
+	int ret = 0;
+
+	r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate regions.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	pmd->regions = r;
+	pmd->regions_num = brn + 1;
+
+	/*
+	 * Create shm for every region. Region 0 is reserved for descriptors.
+	 * Other regions contain buffers.
+	 */
+	for (i = 0; i < (brn + 1); i++) {
+		r = &pmd->regions[i];
+
+		r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
+					       pmd->run.num_m2s_rings) *
+		    (sizeof(memif_ring_t) +
+		     sizeof(memif_desc_t) * (1 << pmd->run.log2_ring_size)) : 0;
+		r->region_size = (i == 0) ? r->buffer_offset :
+		    (uint32_t)(pmd->run.buffer_size *
+				(1 << pmd->run.log2_ring_size) *
+				(pmd->run.num_s2m_rings +
+				 pmd->run.num_m2s_rings));
+
+		memset(shm_name, 0, sizeof(char) * 32);
+		sprintf(shm_name, "memif region %d", i);
+
+		r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+		if (r->fd < 0) {
+			MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		ret = ftruncate(r->fd, r->region_size);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		r->addr = mmap(NULL, r->region_size, PROT_READ |
+			       PROT_WRITE, MAP_SHARED, r->fd, 0);
+		if (r->addr == NULL) {
+			MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+	uint16_t slot;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 1;
+			ring->desc[j].offset = pmd->regions[1].buffer_offset +
+			    (uint32_t)(slot * pmd->run.buffer_size);
+			ring->desc[j].length = pmd->run.buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 1;
+			ring->desc[j].offset = pmd->regions[1].buffer_offset +
+			    (uint32_t)(slot * pmd->run.buffer_size);
+			ring->desc[j].length = pmd->run.buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0].addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0].addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_alloc_regions(dev->data->dev_private, /* num of buffer regions */ 1);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions + i;
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region].addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region].addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *q = (struct memif_queue *)queue;
+
+	if (!q)
+		return;
+
+	rte_free(q);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+	uint8_t tmp, nq;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->q_errors[i] = mq->n_err;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t buffer_size, const char *secret,
+	     struct ether_addr *eth_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * 24);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.buffer_size = buffer_size;
+
+	rte_memcpy(&pmd->eth_addr, eth_addr, sizeof(struct ether_addr));
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = &pmd->eth_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct ether_addr *eth_addr = (struct ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &eth_addr->addr_bytes[0], &eth_addr->addr_bytes[1],
+	       &eth_addr->addr_bytes[2], &eth_addr->addr_bytes[3],
+	       &eth_addr->addr_bytes[4], &eth_addr->addr_bytes[5]);
+	if (ret != 6)
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+	const char *name = rte_vdev_device_name(vdev);
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t buffer_size = ETH_MEMIF_DEFAULT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct ether_addr eth_addr;
+
+	eth_random_addr(eth_addr.addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, &eth_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, buffer_size, secret, &eth_addr);
+
+ exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+	struct pmd_internals *pmd;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	pmd = eth_dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Invalid message size", 0);
+	memif_disconnect(eth_dev);
+
+	memif_socket_remove_device(eth_dev);
+
+	pmd->vdev = NULL;
+
+	rte_free(eth_dev->data->dev_private);
+
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..930de38ed
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,203 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/tmp/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_BUFFER_SIZE		2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_IDX		255
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t buffer_offset;			/**< offset at which buffers start */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	uint16_t in_port;			/**< port id */
+
+	struct pmd_internals *pmd;		/**< device internals */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	/* ring info */
+	memif_ring_type_t type;			/**< ring type */
+	memif_ring_t *ring;			/**< pointer to ring */
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+
+	memif_region_index_t region;		/**< shared memory region index */
+	memif_region_offset_t offset;		/**< offset at which the queue begins */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	struct ether_addr eth_addr;		/**< mac address */
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[24]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions;		/**< shared memory regions */
+	uint8_t regions_num;			/**< number of regions */
+
+	/* remote info */
+	char remote_name[64];			/**< remote app name */
+	char remote_if_name[64];		/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t buffer_size;		/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t buffer_size;		/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[96];		/**< local disconnect reason */
+	char remote_disc_string[96];		/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..4b2e62193
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.05 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 3ecc78cee..86ad1ec9e 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -22,6 +22,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 262132fc6..d82cb0fb5 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -171,6 +171,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
-- 
2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-03-22 11:57   ` [dpdk-dev] [RFC v5] " Jakub Grajciar
@ 2019-03-22 11:57     ` Jakub Grajciar
  2019-03-25 20:58     ` Ferruh Yigit
  2019-05-09  8:30     ` [dpdk-dev] [RFC v6] " Jakub Grajciar
  2 siblings, 0 replies; 58+ messages in thread
From: Jakub Grajciar @ 2019-03-22 11:57 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 config/common_base                          |    5 +
 config/common_linux                         |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  200 ++++
 doc/guides/rel_notes/release_19_05.rst      |    4 +
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   28 +
 drivers/net/memif/memif.h                   |  179 +++
 drivers/net/memif/memif_socket.c            | 1104 ++++++++++++++++++
 drivers/net/memif/memif_socket.h            |  104 ++
 drivers/net/memif/meson.build               |   13 +
 drivers/net/memif/rte_eth_memif.c           | 1132 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  203 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 18 files changed, 3001 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/51511/
Memif uses a unix domain socket as control channel (cc).
rte_interrupts.h is used to handle interrupts for this cc.
This patch is required, because the message received can
be a reason for disconnecting.

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

v5:
- coding style fixes
- add release notes
- remove unused dependencies
- doc update

diff --git a/MAINTAINERS b/MAINTAINERS
index 452b8eb82..145b5282a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -790,6 +790,12 @@ F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst
 
+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+
 
 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 0b09a9348..eb6055ea8 100644
--- a/config/common_base
+++ b/config/common_base
@@ -416,6 +416,11 @@ CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX_FREE=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linux b/config/common_linux
index 75334273d..87514fe4f 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -19,6 +19,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 5c80e3baa..167e0dacb 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -34,6 +34,7 @@ Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..7cc547105
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,200 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
+Linux only.
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Master creates the socket and listens for any slave
+connection requests. The socket may already exist on the system. Be sure to
+remove any such sockets, if you are creating a master interface, or you will
+see an "Address already in use" error. Function ``rte_pmd_memif_remove()``,
+which removes memif interface, will also remove a listener socket, if it is
+not being used by any other interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Each memif on same socket must be given a unique id", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "nrxq=2", "Number of RX queues", "1", "255"
+   "ntxq=2", "Number of TX queues", "1", "255"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+
+**Connection establishment**
+
+In order to create memif connection, two memif interfaces, each in separate
+process, are needed. Once interface in ``master`` role and other in
+``slave`` role. It is not possible to connect two interfaces in a single
+process.
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by ``master``
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have ``slave`` and ``master`` interfaces 
+at the same time, provided each role is assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
+sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
+(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+
+
+Example: testpmd and testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create ``master`` interface::
+
+    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create ``slave`` interface (master must be already running so the slave will connect)::
+
+    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
+
+    testpmd> set fwd rxonly
+    testpmd> start
+
+    testpmd> set fwd txonly
+    testpmd> start
+
+Show status::
+
+    testpmd> show port stats 0
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/doc/guides/rel_notes/release_19_05.rst b/doc/guides/rel_notes/release_19_05.rst
index 61a2c7383..be55faacc 100644
--- a/doc/guides/rel_notes/release_19_05.rst
+++ b/doc/guides/rel_notes/release_19_05.rst
@@ -91,6 +91,10 @@ New Features
 
   * Added promiscuous mode support.
 
+* **Added memif PMD.**
+
+  Added the new Shared Memory Packet Interface (``memif``) PMD.
+  See the :doc:`../nics/memif` guide for more details on this new driver.
 
 Removed Items
 -------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 502869a87..1ecbf5155 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -33,6 +33,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_IAVF_PMD) += iavf
 DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..346cf6473
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,28 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_ethdev -lrte_kvargs
+LDLIBS += -lrte_hash
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..15a56888b
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128)
+{
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	 MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	 MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..ad6a5e483
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1104 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	struct cmsghdr *cmsg;
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = rte_zmalloc("memif_msg",
+						    sizeof(struct
+							   memif_msg_queue_elt), 0);
+
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	struct pmd_internals *pmd;
+	memif_msg_disconnect_t *d;
+
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	pmd = cc->dev->data->dev_private;
+	d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	memif_msg_hello_t *h;
+
+	if (e == NULL)
+		return -1;
+
+	h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_IDX;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.buffer_size = pmd->cfg.buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strcmp(pmd->secret, (char *)i->secret) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+	struct memif_region *mr;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	if (ar->index > ETH_MEMIF_MAX_REGION_IDX) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	mr = rte_realloc(pmd->regions, sizeof(struct memif_region) *
+			 (ar->index + 1), 0);
+	if (mr == NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Device error", 0);
+		return -1;
+	}
+
+	pmd->regions = mr;
+	pmd->regions[ar->index].fd = fd;
+	pmd->regions[ar->index].region_size = ar->size;
+	pmd->regions[ar->index].addr = NULL;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+	struct memif_queue *mq;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, sizeof(pmd->remote_disc_string));
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, 96);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_init_t *i = &e->msg.init;
+
+	if (e == NULL)
+		return -1;
+
+	i = &e->msg.init;
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_add_region_t *ar;
+	struct memif_region *mr = &pmd->regions[idx];
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_region;
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	struct memif_queue *mq;
+	memif_msg_add_ring_t *ar;
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_ring;
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connect_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connect;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connected_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connected;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt;
+	struct memif_queue *mq;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		while ((elt = TAILQ_FIRST(&pmd->cc->msg_queue)) != NULL) {
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		if (pmd->cc->intr_handle.fd > 0) {
+			ret =
+			    rte_intr_callback_unregister(&pmd->cc->intr_handle,
+							 memif_intr_handler,
+							 pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				/* *INDENT-OFF* */
+				ret = rte_intr_callback_unregister_pending(
+							&pmd->cc->intr_handle,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+				/* *INDENT-ON* */
+			} else if (ret > 0) {
+				close(pmd->cc->intr_handle.fd);
+				rte_free(pmd->cc);
+			}
+			if (ret <= 0)
+				MIF_LOG(WARNING,
+					"%s: Failed to unregister control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			rte_intr_disable(&mq->intr_handle);
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			rte_intr_disable(&mq->intr_handle);
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	struct rte_eth_dev *dev;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	dev = rte_eth_dev_allocated(rte_vdev_device_name(
+		((struct pmd_internals *)cc->dev->data->dev_private)->vdev));
+	if (dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret =
+	    rte_intr_callback_register(&cc->intr_handle, memif_intr_handler,
+				       cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL) {
+		rte_free(cc);
+		cc = NULL;
+	}
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	struct rte_hash *hash;
+	int ret;
+	char key[256];
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt =
+	    rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt),
+		       0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+	struct rte_hash *hash;
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) <
+	    0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..8caea270b
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	uint8_t listener;			/**< if not zero socket is listener */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	memif_msg_t msg;			/**< control message */
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..4dfe37d3a
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..7ef1bca93
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1132 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+static struct rte_vdev_driver pmd_memif_drv;
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0].addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region].addr + d->offset);
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+		RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head = NULL;
+	uint64_t b;
+	ssize_t size __rte_unused;
+	uint16_t head;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+ next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size + RTE_PKTMBUF_HEADROOM;
+
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				rte_pktmbuf_chain(mbuf_head, mbuf);
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_pkt_len(mbuf) =
+			    rte_pktmbuf_data_len(mbuf) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+ no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+ refill:
+	if (type == MEMIF_RING_M2S) {
+		head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+	uint64_t a;
+	ssize_t size;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_free && n_tx_pkts < nb_pkts) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len =
+		    (type ==
+		     MEMIF_RING_S2M) ? pmd->run.buffer_size : d0->length;
+
+ next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+ no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		a = 1;
+		size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions + i;
+		if (r == NULL)
+			return;
+		if (r->addr == NULL)
+			return;
+		munmap(r->addr, r->region_size);
+		if (r->fd > 0) {
+			close(r->fd);
+			r->fd = -1;
+		}
+	}
+	rte_free(pmd->regions);
+}
+
+static int
+memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn)
+{
+	struct memif_region *r;
+	char shm_name[32];
+	int i;
+	int ret = 0;
+
+	r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate regions.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	pmd->regions = r;
+	pmd->regions_num = brn + 1;
+
+	/*
+	 * Create shm for every region. Region 0 is reserved for descriptors.
+	 * Other regions contain buffers.
+	 */
+	for (i = 0; i < (brn + 1); i++) {
+		r = &pmd->regions[i];
+
+		r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
+					       pmd->run.num_m2s_rings) *
+		    (sizeof(memif_ring_t) +
+		     sizeof(memif_desc_t) * (1 << pmd->run.log2_ring_size)) : 0;
+		r->region_size = (i == 0) ? r->buffer_offset :
+		    (uint32_t)(pmd->run.buffer_size *
+				(1 << pmd->run.log2_ring_size) *
+				(pmd->run.num_s2m_rings +
+				 pmd->run.num_m2s_rings));
+
+		memset(shm_name, 0, sizeof(char) * 32);
+		sprintf(shm_name, "memif region %d", i);
+
+		r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+		if (r->fd < 0) {
+			MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		ret = ftruncate(r->fd, r->region_size);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+
+		r->addr = mmap(NULL, r->region_size, PROT_READ |
+			       PROT_WRITE, MAP_SHARED, r->fd, 0);
+		if (r->addr == NULL) {
+			MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+				rte_vdev_device_name(pmd->vdev),
+				strerror(errno));
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+	uint16_t slot;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 1;
+			ring->desc[j].offset = pmd->regions[1].buffer_offset +
+			    (uint32_t)(slot * pmd->run.buffer_size);
+			ring->desc[j].length = pmd->run.buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 1;
+			ring->desc[j].offset = pmd->regions[1].buffer_offset +
+			    (uint32_t)(slot * pmd->run.buffer_size);
+			ring->desc[j].length = pmd->run.buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0].addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0].addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_alloc_regions(dev->data->dev_private, /* num of buffer regions */ 1);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions + i;
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region].addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region].addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *q = (struct memif_queue *)queue;
+
+	if (!q)
+		return;
+
+	rte_free(q);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+	uint8_t tmp, nq;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->q_errors[i] = mq->n_err;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t buffer_size, const char *secret,
+	     struct ether_addr *eth_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * 24);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.buffer_size = buffer_size;
+
+	rte_memcpy(&pmd->eth_addr, eth_addr, sizeof(struct ether_addr));
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = &pmd->eth_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct ether_addr *eth_addr = (struct ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &eth_addr->addr_bytes[0], &eth_addr->addr_bytes[1],
+	       &eth_addr->addr_bytes[2], &eth_addr->addr_bytes[3],
+	       &eth_addr->addr_bytes[4], &eth_addr->addr_bytes[5]);
+	if (ret != 6)
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+	const char *name = rte_vdev_device_name(vdev);
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t buffer_size = ETH_MEMIF_DEFAULT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct ether_addr eth_addr;
+
+	eth_random_addr(eth_addr.addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, &eth_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, buffer_size, secret, &eth_addr);
+
+ exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+	struct pmd_internals *pmd;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	pmd = eth_dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Invalid message size", 0);
+	memif_disconnect(eth_dev);
+
+	memif_socket_remove_device(eth_dev);
+
+	pmd->vdev = NULL;
+
+	rte_free(eth_dev->data->dev_private);
+
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..930de38ed
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,203 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/tmp/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_BUFFER_SIZE		2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_IDX		255
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t buffer_offset;			/**< offset at which buffers start */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	uint16_t in_port;			/**< port id */
+
+	struct pmd_internals *pmd;		/**< device internals */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	/* ring info */
+	memif_ring_type_t type;			/**< ring type */
+	memif_ring_t *ring;			/**< pointer to ring */
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+
+	memif_region_index_t region;		/**< shared memory region index */
+	memif_region_offset_t offset;		/**< offset at which the queue begins */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	struct ether_addr eth_addr;		/**< mac address */
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[24]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions;		/**< shared memory regions */
+	uint8_t regions_num;			/**< number of regions */
+
+	/* remote info */
+	char remote_name[64];			/**< remote app name */
+	char remote_if_name[64];		/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t buffer_size;		/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t buffer_size;		/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[96];		/**< local disconnect reason */
+	char remote_disc_string[96];		/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..4b2e62193
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.05 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 3ecc78cee..86ad1ec9e 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -22,6 +22,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 262132fc6..d82cb0fb5 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -171,6 +171,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-03-22 11:57   ` [dpdk-dev] [RFC v5] " Jakub Grajciar
  2019-03-22 11:57     ` Jakub Grajciar
@ 2019-03-25 20:58     ` Ferruh Yigit
  2019-03-25 20:58       ` Ferruh Yigit
  2019-05-02 12:35       ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-05-09  8:30     ` [dpdk-dev] [RFC v6] " Jakub Grajciar
  2 siblings, 2 replies; 58+ messages in thread
From: Ferruh Yigit @ 2019-03-25 20:58 UTC (permalink / raw)
  To: Jakub Grajciar, dev

On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> Memory interface (memif), provides high performance
> packet transfer over shared memory.
> 
> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>

<...>

> @@ -0,0 +1,200 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2018-2019 Cisco Systems, Inc.
> +
> +======================
> +Memif Poll Mode Driver
> +======================
> +Shared memory packet interface (memif) PMD allows for DPDK and any other client
> +using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
> +Linux only.
> +
> +The created device transmits packets in a raw format. It can be used with
> +Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
> +supported in DPDK memif implementation.
> +
> +Memif works in two roles: master and slave. Slave connects to master over an
> +existing socket. It is also a producer of shared memory file and initializes
> +the shared memory. Master creates the socket and listens for any slave
> +connection requests. The socket may already exist on the system. Be sure to
> +remove any such sockets, if you are creating a master interface, or you will
> +see an "Address already in use" error. Function ``rte_pmd_memif_remove()``,

Can it be possible to remove this existing socket on successfully termination of
the dpdk application with 'master' role?
Otherwise each time to run a dpdk app, requires to delete the socket file first.

> +which removes memif interface, will also remove a listener socket, if it is
> +not being used by any other interface.
> +
> +The method to enable one or more interfaces is to use the
> +``--vdev=net_memif0`` option on the DPDK application command line. Each
> +``--vdev=net_memif1`` option given will create an interface named net_memif0,
> +net_memif1, and so on. Memif uses unix domain socket to transmit control
> +messages. Each memif has a unique id per socket. If you are connecting multiple
> +interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
> +etc. Note that if you assign a socket to a master interface it becomes a
> +listener socket. Listener socket can not be used by a slave interface on same
> +client.

When I connect two slaves, id=0 & id=1 to the master, both master and second
slave crashed as soon as second slave connected, is this a know issue? Can you
please check this?

master: ./build/app/testpmd -w0:0.0 -l20,21 --vdev net_memif0,role=master -- -i
slave1: ./build/app/testpmd -w0:0.0 -l20,22 --file-prefix=slave --vdev
net_memif0,role=slave,id=0 -- -i
slave2: ./build/app/testpmd -w0:0.0 -l20,23 --file-prefix=slave2 --vdev
net_memif0,role=slave,id=1 -- -i

<...>

> +Example: testpmd and testpmd
> +----------------------------
> +In this example we run two instances of testpmd application and transmit packets over memif.

How this play with multi process support? When I run a secondary app when memif
PMD is enabled in primary, secondary process crashes.
Can you please check and ensure at least nothing crashes when a secondary app run?

> +
> +First create ``master`` interface::
> +
> +    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
> +
> +Now create ``slave`` interface (master must be already running so the slave will connect)::
> +
> +    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
> +
> +Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
> +
> +    testpmd> set fwd rxonly
> +    testpmd> start
> +
> +    testpmd> set fwd txonly
> +    testpmd> start

Would it be useful to add loopback option to the PMD, for testing/debug ?

Also I am getting low performance numbers above, comparing the ring pmd for
example, is there any performance target for the pmd?
Same forwarding core for both testpmds, this is terrible, either something is
wrong or I am doing something wrong, it is ~40Kpps
Different forwarding cores for each testpmd, still low, !18Mpps

<...>

> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +CFLAGS += -DALLOW_EXPERIMENTAL_API

Can you please add here which experimental APIs are called (as a comment), this
help to keep trace of them in long term?

<...>

> +/**
> + * Buffer descriptor.
> + */
> +typedef struct __rte_packed {
> +	uint16_t flags;				/**< flags */
> +#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */

Is this define used?

<...>

> +
> +static struct rte_vdev_driver pmd_memif_drv;

Same comment from previous review, is this forward deceleration required?

<...>

> +static memif_ring_t *
> +memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
> +{
> +	/* rings only in region 0 */
> +	void *p = pmd->regions[0].addr;
> +	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
> +	    (1 << pmd->run.log2_ring_size);
> +
> +	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;

According above code, I guess layout is as following [1], can you please correct
if it is wrong?

Can you please put this information somewhere, possibly the function comment
that allocates it, so that everyone doesn't need to figure out.


[1]

region 0:
+-----------------------+
| S2M rings | M2S rings |
+-----------------------+

S2M OR M2S Rings:
+-----------------------------------------+
| ring 0 | ring 1 | ring num_s2m_rings - 1|
+-----------------------------------------+

ring 0:
+-----------------------------------------------------+
| Ring Header | (1 << pmd->run.log2_ring_size) * desc |
+-----------------------------------------------------+


region 1:
+-------------------------------------------------------------------------+
| packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
+-------------------------------------------------------------------------+
Is there any order on packet buffers?

<...>

> +	while (n_slots && n_rx_pkts < nb_pkts) {
> +		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
> +		if (unlikely(mbuf_head == NULL))
> +			goto no_free_bufs;
> +		mbuf = mbuf_head;
> +		mbuf->port = mq->in_port;
> +
> + next_slot:
> +		s0 = cur_slot & mask;
> +		d0 = &ring->desc[s0];
> +
> +		src_len = d0->length;
> +		dst_off = 0;
> +		src_off = 0;
> +
> +		do {
> +			dst_len = mbuf_size - dst_off;
> +			if (dst_len == 0) {
> +				dst_off = 0;
> +				dst_len = mbuf_size + RTE_PKTMBUF_HEADROOM;
> +
> +				mbuf = rte_pktmbuf_alloc(mq->mempool);
> +				if (unlikely(mbuf == NULL))
> +					goto no_free_bufs;
> +				mbuf->port = mq->in_port;
> +				rte_pktmbuf_chain(mbuf_head, mbuf);
> +			}
> +			cp_len = RTE_MIN(dst_len, src_len);
> +
> +			rte_pktmbuf_pkt_len(mbuf) =
> +			    rte_pktmbuf_data_len(mbuf) += cp_len;

Can you please make this two lines to prevent confusion?

Also shouldn't need to add 'cp_len' to 'mbuf_head->pkt_len'?
'rte_pktmbuf_chain' updates 'mbuf_head' but at that stage 'mbuf->pkt_len' is not
set to 'cp_len'...

<...>

> +void
> +memif_free_regions(struct pmd_internals *pmd)
> +{
> +	int i;
> +	struct memif_region *r;
> +
> +	for (i = 0; i < pmd->regions_num; i++) {
> +		r = pmd->regions + i;
> +		if (r == NULL)
> +			return;

'r' can be NULL only when 'pmd->region' is NULL and 'i == 0', so is it better to
check 'pmd->region' is NULL check before the loop?

> +		if (r->addr == NULL)
> +			return;
> +		munmap(r->addr, r->region_size);
> +		if (r->fd > 0) {
> +			close(r->fd);
> +			r->fd = -1;
> +		}
> +	}
> +	rte_free(pmd->regions);
> +}
> +
> +static int
> +memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn)
> +{
> +	struct memif_region *r;
> +	char shm_name[32];
> +	int i;
> +	int ret = 0;
> +
> +	r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1), 0);
> +	if (r == NULL) {
> +		MIF_LOG(ERR, "%s: Failed to allocate regions.",
> +			rte_vdev_device_name(pmd->vdev));
> +		return -ENOMEM;
> +	}
> +
> +	pmd->regions = r;
> +	pmd->regions_num = brn + 1;
> +
> +	/*
> +	 * Create shm for every region. Region 0 is reserved for descriptors.
> +	 * Other regions contain buffers.
> +	 */
> +	for (i = 0; i < (brn + 1); i++) {
> +		r = &pmd->regions[i];
> +
> +		r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
> +					       pmd->run.num_m2s_rings) *
> +		    (sizeof(memif_ring_t) +
> +		     sizeof(memif_desc_t) * (1 << pmd->run.log2_ring_size)) : 0;

For complex operations can you please prefer regular if check against ternary,
with the help of the formatting, it is hard to follow this.

what exactly "buffer_offset" is? For 'region 0' it calculates the size of the
size of the 'region 0', otherwise 0. This is offset starting from where? And it
seems only used for below size assignment.

> +		r->region_size = (i == 0) ? r->buffer_offset :
> +		    (uint32_t)(pmd->run.buffer_size *

I guess 'buffer_size' is buffer size per packet, to make this clear, what do you
think to rename it 'packet_buffer_size' ?

> +				(1 << pmd->run.log2_ring_size) *
> +				(pmd->run.num_s2m_rings +
> +				 pmd->run.num_m2s_rings));

There is an illusion of packet buffers can be in multiple regions (above comment
implies it) but this logic assumes all packet buffer are in same region other
than region 0, which gives us 'region 1' and this is already hardcoded in a few
places. Is there a benefit of assuming there will be more regions, will it make
simple to accept 'region 0' for rings and 'region 1' is for packet buffer?

> +
> +		memset(shm_name, 0, sizeof(char) * 32);
> +		sprintf(shm_name, "memif region %d", i);

snprintf please, and can you please add a define for shm_name size?

<...>

> +static void
> +memif_init_rings(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	memif_ring_t *ring;
> +	int i, j;
> +	uint16_t slot;
> +
> +	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
> +		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
> +		ring->head = 0;
> +		ring->tail = 0;
> +		ring->cookie = MEMIF_COOKIE;
> +		ring->flags = 0;
> +		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
> +			slot = i * (1 << pmd->run.log2_ring_size) + j;
> +			ring->desc[j].region = 1;

Why 'region' 1 is hardcoded for buffer, can it be possible to use multiple
regions for buffers?

<...>

> +int
> +memif_connect(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_region *mr;
> +	struct memif_queue *mq;
> +	int i;
> +
> +	for (i = 0; i < pmd->regions_num; i++) {
> +		mr = pmd->regions + i;
> +		if (mr != NULL) {
> +			if (mr->addr == NULL) {
> +				if (mr->fd < 0)
> +					return -1;
> +				mr->addr = mmap(NULL, mr->region_size,
> +						PROT_READ | PROT_WRITE,
> +						MAP_SHARED, mr->fd, 0);
> +				if (mr->addr == NULL)
> +					return -1;
> +			}
> +		}
> +	}

For one master work with multiple slaves, there should be an array of regions,
right, one for for each slave.
Also same concern is valid for Rx/Tx queues, master and slave uses same
dev->data->tx_queues / dev->data->rx_queue, but crossover. I think a more
complex logic required for multiple slave support.

If multiple slave is not supported, is 'id' devarg still valid, or is it used at
all?

<...>

> +static int
> +memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
> +	     memif_interface_id_t id, uint32_t flags,
> +	     const char *socket_filename,
> +	     memif_log2_ring_size_t log2_ring_size,
> +	     uint16_t buffer_size, const char *secret,
> +	     struct ether_addr *eth_addr)
> +{
> +	int ret = 0;
> +	struct rte_eth_dev *eth_dev;
> +	struct rte_eth_dev_data *data;
> +	struct pmd_internals *pmd;
> +	const unsigned int numa_node = vdev->device.numa_node;
> +	const char *name = rte_vdev_device_name(vdev);
> +
> +	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
> +		MIF_LOG(ERR, "Zero-copy not supported.");
> +		return -1;
> +	}

What is the plan for the zero copy?
Can you please put the status & plan to the documentation?

<...>

> +struct memif_queue {
> +	struct rte_mempool *mempool;		/**< mempool for RX packets */
> +	uint16_t in_port;			/**< port id */
> +
> +	struct pmd_internals *pmd;		/**< device internals */
> +
> +	struct rte_intr_handle intr_handle;	/**< interrupt handle */
> +
> +	/* ring info */
> +	memif_ring_type_t type;			/**< ring type */
> +	memif_ring_t *ring;			/**< pointer to ring */
> +	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
> +
> +	memif_region_index_t region;		/**< shared memory region index */
> +	memif_region_offset_t offset;		/**< offset at which the queue begins */

Offset from where, it is start of the region but I think better to detail in the
comment.

> +
> +	uint16_t last_head;			/**< last ring head */
> +	uint16_t last_tail;			/**< last ring tail */

ring already has head, tail variables, what is the difference in queue ones?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-03-25 20:58     ` Ferruh Yigit
@ 2019-03-25 20:58       ` Ferruh Yigit
  2019-05-02 12:35       ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  1 sibling, 0 replies; 58+ messages in thread
From: Ferruh Yigit @ 2019-03-25 20:58 UTC (permalink / raw)
  To: Jakub Grajciar, dev

On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> Memory interface (memif), provides high performance
> packet transfer over shared memory.
> 
> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>

<...>

> @@ -0,0 +1,200 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2018-2019 Cisco Systems, Inc.
> +
> +======================
> +Memif Poll Mode Driver
> +======================
> +Shared memory packet interface (memif) PMD allows for DPDK and any other client
> +using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
> +Linux only.
> +
> +The created device transmits packets in a raw format. It can be used with
> +Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
> +supported in DPDK memif implementation.
> +
> +Memif works in two roles: master and slave. Slave connects to master over an
> +existing socket. It is also a producer of shared memory file and initializes
> +the shared memory. Master creates the socket and listens for any slave
> +connection requests. The socket may already exist on the system. Be sure to
> +remove any such sockets, if you are creating a master interface, or you will
> +see an "Address already in use" error. Function ``rte_pmd_memif_remove()``,

Can it be possible to remove this existing socket on successfully termination of
the dpdk application with 'master' role?
Otherwise each time to run a dpdk app, requires to delete the socket file first.

> +which removes memif interface, will also remove a listener socket, if it is
> +not being used by any other interface.
> +
> +The method to enable one or more interfaces is to use the
> +``--vdev=net_memif0`` option on the DPDK application command line. Each
> +``--vdev=net_memif1`` option given will create an interface named net_memif0,
> +net_memif1, and so on. Memif uses unix domain socket to transmit control
> +messages. Each memif has a unique id per socket. If you are connecting multiple
> +interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
> +etc. Note that if you assign a socket to a master interface it becomes a
> +listener socket. Listener socket can not be used by a slave interface on same
> +client.

When I connect two slaves, id=0 & id=1 to the master, both master and second
slave crashed as soon as second slave connected, is this a know issue? Can you
please check this?

master: ./build/app/testpmd -w0:0.0 -l20,21 --vdev net_memif0,role=master -- -i
slave1: ./build/app/testpmd -w0:0.0 -l20,22 --file-prefix=slave --vdev
net_memif0,role=slave,id=0 -- -i
slave2: ./build/app/testpmd -w0:0.0 -l20,23 --file-prefix=slave2 --vdev
net_memif0,role=slave,id=1 -- -i

<...>

> +Example: testpmd and testpmd
> +----------------------------
> +In this example we run two instances of testpmd application and transmit packets over memif.

How this play with multi process support? When I run a secondary app when memif
PMD is enabled in primary, secondary process crashes.
Can you please check and ensure at least nothing crashes when a secondary app run?

> +
> +First create ``master`` interface::
> +
> +    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
> +
> +Now create ``slave`` interface (master must be already running so the slave will connect)::
> +
> +    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
> +
> +Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
> +
> +    testpmd> set fwd rxonly
> +    testpmd> start
> +
> +    testpmd> set fwd txonly
> +    testpmd> start

Would it be useful to add loopback option to the PMD, for testing/debug ?

Also I am getting low performance numbers above, comparing the ring pmd for
example, is there any performance target for the pmd?
Same forwarding core for both testpmds, this is terrible, either something is
wrong or I am doing something wrong, it is ~40Kpps
Different forwarding cores for each testpmd, still low, !18Mpps

<...>

> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +CFLAGS += -DALLOW_EXPERIMENTAL_API

Can you please add here which experimental APIs are called (as a comment), this
help to keep trace of them in long term?

<...>

> +/**
> + * Buffer descriptor.
> + */
> +typedef struct __rte_packed {
> +	uint16_t flags;				/**< flags */
> +#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */

Is this define used?

<...>

> +
> +static struct rte_vdev_driver pmd_memif_drv;

Same comment from previous review, is this forward deceleration required?

<...>

> +static memif_ring_t *
> +memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
> +{
> +	/* rings only in region 0 */
> +	void *p = pmd->regions[0].addr;
> +	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
> +	    (1 << pmd->run.log2_ring_size);
> +
> +	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;

According above code, I guess layout is as following [1], can you please correct
if it is wrong?

Can you please put this information somewhere, possibly the function comment
that allocates it, so that everyone doesn't need to figure out.


[1]

region 0:
+-----------------------+
| S2M rings | M2S rings |
+-----------------------+

S2M OR M2S Rings:
+-----------------------------------------+
| ring 0 | ring 1 | ring num_s2m_rings - 1|
+-----------------------------------------+

ring 0:
+-----------------------------------------------------+
| Ring Header | (1 << pmd->run.log2_ring_size) * desc |
+-----------------------------------------------------+


region 1:
+-------------------------------------------------------------------------+
| packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
+-------------------------------------------------------------------------+
Is there any order on packet buffers?

<...>

> +	while (n_slots && n_rx_pkts < nb_pkts) {
> +		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
> +		if (unlikely(mbuf_head == NULL))
> +			goto no_free_bufs;
> +		mbuf = mbuf_head;
> +		mbuf->port = mq->in_port;
> +
> + next_slot:
> +		s0 = cur_slot & mask;
> +		d0 = &ring->desc[s0];
> +
> +		src_len = d0->length;
> +		dst_off = 0;
> +		src_off = 0;
> +
> +		do {
> +			dst_len = mbuf_size - dst_off;
> +			if (dst_len == 0) {
> +				dst_off = 0;
> +				dst_len = mbuf_size + RTE_PKTMBUF_HEADROOM;
> +
> +				mbuf = rte_pktmbuf_alloc(mq->mempool);
> +				if (unlikely(mbuf == NULL))
> +					goto no_free_bufs;
> +				mbuf->port = mq->in_port;
> +				rte_pktmbuf_chain(mbuf_head, mbuf);
> +			}
> +			cp_len = RTE_MIN(dst_len, src_len);
> +
> +			rte_pktmbuf_pkt_len(mbuf) =
> +			    rte_pktmbuf_data_len(mbuf) += cp_len;

Can you please make this two lines to prevent confusion?

Also shouldn't need to add 'cp_len' to 'mbuf_head->pkt_len'?
'rte_pktmbuf_chain' updates 'mbuf_head' but at that stage 'mbuf->pkt_len' is not
set to 'cp_len'...

<...>

> +void
> +memif_free_regions(struct pmd_internals *pmd)
> +{
> +	int i;
> +	struct memif_region *r;
> +
> +	for (i = 0; i < pmd->regions_num; i++) {
> +		r = pmd->regions + i;
> +		if (r == NULL)
> +			return;

'r' can be NULL only when 'pmd->region' is NULL and 'i == 0', so is it better to
check 'pmd->region' is NULL check before the loop?

> +		if (r->addr == NULL)
> +			return;
> +		munmap(r->addr, r->region_size);
> +		if (r->fd > 0) {
> +			close(r->fd);
> +			r->fd = -1;
> +		}
> +	}
> +	rte_free(pmd->regions);
> +}
> +
> +static int
> +memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn)
> +{
> +	struct memif_region *r;
> +	char shm_name[32];
> +	int i;
> +	int ret = 0;
> +
> +	r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1), 0);
> +	if (r == NULL) {
> +		MIF_LOG(ERR, "%s: Failed to allocate regions.",
> +			rte_vdev_device_name(pmd->vdev));
> +		return -ENOMEM;
> +	}
> +
> +	pmd->regions = r;
> +	pmd->regions_num = brn + 1;
> +
> +	/*
> +	 * Create shm for every region. Region 0 is reserved for descriptors.
> +	 * Other regions contain buffers.
> +	 */
> +	for (i = 0; i < (brn + 1); i++) {
> +		r = &pmd->regions[i];
> +
> +		r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
> +					       pmd->run.num_m2s_rings) *
> +		    (sizeof(memif_ring_t) +
> +		     sizeof(memif_desc_t) * (1 << pmd->run.log2_ring_size)) : 0;

For complex operations can you please prefer regular if check against ternary,
with the help of the formatting, it is hard to follow this.

what exactly "buffer_offset" is? For 'region 0' it calculates the size of the
size of the 'region 0', otherwise 0. This is offset starting from where? And it
seems only used for below size assignment.

> +		r->region_size = (i == 0) ? r->buffer_offset :
> +		    (uint32_t)(pmd->run.buffer_size *

I guess 'buffer_size' is buffer size per packet, to make this clear, what do you
think to rename it 'packet_buffer_size' ?

> +				(1 << pmd->run.log2_ring_size) *
> +				(pmd->run.num_s2m_rings +
> +				 pmd->run.num_m2s_rings));

There is an illusion of packet buffers can be in multiple regions (above comment
implies it) but this logic assumes all packet buffer are in same region other
than region 0, which gives us 'region 1' and this is already hardcoded in a few
places. Is there a benefit of assuming there will be more regions, will it make
simple to accept 'region 0' for rings and 'region 1' is for packet buffer?

> +
> +		memset(shm_name, 0, sizeof(char) * 32);
> +		sprintf(shm_name, "memif region %d", i);

snprintf please, and can you please add a define for shm_name size?

<...>

> +static void
> +memif_init_rings(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	memif_ring_t *ring;
> +	int i, j;
> +	uint16_t slot;
> +
> +	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
> +		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
> +		ring->head = 0;
> +		ring->tail = 0;
> +		ring->cookie = MEMIF_COOKIE;
> +		ring->flags = 0;
> +		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
> +			slot = i * (1 << pmd->run.log2_ring_size) + j;
> +			ring->desc[j].region = 1;

Why 'region' 1 is hardcoded for buffer, can it be possible to use multiple
regions for buffers?

<...>

> +int
> +memif_connect(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_region *mr;
> +	struct memif_queue *mq;
> +	int i;
> +
> +	for (i = 0; i < pmd->regions_num; i++) {
> +		mr = pmd->regions + i;
> +		if (mr != NULL) {
> +			if (mr->addr == NULL) {
> +				if (mr->fd < 0)
> +					return -1;
> +				mr->addr = mmap(NULL, mr->region_size,
> +						PROT_READ | PROT_WRITE,
> +						MAP_SHARED, mr->fd, 0);
> +				if (mr->addr == NULL)
> +					return -1;
> +			}
> +		}
> +	}

For one master work with multiple slaves, there should be an array of regions,
right, one for for each slave.
Also same concern is valid for Rx/Tx queues, master and slave uses same
dev->data->tx_queues / dev->data->rx_queue, but crossover. I think a more
complex logic required for multiple slave support.

If multiple slave is not supported, is 'id' devarg still valid, or is it used at
all?

<...>

> +static int
> +memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
> +	     memif_interface_id_t id, uint32_t flags,
> +	     const char *socket_filename,
> +	     memif_log2_ring_size_t log2_ring_size,
> +	     uint16_t buffer_size, const char *secret,
> +	     struct ether_addr *eth_addr)
> +{
> +	int ret = 0;
> +	struct rte_eth_dev *eth_dev;
> +	struct rte_eth_dev_data *data;
> +	struct pmd_internals *pmd;
> +	const unsigned int numa_node = vdev->device.numa_node;
> +	const char *name = rte_vdev_device_name(vdev);
> +
> +	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
> +		MIF_LOG(ERR, "Zero-copy not supported.");
> +		return -1;
> +	}

What is the plan for the zero copy?
Can you please put the status & plan to the documentation?

<...>

> +struct memif_queue {
> +	struct rte_mempool *mempool;		/**< mempool for RX packets */
> +	uint16_t in_port;			/**< port id */
> +
> +	struct pmd_internals *pmd;		/**< device internals */
> +
> +	struct rte_intr_handle intr_handle;	/**< interrupt handle */
> +
> +	/* ring info */
> +	memif_ring_type_t type;			/**< ring type */
> +	memif_ring_t *ring;			/**< pointer to ring */
> +	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
> +
> +	memif_region_index_t region;		/**< shared memory region index */
> +	memif_region_offset_t offset;		/**< offset at which the queue begins */

Offset from where, it is start of the region but I think better to detail in the
comment.

> +
> +	uint16_t last_head;			/**< last ring head */
> +	uint16_t last_tail;			/**< last ring tail */

ring already has head, tail variables, what is the difference in queue ones?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-03-25 20:58     ` Ferruh Yigit
  2019-03-25 20:58       ` Ferruh Yigit
@ 2019-05-02 12:35       ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-05-02 12:35         ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-05-03  4:27         ` Honnappa Nagarahalli
  1 sibling, 2 replies; 58+ messages in thread
From: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) @ 2019-05-02 12:35 UTC (permalink / raw)
  To: Ferruh Yigit, dev


________________________________________
From: Ferruh Yigit <ferruh.yigit@intel.com>
Sent: Monday, March 25, 2019 9:58 PM
To: Jakub Grajciar; dev@dpdk.org
Subject: Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)

On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> Memory interface (memif), provides high performance
> packet transfer over shared memory.
>
> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>

<...>

> @@ -0,0 +1,200 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2018-2019 Cisco Systems, Inc.
> +
> +======================
> +Memif Poll Mode Driver
> +======================
> +Shared memory packet interface (memif) PMD allows for DPDK and any other client
> +using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
> +Linux only.
> +
> +The created device transmits packets in a raw format. It can be used with
> +Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
> +supported in DPDK memif implementation.
> +
> +Memif works in two roles: master and slave. Slave connects to master over an
> +existing socket. It is also a producer of shared memory file and initializes
> +the shared memory. Master creates the socket and listens for any slave
> +connection requests. The socket may already exist on the system. Be sure to
> +remove any such sockets, if you are creating a master interface, or you will
> +see an "Address already in use" error. Function ``rte_pmd_memif_remove()``,

Can it be possible to remove this existing socket on successfully termination of
the dpdk application with 'master' role?
Otherwise each time to run a dpdk app, requires to delete the socket file first.


    net_memif is the same case as net_virtio_user, I'd use the same workaround.
    testpmd.c:pmd_test_exit() this, however, is only valid for testpmd.


> +which removes memif interface, will also remove a listener socket, if it is
> +not being used by any other interface.
> +
> +The method to enable one or more interfaces is to use the
> +``--vdev=net_memif0`` option on the DPDK application command line. Each
> +``--vdev=net_memif1`` option given will create an interface named net_memif0,
> +net_memif1, and so on. Memif uses unix domain socket to transmit control
> +messages. Each memif has a unique id per socket. If you are connecting multiple
> +interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
> +etc. Note that if you assign a socket to a master interface it becomes a
> +listener socket. Listener socket can not be used by a slave interface on same
> +client.

When I connect two slaves, id=0 & id=1 to the master, both master and second
slave crashed as soon as second slave connected, is this a know issue? Can you
please check this?

master: ./build/app/testpmd -w0:0.0 -l20,21 --vdev net_memif0,role=master -- -i
slave1: ./build/app/testpmd -w0:0.0 -l20,22 --file-prefix=slave --vdev
net_memif0,role=slave,id=0 -- -i
slave2: ./build/app/testpmd -w0:0.0 -l20,23 --file-prefix=slave2 --vdev
net_memif0,role=slave,id=1 -- -i

<...>


    Each interface can be connected to one peer interface at the same time, I'll
    add more details to the documentation.


> +Example: testpmd and testpmd
> +----------------------------
> +In this example we run two instances of testpmd application and transmit packets over memif.

How this play with multi process support? When I run a secondary app when memif
PMD is enabled in primary, secondary process crashes.
Can you please check and ensure at least nothing crashes when a secondary app run?


    For now I'd disable multi-process support, so we can get the patch applied, then
    provide multi-process support in a separate patch.


> +
> +First create ``master`` interface::
> +
> +    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
> +
> +Now create ``slave`` interface (master must be already running so the slave will connect)::
> +
> +    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
> +
> +Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
> +
> +    testpmd> set fwd rxonly
> +    testpmd> start
> +
> +    testpmd> set fwd txonly
> +    testpmd> start

Would it be useful to add loopback option to the PMD, for testing/debug ?

Also I am getting low performance numbers above, comparing the ring pmd for
example, is there any performance target for the pmd?
Same forwarding core for both testpmds, this is terrible, either something is
wrong or I am doing something wrong, it is ~40Kpps
Different forwarding cores for each testpmd, still low, !18Mpps

<...>


    The difference between ring pmd and memif pmd is that while memif transfers packets,
    ring transfers whole buffers. (Also ring can not be used process to process)
    That means, it does not have to alloc/free buffers.
    I did a simple test where I modified the code so the tx function will only free given buffers
    and rx allocates new buffers. On my machine, this can only handle 45-50Mpps.

    With that in mind, I believe that 23Mpps is fine performance. No performance target is
    defined, the goal is to be as fast as possible.

    The cause of the issue, where traffic drops to 40Kpps while using the same core for both applications,
    is that within the timeslice given to testpmd process, memif driver fills its queues,
    but keeps polling, not giving the other process a chance to receive the packets.
    Same applies to rx side, where an empty queue is polled over and over again. In such configuration,
    interrupt rx-mode should be used instead of polling. Or the application could suspend
    the port.


> +/**
> + * Buffer descriptor.
> + */
> +typedef struct __rte_packed {
> +     uint16_t flags;                         /**< flags */
> +#define MEMIF_DESC_FLAG_NEXT 1                       /**< is chained buffer */

Is this define used?

<...>


     Yes, see eth_memif_tx() and eth_memif_rx()


> +     while (n_slots && n_rx_pkts < nb_pkts) {
> +             mbuf_head = rte_pktmbuf_alloc(mq->mempool);
> +             if (unlikely(mbuf_head == NULL))
> +                     goto no_free_bufs;
> +             mbuf = mbuf_head;
> +             mbuf->port = mq->in_port;
> +
> + next_slot:
> +             s0 = cur_slot & mask;
> +             d0 = &ring->desc[s0];
> +
> +             src_len = d0->length;
> +             dst_off = 0;
> +             src_off = 0;
> +
> +             do {
> +                     dst_len = mbuf_size - dst_off;
> +                     if (dst_len == 0) {
> +                             dst_off = 0;
> +                             dst_len = mbuf_size + RTE_PKTMBUF_HEADROOM;
> +
> +                             mbuf = rte_pktmbuf_alloc(mq->mempool);
> +                             if (unlikely(mbuf == NULL))
> +                                     goto no_free_bufs;
> +                             mbuf->port = mq->in_port;
> +                             rte_pktmbuf_chain(mbuf_head, mbuf);
> +                     }
> +                     cp_len = RTE_MIN(dst_len, src_len);
> +
> +                     rte_pktmbuf_pkt_len(mbuf) =
> +                         rte_pktmbuf_data_len(mbuf) += cp_len;

Can you please make this two lines to prevent confusion?

Also shouldn't need to add 'cp_len' to 'mbuf_head->pkt_len'?
'rte_pktmbuf_chain' updates 'mbuf_head' but at that stage 'mbuf->pkt_len' is not
set to 'cp_len'...

<...>


    Fix in next patch.


> +             if (r->addr == NULL)
> +                     return;
> +             munmap(r->addr, r->region_size);
> +             if (r->fd > 0) {
> +                     close(r->fd);
> +                     r->fd = -1;
> +             }
> +     }
> +     rte_free(pmd->regions);
> +}
> +
> +static int
> +memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn)
> +{
> +     struct memif_region *r;
> +     char shm_name[32];
> +     int i;
> +     int ret = 0;
> +
> +     r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1), 0);
> +     if (r == NULL) {
> +             MIF_LOG(ERR, "%s: Failed to allocate regions.",
> +                     rte_vdev_device_name(pmd->vdev));
> +             return -ENOMEM;
> +     }
> +
> +     pmd->regions = r;
> +     pmd->regions_num = brn + 1;
> +
> +     /*
> +      * Create shm for every region. Region 0 is reserved for descriptors.
> +      * Other regions contain buffers.
> +      */
> +     for (i = 0; i < (brn + 1); i++) {
> +             r = &pmd->regions[i];
> +
> +             r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
> +                                            pmd->run.num_m2s_rings) *
> +                 (sizeof(memif_ring_t) +
> +                  sizeof(memif_desc_t) * (1 << pmd->run.log2_ring_size)) : 0;

what exactly "buffer_offset" is? For 'region 0' it calculates the size of the
size of the 'region 0', otherwise 0. This is offset starting from where? And it
seems only used for below size assignment.


    It is possible to use single region for one connection (non-zero-copy) to reduce
    the number of files opened. It will be implemented in the next patch.
    'buffer_offset' is offset from the start of shared memory file to the first packet buffer.


> +                             (1 << pmd->run.log2_ring_size) *
> +                             (pmd->run.num_s2m_rings +
> +                              pmd->run.num_m2s_rings));

There is an illusion of packet buffers can be in multiple regions (above comment
implies it) but this logic assumes all packet buffer are in same region other
than region 0, which gives us 'region 1' and this is already hardcoded in a few
places. Is there a benefit of assuming there will be more regions, will it make
simple to accept 'region 0' for rings and 'region 1' is for packet buffer?


    Multiple regions are required in case of zero-copy slave. The zero-copy will
    expose dpdk shared memory, where each memseg/memseg_list will be
    represended by memif_region.


> +int
> +memif_connect(struct rte_eth_dev *dev)
> +{
> +     struct pmd_internals *pmd = dev->data->dev_private;
> +     struct memif_region *mr;
> +     struct memif_queue *mq;
> +     int i;
> +
> +     for (i = 0; i < pmd->regions_num; i++) {
> +             mr = pmd->regions + i;
> +             if (mr != NULL) {
> +                     if (mr->addr == NULL) {
> +                             if (mr->fd < 0)
> +                                     return -1;
> +                             mr->addr = mmap(NULL, mr->region_size,
> +                                             PROT_READ | PROT_WRITE,
> +                                             MAP_SHARED, mr->fd, 0);
> +                             if (mr->addr == NULL)
> +                                     return -1;
> +                     }
> +             }
> +     }

If multiple slave is not supported, is 'id' devarg still valid, or is it used at
all?


    'id' is used to identify peer interface. Interface with id=0 will only connect to an
    interface with id=0 and so on...


<...>

> +static int
> +memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
> +          memif_interface_id_t id, uint32_t flags,
> +          const char *socket_filename,
> +          memif_log2_ring_size_t log2_ring_size,
> +          uint16_t buffer_size, const char *secret,
> +          struct ether_addr *eth_addr)
> +{
> +     int ret = 0;
> +     struct rte_eth_dev *eth_dev;
> +     struct rte_eth_dev_data *data;
> +     struct pmd_internals *pmd;
> +     const unsigned int numa_node = vdev->device.numa_node;
> +     const char *name = rte_vdev_device_name(vdev);
> +
> +     if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
> +             MIF_LOG(ERR, "Zero-copy not supported.");
> +             return -1;
> +     }

What is the plan for the zero copy?
Can you please put the status & plan to the documentation?

<...>


    I don't think that putting the status in the docs is needed. Zero-copy slave is in testing stage
    and I should be able to submit a patch once we get this one applied.


> +
> +     uint16_t last_head;                     /**< last ring head */
> +     uint16_t last_tail;                     /**< last ring tail */

ring already has head, tail variables, what is the difference in queue ones?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-02 12:35       ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
@ 2019-05-02 12:35         ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-05-03  4:27         ` Honnappa Nagarahalli
  1 sibling, 0 replies; 58+ messages in thread
From: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) @ 2019-05-02 12:35 UTC (permalink / raw)
  To: Ferruh Yigit, dev


________________________________________
From: Ferruh Yigit <ferruh.yigit@intel.com>
Sent: Monday, March 25, 2019 9:58 PM
To: Jakub Grajciar; dev@dpdk.org
Subject: Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)

On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> Memory interface (memif), provides high performance
> packet transfer over shared memory.
>
> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>

<...>

> @@ -0,0 +1,200 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2018-2019 Cisco Systems, Inc.
> +
> +======================
> +Memif Poll Mode Driver
> +======================
> +Shared memory packet interface (memif) PMD allows for DPDK and any other client
> +using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
> +Linux only.
> +
> +The created device transmits packets in a raw format. It can be used with
> +Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
> +supported in DPDK memif implementation.
> +
> +Memif works in two roles: master and slave. Slave connects to master over an
> +existing socket. It is also a producer of shared memory file and initializes
> +the shared memory. Master creates the socket and listens for any slave
> +connection requests. The socket may already exist on the system. Be sure to
> +remove any such sockets, if you are creating a master interface, or you will
> +see an "Address already in use" error. Function ``rte_pmd_memif_remove()``,

Can it be possible to remove this existing socket on successfully termination of
the dpdk application with 'master' role?
Otherwise each time to run a dpdk app, requires to delete the socket file first.


    net_memif is the same case as net_virtio_user, I'd use the same workaround.
    testpmd.c:pmd_test_exit() this, however, is only valid for testpmd.


> +which removes memif interface, will also remove a listener socket, if it is
> +not being used by any other interface.
> +
> +The method to enable one or more interfaces is to use the
> +``--vdev=net_memif0`` option on the DPDK application command line. Each
> +``--vdev=net_memif1`` option given will create an interface named net_memif0,
> +net_memif1, and so on. Memif uses unix domain socket to transmit control
> +messages. Each memif has a unique id per socket. If you are connecting multiple
> +interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
> +etc. Note that if you assign a socket to a master interface it becomes a
> +listener socket. Listener socket can not be used by a slave interface on same
> +client.

When I connect two slaves, id=0 & id=1 to the master, both master and second
slave crashed as soon as second slave connected, is this a know issue? Can you
please check this?

master: ./build/app/testpmd -w0:0.0 -l20,21 --vdev net_memif0,role=master -- -i
slave1: ./build/app/testpmd -w0:0.0 -l20,22 --file-prefix=slave --vdev
net_memif0,role=slave,id=0 -- -i
slave2: ./build/app/testpmd -w0:0.0 -l20,23 --file-prefix=slave2 --vdev
net_memif0,role=slave,id=1 -- -i

<...>


    Each interface can be connected to one peer interface at the same time, I'll
    add more details to the documentation.


> +Example: testpmd and testpmd
> +----------------------------
> +In this example we run two instances of testpmd application and transmit packets over memif.

How this play with multi process support? When I run a secondary app when memif
PMD is enabled in primary, secondary process crashes.
Can you please check and ensure at least nothing crashes when a secondary app run?


    For now I'd disable multi-process support, so we can get the patch applied, then
    provide multi-process support in a separate patch.


> +
> +First create ``master`` interface::
> +
> +    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
> +
> +Now create ``slave`` interface (master must be already running so the slave will connect)::
> +
> +    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
> +
> +Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
> +
> +    testpmd> set fwd rxonly
> +    testpmd> start
> +
> +    testpmd> set fwd txonly
> +    testpmd> start

Would it be useful to add loopback option to the PMD, for testing/debug ?

Also I am getting low performance numbers above, comparing the ring pmd for
example, is there any performance target for the pmd?
Same forwarding core for both testpmds, this is terrible, either something is
wrong or I am doing something wrong, it is ~40Kpps
Different forwarding cores for each testpmd, still low, !18Mpps

<...>


    The difference between ring pmd and memif pmd is that while memif transfers packets,
    ring transfers whole buffers. (Also ring can not be used process to process)
    That means, it does not have to alloc/free buffers.
    I did a simple test where I modified the code so the tx function will only free given buffers
    and rx allocates new buffers. On my machine, this can only handle 45-50Mpps.

    With that in mind, I believe that 23Mpps is fine performance. No performance target is
    defined, the goal is to be as fast as possible.

    The cause of the issue, where traffic drops to 40Kpps while using the same core for both applications,
    is that within the timeslice given to testpmd process, memif driver fills its queues,
    but keeps polling, not giving the other process a chance to receive the packets.
    Same applies to rx side, where an empty queue is polled over and over again. In such configuration,
    interrupt rx-mode should be used instead of polling. Or the application could suspend
    the port.


> +/**
> + * Buffer descriptor.
> + */
> +typedef struct __rte_packed {
> +     uint16_t flags;                         /**< flags */
> +#define MEMIF_DESC_FLAG_NEXT 1                       /**< is chained buffer */

Is this define used?

<...>


     Yes, see eth_memif_tx() and eth_memif_rx()


> +     while (n_slots && n_rx_pkts < nb_pkts) {
> +             mbuf_head = rte_pktmbuf_alloc(mq->mempool);
> +             if (unlikely(mbuf_head == NULL))
> +                     goto no_free_bufs;
> +             mbuf = mbuf_head;
> +             mbuf->port = mq->in_port;
> +
> + next_slot:
> +             s0 = cur_slot & mask;
> +             d0 = &ring->desc[s0];
> +
> +             src_len = d0->length;
> +             dst_off = 0;
> +             src_off = 0;
> +
> +             do {
> +                     dst_len = mbuf_size - dst_off;
> +                     if (dst_len == 0) {
> +                             dst_off = 0;
> +                             dst_len = mbuf_size + RTE_PKTMBUF_HEADROOM;
> +
> +                             mbuf = rte_pktmbuf_alloc(mq->mempool);
> +                             if (unlikely(mbuf == NULL))
> +                                     goto no_free_bufs;
> +                             mbuf->port = mq->in_port;
> +                             rte_pktmbuf_chain(mbuf_head, mbuf);
> +                     }
> +                     cp_len = RTE_MIN(dst_len, src_len);
> +
> +                     rte_pktmbuf_pkt_len(mbuf) =
> +                         rte_pktmbuf_data_len(mbuf) += cp_len;

Can you please make this two lines to prevent confusion?

Also shouldn't need to add 'cp_len' to 'mbuf_head->pkt_len'?
'rte_pktmbuf_chain' updates 'mbuf_head' but at that stage 'mbuf->pkt_len' is not
set to 'cp_len'...

<...>


    Fix in next patch.


> +             if (r->addr == NULL)
> +                     return;
> +             munmap(r->addr, r->region_size);
> +             if (r->fd > 0) {
> +                     close(r->fd);
> +                     r->fd = -1;
> +             }
> +     }
> +     rte_free(pmd->regions);
> +}
> +
> +static int
> +memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn)
> +{
> +     struct memif_region *r;
> +     char shm_name[32];
> +     int i;
> +     int ret = 0;
> +
> +     r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1), 0);
> +     if (r == NULL) {
> +             MIF_LOG(ERR, "%s: Failed to allocate regions.",
> +                     rte_vdev_device_name(pmd->vdev));
> +             return -ENOMEM;
> +     }
> +
> +     pmd->regions = r;
> +     pmd->regions_num = brn + 1;
> +
> +     /*
> +      * Create shm for every region. Region 0 is reserved for descriptors.
> +      * Other regions contain buffers.
> +      */
> +     for (i = 0; i < (brn + 1); i++) {
> +             r = &pmd->regions[i];
> +
> +             r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
> +                                            pmd->run.num_m2s_rings) *
> +                 (sizeof(memif_ring_t) +
> +                  sizeof(memif_desc_t) * (1 << pmd->run.log2_ring_size)) : 0;

what exactly "buffer_offset" is? For 'region 0' it calculates the size of the
size of the 'region 0', otherwise 0. This is offset starting from where? And it
seems only used for below size assignment.


    It is possible to use single region for one connection (non-zero-copy) to reduce
    the number of files opened. It will be implemented in the next patch.
    'buffer_offset' is offset from the start of shared memory file to the first packet buffer.


> +                             (1 << pmd->run.log2_ring_size) *
> +                             (pmd->run.num_s2m_rings +
> +                              pmd->run.num_m2s_rings));

There is an illusion of packet buffers can be in multiple regions (above comment
implies it) but this logic assumes all packet buffer are in same region other
than region 0, which gives us 'region 1' and this is already hardcoded in a few
places. Is there a benefit of assuming there will be more regions, will it make
simple to accept 'region 0' for rings and 'region 1' is for packet buffer?


    Multiple regions are required in case of zero-copy slave. The zero-copy will
    expose dpdk shared memory, where each memseg/memseg_list will be
    represended by memif_region.


> +int
> +memif_connect(struct rte_eth_dev *dev)
> +{
> +     struct pmd_internals *pmd = dev->data->dev_private;
> +     struct memif_region *mr;
> +     struct memif_queue *mq;
> +     int i;
> +
> +     for (i = 0; i < pmd->regions_num; i++) {
> +             mr = pmd->regions + i;
> +             if (mr != NULL) {
> +                     if (mr->addr == NULL) {
> +                             if (mr->fd < 0)
> +                                     return -1;
> +                             mr->addr = mmap(NULL, mr->region_size,
> +                                             PROT_READ | PROT_WRITE,
> +                                             MAP_SHARED, mr->fd, 0);
> +                             if (mr->addr == NULL)
> +                                     return -1;
> +                     }
> +             }
> +     }

If multiple slave is not supported, is 'id' devarg still valid, or is it used at
all?


    'id' is used to identify peer interface. Interface with id=0 will only connect to an
    interface with id=0 and so on...


<...>

> +static int
> +memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
> +          memif_interface_id_t id, uint32_t flags,
> +          const char *socket_filename,
> +          memif_log2_ring_size_t log2_ring_size,
> +          uint16_t buffer_size, const char *secret,
> +          struct ether_addr *eth_addr)
> +{
> +     int ret = 0;
> +     struct rte_eth_dev *eth_dev;
> +     struct rte_eth_dev_data *data;
> +     struct pmd_internals *pmd;
> +     const unsigned int numa_node = vdev->device.numa_node;
> +     const char *name = rte_vdev_device_name(vdev);
> +
> +     if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
> +             MIF_LOG(ERR, "Zero-copy not supported.");
> +             return -1;
> +     }

What is the plan for the zero copy?
Can you please put the status & plan to the documentation?

<...>


    I don't think that putting the status in the docs is needed. Zero-copy slave is in testing stage
    and I should be able to submit a patch once we get this one applied.


> +
> +     uint16_t last_head;                     /**< last ring head */
> +     uint16_t last_tail;                     /**< last ring tail */

ring already has head, tail variables, what is the difference in queue ones?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-02 12:35       ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-05-02 12:35         ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
@ 2019-05-03  4:27         ` Honnappa Nagarahalli
  2019-05-03  4:27           ` Honnappa Nagarahalli
  2019-05-06 11:00           ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  1 sibling, 2 replies; 58+ messages in thread
From: Honnappa Nagarahalli @ 2019-05-03  4:27 UTC (permalink / raw)
  To: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco),
	Ferruh Yigit, dev, Honnappa Nagarahalli
  Cc: nd, nd

> On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> > Memory interface (memif), provides high performance packet transfer
> > over shared memory.
> >
> > Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
> 
> <...>
> 
> > @@ -0,0 +1,200 @@
> > +..  SPDX-License-Identifier: BSD-3-Clause
> > +    Copyright(c) 2018-2019 Cisco Systems, Inc.
> > +
> > +======================
> > +Memif Poll Mode Driver
> > +======================
> > +Shared memory packet interface (memif) PMD allows for DPDK and any
> > +other client using memif (DPDK, VPP, libmemif) to communicate using
> > +shared memory. Memif is Linux only.
> > +
> > +The created device transmits packets in a raw format. It can be used
> > +with Ethernet mode, IP mode, or Punt/Inject. At this moment, only
> > +Ethernet mode is supported in DPDK memif implementation.
> > +
> > +Memif works in two roles: master and slave. Slave connects to master
> > +over an existing socket. It is also a producer of shared memory file
> > +and initializes the shared memory. Master creates the socket and
> > +listens for any slave connection requests. The socket may already
> > +exist on the system. Be sure to remove any such sockets, if you are
> > +creating a master interface, or you will see an "Address already in
> > +use" error. Function ``rte_pmd_memif_remove()``,
> 
> Can it be possible to remove this existing socket on successfully termination
> of the dpdk application with 'master' role?
> Otherwise each time to run a dpdk app, requires to delete the socket file first.
> 
> 
>     net_memif is the same case as net_virtio_user, I'd use the same
> workaround.
>     testpmd.c:pmd_test_exit() this, however, is only valid for testpmd.
> 
> 
> > +which removes memif interface, will also remove a listener socket, if
> > +it is not being used by any other interface.
> > +
> > +The method to enable one or more interfaces is to use the
> > +``--vdev=net_memif0`` option on the DPDK application command line.
> > +Each ``--vdev=net_memif1`` option given will create an interface
> > +named net_memif0, net_memif1, and so on. Memif uses unix domain
> > +socket to transmit control messages. Each memif has a unique id per
> > +socket. If you are connecting multiple interfaces using same socket,
> > +be sure to specify unique ids ``id=0``, ``id=1``, etc. Note that if
> > +you assign a socket to a master interface it becomes a listener
> > +socket. Listener socket can not be used by a slave interface on same client.
> 
> When I connect two slaves, id=0 & id=1 to the master, both master and
> second slave crashed as soon as second slave connected, is this a know issue?
> Can you please check this?
> 
> master: ./build/app/testpmd -w0:0.0 -l20,21 --vdev net_memif0,role=master
> -- -i
> slave1: ./build/app/testpmd -w0:0.0 -l20,22 --file-prefix=slave --vdev
> net_memif0,role=slave,id=0 -- -i
> slave2: ./build/app/testpmd -w0:0.0 -l20,23 --file-prefix=slave2 --vdev
> net_memif0,role=slave,id=1 -- -i
> 
> <...>
> 
> 
>     Each interface can be connected to one peer interface at the same time, I'll
>     add more details to the documentation.
> 
> 
> > +Example: testpmd and testpmd
> > +----------------------------
> > +In this example we run two instances of testpmd application and transmit
> packets over memif.
> 
> How this play with multi process support? When I run a secondary app when
> memif PMD is enabled in primary, secondary process crashes.
> Can you please check and ensure at least nothing crashes when a secondary
> app run?
> 
> 
>     For now I'd disable multi-process support, so we can get the patch applied,
> then
>     provide multi-process support in a separate patch.
> 
> 
> > +
> > +First create ``master`` interface::
> > +
> > +    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1
> > + --vdev=net_memif,role=master -- -i
> > +
> > +Now create ``slave`` interface (master must be already running so the
> slave will connect)::
> > +
> > +    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2
> > + --vdev=net_memif -- -i
> > +
> > +Set forwarding mode in one of the instances to 'rx only' and the other to
> 'tx only'::
> > +
> > +    testpmd> set fwd rxonly
> > +    testpmd> start
> > +
> > +    testpmd> set fwd txonly
> > +    testpmd> start
> 
> Would it be useful to add loopback option to the PMD, for testing/debug ?
> 
> Also I am getting low performance numbers above, comparing the ring pmd
> for example, is there any performance target for the pmd?
> Same forwarding core for both testpmds, this is terrible, either something is
> wrong or I am doing something wrong, it is ~40Kpps Different forwarding
> cores for each testpmd, still low, !18Mpps
> 
> <...>
> 
> 
>     The difference between ring pmd and memif pmd is that while memif
> transfers packets,
>     ring transfers whole buffers. (Also ring can not be used process to process)
>     That means, it does not have to alloc/free buffers.
>     I did a simple test where I modified the code so the tx function will only
> free given buffers
>     and rx allocates new buffers. On my machine, this can only handle 45-
> 50Mpps.
> 
>     With that in mind, I believe that 23Mpps is fine performance. No
> performance target is
>     defined, the goal is to be as fast as possible.
Use of C11 atomics have proven to provide better performance on weakly ordered architectures (at least on Arm). IMO, C11 atomics should be used to implement the fast path functions at least. This ensures optimal performance on all supported architectures in DPDK.

> 
>     The cause of the issue, where traffic drops to 40Kpps while using the same
> core for both applications,
>     is that within the timeslice given to testpmd process, memif driver fills its
> queues,
>     but keeps polling, not giving the other process a chance to receive the
> packets.
>     Same applies to rx side, where an empty queue is polled over and over
> again. In such configuration,
>     interrupt rx-mode should be used instead of polling. Or the application
> could suspend
>     the port.
> 
> 
> > +/**
> > + * Buffer descriptor.
> > + */
> > +typedef struct __rte_packed {
> > +     uint16_t flags;                         /**< flags */
> > +#define MEMIF_DESC_FLAG_NEXT 1                       /**< is chained buffer */
> 
> Is this define used?
> 
> <...>
> 
> 
>      Yes, see eth_memif_tx() and eth_memif_rx()
> 
> 
> > +     while (n_slots && n_rx_pkts < nb_pkts) {
> > +             mbuf_head = rte_pktmbuf_alloc(mq->mempool);
> > +             if (unlikely(mbuf_head == NULL))
> > +                     goto no_free_bufs;
> > +             mbuf = mbuf_head;
> > +             mbuf->port = mq->in_port;
> > +
> > + next_slot:
> > +             s0 = cur_slot & mask;
> > +             d0 = &ring->desc[s0];
> > +
> > +             src_len = d0->length;
> > +             dst_off = 0;
> > +             src_off = 0;
> > +
> > +             do {
> > +                     dst_len = mbuf_size - dst_off;
> > +                     if (dst_len == 0) {
> > +                             dst_off = 0;
> > +                             dst_len = mbuf_size +
> > + RTE_PKTMBUF_HEADROOM;
> > +
> > +                             mbuf = rte_pktmbuf_alloc(mq->mempool);
> > +                             if (unlikely(mbuf == NULL))
> > +                                     goto no_free_bufs;
> > +                             mbuf->port = mq->in_port;
> > +                             rte_pktmbuf_chain(mbuf_head, mbuf);
> > +                     }
> > +                     cp_len = RTE_MIN(dst_len, src_len);
> > +
> > +                     rte_pktmbuf_pkt_len(mbuf) =
> > +                         rte_pktmbuf_data_len(mbuf) += cp_len;
> 
> Can you please make this two lines to prevent confusion?
> 
> Also shouldn't need to add 'cp_len' to 'mbuf_head->pkt_len'?
> 'rte_pktmbuf_chain' updates 'mbuf_head' but at that stage 'mbuf->pkt_len' is
> not set to 'cp_len'...
> 
> <...>
> 
> 
>     Fix in next patch.
> 
> 
> > +             if (r->addr == NULL)
> > +                     return;
> > +             munmap(r->addr, r->region_size);
> > +             if (r->fd > 0) {
> > +                     close(r->fd);
> > +                     r->fd = -1;
> > +             }
> > +     }
> > +     rte_free(pmd->regions);
> > +}
> > +
> > +static int
> > +memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn) {
> > +     struct memif_region *r;
> > +     char shm_name[32];
> > +     int i;
> > +     int ret = 0;
> > +
> > +     r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1),
> 0);
> > +     if (r == NULL) {
> > +             MIF_LOG(ERR, "%s: Failed to allocate regions.",
> > +                     rte_vdev_device_name(pmd->vdev));
> > +             return -ENOMEM;
> > +     }
> > +
> > +     pmd->regions = r;
> > +     pmd->regions_num = brn + 1;
> > +
> > +     /*
> > +      * Create shm for every region. Region 0 is reserved for descriptors.
> > +      * Other regions contain buffers.
> > +      */
> > +     for (i = 0; i < (brn + 1); i++) {
> > +             r = &pmd->regions[i];
> > +
> > +             r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
> > +                                            pmd->run.num_m2s_rings) *
> > +                 (sizeof(memif_ring_t) +
> > +                  sizeof(memif_desc_t) * (1 <<
> > + pmd->run.log2_ring_size)) : 0;
> 
> what exactly "buffer_offset" is? For 'region 0' it calculates the size of the size
> of the 'region 0', otherwise 0. This is offset starting from where? And it seems
> only used for below size assignment.
> 
> 
>     It is possible to use single region for one connection (non-zero-copy) to
> reduce
>     the number of files opened. It will be implemented in the next patch.
>     'buffer_offset' is offset from the start of shared memory file to the first
> packet buffer.
> 
> 
> > +                             (1 << pmd->run.log2_ring_size) *
> > +                             (pmd->run.num_s2m_rings +
> > +                              pmd->run.num_m2s_rings));
> 
> There is an illusion of packet buffers can be in multiple regions (above
> comment implies it) but this logic assumes all packet buffer are in same
> region other than region 0, which gives us 'region 1' and this is already
> hardcoded in a few places. Is there a benefit of assuming there will be more
> regions, will it make simple to accept 'region 0' for rings and 'region 1' is for
> packet buffer?
> 
> 
>     Multiple regions are required in case of zero-copy slave. The zero-copy will
>     expose dpdk shared memory, where each memseg/memseg_list will be
>     represended by memif_region.
> 
> 
> > +int
> > +memif_connect(struct rte_eth_dev *dev) {
> > +     struct pmd_internals *pmd = dev->data->dev_private;
> > +     struct memif_region *mr;
> > +     struct memif_queue *mq;
> > +     int i;
> > +
> > +     for (i = 0; i < pmd->regions_num; i++) {
> > +             mr = pmd->regions + i;
> > +             if (mr != NULL) {
> > +                     if (mr->addr == NULL) {
> > +                             if (mr->fd < 0)
> > +                                     return -1;
> > +                             mr->addr = mmap(NULL, mr->region_size,
> > +                                             PROT_READ | PROT_WRITE,
> > +                                             MAP_SHARED, mr->fd, 0);
> > +                             if (mr->addr == NULL)
> > +                                     return -1;
> > +                     }
> > +             }
> > +     }
> 
> If multiple slave is not supported, is 'id' devarg still valid, or is it used at all?
> 
> 
>     'id' is used to identify peer interface. Interface with id=0 will only connect
> to an
>     interface with id=0 and so on...
> 
> 
> <...>
> 
> > +static int
> > +memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
> > +          memif_interface_id_t id, uint32_t flags,
> > +          const char *socket_filename,
> > +          memif_log2_ring_size_t log2_ring_size,
> > +          uint16_t buffer_size, const char *secret,
> > +          struct ether_addr *eth_addr) {
> > +     int ret = 0;
> > +     struct rte_eth_dev *eth_dev;
> > +     struct rte_eth_dev_data *data;
> > +     struct pmd_internals *pmd;
> > +     const unsigned int numa_node = vdev->device.numa_node;
> > +     const char *name = rte_vdev_device_name(vdev);
> > +
> > +     if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
> > +             MIF_LOG(ERR, "Zero-copy not supported.");
> > +             return -1;
> > +     }
> 
> What is the plan for the zero copy?
> Can you please put the status & plan to the documentation?
> 
> <...>
> 
> 
>     I don't think that putting the status in the docs is needed. Zero-copy slave
> is in testing stage
>     and I should be able to submit a patch once we get this one applied.
> 
> 
> > +
> > +     uint16_t last_head;                     /**< last ring head */
> > +     uint16_t last_tail;                     /**< last ring tail */
> 
> ring already has head, tail variables, what is the difference in queue ones?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-03  4:27         ` Honnappa Nagarahalli
@ 2019-05-03  4:27           ` Honnappa Nagarahalli
  2019-05-06 11:00           ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  1 sibling, 0 replies; 58+ messages in thread
From: Honnappa Nagarahalli @ 2019-05-03  4:27 UTC (permalink / raw)
  To: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco),
	Ferruh Yigit, dev, Honnappa Nagarahalli
  Cc: nd, nd

> On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> > Memory interface (memif), provides high performance packet transfer
> > over shared memory.
> >
> > Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
> 
> <...>
> 
> > @@ -0,0 +1,200 @@
> > +..  SPDX-License-Identifier: BSD-3-Clause
> > +    Copyright(c) 2018-2019 Cisco Systems, Inc.
> > +
> > +======================
> > +Memif Poll Mode Driver
> > +======================
> > +Shared memory packet interface (memif) PMD allows for DPDK and any
> > +other client using memif (DPDK, VPP, libmemif) to communicate using
> > +shared memory. Memif is Linux only.
> > +
> > +The created device transmits packets in a raw format. It can be used
> > +with Ethernet mode, IP mode, or Punt/Inject. At this moment, only
> > +Ethernet mode is supported in DPDK memif implementation.
> > +
> > +Memif works in two roles: master and slave. Slave connects to master
> > +over an existing socket. It is also a producer of shared memory file
> > +and initializes the shared memory. Master creates the socket and
> > +listens for any slave connection requests. The socket may already
> > +exist on the system. Be sure to remove any such sockets, if you are
> > +creating a master interface, or you will see an "Address already in
> > +use" error. Function ``rte_pmd_memif_remove()``,
> 
> Can it be possible to remove this existing socket on successfully termination
> of the dpdk application with 'master' role?
> Otherwise each time to run a dpdk app, requires to delete the socket file first.
> 
> 
>     net_memif is the same case as net_virtio_user, I'd use the same
> workaround.
>     testpmd.c:pmd_test_exit() this, however, is only valid for testpmd.
> 
> 
> > +which removes memif interface, will also remove a listener socket, if
> > +it is not being used by any other interface.
> > +
> > +The method to enable one or more interfaces is to use the
> > +``--vdev=net_memif0`` option on the DPDK application command line.
> > +Each ``--vdev=net_memif1`` option given will create an interface
> > +named net_memif0, net_memif1, and so on. Memif uses unix domain
> > +socket to transmit control messages. Each memif has a unique id per
> > +socket. If you are connecting multiple interfaces using same socket,
> > +be sure to specify unique ids ``id=0``, ``id=1``, etc. Note that if
> > +you assign a socket to a master interface it becomes a listener
> > +socket. Listener socket can not be used by a slave interface on same client.
> 
> When I connect two slaves, id=0 & id=1 to the master, both master and
> second slave crashed as soon as second slave connected, is this a know issue?
> Can you please check this?
> 
> master: ./build/app/testpmd -w0:0.0 -l20,21 --vdev net_memif0,role=master
> -- -i
> slave1: ./build/app/testpmd -w0:0.0 -l20,22 --file-prefix=slave --vdev
> net_memif0,role=slave,id=0 -- -i
> slave2: ./build/app/testpmd -w0:0.0 -l20,23 --file-prefix=slave2 --vdev
> net_memif0,role=slave,id=1 -- -i
> 
> <...>
> 
> 
>     Each interface can be connected to one peer interface at the same time, I'll
>     add more details to the documentation.
> 
> 
> > +Example: testpmd and testpmd
> > +----------------------------
> > +In this example we run two instances of testpmd application and transmit
> packets over memif.
> 
> How this play with multi process support? When I run a secondary app when
> memif PMD is enabled in primary, secondary process crashes.
> Can you please check and ensure at least nothing crashes when a secondary
> app run?
> 
> 
>     For now I'd disable multi-process support, so we can get the patch applied,
> then
>     provide multi-process support in a separate patch.
> 
> 
> > +
> > +First create ``master`` interface::
> > +
> > +    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1
> > + --vdev=net_memif,role=master -- -i
> > +
> > +Now create ``slave`` interface (master must be already running so the
> slave will connect)::
> > +
> > +    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2
> > + --vdev=net_memif -- -i
> > +
> > +Set forwarding mode in one of the instances to 'rx only' and the other to
> 'tx only'::
> > +
> > +    testpmd> set fwd rxonly
> > +    testpmd> start
> > +
> > +    testpmd> set fwd txonly
> > +    testpmd> start
> 
> Would it be useful to add loopback option to the PMD, for testing/debug ?
> 
> Also I am getting low performance numbers above, comparing the ring pmd
> for example, is there any performance target for the pmd?
> Same forwarding core for both testpmds, this is terrible, either something is
> wrong or I am doing something wrong, it is ~40Kpps Different forwarding
> cores for each testpmd, still low, !18Mpps
> 
> <...>
> 
> 
>     The difference between ring pmd and memif pmd is that while memif
> transfers packets,
>     ring transfers whole buffers. (Also ring can not be used process to process)
>     That means, it does not have to alloc/free buffers.
>     I did a simple test where I modified the code so the tx function will only
> free given buffers
>     and rx allocates new buffers. On my machine, this can only handle 45-
> 50Mpps.
> 
>     With that in mind, I believe that 23Mpps is fine performance. No
> performance target is
>     defined, the goal is to be as fast as possible.
Use of C11 atomics have proven to provide better performance on weakly ordered architectures (at least on Arm). IMO, C11 atomics should be used to implement the fast path functions at least. This ensures optimal performance on all supported architectures in DPDK.

> 
>     The cause of the issue, where traffic drops to 40Kpps while using the same
> core for both applications,
>     is that within the timeslice given to testpmd process, memif driver fills its
> queues,
>     but keeps polling, not giving the other process a chance to receive the
> packets.
>     Same applies to rx side, where an empty queue is polled over and over
> again. In such configuration,
>     interrupt rx-mode should be used instead of polling. Or the application
> could suspend
>     the port.
> 
> 
> > +/**
> > + * Buffer descriptor.
> > + */
> > +typedef struct __rte_packed {
> > +     uint16_t flags;                         /**< flags */
> > +#define MEMIF_DESC_FLAG_NEXT 1                       /**< is chained buffer */
> 
> Is this define used?
> 
> <...>
> 
> 
>      Yes, see eth_memif_tx() and eth_memif_rx()
> 
> 
> > +     while (n_slots && n_rx_pkts < nb_pkts) {
> > +             mbuf_head = rte_pktmbuf_alloc(mq->mempool);
> > +             if (unlikely(mbuf_head == NULL))
> > +                     goto no_free_bufs;
> > +             mbuf = mbuf_head;
> > +             mbuf->port = mq->in_port;
> > +
> > + next_slot:
> > +             s0 = cur_slot & mask;
> > +             d0 = &ring->desc[s0];
> > +
> > +             src_len = d0->length;
> > +             dst_off = 0;
> > +             src_off = 0;
> > +
> > +             do {
> > +                     dst_len = mbuf_size - dst_off;
> > +                     if (dst_len == 0) {
> > +                             dst_off = 0;
> > +                             dst_len = mbuf_size +
> > + RTE_PKTMBUF_HEADROOM;
> > +
> > +                             mbuf = rte_pktmbuf_alloc(mq->mempool);
> > +                             if (unlikely(mbuf == NULL))
> > +                                     goto no_free_bufs;
> > +                             mbuf->port = mq->in_port;
> > +                             rte_pktmbuf_chain(mbuf_head, mbuf);
> > +                     }
> > +                     cp_len = RTE_MIN(dst_len, src_len);
> > +
> > +                     rte_pktmbuf_pkt_len(mbuf) =
> > +                         rte_pktmbuf_data_len(mbuf) += cp_len;
> 
> Can you please make this two lines to prevent confusion?
> 
> Also shouldn't need to add 'cp_len' to 'mbuf_head->pkt_len'?
> 'rte_pktmbuf_chain' updates 'mbuf_head' but at that stage 'mbuf->pkt_len' is
> not set to 'cp_len'...
> 
> <...>
> 
> 
>     Fix in next patch.
> 
> 
> > +             if (r->addr == NULL)
> > +                     return;
> > +             munmap(r->addr, r->region_size);
> > +             if (r->fd > 0) {
> > +                     close(r->fd);
> > +                     r->fd = -1;
> > +             }
> > +     }
> > +     rte_free(pmd->regions);
> > +}
> > +
> > +static int
> > +memif_alloc_regions(struct pmd_internals *pmd, uint8_t brn) {
> > +     struct memif_region *r;
> > +     char shm_name[32];
> > +     int i;
> > +     int ret = 0;
> > +
> > +     r = rte_zmalloc("memif_region", sizeof(struct memif_region) * (brn + 1),
> 0);
> > +     if (r == NULL) {
> > +             MIF_LOG(ERR, "%s: Failed to allocate regions.",
> > +                     rte_vdev_device_name(pmd->vdev));
> > +             return -ENOMEM;
> > +     }
> > +
> > +     pmd->regions = r;
> > +     pmd->regions_num = brn + 1;
> > +
> > +     /*
> > +      * Create shm for every region. Region 0 is reserved for descriptors.
> > +      * Other regions contain buffers.
> > +      */
> > +     for (i = 0; i < (brn + 1); i++) {
> > +             r = &pmd->regions[i];
> > +
> > +             r->buffer_offset = (i == 0) ? (pmd->run.num_s2m_rings +
> > +                                            pmd->run.num_m2s_rings) *
> > +                 (sizeof(memif_ring_t) +
> > +                  sizeof(memif_desc_t) * (1 <<
> > + pmd->run.log2_ring_size)) : 0;
> 
> what exactly "buffer_offset" is? For 'region 0' it calculates the size of the size
> of the 'region 0', otherwise 0. This is offset starting from where? And it seems
> only used for below size assignment.
> 
> 
>     It is possible to use single region for one connection (non-zero-copy) to
> reduce
>     the number of files opened. It will be implemented in the next patch.
>     'buffer_offset' is offset from the start of shared memory file to the first
> packet buffer.
> 
> 
> > +                             (1 << pmd->run.log2_ring_size) *
> > +                             (pmd->run.num_s2m_rings +
> > +                              pmd->run.num_m2s_rings));
> 
> There is an illusion of packet buffers can be in multiple regions (above
> comment implies it) but this logic assumes all packet buffer are in same
> region other than region 0, which gives us 'region 1' and this is already
> hardcoded in a few places. Is there a benefit of assuming there will be more
> regions, will it make simple to accept 'region 0' for rings and 'region 1' is for
> packet buffer?
> 
> 
>     Multiple regions are required in case of zero-copy slave. The zero-copy will
>     expose dpdk shared memory, where each memseg/memseg_list will be
>     represended by memif_region.
> 
> 
> > +int
> > +memif_connect(struct rte_eth_dev *dev) {
> > +     struct pmd_internals *pmd = dev->data->dev_private;
> > +     struct memif_region *mr;
> > +     struct memif_queue *mq;
> > +     int i;
> > +
> > +     for (i = 0; i < pmd->regions_num; i++) {
> > +             mr = pmd->regions + i;
> > +             if (mr != NULL) {
> > +                     if (mr->addr == NULL) {
> > +                             if (mr->fd < 0)
> > +                                     return -1;
> > +                             mr->addr = mmap(NULL, mr->region_size,
> > +                                             PROT_READ | PROT_WRITE,
> > +                                             MAP_SHARED, mr->fd, 0);
> > +                             if (mr->addr == NULL)
> > +                                     return -1;
> > +                     }
> > +             }
> > +     }
> 
> If multiple slave is not supported, is 'id' devarg still valid, or is it used at all?
> 
> 
>     'id' is used to identify peer interface. Interface with id=0 will only connect
> to an
>     interface with id=0 and so on...
> 
> 
> <...>
> 
> > +static int
> > +memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
> > +          memif_interface_id_t id, uint32_t flags,
> > +          const char *socket_filename,
> > +          memif_log2_ring_size_t log2_ring_size,
> > +          uint16_t buffer_size, const char *secret,
> > +          struct ether_addr *eth_addr) {
> > +     int ret = 0;
> > +     struct rte_eth_dev *eth_dev;
> > +     struct rte_eth_dev_data *data;
> > +     struct pmd_internals *pmd;
> > +     const unsigned int numa_node = vdev->device.numa_node;
> > +     const char *name = rte_vdev_device_name(vdev);
> > +
> > +     if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
> > +             MIF_LOG(ERR, "Zero-copy not supported.");
> > +             return -1;
> > +     }
> 
> What is the plan for the zero copy?
> Can you please put the status & plan to the documentation?
> 
> <...>
> 
> 
>     I don't think that putting the status in the docs is needed. Zero-copy slave
> is in testing stage
>     and I should be able to submit a patch once we get this one applied.
> 
> 
> > +
> > +     uint16_t last_head;                     /**< last ring head */
> > +     uint16_t last_tail;                     /**< last ring tail */
> 
> ring already has head, tail variables, what is the difference in queue ones?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-03  4:27         ` Honnappa Nagarahalli
  2019-05-03  4:27           ` Honnappa Nagarahalli
@ 2019-05-06 11:00           ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-05-06 11:00             ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-05-06 11:04             ` Damjan Marion (damarion)
  1 sibling, 2 replies; 58+ messages in thread
From: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) @ 2019-05-06 11:00 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Ferruh Yigit, dev; +Cc: nd, Damjan Marion (damarion)


________________________________________
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Sent: Friday, May 3, 2019 6:27 AM
To: Jakub Grajciar; Ferruh Yigit; dev@dpdk.org; Honnappa Nagarahalli
Cc: nd; nd
Subject: RE: [dpdk-dev] [RFC v5] /net: memory interface (memif)

> On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> > Memory interface (memif), provides high performance packet transfer
> > over shared memory.
> >
> > Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
>

<...>

>     With that in mind, I believe that 23Mpps is fine performance. No
> performance target is
>     defined, the goal is to be as fast as possible.
Use of C11 atomics have proven to provide better performance on weakly ordered architectures (at least on Arm). IMO, C11 atomics should be used to implement the fast path functions at least. This ensures optimal performance on all supported architectures in DPDK.

    Atomics are not required by memif driver.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-06 11:00           ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
@ 2019-05-06 11:00             ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-05-06 11:04             ` Damjan Marion (damarion)
  1 sibling, 0 replies; 58+ messages in thread
From: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) @ 2019-05-06 11:00 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Ferruh Yigit, dev; +Cc: nd, Damjan Marion (damarion)


________________________________________
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Sent: Friday, May 3, 2019 6:27 AM
To: Jakub Grajciar; Ferruh Yigit; dev@dpdk.org; Honnappa Nagarahalli
Cc: nd; nd
Subject: RE: [dpdk-dev] [RFC v5] /net: memory interface (memif)

> On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> > Memory interface (memif), provides high performance packet transfer
> > over shared memory.
> >
> > Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
>

<...>

>     With that in mind, I believe that 23Mpps is fine performance. No
> performance target is
>     defined, the goal is to be as fast as possible.
Use of C11 atomics have proven to provide better performance on weakly ordered architectures (at least on Arm). IMO, C11 atomics should be used to implement the fast path functions at least. This ensures optimal performance on all supported architectures in DPDK.

    Atomics are not required by memif driver.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-06 11:00           ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-05-06 11:00             ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
@ 2019-05-06 11:04             ` Damjan Marion (damarion)
  2019-05-06 11:04               ` Damjan Marion (damarion)
  2019-05-07 11:29               ` Honnappa Nagarahalli
  1 sibling, 2 replies; 58+ messages in thread
From: Damjan Marion (damarion) @ 2019-05-06 11:04 UTC (permalink / raw)
  To: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  Cc: Honnappa Nagarahalli, Ferruh Yigit, dev, nd


> On 6 May 2019, at 13:00, Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) <jgrajcia@cisco.com> wrote:
> 
> 
> ________________________________________
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Friday, May 3, 2019 6:27 AM
> To: Jakub Grajciar; Ferruh Yigit; dev@dpdk.org; Honnappa Nagarahalli
> Cc: nd; nd
> Subject: RE: [dpdk-dev] [RFC v5] /net: memory interface (memif)
> 
>> On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
>>> Memory interface (memif), provides high performance packet transfer
>>> over shared memory.
>>> 
>>> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
>> 
> 
> <...>
> 
>>    With that in mind, I believe that 23Mpps is fine performance. No
>> performance target is
>>    defined, the goal is to be as fast as possible.
> Use of C11 atomics have proven to provide better performance on weakly ordered architectures (at least on Arm). IMO, C11 atomics should be used to implement the fast path functions at least. This ensures optimal performance on all supported architectures in DPDK.
> 
>    Atomics are not required by memif driver.

Correct, only thing we need is store barrier once per batch of packets,
to make sure that descriptor changes are globally visible before we bump head pointer.

-- 
Damjan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-06 11:04             ` Damjan Marion (damarion)
@ 2019-05-06 11:04               ` Damjan Marion (damarion)
  2019-05-07 11:29               ` Honnappa Nagarahalli
  1 sibling, 0 replies; 58+ messages in thread
From: Damjan Marion (damarion) @ 2019-05-06 11:04 UTC (permalink / raw)
  To: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  Cc: Honnappa Nagarahalli, Ferruh Yigit, dev, nd


> On 6 May 2019, at 13:00, Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) <jgrajcia@cisco.com> wrote:
> 
> 
> ________________________________________
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Friday, May 3, 2019 6:27 AM
> To: Jakub Grajciar; Ferruh Yigit; dev@dpdk.org; Honnappa Nagarahalli
> Cc: nd; nd
> Subject: RE: [dpdk-dev] [RFC v5] /net: memory interface (memif)
> 
>> On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
>>> Memory interface (memif), provides high performance packet transfer
>>> over shared memory.
>>> 
>>> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
>> 
> 
> <...>
> 
>>    With that in mind, I believe that 23Mpps is fine performance. No
>> performance target is
>>    defined, the goal is to be as fast as possible.
> Use of C11 atomics have proven to provide better performance on weakly ordered architectures (at least on Arm). IMO, C11 atomics should be used to implement the fast path functions at least. This ensures optimal performance on all supported architectures in DPDK.
> 
>    Atomics are not required by memif driver.

Correct, only thing we need is store barrier once per batch of packets,
to make sure that descriptor changes are globally visible before we bump head pointer.

-- 
Damjan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-06 11:04             ` Damjan Marion (damarion)
  2019-05-06 11:04               ` Damjan Marion (damarion)
@ 2019-05-07 11:29               ` Honnappa Nagarahalli
  2019-05-07 11:29                 ` Honnappa Nagarahalli
  2019-05-07 11:37                 ` Damjan Marion (damarion)
  1 sibling, 2 replies; 58+ messages in thread
From: Honnappa Nagarahalli @ 2019-05-07 11:29 UTC (permalink / raw)
  To: Damjan Marion (damarion),
	Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  Cc: Ferruh Yigit, dev, nd, nd

> >
> >> On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> >>> Memory interface (memif), provides high performance packet transfer
> >>> over shared memory.
> >>>
> >>> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
> >>
> >
> > <...>
> >
> >>    With that in mind, I believe that 23Mpps is fine performance. No
> >> performance target is
> >>    defined, the goal is to be as fast as possible.
> > Use of C11 atomics have proven to provide better performance on weakly
> ordered architectures (at least on Arm). IMO, C11 atomics should be used to
> implement the fast path functions at least. This ensures optimal performance
> on all supported architectures in DPDK.
> >
> >    Atomics are not required by memif driver.
> 
> Correct, only thing we need is store barrier once per batch of packets, to
> make sure that descriptor changes are globally visible before we bump head
> pointer.
May be I was not clear in my comments, I meant that the use of GCC C++11 memory model aware atomic operations [1] show better performance. So, instead of using full memory barriers you can use store-release and load-acquire semantics. A similar change was done to svm_fifo data structure in VPP [2] (though the original algorithm used was different from the one used in this memif patch).

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
[2] https://gerrit.fd.io/r/#/c/18223/

> 
> --
> Damjan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-07 11:29               ` Honnappa Nagarahalli
@ 2019-05-07 11:29                 ` Honnappa Nagarahalli
  2019-05-07 11:37                 ` Damjan Marion (damarion)
  1 sibling, 0 replies; 58+ messages in thread
From: Honnappa Nagarahalli @ 2019-05-07 11:29 UTC (permalink / raw)
  To: Damjan Marion (damarion),
	Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  Cc: Ferruh Yigit, dev, nd, nd

> >
> >> On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
> >>> Memory interface (memif), provides high performance packet transfer
> >>> over shared memory.
> >>>
> >>> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
> >>
> >
> > <...>
> >
> >>    With that in mind, I believe that 23Mpps is fine performance. No
> >> performance target is
> >>    defined, the goal is to be as fast as possible.
> > Use of C11 atomics have proven to provide better performance on weakly
> ordered architectures (at least on Arm). IMO, C11 atomics should be used to
> implement the fast path functions at least. This ensures optimal performance
> on all supported architectures in DPDK.
> >
> >    Atomics are not required by memif driver.
> 
> Correct, only thing we need is store barrier once per batch of packets, to
> make sure that descriptor changes are globally visible before we bump head
> pointer.
May be I was not clear in my comments, I meant that the use of GCC C++11 memory model aware atomic operations [1] show better performance. So, instead of using full memory barriers you can use store-release and load-acquire semantics. A similar change was done to svm_fifo data structure in VPP [2] (though the original algorithm used was different from the one used in this memif patch).

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
[2] https://gerrit.fd.io/r/#/c/18223/

> 
> --
> Damjan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-07 11:29               ` Honnappa Nagarahalli
  2019-05-07 11:29                 ` Honnappa Nagarahalli
@ 2019-05-07 11:37                 ` Damjan Marion (damarion)
  2019-05-07 11:37                   ` Damjan Marion (damarion)
  2019-05-08  7:53                   ` Honnappa Nagarahalli
  1 sibling, 2 replies; 58+ messages in thread
From: Damjan Marion (damarion) @ 2019-05-07 11:37 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco),
	Ferruh Yigit, dev, nd


--
Damjan

On 7 May 2019, at 13:29, Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com<mailto:Honnappa.Nagarahalli@arm.com>> wrote:


On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
Memory interface (memif), provides high performance packet transfer
over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com<mailto:jgrajcia@cisco.com>>


<...>

  With that in mind, I believe that 23Mpps is fine performance. No
performance target is
  defined, the goal is to be as fast as possible.
Use of C11 atomics have proven to provide better performance on weakly
ordered architectures (at least on Arm). IMO, C11 atomics should be used to
implement the fast path functions at least. This ensures optimal performance
on all supported architectures in DPDK.

  Atomics are not required by memif driver.

Correct, only thing we need is store barrier once per batch of packets, to
make sure that descriptor changes are globally visible before we bump head
pointer.
May be I was not clear in my comments, I meant that the use of GCC C++11 memory model aware atomic operations [1] show better performance. So, instead of using full memory barriers you can use store-release and load-acquire semantics. A similar change was done to svm_fifo data structure in VPP [2] (though the original algorithm used was different from the one used in this memif patch).

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
[2] https://gerrit.fd.io/r/#/c/18223/

Typically we have hundreds of normal memory stores into the packet ring, then single store fence and then finally one more store to bump head pointer.
Sorry. I’m not getting what are you suggesting here, and how that can be faster...

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-07 11:37                 ` Damjan Marion (damarion)
@ 2019-05-07 11:37                   ` Damjan Marion (damarion)
  2019-05-08  7:53                   ` Honnappa Nagarahalli
  1 sibling, 0 replies; 58+ messages in thread
From: Damjan Marion (damarion) @ 2019-05-07 11:37 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco),
	Ferruh Yigit, dev, nd


--
Damjan

On 7 May 2019, at 13:29, Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com<mailto:Honnappa.Nagarahalli@arm.com>> wrote:


On 3/22/2019 11:57 AM, Jakub Grajciar wrote:
Memory interface (memif), provides high performance packet transfer
over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com<mailto:jgrajcia@cisco.com>>


<...>

  With that in mind, I believe that 23Mpps is fine performance. No
performance target is
  defined, the goal is to be as fast as possible.
Use of C11 atomics have proven to provide better performance on weakly
ordered architectures (at least on Arm). IMO, C11 atomics should be used to
implement the fast path functions at least. This ensures optimal performance
on all supported architectures in DPDK.

  Atomics are not required by memif driver.

Correct, only thing we need is store barrier once per batch of packets, to
make sure that descriptor changes are globally visible before we bump head
pointer.
May be I was not clear in my comments, I meant that the use of GCC C++11 memory model aware atomic operations [1] show better performance. So, instead of using full memory barriers you can use store-release and load-acquire semantics. A similar change was done to svm_fifo data structure in VPP [2] (though the original algorithm used was different from the one used in this memif patch).

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
[2] https://gerrit.fd.io/r/#/c/18223/

Typically we have hundreds of normal memory stores into the packet ring, then single store fence and then finally one more store to bump head pointer.
Sorry. I’m not getting what are you suggesting here, and how that can be faster...

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-07 11:37                 ` Damjan Marion (damarion)
  2019-05-07 11:37                   ` Damjan Marion (damarion)
@ 2019-05-08  7:53                   ` Honnappa Nagarahalli
  2019-05-08  7:53                     ` Honnappa Nagarahalli
  1 sibling, 1 reply; 58+ messages in thread
From: Honnappa Nagarahalli @ 2019-05-08  7:53 UTC (permalink / raw)
  To: Damjan Marion (damarion)
  Cc: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco),
	Ferruh Yigit, dev, nd, nd


--
Damjan


On 7 May 2019, at 13:29, Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com<mailto:Honnappa.Nagarahalli@arm.com>> wrote:



On 3/22/2019 11:57 AM, Jakub Grajciar wrote:

Memory interface (memif), provides high performance packet transfer
over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com<mailto:jgrajcia@cisco.com>>


<...>


  With that in mind, I believe that 23Mpps is fine performance. No
performance target is
  defined, the goal is to be as fast as possible.
Use of C11 atomics have proven to provide better performance on weakly
ordered architectures (at least on Arm). IMO, C11 atomics should be used to
implement the fast path functions at least. This ensures optimal performance
on all supported architectures in DPDK.


  Atomics are not required by memif driver.

Correct, only thing we need is store barrier once per batch of packets, to
make sure that descriptor changes are globally visible before we bump head
pointer.
May be I was not clear in my comments, I meant that the use of GCC C++11 memory model aware atomic operations [1] show better performance. So, instead of using full memory barriers you can use store-release and load-acquire semantics. A similar change was done to svm_fifo data structure in VPP [2] (though the original algorithm used was different from the one used in this memif patch).

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
[2] https://gerrit.fd.io/r/#/c/18223/

Typically we have hundreds of normal memory stores into the packet ring, then single store fence and then finally one more store to bump head pointer.
Sorry. I’m not getting what are you suggesting here, and how that can be faster...
[Honnappa] I will get back with a patch and we can go from there

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v5] /net: memory interface (memif)
  2019-05-08  7:53                   ` Honnappa Nagarahalli
@ 2019-05-08  7:53                     ` Honnappa Nagarahalli
  0 siblings, 0 replies; 58+ messages in thread
From: Honnappa Nagarahalli @ 2019-05-08  7:53 UTC (permalink / raw)
  To: Damjan Marion (damarion)
  Cc: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco),
	Ferruh Yigit, dev, nd, nd


--
Damjan


On 7 May 2019, at 13:29, Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com<mailto:Honnappa.Nagarahalli@arm.com>> wrote:



On 3/22/2019 11:57 AM, Jakub Grajciar wrote:

Memory interface (memif), provides high performance packet transfer
over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com<mailto:jgrajcia@cisco.com>>


<...>


  With that in mind, I believe that 23Mpps is fine performance. No
performance target is
  defined, the goal is to be as fast as possible.
Use of C11 atomics have proven to provide better performance on weakly
ordered architectures (at least on Arm). IMO, C11 atomics should be used to
implement the fast path functions at least. This ensures optimal performance
on all supported architectures in DPDK.


  Atomics are not required by memif driver.

Correct, only thing we need is store barrier once per batch of packets, to
make sure that descriptor changes are globally visible before we bump head
pointer.
May be I was not clear in my comments, I meant that the use of GCC C++11 memory model aware atomic operations [1] show better performance. So, instead of using full memory barriers you can use store-release and load-acquire semantics. A similar change was done to svm_fifo data structure in VPP [2] (though the original algorithm used was different from the one used in this memif patch).

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
[2] https://gerrit.fd.io/r/#/c/18223/

Typically we have hundreds of normal memory stores into the packet ring, then single store fence and then finally one more store to bump head pointer.
Sorry. I’m not getting what are you suggesting here, and how that can be faster...
[Honnappa] I will get back with a patch and we can go from there

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dpdk-dev] [RFC v6] /net: memory interface (memif)
  2019-03-22 11:57   ` [dpdk-dev] [RFC v5] " Jakub Grajciar
  2019-03-22 11:57     ` Jakub Grajciar
  2019-03-25 20:58     ` Ferruh Yigit
@ 2019-05-09  8:30     ` Jakub Grajciar
  2019-05-09  8:30       ` Jakub Grajciar
  2019-05-13 10:45       ` [dpdk-dev] [RFC v7] " Jakub Grajciar
  2 siblings, 2 replies; 58+ messages in thread
From: Jakub Grajciar @ 2019-05-09  8:30 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 app/test-pmd/testpmd.c                      |    2 +
 config/common_base                          |    5 +
 config/common_linux                         |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  234 ++++
 doc/guides/rel_notes/release_19_05.rst      |    4 +
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   30 +
 drivers/net/memif/memif.h                   |  179 +++
 drivers/net/memif/memif_socket.c            | 1105 +++++++++++++++++
 drivers/net/memif/memif_socket.h            |  105 ++
 drivers/net/memif/meson.build               |   13 +
 drivers/net/memif/rte_eth_memif.c           | 1211 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  207 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 19 files changed, 3124 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/51511/
Memif uses a unix domain socket as control channel (cc).
rte_interrupts.h is used to handle interrupts for this cc.
This patch is required, because the message received can
be a reason for disconnecting.

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

v5:
- coding style fixes
- add release notes
- remove unused dependencies
- doc update

v6:
- remove socket on termination of testpmd application
- disallow memif probing in secondary process (return error), multi-process support will be implemented in separate patch
- fix chained buffer packet length
- use single region for rings and buffers (non-zero-copy)
- doc update

diff --git a/MAINTAINERS b/MAINTAINERS
index 15d0829c5..3698c146b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -841,6 +841,12 @@ F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst

+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+

 Crypto Drivers
 --------------
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 6fbfd29dd..9eec4c19d 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2493,6 +2493,8 @@ pmd_test_exit(void)
 			device = rte_eth_devices[pt_id].device;
 			if (device && !strcmp(device->driver->name, "net_virtio_user"))
 				detach_port_device(pt_id);
+			else if (device && !strcmp(device->driver->name, "net_memif"))
+				detach_port_device(pt_id);
 		}
 	}

diff --git a/config/common_base b/config/common_base
index 6b96e0e80..dca119a65 100644
--- a/config/common_base
+++ b/config/common_base
@@ -444,6 +444,11 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_XDP=n

+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linux b/config/common_linux
index 75334273d..87514fe4f 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -19,6 +19,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 2221c35f2..691e7209f 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..0d0312442
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,234 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
+Linux only.
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Each interface can be connected to one peer interface
+at same time. The peer interface is identified by id parameter. Master
+creates the socket and listens for any slave connection requests. The socket
+may already exist on the system. Be sure to remove any such sockets, if you
+are creating a master interface, or you will see an "Address already in use"
+error. Function ``rte_pmd_memif_remove()``, which removes memif interface,
+will also remove a listener socket, if it is not being used by any other
+interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. This id is used to identify
+peer interface. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Used to identify peer interface", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "nrxq=2", "Number of RX queues", "1", "255"
+   "ntxq=2", "Number of TX queues", "1", "255"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+**Connection establishment**
+
+In order to create memif connection, two memif interfaces, each in separate
+process, are needed. One interface in ``master`` role and other in
+``slave`` role. It is not possible to connect two interfaces in a single
+process. Each interface can be connected to one interface at same time,
+identified by matching id parameter.
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by ``master``
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have ``slave`` and ``master`` interfaces
+at the same time, provided each role is assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
+sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
+(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+Shared memory
+~~~~~~~~~~~~~
+
+**Shared memory format**
+
+Slave is producer and master is consumer. Memory regions, are mapped shared memory files,
+created by memif slave and provided to master at connection establishment.
+Regions contain rings and buffers. Rings and buffers can also be separated into multiple
+regions. For no-zero-copy, rings and buffers are stored inside single memory
+region to reduce the number of opened files.
+
+region n (no-zero-copy):
+
++-----------------------+-------------------------------------------------------------------------+
+| Rings                 | Buffers                                                                 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+| S2M rings | M2S rings | packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+
+S2M OR M2S Rings:
+
++--------+--------+-----------------------+
+| ring 0 | ring 1 | ring num_s2m_rings - 1|
++--------+--------+-----------------------+
+
+ring 0:
+
++-------------+---------------------------------------+
+| ring header | (1 << pmd->run.log2_ring_size) * desc |
++-------------+---------------------------------------+
+
+Descriptors are assigned packet buffers in order of rings creation. If we have one ring
+in each direction and ring size is 1024, then first 1024 buffers will belong to S2M ring and
+last 1024 will belong to M2S ring. In case of zero-copy, buffers are dequeued and
+enqueued as needed.
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+Example: testpmd and testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create ``master`` interface::
+
+    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create ``slave`` interface (master must be already running so the slave will connect)::
+
+    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
+
+    testpmd> set fwd rxonly
+    testpmd> start
+
+    testpmd> set fwd txonly
+    testpmd> start
+
+Show status::
+
+    testpmd> show port stats 0
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/doc/guides/rel_notes/release_19_05.rst b/doc/guides/rel_notes/release_19_05.rst
index 5044ac7df..c77001cd5 100644
--- a/doc/guides/rel_notes/release_19_05.rst
+++ b/doc/guides/rel_notes/release_19_05.rst
@@ -221,6 +221,10 @@ New Features
   Added support for Intel SST-BF feature so that the distributor core is
   pinned to a high frequency core if available.

+* **Added memif PMD.**
+
+  Added the new Shared Memory Packet Interface (``memif``) PMD.
+  See the :doc:`../nics/memif` guide for more details on this new driver.

 Removed Items
 -------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 3a72cf38c..78cb10fc6 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -35,6 +35,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
 DIRS-$(CONFIG_RTE_LIBRTE_IPN3KE_PMD) += ipn3ke
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..2ae3c37a2
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,30 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+# Experimantal APIs:
+# - rte_intr_callback_unregister_pending
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_ethdev -lrte_kvargs
+LDLIBS += -lrte_hash
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..3948b1f50
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128)
+{
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..92dd34ca7
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1105 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	struct cmsghdr *cmsg;
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e;
+
+	e = rte_zmalloc("memif_msg", sizeof(struct memif_msg_queue_elt), 0);
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e;
+	struct pmd_internals *pmd;
+	memif_msg_disconnect_t *d;
+
+	if (cc == NULL) {
+		MIF_LOG(DEBUG, "Missing control channel.");
+		return;
+	}
+
+	e = memif_msg_enq(cc);
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			pmd = cc->dev->data->dev_private;
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	memif_msg_hello_t *h;
+
+	if (e == NULL)
+		return -1;
+
+	h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_NUM - 1;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.pkt_buffer_size = pmd->cfg.pkt_buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strcmp(pmd->secret, (char *)i->secret) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+	struct memif_region *r;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	if (ar->index >= ETH_MEMIF_MAX_REGION_NUM || ar->index != pmd->regions_num ||
+			pmd->regions[ar->index] != NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	r->fd = fd;
+	r->region_size = ar->size;
+	r->addr = NULL;
+
+	pmd->regions[ar->index] = r;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+	struct memif_queue *mq;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, sizeof(pmd->remote_disc_string));
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, 96);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_init_t *i = &e->msg.init;
+
+	if (e == NULL)
+		return -1;
+
+	i = &e->msg.init;
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_add_region_t *ar;
+	struct memif_region *mr = pmd->regions[idx];
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_region;
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	struct memif_queue *mq;
+	memif_msg_add_ring_t *ar;
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_ring;
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connect_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connect;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connected_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connected;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		rte_free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt, *next;
+	struct memif_queue *mq;
+	struct rte_intr_handle *ih;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		for (elt = TAILQ_FIRST(&pmd->cc->msg_queue); elt != NULL; elt = next) {
+			next = TAILQ_NEXT(elt, next);
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				rte_free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		ih = &pmd->cc->intr_handle;
+		if (ih->fd > 0) {
+			ret = rte_intr_callback_unregister(ih,
+							memif_intr_handler,
+							pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				ret = rte_intr_callback_unregister_pending(ih,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+			} else if (ret > 0) {
+				close(ih->fd);
+				rte_free(pmd->cc);
+			}
+			pmd->cc = NULL;
+			if (ret <= 0)
+				MIF_LOG(WARNING,
+					"%s: Failed to unregister control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	/* reset connection configuration */
+	memset(&pmd->run, 0, sizeof(pmd->run));
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	if (cc->dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(cc->dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret = rte_intr_callback_register(&cc->intr_handle, memif_intr_handler, cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL)
+		rte_free(cc);
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	struct rte_hash *hash;
+	int ret;
+	char key[256];
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt =
+	    rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt),
+		       0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+	struct rte_hash *hash;
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) <
+	    0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			rte_free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..2e8970712
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+	char dev_name[RTE_ETH_NAME_MAX_LEN];
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	uint8_t listener;			/**< if not zero socket is listener */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	memif_msg_t msg;			/**< control message */
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..4dfe37d3a
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..c50f14f08
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1211 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_PKT_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+/* Message header to synchronize queues via IPC */
+struct ipc_queues {
+	char port_name[RTE_DEV_NAME_MAX_LEN];
+	int rxq_count;
+	int txq_count;
+	/*
+	 * The file descriptors are in the dedicated part
+	 * of the Unix message to be translated by the kernel.
+	 */
+};
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0]->addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region]->addr +
+		pmd->regions[d->region]->pkt_buffer_offset + d->offset);
+}
+
+static int
+memif_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *cur_tail,
+		    struct rte_mbuf *tail)
+{
+	/* Check for number-of-segments-overflow */
+	if (unlikely(head->nb_segs + tail->nb_segs > RTE_MBUF_MAX_NB_SEGS))
+		return -EOVERFLOW;
+
+	/* Chain 'tail' onto the old tail */
+	cur_tail->next = tail;
+
+	/* accumulate number of segments and total length. */
+	head->nb_segs = (uint16_t)(head->nb_segs + tail->nb_segs);
+
+	tail->pkt_len = tail->data_len;
+	head->pkt_len += tail->pkt_len;
+
+	return 0;
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+		RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf, *mbuf_head, *mbuf_tail;
+	uint64_t b;
+	ssize_t size __rte_unused;
+	uint16_t head;
+	int ret;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size;
+
+				/* store pointer to tail */
+				mbuf_tail = mbuf;
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				ret = memif_pktmbuf_chain(mbuf_head, mbuf_tail, mbuf);
+				if (unlikely(ret < 0)) {
+					MIF_LOG(ERR, "%s: number-of-segments-overflow",
+						rte_vdev_device_name(pmd->vdev));
+					rte_pktmbuf_free(mbuf);
+					goto no_free_bufs;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_data_len(mbuf) += cp_len;
+			rte_pktmbuf_pkt_len(mbuf) = rte_pktmbuf_data_len(mbuf);
+			if (mbuf != mbuf_head)
+				rte_pktmbuf_pkt_len(mbuf_head) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		mq->n_bytes += rte_pktmbuf_pkt_len(mbuf_head);
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+refill:
+	if (type == MEMIF_RING_M2S) {
+		head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.pkt_buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+	uint64_t a;
+	ssize_t size;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_tx_pkts < nb_pkts && n_free) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len = (type == MEMIF_RING_S2M) ?
+			pmd->run.pkt_buffer_size : d0->length;
+
+next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.pkt_buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		a = 1;
+		size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	/* regions are allocated contiguously, so it's
+	 * enough to loop until 'pmd->regions_num'
+	 */
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions[i];
+		if (r != NULL) {
+			if (r->addr != NULL) {
+				munmap(r->addr, r->region_size);
+				if (r->fd > 0) {
+					close(r->fd);
+					r->fd = -1;
+				}
+			}
+			rte_free(r);
+			pmd->regions[i] = NULL;
+		}
+	}
+	pmd->regions_num = 0;
+}
+
+static int
+memif_region_init_shm(struct pmd_internals *pmd, uint8_t has_buffers)
+{
+	char shm_name[ETH_MEMIF_SHM_NAME_SIZE];
+	int ret = 0;
+	struct memif_region *r;
+
+	if (pmd->regions_num >= ETH_MEMIF_MAX_REGION_NUM) {
+		MIF_LOG(ERR, "%s: Too many regions.", rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	/* calculate buffer offset */
+	r->pkt_buffer_offset = (pmd->run.num_s2m_rings + pmd->run.num_m2s_rings) *
+	    (sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size));
+
+	r->region_size = r->pkt_buffer_offset;
+	/* if region has buffers, add buffers size to region_size */
+	if (has_buffers == 1)
+		r->region_size += (uint32_t)(pmd->run.pkt_buffer_size *
+			(1 << pmd->run.log2_ring_size) *
+			(pmd->run.num_s2m_rings +
+			 pmd->run.num_m2s_rings));
+
+	memset(shm_name, 0, sizeof(char) * ETH_MEMIF_SHM_NAME_SIZE);
+	snprintf(shm_name, ETH_MEMIF_SHM_NAME_SIZE, "memif_region_%d",
+		 pmd->regions_num);
+
+	r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+	if (r->fd < 0) {
+		MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		ret = -1;
+		goto error;
+	}
+
+	ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	ret = ftruncate(r->fd, r->region_size);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	r->addr = mmap(NULL, r->region_size, PROT_READ |
+		       PROT_WRITE, MAP_SHARED, r->fd, 0);
+	if (r->addr == MAP_FAILED) {
+		MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(ret));
+		ret = -1;
+		goto error;
+	}
+
+	pmd->regions[pmd->regions_num] = r;
+	pmd->regions_num++;
+
+	return ret;
+
+error:
+	if (r->fd > 0)
+		close(r->fd);
+	r->fd = -1;
+
+	return ret;
+}
+
+static int
+memif_regions_init(struct pmd_internals *pmd)
+{
+	int ret;
+
+	/* create one buffer region */
+	ret = memif_region_init_shm(pmd, /* has buffer */ 1);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+	uint16_t slot;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = (uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = (uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_regions_init(dev->data->dev_private);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions[i];
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+				/* calculate packet buffer offset */
+				mr->pkt_buffer_offset = (pmd->run.num_s2m_rings +
+					pmd->run.num_m2s_rings) *
+					(sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+					(1 << pmd->run.log2_ring_size));
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	printf("Setting up rx queue...\n");
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *q = (struct memif_queue *)queue;
+
+	if (!q)
+		return;
+
+	rte_free(q);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+	uint8_t tmp, nq;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->q_errors[i] = mq->n_err;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t pkt_buffer_size, const char *secret,
+	     struct ether_addr *ether_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy slave not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+	printf("%s\n", role == MEMIF_ROLE_SLAVE ? "SLAVE" : "MASTER");
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * 24);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.pkt_buffer_size = pkt_buffer_size;
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = ether_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *pkt_buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*pkt_buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct ether_addr *ether_addr = (struct ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &ether_addr->addr_bytes[0], &ether_addr->addr_bytes[1],
+	       &ether_addr->addr_bytes[2], &ether_addr->addr_bytes[3],
+	       &ether_addr->addr_bytes[4], &ether_addr->addr_bytes[5]);
+	if (ret != 6)
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+	const char *name = rte_vdev_device_name(vdev);
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t pkt_buffer_size = ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct ether_addr *ether_addr = rte_zmalloc("", sizeof(struct ether_addr), 0);
+
+	eth_random_addr(ether_addr->addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		MIF_LOG(ERR, "Multi-processing not supported for memif.");
+		/* TODO:
+		 * Request connection information.
+		 *
+		 * Once memif in the primary process is connected,
+		 * broadcast connection information.
+		 */
+		return -1;
+	}
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &pkt_buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, ether_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, pkt_buffer_size, secret, ether_addr);
+
+exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+	struct pmd_internals *pmd;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	pmd = eth_dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Device removed", 0);
+	memif_disconnect(eth_dev);
+
+	memif_socket_remove_device(eth_dev);
+
+	pmd->vdev = NULL;
+
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_PKT_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..e640c975f
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,207 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/tmp/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE	2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_NUM		256
+
+#define ETH_MEMIF_SHM_NAME_SIZE			32
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t pkt_buffer_offset;
+	/**< offset from 'addr' to first packet buffer */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	uint16_t in_port;			/**< port id */
+
+	struct pmd_internals *pmd;		/**< device internals */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	/* ring info */
+	memif_ring_type_t type;			/**< ring type */
+	memif_ring_t *ring;			/**< pointer to ring */
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+
+	memif_region_index_t region;		/**< shared memory region index */
+	memif_region_offset_t offset;
+	/**< ring offset from start of shm region (ring - memif_region.addr) */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[24]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions[ETH_MEMIF_MAX_REGION_NUM];
+	/**< shared memory regions */
+	memif_region_index_t regions_num;	/**< number of regions */
+
+	/* remote info */
+	char remote_name[64];			/**< remote app name */
+	char remote_if_name[64];		/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[96];		/**< local disconnect reason */
+	char remote_disc_string[96];		/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..4b2e62193
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.05 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index ed99896c3..b57073483 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,6 +24,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 7c9b4b538..326b2a30b 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -174,6 +174,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
--
2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dpdk-dev] [RFC v6] /net: memory interface (memif)
  2019-05-09  8:30     ` [dpdk-dev] [RFC v6] " Jakub Grajciar
@ 2019-05-09  8:30       ` Jakub Grajciar
  2019-05-13 10:45       ` [dpdk-dev] [RFC v7] " Jakub Grajciar
  1 sibling, 0 replies; 58+ messages in thread
From: Jakub Grajciar @ 2019-05-09  8:30 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 app/test-pmd/testpmd.c                      |    2 +
 config/common_base                          |    5 +
 config/common_linux                         |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  234 ++++
 doc/guides/rel_notes/release_19_05.rst      |    4 +
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   30 +
 drivers/net/memif/memif.h                   |  179 +++
 drivers/net/memif/memif_socket.c            | 1105 +++++++++++++++++
 drivers/net/memif/memif_socket.h            |  105 ++
 drivers/net/memif/meson.build               |   13 +
 drivers/net/memif/rte_eth_memif.c           | 1211 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  207 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 19 files changed, 3124 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/51511/
Memif uses a unix domain socket as control channel (cc).
rte_interrupts.h is used to handle interrupts for this cc.
This patch is required, because the message received can
be a reason for disconnecting.

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

v5:
- coding style fixes
- add release notes
- remove unused dependencies
- doc update

v6:
- remove socket on termination of testpmd application
- disallow memif probing in secondary process (return error), multi-process support will be implemented in separate patch
- fix chained buffer packet length
- use single region for rings and buffers (non-zero-copy)
- doc update

diff --git a/MAINTAINERS b/MAINTAINERS
index 15d0829c5..3698c146b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -841,6 +841,12 @@ F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst

+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+

 Crypto Drivers
 --------------
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 6fbfd29dd..9eec4c19d 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2493,6 +2493,8 @@ pmd_test_exit(void)
 			device = rte_eth_devices[pt_id].device;
 			if (device && !strcmp(device->driver->name, "net_virtio_user"))
 				detach_port_device(pt_id);
+			else if (device && !strcmp(device->driver->name, "net_memif"))
+				detach_port_device(pt_id);
 		}
 	}

diff --git a/config/common_base b/config/common_base
index 6b96e0e80..dca119a65 100644
--- a/config/common_base
+++ b/config/common_base
@@ -444,6 +444,11 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_XDP=n

+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linux b/config/common_linux
index 75334273d..87514fe4f 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -19,6 +19,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 2221c35f2..691e7209f 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..0d0312442
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,234 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
+Linux only.
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Each interface can be connected to one peer interface
+at same time. The peer interface is identified by id parameter. Master
+creates the socket and listens for any slave connection requests. The socket
+may already exist on the system. Be sure to remove any such sockets, if you
+are creating a master interface, or you will see an "Address already in use"
+error. Function ``rte_pmd_memif_remove()``, which removes memif interface,
+will also remove a listener socket, if it is not being used by any other
+interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. This id is used to identify
+peer interface. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Used to identify peer interface", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "nrxq=2", "Number of RX queues", "1", "255"
+   "ntxq=2", "Number of TX queues", "1", "255"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+**Connection establishment**
+
+In order to create memif connection, two memif interfaces, each in separate
+process, are needed. One interface in ``master`` role and other in
+``slave`` role. It is not possible to connect two interfaces in a single
+process. Each interface can be connected to one interface at same time,
+identified by matching id parameter.
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by ``master``
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have ``slave`` and ``master`` interfaces
+at the same time, provided each role is assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
+sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
+(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+Shared memory
+~~~~~~~~~~~~~
+
+**Shared memory format**
+
+Slave is producer and master is consumer. Memory regions, are mapped shared memory files,
+created by memif slave and provided to master at connection establishment.
+Regions contain rings and buffers. Rings and buffers can also be separated into multiple
+regions. For no-zero-copy, rings and buffers are stored inside single memory
+region to reduce the number of opened files.
+
+region n (no-zero-copy):
+
++-----------------------+-------------------------------------------------------------------------+
+| Rings                 | Buffers                                                                 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+| S2M rings | M2S rings | packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+
+S2M OR M2S Rings:
+
++--------+--------+-----------------------+
+| ring 0 | ring 1 | ring num_s2m_rings - 1|
++--------+--------+-----------------------+
+
+ring 0:
+
++-------------+---------------------------------------+
+| ring header | (1 << pmd->run.log2_ring_size) * desc |
++-------------+---------------------------------------+
+
+Descriptors are assigned packet buffers in order of rings creation. If we have one ring
+in each direction and ring size is 1024, then first 1024 buffers will belong to S2M ring and
+last 1024 will belong to M2S ring. In case of zero-copy, buffers are dequeued and
+enqueued as needed.
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+Example: testpmd and testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create ``master`` interface::
+
+    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create ``slave`` interface (master must be already running so the slave will connect)::
+
+    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
+
+    testpmd> set fwd rxonly
+    testpmd> start
+
+    testpmd> set fwd txonly
+    testpmd> start
+
+Show status::
+
+    testpmd> show port stats 0
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/doc/guides/rel_notes/release_19_05.rst b/doc/guides/rel_notes/release_19_05.rst
index 5044ac7df..c77001cd5 100644
--- a/doc/guides/rel_notes/release_19_05.rst
+++ b/doc/guides/rel_notes/release_19_05.rst
@@ -221,6 +221,10 @@ New Features
   Added support for Intel SST-BF feature so that the distributor core is
   pinned to a high frequency core if available.

+* **Added memif PMD.**
+
+  Added the new Shared Memory Packet Interface (``memif``) PMD.
+  See the :doc:`../nics/memif` guide for more details on this new driver.

 Removed Items
 -------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 3a72cf38c..78cb10fc6 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -35,6 +35,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
 DIRS-$(CONFIG_RTE_LIBRTE_IPN3KE_PMD) += ipn3ke
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..2ae3c37a2
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,30 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+# Experimantal APIs:
+# - rte_intr_callback_unregister_pending
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_ethdev -lrte_kvargs
+LDLIBS += -lrte_hash
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..3948b1f50
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128)
+{
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..92dd34ca7
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1105 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	struct cmsghdr *cmsg;
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e;
+
+	e = rte_zmalloc("memif_msg", sizeof(struct memif_msg_queue_elt), 0);
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e;
+	struct pmd_internals *pmd;
+	memif_msg_disconnect_t *d;
+
+	if (cc == NULL) {
+		MIF_LOG(DEBUG, "Missing control channel.");
+		return;
+	}
+
+	e = memif_msg_enq(cc);
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			pmd = cc->dev->data->dev_private;
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	memif_msg_hello_t *h;
+
+	if (e == NULL)
+		return -1;
+
+	h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_NUM - 1;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.pkt_buffer_size = pmd->cfg.pkt_buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strcmp(pmd->secret, (char *)i->secret) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+	struct memif_region *r;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	if (ar->index >= ETH_MEMIF_MAX_REGION_NUM || ar->index != pmd->regions_num ||
+			pmd->regions[ar->index] != NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	r->fd = fd;
+	r->region_size = ar->size;
+	r->addr = NULL;
+
+	pmd->regions[ar->index] = r;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+	struct memif_queue *mq;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, sizeof(pmd->remote_disc_string));
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, 96);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_init_t *i = &e->msg.init;
+
+	if (e == NULL)
+		return -1;
+
+	i = &e->msg.init;
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_add_region_t *ar;
+	struct memif_region *mr = pmd->regions[idx];
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_region;
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	struct memif_queue *mq;
+	memif_msg_add_ring_t *ar;
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_ring;
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connect_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connect;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connected_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connected;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		rte_free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt, *next;
+	struct memif_queue *mq;
+	struct rte_intr_handle *ih;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		for (elt = TAILQ_FIRST(&pmd->cc->msg_queue); elt != NULL; elt = next) {
+			next = TAILQ_NEXT(elt, next);
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				rte_free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		ih = &pmd->cc->intr_handle;
+		if (ih->fd > 0) {
+			ret = rte_intr_callback_unregister(ih,
+							memif_intr_handler,
+							pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				ret = rte_intr_callback_unregister_pending(ih,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+			} else if (ret > 0) {
+				close(ih->fd);
+				rte_free(pmd->cc);
+			}
+			pmd->cc = NULL;
+			if (ret <= 0)
+				MIF_LOG(WARNING,
+					"%s: Failed to unregister control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	/* reset connection configuration */
+	memset(&pmd->run, 0, sizeof(pmd->run));
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	if (cc->dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(cc->dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret = rte_intr_callback_register(&cc->intr_handle, memif_intr_handler, cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL)
+		rte_free(cc);
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	struct rte_hash *hash;
+	int ret;
+	char key[256];
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt =
+	    rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt),
+		       0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+	struct rte_hash *hash;
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) <
+	    0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			rte_free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..2e8970712
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+	char dev_name[RTE_ETH_NAME_MAX_LEN];
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	uint8_t listener;			/**< if not zero socket is listener */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	memif_msg_t msg;			/**< control message */
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..4dfe37d3a
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..c50f14f08
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1211 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_PKT_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+/* Message header to synchronize queues via IPC */
+struct ipc_queues {
+	char port_name[RTE_DEV_NAME_MAX_LEN];
+	int rxq_count;
+	int txq_count;
+	/*
+	 * The file descriptors are in the dedicated part
+	 * of the Unix message to be translated by the kernel.
+	 */
+};
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0]->addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region]->addr +
+		pmd->regions[d->region]->pkt_buffer_offset + d->offset);
+}
+
+static int
+memif_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *cur_tail,
+		    struct rte_mbuf *tail)
+{
+	/* Check for number-of-segments-overflow */
+	if (unlikely(head->nb_segs + tail->nb_segs > RTE_MBUF_MAX_NB_SEGS))
+		return -EOVERFLOW;
+
+	/* Chain 'tail' onto the old tail */
+	cur_tail->next = tail;
+
+	/* accumulate number of segments and total length. */
+	head->nb_segs = (uint16_t)(head->nb_segs + tail->nb_segs);
+
+	tail->pkt_len = tail->data_len;
+	head->pkt_len += tail->pkt_len;
+
+	return 0;
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+		RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf, *mbuf_head, *mbuf_tail;
+	uint64_t b;
+	ssize_t size __rte_unused;
+	uint16_t head;
+	int ret;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size;
+
+				/* store pointer to tail */
+				mbuf_tail = mbuf;
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				ret = memif_pktmbuf_chain(mbuf_head, mbuf_tail, mbuf);
+				if (unlikely(ret < 0)) {
+					MIF_LOG(ERR, "%s: number-of-segments-overflow",
+						rte_vdev_device_name(pmd->vdev));
+					rte_pktmbuf_free(mbuf);
+					goto no_free_bufs;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_data_len(mbuf) += cp_len;
+			rte_pktmbuf_pkt_len(mbuf) = rte_pktmbuf_data_len(mbuf);
+			if (mbuf != mbuf_head)
+				rte_pktmbuf_pkt_len(mbuf_head) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		mq->n_bytes += rte_pktmbuf_pkt_len(mbuf_head);
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+refill:
+	if (type == MEMIF_RING_M2S) {
+		head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.pkt_buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+	uint64_t a;
+	ssize_t size;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_tx_pkts < nb_pkts && n_free) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len = (type == MEMIF_RING_S2M) ?
+			pmd->run.pkt_buffer_size : d0->length;
+
+next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.pkt_buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		a = 1;
+		size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	/* regions are allocated contiguously, so it's
+	 * enough to loop until 'pmd->regions_num'
+	 */
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions[i];
+		if (r != NULL) {
+			if (r->addr != NULL) {
+				munmap(r->addr, r->region_size);
+				if (r->fd > 0) {
+					close(r->fd);
+					r->fd = -1;
+				}
+			}
+			rte_free(r);
+			pmd->regions[i] = NULL;
+		}
+	}
+	pmd->regions_num = 0;
+}
+
+static int
+memif_region_init_shm(struct pmd_internals *pmd, uint8_t has_buffers)
+{
+	char shm_name[ETH_MEMIF_SHM_NAME_SIZE];
+	int ret = 0;
+	struct memif_region *r;
+
+	if (pmd->regions_num >= ETH_MEMIF_MAX_REGION_NUM) {
+		MIF_LOG(ERR, "%s: Too many regions.", rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	/* calculate buffer offset */
+	r->pkt_buffer_offset = (pmd->run.num_s2m_rings + pmd->run.num_m2s_rings) *
+	    (sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size));
+
+	r->region_size = r->pkt_buffer_offset;
+	/* if region has buffers, add buffers size to region_size */
+	if (has_buffers == 1)
+		r->region_size += (uint32_t)(pmd->run.pkt_buffer_size *
+			(1 << pmd->run.log2_ring_size) *
+			(pmd->run.num_s2m_rings +
+			 pmd->run.num_m2s_rings));
+
+	memset(shm_name, 0, sizeof(char) * ETH_MEMIF_SHM_NAME_SIZE);
+	snprintf(shm_name, ETH_MEMIF_SHM_NAME_SIZE, "memif_region_%d",
+		 pmd->regions_num);
+
+	r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+	if (r->fd < 0) {
+		MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		ret = -1;
+		goto error;
+	}
+
+	ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	ret = ftruncate(r->fd, r->region_size);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	r->addr = mmap(NULL, r->region_size, PROT_READ |
+		       PROT_WRITE, MAP_SHARED, r->fd, 0);
+	if (r->addr == MAP_FAILED) {
+		MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(ret));
+		ret = -1;
+		goto error;
+	}
+
+	pmd->regions[pmd->regions_num] = r;
+	pmd->regions_num++;
+
+	return ret;
+
+error:
+	if (r->fd > 0)
+		close(r->fd);
+	r->fd = -1;
+
+	return ret;
+}
+
+static int
+memif_regions_init(struct pmd_internals *pmd)
+{
+	int ret;
+
+	/* create one buffer region */
+	ret = memif_region_init_shm(pmd, /* has buffer */ 1);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+	uint16_t slot;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = (uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = (uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_regions_init(dev->data->dev_private);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions[i];
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+				/* calculate packet buffer offset */
+				mr->pkt_buffer_offset = (pmd->run.num_s2m_rings +
+					pmd->run.num_m2s_rings) *
+					(sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+					(1 << pmd->run.log2_ring_size));
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	printf("Setting up rx queue...\n");
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *q = (struct memif_queue *)queue;
+
+	if (!q)
+		return;
+
+	rte_free(q);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+	uint8_t tmp, nq;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->q_errors[i] = mq->n_err;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t pkt_buffer_size, const char *secret,
+	     struct ether_addr *ether_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy slave not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+	printf("%s\n", role == MEMIF_ROLE_SLAVE ? "SLAVE" : "MASTER");
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * 24);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.pkt_buffer_size = pkt_buffer_size;
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = ether_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *pkt_buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*pkt_buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct ether_addr *ether_addr = (struct ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &ether_addr->addr_bytes[0], &ether_addr->addr_bytes[1],
+	       &ether_addr->addr_bytes[2], &ether_addr->addr_bytes[3],
+	       &ether_addr->addr_bytes[4], &ether_addr->addr_bytes[5]);
+	if (ret != 6)
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+	const char *name = rte_vdev_device_name(vdev);
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t pkt_buffer_size = ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct ether_addr *ether_addr = rte_zmalloc("", sizeof(struct ether_addr), 0);
+
+	eth_random_addr(ether_addr->addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		MIF_LOG(ERR, "Multi-processing not supported for memif.");
+		/* TODO:
+		 * Request connection information.
+		 *
+		 * Once memif in the primary process is connected,
+		 * broadcast connection information.
+		 */
+		return -1;
+	}
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &pkt_buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, ether_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, pkt_buffer_size, secret, ether_addr);
+
+exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+	struct pmd_internals *pmd;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	pmd = eth_dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Device removed", 0);
+	memif_disconnect(eth_dev);
+
+	memif_socket_remove_device(eth_dev);
+
+	pmd->vdev = NULL;
+
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_PKT_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..e640c975f
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,207 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/tmp/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE	2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_NUM		256
+
+#define ETH_MEMIF_SHM_NAME_SIZE			32
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t pkt_buffer_offset;
+	/**< offset from 'addr' to first packet buffer */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	uint16_t in_port;			/**< port id */
+
+	struct pmd_internals *pmd;		/**< device internals */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	/* ring info */
+	memif_ring_type_t type;			/**< ring type */
+	memif_ring_t *ring;			/**< pointer to ring */
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+
+	memif_region_index_t region;		/**< shared memory region index */
+	memif_region_offset_t offset;
+	/**< ring offset from start of shm region (ring - memif_region.addr) */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[24]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions[ETH_MEMIF_MAX_REGION_NUM];
+	/**< shared memory regions */
+	memif_region_index_t regions_num;	/**< number of regions */
+
+	/* remote info */
+	char remote_name[64];			/**< remote app name */
+	char remote_if_name[64];		/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[96];		/**< local disconnect reason */
+	char remote_disc_string[96];		/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..4b2e62193
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.05 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index ed99896c3..b57073483 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,6 +24,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 7c9b4b538..326b2a30b 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -174,6 +174,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
--
2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dpdk-dev] [RFC v7] /net: memory interface (memif)
  2019-05-09  8:30     ` [dpdk-dev] [RFC v6] " Jakub Grajciar
  2019-05-09  8:30       ` Jakub Grajciar
@ 2019-05-13 10:45       ` Jakub Grajciar
  2019-05-13 10:45         ` Jakub Grajciar
  2019-05-16 11:46         ` [dpdk-dev] [RFC v8] " Jakub Grajciar
  1 sibling, 2 replies; 58+ messages in thread
From: Jakub Grajciar @ 2019-05-13 10:45 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 app/test-pmd/testpmd.c                      |    2 +
 config/common_base                          |    5 +
 config/common_linux                         |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  234 ++++
 doc/guides/rel_notes/release_19_05.rst      |    4 +
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   30 +
 drivers/net/memif/memif.h                   |  179 +++
 drivers/net/memif/memif_socket.c            | 1103 +++++++++++++++++
 drivers/net/memif/memif_socket.h            |  105 ++
 drivers/net/memif/meson.build               |   13 +
 drivers/net/memif/rte_eth_memif.c           | 1202 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  207 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 19 files changed, 3113 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/51511/
Memif uses a unix domain socket as control channel (cc).
rte_interrupts.h is used to handle interrupts for this cc.
This patch is required, because the message received can
be a reason for disconnecting.

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

v5:
- coding style fixes
- add release notes
- remove unused dependencies
- doc update

v6:
- remove socket on termination of testpmd application
- disallow memif probing in secondary process (return error), multi-process support will be implemented in separate patch
- fix chained buffer packet length
- use single region for rings and buffers (non-zero-copy)
- doc update

v7:
- coding style fixes

diff --git a/MAINTAINERS b/MAINTAINERS
index 15d0829c5..3698c146b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -841,6 +841,12 @@ F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst

+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+

 Crypto Drivers
 --------------
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 6fbfd29dd..9eec4c19d 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2493,6 +2493,8 @@ pmd_test_exit(void)
 			device = rte_eth_devices[pt_id].device;
 			if (device && !strcmp(device->driver->name, "net_virtio_user"))
 				detach_port_device(pt_id);
+			else if (device && !strcmp(device->driver->name, "net_memif"))
+				detach_port_device(pt_id);
 		}
 	}

diff --git a/config/common_base b/config/common_base
index 6b96e0e80..dca119a65 100644
--- a/config/common_base
+++ b/config/common_base
@@ -444,6 +444,11 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_XDP=n

+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linux b/config/common_linux
index 75334273d..87514fe4f 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -19,6 +19,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 2221c35f2..691e7209f 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..0d0312442
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,234 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
+Linux only.
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Each interface can be connected to one peer interface
+at same time. The peer interface is identified by id parameter. Master
+creates the socket and listens for any slave connection requests. The socket
+may already exist on the system. Be sure to remove any such sockets, if you
+are creating a master interface, or you will see an "Address already in use"
+error. Function ``rte_pmd_memif_remove()``, which removes memif interface,
+will also remove a listener socket, if it is not being used by any other
+interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. This id is used to identify
+peer interface. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Used to identify peer interface", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "nrxq=2", "Number of RX queues", "1", "255"
+   "ntxq=2", "Number of TX queues", "1", "255"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+**Connection establishment**
+
+In order to create memif connection, two memif interfaces, each in separate
+process, are needed. One interface in ``master`` role and other in
+``slave`` role. It is not possible to connect two interfaces in a single
+process. Each interface can be connected to one interface at same time,
+identified by matching id parameter.
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by ``master``
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have ``slave`` and ``master`` interfaces
+at the same time, provided each role is assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
+sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
+(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+Shared memory
+~~~~~~~~~~~~~
+
+**Shared memory format**
+
+Slave is producer and master is consumer. Memory regions, are mapped shared memory files,
+created by memif slave and provided to master at connection establishment.
+Regions contain rings and buffers. Rings and buffers can also be separated into multiple
+regions. For no-zero-copy, rings and buffers are stored inside single memory
+region to reduce the number of opened files.
+
+region n (no-zero-copy):
+
++-----------------------+-------------------------------------------------------------------------+
+| Rings                 | Buffers                                                                 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+| S2M rings | M2S rings | packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+
+S2M OR M2S Rings:
+
++--------+--------+-----------------------+
+| ring 0 | ring 1 | ring num_s2m_rings - 1|
++--------+--------+-----------------------+
+
+ring 0:
+
++-------------+---------------------------------------+
+| ring header | (1 << pmd->run.log2_ring_size) * desc |
++-------------+---------------------------------------+
+
+Descriptors are assigned packet buffers in order of rings creation. If we have one ring
+in each direction and ring size is 1024, then first 1024 buffers will belong to S2M ring and
+last 1024 will belong to M2S ring. In case of zero-copy, buffers are dequeued and
+enqueued as needed.
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+Example: testpmd and testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create ``master`` interface::
+
+    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create ``slave`` interface (master must be already running so the slave will connect)::
+
+    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
+
+    testpmd> set fwd rxonly
+    testpmd> start
+
+    testpmd> set fwd txonly
+    testpmd> start
+
+Show status::
+
+    testpmd> show port stats 0
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/doc/guides/rel_notes/release_19_05.rst b/doc/guides/rel_notes/release_19_05.rst
index 5044ac7df..c77001cd5 100644
--- a/doc/guides/rel_notes/release_19_05.rst
+++ b/doc/guides/rel_notes/release_19_05.rst
@@ -221,6 +221,10 @@ New Features
   Added support for Intel SST-BF feature so that the distributor core is
   pinned to a high frequency core if available.

+* **Added memif PMD.**
+
+  Added the new Shared Memory Packet Interface (``memif``) PMD.
+  See the :doc:`../nics/memif` guide for more details on this new driver.

 Removed Items
 -------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 3a72cf38c..78cb10fc6 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -35,6 +35,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
 DIRS-$(CONFIG_RTE_LIBRTE_IPN3KE_PMD) += ipn3ke
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..2ae3c37a2
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,30 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+# Experimantal APIs:
+# - rte_intr_callback_unregister_pending
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_ethdev -lrte_kvargs
+LDLIBS += -lrte_hash
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..3948b1f50
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128)
+{
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..3596eb686
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1103 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	struct cmsghdr *cmsg;
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e;
+
+	e = rte_zmalloc("memif_msg", sizeof(struct memif_msg_queue_elt), 0);
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e;
+	struct pmd_internals *pmd;
+	memif_msg_disconnect_t *d;
+
+	if (cc == NULL) {
+		MIF_LOG(DEBUG, "Missing control channel.");
+		return;
+	}
+
+	e = memif_msg_enq(cc);
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			pmd = cc->dev->data->dev_private;
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	memif_msg_hello_t *h;
+
+	if (e == NULL)
+		return -1;
+
+	h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_NUM - 1;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.pkt_buffer_size = pmd->cfg.pkt_buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strcmp(pmd->secret, (char *)i->secret) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+	struct memif_region *r;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	if (ar->index >= ETH_MEMIF_MAX_REGION_NUM || ar->index != pmd->regions_num ||
+			pmd->regions[ar->index] != NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	r->fd = fd;
+	r->region_size = ar->size;
+	r->addr = NULL;
+
+	pmd->regions[ar->index] = r;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+	struct memif_queue *mq;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, sizeof(pmd->remote_disc_string));
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, 96);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_init_t *i = &e->msg.init;
+
+	if (e == NULL)
+		return -1;
+
+	i = &e->msg.init;
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_add_region_t *ar;
+	struct memif_region *mr = pmd->regions[idx];
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_region;
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	struct memif_queue *mq;
+	memif_msg_add_ring_t *ar;
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_ring;
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connect_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connect;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connected_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connected;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		rte_free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt, *next;
+	struct memif_queue *mq;
+	struct rte_intr_handle *ih;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		for (elt = TAILQ_FIRST(&pmd->cc->msg_queue); elt != NULL; elt = next) {
+			next = TAILQ_NEXT(elt, next);
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				rte_free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		ih = &pmd->cc->intr_handle;
+		if (ih->fd > 0) {
+			ret = rte_intr_callback_unregister(ih,
+							memif_intr_handler,
+							pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				ret = rte_intr_callback_unregister_pending(ih,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+			} else if (ret > 0) {
+				close(ih->fd);
+				rte_free(pmd->cc);
+			}
+			pmd->cc = NULL;
+			if (ret <= 0)
+				MIF_LOG(WARNING,
+					"%s: Failed to unregister control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	/* reset connection configuration */
+	memset(&pmd->run, 0, sizeof(pmd->run));
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	if (cc->dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(cc->dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret = rte_intr_callback_register(&cc->intr_handle, memif_intr_handler, cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL)
+		rte_free(cc);
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	struct rte_hash *hash;
+	int ret;
+	char key[256];
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt = rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt), 0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+	struct rte_hash *hash;
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) <
+	    0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			rte_free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..2e8970712
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+	char dev_name[RTE_ETH_NAME_MAX_LEN];
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	uint8_t listener;			/**< if not zero socket is listener */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	memif_msg_t msg;			/**< control message */
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..4dfe37d3a
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..f42c39635
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1202 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_PKT_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0]->addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region]->addr +
+		pmd->regions[d->region]->pkt_buffer_offset + d->offset);
+}
+
+static int
+memif_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *cur_tail,
+		    struct rte_mbuf *tail)
+{
+	/* Check for number-of-segments-overflow */
+	if (unlikely(head->nb_segs + tail->nb_segs > RTE_MBUF_MAX_NB_SEGS))
+		return -EOVERFLOW;
+
+	/* Chain 'tail' onto the old tail */
+	cur_tail->next = tail;
+
+	/* accumulate number of segments and total length. */
+	head->nb_segs = (uint16_t)(head->nb_segs + tail->nb_segs);
+
+	tail->pkt_len = tail->data_len;
+	head->pkt_len += tail->pkt_len;
+
+	return 0;
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+		RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf, *mbuf_head, *mbuf_tail;
+	uint64_t b;
+	ssize_t size __rte_unused;
+	uint16_t head;
+	int ret;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size;
+
+				/* store pointer to tail */
+				mbuf_tail = mbuf;
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				ret = memif_pktmbuf_chain(mbuf_head, mbuf_tail, mbuf);
+				if (unlikely(ret < 0)) {
+					MIF_LOG(ERR, "%s: number-of-segments-overflow",
+						rte_vdev_device_name(pmd->vdev));
+					rte_pktmbuf_free(mbuf);
+					goto no_free_bufs;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_data_len(mbuf) += cp_len;
+			rte_pktmbuf_pkt_len(mbuf) = rte_pktmbuf_data_len(mbuf);
+			if (mbuf != mbuf_head)
+				rte_pktmbuf_pkt_len(mbuf_head) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		mq->n_bytes += rte_pktmbuf_pkt_len(mbuf_head);
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+refill:
+	if (type == MEMIF_RING_M2S) {
+		head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.pkt_buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+	uint64_t a;
+	ssize_t size;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_tx_pkts < nb_pkts && n_free) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len = (type == MEMIF_RING_S2M) ?
+			pmd->run.pkt_buffer_size : d0->length;
+
+next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.pkt_buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		a = 1;
+		size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	/* regions are allocated contiguously, so it's
+	 * enough to loop until 'pmd->regions_num'
+	 */
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions[i];
+		if (r != NULL) {
+			if (r->addr != NULL) {
+				munmap(r->addr, r->region_size);
+				if (r->fd > 0) {
+					close(r->fd);
+					r->fd = -1;
+				}
+			}
+			rte_free(r);
+			pmd->regions[i] = NULL;
+		}
+	}
+	pmd->regions_num = 0;
+}
+
+static int
+memif_region_init_shm(struct pmd_internals *pmd, uint8_t has_buffers)
+{
+	char shm_name[ETH_MEMIF_SHM_NAME_SIZE];
+	int ret = 0;
+	struct memif_region *r;
+
+	if (pmd->regions_num >= ETH_MEMIF_MAX_REGION_NUM) {
+		MIF_LOG(ERR, "%s: Too many regions.", rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	/* calculate buffer offset */
+	r->pkt_buffer_offset = (pmd->run.num_s2m_rings + pmd->run.num_m2s_rings) *
+	    (sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size));
+
+	r->region_size = r->pkt_buffer_offset;
+	/* if region has buffers, add buffers size to region_size */
+	if (has_buffers == 1)
+		r->region_size += (uint32_t)(pmd->run.pkt_buffer_size *
+			(1 << pmd->run.log2_ring_size) *
+			(pmd->run.num_s2m_rings +
+			 pmd->run.num_m2s_rings));
+
+	memset(shm_name, 0, sizeof(char) * ETH_MEMIF_SHM_NAME_SIZE);
+	snprintf(shm_name, ETH_MEMIF_SHM_NAME_SIZE, "memif_region_%d",
+		 pmd->regions_num);
+
+	r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+	if (r->fd < 0) {
+		MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		ret = -1;
+		goto error;
+	}
+
+	ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	ret = ftruncate(r->fd, r->region_size);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	r->addr = mmap(NULL, r->region_size, PROT_READ |
+		       PROT_WRITE, MAP_SHARED, r->fd, 0);
+	if (r->addr == MAP_FAILED) {
+		MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(ret));
+		ret = -1;
+		goto error;
+	}
+
+	pmd->regions[pmd->regions_num] = r;
+	pmd->regions_num++;
+
+	return ret;
+
+error:
+	if (r->fd > 0)
+		close(r->fd);
+	r->fd = -1;
+
+	return ret;
+}
+
+static int
+memif_regions_init(struct pmd_internals *pmd)
+{
+	int ret;
+
+	/* create one buffer region */
+	ret = memif_region_init_shm(pmd, /* has buffer */ 1);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+	uint16_t slot;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = (uint32_t)(slot *
+				pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = (uint32_t)(slot *
+				pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_regions_init(dev->data->dev_private);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions[i];
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+				/* calculate packet buffer offset */
+				mr->pkt_buffer_offset = (pmd->run.num_s2m_rings +
+					pmd->run.num_m2s_rings) *
+					(sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+					(1 << pmd->run.log2_ring_size));
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	printf("Setting up rx queue...\n");
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *q = (struct memif_queue *)queue;
+
+	if (!q)
+		return;
+
+	rte_free(q);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+	uint8_t tmp, nq;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->q_errors[i] = mq->n_err;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t pkt_buffer_size, const char *secret,
+	     struct ether_addr *ether_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy slave not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+	printf("%s\n", role == MEMIF_ROLE_SLAVE ? "SLAVE" : "MASTER");
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * 24);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.pkt_buffer_size = pkt_buffer_size;
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = ether_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *pkt_buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*pkt_buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct ether_addr *ether_addr = (struct ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &ether_addr->addr_bytes[0], &ether_addr->addr_bytes[1],
+	       &ether_addr->addr_bytes[2], &ether_addr->addr_bytes[3],
+	       &ether_addr->addr_bytes[4], &ether_addr->addr_bytes[5]);
+	if (ret != 6)
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+	const char *name = rte_vdev_device_name(vdev);
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t pkt_buffer_size = ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct ether_addr *ether_addr = rte_zmalloc("", sizeof(struct ether_addr), 0);
+
+	eth_random_addr(ether_addr->addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		MIF_LOG(ERR, "Multi-processing not supported for memif.");
+		/* TODO:
+		 * Request connection information.
+		 *
+		 * Once memif in the primary process is connected,
+		 * broadcast connection information.
+		 */
+		return -1;
+	}
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &pkt_buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, ether_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, pkt_buffer_size, secret, ether_addr);
+
+exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+	struct pmd_internals *pmd;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	pmd = eth_dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Device removed", 0);
+	memif_disconnect(eth_dev);
+
+	memif_socket_remove_device(eth_dev);
+
+	pmd->vdev = NULL;
+
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_PKT_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..e640c975f
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,207 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/tmp/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE	2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_NUM		256
+
+#define ETH_MEMIF_SHM_NAME_SIZE			32
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t pkt_buffer_offset;
+	/**< offset from 'addr' to first packet buffer */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	uint16_t in_port;			/**< port id */
+
+	struct pmd_internals *pmd;		/**< device internals */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	/* ring info */
+	memif_ring_type_t type;			/**< ring type */
+	memif_ring_t *ring;			/**< pointer to ring */
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+
+	memif_region_index_t region;		/**< shared memory region index */
+	memif_region_offset_t offset;
+	/**< ring offset from start of shm region (ring - memif_region.addr) */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[24]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions[ETH_MEMIF_MAX_REGION_NUM];
+	/**< shared memory regions */
+	memif_region_index_t regions_num;	/**< number of regions */
+
+	/* remote info */
+	char remote_name[64];			/**< remote app name */
+	char remote_if_name[64];		/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[96];		/**< local disconnect reason */
+	char remote_disc_string[96];		/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..4b2e62193
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.05 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index ed99896c3..b57073483 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,6 +24,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 7c9b4b538..326b2a30b 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -174,6 +174,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
--
2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dpdk-dev] [RFC v7] /net: memory interface (memif)
  2019-05-13 10:45       ` [dpdk-dev] [RFC v7] " Jakub Grajciar
@ 2019-05-13 10:45         ` Jakub Grajciar
  2019-05-16 11:46         ` [dpdk-dev] [RFC v8] " Jakub Grajciar
  1 sibling, 0 replies; 58+ messages in thread
From: Jakub Grajciar @ 2019-05-13 10:45 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 app/test-pmd/testpmd.c                      |    2 +
 config/common_base                          |    5 +
 config/common_linux                         |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  234 ++++
 doc/guides/rel_notes/release_19_05.rst      |    4 +
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   30 +
 drivers/net/memif/memif.h                   |  179 +++
 drivers/net/memif/memif_socket.c            | 1103 +++++++++++++++++
 drivers/net/memif/memif_socket.h            |  105 ++
 drivers/net/memif/meson.build               |   13 +
 drivers/net/memif/rte_eth_memif.c           | 1202 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  207 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 19 files changed, 3113 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/51511/
Memif uses a unix domain socket as control channel (cc).
rte_interrupts.h is used to handle interrupts for this cc.
This patch is required, because the message received can
be a reason for disconnecting.

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

v5:
- coding style fixes
- add release notes
- remove unused dependencies
- doc update

v6:
- remove socket on termination of testpmd application
- disallow memif probing in secondary process (return error), multi-process support will be implemented in separate patch
- fix chained buffer packet length
- use single region for rings and buffers (non-zero-copy)
- doc update

v7:
- coding style fixes

diff --git a/MAINTAINERS b/MAINTAINERS
index 15d0829c5..3698c146b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -841,6 +841,12 @@ F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst

+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+

 Crypto Drivers
 --------------
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 6fbfd29dd..9eec4c19d 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2493,6 +2493,8 @@ pmd_test_exit(void)
 			device = rte_eth_devices[pt_id].device;
 			if (device && !strcmp(device->driver->name, "net_virtio_user"))
 				detach_port_device(pt_id);
+			else if (device && !strcmp(device->driver->name, "net_memif"))
+				detach_port_device(pt_id);
 		}
 	}

diff --git a/config/common_base b/config/common_base
index 6b96e0e80..dca119a65 100644
--- a/config/common_base
+++ b/config/common_base
@@ -444,6 +444,11 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_XDP=n

+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linux b/config/common_linux
index 75334273d..87514fe4f 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -19,6 +19,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 2221c35f2..691e7209f 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..0d0312442
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,234 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
+Linux only.
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Each interface can be connected to one peer interface
+at same time. The peer interface is identified by id parameter. Master
+creates the socket and listens for any slave connection requests. The socket
+may already exist on the system. Be sure to remove any such sockets, if you
+are creating a master interface, or you will see an "Address already in use"
+error. Function ``rte_pmd_memif_remove()``, which removes memif interface,
+will also remove a listener socket, if it is not being used by any other
+interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. This id is used to identify
+peer interface. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Used to identify peer interface", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "nrxq=2", "Number of RX queues", "1", "255"
+   "ntxq=2", "Number of TX queues", "1", "255"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+**Connection establishment**
+
+In order to create memif connection, two memif interfaces, each in separate
+process, are needed. One interface in ``master`` role and other in
+``slave`` role. It is not possible to connect two interfaces in a single
+process. Each interface can be connected to one interface at same time,
+identified by matching id parameter.
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by ``master``
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have ``slave`` and ``master`` interfaces
+at the same time, provided each role is assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
+sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
+(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+Shared memory
+~~~~~~~~~~~~~
+
+**Shared memory format**
+
+Slave is producer and master is consumer. Memory regions, are mapped shared memory files,
+created by memif slave and provided to master at connection establishment.
+Regions contain rings and buffers. Rings and buffers can also be separated into multiple
+regions. For no-zero-copy, rings and buffers are stored inside single memory
+region to reduce the number of opened files.
+
+region n (no-zero-copy):
+
++-----------------------+-------------------------------------------------------------------------+
+| Rings                 | Buffers                                                                 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+| S2M rings | M2S rings | packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+
+S2M OR M2S Rings:
+
++--------+--------+-----------------------+
+| ring 0 | ring 1 | ring num_s2m_rings - 1|
++--------+--------+-----------------------+
+
+ring 0:
+
++-------------+---------------------------------------+
+| ring header | (1 << pmd->run.log2_ring_size) * desc |
++-------------+---------------------------------------+
+
+Descriptors are assigned packet buffers in order of rings creation. If we have one ring
+in each direction and ring size is 1024, then first 1024 buffers will belong to S2M ring and
+last 1024 will belong to M2S ring. In case of zero-copy, buffers are dequeued and
+enqueued as needed.
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+Example: testpmd and testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create ``master`` interface::
+
+    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create ``slave`` interface (master must be already running so the slave will connect)::
+
+    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
+
+    testpmd> set fwd rxonly
+    testpmd> start
+
+    testpmd> set fwd txonly
+    testpmd> start
+
+Show status::
+
+    testpmd> show port stats 0
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/doc/guides/rel_notes/release_19_05.rst b/doc/guides/rel_notes/release_19_05.rst
index 5044ac7df..c77001cd5 100644
--- a/doc/guides/rel_notes/release_19_05.rst
+++ b/doc/guides/rel_notes/release_19_05.rst
@@ -221,6 +221,10 @@ New Features
   Added support for Intel SST-BF feature so that the distributor core is
   pinned to a high frequency core if available.

+* **Added memif PMD.**
+
+  Added the new Shared Memory Packet Interface (``memif``) PMD.
+  See the :doc:`../nics/memif` guide for more details on this new driver.

 Removed Items
 -------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 3a72cf38c..78cb10fc6 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -35,6 +35,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
 DIRS-$(CONFIG_RTE_LIBRTE_IPN3KE_PMD) += ipn3ke
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..2ae3c37a2
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,30 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+# Experimantal APIs:
+# - rte_intr_callback_unregister_pending
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_ethdev -lrte_kvargs
+LDLIBS += -lrte_hash
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..3948b1f50
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128)
+{
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..3596eb686
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1103 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	struct cmsghdr *cmsg;
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e;
+
+	e = rte_zmalloc("memif_msg", sizeof(struct memif_msg_queue_elt), 0);
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e;
+	struct pmd_internals *pmd;
+	memif_msg_disconnect_t *d;
+
+	if (cc == NULL) {
+		MIF_LOG(DEBUG, "Missing control channel.");
+		return;
+	}
+
+	e = memif_msg_enq(cc);
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			pmd = cc->dev->data->dev_private;
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	memif_msg_hello_t *h;
+
+	if (e == NULL)
+		return -1;
+
+	h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_NUM - 1;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.pkt_buffer_size = pmd->cfg.pkt_buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strcmp(pmd->secret, (char *)i->secret) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+	struct memif_region *r;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	if (ar->index >= ETH_MEMIF_MAX_REGION_NUM || ar->index != pmd->regions_num ||
+			pmd->regions[ar->index] != NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	r->fd = fd;
+	r->region_size = ar->size;
+	r->addr = NULL;
+
+	pmd->regions[ar->index] = r;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+	struct memif_queue *mq;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, sizeof(pmd->remote_disc_string));
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, 96);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_init_t *i = &e->msg.init;
+
+	if (e == NULL)
+		return -1;
+
+	i = &e->msg.init;
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_add_region_t *ar;
+	struct memif_region *mr = pmd->regions[idx];
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_region;
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	struct memif_queue *mq;
+	memif_msg_add_ring_t *ar;
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_ring;
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connect_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connect;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connected_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connected;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		rte_free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt, *next;
+	struct memif_queue *mq;
+	struct rte_intr_handle *ih;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		for (elt = TAILQ_FIRST(&pmd->cc->msg_queue); elt != NULL; elt = next) {
+			next = TAILQ_NEXT(elt, next);
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				rte_free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		ih = &pmd->cc->intr_handle;
+		if (ih->fd > 0) {
+			ret = rte_intr_callback_unregister(ih,
+							memif_intr_handler,
+							pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				ret = rte_intr_callback_unregister_pending(ih,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+			} else if (ret > 0) {
+				close(ih->fd);
+				rte_free(pmd->cc);
+			}
+			pmd->cc = NULL;
+			if (ret <= 0)
+				MIF_LOG(WARNING,
+					"%s: Failed to unregister control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	/* reset connection configuration */
+	memset(&pmd->run, 0, sizeof(pmd->run));
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	if (cc->dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(cc->dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret = rte_intr_callback_register(&cc->intr_handle, memif_intr_handler, cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL)
+		rte_free(cc);
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	struct rte_hash *hash;
+	int ret;
+	char key[256];
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt = rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt), 0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+	struct rte_hash *hash;
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) <
+	    0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			rte_free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..2e8970712
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+	char dev_name[RTE_ETH_NAME_MAX_LEN];
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	uint8_t listener;			/**< if not zero socket is listener */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	memif_msg_t msg;			/**< control message */
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..4dfe37d3a
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..f42c39635
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1202 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_PKT_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0]->addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region]->addr +
+		pmd->regions[d->region]->pkt_buffer_offset + d->offset);
+}
+
+static int
+memif_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *cur_tail,
+		    struct rte_mbuf *tail)
+{
+	/* Check for number-of-segments-overflow */
+	if (unlikely(head->nb_segs + tail->nb_segs > RTE_MBUF_MAX_NB_SEGS))
+		return -EOVERFLOW;
+
+	/* Chain 'tail' onto the old tail */
+	cur_tail->next = tail;
+
+	/* accumulate number of segments and total length. */
+	head->nb_segs = (uint16_t)(head->nb_segs + tail->nb_segs);
+
+	tail->pkt_len = tail->data_len;
+	head->pkt_len += tail->pkt_len;
+
+	return 0;
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+		RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf, *mbuf_head, *mbuf_tail;
+	uint64_t b;
+	ssize_t size __rte_unused;
+	uint16_t head;
+	int ret;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size;
+
+				/* store pointer to tail */
+				mbuf_tail = mbuf;
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				ret = memif_pktmbuf_chain(mbuf_head, mbuf_tail, mbuf);
+				if (unlikely(ret < 0)) {
+					MIF_LOG(ERR, "%s: number-of-segments-overflow",
+						rte_vdev_device_name(pmd->vdev));
+					rte_pktmbuf_free(mbuf);
+					goto no_free_bufs;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_data_len(mbuf) += cp_len;
+			rte_pktmbuf_pkt_len(mbuf) = rte_pktmbuf_data_len(mbuf);
+			if (mbuf != mbuf_head)
+				rte_pktmbuf_pkt_len(mbuf_head) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		mq->n_bytes += rte_pktmbuf_pkt_len(mbuf_head);
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+refill:
+	if (type == MEMIF_RING_M2S) {
+		head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.pkt_buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+	uint64_t a;
+	ssize_t size;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_tx_pkts < nb_pkts && n_free) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len = (type == MEMIF_RING_S2M) ?
+			pmd->run.pkt_buffer_size : d0->length;
+
+next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.pkt_buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		a = 1;
+		size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	/* regions are allocated contiguously, so it's
+	 * enough to loop until 'pmd->regions_num'
+	 */
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions[i];
+		if (r != NULL) {
+			if (r->addr != NULL) {
+				munmap(r->addr, r->region_size);
+				if (r->fd > 0) {
+					close(r->fd);
+					r->fd = -1;
+				}
+			}
+			rte_free(r);
+			pmd->regions[i] = NULL;
+		}
+	}
+	pmd->regions_num = 0;
+}
+
+static int
+memif_region_init_shm(struct pmd_internals *pmd, uint8_t has_buffers)
+{
+	char shm_name[ETH_MEMIF_SHM_NAME_SIZE];
+	int ret = 0;
+	struct memif_region *r;
+
+	if (pmd->regions_num >= ETH_MEMIF_MAX_REGION_NUM) {
+		MIF_LOG(ERR, "%s: Too many regions.", rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	/* calculate buffer offset */
+	r->pkt_buffer_offset = (pmd->run.num_s2m_rings + pmd->run.num_m2s_rings) *
+	    (sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size));
+
+	r->region_size = r->pkt_buffer_offset;
+	/* if region has buffers, add buffers size to region_size */
+	if (has_buffers == 1)
+		r->region_size += (uint32_t)(pmd->run.pkt_buffer_size *
+			(1 << pmd->run.log2_ring_size) *
+			(pmd->run.num_s2m_rings +
+			 pmd->run.num_m2s_rings));
+
+	memset(shm_name, 0, sizeof(char) * ETH_MEMIF_SHM_NAME_SIZE);
+	snprintf(shm_name, ETH_MEMIF_SHM_NAME_SIZE, "memif_region_%d",
+		 pmd->regions_num);
+
+	r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+	if (r->fd < 0) {
+		MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		ret = -1;
+		goto error;
+	}
+
+	ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	ret = ftruncate(r->fd, r->region_size);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	r->addr = mmap(NULL, r->region_size, PROT_READ |
+		       PROT_WRITE, MAP_SHARED, r->fd, 0);
+	if (r->addr == MAP_FAILED) {
+		MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(ret));
+		ret = -1;
+		goto error;
+	}
+
+	pmd->regions[pmd->regions_num] = r;
+	pmd->regions_num++;
+
+	return ret;
+
+error:
+	if (r->fd > 0)
+		close(r->fd);
+	r->fd = -1;
+
+	return ret;
+}
+
+static int
+memif_regions_init(struct pmd_internals *pmd)
+{
+	int ret;
+
+	/* create one buffer region */
+	ret = memif_region_init_shm(pmd, /* has buffer */ 1);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+	uint16_t slot;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = (uint32_t)(slot *
+				pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = (uint32_t)(slot *
+				pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_regions_init(dev->data->dev_private);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions[i];
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+				/* calculate packet buffer offset */
+				mr->pkt_buffer_offset = (pmd->run.num_s2m_rings +
+					pmd->run.num_m2s_rings) *
+					(sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+					(1 << pmd->run.log2_ring_size));
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	printf("Setting up rx queue...\n");
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *q = (struct memif_queue *)queue;
+
+	if (!q)
+		return;
+
+	rte_free(q);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+	uint8_t tmp, nq;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->q_errors[i] = mq->n_err;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t pkt_buffer_size, const char *secret,
+	     struct ether_addr *ether_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy slave not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+	printf("%s\n", role == MEMIF_ROLE_SLAVE ? "SLAVE" : "MASTER");
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * 24);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.pkt_buffer_size = pkt_buffer_size;
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = ether_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *pkt_buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*pkt_buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct ether_addr *ether_addr = (struct ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &ether_addr->addr_bytes[0], &ether_addr->addr_bytes[1],
+	       &ether_addr->addr_bytes[2], &ether_addr->addr_bytes[3],
+	       &ether_addr->addr_bytes[4], &ether_addr->addr_bytes[5]);
+	if (ret != 6)
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+	const char *name = rte_vdev_device_name(vdev);
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t pkt_buffer_size = ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct ether_addr *ether_addr = rte_zmalloc("", sizeof(struct ether_addr), 0);
+
+	eth_random_addr(ether_addr->addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		MIF_LOG(ERR, "Multi-processing not supported for memif.");
+		/* TODO:
+		 * Request connection information.
+		 *
+		 * Once memif in the primary process is connected,
+		 * broadcast connection information.
+		 */
+		return -1;
+	}
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &pkt_buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, ether_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, pkt_buffer_size, secret, ether_addr);
+
+exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+	struct pmd_internals *pmd;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	pmd = eth_dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Device removed", 0);
+	memif_disconnect(eth_dev);
+
+	memif_socket_remove_device(eth_dev);
+
+	pmd->vdev = NULL;
+
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_PKT_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..e640c975f
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,207 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/tmp/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE	2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_NUM		256
+
+#define ETH_MEMIF_SHM_NAME_SIZE			32
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t pkt_buffer_offset;
+	/**< offset from 'addr' to first packet buffer */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	uint16_t in_port;			/**< port id */
+
+	struct pmd_internals *pmd;		/**< device internals */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	/* ring info */
+	memif_ring_type_t type;			/**< ring type */
+	memif_ring_t *ring;			/**< pointer to ring */
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+
+	memif_region_index_t region;		/**< shared memory region index */
+	memif_region_offset_t offset;
+	/**< ring offset from start of shm region (ring - memif_region.addr) */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[24]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions[ETH_MEMIF_MAX_REGION_NUM];
+	/**< shared memory regions */
+	memif_region_index_t regions_num;	/**< number of regions */
+
+	/* remote info */
+	char remote_name[64];			/**< remote app name */
+	char remote_if_name[64];		/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[96];		/**< local disconnect reason */
+	char remote_disc_string[96];		/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..4b2e62193
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.05 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index ed99896c3..b57073483 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,6 +24,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 7c9b4b538..326b2a30b 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -174,6 +174,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
--
2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dpdk-dev] [RFC v8] /net: memory interface (memif)
  2019-05-13 10:45       ` [dpdk-dev] [RFC v7] " Jakub Grajciar
  2019-05-13 10:45         ` Jakub Grajciar
@ 2019-05-16 11:46         ` Jakub Grajciar
  2019-05-16 15:18           ` Stephen Hemminger
                             ` (5 more replies)
  1 sibling, 6 replies; 58+ messages in thread
From: Jakub Grajciar @ 2019-05-16 11:46 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 app/test-pmd/testpmd.c                      |    2 +
 config/common_base                          |    5 +
 config/common_linux                         |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  234 ++++
 doc/guides/rel_notes/release_19_08.rst      |    5 +
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   30 +
 drivers/net/memif/memif.h                   |  179 +++
 drivers/net/memif/memif_socket.c            | 1103 +++++++++++++++++
 drivers/net/memif/memif_socket.h            |  105 ++
 drivers/net/memif/meson.build               |   13 +
 drivers/net/memif/rte_eth_memif.c           | 1194 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  207 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 19 files changed, 3106 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/51511/
Memif uses a unix domain socket as control channel (cc).
rte_interrupts.h is used to handle interrupts for this cc.
This patch is required, because the message received can
be a reason for disconnecting.

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

v5:
- coding style fixes
- add release notes
- remove unused dependencies
- doc update

v6:
- remove socket on termination of testpmd application
- disallow memif probing in secondary process (return error), multi-process support will be implemented in separate patch
- fix chained buffer packet length
- use single region for rings and buffers (non-zero-copy)
- doc update

v7:
- coding style fixes

v8:
- release notes: 19.05 -> 19.08

diff --git a/MAINTAINERS b/MAINTAINERS
index 15d0829c5..3698c146b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -841,6 +841,12 @@ F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst

+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+

 Crypto Drivers
 --------------
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index f0061d99f..b05f1462d 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2493,6 +2493,8 @@ pmd_test_exit(void)
 			device = rte_eth_devices[pt_id].device;
 			if (device && !strcmp(device->driver->name, "net_virtio_user"))
 				detach_port_device(pt_id);
+			else if (device && !strcmp(device->driver->name, "net_memif"))
+				detach_port_device(pt_id);
 		}
 	}

diff --git a/config/common_base b/config/common_base
index 6b96e0e80..dca119a65 100644
--- a/config/common_base
+++ b/config/common_base
@@ -444,6 +444,11 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_XDP=n

+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linux b/config/common_linux
index 75334273d..87514fe4f 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -19,6 +19,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 2221c35f2..691e7209f 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..0d0312442
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,234 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
+Linux only.
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Each interface can be connected to one peer interface
+at same time. The peer interface is identified by id parameter. Master
+creates the socket and listens for any slave connection requests. The socket
+may already exist on the system. Be sure to remove any such sockets, if you
+are creating a master interface, or you will see an "Address already in use"
+error. Function ``rte_pmd_memif_remove()``, which removes memif interface,
+will also remove a listener socket, if it is not being used by any other
+interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. This id is used to identify
+peer interface. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Used to identify peer interface", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "nrxq=2", "Number of RX queues", "1", "255"
+   "ntxq=2", "Number of TX queues", "1", "255"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+**Connection establishment**
+
+In order to create memif connection, two memif interfaces, each in separate
+process, are needed. One interface in ``master`` role and other in
+``slave`` role. It is not possible to connect two interfaces in a single
+process. Each interface can be connected to one interface at same time,
+identified by matching id parameter.
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by ``master``
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have ``slave`` and ``master`` interfaces
+at the same time, provided each role is assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
+sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
+(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+Shared memory
+~~~~~~~~~~~~~
+
+**Shared memory format**
+
+Slave is producer and master is consumer. Memory regions, are mapped shared memory files,
+created by memif slave and provided to master at connection establishment.
+Regions contain rings and buffers. Rings and buffers can also be separated into multiple
+regions. For no-zero-copy, rings and buffers are stored inside single memory
+region to reduce the number of opened files.
+
+region n (no-zero-copy):
+
++-----------------------+-------------------------------------------------------------------------+
+| Rings                 | Buffers                                                                 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+| S2M rings | M2S rings | packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+
+S2M OR M2S Rings:
+
++--------+--------+-----------------------+
+| ring 0 | ring 1 | ring num_s2m_rings - 1|
++--------+--------+-----------------------+
+
+ring 0:
+
++-------------+---------------------------------------+
+| ring header | (1 << pmd->run.log2_ring_size) * desc |
++-------------+---------------------------------------+
+
+Descriptors are assigned packet buffers in order of rings creation. If we have one ring
+in each direction and ring size is 1024, then first 1024 buffers will belong to S2M ring and
+last 1024 will belong to M2S ring. In case of zero-copy, buffers are dequeued and
+enqueued as needed.
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+Example: testpmd and testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create ``master`` interface::
+
+    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create ``slave`` interface (master must be already running so the slave will connect)::
+
+    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
+
+    testpmd> set fwd rxonly
+    testpmd> start
+
+    testpmd> set fwd txonly
+    testpmd> start
+
+Show status::
+
+    testpmd> show port stats 0
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
index b9510f93a..42a7922b3 100644
--- a/doc/guides/rel_notes/release_19_08.rst
+++ b/doc/guides/rel_notes/release_19_08.rst
@@ -54,6 +54,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================

+* **Added memif PMD.**
+
+  Added the new Shared Memory Packet Interface (``memif``) PMD.
+  See the :doc:`../nics/memif` guide for more details on this new driver.
+

 Removed Items
 -------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 3a72cf38c..78cb10fc6 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -35,6 +35,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
 DIRS-$(CONFIG_RTE_LIBRTE_IPN3KE_PMD) += ipn3ke
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..2ae3c37a2
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,30 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+# Experimantal APIs:
+# - rte_intr_callback_unregister_pending
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_ethdev -lrte_kvargs
+LDLIBS += -lrte_hash
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..3948b1f50
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128)
+{
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..81bb17224
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1103 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	struct cmsghdr *cmsg;
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e;
+
+	e = rte_zmalloc("memif_msg", sizeof(struct memif_msg_queue_elt), 0);
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e;
+	struct pmd_internals *pmd;
+	memif_msg_disconnect_t *d;
+
+	if (cc == NULL) {
+		MIF_LOG(DEBUG, "Missing control channel.");
+		return;
+	}
+
+	e = memif_msg_enq(cc);
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			pmd = cc->dev->data->dev_private;
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	memif_msg_hello_t *h;
+
+	if (e == NULL)
+		return -1;
+
+	h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_NUM - 1;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.pkt_buffer_size = pmd->cfg.pkt_buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strcmp(pmd->secret, (char *)i->secret) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+	struct memif_region *r;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	if (ar->index >= ETH_MEMIF_MAX_REGION_NUM || ar->index != pmd->regions_num ||
+			pmd->regions[ar->index] != NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	r->fd = fd;
+	r->region_size = ar->size;
+	r->addr = NULL;
+
+	pmd->regions[ar->index] = r;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+	struct memif_queue *mq;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->ring_offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, sizeof(pmd->remote_disc_string));
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, 96);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_init_t *i = &e->msg.init;
+
+	if (e == NULL)
+		return -1;
+
+	i = &e->msg.init;
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_add_region_t *ar;
+	struct memif_region *mr = pmd->regions[idx];
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_region;
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	struct memif_queue *mq;
+	memif_msg_add_ring_t *ar;
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_ring;
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->ring_offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connect_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connect;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connected_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connected;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		rte_free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt, *next;
+	struct memif_queue *mq;
+	struct rte_intr_handle *ih;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		for (elt = TAILQ_FIRST(&pmd->cc->msg_queue); elt != NULL; elt = next) {
+			next = TAILQ_NEXT(elt, next);
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				rte_free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		ih = &pmd->cc->intr_handle;
+		if (ih->fd > 0) {
+			ret = rte_intr_callback_unregister(ih,
+							memif_intr_handler,
+							pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				ret = rte_intr_callback_unregister_pending(ih,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+			} else if (ret > 0) {
+				close(ih->fd);
+				rte_free(pmd->cc);
+			}
+			pmd->cc = NULL;
+			if (ret <= 0)
+				MIF_LOG(WARNING, "%s: Failed to unregister "
+					"control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	/* reset connection configuration */
+	memset(&pmd->run, 0, sizeof(pmd->run));
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	if (cc->dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(cc->dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret = rte_intr_callback_register(&cc->intr_handle, memif_intr_handler, cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL)
+		rte_free(cc);
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	struct rte_hash *hash;
+	int ret;
+	char key[256];
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt = rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt), 0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+	struct rte_hash *hash;
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) <
+	    0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			rte_free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, 96);
+	memset(pmd->remote_disc_string, 0, 96);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..2e8970712
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+	char dev_name[RTE_ETH_NAME_MAX_LEN];
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	uint8_t listener;			/**< if not zero socket is listener */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	memif_msg_t msg;			/**< control message */
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..4dfe37d3a
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..7e1950b4d
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1194 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_PKT_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0]->addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region]->addr + d->offset);
+}
+
+static int
+memif_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *cur_tail,
+		    struct rte_mbuf *tail)
+{
+	/* Check for number-of-segments-overflow */
+	if (unlikely(head->nb_segs + tail->nb_segs > RTE_MBUF_MAX_NB_SEGS))
+		return -EOVERFLOW;
+
+	/* Chain 'tail' onto the old tail */
+	cur_tail->next = tail;
+
+	/* accumulate number of segments and total length. */
+	head->nb_segs = (uint16_t)(head->nb_segs + tail->nb_segs);
+
+	tail->pkt_len = tail->data_len;
+	head->pkt_len += tail->pkt_len;
+
+	return 0;
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+		RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf, *mbuf_head, *mbuf_tail;
+	uint64_t b;
+	ssize_t size __rte_unused;
+	uint16_t head;
+	int ret;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size;
+
+				/* store pointer to tail */
+				mbuf_tail = mbuf;
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				ret = memif_pktmbuf_chain(mbuf_head, mbuf_tail, mbuf);
+				if (unlikely(ret < 0)) {
+					MIF_LOG(ERR, "%s: number-of-segments-overflow",
+						rte_vdev_device_name(pmd->vdev));
+					rte_pktmbuf_free(mbuf);
+					goto no_free_bufs;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_data_len(mbuf) += cp_len;
+			rte_pktmbuf_pkt_len(mbuf) = rte_pktmbuf_data_len(mbuf);
+			if (mbuf != mbuf_head)
+				rte_pktmbuf_pkt_len(mbuf_head) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		mq->n_bytes += rte_pktmbuf_pkt_len(mbuf_head);
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+refill:
+	if (type == MEMIF_RING_M2S) {
+		head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.pkt_buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+	uint64_t a;
+	ssize_t size;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_tx_pkts < nb_pkts && n_free) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len = (type == MEMIF_RING_S2M) ?
+			pmd->run.pkt_buffer_size : d0->length;
+
+next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.pkt_buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		a = 1;
+		size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	/* regions are allocated contiguously, so it's
+	 * enough to loop until 'pmd->regions_num'
+	 */
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions[i];
+		if (r != NULL) {
+			if (r->addr != NULL) {
+				munmap(r->addr, r->region_size);
+				if (r->fd > 0) {
+					close(r->fd);
+					r->fd = -1;
+				}
+			}
+			rte_free(r);
+			pmd->regions[i] = NULL;
+		}
+	}
+	pmd->regions_num = 0;
+}
+
+static int
+memif_region_init_shm(struct pmd_internals *pmd, uint8_t has_buffers)
+{
+	char shm_name[ETH_MEMIF_SHM_NAME_SIZE];
+	int ret = 0;
+	struct memif_region *r;
+
+	if (pmd->regions_num >= ETH_MEMIF_MAX_REGION_NUM) {
+		MIF_LOG(ERR, "%s: Too many regions.", rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	/* calculate buffer offset */
+	r->pkt_buffer_offset = (pmd->run.num_s2m_rings + pmd->run.num_m2s_rings) *
+	    (sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size));
+
+	r->region_size = r->pkt_buffer_offset;
+	/* if region has buffers, add buffers size to region_size */
+	if (has_buffers == 1)
+		r->region_size += (uint32_t)(pmd->run.pkt_buffer_size *
+			(1 << pmd->run.log2_ring_size) *
+			(pmd->run.num_s2m_rings +
+			 pmd->run.num_m2s_rings));
+
+	memset(shm_name, 0, sizeof(char) * ETH_MEMIF_SHM_NAME_SIZE);
+	snprintf(shm_name, ETH_MEMIF_SHM_NAME_SIZE, "memif_region_%d",
+		 pmd->regions_num);
+
+	r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+	if (r->fd < 0) {
+		MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		ret = -1;
+		goto error;
+	}
+
+	ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	ret = ftruncate(r->fd, r->region_size);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	r->addr = mmap(NULL, r->region_size, PROT_READ |
+		       PROT_WRITE, MAP_SHARED, r->fd, 0);
+	if (r->addr == MAP_FAILED) {
+		MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(ret));
+		ret = -1;
+		goto error;
+	}
+
+	pmd->regions[pmd->regions_num] = r;
+	pmd->regions_num++;
+
+	return ret;
+
+error:
+	if (r->fd > 0)
+		close(r->fd);
+	r->fd = -1;
+
+	return ret;
+}
+
+static int
+memif_regions_init(struct pmd_internals *pmd)
+{
+	int ret;
+
+	/* create one buffer region */
+	ret = memif_region_init_shm(pmd, /* has buffer */ 1);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+	uint16_t slot;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = pmd->regions[0]->pkt_buffer_offset +
+				(uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = pmd->regions[0]->pkt_buffer_offset +
+				(uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->ring_offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->ring_offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_regions_init(dev->data->dev_private);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions[i];
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->ring_offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->ring_offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *q = (struct memif_queue *)queue;
+
+	if (!q)
+		return;
+
+	rte_free(q);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+	uint8_t tmp, nq;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->q_errors[i] = mq->n_err;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t pkt_buffer_size, const char *secret,
+	     struct ether_addr *ether_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy slave not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * 24);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.pkt_buffer_size = pkt_buffer_size;
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = ether_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *pkt_buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*pkt_buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct ether_addr *ether_addr = (struct ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &ether_addr->addr_bytes[0], &ether_addr->addr_bytes[1],
+	       &ether_addr->addr_bytes[2], &ether_addr->addr_bytes[3],
+	       &ether_addr->addr_bytes[4], &ether_addr->addr_bytes[5]);
+	if (ret != 6)
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+	const char *name = rte_vdev_device_name(vdev);
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t pkt_buffer_size = ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct ether_addr *ether_addr = rte_zmalloc("", sizeof(struct ether_addr), 0);
+
+	eth_random_addr(ether_addr->addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		MIF_LOG(ERR, "Multi-processing not supported for memif.");
+		/* TODO:
+		 * Request connection information.
+		 *
+		 * Once memif in the primary process is connected,
+		 * broadcast connection information.
+		 */
+		return -1;
+	}
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &pkt_buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, ether_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, pkt_buffer_size, secret, ether_addr);
+
+exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+	struct pmd_internals *pmd;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	pmd = eth_dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Device removed", 0);
+	memif_disconnect(eth_dev);
+
+	memif_socket_remove_device(eth_dev);
+
+	pmd->vdev = NULL;
+
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_PKT_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..175eef667
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,207 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/tmp/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE	2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_NUM		256
+
+#define ETH_MEMIF_SHM_NAME_SIZE			32
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t pkt_buffer_offset;
+	/**< offset from 'addr' to first packet buffer */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	uint16_t in_port;			/**< port id */
+
+	struct pmd_internals *pmd;		/**< device internals */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	/* ring info */
+	memif_ring_type_t type;			/**< ring type */
+	memif_ring_t *ring;			/**< pointer to ring */
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+
+	memif_region_index_t region;		/**< shared memory region index */
+	memif_region_offset_t ring_offset;
+	/**< ring offset from start of shm region (ring - memif_region.addr) */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[24]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions[ETH_MEMIF_MAX_REGION_NUM];
+	/**< shared memory regions */
+	memif_region_index_t regions_num;	/**< number of regions */
+
+	/* remote info */
+	char remote_name[64];			/**< remote app name */
+	char remote_if_name[64];		/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[96];		/**< local disconnect reason */
+	char remote_disc_string[96];		/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..8861484fb
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.08 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index ed99896c3..b57073483 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,6 +24,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 7c9b4b538..326b2a30b 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -174,6 +174,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
--
2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v8] /net: memory interface (memif)
  2019-05-16 11:46         ` [dpdk-dev] [RFC v8] " Jakub Grajciar
@ 2019-05-16 15:18           ` Stephen Hemminger
  2019-05-16 15:19           ` Stephen Hemminger
                             ` (4 subsequent siblings)
  5 siblings, 0 replies; 58+ messages in thread
From: Stephen Hemminger @ 2019-05-16 15:18 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

On Thu, 16 May 2019 13:46:58 +0200
Jakub Grajciar <jgrajcia@cisco.com> wrote:

> +struct memif_queue {
> +	struct rte_mempool *mempool;		/**< mempool for RX packets */
> +	uint16_t in_port;			/**< port id */
> +
> +	struct pmd_internals *pmd;		/**< device internals */
> +
> +	struct rte_intr_handle intr_handle;	/**< interrupt handle */
> +
> +	/* ring info */
> +	memif_ring_type_t type;			/**< ring type */
> +	memif_ring_t *ring;			/**< pointer to ring */
> +	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
> +
> +	memif_region_index_t region;		/**< shared memory region index */
> +	memif_region_offset_t ring_offset;
> +	/**< ring offset from start of shm region (ring - memif_region.addr) */
> +
> +	uint16_t last_head;			/**< last ring head */
> +	uint16_t last_tail;			/**< last ring tail */
> +
> +	/* rx/tx info */
> +	uint64_t n_pkts;			/**< number of rx/tx packets */
> +	uint64_t n_bytes;			/**< number of rx/tx bytes */
> +	uint64_t n_err;				/**< number of tx errors */
> +};
> +

The layout of this structure has lots of holes, you might want to rearrange elements.
struct memif_queue {
	struct rte_mempool *       mempool;              /*     0     8 */
	uint16_t                   in_port;              /*     8     2 */

	/* XXX 6 bytes hole, try to pack */

	struct pmd_internals *     pmd;                  /*    16     8 */
	struct rte_intr_handle     intr_handle;          /*    24 26656 */
	/* --- cacheline 416 boundary (26624 bytes) was 56 bytes ago --- */
	memif_ring_type_t          type;                 /* 26680     4 */

	/* XXX 4 bytes hole, try to pack */

	/* --- cacheline 417 boundary (26688 bytes) --- */
	memif_ring_t *             ring;                 /* 26688     8 */
	memif_log2_ring_size_t     log2_ring_size;       /* 26696     1 */

	/* XXX 1 byte hole, try to pack */

	memif_region_index_t       region;               /* 26698     2 */
	memif_region_offset_t      ring_offset;          /* 26700     4 */
	uint16_t                   last_head;            /* 26704     2 */
	uint16_t                   last_tail;            /* 26706     2 */

	/* XXX 4 bytes hole, try to pack */

	uint64_t                   n_pkts;               /* 26712     8 */
	uint64_t                   n_bytes;              /* 26720     8 */
	uint64_t                   n_err;                /* 26728     8 */

	/* size: 26736, cachelines: 418, members: 14 */
	/* sum members: 26721, holes: 4, sum holes: 15 */
	/* last cacheline: 48 bytes */
};

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v8] /net: memory interface (memif)
  2019-05-16 11:46         ` [dpdk-dev] [RFC v8] " Jakub Grajciar
  2019-05-16 15:18           ` Stephen Hemminger
@ 2019-05-16 15:19           ` Stephen Hemminger
  2019-05-16 15:21           ` Stephen Hemminger
                             ` (3 subsequent siblings)
  5 siblings, 0 replies; 58+ messages in thread
From: Stephen Hemminger @ 2019-05-16 15:19 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

On Thu, 16 May 2019 13:46:58 +0200
Jakub Grajciar <jgrajcia@cisco.com> wrote:

> +
> +#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/tmp/memif.sock"
> +#define ETH_MEMIF_DEFAULT_RING_SIZE		10
> +#define ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE	2048

The current best practice on Linux is to use /run for sockets.
It is even in the latest Filesystem layout standard.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v8] /net: memory interface (memif)
  2019-05-16 11:46         ` [dpdk-dev] [RFC v8] " Jakub Grajciar
  2019-05-16 15:18           ` Stephen Hemminger
  2019-05-16 15:19           ` Stephen Hemminger
@ 2019-05-16 15:21           ` Stephen Hemminger
  2019-05-20  9:22             ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-05-16 15:25           ` Stephen Hemminger
                             ` (2 subsequent siblings)
  5 siblings, 1 reply; 58+ messages in thread
From: Stephen Hemminger @ 2019-05-16 15:21 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

On Thu, 16 May 2019 13:46:58 +0200
Jakub Grajciar <jgrajcia@cisco.com> wrote:

> +enum memif_role_t {
> +	MEMIF_ROLE_MASTER,
> +	MEMIF_ROLE_SLAVE,
> +};

Because master/slave terminology is potentially culturally offensive it
is flagged by many corporate source scanning tools.

Could you use primary/secondary in memif instead?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v8] /net: memory interface (memif)
  2019-05-16 11:46         ` [dpdk-dev] [RFC v8] " Jakub Grajciar
                             ` (2 preceding siblings ...)
  2019-05-16 15:21           ` Stephen Hemminger
@ 2019-05-16 15:25           ` Stephen Hemminger
  2019-05-16 15:28           ` Stephen Hemminger
  2019-05-20 10:18           ` [dpdk-dev] [RFC v9] " Jakub Grajciar
  5 siblings, 0 replies; 58+ messages in thread
From: Stephen Hemminger @ 2019-05-16 15:25 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

On Thu, 16 May 2019 13:46:58 +0200
Jakub Grajciar <jgrajcia@cisco.com> wrote:

> +	/* remote info */
> +	char remote_name[64];			/**< remote app name */
> +	char remote_if_name[64];

Hard coding magic string sizes has future potential for disaster.
Could you at least add a #define.

> +typedef struct __rte_packed {
> +	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
> +	memif_version_t min_version; /**< lowest supported memif version */
> +	memif_version_t max_version; /**< highest supported memif version */
> +	memif_region_index_t max_region; /**< maximum num of regions */
> +	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
> +	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
> +	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
> +} memif_msg_hello_t;

Why is name a uint8_t not char? Are end up having to cast it.
Maybe it is because it UTF-8 or you have some subsystem where sizeof(char) != sizeof(uint8_t)?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v8] /net: memory interface (memif)
  2019-05-16 11:46         ` [dpdk-dev] [RFC v8] " Jakub Grajciar
                             ` (3 preceding siblings ...)
  2019-05-16 15:25           ` Stephen Hemminger
@ 2019-05-16 15:28           ` Stephen Hemminger
  2019-05-20 10:18           ` [dpdk-dev] [RFC v9] " Jakub Grajciar
  5 siblings, 0 replies; 58+ messages in thread
From: Stephen Hemminger @ 2019-05-16 15:28 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

On Thu, 16 May 2019 13:46:58 +0200
Jakub Grajciar <jgrajcia@cisco.com> wrote:

> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index f0061d99f..b05f1462d 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -2493,6 +2493,8 @@ pmd_test_exit(void)
>  			device = rte_eth_devices[pt_id].device;
>  			if (device && !strcmp(device->driver->name, "net_virtio_user"))
>  				detach_port_device(pt_id);
> +			else if (device && !strcmp(device->driver->name, "net_memif"))
> +				detach_port_device(pt_id);
>  		}
>  	}


Why not do a proper implementation of dev_close instead of pushing
the problem onto the application.

The existing virtio_user hack is a bug and it should be fixed.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v8] /net: memory interface (memif)
  2019-05-16 15:21           ` Stephen Hemminger
@ 2019-05-20  9:22             ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  0 siblings, 0 replies; 58+ messages in thread
From: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) @ 2019-05-20  9:22 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Damjan Marion (damarion)



> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Thursday, May 16, 2019 5:22 PM
> To: Jakub Grajciar
> <jgrajcia@cisco.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [RFC v8] /net: memory interface (memif)
> 
> On Thu, 16 May 2019 13:46:58 +0200
> Jakub Grajciar <jgrajcia@cisco.com> wrote:
> 
> > +enum memif_role_t {
> > +	MEMIF_ROLE_MASTER,
> > +	MEMIF_ROLE_SLAVE,
> > +};
> 
> Because master/slave terminology is potentially culturally offensive it is
> flagged by many corporate source scanning tools.
> 
> Could you use primary/secondary in memif instead?

Other implementations also use master/slave terminology, so changing it would be confusing. However, we will consider using primary/secondary in next protocol version.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dpdk-dev] [RFC v9] /net: memory interface (memif)
  2019-05-16 11:46         ` [dpdk-dev] [RFC v8] " Jakub Grajciar
                             ` (4 preceding siblings ...)
  2019-05-16 15:28           ` Stephen Hemminger
@ 2019-05-20 10:18           ` Jakub Grajciar
  2019-05-29 17:29             ` Ferruh Yigit
  2019-05-31  6:22             ` [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD Jakub Grajciar
  5 siblings, 2 replies; 58+ messages in thread
From: Jakub Grajciar @ 2019-05-20 10:18 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 config/common_base                          |    5 +
 config/common_linux                         |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  234 ++++
 doc/guides/rel_notes/release_19_08.rst      |    5 +
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   30 +
 drivers/net/memif/memif.h                   |  179 +++
 drivers/net/memif/memif_socket.c            | 1124 +++++++++++++++++
 drivers/net/memif/memif_socket.h            |  105 ++
 drivers/net/memif/meson.build               |   13 +
 drivers/net/memif/rte_eth_memif.c           | 1198 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  212 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 18 files changed, 3134 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/51511/
Memif uses a unix domain socket as control channel (cc).
rte_interrupts.h is used to handle interrupts for this cc.
This patch is required, because the message received can
be a reason for disconnecting.

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

v5:
- coding style fixes
- add release notes
- remove unused dependencies
- doc update

v6:
- remove socket on termination of testpmd application
- disallow memif probing in secondary process (return error), multi-process support will be implemented in separate patch
- fix chained buffer packet length
- use single region for rings and buffers (non-zero-copy)
- doc update

v7:
- coding style fixes

v8:
- release notes: 19.05 -> 19.08

v9:
- rearrange elements in structures to remove holes
- default socket file in /run
- #define name sizes
- implement dev_close op

diff --git a/MAINTAINERS b/MAINTAINERS
index 15d0829c5..3698c146b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -841,6 +841,12 @@ F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst

+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+

 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 6b96e0e80..dca119a65 100644
--- a/config/common_base
+++ b/config/common_base
@@ -444,6 +444,11 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_XDP=n

+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linux b/config/common_linux
index 75334273d..87514fe4f 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -19,6 +19,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 2221c35f2..691e7209f 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..0d0312442
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,234 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
+Linux only.
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Each interface can be connected to one peer interface
+at same time. The peer interface is identified by id parameter. Master
+creates the socket and listens for any slave connection requests. The socket
+may already exist on the system. Be sure to remove any such sockets, if you
+are creating a master interface, or you will see an "Address already in use"
+error. Function ``rte_pmd_memif_remove()``, which removes memif interface,
+will also remove a listener socket, if it is not being used by any other
+interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. This id is used to identify
+peer interface. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Used to identify peer interface", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "nrxq=2", "Number of RX queues", "1", "255"
+   "ntxq=2", "Number of TX queues", "1", "255"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+**Connection establishment**
+
+In order to create memif connection, two memif interfaces, each in separate
+process, are needed. One interface in ``master`` role and other in
+``slave`` role. It is not possible to connect two interfaces in a single
+process. Each interface can be connected to one interface at same time,
+identified by matching id parameter.
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by ``master``
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have ``slave`` and ``master`` interfaces
+at the same time, provided each role is assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
+sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
+(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+Shared memory
+~~~~~~~~~~~~~
+
+**Shared memory format**
+
+Slave is producer and master is consumer. Memory regions, are mapped shared memory files,
+created by memif slave and provided to master at connection establishment.
+Regions contain rings and buffers. Rings and buffers can also be separated into multiple
+regions. For no-zero-copy, rings and buffers are stored inside single memory
+region to reduce the number of opened files.
+
+region n (no-zero-copy):
+
++-----------------------+-------------------------------------------------------------------------+
+| Rings                 | Buffers                                                                 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+| S2M rings | M2S rings | packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+
+S2M OR M2S Rings:
+
++--------+--------+-----------------------+
+| ring 0 | ring 1 | ring num_s2m_rings - 1|
++--------+--------+-----------------------+
+
+ring 0:
+
++-------------+---------------------------------------+
+| ring header | (1 << pmd->run.log2_ring_size) * desc |
++-------------+---------------------------------------+
+
+Descriptors are assigned packet buffers in order of rings creation. If we have one ring
+in each direction and ring size is 1024, then first 1024 buffers will belong to S2M ring and
+last 1024 will belong to M2S ring. In case of zero-copy, buffers are dequeued and
+enqueued as needed.
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+Example: testpmd and testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create ``master`` interface::
+
+    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create ``slave`` interface (master must be already running so the slave will connect)::
+
+    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
+
+    testpmd> set fwd rxonly
+    testpmd> start
+
+    testpmd> set fwd txonly
+    testpmd> start
+
+Show status::
+
+    testpmd> show port stats 0
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
index b9510f93a..42a7922b3 100644
--- a/doc/guides/rel_notes/release_19_08.rst
+++ b/doc/guides/rel_notes/release_19_08.rst
@@ -54,6 +54,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================

+* **Added memif PMD.**
+
+  Added the new Shared Memory Packet Interface (``memif``) PMD.
+  See the :doc:`../nics/memif` guide for more details on this new driver.
+

 Removed Items
 -------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 3a72cf38c..78cb10fc6 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -35,6 +35,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
 DIRS-$(CONFIG_RTE_LIBRTE_IPN3KE_PMD) += ipn3ke
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..2ae3c37a2
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,30 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+# Experimantal APIs:
+# - rte_intr_callback_unregister_pending
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_ethdev -lrte_kvargs
+LDLIBS += -lrte_hash
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..3948b1f50
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128)
+{
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..98e8ceefd
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1124 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	struct cmsghdr *cmsg;
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e;
+
+	e = rte_zmalloc("memif_msg", sizeof(struct memif_msg_queue_elt), 0);
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e;
+	struct pmd_internals *pmd;
+	memif_msg_disconnect_t *d;
+
+	if (cc == NULL) {
+		MIF_LOG(DEBUG, "Missing control channel.");
+		return;
+	}
+
+	e = memif_msg_enq(cc);
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			pmd = cc->dev->data->dev_private;
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	memif_msg_hello_t *h;
+
+	if (e == NULL)
+		return -1;
+
+	h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_NUM - 1;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.pkt_buffer_size = pmd->cfg.pkt_buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strncmp(pmd->secret, (char *)i->secret,
+						ETH_MEMIF_SECRET_SIZE) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+	struct memif_region *r;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	if (ar->index >= ETH_MEMIF_MAX_REGION_NUM || ar->index != pmd->regions_num ||
+			pmd->regions[ar->index] != NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	r->fd = fd;
+	r->region_size = ar->size;
+	r->addr = NULL;
+
+	pmd->regions[ar->index] = r;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+	struct memif_queue *mq;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->ring_offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_init_t *i = &e->msg.init;
+
+	if (e == NULL)
+		return -1;
+
+	i = &e->msg.init;
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_add_region_t *ar;
+	struct memif_region *mr = pmd->regions[idx];
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_region;
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	struct memif_queue *mq;
+	memif_msg_add_ring_t *ar;
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_ring;
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->ring_offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connect_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connect;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connected_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connected;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		rte_free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt, *next;
+	struct memif_queue *mq;
+	struct rte_intr_handle *ih;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		for (elt = TAILQ_FIRST(&pmd->cc->msg_queue); elt != NULL; elt = next) {
+			next = TAILQ_NEXT(elt, next);
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				rte_free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		ih = &pmd->cc->intr_handle;
+		if (ih->fd > 0) {
+			ret = rte_intr_callback_unregister(ih,
+							memif_intr_handler,
+							pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				ret = rte_intr_callback_unregister_pending(ih,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+			} else if (ret > 0) {
+				close(ih->fd);
+				rte_free(pmd->cc);
+			}
+			pmd->cc = NULL;
+			if (ret <= 0)
+				MIF_LOG(WARNING, "%s: Failed to unregister "
+					"control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		if (pmd->role == MEMIF_ROLE_SLAVE) {
+			if (dev->data->tx_queues != NULL)
+				mq = dev->data->tx_queues[i];
+			else
+				continue;
+		} else {
+			if (dev->data->rx_queues != NULL)
+				mq = dev->data->rx_queues[i];
+			else
+				continue;
+		}
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		if (pmd->role == MEMIF_ROLE_MASTER) {
+			if (dev->data->tx_queues != NULL)
+				mq = dev->data->tx_queues[i];
+			else
+				continue;
+		} else {
+			if (dev->data->rx_queues != NULL)
+				mq = dev->data->rx_queues[i];
+			else
+				continue;
+		}
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	/* reset connection configuration */
+	memset(&pmd->run, 0, sizeof(pmd->run));
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	if (cc->dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(cc->dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret = rte_intr_callback_register(&cc->intr_handle, memif_intr_handler, cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL)
+		rte_free(cc);
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	struct rte_hash *hash;
+	int ret;
+	char key[256];
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt = rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt), 0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+	struct rte_hash *hash;
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (pmd->socket_filename == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) < 0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			rte_free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..c60614721
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+	char dev_name[RTE_ETH_NAME_MAX_LEN];
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+	uint8_t listener;			/**< if not zero socket is listener */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	memif_msg_t msg;			/**< control message */
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..4dfe37d3a
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..8700e7589
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1198 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_PKT_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0]->addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region]->addr + d->offset);
+}
+
+static int
+memif_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *cur_tail,
+		    struct rte_mbuf *tail)
+{
+	/* Check for number-of-segments-overflow */
+	if (unlikely(head->nb_segs + tail->nb_segs > RTE_MBUF_MAX_NB_SEGS))
+		return -EOVERFLOW;
+
+	/* Chain 'tail' onto the old tail */
+	cur_tail->next = tail;
+
+	/* accumulate number of segments and total length. */
+	head->nb_segs = (uint16_t)(head->nb_segs + tail->nb_segs);
+
+	tail->pkt_len = tail->data_len;
+	head->pkt_len += tail->pkt_len;
+
+	return 0;
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+		RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf, *mbuf_head, *mbuf_tail;
+	uint64_t b;
+	ssize_t size __rte_unused;
+	uint16_t head;
+	int ret;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size;
+
+				/* store pointer to tail */
+				mbuf_tail = mbuf;
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				ret = memif_pktmbuf_chain(mbuf_head, mbuf_tail, mbuf);
+				if (unlikely(ret < 0)) {
+					MIF_LOG(ERR, "%s: number-of-segments-overflow",
+						rte_vdev_device_name(pmd->vdev));
+					rte_pktmbuf_free(mbuf);
+					goto no_free_bufs;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_data_len(mbuf) += cp_len;
+			rte_pktmbuf_pkt_len(mbuf) = rte_pktmbuf_data_len(mbuf);
+			if (mbuf != mbuf_head)
+				rte_pktmbuf_pkt_len(mbuf_head) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		mq->n_bytes += rte_pktmbuf_pkt_len(mbuf_head);
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+refill:
+	if (type == MEMIF_RING_M2S) {
+		head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.pkt_buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+	uint64_t a;
+	ssize_t size;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_tx_pkts < nb_pkts && n_free) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len = (type == MEMIF_RING_S2M) ?
+			pmd->run.pkt_buffer_size : d0->length;
+
+next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.pkt_buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		a = 1;
+		size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	/* regions are allocated contiguously, so it's
+	 * enough to loop until 'pmd->regions_num'
+	 */
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions[i];
+		if (r != NULL) {
+			if (r->addr != NULL) {
+				munmap(r->addr, r->region_size);
+				if (r->fd > 0) {
+					close(r->fd);
+					r->fd = -1;
+				}
+			}
+			rte_free(r);
+			pmd->regions[i] = NULL;
+		}
+	}
+	pmd->regions_num = 0;
+}
+
+static int
+memif_region_init_shm(struct pmd_internals *pmd, uint8_t has_buffers)
+{
+	char shm_name[ETH_MEMIF_SHM_NAME_SIZE];
+	int ret = 0;
+	struct memif_region *r;
+
+	if (pmd->regions_num >= ETH_MEMIF_MAX_REGION_NUM) {
+		MIF_LOG(ERR, "%s: Too many regions.", rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	/* calculate buffer offset */
+	r->pkt_buffer_offset = (pmd->run.num_s2m_rings + pmd->run.num_m2s_rings) *
+	    (sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size));
+
+	r->region_size = r->pkt_buffer_offset;
+	/* if region has buffers, add buffers size to region_size */
+	if (has_buffers == 1)
+		r->region_size += (uint32_t)(pmd->run.pkt_buffer_size *
+			(1 << pmd->run.log2_ring_size) *
+			(pmd->run.num_s2m_rings +
+			 pmd->run.num_m2s_rings));
+
+	memset(shm_name, 0, sizeof(char) * ETH_MEMIF_SHM_NAME_SIZE);
+	snprintf(shm_name, ETH_MEMIF_SHM_NAME_SIZE, "memif_region_%d",
+		 pmd->regions_num);
+
+	r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+	if (r->fd < 0) {
+		MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		ret = -1;
+		goto error;
+	}
+
+	ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	ret = ftruncate(r->fd, r->region_size);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	r->addr = mmap(NULL, r->region_size, PROT_READ |
+		       PROT_WRITE, MAP_SHARED, r->fd, 0);
+	if (r->addr == MAP_FAILED) {
+		MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(ret));
+		ret = -1;
+		goto error;
+	}
+
+	pmd->regions[pmd->regions_num] = r;
+	pmd->regions_num++;
+
+	return ret;
+
+error:
+	if (r->fd > 0)
+		close(r->fd);
+	r->fd = -1;
+
+	return ret;
+}
+
+static int
+memif_regions_init(struct pmd_internals *pmd)
+{
+	int ret;
+
+	/* create one buffer region */
+	ret = memif_region_init_shm(pmd, /* has buffer */ 1);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+	uint16_t slot;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = pmd->regions[0]->pkt_buffer_offset +
+				(uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = pmd->regions[0]->pkt_buffer_offset +
+				(uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->ring_offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->ring_offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_regions_init(dev->data->dev_private);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions[i];
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->ring_offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->ring_offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static void
+memif_dev_close(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Device closed", 0);
+	memif_disconnect(dev);
+
+	memif_socket_remove_device(dev);
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *mq = (struct memif_queue *)queue;
+
+	if (!mq)
+		return;
+
+	rte_free(mq);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+	uint8_t tmp, nq;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->q_errors[i] = mq->n_err;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_close = memif_dev_close,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t pkt_buffer_size, const char *secret,
+	     struct ether_addr *ether_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy slave not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * ETH_MEMIF_SECRET_SIZE);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.pkt_buffer_size = pkt_buffer_size;
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = ether_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	eth_dev->data->dev_flags &= RTE_ETH_DEV_CLOSE_REMOVE;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *pkt_buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*pkt_buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct ether_addr *ether_addr = (struct ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &ether_addr->addr_bytes[0], &ether_addr->addr_bytes[1],
+	       &ether_addr->addr_bytes[2], &ether_addr->addr_bytes[3],
+	       &ether_addr->addr_bytes[4], &ether_addr->addr_bytes[5]);
+	if (ret != 6)
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+	const char *name = rte_vdev_device_name(vdev);
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t pkt_buffer_size = ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct ether_addr *ether_addr = rte_zmalloc("", sizeof(struct ether_addr), 0);
+
+	eth_random_addr(ether_addr->addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		MIF_LOG(ERR, "Multi-processing not supported for memif.");
+		/* TODO:
+		 * Request connection information.
+		 *
+		 * Once memif in the primary process is connected,
+		 * broadcast connection information.
+		 */
+		return -1;
+	}
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &pkt_buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, ether_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, pkt_buffer_size, secret, ether_addr);
+
+exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	rte_eth_dev_close(eth_dev->data->port_id);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_PKT_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..8cee2643c
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,212 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/run/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE	2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_NUM		256
+
+#define ETH_MEMIF_SHM_NAME_SIZE			32
+#define ETH_MEMIF_DISC_STRING_SIZE		96
+#define ETH_MEMIF_SECRET_SIZE			24
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t pkt_buffer_offset;
+	/**< offset from 'addr' to first packet buffer */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	struct pmd_internals *pmd;		/**< device internals */
+
+	memif_ring_type_t type;			/**< ring type */
+	memif_region_index_t region;		/**< shared memory region index */
+
+	uint16_t in_port;			/**< port id */
+
+	memif_region_offset_t ring_offset;
+	/**< ring offset from start of shm region (ring - memif_region.addr) */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+
+	memif_ring_t *ring;			/**< pointer to ring */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[ETH_MEMIF_SECRET_SIZE]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions[ETH_MEMIF_MAX_REGION_NUM];
+	/**< shared memory regions */
+	memif_region_index_t regions_num;	/**< number of regions */
+
+	/* remote info */
+	char remote_name[RTE_DEV_NAME_MAX_LEN];		/**< remote app name */
+	char remote_if_name[RTE_DEV_NAME_MAX_LEN];	/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[ETH_MEMIF_DISC_STRING_SIZE];
+	/**< local disconnect reason */
+	char remote_disc_string[ETH_MEMIF_DISC_STRING_SIZE];
+	/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..8861484fb
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.08 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index ed99896c3..b57073483 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,6 +24,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 7c9b4b538..326b2a30b 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -174,6 +174,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
--
2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v9] /net: memory interface (memif)
  2019-05-20 10:18           ` [dpdk-dev] [RFC v9] " Jakub Grajciar
@ 2019-05-29 17:29             ` Ferruh Yigit
  2019-05-30 12:38               ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-05-31  6:22             ` [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD Jakub Grajciar
  1 sibling, 1 reply; 58+ messages in thread
From: Ferruh Yigit @ 2019-05-29 17:29 UTC (permalink / raw)
  To: Jakub Grajciar, dev

On 5/20/2019 11:18 AM, Jakub Grajciar wrote:
> Memory interface (memif), provides high performance
> packet transfer over shared memory.

Can you please update the patch title as following in next version:
net/memif: introduce memory interface (memif) PMD

Can you please drop the RFC from next version, the patch is mature enough and we
can start regular versioning on it (without RFC), and RFC tag prefent DPDK
community lab to process the patch, so dropping the RFC will help there.

> 
> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
<...>

> +
> +.. csv-table:: **Memif configuration options**
> +   :header: "Option", "Description", "Default", "Valid value"
> +
> +   "id=0", "Used to identify peer interface", "0", "uint32_t"
> +   "role=master", "Set memif role", "slave", "master|slave"
> +   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"

What happens is 'bsize < mbuf size'? I didn't see any check in the code but is
there any assumption around this?
Or any assumption that slave and master packet should be same? Or any other
relation?
If there is any assumption it may be good to add checks to the code and document
here.

> +   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
> +   "nrxq=2", "Number of RX queues", "1", "255"
> +   "ntxq=2", "Number of TX queues", "1", "255"
> +   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
> +   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
> +   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
> +   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
> +
<...>

> +Example: testpmd and testpmd
> +----------------------------
> +In this example we run two instances of testpmd application and transmit packets over memif.
> +
> +First create ``master`` interface::
> +
> +    #./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
> +
> +Now create ``slave`` interface (master must be already running so the slave will connect)::
> +
> +    #./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
> +
> +Set forwarding mode in one of the instances to 'rx only' and the other to 'tx only'::
> +
> +    testpmd> set fwd rxonly
> +    testpmd> start
> +
> +    testpmd> set fwd txonly
> +    testpmd> start

Also following works, above sample was not clear for me if the shared buffer is
only for one direction communication, but it is not the case, you can consider
adding below as well to clarify this:

Slave:
    testpmd> start

Master:
    testpmd> start tx_first


And if you want you can add multiple queue sample too, which gives better
performance:

master:
./testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1
--vdev=net_memif,role=master -- -i --rxq=4 --txq=4

slave:
./testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2
--vdev=net_memif,role=slave -- -i --rxq=4 --txq=4

<...>

> +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
> +LDLIBS += -lrte_ethdev -lrte_kvargs
> +LDLIBS += -lrte_hash

Requires to be linked against vdev bus library for shared build:
LDLIBS += -lrte_bus_vdev

Can you please be sure to that PMD can be compiled as shared library?

<...>

> @@ -0,0 +1,13 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
> +
> +if host_machine.system() != 'linux'
> +        build = false
> +endif
> +
> +sources = files('rte_eth_memif.c',
> +		'memif_socket.c')
> +
> +allow_experimental_apis = true

Can you please list the used experimental APIs here in a commented way, as it is
done in Makefile?

<...>

> +static void
> +memif_dev_close(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +
> +	memif_msg_enq_disconnect(pmd->cc, "Device closed", 0);
> +	memif_disconnect(dev);
> +
> +	memif_socket_remove_device(dev);

On latest agreement, .dev_close should free all internal data structures, and
since 'RTE_ETH_DEV_CLOSE_REMOVE' flag is set, generic ones will be cleaned by API.
Please check what is clean by API from 'rte_eth_dev_release_port()' remaining
memmory allocated by PMD should be freed here (rx, tx queues for example)

<...>

> +	/* TX stats */
> +	for (i = 0; i < nq; i++) {
> +		mq = dev->data->tx_queues[i];
> +		stats->q_opackets[i] = mq->n_pkts;
> +		stats->q_obytes[i] = mq->n_bytes;
> +		stats->q_errors[i] = mq->n_err;

'q_errors' is for Rx, there is a patchset on top of the next-net head, right now
this value only incorporate to overall Tx error stats (oerrors) not in this field.

<...>

> +
> +	eth_dev->data->dev_flags &= RTE_ETH_DEV_CLOSE_REMOVE;

Are you sure you want to '&' this flag, shouldn't it be "|="?

<...>

> +/* check if directory exists and if we have permission to read/write */
> +static int
> +memif_check_socket_filename(const char *filename)
> +{
> +	char *dir = NULL, *tmp;
> +	uint32_t idx;
> +	int ret = 0;
> +
> +	tmp = strrchr(filename, '/');
> +	if (tmp != NULL) {
> +		idx = tmp - filename;
> +		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
> +		if (dir == NULL) {
> +			MIF_LOG(ERR, "Failed to allocate memory.");
> +			return -1;
> +		}
> +		strlcpy(dir, filename, sizeof(char) * (idx + 1));
> +	}
> +
> +	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
> +					W_OK, AT_EACCESS) < 0)) {
> +		MIF_LOG(ERR, "Invalid directory: '%s'.", dir);

Getting following build error from "gcc (GCC) 9.1.1 20190503 (Red Hat 9.1.1-1)":

In file included from .../dpdk/drivers/net/memif/rte_eth_memif.c:27:
In function ‘memif_check_socket_filename’,
    inlined from ‘memif_set_socket_filename’ at
.../dpdk/drivers/net/memif/rte_eth_memif.c:1052:9:
.../dpdk/drivers/net/memif/rte_eth_memif.h:35:2: error: ‘%s’ directive argument
is null [-Werror=format-overflow=]
   35 |  rte_log(RTE_LOG_ ## level, memif_logtype, \
      |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   36 |   "%s(): " fmt "\n", __func__, ##args)
      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.../dpdk/drivers/net/memif/rte_eth_memif.c:1035:3: note: in expansion of macro
‘MIF_LOG’
 1035 |   MIF_LOG(ERR, "Invalid directory: '%s'.", dir);
      |   ^~~~~~~
.../dpdk/drivers/net/memif/rte_eth_memif.c: In function ‘memif_set_socket_filename’:
.../dpdk/drivers/net/memif/rte_eth_memif.c:1098:37: note: format string is
defined here
 1098 |  MIF_LOG(INFO, "Initialize MEMIF: %s.", name);


<...>

> +static int
> +memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
> +{
> +	struct ether_addr *ether_addr = (struct ether_addr *)extra_args;

Can you please rebase next version on top of latest head, which got some patches
that are adding rte_ prefix to these structs.

<...>

> +#ifndef _RTE_ETH_MEMIF_H_
> +#define _RTE_ETH_MEMIF_H_
> +
> +#ifndef _GNU_SOURCE
> +#define _GNU_SOURCE
> +#endif				/* GNU_SOURCE */

Why this was required?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [RFC v9] /net: memory interface (memif)
  2019-05-29 17:29             ` Ferruh Yigit
@ 2019-05-30 12:38               ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  0 siblings, 0 replies; 58+ messages in thread
From: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) @ 2019-05-30 12:38 UTC (permalink / raw)
  To: Ferruh Yigit, dev



> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@intel.com>
> Sent: Wednesday, May 29, 2019 7:29 PM
> To: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
> <jgrajcia@cisco.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [RFC v9] /net: memory interface (memif)

> > +
> > +.. csv-table:: **Memif configuration options**
> > +   :header: "Option", "Description", "Default", "Valid value"
> > +
> > +   "id=0", "Used to identify peer interface", "0", "uint32_t"
> > +   "role=master", "Set memif role", "slave", "master|slave"
> > +   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
> 
> What happens is 'bsize < mbuf size'? I didn't see any check in the code but is
> there any assumption around this?
> Or any assumption that slave and master packet should be same? Or any
> other relation?
> If there is any assumption it may be good to add checks to the code and
> document here.

There is no relation between bsize and mbuf size. Memif driver will consume as many buffers as it needs (chaining them). 

> > +#ifndef _RTE_ETH_MEMIF_H_
> > +#define _RTE_ETH_MEMIF_H_
> > +
> > +#ifndef _GNU_SOURCE
> > +#define _GNU_SOURCE
> > +#endif				/* GNU_SOURCE */
> 
> Why this was required?

_GNU_SOURCE is required by memfd_create().

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD
  2019-05-20 10:18           ` [dpdk-dev] [RFC v9] " Jakub Grajciar
  2019-05-29 17:29             ` Ferruh Yigit
@ 2019-05-31  6:22             ` Jakub Grajciar
  2019-05-31  7:43               ` Ye Xiaolong
                                 ` (3 more replies)
  1 sibling, 4 replies; 58+ messages in thread
From: Jakub Grajciar @ 2019-05-31  6:22 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Memory interface (memif), provides high performance
packet transfer over shared memory.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 config/common_base                          |    5 +
 config/common_linux                         |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  234 ++++
 doc/guides/rel_notes/release_19_08.rst      |    5 +
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   31 +
 drivers/net/memif/memif.h                   |  179 +++
 drivers/net/memif/memif_socket.c            | 1124 +++++++++++++++++
 drivers/net/memif/memif_socket.h            |  105 ++
 drivers/net/memif/meson.build               |   15 +
 drivers/net/memif/rte_eth_memif.c           | 1206 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  212 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 18 files changed, 3145 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/51511/
Memif uses a unix domain socket as control channel (cc).
rte_interrupts.h is used to handle interrupts for this cc.
This patch is required, because the message received can
be a reason for disconnecting.

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

v5:
- coding style fixes
- add release notes
- remove unused dependencies
- doc update

v6:
- remove socket on termination of testpmd application
- disallow memif probing in secondary process (return error), multi-process support will be implemented in separate patch
- fix chained buffer packet length
- use single region for rings and buffers (non-zero-copy)
- doc update

v7:
- coding style fixes

v8:
- release notes: 19.05 -> 19.08

v9:
- rearrange elements in structures to remove holes
- default socket file in /run
- #define name sizes
- implement dev_close op

v10:
- fix shared lib build
- clear memory allocated by PMD in .dev_close
- fix NULL pointer string 'error: ‘%s’ directive argument is null'

diff --git a/MAINTAINERS b/MAINTAINERS
index 15d0829c5..3698c146b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -841,6 +841,12 @@ F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst

+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+

 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 6f19ad5d2..e406e7836 100644
--- a/config/common_base
+++ b/config/common_base
@@ -444,6 +444,11 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_XDP=n

+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linux b/config/common_linux
index 75334273d..87514fe4f 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -19,6 +19,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 2221c35f2..691e7209f 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..de2d481eb
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,234 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
+Linux only.
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Each interface can be connected to one peer interface
+at same time. The peer interface is identified by id parameter. Master
+creates the socket and listens for any slave connection requests. The socket
+may already exist on the system. Be sure to remove any such sockets, if you
+are creating a master interface, or you will see an "Address already in use"
+error. Function ``rte_pmd_memif_remove()``, which removes memif interface,
+will also remove a listener socket, if it is not being used by any other
+interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. This id is used to identify
+peer interface. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Used to identify peer interface", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+**Connection establishment**
+
+In order to create memif connection, two memif interfaces, each in separate
+process, are needed. One interface in ``master`` role and other in
+``slave`` role. It is not possible to connect two interfaces in a single
+process. Each interface can be connected to one interface at same time,
+identified by matching id parameter.
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by ``master``
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have ``slave`` and ``master`` interfaces
+at the same time, provided each role is assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
+sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
+(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+Shared memory
+~~~~~~~~~~~~~
+
+**Shared memory format**
+
+Slave is producer and master is consumer. Memory regions, are mapped shared memory files,
+created by memif slave and provided to master at connection establishment.
+Regions contain rings and buffers. Rings and buffers can also be separated into multiple
+regions. For no-zero-copy, rings and buffers are stored inside single memory
+region to reduce the number of opened files.
+
+region n (no-zero-copy):
+
++-----------------------+-------------------------------------------------------------------------+
+| Rings                 | Buffers                                                                 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+| S2M rings | M2S rings | packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+
+S2M OR M2S Rings:
+
++--------+--------+-----------------------+
+| ring 0 | ring 1 | ring num_s2m_rings - 1|
++--------+--------+-----------------------+
+
+ring 0:
+
++-------------+---------------------------------------+
+| ring header | (1 << pmd->run.log2_ring_size) * desc |
++-------------+---------------------------------------+
+
+Descriptors are assigned packet buffers in order of rings creation. If we have one ring
+in each direction and ring size is 1024, then first 1024 buffers will belong to S2M ring and
+last 1024 will belong to M2S ring. In case of zero-copy, buffers are dequeued and
+enqueued as needed.
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+Example: testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create ``master`` interface::
+
+    #./build/app/testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create ``slave`` interface (master must be already running so the slave will connect)::
+
+    #./build/app/testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Start forwarding packets::
+
+    Slave:
+        testpmd> start
+
+    Master:
+        testpmd> start tx_first
+
+Show status::
+
+    testpmd> show port stats 0
+
+For more details on testpmd please refer to :doc:`../testpmd_app_ug/index`.
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
index a17e7dea5..46efd13f0 100644
--- a/doc/guides/rel_notes/release_19_08.rst
+++ b/doc/guides/rel_notes/release_19_08.rst
@@ -54,6 +54,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================

+* **Added memif PMD.**
+
+  Added the new Shared Memory Packet Interface (``memif``) PMD.
+  See the :doc:`../nics/memif` guide for more details on this new driver.
+

 Removed Items
 -------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 3a72cf38c..78cb10fc6 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -35,6 +35,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
 DIRS-$(CONFIG_RTE_LIBRTE_IPN3KE_PMD) += ipn3ke
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..c3119625c
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,31 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+# Experimantal APIs:
+# - rte_intr_callback_unregister_pending
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_ethdev -lrte_kvargs
+LDLIBS += -lrte_hash
+LDLIBS += -lrte_bus_vdev
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..3948b1f50
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128)
+{
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..98e8ceefd
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1124 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	struct cmsghdr *cmsg;
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e;
+
+	e = rte_zmalloc("memif_msg", sizeof(struct memif_msg_queue_elt), 0);
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e;
+	struct pmd_internals *pmd;
+	memif_msg_disconnect_t *d;
+
+	if (cc == NULL) {
+		MIF_LOG(DEBUG, "Missing control channel.");
+		return;
+	}
+
+	e = memif_msg_enq(cc);
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			pmd = cc->dev->data->dev_private;
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	memif_msg_hello_t *h;
+
+	if (e == NULL)
+		return -1;
+
+	h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_NUM - 1;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.pkt_buffer_size = pmd->cfg.pkt_buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strncmp(pmd->secret, (char *)i->secret,
+						ETH_MEMIF_SECRET_SIZE) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+	struct memif_region *r;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	if (ar->index >= ETH_MEMIF_MAX_REGION_NUM || ar->index != pmd->regions_num ||
+			pmd->regions[ar->index] != NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	r->fd = fd;
+	r->region_size = ar->size;
+	r->addr = NULL;
+
+	pmd->regions[ar->index] = r;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+	struct memif_queue *mq;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->ring_offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_init_t *i = &e->msg.init;
+
+	if (e == NULL)
+		return -1;
+
+	i = &e->msg.init;
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_add_region_t *ar;
+	struct memif_region *mr = pmd->regions[idx];
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_region;
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	struct memif_queue *mq;
+	memif_msg_add_ring_t *ar;
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_ring;
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->ring_offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connect_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connect;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connected_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connected;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		rte_free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt, *next;
+	struct memif_queue *mq;
+	struct rte_intr_handle *ih;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		for (elt = TAILQ_FIRST(&pmd->cc->msg_queue); elt != NULL; elt = next) {
+			next = TAILQ_NEXT(elt, next);
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				rte_free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		ih = &pmd->cc->intr_handle;
+		if (ih->fd > 0) {
+			ret = rte_intr_callback_unregister(ih,
+							memif_intr_handler,
+							pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				ret = rte_intr_callback_unregister_pending(ih,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+			} else if (ret > 0) {
+				close(ih->fd);
+				rte_free(pmd->cc);
+			}
+			pmd->cc = NULL;
+			if (ret <= 0)
+				MIF_LOG(WARNING, "%s: Failed to unregister "
+					"control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		if (pmd->role == MEMIF_ROLE_SLAVE) {
+			if (dev->data->tx_queues != NULL)
+				mq = dev->data->tx_queues[i];
+			else
+				continue;
+		} else {
+			if (dev->data->rx_queues != NULL)
+				mq = dev->data->rx_queues[i];
+			else
+				continue;
+		}
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		if (pmd->role == MEMIF_ROLE_MASTER) {
+			if (dev->data->tx_queues != NULL)
+				mq = dev->data->tx_queues[i];
+			else
+				continue;
+		} else {
+			if (dev->data->rx_queues != NULL)
+				mq = dev->data->rx_queues[i];
+			else
+				continue;
+		}
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	/* reset connection configuration */
+	memset(&pmd->run, 0, sizeof(pmd->run));
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				afd = *(int *)CMSG_DATA(cmsg);
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	if (cc->dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(cc->dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret = rte_intr_callback_register(&cc->intr_handle, memif_intr_handler, cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL)
+		rte_free(cc);
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	struct rte_hash *hash;
+	int ret;
+	char key[256];
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt = rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt), 0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+	struct rte_hash *hash;
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (pmd->socket_filename == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) < 0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			rte_free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..c60614721
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   ethernet device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif ethernet device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+	char dev_name[RTE_ETH_NAME_MAX_LEN];
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+	uint8_t listener;			/**< if not zero socket is listener */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	memif_msg_t msg;			/**< control message */
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..287a30ef2
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+# Experimantal APIs:
+# - rte_intr_callback_unregister_pending
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..3b9c1ce66
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1206 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_PKT_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0]->addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region]->addr + d->offset);
+}
+
+static int
+memif_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *cur_tail,
+		    struct rte_mbuf *tail)
+{
+	/* Check for number-of-segments-overflow */
+	if (unlikely(head->nb_segs + tail->nb_segs > RTE_MBUF_MAX_NB_SEGS))
+		return -EOVERFLOW;
+
+	/* Chain 'tail' onto the old tail */
+	cur_tail->next = tail;
+
+	/* accumulate number of segments and total length. */
+	head->nb_segs = (uint16_t)(head->nb_segs + tail->nb_segs);
+
+	tail->pkt_len = tail->data_len;
+	head->pkt_len += tail->pkt_len;
+
+	return 0;
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+		RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf, *mbuf_head, *mbuf_tail;
+	uint64_t b;
+	ssize_t size __rte_unused;
+	uint16_t head;
+	int ret;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size;
+
+				/* store pointer to tail */
+				mbuf_tail = mbuf;
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				ret = memif_pktmbuf_chain(mbuf_head, mbuf_tail, mbuf);
+				if (unlikely(ret < 0)) {
+					MIF_LOG(ERR, "%s: number-of-segments-overflow",
+						rte_vdev_device_name(pmd->vdev));
+					rte_pktmbuf_free(mbuf);
+					goto no_free_bufs;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_data_len(mbuf) += cp_len;
+			rte_pktmbuf_pkt_len(mbuf) = rte_pktmbuf_data_len(mbuf);
+			if (mbuf != mbuf_head)
+				rte_pktmbuf_pkt_len(mbuf_head) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		mq->n_bytes += rte_pktmbuf_pkt_len(mbuf_head);
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+refill:
+	if (type == MEMIF_RING_M2S) {
+		head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.pkt_buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+	uint64_t a;
+	ssize_t size;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_tx_pkts < nb_pkts && n_free) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len = (type == MEMIF_RING_S2M) ?
+			pmd->run.pkt_buffer_size : d0->length;
+
+next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.pkt_buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		a = 1;
+		size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	/* regions are allocated contiguously, so it's
+	 * enough to loop until 'pmd->regions_num'
+	 */
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions[i];
+		if (r != NULL) {
+			if (r->addr != NULL) {
+				munmap(r->addr, r->region_size);
+				if (r->fd > 0) {
+					close(r->fd);
+					r->fd = -1;
+				}
+			}
+			rte_free(r);
+			pmd->regions[i] = NULL;
+		}
+	}
+	pmd->regions_num = 0;
+}
+
+static int
+memif_region_init_shm(struct pmd_internals *pmd, uint8_t has_buffers)
+{
+	char shm_name[ETH_MEMIF_SHM_NAME_SIZE];
+	int ret = 0;
+	struct memif_region *r;
+
+	if (pmd->regions_num >= ETH_MEMIF_MAX_REGION_NUM) {
+		MIF_LOG(ERR, "%s: Too many regions.", rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	/* calculate buffer offset */
+	r->pkt_buffer_offset = (pmd->run.num_s2m_rings + pmd->run.num_m2s_rings) *
+	    (sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size));
+
+	r->region_size = r->pkt_buffer_offset;
+	/* if region has buffers, add buffers size to region_size */
+	if (has_buffers == 1)
+		r->region_size += (uint32_t)(pmd->run.pkt_buffer_size *
+			(1 << pmd->run.log2_ring_size) *
+			(pmd->run.num_s2m_rings +
+			 pmd->run.num_m2s_rings));
+
+	memset(shm_name, 0, sizeof(char) * ETH_MEMIF_SHM_NAME_SIZE);
+	snprintf(shm_name, ETH_MEMIF_SHM_NAME_SIZE, "memif_region_%d",
+		 pmd->regions_num);
+
+	r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+	if (r->fd < 0) {
+		MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		ret = -1;
+		goto error;
+	}
+
+	ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	ret = ftruncate(r->fd, r->region_size);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	r->addr = mmap(NULL, r->region_size, PROT_READ |
+		       PROT_WRITE, MAP_SHARED, r->fd, 0);
+	if (r->addr == MAP_FAILED) {
+		MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(ret));
+		ret = -1;
+		goto error;
+	}
+
+	pmd->regions[pmd->regions_num] = r;
+	pmd->regions_num++;
+
+	return ret;
+
+error:
+	if (r->fd > 0)
+		close(r->fd);
+	r->fd = -1;
+
+	return ret;
+}
+
+static int
+memif_regions_init(struct pmd_internals *pmd)
+{
+	int ret;
+
+	/* create one buffer region */
+	ret = memif_region_init_shm(pmd, /* has buffer */ 1);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+	uint16_t slot;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = pmd->regions[0]->pkt_buffer_offset +
+				(uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = pmd->regions[0]->pkt_buffer_offset +
+				(uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->ring_offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->ring_offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_regions_init(dev->data->dev_private);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions[i];
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->ring_offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->ring_offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static void
+memif_dev_close(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Device closed", 0);
+	memif_disconnect(dev);
+
+	memif_socket_remove_device(dev);
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *mq = (struct memif_queue *)queue;
+
+	if (!mq)
+		return;
+
+	rte_free(mq);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+	uint8_t tmp, nq;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_close = memif_dev_close,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t pkt_buffer_size, const char *secret,
+	     struct rte_ether_addr *ether_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy slave not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * ETH_MEMIF_SECRET_SIZE);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.pkt_buffer_size = pkt_buffer_size;
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = ether_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *pkt_buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*pkt_buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid socket directory.");
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct rte_ether_addr *ether_addr = (struct rte_ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &ether_addr->addr_bytes[0], &ether_addr->addr_bytes[1],
+	       &ether_addr->addr_bytes[2], &ether_addr->addr_bytes[3],
+	       &ether_addr->addr_bytes[4], &ether_addr->addr_bytes[5]);
+	if (ret != 6)
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+	const char *name = rte_vdev_device_name(vdev);
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t pkt_buffer_size = ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct rte_ether_addr *ether_addr = rte_zmalloc("", sizeof(struct rte_ether_addr), 0);
+
+	rte_eth_random_addr(ether_addr->addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		MIF_LOG(ERR, "Multi-processing not supported for memif.");
+		/* TODO:
+		 * Request connection information.
+		 *
+		 * Once memif in the primary process is connected,
+		 * broadcast connection information.
+		 */
+		return -1;
+	}
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &pkt_buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, ether_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, pkt_buffer_size, secret, ether_addr);
+
+exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+	int i;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	for (i = 0; i < eth_dev->data->nb_rx_queues; i++)
+		(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data->rx_queues[i]);
+	for (i = 0; i < eth_dev->data->nb_tx_queues; i++)
+		(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data->tx_queues[i]);
+
+	rte_free(eth_dev->process_private);
+	eth_dev->process_private = NULL;
+
+	rte_eth_dev_close(eth_dev->data->port_id);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_PKT_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..8cee2643c
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,212 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/run/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE	2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_NUM		256
+
+#define ETH_MEMIF_SHM_NAME_SIZE			32
+#define ETH_MEMIF_DISC_STRING_SIZE		96
+#define ETH_MEMIF_SECRET_SIZE			24
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t pkt_buffer_offset;
+	/**< offset from 'addr' to first packet buffer */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	struct pmd_internals *pmd;		/**< device internals */
+
+	memif_ring_type_t type;			/**< ring type */
+	memif_region_index_t region;		/**< shared memory region index */
+
+	uint16_t in_port;			/**< port id */
+
+	memif_region_offset_t ring_offset;
+	/**< ring offset from start of shm region (ring - memif_region.addr) */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+
+	memif_ring_t *ring;			/**< pointer to ring */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[ETH_MEMIF_SECRET_SIZE]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions[ETH_MEMIF_MAX_REGION_NUM];
+	/**< shared memory regions */
+	memif_region_index_t regions_num;	/**< number of regions */
+
+	/* remote info */
+	char remote_name[RTE_DEV_NAME_MAX_LEN];		/**< remote app name */
+	char remote_if_name[RTE_DEV_NAME_MAX_LEN];	/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[ETH_MEMIF_DISC_STRING_SIZE];
+	/**< local disconnect reason */
+	char remote_disc_string[ETH_MEMIF_DISC_STRING_SIZE];
+	/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param pmd
+ *   device internals
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..8861484fb
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.08 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index ed99896c3..b57073483 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,6 +24,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 7c9b4b538..326b2a30b 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -174,6 +174,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
--
2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD
  2019-05-31  6:22             ` [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD Jakub Grajciar
@ 2019-05-31  7:43               ` Ye Xiaolong
  2019-06-03 11:28                 ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-06-05 12:01                 ` Ferruh Yigit
  2019-06-03 13:37               ` Aaron Conole
                                 ` (2 subsequent siblings)
  3 siblings, 2 replies; 58+ messages in thread
From: Ye Xiaolong @ 2019-05-31  7:43 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

Minor nit, this should be a V1 instead of V10 after your RFC.

On 05/31, Jakub Grajciar wrote:
>Memory interface (memif), provides high performance
>packet transfer over shared memory.

It'd be better to have more descrition of this new PMD in the commit log, something 
like you have in memif.rst

Thanks,
Xiaolong

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD
  2019-05-31  7:43               ` Ye Xiaolong
@ 2019-06-03 11:28                 ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-06-03 14:25                   ` Ye Xiaolong
  2019-06-05 12:01                 ` Ferruh Yigit
  1 sibling, 1 reply; 58+ messages in thread
From: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) @ 2019-06-03 11:28 UTC (permalink / raw)
  To: Ye Xiaolong; +Cc: dev



> -----Original Message-----
> From: Ye Xiaolong <xiaolong.ye@intel.com>
> Sent: Friday, May 31, 2019 9:43 AM
> To: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
> <jgrajcia@cisco.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface
> (memif) PMD
> 
> Minor nit, this should be a V1 instead of V10 after your RFC.

I'll keep that in mind, but what version number should I use in the next patch, since I did use V10 in this one?

Thanks,
Jakub

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD
  2019-05-31  6:22             ` [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD Jakub Grajciar
  2019-05-31  7:43               ` Ye Xiaolong
@ 2019-06-03 13:37               ` Aaron Conole
  2019-06-05 11:55               ` Ferruh Yigit
  2019-06-06  8:24               ` [dpdk-dev] [PATCH v11] " Jakub Grajciar
  3 siblings, 0 replies; 58+ messages in thread
From: Aaron Conole @ 2019-06-03 13:37 UTC (permalink / raw)
  To: Jakub Grajciar; +Cc: dev

Jakub Grajciar <jgrajcia@cisco.com> writes:

> Memory interface (memif), provides high performance
> packet transfer over shared memory.
>
> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
> ---
>  MAINTAINERS                                 |    6 +
>  config/common_base                          |    5 +
>  config/common_linux                         |    1 +
>  doc/guides/nics/features/memif.ini          |   14 +
>  doc/guides/nics/index.rst                   |    1 +
>  doc/guides/nics/memif.rst                   |  234 ++++
>  doc/guides/rel_notes/release_19_08.rst      |    5 +
>  drivers/net/Makefile                        |    1 +
>  drivers/net/memif/Makefile                  |   31 +
>  drivers/net/memif/memif.h                   |  179 +++
>  drivers/net/memif/memif_socket.c            | 1124 +++++++++++++++++
>  drivers/net/memif/memif_socket.h            |  105 ++
>  drivers/net/memif/meson.build               |   15 +
>  drivers/net/memif/rte_eth_memif.c           | 1206 +++++++++++++++++++
>  drivers/net/memif/rte_eth_memif.h           |  212 ++++
>  drivers/net/memif/rte_pmd_memif_version.map |    4 +
>  drivers/net/meson.build                     |    1 +
>  mk/rte.app.mk                               |    1 +
>  18 files changed, 3145 insertions(+)
>  create mode 100644 doc/guides/nics/features/memif.ini
>  create mode 100644 doc/guides/nics/memif.rst
>  create mode 100644 drivers/net/memif/Makefile
>  create mode 100644 drivers/net/memif/memif.h
>  create mode 100644 drivers/net/memif/memif_socket.c
>  create mode 100644 drivers/net/memif/memif_socket.h
>  create mode 100644 drivers/net/memif/meson.build
>  create mode 100644 drivers/net/memif/rte_eth_memif.c
>  create mode 100644 drivers/net/memif/rte_eth_memif.h
>  create mode 100644 drivers/net/memif/rte_pmd_memif_version.map
>
> requires patch: http://patchwork.dpdk.org/patch/51511/
> Memif uses a unix domain socket as control channel (cc).
> rte_interrupts.h is used to handle interrupts for this cc.
> This patch is required, because the message received can
> be a reason for disconnecting.
>
> v3:
> - coding style fixes
> - documentation
> - doxygen comments
> - use strlcpy() instead of strncpy()
> - use RTE_BUILD_BUG_ON instead of _Static_assert()
> - fix meson build deps
>
> v4:
> - coding style fixes
> - doc update (desc format, messaging, features, ...)
> - pointer arithmetic fix
> - __rte_packed and __rte_aligned instead of __attribute__()
> - fixed multi-queue
> - support additional archs (memfd_create syscall)
>
> v5:
> - coding style fixes
> - add release notes
> - remove unused dependencies
> - doc update
>
> v6:
> - remove socket on termination of testpmd application
> - disallow memif probing in secondary process (return error), multi-process support will be implemented in separate patch
> - fix chained buffer packet length
> - use single region for rings and buffers (non-zero-copy)
> - doc update
>
> v7:
> - coding style fixes
>
> v8:
> - release notes: 19.05 -> 19.08
>
> v9:
> - rearrange elements in structures to remove holes
> - default socket file in /run
> - #define name sizes
> - implement dev_close op
>
> v10:
> - fix shared lib build
> - clear memory allocated by PMD in .dev_close
> - fix NULL pointer string 'error: ‘%s’ directive argument is null'
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 15d0829c5..3698c146b 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -841,6 +841,12 @@ F: drivers/net/softnic/
>  F: doc/guides/nics/features/softnic.ini
>  F: doc/guides/nics/softnic.rst
>
> +Memif PMD
> +M: Jakub Grajciar <jgrajcia@cisco.com>
> +F: drivers/net/memif/
> +F: doc/guides/nics/memif.rst
> +F: doc/guides/nics/features/memif.ini
> +
>
>  Crypto Drivers
>  --------------
> diff --git a/config/common_base b/config/common_base
> index 6f19ad5d2..e406e7836 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -444,6 +444,11 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
>  #
>  CONFIG_RTE_LIBRTE_PMD_AF_XDP=n
>
> +#
> +# Compile Memory Interface PMD driver (Linux only)
> +#
> +CONFIG_RTE_LIBRTE_PMD_MEMIF=n
> +
>  #
>  # Compile link bonding PMD library
>  #
> diff --git a/config/common_linux b/config/common_linux
> index 75334273d..87514fe4f 100644
> --- a/config/common_linux
> +++ b/config/common_linux
> @@ -19,6 +19,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
>  CONFIG_RTE_LIBRTE_PMD_VHOST=y
>  CONFIG_RTE_LIBRTE_IFC_PMD=y
>  CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
> +CONFIG_RTE_LIBRTE_PMD_MEMIF=y
>  CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
>  CONFIG_RTE_LIBRTE_PMD_TAP=y
>  CONFIG_RTE_LIBRTE_AVP_PMD=y
> diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
> new file mode 100644
> index 000000000..807d9ecdc
> --- /dev/null
> +++ b/doc/guides/nics/features/memif.ini
> @@ -0,0 +1,14 @@
> +;
> +; Supported features of the 'memif' network poll mode driver.
> +;
> +; Refer to default.ini for the full list of available PMD features.
> +;
> +[Features]
> +Link status          = Y
> +Basic stats          = Y
> +Jumbo frame          = Y
> +ARMv8                = Y
> +Power8               = Y
> +x86-32               = Y
> +x86-64               = Y
> +Usage doc            = Y
> diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
> index 2221c35f2..691e7209f 100644
> --- a/doc/guides/nics/index.rst
> +++ b/doc/guides/nics/index.rst
> @@ -36,6 +36,7 @@ Network Interface Controller Drivers
>      intel_vf
>      kni
>      liquidio
> +    memif
>      mlx4
>      mlx5
>      mvneta
> diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
> new file mode 100644
> index 000000000..de2d481eb
> --- /dev/null
> +++ b/doc/guides/nics/memif.rst
> @@ -0,0 +1,234 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2018-2019 Cisco Systems, Inc.
> +
> +======================
> +Memif Poll Mode Driver
> +======================
> +
> +Shared memory packet interface (memif) PMD allows for DPDK and any other client
> +using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
> +Linux only.
> +
> +The created device transmits packets in a raw format. It can be used with
> +Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
> +supported in DPDK memif implementation.
> +
> +Memif works in two roles: master and slave. Slave connects to master over an
> +existing socket. It is also a producer of shared memory file and initializes
> +the shared memory. Each interface can be connected to one peer interface
> +at same time. The peer interface is identified by id parameter. Master
> +creates the socket and listens for any slave connection requests. The socket
> +may already exist on the system. Be sure to remove any such sockets, if you
> +are creating a master interface, or you will see an "Address already in use"
> +error. Function ``rte_pmd_memif_remove()``, which removes memif interface,
> +will also remove a listener socket, if it is not being used by any other
> +interface.
> +
> +The method to enable one or more interfaces is to use the
> +``--vdev=net_memif0`` option on the DPDK application command line. Each
> +``--vdev=net_memif1`` option given will create an interface named net_memif0,
> +net_memif1, and so on. Memif uses unix domain socket to transmit control
> +messages. Each memif has a unique id per socket. This id is used to identify
> +peer interface. If you are connecting multiple
> +interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
> +etc. Note that if you assign a socket to a master interface it becomes a
> +listener socket. Listener socket can not be used by a slave interface on same
> +client.
> +
> +.. csv-table:: **Memif configuration options**
> +   :header: "Option", "Description", "Default", "Valid value"
> +
> +   "id=0", "Used to identify peer interface", "0", "uint32_t"
> +   "role=master", "Set memif role", "slave", "master|slave"
> +   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
> +   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
> +   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
> +   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
> +   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
> +   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
> +
> +**Connection establishment**
> +
> +In order to create memif connection, two memif interfaces, each in separate
> +process, are needed. One interface in ``master`` role and other in
> +``slave`` role. It is not possible to connect two interfaces in a single
> +process. Each interface can be connected to one interface at same time,
> +identified by matching id parameter.
> +
> +Memif driver uses unix domain socket to exchange required information between
> +memif interfaces. Socket file path is specified at interface creation see
> +*Memif configuration options* table above. If socket is used by ``master``
> +interface, it's marked as listener socket (in scope of current process) and
> +listens to connection requests from other processes. One socket can be used by
> +multiple interfaces. One process can have ``slave`` and ``master`` interfaces
> +at the same time, provided each role is assigned unique socket.
> +
> +For detailed information on memif control messages, see: net/memif/memif.h.
> +
> +Slave interface attempts to make a connection on assigned socket. Process
> +listening on this socket will extract the connection request and create a new
> +connected socket (control channel). Then it sends the 'hello' message
> +(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
> +adjusts its configuration accordingly, and sends 'init' message
> +(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
> +uses this id to find master interface, and assigns the control channel to this
> +interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
> +sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
> +every region allocated. Master responds to each of these messages with 'ack'
> +message. Same behavior applies to rings. Slave sends 'add ring' message
> +(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
> +each message with 'ack' message. To finalize the connection, slave interface
> +sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
> +master maps regions to its address space, initializes rings and responds with
> +'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
> +(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
> +any time, due to driver error or if the interface is being deleted.
> +
> +Files
> +
> +- net/memif/memif.h *- control messages definitions*
> +- net/memif/memif_socket.h
> +- net/memif/memif_socket.c
> +
> +Shared memory
> +~~~~~~~~~~~~~
> +
> +**Shared memory format**
> +
> +Slave is producer and master is consumer. Memory regions, are mapped shared memory files,
> +created by memif slave and provided to master at connection establishment.
> +Regions contain rings and buffers. Rings and buffers can also be separated into multiple
> +regions. For no-zero-copy, rings and buffers are stored inside single memory
> +region to reduce the number of opened files.
> +
> +region n (no-zero-copy):
> +
> ++-----------------------+-------------------------------------------------------------------------+
> +| Rings                 | Buffers                                                                 |
> ++-----------+-----------+-----------------+---+---------------------------------------------------+
> +| S2M rings | M2S rings | packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
> ++-----------+-----------+-----------------+---+---------------------------------------------------+
> +
> +S2M OR M2S Rings:
> +
> ++--------+--------+-----------------------+
> +| ring 0 | ring 1 | ring num_s2m_rings - 1|
> ++--------+--------+-----------------------+
> +
> +ring 0:
> +
> ++-------------+---------------------------------------+
> +| ring header | (1 << pmd->run.log2_ring_size) * desc |
> ++-------------+---------------------------------------+
> +
> +Descriptors are assigned packet buffers in order of rings creation. If we have one ring
> +in each direction and ring size is 1024, then first 1024 buffers will belong to S2M ring and
> +last 1024 will belong to M2S ring. In case of zero-copy, buffers are dequeued and
> +enqueued as needed.
> +
> +**Descriptor format**
> +
> ++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> +|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
> +|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> +|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
> ++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> +|0   |length                                                         |region                         |flags                          |
> ++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
> +|1   |metadata                                                       |offset                                                         |
> ++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> +|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
> +|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> +|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
> ++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> +
> +**Flags field - flags (Quad Word 0, bits 0:15)**
> +
> ++-----+--------------------+------------------------------------------------------------------------------------------------+
> +|Bits |Name                |Functionality                                                                                   |
> ++=====+====================+================================================================================================+
> +|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
> ++-----+--------------------+------------------------------------------------------------------------------------------------+
> +
> +**Region index - region (Quad Word 0, 16:31)**
> +
> +Index of memory region, the buffer is located in.
> +
> +**Data length - length (Quad Word 0, 32:63)**
> +
> +Length of transmitted/received data.
> +
> +**Data Offset - offset (Quad Word 1, 0:31)**
> +
> +Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
> +
> +**Metadata - metadata (Quad Word 1, 32:63)**
> +
> +Buffer metadata.
> +
> +Files
> +
> +- net/memif/memif.h *- descriptor and ring definitions*
> +- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
> +
> +Example: testpmd
> +----------------------------
> +In this example we run two instances of testpmd application and transmit packets over memif.
> +
> +First create ``master`` interface::
> +
> +    #./build/app/testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
> +
> +Now create ``slave`` interface (master must be already running so the slave will connect)::
> +
> +    #./build/app/testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
> +
> +Start forwarding packets::
> +
> +    Slave:
> +        testpmd> start
> +
> +    Master:
> +        testpmd> start tx_first
> +
> +Show status::
> +
> +    testpmd> show port stats 0
> +
> +For more details on testpmd please refer to :doc:`../testpmd_app_ug/index`.
> +
> +Example: testpmd and VPP
> +------------------------
> +For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
> +
> +Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
> +
> +    vpp# create interface memif id 0 master no-zero-copy
> +    vpp# set interface state memif0/0 up
> +    vpp# set interface ip address memif0/0 192.168.1.1/24
> +
> +To see socket filename use show memif command::
> +
> +    vpp# show memif
> +    sockets
> +     id  listener    filename
> +      0   yes (1)     /run/vpp/memif.sock
> +    ...
> +
> +Now create memif interface by running testpmd with these command line options::
> +
> +    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
> +
> +Testpmd should now create memif slave interface and try to connect to master.
> +In testpmd set forward option to icmpecho and start forwarding::
> +
> +    testpmd> set fwd icmpecho
> +    testpmd> start
> +
> +Send ping from VPP::
> +
> +    vpp# ping 192.168.1.2
> +    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
> +    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
> +    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
> +    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
> diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
> index a17e7dea5..46efd13f0 100644
> --- a/doc/guides/rel_notes/release_19_08.rst
> +++ b/doc/guides/rel_notes/release_19_08.rst
> @@ -54,6 +54,11 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =========================================================
>
> +* **Added memif PMD.**
> +
> +  Added the new Shared Memory Packet Interface (``memif``) PMD.
> +  See the :doc:`../nics/memif` guide for more details on this new driver.
> +
>
>  Removed Items
>  -------------
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index 3a72cf38c..78cb10fc6 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -35,6 +35,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
>  DIRS-$(CONFIG_RTE_LIBRTE_IPN3KE_PMD) += ipn3ke
>  DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
>  DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
> +DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
>  DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
>  DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
>  DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
> diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
> new file mode 100644
> index 000000000..c3119625c
> --- /dev/null
> +++ b/drivers/net/memif/Makefile
> @@ -0,0 +1,31 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +#
> +# library name
> +#
> +LIB = librte_pmd_memif.a
> +
> +EXPORT_MAP := rte_pmd_memif_version.map
> +
> +LIBABIVER := 1
> +
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +CFLAGS += -DALLOW_EXPERIMENTAL_API
> +# Experimantal APIs:
> +# - rte_intr_callback_unregister_pending
> +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
> +LDLIBS += -lrte_ethdev -lrte_kvargs
> +LDLIBS += -lrte_hash
> +LDLIBS += -lrte_bus_vdev
> +
> +#
> +# all source are stored in SRCS-y
> +#
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
> new file mode 100644
> index 000000000..3948b1f50
> --- /dev/null
> +++ b/drivers/net/memif/memif.h
> @@ -0,0 +1,179 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
> + */
> +
> +#ifndef _MEMIF_H_
> +#define _MEMIF_H_
> +
> +#define MEMIF_COOKIE		0x3E31F20
> +#define MEMIF_VERSION_MAJOR	2
> +#define MEMIF_VERSION_MINOR	0
> +#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
> +#define MEMIF_NAME_SZ		32
> +
> +/*
> + * S2M: direction slave -> master
> + * M2S: direction master -> slave
> + */
> +
> +/*
> + *  Type definitions
> + */
> +
> +typedef enum memif_msg_type {
> +	MEMIF_MSG_TYPE_NONE,
> +	MEMIF_MSG_TYPE_ACK,
> +	MEMIF_MSG_TYPE_HELLO,
> +	MEMIF_MSG_TYPE_INIT,
> +	MEMIF_MSG_TYPE_ADD_REGION,
> +	MEMIF_MSG_TYPE_ADD_RING,
> +	MEMIF_MSG_TYPE_CONNECT,
> +	MEMIF_MSG_TYPE_CONNECTED,
> +	MEMIF_MSG_TYPE_DISCONNECT,
> +} memif_msg_type_t;
> +
> +typedef enum {
> +	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
> +	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
> +} memif_ring_type_t;
> +
> +typedef enum {
> +	MEMIF_INTERFACE_MODE_ETHERNET,
> +	MEMIF_INTERFACE_MODE_IP,
> +	MEMIF_INTERFACE_MODE_PUNT_INJECT,
> +} memif_interface_mode_t;
> +
> +typedef uint16_t memif_region_index_t;
> +typedef uint32_t memif_region_offset_t;
> +typedef uint64_t memif_region_size_t;
> +typedef uint16_t memif_ring_index_t;
> +typedef uint32_t memif_interface_id_t;
> +typedef uint16_t memif_version_t;
> +typedef uint8_t memif_log2_ring_size_t;
> +
> +/*
> + *  Socket messages
> + */
> +
> + /**
> +  * M2S
> +  * Contains master interfaces configuration.
> +  */
> +typedef struct __rte_packed {
> +	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
> +	memif_version_t min_version; /**< lowest supported memif version */
> +	memif_version_t max_version; /**< highest supported memif version */
> +	memif_region_index_t max_region; /**< maximum num of regions */
> +	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
> +	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
> +	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
> +} memif_msg_hello_t;
> +
> +/**
> + * S2M
> + * Contains information required to identify interface
> + * to which the slave wants to connect.
> + */
> +typedef struct __rte_packed {
> +	memif_version_t version;		/**< memif version */
> +	memif_interface_id_t id;		/**< interface id */
> +	memif_interface_mode_t mode:8;		/**< interface mode */
> +	uint8_t secret[24];			/**< optional security parameter */
> +	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
> +} memif_msg_init_t;
> +
> +/**
> + * S2M
> + * Request master to add new shared memory region to master interface.
> + * Shared files file descriptor is passed in cmsghdr.
> + */
> +typedef struct __rte_packed {
> +	memif_region_index_t index;		/**< shm regions index */
> +	memif_region_size_t size;		/**< shm region size */
> +} memif_msg_add_region_t;
> +
> +/**
> + * S2M
> + * Request master to add new ring to master interface.
> + */
> +typedef struct __rte_packed {
> +	uint16_t flags;				/**< flags */
> +#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
> +	memif_ring_index_t index;		/**< ring index */
> +	memif_region_index_t region; /**< region index on which this ring is located */
> +	memif_region_offset_t offset;		/**< buffer start offset */
> +	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
> +	uint16_t private_hdr_size;		/**< used for private metadata */
> +} memif_msg_add_ring_t;
> +
> +/**
> + * S2M
> + * Finalize connection establishment.
> + */
> +typedef struct __rte_packed {
> +	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
> +} memif_msg_connect_t;
> +
> +/**
> + * M2S
> + * Finalize connection establishment.
> + */
> +typedef struct __rte_packed {
> +	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
> +} memif_msg_connected_t;
> +
> +/**
> + * S2M & M2S
> + * Disconnect interfaces.
> + */
> +typedef struct __rte_packed {
> +	uint32_t code;				/**< error code */
> +	uint8_t string[96];			/**< disconnect reason */
> +} memif_msg_disconnect_t;
> +
> +typedef struct __rte_packed __rte_aligned(128)
> +{
> +	memif_msg_type_t type:16;
> +	union {
> +		memif_msg_hello_t hello;
> +		memif_msg_init_t init;
> +		memif_msg_add_region_t add_region;
> +		memif_msg_add_ring_t add_ring;
> +		memif_msg_connect_t connect;
> +		memif_msg_connected_t connected;
> +		memif_msg_disconnect_t disconnect;
> +	};
> +} memif_msg_t;
> +
> +/*
> + *  Ring and Descriptor Layout
> + */
> +
> +/**
> + * Buffer descriptor.
> + */
> +typedef struct __rte_packed {
> +	uint16_t flags;				/**< flags */
> +#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
> +	memif_region_index_t region; /**< region index on which the buffer is located */
> +	uint32_t length;			/**< buffer length */
> +	memif_region_offset_t offset;		/**< buffer offset */
> +	uint32_t metadata;
> +} memif_desc_t;
> +
> +#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
> +	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
> +
> +typedef struct {
> +	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
> +	uint32_t cookie;			/**< MEMIF_COOKIE */
> +	uint16_t flags;				/**< flags */
> +#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
> +	volatile uint16_t head;			/**< pointer to ring buffer head */
> +	MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
> +	volatile uint16_t tail;			/**< pointer to ring buffer tail */
> +	MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
> +	memif_desc_t desc[0];			/**< buffer descriptors */
> +} memif_ring_t;
> +
> +#endif				/* _MEMIF_H_ */
> diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
> new file mode 100644
> index 000000000..98e8ceefd
> --- /dev/null
> +++ b/drivers/net/memif/memif_socket.c
> @@ -0,0 +1,1124 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
> + */
> +
> +#include <stdlib.h>
> +#include <fcntl.h>
> +#include <unistd.h>
> +#include <sys/types.h>
> +#include <sys/socket.h>
> +#include <sys/un.h>
> +#include <sys/ioctl.h>
> +#include <errno.h>
> +
> +#include <rte_version.h>
> +#include <rte_mbuf.h>
> +#include <rte_ether.h>
> +#include <rte_ethdev_driver.h>
> +#include <rte_ethdev_vdev.h>
> +#include <rte_malloc.h>
> +#include <rte_kvargs.h>
> +#include <rte_bus_vdev.h>
> +#include <rte_hash.h>
> +#include <rte_jhash.h>
> +#include <rte_string_fns.h>
> +
> +#include "rte_eth_memif.h"
> +#include "memif_socket.h"
> +
> +static void memif_intr_handler(void *arg);
> +
> +static ssize_t
> +memif_msg_send(int fd, memif_msg_t *msg, int afd)
> +{
> +	struct msghdr mh = { 0 };
> +	struct iovec iov[1];
> +	struct cmsghdr *cmsg;
> +	char ctl[CMSG_SPACE(sizeof(int))];
> +
> +	iov[0].iov_base = msg;
> +	iov[0].iov_len = sizeof(memif_msg_t);
> +	mh.msg_iov = iov;
> +	mh.msg_iovlen = 1;
> +
> +	if (afd > 0) {
> +		memset(&ctl, 0, sizeof(ctl));
> +		mh.msg_control = ctl;
> +		mh.msg_controllen = sizeof(ctl);
> +		cmsg = CMSG_FIRSTHDR(&mh);
> +		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
> +		cmsg->cmsg_level = SOL_SOCKET;
> +		cmsg->cmsg_type = SCM_RIGHTS;
> +		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
> +	}
> +
> +	return sendmsg(fd, &mh, 0);
> +}
> +
> +static int
> +memif_msg_send_from_queue(struct memif_control_channel *cc)
> +{
> +	ssize_t size;
> +	int ret = 0;
> +	struct memif_msg_queue_elt *e;
> +
> +	e = TAILQ_FIRST(&cc->msg_queue);
> +	if (e == NULL)
> +		return 0;
> +
> +	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
> +	if (size != sizeof(memif_msg_t)) {
> +		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
> +		ret = -1;
> +	} else {
> +		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
> +	}
> +	TAILQ_REMOVE(&cc->msg_queue, e, next);
> +	rte_free(e);
> +
> +	return ret;
> +}
> +
> +static struct memif_msg_queue_elt *
> +memif_msg_enq(struct memif_control_channel *cc)
> +{
> +	struct memif_msg_queue_elt *e;
> +
> +	e = rte_zmalloc("memif_msg", sizeof(struct memif_msg_queue_elt), 0);
> +	if (e == NULL) {
> +		MIF_LOG(ERR, "Failed to allocate control message.");
> +		return NULL;
> +	}
> +
> +	e->fd = -1;
> +	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
> +
> +	return e;
> +}
> +
> +void
> +memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
> +			 int err_code)
> +{
> +	struct memif_msg_queue_elt *e;
> +	struct pmd_internals *pmd;
> +	memif_msg_disconnect_t *d;
> +
> +	if (cc == NULL) {
> +		MIF_LOG(DEBUG, "Missing control channel.");
> +		return;
> +	}
> +
> +	e = memif_msg_enq(cc);
> +	if (e == NULL) {
> +		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
> +		return;
> +	}
> +
> +	d = &e->msg.disconnect;
> +
> +	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
> +	d->code = err_code;
> +
> +	if (reason != NULL) {
> +		strlcpy((char *)d->string, reason, sizeof(d->string));
> +		if (cc->dev != NULL) {
> +			pmd = cc->dev->data->dev_private;
> +			strlcpy(pmd->local_disc_string, reason,
> +				sizeof(pmd->local_disc_string));
> +		}
> +	}
> +}
> +
> +static int
> +memif_msg_enq_hello(struct memif_control_channel *cc)
> +{
> +	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
> +	memif_msg_hello_t *h;
> +
> +	if (e == NULL)
> +		return -1;
> +
> +	h = &e->msg.hello;
> +
> +	e->msg.type = MEMIF_MSG_TYPE_HELLO;
> +	h->min_version = MEMIF_VERSION;
> +	h->max_version = MEMIF_VERSION;
> +	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
> +	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
> +	h->max_region = ETH_MEMIF_MAX_REGION_NUM - 1;
> +	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
> +
> +	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
> +
> +	return 0;
> +}
> +
> +static int
> +memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	memif_msg_hello_t *h = &msg->hello;
> +
> +	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
> +		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
> +		return -1;
> +	}
> +
> +	/* Set parameters for active connection */
> +	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
> +					   pmd->cfg.num_s2m_rings);
> +	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
> +					   pmd->cfg.num_m2s_rings);
> +	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
> +					    pmd->cfg.log2_ring_size);
> +	pmd->run.pkt_buffer_size = pmd->cfg.pkt_buffer_size;
> +
> +	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
> +
> +	MIF_LOG(DEBUG, "%s: Connecting to %s.",
> +		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
> +
> +	return 0;
> +}
> +
> +static int
> +memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
> +{
> +	memif_msg_init_t *i = &msg->init;
> +	struct memif_socket_dev_list_elt *elt;
> +	struct pmd_internals *pmd;
> +	struct rte_eth_dev *dev;
> +
> +	if (i->version != MEMIF_VERSION) {
> +		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
> +		return -1;
> +	}
> +
> +	if (cc->socket == NULL) {
> +		memif_msg_enq_disconnect(cc, "Device error", 0);
> +		return -1;
> +	}
> +
> +	/* Find device with requested ID */
> +	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
> +		dev = elt->dev;
> +		pmd = dev->data->dev_private;
> +		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
> +		    pmd->id == i->id) {
> +			/* assign control channel to device */
> +			cc->dev = dev;
> +			pmd->cc = cc;
> +
> +			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
> +				memif_msg_enq_disconnect(pmd->cc,
> +							 "Only ethernet mode supported",
> +							 0);
> +				return -1;
> +			}
> +
> +			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
> +					   ETH_MEMIF_FLAG_CONNECTED)) {
> +				memif_msg_enq_disconnect(pmd->cc,
> +							 "Already connected", 0);
> +				return -1;
> +			}
> +			strlcpy(pmd->remote_name, (char *)i->name,
> +				sizeof(pmd->remote_name));
> +
> +			if (*pmd->secret != '\0') {
> +				if (*i->secret == '\0') {
> +					memif_msg_enq_disconnect(pmd->cc,
> +								 "Secret required", 0);
> +					return -1;
> +				}
> +				if (strncmp(pmd->secret, (char *)i->secret,
> +						ETH_MEMIF_SECRET_SIZE) != 0) {
> +					memif_msg_enq_disconnect(pmd->cc,
> +								 "Incorrect secret", 0);
> +					return -1;
> +				}
> +			}
> +
> +			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
> +			return 0;
> +		}
> +	}
> +
> +	/* ID not found on this socket */
> +	MIF_LOG(DEBUG, "ID %u not found.", i->id);
> +	memif_msg_enq_disconnect(cc, "ID not found", 0);
> +	return -1;
> +}
> +
> +static int
> +memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
> +			     int fd)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	memif_msg_add_region_t *ar = &msg->add_region;
> +	struct memif_region *r;
> +
> +	if (fd < 0) {
> +		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
> +		return -1;
> +	}
> +
> +	if (ar->index >= ETH_MEMIF_MAX_REGION_NUM || ar->index != pmd->regions_num ||
> +			pmd->regions[ar->index] != NULL) {
> +		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
> +		return -1;
> +	}
> +
> +	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
> +	if (r == NULL) {
> +		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
> +			rte_vdev_device_name(pmd->vdev));
> +		return -ENOMEM;
> +	}
> +
> +	r->fd = fd;
> +	r->region_size = ar->size;
> +	r->addr = NULL;
> +
> +	pmd->regions[ar->index] = r;
> +	pmd->regions_num++;
> +
> +	return 0;
> +}
> +
> +static int
> +memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	memif_msg_add_ring_t *ar = &msg->add_ring;
> +	struct memif_queue *mq;
> +
> +	if (fd < 0) {
> +		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
> +		return -1;
> +	}
> +
> +	/* check if we have enough queues */
> +	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
> +		if (ar->index >= pmd->cfg.num_s2m_rings) {
> +			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
> +			return -1;
> +		}
> +		pmd->run.num_s2m_rings++;
> +	} else {
> +		if (ar->index >= pmd->cfg.num_m2s_rings) {
> +			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
> +			return -1;
> +		}
> +		pmd->run.num_m2s_rings++;
> +	}
> +
> +	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
> +	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
> +
> +	mq->intr_handle.fd = fd;
> +	mq->log2_ring_size = ar->log2_ring_size;
> +	mq->region = ar->region;
> +	mq->ring_offset = ar->offset;
> +
> +	return 0;
> +}
> +
> +static int
> +memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	memif_msg_connect_t *c = &msg->connect;
> +	int ret;
> +
> +	ret = memif_connect(dev);
> +	if (ret < 0)
> +		return ret;
> +
> +	strlcpy(pmd->remote_if_name, (char *)c->if_name,
> +		sizeof(pmd->remote_if_name));
> +	MIF_LOG(INFO, "%s: Remote interface %s connected.",
> +		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
> +
> +	return 0;
> +}
> +
> +static int
> +memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	memif_msg_connected_t *c = &msg->connected;
> +	int ret;
> +
> +	ret = memif_connect(dev);
> +	if (ret < 0)
> +		return ret;
> +
> +	strlcpy(pmd->remote_if_name, (char *)c->if_name,
> +		sizeof(pmd->remote_if_name));
> +	MIF_LOG(INFO, "%s: Remote interface %s connected.",
> +		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
> +
> +	return 0;
> +}
> +
> +static int
> +memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	memif_msg_disconnect_t *d = &msg->disconnect;
> +
> +	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
> +	strlcpy(pmd->remote_disc_string, (char *)d->string,
> +		sizeof(pmd->remote_disc_string));
> +
> +	MIF_LOG(INFO, "%s: Disconnect received: %s",
> +		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
> +
> +	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
> +	memif_disconnect(rte_eth_dev_allocated
> +			 (rte_vdev_device_name(pmd->vdev)));
> +	return 0;
> +}
> +
> +static int
> +memif_msg_enq_ack(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
> +	if (e == NULL)
> +		return -1;
> +
> +	e->msg.type = MEMIF_MSG_TYPE_ACK;
> +
> +	return 0;
> +}
> +
> +static int
> +memif_msg_enq_init(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
> +	memif_msg_init_t *i = &e->msg.init;
> +
> +	if (e == NULL)
> +		return -1;
> +
> +	i = &e->msg.init;
> +	e->msg.type = MEMIF_MSG_TYPE_INIT;
> +	i->version = MEMIF_VERSION;
> +	i->id = pmd->id;
> +	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
> +
> +	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
> +
> +	if (*pmd->secret != '\0')
> +		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
> +
> +	return 0;
> +}
> +
> +static int
> +memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
> +	memif_msg_add_region_t *ar;
> +	struct memif_region *mr = pmd->regions[idx];
> +
> +	if (e == NULL)
> +		return -1;
> +
> +	ar = &e->msg.add_region;
> +	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
> +	e->fd = mr->fd;
> +	ar->index = idx;
> +	ar->size = mr->region_size;
> +
> +	return 0;
> +}
> +
> +static int
> +memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
> +		       memif_ring_type_t type)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
> +	struct memif_queue *mq;
> +	memif_msg_add_ring_t *ar;
> +
> +	if (e == NULL)
> +		return -1;
> +
> +	ar = &e->msg.add_ring;
> +	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
> +	    dev->data->rx_queues[idx];
> +
> +	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
> +	e->fd = mq->intr_handle.fd;
> +	ar->index = idx;
> +	ar->offset = mq->ring_offset;
> +	ar->region = mq->region;
> +	ar->log2_ring_size = mq->log2_ring_size;
> +	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
> +	ar->private_hdr_size = 0;
> +
> +	return 0;
> +}
> +
> +static int
> +memif_msg_enq_connect(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
> +	const char *name = rte_vdev_device_name(pmd->vdev);
> +	memif_msg_connect_t *c;
> +
> +	if (e == NULL)
> +		return -1;
> +
> +	c = &e->msg.connect;
> +	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
> +	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
> +
> +	return 0;
> +}
> +
> +static int
> +memif_msg_enq_connected(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
> +	const char *name = rte_vdev_device_name(pmd->vdev);
> +	memif_msg_connected_t *c;
> +
> +	if (e == NULL)
> +		return -1;
> +
> +	c = &e->msg.connected;
> +	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
> +	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
> +
> +	return 0;
> +}
> +
> +static void
> +memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
> +{
> +	struct memif_msg_queue_elt *elt;
> +	struct memif_control_channel *cc = arg;
> +
> +	/* close control channel fd */
> +	close(intr_handle->fd);
> +	/* clear message queue */
> +	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
> +		TAILQ_REMOVE(&cc->msg_queue, elt, next);
> +		rte_free(elt);
> +	}
> +	/* free control channel */
> +	rte_free(cc);
> +}
> +
> +void
> +memif_disconnect(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_msg_queue_elt *elt, *next;
> +	struct memif_queue *mq;
> +	struct rte_intr_handle *ih;
> +	int i;
> +	int ret;
> +
> +	if (pmd->cc != NULL) {
> +		/* Clear control message queue (except disconnect message if any). */
> +		for (elt = TAILQ_FIRST(&pmd->cc->msg_queue); elt != NULL; elt = next) {
> +			next = TAILQ_NEXT(elt, next);
> +			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
> +				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
> +				rte_free(elt);
> +			}
> +		}
> +		/* send disconnect message (if there is any in queue) */
> +		memif_msg_send_from_queue(pmd->cc);
> +
> +		/* at this point, there should be no more messages in queue */
> +		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
> +			MIF_LOG(WARNING,
> +				"%s: Unexpected message(s) in message queue.",
> +				rte_vdev_device_name(pmd->vdev));
> +		}
> +
> +		ih = &pmd->cc->intr_handle;
> +		if (ih->fd > 0) {
> +			ret = rte_intr_callback_unregister(ih,
> +							memif_intr_handler,
> +							pmd->cc);
> +			/*
> +			 * If callback is active (disconnecting based on
> +			 * received control message).
> +			 */
> +			if (ret == -EAGAIN) {
> +				ret = rte_intr_callback_unregister_pending(ih,
> +							memif_intr_handler,
> +							pmd->cc,
> +							memif_intr_unregister_handler);
> +			} else if (ret > 0) {
> +				close(ih->fd);
> +				rte_free(pmd->cc);
> +			}
> +			pmd->cc = NULL;
> +			if (ret <= 0)
> +				MIF_LOG(WARNING, "%s: Failed to unregister "
> +					"control channel callback.",
> +					rte_vdev_device_name(pmd->vdev));
> +		}
> +	}
> +
> +	/* unconfig interrupts */
> +	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
> +		if (pmd->role == MEMIF_ROLE_SLAVE) {
> +			if (dev->data->tx_queues != NULL)
> +				mq = dev->data->tx_queues[i];
> +			else
> +				continue;
> +		} else {
> +			if (dev->data->rx_queues != NULL)
> +				mq = dev->data->rx_queues[i];
> +			else
> +				continue;
> +		}
> +		if (mq->intr_handle.fd > 0) {
> +			close(mq->intr_handle.fd);
> +			mq->intr_handle.fd = -1;
> +		}
> +		mq->ring = NULL;
> +	}
> +	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
> +		if (pmd->role == MEMIF_ROLE_MASTER) {
> +			if (dev->data->tx_queues != NULL)
> +				mq = dev->data->tx_queues[i];
> +			else
> +				continue;
> +		} else {
> +			if (dev->data->rx_queues != NULL)
> +				mq = dev->data->rx_queues[i];
> +			else
> +				continue;
> +		}
> +		if (mq->intr_handle.fd > 0) {
> +			close(mq->intr_handle.fd);
> +			mq->intr_handle.fd = -1;
> +		}
> +		mq->ring = NULL;
> +	}
> +
> +	memif_free_regions(pmd);
> +
> +	/* reset connection configuration */
> +	memset(&pmd->run, 0, sizeof(pmd->run));
> +
> +	dev->data->dev_link.link_status = ETH_LINK_DOWN;
> +	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
> +	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
> +	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
> +}
> +
> +static int
> +memif_msg_receive(struct memif_control_channel *cc)
> +{
> +	char ctl[CMSG_SPACE(sizeof(int)) +
> +		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
> +	struct msghdr mh = { 0 };
> +	struct iovec iov[1];
> +	memif_msg_t msg = { 0 };
> +	ssize_t size;
> +	int ret = 0;
> +	struct ucred *cr __rte_unused;
> +	cr = 0;
> +	struct cmsghdr *cmsg;
> +	int afd = -1;
> +	int i;
> +	struct pmd_internals *pmd;
> +
> +	iov[0].iov_base = (void *)&msg;
> +	iov[0].iov_len = sizeof(memif_msg_t);
> +	mh.msg_iov = iov;
> +	mh.msg_iovlen = 1;
> +	mh.msg_control = ctl;
> +	mh.msg_controllen = sizeof(ctl);
> +
> +	size = recvmsg(cc->intr_handle.fd, &mh, 0);
> +	if (size != sizeof(memif_msg_t)) {
> +		MIF_LOG(DEBUG, "Invalid message size.");
> +		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
> +		return -1;
> +	}
> +	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
> +
> +	cmsg = CMSG_FIRSTHDR(&mh);
> +	while (cmsg) {
> +		if (cmsg->cmsg_level == SOL_SOCKET) {
> +			if (cmsg->cmsg_type == SCM_CREDENTIALS)
> +				cr = (struct ucred *)CMSG_DATA(cmsg);
> +			else if (cmsg->cmsg_type == SCM_RIGHTS)
> +				afd = *(int *)CMSG_DATA(cmsg);

I see that GCC from Xenial:

  gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609

Complains as follows:

../drivers/net/memif/memif_socket.c: In function ‘memif_msg_receive’:
../drivers/net/memif/memif_socket.c:665:33: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
     afd = *(int *)CMSG_DATA(cmsg);
                                 ^

Here's the travis build:

https://travis-ci.com/ovsrobot/dpdk/builds/113842433

> +		}
> +		cmsg = CMSG_NXTHDR(&mh, cmsg);
> +	}
> +
> +	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
> +		MIF_LOG(DEBUG, "Unexpected message.");
> +		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
> +		return -1;
> +	}
> +
> +	/* get device from hash data */
> +	switch (msg.type) {
> +	case MEMIF_MSG_TYPE_ACK:
> +		break;
> +	case MEMIF_MSG_TYPE_HELLO:
> +		ret = memif_msg_receive_hello(cc->dev, &msg);
> +		if (ret < 0)
> +			goto exit;
> +		ret = memif_init_regions_and_queues(cc->dev);
> +		if (ret < 0)
> +			goto exit;
> +		ret = memif_msg_enq_init(cc->dev);
> +		if (ret < 0)
> +			goto exit;
> +		pmd = cc->dev->data->dev_private;
> +		for (i = 0; i < pmd->regions_num; i++) {
> +			ret = memif_msg_enq_add_region(cc->dev, i);
> +			if (ret < 0)
> +				goto exit;
> +		}
> +		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
> +			ret = memif_msg_enq_add_ring(cc->dev, i,
> +						     MEMIF_RING_S2M);
> +			if (ret < 0)
> +				goto exit;
> +		}
> +		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
> +			ret = memif_msg_enq_add_ring(cc->dev, i,
> +						     MEMIF_RING_M2S);
> +			if (ret < 0)
> +				goto exit;
> +		}
> +		ret = memif_msg_enq_connect(cc->dev);
> +		if (ret < 0)
> +			goto exit;
> +		break;
> +	case MEMIF_MSG_TYPE_INIT:
> +		/*
> +		 * This cc does not have an interface asociated with it.
> +		 * If suitable interface is found it will be assigned here.
> +		 */
> +		ret = memif_msg_receive_init(cc, &msg);
> +		if (ret < 0)
> +			goto exit;
> +		ret = memif_msg_enq_ack(cc->dev);
> +		if (ret < 0)
> +			goto exit;
> +		break;
> +	case MEMIF_MSG_TYPE_ADD_REGION:
> +		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
> +		if (ret < 0)
> +			goto exit;
> +		ret = memif_msg_enq_ack(cc->dev);
> +		if (ret < 0)
> +			goto exit;
> +		break;
> +	case MEMIF_MSG_TYPE_ADD_RING:
> +		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
> +		if (ret < 0)
> +			goto exit;
> +		ret = memif_msg_enq_ack(cc->dev);
> +		if (ret < 0)
> +			goto exit;
> +		break;
> +	case MEMIF_MSG_TYPE_CONNECT:
> +		ret = memif_msg_receive_connect(cc->dev, &msg);
> +		if (ret < 0)
> +			goto exit;
> +		ret = memif_msg_enq_connected(cc->dev);
> +		if (ret < 0)
> +			goto exit;
> +		break;
> +	case MEMIF_MSG_TYPE_CONNECTED:
> +		ret = memif_msg_receive_connected(cc->dev, &msg);
> +		break;
> +	case MEMIF_MSG_TYPE_DISCONNECT:
> +		ret = memif_msg_receive_disconnect(cc->dev, &msg);
> +		if (ret < 0)
> +			goto exit;
> +		break;
> +	default:
> +		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
> +		ret = -1;
> +		goto exit;
> +	}
> +
> + exit:
> +	return ret;
> +}
> +
> +static void
> +memif_intr_handler(void *arg)
> +{
> +	struct memif_control_channel *cc = arg;
> +	int ret;
> +
> +	ret = memif_msg_receive(cc);
> +	/* if driver failed to assign device */
> +	if (cc->dev == NULL) {
> +		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
> +							   memif_intr_handler,
> +							   cc,
> +							   memif_intr_unregister_handler);
> +		if (ret < 0)
> +			MIF_LOG(WARNING,
> +				"Failed to unregister control channel callback.");
> +		return;
> +	}
> +	/* if memif_msg_receive failed */
> +	if (ret < 0)
> +		goto disconnect;
> +
> +	ret = memif_msg_send_from_queue(cc);
> +	if (ret < 0)
> +		goto disconnect;
> +
> +	return;
> +
> + disconnect:
> +	if (cc->dev == NULL) {
> +		MIF_LOG(WARNING, "eth dev not allocated");
> +		return;
> +	}
> +	memif_disconnect(cc->dev);
> +}
> +
> +static void
> +memif_listener_handler(void *arg)
> +{
> +	struct memif_socket *socket = arg;
> +	int sockfd;
> +	int addr_len;
> +	struct sockaddr_un client;
> +	struct memif_control_channel *cc;
> +	int ret;
> +
> +	addr_len = sizeof(client);
> +	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
> +			(socklen_t *)&addr_len);
> +	if (sockfd < 0) {
> +		MIF_LOG(ERR,
> +			"Failed to accept connection request on socket fd %d",
> +			socket->intr_handle.fd);
> +		return;
> +	}
> +
> +	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
> +
> +	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
> +	if (cc == NULL) {
> +		MIF_LOG(ERR, "Failed to allocate control channel.");
> +		goto error;
> +	}
> +
> +	cc->intr_handle.fd = sockfd;
> +	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
> +	cc->socket = socket;
> +	cc->dev = NULL;
> +	TAILQ_INIT(&cc->msg_queue);
> +
> +	ret = rte_intr_callback_register(&cc->intr_handle, memif_intr_handler, cc);
> +	if (ret < 0) {
> +		MIF_LOG(ERR, "Failed to register control channel callback.");
> +		goto error;
> +	}
> +
> +	ret = memif_msg_enq_hello(cc);
> +	if (ret < 0) {
> +		MIF_LOG(ERR, "Failed to enqueue hello message.");
> +		goto error;
> +	}
> +	ret = memif_msg_send_from_queue(cc);
> +	if (ret < 0)
> +		goto error;
> +
> +	return;
> +
> + error:
> +	if (sockfd > 0) {
> +		close(sockfd);
> +		sockfd = -1;
> +	}
> +	if (cc != NULL)
> +		rte_free(cc);
> +}
> +
> +static struct memif_socket *
> +memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
> +{
> +	struct memif_socket *sock;
> +	struct sockaddr_un un;
> +	int sockfd;
> +	int ret;
> +	int on = 1;
> +
> +	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
> +	if (sock == NULL) {
> +		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
> +		return NULL;
> +	}
> +
> +	sock->listener = listener;
> +	rte_memcpy(sock->filename, key, 256);
> +	TAILQ_INIT(&sock->dev_queue);
> +
> +	if (listener != 0) {
> +		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
> +		if (sockfd < 0)
> +			goto error;
> +
> +		un.sun_family = AF_UNIX;
> +		memcpy(un.sun_path, sock->filename,
> +			sizeof(un.sun_path) - 1);
> +
> +		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
> +				 sizeof(on));
> +		if (ret < 0)
> +			goto error;
> +		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
> +		if (ret < 0)
> +			goto error;
> +		ret = listen(sockfd, 1);
> +		if (ret < 0)
> +			goto error;
> +
> +		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
> +			rte_vdev_device_name(pmd->vdev), sock->filename);
> +
> +		sock->intr_handle.fd = sockfd;
> +		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
> +		ret = rte_intr_callback_register(&sock->intr_handle,
> +						 memif_listener_handler, sock);
> +		if (ret < 0) {
> +			MIF_LOG(ERR, "%s: Failed to register interrupt "
> +				"callback for listener socket",
> +				rte_vdev_device_name(pmd->vdev));
> +			return NULL;
> +		}
> +	}
> +
> +	return sock;
> +
> + error:
> +	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
> +		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
> +	if (sock != NULL)
> +		rte_free(sock);
> +	return NULL;
> +}
> +
> +static struct rte_hash *
> +memif_create_socket_hash(void)
> +{
> +	struct rte_hash_parameters params = { 0 };
> +	params.name = MEMIF_SOCKET_HASH_NAME;
> +	params.entries = 256;
> +	params.key_len = 256;
> +	params.hash_func = rte_jhash;
> +	params.hash_func_init_val = 0;
> +	return rte_hash_create(&params);
> +}
> +
> +int
> +memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_socket *socket = NULL;
> +	struct memif_socket_dev_list_elt *elt;
> +	struct pmd_internals *tmp_pmd;
> +	struct rte_hash *hash;
> +	int ret;
> +	char key[256];
> +
> +	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
> +	if (hash == NULL) {
> +		hash = memif_create_socket_hash();
> +		if (hash == NULL) {
> +			MIF_LOG(ERR, "Failed to create memif socket hash.");
> +			return -1;
> +		}
> +	}
> +
> +	memset(key, 0, 256);
> +	rte_memcpy(key, socket_filename, strlen(socket_filename));
> +	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
> +	if (ret < 0) {
> +		socket = memif_socket_create(pmd, key,
> +					     (pmd->role ==
> +					      MEMIF_ROLE_SLAVE) ? 0 : 1);
> +		if (socket == NULL)
> +			return -1;
> +		ret = rte_hash_add_key_data(hash, key, socket);
> +		if (ret < 0) {
> +			MIF_LOG(ERR, "Failed to add socket to socket hash.");
> +			return ret;
> +		}
> +	}
> +	pmd->socket_filename = socket->filename;
> +
> +	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
> +		MIF_LOG(ERR, "Socket is a listener.");
> +		return -1;
> +	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
> +		MIF_LOG(ERR, "Socket is not a listener.");
> +		return -1;
> +	}
> +
> +	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
> +		tmp_pmd = elt->dev->data->dev_private;
> +		if (tmp_pmd->id == pmd->id) {
> +			MIF_LOG(ERR, "Memif device with id %d already "
> +				"exists on socket %s",
> +				pmd->id, socket->filename);
> +			return -1;
> +		}
> +	}
> +
> +	elt = rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt), 0);
> +	if (elt == NULL) {
> +		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
> +			rte_vdev_device_name(pmd->vdev));
> +		return -1;
> +	}
> +	elt->dev = dev;
> +	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
> +
> +	return 0;
> +}
> +
> +void
> +memif_socket_remove_device(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_socket *socket = NULL;
> +	struct memif_socket_dev_list_elt *elt, *next;
> +	struct rte_hash *hash;
> +
> +	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
> +	if (hash == NULL)
> +		return;
> +
> +	if (pmd->socket_filename == NULL)
> +		return;
> +
> +	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) < 0)
> +		return;
> +
> +	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
> +		next = TAILQ_NEXT(elt, next);
> +		if (elt->dev == dev) {
> +			TAILQ_REMOVE(&socket->dev_queue, elt, next);
> +			rte_free(elt);
> +			pmd->socket_filename = NULL;
> +		}
> +	}
> +
> +	/* remove socket, if this was the last device using it */
> +	if (TAILQ_EMPTY(&socket->dev_queue)) {
> +		rte_hash_del_key(hash, socket->filename);
> +		if (socket->listener) {
> +			/* remove listener socket file,
> +			 * so we can create new one later.
> +			 */
> +			remove(socket->filename);
> +		}
> +		rte_free(socket);
> +	}
> +}
> +
> +int
> +memif_connect_master(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +
> +	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
> +	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
> +	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
> +	return 0;
> +}
> +
> +int
> +memif_connect_slave(struct rte_eth_dev *dev)
> +{
> +	int sockfd;
> +	int ret;
> +	struct sockaddr_un sun;
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +
> +	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
> +	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
> +	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
> +
> +	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
> +	if (sockfd < 0) {
> +		MIF_LOG(ERR, "%s: Failed to open socket.",
> +			rte_vdev_device_name(pmd->vdev));
> +		return -1;
> +	}
> +
> +	sun.sun_family = AF_UNIX;
> +
> +	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
> +
> +	ret = connect(sockfd, (struct sockaddr *)&sun,
> +		      sizeof(struct sockaddr_un));
> +	if (ret < 0) {
> +		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
> +			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
> +		goto error;
> +	}
> +
> +	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
> +		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
> +
> +	pmd->cc = rte_zmalloc("memif-cc",
> +			      sizeof(struct memif_control_channel), 0);
> +	if (pmd->cc == NULL) {
> +		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
> +			rte_vdev_device_name(pmd->vdev));
> +		goto error;
> +	}
> +
> +	pmd->cc->intr_handle.fd = sockfd;
> +	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
> +	pmd->cc->socket = NULL;
> +	pmd->cc->dev = dev;
> +	TAILQ_INIT(&pmd->cc->msg_queue);
> +
> +	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
> +					 memif_intr_handler, pmd->cc);
> +	if (ret < 0) {
> +		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
> +			"for control fd", rte_vdev_device_name(pmd->vdev));
> +		goto error;
> +	}
> +
> +	return 0;
> +
> + error:
> +	if (sockfd > 0) {
> +		close(sockfd);
> +		sockfd = -1;
> +	}
> +	if (pmd->cc != NULL) {
> +		rte_free(pmd->cc);
> +		pmd->cc = NULL;
> +	}
> +	return -1;
> +}
> diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
> new file mode 100644
> index 000000000..c60614721
> --- /dev/null
> +++ b/drivers/net/memif/memif_socket.h
> @@ -0,0 +1,105 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
> + */
> +
> +#ifndef _MEMIF_SOCKET_H_
> +#define _MEMIF_SOCKET_H_
> +
> +#include <sys/queue.h>
> +
> +/**
> + * Remove device from socket device list. If no device is left on the socket,
> + * remove the socket as well.
> + *
> + * @param pmd
> + *   device internals
> + */
> +void memif_socket_remove_device(struct rte_eth_dev *dev);
> +
> +/**
> + * Enqueue disconnect message to control channel message queue.
> + *
> + * @param cc
> + *   control channel
> + * @param reason
> + *   const string stating disconnect reason (96 characters)
> + * @param err_code
> + *   error code
> + */
> +void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
> +			      int err_code);
> +
> +/**
> + * Initialize memif socket for specified device. If socket doesn't exist, create socket.
> + *
> + * @param dev
> + *   memif ethernet device
> + * @param socket_filename
> + *   socket filename
> + * @return
> + *   - On success, zero.
> + *   - On failure, a negative value.
> + */
> +int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
> +
> +/**
> + * Disconnect memif device. Close control channel and shared memory.
> + *
> + * @param dev
> + *   ethernet device
> + */
> +void memif_disconnect(struct rte_eth_dev *dev);
> +
> +/**
> + * If device is properly configured, enable connection establishment.
> + *
> + * @param dev
> + *   memif ethernet device
> + * @return
> + *   - On success, zero.
> + *   - On failure, a negative value.
> + */
> +int memif_connect_master(struct rte_eth_dev *dev);
> +
> +/**
> + * If device is properly configured, send connection request.
> + *
> + * @param dev
> + *   memif ethernet device
> + * @return
> + *   - On success, zero.
> + *   - On failure, a negative value.
> + */
> +int memif_connect_slave(struct rte_eth_dev *dev);
> +
> +struct memif_socket_dev_list_elt {
> +	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
> +	struct rte_eth_dev *dev;		/**< pointer to device internals */
> +	char dev_name[RTE_ETH_NAME_MAX_LEN];
> +};
> +
> +#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
> +struct memif_socket {
> +	struct rte_intr_handle intr_handle;	/**< interrupt handle */
> +	char filename[256];			/**< socket filename */
> +
> +	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
> +	/**< Queue of devices using this socket */
> +	uint8_t listener;			/**< if not zero socket is listener */
> +};
> +
> +/* Control message queue. */
> +struct memif_msg_queue_elt {
> +	memif_msg_t msg;			/**< control message */
> +	TAILQ_ENTRY(memif_msg_queue_elt) next;
> +	int fd;					/**< fd to be sent to peer */
> +};
> +
> +struct memif_control_channel {
> +	struct rte_intr_handle intr_handle;	/**< interrupt handle */
> +	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
> +	struct memif_socket *socket;		/**< pointer to socket */
> +	struct rte_eth_dev *dev;		/**< pointer to device */
> +};
> +
> +#endif				/* MEMIF_SOCKET_H */
> diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
> new file mode 100644
> index 000000000..287a30ef2
> --- /dev/null
> +++ b/drivers/net/memif/meson.build
> @@ -0,0 +1,15 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
> +
> +if host_machine.system() != 'linux'
> +        build = false
> +endif
> +
> +sources = files('rte_eth_memif.c',
> +		'memif_socket.c')
> +
> +allow_experimental_apis = true
> +# Experimantal APIs:
> +# - rte_intr_callback_unregister_pending
> +
> +deps += ['hash']
> diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
> new file mode 100644
> index 000000000..3b9c1ce66
> --- /dev/null
> +++ b/drivers/net/memif/rte_eth_memif.c
> @@ -0,0 +1,1206 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
> + */
> +
> +#include <stdlib.h>
> +#include <fcntl.h>
> +#include <unistd.h>
> +#include <sys/types.h>
> +#include <sys/socket.h>
> +#include <sys/un.h>
> +#include <sys/ioctl.h>
> +#include <sys/mman.h>
> +#include <linux/if_ether.h>
> +#include <errno.h>
> +#include <sys/eventfd.h>
> +
> +#include <rte_version.h>
> +#include <rte_mbuf.h>
> +#include <rte_ether.h>
> +#include <rte_ethdev_driver.h>
> +#include <rte_ethdev_vdev.h>
> +#include <rte_malloc.h>
> +#include <rte_kvargs.h>
> +#include <rte_bus_vdev.h>
> +#include <rte_string_fns.h>
> +
> +#include "rte_eth_memif.h"
> +#include "memif_socket.h"
> +
> +#define ETH_MEMIF_ID_ARG		"id"
> +#define ETH_MEMIF_ROLE_ARG		"role"
> +#define ETH_MEMIF_PKT_BUFFER_SIZE_ARG	"bsize"
> +#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
> +#define ETH_MEMIF_SOCKET_ARG		"socket"
> +#define ETH_MEMIF_MAC_ARG		"mac"
> +#define ETH_MEMIF_ZC_ARG		"zero-copy"
> +#define ETH_MEMIF_SECRET_ARG		"secret"
> +
> +static const char *valid_arguments[] = {
> +	ETH_MEMIF_ID_ARG,
> +	ETH_MEMIF_ROLE_ARG,
> +	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
> +	ETH_MEMIF_RING_SIZE_ARG,
> +	ETH_MEMIF_SOCKET_ARG,
> +	ETH_MEMIF_MAC_ARG,
> +	ETH_MEMIF_ZC_ARG,
> +	ETH_MEMIF_SECRET_ARG,
> +	NULL
> +};
> +
> +const char *
> +memif_version(void)
> +{
> +	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
> +}
> +
> +static void
> +memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
> +{
> +	dev_info->max_mac_addrs = 1;
> +	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
> +	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
> +	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
> +	dev_info->min_rx_bufsize = 0;
> +}
> +
> +static memif_ring_t *
> +memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
> +{
> +	/* rings only in region 0 */
> +	void *p = pmd->regions[0]->addr;
> +	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
> +	    (1 << pmd->run.log2_ring_size);
> +
> +	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
> +
> +	return (memif_ring_t *)p;
> +}
> +
> +static void *
> +memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
> +{
> +	return ((uint8_t *)pmd->regions[d->region]->addr + d->offset);
> +}
> +
> +static int
> +memif_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *cur_tail,
> +		    struct rte_mbuf *tail)
> +{
> +	/* Check for number-of-segments-overflow */
> +	if (unlikely(head->nb_segs + tail->nb_segs > RTE_MBUF_MAX_NB_SEGS))
> +		return -EOVERFLOW;
> +
> +	/* Chain 'tail' onto the old tail */
> +	cur_tail->next = tail;
> +
> +	/* accumulate number of segments and total length. */
> +	head->nb_segs = (uint16_t)(head->nb_segs + tail->nb_segs);
> +
> +	tail->pkt_len = tail->data_len;
> +	head->pkt_len += tail->pkt_len;
> +
> +	return 0;
> +}
> +
> +static uint16_t
> +eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
> +{
> +	struct memif_queue *mq = queue;
> +	struct pmd_internals *pmd = mq->pmd;
> +	memif_ring_t *ring = mq->ring;
> +	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
> +	uint16_t n_rx_pkts = 0;
> +	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
> +		RTE_PKTMBUF_HEADROOM;
> +	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
> +	memif_ring_type_t type = mq->type;
> +	memif_desc_t *d0;
> +	struct rte_mbuf *mbuf, *mbuf_head, *mbuf_tail;
> +	uint64_t b;
> +	ssize_t size __rte_unused;
> +	uint16_t head;
> +	int ret;
> +
> +	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
> +		return 0;
> +	if (unlikely(ring == NULL))
> +		return 0;
> +
> +	/* consume interrupt */
> +	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
> +		size = read(mq->intr_handle.fd, &b, sizeof(b));
> +
> +	ring_size = 1 << mq->log2_ring_size;
> +	mask = ring_size - 1;
> +
> +	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
> +	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
> +	if (cur_slot == last_slot)
> +		goto refill;
> +	n_slots = last_slot - cur_slot;
> +
> +	while (n_slots && n_rx_pkts < nb_pkts) {
> +		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
> +		if (unlikely(mbuf_head == NULL))
> +			goto no_free_bufs;
> +		mbuf = mbuf_head;
> +		mbuf->port = mq->in_port;
> +
> +next_slot:
> +		s0 = cur_slot & mask;
> +		d0 = &ring->desc[s0];
> +
> +		src_len = d0->length;
> +		dst_off = 0;
> +		src_off = 0;
> +
> +		do {
> +			dst_len = mbuf_size - dst_off;
> +			if (dst_len == 0) {
> +				dst_off = 0;
> +				dst_len = mbuf_size;
> +
> +				/* store pointer to tail */
> +				mbuf_tail = mbuf;
> +				mbuf = rte_pktmbuf_alloc(mq->mempool);
> +				if (unlikely(mbuf == NULL))
> +					goto no_free_bufs;
> +				mbuf->port = mq->in_port;
> +				ret = memif_pktmbuf_chain(mbuf_head, mbuf_tail, mbuf);
> +				if (unlikely(ret < 0)) {
> +					MIF_LOG(ERR, "%s: number-of-segments-overflow",
> +						rte_vdev_device_name(pmd->vdev));
> +					rte_pktmbuf_free(mbuf);
> +					goto no_free_bufs;
> +				}
> +			}
> +			cp_len = RTE_MIN(dst_len, src_len);
> +
> +			rte_pktmbuf_data_len(mbuf) += cp_len;
> +			rte_pktmbuf_pkt_len(mbuf) = rte_pktmbuf_data_len(mbuf);
> +			if (mbuf != mbuf_head)
> +				rte_pktmbuf_pkt_len(mbuf_head) += cp_len;
> +
> +			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
> +			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
> +
> +			src_off += cp_len;
> +			dst_off += cp_len;
> +			src_len -= cp_len;
> +		} while (src_len);
> +
> +		cur_slot++;
> +		n_slots--;
> +
> +		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
> +			goto next_slot;
> +
> +		mq->n_bytes += rte_pktmbuf_pkt_len(mbuf_head);
> +		*bufs++ = mbuf_head;
> +		n_rx_pkts++;
> +	}
> +
> +no_free_bufs:
> +	if (type == MEMIF_RING_S2M) {
> +		rte_mb();
> +		ring->tail = cur_slot;
> +		mq->last_head = cur_slot;
> +	} else {
> +		mq->last_tail = cur_slot;
> +	}
> +
> +refill:
> +	if (type == MEMIF_RING_M2S) {
> +		head = ring->head;
> +		n_slots = ring_size - head + mq->last_tail;
> +
> +		while (n_slots--) {
> +			s0 = head++ & mask;
> +			d0 = &ring->desc[s0];
> +			d0->length = pmd->run.pkt_buffer_size;
> +		}
> +		rte_mb();
> +		ring->head = head;
> +	}
> +
> +	mq->n_pkts += n_rx_pkts;
> +	return n_rx_pkts;
> +}
> +
> +static uint16_t
> +eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
> +{
> +	struct memif_queue *mq = queue;
> +	struct pmd_internals *pmd = mq->pmd;
> +	memif_ring_t *ring = mq->ring;
> +	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
> +	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
> +	memif_ring_type_t type = mq->type;
> +	memif_desc_t *d0;
> +	struct rte_mbuf *mbuf;
> +	struct rte_mbuf *mbuf_head;
> +	uint64_t a;
> +	ssize_t size;
> +
> +	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
> +		return 0;
> +	if (unlikely(ring == NULL))
> +		return 0;
> +
> +	ring_size = 1 << mq->log2_ring_size;
> +	mask = ring_size - 1;
> +
> +	n_free = ring->tail - mq->last_tail;
> +	mq->last_tail += n_free;
> +	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
> +
> +	if (type == MEMIF_RING_S2M)
> +		n_free = ring_size - ring->head + mq->last_tail;
> +	else
> +		n_free = ring->head - ring->tail;
> +
> +	while (n_tx_pkts < nb_pkts && n_free) {
> +		mbuf_head = *bufs++;
> +		mbuf = mbuf_head;
> +
> +		saved_slot = slot;
> +		d0 = &ring->desc[slot & mask];
> +		dst_off = 0;
> +		dst_len = (type == MEMIF_RING_S2M) ?
> +			pmd->run.pkt_buffer_size : d0->length;
> +
> +next_in_chain:
> +		src_off = 0;
> +		src_len = rte_pktmbuf_data_len(mbuf);
> +
> +		while (src_len) {
> +			if (dst_len == 0) {
> +				if (n_free) {
> +					slot++;
> +					n_free--;
> +					d0->flags |= MEMIF_DESC_FLAG_NEXT;
> +					d0 = &ring->desc[slot & mask];
> +					dst_off = 0;
> +					dst_len = (type == MEMIF_RING_S2M) ?
> +					    pmd->run.pkt_buffer_size : d0->length;
> +					d0->flags = 0;
> +				} else {
> +					slot = saved_slot;
> +					goto no_free_slots;
> +				}
> +			}
> +			cp_len = RTE_MIN(dst_len, src_len);
> +
> +			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
> +			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
> +			       cp_len);
> +
> +			mq->n_bytes += cp_len;
> +			src_off += cp_len;
> +			dst_off += cp_len;
> +			src_len -= cp_len;
> +			dst_len -= cp_len;
> +
> +			d0->length = dst_off;
> +		}
> +
> +		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
> +			mbuf = mbuf->next;
> +			goto next_in_chain;
> +		}
> +
> +		n_tx_pkts++;
> +		slot++;
> +		n_free--;
> +		rte_pktmbuf_free(mbuf_head);
> +	}
> +
> +no_free_slots:
> +	rte_mb();
> +	if (type == MEMIF_RING_S2M)
> +		ring->head = slot;
> +	else
> +		ring->tail = slot;
> +
> +	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
> +		a = 1;
> +		size = write(mq->intr_handle.fd, &a, sizeof(a));
> +		if (unlikely(size < 0)) {
> +			MIF_LOG(WARNING,
> +				"%s: Failed to send interrupt. %s",
> +				rte_vdev_device_name(pmd->vdev), strerror(errno));
> +		}
> +	}
> +
> +	mq->n_err += nb_pkts - n_tx_pkts;
> +	mq->n_pkts += n_tx_pkts;
> +	return n_tx_pkts;
> +}
> +
> +void
> +memif_free_regions(struct pmd_internals *pmd)
> +{
> +	int i;
> +	struct memif_region *r;
> +
> +	/* regions are allocated contiguously, so it's
> +	 * enough to loop until 'pmd->regions_num'
> +	 */
> +	for (i = 0; i < pmd->regions_num; i++) {
> +		r = pmd->regions[i];
> +		if (r != NULL) {
> +			if (r->addr != NULL) {
> +				munmap(r->addr, r->region_size);
> +				if (r->fd > 0) {
> +					close(r->fd);
> +					r->fd = -1;
> +				}
> +			}
> +			rte_free(r);
> +			pmd->regions[i] = NULL;
> +		}
> +	}
> +	pmd->regions_num = 0;
> +}
> +
> +static int
> +memif_region_init_shm(struct pmd_internals *pmd, uint8_t has_buffers)
> +{
> +	char shm_name[ETH_MEMIF_SHM_NAME_SIZE];
> +	int ret = 0;
> +	struct memif_region *r;
> +
> +	if (pmd->regions_num >= ETH_MEMIF_MAX_REGION_NUM) {
> +		MIF_LOG(ERR, "%s: Too many regions.", rte_vdev_device_name(pmd->vdev));
> +		return -1;
> +	}
> +
> +	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
> +	if (r == NULL) {
> +		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
> +			rte_vdev_device_name(pmd->vdev));
> +		return -ENOMEM;
> +	}
> +
> +	/* calculate buffer offset */
> +	r->pkt_buffer_offset = (pmd->run.num_s2m_rings + pmd->run.num_m2s_rings) *
> +	    (sizeof(memif_ring_t) + sizeof(memif_desc_t) *
> +	    (1 << pmd->run.log2_ring_size));
> +
> +	r->region_size = r->pkt_buffer_offset;
> +	/* if region has buffers, add buffers size to region_size */
> +	if (has_buffers == 1)
> +		r->region_size += (uint32_t)(pmd->run.pkt_buffer_size *
> +			(1 << pmd->run.log2_ring_size) *
> +			(pmd->run.num_s2m_rings +
> +			 pmd->run.num_m2s_rings));
> +
> +	memset(shm_name, 0, sizeof(char) * ETH_MEMIF_SHM_NAME_SIZE);
> +	snprintf(shm_name, ETH_MEMIF_SHM_NAME_SIZE, "memif_region_%d",
> +		 pmd->regions_num);
> +
> +	r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
> +	if (r->fd < 0) {
> +		MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
> +			rte_vdev_device_name(pmd->vdev),
> +			strerror(errno));
> +		ret = -1;
> +		goto error;
> +	}
> +
> +	ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
> +	if (ret < 0) {
> +		MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
> +			rte_vdev_device_name(pmd->vdev),
> +			strerror(errno));
> +		goto error;
> +	}
> +
> +	ret = ftruncate(r->fd, r->region_size);
> +	if (ret < 0) {
> +		MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
> +			rte_vdev_device_name(pmd->vdev),
> +			strerror(errno));
> +		goto error;
> +	}
> +
> +	r->addr = mmap(NULL, r->region_size, PROT_READ |
> +		       PROT_WRITE, MAP_SHARED, r->fd, 0);
> +	if (r->addr == MAP_FAILED) {
> +		MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
> +			rte_vdev_device_name(pmd->vdev),
> +			strerror(ret));
> +		ret = -1;
> +		goto error;
> +	}
> +
> +	pmd->regions[pmd->regions_num] = r;
> +	pmd->regions_num++;
> +
> +	return ret;
> +
> +error:
> +	if (r->fd > 0)
> +		close(r->fd);
> +	r->fd = -1;
> +
> +	return ret;
> +}
> +
> +static int
> +memif_regions_init(struct pmd_internals *pmd)
> +{
> +	int ret;
> +
> +	/* create one buffer region */
> +	ret = memif_region_init_shm(pmd, /* has buffer */ 1);
> +	if (ret < 0)
> +		return ret;
> +
> +	return 0;
> +}
> +
> +static void
> +memif_init_rings(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	memif_ring_t *ring;
> +	int i, j;
> +	uint16_t slot;
> +
> +	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
> +		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
> +		ring->head = 0;
> +		ring->tail = 0;
> +		ring->cookie = MEMIF_COOKIE;
> +		ring->flags = 0;
> +		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
> +			slot = i * (1 << pmd->run.log2_ring_size) + j;
> +			ring->desc[j].region = 0;
> +			ring->desc[j].offset = pmd->regions[0]->pkt_buffer_offset +
> +				(uint32_t)(slot * pmd->run.pkt_buffer_size);
> +			ring->desc[j].length = pmd->run.pkt_buffer_size;
> +		}
> +	}
> +
> +	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
> +		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
> +		ring->head = 0;
> +		ring->tail = 0;
> +		ring->cookie = MEMIF_COOKIE;
> +		ring->flags = 0;
> +		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
> +			slot = (i + pmd->run.num_s2m_rings) *
> +			    (1 << pmd->run.log2_ring_size) + j;
> +			ring->desc[j].region = 0;
> +			ring->desc[j].offset = pmd->regions[0]->pkt_buffer_offset +
> +				(uint32_t)(slot * pmd->run.pkt_buffer_size);
> +			ring->desc[j].length = pmd->run.pkt_buffer_size;
> +		}
> +	}
> +}
> +
> +/* called only by slave */
> +static void
> +memif_init_queues(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_queue *mq;
> +	int i;
> +
> +	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
> +		mq = dev->data->tx_queues[i];
> +		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
> +		mq->log2_ring_size = pmd->run.log2_ring_size;
> +		/* queues located only in region 0 */
> +		mq->region = 0;
> +		mq->ring_offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
> +		mq->last_head = 0;
> +		mq->last_tail = 0;
> +		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
> +		if (mq->intr_handle.fd < 0) {
> +			MIF_LOG(WARNING,
> +				"%s: Failed to create eventfd for tx queue %d: %s.",
> +				rte_vdev_device_name(pmd->vdev), i,
> +				strerror(errno));
> +		}
> +	}
> +
> +	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
> +		mq = dev->data->rx_queues[i];
> +		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
> +		mq->log2_ring_size = pmd->run.log2_ring_size;
> +		/* queues located only in region 0 */
> +		mq->region = 0;
> +		mq->ring_offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
> +		mq->last_head = 0;
> +		mq->last_tail = 0;
> +		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
> +		if (mq->intr_handle.fd < 0) {
> +			MIF_LOG(WARNING,
> +				"%s: Failed to create eventfd for rx queue %d: %s.",
> +				rte_vdev_device_name(pmd->vdev), i,
> +				strerror(errno));
> +		}
> +	}
> +}
> +
> +int
> +memif_init_regions_and_queues(struct rte_eth_dev *dev)
> +{
> +	int ret;
> +
> +	ret = memif_regions_init(dev->data->dev_private);
> +	if (ret < 0)
> +		return ret;
> +
> +	memif_init_rings(dev);
> +
> +	memif_init_queues(dev);
> +
> +	return 0;
> +}
> +
> +int
> +memif_connect(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_region *mr;
> +	struct memif_queue *mq;
> +	int i;
> +
> +	for (i = 0; i < pmd->regions_num; i++) {
> +		mr = pmd->regions[i];
> +		if (mr != NULL) {
> +			if (mr->addr == NULL) {
> +				if (mr->fd < 0)
> +					return -1;
> +				mr->addr = mmap(NULL, mr->region_size,
> +						PROT_READ | PROT_WRITE,
> +						MAP_SHARED, mr->fd, 0);
> +				if (mr->addr == NULL)
> +					return -1;
> +			}
> +		}
> +	}
> +
> +	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
> +		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
> +		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
> +		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
> +			    mq->ring_offset);
> +		if (mq->ring->cookie != MEMIF_COOKIE) {
> +			MIF_LOG(ERR, "%s: Wrong cookie",
> +				rte_vdev_device_name(pmd->vdev));
> +			return -1;
> +		}
> +		mq->ring->head = 0;
> +		mq->ring->tail = 0;
> +		mq->last_head = 0;
> +		mq->last_tail = 0;
> +		/* enable polling mode */
> +		if (pmd->role == MEMIF_ROLE_MASTER)
> +			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
> +	}
> +	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
> +		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
> +		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
> +		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
> +			    mq->ring_offset);
> +		if (mq->ring->cookie != MEMIF_COOKIE) {
> +			MIF_LOG(ERR, "%s: Wrong cookie",
> +				rte_vdev_device_name(pmd->vdev));
> +			return -1;
> +		}
> +		mq->ring->head = 0;
> +		mq->ring->tail = 0;
> +		mq->last_head = 0;
> +		mq->last_tail = 0;
> +		/* enable polling mode */
> +		if (pmd->role == MEMIF_ROLE_SLAVE)
> +			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
> +	}
> +
> +	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
> +	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
> +	dev->data->dev_link.link_status = ETH_LINK_UP;
> +	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
> +	return 0;
> +}
> +
> +static int
> +memif_dev_start(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	int ret = 0;
> +
> +	switch (pmd->role) {
> +	case MEMIF_ROLE_SLAVE:
> +		ret = memif_connect_slave(dev);
> +		break;
> +	case MEMIF_ROLE_MASTER:
> +		ret = memif_connect_master(dev);
> +		break;
> +	default:
> +		MIF_LOG(ERR, "%s: Unknown role: %d.",
> +			rte_vdev_device_name(pmd->vdev), pmd->role);
> +		ret = -1;
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
> +static void
> +memif_dev_close(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +
> +	memif_msg_enq_disconnect(pmd->cc, "Device closed", 0);
> +	memif_disconnect(dev);
> +
> +	memif_socket_remove_device(dev);
> +}
> +
> +static int
> +memif_dev_configure(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +
> +	/*
> +	 * SLAVE - TXQ
> +	 * MASTER - RXQ
> +	 */
> +	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
> +				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
> +
> +	/*
> +	 * SLAVE - RXQ
> +	 * MASTER - TXQ
> +	 */
> +	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
> +				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
> +
> +	return 0;
> +}
> +
> +static int
> +memif_tx_queue_setup(struct rte_eth_dev *dev,
> +		     uint16_t qid,
> +		     uint16_t nb_tx_desc __rte_unused,
> +		     unsigned int socket_id __rte_unused,
> +		     const struct rte_eth_txconf *tx_conf __rte_unused)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_queue *mq;
> +
> +	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
> +	if (mq == NULL) {
> +		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
> +			rte_vdev_device_name(pmd->vdev), qid);
> +		return -ENOMEM;
> +	}
> +
> +	mq->type =
> +	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
> +	mq->n_pkts = 0;
> +	mq->n_bytes = 0;
> +	mq->n_err = 0;
> +	mq->intr_handle.fd = -1;
> +	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
> +	mq->pmd = pmd;
> +	dev->data->tx_queues[qid] = mq;
> +
> +	return 0;
> +}
> +
> +static int
> +memif_rx_queue_setup(struct rte_eth_dev *dev,
> +		     uint16_t qid,
> +		     uint16_t nb_rx_desc __rte_unused,
> +		     unsigned int socket_id __rte_unused,
> +		     const struct rte_eth_rxconf *rx_conf __rte_unused,
> +		     struct rte_mempool *mb_pool)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_queue *mq;
> +
> +	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
> +	if (mq == NULL) {
> +		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
> +			rte_vdev_device_name(pmd->vdev), qid);
> +		return -ENOMEM;
> +	}
> +
> +	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
> +	mq->n_pkts = 0;
> +	mq->n_bytes = 0;
> +	mq->n_err = 0;
> +	mq->intr_handle.fd = -1;
> +	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
> +	mq->mempool = mb_pool;
> +	mq->in_port = dev->data->port_id;
> +	mq->pmd = pmd;
> +	dev->data->rx_queues[qid] = mq;
> +
> +	return 0;
> +}
> +
> +static void
> +memif_queue_release(void *queue)
> +{
> +	struct memif_queue *mq = (struct memif_queue *)queue;
> +
> +	if (!mq)
> +		return;
> +
> +	rte_free(mq);
> +}
> +
> +static int
> +memif_link_update(struct rte_eth_dev *dev __rte_unused,
> +		  int wait_to_complete __rte_unused)
> +{
> +	return 0;
> +}
> +
> +static int
> +memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	struct memif_queue *mq;
> +	int i;
> +	uint8_t tmp, nq;
> +
> +	stats->ipackets = 0;
> +	stats->ibytes = 0;
> +	stats->opackets = 0;
> +	stats->obytes = 0;
> +	stats->oerrors = 0;
> +
> +	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
> +	    pmd->run.num_m2s_rings;
> +	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
> +	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
> +
> +	/* RX stats */
> +	for (i = 0; i < nq; i++) {
> +		mq = dev->data->rx_queues[i];
> +		stats->q_ipackets[i] = mq->n_pkts;
> +		stats->q_ibytes[i] = mq->n_bytes;
> +		stats->ipackets += mq->n_pkts;
> +		stats->ibytes += mq->n_bytes;
> +	}
> +
> +	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
> +	    pmd->run.num_s2m_rings;
> +	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
> +	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
> +
> +	/* TX stats */
> +	for (i = 0; i < nq; i++) {
> +		mq = dev->data->tx_queues[i];
> +		stats->q_opackets[i] = mq->n_pkts;
> +		stats->q_obytes[i] = mq->n_bytes;
> +		stats->opackets += mq->n_pkts;
> +		stats->obytes += mq->n_bytes;
> +		stats->oerrors += mq->n_err;
> +	}
> +	return 0;
> +}
> +
> +static void
> +memif_stats_reset(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +	int i;
> +	struct memif_queue *mq;
> +
> +	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
> +		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
> +		    dev->data->rx_queues[i];
> +		mq->n_pkts = 0;
> +		mq->n_bytes = 0;
> +		mq->n_err = 0;
> +	}
> +	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
> +		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
> +		    dev->data->tx_queues[i];
> +		mq->n_pkts = 0;
> +		mq->n_bytes = 0;
> +		mq->n_err = 0;
> +	}
> +}
> +
> +static int
> +memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
> +{
> +	struct pmd_internals *pmd = dev->data->dev_private;
> +
> +	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
> +		rte_vdev_device_name(pmd->vdev));
> +
> +	return -1;
> +}
> +
> +static int
> +memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
> +{
> +	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
> +
> +	return 0;
> +}
> +
> +static const struct eth_dev_ops ops = {
> +	.dev_start = memif_dev_start,
> +	.dev_close = memif_dev_close,
> +	.dev_infos_get = memif_dev_info,
> +	.dev_configure = memif_dev_configure,
> +	.tx_queue_setup = memif_tx_queue_setup,
> +	.rx_queue_setup = memif_rx_queue_setup,
> +	.rx_queue_release = memif_queue_release,
> +	.tx_queue_release = memif_queue_release,
> +	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
> +	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
> +	.link_update = memif_link_update,
> +	.stats_get = memif_stats_get,
> +	.stats_reset = memif_stats_reset,
> +};
> +
> +static int
> +memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
> +	     memif_interface_id_t id, uint32_t flags,
> +	     const char *socket_filename,
> +	     memif_log2_ring_size_t log2_ring_size,
> +	     uint16_t pkt_buffer_size, const char *secret,
> +	     struct rte_ether_addr *ether_addr)
> +{
> +	int ret = 0;
> +	struct rte_eth_dev *eth_dev;
> +	struct rte_eth_dev_data *data;
> +	struct pmd_internals *pmd;
> +	const unsigned int numa_node = vdev->device.numa_node;
> +	const char *name = rte_vdev_device_name(vdev);
> +
> +	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
> +		MIF_LOG(ERR, "Zero-copy slave not supported.");
> +		return -1;
> +	}
> +
> +	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
> +	if (eth_dev == NULL) {
> +		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
> +		return -1;
> +	}
> +
> +	pmd = eth_dev->data->dev_private;
> +	memset(pmd, 0, sizeof(*pmd));
> +
> +	pmd->vdev = vdev;
> +	pmd->id = id;
> +	pmd->flags = flags;
> +	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
> +	pmd->role = role;
> +
> +	ret = memif_socket_init(eth_dev, socket_filename);
> +	if (ret < 0)
> +		return ret;
> +
> +	memset(pmd->secret, 0, sizeof(char) * ETH_MEMIF_SECRET_SIZE);
> +	if (secret != NULL)
> +		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
> +
> +	pmd->cfg.log2_ring_size = log2_ring_size;
> +	/* set in .dev_configure() */
> +	pmd->cfg.num_s2m_rings = 0;
> +	pmd->cfg.num_m2s_rings = 0;
> +
> +	pmd->cfg.pkt_buffer_size = pkt_buffer_size;
> +
> +	data = eth_dev->data;
> +	data->dev_private = pmd;
> +	data->numa_node = numa_node;
> +	data->mac_addrs = ether_addr;
> +
> +	eth_dev->dev_ops = &ops;
> +	eth_dev->device = &vdev->device;
> +	eth_dev->rx_pkt_burst = eth_memif_rx;
> +	eth_dev->tx_pkt_burst = eth_memif_tx;
> +
> +	eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
> +
> +	rte_eth_dev_probing_finish(eth_dev);
> +
> +	return 0;
> +}
> +
> +static int
> +memif_set_role(const char *key __rte_unused, const char *value,
> +	       void *extra_args)
> +{
> +	enum memif_role_t *role = (enum memif_role_t *)extra_args;
> +
> +	if (strstr(value, "master") != NULL) {
> +		*role = MEMIF_ROLE_MASTER;
> +	} else if (strstr(value, "slave") != NULL) {
> +		*role = MEMIF_ROLE_SLAVE;
> +	} else {
> +		MIF_LOG(ERR, "Unknown role: %s.", value);
> +		return -EINVAL;
> +	}
> +	return 0;
> +}
> +
> +static int
> +memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
> +{
> +	uint32_t *flags = (uint32_t *)extra_args;
> +
> +	if (strstr(value, "yes") != NULL) {
> +		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
> +	} else if (strstr(value, "no") != NULL) {
> +		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
> +	} else {
> +		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
> +		return -EINVAL;
> +	}
> +	return 0;
> +}
> +
> +static int
> +memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
> +{
> +	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
> +
> +	/* even if parsing fails, 0 is a valid id */
> +	*id = strtoul(value, NULL, 10);
> +	return 0;
> +}
> +
> +static int
> +memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
> +{
> +	unsigned long tmp;
> +	uint16_t *pkt_buffer_size = (uint16_t *)extra_args;
> +
> +	tmp = strtoul(value, NULL, 10);
> +	if (tmp == 0 || tmp > 0xFFFF) {
> +		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
> +		return -EINVAL;
> +	}
> +	*pkt_buffer_size = tmp;
> +	return 0;
> +}
> +
> +static int
> +memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
> +{
> +	unsigned long tmp;
> +	memif_log2_ring_size_t *log2_ring_size =
> +	    (memif_log2_ring_size_t *)extra_args;
> +
> +	tmp = strtoul(value, NULL, 10);
> +	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
> +		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
> +			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
> +		return -EINVAL;
> +	}
> +	*log2_ring_size = tmp;
> +	return 0;
> +}
> +
> +/* check if directory exists and if we have permission to read/write */
> +static int
> +memif_check_socket_filename(const char *filename)
> +{
> +	char *dir = NULL, *tmp;
> +	uint32_t idx;
> +	int ret = 0;
> +
> +	tmp = strrchr(filename, '/');
> +	if (tmp != NULL) {
> +		idx = tmp - filename;
> +		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
> +		if (dir == NULL) {
> +			MIF_LOG(ERR, "Failed to allocate memory.");
> +			return -1;
> +		}
> +		strlcpy(dir, filename, sizeof(char) * (idx + 1));
> +	}
> +
> +	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
> +					W_OK, AT_EACCESS) < 0)) {
> +		MIF_LOG(ERR, "Invalid socket directory.");
> +		ret = -EINVAL;
> +	}
> +
> +	if (dir != NULL)
> +		rte_free(dir);
> +
> +	return ret;
> +}
> +
> +static int
> +memif_set_socket_filename(const char *key __rte_unused, const char *value,
> +			  void *extra_args)
> +{
> +	const char **socket_filename = (const char **)extra_args;
> +
> +	*socket_filename = value;
> +	return memif_check_socket_filename(*socket_filename);
> +}
> +
> +static int
> +memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
> +{
> +	struct rte_ether_addr *ether_addr = (struct rte_ether_addr *)extra_args;
> +	int ret = 0;
> +
> +	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
> +	       &ether_addr->addr_bytes[0], &ether_addr->addr_bytes[1],
> +	       &ether_addr->addr_bytes[2], &ether_addr->addr_bytes[3],
> +	       &ether_addr->addr_bytes[4], &ether_addr->addr_bytes[5]);
> +	if (ret != 6)
> +		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
> +	return 0;
> +}
> +
> +static int
> +memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
> +{
> +	const char **secret = (const char **)extra_args;
> +
> +	*secret = value;
> +	return 0;
> +}
> +
> +static int
> +rte_pmd_memif_probe(struct rte_vdev_device *vdev)
> +{
> +	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
> +	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
> +	int ret = 0;
> +	struct rte_kvargs *kvlist;
> +	const char *name = rte_vdev_device_name(vdev);
> +	enum memif_role_t role = MEMIF_ROLE_SLAVE;
> +	memif_interface_id_t id = 0;
> +	uint16_t pkt_buffer_size = ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE;
> +	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
> +	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
> +	uint32_t flags = 0;
> +	const char *secret = NULL;
> +	struct rte_ether_addr *ether_addr = rte_zmalloc("", sizeof(struct rte_ether_addr), 0);
> +
> +	rte_eth_random_addr(ether_addr->addr_bytes);
> +
> +	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
> +
> +	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
> +		MIF_LOG(ERR, "Multi-processing not supported for memif.");
> +		/* TODO:
> +		 * Request connection information.
> +		 *
> +		 * Once memif in the primary process is connected,
> +		 * broadcast connection information.
> +		 */
> +		return -1;
> +	}
> +
> +	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
> +
> +	/* parse parameters */
> +	if (kvlist != NULL) {
> +		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
> +					 &memif_set_role, &role);
> +		if (ret < 0)
> +			goto exit;
> +		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
> +					 &memif_set_id, &id);
> +		if (ret < 0)
> +			goto exit;
> +		ret = rte_kvargs_process(kvlist, ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
> +					 &memif_set_bs, &pkt_buffer_size);
> +		if (ret < 0)
> +			goto exit;
> +		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
> +					 &memif_set_rs, &log2_ring_size);
> +		if (ret < 0)
> +			goto exit;
> +		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
> +					 &memif_set_socket_filename,
> +					 (void *)(&socket_filename));
> +		if (ret < 0)
> +			goto exit;
> +		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
> +					 &memif_set_mac, ether_addr);
> +		if (ret < 0)
> +			goto exit;
> +		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
> +					 &memif_set_zc, &flags);
> +		if (ret < 0)
> +			goto exit;
> +		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
> +					 &memif_set_secret, (void *)(&secret));
> +		if (ret < 0)
> +			goto exit;
> +	}
> +
> +	/* create interface */
> +	ret = memif_create(vdev, role, id, flags, socket_filename,
> +			   log2_ring_size, pkt_buffer_size, secret, ether_addr);
> +
> +exit:
> +	if (kvlist != NULL)
> +		rte_kvargs_free(kvlist);
> +	return ret;
> +}
> +
> +static int
> +rte_pmd_memif_remove(struct rte_vdev_device *vdev)
> +{
> +	struct rte_eth_dev *eth_dev;
> +	int i;
> +
> +	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
> +	if (eth_dev == NULL)
> +		return 0;
> +
> +	for (i = 0; i < eth_dev->data->nb_rx_queues; i++)
> +		(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data->rx_queues[i]);
> +	for (i = 0; i < eth_dev->data->nb_tx_queues; i++)
> +		(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data->tx_queues[i]);
> +
> +	rte_free(eth_dev->process_private);
> +	eth_dev->process_private = NULL;
> +
> +	rte_eth_dev_close(eth_dev->data->port_id);
> +
> +	return 0;
> +}
> +
> +static struct rte_vdev_driver pmd_memif_drv = {
> +	.probe = rte_pmd_memif_probe,
> +	.remove = rte_pmd_memif_remove,
> +};
> +
> +RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
> +
> +RTE_PMD_REGISTER_PARAM_STRING(net_memif,
> +			      ETH_MEMIF_ID_ARG "=<int>"
> +			      ETH_MEMIF_ROLE_ARG "=master|slave"
> +			      ETH_MEMIF_PKT_BUFFER_SIZE_ARG "=<int>"
> +			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
> +			      ETH_MEMIF_SOCKET_ARG "=<string>"
> +			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
> +			      ETH_MEMIF_ZC_ARG "=yes|no"
> +			      ETH_MEMIF_SECRET_ARG "=<string>");
> +
> +int memif_logtype;
> +
> +RTE_INIT(memif_init_log)
> +{
> +	memif_logtype = rte_log_register("pmd.net.memif");
> +	if (memif_logtype >= 0)
> +		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
> +}
> diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
> new file mode 100644
> index 000000000..8cee2643c
> --- /dev/null
> +++ b/drivers/net/memif/rte_eth_memif.h
> @@ -0,0 +1,212 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
> + */
> +
> +#ifndef _RTE_ETH_MEMIF_H_
> +#define _RTE_ETH_MEMIF_H_
> +
> +#ifndef _GNU_SOURCE
> +#define _GNU_SOURCE
> +#endif				/* GNU_SOURCE */
> +
> +#include <sys/queue.h>
> +
> +#include <rte_ethdev_driver.h>
> +#include <rte_ether.h>
> +#include <rte_interrupts.h>
> +
> +#include "memif.h"
> +
> +#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/run/memif.sock"
> +#define ETH_MEMIF_DEFAULT_RING_SIZE		10
> +#define ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE	2048
> +
> +#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
> +#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
> +#define ETH_MEMIF_MAX_REGION_NUM		256
> +
> +#define ETH_MEMIF_SHM_NAME_SIZE			32
> +#define ETH_MEMIF_DISC_STRING_SIZE		96
> +#define ETH_MEMIF_SECRET_SIZE			24
> +
> +extern int memif_logtype;
> +
> +#define MIF_LOG(level, fmt, args...) \
> +	rte_log(RTE_LOG_ ## level, memif_logtype, \
> +		"%s(): " fmt "\n", __func__, ##args)
> +
> +enum memif_role_t {
> +	MEMIF_ROLE_MASTER,
> +	MEMIF_ROLE_SLAVE,
> +};
> +
> +struct memif_region {
> +	void *addr;				/**< shared memory address */
> +	memif_region_size_t region_size;	/**< shared memory size */
> +	int fd;					/**< shared memory file descriptor */
> +	uint32_t pkt_buffer_offset;
> +	/**< offset from 'addr' to first packet buffer */
> +};
> +
> +struct memif_queue {
> +	struct rte_mempool *mempool;		/**< mempool for RX packets */
> +	struct pmd_internals *pmd;		/**< device internals */
> +
> +	memif_ring_type_t type;			/**< ring type */
> +	memif_region_index_t region;		/**< shared memory region index */
> +
> +	uint16_t in_port;			/**< port id */
> +
> +	memif_region_offset_t ring_offset;
> +	/**< ring offset from start of shm region (ring - memif_region.addr) */
> +
> +	uint16_t last_head;			/**< last ring head */
> +	uint16_t last_tail;			/**< last ring tail */
> +
> +	/* rx/tx info */
> +	uint64_t n_pkts;			/**< number of rx/tx packets */
> +	uint64_t n_bytes;			/**< number of rx/tx bytes */
> +	uint64_t n_err;				/**< number of tx errors */
> +
> +	memif_ring_t *ring;			/**< pointer to ring */
> +
> +	struct rte_intr_handle intr_handle;	/**< interrupt handle */
> +
> +	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
> +};
> +
> +struct pmd_internals {
> +	memif_interface_id_t id;		/**< unique id */
> +	enum memif_role_t role;			/**< device role */
> +	uint32_t flags;				/**< device status flags */
> +#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
> +/**< device is connecting */
> +#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
> +/**< device is connected */
> +#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
> +/**< device is zero-copy enabled */
> +#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
> +/**< device has not been configured and can not accept connection requests */
> +
> +	char *socket_filename;			/**< pointer to socket filename */
> +	char secret[ETH_MEMIF_SECRET_SIZE]; /**< secret (optional security parameter) */
> +
> +	struct memif_control_channel *cc;	/**< control channel */
> +
> +	struct memif_region *regions[ETH_MEMIF_MAX_REGION_NUM];
> +	/**< shared memory regions */
> +	memif_region_index_t regions_num;	/**< number of regions */
> +
> +	/* remote info */
> +	char remote_name[RTE_DEV_NAME_MAX_LEN];		/**< remote app name */
> +	char remote_if_name[RTE_DEV_NAME_MAX_LEN];	/**< remote peer name */
> +
> +	struct {
> +		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
> +		uint8_t num_s2m_rings;		/**< number of slave to master rings */
> +		uint8_t num_m2s_rings;		/**< number of master to slave rings */
> +		uint16_t pkt_buffer_size;	/**< buffer size */
> +	} cfg;					/**< Configured parameters (max values) */
> +
> +	struct {
> +		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
> +		uint8_t num_s2m_rings;		/**< number of slave to master rings */
> +		uint8_t num_m2s_rings;		/**< number of master to slave rings */
> +		uint16_t pkt_buffer_size;	/**< buffer size */
> +	} run;
> +	/**< Parameters used in active connection */
> +
> +	char local_disc_string[ETH_MEMIF_DISC_STRING_SIZE];
> +	/**< local disconnect reason */
> +	char remote_disc_string[ETH_MEMIF_DISC_STRING_SIZE];
> +	/**< remote disconnect reason */
> +
> +	struct rte_vdev_device *vdev;		/**< vdev handle */
> +};
> +
> +/**
> + * Unmap shared memory and free regions from memory.
> + *
> + * @param pmd
> + *   device internals
> + */
> +void memif_free_regions(struct pmd_internals *pmd);
> +
> +/**
> + * Finalize connection establishment process. Map shared memory file
> + * (master role), initialize ring queue, set link status up.
> + *
> + * @param pmd
> + *   device internals
> + * @return
> + *   - On success, zero.
> + *   - On failure, a negative value.
> + */
> +int memif_connect(struct rte_eth_dev *dev);
> +
> +/**
> + * Create shared memory file and initialize ring queue.
> + * Only called by slave when establishing connection
> + *
> + * @param pmd
> + *   device internals
> + * @return
> + *   - On success, zero.
> + *   - On failure, a negative value.
> + */
> +int memif_init_regions_and_queues(struct rte_eth_dev *dev);
> +
> +/**
> + * Get memif version string.
> + *
> + * @return
> + *   - memif version string
> + */
> +const char *memif_version(void);
> +
> +#ifndef MFD_HUGETLB
> +#ifndef __NR_memfd_create
> +
> +#if defined __x86_64__
> +#define __NR_memfd_create 319
> +#elif defined __x86_32__
> +#define __NR_memfd_create 1073742143
> +#elif defined __arm__
> +#define __NR_memfd_create 385
> +#elif defined __aarch64__
> +#define __NR_memfd_create 279
> +#elif defined __powerpc__
> +#define __NR_memfd_create 360
> +#elif defined __i386__
> +#define __NR_memfd_create 356
> +#else
> +#error "__NR_memfd_create unknown for this architecture"
> +#endif
> +
> +#endif				/* __NR_memfd_create */
> +
> +static inline int memfd_create(const char *name, unsigned int flags)
> +{
> +	return syscall(__NR_memfd_create, name, flags);
> +}
> +#endif				/* MFD_HUGETLB */
> +
> +#ifndef F_LINUX_SPECIFIC_BASE
> +#define F_LINUX_SPECIFIC_BASE 1024
> +#endif
> +
> +#ifndef MFD_ALLOW_SEALING
> +#define MFD_ALLOW_SEALING       0x0002U
> +#endif
> +
> +#ifndef F_ADD_SEALS
> +#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
> +#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
> +
> +#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
> +#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
> +#define F_SEAL_GROW     0x0004	/* prevent file from growing */
> +#define F_SEAL_WRITE    0x0008	/* prevent writes */
> +#endif
> +
> +#endif				/* RTE_ETH_MEMIF_H */
> diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
> new file mode 100644
> index 000000000..8861484fb
> --- /dev/null
> +++ b/drivers/net/memif/rte_pmd_memif_version.map
> @@ -0,0 +1,4 @@
> +DPDK_19.08 {
> +
> +        local: *;
> +};
> diff --git a/drivers/net/meson.build b/drivers/net/meson.build
> index ed99896c3..b57073483 100644
> --- a/drivers/net/meson.build
> +++ b/drivers/net/meson.build
> @@ -24,6 +24,7 @@ drivers = ['af_packet',
>  	'ixgbe',
>  	'kni',
>  	'liquidio',
> +	'memif',
>  	'mlx4',
>  	'mlx5',
>  	'mvneta',
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index 7c9b4b538..326b2a30b 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -174,6 +174,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
>  endif
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
>  ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
> --
> 2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD
  2019-06-03 11:28                 ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
@ 2019-06-03 14:25                   ` Ye Xiaolong
  0 siblings, 0 replies; 58+ messages in thread
From: Ye Xiaolong @ 2019-06-03 14:25 UTC (permalink / raw)
  To: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco),
	Ferruh Yigit
  Cc: dev

On 06/03, Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) wrote:
>
>
>> -----Original Message-----
>> From: Ye Xiaolong <xiaolong.ye@intel.com>
>> Sent: Friday, May 31, 2019 9:43 AM
>> To: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
>> <jgrajcia@cisco.com>
>> Cc: dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface
>> (memif) PMD
>> 
>> Minor nit, this should be a V1 instead of V10 after your RFC.
>
>I'll keep that in mind, but what version number should I use in the next patch, since I did use V10 in this one?
>

I am not sure what's DPDK convention for this case, Maybe Ferruh can suggest
on this.

Thanks,
Xiaolong

>Thanks,
>Jakub

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD
  2019-05-31  6:22             ` [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD Jakub Grajciar
  2019-05-31  7:43               ` Ye Xiaolong
  2019-06-03 13:37               ` Aaron Conole
@ 2019-06-05 11:55               ` Ferruh Yigit
  2019-06-06  9:24                 ` Ferruh Yigit
  2019-06-06  8:24               ` [dpdk-dev] [PATCH v11] " Jakub Grajciar
  3 siblings, 1 reply; 58+ messages in thread
From: Ferruh Yigit @ 2019-06-05 11:55 UTC (permalink / raw)
  To: Jakub Grajciar, dev

On 5/31/2019 7:22 AM, Jakub Grajciar wrote:
> Memory interface (memif), provides high performance
> packet transfer over shared memory.

Almost there, can you please check below comments? I am hoping to merge next
version of the patch.

Thanks,
ferruh

> 
> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>

<...>

> +static const char *valid_arguments[] = {
> +	ETH_MEMIF_ID_ARG,
> +	ETH_MEMIF_ROLE_ARG,
> +	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
> +	ETH_MEMIF_RING_SIZE_ARG,
> +	ETH_MEMIF_SOCKET_ARG,
> +	ETH_MEMIF_MAC_ARG,
> +	ETH_MEMIF_ZC_ARG,
> +	ETH_MEMIF_SECRET_ARG,
> +	NULL
> +};

Checkpatch is giving following warning:

WARNING:STATIC_CONST_CHAR_ARRAY: static const char * array should probably be
static const char * const
#1885: FILE: drivers/net/memif/rte_eth_memif.c:39:
+static const char *valid_arguments[] = {

<...>

> +static int
> +rte_pmd_memif_probe(struct rte_vdev_device *vdev)
> +{
> +	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
> +	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
> +	int ret = 0;
> +	struct rte_kvargs *kvlist;
> +	const char *name = rte_vdev_device_name(vdev);
> +	enum memif_role_t role = MEMIF_ROLE_SLAVE;
> +	memif_interface_id_t id = 0;
> +	uint16_t pkt_buffer_size = ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE;
> +	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
> +	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
> +	uint32_t flags = 0;
> +	const char *secret = NULL;
> +	struct rte_ether_addr *ether_addr = rte_zmalloc("", sizeof(struct rte_ether_addr), 0);

This is a long line, and breaking it won't reduce the readability, can you
please break it. Checkpatch warning:

WARNING:LONG_LINE: line over 80 characters
#2939: FILE: drivers/net/memif/rte_eth_memif.c:1093:
+       struct rte_ether_addr *ether_addr = rte_zmalloc("", sizeof(struct
rte_ether_addr), 0);

<...>

> +static int
> +rte_pmd_memif_remove(struct rte_vdev_device *vdev)
> +{
> +	struct rte_eth_dev *eth_dev;
> +	int i;
> +
> +	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
> +	if (eth_dev == NULL)
> +		return 0;
> +
> +	for (i = 0; i < eth_dev->data->nb_rx_queues; i++)
> +		(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data->rx_queues[i]);
> +	for (i = 0; i < eth_dev->data->nb_tx_queues; i++)
> +		(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data->tx_queues[i]);

Although they point same function, better to use 'dev_ops->tx_queue_release' for
Tx queues.

> +
> +	rte_free(eth_dev->process_private);
> +	eth_dev->process_private = NULL;

"process_private" is not used in this PMD at all, no need to free it I think.

> +
> +	rte_eth_dev_close(eth_dev->data->port_id);

There are two exit path from a PMD:
1) rte_eth_dev_close() API
2) rte_vdev_driver->remove() called by hotplug remove APIs ('rte_dev_remove()'
or 'rte_eal_hotplug_remove()')

Both should clear all PMD allocated resources. Since you are calling
'rte_eth_dev_close() from this .remove() function, it makes sense to move all
resource cleanup to .dev_close (like queue cleanup calls above).

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD
  2019-05-31  7:43               ` Ye Xiaolong
  2019-06-03 11:28                 ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
@ 2019-06-05 12:01                 ` Ferruh Yigit
  1 sibling, 0 replies; 58+ messages in thread
From: Ferruh Yigit @ 2019-06-05 12:01 UTC (permalink / raw)
  To: Ye Xiaolong, Jakub Grajciar; +Cc: dev

On 5/31/2019 8:43 AM, Ye Xiaolong wrote:
> Minor nit, this should be a V1 instead of V10 after your RFC.

Normally Xiaolong is right, after RFC the actual patches numbering resets, but
for this case I think it is OK as long as people can follow the order, I am OK
to continue as v11 for next version, up to you.

> 
> On 05/31, Jakub Grajciar wrote:
>> Memory interface (memif), provides high performance
>> packet transfer over shared memory.
> 
> It'd be better to have more descrition of this new PMD in the commit log, something 
> like you have in memif.rst

+1, some more detail in commit log is better.

> 
> Thanks,
> Xiaolong
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dpdk-dev] [PATCH v11] net/memif: introduce memory interface (memif) PMD
  2019-05-31  6:22             ` [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD Jakub Grajciar
                                 ` (2 preceding siblings ...)
  2019-06-05 11:55               ` Ferruh Yigit
@ 2019-06-06  8:24               ` Jakub Grajciar
  2019-06-06 11:38                 ` [dpdk-dev] [PATCH v12] " Jakub Grajciar
  3 siblings, 1 reply; 58+ messages in thread
From: Jakub Grajciar @ 2019-06-06  8:24 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Shared memory packet interface (memif) PMD allows for DPDK and any other client
using memif (DPDK, VPP, libmemif) to communicate using shared memory.
The created device transmits packets in a raw format. It can be used with
Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
supported in DPDK memif implementation. Memif is Linux only.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 config/common_base                          |    5 +
 config/common_linux                         |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  234 ++++
 doc/guides/rel_notes/release_19_08.rst      |    5 +
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   31 +
 drivers/net/memif/memif.h                   |  179 +++
 drivers/net/memif/memif_socket.c            | 1124 +++++++++++++++++
 drivers/net/memif/memif_socket.h            |  105 ++
 drivers/net/memif/meson.build               |   15 +
 drivers/net/memif/rte_eth_memif.c           | 1207 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  212 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 18 files changed, 3146 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/51511/
Memif uses a unix domain socket as control channel (cc).
rte_interrupts.h is used to handle interrupts for this cc.
This patch is required, because the message received can
be a reason for disconnecting.

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

v5:
- coding style fixes
- add release notes
- remove unused dependencies
- doc update

v6:
- remove socket on termination of testpmd application
- disallow memif probing in secondary process (return error), multi-process support will be implemented in separate patch
- fix chained buffer packet length
- use single region for rings and buffers (non-zero-copy)
- doc update

v7:
- coding style fixes

v8:
- release notes: 19.05 -> 19.08

v9:
- rearrange elements in structures to remove holes
- default socket file in /run
- #define name sizes
- implement dev_close op

v10:
- fix shared lib build
- clear memory allocated by PMD in .dev_close
- fix NULL pointer string 'error: ‘%s’ directive argument is null'

v11:
- fix strict-aliasing
- correct parameter names in doxygen comments

diff --git a/MAINTAINERS b/MAINTAINERS
index 15d0829c5..3698c146b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -841,6 +841,12 @@ F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst

+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+

 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 6f19ad5d2..e406e7836 100644
--- a/config/common_base
+++ b/config/common_base
@@ -444,6 +444,11 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_XDP=n

+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linux b/config/common_linux
index 75334273d..87514fe4f 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -19,6 +19,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 2221c35f2..691e7209f 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..de2d481eb
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,234 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
+Linux only.
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Each interface can be connected to one peer interface
+at same time. The peer interface is identified by id parameter. Master
+creates the socket and listens for any slave connection requests. The socket
+may already exist on the system. Be sure to remove any such sockets, if you
+are creating a master interface, or you will see an "Address already in use"
+error. Function ``rte_pmd_memif_remove()``, which removes memif interface,
+will also remove a listener socket, if it is not being used by any other
+interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. This id is used to identify
+peer interface. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Used to identify peer interface", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+**Connection establishment**
+
+In order to create memif connection, two memif interfaces, each in separate
+process, are needed. One interface in ``master`` role and other in
+``slave`` role. It is not possible to connect two interfaces in a single
+process. Each interface can be connected to one interface at same time,
+identified by matching id parameter.
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by ``master``
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have ``slave`` and ``master`` interfaces
+at the same time, provided each role is assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
+sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
+(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+Shared memory
+~~~~~~~~~~~~~
+
+**Shared memory format**
+
+Slave is producer and master is consumer. Memory regions, are mapped shared memory files,
+created by memif slave and provided to master at connection establishment.
+Regions contain rings and buffers. Rings and buffers can also be separated into multiple
+regions. For no-zero-copy, rings and buffers are stored inside single memory
+region to reduce the number of opened files.
+
+region n (no-zero-copy):
+
++-----------------------+-------------------------------------------------------------------------+
+| Rings                 | Buffers                                                                 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+| S2M rings | M2S rings | packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+
+S2M OR M2S Rings:
+
++--------+--------+-----------------------+
+| ring 0 | ring 1 | ring num_s2m_rings - 1|
++--------+--------+-----------------------+
+
+ring 0:
+
++-------------+---------------------------------------+
+| ring header | (1 << pmd->run.log2_ring_size) * desc |
++-------------+---------------------------------------+
+
+Descriptors are assigned packet buffers in order of rings creation. If we have one ring
+in each direction and ring size is 1024, then first 1024 buffers will belong to S2M ring and
+last 1024 will belong to M2S ring. In case of zero-copy, buffers are dequeued and
+enqueued as needed.
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+Example: testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create ``master`` interface::
+
+    #./build/app/testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create ``slave`` interface (master must be already running so the slave will connect)::
+
+    #./build/app/testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Start forwarding packets::
+
+    Slave:
+        testpmd> start
+
+    Master:
+        testpmd> start tx_first
+
+Show status::
+
+    testpmd> show port stats 0
+
+For more details on testpmd please refer to :doc:`../testpmd_app_ug/index`.
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
index a17e7dea5..46efd13f0 100644
--- a/doc/guides/rel_notes/release_19_08.rst
+++ b/doc/guides/rel_notes/release_19_08.rst
@@ -54,6 +54,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================

+* **Added memif PMD.**
+
+  Added the new Shared Memory Packet Interface (``memif``) PMD.
+  See the :doc:`../nics/memif` guide for more details on this new driver.
+

 Removed Items
 -------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 3a72cf38c..78cb10fc6 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -35,6 +35,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
 DIRS-$(CONFIG_RTE_LIBRTE_IPN3KE_PMD) += ipn3ke
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..c3119625c
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,31 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+# Experimantal APIs:
+# - rte_intr_callback_unregister_pending
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_ethdev -lrte_kvargs
+LDLIBS += -lrte_hash
+LDLIBS += -lrte_bus_vdev
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..3948b1f50
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128)
+{
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..1e046b6b1
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1124 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	struct cmsghdr *cmsg;
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e;
+
+	e = rte_zmalloc("memif_msg", sizeof(struct memif_msg_queue_elt), 0);
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e;
+	struct pmd_internals *pmd;
+	memif_msg_disconnect_t *d;
+
+	if (cc == NULL) {
+		MIF_LOG(DEBUG, "Missing control channel.");
+		return;
+	}
+
+	e = memif_msg_enq(cc);
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			pmd = cc->dev->data->dev_private;
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	memif_msg_hello_t *h;
+
+	if (e == NULL)
+		return -1;
+
+	h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_NUM - 1;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.pkt_buffer_size = pmd->cfg.pkt_buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strncmp(pmd->secret, (char *)i->secret,
+						ETH_MEMIF_SECRET_SIZE) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+	struct memif_region *r;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	if (ar->index >= ETH_MEMIF_MAX_REGION_NUM || ar->index != pmd->regions_num ||
+			pmd->regions[ar->index] != NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	r->fd = fd;
+	r->region_size = ar->size;
+	r->addr = NULL;
+
+	pmd->regions[ar->index] = r;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+	struct memif_queue *mq;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->ring_offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_init_t *i = &e->msg.init;
+
+	if (e == NULL)
+		return -1;
+
+	i = &e->msg.init;
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_add_region_t *ar;
+	struct memif_region *mr = pmd->regions[idx];
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_region;
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	struct memif_queue *mq;
+	memif_msg_add_ring_t *ar;
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_ring;
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->ring_offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connect_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connect;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connected_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connected;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		rte_free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt, *next;
+	struct memif_queue *mq;
+	struct rte_intr_handle *ih;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		for (elt = TAILQ_FIRST(&pmd->cc->msg_queue); elt != NULL; elt = next) {
+			next = TAILQ_NEXT(elt, next);
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				rte_free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		ih = &pmd->cc->intr_handle;
+		if (ih->fd > 0) {
+			ret = rte_intr_callback_unregister(ih,
+							memif_intr_handler,
+							pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				ret = rte_intr_callback_unregister_pending(ih,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+			} else if (ret > 0) {
+				close(ih->fd);
+				rte_free(pmd->cc);
+			}
+			pmd->cc = NULL;
+			if (ret <= 0)
+				MIF_LOG(WARNING, "%s: Failed to unregister "
+					"control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		if (pmd->role == MEMIF_ROLE_SLAVE) {
+			if (dev->data->tx_queues != NULL)
+				mq = dev->data->tx_queues[i];
+			else
+				continue;
+		} else {
+			if (dev->data->rx_queues != NULL)
+				mq = dev->data->rx_queues[i];
+			else
+				continue;
+		}
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		if (pmd->role == MEMIF_ROLE_MASTER) {
+			if (dev->data->tx_queues != NULL)
+				mq = dev->data->tx_queues[i];
+			else
+				continue;
+		} else {
+			if (dev->data->rx_queues != NULL)
+				mq = dev->data->rx_queues[i];
+			else
+				continue;
+		}
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	/* reset connection configuration */
+	memset(&pmd->run, 0, sizeof(pmd->run));
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				memcpy(&afd, CMSG_DATA(cmsg), sizeof(int));
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	if (cc->dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(cc->dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret = rte_intr_callback_register(&cc->intr_handle, memif_intr_handler, cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL)
+		rte_free(cc);
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	struct rte_hash *hash;
+	int ret;
+	char key[256];
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt = rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt), 0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+	struct rte_hash *hash;
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (pmd->socket_filename == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) < 0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			rte_free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..db293e200
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param dev
+ *   memif device
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   memif device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+	char dev_name[RTE_ETH_NAME_MAX_LEN];
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+	uint8_t listener;			/**< if not zero socket is listener */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	memif_msg_t msg;			/**< control message */
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..287a30ef2
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+# Experimantal APIs:
+# - rte_intr_callback_unregister_pending
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..4e6eeb7b6
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1207 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_PKT_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char *valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0]->addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region]->addr + d->offset);
+}
+
+static int
+memif_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *cur_tail,
+		    struct rte_mbuf *tail)
+{
+	/* Check for number-of-segments-overflow */
+	if (unlikely(head->nb_segs + tail->nb_segs > RTE_MBUF_MAX_NB_SEGS))
+		return -EOVERFLOW;
+
+	/* Chain 'tail' onto the old tail */
+	cur_tail->next = tail;
+
+	/* accumulate number of segments and total length. */
+	head->nb_segs = (uint16_t)(head->nb_segs + tail->nb_segs);
+
+	tail->pkt_len = tail->data_len;
+	head->pkt_len += tail->pkt_len;
+
+	return 0;
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+		RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf, *mbuf_head, *mbuf_tail;
+	uint64_t b;
+	ssize_t size __rte_unused;
+	uint16_t head;
+	int ret;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size;
+
+				/* store pointer to tail */
+				mbuf_tail = mbuf;
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				ret = memif_pktmbuf_chain(mbuf_head, mbuf_tail, mbuf);
+				if (unlikely(ret < 0)) {
+					MIF_LOG(ERR, "%s: number-of-segments-overflow",
+						rte_vdev_device_name(pmd->vdev));
+					rte_pktmbuf_free(mbuf);
+					goto no_free_bufs;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_data_len(mbuf) += cp_len;
+			rte_pktmbuf_pkt_len(mbuf) = rte_pktmbuf_data_len(mbuf);
+			if (mbuf != mbuf_head)
+				rte_pktmbuf_pkt_len(mbuf_head) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		mq->n_bytes += rte_pktmbuf_pkt_len(mbuf_head);
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+refill:
+	if (type == MEMIF_RING_M2S) {
+		head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.pkt_buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+	uint64_t a;
+	ssize_t size;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_tx_pkts < nb_pkts && n_free) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len = (type == MEMIF_RING_S2M) ?
+			pmd->run.pkt_buffer_size : d0->length;
+
+next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.pkt_buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		a = 1;
+		size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	/* regions are allocated contiguously, so it's
+	 * enough to loop until 'pmd->regions_num'
+	 */
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions[i];
+		if (r != NULL) {
+			if (r->addr != NULL) {
+				munmap(r->addr, r->region_size);
+				if (r->fd > 0) {
+					close(r->fd);
+					r->fd = -1;
+				}
+			}
+			rte_free(r);
+			pmd->regions[i] = NULL;
+		}
+	}
+	pmd->regions_num = 0;
+}
+
+static int
+memif_region_init_shm(struct pmd_internals *pmd, uint8_t has_buffers)
+{
+	char shm_name[ETH_MEMIF_SHM_NAME_SIZE];
+	int ret = 0;
+	struct memif_region *r;
+
+	if (pmd->regions_num >= ETH_MEMIF_MAX_REGION_NUM) {
+		MIF_LOG(ERR, "%s: Too many regions.", rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	/* calculate buffer offset */
+	r->pkt_buffer_offset = (pmd->run.num_s2m_rings + pmd->run.num_m2s_rings) *
+	    (sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size));
+
+	r->region_size = r->pkt_buffer_offset;
+	/* if region has buffers, add buffers size to region_size */
+	if (has_buffers == 1)
+		r->region_size += (uint32_t)(pmd->run.pkt_buffer_size *
+			(1 << pmd->run.log2_ring_size) *
+			(pmd->run.num_s2m_rings +
+			 pmd->run.num_m2s_rings));
+
+	memset(shm_name, 0, sizeof(char) * ETH_MEMIF_SHM_NAME_SIZE);
+	snprintf(shm_name, ETH_MEMIF_SHM_NAME_SIZE, "memif_region_%d",
+		 pmd->regions_num);
+
+	r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+	if (r->fd < 0) {
+		MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		ret = -1;
+		goto error;
+	}
+
+	ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	ret = ftruncate(r->fd, r->region_size);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	r->addr = mmap(NULL, r->region_size, PROT_READ |
+		       PROT_WRITE, MAP_SHARED, r->fd, 0);
+	if (r->addr == MAP_FAILED) {
+		MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(ret));
+		ret = -1;
+		goto error;
+	}
+
+	pmd->regions[pmd->regions_num] = r;
+	pmd->regions_num++;
+
+	return ret;
+
+error:
+	if (r->fd > 0)
+		close(r->fd);
+	r->fd = -1;
+
+	return ret;
+}
+
+static int
+memif_regions_init(struct pmd_internals *pmd)
+{
+	int ret;
+
+	/* create one buffer region */
+	ret = memif_region_init_shm(pmd, /* has buffer */ 1);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+	uint16_t slot;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = pmd->regions[0]->pkt_buffer_offset +
+				(uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = pmd->regions[0]->pkt_buffer_offset +
+				(uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->ring_offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->ring_offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_regions_init(dev->data->dev_private);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions[i];
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->ring_offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->ring_offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static void
+memif_dev_close(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memif_msg_enq_disconnect(pmd->cc, "Device closed", 0);
+	memif_disconnect(dev);
+
+	memif_socket_remove_device(dev);
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *mq = (struct memif_queue *)queue;
+
+	if (!mq)
+		return;
+
+	rte_free(mq);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+	uint8_t tmp, nq;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_close = memif_dev_close,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t pkt_buffer_size, const char *secret,
+	     struct rte_ether_addr *ether_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy slave not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * ETH_MEMIF_SECRET_SIZE);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.pkt_buffer_size = pkt_buffer_size;
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = ether_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *pkt_buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*pkt_buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid socket directory.");
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct rte_ether_addr *ether_addr = (struct rte_ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &ether_addr->addr_bytes[0], &ether_addr->addr_bytes[1],
+	       &ether_addr->addr_bytes[2], &ether_addr->addr_bytes[3],
+	       &ether_addr->addr_bytes[4], &ether_addr->addr_bytes[5]);
+	if (ret != 6)
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+	const char *name = rte_vdev_device_name(vdev);
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t pkt_buffer_size = ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct rte_ether_addr *ether_addr = rte_zmalloc("",
+		sizeof(struct rte_ether_addr), 0);
+
+	rte_eth_random_addr(ether_addr->addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		MIF_LOG(ERR, "Multi-processing not supported for memif.");
+		/* TODO:
+		 * Request connection information.
+		 *
+		 * Once memif in the primary process is connected,
+		 * broadcast connection information.
+		 */
+		return -1;
+	}
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &pkt_buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, ether_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, pkt_buffer_size, secret, ether_addr);
+
+exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+	int i;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	for (i = 0; i < eth_dev->data->nb_rx_queues; i++)
+		(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data->rx_queues[i]);
+	for (i = 0; i < eth_dev->data->nb_tx_queues; i++)
+		(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data->tx_queues[i]);
+
+	rte_free(eth_dev->process_private);
+	eth_dev->process_private = NULL;
+
+	rte_eth_dev_close(eth_dev->data->port_id);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_PKT_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..5f631e925
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,212 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/run/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE	2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_NUM		256
+
+#define ETH_MEMIF_SHM_NAME_SIZE			32
+#define ETH_MEMIF_DISC_STRING_SIZE		96
+#define ETH_MEMIF_SECRET_SIZE			24
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t pkt_buffer_offset;
+	/**< offset from 'addr' to first packet buffer */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	struct pmd_internals *pmd;		/**< device internals */
+
+	memif_ring_type_t type;			/**< ring type */
+	memif_region_index_t region;		/**< shared memory region index */
+
+	uint16_t in_port;			/**< port id */
+
+	memif_region_offset_t ring_offset;
+	/**< ring offset from start of shm region (ring - memif_region.addr) */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+
+	memif_ring_t *ring;			/**< pointer to ring */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[ETH_MEMIF_SECRET_SIZE]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions[ETH_MEMIF_MAX_REGION_NUM];
+	/**< shared memory regions */
+	memif_region_index_t regions_num;	/**< number of regions */
+
+	/* remote info */
+	char remote_name[RTE_DEV_NAME_MAX_LEN];		/**< remote app name */
+	char remote_if_name[RTE_DEV_NAME_MAX_LEN];	/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[ETH_MEMIF_DISC_STRING_SIZE];
+	/**< local disconnect reason */
+	char remote_disc_string[ETH_MEMIF_DISC_STRING_SIZE];
+	/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param dev
+ *   memif device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param dev
+ *   memif device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..8861484fb
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.08 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index ed99896c3..b57073483 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,6 +24,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 7c9b4b538..326b2a30b 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -174,6 +174,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
--
2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD
  2019-06-05 11:55               ` Ferruh Yigit
@ 2019-06-06  9:24                 ` Ferruh Yigit
  2019-06-06 10:25                   ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  0 siblings, 1 reply; 58+ messages in thread
From: Ferruh Yigit @ 2019-06-06  9:24 UTC (permalink / raw)
  To: Jakub Grajciar, dev

On 6/5/2019 12:55 PM, Ferruh Yigit wrote:
> On 5/31/2019 7:22 AM, Jakub Grajciar wrote:
>> Memory interface (memif), provides high performance
>> packet transfer over shared memory.
> 
> Almost there, can you please check below comments? I am hoping to merge next
> version of the patch.
> 
> Thanks,
> ferruh
> 
>>
>> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
> 
> <...>
> 
>> +static const char *valid_arguments[] = {
>> +	ETH_MEMIF_ID_ARG,
>> +	ETH_MEMIF_ROLE_ARG,
>> +	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
>> +	ETH_MEMIF_RING_SIZE_ARG,
>> +	ETH_MEMIF_SOCKET_ARG,
>> +	ETH_MEMIF_MAC_ARG,
>> +	ETH_MEMIF_ZC_ARG,
>> +	ETH_MEMIF_SECRET_ARG,
>> +	NULL
>> +};
> 
> Checkpatch is giving following warning:
> 
> WARNING:STATIC_CONST_CHAR_ARRAY: static const char * array should probably be
> static const char * const
> #1885: FILE: drivers/net/memif/rte_eth_memif.c:39:
> +static const char *valid_arguments[] = {
> 
> <...>
> 
>> +static int
>> +rte_pmd_memif_remove(struct rte_vdev_device *vdev)
>> +{
>> +	struct rte_eth_dev *eth_dev;
>> +	int i;
>> +
>> +	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
>> +	if (eth_dev == NULL)
>> +		return 0;
>> +
>> +	for (i = 0; i < eth_dev->data->nb_rx_queues; i++)
>> +		(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data->rx_queues[i]);
>> +	for (i = 0; i < eth_dev->data->nb_tx_queues; i++)
>> +		(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data->tx_queues[i]);
> 
> Although they point same function, better to use 'dev_ops->tx_queue_release' for
> Tx queues.
> 
>> +
>> +	rte_free(eth_dev->process_private);
>> +	eth_dev->process_private = NULL;
> 
> "process_private" is not used in this PMD at all, no need to free it I think.
> 
>> +
>> +	rte_eth_dev_close(eth_dev->data->port_id);
> 
> There are two exit path from a PMD:
> 1) rte_eth_dev_close() API
> 2) rte_vdev_driver->remove() called by hotplug remove APIs ('rte_dev_remove()'
> or 'rte_eal_hotplug_remove()')
> 
> Both should clear all PMD allocated resources. Since you are calling
> 'rte_eth_dev_close() from this .remove() function, it makes sense to move all
> resource cleanup to .dev_close (like queue cleanup calls above).
> 

Hi Jakup,

Above comments seems not implemented in v11, can you please check them?
If anything is not clear feel free to reach me on the irc.

Thanks,
ferruh

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD
  2019-06-06  9:24                 ` Ferruh Yigit
@ 2019-06-06 10:25                   ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
  2019-06-06 11:18                     ` Ferruh Yigit
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) @ 2019-06-06 10:25 UTC (permalink / raw)
  To: Ferruh Yigit, dev



> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@intel.com>
> Sent: Thursday, June 6, 2019 11:24 AM
> To: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
> <jgrajcia@cisco.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface
> (memif) PMD
> 
> On 6/5/2019 12:55 PM, Ferruh Yigit wrote:
> > On 5/31/2019 7:22 AM, Jakub Grajciar wrote:
> >> Memory interface (memif), provides high performance packet transfer
> >> over shared memory.
> >
> > Almost there, can you please check below comments? I am hoping to
> > merge next version of the patch.
> >
> > Thanks,
> > ferruh
> >
> >>
> >> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
> >
> > <...>
> >
> >> +static const char *valid_arguments[] = {
> >> +	ETH_MEMIF_ID_ARG,
> >> +	ETH_MEMIF_ROLE_ARG,
> >> +	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
> >> +	ETH_MEMIF_RING_SIZE_ARG,
> >> +	ETH_MEMIF_SOCKET_ARG,
> >> +	ETH_MEMIF_MAC_ARG,
> >> +	ETH_MEMIF_ZC_ARG,
> >> +	ETH_MEMIF_SECRET_ARG,
> >> +	NULL
> >> +};
> >
> > Checkpatch is giving following warning:
> >
> > WARNING:STATIC_CONST_CHAR_ARRAY: static const char * array should
> > probably be static const char * const
> > #1885: FILE: drivers/net/memif/rte_eth_memif.c:39:
> > +static const char *valid_arguments[] = {
> >
> > <...>
> >
> >> +static int
> >> +rte_pmd_memif_remove(struct rte_vdev_device *vdev) {
> >> +	struct rte_eth_dev *eth_dev;
> >> +	int i;
> >> +
> >> +	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
> >> +	if (eth_dev == NULL)
> >> +		return 0;
> >> +
> >> +	for (i = 0; i < eth_dev->data->nb_rx_queues; i++)
> >> +		(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data-
> >rx_queues[i]);
> >> +	for (i = 0; i < eth_dev->data->nb_tx_queues; i++)
> >> +
> >> +(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data->tx_queues[i]);
> >
> > Although they point same function, better to use
> > 'dev_ops->tx_queue_release' for Tx queues.
> >
> >> +
> >> +	rte_free(eth_dev->process_private);
> >> +	eth_dev->process_private = NULL;
> >
> > "process_private" is not used in this PMD at all, no need to free it I think.
> >
> >> +
> >> +	rte_eth_dev_close(eth_dev->data->port_id);
> >
> > There are two exit path from a PMD:
> > 1) rte_eth_dev_close() API
> > 2) rte_vdev_driver->remove() called by hotplug remove APIs
> ('rte_dev_remove()'
> > or 'rte_eal_hotplug_remove()')
> >
> > Both should clear all PMD allocated resources. Since you are calling
> > 'rte_eth_dev_close() from this .remove() function, it makes sense to
> > move all resource cleanup to .dev_close (like queue cleanup calls above).
> >
> 
> Hi Jakup,
> 
> Above comments seems not implemented in v11, can you please check them?
> If anything is not clear feel free to reach me on the irc.
> 


Sorry, I missed that mail. I'll get it fixed right away, but I don't understand what's wrong with 'static const char *valid_arguments[]...' other PMDs handle args the same way, can you please give me a hint?

Thanks

> Thanks,
> ferruh

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD
  2019-06-06 10:25                   ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
@ 2019-06-06 11:18                     ` Ferruh Yigit
  0 siblings, 0 replies; 58+ messages in thread
From: Ferruh Yigit @ 2019-06-06 11:18 UTC (permalink / raw)
  To: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco), dev

On 6/6/2019 11:25 AM, Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at
Cisco) wrote:
> 
> 
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@intel.com>
>> Sent: Thursday, June 6, 2019 11:24 AM
>> To: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
>> <jgrajcia@cisco.com>; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v10] net/memif: introduce memory interface
>> (memif) PMD
>>
>> On 6/5/2019 12:55 PM, Ferruh Yigit wrote:
>>> On 5/31/2019 7:22 AM, Jakub Grajciar wrote:
>>>> Memory interface (memif), provides high performance packet transfer
>>>> over shared memory.
>>>
>>> Almost there, can you please check below comments? I am hoping to
>>> merge next version of the patch.
>>>
>>> Thanks,
>>> ferruh
>>>
>>>>
>>>> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
>>>
>>> <...>
>>>
>>>> +static const char *valid_arguments[] = {
>>>> +	ETH_MEMIF_ID_ARG,
>>>> +	ETH_MEMIF_ROLE_ARG,
>>>> +	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
>>>> +	ETH_MEMIF_RING_SIZE_ARG,
>>>> +	ETH_MEMIF_SOCKET_ARG,
>>>> +	ETH_MEMIF_MAC_ARG,
>>>> +	ETH_MEMIF_ZC_ARG,
>>>> +	ETH_MEMIF_SECRET_ARG,
>>>> +	NULL
>>>> +};
>>>
>>> Checkpatch is giving following warning:
>>>
>>> WARNING:STATIC_CONST_CHAR_ARRAY: static const char * array should
>>> probably be static const char * const
>>> #1885: FILE: drivers/net/memif/rte_eth_memif.c:39:
>>> +static const char *valid_arguments[] = {
>>>
>>> <...>
>>>
>>>> +static int
>>>> +rte_pmd_memif_remove(struct rte_vdev_device *vdev) {
>>>> +	struct rte_eth_dev *eth_dev;
>>>> +	int i;
>>>> +
>>>> +	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
>>>> +	if (eth_dev == NULL)
>>>> +		return 0;
>>>> +
>>>> +	for (i = 0; i < eth_dev->data->nb_rx_queues; i++)
>>>> +		(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data-
>>> rx_queues[i]);
>>>> +	for (i = 0; i < eth_dev->data->nb_tx_queues; i++)
>>>> +
>>>> +(*eth_dev->dev_ops->rx_queue_release)(eth_dev->data->tx_queues[i]);
>>>
>>> Although they point same function, better to use
>>> 'dev_ops->tx_queue_release' for Tx queues.
>>>
>>>> +
>>>> +	rte_free(eth_dev->process_private);
>>>> +	eth_dev->process_private = NULL;
>>>
>>> "process_private" is not used in this PMD at all, no need to free it I think.
>>>
>>>> +
>>>> +	rte_eth_dev_close(eth_dev->data->port_id);
>>>
>>> There are two exit path from a PMD:
>>> 1) rte_eth_dev_close() API
>>> 2) rte_vdev_driver->remove() called by hotplug remove APIs
>> ('rte_dev_remove()'
>>> or 'rte_eal_hotplug_remove()')
>>>
>>> Both should clear all PMD allocated resources. Since you are calling
>>> 'rte_eth_dev_close() from this .remove() function, it makes sense to
>>> move all resource cleanup to .dev_close (like queue cleanup calls above).
>>>
>>
>> Hi Jakup,
>>
>> Above comments seems not implemented in v11, can you please check them?
>> If anything is not clear feel free to reach me on the irc.
>>
> 
> 
> Sorry, I missed that mail. I'll get it fixed right away, but I don't understand what's wrong with 'static const char *valid_arguments[]...' other PMDs handle args the same way, can you please give me a hint?

It is not wrong, but good to have that second 'const', it prevents you update
the valid_arguments[] by mistake, like following will work without that 'const':

 char p = "test";
 valid_arguments[0] = p;

Since we don't support dynamically changing device arguments in runtime, good to
have the protection, it won't hurt.


> 
> Thanks
> 
>> Thanks,
>> ferruh


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [dpdk-dev] [PATCH v12] net/memif: introduce memory interface (memif) PMD
  2019-06-06  8:24               ` [dpdk-dev] [PATCH v11] " Jakub Grajciar
@ 2019-06-06 11:38                 ` Jakub Grajciar
  2019-06-06 14:07                   ` Ferruh Yigit
  0 siblings, 1 reply; 58+ messages in thread
From: Jakub Grajciar @ 2019-06-06 11:38 UTC (permalink / raw)
  To: dev; +Cc: Jakub Grajciar

Shared memory packet interface (memif) PMD allows for DPDK and any other
client using memif (DPDK, VPP, libmemif) to communicate using shared
memory. The created device transmits packets in a raw format. It can be
used with Ethernet mode, IP mode, or Punt/Inject. At this moment, only
Ethernet mode is supported in DPDK memif implementation. Memif is Linux
only.

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
---
 MAINTAINERS                                 |    6 +
 config/common_base                          |    5 +
 config/common_linux                         |    1 +
 doc/guides/nics/features/memif.ini          |   14 +
 doc/guides/nics/index.rst                   |    1 +
 doc/guides/nics/memif.rst                   |  234 ++++
 doc/guides/rel_notes/release_19_08.rst      |    5 +
 drivers/net/Makefile                        |    1 +
 drivers/net/memif/Makefile                  |   31 +
 drivers/net/memif/memif.h                   |  179 +++
 drivers/net/memif/memif_socket.c            | 1124 +++++++++++++++++
 drivers/net/memif/memif_socket.h            |  105 ++
 drivers/net/memif/meson.build               |   15 +
 drivers/net/memif/rte_eth_memif.c           | 1204 +++++++++++++++++++
 drivers/net/memif/rte_eth_memif.h           |  212 ++++
 drivers/net/memif/rte_pmd_memif_version.map |    4 +
 drivers/net/meson.build                     |    1 +
 mk/rte.app.mk                               |    1 +
 18 files changed, 3143 insertions(+)
 create mode 100644 doc/guides/nics/features/memif.ini
 create mode 100644 doc/guides/nics/memif.rst
 create mode 100644 drivers/net/memif/Makefile
 create mode 100644 drivers/net/memif/memif.h
 create mode 100644 drivers/net/memif/memif_socket.c
 create mode 100644 drivers/net/memif/memif_socket.h
 create mode 100644 drivers/net/memif/meson.build
 create mode 100644 drivers/net/memif/rte_eth_memif.c
 create mode 100644 drivers/net/memif/rte_eth_memif.h
 create mode 100644 drivers/net/memif/rte_pmd_memif_version.map

requires patch: http://patchwork.dpdk.org/patch/51511/
Memif uses a unix domain socket as control channel (cc).
rte_interrupts.h is used to handle interrupts for this cc.
This patch is required, because the message received can
be a reason for disconnecting.

v3:
- coding style fixes
- documentation
- doxygen comments
- use strlcpy() instead of strncpy()
- use RTE_BUILD_BUG_ON instead of _Static_assert()
- fix meson build deps

v4:
- coding style fixes
- doc update (desc format, messaging, features, ...)
- pointer arithmetic fix
- __rte_packed and __rte_aligned instead of __attribute__()
- fixed multi-queue
- support additional archs (memfd_create syscall)

v5:
- coding style fixes
- add release notes
- remove unused dependencies
- doc update

v6:
- remove socket on termination of testpmd application
- disallow memif probing in secondary process (return error), multi-process support will be implemented in separate patch
- fix chained buffer packet length
- use single region for rings and buffers (non-zero-copy)
- doc update

v7:
- coding style fixes

v8:
- release notes: 19.05 -> 19.08

v9:
- rearrange elements in structures to remove holes
- default socket file in /run
- #define name sizes
- implement dev_close op

v10:
- fix shared lib build
- clear memory allocated by PMD in .dev_close
- fix NULL pointer string 'error: ‘%s’ directive argument is null'

v11:
- fix strict-aliasing
- correct parameter names in doxygen comments

v12:
- fix coding style issues
- free resources in .dev_close

diff --git a/MAINTAINERS b/MAINTAINERS
index 15d0829c5..3698c146b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -841,6 +841,12 @@ F: drivers/net/softnic/
 F: doc/guides/nics/features/softnic.ini
 F: doc/guides/nics/softnic.rst

+Memif PMD
+M: Jakub Grajciar <jgrajcia@cisco.com>
+F: drivers/net/memif/
+F: doc/guides/nics/memif.rst
+F: doc/guides/nics/features/memif.ini
+

 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 6f19ad5d2..e406e7836 100644
--- a/config/common_base
+++ b/config/common_base
@@ -444,6 +444,11 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_XDP=n

+#
+# Compile Memory Interface PMD driver (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_MEMIF=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/config/common_linux b/config/common_linux
index 75334273d..87514fe4f 100644
--- a/config/common_linux
+++ b/config/common_linux
@@ -19,6 +19,7 @@ CONFIG_RTE_LIBRTE_VHOST_POSTCOPY=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
 CONFIG_RTE_LIBRTE_IFC_PMD=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
+CONFIG_RTE_LIBRTE_PMD_MEMIF=y
 CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/doc/guides/nics/features/memif.ini b/doc/guides/nics/features/memif.ini
new file mode 100644
index 000000000..807d9ecdc
--- /dev/null
+++ b/doc/guides/nics/features/memif.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'memif' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+Basic stats          = Y
+Jumbo frame          = Y
+ARMv8                = Y
+Power8               = Y
+x86-32               = Y
+x86-64               = Y
+Usage doc            = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 2221c35f2..691e7209f 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
     intel_vf
     kni
     liquidio
+    memif
     mlx4
     mlx5
     mvneta
diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
new file mode 100644
index 000000000..de2d481eb
--- /dev/null
+++ b/doc/guides/nics/memif.rst
@@ -0,0 +1,234 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2018-2019 Cisco Systems, Inc.
+
+======================
+Memif Poll Mode Driver
+======================
+
+Shared memory packet interface (memif) PMD allows for DPDK and any other client
+using memif (DPDK, VPP, libmemif) to communicate using shared memory. Memif is
+Linux only.
+
+The created device transmits packets in a raw format. It can be used with
+Ethernet mode, IP mode, or Punt/Inject. At this moment, only Ethernet mode is
+supported in DPDK memif implementation.
+
+Memif works in two roles: master and slave. Slave connects to master over an
+existing socket. It is also a producer of shared memory file and initializes
+the shared memory. Each interface can be connected to one peer interface
+at same time. The peer interface is identified by id parameter. Master
+creates the socket and listens for any slave connection requests. The socket
+may already exist on the system. Be sure to remove any such sockets, if you
+are creating a master interface, or you will see an "Address already in use"
+error. Function ``rte_pmd_memif_remove()``, which removes memif interface,
+will also remove a listener socket, if it is not being used by any other
+interface.
+
+The method to enable one or more interfaces is to use the
+``--vdev=net_memif0`` option on the DPDK application command line. Each
+``--vdev=net_memif1`` option given will create an interface named net_memif0,
+net_memif1, and so on. Memif uses unix domain socket to transmit control
+messages. Each memif has a unique id per socket. This id is used to identify
+peer interface. If you are connecting multiple
+interfaces using same socket, be sure to specify unique ids ``id=0``, ``id=1``,
+etc. Note that if you assign a socket to a master interface it becomes a
+listener socket. Listener socket can not be used by a slave interface on same
+client.
+
+.. csv-table:: **Memif configuration options**
+   :header: "Option", "Description", "Default", "Valid value"
+
+   "id=0", "Used to identify peer interface", "0", "uint32_t"
+   "role=master", "Set memif role", "slave", "master|slave"
+   "bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
+   "rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", "10", "1-14"
+   "socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 256"
+   "mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
+   "secret=abc123", "Secret is an optional security option, which if specified, must be matched by peer", "", "string len 24"
+   "zero-copy=yes", "Enable/disable zero-copy slave mode", "no", "yes|no"
+
+**Connection establishment**
+
+In order to create memif connection, two memif interfaces, each in separate
+process, are needed. One interface in ``master`` role and other in
+``slave`` role. It is not possible to connect two interfaces in a single
+process. Each interface can be connected to one interface at same time,
+identified by matching id parameter.
+
+Memif driver uses unix domain socket to exchange required information between
+memif interfaces. Socket file path is specified at interface creation see
+*Memif configuration options* table above. If socket is used by ``master``
+interface, it's marked as listener socket (in scope of current process) and
+listens to connection requests from other processes. One socket can be used by
+multiple interfaces. One process can have ``slave`` and ``master`` interfaces
+at the same time, provided each role is assigned unique socket.
+
+For detailed information on memif control messages, see: net/memif/memif.h.
+
+Slave interface attempts to make a connection on assigned socket. Process
+listening on this socket will extract the connection request and create a new
+connected socket (control channel). Then it sends the 'hello' message
+(``MEMIF_MSG_TYPE_HELLO``), containing configuration boundaries. Slave interface
+adjusts its configuration accordingly, and sends 'init' message
+(``MEMIF_MSG_TYPE_INIT``). This message among others contains interface id. Driver
+uses this id to find master interface, and assigns the control channel to this
+interface. If such interface is found, 'ack' message (``MEMIF_MSG_TYPE_ACK``) is
+sent. Slave interface sends 'add region' message (``MEMIF_MSG_TYPE_ADD_REGION``) for
+every region allocated. Master responds to each of these messages with 'ack'
+message. Same behavior applies to rings. Slave sends 'add ring' message
+(``MEMIF_MSG_TYPE_ADD_RING``) for every initialized ring. Master again responds to
+each message with 'ack' message. To finalize the connection, slave interface
+sends 'connect' message (``MEMIF_MSG_TYPE_CONNECT``). Upon receiving this message
+master maps regions to its address space, initializes rings and responds with
+'connected' message (``MEMIF_MSG_TYPE_CONNECTED``). Disconnect
+(``MEMIF_MSG_TYPE_DISCONNECT``) can be sent by both master and slave interfaces at
+any time, due to driver error or if the interface is being deleted.
+
+Files
+
+- net/memif/memif.h *- control messages definitions*
+- net/memif/memif_socket.h
+- net/memif/memif_socket.c
+
+Shared memory
+~~~~~~~~~~~~~
+
+**Shared memory format**
+
+Slave is producer and master is consumer. Memory regions, are mapped shared memory files,
+created by memif slave and provided to master at connection establishment.
+Regions contain rings and buffers. Rings and buffers can also be separated into multiple
+regions. For no-zero-copy, rings and buffers are stored inside single memory
+region to reduce the number of opened files.
+
+region n (no-zero-copy):
+
++-----------------------+-------------------------------------------------------------------------+
+| Rings                 | Buffers                                                                 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+| S2M rings | M2S rings | packet buffer 0 | . | pb ((1 << pmd->run.log2_ring_size)*(s2m + m2s))-1 |
++-----------+-----------+-----------------+---+---------------------------------------------------+
+
+S2M OR M2S Rings:
+
++--------+--------+-----------------------+
+| ring 0 | ring 1 | ring num_s2m_rings - 1|
++--------+--------+-----------------------+
+
+ring 0:
+
++-------------+---------------------------------------+
+| ring header | (1 << pmd->run.log2_ring_size) * desc |
++-------------+---------------------------------------+
+
+Descriptors are assigned packet buffers in order of rings creation. If we have one ring
+in each direction and ring size is 1024, then first 1024 buffers will belong to S2M ring and
+last 1024 will belong to M2S ring. In case of zero-copy, buffers are dequeued and
+enqueued as needed.
+
+**Descriptor format**
+
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Quad|6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | |1|1| | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|Word|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | |6|5| | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|0   |length                                                         |region                         |flags                          |
++----+---------------------------------------------------------------+-------------------------------+-------------------------------+
+|1   |metadata                                                       |offset                                                         |
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |6| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |3|3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+|    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+|    |3| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |2|1| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |0|
++----+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+**Flags field - flags (Quad Word 0, bits 0:15)**
+
++-----+--------------------+------------------------------------------------------------------------------------------------+
+|Bits |Name                |Functionality                                                                                   |
++=====+====================+================================================================================================+
+|0    |MEMIF_DESC_FLAG_NEXT|Is chained buffer. When set, the packet is divided into multiple buffers. May not be contiguous.|
++-----+--------------------+------------------------------------------------------------------------------------------------+
+
+**Region index - region (Quad Word 0, 16:31)**
+
+Index of memory region, the buffer is located in.
+
+**Data length - length (Quad Word 0, 32:63)**
+
+Length of transmitted/received data.
+
+**Data Offset - offset (Quad Word 1, 0:31)**
+
+Data start offset from memory region address. *.regions[desc->region].addr + desc->offset*
+
+**Metadata - metadata (Quad Word 1, 32:63)**
+
+Buffer metadata.
+
+Files
+
+- net/memif/memif.h *- descriptor and ring definitions*
+- net/memif/rte_eth_memif.c *- eth_memif_rx() eth_memif_tx()*
+
+Example: testpmd
+----------------------------
+In this example we run two instances of testpmd application and transmit packets over memif.
+
+First create ``master`` interface::
+
+    #./build/app/testpmd -l 0-1 --proc-type=primary --file-prefix=pmd1 --vdev=net_memif,role=master -- -i
+
+Now create ``slave`` interface (master must be already running so the slave will connect)::
+
+    #./build/app/testpmd -l 2-3 --proc-type=primary --file-prefix=pmd2 --vdev=net_memif -- -i
+
+Start forwarding packets::
+
+    Slave:
+        testpmd> start
+
+    Master:
+        testpmd> start tx_first
+
+Show status::
+
+    testpmd> show port stats 0
+
+For more details on testpmd please refer to :doc:`../testpmd_app_ug/index`.
+
+Example: testpmd and VPP
+------------------------
+For information on how to get and run VPP please see `<https://wiki.fd.io/view/VPP>`_.
+
+Start VPP in interactive mode (should be by default). Create memif master interface in VPP::
+
+    vpp# create interface memif id 0 master no-zero-copy
+    vpp# set interface state memif0/0 up
+    vpp# set interface ip address memif0/0 192.168.1.1/24
+
+To see socket filename use show memif command::
+
+    vpp# show memif
+    sockets
+     id  listener    filename
+      0   yes (1)     /run/vpp/memif.sock
+    ...
+
+Now create memif interface by running testpmd with these command line options::
+
+    #./testpmd --vdev=net_memif,socket=/run/vpp/memif.sock -- -i
+
+Testpmd should now create memif slave interface and try to connect to master.
+In testpmd set forward option to icmpecho and start forwarding::
+
+    testpmd> set fwd icmpecho
+    testpmd> start
+
+Send ping from VPP::
+
+    vpp# ping 192.168.1.2
+    64 bytes from 192.168.1.2: icmp_seq=2 ttl=254 time=36.2918 ms
+    64 bytes from 192.168.1.2: icmp_seq=3 ttl=254 time=23.3927 ms
+    64 bytes from 192.168.1.2: icmp_seq=4 ttl=254 time=24.2975 ms
+    64 bytes from 192.168.1.2: icmp_seq=5 ttl=254 time=17.7049 ms
diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
index a17e7dea5..46efd13f0 100644
--- a/doc/guides/rel_notes/release_19_08.rst
+++ b/doc/guides/rel_notes/release_19_08.rst
@@ -54,6 +54,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================

+* **Added memif PMD.**
+
+  Added the new Shared Memory Packet Interface (``memif``) PMD.
+  See the :doc:`../nics/memif` guide for more details on this new driver.
+

 Removed Items
 -------------
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 3a72cf38c..78cb10fc6 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -35,6 +35,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_ICE_PMD) += ice
 DIRS-$(CONFIG_RTE_LIBRTE_IPN3KE_PMD) += ipn3ke
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_LIO_PMD) += liquidio
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MVNETA_PMD) += mvneta
diff --git a/drivers/net/memif/Makefile b/drivers/net/memif/Makefile
new file mode 100644
index 000000000..c3119625c
--- /dev/null
+++ b/drivers/net/memif/Makefile
@@ -0,0 +1,31 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_memif.a
+
+EXPORT_MAP := rte_pmd_memif_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+# Experimantal APIs:
+# - rte_intr_callback_unregister_pending
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool
+LDLIBS += -lrte_ethdev -lrte_kvargs
+LDLIBS += -lrte_hash
+LDLIBS += -lrte_bus_vdev
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += rte_eth_memif.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF) += memif_socket.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/memif/memif.h b/drivers/net/memif/memif.h
new file mode 100644
index 000000000..3948b1f50
--- /dev/null
+++ b/drivers/net/memif/memif.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_H_
+#define _MEMIF_H_
+
+#define MEMIF_COOKIE		0x3E31F20
+#define MEMIF_VERSION_MAJOR	2
+#define MEMIF_VERSION_MINOR	0
+#define MEMIF_VERSION		((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR)
+#define MEMIF_NAME_SZ		32
+
+/*
+ * S2M: direction slave -> master
+ * M2S: direction master -> slave
+ */
+
+/*
+ *  Type definitions
+ */
+
+typedef enum memif_msg_type {
+	MEMIF_MSG_TYPE_NONE,
+	MEMIF_MSG_TYPE_ACK,
+	MEMIF_MSG_TYPE_HELLO,
+	MEMIF_MSG_TYPE_INIT,
+	MEMIF_MSG_TYPE_ADD_REGION,
+	MEMIF_MSG_TYPE_ADD_RING,
+	MEMIF_MSG_TYPE_CONNECT,
+	MEMIF_MSG_TYPE_CONNECTED,
+	MEMIF_MSG_TYPE_DISCONNECT,
+} memif_msg_type_t;
+
+typedef enum {
+	MEMIF_RING_S2M, /**< buffer ring in direction slave -> master */
+	MEMIF_RING_M2S, /**< buffer ring in direction master -> slave */
+} memif_ring_type_t;
+
+typedef enum {
+	MEMIF_INTERFACE_MODE_ETHERNET,
+	MEMIF_INTERFACE_MODE_IP,
+	MEMIF_INTERFACE_MODE_PUNT_INJECT,
+} memif_interface_mode_t;
+
+typedef uint16_t memif_region_index_t;
+typedef uint32_t memif_region_offset_t;
+typedef uint64_t memif_region_size_t;
+typedef uint16_t memif_ring_index_t;
+typedef uint32_t memif_interface_id_t;
+typedef uint16_t memif_version_t;
+typedef uint8_t memif_log2_ring_size_t;
+
+/*
+ *  Socket messages
+ */
+
+ /**
+  * M2S
+  * Contains master interfaces configuration.
+  */
+typedef struct __rte_packed {
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+	memif_version_t min_version; /**< lowest supported memif version */
+	memif_version_t max_version; /**< highest supported memif version */
+	memif_region_index_t max_region; /**< maximum num of regions */
+	memif_ring_index_t max_m2s_ring; /**< maximum num of M2S ring */
+	memif_ring_index_t max_s2m_ring; /**< maximum num of S2M rings */
+	memif_log2_ring_size_t max_log2_ring_size; /**< maximum ring size (as log2) */
+} memif_msg_hello_t;
+
+/**
+ * S2M
+ * Contains information required to identify interface
+ * to which the slave wants to connect.
+ */
+typedef struct __rte_packed {
+	memif_version_t version;		/**< memif version */
+	memif_interface_id_t id;		/**< interface id */
+	memif_interface_mode_t mode:8;		/**< interface mode */
+	uint8_t secret[24];			/**< optional security parameter */
+	uint8_t name[MEMIF_NAME_SZ]; /**< Client app name. In this case DPDK version */
+} memif_msg_init_t;
+
+/**
+ * S2M
+ * Request master to add new shared memory region to master interface.
+ * Shared files file descriptor is passed in cmsghdr.
+ */
+typedef struct __rte_packed {
+	memif_region_index_t index;		/**< shm regions index */
+	memif_region_size_t size;		/**< shm region size */
+} memif_msg_add_region_t;
+
+/**
+ * S2M
+ * Request master to add new ring to master interface.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_MSG_ADD_RING_FLAG_S2M 1		/**< ring is in S2M direction */
+	memif_ring_index_t index;		/**< ring index */
+	memif_region_index_t region; /**< region index on which this ring is located */
+	memif_region_offset_t offset;		/**< buffer start offset */
+	memif_log2_ring_size_t log2_ring_size;	/**< ring size (log2) */
+	uint16_t private_hdr_size;		/**< used for private metadata */
+} memif_msg_add_ring_t;
+
+/**
+ * S2M
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< slave interface name */
+} memif_msg_connect_t;
+
+/**
+ * M2S
+ * Finalize connection establishment.
+ */
+typedef struct __rte_packed {
+	uint8_t if_name[MEMIF_NAME_SZ];		/**< master interface name */
+} memif_msg_connected_t;
+
+/**
+ * S2M & M2S
+ * Disconnect interfaces.
+ */
+typedef struct __rte_packed {
+	uint32_t code;				/**< error code */
+	uint8_t string[96];			/**< disconnect reason */
+} memif_msg_disconnect_t;
+
+typedef struct __rte_packed __rte_aligned(128)
+{
+	memif_msg_type_t type:16;
+	union {
+		memif_msg_hello_t hello;
+		memif_msg_init_t init;
+		memif_msg_add_region_t add_region;
+		memif_msg_add_ring_t add_ring;
+		memif_msg_connect_t connect;
+		memif_msg_connected_t connected;
+		memif_msg_disconnect_t disconnect;
+	};
+} memif_msg_t;
+
+/*
+ *  Ring and Descriptor Layout
+ */
+
+/**
+ * Buffer descriptor.
+ */
+typedef struct __rte_packed {
+	uint16_t flags;				/**< flags */
+#define MEMIF_DESC_FLAG_NEXT 1			/**< is chained buffer */
+	memif_region_index_t region; /**< region index on which the buffer is located */
+	uint32_t length;			/**< buffer length */
+	memif_region_offset_t offset;		/**< buffer offset */
+	uint32_t metadata;
+} memif_desc_t;
+
+#define MEMIF_CACHELINE_ALIGN_MARK(mark) \
+	uint8_t mark[0] __rte_aligned(RTE_CACHE_LINE_SIZE)
+
+typedef struct {
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline0);
+	uint32_t cookie;			/**< MEMIF_COOKIE */
+	uint16_t flags;				/**< flags */
+#define MEMIF_RING_FLAG_MASK_INT 1		/**< disable interrupt mode */
+	volatile uint16_t head;			/**< pointer to ring buffer head */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline1);
+	volatile uint16_t tail;			/**< pointer to ring buffer tail */
+	MEMIF_CACHELINE_ALIGN_MARK(cacheline2);
+	memif_desc_t desc[0];			/**< buffer descriptors */
+} memif_ring_t;
+
+#endif				/* _MEMIF_H_ */
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
new file mode 100644
index 000000000..1e046b6b1
--- /dev/null
+++ b/drivers/net/memif/memif_socket.c
@@ -0,0 +1,1124 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <errno.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_hash.h>
+#include <rte_jhash.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+static void memif_intr_handler(void *arg);
+
+static ssize_t
+memif_msg_send(int fd, memif_msg_t *msg, int afd)
+{
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	struct cmsghdr *cmsg;
+	char ctl[CMSG_SPACE(sizeof(int))];
+
+	iov[0].iov_base = msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+
+	if (afd > 0) {
+		memset(&ctl, 0, sizeof(ctl));
+		mh.msg_control = ctl;
+		mh.msg_controllen = sizeof(ctl);
+		cmsg = CMSG_FIRSTHDR(&mh);
+		cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		rte_memcpy(CMSG_DATA(cmsg), &afd, sizeof(int));
+	}
+
+	return sendmsg(fd, &mh, 0);
+}
+
+static int
+memif_msg_send_from_queue(struct memif_control_channel *cc)
+{
+	ssize_t size;
+	int ret = 0;
+	struct memif_msg_queue_elt *e;
+
+	e = TAILQ_FIRST(&cc->msg_queue);
+	if (e == NULL)
+		return 0;
+
+	size = memif_msg_send(cc->intr_handle.fd, &e->msg, e->fd);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(ERR, "sendmsg fail: %s.", strerror(errno));
+		ret = -1;
+	} else {
+		MIF_LOG(DEBUG, "Sent msg type %u.", e->msg.type);
+	}
+	TAILQ_REMOVE(&cc->msg_queue, e, next);
+	rte_free(e);
+
+	return ret;
+}
+
+static struct memif_msg_queue_elt *
+memif_msg_enq(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e;
+
+	e = rte_zmalloc("memif_msg", sizeof(struct memif_msg_queue_elt), 0);
+	if (e == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control message.");
+		return NULL;
+	}
+
+	e->fd = -1;
+	TAILQ_INSERT_TAIL(&cc->msg_queue, e, next);
+
+	return e;
+}
+
+void
+memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			 int err_code)
+{
+	struct memif_msg_queue_elt *e;
+	struct pmd_internals *pmd;
+	memif_msg_disconnect_t *d;
+
+	if (cc == NULL) {
+		MIF_LOG(DEBUG, "Missing control channel.");
+		return;
+	}
+
+	e = memif_msg_enq(cc);
+	if (e == NULL) {
+		MIF_LOG(WARNING, "Failed to enqueue disconnect message.");
+		return;
+	}
+
+	d = &e->msg.disconnect;
+
+	e->msg.type = MEMIF_MSG_TYPE_DISCONNECT;
+	d->code = err_code;
+
+	if (reason != NULL) {
+		strlcpy((char *)d->string, reason, sizeof(d->string));
+		if (cc->dev != NULL) {
+			pmd = cc->dev->data->dev_private;
+			strlcpy(pmd->local_disc_string, reason,
+				sizeof(pmd->local_disc_string));
+		}
+	}
+}
+
+static int
+memif_msg_enq_hello(struct memif_control_channel *cc)
+{
+	struct memif_msg_queue_elt *e = memif_msg_enq(cc);
+	memif_msg_hello_t *h;
+
+	if (e == NULL)
+		return -1;
+
+	h = &e->msg.hello;
+
+	e->msg.type = MEMIF_MSG_TYPE_HELLO;
+	h->min_version = MEMIF_VERSION;
+	h->max_version = MEMIF_VERSION;
+	h->max_s2m_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_m2s_ring = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	h->max_region = ETH_MEMIF_MAX_REGION_NUM - 1;
+	h->max_log2_ring_size = ETH_MEMIF_MAX_LOG2_RING_SIZE;
+
+	strlcpy((char *)h->name, rte_version(), sizeof(h->name));
+
+	return 0;
+}
+
+static int
+memif_msg_receive_hello(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_hello_t *h = &msg->hello;
+
+	if (h->min_version > MEMIF_VERSION || h->max_version < MEMIF_VERSION) {
+		memif_msg_enq_disconnect(pmd->cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	/* Set parameters for active connection */
+	pmd->run.num_s2m_rings = RTE_MIN(h->max_s2m_ring + 1,
+					   pmd->cfg.num_s2m_rings);
+	pmd->run.num_m2s_rings = RTE_MIN(h->max_m2s_ring + 1,
+					   pmd->cfg.num_m2s_rings);
+	pmd->run.log2_ring_size = RTE_MIN(h->max_log2_ring_size,
+					    pmd->cfg.log2_ring_size);
+	pmd->run.pkt_buffer_size = pmd->cfg.pkt_buffer_size;
+
+	strlcpy(pmd->remote_name, (char *)h->name, sizeof(pmd->remote_name));
+
+	MIF_LOG(DEBUG, "%s: Connecting to %s.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_init(struct memif_control_channel *cc, memif_msg_t *msg)
+{
+	memif_msg_init_t *i = &msg->init;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *pmd;
+	struct rte_eth_dev *dev;
+
+	if (i->version != MEMIF_VERSION) {
+		memif_msg_enq_disconnect(cc, "Incompatible memif version", 0);
+		return -1;
+	}
+
+	if (cc->socket == NULL) {
+		memif_msg_enq_disconnect(cc, "Device error", 0);
+		return -1;
+	}
+
+	/* Find device with requested ID */
+	TAILQ_FOREACH(elt, &cc->socket->dev_queue, next) {
+		dev = elt->dev;
+		pmd = dev->data->dev_private;
+		if (((pmd->flags & ETH_MEMIF_FLAG_DISABLED) == 0) &&
+		    pmd->id == i->id) {
+			/* assign control channel to device */
+			cc->dev = dev;
+			pmd->cc = cc;
+
+			if (i->mode != MEMIF_INTERFACE_MODE_ETHERNET) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Only ethernet mode supported",
+							 0);
+				return -1;
+			}
+
+			if (pmd->flags & (ETH_MEMIF_FLAG_CONNECTING |
+					   ETH_MEMIF_FLAG_CONNECTED)) {
+				memif_msg_enq_disconnect(pmd->cc,
+							 "Already connected", 0);
+				return -1;
+			}
+			strlcpy(pmd->remote_name, (char *)i->name,
+				sizeof(pmd->remote_name));
+
+			if (*pmd->secret != '\0') {
+				if (*i->secret == '\0') {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Secret required", 0);
+					return -1;
+				}
+				if (strncmp(pmd->secret, (char *)i->secret,
+						ETH_MEMIF_SECRET_SIZE) != 0) {
+					memif_msg_enq_disconnect(pmd->cc,
+								 "Incorrect secret", 0);
+					return -1;
+				}
+			}
+
+			pmd->flags |= ETH_MEMIF_FLAG_CONNECTING;
+			return 0;
+		}
+	}
+
+	/* ID not found on this socket */
+	MIF_LOG(DEBUG, "ID %u not found.", i->id);
+	memif_msg_enq_disconnect(cc, "ID not found", 0);
+	return -1;
+}
+
+static int
+memif_msg_receive_add_region(struct rte_eth_dev *dev, memif_msg_t *msg,
+			     int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_region_t *ar = &msg->add_region;
+	struct memif_region *r;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing region fd", 0);
+		return -1;
+	}
+
+	if (ar->index >= ETH_MEMIF_MAX_REGION_NUM || ar->index != pmd->regions_num ||
+			pmd->regions[ar->index] != NULL) {
+		memif_msg_enq_disconnect(pmd->cc, "Invalid region index", 0);
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	r->fd = fd;
+	r->region_size = ar->size;
+	r->addr = NULL;
+
+	pmd->regions[ar->index] = r;
+	pmd->regions_num++;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_add_ring(struct rte_eth_dev *dev, memif_msg_t *msg, int fd)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_add_ring_t *ar = &msg->add_ring;
+	struct memif_queue *mq;
+
+	if (fd < 0) {
+		memif_msg_enq_disconnect(pmd->cc, "Missing interrupt fd", 0);
+		return -1;
+	}
+
+	/* check if we have enough queues */
+	if (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) {
+		if (ar->index >= pmd->cfg.num_s2m_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_s2m_rings++;
+	} else {
+		if (ar->index >= pmd->cfg.num_m2s_rings) {
+			memif_msg_enq_disconnect(pmd->cc, "Invalid ring index", 0);
+			return -1;
+		}
+		pmd->run.num_m2s_rings++;
+	}
+
+	mq = (ar->flags & MEMIF_MSG_ADD_RING_FLAG_S2M) ?
+	    dev->data->rx_queues[ar->index] : dev->data->tx_queues[ar->index];
+
+	mq->intr_handle.fd = fd;
+	mq->log2_ring_size = ar->log2_ring_size;
+	mq->region = ar->region;
+	mq->ring_offset = ar->offset;
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connect_t *c = &msg->connect;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_connected(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_connected_t *c = &msg->connected;
+	int ret;
+
+	ret = memif_connect(dev);
+	if (ret < 0)
+		return ret;
+
+	strlcpy(pmd->remote_if_name, (char *)c->if_name,
+		sizeof(pmd->remote_if_name));
+	MIF_LOG(INFO, "%s: Remote interface %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_if_name);
+
+	return 0;
+}
+
+static int
+memif_msg_receive_disconnect(struct rte_eth_dev *dev, memif_msg_t *msg)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_msg_disconnect_t *d = &msg->disconnect;
+
+	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	strlcpy(pmd->remote_disc_string, (char *)d->string,
+		sizeof(pmd->remote_disc_string));
+
+	MIF_LOG(INFO, "%s: Disconnect received: %s",
+		rte_vdev_device_name(pmd->vdev), pmd->remote_disc_string);
+
+	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	memif_disconnect(rte_eth_dev_allocated
+			 (rte_vdev_device_name(pmd->vdev)));
+	return 0;
+}
+
+static int
+memif_msg_enq_ack(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	if (e == NULL)
+		return -1;
+
+	e->msg.type = MEMIF_MSG_TYPE_ACK;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_init_t *i = &e->msg.init;
+
+	if (e == NULL)
+		return -1;
+
+	i = &e->msg.init;
+	e->msg.type = MEMIF_MSG_TYPE_INIT;
+	i->version = MEMIF_VERSION;
+	i->id = pmd->id;
+	i->mode = MEMIF_INTERFACE_MODE_ETHERNET;
+
+	strlcpy((char *)i->name, rte_version(), sizeof(i->name));
+
+	if (*pmd->secret != '\0')
+		strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_region(struct rte_eth_dev *dev, uint8_t idx)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	memif_msg_add_region_t *ar;
+	struct memif_region *mr = pmd->regions[idx];
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_region;
+	e->msg.type = MEMIF_MSG_TYPE_ADD_REGION;
+	e->fd = mr->fd;
+	ar->index = idx;
+	ar->size = mr->region_size;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_add_ring(struct rte_eth_dev *dev, uint8_t idx,
+		       memif_ring_type_t type)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	struct memif_queue *mq;
+	memif_msg_add_ring_t *ar;
+
+	if (e == NULL)
+		return -1;
+
+	ar = &e->msg.add_ring;
+	mq = (type == MEMIF_RING_S2M) ? dev->data->tx_queues[idx] :
+	    dev->data->rx_queues[idx];
+
+	e->msg.type = MEMIF_MSG_TYPE_ADD_RING;
+	e->fd = mq->intr_handle.fd;
+	ar->index = idx;
+	ar->offset = mq->ring_offset;
+	ar->region = mq->region;
+	ar->log2_ring_size = mq->log2_ring_size;
+	ar->flags = (type == MEMIF_RING_S2M) ? MEMIF_MSG_ADD_RING_FLAG_S2M : 0;
+	ar->private_hdr_size = 0;
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connect_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connect;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECT;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static int
+memif_msg_enq_connected(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *e = memif_msg_enq(pmd->cc);
+	const char *name = rte_vdev_device_name(pmd->vdev);
+	memif_msg_connected_t *c;
+
+	if (e == NULL)
+		return -1;
+
+	c = &e->msg.connected;
+	e->msg.type = MEMIF_MSG_TYPE_CONNECTED;
+	strlcpy((char *)c->if_name, name, sizeof(c->if_name));
+
+	return 0;
+}
+
+static void
+memif_intr_unregister_handler(struct rte_intr_handle *intr_handle, void *arg)
+{
+	struct memif_msg_queue_elt *elt;
+	struct memif_control_channel *cc = arg;
+
+	/* close control channel fd */
+	close(intr_handle->fd);
+	/* clear message queue */
+	while ((elt = TAILQ_FIRST(&cc->msg_queue)) != NULL) {
+		TAILQ_REMOVE(&cc->msg_queue, elt, next);
+		rte_free(elt);
+	}
+	/* free control channel */
+	rte_free(cc);
+}
+
+void
+memif_disconnect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_msg_queue_elt *elt, *next;
+	struct memif_queue *mq;
+	struct rte_intr_handle *ih;
+	int i;
+	int ret;
+
+	if (pmd->cc != NULL) {
+		/* Clear control message queue (except disconnect message if any). */
+		for (elt = TAILQ_FIRST(&pmd->cc->msg_queue); elt != NULL; elt = next) {
+			next = TAILQ_NEXT(elt, next);
+			if (elt->msg.type != MEMIF_MSG_TYPE_DISCONNECT) {
+				TAILQ_REMOVE(&pmd->cc->msg_queue, elt, next);
+				rte_free(elt);
+			}
+		}
+		/* send disconnect message (if there is any in queue) */
+		memif_msg_send_from_queue(pmd->cc);
+
+		/* at this point, there should be no more messages in queue */
+		if (TAILQ_FIRST(&pmd->cc->msg_queue) != NULL) {
+			MIF_LOG(WARNING,
+				"%s: Unexpected message(s) in message queue.",
+				rte_vdev_device_name(pmd->vdev));
+		}
+
+		ih = &pmd->cc->intr_handle;
+		if (ih->fd > 0) {
+			ret = rte_intr_callback_unregister(ih,
+							memif_intr_handler,
+							pmd->cc);
+			/*
+			 * If callback is active (disconnecting based on
+			 * received control message).
+			 */
+			if (ret == -EAGAIN) {
+				ret = rte_intr_callback_unregister_pending(ih,
+							memif_intr_handler,
+							pmd->cc,
+							memif_intr_unregister_handler);
+			} else if (ret > 0) {
+				close(ih->fd);
+				rte_free(pmd->cc);
+			}
+			pmd->cc = NULL;
+			if (ret <= 0)
+				MIF_LOG(WARNING, "%s: Failed to unregister "
+					"control channel callback.",
+					rte_vdev_device_name(pmd->vdev));
+		}
+	}
+
+	/* unconfig interrupts */
+	for (i = 0; i < pmd->cfg.num_s2m_rings; i++) {
+		if (pmd->role == MEMIF_ROLE_SLAVE) {
+			if (dev->data->tx_queues != NULL)
+				mq = dev->data->tx_queues[i];
+			else
+				continue;
+		} else {
+			if (dev->data->rx_queues != NULL)
+				mq = dev->data->rx_queues[i];
+			else
+				continue;
+		}
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+	for (i = 0; i < pmd->cfg.num_m2s_rings; i++) {
+		if (pmd->role == MEMIF_ROLE_MASTER) {
+			if (dev->data->tx_queues != NULL)
+				mq = dev->data->tx_queues[i];
+			else
+				continue;
+		} else {
+			if (dev->data->rx_queues != NULL)
+				mq = dev->data->rx_queues[i];
+			else
+				continue;
+		}
+		if (mq->intr_handle.fd > 0) {
+			close(mq->intr_handle.fd);
+			mq->intr_handle.fd = -1;
+		}
+		mq->ring = NULL;
+	}
+
+	memif_free_regions(pmd);
+
+	/* reset connection configuration */
+	memset(&pmd->run, 0, sizeof(pmd->run));
+
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTED;
+	MIF_LOG(DEBUG, "%s: Disconnected.", rte_vdev_device_name(pmd->vdev));
+}
+
+static int
+memif_msg_receive(struct memif_control_channel *cc)
+{
+	char ctl[CMSG_SPACE(sizeof(int)) +
+		 CMSG_SPACE(sizeof(struct ucred))] = { 0 };
+	struct msghdr mh = { 0 };
+	struct iovec iov[1];
+	memif_msg_t msg = { 0 };
+	ssize_t size;
+	int ret = 0;
+	struct ucred *cr __rte_unused;
+	cr = 0;
+	struct cmsghdr *cmsg;
+	int afd = -1;
+	int i;
+	struct pmd_internals *pmd;
+
+	iov[0].iov_base = (void *)&msg;
+	iov[0].iov_len = sizeof(memif_msg_t);
+	mh.msg_iov = iov;
+	mh.msg_iovlen = 1;
+	mh.msg_control = ctl;
+	mh.msg_controllen = sizeof(ctl);
+
+	size = recvmsg(cc->intr_handle.fd, &mh, 0);
+	if (size != sizeof(memif_msg_t)) {
+		MIF_LOG(DEBUG, "Invalid message size.");
+		memif_msg_enq_disconnect(cc, "Invalid message size", 0);
+		return -1;
+	}
+	MIF_LOG(DEBUG, "Received msg type: %u.", msg.type);
+
+	cmsg = CMSG_FIRSTHDR(&mh);
+	while (cmsg) {
+		if (cmsg->cmsg_level == SOL_SOCKET) {
+			if (cmsg->cmsg_type == SCM_CREDENTIALS)
+				cr = (struct ucred *)CMSG_DATA(cmsg);
+			else if (cmsg->cmsg_type == SCM_RIGHTS)
+				memcpy(&afd, CMSG_DATA(cmsg), sizeof(int));
+		}
+		cmsg = CMSG_NXTHDR(&mh, cmsg);
+	}
+
+	if (cc->dev == NULL && msg.type != MEMIF_MSG_TYPE_INIT) {
+		MIF_LOG(DEBUG, "Unexpected message.");
+		memif_msg_enq_disconnect(cc, "Unexpected message", 0);
+		return -1;
+	}
+
+	/* get device from hash data */
+	switch (msg.type) {
+	case MEMIF_MSG_TYPE_ACK:
+		break;
+	case MEMIF_MSG_TYPE_HELLO:
+		ret = memif_msg_receive_hello(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_init_regions_and_queues(cc->dev);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_init(cc->dev);
+		if (ret < 0)
+			goto exit;
+		pmd = cc->dev->data->dev_private;
+		for (i = 0; i < pmd->regions_num; i++) {
+			ret = memif_msg_enq_add_region(cc->dev, i);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_S2M);
+			if (ret < 0)
+				goto exit;
+		}
+		for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+			ret = memif_msg_enq_add_ring(cc->dev, i,
+						     MEMIF_RING_M2S);
+			if (ret < 0)
+				goto exit;
+		}
+		ret = memif_msg_enq_connect(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_INIT:
+		/*
+		 * This cc does not have an interface asociated with it.
+		 * If suitable interface is found it will be assigned here.
+		 */
+		ret = memif_msg_receive_init(cc, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_REGION:
+		ret = memif_msg_receive_add_region(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_ADD_RING:
+		ret = memif_msg_receive_add_ring(cc->dev, &msg, afd);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_ack(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECT:
+		ret = memif_msg_receive_connect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		ret = memif_msg_enq_connected(cc->dev);
+		if (ret < 0)
+			goto exit;
+		break;
+	case MEMIF_MSG_TYPE_CONNECTED:
+		ret = memif_msg_receive_connected(cc->dev, &msg);
+		break;
+	case MEMIF_MSG_TYPE_DISCONNECT:
+		ret = memif_msg_receive_disconnect(cc->dev, &msg);
+		if (ret < 0)
+			goto exit;
+		break;
+	default:
+		memif_msg_enq_disconnect(cc, "Unknown message type", 0);
+		ret = -1;
+		goto exit;
+	}
+
+ exit:
+	return ret;
+}
+
+static void
+memif_intr_handler(void *arg)
+{
+	struct memif_control_channel *cc = arg;
+	int ret;
+
+	ret = memif_msg_receive(cc);
+	/* if driver failed to assign device */
+	if (cc->dev == NULL) {
+		ret = rte_intr_callback_unregister_pending(&cc->intr_handle,
+							   memif_intr_handler,
+							   cc,
+							   memif_intr_unregister_handler);
+		if (ret < 0)
+			MIF_LOG(WARNING,
+				"Failed to unregister control channel callback.");
+		return;
+	}
+	/* if memif_msg_receive failed */
+	if (ret < 0)
+		goto disconnect;
+
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto disconnect;
+
+	return;
+
+ disconnect:
+	if (cc->dev == NULL) {
+		MIF_LOG(WARNING, "eth dev not allocated");
+		return;
+	}
+	memif_disconnect(cc->dev);
+}
+
+static void
+memif_listener_handler(void *arg)
+{
+	struct memif_socket *socket = arg;
+	int sockfd;
+	int addr_len;
+	struct sockaddr_un client;
+	struct memif_control_channel *cc;
+	int ret;
+
+	addr_len = sizeof(client);
+	sockfd = accept(socket->intr_handle.fd, (struct sockaddr *)&client,
+			(socklen_t *)&addr_len);
+	if (sockfd < 0) {
+		MIF_LOG(ERR,
+			"Failed to accept connection request on socket fd %d",
+			socket->intr_handle.fd);
+		return;
+	}
+
+	MIF_LOG(DEBUG, "%s: Connection request accepted.", socket->filename);
+
+	cc = rte_zmalloc("memif-cc", sizeof(struct memif_control_channel), 0);
+	if (cc == NULL) {
+		MIF_LOG(ERR, "Failed to allocate control channel.");
+		goto error;
+	}
+
+	cc->intr_handle.fd = sockfd;
+	cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	cc->socket = socket;
+	cc->dev = NULL;
+	TAILQ_INIT(&cc->msg_queue);
+
+	ret = rte_intr_callback_register(&cc->intr_handle, memif_intr_handler, cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to register control channel callback.");
+		goto error;
+	}
+
+	ret = memif_msg_enq_hello(cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "Failed to enqueue hello message.");
+		goto error;
+	}
+	ret = memif_msg_send_from_queue(cc);
+	if (ret < 0)
+		goto error;
+
+	return;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (cc != NULL)
+		rte_free(cc);
+}
+
+static struct memif_socket *
+memif_socket_create(struct pmd_internals *pmd, char *key, uint8_t listener)
+{
+	struct memif_socket *sock;
+	struct sockaddr_un un;
+	int sockfd;
+	int ret;
+	int on = 1;
+
+	sock = rte_zmalloc("memif-socket", sizeof(struct memif_socket), 0);
+	if (sock == NULL) {
+		MIF_LOG(ERR, "Failed to allocate memory for memif socket");
+		return NULL;
+	}
+
+	sock->listener = listener;
+	rte_memcpy(sock->filename, key, 256);
+	TAILQ_INIT(&sock->dev_queue);
+
+	if (listener != 0) {
+		sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+		if (sockfd < 0)
+			goto error;
+
+		un.sun_family = AF_UNIX;
+		memcpy(un.sun_path, sock->filename,
+			sizeof(un.sun_path) - 1);
+
+		ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
+				 sizeof(on));
+		if (ret < 0)
+			goto error;
+		ret = bind(sockfd, (struct sockaddr *)&un, sizeof(un));
+		if (ret < 0)
+			goto error;
+		ret = listen(sockfd, 1);
+		if (ret < 0)
+			goto error;
+
+		MIF_LOG(DEBUG, "%s: Memif listener socket %s created.",
+			rte_vdev_device_name(pmd->vdev), sock->filename);
+
+		sock->intr_handle.fd = sockfd;
+		sock->intr_handle.type = RTE_INTR_HANDLE_EXT;
+		ret = rte_intr_callback_register(&sock->intr_handle,
+						 memif_listener_handler, sock);
+		if (ret < 0) {
+			MIF_LOG(ERR, "%s: Failed to register interrupt "
+				"callback for listener socket",
+				rte_vdev_device_name(pmd->vdev));
+			return NULL;
+		}
+	}
+
+	return sock;
+
+ error:
+	MIF_LOG(ERR, "%s: Failed to setup socket %s: %s",
+		rte_vdev_device_name(pmd->vdev), key, strerror(errno));
+	if (sock != NULL)
+		rte_free(sock);
+	return NULL;
+}
+
+static struct rte_hash *
+memif_create_socket_hash(void)
+{
+	struct rte_hash_parameters params = { 0 };
+	params.name = MEMIF_SOCKET_HASH_NAME;
+	params.entries = 256;
+	params.key_len = 256;
+	params.hash_func = rte_jhash;
+	params.hash_func_init_val = 0;
+	return rte_hash_create(&params);
+}
+
+int
+memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt;
+	struct pmd_internals *tmp_pmd;
+	struct rte_hash *hash;
+	int ret;
+	char key[256];
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL) {
+		hash = memif_create_socket_hash();
+		if (hash == NULL) {
+			MIF_LOG(ERR, "Failed to create memif socket hash.");
+			return -1;
+		}
+	}
+
+	memset(key, 0, 256);
+	rte_memcpy(key, socket_filename, strlen(socket_filename));
+	ret = rte_hash_lookup_data(hash, key, (void **)&socket);
+	if (ret < 0) {
+		socket = memif_socket_create(pmd, key,
+					     (pmd->role ==
+					      MEMIF_ROLE_SLAVE) ? 0 : 1);
+		if (socket == NULL)
+			return -1;
+		ret = rte_hash_add_key_data(hash, key, socket);
+		if (ret < 0) {
+			MIF_LOG(ERR, "Failed to add socket to socket hash.");
+			return ret;
+		}
+	}
+	pmd->socket_filename = socket->filename;
+
+	if (socket->listener != 0 && pmd->role == MEMIF_ROLE_SLAVE) {
+		MIF_LOG(ERR, "Socket is a listener.");
+		return -1;
+	} else if ((socket->listener == 0) && (pmd->role == MEMIF_ROLE_MASTER)) {
+		MIF_LOG(ERR, "Socket is not a listener.");
+		return -1;
+	}
+
+	TAILQ_FOREACH(elt, &socket->dev_queue, next) {
+		tmp_pmd = elt->dev->data->dev_private;
+		if (tmp_pmd->id == pmd->id) {
+			MIF_LOG(ERR, "Memif device with id %d already "
+				"exists on socket %s",
+				pmd->id, socket->filename);
+			return -1;
+		}
+	}
+
+	elt = rte_malloc("pmd-queue", sizeof(struct memif_socket_dev_list_elt), 0);
+	if (elt == NULL) {
+		MIF_LOG(ERR, "%s: Failed to add device to socket device list.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+	elt->dev = dev;
+	TAILQ_INSERT_TAIL(&socket->dev_queue, elt, next);
+
+	return 0;
+}
+
+void
+memif_socket_remove_device(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_socket *socket = NULL;
+	struct memif_socket_dev_list_elt *elt, *next;
+	struct rte_hash *hash;
+
+	hash = rte_hash_find_existing(MEMIF_SOCKET_HASH_NAME);
+	if (hash == NULL)
+		return;
+
+	if (pmd->socket_filename == NULL)
+		return;
+
+	if (rte_hash_lookup_data(hash, pmd->socket_filename, (void **)&socket) < 0)
+		return;
+
+	for (elt = TAILQ_FIRST(&socket->dev_queue); elt != NULL; elt = next) {
+		next = TAILQ_NEXT(elt, next);
+		if (elt->dev == dev) {
+			TAILQ_REMOVE(&socket->dev_queue, elt, next);
+			rte_free(elt);
+			pmd->socket_filename = NULL;
+		}
+	}
+
+	/* remove socket, if this was the last device using it */
+	if (TAILQ_EMPTY(&socket->dev_queue)) {
+		rte_hash_del_key(hash, socket->filename);
+		if (socket->listener) {
+			/* remove listener socket file,
+			 * so we can create new one later.
+			 */
+			remove(socket->filename);
+		}
+		rte_free(socket);
+	}
+}
+
+int
+memif_connect_master(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+	return 0;
+}
+
+int
+memif_connect_slave(struct rte_eth_dev *dev)
+{
+	int sockfd;
+	int ret;
+	struct sockaddr_un sun;
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	memset(pmd->remote_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
+	pmd->flags &= ~ETH_MEMIF_FLAG_DISABLED;
+
+	sockfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+	if (sockfd < 0) {
+		MIF_LOG(ERR, "%s: Failed to open socket.",
+			rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	sun.sun_family = AF_UNIX;
+
+	memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+
+	ret = connect(sockfd, (struct sockaddr *)&sun,
+		      sizeof(struct sockaddr_un));
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to connect socket: %s.",
+			rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+		goto error;
+	}
+
+	MIF_LOG(DEBUG, "%s: Memif socket: %s connected.",
+		rte_vdev_device_name(pmd->vdev), pmd->socket_filename);
+
+	pmd->cc = rte_zmalloc("memif-cc",
+			      sizeof(struct memif_control_channel), 0);
+	if (pmd->cc == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate control channel.",
+			rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	pmd->cc->intr_handle.fd = sockfd;
+	pmd->cc->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	pmd->cc->socket = NULL;
+	pmd->cc->dev = dev;
+	TAILQ_INIT(&pmd->cc->msg_queue);
+
+	ret = rte_intr_callback_register(&pmd->cc->intr_handle,
+					 memif_intr_handler, pmd->cc);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to register interrupt callback "
+			"for control fd", rte_vdev_device_name(pmd->vdev));
+		goto error;
+	}
+
+	return 0;
+
+ error:
+	if (sockfd > 0) {
+		close(sockfd);
+		sockfd = -1;
+	}
+	if (pmd->cc != NULL) {
+		rte_free(pmd->cc);
+		pmd->cc = NULL;
+	}
+	return -1;
+}
diff --git a/drivers/net/memif/memif_socket.h b/drivers/net/memif/memif_socket.h
new file mode 100644
index 000000000..db293e200
--- /dev/null
+++ b/drivers/net/memif/memif_socket.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _MEMIF_SOCKET_H_
+#define _MEMIF_SOCKET_H_
+
+#include <sys/queue.h>
+
+/**
+ * Remove device from socket device list. If no device is left on the socket,
+ * remove the socket as well.
+ *
+ * @param dev
+ *   memif device
+ */
+void memif_socket_remove_device(struct rte_eth_dev *dev);
+
+/**
+ * Enqueue disconnect message to control channel message queue.
+ *
+ * @param cc
+ *   control channel
+ * @param reason
+ *   const string stating disconnect reason (96 characters)
+ * @param err_code
+ *   error code
+ */
+void memif_msg_enq_disconnect(struct memif_control_channel *cc, const char *reason,
+			      int err_code);
+
+/**
+ * Initialize memif socket for specified device. If socket doesn't exist, create socket.
+ *
+ * @param dev
+ *   memif device
+ * @param socket_filename
+ *   socket filename
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_socket_init(struct rte_eth_dev *dev, const char *socket_filename);
+
+/**
+ * Disconnect memif device. Close control channel and shared memory.
+ *
+ * @param dev
+ *   memif device
+ */
+void memif_disconnect(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, enable connection establishment.
+ *
+ * @param dev
+ *   memif device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_master(struct rte_eth_dev *dev);
+
+/**
+ * If device is properly configured, send connection request.
+ *
+ * @param dev
+ *   memif device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect_slave(struct rte_eth_dev *dev);
+
+struct memif_socket_dev_list_elt {
+	TAILQ_ENTRY(memif_socket_dev_list_elt) next;
+	struct rte_eth_dev *dev;		/**< pointer to device internals */
+	char dev_name[RTE_ETH_NAME_MAX_LEN];
+};
+
+#define MEMIF_SOCKET_HASH_NAME			"memif-sh"
+struct memif_socket {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	char filename[256];			/**< socket filename */
+
+	TAILQ_HEAD(, memif_socket_dev_list_elt) dev_queue;
+	/**< Queue of devices using this socket */
+	uint8_t listener;			/**< if not zero socket is listener */
+};
+
+/* Control message queue. */
+struct memif_msg_queue_elt {
+	memif_msg_t msg;			/**< control message */
+	TAILQ_ENTRY(memif_msg_queue_elt) next;
+	int fd;					/**< fd to be sent to peer */
+};
+
+struct memif_control_channel {
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+	TAILQ_HEAD(, memif_msg_queue_elt) msg_queue; /**< control message queue */
+	struct memif_socket *socket;		/**< pointer to socket */
+	struct rte_eth_dev *dev;		/**< pointer to device */
+};
+
+#endif				/* MEMIF_SOCKET_H */
diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
new file mode 100644
index 000000000..287a30ef2
--- /dev/null
+++ b/drivers/net/memif/meson.build
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+
+if host_machine.system() != 'linux'
+        build = false
+endif
+
+sources = files('rte_eth_memif.c',
+		'memif_socket.c')
+
+allow_experimental_apis = true
+# Experimantal APIs:
+# - rte_intr_callback_unregister_pending
+
+deps += ['hash']
diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
new file mode 100644
index 000000000..b9f05a624
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -0,0 +1,1204 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#include <stdlib.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <linux/if_ether.h>
+#include <errno.h>
+#include <sys/eventfd.h>
+
+#include <rte_version.h>
+#include <rte_mbuf.h>
+#include <rte_ether.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+
+#include "rte_eth_memif.h"
+#include "memif_socket.h"
+
+#define ETH_MEMIF_ID_ARG		"id"
+#define ETH_MEMIF_ROLE_ARG		"role"
+#define ETH_MEMIF_PKT_BUFFER_SIZE_ARG	"bsize"
+#define ETH_MEMIF_RING_SIZE_ARG		"rsize"
+#define ETH_MEMIF_SOCKET_ARG		"socket"
+#define ETH_MEMIF_MAC_ARG		"mac"
+#define ETH_MEMIF_ZC_ARG		"zero-copy"
+#define ETH_MEMIF_SECRET_ARG		"secret"
+
+static const char * const valid_arguments[] = {
+	ETH_MEMIF_ID_ARG,
+	ETH_MEMIF_ROLE_ARG,
+	ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+	ETH_MEMIF_RING_SIZE_ARG,
+	ETH_MEMIF_SOCKET_ARG,
+	ETH_MEMIF_MAC_ARG,
+	ETH_MEMIF_ZC_ARG,
+	ETH_MEMIF_SECRET_ARG,
+	NULL
+};
+
+const char *
+memif_version(void)
+{
+	return ("memif-" RTE_STR(MEMIF_VERSION_MAJOR) "." RTE_STR(MEMIF_VERSION_MINOR));
+}
+
+static void
+memif_dev_info(struct rte_eth_dev *dev __rte_unused, struct rte_eth_dev_info *dev_info)
+{
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN;
+	dev_info->max_rx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->max_tx_queues = ETH_MEMIF_MAX_NUM_Q_PAIRS;
+	dev_info->min_rx_bufsize = 0;
+}
+
+static memif_ring_t *
+memif_get_ring(struct pmd_internals *pmd, memif_ring_type_t type, uint16_t ring_num)
+{
+	/* rings only in region 0 */
+	void *p = pmd->regions[0]->addr;
+	int ring_size = sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size);
+
+	p = (uint8_t *)p + (ring_num + type * pmd->run.num_s2m_rings) * ring_size;
+
+	return (memif_ring_t *)p;
+}
+
+static void *
+memif_get_buffer(struct pmd_internals *pmd, memif_desc_t *d)
+{
+	return ((uint8_t *)pmd->regions[d->region]->addr + d->offset);
+}
+
+static int
+memif_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *cur_tail,
+		    struct rte_mbuf *tail)
+{
+	/* Check for number-of-segments-overflow */
+	if (unlikely(head->nb_segs + tail->nb_segs > RTE_MBUF_MAX_NB_SEGS))
+		return -EOVERFLOW;
+
+	/* Chain 'tail' onto the old tail */
+	cur_tail->next = tail;
+
+	/* accumulate number of segments and total length. */
+	head->nb_segs = (uint16_t)(head->nb_segs + tail->nb_segs);
+
+	tail->pkt_len = tail->data_len;
+	head->pkt_len += tail->pkt_len;
+
+	return 0;
+}
+
+static uint16_t
+eth_memif_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t cur_slot, last_slot, n_slots, ring_size, mask, s0;
+	uint16_t n_rx_pkts = 0;
+	uint16_t mbuf_size = rte_pktmbuf_data_room_size(mq->mempool) -
+		RTE_PKTMBUF_HEADROOM;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf, *mbuf_head, *mbuf_tail;
+	uint64_t b;
+	ssize_t size __rte_unused;
+	uint16_t head;
+	int ret;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	/* consume interrupt */
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0)
+		size = read(mq->intr_handle.fd, &b, sizeof(b));
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	cur_slot = (type == MEMIF_RING_S2M) ? mq->last_head : mq->last_tail;
+	last_slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+	if (cur_slot == last_slot)
+		goto refill;
+	n_slots = last_slot - cur_slot;
+
+	while (n_slots && n_rx_pkts < nb_pkts) {
+		mbuf_head = rte_pktmbuf_alloc(mq->mempool);
+		if (unlikely(mbuf_head == NULL))
+			goto no_free_bufs;
+		mbuf = mbuf_head;
+		mbuf->port = mq->in_port;
+
+next_slot:
+		s0 = cur_slot & mask;
+		d0 = &ring->desc[s0];
+
+		src_len = d0->length;
+		dst_off = 0;
+		src_off = 0;
+
+		do {
+			dst_len = mbuf_size - dst_off;
+			if (dst_len == 0) {
+				dst_off = 0;
+				dst_len = mbuf_size;
+
+				/* store pointer to tail */
+				mbuf_tail = mbuf;
+				mbuf = rte_pktmbuf_alloc(mq->mempool);
+				if (unlikely(mbuf == NULL))
+					goto no_free_bufs;
+				mbuf->port = mq->in_port;
+				ret = memif_pktmbuf_chain(mbuf_head, mbuf_tail, mbuf);
+				if (unlikely(ret < 0)) {
+					MIF_LOG(ERR, "%s: number-of-segments-overflow",
+						rte_vdev_device_name(pmd->vdev));
+					rte_pktmbuf_free(mbuf);
+					goto no_free_bufs;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			rte_pktmbuf_data_len(mbuf) += cp_len;
+			rte_pktmbuf_pkt_len(mbuf) = rte_pktmbuf_data_len(mbuf);
+			if (mbuf != mbuf_head)
+				rte_pktmbuf_pkt_len(mbuf_head) += cp_len;
+
+			memcpy(rte_pktmbuf_mtod_offset(mbuf, void *, dst_off),
+			       (uint8_t *)memif_get_buffer(pmd, d0) + src_off, cp_len);
+
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+		} while (src_len);
+
+		cur_slot++;
+		n_slots--;
+
+		if (d0->flags & MEMIF_DESC_FLAG_NEXT)
+			goto next_slot;
+
+		mq->n_bytes += rte_pktmbuf_pkt_len(mbuf_head);
+		*bufs++ = mbuf_head;
+		n_rx_pkts++;
+	}
+
+no_free_bufs:
+	if (type == MEMIF_RING_S2M) {
+		rte_mb();
+		ring->tail = cur_slot;
+		mq->last_head = cur_slot;
+	} else {
+		mq->last_tail = cur_slot;
+	}
+
+refill:
+	if (type == MEMIF_RING_M2S) {
+		head = ring->head;
+		n_slots = ring_size - head + mq->last_tail;
+
+		while (n_slots--) {
+			s0 = head++ & mask;
+			d0 = &ring->desc[s0];
+			d0->length = pmd->run.pkt_buffer_size;
+		}
+		rte_mb();
+		ring->head = head;
+	}
+
+	mq->n_pkts += n_rx_pkts;
+	return n_rx_pkts;
+}
+
+static uint16_t
+eth_memif_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct memif_queue *mq = queue;
+	struct pmd_internals *pmd = mq->pmd;
+	memif_ring_t *ring = mq->ring;
+	uint16_t slot, saved_slot, n_free, ring_size, mask, n_tx_pkts = 0;
+	uint16_t src_len, src_off, dst_len, dst_off, cp_len;
+	memif_ring_type_t type = mq->type;
+	memif_desc_t *d0;
+	struct rte_mbuf *mbuf;
+	struct rte_mbuf *mbuf_head;
+	uint64_t a;
+	ssize_t size;
+
+	if (unlikely((pmd->flags & ETH_MEMIF_FLAG_CONNECTED) == 0))
+		return 0;
+	if (unlikely(ring == NULL))
+		return 0;
+
+	ring_size = 1 << mq->log2_ring_size;
+	mask = ring_size - 1;
+
+	n_free = ring->tail - mq->last_tail;
+	mq->last_tail += n_free;
+	slot = (type == MEMIF_RING_S2M) ? ring->head : ring->tail;
+
+	if (type == MEMIF_RING_S2M)
+		n_free = ring_size - ring->head + mq->last_tail;
+	else
+		n_free = ring->head - ring->tail;
+
+	while (n_tx_pkts < nb_pkts && n_free) {
+		mbuf_head = *bufs++;
+		mbuf = mbuf_head;
+
+		saved_slot = slot;
+		d0 = &ring->desc[slot & mask];
+		dst_off = 0;
+		dst_len = (type == MEMIF_RING_S2M) ?
+			pmd->run.pkt_buffer_size : d0->length;
+
+next_in_chain:
+		src_off = 0;
+		src_len = rte_pktmbuf_data_len(mbuf);
+
+		while (src_len) {
+			if (dst_len == 0) {
+				if (n_free) {
+					slot++;
+					n_free--;
+					d0->flags |= MEMIF_DESC_FLAG_NEXT;
+					d0 = &ring->desc[slot & mask];
+					dst_off = 0;
+					dst_len = (type == MEMIF_RING_S2M) ?
+					    pmd->run.pkt_buffer_size : d0->length;
+					d0->flags = 0;
+				} else {
+					slot = saved_slot;
+					goto no_free_slots;
+				}
+			}
+			cp_len = RTE_MIN(dst_len, src_len);
+
+			memcpy((uint8_t *)memif_get_buffer(pmd, d0) + dst_off,
+			       rte_pktmbuf_mtod_offset(mbuf, void *, src_off),
+			       cp_len);
+
+			mq->n_bytes += cp_len;
+			src_off += cp_len;
+			dst_off += cp_len;
+			src_len -= cp_len;
+			dst_len -= cp_len;
+
+			d0->length = dst_off;
+		}
+
+		if (rte_pktmbuf_is_contiguous(mbuf) == 0) {
+			mbuf = mbuf->next;
+			goto next_in_chain;
+		}
+
+		n_tx_pkts++;
+		slot++;
+		n_free--;
+		rte_pktmbuf_free(mbuf_head);
+	}
+
+no_free_slots:
+	rte_mb();
+	if (type == MEMIF_RING_S2M)
+		ring->head = slot;
+	else
+		ring->tail = slot;
+
+	if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) {
+		a = 1;
+		size = write(mq->intr_handle.fd, &a, sizeof(a));
+		if (unlikely(size < 0)) {
+			MIF_LOG(WARNING,
+				"%s: Failed to send interrupt. %s",
+				rte_vdev_device_name(pmd->vdev), strerror(errno));
+		}
+	}
+
+	mq->n_err += nb_pkts - n_tx_pkts;
+	mq->n_pkts += n_tx_pkts;
+	return n_tx_pkts;
+}
+
+void
+memif_free_regions(struct pmd_internals *pmd)
+{
+	int i;
+	struct memif_region *r;
+
+	/* regions are allocated contiguously, so it's
+	 * enough to loop until 'pmd->regions_num'
+	 */
+	for (i = 0; i < pmd->regions_num; i++) {
+		r = pmd->regions[i];
+		if (r != NULL) {
+			if (r->addr != NULL) {
+				munmap(r->addr, r->region_size);
+				if (r->fd > 0) {
+					close(r->fd);
+					r->fd = -1;
+				}
+			}
+			rte_free(r);
+			pmd->regions[i] = NULL;
+		}
+	}
+	pmd->regions_num = 0;
+}
+
+static int
+memif_region_init_shm(struct pmd_internals *pmd, uint8_t has_buffers)
+{
+	char shm_name[ETH_MEMIF_SHM_NAME_SIZE];
+	int ret = 0;
+	struct memif_region *r;
+
+	if (pmd->regions_num >= ETH_MEMIF_MAX_REGION_NUM) {
+		MIF_LOG(ERR, "%s: Too many regions.", rte_vdev_device_name(pmd->vdev));
+		return -1;
+	}
+
+	r = rte_zmalloc("region", sizeof(struct memif_region), 0);
+	if (r == NULL) {
+		MIF_LOG(ERR, "%s: Failed to alloc memif region.",
+			rte_vdev_device_name(pmd->vdev));
+		return -ENOMEM;
+	}
+
+	/* calculate buffer offset */
+	r->pkt_buffer_offset = (pmd->run.num_s2m_rings + pmd->run.num_m2s_rings) *
+	    (sizeof(memif_ring_t) + sizeof(memif_desc_t) *
+	    (1 << pmd->run.log2_ring_size));
+
+	r->region_size = r->pkt_buffer_offset;
+	/* if region has buffers, add buffers size to region_size */
+	if (has_buffers == 1)
+		r->region_size += (uint32_t)(pmd->run.pkt_buffer_size *
+			(1 << pmd->run.log2_ring_size) *
+			(pmd->run.num_s2m_rings +
+			 pmd->run.num_m2s_rings));
+
+	memset(shm_name, 0, sizeof(char) * ETH_MEMIF_SHM_NAME_SIZE);
+	snprintf(shm_name, ETH_MEMIF_SHM_NAME_SIZE, "memif_region_%d",
+		 pmd->regions_num);
+
+	r->fd = memfd_create(shm_name, MFD_ALLOW_SEALING);
+	if (r->fd < 0) {
+		MIF_LOG(ERR, "%s: Failed to create shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		ret = -1;
+		goto error;
+	}
+
+	ret = fcntl(r->fd, F_ADD_SEALS, F_SEAL_SHRINK);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to add seals to shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	ret = ftruncate(r->fd, r->region_size);
+	if (ret < 0) {
+		MIF_LOG(ERR, "%s: Failed to truncate shm file: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(errno));
+		goto error;
+	}
+
+	r->addr = mmap(NULL, r->region_size, PROT_READ |
+		       PROT_WRITE, MAP_SHARED, r->fd, 0);
+	if (r->addr == MAP_FAILED) {
+		MIF_LOG(ERR, "%s: Failed to mmap shm region: %s.",
+			rte_vdev_device_name(pmd->vdev),
+			strerror(ret));
+		ret = -1;
+		goto error;
+	}
+
+	pmd->regions[pmd->regions_num] = r;
+	pmd->regions_num++;
+
+	return ret;
+
+error:
+	if (r->fd > 0)
+		close(r->fd);
+	r->fd = -1;
+
+	return ret;
+}
+
+static int
+memif_regions_init(struct pmd_internals *pmd)
+{
+	int ret;
+
+	/* create one buffer region */
+	ret = memif_region_init_shm(pmd, /* has buffer */ 1);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static void
+memif_init_rings(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	memif_ring_t *ring;
+	int i, j;
+	uint16_t slot;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = i * (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = pmd->regions[0]->pkt_buffer_offset +
+				(uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		ring->head = 0;
+		ring->tail = 0;
+		ring->cookie = MEMIF_COOKIE;
+		ring->flags = 0;
+		for (j = 0; j < (1 << pmd->run.log2_ring_size); j++) {
+			slot = (i + pmd->run.num_s2m_rings) *
+			    (1 << pmd->run.log2_ring_size) + j;
+			ring->desc[j].region = 0;
+			ring->desc[j].offset = pmd->regions[0]->pkt_buffer_offset +
+				(uint32_t)(slot * pmd->run.pkt_buffer_size);
+			ring->desc[j].length = pmd->run.pkt_buffer_size;
+		}
+	}
+}
+
+/* called only by slave */
+static void
+memif_init_queues(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = dev->data->tx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_S2M, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->ring_offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for tx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = dev->data->rx_queues[i];
+		mq->ring = memif_get_ring(pmd, MEMIF_RING_M2S, i);
+		mq->log2_ring_size = pmd->run.log2_ring_size;
+		/* queues located only in region 0 */
+		mq->region = 0;
+		mq->ring_offset = (uint8_t *)mq->ring - (uint8_t *)pmd->regions[0]->addr;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		mq->intr_handle.fd = eventfd(0, EFD_NONBLOCK);
+		if (mq->intr_handle.fd < 0) {
+			MIF_LOG(WARNING,
+				"%s: Failed to create eventfd for rx queue %d: %s.",
+				rte_vdev_device_name(pmd->vdev), i,
+				strerror(errno));
+		}
+	}
+}
+
+int
+memif_init_regions_and_queues(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = memif_regions_init(dev->data->dev_private);
+	if (ret < 0)
+		return ret;
+
+	memif_init_rings(dev);
+
+	memif_init_queues(dev);
+
+	return 0;
+}
+
+int
+memif_connect(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_region *mr;
+	struct memif_queue *mq;
+	int i;
+
+	for (i = 0; i < pmd->regions_num; i++) {
+		mr = pmd->regions[i];
+		if (mr != NULL) {
+			if (mr->addr == NULL) {
+				if (mr->fd < 0)
+					return -1;
+				mr->addr = mmap(NULL, mr->region_size,
+						PROT_READ | PROT_WRITE,
+						MAP_SHARED, mr->fd, 0);
+				if (mr->addr == NULL)
+					return -1;
+			}
+		}
+	}
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->tx_queues[i] : dev->data->rx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->ring_offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_MASTER)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ?
+		    dev->data->rx_queues[i] : dev->data->tx_queues[i];
+		mq->ring = (memif_ring_t *)((uint8_t *)pmd->regions[mq->region]->addr +
+			    mq->ring_offset);
+		if (mq->ring->cookie != MEMIF_COOKIE) {
+			MIF_LOG(ERR, "%s: Wrong cookie",
+				rte_vdev_device_name(pmd->vdev));
+			return -1;
+		}
+		mq->ring->head = 0;
+		mq->ring->tail = 0;
+		mq->last_head = 0;
+		mq->last_tail = 0;
+		/* enable polling mode */
+		if (pmd->role == MEMIF_ROLE_SLAVE)
+			mq->ring->flags = MEMIF_RING_FLAG_MASK_INT;
+	}
+
+	pmd->flags &= ~ETH_MEMIF_FLAG_CONNECTING;
+	pmd->flags |= ETH_MEMIF_FLAG_CONNECTED;
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+	MIF_LOG(INFO, "%s: Connected.", rte_vdev_device_name(pmd->vdev));
+	return 0;
+}
+
+static int
+memif_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int ret = 0;
+
+	switch (pmd->role) {
+	case MEMIF_ROLE_SLAVE:
+		ret = memif_connect_slave(dev);
+		break;
+	case MEMIF_ROLE_MASTER:
+		ret = memif_connect_master(dev);
+		break;
+	default:
+		MIF_LOG(ERR, "%s: Unknown role: %d.",
+			rte_vdev_device_name(pmd->vdev), pmd->role);
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static void
+memif_dev_close(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+
+	memif_msg_enq_disconnect(pmd->cc, "Device closed", 0);
+	memif_disconnect(dev);
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++)
+		(*dev->dev_ops->rx_queue_release)(dev->data->rx_queues[i]);
+	for (i = 0; i < dev->data->nb_tx_queues; i++)
+		(*dev->dev_ops->tx_queue_release)(dev->data->tx_queues[i]);
+
+	memif_socket_remove_device(dev);
+}
+
+static int
+memif_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	/*
+	 * SLAVE - TXQ
+	 * MASTER - RXQ
+	 */
+	pmd->cfg.num_s2m_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_tx_queues : dev->data->nb_rx_queues;
+
+	/*
+	 * SLAVE - RXQ
+	 * MASTER - TXQ
+	 */
+	pmd->cfg.num_m2s_rings = (pmd->role == MEMIF_ROLE_SLAVE) ?
+				  dev->data->nb_rx_queues : dev->data->nb_tx_queues;
+
+	return 0;
+}
+
+static int
+memif_tx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_tx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("tx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate tx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type =
+	    (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->pmd = pmd;
+	dev->data->tx_queues[qid] = mq;
+
+	return 0;
+}
+
+static int
+memif_rx_queue_setup(struct rte_eth_dev *dev,
+		     uint16_t qid,
+		     uint16_t nb_rx_desc __rte_unused,
+		     unsigned int socket_id __rte_unused,
+		     const struct rte_eth_rxconf *rx_conf __rte_unused,
+		     struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+
+	mq = rte_zmalloc("rx-queue", sizeof(struct memif_queue), 0);
+	if (mq == NULL) {
+		MIF_LOG(ERR, "%s: Failed to allocate rx queue id: %u",
+			rte_vdev_device_name(pmd->vdev), qid);
+		return -ENOMEM;
+	}
+
+	mq->type = (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M;
+	mq->n_pkts = 0;
+	mq->n_bytes = 0;
+	mq->n_err = 0;
+	mq->intr_handle.fd = -1;
+	mq->intr_handle.type = RTE_INTR_HANDLE_EXT;
+	mq->mempool = mb_pool;
+	mq->in_port = dev->data->port_id;
+	mq->pmd = pmd;
+	dev->data->rx_queues[qid] = mq;
+
+	return 0;
+}
+
+static void
+memif_queue_release(void *queue)
+{
+	struct memif_queue *mq = (struct memif_queue *)queue;
+
+	if (!mq)
+		return;
+
+	rte_free(mq);
+}
+
+static int
+memif_link_update(struct rte_eth_dev *dev __rte_unused,
+		  int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static int
+memif_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	struct memif_queue *mq;
+	int i;
+	uint8_t tmp, nq;
+
+	stats->ipackets = 0;
+	stats->ibytes = 0;
+	stats->opackets = 0;
+	stats->obytes = 0;
+	stats->oerrors = 0;
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_s2m_rings :
+	    pmd->run.num_m2s_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* RX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->rx_queues[i];
+		stats->q_ipackets[i] = mq->n_pkts;
+		stats->q_ibytes[i] = mq->n_bytes;
+		stats->ipackets += mq->n_pkts;
+		stats->ibytes += mq->n_bytes;
+	}
+
+	tmp = (pmd->role == MEMIF_ROLE_SLAVE) ? pmd->run.num_m2s_rings :
+	    pmd->run.num_s2m_rings;
+	nq = (tmp < RTE_ETHDEV_QUEUE_STAT_CNTRS) ? tmp :
+	    RTE_ETHDEV_QUEUE_STAT_CNTRS;
+
+	/* TX stats */
+	for (i = 0; i < nq; i++) {
+		mq = dev->data->tx_queues[i];
+		stats->q_opackets[i] = mq->n_pkts;
+		stats->q_obytes[i] = mq->n_bytes;
+		stats->opackets += mq->n_pkts;
+		stats->obytes += mq->n_bytes;
+		stats->oerrors += mq->n_err;
+	}
+	return 0;
+}
+
+static void
+memif_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+	int i;
+	struct memif_queue *mq;
+
+	for (i = 0; i < pmd->run.num_s2m_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->tx_queues[i] :
+		    dev->data->rx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+	for (i = 0; i < pmd->run.num_m2s_rings; i++) {
+		mq = (pmd->role == MEMIF_ROLE_SLAVE) ? dev->data->rx_queues[i] :
+		    dev->data->tx_queues[i];
+		mq->n_pkts = 0;
+		mq->n_bytes = 0;
+		mq->n_err = 0;
+	}
+}
+
+static int
+memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd = dev->data->dev_private;
+
+	MIF_LOG(WARNING, "%s: Interrupt mode not supported.",
+		rte_vdev_device_name(pmd->vdev));
+
+	return -1;
+}
+
+static int
+memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused)
+{
+	struct pmd_internals *pmd __rte_unused = dev->data->dev_private;
+
+	return 0;
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = memif_dev_start,
+	.dev_close = memif_dev_close,
+	.dev_infos_get = memif_dev_info,
+	.dev_configure = memif_dev_configure,
+	.tx_queue_setup = memif_tx_queue_setup,
+	.rx_queue_setup = memif_rx_queue_setup,
+	.rx_queue_release = memif_queue_release,
+	.tx_queue_release = memif_queue_release,
+	.rx_queue_intr_enable = memif_rx_queue_intr_enable,
+	.rx_queue_intr_disable = memif_rx_queue_intr_disable,
+	.link_update = memif_link_update,
+	.stats_get = memif_stats_get,
+	.stats_reset = memif_stats_reset,
+};
+
+static int
+memif_create(struct rte_vdev_device *vdev, enum memif_role_t role,
+	     memif_interface_id_t id, uint32_t flags,
+	     const char *socket_filename,
+	     memif_log2_ring_size_t log2_ring_size,
+	     uint16_t pkt_buffer_size, const char *secret,
+	     struct rte_ether_addr *ether_addr)
+{
+	int ret = 0;
+	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev_data *data;
+	struct pmd_internals *pmd;
+	const unsigned int numa_node = vdev->device.numa_node;
+	const char *name = rte_vdev_device_name(vdev);
+
+	if (flags & ETH_MEMIF_FLAG_ZERO_COPY) {
+		MIF_LOG(ERR, "Zero-copy slave not supported.");
+		return -1;
+	}
+
+	eth_dev = rte_eth_vdev_allocate(vdev, sizeof(*pmd));
+	if (eth_dev == NULL) {
+		MIF_LOG(ERR, "%s: Unable to allocate device struct.", name);
+		return -1;
+	}
+
+	pmd = eth_dev->data->dev_private;
+	memset(pmd, 0, sizeof(*pmd));
+
+	pmd->vdev = vdev;
+	pmd->id = id;
+	pmd->flags = flags;
+	pmd->flags |= ETH_MEMIF_FLAG_DISABLED;
+	pmd->role = role;
+
+	ret = memif_socket_init(eth_dev, socket_filename);
+	if (ret < 0)
+		return ret;
+
+	memset(pmd->secret, 0, sizeof(char) * ETH_MEMIF_SECRET_SIZE);
+	if (secret != NULL)
+		strlcpy(pmd->secret, secret, sizeof(pmd->secret));
+
+	pmd->cfg.log2_ring_size = log2_ring_size;
+	/* set in .dev_configure() */
+	pmd->cfg.num_s2m_rings = 0;
+	pmd->cfg.num_m2s_rings = 0;
+
+	pmd->cfg.pkt_buffer_size = pkt_buffer_size;
+
+	data = eth_dev->data;
+	data->dev_private = pmd;
+	data->numa_node = numa_node;
+	data->mac_addrs = ether_addr;
+
+	eth_dev->dev_ops = &ops;
+	eth_dev->device = &vdev->device;
+	eth_dev->rx_pkt_burst = eth_memif_rx;
+	eth_dev->tx_pkt_burst = eth_memif_tx;
+
+	eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+memif_set_role(const char *key __rte_unused, const char *value,
+	       void *extra_args)
+{
+	enum memif_role_t *role = (enum memif_role_t *)extra_args;
+
+	if (strstr(value, "master") != NULL) {
+		*role = MEMIF_ROLE_MASTER;
+	} else if (strstr(value, "slave") != NULL) {
+		*role = MEMIF_ROLE_SLAVE;
+	} else {
+		MIF_LOG(ERR, "Unknown role: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_zc(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	uint32_t *flags = (uint32_t *)extra_args;
+
+	if (strstr(value, "yes") != NULL) {
+		*flags |= ETH_MEMIF_FLAG_ZERO_COPY;
+	} else if (strstr(value, "no") != NULL) {
+		*flags &= ~ETH_MEMIF_FLAG_ZERO_COPY;
+	} else {
+		MIF_LOG(ERR, "Failed to parse zero-copy param: %s.", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int
+memif_set_id(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	memif_interface_id_t *id = (memif_interface_id_t *)extra_args;
+
+	/* even if parsing fails, 0 is a valid id */
+	*id = strtoul(value, NULL, 10);
+	return 0;
+}
+
+static int
+memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	uint16_t *pkt_buffer_size = (uint16_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > 0xFFFF) {
+		MIF_LOG(ERR, "Invalid buffer size: %s.", value);
+		return -EINVAL;
+	}
+	*pkt_buffer_size = tmp;
+	return 0;
+}
+
+static int
+memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	unsigned long tmp;
+	memif_log2_ring_size_t *log2_ring_size =
+	    (memif_log2_ring_size_t *)extra_args;
+
+	tmp = strtoul(value, NULL, 10);
+	if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) {
+		MIF_LOG(ERR, "Invalid ring size: %s (max %u).",
+			value, ETH_MEMIF_MAX_LOG2_RING_SIZE);
+		return -EINVAL;
+	}
+	*log2_ring_size = tmp;
+	return 0;
+}
+
+/* check if directory exists and if we have permission to read/write */
+static int
+memif_check_socket_filename(const char *filename)
+{
+	char *dir = NULL, *tmp;
+	uint32_t idx;
+	int ret = 0;
+
+	tmp = strrchr(filename, '/');
+	if (tmp != NULL) {
+		idx = tmp - filename;
+		dir = rte_zmalloc("memif_tmp", sizeof(char) * (idx + 1), 0);
+		if (dir == NULL) {
+			MIF_LOG(ERR, "Failed to allocate memory.");
+			return -1;
+		}
+		strlcpy(dir, filename, sizeof(char) * (idx + 1));
+	}
+
+	if (dir == NULL || (faccessat(-1, dir, F_OK | R_OK |
+					W_OK, AT_EACCESS) < 0)) {
+		MIF_LOG(ERR, "Invalid socket directory.");
+		ret = -EINVAL;
+	}
+
+	if (dir != NULL)
+		rte_free(dir);
+
+	return ret;
+}
+
+static int
+memif_set_socket_filename(const char *key __rte_unused, const char *value,
+			  void *extra_args)
+{
+	const char **socket_filename = (const char **)extra_args;
+
+	*socket_filename = value;
+	return memif_check_socket_filename(*socket_filename);
+}
+
+static int
+memif_set_mac(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	struct rte_ether_addr *ether_addr = (struct rte_ether_addr *)extra_args;
+	int ret = 0;
+
+	ret = sscanf(value, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
+	       &ether_addr->addr_bytes[0], &ether_addr->addr_bytes[1],
+	       &ether_addr->addr_bytes[2], &ether_addr->addr_bytes[3],
+	       &ether_addr->addr_bytes[4], &ether_addr->addr_bytes[5]);
+	if (ret != 6)
+		MIF_LOG(WARNING, "Failed to parse mac '%s'.", value);
+	return 0;
+}
+
+static int
+memif_set_secret(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	const char **secret = (const char **)extra_args;
+
+	*secret = value;
+	return 0;
+}
+
+static int
+rte_pmd_memif_probe(struct rte_vdev_device *vdev)
+{
+	RTE_BUILD_BUG_ON(sizeof(memif_msg_t) != 128);
+	RTE_BUILD_BUG_ON(sizeof(memif_desc_t) != 16);
+	int ret = 0;
+	struct rte_kvargs *kvlist;
+	const char *name = rte_vdev_device_name(vdev);
+	enum memif_role_t role = MEMIF_ROLE_SLAVE;
+	memif_interface_id_t id = 0;
+	uint16_t pkt_buffer_size = ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE;
+	memif_log2_ring_size_t log2_ring_size = ETH_MEMIF_DEFAULT_RING_SIZE;
+	const char *socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME;
+	uint32_t flags = 0;
+	const char *secret = NULL;
+	struct rte_ether_addr *ether_addr = rte_zmalloc("",
+		sizeof(struct rte_ether_addr), 0);
+
+	rte_eth_random_addr(ether_addr->addr_bytes);
+
+	MIF_LOG(INFO, "Initialize MEMIF: %s.", name);
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		MIF_LOG(ERR, "Multi-processing not supported for memif.");
+		/* TODO:
+		 * Request connection information.
+		 *
+		 * Once memif in the primary process is connected,
+		 * broadcast connection information.
+		 */
+		return -1;
+	}
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(vdev), valid_arguments);
+
+	/* parse parameters */
+	if (kvlist != NULL) {
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG,
+					 &memif_set_role, &role);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG,
+					 &memif_set_id, &id);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_PKT_BUFFER_SIZE_ARG,
+					 &memif_set_bs, &pkt_buffer_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG,
+					 &memif_set_rs, &log2_ring_size);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SOCKET_ARG,
+					 &memif_set_socket_filename,
+					 (void *)(&socket_filename));
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_MAC_ARG,
+					 &memif_set_mac, ether_addr);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG,
+					 &memif_set_zc, &flags);
+		if (ret < 0)
+			goto exit;
+		ret = rte_kvargs_process(kvlist, ETH_MEMIF_SECRET_ARG,
+					 &memif_set_secret, (void *)(&secret));
+		if (ret < 0)
+			goto exit;
+	}
+
+	/* create interface */
+	ret = memif_create(vdev, role, id, flags, socket_filename,
+			   log2_ring_size, pkt_buffer_size, secret, ether_addr);
+
+exit:
+	if (kvlist != NULL)
+		rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+rte_pmd_memif_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *eth_dev;
+
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (eth_dev == NULL)
+		return 0;
+
+	rte_eth_dev_close(eth_dev->data->port_id);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_memif_drv = {
+	.probe = rte_pmd_memif_probe,
+	.remove = rte_pmd_memif_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv);
+
+RTE_PMD_REGISTER_PARAM_STRING(net_memif,
+			      ETH_MEMIF_ID_ARG "=<int>"
+			      ETH_MEMIF_ROLE_ARG "=master|slave"
+			      ETH_MEMIF_PKT_BUFFER_SIZE_ARG "=<int>"
+			      ETH_MEMIF_RING_SIZE_ARG "=<int>"
+			      ETH_MEMIF_SOCKET_ARG "=<string>"
+			      ETH_MEMIF_MAC_ARG "=xx:xx:xx:xx:xx:xx"
+			      ETH_MEMIF_ZC_ARG "=yes|no"
+			      ETH_MEMIF_SECRET_ARG "=<string>");
+
+int memif_logtype;
+
+RTE_INIT(memif_init_log)
+{
+	memif_logtype = rte_log_register("pmd.net.memif");
+	if (memif_logtype >= 0)
+		rte_log_set_level(memif_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/memif/rte_eth_memif.h b/drivers/net/memif/rte_eth_memif.h
new file mode 100644
index 000000000..5f631e925
--- /dev/null
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -0,0 +1,212 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
+ */
+
+#ifndef _RTE_ETH_MEMIF_H_
+#define _RTE_ETH_MEMIF_H_
+
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif				/* GNU_SOURCE */
+
+#include <sys/queue.h>
+
+#include <rte_ethdev_driver.h>
+#include <rte_ether.h>
+#include <rte_interrupts.h>
+
+#include "memif.h"
+
+#define ETH_MEMIF_DEFAULT_SOCKET_FILENAME	"/run/memif.sock"
+#define ETH_MEMIF_DEFAULT_RING_SIZE		10
+#define ETH_MEMIF_DEFAULT_PKT_BUFFER_SIZE	2048
+
+#define ETH_MEMIF_MAX_NUM_Q_PAIRS		255
+#define ETH_MEMIF_MAX_LOG2_RING_SIZE		14
+#define ETH_MEMIF_MAX_REGION_NUM		256
+
+#define ETH_MEMIF_SHM_NAME_SIZE			32
+#define ETH_MEMIF_DISC_STRING_SIZE		96
+#define ETH_MEMIF_SECRET_SIZE			24
+
+extern int memif_logtype;
+
+#define MIF_LOG(level, fmt, args...) \
+	rte_log(RTE_LOG_ ## level, memif_logtype, \
+		"%s(): " fmt "\n", __func__, ##args)
+
+enum memif_role_t {
+	MEMIF_ROLE_MASTER,
+	MEMIF_ROLE_SLAVE,
+};
+
+struct memif_region {
+	void *addr;				/**< shared memory address */
+	memif_region_size_t region_size;	/**< shared memory size */
+	int fd;					/**< shared memory file descriptor */
+	uint32_t pkt_buffer_offset;
+	/**< offset from 'addr' to first packet buffer */
+};
+
+struct memif_queue {
+	struct rte_mempool *mempool;		/**< mempool for RX packets */
+	struct pmd_internals *pmd;		/**< device internals */
+
+	memif_ring_type_t type;			/**< ring type */
+	memif_region_index_t region;		/**< shared memory region index */
+
+	uint16_t in_port;			/**< port id */
+
+	memif_region_offset_t ring_offset;
+	/**< ring offset from start of shm region (ring - memif_region.addr) */
+
+	uint16_t last_head;			/**< last ring head */
+	uint16_t last_tail;			/**< last ring tail */
+
+	/* rx/tx info */
+	uint64_t n_pkts;			/**< number of rx/tx packets */
+	uint64_t n_bytes;			/**< number of rx/tx bytes */
+	uint64_t n_err;				/**< number of tx errors */
+
+	memif_ring_t *ring;			/**< pointer to ring */
+
+	struct rte_intr_handle intr_handle;	/**< interrupt handle */
+
+	memif_log2_ring_size_t log2_ring_size;	/**< log2 of ring size */
+};
+
+struct pmd_internals {
+	memif_interface_id_t id;		/**< unique id */
+	enum memif_role_t role;			/**< device role */
+	uint32_t flags;				/**< device status flags */
+#define ETH_MEMIF_FLAG_CONNECTING	(1 << 0)
+/**< device is connecting */
+#define ETH_MEMIF_FLAG_CONNECTED	(1 << 1)
+/**< device is connected */
+#define ETH_MEMIF_FLAG_ZERO_COPY	(1 << 2)
+/**< device is zero-copy enabled */
+#define ETH_MEMIF_FLAG_DISABLED		(1 << 3)
+/**< device has not been configured and can not accept connection requests */
+
+	char *socket_filename;			/**< pointer to socket filename */
+	char secret[ETH_MEMIF_SECRET_SIZE]; /**< secret (optional security parameter) */
+
+	struct memif_control_channel *cc;	/**< control channel */
+
+	struct memif_region *regions[ETH_MEMIF_MAX_REGION_NUM];
+	/**< shared memory regions */
+	memif_region_index_t regions_num;	/**< number of regions */
+
+	/* remote info */
+	char remote_name[RTE_DEV_NAME_MAX_LEN];		/**< remote app name */
+	char remote_if_name[RTE_DEV_NAME_MAX_LEN];	/**< remote peer name */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} cfg;					/**< Configured parameters (max values) */
+
+	struct {
+		memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */
+		uint8_t num_s2m_rings;		/**< number of slave to master rings */
+		uint8_t num_m2s_rings;		/**< number of master to slave rings */
+		uint16_t pkt_buffer_size;	/**< buffer size */
+	} run;
+	/**< Parameters used in active connection */
+
+	char local_disc_string[ETH_MEMIF_DISC_STRING_SIZE];
+	/**< local disconnect reason */
+	char remote_disc_string[ETH_MEMIF_DISC_STRING_SIZE];
+	/**< remote disconnect reason */
+
+	struct rte_vdev_device *vdev;		/**< vdev handle */
+};
+
+/**
+ * Unmap shared memory and free regions from memory.
+ *
+ * @param pmd
+ *   device internals
+ */
+void memif_free_regions(struct pmd_internals *pmd);
+
+/**
+ * Finalize connection establishment process. Map shared memory file
+ * (master role), initialize ring queue, set link status up.
+ *
+ * @param dev
+ *   memif device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_connect(struct rte_eth_dev *dev);
+
+/**
+ * Create shared memory file and initialize ring queue.
+ * Only called by slave when establishing connection
+ *
+ * @param dev
+ *   memif device
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int memif_init_regions_and_queues(struct rte_eth_dev *dev);
+
+/**
+ * Get memif version string.
+ *
+ * @return
+ *   - memif version string
+ */
+const char *memif_version(void);
+
+#ifndef MFD_HUGETLB
+#ifndef __NR_memfd_create
+
+#if defined __x86_64__
+#define __NR_memfd_create 319
+#elif defined __x86_32__
+#define __NR_memfd_create 1073742143
+#elif defined __arm__
+#define __NR_memfd_create 385
+#elif defined __aarch64__
+#define __NR_memfd_create 279
+#elif defined __powerpc__
+#define __NR_memfd_create 360
+#elif defined __i386__
+#define __NR_memfd_create 356
+#else
+#error "__NR_memfd_create unknown for this architecture"
+#endif
+
+#endif				/* __NR_memfd_create */
+
+static inline int memfd_create(const char *name, unsigned int flags)
+{
+	return syscall(__NR_memfd_create, name, flags);
+}
+#endif				/* MFD_HUGETLB */
+
+#ifndef F_LINUX_SPECIFIC_BASE
+#define F_LINUX_SPECIFIC_BASE 1024
+#endif
+
+#ifndef MFD_ALLOW_SEALING
+#define MFD_ALLOW_SEALING       0x0002U
+#endif
+
+#ifndef F_ADD_SEALS
+#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001	/* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002	/* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004	/* prevent file from growing */
+#define F_SEAL_WRITE    0x0008	/* prevent writes */
+#endif
+
+#endif				/* RTE_ETH_MEMIF_H */
diff --git a/drivers/net/memif/rte_pmd_memif_version.map b/drivers/net/memif/rte_pmd_memif_version.map
new file mode 100644
index 000000000..8861484fb
--- /dev/null
+++ b/drivers/net/memif/rte_pmd_memif_version.map
@@ -0,0 +1,4 @@
+DPDK_19.08 {
+
+        local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index ed99896c3..b57073483 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -24,6 +24,7 @@ drivers = ['af_packet',
 	'ixgbe',
 	'kni',
 	'liquidio',
+	'memif',
 	'mlx4',
 	'mlx5',
 	'mvneta',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 7c9b4b538..326b2a30b 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -174,6 +174,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_KNI),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KNI)        += -lrte_pmd_kni
 endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LIO_PMD)        += -lrte_pmd_lio
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_MEMIF)      += -lrte_pmd_memif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD)       += -lrte_pmd_mlx4
 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD)       += -lrte_pmd_mlx5 -lmnl
 ifeq ($(CONFIG_RTE_IBVERBS_LINK_DLOPEN),y)
--
2.17.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [dpdk-dev] [PATCH v12] net/memif: introduce memory interface (memif) PMD
  2019-06-06 11:38                 ` [dpdk-dev] [PATCH v12] " Jakub Grajciar
@ 2019-06-06 14:07                   ` Ferruh Yigit
  0 siblings, 0 replies; 58+ messages in thread
From: Ferruh Yigit @ 2019-06-06 14:07 UTC (permalink / raw)
  To: Jakub Grajciar, dev

On 6/6/2019 12:38 PM, Jakub Grajciar wrote:
> Shared memory packet interface (memif) PMD allows for DPDK and any other
> client using memif (DPDK, VPP, libmemif) to communicate using shared
> memory. The created device transmits packets in a raw format. It can be
> used with Ethernet mode, IP mode, or Punt/Inject. At this moment, only
> Ethernet mode is supported in DPDK memif implementation. Memif is Linux
> only.
> 
> Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
> ---
>  MAINTAINERS                                 |    6 +
>  config/common_base                          |    5 +
>  config/common_linux                         |    1 +
>  doc/guides/nics/features/memif.ini          |   14 +
>  doc/guides/nics/index.rst                   |    1 +
>  doc/guides/nics/memif.rst                   |  234 ++++
>  doc/guides/rel_notes/release_19_08.rst      |    5 +
>  drivers/net/Makefile                        |    1 +
>  drivers/net/memif/Makefile                  |   31 +
>  drivers/net/memif/memif.h                   |  179 +++
>  drivers/net/memif/memif_socket.c            | 1124 +++++++++++++++++
>  drivers/net/memif/memif_socket.h            |  105 ++
>  drivers/net/memif/meson.build               |   15 +
>  drivers/net/memif/rte_eth_memif.c           | 1204 +++++++++++++++++++
>  drivers/net/memif/rte_eth_memif.h           |  212 ++++
>  drivers/net/memif/rte_pmd_memif_version.map |    4 +
>  drivers/net/meson.build                     |    1 +
>  mk/rte.app.mk                               |    1 +

Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

Applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2019-06-06 14:07 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-13 13:30 [dpdk-dev] [RFC v3] /net: memory interface (memif) Jakub Grajciar
2018-12-13 18:07 ` Stephen Hemminger
2018-12-14  9:39   ` Bruce Richardson
2018-12-14 16:12     ` Wiles, Keith
2019-01-04 17:16 ` Ferruh Yigit
2019-01-04 19:23 ` Stephen Hemminger
2019-01-04 19:27 ` Stephen Hemminger
2019-01-04 19:32 ` Stephen Hemminger
2019-02-20 11:52 ` [dpdk-dev] [RFC v4] " Jakub Grajciar
2019-02-20 15:46   ` Stephen Hemminger
2019-02-20 16:17   ` Stephen Hemminger
2019-02-21 10:50   ` Rami Rosen
2019-02-27 17:04   ` Ferruh Yigit
2019-03-22 11:57   ` [dpdk-dev] [RFC v5] " Jakub Grajciar
2019-03-22 11:57     ` Jakub Grajciar
2019-03-25 20:58     ` Ferruh Yigit
2019-03-25 20:58       ` Ferruh Yigit
2019-05-02 12:35       ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-05-02 12:35         ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-05-03  4:27         ` Honnappa Nagarahalli
2019-05-03  4:27           ` Honnappa Nagarahalli
2019-05-06 11:00           ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-05-06 11:00             ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-05-06 11:04             ` Damjan Marion (damarion)
2019-05-06 11:04               ` Damjan Marion (damarion)
2019-05-07 11:29               ` Honnappa Nagarahalli
2019-05-07 11:29                 ` Honnappa Nagarahalli
2019-05-07 11:37                 ` Damjan Marion (damarion)
2019-05-07 11:37                   ` Damjan Marion (damarion)
2019-05-08  7:53                   ` Honnappa Nagarahalli
2019-05-08  7:53                     ` Honnappa Nagarahalli
2019-05-09  8:30     ` [dpdk-dev] [RFC v6] " Jakub Grajciar
2019-05-09  8:30       ` Jakub Grajciar
2019-05-13 10:45       ` [dpdk-dev] [RFC v7] " Jakub Grajciar
2019-05-13 10:45         ` Jakub Grajciar
2019-05-16 11:46         ` [dpdk-dev] [RFC v8] " Jakub Grajciar
2019-05-16 15:18           ` Stephen Hemminger
2019-05-16 15:19           ` Stephen Hemminger
2019-05-16 15:21           ` Stephen Hemminger
2019-05-20  9:22             ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-05-16 15:25           ` Stephen Hemminger
2019-05-16 15:28           ` Stephen Hemminger
2019-05-20 10:18           ` [dpdk-dev] [RFC v9] " Jakub Grajciar
2019-05-29 17:29             ` Ferruh Yigit
2019-05-30 12:38               ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-05-31  6:22             ` [dpdk-dev] [PATCH v10] net/memif: introduce memory interface (memif) PMD Jakub Grajciar
2019-05-31  7:43               ` Ye Xiaolong
2019-06-03 11:28                 ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-06-03 14:25                   ` Ye Xiaolong
2019-06-05 12:01                 ` Ferruh Yigit
2019-06-03 13:37               ` Aaron Conole
2019-06-05 11:55               ` Ferruh Yigit
2019-06-06  9:24                 ` Ferruh Yigit
2019-06-06 10:25                   ` Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco)
2019-06-06 11:18                     ` Ferruh Yigit
2019-06-06  8:24               ` [dpdk-dev] [PATCH v11] " Jakub Grajciar
2019-06-06 11:38                 ` [dpdk-dev] [PATCH v12] " Jakub Grajciar
2019-06-06 14:07                   ` Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).