From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 4311E1B497 for ; Fri, 4 Jan 2019 18:16:49 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Jan 2019 09:16:48 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,439,1539673200"; d="scan'208";a="308996362" Received: from fyigit-mobl.ger.corp.intel.com (HELO [10.237.221.22]) ([10.237.221.22]) by fmsmga005.fm.intel.com with ESMTP; 04 Jan 2019 09:16:46 -0800 To: Jakub Grajciar , dev@dpdk.org References: <20181213133051.18779-1-jgrajcia@cisco.com> From: Ferruh Yigit Openpgp: preference=signencrypt Autocrypt: addr=ferruh.yigit@intel.com; prefer-encrypt=mutual; keydata= mQINBFXZCFABEADCujshBOAaqPZpwShdkzkyGpJ15lmxiSr3jVMqOtQS/sB3FYLT0/d3+bvy qbL9YnlbPyRvZfnP3pXiKwkRoR1RJwEo2BOf6hxdzTmLRtGtwWzI9MwrUPj6n/ldiD58VAGQ +iR1I/z9UBUN/ZMksElA2D7Jgg7vZ78iKwNnd+vLBD6I61kVrZ45Vjo3r+pPOByUBXOUlxp9 GWEKKIrJ4eogqkVNSixN16VYK7xR+5OUkBYUO+sE6etSxCr7BahMPKxH+XPlZZjKrxciaWQb +dElz3Ab4Opl+ZT/bK2huX+W+NJBEBVzjTkhjSTjcyRdxvS1gwWRuXqAml/sh+KQjPV1PPHF YK5LcqLkle+OKTCa82OvUb7cr+ALxATIZXQkgmn+zFT8UzSS3aiBBohg3BtbTIWy51jNlYdy ezUZ4UxKSsFuUTPt+JjHQBvF7WKbmNGS3fCid5Iag4tWOfZoqiCNzxApkVugltxoc6rG2TyX CmI2rP0mQ0GOsGXA3+3c1MCdQFzdIn/5tLBZyKy4F54UFo35eOX8/g7OaE+xrgY/4bZjpxC1 1pd66AAtKb3aNXpHvIfkVV6NYloo52H+FUE5ZDPNCGD0/btFGPWmWRmkPybzColTy7fmPaGz cBcEEqHK4T0aY4UJmE7Ylvg255Kz7s6wGZe6IR3N0cKNv++O7QARAQABtCVGZXJydWggWWln aXQgPGZlcnJ1aC55aWdpdEBpbnRlbC5jb20+iQJVBBMBAgA/AhsDBgsJCAcDAgYVCAIJCgsE FgIDAQIeAQIXgBYhBNI2U4dCLsKE45mBx/kz60PfE2EfBQJbughWBQkHwjOGAAoJEPkz60Pf E2Eft84QAIbKWqhgqRfoiw/BbXbA1+qm2o4UgkCRQ0yJgt9QsnbpOmPKydHH0ixCliNz1J8e mRXCkMini1bTpnzp7spOjQGLeAFkNFz6BMq8YF2mVWbGEDE9WgnAxZdi0eLY7ZQnHbE6AxKL SXmpe9INb6z3ztseFt7mqje/W/6DWYIMnH3Yz9KzxujFWDcq8UCAvPkxVQXLTMpauhFgYeEx Nub5HbvhxTfUkapLwRQsSd/HbywzqZ3s/bbYMjj5JO3tgMiM9g9HOjv1G2f1dQjHi5YQiTZl 1eIIqQ3pTic6ROaiZqNmQFXPsoOOFfXF8nN2zg8kl/sSdoXWHhama5hbwwtl1vdaygQYlmdK H2ueiFh/UvT3WG3waNv2eZiEbHV8Rk52Xyn2w1G90lV0fYC6Ket1Xjoch7kjwbx793Kz/RfQ rmBY8/S4DTGn3oq3dMdQY+b6+7VMUeLMMh2CXYO9ErkOq+qNTD1IY+cBAkXnaDbQfz0zbste ZGWH74FAZ9nCpDOqbRTrBL42aMGhfOWEyeA1x7+hl6JZfabBWAuf4nnCXuorKHzBXTrf7u7p fXsKQClWRW77PF1VmzrtKNVSytQAmlCWApQIw20AarFipXmVdIjHmJPU611WoyxZPb4JTOxx 5cv9B+nr/RIB+v5dcStyHCCwO1be7nBDdCgd4F6kTQPLuQINBFfWTL4BEACnNA29e8TarUsB L5n6eLZHXcFvVwNLVlirWOClHXf44o2KnN3ww+eBEmKVfEFo9MSuGDNHS8Zw1NiGMYxLIUgd U6gGrVVs/VrQWL82pbMk6jCj98N+BXIri+6K1z+AImz7ax7iF1kDgRAnFWU0znWWBgM2mM8Y gDjcxfXk4sCKnvf6Gjo08Ey5zmqx7dekAKU2EEp8Q1EJY3jbymLdZWRP4AFFMTS1rGMk0/tt v71NBg1GobCcbNfn9chK/jhqxYhAJqq86RdJQkt3/9x1U1Oq0vXCt4JVVHmkxePtUiuWTTt+ aYlUAsKYZsWvncExvw77x2ArYDmaK0yfjh37wp0lY7DOJHFxoyT8tyWZlLci/VMRG2Ja33xj 0CN4C1yBg+QDeV3QFxQo42iA/ykdXPUR3ezmsND3XKvVLTC4DNb3V/EZQ7jBj64+bEK0VW4G B31VP00ApNQvSoczsIOAKdk97RNbpmPw6q10ILIB+9T1xbnFYzshzGF17oC0/GENIHATx8vZ masOZoDiOZQpeneLgnFE9JfzhLTxv6wNZcc/HLXRQVTkDsQr8ERtkAoHCf1E5+b5Yr7pfnE4 YuhET746o25S53ELUYPIs49qoJsEJL34/oexMfPGyPIlrbufiNyty5jc/1MRwUlhJlJ5IOHy ZUa+6CLR7GdImusFkPJUJwARAQABiQI8BBgBAgAmAhsMFiEE0jZTh0IuwoTjmYHH+TPrQ98T YR8FAlu6CHAFCQXE7zIACgkQ+TPrQ98TYR9nXxAAqNBgkYNyGuWUuy0GwDQCbu3iiMyH1+D7 llafPcK4NYy1Z4AYuVwC9nmLaoj+ozdqS3ncRo57ncRsKEJC46nDJJZYZ5LSJVn63Y3NBF86 lxQAgjj2oyZEwaLKtKbAFsXL43jv1pUGgSvWwYtDwHITXXFQto9rZEuUDRFSx4sg9OR+Q6/6 LY+nQQ3OdHlBkflzYMPcWgDcvcTAO6yasLEUf7UcYoSWTyMYjLB4QuNlXzTswzGVMssJF/vo V8lD1eqqaSUWG3STF6GVLQOr1NLvN5+kUBiEStHFxBpgSCvYY9sNV8FS6N24CAWMBl+10W+D 2h1yiiP5dOdPcBDYKsgqDD91/sP0WdyMJkwdQJtD49f9f+lYloxHnSAxMleOpyscg1pldw+i mPaUY1bmIknLhhkqfMmjywQOXpac5LRMibAAYkcB8v7y3kwELnt8mhqqZy6LUsqcWygNbH/W K3GGt5tRpeIXeJ25x8gg5EBQ0Jnvp/IbBYQfPLtXH0Myq2QuAhk/1q2yEIbVjS+7iowEZNyE 56K63WBJxsJPB2mvmLgn98GqB4G6GufP1ndS0XDti/2K0o8rep9xoY/JDGi0n0L0tk9BHyoP Y7kaEpu7UyY3nVdRLe5H1/MnFG8hdJ97WqnPS0buYZlrbTV0nRFL/NI2VABl18vEEXvNQiO+ vM8= Message-ID: Date: Fri, 4 Jan 2019 17:16:46 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <20181213133051.18779-1-jgrajcia@cisco.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [RFC v3] /net: memory interface (memif) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Jan 2019 17:16:50 -0000 On 12/13/2018 1:30 PM, Jakub Grajciar wrote: > Memory interface (memif), provides high performance > packet transfer over shared memory. > > Signed-off-by: Jakub Grajciar Hi Jakub, I put some comments but overall it is a long patch, I am almost sure there are some details missed, it can make easier to review if you can split the patch into multiple patches. > --- > config/common_base | 5 + > config/common_linuxapp | 1 + > doc/guides/nics/memif.rst | 80 ++ "doc/guides/nics/index.rst" needs to be updated to add the new doc into index Also there is a driver feature documentation, "doc/guides/nics/features/*.ini", I admit it is not very informative for virtual devices, but having it will add the driver to the overview table, so please add it too, with as much data as possible. > drivers/net/Makefile | 1 + > drivers/net/memif/Makefile | 29 + > drivers/net/memif/memif.h | 143 +++ > drivers/net/memif/memif_socket.c | 1093 +++++++++++++++++ > drivers/net/memif/memif_socket.h | 104 ++ > drivers/net/memif/meson.build | 10 + > drivers/net/memif/rte_eth_memif.c | 1189 +++++++++++++++++++ > drivers/net/memif/rte_eth_memif.h | 207 ++++ > drivers/net/memif/rte_pmd_memif_version.map | 4 + > drivers/net/meson.build | 1 + > mk/rte.app.mk | 1 + Can you please update "MAINTAINERS" file too, to add driver and its owner. <...> > @@ -0,0 +1,80 @@ > +.. SPDX-License-Identifier: BSD-3-Clause > + Copyright(c) 2018 Cisco Systems, Inc. > + > +Memif Poll Mode Driver > +====================== > + > +Memif PMD allows for DPDK and any other client using memif (DPDK, VPP, libmemif) to > +communicate using shared memory. It can be good to describe what 'memif' is briefly here and perhaps a link for more detailed documentation. And some more low level detail can be good, like message type, messaging protocol, descriptor format etc.. A link also good enough if it is already documented somewhere. Also memif PMD is only for Linux, what do you think documenting that limitation here? <...> > +Example: testpmd and VPP > +------------------------ > + > +For information on how to get and run VPP please see ``_. Can memif PMD used between two DPDK application, or as two interfaces to testpmd? If so what do you think adding that as a main sample, which would be easier for everyone to test? <...> > @@ -0,0 +1,29 @@ > +# SPDX-License-Identifier: BSD-3-Clause > +# Copyright 2018 Cisco Systems, Inc. All rights reserved. > + > +include $(RTE_SDK)/mk/rte.vars.mk > + > +# > +# library name > +# > +LIB = librte_pmd_memif.a > + > +EXPORT_MAP := rte_pmd_memif_version.map > + > +LIBABIVER := 1 > + > +CFLAGS += -O3 > +CFLAGS += -I$(SRCDIR) Why need to include $(SRCDIR) ? > +CFLAGS += $(WERROR_FLAGS) > +CFLAGS += -Wno-pointer-arith Is this warning disabled explicitly? <...> > @@ -0,0 +1,143 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright 2018 Cisco Systems, Inc. All rights reserved. > + */ > + > +#ifndef _MEMIF_H_ > +#define _MEMIF_H_ > + > +#ifndef MEMIF_CACHELINE_SIZE > +#define MEMIF_CACHELINE_SIZE 64 > +#endif How much this is related to processor cache size? If same thing it is better to use RTE_CACHE_LINE_SIZE since it can be set properly for various architectures. > + > +#define MEMIF_COOKIE 0x3E31F20 > +#define MEMIF_VERSION_MAJOR 2 > +#define MEMIF_VERSION_MINOR 0 > +#define MEMIF_VERSION ((MEMIF_VERSION_MAJOR << 8) | MEMIF_VERSION_MINOR) > + > +/* > + * Type definitions > + */ > + > +typedef enum memif_msg_type { > + MEMIF_MSG_TYPE_NONE = 0, > + MEMIF_MSG_TYPE_ACK = 1, > + MEMIF_MSG_TYPE_HELLO = 2, > + MEMIF_MSG_TYPE_INIT = 3, > + MEMIF_MSG_TYPE_ADD_REGION = 4, > + MEMIF_MSG_TYPE_ADD_RING = 5, > + MEMIF_MSG_TYPE_CONNECT = 6, > + MEMIF_MSG_TYPE_CONNECTED = 7, > + MEMIF_MSG_TYPE_DISCONNECT = 8, No need to assign numbers to enums, they will be already provided by compiler. Valid for all enums below. > +} memif_msg_type_t; > + > +typedef enum { > + MEMIF_RING_S2M = 0, > + MEMIF_RING_M2S = 1 > +} memif_ring_type_t; > + > +typedef enum { > + MEMIF_INTERFACE_MODE_ETHERNET = 0, > + MEMIF_INTERFACE_MODE_IP = 1, > + MEMIF_INTERFACE_MODE_PUNT_INJECT = 2, > +} memif_interface_mode_t; > + > +typedef uint16_t memif_region_index_t; > +typedef uint32_t memif_region_offset_t; > +typedef uint64_t memif_region_size_t; > +typedef uint16_t memif_ring_index_t; > +typedef uint32_t memif_interface_id_t; > +typedef uint16_t memif_version_t; > +typedef uint8_t memif_log2_ring_size_t; > + > +/* > + * Socket messages > + */ > + > +typedef struct __attribute__ ((packed)) { can use "__rte_packed" instead of "__attribute__ ((packed))" > + uint8_t name[32]; It can be good to define a name size macro instead of 32 > + memif_version_t min_version; > + memif_version_t max_version; > + memif_region_index_t max_region; > + memif_ring_index_t max_m2s_ring; > + memif_ring_index_t max_s2m_ring; > + memif_log2_ring_size_t max_log2_ring_size; > +} memif_msg_hello_t; > + > +typedef struct __attribute__ ((packed)) { > + memif_version_t version; > + memif_interface_id_t id; > + memif_interface_mode_t mode:8; What will be the size of the enum when combined bitfield and "__attribute__ ((packed))", I guess it will be 1byte above, does it worth commenting to clarify? And I wonder if this behavior is part of standard? > + uint8_t secret[24]; > + uint8_t name[32]; > +} memif_msg_init_t; > + > +typedef struct __attribute__ ((packed)) { > + memif_region_index_t index; > + memif_region_size_t size; > +} memif_msg_add_region_t; > + > +typedef struct __attribute__ ((packed)) { > + uint16_t flags; > +#define MEMIF_MSG_ADD_RING_FLAG_S2M (1 << 0) Why not just "1" ? > + memif_ring_index_t index; > + memif_region_index_t region; > + memif_region_offset_t offset; > + memif_log2_ring_size_t log2_ring_size; > + uint16_t private_hdr_size; /* used for private metadata */ > +} memif_msg_add_ring_t; > + > +typedef struct __attribute__ ((packed)) { > + uint8_t if_name[32]; > +} memif_msg_connect_t; > + > +typedef struct __attribute__ ((packed)) { > + uint8_t if_name[32]; > +} memif_msg_connected_t; > + > +typedef struct __attribute__ ((packed)) { > + uint32_t code; > + uint8_t string[96]; Why not 97? Or 42? > +} memif_msg_disconnect_t; > + > +typedef struct __attribute__ ((packed, aligned(128))) { > + memif_msg_type_t type:16; > + union { > + memif_msg_hello_t hello; > + memif_msg_init_t init; > + memif_msg_add_region_t add_region; > + memif_msg_add_ring_t add_ring; > + memif_msg_connect_t connect; > + memif_msg_connected_t connected; > + memif_msg_disconnect_t disconnect; > + }; > +} memif_msg_t; > + > +/* > + * Ring and Descriptor Layout > + */ > + > +typedef struct __attribute__ ((packed)) { > + uint16_t flags; > +#define MEMIF_DESC_FLAG_NEXT (1 << 0) > + memif_region_index_t region; > + uint32_t length; > + memif_region_offset_t offset; > + uint32_t metadata; > +} memif_desc_t; > + > +#define MEMIF_CACHELINE_ALIGN_MARK(mark) \ > + uint8_t mark[0] __attribute__((aligned(MEMIF_CACHELINE_SIZE))) can use "__rte_aligned()" instead. And if this is for processors cache size '__rte_cache_aligned' can be better to use. <...> > + > +#include > +#include These are private headers, not installed to system path, why not access them as local #include "memif_socket.h" <...> > + > + if (pmd->flags && (ETH_MEMIF_FLAG_CONNECTING | > + ETH_MEMIF_FLAG_CONNECTED)) { I assume intention is '&' here, instead of '&&' <...> > + strlcpy((char *)i->name, rte_version(), sizeof(i->name)); > + > + if (pmd->secret) > + strlcpy((char *)i->secret, pmd->secret, sizeof(i->secret)); This will always evaluate to true, is the intention '*pmd->secret' ? <...> > + if (ret == -EAGAIN) { > + ret = > + rte_intr_callback_unregister_pending(&pmd-> > + cc->intr_handle, > + memif_intr_handler, > + pmd->cc, > + memif_intr_unregister_handler); Is 'rte_intr_callback_unregister_pending()' defined in DPDK, I am getting build error for this. <...> > +static void > +memif_intr_handler(void *arg) > +{ > + struct memif_control_channel *cc = arg; > + struct rte_eth_dev *dev; > + int ret; > + > + ret = memif_msg_receive(cc); > + /* if driver failed to assign device */ > + if (cc->pmd == NULL) { Isn't this already set when slave connected, When/How this can happen? <...> > @@ -0,0 +1,10 @@ > +# SPDX-License-Identifier: BSD-3-Clause > +# Copyright 2018 Cisco Systems, Inc. All rights reserved. > + > +if host_machine.system() != 'linux' > + build = false > +endif > +sources = files('rte_eth_memif.c', > + 'memif_socket.c') > + > +deps += ['hash'] This dependency is not in Makefile, and since librte_hash is used, dependency should be added into makefile too. <...> > + > +static struct rte_vdev_driver pmd_memif_drv; This deceleration may not be required. > + > +const char * > +memif_version(void) > +{ > +#define STR_HELP(s) #s > +#define STR(s) STR_HELP(s) Can use RTE_STR(x) instead. > + return ("memif-" STR(MEMIF_VERSION_MAJOR) "." STR(MEMIF_VERSION_MINOR)); > +#undef STR > +#undef STR_HELP > +} > + > +static void > +memif_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) > +{ > + struct pmd_internals *pmd = dev->data->dev_private; > + > + dev_info->if_index = pmd->if_index; if_index is for host interfaces, that something you should be able to provide to if_indextoname() as parameter, make sense for af_packet or tap, but not for here. > + dev_info->max_mac_addrs = 1; > + dev_info->max_rx_pktlen = (uint32_t)ETH_FRAME_LEN; > + dev_info->max_rx_queues = (pmd->role == MEMIF_ROLE_SLAVE) ? > + pmd->cfg.num_m2s_rings : pmd->cfg.num_s2m_rings; > + dev_info->max_tx_queues = (pmd->role == MEMIF_ROLE_SLAVE) ? > + pmd->cfg.num_s2m_rings : pmd->cfg.num_m2s_rings; max_rx_queues & max_tx_queues are for maximum available queue number, not current number, setting them to current number prevents app capability to increase the queue number. For this case ETH_MEMIF_MAX_NUM_Q_PAIRS seems the max queue number. <...> > + if ((ring->flags & MEMIF_RING_FLAG_MASK_INT) == 0) { > + uint64_t a = 1; > + ssize_t size = write(mq->intr_handle.fd, &a, sizeof(a)); > + if (unlikely(size < 0)) { > + MIF_LOG(WARNING, > + "%s: Failed to send interrupt on qid %ld: %s", > + rte_vdev_device_name(pmd->vdev), > + mq - pmd->tx_queues, strerror(errno)); giving build error because of %ld, error: format ‘%ld’ expects argument of type ‘long int’, but argument 6 has type ‘int’ [-Werror=format=] "%s(): " fmt "\n", __func__, ##args) ^~~~~~~~ <...> > +static int > +memif_tx_queue_setup(struct rte_eth_dev *dev, > + uint16_t qid, > + uint16_t nb_tx_desc __rte_unused, > + unsigned int socket_id __rte_unused, > + const struct rte_eth_txconf *tx_conf __rte_unused) > +{ > + struct pmd_internals *pmd = dev->data->dev_private; > + struct memif_queue *mq; > + > + mq = rte_realloc(pmd->tx_queues, sizeof(struct memif_queue) * (qid + 1), > + 0); There is an assumption in this code that whenever this fucntion called, it will be called with a 'qid' bigger that previous one, this assumption is not correct, there is no forced order of calling setup queue. Reverse order may cause you loosing all the memory. You should know the number of the queues already, instead of trying to allocate dynamiccaly in setup function, why don't you allocate it in advance? > + if (mq == NULL) { > + MIF_LOG(ERR, "%s: Failed to alloc tx queue %u.", > + rte_vdev_device_name(pmd->vdev), qid); > + return -ENOMEM; > + } > + > + pmd->tx_queues = mq; It may be better to have a varible pointer to pointer, "struct memif_queue **tx_queues", as 'data->tx_queues' does, because that is what you are storing, harder to keep this with "struct memif_queue *tx_queues" although can be done. > + > + mq->type = > + (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_S2M : MEMIF_RING_M2S; > + mq->n_pkts = 0; > + mq->n_bytes = 0; > + mq->n_err = 0; > + mq->intr_handle.fd = -1; > + mq->intr_handle.type = RTE_INTR_HANDLE_EXT; > + mq->pmd = pmd; Something looks wrong in this code or I am missing something. When "mq" expanded with "rte_realloc()" context preserved, but the new chunk of memory it at the end, but you are writing to the beggining, isn't this overwrite the existing data? Have you test this code with multi-queue? > + dev->data->tx_queues[qid] = mq; Also this part seems wrong, "data->tx_queues[qid]" should preserve the pointer to that specific queue with 'qid', here it point to the beginning of the memorry that has all queues. Also after rte_realloc() previous pointer may become invalid, and you may be storing/accessing invalid pointer. > + > + return 0; > +} > + > +static int > +memif_rx_queue_setup(struct rte_eth_dev *dev, > + uint16_t qid, > + uint16_t nb_rx_desc __rte_unused, > + unsigned int socket_id __rte_unused, > + const struct rte_eth_rxconf *rx_conf __rte_unused, > + struct rte_mempool *mb_pool) > +{ > + struct pmd_internals *pmd = dev->data->dev_private; > + struct memif_queue *mq; > + > + mq = rte_realloc(pmd->rx_queues, sizeof(struct memif_queue) * (qid + 1), > + 0); > + if (mq == NULL) { > + MIF_LOG(ERR, "%s: Failed to alloc rx queue %u.", > + rte_vdev_device_name(pmd->vdev), qid); > + return -ENOMEM; > + } > + > + pmd->rx_queues = mq; > + > + mq->type = > + (pmd->role == MEMIF_ROLE_SLAVE) ? MEMIF_RING_M2S : MEMIF_RING_S2M; > + mq->n_pkts = 0; > + mq->n_bytes = 0; > + mq->n_err = 0; > + mq->intr_handle.fd = -1; > + mq->intr_handle.type = RTE_INTR_HANDLE_EXT; > + mq->mempool = mb_pool; > + mq->in_port = dev->data->port_id; > + mq->pmd = pmd; > + dev->data->rx_queues[qid] = mq; Same problems with 'memif_tx_queue_setup' exists here, I will assume multi queue not tested at all. <...> > +static void > +memif_stats_reset(struct rte_eth_dev *dev) > +{ > + struct pmd_internals *pmd = dev->data->dev_private; > + int i; > + struct memif_queue *mq; > + > + for (i = 0; i < pmd->run.num_s2m_rings; i++) { > + mq = (pmd->role == MEMIF_ROLE_SLAVE) ? &pmd->tx_queues[i] : > + &pmd->rx_queues[i]; > + mq->n_pkts = 0; > + mq->n_bytes = 0; > + mq->n_err = 0; > + } > + for (i = 0; i < pmd->run.num_m2s_rings; i++) { > + mq = (pmd->role == MEMIF_ROLE_SLAVE) ? &pmd->rx_queues[i] : > + &pmd->tx_queues[i]; > + mq->n_pkts = 0; > + mq->n_bytes = 0; > + mq->n_err = 0; > + } Is this logic correct? If "role == MEMIF_ROLE_SLAVE", foreach pmd->run.num_s2m_rings: reset tx_queues foreach pmd->run.num_m2s_rings: reset rx_queues Not sure if "pmd->run.num_s2m_rings" & "pmd->run.num_s2m_rings" can be different values, but from rest of the code what I get, one is number of queues for SLAVE and other is number of queues for MASTER. The logic on 'memif_stats_get()' looks better. > +} > + > +static int > +memif_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t qid __rte_unused) > +{ > + struct pmd_internals *pmd = dev->data->dev_private; > + > + MIF_LOG(WARNING, "%s: Interrupt mode not supported.", > + rte_vdev_device_name(pmd->vdev)); > + > + /* Enable MEMIF interrupts. */ > + /* pmd->rx_queues[qid].ring->flags &= ~MEMIF_RING_FLAG_MASK_INT; */ > + > + /* > + * TODO: Tell dpdk to use interrupt mode. > + * > + * return rte_intr_enable(&pmd->rx_queues[qid].intr_handle); > + */ Please remove commented code while upstreaming. > + return -1; > +} > + > +static int > +memif_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t qid __rte_unused) > +{ > + struct pmd_internals *pmd __rte_unused = dev->data->dev_private; > + > + /* Disable MEMIF interrupts. */ > + /* pmd->rx_queues[qid].ring->flags |= MEMIF_RING_FLAG_MASK_INT; */ > + > + /* > + * TODO: Tell dpdk to use polling mode. > + * > + * return rte_intr_disable(&pmd->rx_queues[qid].intr_handle); > + */ Same here. <...> > + > + /* FIXME: generate mac? */ > + if (eth_addr == NULL) > + eth_addr = ETH_MEMIF_DEFAULT_ETH_ADDR; It is easy to generate random MAC in DPDK via eth_random_addr() API, it worth doing it and removing the FIXME. <...> > +static int > +memif_set_role(const char *key __rte_unused, const char *value, > + void *extra_args) > +{ > + enum memif_role_t *role = (enum memif_role_t *)extra_args; > + if (strstr(value, "master") != NULL) { Can 'value' be something containing 'master' or should it be exact 'master', if expectaion is exact match, strcmp() can be used. Same for below ones. <...> > +static int > +memif_set_bs(const char *key __rte_unused, const char *value, void *extra_args) > +{ > + unsigned long tmp; > + uint16_t *buffer_size = (uint16_t *)extra_args; > + > + tmp = strtoul(value, NULL, 10); > + if (tmp == 0 || tmp > 0xFFFF) { This 0xFFFF size limit is very hidden in kvargs function, can be good to add this to documentation and header comment. > + MIF_LOG(ERR, "Invalid buffer size: %s.", value); > + return -EINVAL; > + } > + *buffer_size = tmp; > + return 0; > +} > + > +static int > +memif_set_rs(const char *key __rte_unused, const char *value, void *extra_args) > +{ > + unsigned long tmp; > + memif_log2_ring_size_t *log2_ring_size = > + (memif_log2_ring_size_t *)extra_args; > + > + tmp = strtoul(value, NULL, 10); > + if (tmp == 0 || tmp > ETH_MEMIF_MAX_LOG2_RING_SIZE) { Same here for ETH_MEMIF_MAX_LOG2_RING_SIZE, can you document it? Same for below ones. <...> > + /* set default values */ > + role = MEMIF_ROLE_SLAVE; > + flags = 0; > + id = 0; > + buffer_size = 2048; > + log2_ring_size = 10; > + nrxq = 1; > + ntxq = 1; There are already some ETH_MEMIF_DEFAULT_* macros are defined same of these values, why not use them? And can be good to have default data information in single place. > + socket_filename = ETH_MEMIF_DEFAULT_SOCKET_FILENAME; > + secret = NULL; > + eth_addr = NULL; > + > + /* parse parameters */ > + if (kvlist != NULL) { > + if (rte_kvargs_count(kvlist, ETH_MEMIF_ROLE_ARG) == 1) { > + ret = rte_kvargs_process(kvlist, ETH_MEMIF_ROLE_ARG, > + &memif_set_role, &role); > + if (ret < 0) > + goto exit; > + } Do you need rte_kvargs_count() checks? If you are not really interested in the count, can call rte_kvargs_process() directly, if key exists the callback will be called, if not exists it will return 0, so safe to call without count check. > + if (rte_kvargs_count(kvlist, ETH_MEMIF_ID_ARG) == 1) { > + ret = rte_kvargs_process(kvlist, ETH_MEMIF_ID_ARG, > + &memif_set_id, &id); > + if (ret < 0) > + goto exit; > + } > + if (rte_kvargs_count(kvlist, ETH_MEMIF_BUFFER_SIZE_ARG) == 1) { > + ret = > + rte_kvargs_process(kvlist, > + ETH_MEMIF_BUFFER_SIZE_ARG, > + &memif_set_bs, &buffer_size); > + if (ret < 0) > + goto exit; > + } > + if (rte_kvargs_count(kvlist, ETH_MEMIF_RING_SIZE_ARG) == 1) { > + ret = > + rte_kvargs_process(kvlist, ETH_MEMIF_RING_SIZE_ARG, > + &memif_set_rs, &log2_ring_size); > + if (ret < 0) > + goto exit; > + } > + if (rte_kvargs_count(kvlist, ETH_MEMIF_NRXQ_ARG) == 1) { > + ret = rte_kvargs_process(kvlist, ETH_MEMIF_NRXQ_ARG, > + &memif_set_nq, &nrxq); > + if (ret < 0) > + goto exit; > + } > + if (rte_kvargs_count(kvlist, ETH_MEMIF_NTXQ_ARG) == 1) { > + ret = rte_kvargs_process(kvlist, ETH_MEMIF_NTXQ_ARG, > + &memif_set_nq, &ntxq); > + if (ret < 0) > + goto exit; > + } Nuber of Rx/Tx queues can be configured dynamically by applications. This information provided by ".dev_configure" which seems you are not using at all. Why getting number of queue information statically during PMD probe instead of getting dynamic by app? Won't it be possible app disconnect one memif PMD, reconfigure it with different queue numbers and connect to other master? > + if (rte_kvargs_count(kvlist, ETH_MEMIF_SOCKET_ARG) == 1) { > + for (i = 0; i < kvlist->count; i++) { > + pair = &kvlist->pairs[i]; > + if (strcmp(pair->key, ETH_MEMIF_SOCKET_ARG) == 0) { > + socket_filename = pair->value; > + ret = memif_check_socket_filename( > + socket_filename); > + if (ret < 0) > + goto exit; > + } Instead of accessing the kvarg details and 'pair->value' why not implement 'memif_check_socket_filename()' as callback funtion to the 'rte_kvargs_process()' ? > + } > + } > + if (rte_kvargs_count(kvlist, ETH_MEMIF_MAC_ARG) == 1) { > + for (i = 0; i < kvlist->count; i++) { > + pair = &kvlist->pairs[i]; > + if (strcmp(pair->key, ETH_MEMIF_MAC_ARG) == 0) > + eth_addr = pair->value; Similar comment here, instead of keeping 'eth_addr' as string and posponing task of process it to 'memif_create()', you can process it in a callback and store as 'struct ether_addr' Also you can store the default mac address as 'struct ether_addr' instead of string. > + } > + } > + if (rte_kvargs_count(kvlist, ETH_MEMIF_ZC_ARG) == 1) { > + ret = rte_kvargs_process(kvlist, ETH_MEMIF_ZC_ARG, > + &memif_set_zc, &flags); > + if (ret < 0) > + goto exit; > + } > + if (rte_kvargs_count(kvlist, ETH_MEMIF_SECRET_ARG) == 1) { > + for (i = 0; i < kvlist->count; i++) { > + pair = &kvlist->pairs[i]; > + if (strcmp(pair->key, ETH_MEMIF_SECRET_ARG) == 0) > + secret = pair->value; > + } Same here, I think better to implement as rte_kvargs_process(); > + } > + } > + > + /* create interface */ > + ret = > + memif_create(vdev, role, id, flags, socket_filename, log2_ring_size, > + nrxq, ntxq, buffer_size, secret, eth_addr); Just syntax, but no need to break after "ret = " line. <...> > +RTE_PMD_REGISTER_VDEV(net_memif, pmd_memif_drv); > +RTE_PMD_REGISTER_ALIAS(net_memif, eth_memif); No need to provide alias for new PMDs, it is for backward compatibility. > +RTE_PMD_REGISTER_PARAM_STRING(net_memif, > + ETH_MEMIF_ID_ARG "=" > + ETH_MEMIF_ROLE_ARG "=" It can be more helpful to describe the valid set, since it is limited: ETH_MEMIF_ROLE_ARG "=master|slave" <...> > +int memif_logtype; This will work but wouldn't it be better to define variable to .c file and have an extern here. <...> > + uint16_t last_head; /**< last ring head */ > + uint16_t last_tail; /**< last ring tail */ > + /*uint32_t *buffers;*/ Please remove commented code. <...> > + > + struct memif_queue *rx_queues; /**< RX queues */ > + struct memif_queue *tx_queues; /**< TX queues */ Aren't these are duplicates of 'data->rx_queues' & 'data->tx_queues'? Why not using existing ones but creating anouther copy here? > + > + /* remote info */ > + char remote_name[64]; /**< remote app name */ > + char remote_if_name[64]; /**< remote peer name */ > + > + struct { > + memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */ > + uint8_t num_s2m_rings; /**< number of slave to master rings */ > + uint8_t num_m2s_rings; /**< number of master to slave rings */ > + uint16_t buffer_size; /**< buffer size */ > + } cfg; /**< Configured parameters (max values) */ > + > + struct { > + memif_log2_ring_size_t log2_ring_size; /**< log2 of ring size */ > + uint8_t num_s2m_rings; /**< number of slave to master rings */ > + uint8_t num_m2s_rings; /**< number of master to slave rings */ > + uint16_t buffer_size; /**< buffer size */ > + } run; What about defining the struct out, and reuse it for both 'cfg' & 'run'? <...> > +#ifndef MFD_HUGETLB > +#ifndef __NR_memfd_create Is there are a kernel version requirement for these? If so what it is? > + > +#if defined __x86_64__ > +#define __NR_memfd_create 319 > +#elif defined __arm__ > +#define __NR_memfd_create 385 > +#elif defined __aarch64__ > +#define __NR_memfd_create 279 > +#else > +#error "__NR_memfd_create unknown for this architecture" > +#endif > + > +#endif /* __NR_memfd_create */ > + > +static inline int memfd_create(const char *name, unsigned int flags) > +{ > + return syscall(__NR_memfd_create, name, flags); > +} Isn't there a glibc wrapper for this, do you really need to call syscall? And why i686 and powerpc are not supported? btw, is memif PMD supports 32-bits? (.ini file, requested above, good for document these kind of questions)