DPDK patches and discussions
 help / color / mirror / Atom feed
From: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
To: Narcisa Ana Maria Vasile <navasile@linux.microsoft.com>
Cc: dev@dpdk.org, thomas@monjalon.net, ocardona@microsoft.com,
	haramakr@linux.microsoft.com, pallavi.kadam@intel.com,
	ranjit.menon@intel.com, dmitrym@microsoft.com,
	Narcisa Vasile <navasile@microsoft.com>,
	Harini Ramakrishnan <Harini.Ramakrishnan@microsoft.com>
Subject: Re: [dpdk-dev] [PATCH v2] windows/netuio: add Windows NetUIO kernel driver
Date: Mon, 24 Aug 2020 23:53:44 +0300	[thread overview]
Message-ID: <20200824235344.5994cd13@sovereign> (raw)
In-Reply-To: <1597962235-4787-1-git-send-email-navasile@linux.microsoft.com>

On Thu, 20 Aug 2020 15:23:55 -0700, Narcisa Ana Maria Vasile wrote:
> From: Narcisa Vasile <navasile@microsoft.com>
> 
> The Windows NetUIO kernel driver allows the DPDK userspace
> application to directly access the hardware.
> 
> Cc: Harini Ramakrishnan <Harini.Ramakrishnan@microsoft.com>
> Cc: Omar Cardona <ocardona@microsoft.com>
> Signed-off-by: Narcisa Vasile <navasile@microsoft.com>
> ---

Major questions:

1. Does NetUIO still need to allocate and map a contiguous memory segment
now, when DPDK has user-mode memory management?

2. IOCTLs require to specify PCI address on each call. This is very
inconvenient for DPDK consumers and also seems to serve no purpose.

3. There is a need to document driver's design, preferably in commit message,
specifically:

3.1) DMA remapping capability in INF (AFAIK, vendors are notified);
3.2) manual BAR probing instead of using resource lists;
3.3) reason to use EvtIoInCallerContext and IO queues;
3.4) IOCTL format.

Also, I agree with everything Ranjit has noted already.

General suggestions to cleanup the code a bit. We can do it later if you
wish. This also brings up the question, which code style should Windows
kernel code for DPDK follow (off-topic for now).

* Remove boilerplate code and comments generated by VS wizard.
* Place `_Use_decl_annotations` on definitions to make them simpler.
* Limit line length, try using shorted variable names (e.g.
  "netuio_contextdata" may be "context" or "ctx" with no loss).

More specific comments inline.


>  create mode 100644 .gitattributes
>  create mode 100644 .gitignore

Both of these files should be in windows/ directory.

>  create mode 100644 windows/netuio/kernel/windows/netuio/resource.h

This file is not needed.


> diff --git a/windows/netuio/kernel/README_NetUIO.rst b/windows/netuio/kernel/README_NetUIO.rst
> new file mode 100644
> index 000000000..a290fcf20
> --- /dev/null
> +++ b/windows/netuio/kernel/README_NetUIO.rst
> @@ -0,0 +1,64 @@
[snip]
> +Installing netuio.sys Driver for development
> +--------------------------------------------
> +.. note::
> +
> +In a development environment, NetUIO driver can be loaded by enabling test-signing.
> +Please implement adequate precautionary measures before installing a test-signed driver, replacing a signed-driver.

Paragraph under "note" must be indented for RST to peek it up as note
content. Like so:

.. note::

   This line has 3 spaces in the beginning.

[snip]
> diff --git a/windows/netuio/kernel/windows/netuio/netuio.inf b/windows/netuio/kernel/windows/netuio/netuio.inf
> new file mode 100644
> index 000000000..88e90b365
> --- /dev/null
> +++ b/windows/netuio/kernel/windows/netuio/netuio.inf
> @@ -0,0 +1,78 @@
> +; SPDX-License-Identifier: BSD-3-Clause
> +; Copyright (c) Microsoft Corporation. All rights reserved

This copyright differs from the one in README.
"All rights reserved" is probably unnecessary.

[snip]
> +[Strings]
> +SPSVCINST_ASSOCSERVICE= 0x00000002
> +Intel = "Intel"
> +Broadcom = "Broadcom Corporation"

IHVs are supposed to add this gradually.

> +ClassName = "Windows UIO"
> +DiskName = "DPDK netUIO Installation Disk"
> +netuio.DeviceDesc = "netuio Device"
> +netuio.SVCDESC = "netuio Service"
> +
> +[DMAr.reg]
> +HKR,Parameters,DmaRemappingCompatible,0x00010001,1
> diff --git a/windows/netuio/kernel/windows/netuio/netuio_dev.c b/windows/netuio/kernel/windows/netuio/netuio_dev.c
> new file mode 100644
> index 000000000..6394bb5d1
> --- /dev/null
> +++ b/windows/netuio/kernel/windows/netuio/netuio_dev.c
[snip]
> +NTSTATUS
> +netuio_create_device(_Inout_ PWDFDEVICE_INIT DeviceInit)
> +{
[snip]
> +
> +    status = WdfDeviceCreate(&DeviceInit, &deviceAttributes, &device);
> +
> +	if (NT_SUCCESS(status)) {
> +		// Create a device interface so that applications can find and talk to us.
> +		status = WdfDeviceCreateDeviceInterface(device, &GUID_DEVINTERFACE_netUIO, NULL);
> +	}

Mixed tabs and spaces for indent.

[snip]
> +/*
> +Routine Description:
> +    Free all the resources allocated in AdfEvtDeviceAdd.

Typo: Adf -> Wdf.

[snip]
> +static NTSTATUS
> +get_pci_device_info(_In_ WDFOBJECT device)
> +{
[snip]
> +        // Also, retrieve the NUMA node of the device
> +        USHORT numaNode;
> +        status = IoGetDeviceNumaNode(pdo, &numaNode);
> +        if (NT_SUCCESS(status)) {
> +            netuio_contextdata->dev_numa_node = numaNode;
> +        }

Why is this needed? Userspace has access to this info.

> +    }
> +
> +    return status;
> +}
> +
> +static NTSTATUS
> +create_device_specific_symbolic_link(_In_ WDFOBJECT device)
> +{
> +    NTSTATUS status = STATUS_UNSUCCESSFUL;
> +    UNICODE_STRING netuio_symbolic_link;
> +
> +    PNETUIO_CONTEXT_DATA  netuio_contextdata;
> +    netuio_contextdata = netuio_get_context_data(device);
> +
> +    if (!netuio_contextdata)
> +        return status;
> +
> +    // Build symbolic link name as <netuio_symbolic_link>_BDF  (bus/device/func)
> +    CHAR  symbolic_link[64] = { 0 };
> +    sprintf_s(symbolic_link, sizeof(symbolic_link), "%s_%04d%02d%02d",
> +                            NETUIO_DEVICE_SYMBOLIC_LINK_ANSI, netuio_contextdata->addr.bus_num,
> +                            netuio_contextdata->addr.dev_num, netuio_contextdata->addr.func_num);
> +
> +    ANSI_STRING ansi_symbolic_link;
> +    RtlInitAnsiString(&ansi_symbolic_link, symbolic_link);
> +
> +    status = RtlAnsiStringToUnicodeString(&netuio_symbolic_link, &ansi_symbolic_link, TRUE);
> +    if (!NT_SUCCESS(status))
> +        return status;

Why not use Unicode directly?

> +
> +    status = WdfDeviceCreateSymbolicLink(device, &netuio_symbolic_link);
> +
> +    RtlFreeUnicodeString(&netuio_symbolic_link);
> +
> +    return status;
> +}
[snip]

> +
> +// Local function protoyypes
> +static NTSTATUS get_pci_device_info(_In_ WDFOBJECT device);
> +static NTSTATUS create_device_specific_symbolic_link(_In_ WDFOBJECT device);
> +static NTSTATUS allocate_usermemory_segment(_In_ WDFOBJECT device);
> +static VOID free_usermemory_segment(_In_ WDFOBJECT device);

Nit: local functions usually don't appear in headers.

[snip]
> diff --git a/windows/netuio/kernel/windows/netuio/netuio_interface.h b/windows/netuio/kernel/windows/netuio/netuio_interface.h
> new file mode 100644
> index 000000000..6674931d2
> --- /dev/null
> +++ b/windows/netuio/kernel/windows/netuio/netuio_interface.h
> @@ -0,0 +1,73 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) Microsoft Corporation. All rights reserved
> + */
> +
> +#ifndef NETUIO_INTERFACE_H
> +#define NETUIO_INTERFACE_H
> +
> +// All structures in this file are packed on an 8B boundary. 
> +#pragma pack(push)
> +#pragma pack(8)
> +
> +// Define an Interface Guid so that any app can find the device and talk to it.
> +DEFINE_GUID (GUID_DEVINTERFACE_netUIO, 0x08336f60,0x0679,0x4c6c,0x85,0xd2,0xae,0x7c,0xed,0x65,0xff,0xf7); // {08336f60-0679-4c6c-85d2-ae7ced65fff7}
> +
> +// Device name definitions
> +#define NETUIO_DEVICE_SYMBOLIC_LINK_ANSI    "\\DosDevices\\netuio"
> +
> +// netUIO driver symbolic name (prefix)
> +#define NETUIO_DRIVER_NAME  _T("netuio")
> +
> +// IOCTL code definitions
> +#define IOCTL_NETUIO_MAP_HW_INTO_USERMODE CTL_CODE(FILE_DEVICE_NETWORK, 51, METHOD_BUFFERED, FILE_READ_ACCESS | FILE_WRITE_ACCESS)
> +#define IOCTL_NETUIO_PCI_CONFIG_IO        CTL_CODE(FILE_DEVICE_NETWORK, 52, METHOD_BUFFERED, FILE_READ_ACCESS | FILE_WRITE_ACCESS)

Input and output of these interfaces has to be documented somewhere.

> +
> +struct mem_region {
> +    UINT64           size;       // memory region size
> +    PHYSICAL_ADDRESS phys_addr;  // physical address of the memory region
> +    PVOID            virt_addr;  // virtual address of the memory region
> +    PVOID            user_mapped_virt_addr;  // virtual address of the region mapped into user process context
> +};
> +
> +struct dev_addr {
> +    ULONG   bus_num;
> +    USHORT  dev_num;
> +    USHORT  func_num;
> +};
> +
> +enum pci_io {
> +    PCI_IO_READ = 0,
> +    PCI_IO_WRITE = 1
> +};
> +
> +#define PCI_MAX_BAR 6
> +
> +struct dpdk_private_info

Nit: "private" usually means something hidden, not exposed like this. How
about "device_info", "device_data", or "device_resources"?

> +{
> +    struct mem_region   hw[PCI_MAX_BAR];
> +    struct mem_region   ms;
> +    struct dev_addr     dev_addr;

Why is the address needed? Each device has its own symlink and handle.
Such interface is harder to use with no visible benefit. Same below.

> +    UINT16              dev_id;
> +    UINT16              sub_dev_id;
> +    USHORT              dev_numa_node;
> +    USHORT              reserved;
> +};
> +
> +struct dpdk_pci_config_io
> +{
> +    struct dev_addr     dev_addr;
> +    UINT32              offset;
> +    enum pci_io         op;

Driver API should use fixed-size types. Enum size is unspecified.

> +    UINT32              access_size; // 1, 2, 4, or 8 bytes
> +
> +    union dpdk_pci_config_io_data {
> +        UINT8			u8;
> +        UINT16			u16;
> +        UINT32			u32;
> +        UINT64			u64;
> +    } data;
> +};
> +
> +#pragma pack(pop)
> +
> +#endif // NETUIO_INTERFACE_H

[snip]

> +static NTSTATUS
> +netuio_map_address_into_user_process(_In_ PNETUIO_CONTEXT_DATA netuio_contextdata)
> +{
> +    NTSTATUS status = STATUS_SUCCESS;
> +
> +    // Map the scratch memory regions to the user's process context
> +    MmBuildMdlForNonPagedPool(netuio_contextdata->dpdk_seg.mdl);
> +    __try {
> +        netuio_contextdata->dpdk_seg.mem.user_mapped_virt_addr =
> +            MmMapLockedPagesSpecifyCache(netuio_contextdata->dpdk_seg.mdl, UserMode,
> +                                         MmCached, NULL, FALSE, NormalPagePriority);

If user does IOCTL twice, they'll get the same memory mapped twice to
different addresses, but the driver will only remember the last mapping. After
it is removed, the first mapping will remain, exposing kernel memory for
user-after-free from userspace. Or do I miss something here?

DPDK doesn't need multiple mappings, so you could either return an error or
just an already mapped address on the second call.

[snip]

> +NTSTATUS
> +netuio_queue_initialize(_In_ WDFDEVICE Device)
> +{
> +    WDFQUEUE queue;
> +    NTSTATUS status;
> +    WDF_IO_QUEUE_CONFIG    queueConfig;
> +
> +    PAGED_CODE();
> +
> +    // Configure a default queue so that requests that are not
> +    // configure-fowarded using WdfDeviceConfigureRequestDispatching to goto
> +    // other queues get dispatched here.
> +    WDF_IO_QUEUE_CONFIG_INIT_DEFAULT_QUEUE(&queueConfig, WdfIoQueueDispatchParallel);
> +
> +    queueConfig.EvtIoDeviceControl = netuio_evt_IO_device_control;
> +    queueConfig.EvtIoStop = netuio_evt_IO_stop;

EvtIoStop is not needed.

[snip]

> +_Use_decl_annotations_
> +VOID
> +netuio_evt_IO_in_caller_context(
> +    IN WDFDEVICE  Device,
> +    IN WDFREQUEST Request
> +)
> +{
> +    WDF_REQUEST_PARAMETERS params = { 0 };
> +    NTSTATUS status = STATUS_SUCCESS;
> +    PVOID    input_buf = NULL, output_buf = NULL;
> +    size_t   input_buf_size, output_buf_size;
> +    size_t  bytes_returned = 0;
> +    PNETUIO_CONTEXT_DATA  netuio_contextdata = NULL;
> +
> +    netuio_contextdata = netuio_get_context_data(Device);
> +
> +    WDF_REQUEST_PARAMETERS_INIT(&params);
> +    WdfRequestGetParameters(Request, &params);
> +
> +    // We only need to be in the context of the process that initiated the request
> +    //when we need to map memory to userspace. Otherwise, send the request back to framework.
> +    if (!((params.Type == WdfRequestTypeDeviceControl) &&
> +        (params.Parameters.DeviceIoControl.IoControlCode == IOCTL_NETUIO_MAP_HW_INTO_USERMODE)))
> +    {
> +        status = WdfDeviceEnqueueRequest(Device, Request);
> +
> +        if (!NT_SUCCESS(status))
> +        {
> +            WdfRequestCompleteWithInformation(Request, status, bytes_returned);

Nit: WdfRequestComplete(Request, status) would be sufficient.

> +        }
> +        return;
> +    }
> +
[snip]

> diff --git a/windows/netuio/kernel/windows/netuio/netuio_queue.h b/windows/netuio/kernel/windows/netuio/netuio_queue.h
> new file mode 100644
> index 000000000..9173a77b9
> --- /dev/null
> +++ b/windows/netuio/kernel/windows/netuio/netuio_queue.h
> @@ -0,0 +1,31 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) Microsoft Corporation. All rights reserved
> + */
> +
> +#ifndef NETUIO_QUEUE_H
> +#define NETUIO_QUEUE_H
> +
> +EXTERN_C_START

Not needed in C code.

> +
> +// This is the context that can be placed per queue and would contain per queue information.
> +typedef struct _QUEUE_CONTEXT {
> +    ULONG PrivateDeviceData;  // just a placeholder
> +} QUEUE_CONTEXT, *PQUEUE_CONTEXT;
> +
> +WDF_DECLARE_CONTEXT_TYPE_WITH_NAME(QUEUE_CONTEXT, QueueGetContext)

Boilerplate, not needed.

[snip]


  parent reply	other threads:[~2020-08-24 20:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-20 22:23 Narcisa Ana Maria Vasile
2020-08-21  1:32 ` Ranjit Menon
2020-09-09 18:58   ` Narcisa Ana Maria Vasile
2020-08-24 20:53 ` Dmitry Kozlyuk [this message]
2020-09-09 18:53   ` Narcisa Ana Maria Vasile
2020-09-13 21:39     ` Dmitry Kozlyuk
2020-09-09 18:41 ` [dpdk-dev] [PATCH v3] " Narcisa Ana Maria Vasile
2020-09-13 21:39   ` Dmitry Kozlyuk
2020-09-19  2:52   ` [dpdk-dev] [PATCH v4] " Narcisa Ana Maria Vasile
2020-09-22 21:25     ` Dmitry Kozlyuk
2020-09-22 21:36     ` Ranjit Menon
2020-10-01 22:55     ` [dpdk-dev] [PATCH v5] " Narcisa Ana Maria Vasile
2020-10-02 18:21       ` Ranjit Menon
2020-10-05 19:34         ` Narcisa Ana Maria Vasile
2020-10-14  9:29         ` Thomas Monjalon
2020-10-02 18:33       ` Dmitry Kozlyuk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200824235344.5994cd13@sovereign \
    --to=dmitry.kozliuk@gmail.com \
    --cc=Harini.Ramakrishnan@microsoft.com \
    --cc=dev@dpdk.org \
    --cc=dmitrym@microsoft.com \
    --cc=haramakr@linux.microsoft.com \
    --cc=navasile@linux.microsoft.com \
    --cc=navasile@microsoft.com \
    --cc=ocardona@microsoft.com \
    --cc=pallavi.kadam@intel.com \
    --cc=ranjit.menon@intel.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).