DPDK patches and discussions
 help / color / mirror / Atom feed
From: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
To: Narcisa Ana Maria Vasile <navasile@linux.microsoft.com>
Cc: dev@dpdk.org, thomas@monjalon.net, ocardona@microsoft.com,
	haramakr@linux.microsoft.com, pallavi.kadam@intel.com,
	ranjit.menon@intel.com, dmitrym@microsoft.com,
	Narcisa Vasile <navasile@microsoft.com>,
	Harini Ramakrishnan <Harini.Ramakrishnan@microsoft.com>
Subject: Re: [dpdk-dev] [PATCH v3] windows/netuio: add Windows NetUIO kernel driver
Date: Mon, 14 Sep 2020 00:39:05 +0300	[thread overview]
Message-ID: <20200914003905.4a27324f@sovereign> (raw)
In-Reply-To: <1599676883-11991-1-git-send-email-navasile@linux.microsoft.com>

On Wed,  9 Sep 2020 11:41:23 -0700, Narcisa Ana Maria Vasile wrote:
> From: Narcisa Vasile <navasile@microsoft.com>

Thanks for the hard work! A few comments inline worth checking, but this
version improved much anyway.

> The Windows netuio kernel driver provides the DPDK userspace application
> with direct access to hardware, by mapping the HW registers in userspace
> and allowing read/write operations from/to the device
> configuration space.
> 
> Two IOCTLs are defined by the netuio interface:
>   * IOCTL_NETUIO_MAP_HW_INTO_USERMODE
>       - used for mapping the device registers into userspace
> 
>       Description of output:
>       -	the physical address, virtual address and the size of the
>         memory region where the BARs were mapped.
> 
>   * IOCTL_NETUIO_PCI_CONFIG_IO
>       - used to read/write from/into the device configuration space
> 
>       Description of input:
>       -	the operation type (read/write)
>       -	the offset into the device data where the operation begins
>       -	the length of data to read/write.
>       -	in case of a write operation, the data to be written to
>         the device configuration space.
>       Description of output:
>       -	in case of a read operation, the output buffer is filled
>         with the data read from the configuration space.

Detailed parameter descriptions would better suit netuio_interface.h, where
the IOCTLs are defined. Short ones are useful here, though.

>  windows/netuio/README.rst             |  58 +++++

Nit: "git am" complains about spaces at end of a few lines in this file.

[snip]
> +static NTSTATUS
> +get_pci_device_info(_In_ WDFOBJECT device)
> +{
[snip]
> +    // Retrieve the B:D:F details of our device
> +    PDEVICE_OBJECT pdo = NULL;
> +    pdo = WdfDeviceWdmGetPhysicalDevice(device);
> +    if (pdo) {
> +        ULONG prop = 0, length = 0;
> +        status = IoGetDeviceProperty(pdo, DevicePropertyBusNumber, sizeof(ULONG), (PVOID)&ctx->addr.bus_num, &length);

Status is not checked and is immediately overwritten.

> +        status = IoGetDeviceProperty(pdo, DevicePropertyAddress, sizeof(ULONG), (PVOID)&prop, &length);
> +
> +        if (NT_SUCCESS(status)) {
> +            ctx->addr.func_num = prop & 0x0000FFFF;
> +            ctx->addr.dev_num = ((prop >> 16) & 0x0000FFFF);
> +        }
> +    }
> +
> +    return status;
> +}
> +
> +static NTSTATUS
> +create_device_specific_symbolic_link(_In_ WDFOBJECT device)
> +{
> +    NTSTATUS status = STATUS_UNSUCCESSFUL;
> +    UNICODE_STRING netuio_symbolic_link;
> +
> +    PNETUIO_CONTEXT_DATA  ctx;
> +    ctx = netuio_get_context_data(device);
> +
> +    if (!ctx)
> +        return status;

Can "ctx" really be NULL? Same applies to file object context. If I
get WdfObjectGetTypedContext() description right, it cannot, so you could
save a lot of checks.

> +
> +    // Build symbolic link name as <netuio_symbolic_link>_BDF  (bus/device/func)
> +    CHAR  symbolic_link[64] = { 0 };
> +    sprintf_s(symbolic_link, sizeof(symbolic_link), "%s_%04d%02d%02d",
> +        NETUIO_DEVICE_SYMBOLIC_LINK_ANSI, ctx->addr.bus_num,
> +        ctx->addr.dev_num, ctx->addr.func_num);

You probably want to use %x, and not %d for PCI addresses, because decimal
device numbers may not fit 2 digits and it will be hard to recognize. The
other place is DbgPrintEx() below. No strong opinion, however, because Device
Manager seems to use decimal.

> +
> +    ANSI_STRING ansi_symbolic_link;
> +    RtlInitAnsiString(&ansi_symbolic_link, symbolic_link);
> +
> +    status = RtlAnsiStringToUnicodeString(&netuio_symbolic_link, &ansi_symbolic_link, TRUE);
> +    if (!NT_SUCCESS(status))
> +        return status;
> +
> +    status = WdfDeviceCreateSymbolicLink(device, &netuio_symbolic_link);
> +
> +    RtlFreeUnicodeString(&netuio_symbolic_link);
> +
> +    return status;
> +}

[snip]
> +
> +_Use_decl_annotations_
> +NTSTATUS
> +netuio_map_hw_resources(WDFDEVICE Device, WDFCMRESLIST Resources, WDFCMRESLIST ResourcesTranslated)
> +{

Could you please document the solution here? Something along these lines:

	ResourcesTranslated report MMIO BARs in the correct order, but their
	indices differ from physical ones. Traverse PCI configuration
	manually to maintain physical index, searching for the next MMIO
	resource each time.

When I first implemented a netuio-like driver, I fell into this trap with
indices. Worse, I did work sometimes, e.g. with virtio-pci (legacy),that's
why I consider this non-obvious.

[snip]
> +    for (INT bar_index = 0; bar_index < PCI_MAX_BAR; bar_index++) {
> +        prev_bar = curr_bar;
> +        curr_bar = pci_config.u.type0.BaseAddresses[bar_index];
> +        if (curr_bar == 0 || (prev_bar & PCI_TYPE_64BIT)) {
> +            // Skip this bar
> +            ctx->bar[bar_index].base_addr.QuadPart = 0;
> +            ctx->bar[bar_index].size = 0;
> +            ctx->bar[bar_index].virt_addr = 0;
> +
> +            ctx->dpdk_hw[bar_index].mdl = NULL;
> +            ctx->dpdk_hw[bar_index].mem.size = 0;
> +
> +            continue;
> +        }

Maybe RtlZeroMmeory() the whole array beforehand and just continue here? This
would protect from forgetting to clear new fields, should they be added.

> +
> +        // Find next CmResourceTypeMemory
> +        do {
> +            descriptor = WdfCmResourceListGetDescriptor(ResourcesTranslated, next_descriptor);
> +            next_descriptor++;
> +
> +            if (descriptor == NULL) {
> +                status = STATUS_DEVICE_CONFIGURATION_ERROR;
> +                goto end;
> +            }
> +        } while (descriptor->Type != CmResourceTypeMemory);
> +
> +        // Assert that we have the correct descriptor
> +        ASSERT((descriptor->Flags & CM_RESOURCE_MEMORY_BAR) != 0);

Shouldn't this assert be a check? Or incorporated in loop condition maybe.

> +
> +        if (curr_bar & PCI_TYPE_64BIT) {
> +            ASSERT(bar_index != PCI_TYPE0_ADDRESSES - 1);
> +            bar_addr = ((ULONGLONG)pci_config.u.type0.BaseAddresses[bar_index + 1] << 32) | (curr_bar & PCI_ADDRESS_MEMORY_ADDRESS_MASK);
> +        }
> +        else
> +        {
> +            bar_addr = curr_bar & PCI_ADDRESS_MEMORY_ADDRESS_MASK;
> +        }
> +
> +        ASSERT((ULONGLONG)descriptor->u.Memory.Start.QuadPart == bar_addr);

Are these calculations really needed if you end up using address from the
resource descriptor?

> +
> +/*
> +Routine Description:
> +    Free all the resources allocated in DriverEntry.
> +
> +Return Value:
> +    None
> +-*/
> +_Use_decl_annotations_
> +VOID
> +netuio_evt_driver_context_cleanup(WDFOBJECT DriverObject)
> +{
> +    UNREFERENCED_PARAMETER(DriverObject);
> +    DbgPrintEx(DPFLTR_IHVNETWORK_ID, DPFLTR_NETUIO_INFO_LEVEL, "netUIO Driver unloaded.\n");
> +    PAGED_CODE();
> +}

The comment is misleading, no resources are freed here. You could've done
without this function, driver loading/unloading is logged anyway.
Also, PAGED_CODE() logically must come before any IRQL-constrained calls.

[snip]

> +#endif // NETUIO_DRV_H
> diff --git a/windows/netuio/netuio_interface.h b/windows/netuio/netuio_interface.h
> new file mode 100644
> index 000000000..6fc75ad8e
> --- /dev/null
> +++ b/windows/netuio/netuio_interface.h
> @@ -0,0 +1,67 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Microsoft Corporation.
> + */
> +
> +#ifndef NETUIO_INTERFACE_H
> +#define NETUIO_INTERFACE_H
> +
> +// All structures in this file are packed on an 8B boundary. 
> +#pragma pack(push)
> +#pragma pack(8)
> +
> +// Define an Interface Guid so that any app can find the device and talk to it.
> +DEFINE_GUID (GUID_DEVINTERFACE_netUIO, 0x08336f60,0x0679,0x4c6c,0x85,0xd2,0xae,0x7c,0xed,0x65,0xff,0xf7); // {08336f60-0679-4c6c-85d2-ae7ced65fff7}
> +
> +// Device name definitions
> +#define NETUIO_DEVICE_SYMBOLIC_LINK_ANSI    "\\DosDevices\\netuio"
> +
> +// netUIO driver symbolic name (prefix)
> +#define NETUIO_DRIVER_NAME  _T("netuio")

This string is unused and is not needed for interface.

> +
> +// IOCTL code definitions
> +#define IOCTL_NETUIO_MAP_HW_INTO_USERMODE CTL_CODE(FILE_DEVICE_NETWORK, 51, METHOD_BUFFERED, FILE_READ_ACCESS | FILE_WRITE_ACCESS)
> +#define IOCTL_NETUIO_PCI_CONFIG_IO        CTL_CODE(FILE_DEVICE_NETWORK, 52, METHOD_BUFFERED, FILE_READ_ACCESS | FILE_WRITE_ACCESS)
> +
> +struct mem_region {
> +    UINT64           size;       // memory region size
> +    LARGE_INTEGER    phys_addr;  // physical address of the memory region
> +    PVOID            virt_addr;  // virtual address of the memory region
> +    PVOID            user_mapped_virt_addr;  // virtual address of the region mapped into user process context
> +};
> +
> +struct dev_addr {
> +    ULONG   bus_num;
> +    USHORT  dev_num;
> +    USHORT  func_num;
> +};

This structure is not needed for interface.

> +
> +enum pci_io {
> +    PCI_IO_READ = 0,
> +    PCI_IO_WRITE = 1
> +};
> +
> +#define PCI_MAX_BAR 6
> +
> +struct device_info
> +{
> +    struct mem_region   hw[PCI_MAX_BAR];
> +    USHORT              reserved;

"Reserved" is not used and it is not documented whether the caller must
initialize it or not. I suggest removing it.

> +};
> +
> +struct dpdk_pci_config_io
> +{
> +    UINT32              offset;
> +    UINT8               op;
> +    UINT32              access_size; // 1, 2, 4, or 8 bytes
> +
> +    union dpdk_pci_config_io_data {
> +        UINT8			u8;
> +        UINT16			u16;
> +        UINT32			u32;
> +        UINT64			u64;
> +    } data;
> +};
> +
> +#pragma pack(pop)
> +
> +#endif // NETUIO_INTERFACE_H
> diff --git a/windows/netuio/netuio_queue.c b/windows/netuio/netuio_queue.c
> new file mode 100644
> index 000000000..cea94e630
> --- /dev/null
> +++ b/windows/netuio/netuio_queue.c
> @@ -0,0 +1,333 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Microsoft Corporation.
> + */
> +
> +#include "netuio_drv.h"
> +
> +#ifdef ALLOC_PRAGMA
> +#pragma alloc_text (PAGE, netuio_queue_initialize)
> +#endif
> +
> +static
> +BOOLEAN
> +netuio_get_usermode_mapping_flag(WDFREQUEST Request)
> +{
[snip]
> +}

Blank line is missing here.

[snip]
> +
> +/*
> +Routine Description:
> +    Maps address ranges into the usermode process's address space.  The following
> +    ranges are mapped:
> +
> +        * Any PCI BARs that our device was assigned
> +        * The scratch buffer of contiguous pages
> +
> +Return Value:
> +    NTSTATUS
> +*/
> +static NTSTATUS
> +netuio_map_address_into_user_process(_In_ PNETUIO_CONTEXT_DATA ctx, WDFREQUEST Request)
[snip]

> +_Use_decl_annotations_
> +VOID
> +netuio_evt_IO_in_caller_context(WDFDEVICE  Device,
> +                                WDFREQUEST Request)
> +{
> +    WDF_REQUEST_PARAMETERS params = { 0 };
> +    NTSTATUS status = STATUS_SUCCESS;
> +    PVOID    output_buf = NULL;
> +    size_t   output_buf_size;
> +    size_t  bytes_returned = 0;
> +    PNETUIO_CONTEXT_DATA  ctx = NULL;
> +
> +    ctx = netuio_get_context_data(Device);
> +
> +    WDF_REQUEST_PARAMETERS_INIT(&params);
> +    WdfRequestGetParameters(Request, &params);
> +
> +    // We only need to be in the context of the process that initiated the request
> +    //when we need to map memory to userspace. Otherwise, send the request back to framework.
> +    if (!((params.Type == WdfRequestTypeDeviceControl) &&
> +        (params.Parameters.DeviceIoControl.IoControlCode == IOCTL_NETUIO_MAP_HW_INTO_USERMODE)))
> +    {
> +        status = WdfDeviceEnqueueRequest(Device, Request);
> +
> +        if (!NT_SUCCESS(status))
> +        {
> +            WdfRequestCompleteWithInformation(Request, status, bytes_returned);
> +        }
> +        return;
> +    }
> +
> +    // Return relevant data to the caller
> +    status = WdfRequestRetrieveOutputBuffer(Request, sizeof(struct device_info), &output_buf, &output_buf_size);
> +    if (!NT_SUCCESS(status)) {
> +        status = STATUS_INVALID_BUFFER_SIZE;

Better propagate original error to help debugging
(two more similar places in netuio_evt_IO_device_control).

> +        goto end;
> +    }
> +
> +    status = netuio_map_address_into_user_process(ctx, Request);
> +    if (status != STATUS_SUCCESS) {
> +        goto end;
> +    }
> +
> +    netuio_set_usermode_mapping_flag(Request);
> +
> +    netuio_handle_get_hw_data_request(ctx, output_buf, output_buf_size);
> +    bytes_returned = output_buf_size;
> +
> +end:
> +    WdfRequestCompleteWithInformation(Request, status, bytes_returned);
> +
> +    return;
> +}
> +
> +/*
> +Routine Description:
> +    This event is invoked when the framework receives IRP_MJ_DEVICE_CONTROL request.
> +
> +Return Value:
> +    None
> + */
> +_Use_decl_annotations_
> +VOID
> +netuio_evt_IO_device_control(WDFQUEUE Queue, WDFREQUEST Request,
> +                             size_t OutputBufferLength, size_t InputBufferLength,
> +                             ULONG IoControlCode)
[snip]
> +
> +    if (dpdk_pci_io_input->op == PCI_IO_READ) {
> +        *(UINT64 *)output_buf = 0;
> +        bytes_returned = ctx->bus_interface.GetBusData(
> +            ctx->bus_interface.Context,
> +            PCI_WHICHSPACE_CONFIG,
> +            output_buf,
> +            dpdk_pci_io_input->offset,
> +            dpdk_pci_io_input->access_size);
> +    }
> +    else if (dpdk_pci_io_input->op == PCI_IO_WRITE) {
> +        // returns bytes written
> +        bytes_returned = ctx->bus_interface.SetBusData(
> +            ctx->bus_interface.Context,
> +            PCI_WHICHSPACE_CONFIG,
> +            (PVOID)&dpdk_pci_io_input->data,
> +            dpdk_pci_io_input->offset,
> +            dpdk_pci_io_input->access_size);

The "offset" is not checked, so GetBusData() and SetBusData() can fail and
return 0. You have to either check it above to ensure success or to check if
bytes_returned != dpdk_pci_io_input->access_size and complete with error.

> +    }
> +    else {
> +        status = STATUS_INVALID_PARAMETER;
> +        goto end;
> +    }
> +
> +end:
> +    WdfRequestCompleteWithInformation(Request, status, bytes_returned);
> +
> +    return;
> +}


  reply	other threads:[~2020-09-13 21:39 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-20 22:23 [dpdk-dev] [PATCH v2] " Narcisa Ana Maria Vasile
2020-08-21  1:32 ` Ranjit Menon
2020-09-09 18:58   ` Narcisa Ana Maria Vasile
2020-08-24 20:53 ` Dmitry Kozlyuk
2020-09-09 18:53   ` Narcisa Ana Maria Vasile
2020-09-13 21:39     ` Dmitry Kozlyuk
2020-09-09 18:41 ` [dpdk-dev] [PATCH v3] " Narcisa Ana Maria Vasile
2020-09-13 21:39   ` Dmitry Kozlyuk [this message]
2020-09-19  2:52   ` [dpdk-dev] [PATCH v4] " Narcisa Ana Maria Vasile
2020-09-22 21:25     ` Dmitry Kozlyuk
2020-09-22 21:36     ` Ranjit Menon
2020-10-01 22:55     ` [dpdk-dev] [PATCH v5] " Narcisa Ana Maria Vasile
2020-10-02 18:21       ` Ranjit Menon
2020-10-05 19:34         ` Narcisa Ana Maria Vasile
2020-10-14  9:29         ` Thomas Monjalon
2020-10-02 18:33       ` Dmitry Kozlyuk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200914003905.4a27324f@sovereign \
    --to=dmitry.kozliuk@gmail.com \
    --cc=Harini.Ramakrishnan@microsoft.com \
    --cc=dev@dpdk.org \
    --cc=dmitrym@microsoft.com \
    --cc=haramakr@linux.microsoft.com \
    --cc=navasile@linux.microsoft.com \
    --cc=navasile@microsoft.com \
    --cc=ocardona@microsoft.com \
    --cc=pallavi.kadam@intel.com \
    --cc=ranjit.menon@intel.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).