From: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
To: Narcisa Ana Maria Vasile <navasile@linux.microsoft.com>
Cc: dev@dpdk.org, thomas@monjalon.net, ocardona@microsoft.com,
haramakr@linux.microsoft.com, pallavi.kadam@intel.com,
ranjit.menon@intel.com, dmitrym@microsoft.com,
Narcisa Vasile <navasile@microsoft.com>,
Harini Ramakrishnan <Harini.Ramakrishnan@microsoft.com>
Subject: Re: [dpdk-dev] [PATCH v3] windows/netuio: add Windows NetUIO kernel driver
Date: Mon, 14 Sep 2020 00:39:05 +0300 [thread overview]
Message-ID: <20200914003905.4a27324f@sovereign> (raw)
In-Reply-To: <1599676883-11991-1-git-send-email-navasile@linux.microsoft.com>
On Wed, 9 Sep 2020 11:41:23 -0700, Narcisa Ana Maria Vasile wrote:
> From: Narcisa Vasile <navasile@microsoft.com>
Thanks for the hard work! A few comments inline worth checking, but this
version improved much anyway.
> The Windows netuio kernel driver provides the DPDK userspace application
> with direct access to hardware, by mapping the HW registers in userspace
> and allowing read/write operations from/to the device
> configuration space.
>
> Two IOCTLs are defined by the netuio interface:
> * IOCTL_NETUIO_MAP_HW_INTO_USERMODE
> - used for mapping the device registers into userspace
>
> Description of output:
> - the physical address, virtual address and the size of the
> memory region where the BARs were mapped.
>
> * IOCTL_NETUIO_PCI_CONFIG_IO
> - used to read/write from/into the device configuration space
>
> Description of input:
> - the operation type (read/write)
> - the offset into the device data where the operation begins
> - the length of data to read/write.
> - in case of a write operation, the data to be written to
> the device configuration space.
> Description of output:
> - in case of a read operation, the output buffer is filled
> with the data read from the configuration space.
Detailed parameter descriptions would better suit netuio_interface.h, where
the IOCTLs are defined. Short ones are useful here, though.
> windows/netuio/README.rst | 58 +++++
Nit: "git am" complains about spaces at end of a few lines in this file.
[snip]
> +static NTSTATUS
> +get_pci_device_info(_In_ WDFOBJECT device)
> +{
[snip]
> + // Retrieve the B:D:F details of our device
> + PDEVICE_OBJECT pdo = NULL;
> + pdo = WdfDeviceWdmGetPhysicalDevice(device);
> + if (pdo) {
> + ULONG prop = 0, length = 0;
> + status = IoGetDeviceProperty(pdo, DevicePropertyBusNumber, sizeof(ULONG), (PVOID)&ctx->addr.bus_num, &length);
Status is not checked and is immediately overwritten.
> + status = IoGetDeviceProperty(pdo, DevicePropertyAddress, sizeof(ULONG), (PVOID)&prop, &length);
> +
> + if (NT_SUCCESS(status)) {
> + ctx->addr.func_num = prop & 0x0000FFFF;
> + ctx->addr.dev_num = ((prop >> 16) & 0x0000FFFF);
> + }
> + }
> +
> + return status;
> +}
> +
> +static NTSTATUS
> +create_device_specific_symbolic_link(_In_ WDFOBJECT device)
> +{
> + NTSTATUS status = STATUS_UNSUCCESSFUL;
> + UNICODE_STRING netuio_symbolic_link;
> +
> + PNETUIO_CONTEXT_DATA ctx;
> + ctx = netuio_get_context_data(device);
> +
> + if (!ctx)
> + return status;
Can "ctx" really be NULL? Same applies to file object context. If I
get WdfObjectGetTypedContext() description right, it cannot, so you could
save a lot of checks.
> +
> + // Build symbolic link name as <netuio_symbolic_link>_BDF (bus/device/func)
> + CHAR symbolic_link[64] = { 0 };
> + sprintf_s(symbolic_link, sizeof(symbolic_link), "%s_%04d%02d%02d",
> + NETUIO_DEVICE_SYMBOLIC_LINK_ANSI, ctx->addr.bus_num,
> + ctx->addr.dev_num, ctx->addr.func_num);
You probably want to use %x, and not %d for PCI addresses, because decimal
device numbers may not fit 2 digits and it will be hard to recognize. The
other place is DbgPrintEx() below. No strong opinion, however, because Device
Manager seems to use decimal.
> +
> + ANSI_STRING ansi_symbolic_link;
> + RtlInitAnsiString(&ansi_symbolic_link, symbolic_link);
> +
> + status = RtlAnsiStringToUnicodeString(&netuio_symbolic_link, &ansi_symbolic_link, TRUE);
> + if (!NT_SUCCESS(status))
> + return status;
> +
> + status = WdfDeviceCreateSymbolicLink(device, &netuio_symbolic_link);
> +
> + RtlFreeUnicodeString(&netuio_symbolic_link);
> +
> + return status;
> +}
[snip]
> +
> +_Use_decl_annotations_
> +NTSTATUS
> +netuio_map_hw_resources(WDFDEVICE Device, WDFCMRESLIST Resources, WDFCMRESLIST ResourcesTranslated)
> +{
Could you please document the solution here? Something along these lines:
ResourcesTranslated report MMIO BARs in the correct order, but their
indices differ from physical ones. Traverse PCI configuration
manually to maintain physical index, searching for the next MMIO
resource each time.
When I first implemented a netuio-like driver, I fell into this trap with
indices. Worse, I did work sometimes, e.g. with virtio-pci (legacy),that's
why I consider this non-obvious.
[snip]
> + for (INT bar_index = 0; bar_index < PCI_MAX_BAR; bar_index++) {
> + prev_bar = curr_bar;
> + curr_bar = pci_config.u.type0.BaseAddresses[bar_index];
> + if (curr_bar == 0 || (prev_bar & PCI_TYPE_64BIT)) {
> + // Skip this bar
> + ctx->bar[bar_index].base_addr.QuadPart = 0;
> + ctx->bar[bar_index].size = 0;
> + ctx->bar[bar_index].virt_addr = 0;
> +
> + ctx->dpdk_hw[bar_index].mdl = NULL;
> + ctx->dpdk_hw[bar_index].mem.size = 0;
> +
> + continue;
> + }
Maybe RtlZeroMmeory() the whole array beforehand and just continue here? This
would protect from forgetting to clear new fields, should they be added.
> +
> + // Find next CmResourceTypeMemory
> + do {
> + descriptor = WdfCmResourceListGetDescriptor(ResourcesTranslated, next_descriptor);
> + next_descriptor++;
> +
> + if (descriptor == NULL) {
> + status = STATUS_DEVICE_CONFIGURATION_ERROR;
> + goto end;
> + }
> + } while (descriptor->Type != CmResourceTypeMemory);
> +
> + // Assert that we have the correct descriptor
> + ASSERT((descriptor->Flags & CM_RESOURCE_MEMORY_BAR) != 0);
Shouldn't this assert be a check? Or incorporated in loop condition maybe.
> +
> + if (curr_bar & PCI_TYPE_64BIT) {
> + ASSERT(bar_index != PCI_TYPE0_ADDRESSES - 1);
> + bar_addr = ((ULONGLONG)pci_config.u.type0.BaseAddresses[bar_index + 1] << 32) | (curr_bar & PCI_ADDRESS_MEMORY_ADDRESS_MASK);
> + }
> + else
> + {
> + bar_addr = curr_bar & PCI_ADDRESS_MEMORY_ADDRESS_MASK;
> + }
> +
> + ASSERT((ULONGLONG)descriptor->u.Memory.Start.QuadPart == bar_addr);
Are these calculations really needed if you end up using address from the
resource descriptor?
> +
> +/*
> +Routine Description:
> + Free all the resources allocated in DriverEntry.
> +
> +Return Value:
> + None
> +-*/
> +_Use_decl_annotations_
> +VOID
> +netuio_evt_driver_context_cleanup(WDFOBJECT DriverObject)
> +{
> + UNREFERENCED_PARAMETER(DriverObject);
> + DbgPrintEx(DPFLTR_IHVNETWORK_ID, DPFLTR_NETUIO_INFO_LEVEL, "netUIO Driver unloaded.\n");
> + PAGED_CODE();
> +}
The comment is misleading, no resources are freed here. You could've done
without this function, driver loading/unloading is logged anyway.
Also, PAGED_CODE() logically must come before any IRQL-constrained calls.
[snip]
> +#endif // NETUIO_DRV_H
> diff --git a/windows/netuio/netuio_interface.h b/windows/netuio/netuio_interface.h
> new file mode 100644
> index 000000000..6fc75ad8e
> --- /dev/null
> +++ b/windows/netuio/netuio_interface.h
> @@ -0,0 +1,67 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Microsoft Corporation.
> + */
> +
> +#ifndef NETUIO_INTERFACE_H
> +#define NETUIO_INTERFACE_H
> +
> +// All structures in this file are packed on an 8B boundary.
> +#pragma pack(push)
> +#pragma pack(8)
> +
> +// Define an Interface Guid so that any app can find the device and talk to it.
> +DEFINE_GUID (GUID_DEVINTERFACE_netUIO, 0x08336f60,0x0679,0x4c6c,0x85,0xd2,0xae,0x7c,0xed,0x65,0xff,0xf7); // {08336f60-0679-4c6c-85d2-ae7ced65fff7}
> +
> +// Device name definitions
> +#define NETUIO_DEVICE_SYMBOLIC_LINK_ANSI "\\DosDevices\\netuio"
> +
> +// netUIO driver symbolic name (prefix)
> +#define NETUIO_DRIVER_NAME _T("netuio")
This string is unused and is not needed for interface.
> +
> +// IOCTL code definitions
> +#define IOCTL_NETUIO_MAP_HW_INTO_USERMODE CTL_CODE(FILE_DEVICE_NETWORK, 51, METHOD_BUFFERED, FILE_READ_ACCESS | FILE_WRITE_ACCESS)
> +#define IOCTL_NETUIO_PCI_CONFIG_IO CTL_CODE(FILE_DEVICE_NETWORK, 52, METHOD_BUFFERED, FILE_READ_ACCESS | FILE_WRITE_ACCESS)
> +
> +struct mem_region {
> + UINT64 size; // memory region size
> + LARGE_INTEGER phys_addr; // physical address of the memory region
> + PVOID virt_addr; // virtual address of the memory region
> + PVOID user_mapped_virt_addr; // virtual address of the region mapped into user process context
> +};
> +
> +struct dev_addr {
> + ULONG bus_num;
> + USHORT dev_num;
> + USHORT func_num;
> +};
This structure is not needed for interface.
> +
> +enum pci_io {
> + PCI_IO_READ = 0,
> + PCI_IO_WRITE = 1
> +};
> +
> +#define PCI_MAX_BAR 6
> +
> +struct device_info
> +{
> + struct mem_region hw[PCI_MAX_BAR];
> + USHORT reserved;
"Reserved" is not used and it is not documented whether the caller must
initialize it or not. I suggest removing it.
> +};
> +
> +struct dpdk_pci_config_io
> +{
> + UINT32 offset;
> + UINT8 op;
> + UINT32 access_size; // 1, 2, 4, or 8 bytes
> +
> + union dpdk_pci_config_io_data {
> + UINT8 u8;
> + UINT16 u16;
> + UINT32 u32;
> + UINT64 u64;
> + } data;
> +};
> +
> +#pragma pack(pop)
> +
> +#endif // NETUIO_INTERFACE_H
> diff --git a/windows/netuio/netuio_queue.c b/windows/netuio/netuio_queue.c
> new file mode 100644
> index 000000000..cea94e630
> --- /dev/null
> +++ b/windows/netuio/netuio_queue.c
> @@ -0,0 +1,333 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Microsoft Corporation.
> + */
> +
> +#include "netuio_drv.h"
> +
> +#ifdef ALLOC_PRAGMA
> +#pragma alloc_text (PAGE, netuio_queue_initialize)
> +#endif
> +
> +static
> +BOOLEAN
> +netuio_get_usermode_mapping_flag(WDFREQUEST Request)
> +{
[snip]
> +}
Blank line is missing here.
[snip]
> +
> +/*
> +Routine Description:
> + Maps address ranges into the usermode process's address space. The following
> + ranges are mapped:
> +
> + * Any PCI BARs that our device was assigned
> + * The scratch buffer of contiguous pages
> +
> +Return Value:
> + NTSTATUS
> +*/
> +static NTSTATUS
> +netuio_map_address_into_user_process(_In_ PNETUIO_CONTEXT_DATA ctx, WDFREQUEST Request)
[snip]
> +_Use_decl_annotations_
> +VOID
> +netuio_evt_IO_in_caller_context(WDFDEVICE Device,
> + WDFREQUEST Request)
> +{
> + WDF_REQUEST_PARAMETERS params = { 0 };
> + NTSTATUS status = STATUS_SUCCESS;
> + PVOID output_buf = NULL;
> + size_t output_buf_size;
> + size_t bytes_returned = 0;
> + PNETUIO_CONTEXT_DATA ctx = NULL;
> +
> + ctx = netuio_get_context_data(Device);
> +
> + WDF_REQUEST_PARAMETERS_INIT(¶ms);
> + WdfRequestGetParameters(Request, ¶ms);
> +
> + // We only need to be in the context of the process that initiated the request
> + //when we need to map memory to userspace. Otherwise, send the request back to framework.
> + if (!((params.Type == WdfRequestTypeDeviceControl) &&
> + (params.Parameters.DeviceIoControl.IoControlCode == IOCTL_NETUIO_MAP_HW_INTO_USERMODE)))
> + {
> + status = WdfDeviceEnqueueRequest(Device, Request);
> +
> + if (!NT_SUCCESS(status))
> + {
> + WdfRequestCompleteWithInformation(Request, status, bytes_returned);
> + }
> + return;
> + }
> +
> + // Return relevant data to the caller
> + status = WdfRequestRetrieveOutputBuffer(Request, sizeof(struct device_info), &output_buf, &output_buf_size);
> + if (!NT_SUCCESS(status)) {
> + status = STATUS_INVALID_BUFFER_SIZE;
Better propagate original error to help debugging
(two more similar places in netuio_evt_IO_device_control).
> + goto end;
> + }
> +
> + status = netuio_map_address_into_user_process(ctx, Request);
> + if (status != STATUS_SUCCESS) {
> + goto end;
> + }
> +
> + netuio_set_usermode_mapping_flag(Request);
> +
> + netuio_handle_get_hw_data_request(ctx, output_buf, output_buf_size);
> + bytes_returned = output_buf_size;
> +
> +end:
> + WdfRequestCompleteWithInformation(Request, status, bytes_returned);
> +
> + return;
> +}
> +
> +/*
> +Routine Description:
> + This event is invoked when the framework receives IRP_MJ_DEVICE_CONTROL request.
> +
> +Return Value:
> + None
> + */
> +_Use_decl_annotations_
> +VOID
> +netuio_evt_IO_device_control(WDFQUEUE Queue, WDFREQUEST Request,
> + size_t OutputBufferLength, size_t InputBufferLength,
> + ULONG IoControlCode)
[snip]
> +
> + if (dpdk_pci_io_input->op == PCI_IO_READ) {
> + *(UINT64 *)output_buf = 0;
> + bytes_returned = ctx->bus_interface.GetBusData(
> + ctx->bus_interface.Context,
> + PCI_WHICHSPACE_CONFIG,
> + output_buf,
> + dpdk_pci_io_input->offset,
> + dpdk_pci_io_input->access_size);
> + }
> + else if (dpdk_pci_io_input->op == PCI_IO_WRITE) {
> + // returns bytes written
> + bytes_returned = ctx->bus_interface.SetBusData(
> + ctx->bus_interface.Context,
> + PCI_WHICHSPACE_CONFIG,
> + (PVOID)&dpdk_pci_io_input->data,
> + dpdk_pci_io_input->offset,
> + dpdk_pci_io_input->access_size);
The "offset" is not checked, so GetBusData() and SetBusData() can fail and
return 0. You have to either check it above to ensure success or to check if
bytes_returned != dpdk_pci_io_input->access_size and complete with error.
> + }
> + else {
> + status = STATUS_INVALID_PARAMETER;
> + goto end;
> + }
> +
> +end:
> + WdfRequestCompleteWithInformation(Request, status, bytes_returned);
> +
> + return;
> +}
next prev parent reply other threads:[~2020-09-13 21:39 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-20 22:23 [dpdk-dev] [PATCH v2] " Narcisa Ana Maria Vasile
2020-08-21 1:32 ` Ranjit Menon
2020-09-09 18:58 ` Narcisa Ana Maria Vasile
2020-08-24 20:53 ` Dmitry Kozlyuk
2020-09-09 18:53 ` Narcisa Ana Maria Vasile
2020-09-13 21:39 ` Dmitry Kozlyuk
2020-09-09 18:41 ` [dpdk-dev] [PATCH v3] " Narcisa Ana Maria Vasile
2020-09-13 21:39 ` Dmitry Kozlyuk [this message]
2020-09-19 2:52 ` [dpdk-dev] [PATCH v4] " Narcisa Ana Maria Vasile
2020-09-22 21:25 ` Dmitry Kozlyuk
2020-09-22 21:36 ` Ranjit Menon
2020-10-01 22:55 ` [dpdk-dev] [PATCH v5] " Narcisa Ana Maria Vasile
2020-10-02 18:21 ` Ranjit Menon
2020-10-05 19:34 ` Narcisa Ana Maria Vasile
2020-10-14 9:29 ` Thomas Monjalon
2020-10-02 18:33 ` Dmitry Kozlyuk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200914003905.4a27324f@sovereign \
--to=dmitry.kozliuk@gmail.com \
--cc=Harini.Ramakrishnan@microsoft.com \
--cc=dev@dpdk.org \
--cc=dmitrym@microsoft.com \
--cc=haramakr@linux.microsoft.com \
--cc=navasile@linux.microsoft.com \
--cc=navasile@microsoft.com \
--cc=ocardona@microsoft.com \
--cc=pallavi.kadam@intel.com \
--cc=ranjit.menon@intel.com \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).