From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) by dpdk.org (Postfix) with ESMTP id 44D961396 for ; Thu, 11 Feb 2016 04:05:43 +0100 (CET) Received: from c-71-237-198-100.hsd1.or.comcast.net ([71.237.198.100] helo=localhost) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1aThZe-00055t-MJ for dev@dpdk.org; Thu, 11 Feb 2016 03:05:43 +0000 Date: Wed, 10 Feb 2016 19:05:40 -0800 From: Seth Arnold To: dev@dpdk.org Message-ID: <20160211030540.GB25680@hunt> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Subject: [dpdk-dev] thoughts on DPDK after a few days of reading sources X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Feb 2016 03:05:43 -0000 Hello, The Ubuntu distribution is looking at supporting DPDK in the 'main' component of the archive. As part of this process I spent a few days reading the DPDK sources to gauge if we can support it or not. I've taken some notes while reading the sources; I'm sharing them in the hopes that it's useful: on the one hand my fresh eyes may spot things that you've overlooked, on the other hand your familiarity with the code means that you're better suited to judge what I've found. Most of the code was very good; I am however concerned about the frequent memory allocations that use simple integer arithmetic when deciding how much to allocate without checking for integer overflows. Here's the bug tracking the Main Inclusion Request: https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/1492186 Here's the portions of the notes that I think may be most interesting to DPDK devs. It's in rough priority order so feel free to stop reading when you're bored: === Affects main codebase and should be investigated quickly: - shellcheck reports extensive cases of forgotten quotes to prevent word splitting or globbing, potentially unused variables, error-prone printf formatting. The scripts that are going to be used at runtime should be fixed: - ./debian/dpdk-init - ./debian/dpdk.init - ./tools/setup.sh ? (Hard to tell) - ./drivers/net/cxgbe/cxgbe_ethdev.c eth_cxgbe_dev_init() memory leak in out_free_adapter: that doesn't free adapter - ./drivers/net/virtio/virtio_ethdev.c virtio_set_multiple_queues() calls virtio_send_command(), which performs: memcpy(vq->virtio_net_hdr_mz->addr, ctrl, sizeof(struct virtio_pmd_ctrl)); This copies a potentially huge amount of uninitialized data into ->addr because the struct virtio_pmd_ctrl ctrl was not zeroed before being passed. How much of this data leaves the system? Does this require a CVE? - ./lib/librte_eal/common/rte_malloc.c memory allocation routines don't check for integer overflow errors: - rte_calloc_socket() - rte_calloc() - ./lib/librte_vhost/vhost_user/virtio-net-user.c user_set_mem_table() doesn't perform integer overflow checks before calling calloc() - ./lib/librte_vhost/vhost_cuse/virtio-net-cdev.c cuse_set_mem_table() doesn't perform integer overflow checks before calling calloc() - ./lib/librte_eal/linuxapp/eal/eal_xen_memory.c rte_xen_dom0_memory_init() If vma_addr = xen_get_virtual_area(&vma_len, RTE_PGSIZE_2M); fails, vma_len is reset to RTE_PGSIZE_2M instead of seginfo[memseg_idx].size -- is this a bug? - ./lib/librte_eal/linuxapp/eal/eal_memory.c create_shared_memory() creates the hugetlb file with mode 0666 rather than 0600 -- is this a bug? Does this require a CVE? - ./lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h rte_cpu_get_features() leaks auxv_fd - ./lib/librte_eal/common/include/arch/arm/rte_cpuflags_32.h rte_cpu_get_features() leaks auxv_fd - ./lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h rte_cpu_get_features() leaks auxv_fd Affects main codebase: - Assorted false-positives and style issues reported by cppcheck - Slightly dangerous convention of memcpy(dest, source, sizeof(source)) is used extensively; while all the instances I investigated were correct, it's still more prone to mistakes under maintenance than memcpy(dest, source, sizeof(dest)). - extensive use of *malloc() wrappers that perform multiplication to determine the size to allocate; all the cases I've seen used values that should be constrained by system configurations but the habit is dangerous compared to use of *calloc() wrappers that handle integer overflow safely. - ./app/test/test_malloc.c test_reordered_free_per_lcore() has incorrect calls to is_memory_overlap() -- p2 is 16000 bytes long, not 1000 bytes long, and this size difference is not reflected in the calls. - ./app/test/test_malloc.c test_realloc() leaks ptr1, may leak ptr9 via error path - ./app/test/test_malloc.c test_realloc() the test with error "Unexpected - ptr4 != ptr3" doesn't feel like it tests an actual promise from the API - ./lib/librte_pipeline/rte_pipeline.c rte_pipeline_table_create(), rte_pipeline_port_in_create(), rte_pipeline_port_out_create(), duplicate the array of function pointers via memcpy() rather than copying a pointer to static tables -- this may represent an easy way to save memory and improve cache hit ratios as well as potentially allow storing the tables in static memory rather than on the heap, reducing the value of these structs in potential exploits. - ixgbe driver in the package is very different from the driver in the Linux kernel -- when bugs in one are found, who is in charge of copying the fixes between the two code bases? - ./lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_ethtool.c ixgbe_get_strings() takes a buffer to write into but not the buffer length; sprintf() calls may overflow the buffer if it isn't large enough. It looks like ethtool_get_strings() may use fixed-size buffer but the amount of data that gets written into it is based on both static and dynamic data. Can this overflow the buffer? - ./lib/librte_eal/linuxapp/eal/eal.c rte_eal_config_create() uses mode 0660 rather than 0600 to create /home/username/._config files, which may be too open on distributions with "staff" or "users" default user groups rather than user-specific groups. Affects examples but should still be investigated quickly: - print_stats() in examples/l2fwd-ivshmem/host/host.c uses variable total_vm_packets_dropped without initialization Affects examples: - app_config_preproc() from ./examples/ip_pipeline/config_parse.c builds a string to execute with system(). While the inputs come from the command line arguments, they might have been supplied via a safe mechanism from untrusted users, rendering them unsafe here. The snprintf() error return is ignored. The access(app->config_file) check isn't as helpful as an actual error message if something does go wrong. - main() from ./examples/kni/main.c ignores kni_alloc() return value - main() from ./examples/kni/main.c may allow port to grow beyond RTE_MAX_ETHPORTS and then aborts rather than capping the upper end of the for() loop. - ./examples/vhost/main.c txmbuf_clean_zcp(): uint32_t used_idx = vq->last_used_idx & (vq->size - 1); Use of & rather than % for modular arithmetic -- are virtuqueue sizes guaranteed to be powers of two? - Multiple places omits errno in the case the file open fails: - ./examples/ip_pipeline/config_parse.c app_config_save() - ./examples/ip_pipeline/pipeline/pipeline_common_fe.c app_run_file() - ./examples/ethtool/ethtool-app/ethapp.c pcmd_eeprom_callback() - other locations of this aren't mentioned. It's nearly impossible to solve issues without error reporting. Good error reporting saves admins time and money. === Thanks