From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 52AD942ED2; Fri, 21 Jul 2023 01:26:02 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2510840DF8; Fri, 21 Jul 2023 01:26:02 +0200 (CEST) Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by mails.dpdk.org (Postfix) with ESMTP id 9639D40DD8 for ; Fri, 21 Jul 2023 01:26:00 +0200 (CEST) Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1b8b318c5a7so10071575ad.3 for ; Thu, 20 Jul 2023 16:26:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20221208.gappssmtp.com; s=20221208; t=1689895559; x=1690500359; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sN+x2Yl+6fjMEPhRJx/L/w+7i4GRhqkinj9Tmc+JT+M=; b=02ohsF7ik4PXBZKSmnOb5dhrQJ6e8szafP3gBLU91PGEV3Crz7xDQGcrwWvC9s3+OC ijtU6rDlfzPbvD3WprSRGhxbQ7z8QYbs4/+kgM3HmN1+Q3qhkuIpSy13x7lyjvCfFg+L fa3K0cuuVv847Z9eCkKQE1fTnE/nMV8nS/Sh2o6++Er2Bs3XRtErfKtTSNRW9p9+o7X3 g7OAj7Hqz5nt7zgio7k69wd7b7Vib6dHtNjVrKZGrFWTSb+jyt4UJ4zNEr+hJj8vDYle ZIvTP1fWrsFk3EDfqhqEbXOxxw7hRpN0EiqoQ8f6tvaqol3zS7NPouiIZFrvEwJrfKHR ic4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689895559; x=1690500359; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sN+x2Yl+6fjMEPhRJx/L/w+7i4GRhqkinj9Tmc+JT+M=; b=LW4fiPASBFxwNUgO4YpJEKFUYveuLEA6oItoJHiaMmdVeCkl981cirHQ9pGHEvUhse 7R9Cs1Y7/LF7yg4o0imALfBeBYBPEJRpnXdbRR+eRaycA2z2kHwjnE8XjlpOSuYwdD96 ipLvNZJwS1mgYoqSyqfu3unXAppDmJmDFaqo+qUgKZgUdQivLb34IMQ/EgC6i4lPlaW5 voB5jAFUq/RRYbX9z1c5SMI8xTm5yI/T70qsuqrxKQeyd4pO9MKnp+QRdztQAE33u983 ba+bKqTN/WfTS2N8q0DwNLmAMnn3udf8kfUcUjA4hzOV8sR4mrUk8RjyR9tJmRF4BrXZ IAXQ== X-Gm-Message-State: ABy/qLYdsnudD8nG4ZHzRPQwRMW5SVOX5CkAJ8Q3muF8WpoQ7/8+a71i eZLohHcRY1nuZPgPsuui75iVpNMqMy7mxl3IyNp7Ig== X-Google-Smtp-Source: APBJJlEtbb0exkDZjNKG5DsdJZlcEuvSiWjCJGMT6TlQ4R4PWU1bIiAyVl+iifIysWvrll/Ofzsrtw== X-Received: by 2002:a17:902:db06:b0:1bb:5d9a:9054 with SMTP id m6-20020a170902db0600b001bb5d9a9054mr469254plx.12.1689895559088; Thu, 20 Jul 2023 16:25:59 -0700 (PDT) Received: from hermes.local (204-195-127-207.wavecable.com. [204.195.127.207]) by smtp.gmail.com with ESMTPSA id ix7-20020a170902f80700b001b8b2fb52d4sm1941787plb.203.2023.07.20.16.25.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 Jul 2023 16:25:58 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Subject: [PATCH v3] tap: fix build of TAP BPF program Date: Thu, 20 Jul 2023 16:25:49 -0700 Message-Id: <20230720232549.63619-1-stephen@networkplumber.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230716212544.5625-1-stephen@networkplumber.org> References: <20230716212544.5625-1-stephen@networkplumber.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Move the BPF program related code into a subdirectory. And add a Makefile for building it. The code was depending on old versions of headers from iproute2. Include those headers here so that build works. The standalone build was also broken because by commit ef5baf3486e0 ("replace packed attributes") which introduced __rte_packed into this code. Add a python program to extract the resulting BPF into a format that can be consumed by the TAP driver. Update the documentation. Signed-off-by: Stephen Hemminger --- doc/guides/nics/tap.rst | 11 +- drivers/net/tap/bpf/.gitignore | 1 + drivers/net/tap/bpf/Makefile | 18 ++ drivers/net/tap/bpf/bpf_api.h | 261 ++++++++++++++++++++ drivers/net/tap/bpf/bpf_elf.h | 43 ++++ drivers/net/tap/bpf/bpf_extract.py | 80 ++++++ drivers/net/tap/{ => bpf}/tap_bpf_program.c | 9 +- drivers/net/tap/tap_rss.h | 2 +- 8 files changed, 413 insertions(+), 12 deletions(-) create mode 100644 drivers/net/tap/bpf/.gitignore create mode 100644 drivers/net/tap/bpf/Makefile create mode 100644 drivers/net/tap/bpf/bpf_api.h create mode 100644 drivers/net/tap/bpf/bpf_elf.h create mode 100644 drivers/net/tap/bpf/bpf_extract.py rename drivers/net/tap/{ => bpf}/tap_bpf_program.c (97%) diff --git a/doc/guides/nics/tap.rst b/doc/guides/nics/tap.rst index 07df0d35a2ec..449e747994bd 100644 --- a/doc/guides/nics/tap.rst +++ b/doc/guides/nics/tap.rst @@ -256,15 +256,12 @@ C functions under different ELF sections. 2. Install ``LLVM`` library and ``clang`` compiler versions 3.7 and above -3. Compile ``tap_bpf_program.c`` via ``LLVM`` into an object file:: +3. Use make to compile `tap_bpf_program.c`` via ``LLVM`` into an object file + and extract the resulting instructions into ``tap_bpf_insn.h``. - clang -O2 -emit-llvm -c tap_bpf_program.c -o - | llc -march=bpf \ - -filetype=obj -o + cd bpf; make - -4. Use a tool that receives two parameters: an eBPF object file and a section -name, and prints out the section as a C array of eBPF instructions. -Embed the C array in your TAP PMD tree. +4. Recompile the TAP PMD. The C arrays are uploaded to the kernel using BPF system calls. diff --git a/drivers/net/tap/bpf/.gitignore b/drivers/net/tap/bpf/.gitignore new file mode 100644 index 000000000000..30a258f1af3b --- /dev/null +++ b/drivers/net/tap/bpf/.gitignore @@ -0,0 +1 @@ +tap_bpf_program.o diff --git a/drivers/net/tap/bpf/Makefile b/drivers/net/tap/bpf/Makefile new file mode 100644 index 000000000000..e5ae4e1f5adc --- /dev/null +++ b/drivers/net/tap/bpf/Makefile @@ -0,0 +1,18 @@ +# SPDX-License-Identifier: BSD-3-Clause +# This file is not built as part of normal DPDK build. +# It is used to generate the eBPF code for TAP RSS. +CLANG=clang +CLANG_OPTS=-O2 +TARGET=../tap_bpf_insns.h + +all: $(TARGET) + +clean: + rm tap_bpf_program.o $(TARGET) + +tap_bpf_program.o: tap_bpf_program.c + $(CLANG) $(CLANG_OPTS) -emit-llvm -c $< -o - | \ + llc -march=bpf -filetype=obj -o $@ + +$(TARGET): bpf_extract.py tap_bpf_program.o + python3 bpf_extract.py tap_bpf_program.o $@ diff --git a/drivers/net/tap/bpf/bpf_api.h b/drivers/net/tap/bpf/bpf_api.h new file mode 100644 index 000000000000..d13247199c9a --- /dev/null +++ b/drivers/net/tap/bpf/bpf_api.h @@ -0,0 +1,261 @@ +#ifndef __BPF_API__ +#define __BPF_API__ + +/* Note: + * + * This file can be included into eBPF kernel programs. It contains + * a couple of useful helper functions, map/section ABI (bpf_elf.h), + * misc macros and some eBPF specific LLVM built-ins. + */ + +#include + +#include +#include +#include + +#include + +#include "bpf_elf.h" + +/** Misc macros. */ + +#ifndef __stringify +# define __stringify(X) #X +#endif + +#ifndef __maybe_unused +# define __maybe_unused __attribute__((__unused__)) +#endif + +#ifndef offsetof +# define offsetof(TYPE, MEMBER) __builtin_offsetof(TYPE, MEMBER) +#endif + +#ifndef likely +# define likely(X) __builtin_expect(!!(X), 1) +#endif + +#ifndef unlikely +# define unlikely(X) __builtin_expect(!!(X), 0) +#endif + +#ifndef htons +# define htons(X) __constant_htons((X)) +#endif + +#ifndef ntohs +# define ntohs(X) __constant_ntohs((X)) +#endif + +#ifndef htonl +# define htonl(X) __constant_htonl((X)) +#endif + +#ifndef ntohl +# define ntohl(X) __constant_ntohl((X)) +#endif + +#ifndef __inline__ +# define __inline__ __attribute__((always_inline)) +#endif + +/** Section helper macros. */ + +#ifndef __section +# define __section(NAME) \ + __attribute__((section(NAME), used)) +#endif + +#ifndef __section_tail +# define __section_tail(ID, KEY) \ + __section(__stringify(ID) "/" __stringify(KEY)) +#endif + +#ifndef __section_xdp_entry +# define __section_xdp_entry \ + __section(ELF_SECTION_PROG) +#endif + +#ifndef __section_cls_entry +# define __section_cls_entry \ + __section(ELF_SECTION_CLASSIFIER) +#endif + +#ifndef __section_act_entry +# define __section_act_entry \ + __section(ELF_SECTION_ACTION) +#endif + +#ifndef __section_lwt_entry +# define __section_lwt_entry \ + __section(ELF_SECTION_PROG) +#endif + +#ifndef __section_license +# define __section_license \ + __section(ELF_SECTION_LICENSE) +#endif + +#ifndef __section_maps +# define __section_maps \ + __section(ELF_SECTION_MAPS) +#endif + +/** Declaration helper macros. */ + +#ifndef BPF_LICENSE +# define BPF_LICENSE(NAME) \ + char ____license[] __section_license = NAME +#endif + +/** Classifier helper */ + +#ifndef BPF_H_DEFAULT +# define BPF_H_DEFAULT -1 +#endif + +/** BPF helper functions for tc. Individual flags are in linux/bpf.h */ + +#ifndef __BPF_FUNC +# define __BPF_FUNC(NAME, ...) \ + (* NAME)(__VA_ARGS__) __maybe_unused +#endif + +#ifndef BPF_FUNC +# define BPF_FUNC(NAME, ...) \ + __BPF_FUNC(NAME, __VA_ARGS__) = (void *) BPF_FUNC_##NAME +#endif + +/* Map access/manipulation */ +static void *BPF_FUNC(map_lookup_elem, void *map, const void *key); +static int BPF_FUNC(map_update_elem, void *map, const void *key, + const void *value, uint32_t flags); +static int BPF_FUNC(map_delete_elem, void *map, const void *key); + +/* Time access */ +static uint64_t BPF_FUNC(ktime_get_ns); + +/* Debugging */ + +/* FIXME: __attribute__ ((format(printf, 1, 3))) not possible unless + * llvm bug https://llvm.org/bugs/show_bug.cgi?id=26243 gets resolved. + * It would require ____fmt to be made const, which generates a reloc + * entry (non-map). + */ +static void BPF_FUNC(trace_printk, const char *fmt, int fmt_size, ...); + +#ifndef printt +# define printt(fmt, ...) \ + ({ \ + char ____fmt[] = fmt; \ + trace_printk(____fmt, sizeof(____fmt), ##__VA_ARGS__); \ + }) +#endif + +/* Random numbers */ +static uint32_t BPF_FUNC(get_prandom_u32); + +/* Tail calls */ +static void BPF_FUNC(tail_call, struct __sk_buff *skb, void *map, + uint32_t index); + +/* System helpers */ +static uint32_t BPF_FUNC(get_smp_processor_id); +static uint32_t BPF_FUNC(get_numa_node_id); + +/* Packet misc meta data */ +static uint32_t BPF_FUNC(get_cgroup_classid, struct __sk_buff *skb); +static int BPF_FUNC(skb_under_cgroup, void *map, uint32_t index); + +static uint32_t BPF_FUNC(get_route_realm, struct __sk_buff *skb); +static uint32_t BPF_FUNC(get_hash_recalc, struct __sk_buff *skb); +static uint32_t BPF_FUNC(set_hash_invalid, struct __sk_buff *skb); + +/* Packet redirection */ +static int BPF_FUNC(redirect, int ifindex, uint32_t flags); +static int BPF_FUNC(clone_redirect, struct __sk_buff *skb, int ifindex, + uint32_t flags); + +/* Packet manipulation */ +static int BPF_FUNC(skb_load_bytes, struct __sk_buff *skb, uint32_t off, + void *to, uint32_t len); +static int BPF_FUNC(skb_store_bytes, struct __sk_buff *skb, uint32_t off, + const void *from, uint32_t len, uint32_t flags); + +static int BPF_FUNC(l3_csum_replace, struct __sk_buff *skb, uint32_t off, + uint32_t from, uint32_t to, uint32_t flags); +static int BPF_FUNC(l4_csum_replace, struct __sk_buff *skb, uint32_t off, + uint32_t from, uint32_t to, uint32_t flags); +static int BPF_FUNC(csum_diff, const void *from, uint32_t from_size, + const void *to, uint32_t to_size, uint32_t seed); +static int BPF_FUNC(csum_update, struct __sk_buff *skb, uint32_t wsum); + +static int BPF_FUNC(skb_change_type, struct __sk_buff *skb, uint32_t type); +static int BPF_FUNC(skb_change_proto, struct __sk_buff *skb, uint32_t proto, + uint32_t flags); +static int BPF_FUNC(skb_change_tail, struct __sk_buff *skb, uint32_t nlen, + uint32_t flags); + +static int BPF_FUNC(skb_pull_data, struct __sk_buff *skb, uint32_t len); + +/* Event notification */ +static int __BPF_FUNC(skb_event_output, struct __sk_buff *skb, void *map, + uint64_t index, const void *data, uint32_t size) = + (void *) BPF_FUNC_perf_event_output; + +/* Packet vlan encap/decap */ +static int BPF_FUNC(skb_vlan_push, struct __sk_buff *skb, uint16_t proto, + uint16_t vlan_tci); +static int BPF_FUNC(skb_vlan_pop, struct __sk_buff *skb); + +/* Packet tunnel encap/decap */ +static int BPF_FUNC(skb_get_tunnel_key, struct __sk_buff *skb, + struct bpf_tunnel_key *to, uint32_t size, uint32_t flags); +static int BPF_FUNC(skb_set_tunnel_key, struct __sk_buff *skb, + const struct bpf_tunnel_key *from, uint32_t size, + uint32_t flags); + +static int BPF_FUNC(skb_get_tunnel_opt, struct __sk_buff *skb, + void *to, uint32_t size); +static int BPF_FUNC(skb_set_tunnel_opt, struct __sk_buff *skb, + const void *from, uint32_t size); + +/** LLVM built-ins, mem*() routines work for constant size */ + +#ifndef lock_xadd +# define lock_xadd(ptr, val) ((void) __sync_fetch_and_add(ptr, val)) +#endif + +#ifndef memset +# define memset(s, c, n) __builtin_memset((s), (c), (n)) +#endif + +#ifndef memcpy +# define memcpy(d, s, n) __builtin_memcpy((d), (s), (n)) +#endif + +#ifndef memmove +# define memmove(d, s, n) __builtin_memmove((d), (s), (n)) +#endif + +/* FIXME: __builtin_memcmp() is not yet fully useable unless llvm bug + * https://llvm.org/bugs/show_bug.cgi?id=26218 gets resolved. Also + * this one would generate a reloc entry (non-map), otherwise. + */ +#if 0 +#ifndef memcmp +# define memcmp(a, b, n) __builtin_memcmp((a), (b), (n)) +#endif +#endif + +unsigned long long load_byte(void *skb, unsigned long long off) + asm ("llvm.bpf.load.byte"); + +unsigned long long load_half(void *skb, unsigned long long off) + asm ("llvm.bpf.load.half"); + +unsigned long long load_word(void *skb, unsigned long long off) + asm ("llvm.bpf.load.word"); + +#endif /* __BPF_API__ */ diff --git a/drivers/net/tap/bpf/bpf_elf.h b/drivers/net/tap/bpf/bpf_elf.h new file mode 100644 index 000000000000..406c30874ac3 --- /dev/null +++ b/drivers/net/tap/bpf/bpf_elf.h @@ -0,0 +1,43 @@ +#ifndef __BPF_ELF__ +#define __BPF_ELF__ + +#include + +/* Note: + * + * Below ELF section names and bpf_elf_map structure definition + * are not (!) kernel ABI. It's rather a "contract" between the + * application and the BPF loader in tc. For compatibility, the + * section names should stay as-is. Introduction of aliases, if + * needed, are a possibility, though. + */ + +/* ELF section names, etc */ +#define ELF_SECTION_LICENSE "license" +#define ELF_SECTION_MAPS "maps" +#define ELF_SECTION_PROG "prog" +#define ELF_SECTION_CLASSIFIER "classifier" +#define ELF_SECTION_ACTION "action" + +#define ELF_MAX_MAPS 64 +#define ELF_MAX_LICENSE_LEN 128 + +/* Object pinning settings */ +#define PIN_NONE 0 +#define PIN_OBJECT_NS 1 +#define PIN_GLOBAL_NS 2 + +/* ELF map definition */ +struct bpf_elf_map { + __u32 type; + __u32 size_key; + __u32 size_value; + __u32 max_elem; + __u32 flags; + __u32 id; + __u32 pinning; + __u32 inner_id; + __u32 inner_idx; +}; + +#endif /* __BPF_ELF__ */ diff --git a/drivers/net/tap/bpf/bpf_extract.py b/drivers/net/tap/bpf/bpf_extract.py new file mode 100644 index 000000000000..d79fc61020b3 --- /dev/null +++ b/drivers/net/tap/bpf/bpf_extract.py @@ -0,0 +1,80 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright (c) 2023 Stephen Hemminger + +import argparse +import sys +import struct +from tempfile import TemporaryFile +from elftools.elf.elffile import ELFFile + + +def load_sections(elffile): + result = [] + DATA = [("cls_q", "cls_q_insns"), ("l3_l4", "l3_l4_hash_insns")] + for name, tag in DATA: + section = elffile.get_section_by_name(name) + if section: + insns = struct.iter_unpack('> 4 + dst = bpf[1] & 0xf + off = bpf[2] + imm = bpf[3] + print('\t{{{:#02x}, {:4d}, {:4d}, {:8d}, {:#010x}}},'.format( + code, dst, src, off, imm), + file=out) + print('};', file=out) + + +def parse_args(): + parser = argparse.ArgumentParser() + parser.add_argument("input", + nargs='+', + help="input object file path or '-' for stdin") + parser.add_argument("output", help="output C file path or '-' for stdout") + return parser.parse_args() + + +def open_input(path): + if path == "-": + temp = TemporaryFile() + temp.write(sys.stdin.buffer.read()) + return temp + return open(path, "rb") + + +def open_output(path): + if path == "-": + return sys.stdout + return open(path, "w") + + +def write_header(output): + print("/* SPDX-License-Identifier: BSD-3-Clause", file=output) + print(" * Compiled BPF instructions do not edit", file=output) + print(" */\n", file=output) + print("#include ", file=output) + + +def main(): + args = parse_args() + + output = open_output(args.output) + write_header(output) + for path in args.input: + elffile = ELFFile(open_input(path)) + sections = load_sections(elffile) + dump_sections(sections, output) + + +if __name__ == "__main__": + main() diff --git a/drivers/net/tap/tap_bpf_program.c b/drivers/net/tap/bpf/tap_bpf_program.c similarity index 97% rename from drivers/net/tap/tap_bpf_program.c rename to drivers/net/tap/bpf/tap_bpf_program.c index 20c310e5e7ba..ff6f1606fb38 100644 --- a/drivers/net/tap/tap_bpf_program.c +++ b/drivers/net/tap/bpf/tap_bpf_program.c @@ -14,9 +14,10 @@ #include #include #include -#include -#include "tap_rss.h" +#include "bpf_api.h" +#include "bpf_elf.h" +#include "../tap_rss.h" /** Create IPv4 address */ #define IPv4(a, b, c, d) ((__u32)(((a) & 0xff) << 24) | \ @@ -75,14 +76,14 @@ struct ipv4_l3_l4_tuple { __u32 dst_addr; __u16 dport; __u16 sport; -} __rte_packed; +} __attribute__((packed)); struct ipv6_l3_l4_tuple { __u8 src_addr[16]; __u8 dst_addr[16]; __u16 dport; __u16 sport; -} __rte_packed; +} __attribute__((packed)); static const __u8 def_rss_key[TAP_RSS_HASH_KEY_SIZE] = { 0xd1, 0x81, 0xc6, 0x2c, diff --git a/drivers/net/tap/tap_rss.h b/drivers/net/tap/tap_rss.h index 48c151cf6b68..dff46a012f94 100644 --- a/drivers/net/tap/tap_rss.h +++ b/drivers/net/tap/tap_rss.h @@ -35,6 +35,6 @@ struct rss_key { __u32 key_size; __u32 queues[TAP_MAX_QUEUES]; __u32 nb_queues; -} __rte_packed; +} __attribute__((packed)); #endif /* _TAP_RSS_H_ */ -- 2.39.2