DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev]  [PATCH 0/8] eBPF arm64 JIT support
@ 2019-09-03 10:59 jerinj
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 1/8] bpf/arm64: add build infrastructure jerinj
                   ` (9 more replies)
  0 siblings, 10 replies; 32+ messages in thread
From: jerinj @ 2019-09-03 10:59 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, honnappa.nagarahalli, thomas, gavin.hu, Jerin Jacob

From: Jerin Jacob <jerinj@marvell.com>

Added eBPF arm64 JIT support to improve the eBPF program performance
on arm64.

dpdk.org/examples/bpf/t1.c application shows around 50% improvement
on OCTEON TX2 platform in JIT vs interpreter mode.

Verified the implementation using existing bpf_autotest application.

# echo "bpf_autotest" | sudo ./build/app/test -c 0x3

RTE>>bpf_autotest
run_test(test_store1) start
run_test(test_store2) start
run_test(test_load1) start
run_test(test_ldimm1) start
run_test(test_mul1) start
run_test(test_shift1) start
run_test(test_jump1) start
run_test(test_alu1) start
run_test(test_bele1) start
run_test(test_xadd1) start
run_test(test_div1) start
bpf_exec(0xffffa37c0000): division by 0 at pc: 0x68;
run_test(test_call1) start
run_test(test_call2) start
run_test(test_call3) start
Test OK

Jerin Jacob (8):
  bpf/arm64: add build infrastructure
  bpf/arm64: add prologue and epilogue
  bpf/arm64: add basic arithmetic operations
  bpf/arm64: add logical operations
  bpf/arm64: add byte swap operations
  bpf/arm64: add load and store operations
  bpf/arm64: add atomic-exchange-and-add operation
  bpf/arm64: add branch operation

 MAINTAINERS                            |    1 +
 doc/guides/prog_guide/bpf_lib.rst      |    2 +-
 doc/guides/rel_notes/release_19_11.rst |    5 +
 lib/librte_bpf/Makefile                |    2 +
 lib/librte_bpf/bpf.c                   |    4 +-
 lib/librte_bpf/bpf_impl.h              |    3 +-
 lib/librte_bpf/bpf_jit_arm64.c         | 1451 ++++++++++++++++++++++++
 lib/librte_bpf/meson.build             |    2 +
 8 files changed, 1466 insertions(+), 4 deletions(-)
 create mode 100644 lib/librte_bpf/bpf_jit_arm64.c

-- 
2.23.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev]  [PATCH 1/8] bpf/arm64: add build infrastructure
  2019-09-03 10:59 [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support jerinj
@ 2019-09-03 10:59 ` jerinj
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 2/8] bpf/arm64: add prologue and epilogue jerinj
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: jerinj @ 2019-09-03 10:59 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, honnappa.nagarahalli, thomas, gavin.hu, Jerin Jacob

From: Jerin Jacob <jerinj@marvell.com>

Add build infrastructure and documentation
update for arm64 JIT support.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 MAINTAINERS                            |  1 +
 doc/guides/prog_guide/bpf_lib.rst      |  2 +-
 doc/guides/rel_notes/release_19_11.rst |  5 +++++
 lib/librte_bpf/Makefile                |  2 ++
 lib/librte_bpf/bpf.c                   |  4 +++-
 lib/librte_bpf/bpf_impl.h              |  3 +--
 lib/librte_bpf/bpf_jit_arm64.c         | 19 +++++++++++++++++++
 lib/librte_bpf/meson.build             |  2 ++
 8 files changed, 34 insertions(+), 4 deletions(-)
 create mode 100644 lib/librte_bpf/bpf_jit_arm64.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 410026086..c2e91343c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -252,6 +252,7 @@ M: Gavin Hu <gavin.hu@arm.com>
 F: lib/librte_eal/common/include/arch/arm/*_64.h
 F: lib/librte_net/net_crc_neon.h
 F: lib/librte_acl/acl_run_neon.*
+F: lib/librte_bpf/bpf_jit_arm64.c
 F: lib/librte_lpm/rte_lpm_neon.h
 F: lib/librte_hash/rte*_arm64.h
 F: lib/librte_efd/rte*_arm64.h
diff --git a/doc/guides/prog_guide/bpf_lib.rst b/doc/guides/prog_guide/bpf_lib.rst
index 7c08e6b2d..9c728da7b 100644
--- a/doc/guides/prog_guide/bpf_lib.rst
+++ b/doc/guides/prog_guide/bpf_lib.rst
@@ -30,7 +30,7 @@ The library API provides the following basic operations:
 Not currently supported eBPF features
 -------------------------------------
 
- - JIT for non X86_64 platforms
+ - JIT support only available for X86_64 and arm64 platforms
  - cBPF
  - tail-pointer call
  - eBPF MAP
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 8490d897c..a0a92b8ae 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -56,6 +56,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added eBPF JIT support for arm64.**
+
+  Added eBPF JIT support for arm64 architecture to improve the eBPF program
+  performance.
+
 
 Removed Items
 -------------
diff --git a/lib/librte_bpf/Makefile b/lib/librte_bpf/Makefile
index c0e8aaa68..419a5162e 100644
--- a/lib/librte_bpf/Makefile
+++ b/lib/librte_bpf/Makefile
@@ -31,6 +31,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf_load_elf.c
 endif
 ifeq ($(CONFIG_RTE_ARCH_X86_64),y)
 SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf_jit_x86.c
+else ifeq ($(CONFIG_RTE_ARCH_ARM64),y)
+SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf_jit_arm64.c
 endif
 
 # install header files
diff --git a/lib/librte_bpf/bpf.c b/lib/librte_bpf/bpf.c
index cc963d52e..7e1879ffa 100644
--- a/lib/librte_bpf/bpf.c
+++ b/lib/librte_bpf/bpf.c
@@ -41,8 +41,10 @@ bpf_jit(struct rte_bpf *bpf)
 {
 	int32_t rc;
 
-#ifdef RTE_ARCH_X86_64
+#if defined(RTE_ARCH_X86_64)
 	rc = bpf_jit_x86(bpf);
+#elif defined(RTE_ARCH_ARM64)
+	rc = bpf_jit_arm64(bpf);
 #else
 	rc = -ENOTSUP;
 #endif
diff --git a/lib/librte_bpf/bpf_impl.h b/lib/librte_bpf/bpf_impl.h
index b577e2cbe..03ba0ae11 100644
--- a/lib/librte_bpf/bpf_impl.h
+++ b/lib/librte_bpf/bpf_impl.h
@@ -25,9 +25,8 @@ extern int bpf_validate(struct rte_bpf *bpf);
 
 extern int bpf_jit(struct rte_bpf *bpf);
 
-#ifdef RTE_ARCH_X86_64
 extern int bpf_jit_x86(struct rte_bpf *);
-#endif
+extern int bpf_jit_arm64(struct rte_bpf *);
 
 extern int rte_bpf_logtype;
 
diff --git a/lib/librte_bpf/bpf_jit_arm64.c b/lib/librte_bpf/bpf_jit_arm64.c
new file mode 100644
index 000000000..621bb7f46
--- /dev/null
+++ b/lib/librte_bpf/bpf_jit_arm64.c
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#include <errno.h>
+
+#include <rte_common.h>
+
+#include "bpf_impl.h"
+/*
+ * Produce a native ISA version of the given BPF code.
+ */
+int
+bpf_jit_arm64(struct rte_bpf *bpf)
+{
+	RTE_SET_USED(bpf);
+
+	return -ENOTSUP;
+}
diff --git a/lib/librte_bpf/meson.build b/lib/librte_bpf/meson.build
index 11c1fb558..13fc02db3 100644
--- a/lib/librte_bpf/meson.build
+++ b/lib/librte_bpf/meson.build
@@ -10,6 +10,8 @@ sources = files('bpf.c',
 
 if arch_subdir == 'x86' and dpdk_conf.get('RTE_ARCH_64')
 	sources += files('bpf_jit_x86.c')
+elif dpdk_conf.has('RTE_ARCH_ARM64')
+	sources += files('bpf_jit_arm64.c')
 endif
 
 install_headers = files('bpf_def.h',
-- 
2.23.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev]  [PATCH 2/8] bpf/arm64: add prologue and epilogue
  2019-09-03 10:59 [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support jerinj
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 1/8] bpf/arm64: add build infrastructure jerinj
@ 2019-09-03 10:59 ` jerinj
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 3/8] bpf/arm64: add basic arithmetic operations jerinj
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: jerinj @ 2019-09-03 10:59 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, honnappa.nagarahalli, thomas, gavin.hu, Jerin Jacob

From: Jerin Jacob <jerinj@marvell.com>

Add prologue and epilogue as per arm64 procedure call standard.

As an optimization the generated instructions are
the function of whether eBPF program has stack and/or
CALL class.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 lib/librte_bpf/bpf_jit_arm64.c | 466 ++++++++++++++++++++++++++++++++-
 1 file changed, 464 insertions(+), 2 deletions(-)

diff --git a/lib/librte_bpf/bpf_jit_arm64.c b/lib/librte_bpf/bpf_jit_arm64.c
index 621bb7f46..548408a61 100644
--- a/lib/librte_bpf/bpf_jit_arm64.c
+++ b/lib/librte_bpf/bpf_jit_arm64.c
@@ -3,17 +3,479 @@
  */
 
 #include <errno.h>
+#include <stdbool.h>
 
 #include <rte_common.h>
 
 #include "bpf_impl.h"
+
+#define A64_REG_MASK(r)		((r) & 0x1f)
+#define A64_INVALID_OP_CODE	(0xffffffff)
+
+#define TMP_REG_1		(EBPF_REG_10 + 1)
+#define TMP_REG_2		(EBPF_REG_10 + 2)
+#define TMP_REG_3		(EBPF_REG_10 + 3)
+
+#define EBPF_FP			(EBPF_REG_10)
+#define EBPF_OP_GET(op)		(BPF_OP(op) >> 4)
+
+#define A64_R(x)		x
+#define A64_FP			29
+#define A64_LR			30
+#define A64_SP			31
+#define A64_ZR			31
+
+#define check_imm(n, val) (((val) >= 0) ? !!((val) >> (n)) : !!((~val) >> (n)))
+#define mask_imm(n, val) ((val) & ((1 << (n)) - 1))
+
+struct ebpf_a64_map {
+	uint32_t off; /* eBPF to arm64 insn offset mapping for jump */
+	uint8_t off_to_b; /* Offset to branch instruction delta */
+};
+
+struct a64_jit_ctx {
+	size_t stack_sz;          /* Stack size */
+	uint32_t *ins;            /* ARM64 instructions. NULL if first pass */
+	struct ebpf_a64_map *map; /* eBPF to arm64 insn mapping for jump */
+	uint32_t idx;             /* Current instruction index */
+	uint32_t program_start;   /* Program index, Just after prologue */
+	uint32_t program_sz;      /* Program size. Found in first pass */
+	uint8_t foundcall;        /* Found EBPF_CALL class code in eBPF pgm */
+};
+
+static int
+check_reg(uint8_t r)
+{
+	return (r > 31) ? 1 : 0;
+}
+
+static int
+is_first_pass(struct a64_jit_ctx *ctx)
+{
+	return (ctx->ins == NULL);
+}
+
+static int
+check_invalid_args(struct a64_jit_ctx *ctx, uint32_t limit)
+{
+	uint32_t idx;
+
+	if (is_first_pass(ctx))
+		return 0;
+
+	for (idx = 0; idx < limit; idx++) {
+		if (rte_le_to_cpu_32(ctx->ins[idx]) == A64_INVALID_OP_CODE) {
+			RTE_BPF_LOG(ERR,
+				"%s: invalid opcode at %u;\n", __func__, idx);
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+/* Emit an instruction */
+static inline void
+emit_insn(struct a64_jit_ctx *ctx, uint32_t insn, int error)
+{
+	if (error)
+		insn = A64_INVALID_OP_CODE;
+
+	if (ctx->ins)
+		ctx->ins[ctx->idx] = rte_cpu_to_le_32(insn);
+
+	ctx->idx++;
+}
+
+static void
+emit_ret(struct a64_jit_ctx *ctx)
+{
+	emit_insn(ctx, 0xd65f03c0, 0);
+}
+
+static void
+emit_add_sub_imm(struct a64_jit_ctx *ctx, bool is64, bool sub, uint8_t rd,
+		 uint8_t rn, int16_t imm12)
+{
+	uint32_t insn, imm;
+
+	imm = mask_imm(12, imm12);
+	insn = (!!is64) << 31;
+	insn |= (!!sub) << 30;
+	insn |= 0x11000000;
+	insn |= rd;
+	insn |= rn << 5;
+	insn |= imm << 10;
+
+	emit_insn(ctx, insn,
+		  check_reg(rd) || check_reg(rn) || check_imm(12, imm12));
+}
+
+static void
+emit_add_imm_64(struct a64_jit_ctx *ctx, uint8_t rd, uint8_t rn, uint16_t imm12)
+{
+	emit_add_sub_imm(ctx, 1, 0, rd, rn, imm12);
+}
+
+static void
+emit_sub_imm_64(struct a64_jit_ctx *ctx, uint8_t rd, uint8_t rn, uint16_t imm12)
+{
+	emit_add_sub_imm(ctx, 1, 1, rd, rn, imm12);
+}
+
+static void
+emit_mov(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rn)
+{
+	emit_add_sub_imm(ctx, is64, 0, rd, rn, 0);
+}
+
+static void
+emit_mov_64(struct a64_jit_ctx *ctx, uint8_t rd, uint8_t rn)
+{
+	emit_mov(ctx, 1, rd, rn);
+}
+
+static void
+emit_ls_pair_64(struct a64_jit_ctx *ctx, uint8_t rt, uint8_t rt2, uint8_t rn,
+		bool push, bool load, bool pre_index)
+{
+	uint32_t insn;
+
+	insn = (!!load) << 22;
+	insn |= (!!pre_index) << 24;
+	insn |= 0xa8800000;
+	insn |= rt;
+	insn |= rn << 5;
+	insn |= rt2 << 10;
+	if (push)
+		insn |= 0x7e << 15; /* 0x7e means -2 with imm7 */
+	else
+		insn |= 0x2 << 15;
+
+	emit_insn(ctx, insn, check_reg(rn) || check_reg(rt) || check_reg(rt2));
+
+}
+
+/* Emit stp rt, rt2, [sp, #-16]! */
+static void
+emit_stack_push(struct a64_jit_ctx *ctx, uint8_t rt, uint8_t rt2)
+{
+	emit_ls_pair_64(ctx, rt, rt2, A64_SP, 1, 0, 1);
+}
+
+/* Emit ldp rt, rt2, [sp, #16] */
+static void
+emit_stack_pop(struct a64_jit_ctx *ctx, uint8_t rt, uint8_t rt2)
+{
+	emit_ls_pair_64(ctx, rt, rt2, A64_SP, 0, 1, 0);
+}
+
+static uint8_t
+ebpf_to_a64_reg(struct a64_jit_ctx *ctx, uint8_t reg)
+{
+	const uint32_t ebpf2a64_has_call[] = {
+		/* Map A64 R7 register as EBPF return register */
+		[EBPF_REG_0] = A64_R(7),
+		/* Map A64 arguments register as EBPF arguments register */
+		[EBPF_REG_1] = A64_R(0),
+		[EBPF_REG_2] = A64_R(1),
+		[EBPF_REG_3] = A64_R(2),
+		[EBPF_REG_4] = A64_R(3),
+		[EBPF_REG_5] = A64_R(4),
+		/* Map A64 callee save register as EBPF callee save register */
+		[EBPF_REG_6] = A64_R(19),
+		[EBPF_REG_7] = A64_R(20),
+		[EBPF_REG_8] = A64_R(21),
+		[EBPF_REG_9] = A64_R(22),
+		[EBPF_FP]    = A64_R(25),
+		/* Map A64 scratch registers as temporary storage */
+		[TMP_REG_1] = A64_R(9),
+		[TMP_REG_2] = A64_R(10),
+		[TMP_REG_3] = A64_R(11),
+	};
+
+	const uint32_t ebpf2a64_no_call[] = {
+		/* Map A64 R7 register as EBPF return register */
+		[EBPF_REG_0] = A64_R(7),
+		/* Map A64 arguments register as EBPF arguments register */
+		[EBPF_REG_1] = A64_R(0),
+		[EBPF_REG_2] = A64_R(1),
+		[EBPF_REG_3] = A64_R(2),
+		[EBPF_REG_4] = A64_R(3),
+		[EBPF_REG_5] = A64_R(4),
+		/*
+		 * EBPF program does not have EBPF_CALL op code,
+		 * Map A64 scratch registers as EBPF callee save registers.
+		 */
+		[EBPF_REG_6] = A64_R(9),
+		[EBPF_REG_7] = A64_R(10),
+		[EBPF_REG_8] = A64_R(11),
+		[EBPF_REG_9] = A64_R(12),
+		/* Map A64 FP register as EBPF FP register */
+		[EBPF_FP]    = A64_FP,
+		/* Map remaining A64 scratch registers as temporary storage */
+		[TMP_REG_1] = A64_R(13),
+		[TMP_REG_2] = A64_R(14),
+		[TMP_REG_3] = A64_R(15),
+	};
+
+	if (ctx->foundcall)
+		return ebpf2a64_has_call[reg];
+	else
+		return ebpf2a64_no_call[reg];
+}
+
+/*
+ * Procedure call standard for the arm64
+ * -------------------------------------
+ * R0..R7  - Parameter/result registers
+ * R8      - Indirect result location register
+ * R9..R15 - Scratch registers
+ * R15     - Platform Register
+ * R16     - First intra-procedure-call scratch register
+ * R17     - Second intra-procedure-call temporary register
+ * R19-R28 - Callee saved registers
+ * R29     - Frame pointer
+ * R30     - Link register
+ * R31     - Stack pointer
+ */
+static void
+emit_prologue_has_call(struct a64_jit_ctx *ctx)
+{
+	uint8_t r6, r7, r8, r9, fp;
+
+	r6 = ebpf_to_a64_reg(ctx, EBPF_REG_6);
+	r7 = ebpf_to_a64_reg(ctx, EBPF_REG_7);
+	r8 = ebpf_to_a64_reg(ctx, EBPF_REG_8);
+	r9 = ebpf_to_a64_reg(ctx, EBPF_REG_9);
+	fp = ebpf_to_a64_reg(ctx, EBPF_FP);
+
+	/*
+	 * eBPF prog stack layout
+	 *
+	 *                               high
+	 *       eBPF prologue       0:+-----+ <= original A64_SP
+	 *                             |FP/LR|
+	 *                         -16:+-----+ <= current A64_FP
+	 *    Callee saved registers   | ... |
+	 *             EBPF_FP =>  -64:+-----+
+	 *                             |     |
+	 *       eBPF prog stack       | ... |
+	 *                             |     |
+	 * (EBPF_FP - bpf->stack_sz)=> +-----+
+	 * Pad for A64_SP 16B alignment| PAD |
+	 * (EBPF_FP - ctx->stack_sz)=> +-----+ <= current A64_SP
+	 *                             |     |
+	 *                             | ... | Function call stack
+	 *                             |     |
+	 *                             +-----+
+	 *                              low
+	 */
+	emit_stack_push(ctx, A64_FP, A64_LR);
+	emit_mov_64(ctx, A64_FP, A64_SP);
+	emit_stack_push(ctx, r6, r7);
+	emit_stack_push(ctx, r8, r9);
+	/*
+	 * There is no requirement to save A64_R(28) in stack. Doing it here,
+	 * because, A64_SP needs be to 16B aligned and STR vs STP
+	 * takes same number of cycles(typically).
+	 */
+	emit_stack_push(ctx, fp, A64_R(28));
+	emit_mov_64(ctx, fp, A64_SP);
+	if (ctx->stack_sz)
+		emit_sub_imm_64(ctx, A64_SP, A64_SP, ctx->stack_sz);
+}
+
+static void
+emit_epilogue_has_call(struct a64_jit_ctx *ctx)
+{
+	uint8_t r6, r7, r8, r9, fp, r0;
+
+	r6 = ebpf_to_a64_reg(ctx, EBPF_REG_6);
+	r7 = ebpf_to_a64_reg(ctx, EBPF_REG_7);
+	r8 = ebpf_to_a64_reg(ctx, EBPF_REG_8);
+	r9 = ebpf_to_a64_reg(ctx, EBPF_REG_9);
+	fp = ebpf_to_a64_reg(ctx, EBPF_FP);
+	r0 = ebpf_to_a64_reg(ctx, EBPF_REG_0);
+
+	if (ctx->stack_sz)
+		emit_add_imm_64(ctx, A64_SP, A64_SP, ctx->stack_sz);
+	emit_stack_pop(ctx, fp, A64_R(28));
+	emit_stack_pop(ctx, r8, r9);
+	emit_stack_pop(ctx, r6, r7);
+	emit_stack_pop(ctx, A64_FP, A64_LR);
+	emit_mov_64(ctx, A64_R(0), r0);
+	emit_ret(ctx);
+}
+
+static void
+emit_prologue_no_call(struct a64_jit_ctx *ctx)
+{
+	/*
+	 * eBPF prog stack layout without EBPF_CALL opcode
+	 *
+	 *                               high
+	 *    eBPF prologue(EBPF_FP) 0:+-----+ <= original A64_SP/current A64_FP
+	 *                             |     |
+	 *                             | ... |
+	 *            eBPF prog stack  |     |
+	 *                             |     |
+	 * (EBPF_FP - bpf->stack_sz)=> +-----+
+	 * Pad for A64_SP 16B alignment| PAD |
+	 * (EBPF_FP - ctx->stack_sz)=> +-----+ <= current A64_SP
+	 *                             |     |
+	 *                             | ... | Function call stack
+	 *                             |     |
+	 *                             +-----+
+	 *                              low
+	 */
+	if (ctx->stack_sz) {
+		emit_mov_64(ctx, A64_FP, A64_SP);
+		emit_sub_imm_64(ctx, A64_SP, A64_SP, ctx->stack_sz);
+	}
+}
+
+static void
+emit_epilogue_no_call(struct a64_jit_ctx *ctx)
+{
+	if (ctx->stack_sz)
+		emit_add_imm_64(ctx, A64_SP, A64_SP, ctx->stack_sz);
+	emit_mov_64(ctx, A64_R(0), ebpf_to_a64_reg(ctx, EBPF_REG_0));
+	emit_ret(ctx);
+}
+
+static void
+emit_prologue(struct a64_jit_ctx *ctx)
+{
+	if (ctx->foundcall)
+		emit_prologue_has_call(ctx);
+	else
+		emit_prologue_no_call(ctx);
+
+	ctx->program_start = ctx->idx;
+}
+
+static void
+emit_epilogue(struct a64_jit_ctx *ctx)
+{
+	ctx->program_sz = ctx->idx - ctx->program_start;
+
+	if (ctx->foundcall)
+		emit_epilogue_has_call(ctx);
+	else
+		emit_epilogue_no_call(ctx);
+}
+
+static void
+check_program_has_call(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
+{
+	const struct ebpf_insn *ins;
+	uint8_t op;
+	uint32_t i;
+
+	for (i = 0; i != bpf->prm.nb_ins; i++) {
+		ins = bpf->prm.ins + i;
+		op = ins->code;
+
+		switch (op) {
+		/* Call imm */
+		case (BPF_JMP | EBPF_CALL):
+			ctx->foundcall = 1;
+			return;
+		}
+	}
+}
+
+/*
+ * Walk through eBPF code and translate them to arm64 one.
+ */
+static int
+emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
+{
+	uint8_t op;
+	const struct ebpf_insn *ins;
+	uint32_t i;
+	int rc;
+
+	/* Reset context fields */
+	ctx->idx = 0;
+	/* arm64 SP must be aligned to 16 */
+	ctx->stack_sz = RTE_ALIGN_MUL_CEIL(bpf->stack_sz, 16);
+
+	emit_prologue(ctx);
+
+	for (i = 0; i != bpf->prm.nb_ins; i++) {
+
+		ins = bpf->prm.ins + i;
+		op = ins->code;
+
+		switch (op) {
+		/* Return r0 */
+		case (BPF_JMP | EBPF_EXIT):
+			emit_epilogue(ctx);
+			break;
+		default:
+			RTE_BPF_LOG(ERR,
+				"%s(%p): invalid opcode %#x at pc: %u;\n",
+				__func__, bpf, ins->code, i);
+			return -EINVAL;
+		}
+	}
+	rc = check_invalid_args(ctx, ctx->idx);
+
+	return rc;
+}
+
 /*
  * Produce a native ISA version of the given BPF code.
  */
 int
 bpf_jit_arm64(struct rte_bpf *bpf)
 {
-	RTE_SET_USED(bpf);
+	struct a64_jit_ctx ctx;
+	size_t size;
+	int rc;
+
+	/* Init JIT context */
+	memset(&ctx, 0, sizeof(ctx));
+
+	/* Find eBPF program has call class or not */
+	check_program_has_call(&ctx, bpf);
+
+	/* First pass to calculate total code size and valid jump offsets */
+	rc = emit(&ctx, bpf);
+	if (rc)
+		goto finish;
+
+	size = ctx.idx * sizeof(uint32_t);
+	/* Allocate JIT program memory */
+	ctx.ins = mmap(NULL, size, PROT_READ | PROT_WRITE,
+			       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	if (ctx.ins == MAP_FAILED) {
+		rc = -ENOMEM;
+		goto finish;
+	}
+
+	/* Second pass to generate code */
+	rc = emit(&ctx, bpf);
+	if (rc)
+		goto munmap;
+
+	rc = mprotect(ctx.ins, size, PROT_READ | PROT_EXEC) != 0;
+	if (rc) {
+		rc = -errno;
+		goto munmap;
+	}
+
+	/* Flush the icache */
+	__builtin___clear_cache(ctx.ins, ctx.ins + ctx.idx);
+
+	bpf->jit.func = (void *)ctx.ins;
+	bpf->jit.sz = size;
+
+	goto finish;
 
-	return -ENOTSUP;
+munmap:
+	munmap(ctx.ins, size);
+finish:
+	return rc;
 }
-- 
2.23.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev] [PATCH 3/8] bpf/arm64: add basic arithmetic operations
  2019-09-03 10:59 [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support jerinj
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 1/8] bpf/arm64: add build infrastructure jerinj
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 2/8] bpf/arm64: add prologue and epilogue jerinj
@ 2019-09-03 10:59 ` jerinj
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 4/8] bpf/arm64: add logical operations jerinj
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: jerinj @ 2019-09-03 10:59 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, honnappa.nagarahalli, thomas, gavin.hu, Jerin Jacob

From: Jerin Jacob <jerinj@marvell.com>

Add mov, add, sub, mul, div and mod arithmetic
operations for immediate and source register variants.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 lib/librte_bpf/bpf_jit_arm64.c | 300 ++++++++++++++++++++++++++++++++-
 1 file changed, 299 insertions(+), 1 deletion(-)

diff --git a/lib/librte_bpf/bpf_jit_arm64.c b/lib/librte_bpf/bpf_jit_arm64.c
index 548408a61..5d2ce378c 100644
--- a/lib/librte_bpf/bpf_jit_arm64.c
+++ b/lib/librte_bpf/bpf_jit_arm64.c
@@ -43,6 +43,17 @@ struct a64_jit_ctx {
 	uint8_t foundcall;        /* Found EBPF_CALL class code in eBPF pgm */
 };
 
+static int
+check_mov_hw(bool is64, const uint8_t val)
+{
+	if (val == 16 || val == 0)
+		return 0;
+	else if (is64 && val != 64 && val != 48 && val != 32)
+		return 1;
+
+	return 0;
+}
+
 static int
 check_reg(uint8_t r)
 {
@@ -169,6 +180,179 @@ emit_stack_pop(struct a64_jit_ctx *ctx, uint8_t rt, uint8_t rt2)
 	emit_ls_pair_64(ctx, rt, rt2, A64_SP, 0, 1, 0);
 }
 
+#define A64_MOVN 0
+#define A64_MOVZ 2
+#define A64_MOVK 3
+static void
+mov_imm(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t type,
+	uint16_t imm16, uint8_t shift)
+{
+	uint32_t insn;
+
+	insn = (!!is64) << 31;
+	insn |= type << 29;
+	insn |= 0x25 << 23;
+	insn |= (shift/16) << 21;
+	insn |= imm16 << 5;
+	insn |= rd;
+
+	emit_insn(ctx, insn, check_reg(rd) || check_mov_hw(is64, shift));
+}
+
+static void
+emit_mov_imm32(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint32_t val)
+{
+	uint16_t upper = val >> 16;
+	uint16_t lower = val & 0xffff;
+
+	/* Positive number */
+	if ((val & 1UL << 31) == 0) {
+		mov_imm(ctx, is64, rd, A64_MOVZ, lower, 0);
+		if (upper)
+			mov_imm(ctx, is64, rd, A64_MOVK, upper, 16);
+	} else { /* Negative number */
+		if (upper == 0xffff) {
+			mov_imm(ctx, is64, rd, A64_MOVN, ~lower, 0);
+		} else {
+			mov_imm(ctx, is64, rd, A64_MOVN, ~upper, 16);
+			if (lower != 0xffff)
+				mov_imm(ctx, is64, rd, A64_MOVK, lower, 0);
+		}
+	}
+}
+
+static int
+u16_blocks_weight(const uint64_t val, bool one)
+{
+	return (((val >>  0) & 0xffff) == (one ? 0xffff : 0x0000)) +
+	       (((val >> 16) & 0xffff) == (one ? 0xffff : 0x0000)) +
+	       (((val >> 32) & 0xffff) == (one ? 0xffff : 0x0000)) +
+	       (((val >> 48) & 0xffff) == (one ? 0xffff : 0x0000));
+}
+
+static void
+emit_mov_imm(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint64_t val)
+{
+	uint64_t nval = ~val;
+	int movn, sr;
+
+	if (is64 == 0)
+		return emit_mov_imm32(ctx, 0, rd, (uint32_t)(val & 0xffffffff));
+
+	/* Find MOVN or MOVZ first */
+	movn = u16_blocks_weight(val, true) > u16_blocks_weight(val, false);
+	/* Find shift right value */
+	sr = movn ? rte_fls_u64(nval) - 1 : rte_fls_u64(val) - 1;
+	sr = RTE_ALIGN_FLOOR(sr, 16);
+	sr = RTE_MAX(sr, 0);
+
+	if (movn)
+		mov_imm(ctx, 1, rd, A64_MOVN, (nval >> sr) & 0xffff, sr);
+	else
+		mov_imm(ctx, 1, rd, A64_MOVZ, (val >> sr) & 0xffff, sr);
+
+	sr -= 16;
+	while (sr >= 0) {
+		if (((val >> sr) & 0xffff) != (movn ? 0xffff : 0x0000))
+			mov_imm(ctx, 1, rd, A64_MOVK, (val >> sr) & 0xffff, sr);
+		sr -= 16;
+	}
+}
+
+#define A64_ADD 0x58
+#define A64_SUB 0x258
+static void
+emit_add_sub(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rn,
+	     uint8_t rm, uint16_t op)
+{
+	uint32_t insn;
+
+	insn = (!!is64) << 31;
+	insn |= op << 21; /* shift == 0 */
+	insn |= rm << 16;
+	insn |= rn << 5;
+	insn |= rd;
+
+	emit_insn(ctx, insn, check_reg(rd) || check_reg(rm));
+}
+
+static void
+emit_add(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
+{
+	emit_add_sub(ctx, is64, rd, rd, rm, A64_ADD);
+}
+
+static void
+emit_sub(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
+{
+	emit_add_sub(ctx, is64, rd, rd, rm, A64_SUB);
+}
+
+static void
+emit_mul(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
+{
+	uint32_t insn;
+
+	insn = (!!is64) << 31;
+	insn |= 0xd8 << 21;
+	insn |= rm << 16;
+	insn |= A64_ZR << 10;
+	insn |= rd << 5;
+	insn |= rd;
+
+	emit_insn(ctx, insn, check_reg(rd) || check_reg(rm));
+}
+
+#define A64_UDIV 0x2
+static void
+emit_data_process_two_src(struct a64_jit_ctx *ctx, bool is64, uint8_t rd,
+			  uint8_t rn, uint8_t rm, uint16_t op)
+
+{
+	uint32_t insn;
+
+	insn = (!!is64) << 31;
+	insn |= 0xd6 << 21;
+	insn |= rm << 16;
+	insn |= op << 10;
+	insn |= rn << 5;
+	insn |= rd;
+
+	emit_insn(ctx, insn, check_reg(rd) || check_reg(rm));
+}
+
+static void
+emit_div(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
+{
+	emit_data_process_two_src(ctx, is64, rd, rd, rm, A64_UDIV);
+}
+
+static void
+emit_msub(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rn,
+	  uint8_t rm, uint8_t ra)
+{
+	uint32_t insn;
+
+	insn = (!!is64) << 31;
+	insn |= 0xd8 << 21;
+	insn |= rm << 16;
+	insn |= 0x1 << 15;
+	insn |= ra << 10;
+	insn |= rn << 5;
+	insn |= rd;
+
+	emit_insn(ctx, insn, check_reg(rd) || check_reg(rn) || check_reg(rm) ||
+		  check_reg(ra));
+}
+
+static void
+emit_mod(struct a64_jit_ctx *ctx, bool is64, uint8_t tmp, uint8_t rd,
+	 uint8_t rm)
+{
+	emit_data_process_two_src(ctx, is64, tmp, rd, rm, A64_UDIV);
+	emit_msub(ctx, is64, rd, tmp, rm, rd);
+}
+
 static uint8_t
 ebpf_to_a64_reg(struct a64_jit_ctx *ctx, uint8_t reg)
 {
@@ -365,6 +549,44 @@ emit_epilogue(struct a64_jit_ctx *ctx)
 		emit_epilogue_no_call(ctx);
 }
 
+static void
+emit_cbnz(struct a64_jit_ctx *ctx, bool is64, uint8_t rt, int32_t imm19)
+{
+	uint32_t insn, imm;
+
+	imm = mask_imm(19, imm19);
+	insn = (!!is64) << 31;
+	insn |= 0x35 << 24;
+	insn |= imm << 5;
+	insn |= rt;
+
+	emit_insn(ctx, insn, check_reg(rt) || check_imm(19, imm19));
+}
+
+static void
+emit_b(struct a64_jit_ctx *ctx, int32_t imm26)
+{
+	uint32_t insn, imm;
+
+	imm = mask_imm(26, imm26);
+	insn = 0x5 << 26;
+	insn |= imm;
+
+	emit_insn(ctx, insn, check_imm(26, imm26));
+}
+
+static void
+emit_return_zero_if_src_zero(struct a64_jit_ctx *ctx, bool is64, uint8_t src)
+{
+	uint8_t r0 = ebpf_to_a64_reg(ctx, EBPF_REG_0);
+	uint16_t jump_to_epilogue;
+
+	emit_cbnz(ctx, is64, src, 3);
+	emit_mov_imm(ctx, is64, r0, 0);
+	jump_to_epilogue = (ctx->program_start + ctx->program_sz) - ctx->idx;
+	emit_b(ctx, jump_to_epilogue);
+}
+
 static void
 check_program_has_call(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 {
@@ -391,15 +613,19 @@ check_program_has_call(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 static int
 emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 {
-	uint8_t op;
+	uint8_t op, dst, src, tmp1, tmp2;
 	const struct ebpf_insn *ins;
+	int32_t imm;
 	uint32_t i;
+	bool is64;
 	int rc;
 
 	/* Reset context fields */
 	ctx->idx = 0;
 	/* arm64 SP must be aligned to 16 */
 	ctx->stack_sz = RTE_ALIGN_MUL_CEIL(bpf->stack_sz, 16);
+	tmp1 = ebpf_to_a64_reg(ctx, TMP_REG_1);
+	tmp2 = ebpf_to_a64_reg(ctx, TMP_REG_2);
 
 	emit_prologue(ctx);
 
@@ -407,8 +633,80 @@ emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 
 		ins = bpf->prm.ins + i;
 		op = ins->code;
+		imm = ins->imm;
+
+		dst = ebpf_to_a64_reg(ctx, ins->dst_reg);
+		src = ebpf_to_a64_reg(ctx, ins->src_reg);
+		is64 = (BPF_CLASS(op) == EBPF_ALU64);
 
 		switch (op) {
+		/* dst = src */
+		case (BPF_ALU | EBPF_MOV | BPF_X):
+		case (EBPF_ALU64 | EBPF_MOV | BPF_X):
+			emit_mov(ctx, is64, dst, src);
+			break;
+		/* dst = imm */
+		case (BPF_ALU | EBPF_MOV | BPF_K):
+		case (EBPF_ALU64 | EBPF_MOV | BPF_K):
+			emit_mov_imm(ctx, is64, dst, imm);
+			break;
+		/* dst += src */
+		case (BPF_ALU | BPF_ADD | BPF_X):
+		case (EBPF_ALU64 | BPF_ADD | BPF_X):
+			emit_add(ctx, is64, dst, src);
+			break;
+		/* dst += imm */
+		case (BPF_ALU | BPF_ADD | BPF_K):
+		case (EBPF_ALU64 | BPF_ADD | BPF_K):
+			emit_mov_imm(ctx, is64, tmp1, imm);
+			emit_add(ctx, is64, dst, tmp1);
+			break;
+		/* dst -= src */
+		case (BPF_ALU | BPF_SUB | BPF_X):
+		case (EBPF_ALU64 | BPF_SUB | BPF_X):
+			emit_sub(ctx, is64, dst, src);
+			break;
+		/* dst -= imm */
+		case (BPF_ALU | BPF_SUB | BPF_K):
+		case (EBPF_ALU64 | BPF_SUB | BPF_K):
+			emit_mov_imm(ctx, is64, tmp1, imm);
+			emit_sub(ctx, is64, dst, tmp1);
+			break;
+		/* dst *= src */
+		case (BPF_ALU | BPF_MUL | BPF_X):
+		case (EBPF_ALU64 | BPF_MUL | BPF_X):
+			emit_mul(ctx, is64, dst, src);
+			break;
+		/* dst *= imm */
+		case (BPF_ALU | BPF_MUL | BPF_K):
+		case (EBPF_ALU64 | BPF_MUL | BPF_K):
+			emit_mov_imm(ctx, is64, tmp1, imm);
+			emit_mul(ctx, is64, dst, tmp1);
+			break;
+		/* dst /= src */
+		case (BPF_ALU | BPF_DIV | BPF_X):
+		case (EBPF_ALU64 | BPF_DIV | BPF_X):
+			emit_return_zero_if_src_zero(ctx, is64, src);
+			emit_div(ctx, is64, dst, src);
+			break;
+		/* dst /= imm */
+		case (BPF_ALU | BPF_DIV | BPF_K):
+		case (EBPF_ALU64 | BPF_DIV | BPF_K):
+			emit_mov_imm(ctx, is64, tmp1, imm);
+			emit_div(ctx, is64, dst, tmp1);
+			break;
+		/* dst %= src */
+		case (BPF_ALU | BPF_MOD | BPF_X):
+		case (EBPF_ALU64 | BPF_MOD | BPF_X):
+			emit_return_zero_if_src_zero(ctx, is64, src);
+			emit_mod(ctx, is64, tmp1, dst, src);
+			break;
+		/* dst %= imm */
+		case (BPF_ALU | BPF_MOD | BPF_K):
+		case (EBPF_ALU64 | BPF_MOD | BPF_K):
+			emit_mov_imm(ctx, is64, tmp1, imm);
+			emit_mod(ctx, is64, tmp2, dst, tmp1);
+			break;
 		/* Return r0 */
 		case (BPF_JMP | EBPF_EXIT):
 			emit_epilogue(ctx);
-- 
2.23.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev]  [PATCH 4/8] bpf/arm64: add logical operations
  2019-09-03 10:59 [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support jerinj
                   ` (2 preceding siblings ...)
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 3/8] bpf/arm64: add basic arithmetic operations jerinj
@ 2019-09-03 10:59 ` jerinj
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 5/8] bpf/arm64: add byte swap operations jerinj
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: jerinj @ 2019-09-03 10:59 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, honnappa.nagarahalli, thomas, gavin.hu, Jerin Jacob

From: Jerin Jacob <jerinj@marvell.com>

Add OR, AND, NEG, XOR, shift operations for immediate
and source register variants.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 lib/librte_bpf/bpf_jit_arm64.c | 189 +++++++++++++++++++++++++++++++++
 1 file changed, 189 insertions(+)

diff --git a/lib/librte_bpf/bpf_jit_arm64.c b/lib/librte_bpf/bpf_jit_arm64.c
index 5d2ce378c..fcde3583b 100644
--- a/lib/librte_bpf/bpf_jit_arm64.c
+++ b/lib/librte_bpf/bpf_jit_arm64.c
@@ -43,6 +43,17 @@ struct a64_jit_ctx {
 	uint8_t foundcall;        /* Found EBPF_CALL class code in eBPF pgm */
 };
 
+static int
+check_immr_imms(bool is64, uint8_t immr, uint8_t imms)
+{
+	const unsigned int width = is64 ? 64 : 32;
+
+	if (immr >= width || imms >= width)
+		return 1;
+
+	return 0;
+}
+
 static int
 check_mov_hw(bool is64, const uint8_t val)
 {
@@ -288,6 +299,12 @@ emit_sub(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
 	emit_add_sub(ctx, is64, rd, rd, rm, A64_SUB);
 }
 
+static void
+emit_neg(struct a64_jit_ctx *ctx, bool is64, uint8_t rd)
+{
+	emit_add_sub(ctx, is64, rd, A64_ZR, rd, A64_SUB);
+}
+
 static void
 emit_mul(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
 {
@@ -304,6 +321,9 @@ emit_mul(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
 }
 
 #define A64_UDIV 0x2
+#define A64_LSLV 0x8
+#define A64_LSRV 0x9
+#define A64_ASRV 0xA
 static void
 emit_data_process_two_src(struct a64_jit_ctx *ctx, bool is64, uint8_t rd,
 			  uint8_t rn, uint8_t rm, uint16_t op)
@@ -327,6 +347,107 @@ emit_div(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
 	emit_data_process_two_src(ctx, is64, rd, rd, rm, A64_UDIV);
 }
 
+static void
+emit_lslv(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
+{
+	emit_data_process_two_src(ctx, is64, rd, rd, rm, A64_LSLV);
+}
+
+static void
+emit_lsrv(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
+{
+	emit_data_process_two_src(ctx, is64, rd, rd, rm, A64_LSRV);
+}
+
+static void
+emit_asrv(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
+{
+	emit_data_process_two_src(ctx, is64, rd, rd, rm, A64_ASRV);
+}
+
+#define A64_UBFM 0x2
+#define A64_SBFM 0x0
+static void
+emit_bitfield(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rn,
+	      uint8_t immr, uint8_t imms, uint16_t op)
+
+{
+	uint32_t insn;
+
+	insn = (!!is64) << 31;
+	if (insn)
+		insn |= 1 << 22; /* Set N bit when is64 is set */
+	insn |= op << 29;
+	insn |= 0x26 << 23;
+	insn |= immr << 16;
+	insn |= imms << 10;
+	insn |= rn << 5;
+	insn |= rd;
+
+	emit_insn(ctx, insn, check_reg(rd) || check_reg(rn) ||
+		  check_immr_imms(is64, immr, imms));
+}
+static void
+emit_lsl(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t imm)
+{
+	const unsigned int width = is64 ? 64 : 32;
+	uint8_t imms, immr;
+
+	immr = (width - imm) & (width - 1);
+	imms = width - 1 - imm;
+
+	emit_bitfield(ctx, is64, rd, rd, immr, imms, A64_UBFM);
+}
+
+static void
+emit_lsr(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t imm)
+{
+	emit_bitfield(ctx, is64, rd, rd, imm, is64 ? 63 : 31, A64_UBFM);
+}
+
+static void
+emit_asr(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t imm)
+{
+	emit_bitfield(ctx, is64, rd, rd, imm, is64 ? 63 : 31, A64_SBFM);
+}
+
+#define A64_AND 0
+#define A64_OR 1
+#define A64_XOR 2
+static void
+emit_logical(struct a64_jit_ctx *ctx, bool is64, uint8_t rd,
+	     uint8_t rm, uint16_t op)
+{
+	uint32_t insn;
+
+	insn = (!!is64) << 31;
+	insn |= op << 29;
+	insn |= 0x50 << 21;
+	insn |= rm << 16;
+	insn |= rd << 5;
+	insn |= rd;
+
+	emit_insn(ctx, insn, check_reg(rd) || check_reg(rm));
+}
+
+static void
+emit_or(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
+{
+	emit_logical(ctx, is64, rd, rm, A64_OR);
+}
+
+static void
+emit_and(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
+{
+	emit_logical(ctx, is64, rd, rm, A64_AND);
+}
+
+static void
+emit_xor(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rm)
+{
+	emit_logical(ctx, is64, rd, rm, A64_XOR);
+}
+
 static void
 emit_msub(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint8_t rn,
 	  uint8_t rm, uint8_t ra)
@@ -707,6 +828,74 @@ emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 			emit_mov_imm(ctx, is64, tmp1, imm);
 			emit_mod(ctx, is64, tmp2, dst, tmp1);
 			break;
+		/* dst |= src */
+		case (BPF_ALU | BPF_OR | BPF_X):
+		case (EBPF_ALU64 | BPF_OR | BPF_X):
+			emit_or(ctx, is64, dst, src);
+			break;
+		/* dst |= imm */
+		case (BPF_ALU | BPF_OR | BPF_K):
+		case (EBPF_ALU64 | BPF_OR | BPF_K):
+			emit_mov_imm(ctx, is64, tmp1, imm);
+			emit_or(ctx, is64, dst, tmp1);
+			break;
+		/* dst &= src */
+		case (BPF_ALU | BPF_AND | BPF_X):
+		case (EBPF_ALU64 | BPF_AND | BPF_X):
+			emit_and(ctx, is64, dst, src);
+			break;
+		/* dst &= imm */
+		case (BPF_ALU | BPF_AND | BPF_K):
+		case (EBPF_ALU64 | BPF_AND | BPF_K):
+			emit_mov_imm(ctx, is64, tmp1, imm);
+			emit_and(ctx, is64, dst, tmp1);
+			break;
+		/* dst ^= src */
+		case (BPF_ALU | BPF_XOR | BPF_X):
+		case (EBPF_ALU64 | BPF_XOR | BPF_X):
+			emit_xor(ctx, is64, dst, src);
+			break;
+		/* dst ^= imm */
+		case (BPF_ALU | BPF_XOR | BPF_K):
+		case (EBPF_ALU64 | BPF_XOR | BPF_K):
+			emit_mov_imm(ctx, is64, tmp1, imm);
+			emit_xor(ctx, is64, dst, tmp1);
+			break;
+		/* dst = -dst */
+		case (BPF_ALU | BPF_NEG):
+		case (EBPF_ALU64 | BPF_NEG):
+			emit_neg(ctx, is64, dst);
+			break;
+		/* dst <<= src */
+		case BPF_ALU | BPF_LSH | BPF_X:
+		case EBPF_ALU64 | BPF_LSH | BPF_X:
+			emit_lslv(ctx, is64, dst, src);
+			break;
+		/* dst <<= imm */
+		case BPF_ALU | BPF_LSH | BPF_K:
+		case EBPF_ALU64 | BPF_LSH | BPF_K:
+			emit_lsl(ctx, is64, dst, imm);
+			break;
+		/* dst >>= src */
+		case BPF_ALU | BPF_RSH | BPF_X:
+		case EBPF_ALU64 | BPF_RSH | BPF_X:
+			emit_lsrv(ctx, is64, dst, src);
+			break;
+		/* dst >>= imm */
+		case BPF_ALU | BPF_RSH | BPF_K:
+		case EBPF_ALU64 | BPF_RSH | BPF_K:
+			emit_lsr(ctx, is64, dst, imm);
+			break;
+		/* dst >>= src (arithmetic) */
+		case BPF_ALU | EBPF_ARSH | BPF_X:
+		case EBPF_ALU64 | EBPF_ARSH | BPF_X:
+			emit_asrv(ctx, is64, dst, src);
+			break;
+		/* dst >>= imm (arithmetic) */
+		case BPF_ALU | EBPF_ARSH | BPF_K:
+		case EBPF_ALU64 | EBPF_ARSH | BPF_K:
+			emit_asr(ctx, is64, dst, imm);
+			break;
 		/* Return r0 */
 		case (BPF_JMP | EBPF_EXIT):
 			emit_epilogue(ctx);
-- 
2.23.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev]  [PATCH 5/8] bpf/arm64: add byte swap operations
  2019-09-03 10:59 [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support jerinj
                   ` (3 preceding siblings ...)
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 4/8] bpf/arm64: add logical operations jerinj
@ 2019-09-03 10:59 ` jerinj
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 6/8] bpf/arm64: add load and store operations jerinj
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: jerinj @ 2019-09-03 10:59 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, honnappa.nagarahalli, thomas, gavin.hu, Jerin Jacob

From: Jerin Jacob <jerinj@marvell.com>

add le16, le32, le64, be16, be32 and be64 operations.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 lib/librte_bpf/bpf_jit_arm64.c | 87 ++++++++++++++++++++++++++++++++++
 1 file changed, 87 insertions(+)

diff --git a/lib/librte_bpf/bpf_jit_arm64.c b/lib/librte_bpf/bpf_jit_arm64.c
index fcde3583b..ec165d627 100644
--- a/lib/librte_bpf/bpf_jit_arm64.c
+++ b/lib/librte_bpf/bpf_jit_arm64.c
@@ -6,6 +6,7 @@
 #include <stdbool.h>
 
 #include <rte_common.h>
+#include <rte_byteorder.h>
 
 #include "bpf_impl.h"
 
@@ -474,6 +475,84 @@ emit_mod(struct a64_jit_ctx *ctx, bool is64, uint8_t tmp, uint8_t rd,
 	emit_msub(ctx, is64, rd, tmp, rm, rd);
 }
 
+static void
+emit_zero_extend(struct a64_jit_ctx *ctx, uint8_t rd, int32_t imm)
+{
+	switch (imm) {
+	case 16:
+		/* Zero-extend 16 bits into 64 bits */
+		emit_bitfield(ctx, 1, rd, rd, 0, 15, A64_UBFM);
+		break;
+	case 32:
+		/* Zero-extend 32 bits into 64 bits */
+		emit_bitfield(ctx, 1, rd, rd, 0, 31, A64_UBFM);
+		break;
+	case 64:
+		break;
+	default:
+		/* Generate error */
+		emit_insn(ctx, 0, 1);
+	}
+}
+
+static void
+emit_rev(struct a64_jit_ctx *ctx, uint8_t rd, int32_t imm)
+{
+	uint32_t insn;
+
+	insn = 0xdac00000;
+	insn |= rd << 5;
+	insn |= rd;
+
+	switch (imm) {
+	case 16:
+		insn |= 1 << 10;
+		emit_insn(ctx, insn, check_reg(rd));
+		emit_zero_extend(ctx, rd, 16);
+		break;
+	case 32:
+		insn |= 2 << 10;
+		emit_insn(ctx, insn, check_reg(rd));
+		/* Upper 32 bits already cleared */
+		break;
+	case 64:
+		insn |= 3 << 10;
+		emit_insn(ctx, insn, check_reg(rd));
+		break;
+	default:
+		/* Generate error */
+		emit_insn(ctx, insn, 1);
+	}
+}
+
+static int
+is_be(void)
+{
+#if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
+	return 1;
+#else
+	return 0;
+#endif
+}
+
+static void
+emit_be(struct a64_jit_ctx *ctx, uint8_t rd, int32_t imm)
+{
+	if (is_be())
+		emit_zero_extend(ctx, rd, imm);
+	else
+		emit_rev(ctx, rd, imm);
+}
+
+static void
+emit_le(struct a64_jit_ctx *ctx, uint8_t rd, int32_t imm)
+{
+	if (is_be())
+		emit_rev(ctx, rd, imm);
+	else
+		emit_zero_extend(ctx, rd, imm);
+}
+
 static uint8_t
 ebpf_to_a64_reg(struct a64_jit_ctx *ctx, uint8_t reg)
 {
@@ -896,6 +975,14 @@ emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 		case EBPF_ALU64 | EBPF_ARSH | BPF_K:
 			emit_asr(ctx, is64, dst, imm);
 			break;
+		/* dst = be##imm(dst) */
+		case (BPF_ALU | EBPF_END | EBPF_TO_BE):
+			emit_be(ctx, dst, imm);
+			break;
+		/* dst = le##imm(dst) */
+		case (BPF_ALU | EBPF_END | EBPF_TO_LE):
+			emit_le(ctx, dst, imm);
+			break;
 		/* Return r0 */
 		case (BPF_JMP | EBPF_EXIT):
 			emit_epilogue(ctx);
-- 
2.23.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev]  [PATCH 6/8] bpf/arm64: add load and store operations
  2019-09-03 10:59 [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support jerinj
                   ` (4 preceding siblings ...)
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 5/8] bpf/arm64: add byte swap operations jerinj
@ 2019-09-03 10:59 ` jerinj
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 7/8] bpf/arm64: add atomic-exchange-and-add operation jerinj
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: jerinj @ 2019-09-03 10:59 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, honnappa.nagarahalli, thomas, gavin.hu, Jerin Jacob

From: Jerin Jacob <jerinj@marvell.com>

Add load and store operations.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 lib/librte_bpf/bpf_jit_arm64.c | 84 ++++++++++++++++++++++++++++++++++
 1 file changed, 84 insertions(+)

diff --git a/lib/librte_bpf/bpf_jit_arm64.c b/lib/librte_bpf/bpf_jit_arm64.c
index ec165d627..c797c9c62 100644
--- a/lib/librte_bpf/bpf_jit_arm64.c
+++ b/lib/librte_bpf/bpf_jit_arm64.c
@@ -66,6 +66,15 @@ check_mov_hw(bool is64, const uint8_t val)
 	return 0;
 }
 
+static int
+check_ls_sz(uint8_t sz)
+{
+	if (sz == BPF_B || sz == BPF_H || sz == BPF_W || sz == EBPF_DW)
+		return 0;
+
+	return 1;
+}
+
 static int
 check_reg(uint8_t r)
 {
@@ -271,6 +280,47 @@ emit_mov_imm(struct a64_jit_ctx *ctx, bool is64, uint8_t rd, uint64_t val)
 	}
 }
 
+static void
+emit_ls(struct a64_jit_ctx *ctx, uint8_t sz, uint8_t rt, uint8_t rn, uint8_t rm,
+	bool load)
+{
+	uint32_t insn;
+
+	insn = 0x1c1 << 21;
+	if (load)
+		insn |= 1 << 22;
+	if (sz == BPF_B)
+		insn |= 0 << 30;
+	else if (sz == BPF_H)
+		insn |= 1 << 30;
+	else if (sz == BPF_W)
+		insn |= 2 << 30;
+	else if (sz == EBPF_DW)
+		insn |= 3 << 30;
+
+	insn |= rm << 16;
+	insn |= 0x1a << 10; /* LSL and S = 0 */
+	insn |= rn << 5;
+	insn |= rt;
+
+	emit_insn(ctx, insn, check_reg(rt) || check_reg(rn) || check_reg(rm) ||
+		  check_ls_sz(sz));
+}
+
+static void
+emit_str(struct a64_jit_ctx *ctx, uint8_t sz, uint8_t rt, uint8_t rn,
+	 uint8_t rm)
+{
+	emit_ls(ctx, sz, rt, rn, rm, 0);
+}
+
+static void
+emit_ldr(struct a64_jit_ctx *ctx, uint8_t sz, uint8_t rt, uint8_t rn,
+	 uint8_t rm)
+{
+	emit_ls(ctx, sz, rt, rn, rm, 1);
+}
+
 #define A64_ADD 0x58
 #define A64_SUB 0x258
 static void
@@ -815,6 +865,8 @@ emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 {
 	uint8_t op, dst, src, tmp1, tmp2;
 	const struct ebpf_insn *ins;
+	uint64_t u64;
+	int16_t off;
 	int32_t imm;
 	uint32_t i;
 	bool is64;
@@ -833,6 +885,7 @@ emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 
 		ins = bpf->prm.ins + i;
 		op = ins->code;
+		off = ins->off;
 		imm = ins->imm;
 
 		dst = ebpf_to_a64_reg(ctx, ins->dst_reg);
@@ -983,6 +1036,37 @@ emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 		case (BPF_ALU | EBPF_END | EBPF_TO_LE):
 			emit_le(ctx, dst, imm);
 			break;
+		/* dst = *(size *) (src + off) */
+		case (BPF_LDX | BPF_MEM | BPF_B):
+		case (BPF_LDX | BPF_MEM | BPF_H):
+		case (BPF_LDX | BPF_MEM | BPF_W):
+		case (BPF_LDX | BPF_MEM | EBPF_DW):
+			emit_mov_imm(ctx, 1, tmp1, off);
+			emit_ldr(ctx, BPF_SIZE(op), dst, src, tmp1);
+			break;
+		/* dst = imm64 */
+		case (BPF_LD | BPF_IMM | EBPF_DW):
+			u64 = ((uint64_t)ins[1].imm << 32) | (uint32_t)imm;
+			emit_mov_imm(ctx, 1, dst, u64);
+			i++;
+			break;
+		/* *(size *)(dst + off) = src */
+		case (BPF_STX | BPF_MEM | BPF_B):
+		case (BPF_STX | BPF_MEM | BPF_H):
+		case (BPF_STX | BPF_MEM | BPF_W):
+		case (BPF_STX | BPF_MEM | EBPF_DW):
+			emit_mov_imm(ctx, 1, tmp1, off);
+			emit_str(ctx, BPF_SIZE(op), src, dst, tmp1);
+			break;
+		/* *(size *)(dst + off) = imm */
+		case (BPF_ST | BPF_MEM | BPF_B):
+		case (BPF_ST | BPF_MEM | BPF_H):
+		case (BPF_ST | BPF_MEM | BPF_W):
+		case (BPF_ST | BPF_MEM | EBPF_DW):
+			emit_mov_imm(ctx, 1, tmp1, imm);
+			emit_mov_imm(ctx, 1, tmp2, off);
+			emit_str(ctx, BPF_SIZE(op), tmp1, dst, tmp2);
+			break;
 		/* Return r0 */
 		case (BPF_JMP | EBPF_EXIT):
 			emit_epilogue(ctx);
-- 
2.23.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev] [PATCH 7/8] bpf/arm64: add atomic-exchange-and-add operation
  2019-09-03 10:59 [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support jerinj
                   ` (5 preceding siblings ...)
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 6/8] bpf/arm64: add load and store operations jerinj
@ 2019-09-03 10:59 ` jerinj
  2019-10-18 13:16   ` David Marchand
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 8/8] bpf/arm64: add branch operation jerinj
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: jerinj @ 2019-09-03 10:59 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, honnappa.nagarahalli, thomas, gavin.hu, Jerin Jacob

From: Jerin Jacob <jerinj@marvell.com>

Implement XADD eBPF instruction using STADD arm64 instruction.
If the given platform does not have atomics support,
use LDXR and STXR pair for critical section instead of STADD.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 lib/librte_bpf/bpf_jit_arm64.c | 85 +++++++++++++++++++++++++++++++++-
 1 file changed, 84 insertions(+), 1 deletion(-)

diff --git a/lib/librte_bpf/bpf_jit_arm64.c b/lib/librte_bpf/bpf_jit_arm64.c
index c797c9c62..62fa6a505 100644
--- a/lib/librte_bpf/bpf_jit_arm64.c
+++ b/lib/librte_bpf/bpf_jit_arm64.c
@@ -837,6 +837,83 @@ emit_return_zero_if_src_zero(struct a64_jit_ctx *ctx, bool is64, uint8_t src)
 	emit_b(ctx, jump_to_epilogue);
 }
 
+static void
+emit_stadd(struct a64_jit_ctx *ctx, bool is64, uint8_t rs, uint8_t rn)
+{
+	uint32_t insn;
+
+	insn = 0xb820001f;
+	insn |= (!!is64) << 30;
+	insn |= rs << 16;
+	insn |= rn << 5;
+
+	emit_insn(ctx, insn, check_reg(rs) || check_reg(rn));
+}
+
+static void
+emit_ldxr(struct a64_jit_ctx *ctx, bool is64, uint8_t rt, uint8_t rn)
+{
+	uint32_t insn;
+
+	insn = 0x885f7c00;
+	insn |= (!!is64) << 30;
+	insn |= rn << 5;
+	insn |= rt;
+
+	emit_insn(ctx, insn, check_reg(rt) || check_reg(rn));
+}
+
+static void
+emit_stxr(struct a64_jit_ctx *ctx, bool is64, uint8_t rs, uint8_t rt,
+	  uint8_t rn)
+{
+	uint32_t insn;
+
+	insn = 0x88007c00;
+	insn |= (!!is64) << 30;
+	insn |= rs << 16;
+	insn |= rn << 5;
+	insn |= rt;
+
+	emit_insn(ctx, insn, check_reg(rs) || check_reg(rt) || check_reg(rn));
+}
+
+static int
+has_atomics(void)
+{
+	int rc = 0;
+
+#if defined(__ARM_FEATURE_ATOMICS) || defined(RTE_ARM_FEATURE_ATOMICS)
+	rc = 1;
+#endif
+	return rc;
+}
+
+static void
+emit_xadd(struct a64_jit_ctx *ctx, uint8_t op, uint8_t tmp1, uint8_t tmp2,
+	  uint8_t tmp3, uint8_t dst, int16_t off, uint8_t src)
+{
+	bool is64 = (BPF_SIZE(op) == EBPF_DW);
+	uint8_t rn;
+
+	if (off) {
+		emit_mov_imm(ctx, 1, tmp1, off);
+		emit_add(ctx, 1, tmp1, dst);
+		rn = tmp1;
+	} else {
+		rn = dst;
+	}
+
+	if (has_atomics()) {
+		emit_stadd(ctx, is64, src, rn);
+	} else {
+		emit_ldxr(ctx, is64, tmp2, rn);
+		emit_add(ctx, is64, tmp2, src);
+		emit_stxr(ctx, is64, tmp3, tmp2, rn);
+		emit_cbnz(ctx, is64, tmp3, -3);
+	}
+}
+
 static void
 check_program_has_call(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 {
@@ -863,7 +940,7 @@ check_program_has_call(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 static int
 emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 {
-	uint8_t op, dst, src, tmp1, tmp2;
+	uint8_t op, dst, src, tmp1, tmp2, tmp3;
 	const struct ebpf_insn *ins;
 	uint64_t u64;
 	int16_t off;
@@ -878,6 +955,7 @@ emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 	ctx->stack_sz = RTE_ALIGN_MUL_CEIL(bpf->stack_sz, 16);
 	tmp1 = ebpf_to_a64_reg(ctx, TMP_REG_1);
 	tmp2 = ebpf_to_a64_reg(ctx, TMP_REG_2);
+	tmp3 = ebpf_to_a64_reg(ctx, TMP_REG_3);
 
 	emit_prologue(ctx);
 
@@ -1067,6 +1145,11 @@ emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 			emit_mov_imm(ctx, 1, tmp2, off);
 			emit_str(ctx, BPF_SIZE(op), tmp1, dst, tmp2);
 			break;
+		/* STX XADD: lock *(size *)(dst + off) += src */
+		case (BPF_STX | EBPF_XADD | BPF_W):
+		case (BPF_STX | EBPF_XADD | EBPF_DW):
+			emit_xadd(ctx, op, tmp1, tmp2, tmp3, dst, off, src);
+			break;
 		/* Return r0 */
 		case (BPF_JMP | EBPF_EXIT):
 			emit_epilogue(ctx);
-- 
2.23.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev]  [PATCH 8/8] bpf/arm64: add branch operation
  2019-09-03 10:59 [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support jerinj
                   ` (6 preceding siblings ...)
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 7/8] bpf/arm64: add atomic-exchange-and-add operation jerinj
@ 2019-09-03 10:59 ` jerinj
  2019-09-24 17:03 ` [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support Ananyev, Konstantin
  2019-10-03 12:51 ` Thomas Monjalon
  9 siblings, 0 replies; 32+ messages in thread
From: jerinj @ 2019-09-03 10:59 UTC (permalink / raw)
  To: dev
  Cc: konstantin.ananyev, honnappa.nagarahalli, thomas, gavin.hu, Jerin Jacob

From: Jerin Jacob <jerinj@marvell.com>

Add branch and call operations.

jump_offset_* APIs used for finding the relative offset
to jump w.r.t current eBPF program PC.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 lib/librte_bpf/bpf_jit_arm64.c | 229 +++++++++++++++++++++++++++++++++
 1 file changed, 229 insertions(+)

diff --git a/lib/librte_bpf/bpf_jit_arm64.c b/lib/librte_bpf/bpf_jit_arm64.c
index 62fa6a505..8882fee67 100644
--- a/lib/librte_bpf/bpf_jit_arm64.c
+++ b/lib/librte_bpf/bpf_jit_arm64.c
@@ -105,6 +105,112 @@ check_invalid_args(struct a64_jit_ctx *ctx, uint32_t limit)
 	return 0;
 }
 
+static int
+jump_offset_init(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
+{
+	uint32_t i;
+
+	ctx->map = malloc(bpf->prm.nb_ins * sizeof(ctx->map[0]));
+	if (ctx->map == NULL)
+		return -ENOMEM;
+
+	/* Fill with fake offsets */
+	for (i = 0; i != bpf->prm.nb_ins; i++) {
+		ctx->map[i].off = INT32_MAX;
+		ctx->map[i].off_to_b = 0;
+	}
+	return 0;
+}
+
+static void
+jump_offset_fini(struct a64_jit_ctx *ctx)
+{
+	free(ctx->map);
+}
+
+static void
+jump_offset_update(struct a64_jit_ctx *ctx, uint32_t ebpf_idx)
+{
+	if (is_first_pass(ctx))
+		ctx->map[ebpf_idx].off = ctx->idx;
+}
+
+static void
+jump_offset_to_branch_update(struct a64_jit_ctx *ctx, uint32_t ebpf_idx)
+{
+	if (is_first_pass(ctx))
+		ctx->map[ebpf_idx].off_to_b = ctx->idx - ctx->map[ebpf_idx].off;
+
+}
+
+static int32_t
+jump_offset_get(struct a64_jit_ctx *ctx, uint32_t from, int16_t offset)
+{
+	int32_t a64_from, a64_to;
+
+	a64_from = ctx->map[from].off +  ctx->map[from].off_to_b;
+	a64_to = ctx->map[from + offset + 1].off;
+
+	if (a64_to == INT32_MAX)
+		return a64_to;
+
+	return a64_to - a64_from;
+}
+
+enum a64_cond_e {
+	A64_EQ = 0x0, /* == */
+	A64_NE = 0x1, /* != */
+	A64_CS = 0x2, /* Unsigned >= */
+	A64_CC = 0x3, /* Unsigned < */
+	A64_MI = 0x4, /* < 0 */
+	A64_PL = 0x5, /* >= 0 */
+	A64_VS = 0x6, /* Overflow */
+	A64_VC = 0x7, /* No overflow */
+	A64_HI = 0x8, /* Unsigned > */
+	A64_LS = 0x9, /* Unsigned <= */
+	A64_GE = 0xa, /* Signed >= */
+	A64_LT = 0xb, /* Signed < */
+	A64_GT = 0xc, /* Signed > */
+	A64_LE = 0xd, /* Signed <= */
+	A64_AL = 0xe, /* Always */
+};
+
+static int
+check_cond(uint8_t cond)
+{
+	return (cond >= A64_AL) ? 1 : 0;
+}
+
+static uint8_t
+ebpf_to_a64_cond(uint8_t op)
+{
+	switch (BPF_OP(op)) {
+	case BPF_JEQ:
+		return A64_EQ;
+	case BPF_JGT:
+		return A64_HI;
+	case EBPF_JLT:
+		return A64_CC;
+	case BPF_JGE:
+		return A64_CS;
+	case EBPF_JLE:
+		return A64_LS;
+	case BPF_JSET:
+	case EBPF_JNE:
+		return A64_NE;
+	case EBPF_JSGT:
+		return A64_GT;
+	case EBPF_JSLT:
+		return A64_LT;
+	case EBPF_JSGE:
+		return A64_GE;
+	case EBPF_JSLE:
+		return A64_LE;
+	default:
+		return UINT8_MAX;
+	}
+}
+
 /* Emit an instruction */
 static inline void
 emit_insn(struct a64_jit_ctx *ctx, uint32_t insn, int error)
@@ -525,6 +631,17 @@ emit_mod(struct a64_jit_ctx *ctx, bool is64, uint8_t tmp, uint8_t rd,
 	emit_msub(ctx, is64, rd, tmp, rm, rd);
 }
 
+static void
+emit_blr(struct a64_jit_ctx *ctx, uint8_t rn)
+{
+	uint32_t insn;
+
+	insn = 0xd63f0000;
+	insn |= rn << 5;
+
+	emit_insn(ctx, insn, check_reg(rn));
+}
+
 static void
 emit_zero_extend(struct a64_jit_ctx *ctx, uint8_t rd, int32_t imm)
 {
@@ -799,6 +916,16 @@ emit_epilogue(struct a64_jit_ctx *ctx)
 		emit_epilogue_no_call(ctx);
 }
 
+static void
+emit_call(struct a64_jit_ctx *ctx, uint8_t tmp, void *func)
+{
+	uint8_t r0 = ebpf_to_a64_reg(ctx, EBPF_REG_0);
+
+	emit_mov_imm(ctx, 1, tmp, (uint64_t)func);
+	emit_blr(ctx, tmp);
+	emit_mov_64(ctx, r0, A64_R(0));
+}
+
 static void
 emit_cbnz(struct a64_jit_ctx *ctx, bool is64, uint8_t rt, int32_t imm19)
 {
@@ -914,6 +1041,54 @@ emit_xadd(struct a64_jit_ctx *ctx, uint8_t op, uint8_t tmp1, uint8_t tmp2,
 	}
 }
 
+#define A64_CMP 0x6b00000f
+#define A64_TST 0x6a00000f
+static void
+emit_cmp_tst(struct a64_jit_ctx *ctx, bool is64, uint8_t rn, uint8_t rm,
+	     uint32_t opc)
+{
+	uint32_t insn;
+
+	insn = opc;
+	insn |= (!!is64) << 31;
+	insn |= rm << 16;
+	insn |= rn << 5;
+
+	emit_insn(ctx, insn, check_reg(rn) || check_reg(rm));
+}
+
+static void
+emit_cmp(struct a64_jit_ctx *ctx, bool is64, uint8_t rn, uint8_t rm)
+{
+	emit_cmp_tst(ctx, is64, rn, rm, A64_CMP);
+}
+
+static void
+emit_tst(struct a64_jit_ctx *ctx, bool is64, uint8_t rn, uint8_t rm)
+{
+	emit_cmp_tst(ctx, is64, rn, rm, A64_TST);
+}
+
+static void
+emit_b_cond(struct a64_jit_ctx *ctx, uint8_t cond, int32_t imm19)
+{
+	uint32_t insn, imm;
+
+	imm = mask_imm(19, imm19);
+	insn = 0x15 << 26;
+	insn |= imm << 5;
+	insn |= cond;
+
+	emit_insn(ctx, insn, check_cond(cond) || check_imm(19, imm19));
+}
+
+static void
+emit_branch(struct a64_jit_ctx *ctx, uint8_t op, uint32_t i, int16_t off)
+{
+	jump_offset_to_branch_update(ctx, i);
+	emit_b_cond(ctx, ebpf_to_a64_cond(op), jump_offset_get(ctx, i, off));
+}
+
 static void
 check_program_has_call(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 {
@@ -961,6 +1136,7 @@ emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 
 	for (i = 0; i != bpf->prm.nb_ins; i++) {
 
+		jump_offset_update(ctx, i);
 		ins = bpf->prm.ins + i;
 		op = ins->code;
 		off = ins->off;
@@ -1150,6 +1326,52 @@ emit(struct a64_jit_ctx *ctx, struct rte_bpf *bpf)
 		case (BPF_STX | EBPF_XADD | EBPF_DW):
 			emit_xadd(ctx, op, tmp1, tmp2, tmp3, dst, off, src);
 			break;
+		/* PC += off */
+		case (BPF_JMP | BPF_JA):
+			emit_b(ctx, jump_offset_get(ctx, i, off));
+			break;
+		/* PC += off if dst COND imm */
+		case (BPF_JMP | BPF_JEQ | BPF_K):
+		case (BPF_JMP | EBPF_JNE | BPF_K):
+		case (BPF_JMP | BPF_JGT | BPF_K):
+		case (BPF_JMP | EBPF_JLT | BPF_K):
+		case (BPF_JMP | BPF_JGE | BPF_K):
+		case (BPF_JMP | EBPF_JLE | BPF_K):
+		case (BPF_JMP | EBPF_JSGT | BPF_K):
+		case (BPF_JMP | EBPF_JSLT | BPF_K):
+		case (BPF_JMP | EBPF_JSGE | BPF_K):
+		case (BPF_JMP | EBPF_JSLE | BPF_K):
+			emit_mov_imm(ctx, 1, tmp1, imm);
+			emit_cmp(ctx, 1, dst, tmp1);
+			emit_branch(ctx, op, i, off);
+			break;
+		case (BPF_JMP | BPF_JSET | BPF_K):
+			emit_mov_imm(ctx, 1, tmp1, imm);
+			emit_tst(ctx, 1, dst, tmp1);
+			emit_branch(ctx, op, i, off);
+			break;
+		/* PC += off if dst COND src */
+		case (BPF_JMP | BPF_JEQ | BPF_X):
+		case (BPF_JMP | EBPF_JNE | BPF_X):
+		case (BPF_JMP | BPF_JGT | BPF_X):
+		case (BPF_JMP | EBPF_JLT | BPF_X):
+		case (BPF_JMP | BPF_JGE | BPF_X):
+		case (BPF_JMP | EBPF_JLE | BPF_X):
+		case (BPF_JMP | EBPF_JSGT | BPF_X):
+		case (BPF_JMP | EBPF_JSLT | BPF_X):
+		case (BPF_JMP | EBPF_JSGE | BPF_X):
+		case (BPF_JMP | EBPF_JSLE | BPF_X):
+			emit_cmp(ctx, 1, dst, src);
+			emit_branch(ctx, op, i, off);
+			break;
+		case (BPF_JMP | BPF_JSET | BPF_X):
+			emit_tst(ctx, 1, dst, src);
+			emit_branch(ctx, op, i, off);
+			break;
+		/* Call imm */
+		case (BPF_JMP | EBPF_CALL):
+			emit_call(ctx, tmp1, bpf->prm.xsym[ins->imm].func.val);
+			break;
 		/* Return r0 */
 		case (BPF_JMP | EBPF_EXIT):
 			emit_epilogue(ctx);
@@ -1179,6 +1401,11 @@ bpf_jit_arm64(struct rte_bpf *bpf)
 	/* Init JIT context */
 	memset(&ctx, 0, sizeof(ctx));
 
+	/* Initialize the memory for eBPF to a64 insn offset map for jump */
+	rc = jump_offset_init(&ctx, bpf);
+	if (rc)
+		goto error;
+
 	/* Find eBPF program has call class or not */
 	check_program_has_call(&ctx, bpf);
 
@@ -1218,5 +1445,7 @@ bpf_jit_arm64(struct rte_bpf *bpf)
 munmap:
 	munmap(ctx.ins, size);
 finish:
+	jump_offset_fini(&ctx);
+error:
 	return rc;
 }
-- 
2.23.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-09-03 10:59 [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support jerinj
                   ` (7 preceding siblings ...)
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 8/8] bpf/arm64: add branch operation jerinj
@ 2019-09-24 17:03 ` Ananyev, Konstantin
  2019-10-12 12:22   ` Thomas Monjalon
  2019-10-03 12:51 ` Thomas Monjalon
  9 siblings, 1 reply; 32+ messages in thread
From: Ananyev, Konstantin @ 2019-09-24 17:03 UTC (permalink / raw)
  To: jerinj, dev; +Cc: honnappa.nagarahalli, thomas, gavin.hu



> 
> Added eBPF arm64 JIT support to improve the eBPF program performance
> on arm64.
> 
> dpdk.org/examples/bpf/t1.c application shows around 50% improvement
> on OCTEON TX2 platform in JIT vs interpreter mode.
> 
> Verified the implementation using existing bpf_autotest application.
> 
> # echo "bpf_autotest" | sudo ./build/app/test -c 0x3
> 
> RTE>>bpf_autotest
> run_test(test_store1) start
> run_test(test_store2) start
> run_test(test_load1) start
> run_test(test_ldimm1) start
> run_test(test_mul1) start
> run_test(test_shift1) start
> run_test(test_jump1) start
> run_test(test_alu1) start
> run_test(test_bele1) start
> run_test(test_xadd1) start
> run_test(test_div1) start
> bpf_exec(0xffffa37c0000): division by 0 at pc: 0x68;
> run_test(test_call1) start
> run_test(test_call2) start
> run_test(test_call3) start
> Test OK
> 
> Jerin Jacob (8):
>   bpf/arm64: add build infrastructure
>   bpf/arm64: add prologue and epilogue
>   bpf/arm64: add basic arithmetic operations
>   bpf/arm64: add logical operations
>   bpf/arm64: add byte swap operations
>   bpf/arm64: add load and store operations
>   bpf/arm64: add atomic-exchange-and-add operation
>   bpf/arm64: add branch operation
> 
>  MAINTAINERS                            |    1 +
>  doc/guides/prog_guide/bpf_lib.rst      |    2 +-
>  doc/guides/rel_notes/release_19_11.rst |    5 +
>  lib/librte_bpf/Makefile                |    2 +
>  lib/librte_bpf/bpf.c                   |    4 +-
>  lib/librte_bpf/bpf_impl.h              |    3 +-
>  lib/librte_bpf/bpf_jit_arm64.c         | 1451 ++++++++++++++++++++++++
>  lib/librte_bpf/meson.build             |    2 +
>  8 files changed, 1466 insertions(+), 4 deletions(-)
>  create mode 100644 lib/librte_bpf/bpf_jit_arm64.c
> 
> --

Series Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 2.23.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-09-03 10:59 [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support jerinj
                   ` (8 preceding siblings ...)
  2019-09-24 17:03 ` [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support Ananyev, Konstantin
@ 2019-10-03 12:51 ` Thomas Monjalon
  2019-10-03 13:07   ` Jerin Jacob
  9 siblings, 1 reply; 32+ messages in thread
From: Thomas Monjalon @ 2019-10-03 12:51 UTC (permalink / raw)
  To: jerinj, konstantin.ananyev; +Cc: dev, honnappa.nagarahalli, gavin.hu

03/09/2019 12:59, jerinj@marvell.com:
> Added eBPF arm64 JIT support to improve the eBPF program performance
> on arm64.
> 
>  lib/librte_bpf/bpf_jit_arm64.c         | 1451 ++++++++++++++++++++++++

I am concerned about duplicating the BPF JIT effort in DPDK and Linux.
Could we try to pull the Linux JIT?
Is the license the only issue?

After a quick discussion, it seems the Linux authors are OK to arrange
their JIT code for sharing with userspace projects.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-03 12:51 ` Thomas Monjalon
@ 2019-10-03 13:07   ` Jerin Jacob
  2019-10-03 15:05     ` Ananyev, Konstantin
  0 siblings, 1 reply; 32+ messages in thread
From: Jerin Jacob @ 2019-10-03 13:07 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Jerin Jacob, Ananyev, Konstantin, dpdk-dev, Honnappa Nagarahalli,
	Gavin Hu

On Thu, Oct 3, 2019 at 6:21 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 03/09/2019 12:59, jerinj@marvell.com:
> > Added eBPF arm64 JIT support to improve the eBPF program performance
> > on arm64.
> >
> >  lib/librte_bpf/bpf_jit_arm64.c         | 1451 ++++++++++++++++++++++++
>
> I am concerned about duplicating the BPF JIT effort in DPDK and Linux.
> Could we try to pull the Linux JIT?
> Is the license the only issue?

That's one issue.

>
> After a quick discussion, it seems the Linux authors are OK to arrange
> their JIT code for sharing with userspace projects.

I did a clean room implementation considering some optimization for
DPDK etc(Like if stack is not used then don't push stack etc)
and wherever Linux can be improved, I have submitted the patch also to
Linux as well.(Some more pending as well)

https://github.com/torvalds/linux/commit/504792e07a44844f24e9d79913e4a2f8373cd332

And Linux has a framework for instruction generation for debugging
etc. So We can not copy and paste the code
from Linux as is.

My view to keep a different code base optimize for DPDK use cases and
library requirements(for example, tail call is not supported in DPDK).
For arm64/x86 case the code is done so it is not worth sync with
Linux. For new architecture, it can be if possible.

Konstantin,
Your thoughts?

>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-03 13:07   ` Jerin Jacob
@ 2019-10-03 15:05     ` Ananyev, Konstantin
  2019-10-04  4:55       ` Honnappa Nagarahalli
  2019-10-04 15:39       ` Jerin Jacob
  0 siblings, 2 replies; 32+ messages in thread
From: Ananyev, Konstantin @ 2019-10-03 15:05 UTC (permalink / raw)
  To: Jerin Jacob, Thomas Monjalon
  Cc: Jerin Jacob, dpdk-dev, Honnappa Nagarahalli, Gavin Hu

Hi everyone,

> 
> On Thu, Oct 3, 2019 at 6:21 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 03/09/2019 12:59, jerinj@marvell.com:
> > > Added eBPF arm64 JIT support to improve the eBPF program performance
> > > on arm64.
> > >
> > >  lib/librte_bpf/bpf_jit_arm64.c         | 1451 ++++++++++++++++++++++++
> >
> > I am concerned about duplicating the BPF JIT effort in DPDK and Linux.
> > Could we try to pull the Linux JIT?
> > Is the license the only issue?
> 
> That's one issue.
> 
> >
> > After a quick discussion, it seems the Linux authors are OK to arrange
> > their JIT code for sharing with userspace projects.
> 
> I did a clean room implementation considering some optimization for
> DPDK etc(Like if stack is not used then don't push stack etc)
> and wherever Linux can be improved, I have submitted the patch also to
> Linux as well.(Some more pending as well)
> 
> https://github.com/torvalds/linux/commit/504792e07a44844f24e9d79913e4a2f8373cd332
> 
> And Linux has a framework for instruction generation for debugging
> etc. So We can not copy and paste the code
> from Linux as is.
> 
> My view to keep a different code base optimize for DPDK use cases and
> library requirements(for example, tail call is not supported in DPDK).
> For arm64/x86 case the code is done so it is not worth sync with
> Linux. For new architecture, it can be if possible.
> 
> Konstantin,
> Your thoughts?
> 

My thought would be that if we have JIT eBPF compiler already in DPDK
for one arch (x86) there is absolutely no reason why we shouldn't allow it for different arch (arm).
About having a common code-base with Linux eBPF JITs implementation -
I think it is a very good idea,
but I don’t' think it could be achieved without significant effort.
DPDK and Linux JIT code-generators differ quite a bit.
So my suggestion - let's go ahead and integrate Jerin patch into 19.11,
meanwhile start talking with linux guys how common JIT code-base could be achieved. 
Konstantin





^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-03 15:05     ` Ananyev, Konstantin
@ 2019-10-04  4:55       ` Honnappa Nagarahalli
  2019-10-04  9:54         ` Steve Capper
  2019-10-04 15:39       ` Jerin Jacob
  1 sibling, 1 reply; 32+ messages in thread
From: Honnappa Nagarahalli @ 2019-10-04  4:55 UTC (permalink / raw)
  To: Ananyev, Konstantin, Jerin Jacob, thomas, Rodolph Perfetta, Steve Capper
  Cc: jerinj, dpdk-dev, Gavin Hu (Arm Technology China),
	Honnappa Nagarahalli, nd, nd

Adding Arm JIT and Kernel experts

> -----Original Message-----
> From: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Sent: Thursday, October 3, 2019 10:06 AM
> To: Jerin Jacob <jerinjacobk@gmail.com>; thomas@monjalon.net
> Cc: jerinj@marvell.com; dpdk-dev <dev@dpdk.org>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Gavin Hu (Arm Technology China)
> <Gavin.Hu@arm.com>
> Subject: RE: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
> 
> Hi everyone,
> 
> >
> > On Thu, Oct 3, 2019 at 6:21 PM Thomas Monjalon <thomas@monjalon.net>
> wrote:
> > >
> > > 03/09/2019 12:59, jerinj@marvell.com:
> > > > Added eBPF arm64 JIT support to improve the eBPF program
> > > > performance on arm64.
> > > >
> > > >  lib/librte_bpf/bpf_jit_arm64.c         | 1451 ++++++++++++++++++++++++
> > >
> > > I am concerned about duplicating the BPF JIT effort in DPDK and Linux.
> > > Could we try to pull the Linux JIT?
> > > Is the license the only issue?
> >
> > That's one issue.
> >
> > >
> > > After a quick discussion, it seems the Linux authors are OK to
> > > arrange their JIT code for sharing with userspace projects.
> >
> > I did a clean room implementation considering some optimization for
> > DPDK etc(Like if stack is not used then don't push stack etc) and
> > wherever Linux can be improved, I have submitted the patch also to
> > Linux as well.(Some more pending as well)
> >
> >
> https://github.com/torvalds/linux/commit/504792e07a44844f24e9d79913e
> 4a
> > 2f8373cd332
> >
> > And Linux has a framework for instruction generation for debugging
> > etc. So We can not copy and paste the code from Linux as is.
> >
> > My view to keep a different code base optimize for DPDK use cases and
> > library requirements(for example, tail call is not supported in DPDK).
> > For arm64/x86 case the code is done so it is not worth sync with
> > Linux. For new architecture, it can be if possible.
> >
> > Konstantin,
> > Your thoughts?
> >
> 
> My thought would be that if we have JIT eBPF compiler already in DPDK for
> one arch (x86) there is absolutely no reason why we shouldn't allow it for
> different arch (arm).
> About having a common code-base with Linux eBPF JITs implementation - I
> think it is a very good idea, but I don’t' think it could be achieved without
> significant effort.
> DPDK and Linux JIT code-generators differ quite a bit.
> So my suggestion - let's go ahead and integrate Jerin patch into 19.11,
> meanwhile start talking with linux guys how common JIT code-base could be
> achieved.
> Konstantin
> 
> 
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-04  4:55       ` Honnappa Nagarahalli
@ 2019-10-04  9:54         ` Steve Capper
  2019-10-04 10:53           ` Thomas Monjalon
  0 siblings, 1 reply; 32+ messages in thread
From: Steve Capper @ 2019-10-04  9:54 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: Ananyev, Konstantin, Jerin Jacob, thomas, Rodolph Perfetta,
	jerinj, dpdk-dev, Gavin Hu (Arm Technology China),
	nd

On Fri, Oct 04, 2019 at 05:55:18AM +0100, Honnappa Nagarahalli wrote:
> Adding Arm JIT and Kernel experts

Hi Honnappa,
I'd recommend also reaching out the BPF maintainers:
BPF JIT for ARM64
M:	Daniel Borkmann <daniel@iogearbox.net>
M:	Alexei Starovoitov <ast@kernel.org>
M:	Zi Shen Lim <zlim.lnx@gmail.com>
L:	netdev@vger.kernel.org
L:	bpf@vger.kernel.org
S:	Supported
F:	arch/arm64/net/

As they will have much better knowledge of the state of play and will be
better able to advise.

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-04  9:54         ` Steve Capper
@ 2019-10-04 10:53           ` Thomas Monjalon
  2019-10-04 14:09             ` Daniel Borkmann
  0 siblings, 1 reply; 32+ messages in thread
From: Thomas Monjalon @ 2019-10-04 10:53 UTC (permalink / raw)
  To: Steve Capper, Ananyev, Konstantin, Jerin Jacob
  Cc: Honnappa Nagarahalli, Rodolph Perfetta, jerinj, dev,
	Gavin Hu (Arm Technology China),
	nd, Alexei Starovoitov, Daniel Borkmann, Quentin Monnet

04/10/2019 11:54, Steve Capper:
> I'd recommend also reaching out the BPF maintainers:
> BPF JIT for ARM64
> M:	Daniel Borkmann <daniel@iogearbox.net>
> M:	Alexei Starovoitov <ast@kernel.org>
> M:	Zi Shen Lim <zlim.lnx@gmail.com>
> L:	netdev@vger.kernel.org
> L:	bpf@vger.kernel.org
> S:	Supported
> F:	arch/arm64/net/
> 
> As they will have much better knowledge of the state of play and will be
> better able to advise.

As far as I know Alexei and Daniel are OK with the idea.
But better to let them reply here.

I suggest we think about a way to package the kernel BPF JIT
for userspace usage (not only DPDK) as a library.
I don't understand why the DPDK JIT should be different
or optimized differently.
The only real issue I see is the need for a dual licensing BSD-GPL.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-04 10:53           ` Thomas Monjalon
@ 2019-10-04 14:09             ` Daniel Borkmann
  2019-10-04 14:43               ` Jerin Jacob
  0 siblings, 1 reply; 32+ messages in thread
From: Daniel Borkmann @ 2019-10-04 14:09 UTC (permalink / raw)
  To: Thomas Monjalon, Steve Capper, Ananyev, Konstantin, Jerin Jacob
  Cc: Honnappa Nagarahalli, Rodolph Perfetta, jerinj, dev,
	Gavin Hu (Arm Technology China),
	nd, Alexei Starovoitov, Quentin Monnet, john.fastabend

On 10/4/19 12:53 PM, Thomas Monjalon wrote:
> 04/10/2019 11:54, Steve Capper:
>> I'd recommend also reaching out the BPF maintainers:
>> BPF JIT for ARM64
>> M:	Daniel Borkmann <daniel@iogearbox.net>
>> M:	Alexei Starovoitov <ast@kernel.org>
>> M:	Zi Shen Lim <zlim.lnx@gmail.com>
>> L:	netdev@vger.kernel.org
>> L:	bpf@vger.kernel.org
>> S:	Supported
>> F:	arch/arm64/net/
>>
>> As they will have much better knowledge of the state of play and will be
>> better able to advise.
> 
> As far as I know Alexei and Daniel are OK with the idea.
> But better to let them reply here.
> 
> I suggest we think about a way to package the kernel BPF JIT
> for userspace usage (not only DPDK) as a library.
> I don't understand why the DPDK JIT should be different
> or optimized differently.

That would be great indeed as both projects would benefit from a shared
JIT instead of reimplementing everything twice. I never looked into DPDK
too much, but I presume the idea would be as well to take the LLVM (or
bpf-gcc) generated object file and load it into a BPF 'engine' that sits
in user space on top of DPDK? Presumably loader could be libbpf here as
well since it already knows how to parse the ELF, perform the relocations
etc. The only difference would be that you have a different context and
different helpers? Is that the goal eventually?

> The only real issue I see is the need for a dual licensing BSD-GPL.

This might be one avenue if all kernel JIT contributors would be on board.
Another option I'm wondering could be to extend the bpf() syscall in order
to pass down a description of context and helper mappings e.g. via BTF and
let everything go through the verifier in the kernel the usual way (I presume
one goal might be that you want to assure that the generated BPF code passes
the safety checks before running the prog), then have it JITed and extract
the generated image in order to use it from user space. Kernel would have
to make sure it never actually allows attaching this program in the kernel.
Generated opcodes can already be retrieved today (see below). Such infra
could potentially help bpf-gcc folks as well as they expressed desire to
have some sort of a simulator for their gcc BPF test suite.. and it would
allow for consistent behavior of the BPF runtime. Just a thought.

# bpftool prog
2: cgroup_skb  tag 7be49e3934a125ba  gpl
         loaded_at 2019-10-03T12:53:11+0200  uid 0
         xlated 296B  jited 229B  memlock 4096B  map_ids 2,3
[...]

# bpftool prog dump xlated id 2
    0: (bf) r6 = r1
    1: (69) r7 = *(u16 *)(r6 +192)
    2: (b4) w8 = 0
    3: (55) if r7 != 0x8 goto pc+14
    4: (bf) r1 = r6
    5: (b4) w2 = 16
    6: (bf) r3 = r10
    7: (07) r3 += -4
    8: (b4) w4 = 4
    9: (85) call bpf_skb_load_bytes#7484768
   10: (18) r1 = map[id:2]
   12: (bf) r2 = r10
   13: (07) r2 += -8
   14: (62) *(u32 *)(r2 +0) = 32
   15: (85) call trie_lookup_elem#90800
   16: (15) if r0 == 0x0 goto pc+1
   17: (44) w8 |= 2
   18: (55) if r7 != 0xdd86 goto pc+14
   19: (bf) r1 = r6
   20: (b4) w2 = 24
   21: (bf) r3 = r10
   22: (07) r3 += -16
   23: (b4) w4 = 16
   24: (85) call bpf_skb_load_bytes#7484768
   25: (18) r1 = map[id:3]
   27: (bf) r2 = r10
   28: (07) r2 += -20
   29: (62) *(u32 *)(r2 +0) = 128
   30: (85) call trie_lookup_elem#90800
   31: (15) if r0 == 0x0 goto pc+1
   32: (44) w8 |= 2
   33: (b7) r0 = 1
   34: (55) if r8 != 0x2 goto pc+1
   35: (b7) r0 = 0
   36: (95) exit

# bpftool prog dump jited id 2 opcodes
    0:   push   %rbp
         55
    1:   mov    %rsp,%rbp
         48 89 e5
    4:   sub    $0x40,%rsp
         48 81 ec 40 00 00 00
    b:   sub    $0x28,%rbp
         48 83 ed 28
    f:   mov    %rbx,0x0(%rbp)
         48 89 5d 00
   13:   mov    %r13,0x8(%rbp)
         4c 89 6d 08
   17:   mov    %r14,0x10(%rbp)
         4c 89 75 10
   1b:   mov    %r15,0x18(%rbp)
         4c 89 7d 18
   1f:   xor    %eax,%eax
         31 c0
   21:   mov    %rax,0x20(%rbp)
         48 89 45 20
   25:   mov    %rdi,%rbx
         48 89 fb
   28:   movzwq 0xc0(%rbx),%r13
         4c 0f b7 ab c0 00 00 00
   30:   xor    %r14d,%r14d
         45 31 f6
   33:   cmp    $0x8,%r13
         49 83 fd 08
   37:   jne    0x0000000000000079
         75 40
   39:   mov    %rbx,%rdi
         48 89 df
   3c:   mov    $0x10,%esi
         be 10 00 00 00
[...]
   cb:   jne    0x00000000000000cf
         75 02
   cd:   xor    %eax,%eax
         31 c0
   cf:   mov    0x0(%rbp),%rbx
         48 8b 5d 00
   d3:   mov    0x8(%rbp),%r13
         4c 8b 6d 08
   d7:   mov    0x10(%rbp),%r14
         4c 8b 75 10
   db:   mov    0x18(%rbp),%r15
         4c 8b 7d 18
   df:   add    $0x28,%rbp
         48 83 c5 28
   e3:   leaveq
         c9
   e4:   retq
         c3

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-04 14:09             ` Daniel Borkmann
@ 2019-10-04 14:43               ` Jerin Jacob
  2019-10-05  0:00                 ` Daniel Borkmann
  2020-04-06 11:05                 ` Ananyev, Konstantin
  0 siblings, 2 replies; 32+ messages in thread
From: Jerin Jacob @ 2019-10-04 14:43 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Thomas Monjalon, Steve Capper, Ananyev, Konstantin,
	Honnappa Nagarahalli, Rodolph Perfetta, jerinj, dpdk-dev,
	Gavin Hu (Arm Technology China),
	nd, Alexei Starovoitov, Quentin Monnet, john.fastabend

On Fri, Oct 4, 2019 at 7:39 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 10/4/19 12:53 PM, Thomas Monjalon wrote:
> > 04/10/2019 11:54, Steve Capper:
> >> I'd recommend also reaching out the BPF maintainers:
> >> BPF JIT for ARM64
> >> M:   Daniel Borkmann <daniel@iogearbox.net>
> >> M:   Alexei Starovoitov <ast@kernel.org>
> >> M:   Zi Shen Lim <zlim.lnx@gmail.com>
> >> L:   netdev@vger.kernel.org
> >> L:   bpf@vger.kernel.org
> >> S:   Supported
> >> F:   arch/arm64/net/
> >>
> >> As they will have much better knowledge of the state of play and will be
> >> better able to advise.
> >
> > As far as I know Alexei and Daniel are OK with the idea.
> > But better to let them reply here.
> >
> > I suggest we think about a way to package the kernel BPF JIT
> > for userspace usage (not only DPDK) as a library.
> > I don't understand why the DPDK JIT should be different
> > or optimized differently.
>
> That would be great indeed as both projects would benefit from a shared
> JIT instead of reimplementing everything twice. I never looked into DPDK
> too much, but I presume the idea would be as well to take the LLVM (or
> bpf-gcc) generated object file and load it into a BPF 'engine' that sits
> in user space on top of DPDK? Presumably loader could be libbpf here as
> well since it already knows how to parse the ELF, perform the relocations
> etc. The only difference would be that you have a different context and
> different helpers? Is that the goal eventually?
>
> > The only real issue I see is the need for a dual licensing BSD-GPL.
>
> This might be one avenue if all kernel JIT contributors would be on board.
> Another option I'm wondering could be to extend the bpf() syscall in order
> to pass down a description of context and helper mappings e.g. via BTF and
> let everything go through the verifier in the kernel the usual way (I presume
> one goal might be that you want to assure that the generated BPF code passes
> the safety checks before running the prog), then have it JITed and extract
> the generated image in order to use it from user space. Kernel would have
> to make sure it never actually allows attaching this program in the kernel.
> Generated opcodes can already be retrieved today (see below). Such infra
> could potentially help bpf-gcc folks as well as they expressed desire to
> have some sort of a simulator for their gcc BPF test suite.. and it would
> allow for consistent behavior of the BPF runtime. Just a thought.

This idea looks good. This can remove the verifier code also from DPDK.
 A couple of downsides I can think of,

# We may need to extend the kernel verifier to understand the user-space address
and its symbols for CALL and MEM access operations.
# DPDK supports FreeBSD and Windows OS as well
# Need a different treatment for old Linux kernels.




>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-03 15:05     ` Ananyev, Konstantin
  2019-10-04  4:55       ` Honnappa Nagarahalli
@ 2019-10-04 15:39       ` Jerin Jacob
  2019-10-07 12:33         ` Thomas Monjalon
  1 sibling, 1 reply; 32+ messages in thread
From: Jerin Jacob @ 2019-10-04 15:39 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Thomas Monjalon, Jerin Jacob, dpdk-dev, Honnappa Nagarahalli, Gavin Hu

On Thu, Oct 3, 2019 at 8:35 PM Ananyev, Konstantin
<konstantin.ananyev@intel.com> wrote:
>
> Hi everyone,
>
> >
> > On Thu, Oct 3, 2019 at 6:21 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 03/09/2019 12:59, jerinj@marvell.com:
> > > > Added eBPF arm64 JIT support to improve the eBPF program performance
> > > > on arm64.
> > > >
> > > >  lib/librte_bpf/bpf_jit_arm64.c         | 1451 ++++++++++++++++++++++++
> > >
> > > I am concerned about duplicating the BPF JIT effort in DPDK and Linux.
> > > Could we try to pull the Linux JIT?
> > > Is the license the only issue?
> >
> > That's one issue.
> >
> > >
> > > After a quick discussion, it seems the Linux authors are OK to arrange
> > > their JIT code for sharing with userspace projects.
> >
> > I did a clean room implementation considering some optimization for
> > DPDK etc(Like if stack is not used then don't push stack etc)
> > and wherever Linux can be improved, I have submitted the patch also to
> > Linux as well.(Some more pending as well)
> >
> > https://github.com/torvalds/linux/commit/504792e07a44844f24e9d79913e4a2f8373cd332
> >
> > And Linux has a framework for instruction generation for debugging
> > etc. So We can not copy and paste the code
> > from Linux as is.
> >
> > My view to keep a different code base optimize for DPDK use cases and
> > library requirements(for example, tail call is not supported in DPDK).
> > For arm64/x86 case the code is done so it is not worth sync with
> > Linux. For new architecture, it can be if possible.
> >
> > Konstantin,
> > Your thoughts?
> >
>
> My thought would be that if we have JIT eBPF compiler already in DPDK
> for one arch (x86) there is absolutely no reason why we shouldn't allow it for different arch (arm).
> About having a common code-base with Linux eBPF JITs implementation -
> I think it is a very good idea,
> but I don’t' think it could be achieved without significant effort.
> DPDK and Linux JIT code-generators differ quite a bit.
> So my suggestion - let's go ahead and integrate Jerin patch into 19.11,
> meanwhile start talking with linux guys how common JIT code-base could be achieved.

I agree with Konstantin here.

Thomas,

Just confirm the following:

While we continue to have 'advanced' discussion on avoiding code duplication etc
and it will take a couple of months to converge(if at all it happens)

Just to be clear, I assume, you are OK to merge this code for 19.11(If
no more technical comment on the patch).

I am only afraid of, our typical last-minute surprise pattern and
followed by back and forth open ended discussions.

i.e

# Code submitted before the proposal window
# Gets ACK from Maintainer
# New non-technical concerns start just before RC1










> Konstantin
>
>
>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-04 14:43               ` Jerin Jacob
@ 2019-10-05  0:00                 ` Daniel Borkmann
  2019-10-05 14:39                   ` Jerin Jacob
  2020-04-06 11:05                 ` Ananyev, Konstantin
  1 sibling, 1 reply; 32+ messages in thread
From: Daniel Borkmann @ 2019-10-05  0:00 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Thomas Monjalon, Steve Capper, Ananyev, Konstantin,
	Honnappa Nagarahalli, Rodolph Perfetta, jerinj, dpdk-dev,
	Gavin Hu (Arm Technology China),
	nd, Alexei Starovoitov, Quentin Monnet, john.fastabend

On 10/4/19 4:43 PM, Jerin Jacob wrote:
> On Fri, Oct 4, 2019 at 7:39 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 10/4/19 12:53 PM, Thomas Monjalon wrote:
>>> 04/10/2019 11:54, Steve Capper:
>>>> I'd recommend also reaching out the BPF maintainers:
>>>> BPF JIT for ARM64
>>>> M:   Daniel Borkmann <daniel@iogearbox.net>
>>>> M:   Alexei Starovoitov <ast@kernel.org>
>>>> M:   Zi Shen Lim <zlim.lnx@gmail.com>
>>>> L:   netdev@vger.kernel.org
>>>> L:   bpf@vger.kernel.org
>>>> S:   Supported
>>>> F:   arch/arm64/net/
>>>>
>>>> As they will have much better knowledge of the state of play and will be
>>>> better able to advise.
>>>
>>> As far as I know Alexei and Daniel are OK with the idea.
>>> But better to let them reply here.
>>>
>>> I suggest we think about a way to package the kernel BPF JIT
>>> for userspace usage (not only DPDK) as a library.
>>> I don't understand why the DPDK JIT should be different
>>> or optimized differently.
>>
>> That would be great indeed as both projects would benefit from a shared
>> JIT instead of reimplementing everything twice. I never looked into DPDK
>> too much, but I presume the idea would be as well to take the LLVM (or
>> bpf-gcc) generated object file and load it into a BPF 'engine' that sits
>> in user space on top of DPDK? Presumably loader could be libbpf here as
>> well since it already knows how to parse the ELF, perform the relocations
>> etc. The only difference would be that you have a different context and
>> different helpers? Is that the goal eventually?
>>
>>> The only real issue I see is the need for a dual licensing BSD-GPL.
>>
>> This might be one avenue if all kernel JIT contributors would be on board.
>> Another option I'm wondering could be to extend the bpf() syscall in order
>> to pass down a description of context and helper mappings e.g. via BTF and
>> let everything go through the verifier in the kernel the usual way (I presume
>> one goal might be that you want to assure that the generated BPF code passes
>> the safety checks before running the prog), then have it JITed and extract
>> the generated image in order to use it from user space. Kernel would have
>> to make sure it never actually allows attaching this program in the kernel.
>> Generated opcodes can already be retrieved today (see below). Such infra
>> could potentially help bpf-gcc folks as well as they expressed desire to
>> have some sort of a simulator for their gcc BPF test suite.. and it would
>> allow for consistent behavior of the BPF runtime. Just a thought.
> 
> This idea looks good. This can remove the verifier code also from DPDK.

Right, definitely makes sense to have consolidation also on this one as well
aside from the JITs, and pushing to the kernel and receiving back the JITed
image seems quite nice and would be generic to open up many other use cases
outside of networking. From app pov, it's just an implementation detail where
to get that BPF opcode image from anyway.

>   A couple of downsides I can think of,
> 
> # We may need to extend the kernel verifier to understand the user-space address
> and its symbols for CALL and MEM access operations.

Yep, that part would be needed, potentially BTF could be of help here as well
for the description of the user space runtime environment like context, helpers
etc, so that JIT knows how to handle this.

> # DPDK supports FreeBSD and Windows OS as well

I'm not too familiar with the state on BPF for the latter two, but afaik FreeBSD
at least had some effort to implement a BPF runtime into their kernel as well,
so similar interface could be provided, but presumably as starting point vast
majority of DPDK users are running Linux underneath anyway?

> # Need a different treatment for old Linux kernels.

Maybe, though I have little insight from DPDK angle here. Wrt BPF and kernel from
what we see major cloud providers usually offer quite recent kernels as well as
most mainstream distros that are run there, but again, environments where DPDK is
typically deployed may differ (?), so cannot really comment.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-05  0:00                 ` Daniel Borkmann
@ 2019-10-05 14:39                   ` Jerin Jacob
  2019-10-07 11:57                     ` Ananyev, Konstantin
  2019-10-24  4:22                     ` Jerin Jacob
  0 siblings, 2 replies; 32+ messages in thread
From: Jerin Jacob @ 2019-10-05 14:39 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Thomas Monjalon, Steve Capper, Ananyev, Konstantin,
	Honnappa Nagarahalli, Rodolph Perfetta, jerinj, dpdk-dev,
	Gavin Hu (Arm Technology China),
	nd, Alexei Starovoitov, Quentin Monnet, john.fastabend

> >>
> >> This might be one avenue if all kernel JIT contributors would be on board.
> >> Another option I'm wondering could be to extend the bpf() syscall in order
> >> to pass down a description of context and helper mappings e.g. via BTF and
> >> let everything go through the verifier in the kernel the usual way (I presume
> >> one goal might be that you want to assure that the generated BPF code passes
> >> the safety checks before running the prog), then have it JITed and extract
> >> the generated image in order to use it from user space. Kernel would have
> >> to make sure it never actually allows attaching this program in the kernel.
> >> Generated opcodes can already be retrieved today (see below). Such infra
> >> could potentially help bpf-gcc folks as well as they expressed desire to
> >> have some sort of a simulator for their gcc BPF test suite.. and it would
> >> allow for consistent behavior of the BPF runtime. Just a thought.
> >
> > This idea looks good. This can remove the verifier code also from DPDK.
>
> Right, definitely makes sense to have consolidation also on this one as well
> aside from the JITs, and pushing to the kernel and receiving back the JITed
> image seems quite nice and would be generic to open up many other use cases
> outside of networking. From app pov, it's just an implementation detail where
> to get that BPF opcode image from anyway.
>
> >   A couple of downsides I can think of,
> >
> > # We may need to extend the kernel verifier to understand the user-space address
> > and its symbols for CALL and MEM access operations.
>
> Yep, that part would be needed, potentially BTF could be of help here as well
> for the description of the user space runtime environment like context, helpers
> etc, so that JIT knows how to handle this.

Though, We can not conclude the following non-technical aspects in
this forum like
 # Dual license Linux Kernel BPF code as GPL-BSD as a separate library.
#  DPDK BPF support for FreeBSD and Windows OS, Treatment for Other OS?
# Need a different treatment for old Linux kernels.
This would call for immediate DPDK release to follow the existing semantics.

Yes, For the long term, Using the Kernel JITed EBPF program for
userspace will be helpful. At least DPDK can use it in the future.
A couple of other things to consider when someone does this
# https://github.com/iovisor/ubpf can also benefit from this.
# We need to think about how to support tail call in userspace
# It is possible to have burst mode support in userspace as an
improvement (dealing with 4 packets at the time with SIMD).
Dealing SIMD in kernel space will be an issue or dealing with such
improvement in general.
# Kernel verifier and dealing with address has a lot of security
requirements that may not apply for userspace, and therefore some of
the
optimization specific to userspace needs to consider some way
# Based on my understanding the Linux and DPDK JITed code, following
optimizations may need/have a different path

a) Userspace JIT has to deal with a 64bit address space. Kernel BPF
code can assume the Kernel virtual address space range and optimize.
b) I see, Kernel JIT always makes stack size as non zero even though
BPF applications are not using the stack. I am not sure why it
is that(Could be some security issue). There could be some
optimization not to push the stack pointer in the prologue if EBPF
program does
not use stack
c) In DPDK JIT compiler, In this first pass, we are checking the
actual registers really in use and based on that we are crafting
prologue and
epilogue at runtime to improve the performance.  It could be just
implementation detail, but not sure why Linux kernel is not doing such
optimization.

Konstantin is the author DPDK EBPF support, I just added arm64 JIT
support. Maybe he has more data for this direction.

>
> > # DPDK supports FreeBSD and Windows OS as well
>
> I'm not too familiar with the state on BPF for the latter two, but afaik FreeBSD
> at least had some effort to implement a BPF runtime into their kernel as well,
> so similar interface could be provided, but presumably as starting point vast
> majority of DPDK users are running Linux underneath anyway?
>
> > # Need a different treatment for old Linux kernels.
>
> Maybe, though I have little insight from DPDK angle here. Wrt BPF and kernel from
> what we see major cloud providers usually offer quite recent kernels as well as
> most mainstream distros that are run there, but again, environments where DPDK is
> typically deployed may differ (?), so cannot really comment.







>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-05 14:39                   ` Jerin Jacob
@ 2019-10-07 11:57                     ` Ananyev, Konstantin
  2019-10-24  4:22                     ` Jerin Jacob
  1 sibling, 0 replies; 32+ messages in thread
From: Ananyev, Konstantin @ 2019-10-07 11:57 UTC (permalink / raw)
  To: Jerin Jacob, Daniel Borkmann
  Cc: Thomas Monjalon, Steve Capper, Honnappa Nagarahalli,
	Rodolph Perfetta, jerinj, dpdk-dev,
	Gavin Hu (Arm Technology China),
	nd, Alexei Starovoitov, Quentin Monnet, john.fastabend


Hi everyone,

 
> > >>
> > >> This might be one avenue if all kernel JIT contributors would be on board.
> > >> Another option I'm wondering could be to extend the bpf() syscall in order
> > >> to pass down a description of context and helper mappings e.g. via BTF and
> > >> let everything go through the verifier in the kernel the usual way (I presume
> > >> one goal might be that you want to assure that the generated BPF code passes
> > >> the safety checks before running the prog), then have it JITed and extract
> > >> the generated image in order to use it from user space. Kernel would have
> > >> to make sure it never actually allows attaching this program in the kernel.
> > >> Generated opcodes can already be retrieved today (see below). Such infra
> > >> could potentially help bpf-gcc folks as well as they expressed desire to
> > >> have some sort of a simulator for their gcc BPF test suite.. and it would
> > >> allow for consistent behavior of the BPF runtime. Just a thought.
> > >
> > > This idea looks good. This can remove the verifier code also from DPDK.

Yes, from one side that idea looks very tempting,
As I understand in that case we wouldn't need to worry about licensing,
plus we getting JIT, verifier and might be even cBPF support for free...
As the downside, as Jerin also outlined below - no support for other OSes,
plus in future to get new feature users might need to upgrade to latest kernel.

Just exploring the alternate approach - if we put away for now this licensing hussle,
how difficult you think it would be to make current bpf kernel code sort of
platform independent entity that could be build both as part of kernel and as
standalone user-space lib? 
Again would it be useful for current eBPF users to have an ability to build/run verifier/jit
in user-space too (might be easier to debug, faster prototyping, etc.),
or just extra pain for maintainers? 

Konstantin

> >
> > Right, definitely makes sense to have consolidation also on this one as well
> > aside from the JITs, and pushing to the kernel and receiving back the JITed
> > image seems quite nice and would be generic to open up many other use cases
> > outside of networking. From app pov, it's just an implementation detail where
> > to get that BPF opcode image from anyway.
> >
> > >   A couple of downsides I can think of,
> > >
> > > # We may need to extend the kernel verifier to understand the user-space address
> > > and its symbols for CALL and MEM access operations.
> >
> > Yep, that part would be needed, potentially BTF could be of help here as well
> > for the description of the user space runtime environment like context, helpers
> > etc, so that JIT knows how to handle this.
> 
> Though, We can not conclude the following non-technical aspects in
> this forum like
>  # Dual license Linux Kernel BPF code as GPL-BSD as a separate library.
> #  DPDK BPF support for FreeBSD and Windows OS, Treatment for Other OS?
> # Need a different treatment for old Linux kernels.
> This would call for immediate DPDK release to follow the existing semantics.
> 
> Yes, For the long term, Using the Kernel JITed EBPF program for
> userspace will be helpful. At least DPDK can use it in the future.
> A couple of other things to consider when someone does this
> # https://github.com/iovisor/ubpf can also benefit from this.
> # We need to think about how to support tail call in userspace
> # It is possible to have burst mode support in userspace as an
> improvement (dealing with 4 packets at the time with SIMD).
> Dealing SIMD in kernel space will be an issue or dealing with such
> improvement in general.
> # Kernel verifier and dealing with address has a lot of security
> requirements that may not apply for userspace, and therefore some of
> the
> optimization specific to userspace needs to consider some way
> # Based on my understanding the Linux and DPDK JITed code, following
> optimizations may need/have a different path
> 
> a) Userspace JIT has to deal with a 64bit address space. Kernel BPF
> code can assume the Kernel virtual address space range and optimize.
> b) I see, Kernel JIT always makes stack size as non zero even though
> BPF applications are not using the stack. I am not sure why it
> is that(Could be some security issue). There could be some
> optimization not to push the stack pointer in the prologue if EBPF
> program does
> not use stack
> c) In DPDK JIT compiler, In this first pass, we are checking the
> actual registers really in use and based on that we are crafting
> prologue and
> epilogue at runtime to improve the performance.  It could be just
> implementation detail, but not sure why Linux kernel is not doing such
> optimization.
> 
> Konstantin is the author DPDK EBPF support, I just added arm64 JIT
> support. Maybe he has more data for this direction.
> 
> >
> > > # DPDK supports FreeBSD and Windows OS as well
> >
> > I'm not too familiar with the state on BPF for the latter two, but afaik FreeBSD
> > at least had some effort to implement a BPF runtime into their kernel as well,
> > so similar interface could be provided, but presumably as starting point vast
> > majority of DPDK users are running Linux underneath anyway?
> >
> > > # Need a different treatment for old Linux kernels.
> >
> > Maybe, though I have little insight from DPDK angle here. Wrt BPF and kernel from
> > what we see major cloud providers usually offer quite recent kernels as well as
> > most mainstream distros that are run there, but again, environments where DPDK is
> > typically deployed may differ (?), so cannot really comment.
> 
> 
> 
> 
> 
> 
> 
> >
> > Thanks,
> > Daniel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-04 15:39       ` Jerin Jacob
@ 2019-10-07 12:33         ` Thomas Monjalon
  2019-10-07 13:00           ` Jerin Jacob
  0 siblings, 1 reply; 32+ messages in thread
From: Thomas Monjalon @ 2019-10-07 12:33 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Ananyev, Konstantin, Jerin Jacob, dpdk-dev, Honnappa Nagarahalli,
	Gavin Hu

04/10/2019 17:39, Jerin Jacob:
> On Thu, Oct 3, 2019 at 8:35 PM Ananyev, Konstantin
> <konstantin.ananyev@intel.com> wrote:
> >
> > Hi everyone,
> >
> > >
> > > On Thu, Oct 3, 2019 at 6:21 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >
> > > > 03/09/2019 12:59, jerinj@marvell.com:
> > > > > Added eBPF arm64 JIT support to improve the eBPF program performance
> > > > > on arm64.
> > > > >
> > > > >  lib/librte_bpf/bpf_jit_arm64.c         | 1451 ++++++++++++++++++++++++
> > > >
> > > > I am concerned about duplicating the BPF JIT effort in DPDK and Linux.
> > > > Could we try to pull the Linux JIT?
> > > > Is the license the only issue?
> > >
> > > That's one issue.
> > >
> > > >
> > > > After a quick discussion, it seems the Linux authors are OK to arrange
> > > > their JIT code for sharing with userspace projects.
> > >
> > > I did a clean room implementation considering some optimization for
> > > DPDK etc(Like if stack is not used then don't push stack etc)
> > > and wherever Linux can be improved, I have submitted the patch also to
> > > Linux as well.(Some more pending as well)
> > >
> > > https://github.com/torvalds/linux/commit/504792e07a44844f24e9d79913e4a2f8373cd332
> > >
> > > And Linux has a framework for instruction generation for debugging
> > > etc. So We can not copy and paste the code
> > > from Linux as is.
> > >
> > > My view to keep a different code base optimize for DPDK use cases and
> > > library requirements(for example, tail call is not supported in DPDK).
> > > For arm64/x86 case the code is done so it is not worth sync with
> > > Linux. For new architecture, it can be if possible.
> > >
> > > Konstantin,
> > > Your thoughts?
> > >
> >
> > My thought would be that if we have JIT eBPF compiler already in DPDK
> > for one arch (x86) there is absolutely no reason why we shouldn't allow it for different arch (arm).
> > About having a common code-base with Linux eBPF JITs implementation -
> > I think it is a very good idea,
> > but I don’t' think it could be achieved without significant effort.
> > DPDK and Linux JIT code-generators differ quite a bit.
> > So my suggestion - let's go ahead and integrate Jerin patch into 19.11,
> > meanwhile start talking with linux guys how common JIT code-base could be achieved.
> 
> I agree with Konstantin here.
> 
> Thomas,
> 
> Just confirm the following:
> 
> While we continue to have 'advanced' discussion on avoiding code duplication etc
> and it will take a couple of months to converge(if at all it happens)
> 
> Just to be clear, I assume, you are OK to merge this code for 19.11(If
> no more technical comment on the patch).
> 
> I am only afraid of, our typical last-minute surprise pattern and
> followed by back and forth open ended discussions.
> 
> i.e
> 
> # Code submitted before the proposal window
> # Gets ACK from Maintainer
> # New non-technical concerns start just before RC1

I hope you are not against discussing the real good questions,
even if they come a month after the first submission.

I don't care merging such patch in 19.11,
but I would have preferred such questions were open
when introducing this new library (for x86).

About your urge of having this code merged,
please can you explain what is your usage?



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-07 12:33         ` Thomas Monjalon
@ 2019-10-07 13:00           ` Jerin Jacob
  2019-10-07 18:04             ` Thomas Monjalon
  0 siblings, 1 reply; 32+ messages in thread
From: Jerin Jacob @ 2019-10-07 13:00 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ananyev, Konstantin, Jerin Jacob, dpdk-dev, Honnappa Nagarahalli,
	Gavin Hu

On Mon, 7 Oct, 2019, 6:03 PM Thomas Monjalon, <thomas@monjalon.net> wrote:

> 04/10/2019 17:39, Jerin Jacob:
> > On Thu, Oct 3, 2019 at 8:35 PM Ananyev, Konstantin
> > <konstantin.ananyev@intel.com> wrote:
> > >
> > > Hi everyone,
> > >
> > > >
> > > > On Thu, Oct 3, 2019 at 6:21 PM Thomas Monjalon <thomas@monjalon.net>
> wrote:
> > > > >
> > > > > 03/09/2019 12:59, jerinj@marvell.com:
> > > > > > Added eBPF arm64 JIT support to improve the eBPF program
> performance
> > > > > > on arm64.
> > > > > >
> > > > > >  lib/librte_bpf/bpf_jit_arm64.c         | 1451
> ++++++++++++++++++++++++
> > > > >
> > > > > I am concerned about duplicating the BPF JIT effort in DPDK and
> Linux.
> > > > > Could we try to pull the Linux JIT?
> > > > > Is the license the only issue?
> > > >
> > > > That's one issue.
> > > >
> > > > >
> > > > > After a quick discussion, it seems the Linux authors are OK to
> arrange
> > > > > their JIT code for sharing with userspace projects.
> > > >
> > > > I did a clean room implementation considering some optimization for
> > > > DPDK etc(Like if stack is not used then don't push stack etc)
> > > > and wherever Linux can be improved, I have submitted the patch also
> to
> > > > Linux as well.(Some more pending as well)
> > > >
> > > >
> https://github.com/torvalds/linux/commit/504792e07a44844f24e9d79913e4a2f8373cd332
> > > >
> > > > And Linux has a framework for instruction generation for debugging
> > > > etc. So We can not copy and paste the code
> > > > from Linux as is.
> > > >
> > > > My view to keep a different code base optimize for DPDK use cases and
> > > > library requirements(for example, tail call is not supported in
> DPDK).
> > > > For arm64/x86 case the code is done so it is not worth sync with
> > > > Linux. For new architecture, it can be if possible.
> > > >
> > > > Konstantin,
> > > > Your thoughts?
> > > >
> > >
> > > My thought would be that if we have JIT eBPF compiler already in DPDK
> > > for one arch (x86) there is absolutely no reason why we shouldn't
> allow it for different arch (arm).
> > > About having a common code-base with Linux eBPF JITs implementation -
> > > I think it is a very good idea,
> > > but I don’t' think it could be achieved without significant effort.
> > > DPDK and Linux JIT code-generators differ quite a bit.
> > > So my suggestion - let's go ahead and integrate Jerin patch into 19.11,
> > > meanwhile start talking with linux guys how common JIT code-base could
> be achieved.
> >
> > I agree with Konstantin here.
> >
> > Thomas,
> >
> > Just confirm the following:
> >
> > While we continue to have 'advanced' discussion on avoiding code
> duplication etc
> > and it will take a couple of months to converge(if at all it happens)
> >
> > Just to be clear, I assume, you are OK to merge this code for 19.11(If
> > no more technical comment on the patch).
> >
> > I am only afraid of, our typical last-minute surprise pattern and
> > followed by back and forth open ended discussions.
> >
> > i.e
> >
> > # Code submitted before the proposal window
> > # Gets ACK from Maintainer
> > # New non-technical concerns start just before RC1
>
> I hope you are not against discussing the real good questions,
> even if they come a month after the first submission.
>

I am not against discussing the technical data about the 'patch' and review
it. If there is a review with respect to content of the patch it is very
good, I am happy to address it. Stuff like I don't have any control (
changing the licence) etc, I have am not comfortable to take  in last
minute. I have already shared the eBPF ARM64 JIT support in roadmap a month
ago before implementing it. No question asked that time. Spend a almost
month to add support for it and It is not a simple C code. Now I am not
comfortable in asking the fundamental questions like why this eBPF it self
is required and code duplication  ( code was duplicated when x86 support
has been added) and therefore stall the patch at this point of time, where
this library and x86 support added a year back.


>
>
>
> I don't care merging such patch in 19.11,
> but I would have preferred such questions were open
> when introducing this new library (for x86).
>

Konstantin added enough data on ml this when this library gets added on
reply to different users.


> About your urge of having this code merged,
> please can you explain what is your usage?
>

As an ARM64 maintainter, I would  like to fix any disparity in terms of the
features with respect to x86 and I have been doing for last 3 years. If
some one using eBPF on x86, I want to make sure it run in similar
"performance" on arm64 on architecture perspective.



>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-07 13:00           ` Jerin Jacob
@ 2019-10-07 18:04             ` Thomas Monjalon
  2019-10-07 19:29               ` Jerin Jacob
  0 siblings, 1 reply; 32+ messages in thread
From: Thomas Monjalon @ 2019-10-07 18:04 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Ananyev, Konstantin, Jerin Jacob, dpdk-dev, Honnappa Nagarahalli,
	Gavin Hu

07/10/2019 15:00, Jerin Jacob:
> On Mon, 7 Oct, 2019, 6:03 PM Thomas Monjalon wrote:
> > 04/10/2019 17:39, Jerin Jacob:
> > > On Thu, Oct 3, 2019 at 8:35 PM Ananyev, Konstantin wrote:
> > > > > On Thu, Oct 3, 2019 at 6:21 PM Thomas Monjalon wrote:
> > > > > > 03/09/2019 12:59, jerinj@marvell.com:
> > > > > > > Added eBPF arm64 JIT support to improve the eBPF
> > > > > > > program performance on arm64.
> > > > > > >
> > > > > > >  lib/librte_bpf/bpf_jit_arm64.c
> > > > > > >  | 1451 ++++++++++++++++++++++++
> > > > > >
> > > > > > I am concerned about duplicating the BPF JIT effort in DPDK and Linux.
> > > > > > Could we try to pull the Linux JIT?
> > > > > > Is the license the only issue?
> > > > >
> > > > > That's one issue.
> > > > >
> > > > > > After a quick discussion, it seems the Linux authors are OK to
> > > > > > arrange their JIT code for sharing with userspace projects.
> > > > >
> > > > > I did a clean room implementation considering some optimization for
> > > > > DPDK etc(Like if stack is not used then don't push stack etc)
> > > > > and wherever Linux can be improved, I have submitted the patch also
> > > > > to Linux as well.(Some more pending as well)
> > > > > https://github.com/torvalds/linux/commit/504792e07a44844f24e9d79913e4a2f8373cd332
> > > > >
> > > > > And Linux has a framework for instruction generation for debugging
> > > > > etc. So We can not copy and paste the code from Linux as is.
> > > > >
> > > > > My view to keep a different code base optimize for DPDK use cases and
> > > > > library requirements(for example, tail call is not supported in DPDK).
> > > > > For arm64/x86 case the code is done so it is not worth sync with
> > > > > Linux. For new architecture, it can be if possible.
> > > > >
> > > > > Konstantin,
> > > > > Your thoughts?
> > > > >
> > > >
> > > > My thought would be that if we have JIT eBPF compiler already in DPDK
> > > > for one arch (x86) there is absolutely no reason why we shouldn't
> > > > allow it for different arch (arm).
> > > > About having a common code-base with Linux eBPF JITs implementation -
> > > > I think it is a very good idea,
> > > > but I don’t' think it could be achieved without significant effort.
> > > > DPDK and Linux JIT code-generators differ quite a bit.
> > > > So my suggestion - let's go ahead and integrate Jerin patch into 19.11,
> > > > meanwhile start talking with linux guys how common JIT code-base could
> > > > be achieved.
> > >
> > > I agree with Konstantin here.
> > >
> > > Thomas,
> > >
> > > Just confirm the following:
> > >
> > > While we continue to have 'advanced' discussion on avoiding code
> > > duplication etc and it will take a couple of months to converge
> > > (if at all it happens)
> > > Just to be clear, I assume, you are OK to merge this code for 19.11
> > > (If no more technical comment on the patch).
> > >
> > > I am only afraid of, our typical last-minute surprise pattern and
> > > followed by back and forth open ended discussions.
> > >
> > > i.e
> > >
> > > # Code submitted before the proposal window
> > > # Gets ACK from Maintainer
> > > # New non-technical concerns start just before RC1
> >
> > I hope you are not against discussing the real good questions,
> > even if they come a month after the first submission.
> 
> I am not against discussing the technical data about the 'patch' and review
> it. If there is a review with respect to content of the patch it is very
> good, I am happy to address it. Stuff like I don't have any control (
> changing the licence) etc, I have am not comfortable to take  in last
> minute. I have already shared the eBPF ARM64 JIT support in roadmap a month
> ago before implementing it. No question asked that time. Spend a almost
> month to add support for it and It is not a simple C code. Now I am not
> comfortable in asking the fundamental questions like why this eBPF it self
> is required and code duplication  ( code was duplicated when x86 support
> has been added) and therefore stall the patch at this point of time, where
> this library and x86 support added a year back.

I really don't like this reaction.
First, I never said this discussion was blocking the patch.
Second, why am I the only one asking such obvious questions
as not duplicating work?

> > I don't care merging such patch in 19.11,
> > but I would have preferred such questions were open
> > when introducing this new library (for x86).

You Jerin and Konstantin should have answered these questions
a long time ago before starting such development.
Is it so hard to require a bit of thoughts before starting something new?

> Konstantin added enough data on ml this when this library gets added on
> reply to different users.

Really? which data?

> > About your urge of having this code merged,
> > please can you explain what is your usage?
> 
> As an ARM64 maintainter, I would  like to fix any disparity in terms of the
> features with respect to x86 and I have been doing for last 3 years. If
> some one using eBPF on x86, I want to make sure it run in similar
> "performance" on arm64 on architecture perspective.

So we are debating about a library which is probably not used by anybody.
That's not how I plan to spend my time on DPDK.

Sorry Jerin, I really like working with you,
but I think you forward too much pressure here,
instead of quietly discussing the future of DPDK.

Please forget the deadline (we will agree on merging anyway)
and let's restart from the beginning by answering simple questions:
- what are the use cases of BPF in DPDK?
- how much we'll benefit from sharing code with Linux?
- what can we lose in a single JIT implementation?



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-07 18:04             ` Thomas Monjalon
@ 2019-10-07 19:29               ` Jerin Jacob
  2019-10-07 20:15                 ` Thomas Monjalon
  0 siblings, 1 reply; 32+ messages in thread
From: Jerin Jacob @ 2019-10-07 19:29 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ananyev, Konstantin, Jerin Jacob, dpdk-dev, Honnappa Nagarahalli,
	Gavin Hu

On Mon, 7 Oct, 2019, 11:35 PM Thomas Monjalon, <thomas@monjalon.net> wrote:

> 07/10/2019 15:00, Jerin Jacob:
> > On Mon, 7 Oct, 2019, 6:03 PM Thomas Monjalon wrote:
> > > 04/10/2019 17:39, Jerin Jacob:
> > > > On Thu, Oct 3, 2019 at 8:35 PM Ananyev, Konstantin wrote:
> > > > > > On Thu, Oct 3, 2019 at 6:21 PM Thomas Monjalon wrote:
> > > > > > > 03/09/2019 12:59, jerinj@marvell.com:
> > > > > > > > Added eBPF arm64 JIT support to improve the eBPF
> > > > > > > > program performance on arm64.
> > > > > > > >
> > > > > > > >  lib/librte_bpf/bpf_jit_arm64.c
> > > > > > > >  | 1451 ++++++++++++++++++++++++
> > > > > > >
> > > > > > > I am concerned about duplicating the BPF JIT effort in DPDK
> and Linux.
> > > > > > > Could we try to pull the Linux JIT?
> > > > > > > Is the license the only issue?
> > > > > >
> > > > > > That's one issue.
> > > > > >
> > > > > > > After a quick discussion, it seems the Linux authors are OK to
> > > > > > > arrange their JIT code for sharing with userspace projects.
> > > > > >
> > > > > > I did a clean room implementation considering some optimization
> for
> > > > > > DPDK etc(Like if stack is not used then don't push stack etc)
> > > > > > and wherever Linux can be improved, I have submitted the patch
> also
> > > > > > to Linux as well.(Some more pending as well)
> > > > > >
> https://github.com/torvalds/linux/commit/504792e07a44844f24e9d79913e4a2f8373cd332
> > > > > >
> > > > > > And Linux has a framework for instruction generation for
> debugging
> > > > > > etc. So We can not copy and paste the code from Linux as is.
> > > > > >
> > > > > > My view to keep a different code base optimize for DPDK use
> cases and
> > > > > > library requirements(for example, tail call is not supported in
> DPDK).
> > > > > > For arm64/x86 case the code is done so it is not worth sync with
> > > > > > Linux. For new architecture, it can be if possible.
> > > > > >
> > > > > > Konstantin,
> > > > > > Your thoughts?
> > > > > >
> > > > >
> > > > > My thought would be that if we have JIT eBPF compiler already in
> DPDK
> > > > > for one arch (x86) there is absolutely no reason why we shouldn't
> > > > > allow it for different arch (arm).
> > > > > About having a common code-base with Linux eBPF JITs
> implementation -
> > > > > I think it is a very good idea,
> > > > > but I don’t' think it could be achieved without significant effort.
> > > > > DPDK and Linux JIT code-generators differ quite a bit.
> > > > > So my suggestion - let's go ahead and integrate Jerin patch into
> 19.11,
> > > > > meanwhile start talking with linux guys how common JIT code-base
> could
> > > > > be achieved.
> > > >
> > > > I agree with Konstantin here.
> > > >
> > > > Thomas,
> > > >
> > > > Just confirm the following:
> > > >
> > > > While we continue to have 'advanced' discussion on avoiding code
> > > > duplication etc and it will take a couple of months to converge
> > > > (if at all it happens)
> > > > Just to be clear, I assume, you are OK to merge this code for 19.11
> > > > (If no more technical comment on the patch).
> > > >
> > > > I am only afraid of, our typical last-minute surprise pattern and
> > > > followed by back and forth open ended discussions.
> > > >
> > > > i.e
> > > >
> > > > # Code submitted before the proposal window
> > > > # Gets ACK from Maintainer
> > > > # New non-technical concerns start just before RC1
> > >
> > > I hope you are not against discussing the real good questions,
> > > even if they come a month after the first submission.
> >
> > I am not against discussing the technical data about the 'patch' and
> review
> > it. If there is a review with respect to content of the patch it is very
> > good, I am happy to address it. Stuff like I don't have any control (
> > changing the licence) etc, I have am not comfortable to take  in last
> > minute. I have already shared the eBPF ARM64 JIT support in roadmap a
> month
> > ago before implementing it. No question asked that time. Spend a almost
> > month to add support for it and It is not a simple C code. Now I am not
> > comfortable in asking the fundamental questions like why this eBPF it
> self
> > is required and code duplication  ( code was duplicated when x86 support
> > has been added) and therefore stall the patch at this point of time,
> where
> > this library and x86 support added a year back.
>
> I really don't like this reaction.
>

If it hurts you in some way then I am sorry about that.

First, I never said this discussion was blocking the patch.
>

You said you have concern with this patch. Sorry,
I am not sure how to interpret that and if I don't jump in it will be
stalled for sure. That's my experience. Sorry if you dis agree.


Second, why am I the only one asking such obvious questions
> as not duplicating work?
>

Some things it does not converge at all. Especially relicecing some code
from linux. There are a lot developers(even me) are involved in that code
base. Why would everyone agree? The list would include a recent RISCV JIT
contributer from gmail.com as example.

Duplication the semantics some times gives the morecontrol. We already did
that for rte_flow, rcu etc. I have mentioned the performance reason as well
for JIT in the other thread.



>
> > > I don't care merging such patch in 19.11,
> > > but I would have preferred such questions were open
> > > when introducing this new library (for x86).
>
> You Jerin and Konstantin should have answered these questions
> a long time ago before starting such development.
> Is it so hard to require a bit of thoughts before starting something new?
>

For me, I don't see any better approach to have user space eBPF to support
all OS in DPDK.


> > Konstantin added enough data on ml this when this library gets added on
> > reply to different users.
>
> Really? which data?
>

I am talking about the discussion with
niterome developer.I don't have exact email thread, probably Konstantin may
have


>
> > > About your urge of having this code merged,
> > > please can you explain what is your usage?
> >
> > As an ARM64 maintainter, I would  like to fix any disparity in terms of
> the
> > features with respect to x86 and I have been doing for last 3 years. If
> > some one using eBPF on x86, I want to make sure it run in similar
> > "performance" on arm64 on architecture perspective.
>
> So we are debating about a library which is probably not used by anybody.
> That's not how I plan to spend my time on DPDK.
>

How do anyone know that the library is not used by anyone in community if
it is part of dpdk.org and a customer asked does arm64 has JIT support too.

If something needs to be dynamically controlled then eBPF can be used,
couple of use cases

# packet filtering
# debugging
# function call tracing
# There are some Lua JIT based dataplane implementations. Which can be
replaced with eBPF with JIT.




>
>
> Sorry Jerin, I really like working with you,
>

Mee too.

but I think you forward too much pressure here,
> instead of quietly discussing the future of DPDK.
>
> Please forget the deadline (we will agree on merging anyway)
>

Ok.

and let's restart from the beginning by answering simple questions:
> - what are the use cases of BPF in DPDK?
>

I meantioned what I know,

- how much we'll benefit from sharing code with Linux?
>

I have mentioned some of the performance constraint in the other thread.
Moreover I don't believe it is not easy task for Linux eBPF to run as
userspace and I not sure who is going to do that

- what can we lose in a single JIT implementation?
>

Sorry, I didn't understood this question?




>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-07 19:29               ` Jerin Jacob
@ 2019-10-07 20:15                 ` Thomas Monjalon
  2019-10-08  6:57                   ` Jerin Jacob
  0 siblings, 1 reply; 32+ messages in thread
From: Thomas Monjalon @ 2019-10-07 20:15 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Ananyev, Konstantin, Jerin Jacob, dpdk-dev, Honnappa Nagarahalli,
	Gavin Hu

07/10/2019 21:29, Jerin Jacob:
> On Mon, 7 Oct, 2019, 11:35 PM Thomas Monjalon, <thomas@monjalon.net> wrote:
[...] 
> let's restart from the beginning by answering simple questions:
> > - what are the use cases of BPF in DPDK?
> 
> If something needs to be dynamically controlled then eBPF can be used,
> couple of use cases
> 
> # packet filtering
> # debugging
> # function call tracing
> # There are some Lua JIT based dataplane implementations. Which can be
> replaced with eBPF with JIT.
> 
> - how much we'll benefit from sharing code with Linux?
> 
> I have mentioned some of the performance constraint in the other thread.
> Moreover I don't believe it is not easy task for Linux eBPF to run as
> userspace and I not sure who is going to do that

I was asking the benefits here:
- sharing optimizations in both projects
- get verifier support
What else?

> - what can we lose in a single JIT implementation?
> 
> Sorry, I didn't understood this question?

I mean what are the drawbacks of using a Linux implementation?
How performance constraints are differents, etc?

Note: as a lot of people, I don't really know BPF,
so these are real questions to help understanding the challenge.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-07 20:15                 ` Thomas Monjalon
@ 2019-10-08  6:57                   ` Jerin Jacob
  0 siblings, 0 replies; 32+ messages in thread
From: Jerin Jacob @ 2019-10-08  6:57 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ananyev, Konstantin, Jerin Jacob, dpdk-dev, Honnappa Nagarahalli,
	Gavin Hu

On Tue, 8 Oct, 2019, 1:45 AM Thomas Monjalon, <thomas@monjalon.net> wrote:

> 07/10/2019 21:29, Jerin Jacob:
> > On Mon, 7 Oct, 2019, 11:35 PM Thomas Monjalon, <thomas@monjalon.net>
> wrote:
> [...]
> > let's restart from the beginning by answering simple questions:
> > > - what are the use cases of BPF in DPDK?
> >
> > If something needs to be dynamically controlled then eBPF can be used,
> > couple of use cases
> >
> > # packet filtering
> > # debugging
> > # function call tracing
> > # There are some Lua JIT based dataplane implementations. Which can be
> > replaced with eBPF with JIT.
> >
> > - how much we'll benefit from sharing code with Linux?
> >
> > I have mentioned some of the performance constraint in the other thread.
> > Moreover I don't believe it is not easy task for Linux eBPF to run as
> > userspace and I not sure who is going to do that
>
> I was asking the benefits here:
> - sharing optimizations in both projects
>

Yes. But even if it is different code base it is possible to share the
optimization.

- get verifier support
>

Verifier support already available in the library.

What else?
>

I see only avoiding code duplication and getting new feature like cBPF.


> > - what can we lose in a single JIT implementation?
> >
> > Sorry, I didn't understood this question?
>
> I mean what are the drawbacks of using a Linux implementation?
> How performance constraints are differents, etc?
>

Mention the details in the below thread. Waiting for feedback from Kernel
maintainer.

http://mails.dpdk.org/archives/dev/2019-October/146004.html

http://mails.dpdk.org/archives/dev/2019-October/146063.html


>
>
> Note: as a lot of people, I don't really know BPF,
> so these are real questions to help understanding the challenge.


>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-09-24 17:03 ` [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support Ananyev, Konstantin
@ 2019-10-12 12:22   ` Thomas Monjalon
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Monjalon @ 2019-10-12 12:22 UTC (permalink / raw)
  To: jerinj; +Cc: dev, Ananyev, Konstantin, honnappa.nagarahalli, gavin.hu

24/09/2019 19:03, Ananyev, Konstantin:
> > Jerin Jacob (8):
> >   bpf/arm64: add build infrastructure
> >   bpf/arm64: add prologue and epilogue
> >   bpf/arm64: add basic arithmetic operations
> >   bpf/arm64: add logical operations
> >   bpf/arm64: add byte swap operations
> >   bpf/arm64: add load and store operations
> >   bpf/arm64: add atomic-exchange-and-add operation
> >   bpf/arm64: add branch operation
> > 
> 
> Series Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Applied

Please continue the discussions started with the Linux maintainers
in order to try reducing the code duplication between the two projects.
Thanks




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 7/8] bpf/arm64: add atomic-exchange-and-add operation
  2019-09-03 10:59 ` [dpdk-dev] [PATCH 7/8] bpf/arm64: add atomic-exchange-and-add operation jerinj
@ 2019-10-18 13:16   ` David Marchand
  0 siblings, 0 replies; 32+ messages in thread
From: David Marchand @ 2019-10-18 13:16 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, Thomas Monjalon
  Cc: dev, Ananyev, Konstantin, Honnappa Nagarahalli, Gavin Hu

On Tue, Sep 3, 2019 at 1:00 PM <jerinj@marvell.com> wrote:
>
> From: Jerin Jacob <jerinj@marvell.com>
>
> Implement XADD eBPF instruction using STADD arm64 instruction.
> If the given platform does not have atomics support,
> use LDXR and STXR pair for critical section instead of STADD.

For the record, this patch had a missed dependency on the 128-bits
atomic for arm64 patch because of RTE_ARM_FEATURE_ATOMICS.
This will be resolved once the latter is merged.


> ---
>  lib/librte_bpf/bpf_jit_arm64.c | 85 +++++++++++++++++++++++++++++++++-
>  1 file changed, 84 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_bpf/bpf_jit_arm64.c b/lib/librte_bpf/bpf_jit_arm64.c
> index c797c9c62..62fa6a505 100644
> --- a/lib/librte_bpf/bpf_jit_arm64.c
> +++ b/lib/librte_bpf/bpf_jit_arm64.c

[snip]

> +static int
> +has_atomics(void)
> +{
> +       int rc = 0;
> +
> +#if defined(__ARM_FEATURE_ATOMICS) || defined(RTE_ARM_FEATURE_ATOMICS)
> +       rc = 1;
> +#endif
> +       return rc;
> +}

[snip]


-- 
David Marchand


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-05 14:39                   ` Jerin Jacob
  2019-10-07 11:57                     ` Ananyev, Konstantin
@ 2019-10-24  4:22                     ` Jerin Jacob
  1 sibling, 0 replies; 32+ messages in thread
From: Jerin Jacob @ 2019-10-24  4:22 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Thomas Monjalon, Steve Capper, Ananyev, Konstantin,
	Honnappa Nagarahalli, Rodolph Perfetta, jerinj, dpdk-dev,
	Gavin Hu (Arm Technology China),
	nd, Alexei Starovoitov, Quentin Monnet, john.fastabend

On Sat, Oct 5, 2019 at 8:09 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
> > >>
> > >> This might be one avenue if all kernel JIT contributors would be on board.
> > >> Another option I'm wondering could be to extend the bpf() syscall in order
> > >> to pass down a description of context and helper mappings e.g. via BTF and
> > >> let everything go through the verifier in the kernel the usual way (I presume
> > >> one goal might be that you want to assure that the generated BPF code passes
> > >> the safety checks before running the prog), then have it JITed and extract
> > >> the generated image in order to use it from user space. Kernel would have
> > >> to make sure it never actually allows attaching this program in the kernel.
> > >> Generated opcodes can already be retrieved today (see below). Such infra
> > >> could potentially help bpf-gcc folks as well as they expressed desire to
> > >> have some sort of a simulator for their gcc BPF test suite.. and it would
> > >> allow for consistent behavior of the BPF runtime. Just a thought.
> > >
> > > This idea looks good. This can remove the verifier code also from DPDK.
> >
> > Right, definitely makes sense to have consolidation also on this one as well
> > aside from the JITs, and pushing to the kernel and receiving back the JITed
> > image seems quite nice and would be generic to open up many other use cases
> > outside of networking. From app pov, it's just an implementation detail where
> > to get that BPF opcode image from anyway.
> >
> > >   A couple of downsides I can think of,
> > >
> > > # We may need to extend the kernel verifier to understand the user-space address
> > > and its symbols for CALL and MEM access operations.
> >
> > Yep, that part would be needed, potentially BTF could be of help here as well
> > for the description of the user space runtime environment like context, helpers
> > etc, so that JIT knows how to handle this.
>
> Though, We can not conclude the following non-technical aspects in
> this forum like
>  # Dual license Linux Kernel BPF code as GPL-BSD as a separate library.
> #  DPDK BPF support for FreeBSD and Windows OS, Treatment for Other OS?
> # Need a different treatment for old Linux kernels.
> This would call for immediate DPDK release to follow the existing semantics.
>
> Yes, For the long term, Using the Kernel JITed EBPF program for
> userspace will be helpful. At least DPDK can use it in the future.
> A couple of other things to consider when someone does this
> # https://github.com/iovisor/ubpf can also benefit from this.
> # We need to think about how to support tail call in userspace
> # It is possible to have burst mode support in userspace as an
> improvement (dealing with 4 packets at the time with SIMD).
> Dealing SIMD in kernel space will be an issue or dealing with such
> improvement in general.
> # Kernel verifier and dealing with address has a lot of security
> requirements that may not apply for userspace, and therefore some of
> the
> optimization specific to userspace needs to consider some way
> # Based on my understanding the Linux and DPDK JITed code, following
> optimizations may need/have a different path
>
> a) Userspace JIT has to deal with a 64bit address space. Kernel BPF
> code can assume the Kernel virtual address space range and optimize.
> b) I see, Kernel JIT always makes stack size as non zero even though
> BPF applications are not using the stack. I am not sure why it
> is that(Could be some security issue). There could be some
> optimization not to push the stack pointer in the prologue if EBPF
> program does
> not use stack
> c) In DPDK JIT compiler, In this first pass, we are checking the
> actual registers really in use and based on that we are crafting
> prologue and
> epilogue at runtime to improve the performance.  It could be just
> implementation detail, but not sure why Linux kernel is not doing such
> optimization.


Just pinging back to see if anyone interested in the common library.
Answers to the above questions and mainly this below thead will make
us forward progress for common library if there is still interest in
collaboration.

http://mails.dpdk.org/archives/dev/2019-October/146063.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support
  2019-10-04 14:43               ` Jerin Jacob
  2019-10-05  0:00                 ` Daniel Borkmann
@ 2020-04-06 11:05                 ` Ananyev, Konstantin
  1 sibling, 0 replies; 32+ messages in thread
From: Ananyev, Konstantin @ 2020-04-06 11:05 UTC (permalink / raw)
  To: Jerin Jacob, Daniel Borkmann
  Cc: Thomas Monjalon, Steve Capper, Honnappa Nagarahalli,
	Rodolph Perfetta, jerinj, dpdk-dev,
	Gavin Hu (Arm Technology China),
	nd, Alexei Starovoitov, Quentin Monnet, john.fastabend

Hi guys,
 
> On Fri, Oct 4, 2019 at 7:39 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >
> > On 10/4/19 12:53 PM, Thomas Monjalon wrote:
> > > 04/10/2019 11:54, Steve Capper:
> > >> I'd recommend also reaching out the BPF maintainers:
> > >> BPF JIT for ARM64
> > >> M:   Daniel Borkmann <daniel@iogearbox.net>
> > >> M:   Alexei Starovoitov <ast@kernel.org>
> > >> M:   Zi Shen Lim <zlim.lnx@gmail.com>
> > >> L:   netdev@vger.kernel.org
> > >> L:   bpf@vger.kernel.org
> > >> S:   Supported
> > >> F:   arch/arm64/net/
> > >>
> > >> As they will have much better knowledge of the state of play and will be
> > >> better able to advise.
> > >
> > > As far as I know Alexei and Daniel are OK with the idea.
> > > But better to let them reply here.
> > >
> > > I suggest we think about a way to package the kernel BPF JIT
> > > for userspace usage (not only DPDK) as a library.
> > > I don't understand why the DPDK JIT should be different
> > > or optimized differently.
> >
> > That would be great indeed as both projects would benefit from a shared
> > JIT instead of reimplementing everything twice. I never looked into DPDK
> > too much, but I presume the idea would be as well to take the LLVM (or
> > bpf-gcc) generated object file and load it into a BPF 'engine' that sits
> > in user space on top of DPDK? Presumably loader could be libbpf here as
> > well since it already knows how to parse the ELF, perform the relocations
> > etc. The only difference would be that you have a different context and
> > different helpers? Is that the goal eventually?
> >
> > > The only real issue I see is the need for a dual licensing BSD-GPL.
> >
> > This might be one avenue if all kernel JIT contributors would be on board.
> > Another option I'm wondering could be to extend the bpf() syscall in order
> > to pass down a description of context and helper mappings e.g. via BTF and
> > let everything go through the verifier in the kernel the usual way (I presume
> > one goal might be that you want to assure that the generated BPF code passes
> > the safety checks before running the prog), then have it JITed and extract
> > the generated image in order to use it from user space. Kernel would have
> > to make sure it never actually allows attaching this program in the kernel.
> > Generated opcodes can already be retrieved today (see below). Such infra
> > could potentially help bpf-gcc folks as well as they expressed desire to
> > have some sort of a simulator for their gcc BPF test suite.. and it would
> > allow for consistent behavior of the BPF runtime. Just a thought.
> 
> This idea looks good. This can remove the verifier code also from DPDK.
>  A couple of downsides I can think of,
> 
> # We may need to extend the kernel verifier to understand the user-space address
> and its symbols for CALL and MEM access operations.
> # DPDK supports FreeBSD and Windows OS as well
> # Need a different treatment for old Linux kernels.

Seems like discussion died out eventually.
Pinging to check is there still any interest on that subject
from kernel community.
Thanks
Konstantin


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2020-04-06 11:05 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-03 10:59 [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support jerinj
2019-09-03 10:59 ` [dpdk-dev] [PATCH 1/8] bpf/arm64: add build infrastructure jerinj
2019-09-03 10:59 ` [dpdk-dev] [PATCH 2/8] bpf/arm64: add prologue and epilogue jerinj
2019-09-03 10:59 ` [dpdk-dev] [PATCH 3/8] bpf/arm64: add basic arithmetic operations jerinj
2019-09-03 10:59 ` [dpdk-dev] [PATCH 4/8] bpf/arm64: add logical operations jerinj
2019-09-03 10:59 ` [dpdk-dev] [PATCH 5/8] bpf/arm64: add byte swap operations jerinj
2019-09-03 10:59 ` [dpdk-dev] [PATCH 6/8] bpf/arm64: add load and store operations jerinj
2019-09-03 10:59 ` [dpdk-dev] [PATCH 7/8] bpf/arm64: add atomic-exchange-and-add operation jerinj
2019-10-18 13:16   ` David Marchand
2019-09-03 10:59 ` [dpdk-dev] [PATCH 8/8] bpf/arm64: add branch operation jerinj
2019-09-24 17:03 ` [dpdk-dev] [PATCH 0/8] eBPF arm64 JIT support Ananyev, Konstantin
2019-10-12 12:22   ` Thomas Monjalon
2019-10-03 12:51 ` Thomas Monjalon
2019-10-03 13:07   ` Jerin Jacob
2019-10-03 15:05     ` Ananyev, Konstantin
2019-10-04  4:55       ` Honnappa Nagarahalli
2019-10-04  9:54         ` Steve Capper
2019-10-04 10:53           ` Thomas Monjalon
2019-10-04 14:09             ` Daniel Borkmann
2019-10-04 14:43               ` Jerin Jacob
2019-10-05  0:00                 ` Daniel Borkmann
2019-10-05 14:39                   ` Jerin Jacob
2019-10-07 11:57                     ` Ananyev, Konstantin
2019-10-24  4:22                     ` Jerin Jacob
2020-04-06 11:05                 ` Ananyev, Konstantin
2019-10-04 15:39       ` Jerin Jacob
2019-10-07 12:33         ` Thomas Monjalon
2019-10-07 13:00           ` Jerin Jacob
2019-10-07 18:04             ` Thomas Monjalon
2019-10-07 19:29               ` Jerin Jacob
2019-10-07 20:15                 ` Thomas Monjalon
2019-10-08  6:57                   ` Jerin Jacob

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).