From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 45CC6A04FA; Thu, 6 Feb 2020 10:11:40 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8D6B71C025; Thu, 6 Feb 2020 10:11:39 +0100 (CET) Received: from mail-lj1-f180.google.com (mail-lj1-f180.google.com [209.85.208.180]) by dpdk.org (Postfix) with ESMTP id 026941C021 for ; Thu, 6 Feb 2020 10:11:37 +0100 (CET) Received: by mail-lj1-f180.google.com with SMTP id q8so5253730ljj.11 for ; Thu, 06 Feb 2020 01:11:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BLK9gkn1Njm9UrBgpl2SCXFDOF/XUhrLFwWYZDFbxzI=; b=gRNBL1cXPA+4BRCA7GR4CMOVe0+lWvsdV56U350aMHgblGp341Y8llJn6+WMll9Wqn M4/N4yuXrwvYFzX2eCRW5aiTRVOYIhqLxTb4V8OcIPoAP4MNI4lwBIssCeZCiTqJeDrU GwlMnsIR27iRObNX2lZa8NhuFiCpC7DQ+Faqttnb2ehGleEGegCY4W7OeN4Kbn4RTdnf lIGiJe8TwnEwv5A5z7MDNwF+rQ0sjVmIM6w0M5q1ZKSg6qcbzN8cJz27ndLdWJRAxC0G Kg9LK6oDt1yU2HOf8P3O9U65Hv4pBxhLfPNP7rN/D9SEd+ZcHmrj1jZdxMRcu2MN0d9v rbhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BLK9gkn1Njm9UrBgpl2SCXFDOF/XUhrLFwWYZDFbxzI=; b=N2e3MZQX9TFd8h647mgWAc9tfNIdoyXmxwTyvI+hbHZWERlB4nr73GH1Dvs+0t+7WH FP0XYVvXLKFydbxGkH/quRqSQGvMeEWN1QCwqlqh5kBXGM8UEnWqG2qrI0qTg7dtmelN dsbA55kUNWWcxHafNUjXUUf7F9jUGO4aDLYCsysKnVPJRQXmrZ1tQbsiHcMlEr3PdxtL 1XIyftJG/qTbUHQxdXxNRe4/byaroJmmtTeW++7tH8HDPuDwe+InQGjN7m6QcOexEjj/ uMoc7lXnVVyYbGPD+Lgp0qHT0oQ3W0CVrjrxlUKly8eUuEtxAbr9T8HA1xxVku0YVOBQ muYg== X-Gm-Message-State: APjAAAUFc+Sk+AZsCfnc8TudgbIwti9oBzXMnEPTT7vaya6Ea9Xzyeim 5BMPe+vbG5k/UzJXRheJNhGaA81fa8AnONntmVPs4w== X-Google-Smtp-Source: APXvYqzwH65Uf2MDjn5lzsSP7mAdWTMEl4eKLLUrQtzNwwlg66XND5cTHKYlMx8pIf7f9/JMbjYgpi5VgcGo7MHUVPQ= X-Received: by 2002:a2e:3e10:: with SMTP id l16mr1401324lja.286.1580980297246; Thu, 06 Feb 2020 01:11:37 -0800 (PST) MIME-Version: 1.0 References: <20200204065229.0e8a85f7@shemminger-XPS-13-9360> <98CBD80474FA8B44BF855DF32C47DC35C60DA0@smartserver.smartshare.dk> In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35C60DA0@smartserver.smartshare.dk> From: Stephen Hemminger Date: Thu, 6 Feb 2020 09:11:27 +0000 Message-ID: To: =?UTF-8?Q?Morten_Br=C3=B8rup?= Cc: "Ananyev, Konstantin" , dev@dpdk.org, Jerin Jacob , "Varghese, Vipin" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] BUG: eBPF missing BPF_ABS X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" I agree fixing offsets in cbpf to ebpf converter and passing mbuf is easier. There is still the pathological case of multi segment mbuf. But Linux XDP doesn't handle it either. Let me put early version of filter2rteebf on GitHub On Thu, Feb 6, 2020, 8:54 AM Morten Br=C3=B8rup = wrote: > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, > > Konstantin > > Sent: Wednesday, February 5, 2020 10:16 PM > > > > > > > > As I mentioned in my FOSDEM talk the current DPDK eBPF handling is > > > not usable for packet filters. I have ported the classic BPF to eBPF > > code > > > and the generated code is not usable by DPDK. > > > > > > The problem is that DPDK eBPF does not implement all the opcodes. > > > BPF_ABS is not implemented and must be. It is in the Linux kernel. > > > > Yep, it doesn't. > > This is not a generic eBPF instruction, but sort of implicit function > > call > > to access skb data. At initial stage of librte_bpf development, > > I didn't think much about cBPF conversion, and to simplify things > > decided to put all special skb features aside. > > But sure, if that will enable DPDK cBPF support, let's add it in. > > Please fill a Bugzilla ticket to track that issue, and I'll try to > > find some time within 20.05 to look at it. > > Unless of course, you or someone else would like to volunteer for it. > > > > Though at first step, we probably need to decide what should be > > our requirements for it in terms of DPDK. > > From https://www.kernel.org/doc/Documentation/networking/filter.txt: > > "eBPF has two non-generic instructions: (BPF_ABS | | BPF_LD) and > > (BPF_IND | | BPF_LD) which are used to access packet data. > > They had to be carried over from classic to have strong performance of > > socket filters running in eBPF interpreter. These instructions can only > > be used when interpreter context is a pointer to 'struct sk_buff' and > > have seven implicit operands. Register R6 is an implicit input that > > must > > contain pointer to sk_buff. Register R0 is an implicit output which > > contains > > the data fetched from the packet. Registers R1-R5 are scratch registers > > and must not be used to store the data across BPF_ABS | BPF_LD or > > BPF_IND | BPF_LD instructions. > > These instructions have implicit program exit condition as well. When > > eBPF program is trying to access the data beyond the packet boundary, > > the interpreter will abort the execution of the program. JIT compilers > > therefore must preserve this property. src_reg and imm32 fields are > > explicit inputs to these instructions. > > For example: > > BPF_IND | BPF_W | BPF_LD means: > > R0 =3D ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32)= ) > > and R1 - R5 were scratched." > > > > For RTE_BPF_ARG_PTR_MBUF context we probably want behavior similar > > to linux, i.e. BPF_IND | BPF_W | BPF_LD would mean something like: > > > > 1) uint32_t tmp; > > R0 =3D &tmp; > > R0 =3D rte_pktmbuf_read((const struct rte_mbuf *)R6, src_reg + > > imm32, > > sizeof(uint32_t), R0); > > if (R0 =3D=3D NULL) return FAILED; > > R0 =3D ntohl(*(uint32_t *)R0); > > > > But what to do with when ctx is raw data buffer (RTE_BPF_ARG_PTR)? > > Should it be just: > > 2) R0 =3D ntohl(*(uint32_t *)(R6 + src_reg + imm32)); > > 3) not allow LD_ABS/LD_IND in this mode at all. > > > > I think that 1) is the correct choice for the "cBPF filter" use case. > > In that context I consider both 2) and 3) irrelevant because the > RTE_BPF_ARG_PTR_MBUF type should be used for cBPF filtering. I have argue= d > for this before. > > Others have argued for using the RTE_BPF_ARG_PTR type instead. Let's > consider using the RTE_BPF_ARG_PTR for a moment... Is there an implicit > understanding that the data points to packet data? Then a range check in = 2) > might be relevant. > > However, if the RTE_BPF_ARG_PTR_MBUF type is supported and used for > cBPF->eBPF conversion, we would not need to support LD_ABS/LD_IND for the > RTE_BPF_ARG_PTR type. So between 2) and 3), I support 3). > > > Second question is implementation. > > I can see two main options here: > > a) if we plan to have our own cBPF->eBPF converter and support only it, > > we can add these extra instructions generation into converter itself. > > I.E. cBPF->eBPF conversion for LD_ABS/LD_IND will generate series > > of generic eBPF instructions. > > b) support eBPF LD_ABS/LD_IND in eBPF interpreter/jit > > > > (a) probably a simpler way (eBPF interpreter/jit/verifier would remain > > unchanged), > > but seems way too limited. So I think (b) is a better choice, even more > > work implied > > (interpreter seems more or less straightforward, jit would probably > > need some effort). > > > > This is going to be used in the fast path, probably on all packets on an > interface. So clearly b). > > > Any thoughts/opinions? > > Konstantin > > One more piece of information: Linux cBPF supports Auxiliary data (VLAN > ID, Interface Index, etc.), i.e. metadata that are not part of the packet > data, but can be found in the sk_buff/mbuf. Going for 1) and b) might mak= e > it easier adding support for these later. > > > Med venlig hilsen / kind regards > - Morten Br=C3=B8rup > >