From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A6DC043256; Tue, 31 Oct 2023 23:44:49 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 08E7E41151; Tue, 31 Oct 2023 23:44:40 +0100 (CET) Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by mails.dpdk.org (Postfix) with ESMTP id B69C1410D5 for ; Tue, 31 Oct 2023 23:44:36 +0100 (CET) Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-1cc131e52f1so2381805ad.0 for ; Tue, 31 Oct 2023 15:44:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1698792276; x=1699397076; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=U77Yi7lbfyn8RYyyObf9wDcdhY8VNBTi21Q9I6NsYaY=; b=g6BtENOz+6927HCJlEGTBJcXcaqjIhdF3pqmARjA9CuX2DSbPkHODIjSF4eqfCqzBX QmSP6sQ2IWVZVFehqMou9f3sVczUWhZXtln+O8c0TIuBZiN8DERpsD0LGM9OKwIwRkAm TPjYT64DomdLyVZ9jhiiuEX0Mii5WitVJAT4XRe1uzgeSqWoTT02OpDAOP2bmlt5pdUC R2YjOk4z/Q5qkzAph0B6WGN551rag9eRDwgDAA3+QDmIhIPCd5WDc8MFTqMN8OIAPjut ApTAooKlmHsklBO9tVZ47DcXgpx24/FHljbRez1HykdAco74/CMpEWI4CS5solMHV9wZ naaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698792276; x=1699397076; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=U77Yi7lbfyn8RYyyObf9wDcdhY8VNBTi21Q9I6NsYaY=; b=mairuR6zxi2rITcDxhPBTE4a69D3TPRfGSs2jKsI3AuYwi9nzR7gQM4qWawgxOFYAU PzfZhCW9/T91f9gHCIL047vyzcv4Yz3dLpheoTyk1KRxQlmc7vmNLpk0bU57XMgFKa4L HX/ExVULFacgf8yAlo61u9Tjm9kTyyEGQTpyJJrsEc7BmzAkKxiXpAS8jD2n7t5dFPjv O1r4E5T3ETfh64j8VE80ynq0Pr5+mRgZwN5+gNezlhlJQ+QqwpJHD5GVYM3nA+PLlkDF TICVFjSjqdl9V0+yCkDXNsrkFesexUODj2RfaIc3tESsExRCT2CqH+lqrJYJG/e6wEs7 S9ag== X-Gm-Message-State: AOJu0YzgHRzCRO5fKGgw9Xigks4QC3rs3dMoPtBhRDVoZ8jfKj8/E/NO tptFU37+7jekilCRspYwM4GlJ4ilOcYE9Crv22iIhg4l X-Google-Smtp-Source: AGHT+IGW8/wenOJlFEyehwGJwrnrq6995xMzcpiY9x2rAoOyqvn3n32NJZDLmVKjYEn4uuYfPorMJA== X-Received: by 2002:a17:903:228a:b0:1cc:3004:750c with SMTP id b10-20020a170903228a00b001cc3004750cmr1072411plh.20.1698792275700; Tue, 31 Oct 2023 15:44:35 -0700 (PDT) Received: from fedora.. ([38.142.2.14]) by smtp.gmail.com with ESMTPSA id b1-20020a170902d30100b001cc131c65besm70485plc.168.2023.10.31.15.44.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 15:44:35 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Madhuker Mythri , Stephen Hemminger Subject: [PATCH v5 2/3] net/tap: Fixed RSS algorithm to support fragmented packets Date: Tue, 31 Oct 2023 15:42:23 -0700 Message-ID: <20231031224429.150002-3-stephen@networkplumber.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20231031224429.150002-1-stephen@networkplumber.org> References: <20230716212544.5625-1-stephen@networkplumber.org> <20231031224429.150002-1-stephen@networkplumber.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Madhuker Mythri As per analysis on Tap PMD, the existing RSS algorithm considering 4-tuple(Src-IP, Dst-IP, Src-port and Dst-port) and identification of fragment packets is not done, thus we are seeing all the fragmented chunks of single packet differs in RSS hash value and distributed across multiple queues. The RSS algorithm assumes that, all the incoming IP packets are based on L4-protocol(UDP/TCP) and trying to fetch the L4 fields(Src-port and Dst-port) for each incoming packet, but for the fragmented chunks these L4-header will not be present(except for first packet) and should not consider in RSS hash for L4 header fields in-case of fragmented chunks. Which is a bug in the RSS algorithm implemented in the BPF functionality under TAP PMD. So, modified the RSS eBPF C-program and generated the structure of C-array in the 'tap_bpf_insns.h' file, which is in eBPF byte-code instructions format. Bugzilla Id: 870 Signed-off-by: Madhuker Mythri Signed-off-by: Stephen Hemminger --- drivers/net/tap/bpf/tap_bpf_program.c | 47 ++++++++++++++++++++++----- 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/drivers/net/tap/bpf/tap_bpf_program.c b/drivers/net/tap/bpf/tap_bpf_program.c index d65021d8a1..369c7b107f 100644 --- a/drivers/net/tap/bpf/tap_bpf_program.c +++ b/drivers/net/tap/bpf/tap_bpf_program.c @@ -19,6 +19,8 @@ #include "bpf_elf.h" #include "../tap_rss.h" +#include "bpf_api.h" + /** Create IPv4 address */ #define IPv4(a, b, c, d) ((__u32)(((a) & 0xff) << 24) | \ (((b) & 0xff) << 16) | \ @@ -133,6 +135,8 @@ rss_l3_l4(struct __sk_buff *skb) __u8 *key = 0; __u32 len; __u32 queue = 0; + bool mf = 0; + __u16 frag_off = 0; rsskey = map_lookup_elem(&map_keys, &key_idx); if (!rsskey) { @@ -157,6 +161,8 @@ rss_l3_l4(struct __sk_buff *skb) return TC_ACT_OK; __u8 *src_dst_addr = data + off + offsetof(struct iphdr, saddr); + __u8 *frag_off_addr = data + off + offsetof(struct iphdr, frag_off); + __u8 *prot_addr = data + off + offsetof(struct iphdr, protocol); __u8 *src_dst_port = data + off + sizeof(struct iphdr); struct ipv4_l3_l4_tuple v4_tuple = { .src_addr = IPv4(*(src_dst_addr + 0), @@ -167,11 +173,25 @@ rss_l3_l4(struct __sk_buff *skb) *(src_dst_addr + 5), *(src_dst_addr + 6), *(src_dst_addr + 7)), - .sport = PORT(*(src_dst_port + 0), - *(src_dst_port + 1)), - .dport = PORT(*(src_dst_port + 2), - *(src_dst_port + 3)), + .sport = 0, + .dport = 0, }; + /** Fetch the L4-payer port numbers only in-case of TCP/UDP + ** and also if the packet is not fragmented. Since fragmented + ** chunks do not have L4 TCP/UDP header. + **/ + if (*prot_addr == IPPROTO_UDP || *prot_addr == IPPROTO_TCP) { + frag_off = PORT(*(frag_off_addr + 0), + *(frag_off_addr + 1)); + mf = frag_off & 0x2000; + frag_off = frag_off & 0x1fff; + if (mf == 0 && frag_off == 0) { + v4_tuple.sport = PORT(*(src_dst_port + 0), + *(src_dst_port + 1)); + v4_tuple.dport = PORT(*(src_dst_port + 2), + *(src_dst_port + 3)); + } + } __u8 input_len = sizeof(v4_tuple) / sizeof(__u32); if (rsskey->hash_fields & (1 << HASH_FIELD_IPV4_L3)) input_len--; @@ -184,6 +204,9 @@ rss_l3_l4(struct __sk_buff *skb) offsetof(struct ipv6hdr, saddr); __u8 *src_dst_port = data + off + sizeof(struct ipv6hdr); + __u8 *next_hdr = data + off + + offsetof(struct ipv6hdr, nexthdr); + struct ipv6_l3_l4_tuple v6_tuple; for (j = 0; j < 4; j++) *((uint32_t *)&v6_tuple.src_addr + j) = @@ -193,10 +216,18 @@ rss_l3_l4(struct __sk_buff *skb) *((uint32_t *)&v6_tuple.dst_addr + j) = __builtin_bswap32(*((uint32_t *) src_dst_addr + 4 + j)); - v6_tuple.sport = PORT(*(src_dst_port + 0), - *(src_dst_port + 1)); - v6_tuple.dport = PORT(*(src_dst_port + 2), - *(src_dst_port + 3)); + + /** Fetch the L4 header port-numbers only if next-header + * is TCP/UDP **/ + if (*next_hdr == IPPROTO_UDP || *next_hdr == IPPROTO_TCP) { + v6_tuple.sport = PORT(*(src_dst_port + 0), + *(src_dst_port + 1)); + v6_tuple.dport = PORT(*(src_dst_port + 2), + *(src_dst_port + 3)); + } else { + v6_tuple.sport = 0; + v6_tuple.dport = 0; + } __u8 input_len = sizeof(v6_tuple) / sizeof(__u32); if (rsskey->hash_fields & (1 << HASH_FIELD_IPV6_L3)) -- 2.41.0