From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 70C8E43256; Tue, 31 Oct 2023 23:09:41 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1A6C14113D; Tue, 31 Oct 2023 23:09:29 +0100 (CET) Received: from mail-pg1-f175.google.com (mail-pg1-f175.google.com [209.85.215.175]) by mails.dpdk.org (Postfix) with ESMTP id 69014410D3 for ; Tue, 31 Oct 2023 23:09:27 +0100 (CET) Received: by mail-pg1-f175.google.com with SMTP id 41be03b00d2f7-5b9a453d3d3so1970771a12.0 for ; Tue, 31 Oct 2023 15:09:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1698790166; x=1699394966; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Y5erbmDsBMkX+RUS8JiHrd4gA2b3+xKGIg8G9ryNZZg=; b=DUfMTw2g8yNT9bFBknviThbiHMF5YtfcE5kFGRDxXV/AZs+9vctmUyOuoZ91cGvd4t MolwG4kxL43ivMGAfD/rvzjPhGeI1Ra+y/qRLXNESlEvnEFGoqpfR9Sjr9uzy5QfJlfP efcFl7BB+n+eLkl+oQL+wBpFPcvCGpvH+tuT9tTVMUduoOu4cAT+/C9F8vo1SjgJHHri hWYuVUj6mdMeorI0Re2vEOqBw9icuCwkRtwOe3NTT5K80wYahPZKhafH7EY7Bg4OfCmE hXUn8SStlc5KknoSfp34NoUzpniAWCSJE5YzMIftic+DxnElchHJh5RECVj9kBlg2ijY nQSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698790166; x=1699394966; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Y5erbmDsBMkX+RUS8JiHrd4gA2b3+xKGIg8G9ryNZZg=; b=ebSRvbzZKfPhRzTzRIW6+Hb5UAhUVNWOUh8CEN6ZC9aQgTaHGrx70/FKHTKHcyZCjF DenCMw2cNlHXJK4okfmQfO6MkN2tqZSMwGHOwie+PgphHJ7ocHyOE6uXa0CNCv77EBmV R0loCOpbAWluXDnLsybDwgLV6fY3rDKf0W7F2ZlrqFnmzwZ6EGO5SfRcFyKhfPRvLpU+ jvUeWJ5KC548FyNaAdKHScWsZGghjxLrS3CZId2UTmTwFVfdaUW/DSBwv83T9i0TAbH6 5ix0jpaFrv7oLXbJvTMA4p33Dp0P5UuysE8bgxdXPSMfjndh+ldBygD6UBfRbapX7/KD bdGA== X-Gm-Message-State: AOJu0YzvVZ8Et6Lf5YBRNDp2bj94uvL9RYXDxLqLPATD/SKbqhsfHIQw ZzO0+FpUhfJXE/71DzhHjiU2wSSNIDuDlwwloRlTXeBz X-Google-Smtp-Source: AGHT+IF7suuS116dQ1s9peACX1HMhdGQRvZ8yZwVVDwcMTzNmO8DXIG3X9mOA7AWZZs/a4OjGQCs+w== X-Received: by 2002:a05:6a20:a206:b0:174:d189:2f93 with SMTP id u6-20020a056a20a20600b00174d1892f93mr9426885pzk.59.1698790166198; Tue, 31 Oct 2023 15:09:26 -0700 (PDT) Received: from fedora.. ([38.142.2.14]) by smtp.gmail.com with ESMTPSA id ey2-20020a056a0038c200b0065a1b05193asm100750pfb.185.2023.10.31.15.09.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 15:09:25 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Madhuker Mythri , Stephen Hemminger Subject: [PATCH 2/3] net/tap: Fixed RSS algorithm to support fragmented packets Date: Tue, 31 Oct 2023 15:08:12 -0700 Message-ID: <20231031220921.96023-3-stephen@networkplumber.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20231031220921.96023-1-stephen@networkplumber.org> References: <20230716212544.5625-1-stephen@networkplumber.org> <20231031220921.96023-1-stephen@networkplumber.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Madhuker Mythri As per analysis on Tap PMD, the existing RSS algorithm considering 4-tuple(Src-IP, Dst-IP, Src-port and Dst-port) and identification of fragment packets is not done, thus we are seeing all the fragmented chunks of single packet differs in RSS hash value and distributed across multiple queues. The RSS algorithm assumes that, all the incoming IP packets are based on L4-protocol(UDP/TCP) and trying to fetch the L4 fields(Src-port and Dst-port) for each incoming packet, but for the fragmented chunks these L4-header will not be present(except for first packet) and should not consider in RSS hash for L4 header fields in-case of fragmented chunks. Which is a bug in the RSS algorithm implemented in the BPF functionality under TAP PMD. So, modified the RSS eBPF C-program and generated the structure of C-array in the 'tap_bpf_insns.h' file, which is in eBPF byte-code instructions format. Bugzilla Id: 870 Signed-off-by: Madhuker Mythri Signed-off-by: Stephen Hemminger --- drivers/net/tap/bpf/tap_bpf_program.c | 47 ++++++++++++++++++++++----- 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/drivers/net/tap/bpf/tap_bpf_program.c b/drivers/net/tap/bpf/tap_bpf_program.c index ff6f1606fb..3d431dfa43 100644 --- a/drivers/net/tap/bpf/tap_bpf_program.c +++ b/drivers/net/tap/bpf/tap_bpf_program.c @@ -19,6 +19,8 @@ #include "bpf_elf.h" #include "../tap_rss.h" +#include "bpf_api.h" + /** Create IPv4 address */ #define IPv4(a, b, c, d) ((__u32)(((a) & 0xff) << 24) | \ (((b) & 0xff) << 16) | \ @@ -132,6 +134,8 @@ rss_l3_l4(struct __sk_buff *skb) __u8 *key = 0; __u32 len; __u32 queue = 0; + bool mf = 0; + __u16 frag_off = 0; rsskey = map_lookup_elem(&map_keys, &key_idx); if (!rsskey) { @@ -156,6 +160,8 @@ rss_l3_l4(struct __sk_buff *skb) return TC_ACT_OK; __u8 *src_dst_addr = data + off + offsetof(struct iphdr, saddr); + __u8 *frag_off_addr = data + off + offsetof(struct iphdr, frag_off); + __u8 *prot_addr = data + off + offsetof(struct iphdr, protocol); __u8 *src_dst_port = data + off + sizeof(struct iphdr); struct ipv4_l3_l4_tuple v4_tuple = { .src_addr = IPv4(*(src_dst_addr + 0), @@ -166,11 +172,25 @@ rss_l3_l4(struct __sk_buff *skb) *(src_dst_addr + 5), *(src_dst_addr + 6), *(src_dst_addr + 7)), - .sport = PORT(*(src_dst_port + 0), - *(src_dst_port + 1)), - .dport = PORT(*(src_dst_port + 2), - *(src_dst_port + 3)), + .sport = 0, + .dport = 0, }; + /** Fetch the L4-payer port numbers only in-case of TCP/UDP + ** and also if the packet is not fragmented. Since fragmented + ** chunks do not have L4 TCP/UDP header. + **/ + if (*prot_addr == IPPROTO_UDP || *prot_addr == IPPROTO_TCP) { + frag_off = PORT(*(frag_off_addr + 0), + *(frag_off_addr + 1)); + mf = frag_off & 0x2000; + frag_off = frag_off & 0x1fff; + if (mf == 0 && frag_off == 0) { + v4_tuple.sport = PORT(*(src_dst_port + 0), + *(src_dst_port + 1)); + v4_tuple.dport = PORT(*(src_dst_port + 2), + *(src_dst_port + 3)); + } + } __u8 input_len = sizeof(v4_tuple) / sizeof(__u32); if (rsskey->hash_fields & (1 << HASH_FIELD_IPV4_L3)) input_len--; @@ -183,6 +203,9 @@ rss_l3_l4(struct __sk_buff *skb) offsetof(struct ipv6hdr, saddr); __u8 *src_dst_port = data + off + sizeof(struct ipv6hdr); + __u8 *next_hdr = data + off + + offsetof(struct ipv6hdr, nexthdr); + struct ipv6_l3_l4_tuple v6_tuple; for (j = 0; j < 4; j++) *((uint32_t *)&v6_tuple.src_addr + j) = @@ -192,10 +215,18 @@ rss_l3_l4(struct __sk_buff *skb) *((uint32_t *)&v6_tuple.dst_addr + j) = __builtin_bswap32(*((uint32_t *) src_dst_addr + 4 + j)); - v6_tuple.sport = PORT(*(src_dst_port + 0), - *(src_dst_port + 1)); - v6_tuple.dport = PORT(*(src_dst_port + 2), - *(src_dst_port + 3)); + + /** Fetch the L4 header port-numbers only if next-header + * is TCP/UDP **/ + if (*next_hdr == IPPROTO_UDP || *next_hdr == IPPROTO_TCP) { + v6_tuple.sport = PORT(*(src_dst_port + 0), + *(src_dst_port + 1)); + v6_tuple.dport = PORT(*(src_dst_port + 2), + *(src_dst_port + 3)); + } else { + v6_tuple.sport = 0; + v6_tuple.dport = 0; + } __u8 input_len = sizeof(v6_tuple) / sizeof(__u32); if (rsskey->hash_fields & (1 << HASH_FIELD_IPV6_L3)) -- 2.41.0