From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id AD9E8429DF; Mon, 24 Apr 2023 19:56:02 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7EE2F410ED; Mon, 24 Apr 2023 19:56:02 +0200 (CEST) Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by mails.dpdk.org (Postfix) with ESMTP id 7B2744013F for ; Mon, 24 Apr 2023 19:56:01 +0200 (CEST) Received: by mail-qt1-f175.google.com with SMTP id d75a77b69052e-3ef34c49cb9so1596681cf.1 for ; Mon, 24 Apr 2023 10:56:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1682358961; x=1684950961; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=6+7Wn6mMHADQ5IL+HEO/3f5oew4brekDQDOmE6IuTJc=; b=FdLs8pu3qTmS1x9uR0SNUEfhoY5u9lgRIW2tHiczlgb+35phNFsNalmZ6s6dZeSeMC B2oG6kR1np1Rff61pZZ9XHNxURHDhy6i7RcpbAbJhwQyY+7Cvm3I1JPcqQycNa9NYga6 X7aWG9kQp8ztYxWBzdFCrWLwTAI7MSoqoRLi/c3xggwH8z8v2mjg+Ku/WNqBPb6vNoYm 4or7/KO2Rw/ca3J7uGp54n27HJNPi5UYoYt6VMLzVQ9BoB5FQGgV7wU9pNLIdkbfwhyD Pw7ve1ghq7IhZy/JoNX3XxyYLExJ6niLlM+WGmUSRakjNe6DW3KMeQx3KINdfC6phAIu WZzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682358961; x=1684950961; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=6+7Wn6mMHADQ5IL+HEO/3f5oew4brekDQDOmE6IuTJc=; b=Zj2eJbgxu9qV/GVuNty4fBFYNsXik7/9MJXXXcaMl4XHIi9DQSD8V9PIDc6RZeKAVh /dDKoi2zWwMs3b9pV8XyttVk6YMiZv5N1z8JgU5D1HeLZ5N/GuR0TkM2dBRPlJtfdBfZ iYcQwtGZwY6K0xSAUnp/LP34VwIZItOInOmE56DNRVzFrkbqREJPNjGkpLpLPjF+EZem WyV0UyYpsBfoLm3t676J8y0N+qmAmUZgsNYlyN9HsPvOOvq/HzY+Lt5Jp3bQB7EiTxPX 2dSa6dnE8ac6xnEwDwZPSw80WQPx3p4PilOwKuTQ7rNENPbLF1ywJ5lbS++So+FJ+ctB Yiog== X-Gm-Message-State: AAQBX9fjvQ6lLt1tucnX4AaAFrEveIcgSMj7kZOaZQt8FcCENWowIi0i ruEM9k4mx/yHMkCU7ZIM7nroZWJdEqo3Ki0vW44Y7w== X-Google-Smtp-Source: AKy350aSVTeijn74GL3ZCTASwe9Kmyy0wqAnLBdYinYWU1tZaUBX0aPGB9YO9ofDeREDRUCtrC/Kx4Tl5R2/+GMQ3Ew= X-Received: by 2002:a05:622a:50f:b0:3ed:210b:e698 with SMTP id l15-20020a05622a050f00b003ed210be698mr633356qtx.7.1682358960647; Mon, 24 Apr 2023 10:56:00 -0700 (PDT) MIME-Version: 1.0 References: <20230412181619.496342-1-joshwash@google.com> <20230421232022.342081-1-joshwash@google.com> In-Reply-To: <20230421232022.342081-1-joshwash@google.com> From: Joshua Washington Date: Mon, 24 Apr 2023 10:55:49 -0700 Message-ID: Subject: Re: [PATCH v5] app/testpmd: txonly multiflow port change support To: Aman Singh , Yuying Zhang Cc: dev@dpdk.org, Rushil Gupta Content-Type: multipart/alternative; boundary="0000000000003c84be05fa18b858" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org --0000000000003c84be05fa18b858 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable After updating the patch, it seems that the `lcores_autotest` unit test now times out on Windows Server 2019. I looked at the test logs, but they were identical as far as I could tell, with the timed out test even printing "Te= st OK" to stdout. Is this a flake? Or is there any other way to get extra information about why the test timed out or run the test with extra debugging information? Thanks, Josh On Fri, Apr 21, 2023 at 4:20=E2=80=AFPM Joshua Washington wrote: > Google cloud routes traffic using IP addresses without the support of MAC > addresses, so changing source IP address for txonly-multi-flow can have > negative performance implications for net/gve when using testpmd. This > patch updates txonly multiflow mode to modify source ports instead of > source IP addresses. > > The change can be tested with the following command: > dpdk-testpmd -- --forward-mode=3Dtxonly --txonly-multi-flow \ > --tx-ip=3D, > > Signed-off-by: Joshua Washington > Reviewed-by: Rushil Gupta > --- > app/test-pmd/txonly.c | 39 +++++++++++++++++++++++---------------- > 1 file changed, 23 insertions(+), 16 deletions(-) > > diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c > index b3d6873104..f79e0e5d0b 100644 > --- a/app/test-pmd/txonly.c > +++ b/app/test-pmd/txonly.c > @@ -56,7 +56,7 @@ uint32_t tx_ip_dst_addr =3D (198U << 24) | (18 << 16) |= (0 > << 8) | 2; > #define IP_DEFTTL 64 /* from RFC 1340. */ > > static struct rte_ipv4_hdr pkt_ip_hdr; /**< IP header of transmitted > packets. */ > -RTE_DEFINE_PER_LCORE(uint8_t, _ip_var); /**< IP address variation */ > +RTE_DEFINE_PER_LCORE(uint8_t, _src_var); /**< Source port variation */ > static struct rte_udp_hdr pkt_udp_hdr; /**< UDP header of tx packets. */ > > static uint64_t timestamp_mask; /**< Timestamp dynamic flag mask */ > @@ -230,28 +230,35 @@ pkt_burst_prepare(struct rte_mbuf *pkt, struct > rte_mempool *mbp, > copy_buf_to_pkt(eth_hdr, sizeof(*eth_hdr), pkt, 0); > copy_buf_to_pkt(&pkt_ip_hdr, sizeof(pkt_ip_hdr), pkt, > sizeof(struct rte_ether_hdr)); > + copy_buf_to_pkt(&pkt_udp_hdr, sizeof(pkt_udp_hdr), pkt, > + sizeof(struct rte_ether_hdr) + > + sizeof(struct rte_ipv4_hdr)); > if (txonly_multi_flow) { > - uint8_t ip_var =3D RTE_PER_LCORE(_ip_var); > - struct rte_ipv4_hdr *ip_hdr; > - uint32_t addr; > + uint16_t src_var =3D RTE_PER_LCORE(_src_var); > + struct rte_udp_hdr *udp_hdr; > + uint16_t port; > > - ip_hdr =3D rte_pktmbuf_mtod_offset(pkt, > - struct rte_ipv4_hdr *, > - sizeof(struct rte_ether_hdr)); > + udp_hdr =3D rte_pktmbuf_mtod_offset(pkt, > + struct rte_udp_hdr *, > + sizeof(struct rte_ether_hdr) + > + sizeof(struct rte_ipv4_hdr)); > /* > - * Generate multiple flows by varying IP src addr. This > - * enables packets are well distributed by RSS in > + * Generate multiple flows by varying UDP source port. > + * This enables packets are well distributed by RSS in > * receiver side if any and txonly mode can be a decent > * packet generator for developer's quick performance > * regression test. > + * > + * Only ports in the range 49152 (0xC000) and 65535 > (0xFFFF) > + * will be used, with the least significant byte > representing > + * the lcore ID. As such, the most significant byte will > cycle > + * through 0xC0 and 0xFF. > */ > - addr =3D (tx_ip_dst_addr | (ip_var++ << 8)) + rte_lcore_i= d(); > - ip_hdr->src_addr =3D rte_cpu_to_be_32(addr); > - RTE_PER_LCORE(_ip_var) =3D ip_var; > + port =3D ((((src_var++) % (0xFF - 0xC0) + 0xC0) & 0xFF) <= < 8) > + + rte_lcore_id(); > + udp_hdr->src_port =3D rte_cpu_to_be_16(port); > + RTE_PER_LCORE(_src_var) =3D src_var; > } > - copy_buf_to_pkt(&pkt_udp_hdr, sizeof(pkt_udp_hdr), pkt, > - sizeof(struct rte_ether_hdr) + > - sizeof(struct rte_ipv4_hdr)); > > if (unlikely(tx_pkt_split =3D=3D TX_PKT_SPLIT_RND) || > txonly_multi_flow) > update_pkt_header(pkt, pkt_len); > @@ -393,7 +400,7 @@ pkt_burst_transmit(struct fwd_stream *fs) > nb_tx =3D common_fwd_stream_transmit(fs, pkts_burst, nb_pkt); > > if (txonly_multi_flow) > - RTE_PER_LCORE(_ip_var) -=3D nb_pkt - nb_tx; > + RTE_PER_LCORE(_src_var) -=3D nb_pkt - nb_tx; > > if (unlikely(nb_tx < nb_pkt)) { > if (verbose_level > 0 && fs->fwd_dropped =3D=3D 0) > -- > 2.40.0.634.g4ca3ef3211-goog > > --=20 Joshua Washington | Software Engineer | joshwash@google.com | (414) 366-442= 3 --0000000000003c84be05fa18b858 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
After=C2=A0updating=C2=A0the patch, it seems that the `lco= res_autotest` unit test now times out on Windows Server 2019. I looked at t= he test logs, but they were identical as far as I could tell, with the time= d out test even printing=C2=A0"Test O= K" to stdout. Is this a flake? Or is there any other=C2=A0way to get extr= a information about why the test timed out or run the test with extra debug= ging information?

Thanks,
Josh

On Fri, Apr 21, 2023 at 4:20=E2= =80=AFPM Joshua Washington <joshw= ash@google.com> wrote:
Google cloud routes traffic using IP addresses without the su= pport of MAC
addresses, so changing source IP address for txonly-multi-flow can have
negative performance implications for net/gve when using testpmd. This
patch updates txonly multiflow mode to modify source ports instead of
source IP addresses.

The change can be tested with the following command:
dpdk-testpmd -- --forward-mode=3Dtxonly --txonly-multi-flow \
=C2=A0 =C2=A0 --tx-ip=3D<SRC>,<DST>

Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Rushil Gupta <rushilg@google.com>
---
=C2=A0app/test-pmd/txonly.c | 39 +++++++++++++++++++++++---------------- =C2=A01 file changed, 23 insertions(+), 16 deletions(-)

diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index b3d6873104..f79e0e5d0b 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -56,7 +56,7 @@ uint32_t tx_ip_dst_addr =3D (198U << 24) | (18 <= < 16) | (0 << 8) | 2;
=C2=A0#define IP_DEFTTL=C2=A0 64=C2=A0 =C2=A0/* from RFC 1340. */

=C2=A0static struct rte_ipv4_hdr pkt_ip_hdr; /**< IP header of transmitt= ed packets. */
-RTE_DEFINE_PER_LCORE(uint8_t, _ip_var); /**< IP address variation */ +RTE_DEFINE_PER_LCORE(uint8_t, _src_var); /**< Source port variation */<= br> =C2=A0static struct rte_udp_hdr pkt_udp_hdr; /**< UDP header of tx packe= ts. */

=C2=A0static uint64_t timestamp_mask; /**< Timestamp dynamic flag mask *= /
@@ -230,28 +230,35 @@ pkt_burst_prepare(struct rte_mbuf *pkt, struct rte_me= mpool *mbp,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 copy_buf_to_pkt(eth_hdr, sizeof(*eth_hdr), pkt,= 0);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 copy_buf_to_pkt(&pkt_ip_hdr, sizeof(pkt_ip_= hdr), pkt,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 sizeof(struct rte_ether_hdr));
+=C2=A0 =C2=A0 =C2=A0 =C2=A0copy_buf_to_pkt(&pkt_udp_hdr, sizeof(pkt_ud= p_hdr), pkt,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0sizeof(struct rte_ether_hdr) +
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0sizeof(struct rte_ipv4_hdr));
=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (txonly_multi_flow) {
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint8_t=C2=A0 ip_va= r =3D RTE_PER_LCORE(_ip_var);
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct rte_ipv4_hdr= *ip_hdr;
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint32_t addr;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint16_t src_var = =3D RTE_PER_LCORE(_src_var);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct rte_udp_hdr = *udp_hdr;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0uint16_t port;

-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ip_hdr =3D rte_pktm= buf_mtod_offset(pkt,
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct rte_ipv4_hdr *,
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sizeof(struct rte_ether_hdr));
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0udp_hdr =3D rte_pkt= mbuf_mtod_offset(pkt,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct rte_udp_hdr *,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sizeof(struct rte_ether_hdr) +
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sizeof(struct rte_ipv4_hdr));
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /*
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * Generate multipl= e flows by varying IP src addr. This
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * enables packets = are well distributed by RSS in
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * Generate multipl= e flows by varying UDP source port.
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * This enables pac= kets are well distributed by RSS in
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* receiver si= de if any and txonly mode can be a decent
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* packet gene= rator for developer's quick performance
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* regression = test.
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 *
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * Only ports in th= e range 49152 (0xC000) and 65535 (0xFFFF)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * will be used, wi= th the least significant byte representing
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * the lcore ID. As= such, the most significant byte will cycle
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * through 0xC0 and= 0xFF.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0addr =3D (tx_ip_dst= _addr | (ip_var++ << 8)) + rte_lcore_id();
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ip_hdr->src_addr= =3D rte_cpu_to_be_32(addr);
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0RTE_PER_LCORE(_ip_v= ar) =3D ip_var;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0port =3D ((((src_va= r++) % (0xFF - 0xC0) + 0xC0) & 0xFF) << 8)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+ rte_lcore_id();
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0udp_hdr->src_por= t =3D rte_cpu_to_be_16(port);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0RTE_PER_LCORE(_src_= var) =3D src_var;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
-=C2=A0 =C2=A0 =C2=A0 =C2=A0copy_buf_to_pkt(&pkt_udp_hdr, sizeof(pkt_ud= p_hdr), pkt,
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0sizeof(struct rte_ether_hdr) +
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0sizeof(struct rte_ipv4_hdr));

=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (unlikely(tx_pkt_split =3D=3D TX_PKT_SPLIT_R= ND) || txonly_multi_flow)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 update_pkt_header(p= kt, pkt_len);
@@ -393,7 +400,7 @@ pkt_burst_transmit(struct fwd_stream *fs)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 nb_tx =3D common_fwd_stream_transmit(fs, pkts_b= urst, nb_pkt);

=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (txonly_multi_flow)
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0RTE_PER_LCORE(_ip_v= ar) -=3D nb_pkt - nb_tx;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0RTE_PER_LCORE(_src_= var) -=3D nb_pkt - nb_tx;

=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (unlikely(nb_tx < nb_pkt)) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (verbose_level &= gt; 0 && fs->fwd_dropped =3D=3D 0)
--
2.40.0.634.g4ca3ef3211-goog



--

Joshua Washington=C2=A0|=C2=A0Software Engineer |=C2=A0joshwash@google.com=C2=A0|=C2=A0(414) 366-4423
=C2=A0
--0000000000003c84be05fa18b858--