From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id ECD51425DC; Thu, 21 Sep 2023 06:24:28 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 820AB4067B; Thu, 21 Sep 2023 06:24:09 +0200 (CEST) Received: from mail-oi1-f180.google.com (mail-oi1-f180.google.com [209.85.167.180]) by mails.dpdk.org (Postfix) with ESMTP id D0C20402E0 for ; Thu, 21 Sep 2023 06:24:05 +0200 (CEST) Received: by mail-oi1-f180.google.com with SMTP id 5614622812f47-3ade1011f8cso368537b6e.0 for ; Wed, 20 Sep 2023 21:24:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1695270245; x=1695875045; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=v9D23f4qgr4f2T2Rqt+XYQxDpkW7uq0g6x8hWZbE/U4=; b=CDNbPrS4+KXx9RCONK8FY7qDcN1dYGG/fPqsViYW6GA35zvv8yVku6UMmWn1irtmGE C+if6nB3hxtku+4siknAY138nVUU/CddNjqOPd0yKpaCYaKZd+uwc1jz37QeRszB9x38 ERBzhom8I2DjP4s3u4ShjPykxqv/a8Cyc3ZnwNnmIQyjmR+9IoqeO4CgFPaK0qp+UI4z 2FEt4TchXKXGOVwTKu32UL/yLHgNurHcueJAi7zUoRf69ADP6F02DrjqDnXAwI/hLg5r w5+BlEscZ8pxuOxXs4s4XWUZdQ1p6HO6TIA98Q60KRAYjML9MOI0xgQWR6v95tDxWF8u ATmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695270245; x=1695875045; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v9D23f4qgr4f2T2Rqt+XYQxDpkW7uq0g6x8hWZbE/U4=; b=Epg0JqvrFBR9Iat2qYWrAIb+38Kb7WCrnw4MTYmkWIRuZlwf7TFOQ/lwFsW7NlI/j8 3GPSZq+r3QBT80r1us0cJ1dhanAGKnlJkhZsQ15FxNWa+DIU1z2M+mX99Ief6xVK5uWX yREpGzodurE6trFgHGxhQIz9qaAqptfsPiiLXRE5LRUX+/nu8jp2uceqciaAyLrwD14V oo4Jis6pMKPsbEaXSmXziqtsaVOtwDM90PfuBJk1Vm+CwWV0tCURHry3qp3HPkAry98k 2KkOB3hfQScofyv3CgkFaWiXpb8nIhMvoWiYQywOe8pcbSfgWyyO+DzQbQXi3U2vXEUu m8mg== X-Gm-Message-State: AOJu0Yxk+ps1igR7K/sDVk8/a3ADDtsTsI5f4To0ncqxeoYXOeWHILCJ GOrte078m5g9C/QZ1+HKb5/14OlKv0r78h8w2gc= X-Google-Smtp-Source: AGHT+IHUvMo7kH1GvKMxHvk7tq4MfVlUDqvmk18WaclEGAhNRS+IBF5+HBsTvdj5rTXazSPfS7W8WQ== X-Received: by 2002:a05:6870:9a24:b0:1d6:439d:d03e with SMTP id fo36-20020a0568709a2400b001d6439dd03emr4796780oab.18.1695270244838; Wed, 20 Sep 2023 21:24:04 -0700 (PDT) Received: from hermes.local (204-195-112-131.wavecable.com. [204.195.112.131]) by smtp.gmail.com with ESMTPSA id u5-20020aa78485000000b00690d1269691sm311946pfn.22.2023.09.20.21.24.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Sep 2023 21:24:04 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger , Reshma Pattan , Quentin Armitage Subject: [PATCH 4/4] pcapng: move timestamp calculation into pdump Date: Wed, 20 Sep 2023 21:23:49 -0700 Message-Id: <20230921042349.104150-5-stephen@networkplumber.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230921042349.104150-1-stephen@networkplumber.org> References: <20230921042349.104150-1-stephen@networkplumber.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The computation of timestamp is more easily done in pdump than pcapng. The initialization is easier and makes the pcapng library have no global state. It also makes it easier to add HW timestamp support later. Simplify the computation of nanoseconds from TSC to a two step process which avoids numeric overflow issues. The previous code was not thread safe as well. Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files") Signed-off-by: Stephen Hemminger --- lib/pcapng/rte_pcapng.c | 71 ++--------------------------------------- lib/pcapng/rte_pcapng.h | 2 +- lib/pdump/rte_pdump.c | 56 +++++++++++++++++++++++++++++--- 3 files changed, 55 insertions(+), 74 deletions(-) diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c index ddce7bc87141..f6b3bd0ca718 100644 --- a/lib/pcapng/rte_pcapng.c +++ b/lib/pcapng/rte_pcapng.c @@ -25,7 +25,6 @@ #include #include #include -#include #include #include "pcapng_proto.h" @@ -43,15 +42,6 @@ struct rte_pcapng { uint32_t port_index[RTE_MAX_ETHPORTS]; }; -/* For converting TSC cycles to PCAPNG ns format */ -static struct pcapng_time { - uint64_t ns; - uint64_t cycles; - uint64_t tsc_hz; - struct rte_reciprocal_u64 tsc_hz_inverse; -} pcapng_time; - - #ifdef RTE_EXEC_ENV_WINDOWS /* * Windows does not have writev() call. @@ -102,58 +92,6 @@ static ssize_t writev(int fd, const struct iovec *iov, int iovcnt) #define if_indextoname(ifindex, ifname) NULL #endif -static inline void -pcapng_init(void) -{ - struct timespec ts; - - pcapng_time.cycles = rte_get_tsc_cycles(); - clock_gettime(CLOCK_REALTIME, &ts); - pcapng_time.cycles = (pcapng_time.cycles + rte_get_tsc_cycles()) / 2; - pcapng_time.ns = rte_timespec_to_ns(&ts); - - pcapng_time.tsc_hz = rte_get_tsc_hz(); - pcapng_time.tsc_hz_inverse = rte_reciprocal_value_u64(pcapng_time.tsc_hz); -} - -/* PCAPNG timestamps are in nanoseconds */ -static uint64_t pcapng_tsc_to_ns(uint64_t cycles) -{ - uint64_t delta, secs; - - if (!pcapng_time.tsc_hz) - pcapng_init(); - - /* In essence the calculation is: - * delta = (cycles - pcapng_time.cycles) * NSEC_PRE_SEC / rte_get_tsc_hz() - * but this overflows within 4 to 8 seconds depending on TSC frequency. - * Instead, if delta >= pcapng_time.tsc_hz: - * Increase pcapng_time.ns and pcapng_time.cycles by the number of - * whole seconds in delta and reduce delta accordingly. - * delta will therefore always lie in the interval [0, pcapng_time.tsc_hz), - * which will not overflow when multiplied by NSEC_PER_SEC provided the - * TSC frequency < approx 18.4GHz. - * - * Currently all TSCs operate below 5GHz. - */ - delta = cycles - pcapng_time.cycles; - if (unlikely(delta >= pcapng_time.tsc_hz)) { - if (likely(delta < pcapng_time.tsc_hz * 2)) { - delta -= pcapng_time.tsc_hz; - pcapng_time.cycles += pcapng_time.tsc_hz; - pcapng_time.ns += NSEC_PER_SEC; - } else { - secs = rte_reciprocal_divide_u64(delta, &pcapng_time.tsc_hz_inverse); - delta -= secs * pcapng_time.tsc_hz; - pcapng_time.cycles += secs * pcapng_time.tsc_hz; - pcapng_time.ns += secs * NSEC_PER_SEC; - } - } - - return pcapng_time.ns + rte_reciprocal_divide_u64(delta * NSEC_PER_SEC, - &pcapng_time.tsc_hz_inverse); -} - /* length of option including padding */ static uint16_t pcapng_optlen(uint16_t len) { @@ -518,7 +456,7 @@ struct rte_mbuf * rte_pcapng_copy(uint16_t port_id, uint32_t queue, const struct rte_mbuf *md, struct rte_mempool *mp, - uint32_t length, uint64_t cycles, + uint32_t length, uint64_t timestamp, enum rte_pcapng_direction direction, const char *comment) { @@ -527,14 +465,11 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue, struct pcapng_option *opt; uint16_t optlen; struct rte_mbuf *mc; - uint64_t ns; bool rss_hash; #ifdef RTE_LIBRTE_ETHDEV_DEBUG RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL); #endif - ns = pcapng_tsc_to_ns(cycles); - orig_len = rte_pktmbuf_pkt_len(md); /* Take snapshot of the data */ @@ -639,8 +574,8 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue, /* Interface index is filled in later during write */ mc->port = port_id; - epb->timestamp_hi = ns >> 32; - epb->timestamp_lo = (uint32_t)ns; + epb->timestamp_hi = timestamp >> 32; + epb->timestamp_lo = (uint32_t)timestamp; epb->capture_length = data_len; epb->original_length = orig_len; diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h index 1225ed5536ff..b9a9ee23ad1d 100644 --- a/lib/pcapng/rte_pcapng.h +++ b/lib/pcapng/rte_pcapng.h @@ -122,7 +122,7 @@ enum rte_pcapng_direction { * The upper limit on bytes to copy. Passing UINT32_MAX * means all data (after offset). * @param timestamp - * The timestamp in TSC cycles. + * The timestamp in nanoseconds since 1/1/1970. * @param direction * The direction of the packer: receive, transmit or unknown. * @param comment diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c index a70085bd0211..384abf5e27ad 100644 --- a/lib/pdump/rte_pdump.c +++ b/lib/pdump/rte_pdump.c @@ -10,7 +10,9 @@ #include #include #include +#include #include +#include #include #include "rte_pdump.h" @@ -78,6 +80,33 @@ static struct { const struct rte_memzone *mz; } *pdump_stats; +/* Time conversion values */ +static struct { + uint64_t offset_ns; /* ns since 1/1/1970 when initialized */ + uint64_t tsc_base; /* TSC when initialized */ + uint64_t tsc_hz; /* copy of rte_tsc_hz() */ + struct rte_reciprocal_u64 tsc_hz_inverse; /* inverse of tsc_hz */ +} pdump_time; + +/* Convert from TSC (CPU cycles) to nanoseconds */ +static uint64_t pdump_timestamp(void) +{ + uint64_t delta, secs, ns; + + delta = rte_get_tsc_cycles() - pdump_time.tsc_base; + + /* Avoid numeric wraparound by computing seconds first */ + secs = rte_reciprocal_divide_u64(delta, &pdump_time.tsc_hz_inverse); + + /* Remove the seconds portion */ + delta -= secs * pdump_time.tsc_hz; + ns = rte_reciprocal_divide_u64(delta * NS_PER_S, + &pdump_time.tsc_hz_inverse); + + return secs * NS_PER_S + ns + pdump_time.offset_ns; +} + + /* Create a clone of mbuf to be placed into ring. */ static void pdump_copy(uint16_t port_id, uint16_t queue, @@ -90,7 +119,7 @@ pdump_copy(uint16_t port_id, uint16_t queue, int ring_enq; uint16_t d_pkts = 0; struct rte_mbuf *dup_bufs[nb_pkts]; - uint64_t ts; + uint64_t timestamp = 0; struct rte_ring *ring; struct rte_mempool *mp; struct rte_mbuf *p; @@ -99,7 +128,6 @@ pdump_copy(uint16_t port_id, uint16_t queue, if (cbs->filter) rte_bpf_exec_burst(cbs->filter, (void **)pkts, rcs, nb_pkts); - ts = rte_get_tsc_cycles(); ring = cbs->ring; mp = cbs->mp; for (i = 0; i < nb_pkts; i++) { @@ -119,12 +147,17 @@ pdump_copy(uint16_t port_id, uint16_t queue, * If using pcapng then want to wrap packets * otherwise a simple copy. */ - if (cbs->ver == V2) + if (cbs->ver == V2) { + /* calculate timestamp on first packet */ + if (timestamp == 0) + timestamp = pdump_timestamp(); + p = rte_pcapng_copy(port_id, queue, pkts[i], mp, cbs->snaplen, - ts, direction, NULL); - else + timestamp, direction, NULL); + } else { p = rte_pktmbuf_copy(pkts[i], mp, 0, cbs->snaplen); + } if (unlikely(p == NULL)) __atomic_fetch_add(&stats->nombuf, 1, __ATOMIC_RELAXED); @@ -421,8 +454,21 @@ int rte_pdump_init(void) { const struct rte_memzone *mz; + struct timespec ts; + uint64_t cycles; int ret; + /* Compute time base offsets */ + cycles = rte_get_tsc_cycles(); + clock_gettime(CLOCK_REALTIME, &ts); + + /* put initial TSC value in middle of clock_gettime() call */ + pdump_time.tsc_base = (cycles + rte_get_tsc_cycles()) / 2; + pdump_time.offset_ns = rte_timespec_to_ns(&ts); + + pdump_time.tsc_hz = rte_get_tsc_hz(); + pdump_time.tsc_hz_inverse = rte_reciprocal_value_u64(pdump_time.tsc_hz); + mz = rte_memzone_reserve(MZ_RTE_PDUMP_STATS, sizeof(*pdump_stats), rte_socket_id(), 0); if (mz == NULL) { -- 2.39.2