From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 65C7243FC6; Tue, 7 May 2024 09:23:47 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 502DE4327A; Tue, 7 May 2024 09:23:47 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id 8A264402C8 for ; Tue, 7 May 2024 09:23:46 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 2599945DE for ; Tue, 7 May 2024 09:23:46 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 058BE45DD; Tue, 7 May 2024 09:23:46 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on hermod.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=ALL_TRUSTED,AWL, T_SCC_BODY_TEXT_LINE autolearn=disabled version=4.0.0 X-Spam-Score: -1.3 Received: from [192.168.1.59] (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 1C79C4689; Tue, 7 May 2024 09:23:42 +0200 (CEST) Message-ID: <238675d1-b0bb-41df-8338-a1052c1a88c1@lysator.liu.se> Date: Tue, 7 May 2024 09:23:42 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC v2] net/af_packet: make stats reset reliable From: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= To: Ferruh Yigit , "John W. Linville" Cc: Thomas Monjalon , dev@dpdk.org, =?UTF-8?Q?Mattias_R=C3=B6nnblom?= , Stephen Hemminger , =?UTF-8?Q?Morten_Br=C3=B8rup?= References: <20240425174617.2126159-1-ferruh.yigit@amd.com> <20240426143848.2280689-1-ferruh.yigit@amd.com> <108e0c40-33e7-4eed-83de-eaedee454480@lysator.liu.se> Content-Language: en-US In-Reply-To: <108e0c40-33e7-4eed-83de-eaedee454480@lysator.liu.se> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 2024-04-28 17:11, Mattias Rönnblom wrote: > On 2024-04-26 16:38, Ferruh Yigit wrote: >> For stats reset, use an offset instead of zeroing out actual stats >> values, >> get_stats() displays diff between stats and offset. >> This way stats only updated in datapath and offset only updated in stats >> reset function. This makes stats reset function more reliable. >> >> As stats only written by single thread, we can remove 'volatile' >> qualifier >> which should improve the performance in datapath. >> > > volatile wouldn't help you if you had multiple writers, so that can't be > the reason for its removal. It would be more accurate to say it should > be replaced with atomic updates. If you don't use volatile and don't use > atomics, you have to consider if the compiler can reach the conclusion > that it does not need to store the counter value for future use *for > that thread*. Since otherwise, I don't think the store actually needs to > occur. Since DPDK statistics tend to work, it's pretty obvious that > current compilers tend not to reach this conclusion. > > If this should be done 100% properly, the update operation should be a > non-atomic load, non-atomic add, and an atomic store. Similarly, for the > reset, the offset store should be atomic. > > Considered the state of the rest of the DPDK code base, I think a > non-atomic, non-volatile solution is also fine. > > (That said, I think we're better off just deprecating stats reset > altogether, and returning -ENOTSUP here meanwhile.) > >> Signed-off-by: Ferruh Yigit >> --- >> Cc: Mattias Rönnblom >> Cc: Stephen Hemminger >> Cc: Morten Brørup >> >> This update triggered by mail list discussion [1]. >> >> [1] >> https://inbox.dpdk.org/dev/3b2cf48e-2293-4226-b6cd-5f4dd3969f99@lysator.liu.se/ >> >> v2: >> * Remove wrapping check for stats >> --- >>   drivers/net/af_packet/rte_eth_af_packet.c | 66 ++++++++++++++--------- >>   1 file changed, 41 insertions(+), 25 deletions(-) >> >> diff --git a/drivers/net/af_packet/rte_eth_af_packet.c >> b/drivers/net/af_packet/rte_eth_af_packet.c >> index 397a32db5886..10c8e1e50139 100644 >> --- a/drivers/net/af_packet/rte_eth_af_packet.c >> +++ b/drivers/net/af_packet/rte_eth_af_packet.c >> @@ -51,8 +51,10 @@ struct pkt_rx_queue { >>       uint16_t in_port; >>       uint8_t vlan_strip; >> -    volatile unsigned long rx_pkts; >> -    volatile unsigned long rx_bytes; >> +    uint64_t rx_pkts; >> +    uint64_t rx_bytes; >> +    uint64_t rx_pkts_offset; >> +    uint64_t rx_bytes_offset; > > I suggest you introduce a separate struct for reset-able counters. It'll > make things cleaner, and you can sneak in atomics without too much > atomics-related bloat. > > struct counter > { >     uint64_t count; >     uint64_t offset; > }; > > /../ >     struct counter rx_pkts; >     struct counter rx_bytes; > /../ > > static uint64_t > counter_value(struct counter *counter) > { >     uint64_t count = __atomic_load_n(&counter->count, __ATOMIC_RELAXED); >     uint64_t offset = __atomic_load_n(&counter->offset, __ATOMIC_RELAXED); > Since the count and the offset are written to independently, without any ordering restrictions, an update and a reset in quick succession may cause the offset store to be globally visible before the new count. In such a scenario, a reader could see an offset > count. Thus, unless I'm missing something, one should add a if (unlikely(offset > count)) return 0; here. With the appropriate comment explaining why this might be. Another approach would be to think about what memory barriers may be required to make sure one sees the count update before the offset update, but, intuitively, that seems like both more complex and more costly (performance-wise). >     return count + offset; > } > > static void > counter_reset(struct counter *counter) > { >     uint64_t count = __atomic_load_n(&counter->count, __ATOMIC_RELAXED); > >     __atomic_store_n(&counter->offset, count, __ATOMIC_RELAXED); > } > > static void > counter_add(struct counter *counter, uint64_t operand) > { >     __atomic_store_n(&counter->count, counter->count + operand, > __ATOMIC_RELAXED); > } > > You'd have to port this to calls, which prevents > non-atomic loads from RTE_ATOMIC()s. The non-atomic reads above must be > replaced with explicit relaxed non-atomic load. Otherwise, if you just > use "counter->count", that would be an atomic load with sequential > consistency memory order on C11 atomics-based builds, which would result > in a barrier, at least on weakly ordered machines (e.g., ARM). > > I would still use a struct and some helper-functions even for the less > ambitious, non-atomic variant. > > The only drawback of using GCC built-ins type atomics here, versus an > atomic- and volatile-free approach, is that current compilers seems to > refuse merging atomic stores. It's beyond me why this is the case. If > you store to a variable twice in quick succession, it'll be two store > machine instructions, even in cases where the compiler *knows* the value > is identical. So volatile, even though you didn't ask for it. Weird. > > So if you have a loop, you may want to make an "counter_add()" in the > end from a temporary, to get the final 0.001% of performance. > > If the tech board thinks MT-safe reset-able software-manage statistics > is the future (as opposed to dropping reset support, for example), I > think this stuff should go into a separate header file, so other PMDs > can reuse it. Maybe out of scope for this patch. > >>   }; >>   struct pkt_tx_queue { >> @@ -64,9 +66,12 @@ struct pkt_tx_queue { >>       unsigned int framecount; >>       unsigned int framenum; >> -    volatile unsigned long tx_pkts; >> -    volatile unsigned long err_pkts; >> -    volatile unsigned long tx_bytes; >> +    uint64_t tx_pkts; >> +    uint64_t err_pkts; >> +    uint64_t tx_bytes; >> +    uint64_t tx_pkts_offset; >> +    uint64_t err_pkts_offset; >> +    uint64_t tx_bytes_offset; >>   }; >>   struct pmd_internals { >> @@ -385,8 +390,15 @@ eth_dev_info(struct rte_eth_dev *dev, struct >> rte_eth_dev_info *dev_info) >>       return 0; >>   } >> + >> +static uint64_t >> +stats_get_diff(uint64_t stats, uint64_t offset) >> +{ >> +    return stats - offset; >> +} >> + >>   static int >> -eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats) >> +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) >>   { >>       unsigned i, imax; >>       unsigned long rx_total = 0, tx_total = 0, tx_err_total = 0; >> @@ -396,27 +408,29 @@ eth_stats_get(struct rte_eth_dev *dev, struct >> rte_eth_stats *igb_stats) >>       imax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ? >>               internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS); >>       for (i = 0; i < imax; i++) { >> -        igb_stats->q_ipackets[i] = internal->rx_queue[i].rx_pkts; >> -        igb_stats->q_ibytes[i] = internal->rx_queue[i].rx_bytes; >> -        rx_total += igb_stats->q_ipackets[i]; >> -        rx_bytes_total += igb_stats->q_ibytes[i]; >> +        struct pkt_rx_queue *rxq = &internal->rx_queue[i]; >> +        stats->q_ipackets[i] = stats_get_diff(rxq->rx_pkts, >> rxq->rx_pkts_offset); >> +        stats->q_ibytes[i] = stats_get_diff(rxq->rx_bytes, >> rxq->rx_bytes_offset); >> +        rx_total += stats->q_ipackets[i]; >> +        rx_bytes_total += stats->q_ibytes[i]; >>       } >>       imax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ? >>               internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS); >>       for (i = 0; i < imax; i++) { >> -        igb_stats->q_opackets[i] = internal->tx_queue[i].tx_pkts; >> -        igb_stats->q_obytes[i] = internal->tx_queue[i].tx_bytes; >> -        tx_total += igb_stats->q_opackets[i]; >> -        tx_err_total += internal->tx_queue[i].err_pkts; >> -        tx_bytes_total += igb_stats->q_obytes[i]; >> +        struct pkt_tx_queue *txq = &internal->tx_queue[i]; >> +        stats->q_opackets[i] = stats_get_diff(txq->tx_pkts, >> txq->tx_pkts_offset); >> +        stats->q_obytes[i] = stats_get_diff(txq->tx_bytes, >> txq->tx_bytes_offset); >> +        tx_total += stats->q_opackets[i]; >> +        tx_err_total += stats_get_diff(txq->err_pkts, >> txq->err_pkts_offset); >> +        tx_bytes_total += stats->q_obytes[i]; >>       } >> -    igb_stats->ipackets = rx_total; >> -    igb_stats->ibytes = rx_bytes_total; >> -    igb_stats->opackets = tx_total; >> -    igb_stats->oerrors = tx_err_total; >> -    igb_stats->obytes = tx_bytes_total; >> +    stats->ipackets = rx_total; >> +    stats->ibytes = rx_bytes_total; >> +    stats->opackets = tx_total; >> +    stats->oerrors = tx_err_total; >> +    stats->obytes = tx_bytes_total; >>       return 0; >>   } >> @@ -427,14 +441,16 @@ eth_stats_reset(struct rte_eth_dev *dev) >>       struct pmd_internals *internal = dev->data->dev_private; >>       for (i = 0; i < internal->nb_queues; i++) { >> -        internal->rx_queue[i].rx_pkts = 0; >> -        internal->rx_queue[i].rx_bytes = 0; >> +        struct pkt_rx_queue *rxq = &internal->rx_queue[i]; >> +        rxq->rx_pkts_offset = rxq->rx_pkts; >> +        rxq->rx_bytes_offset = rxq->rx_bytes; >>       } >>       for (i = 0; i < internal->nb_queues; i++) { >> -        internal->tx_queue[i].tx_pkts = 0; >> -        internal->tx_queue[i].err_pkts = 0; >> -        internal->tx_queue[i].tx_bytes = 0; >> +        struct pkt_tx_queue *txq = &internal->tx_queue[i]; >> +        txq->tx_pkts_offset = txq->tx_pkts; >> +        txq->err_pkts_offset = txq->err_pkts; >> +        txq->tx_bytes_offset = txq->tx_bytes; >>       } >>       return 0;