From: Ferruh Yigit <ferruh.yigit@amd.com>
To: "Mattias Rönnblom" <hofors@lysator.liu.se>,
"John W. Linville" <linville@tuxdriver.com>
Cc: "Thomas Monjalon" <thomas@monjalon.net>,
dev@dpdk.org, "Mattias Rönnblom" <mattias.ronnblom@ericsson.com>,
"Stephen Hemminger" <stephen@networkplumber.org>,
"Morten Brørup" <mb@smartsharesystems.com>
Subject: Re: [RFC v2] net/af_packet: make stats reset reliable
Date: Tue, 7 May 2024 14:49:19 +0100 [thread overview]
Message-ID: <c71d1994-0b1f-4923-a672-2b964e79d7f3@amd.com> (raw)
In-Reply-To: <238675d1-b0bb-41df-8338-a1052c1a88c1@lysator.liu.se>
On 5/7/2024 8:23 AM, Mattias Rönnblom wrote:
> On 2024-04-28 17:11, Mattias Rönnblom wrote:
>> On 2024-04-26 16:38, Ferruh Yigit wrote:
>>> For stats reset, use an offset instead of zeroing out actual stats
>>> values,
>>> get_stats() displays diff between stats and offset.
>>> This way stats only updated in datapath and offset only updated in stats
>>> reset function. This makes stats reset function more reliable.
>>>
>>> As stats only written by single thread, we can remove 'volatile'
>>> qualifier
>>> which should improve the performance in datapath.
>>>
>>
>> volatile wouldn't help you if you had multiple writers, so that can't
>> be the reason for its removal. It would be more accurate to say it
>> should be replaced with atomic updates. If you don't use volatile and
>> don't use atomics, you have to consider if the compiler can reach the
>> conclusion that it does not need to store the counter value for future
>> use *for that thread*. Since otherwise, I don't think the store
>> actually needs to occur. Since DPDK statistics tend to work, it's
>> pretty obvious that current compilers tend not to reach this conclusion.
>>
>> If this should be done 100% properly, the update operation should be a
>> non-atomic load, non-atomic add, and an atomic store. Similarly, for
>> the reset, the offset store should be atomic.
>>
>> Considered the state of the rest of the DPDK code base, I think a
>> non-atomic, non-volatile solution is also fine.
>>
>> (That said, I think we're better off just deprecating stats reset
>> altogether, and returning -ENOTSUP here meanwhile.)
>>
>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@amd.com>
>>> ---
>>> Cc: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>>> Cc: Stephen Hemminger <stephen@networkplumber.org>
>>> Cc: Morten Brørup <mb@smartsharesystems.com>
>>>
>>> This update triggered by mail list discussion [1].
>>>
>>> [1]
>>> https://inbox.dpdk.org/dev/3b2cf48e-2293-4226-b6cd-5f4dd3969f99@lysator.liu.se/
>>>
>>> v2:
>>> * Remove wrapping check for stats
>>> ---
>>> drivers/net/af_packet/rte_eth_af_packet.c | 66 ++++++++++++++---------
>>> 1 file changed, 41 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
>>> b/drivers/net/af_packet/rte_eth_af_packet.c
>>> index 397a32db5886..10c8e1e50139 100644
>>> --- a/drivers/net/af_packet/rte_eth_af_packet.c
>>> +++ b/drivers/net/af_packet/rte_eth_af_packet.c
>>> @@ -51,8 +51,10 @@ struct pkt_rx_queue {
>>> uint16_t in_port;
>>> uint8_t vlan_strip;
>>> - volatile unsigned long rx_pkts;
>>> - volatile unsigned long rx_bytes;
>>> + uint64_t rx_pkts;
>>> + uint64_t rx_bytes;
>>> + uint64_t rx_pkts_offset;
>>> + uint64_t rx_bytes_offset;
>>
>> I suggest you introduce a separate struct for reset-able counters.
>> It'll make things cleaner, and you can sneak in atomics without too
>> much atomics-related bloat.
>>
>> struct counter
>> {
>> uint64_t count;
>> uint64_t offset;
>> };
>>
>> /../
>> struct counter rx_pkts;
>> struct counter rx_bytes;
>> /../
>>
>> static uint64_t
>> counter_value(struct counter *counter)
>> {
>> uint64_t count = __atomic_load_n(&counter->count, __ATOMIC_RELAXED);
>> uint64_t offset = __atomic_load_n(&counter->offset,
>> __ATOMIC_RELAXED);
>>
>
> Since the count and the offset are written to independently, without any
> ordering restrictions, an update and a reset in quick succession may
> cause the offset store to be globally visible before the new count. In
> such a scenario, a reader could see an offset > count.
>
> Thus, unless I'm missing something, one should add a
>
> if (unlikely(offset > count))
> return 0;
>
> here. With the appropriate comment explaining why this might be.
>
> Another approach would be to think about what memory barriers may be
> required to make sure one sees the count update before the offset
> update, but, intuitively, that seems like both more complex and more
> costly (performance-wise).
>
We are going with lazy alternative and requesting to stop forwarding
before stats reset, this should prevent 'count' and 'offset' being
updated simultaneously.
>> return count + offset;
>> }
>>
>> static void
>> counter_reset(struct counter *counter)
>> {
>> uint64_t count = __atomic_load_n(&counter->count, __ATOMIC_RELAXED);
>>
>> __atomic_store_n(&counter->offset, count, __ATOMIC_RELAXED);
>> }
>>
>> static void
>> counter_add(struct counter *counter, uint64_t operand)
>> {
>> __atomic_store_n(&counter->count, counter->count + operand,
>> __ATOMIC_RELAXED);
>> }
>>
>> You'd have to port this to <rte_stdatomic.h> calls, which prevents
>> non-atomic loads from RTE_ATOMIC()s. The non-atomic reads above must
>> be replaced with explicit relaxed non-atomic load. Otherwise, if you
>> just use "counter->count", that would be an atomic load with
>> sequential consistency memory order on C11 atomics-based builds, which
>> would result in a barrier, at least on weakly ordered machines (e.g.,
>> ARM).
>>
>> I would still use a struct and some helper-functions even for the less
>> ambitious, non-atomic variant.
>>
>> The only drawback of using GCC built-ins type atomics here, versus an
>> atomic- and volatile-free approach, is that current compilers seems to
>> refuse merging atomic stores. It's beyond me why this is the case. If
>> you store to a variable twice in quick succession, it'll be two store
>> machine instructions, even in cases where the compiler *knows* the
>> value is identical. So volatile, even though you didn't ask for it.
>> Weird.
>>
>> So if you have a loop, you may want to make an "counter_add()" in the
>> end from a temporary, to get the final 0.001% of performance.
>>
>> If the tech board thinks MT-safe reset-able software-manage statistics
>> is the future (as opposed to dropping reset support, for example), I
>> think this stuff should go into a separate header file, so other PMDs
>> can reuse it. Maybe out of scope for this patch.
>>
>>> };
>>> struct pkt_tx_queue {
>>> @@ -64,9 +66,12 @@ struct pkt_tx_queue {
>>> unsigned int framecount;
>>> unsigned int framenum;
>>> - volatile unsigned long tx_pkts;
>>> - volatile unsigned long err_pkts;
>>> - volatile unsigned long tx_bytes;
>>> + uint64_t tx_pkts;
>>> + uint64_t err_pkts;
>>> + uint64_t tx_bytes;
>>> + uint64_t tx_pkts_offset;
>>> + uint64_t err_pkts_offset;
>>> + uint64_t tx_bytes_offset;
>>> };
>>> struct pmd_internals {
>>> @@ -385,8 +390,15 @@ eth_dev_info(struct rte_eth_dev *dev, struct
>>> rte_eth_dev_info *dev_info)
>>> return 0;
>>> }
>>> +
>>> +static uint64_t
>>> +stats_get_diff(uint64_t stats, uint64_t offset)
>>> +{
>>> + return stats - offset;
>>> +}
>>> +
>>> static int
>>> -eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)
>>> +eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
>>> {
>>> unsigned i, imax;
>>> unsigned long rx_total = 0, tx_total = 0, tx_err_total = 0;
>>> @@ -396,27 +408,29 @@ eth_stats_get(struct rte_eth_dev *dev, struct
>>> rte_eth_stats *igb_stats)
>>> imax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?
>>> internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);
>>> for (i = 0; i < imax; i++) {
>>> - igb_stats->q_ipackets[i] = internal->rx_queue[i].rx_pkts;
>>> - igb_stats->q_ibytes[i] = internal->rx_queue[i].rx_bytes;
>>> - rx_total += igb_stats->q_ipackets[i];
>>> - rx_bytes_total += igb_stats->q_ibytes[i];
>>> + struct pkt_rx_queue *rxq = &internal->rx_queue[i];
>>> + stats->q_ipackets[i] = stats_get_diff(rxq->rx_pkts,
>>> rxq->rx_pkts_offset);
>>> + stats->q_ibytes[i] = stats_get_diff(rxq->rx_bytes,
>>> rxq->rx_bytes_offset);
>>> + rx_total += stats->q_ipackets[i];
>>> + rx_bytes_total += stats->q_ibytes[i];
>>> }
>>> imax = (internal->nb_queues < RTE_ETHDEV_QUEUE_STAT_CNTRS ?
>>> internal->nb_queues : RTE_ETHDEV_QUEUE_STAT_CNTRS);
>>> for (i = 0; i < imax; i++) {
>>> - igb_stats->q_opackets[i] = internal->tx_queue[i].tx_pkts;
>>> - igb_stats->q_obytes[i] = internal->tx_queue[i].tx_bytes;
>>> - tx_total += igb_stats->q_opackets[i];
>>> - tx_err_total += internal->tx_queue[i].err_pkts;
>>> - tx_bytes_total += igb_stats->q_obytes[i];
>>> + struct pkt_tx_queue *txq = &internal->tx_queue[i];
>>> + stats->q_opackets[i] = stats_get_diff(txq->tx_pkts,
>>> txq->tx_pkts_offset);
>>> + stats->q_obytes[i] = stats_get_diff(txq->tx_bytes,
>>> txq->tx_bytes_offset);
>>> + tx_total += stats->q_opackets[i];
>>> + tx_err_total += stats_get_diff(txq->err_pkts,
>>> txq->err_pkts_offset);
>>> + tx_bytes_total += stats->q_obytes[i];
>>> }
>>> - igb_stats->ipackets = rx_total;
>>> - igb_stats->ibytes = rx_bytes_total;
>>> - igb_stats->opackets = tx_total;
>>> - igb_stats->oerrors = tx_err_total;
>>> - igb_stats->obytes = tx_bytes_total;
>>> + stats->ipackets = rx_total;
>>> + stats->ibytes = rx_bytes_total;
>>> + stats->opackets = tx_total;
>>> + stats->oerrors = tx_err_total;
>>> + stats->obytes = tx_bytes_total;
>>> return 0;
>>> }
>>> @@ -427,14 +441,16 @@ eth_stats_reset(struct rte_eth_dev *dev)
>>> struct pmd_internals *internal = dev->data->dev_private;
>>> for (i = 0; i < internal->nb_queues; i++) {
>>> - internal->rx_queue[i].rx_pkts = 0;
>>> - internal->rx_queue[i].rx_bytes = 0;
>>> + struct pkt_rx_queue *rxq = &internal->rx_queue[i];
>>> + rxq->rx_pkts_offset = rxq->rx_pkts;
>>> + rxq->rx_bytes_offset = rxq->rx_bytes;
>>> }
>>> for (i = 0; i < internal->nb_queues; i++) {
>>> - internal->tx_queue[i].tx_pkts = 0;
>>> - internal->tx_queue[i].err_pkts = 0;
>>> - internal->tx_queue[i].tx_bytes = 0;
>>> + struct pkt_tx_queue *txq = &internal->tx_queue[i];
>>> + txq->tx_pkts_offset = txq->tx_pkts;
>>> + txq->err_pkts_offset = txq->err_pkts;
>>> + txq->tx_bytes_offset = txq->tx_bytes;
>>> }
>>> return 0;
next prev parent reply other threads:[~2024-05-07 13:49 UTC|newest]
Thread overview: 134+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-25 17:46 [RFC] " Ferruh Yigit
2024-04-26 11:33 ` Morten Brørup
2024-04-26 13:37 ` Ferruh Yigit
2024-04-26 14:56 ` Morten Brørup
2024-04-28 15:42 ` Mattias Rönnblom
2024-04-26 14:38 ` [RFC v2] " Ferruh Yigit
2024-04-26 14:47 ` Morten Brørup
2024-04-28 15:11 ` Mattias Rönnblom
2024-05-01 16:19 ` Ferruh Yigit
2024-05-02 5:51 ` Mattias Rönnblom
2024-05-02 14:22 ` Ferruh Yigit
2024-05-02 15:59 ` Stephen Hemminger
2024-05-02 18:20 ` Ferruh Yigit
2024-05-02 17:37 ` Mattias Rönnblom
2024-05-02 18:26 ` Stephen Hemminger
2024-05-02 21:26 ` Mattias Rönnblom
2024-05-02 21:46 ` Stephen Hemminger
2024-05-07 7:23 ` Mattias Rönnblom
2024-05-07 13:49 ` Ferruh Yigit [this message]
2024-05-07 14:51 ` Stephen Hemminger
2024-05-07 16:00 ` Morten Brørup
2024-05-07 16:54 ` Ferruh Yigit
2024-05-07 18:47 ` Stephen Hemminger
2024-05-08 7:48 ` Mattias Rönnblom
2024-05-08 6:28 ` Mattias Rönnblom
2024-05-08 6:25 ` Mattias Rönnblom
2024-05-07 19:19 ` Morten Brørup
2024-05-08 6:34 ` Mattias Rönnblom
2024-05-08 7:10 ` Morten Brørup
2024-05-08 7:23 ` Mattias Rönnblom
2024-04-26 21:28 ` [RFC] " Patrick Robb
2024-05-03 15:45 ` [RFC v3] " Ferruh Yigit
2024-05-03 22:00 ` Stephen Hemminger
2024-05-07 13:48 ` Ferruh Yigit
2024-05-07 14:52 ` Stephen Hemminger
2024-05-07 17:27 ` Ferruh Yigit
2024-05-08 7:19 ` Mattias Rönnblom
2024-05-08 15:23 ` Stephen Hemminger
2024-05-08 19:48 ` Ferruh Yigit
2024-05-08 20:54 ` Stephen Hemminger
2024-05-09 7:43 ` Morten Brørup
2024-05-09 9:29 ` Bruce Richardson
2024-05-09 11:37 ` Morten Brørup
2024-05-09 14:19 ` Morten Brørup
2024-05-10 4:56 ` Stephen Hemminger
2024-05-10 9:14 ` Morten Brørup
2024-05-07 15:27 ` Morten Brørup
2024-05-07 17:40 ` Ferruh Yigit
2024-05-10 5:01 ` [RFC 0/3] generic sw counters Stephen Hemminger
2024-05-10 5:01 ` [RFC 1/3] ethdev: add internal helper of SW driver statistics Stephen Hemminger
2024-05-10 5:01 ` [RFC 2/3] net/af_packet: use SW stats helper Stephen Hemminger
2024-05-10 5:01 ` [RFC 3/3] net/tap: use generic SW stats Stephen Hemminger
2024-05-10 17:29 ` [RFC 0/3] generic sw counters Morten Brørup
2024-05-10 19:30 ` Stephen Hemminger
2024-05-13 18:52 ` [RFC v2 0/7] generic SW counters Stephen Hemminger
2024-05-13 18:52 ` [RFC v2 1/7] eal: generic 64 bit counter Stephen Hemminger
2024-05-13 19:36 ` Morten Brørup
2024-05-13 18:52 ` [RFC v2 2/7] ethdev: add internal helper of SW driver statistics Stephen Hemminger
2024-05-13 18:52 ` [RFC v2 3/7] net/af_packet: use SW stats helper Stephen Hemminger
2024-05-13 18:52 ` [RFC v2 4/7] net/tap: use generic SW stats Stephen Hemminger
2024-05-13 18:52 ` [RFC v2 5/7] net/pcap: " Stephen Hemminger
2024-05-13 18:52 ` [RFC v2 6/7] net/af_xdp: " Stephen Hemminger
2024-05-13 18:52 ` [RFC v2 7/7] net/ring: " Stephen Hemminger
2024-05-14 15:35 ` [PATCH v3 0/7] Generic SW counters Stephen Hemminger
2024-05-14 15:35 ` [PATCH v3 1/7] eal: generic 64 bit counter Stephen Hemminger
2024-05-15 9:30 ` Morten Brørup
2024-05-15 15:03 ` Stephen Hemminger
2024-05-15 16:18 ` Morten Brørup
2024-05-14 15:35 ` [PATCH v3 2/7] ethdev: add internal helper of SW driver statistics Stephen Hemminger
2024-05-14 15:35 ` [PATCH v3 3/7] net/af_packet: use SW stats helper Stephen Hemminger
2024-05-14 15:35 ` [PATCH v3 4/7] net/af_xdp: use generic SW stats Stephen Hemminger
2024-05-14 15:35 ` [PATCH v3 5/7] net/pcap: " Stephen Hemminger
2024-05-14 15:35 ` [PATCH v3 6/7] net/ring: " Stephen Hemminger
2024-05-14 15:35 ` [PATCH v3 7/7] net/tap: " Stephen Hemminger
2024-05-15 23:40 ` [PATCH v4 0/8] Generic 64 bit counters for SW PMD's Stephen Hemminger
2024-05-15 23:40 ` [PATCH v4 1/8] eal: generic 64 bit counter Stephen Hemminger
2024-05-15 23:40 ` [PATCH v4 2/8] ethdev: add common counters for statistics Stephen Hemminger
2024-05-15 23:40 ` [PATCH v4 3/8] net/af_packet: use generic SW stats Stephen Hemminger
2024-05-15 23:40 ` [PATCH v4 4/8] net/af_xdp: " Stephen Hemminger
2024-05-15 23:40 ` [PATCH v4 5/8] net/pcap: " Stephen Hemminger
2024-05-15 23:40 ` [PATCH v4 6/8] net/ring: " Stephen Hemminger
2024-05-15 23:40 ` [PATCH v4 7/8] net/tap: " Stephen Hemminger
2024-05-15 23:41 ` [PATCH v4 8/8] net/null: " Stephen Hemminger
2024-05-16 15:40 ` [PATCH v5 0/9] Generic 64 bit counters Stephen Hemminger
2024-05-16 15:40 ` [PATCH v5 1/9] eal: generic 64 bit counter Stephen Hemminger
2024-05-16 18:22 ` Wathsala Wathawana Vithanage
2024-05-16 21:42 ` Stephen Hemminger
2024-05-17 2:39 ` Honnappa Nagarahalli
2024-05-17 3:29 ` Stephen Hemminger
2024-05-17 4:39 ` Honnappa Nagarahalli
2024-05-16 15:40 ` [PATCH v5 2/9] ethdev: add common counters for statistics Stephen Hemminger
2024-05-16 18:30 ` Wathsala Wathawana Vithanage
2024-05-17 0:19 ` Stephen Hemminger
2024-05-16 15:40 ` [PATCH v5 3/9] net/af_packet: use generic SW stats Stephen Hemminger
2024-05-16 15:40 ` [PATCH v5 4/9] net/af_xdp: " Stephen Hemminger
2024-05-16 15:40 ` [PATCH v5 5/9] net/pcap: " Stephen Hemminger
2024-05-16 15:40 ` [PATCH v5 6/9] test/pmd_ring: initialize mbufs Stephen Hemminger
2024-05-16 15:40 ` [PATCH v5 7/9] net/ring: use generic SW stats Stephen Hemminger
2024-05-16 15:40 ` [PATCH v5 8/9] net/tap: " Stephen Hemminger
2024-05-16 15:40 ` [PATCH v5 9/9] net/null: " Stephen Hemminger
2024-05-17 0:12 ` [PATCH v6 0/9] Generic 64 bit counters for SW drivers Stephen Hemminger
2024-05-17 0:12 ` [PATCH v6 1/9] eal: generic 64 bit counter Stephen Hemminger
2024-05-17 2:45 ` Honnappa Nagarahalli
2024-05-17 3:30 ` Stephen Hemminger
2024-05-17 4:26 ` Honnappa Nagarahalli
2024-05-17 6:44 ` Morten Brørup
2024-05-17 15:05 ` Stephen Hemminger
2024-05-17 16:18 ` Stephen Hemminger
2024-05-18 14:00 ` Morten Brørup
2024-05-19 15:13 ` Stephen Hemminger
2024-05-19 17:10 ` Morten Brørup
2024-05-19 22:49 ` Stephen Hemminger
2024-05-20 7:57 ` Morten Brørup
2024-05-17 15:07 ` Stephen Hemminger
2024-05-17 0:12 ` [PATCH v6 2/9] ethdev: add common counters for statistics Stephen Hemminger
2024-05-17 0:12 ` [PATCH v6 3/9] net/af_packet: use generic SW stats Stephen Hemminger
2024-05-17 0:12 ` [PATCH v6 4/9] net/af_xdp: " Stephen Hemminger
2024-05-17 13:34 ` Loftus, Ciara
2024-05-17 14:54 ` Stephen Hemminger
2024-05-17 0:12 ` [PATCH v6 5/9] net/pcap: " Stephen Hemminger
2024-05-17 0:12 ` [PATCH v6 6/9] test/pmd_ring: initialize mbufs Stephen Hemminger
2024-05-17 0:12 ` [PATCH v6 7/9] net/ring: use generic SW stats Stephen Hemminger
2024-05-17 0:12 ` [PATCH v6 8/9] net/tap: " Stephen Hemminger
2024-05-17 0:12 ` [PATCH v6 9/9] net/null: " Stephen Hemminger
2024-05-17 17:35 ` [PATCH v7 0/9] Use weak atomic operations for SW PMD counters Stephen Hemminger
2024-05-17 17:35 ` [PATCH v7 1/9] eal: generic 64 bit counter Stephen Hemminger
2024-05-17 17:35 ` [PATCH v7 2/9] ethdev: add common counters for statistics Stephen Hemminger
2024-05-17 17:35 ` [PATCH v7 3/9] net/af_packet: use generic SW stats Stephen Hemminger
2024-05-17 17:35 ` [PATCH v7 4/9] net/af_xdp: " Stephen Hemminger
2024-05-17 17:35 ` [PATCH v7 5/9] net/pcap: " Stephen Hemminger
2024-05-17 17:35 ` [PATCH v7 6/9] test/pmd_ring: initialize mbufs Stephen Hemminger
2024-05-17 17:35 ` [PATCH v7 7/9] net/ring: use generic SW stats Stephen Hemminger
2024-05-17 17:35 ` [PATCH v7 8/9] net/tap: " Stephen Hemminger
2024-05-17 17:35 ` [PATCH v7 9/9] net/null: " Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c71d1994-0b1f-4923-a672-2b964e79d7f3@amd.com \
--to=ferruh.yigit@amd.com \
--cc=dev@dpdk.org \
--cc=hofors@lysator.liu.se \
--cc=linville@tuxdriver.com \
--cc=mattias.ronnblom@ericsson.com \
--cc=mb@smartsharesystems.com \
--cc=stephen@networkplumber.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).