From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Di, ChenxuX" <chenxux.di@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "Yang, Qiming" <qiming.yang@intel.com>
Subject: Re: [dpdk-dev] [PATCH v6 3/4] net/ixgbe: cleanup Tx buffers
Date: Mon, 6 Jan 2020 13:26:29 +0000 [thread overview]
Message-ID: <SN6PR11MB25585FDADFE1F159829B305D9A3C0@SN6PR11MB2558.namprd11.prod.outlook.com> (raw)
In-Reply-To: <3B926E44943CB04AA3A39AC16328CE39B93D08@SHSMSX101.ccr.corp.intel.com>
> > > > > + * tx_tail is the last sent packet on the sw_ring. Goto the end
> > > > > + * of that packet (the last segment in the packet chain) and
> > > > > + * then the next segment will be the start of the oldest segment
> > > > > + * in the sw_ring.
> > > >
> > > > Not sure I understand the sentence above.
> > > > tx_tail is the value of TDT HW register (most recently armed by SW TD).
> > > > last_id is the index of last descriptor for multi-seg packet.
> > > > next_id is just the index of next descriptor in HW TD ring.
> > > > How do you conclude that it will be the ' oldest segment in the sw_ring'?
> > > >
> > >
> > > The tx_tail is the last sent packet on the sw_ring. While the
> > > xmit_cleanup or Tx_free_bufs will be call when the nb_tx_free <
> > tx_free_thresh .
> > > So the sw_ring[tx_tail].next_id must be the begin of mbufs which are
> > > not used or Already freed . then begin the loop until the mbuf is used and
> > begin to free them.
> > >
> > >
> > >
> > > > Another question why do you need to write your own functions?
> > > > Why can't you reuse existing ixgbe_xmit_cleanup() for full(offload)
> > > > path and
> > > > ixgbe_tx_free_bufs() for simple path?
> > > > Yes, ixgbe_xmit_cleanup() doesn't free mbufs, but at least it could
> > > > be used to determine finished TX descriptors.
> > > > Based on that you can you can free appropriate sw_ring[] entries.
> > > >
> > >
> > > The reason why I don't reuse existing function is that they all free
> > > several mbufs While the free_cnt of the API rte_eth_tx_done_cleanup() is the
> > number of packets.
> > > It also need to be done that check which mbuffs are from the same packet.
> >
> > At first, I don't see anything bad if tx_done_cleanup() will free only some
> > segments from the packet. As long as it is safe - there is no problem with that.
> > I think rte_eth_tx_done_cleanup() operates on mbuf, not packet quantities.
> > But in our case I think it doesn't matter, as ixgbe_xmit_cleanup() mark TXDs as
> > free only when HW is done with all TXDs for that packet.
> > As long as there is a way to reuse existing code and avoid duplication (without
> > introducing any degradation) - we should use it.
> > And I think there is a very good opportunity here to reuse existing
> > ixgbe_xmit_cleanup() for tx_done_cleanup() implementation.
> > Moreover because your code doesn’t follow
> > ixgbe_xmit_pkts()/ixgbe_xmit_cleanup()
> > logic and infrastructure, it introduces unnecessary scans over TXD ring, and in
> > some cases doesn't work as expected:
> >
> > +while (1) {
> > +tx_last = sw_ring[tx_id].last_id;
> > +
> > +if (sw_ring[tx_last].mbuf) {
> > +if (txr[tx_last].wb.status &
> > +IXGBE_TXD_STAT_DD) {
> > ...
> > +} else {
> > +/*
> > + * mbuf still in use, nothing left to
> > + * free.
> > + */
> > +break;
> >
> > It is not correct to expect that IXGBE_TXD_STAT_DD will be set on last TXD for
> > *every* packet.
> > We set IXGBE_TXD_CMD_RS bit only on threshold packet last descriptor.
> > Plus ixgbe_xmit_cleanup() can cleanup TXD wb.status.
> >
> > So I strongly recommend to reuse ixgbe_xmit_cleanup() here.
> > It would be much less error prone and will help to avoid code duplication.
> >
> > Konstantin
> >
>
> At first.
> The function ixgbe_xmit_cleanup(struct ixgbe_tx_queue *txq) will cleanup TXD wb.status.
> the number of status cleanuped is always txq->tx_rs_thresh.
Yes, and what's wrong with it?
>
> The API rte_eth_tx_done_cleanup() in rte_eth_dev.h show that
> @param free_cnt
> * Maximum number of packets to free. Use 0 to indicate all possible packets
> * should be freed. Note that a packet may be using multiple mbufs.
I don't think it is a good approach, better would be to report number of freed mbufs,
but ok, as it is a public API, we probably need to keep it as it is.
> a number must be set while ixgbe_xmit_cleanup and ixgbe_tx_free_bufs only have one parameter txq.
Yes, ixgbe_xmit_cleanup() cleans up at least txq->tx_rs_thresh TXDs.
So if user requested more packets to be freed we can call ixgbe_xmit_cleanup()
in a loop.
> And what should do is not only free buffers and status but also check which bufs are from
> One packet and count the packet freed.
ixgbe_xmit_cleanup() itself doesn't free mbufs itself.
It only cleans up TXDs.
So in tx_done_cleanup() after calling ixgbe_xmit_cleanup()
you'll still need to go through sw_ring[]
entries that correspond to free TXDs and call mbuf_seg_free().
You can count number of full packets here.
> So I think it can't be implemented that reuse function xmit_cleanup without change it.
> And create a new function with the code of xmit_cleanup will cause many duplication.
I don't think it would.
I think all we need here is something like that
(note it is schematic one, doesn't take into accounf that TXD ring is circular):
tx_done_cleanup(..., uint32_t cnt)
{
/* we have txq->nb_tx_free TXDs starting from txq->tx_tail.
Scan them first and free as many mbufs as we can.
If we need more mbufs to free call ixgbe_xmit_cleanup()
to free more TXDs. */
swr = txq->sw_ring;
txr = txq->tx_ring;
id = txq->tx_tail;
free = txq->nb_tx_free;
for (n = 0; n < cnt && free != 0; ) {
for (j = 0; j != free && n < cnt; j++) {
swe = &swr[id + j];
if (swe->mbuf != NULL) {
rte_pktmbuf_free_seg(swe->mbuf);
swe->mbuf = NULL;
}
n += (swe->last_id == id + j)
}
if (n < cnt) {
ixgbe_xmit_cleanup(txq);
free = txq->nb_tx_free - free;
}
}
return n;
}
>
> Above all , it seem not a perfect idea to reuse ixgbe_xmit_cleanup().
Totally disagree, see above.
>
> Second.
> The function in patch is copy from code in igb_rxtx.c. it already updated in 2017,
> The commit id is 8d907d2b79f7a54c809f1c44970ff455fa2865e1.
I realized that.
But I think it as a problem, not a positive thing.
While they do have some similarities, igb abd ixgbe are PMDs for different devices,
and their TX code differs quite a lot. Let say igb doesn't use tx_rs_threshold,
but instead set RS bit for each last TXD.
So, just blindly copying tx_done_cleanup() from igb to ixgbe doesn't look like a good
idea to me.
> I trust the logic of code is right.
> Actually it don't complete for ixgbe, i40e and ice, while it don't change the value of
> last_desc_cleaned and tx_next_dd. And it's beginning prefer last_desc_cleaned or
> tx_next_dd(for offload or simple) to tx_tail.
>
> So, I suggest to use the old function and fix the issue.
>
> > >
> > >
> > > > >This is the first packet that will be
> > > > > + * attempted to be freed.
> > > > > + */
> > > > > +
> > > > > +/* Get last segment in most recently added packet. */ tx_last =
> > > > > +sw_ring[txq->tx_tail].last_id;
> > > > > +
> > > > > +/* Get the next segment, which is the oldest segment in ring. */
> > > > > +tx_first = sw_ring[tx_last].next_id;
> > > > > +
> > > > > +/* Set the current index to the first. */ tx_id = tx_first;
> > > > > +
> > > > > +/*
> > > > > + * Loop through each packet. For each packet, verify that an
> > > > > + * mbuf exists and that the last segment is free. If so, free
> > > > > + * it and move on.
> > > > > + */
> > > > > +while (1) {
> > > > > +tx_last = sw_ring[tx_id].last_id;
> > > > > +
> > > > > +if (sw_ring[tx_last].mbuf) {
> > > > > +if (!(txr[tx_last].wb.status &
> > > > > +IXGBE_TXD_STAT_DD))
> > > > > +break;
> > > > > +
> > > > > +/* Get the start of the next packet. */ tx_next =
> > > > > +sw_ring[tx_last].next_id;
> > > > > +
> > > > > +/*
> > > > > + * Loop through all segments in a
> > > > > + * packet.
> > > > > + */
> > > > > +do {
> > > > > +rte_pktmbuf_free_seg(sw_ring[tx_id].mbuf);
> > > > > +sw_ring[tx_id].mbuf = NULL;
> > > > > +sw_ring[tx_id].last_id = tx_id;
> > > > > +
> > > > > +/* Move to next segment. */
> > > > > +tx_id = sw_ring[tx_id].next_id;
> > > > > +
> > > > > +} while (tx_id != tx_next);
> > > > > +
> > > > > +/*
> > > > > + * Increment the number of packets
> > > > > + * freed.
> > > > > + */
> > > > > +count++;
> > > > > +
> > > > > +if (unlikely(count == (int)free_cnt)) break; } else {
> > > > > +/*
> > > > > + * There are multiple reasons to be here:
> > > > > + * 1) All the packets on the ring have been
> > > > > + * freed - tx_id is equal to tx_first
> > > > > + * and some packets have been freed.
> > > > > + * - Done, exit
> > > > > + * 2) Interfaces has not sent a rings worth of
> > > > > + * packets yet, so the segment after tail is
> > > > > + * still empty. Or a previous call to this
> > > > > + * function freed some of the segments but
> > > > > + * not all so there is a hole in the list.
> > > > > + * Hopefully this is a rare case.
> > > > > + * - Walk the list and find the next mbuf. If
> > > > > + * there isn't one, then done.
> > > > > + */
> > > > > +if (likely(tx_id == tx_first && count != 0)) break;
> > > > > +
> > > > > +/*
> > > > > + * Walk the list and find the next mbuf, if any.
> > > > > + */
> > > > > +do {
> > > > > +/* Move to next segment. */
> > > > > +tx_id = sw_ring[tx_id].next_id;
> > > > > +
> > > > > +if (sw_ring[tx_id].mbuf)
> > > > > +break;
> > > > > +
> > > > > +} while (tx_id != tx_first);
> > > > > +
> > > > > +/*
> > > > > + * Determine why previous loop bailed. If there
> > > > > + * is not an mbuf, done.
> > > > > + */
> > > > > +if (sw_ring[tx_id].mbuf == NULL)
> > > > > +break;
> > > > > +}
> > > > > +}
> > > > > +
> > > > > +return count;
> > > > > +}
> > > > > +
> > > > > static void __attribute__((cold)) ixgbe_tx_free_swring(struct
> > > > > ixgbe_tx_queue *txq) { diff --git
> > > > > a/drivers/net/ixgbe/ixgbe_rxtx.h b/drivers/net/ixgbe/ixgbe_rxtx.h
> > > > > index 505d344b9..2c3770af6 100644
> > > > > --- a/drivers/net/ixgbe/ixgbe_rxtx.h
> > > > > +++ b/drivers/net/ixgbe/ixgbe_rxtx.h
> > > > > @@ -285,6 +285,8 @@ int
> > > > > ixgbe_rx_vec_dev_conf_condition_check(struct
> > > > > rte_eth_dev *dev); int ixgbe_rxq_vec_setup(struct ixgbe_rx_queue
> > > > > *rxq); void ixgbe_rx_queue_release_mbufs_vec(struct
> > > > > ixgbe_rx_queue *rxq);
> > > > >
> > > > > +int ixgbe_tx_done_cleanup(void *txq, uint32_t free_cnt);
> > > > > +
> > > > > extern const uint32_t ptype_table[IXGBE_PACKET_TYPE_MAX];
> > > > > extern const uint32_t ptype_table_tn[IXGBE_PACKET_TYPE_TN_MAX];
> > > > >
> > > > > --
> > > > > 2.17.1
> > > >
> > >
> >
>
next prev parent reply other threads:[~2020-01-06 13:26 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-03 5:51 [dpdk-dev] [PATCH 0/4] drivers/net: " Chenxu Di
2019-12-03 5:51 ` [dpdk-dev] [PATCH 1/4] net/fm10k: " Chenxu Di
2019-12-03 5:51 ` [dpdk-dev] [PATCH 2/4] net/i40e: " Chenxu Di
2019-12-03 5:51 ` [dpdk-dev] [PATCH 3/4] net/ice: " Chenxu Di
2019-12-03 5:51 ` [dpdk-dev] [PATCH 4/4] net/ixgbe: " Chenxu Di
2019-12-20 3:02 ` [dpdk-dev] [PATCH v2 0/5] drivers/net: " Chenxu Di
2019-12-20 3:02 ` [dpdk-dev] [PATCH v2 1/5] net/fm10k: " Chenxu Di
2019-12-20 3:02 ` [dpdk-dev] [PATCH v2 2/5] net/i40e: " Chenxu Di
2019-12-20 3:02 ` [dpdk-dev] [PATCH v2 3/5] net/ice: " Chenxu Di
2019-12-20 3:02 ` [dpdk-dev] [PATCH v2 4/5] net/ixgbe: " Chenxu Di
2019-12-20 3:02 ` [dpdk-dev] [PATCH v2 5/5] net/e1000: " Chenxu Di
2019-12-20 3:15 ` [dpdk-dev] [PATCH v3 0/5] drivers/net: " Chenxu Di
2019-12-20 3:15 ` [dpdk-dev] [PATCH v3 1/5] net/fm10k: " Chenxu Di
2019-12-20 3:15 ` [dpdk-dev] [PATCH v3 2/5] net/i40e: " Chenxu Di
2019-12-20 3:15 ` [dpdk-dev] [PATCH v3 3/5] net/ice: " Chenxu Di
2019-12-20 3:15 ` [dpdk-dev] [PATCH v3 4/5] net/ixgbe: " Chenxu Di
2019-12-20 3:15 ` [dpdk-dev] [PATCH v3 5/5] net/e1000: " Chenxu Di
2019-12-24 2:39 ` [dpdk-dev] [PATCH v4 0/5] drivers/net: " Chenxu Di
2019-12-24 2:39 ` [dpdk-dev] [PATCH v4 1/5] net/fm10k: " Chenxu Di
2019-12-24 2:39 ` [dpdk-dev] [PATCH v4 2/5] net/i40e: " Chenxu Di
2019-12-26 8:24 ` Xing, Beilei
2019-12-24 2:39 ` [dpdk-dev] [PATCH v4 3/5] net/ice: " Chenxu Di
2019-12-24 2:39 ` [dpdk-dev] [PATCH v4 4/5] net/ixgbe: " Chenxu Di
2019-12-24 2:39 ` [dpdk-dev] [PATCH v4 5/5] net/e1000: " Chenxu Di
2019-12-30 9:38 ` [dpdk-dev] [PATCH v6 0/4] drivers/net: " Chenxu Di
2019-12-30 9:38 ` [dpdk-dev] [PATCH v6 1/4] net/i40e: " Chenxu Di
2019-12-30 13:01 ` Ananyev, Konstantin
2019-12-30 9:38 ` [dpdk-dev] [PATCH v6 2/4] net/ice: " Chenxu Di
2019-12-30 9:38 ` [dpdk-dev] [PATCH v6 3/4] net/ixgbe: " Chenxu Di
2019-12-30 12:53 ` Ananyev, Konstantin
2020-01-03 9:01 ` Di, ChenxuX
2020-01-05 23:36 ` Ananyev, Konstantin
2020-01-06 9:03 ` Di, ChenxuX
2020-01-06 13:26 ` Ananyev, Konstantin [this message]
2020-01-07 10:46 ` Di, ChenxuX
2020-01-07 14:09 ` Ananyev, Konstantin
2020-01-08 10:15 ` Di, ChenxuX
2020-01-08 15:12 ` Ananyev, Konstantin
2019-12-30 9:38 ` [dpdk-dev] [PATCH v6 4/4] net/e1000: " Chenxu Di
2020-01-09 10:38 ` [dpdk-dev] [PATCH v7 0/4] drivers/net: " Chenxu Di
2020-01-09 10:38 ` [dpdk-dev] [PATCH v7 1/4] net/i40e: " Chenxu Di
2020-01-09 10:38 ` [dpdk-dev] [PATCH v7 2/4] net/ice: " Chenxu Di
2020-01-09 10:38 ` [dpdk-dev] [PATCH v7 3/4] net/ixgbe: " Chenxu Di
2020-01-09 14:01 ` Ananyev, Konstantin
2020-01-10 10:08 ` Di, ChenxuX
2020-01-10 12:46 ` Ananyev, Konstantin
2020-01-09 10:38 ` [dpdk-dev] [PATCH v7 4/4] net/e1000: " Chenxu Di
2020-01-10 9:58 ` [dpdk-dev] [PATCH v8 0/4] drivers/net: " Chenxu Di
2020-01-10 9:58 ` [dpdk-dev] [PATCH v8 1/4] net/i40e: " Chenxu Di
2020-01-10 9:58 ` [dpdk-dev] [PATCH v8 2/4] net/ice: " Chenxu Di
2020-01-10 9:58 ` [dpdk-dev] [PATCH v8 3/4] net/ixgbe: " Chenxu Di
2020-01-10 13:49 ` Ananyev, Konstantin
2020-01-10 9:59 ` [dpdk-dev] [PATCH v8 4/4] net/e1000: " Chenxu Di
2020-01-13 9:57 ` [dpdk-dev] [PATCH v9 0/4] drivers/net: " Chenxu Di
2020-01-13 9:57 ` [dpdk-dev] [PATCH v9 1/4] net/i40e: " Chenxu Di
2020-01-13 11:08 ` Ananyev, Konstantin
2020-01-13 9:57 ` [dpdk-dev] [PATCH v9 2/4] net/ice: " Chenxu Di
2020-01-14 1:55 ` Yang, Qiming
2020-01-14 12:40 ` Ferruh Yigit
2020-01-15 14:34 ` Ferruh Yigit
2020-01-16 1:40 ` Di, ChenxuX
2020-01-16 7:09 ` [dpdk-dev] [PATCH] net/ice: cleanup for vec path check Xiaolong Ye
2020-01-16 10:19 ` Ferruh Yigit
2020-01-17 2:21 ` Yang, Qiming
2020-01-16 8:43 ` [dpdk-dev] [PATCH v9 2/4] net/ice: cleanup Tx buffers Ferruh Yigit
2020-01-13 9:57 ` [dpdk-dev] [PATCH v9 3/4] net/ixgbe: " Chenxu Di
2020-01-13 11:07 ` Ananyev, Konstantin
2020-01-16 8:44 ` Ferruh Yigit
2020-01-16 14:47 ` Ferruh Yigit
2020-01-16 15:23 ` Ferruh Yigit
2020-01-13 9:57 ` [dpdk-dev] [PATCH v9 4/4] net/e1000: " Chenxu Di
2020-01-13 11:08 ` Ananyev, Konstantin
2020-01-14 2:49 ` Ye Xiaolong
2020-01-14 2:22 ` [dpdk-dev] [PATCH 0/4] drivers/net: " Ye Xiaolong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=SN6PR11MB25585FDADFE1F159829B305D9A3C0@SN6PR11MB2558.namprd11.prod.outlook.com \
--to=konstantin.ananyev@intel.com \
--cc=chenxux.di@intel.com \
--cc=dev@dpdk.org \
--cc=qiming.yang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).