From: Zoltan Kiss <zoltan.kiss@linaro.org>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
Date: Wed, 03 Jun 2015 18:46:40 +0100 [thread overview]
Message-ID: <556F3D80.3070904@linaro.org> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB977258214348AA@irsmsx105.ger.corp.intel.com>
On 02/06/15 18:35, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org]
>> Sent: Tuesday, June 02, 2015 4:08 PM
>> To: Ananyev, Konstantin; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
>>
>>
>>
>> On 02/06/15 14:31, Ananyev, Konstantin wrote:
>>> Hi Zoltan,
>>>
>>>> -----Original Message-----
>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Zoltan Kiss
>>>> Sent: Monday, June 01, 2015 5:16 PM
>>>> To: dev@dpdk.org
>>>> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
>>>>
>>>> Hi,
>>>>
>>>> Anyone would like to review this patch? Venky sent a NAK, but I've
>>>> explained to him why it is a bug.
>>>
>>>
>>> Well, I think Venky is right here.
>> I think the comments above rte_eth_tx_burst() definition are quite clear
>> about what tx_free_thresh means, e1000 and i40e use it that way, but not
>> ixgbe.
>>
>>> Indeed that fix, will cause more often unsuccessful checks for DD bits and might cause a
>>> slowdown for TX fast-path.
>> Not if the applications set tx_free_thresh according to the definition
>> of this value. But we can change the default value from 32 to something
>> higher, e.g I'm using nb_desc/2, and it works out well.
>
> Sure we can, as I said below, we can unify it one way or another.
> One way would be to make fast-path TX to free TXDs when number of occupied TXDs raises above tx_free_thresh
> (what rte_ethdev.h comments say and what full-featured TX is doing).
> Though in that case we have to change default value for tx_free_thresh, and all existing apps that
> using tx_free_thresh==32 and fast-path TX will probably experience a slowdown.
They are in trouble already, because i40e and e1000 uses it as defined.
But I guess most apps are going with 0, which sets the drivers default.
Others have to change the value to nb_txd - curr_value to have the same
behaviour
> Another way would be to make all TX functions to treat tx_conf->tx_free_thresh as fast-path TX functions do
> (free TXDs when number of free TXDs drops below tx_free_thresh) and update rte_ethdev.h comments.
And i40e and e1000e code as well. I don't see what difference it makes
which way of definition you use, what I care is that it should be used
consistently.
>
> Though, I am not sure that it really worth all these changes.
> From one side, whatever tx_free_thresh would be,
> the app should still assume that the worst case might happen,
> and up to nb_tx_desc mbufs can be consumed by the queue.
> From other side, I think the default value should work well for most cases.
> So I am still for graceful deprecation of that config parameter, see below.
>
>>
>>> Anyway, with current PMD implementation, you can't guarantee that at any moment
>>> TX queue wouldn't use more than tx_free_thresh mbufs.
>>
>>
>>> There could be situations (low speed, or link is down for some short period, etc), when
>>> much more than tx_free_thresh TXDs are in use and none of them could be freed by HW right now.
>>> So your app better be prepared, that up to (nb_tx_desc * num_of_TX_queues) could be in use
>>> by TX path at any given moment.
>>>
>>> Though yes, there is an inconsistency how different ixgbe TX functions treat tx_conf->tx_free_thresh parameter.
>>> That probably creates wrong expectations and confusion.
>> Yes, ixgbe_xmit_pkts() use it the way it's defined, this two function
>> doesn't.
>>
>>> We might try to unify it's usage one way or another, but I personally don't see much point in it.
>>> After all, tx_free_tresh seems like a driver internal choice (based on the nb_tx_desc and other parameters).
>>> So I think a better way would be:
>>> 1. Deprecate tx_conf->tx_free_thresh (and remove it in later releases) and make
>>> each driver to use what it thinks would be the best value.
>> But how does the driver knows what's the best for the applications
>> traffic pattern? I think it's better to leave the possibility for the
>> app to fine tune it.
>
> My understanding is that for most cases the default value should do pretty well.
> That default value, shouldn't be too small, so we avoid unnecessary & unsuccessful checks,
> and probably shouldn't be too big, to prevent unnecessary mbufs consumption
> (something between nb_tx_desc / 2 and 3 * nb_tx_desc / 4 probably).
I agree
>
> But might be you have a good example, when such tuning is needed?
> For what traffic patterns you would set tx_free_thresh to some different values,
> and how will it impact performance?
I don't have an actual example, but I think it's worth to keep this
tuning option if we already have it. Most people probably wouldn't use
it, but I can imagine that the very enthusiastic wants to try out
different settings to find the best.
E.g. I was testing odp_l2fwd when I came across the problem, and I found
it useful to have this option. With its traffic pattern (receive a batch
of packets then send them out on an another interface) it can happen
that with different clock speeds you can find different optimums.
>
> Again, if there would be tx_free_pkts(), why someone would also need a tx_conf->tx_free_thresh?
I think about tx_free_pkts as a rainy day option, when you want ALL TX
completed packets to be released, because you are out of buffers. While
tx_free_thresh is the fast path way of TX completion, when you have the
room to wait for more packets to be gathered.
>
> Konstantin
>
>> In the meantime we can improve the default selection as well, as I
>> suggested above.
>>
>>> 2. As you suggested in another mail, introduce an new function:
>>> uint16_t rte_eth_tx_free_pkts(port_id, queue_id, nb_to_free).
>>> That would give upper layer a better control of memory usage, and might be called by the upper layer at idle time,
>>> so further tx_burst, don't need to spend time on freeing TXDs/packets.
>> I agree.
>>
>>>
>>> Konstantin
>>>
>>>
>>>>
>>>> Regards,
>>>>
>>>> Zoltan
>>>>
>>>> On 27/05/15 21:12, Zoltan Kiss wrote:
>>>>> This check doesn't do what's required by rte_eth_tx_burst:
>>>>> "When the number of previously sent packets reached the "minimum transmit
>>>>> packets to free" threshold"
>>>>>
>>>>> This can cause problems when txq->tx_free_thresh + [number of elements in the
>>>>> pool] < txq->nb_tx_desc.
>>>>>
>>>>> Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org>
>>>>> ---
>>>>> drivers/net/ixgbe/ixgbe_rxtx.c | 4 ++--
>>>>> drivers/net/ixgbe/ixgbe_rxtx_vec.c | 2 +-
>>>>> 2 files changed, 3 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>> index 4f9ab22..b70ed8c 100644
>>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>> @@ -250,10 +250,10 @@ tx_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
>>>>>
>>>>> /*
>>>>> * Begin scanning the H/W ring for done descriptors when the
>>>>> - * number of available descriptors drops below tx_free_thresh. For
>>>>> + * number of in flight descriptors reaches tx_free_thresh. For
>>>>> * each done descriptor, free the associated buffer.
>>>>> */
>>>>> - if (txq->nb_tx_free < txq->tx_free_thresh)
>>>>> + if ((txq->nb_tx_desc - txq->nb_tx_free) > txq->tx_free_thresh)
>>>>> ixgbe_tx_free_bufs(txq);
>>>>>
>>>>> /* Only use descriptors that are available */
>>>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>> index abd10f6..f91c698 100644
>>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>> @@ -598,7 +598,7 @@ ixgbe_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
>>>>> if (unlikely(nb_pkts > RTE_IXGBE_VPMD_TX_BURST))
>>>>> nb_pkts = RTE_IXGBE_VPMD_TX_BURST;
>>>>>
>>>>> - if (txq->nb_tx_free < txq->tx_free_thresh)
>>>>> + if ((txq->nb_tx_desc - txq->nb_tx_free) > txq->tx_free_thresh)
>>>>> ixgbe_tx_free_bufs(txq);
>>>>>
>>>>> nb_commit = nb_pkts = (uint16_t)RTE_MIN(txq->nb_tx_free, nb_pkts);
>>>>>
next prev parent reply other threads:[~2015-06-03 17:46 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-27 20:12 Zoltan Kiss
2015-05-28 10:50 ` Venkatesan, Venky
2015-05-28 11:12 ` Zoltan Kiss
2015-06-01 16:15 ` Zoltan Kiss
2015-06-02 13:31 ` Ananyev, Konstantin
2015-06-02 15:08 ` Zoltan Kiss
2015-06-02 17:35 ` Ananyev, Konstantin
2015-06-03 17:46 ` Zoltan Kiss [this message]
2015-06-09 11:18 ` Ananyev, Konstantin
2015-06-09 15:08 ` Zoltan Kiss
2015-06-09 15:44 ` Ananyev, Konstantin
2015-06-09 17:46 ` Zoltan Kiss
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=556F3D80.3070904@linaro.org \
--to=zoltan.kiss@linaro.org \
--cc=dev@dpdk.org \
--cc=konstantin.ananyev@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).