Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Avi Kivity <avi@scylladb.com>
To: Jerin Jacob <jerin.jacob@caviumnetworks.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: Thomas Monjalon <thomas.monjalon@6wind.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure
Date: Thu, 28 Jul 2016 15:07:27 +0300	[thread overview]
Message-ID: <ea9624c7-4a90-c014-1815-749be6604c05@scylladb.com> (raw)
In-Reply-To: <20160728113853.GA14755@localhost.localdomain>



On 07/28/2016 02:38 PM, Jerin Jacob wrote:
> On Thu, Jul 28, 2016 at 10:36:07AM +0000, Ananyev, Konstantin wrote:
>>> If it does not cope up then it can skip tx'ing in the actual tx burst
>>> itself and move the "skipped" tx packets to end of the list in the tx
>>> burst so that application can take the action on "skipped" packet after
>>> the tx burst
>> Sorry, that's too cryptic for me.
>> Can you reword it somehow?
> OK.
> 1) lets say application requests 32 packets to send it using tx_burst.
> 2) packets are from p0 to p31
> 3) in driver due to some reason, it is not able to send the packets due to some
> constraints in the driver(say expect p2 and p16 everything else sent
> successfully by the driver)
> 4) driver can move p2 and p16 at pkt[0] and pkt[1] on tx_burst and
> return 30
> 5) application can take action on p2 and p16 based the return value of
> 30(ie 32-30 = 2 packets needs to handle at pkt[0] and pkt[1]

That can cause reordering; while it is legal, it reduces tcp performance.

Better to preserve the application-provided order.


>
>>>
>>>> Instead it just setups the ol_flags, fills tx_offload fields and calls tx_prep().
>>>> Please read the original Tomasz's patch, I think he explained possible use-cases
>>>> with lot of details.
>>> Sorry, it is not very clear in terms of use cases.
>> Ok, what I meant to say:
>> Right now, if user wants to use HW TX cksum/TSO offloads he might have to:
>> - setup ipv4 header cksum field.
>> - calculate the pseudo header checksum
>> - setup tcp/udp cksum field.
>>
>> Rules how these calculations need to be done and which fields need to be updated,
>> may vary depending on HW underneath and requested offloads.
>> tx_prep() - supposed to hide all these nuances from user and allow him to use TX HW offloads
>> in a transparent way.
> Not sure I understand it completely. Bit contradicting with below
> statement
> |We would document what tx_prep() supposed to do, and in what cases user
> |don't need it.
>
> How about introducing a new ethdev generic eal command-line mode OR
> new ethdev_configure hint that PMD driver is in "tx_prep->tx_burst" mode
> instead of just tx_burst? That way no fast-path performance degradation
> for the PMD that does not need it
>
>
>> Another main purpose of tx_prep(): for multi-segment packets is to check
>> that number of segments doesn't exceed  HW limit.
>> Again right now users have to do that on their own.
>>
>>> In HW perspective, It it tries to avoid the illegal state. But not sure
>>> calling "back to back" tx prepare and then tx burst how does it improve the
>>> situation as the check illegal state check introduce in actual tx burst
>>> it self.
>>>
>>> In SW perspective, its try to avoid sending malformed packets. In my
>>> view the same can achieved with existing tx burst it self as PMD is the
>>> one finally send the packets on the wire.
>> Ok, so your question is: why not to put that functionality into
>> tx_burst() itself, right?
>> For few reasons:
>> 1. putting that functionality into tx_burst() would introduce unnecessary
>>      slowdown for cases when that functionality is not needed
>>      (one segment per packet, no HW offloads).
> These parameters can be configured on init time
>
>> 2. User might don't want to use tx_prep() - he/she might have its
>>      own analog, which he/she belives is faster/smarter,etc.
> That's the current mode. Right?
>> 3.  Having it a s separate function would allow user control when/where
>>        to call it, let say only for some packets, or probably call tx_prep()
>>        on one core, and do actual tx_burst() for these packets on the other.
> Why to process it under tx_prep() as application can always process the
> packet in one core
>
>>> proposal quote:
>>>
>>> 1. Introduce rte_eth_tx_prep() function to do necessary preparations of
>>>     packet burst to be safely transmitted on device for desired HW
>>>     offloads (set/reset checksum field according to the hardware
>>>     requirements) and check HW constraints (number of segments per
>>>     packet, etc).
>>>
>>>     While the limitations and requirements may differ for devices, it
>>>     requires to extend rte_eth_dev structure with new function pointer
>>>     "tx_pkt_prep" which can be implemented in the driver to prepare and
>>>     verify packets, in devices specific way, before burst, what should to
>>>     prevent application to send malformed packets.
>>>
>>>
>>>>> and what if the PMD does not implement that callback then it is of waste cycles. Right?
>>>> If you refer as lost cycles here something like:
>>>> RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_prep, -ENOTSUP);
>>>> then yes.
>>>> Though comparing to actual work need to be done for most HW TX offloads,
>>>> I think it is neglectable.
>>> Not sure.
>>>
>>>> Again, as I said before, it is totally voluntary for the application.
>>> Not according to proposal. It can't be too as application has no idea
>>> what PMD driver does with "prep" what is the implication on a HW if
>>> application does not
>> Why application writer wouldn't have an idea?
>> We would document what tx_prep() supposed to do, and in what cases user don't need it.
> But how he/she detect that on that run-time ?
>
>> Then it would be up to the user:
>> - not to use it at all (one segment per packet, no HW TX offloads)
> We already have TX flags for that
>
>> - not to use tx_prep(), and make necessary preparations himself,
>>    that what people have to do now.
>> - use tx_prep()
>>
>> Konstantin
>>

next prev parent reply	other threads:[~2016-07-28 12:07 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-20 14:24 [dpdk-dev] [PATCH] " Tomasz Kulasek
2016-07-20 15:01 ` Thomas Monjalon
2016-07-20 15:13   ` Ananyev, Konstantin
2016-07-20 15:22     ` Thomas Monjalon
2016-07-20 15:42       ` Kulasek, TomaszX
2016-07-21 15:24 ` [dpdk-dev] [PATCH v2] " Tomasz Kulasek
2016-07-21 22:48   ` Ananyev, Konstantin
2016-07-27  8:59     ` Thomas Monjalon
2016-07-27 17:10       ` Jerin Jacob
2016-07-27 17:33         ` Ananyev, Konstantin
2016-07-27 17:41           ` Jerin Jacob
2016-07-27 20:51             ` Ananyev, Konstantin
2016-07-28  2:13               ` Jerin Jacob
2016-07-28 10:36                 ` Ananyev, Konstantin
2016-07-28 11:38                   ` Jerin Jacob
2016-07-28 12:07                     ` Avi Kivity [this message]
2016-07-28 13:01                     ` Ananyev, Konstantin
2016-07-28 13:58                       ` Olivier MATZ
2016-07-28 14:21                         ` Ananyev, Konstantin
2016-07-28 13:59                       ` Jerin Jacob
2016-07-28 14:52                         ` Thomas Monjalon
2016-07-28 16:25                           ` Jerin Jacob
2016-07-28 17:07                             ` Thomas Monjalon
2016-07-31  7:50     ` Vlad Zolotarov
2016-07-28 12:04   ` Avi Kivity
2016-07-31  7:46 ` [dpdk-dev] [PATCH] " Vlad Zolotarov
2016-07-31  8:10   ` Vlad Zolotarov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ea9624c7-4a90-c014-1815-749be6604c05@scylladb.com \
    --to=avi@scylladb.com \
    --cc=dev@dpdk.org \
    --cc=jerin.jacob@caviumnetworks.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=thomas.monjalon@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).