DPDK patches and discussions
 help / color / mirror / Atom feed
From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
To: Stephen Hemminger <stephen@networkplumber.org>,
	Jie Hai <haijie1@huawei.com>
Cc: Ferruh Yigit <ferruh.yigit@amd.com>,
	huangdengdui <huangdengdui@huawei.com>,
	dev@dpdk.org, "John W. Linville" <linville@tuxdriver.com>,
	Ciara Loftus <ciara.loftus@intel.com>,
	Shepard Siegel <shepard.siegel@atomicrules.com>,
	Ed Czeck <ed.czeck@atomicrules.com>,
	John Miller <john.miller@atomicrules.com>,
	Igor Russkikh <irusskikh@marvell.com>,
	Steven Webster <steven.webster@windriver.com>,
	Matt Peters <matt.peters@windriver.com>,
	Selwin Sebastian <selwin.sebastian@amd.com>,
	Julien Aube <julien_dpdk@jaube.fr>,
	Ajit Khaparde <ajit.khaparde@broadcom.com>,
	Somnath Kotur <somnath.kotur@broadcom.com>,
	Chas Williams <chas3@att.com>,
	"Min Hu (Connor)" <humin29@huawei.com>,
	Nithin Dabilpuram <ndabilpuram@marvell.com>,
	Kiran Kumar K <kirankumark@marvell.com>,
	Sunil Kumar Kori <skori@marvell.com>,
	Satha Rao <skoteshwar@marvell.com>,
	Harman Kalra <hkalra@marvell.com>,
	Yuying Zhang <yuying.zhang@intel.com>,
	Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>,
	Hemant Agrawal <hemant.agrawal@nxp.com>,
	Sachin Saxena <sachin.saxena@nxp.com>,
	Shai Brandes <shaibran@amazon.com>,
	Evgeny Schemeilin <evgenys@amazon.com>,
	Ron Beider <rbeider@amazon.com>,
	Amit Bernstein <amitbern@amazon.com>,
	Wajeeh Atrash <atrwajee@amazon.com>,
	Gagandeep Singh <g.singh@nxp.com>,
	Apeksha Gupta <apeksha.gupta@nxp.com>,
	John Daley <johndale@cisco.com>,
	Hyong Youb Kim <hyonkim@cisco.com>, Gaetan Rivet <grive@u256.net>,
	Jeroen de Borst <jeroendb@google.com>,
	Rushil Gupta <rushilg@google.com>,
	Joshua Washington <joshwash@google.com>,
	Ziyang Xuan <xuanziyang2@huawei.com>,
	Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>,
	Guoyang Zhou <zhouguoyang@huawei.com>,
	Yisen Zhuang <yisen.zhuang@huawei.com>,
	Jingjing Wu <jingjing.wu@intel.com>,
	Andrew Boyer <andrew.boyer@amd.com>,
	Rosen Xu <rosen.xu@intel.com>, Long Li <longli@microsoft.com>,
	Jakub Grajciar <jgrajcia@cisco.com>,
	Matan Azrad <matan@nvidia.com>,
	Viacheslav Ovsiienko <viacheslavo@nvidia.com>,
	Dariusz Sosnowski <dsosnowski@nvidia.com>,
	Ori Kam <orika@nvidia.com>, Suanming Mou <suanmingm@nvidia.com>,
	Zyta Szpak <zr@semihalf.com>, Liron Himi <lironh@marvell.com>,
	Martin Spinler <spinler@cesnet.cz>,
	Chaoyong He <chaoyong.he@corigine.com>,
	Jiawen Wu <jiawenwu@trustnetic.com>,
	Tetsuya Mukawa <mtetsuyah@gmail.com>,
	Vamsi Attunuru <vattunuru@marvell.com>,
	Devendra Singh Rawat <dsinghrawat@marvell.com>,
	Alok Prasad <palok@marvell.com>,
	Bruce Richardson <bruce.richardson@intel.com>,
	Cristian Dumitrescu <cristian.dumitrescu@intel.com>,
	Jerin Jacob <jerinj@marvell.com>,
	Maciej Czekaj <mczekaj@marvell.com>,
	Jian Wang <jianwang@trustnetic.com>,
	Maxime Coquelin <maxime.coquelin@redhat.com>,
	Chenbo Xia <chenbox@nvidia.com>,
	Jochen Behrens <jbehrens@vmware.com>,
	lihuisong@huawei.com, fengchengwen@huawei.com,
	liuyonglong@huawei.com
Subject: Re: [PATCH] net/hns3: fix Rx packet truncation when KEEP CRC enabled
Date: Mon, 3 Jun 2024 10:07:01 +0300	[thread overview]
Message-ID: <4e15c607-9aef-4c33-91ff-4d6784ea6b4d@oktetlabs.ru> (raw)
In-Reply-To: <20240602193304.75716520@hermes.local>

On 6/3/24 05:33, Stephen Hemminger wrote:
> On Mon, 3 Jun 2024 09:38:19 +0800
> Jie Hai <haijie1@huawei.com> wrote:
> 
>> On 2024/3/1 19:10, Ferruh Yigit wrote:
>>> On 3/1/2024 6:55 AM, huangdengdui wrote:
>>>>
>>>>
>>>> On 2024/2/29 17:25, Ferruh Yigit wrote:
>>>>> On 2/29/2024 3:58 AM, huangdengdui wrote:
>>>>>>
>>>>>>
>>>>>> On 2024/2/28 21:07, Ferruh Yigit wrote:
>>>>>>> On 2/28/2024 2:27 AM, huangdengdui wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2024/2/27 0:43, Ferruh Yigit wrote:
>>>>>>>>> On 2/26/2024 3:16 AM, Jie Hai wrote:
>>>>>>>>>> On 2024/2/23 21:53, Ferruh Yigit wrote:
>>>>>>>>>>> On 2/20/2024 3:58 AM, Jie Hai wrote:
>>>>>>>>>>>> Hi, Ferruh,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for your review.
>>>>>>>>>>>>
>>>>>>>>>>>> On 2024/2/7 22:15, Ferruh Yigit wrote:
>>>>>>>>>>>>> On 2/6/2024 1:10 AM, Jie Hai wrote:
>>>>>>>>>>>>>> From: Dengdui Huang <huangdengdui@huawei.com>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When KEEP_CRC offload is enabled, some packets will be truncated and
>>>>>>>>>>>>>> the CRC is still be stripped in following cases:
>>>>>>>>>>>>>> 1. For HIP08 hardware, the packet type is TCP and the length
>>>>>>>>>>>>>>        is less than or equal to 60B.
>>>>>>>>>>>>>> 2. For other hardwares, the packet type is IP and the length
>>>>>>>>>>>>>>        is less than or equal to 60B.
>>>>>>>>>>>>>>   
>>>>>>>>>>>>>
>>>>>>>>>>>>> If a device doesn't support the offload by some packets, it can be
>>>>>>>>>>>>> option to disable offload for that device, instead of calculating it in
>>>>>>>>>>>>> software and append it.
>>>>>>>>>>>>
>>>>>>>>>>>> The KEEP CRC feature of hns3 is faulty only in the specific packet
>>>>>>>>>>>> type and small packet(<60B) case.
>>>>>>>>>>>> What's more, the small ethernet packet is not common.
>>>>>>>>>>>>   
>>>>>>>>>>>>> Unless you have a specific usecase, or requirement to support the
>>>>>>>>>>>>> offload.
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, some users of hns3 are already using this feature.
>>>>>>>>>>>> So we cannot drop this offload
>>>>>>>>>>>>   
>>>>>>>>>>>>> <...>
>>>>>>>>>>>>>   
>>>>>>>>>>>>>> @@ -2492,10 +2544,16 @@ hns3_recv_pkts_simple(void *rx_queue,
>>>>>>>>>>>>>>                  goto pkt_err;
>>>>>>>>>>>>>>                rxm->packet_type = hns3_rx_calc_ptype(rxq, l234_info,
>>>>>>>>>>>>>> ol_info);
>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>              if (rxm->packet_type == RTE_PTYPE_L2_ETHER_TIMESYNC)
>>>>>>>>>>>>>>                  rxm->ol_flags |= RTE_MBUF_F_RX_IEEE1588_PTP;
>>>>>>>>>>>>>>      +        if (unlikely(rxq->crc_len > 0)) {
>>>>>>>>>>>>>> +            if (hns3_need_recalculate_crc(rxq, rxm))
>>>>>>>>>>>>>> +                hns3_recalculate_crc(rxq, rxm);
>>>>>>>>>>>>>> +            rxm->pkt_len -= rxq->crc_len;
>>>>>>>>>>>>>> +            rxm->data_len -= rxq->crc_len;
>>>>>>>>>>>>>>   
>>>>>>>>>>>>>
>>>>>>>>>>>>> Removing 'crc_len' from 'mbuf->pkt_len' & 'mbuf->data_len' is
>>>>>>>>>>>>> practically same as stripping CRC.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We don't count CRC length in the statistics, but it should be
>>>>>>>>>>>>> accessible
>>>>>>>>>>>>> in the payload by the user.
>>>>>>>>>>>> Our drivers are behaving exactly as you say.
>>>>>>>>>>>>   
>>>>>>>>>>>
>>>>>>>>>>> If so I missed why mbuf 'pkt_len' and 'data_len' reduced by
>>>>>>>>>>> 'rxq->crc_len', can you please explain what above lines does?
>>>>>>>>>>>
>>>>>>>>>>>   
>>>>>>>>>> @@ -2470,8 +2523,7 @@ hns3_recv_pkts_simple(void *rx_queue,
>>>>>>>>>>            rxdp->rx.bd_base_info = 0;
>>>>>>>>>>
>>>>>>>>>>            rxm->data_off = RTE_PKTMBUF_HEADROOM;
>>>>>>>>>> -        rxm->pkt_len = (uint16_t)(rte_le_to_cpu_16(rxd.rx.pkt_len)) -
>>>>>>>>>> -                rxq->crc_len;
>>>>>>>>>> +        rxm->pkt_len = rte_le_to_cpu_16(rxd.rx.pkt_len);
>>>>>>>>>>
>>>>>>>>>> In the previous code above, the 'pkt_len' is set to the length obtained
>>>>>>>>>> from the BD. the length obtained from the BD already contains CRC length.
>>>>>>>>>> But as you said above, the DPDK requires that the length of the mbuf
>>>>>>>>>> does not contain CRC length . So we subtract 'rxq->crc_len' from
>>>>>>>>>> mbuf'pkt_len' and 'data_len'. This patch doesn't change the logic, it
>>>>>>>>>> just moves the code around.
>>>>>>>>>>   
>>>>>>>>>
>>>>>>>>> Nope, I am not saying mbuf length shouldn't contain CRC length, indeed
>>>>>>>>> it is other way around and this is our confusion.
>>>>>>>>>
>>>>>>>>> CRC length shouldn't be in the statistics, I mean in received bytes stats.
>>>>>>>>> Assume that received packet is 128 bytes and we know it has the CRC,
>>>>>>>>> Rx received bytes stat should be 124 (rx_bytes = 128 - CRC = 124)
>>>>>>>>>
>>>>>>>>> But mbuf->data_len & mbuf->pkt_len should have full frame length,
>>>>>>>>> including CRC.
>>>>>>>>>
>>>>>>>>> As application explicitly requested to KEEP CRC, it will know last 4
>>>>>>>>> bytes are CRC.
>>>>>>>>> Anything after 'mbuf->data_len' in the mbuf buffer is not valid, so if
>>>>>>>>> you reduce 'mbuf->data_len' by CRC size, application can't know if 4
>>>>>>>>> bytes after 'mbuf->data_len' is valid CRC or not.
>>>>>>>>>   
>>>>>>>> I agree with you.
>>>>>>>>
>>>>>>>> But the implementation of other PMDs supported KEEP_CRC is like this.
>>>>>>>> In addition, there are probably many users that are already using it.
>>>>>>>> If we modify it, it may cause applications incompatible.
>>>>>>>>
>>>>>>>> what do you think?
>>>>>>>>   
>>>>>>> This is documented in the ethdev [1], better to follow the documentation
>>>>>>> for all PMDs, can you please highlight the relevant driver code, we can
>>>>>>> discuss it with their maintainers.
>>>>>>>
>>>>>>> Alternatively we can document this additionally in the KEEP_CRC feature
>>>>>>> document if it helps for the applications.
>>>>>>>
>>>>>>>
>>>>>>> [1]
>>>>>>> https://git.dpdk.org/dpdk/tree/lib/ethdev/rte_ethdev.h?h=v23.11#n257
>>>>>>
>>>>>> Currently,this documentation does not describe whether pkt_len and data_len should contain crc_len.
>>>>>>   
>>>>>
>>>>> I think it is clear that pkt_len and data_len should contain crc_len, we
>>>>> can ask for more comments.
>>>> This patch doesn't change the logic for hns3 PMD and the implementation of
>>>> other PMDs supported KEEP_CRC is like hns3 PMD. Can we merge this patch first?
>>>>   
>>>
>>> If hns3 behaving against the documented behavior, I don't understand why
>>> you are pushing for merging this patch, instead of fixing it.
>>>    
>>
>>>
>>> Other drivers behavior is something else, not directly related to this
>>> patch, but again if you can provide references we can discuss with their
>>> maintainers.
>>>    
>> Hi, all maintainers,
>> We need your opinions on whether pkt_len and data_len should contain CRC
>> len. The KEEP CRC feature is related. As if it is enabled, most drivers
>> will substract CRC len from pkt_len and data_len. which means users
>> cannot read the CRC data through the DPDK framework interface.
>>
>> Among the drivers that support keeping CRC, only the bnxt, cfpl, idpf,
>> qede and sfc get the pkt_len and data_len from the descriptor and not
>> subtract CRC len by drivers. I don't know if the length of these drivers
>> includes the CRC len or not, please confirm that, thanks.
>>
>>
>> Back to the current patch.
>> Hi, Ferruh.
>> Obviously, if we need to give users access to the CRC data, we'll have
>> to modify the defination in ethdev and usage in most drivers.
>>
>> I don't think this change will be backported. Am I wrong?
>>
>> But this patch for hns3 bugfix, need to be backported.
>>
>> That's why we can separate this patch from the confirmation of the
>> meaning of pkt_len and data_len.
>> So can this patch merge first?
>>
>> Thanks,
>> Jie Hai
>>
>>>    
>>>>>   
>>>>>> Do you mean that we add this description in the KEEP_CRC feature document
>>>>>> and notify all drivers that support KEEP_CRC to follow this documentation?
>>>>>>
>>>>>> If so, can you merge this patch first?
>>>>>> Then we send a RFC to disscuss it with all PMDs maintainer.
>>>>>>   
>>>>>
>>>>> Not for drivers, just a suggestion that if we should update feature
>>>>> documentation with above information for users. So there is no
>>>>> dependency to features document update.
>>>>>
>>>>>   
>>>> Sorry I'm more confused. What should we do next?
>>>
>>> There is already API documentation about KEEP_CRC, I think that is
>>> already sufficient for driver developers.
>>>
>>> I am just brainstorming if updating './doc/guides/nics/features.rst' can
>>> help end user, but it is not an action or blocker for this patch.
>>>
>>> Next step is to update this path.
> 
> IMHO the only sane thing is:
>    -if keep crc is enabled then pkt_len and data_len include the extra bytes for the CRC.
>    -if keep crc is disabled, then pkt_len and data_len match the length of the packet without the CRC.
> 
> 
> Other than driver testing, never saw much point to using keep crc.

+1

      parent reply	other threads:[~2024-06-03  7:07 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-06  1:10 Jie Hai
2024-02-07 14:15 ` Ferruh Yigit
2024-02-20  3:58   ` Jie Hai
2024-02-23 13:53     ` Ferruh Yigit
2024-02-26  3:16       ` Jie Hai
2024-02-26 16:43         ` Ferruh Yigit
2024-02-28  2:27           ` huangdengdui
2024-02-28 13:07             ` Ferruh Yigit
2024-02-29  3:58               ` huangdengdui
2024-02-29  9:25                 ` Ferruh Yigit
2024-03-01  6:55                   ` huangdengdui
2024-03-01 11:10                     ` Ferruh Yigit
2024-03-08 11:36                       ` Jie Hai
2024-03-22  6:28                         ` Jie Hai
2024-06-03  1:38                       ` Jie Hai
2024-06-03  2:33                         ` Stephen Hemminger
2024-06-03  5:24                           ` Morten Brørup
2024-06-03  7:07                           ` Andrew Rybchenko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e15c607-9aef-4c33-91ff-4d6784ea6b4d@oktetlabs.ru \
    --to=andrew.rybchenko@oktetlabs.ru \
    --cc=ajit.khaparde@broadcom.com \
    --cc=amitbern@amazon.com \
    --cc=andrew.boyer@amd.com \
    --cc=apeksha.gupta@nxp.com \
    --cc=atrwajee@amazon.com \
    --cc=bruce.richardson@intel.com \
    --cc=chaoyong.he@corigine.com \
    --cc=chas3@att.com \
    --cc=chenbox@nvidia.com \
    --cc=ciara.loftus@intel.com \
    --cc=cloud.wangxiaoyun@huawei.com \
    --cc=cristian.dumitrescu@intel.com \
    --cc=dev@dpdk.org \
    --cc=dsinghrawat@marvell.com \
    --cc=dsosnowski@nvidia.com \
    --cc=ed.czeck@atomicrules.com \
    --cc=evgenys@amazon.com \
    --cc=fengchengwen@huawei.com \
    --cc=ferruh.yigit@amd.com \
    --cc=g.singh@nxp.com \
    --cc=grive@u256.net \
    --cc=haijie1@huawei.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=hkalra@marvell.com \
    --cc=huangdengdui@huawei.com \
    --cc=humin29@huawei.com \
    --cc=hyonkim@cisco.com \
    --cc=irusskikh@marvell.com \
    --cc=jbehrens@vmware.com \
    --cc=jerinj@marvell.com \
    --cc=jeroendb@google.com \
    --cc=jgrajcia@cisco.com \
    --cc=jianwang@trustnetic.com \
    --cc=jiawenwu@trustnetic.com \
    --cc=jingjing.wu@intel.com \
    --cc=john.miller@atomicrules.com \
    --cc=johndale@cisco.com \
    --cc=joshwash@google.com \
    --cc=julien_dpdk@jaube.fr \
    --cc=kirankumark@marvell.com \
    --cc=lihuisong@huawei.com \
    --cc=linville@tuxdriver.com \
    --cc=lironh@marvell.com \
    --cc=liuyonglong@huawei.com \
    --cc=longli@microsoft.com \
    --cc=matan@nvidia.com \
    --cc=matt.peters@windriver.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mczekaj@marvell.com \
    --cc=mtetsuyah@gmail.com \
    --cc=ndabilpuram@marvell.com \
    --cc=orika@nvidia.com \
    --cc=palok@marvell.com \
    --cc=rahul.lakkireddy@chelsio.com \
    --cc=rbeider@amazon.com \
    --cc=rosen.xu@intel.com \
    --cc=rushilg@google.com \
    --cc=sachin.saxena@nxp.com \
    --cc=selwin.sebastian@amd.com \
    --cc=shaibran@amazon.com \
    --cc=shepard.siegel@atomicrules.com \
    --cc=skori@marvell.com \
    --cc=skoteshwar@marvell.com \
    --cc=somnath.kotur@broadcom.com \
    --cc=spinler@cesnet.cz \
    --cc=stephen@networkplumber.org \
    --cc=steven.webster@windriver.com \
    --cc=suanmingm@nvidia.com \
    --cc=vattunuru@marvell.com \
    --cc=viacheslavo@nvidia.com \
    --cc=xuanziyang2@huawei.com \
    --cc=yisen.zhuang@huawei.com \
    --cc=yuying.zhang@intel.com \
    --cc=zhouguoyang@huawei.com \
    --cc=zr@semihalf.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).