DPDK patches and discussions
 help / color / mirror / Atom feed
From: Elad Nachman <eladv6@gmail.com>
To: Eric Christian <erclists@gmail.com>
Cc: Ferruh Yigit <ferruh.yigit@intel.com>, dev <dev@dpdk.org>,
	 Igor Ryzhov <iryzhov@nfware.com>
Subject: Re: [dpdk-dev] [PATCH v2] kni: Fix request overwritten
Date: Mon, 4 Oct 2021 21:27:35 +0300	[thread overview]
Message-ID: <CACXF7qkoC-8EkYY7WrAqAK7pUDUYZV4yONc-VZVUr-zXtnfdFg@mail.gmail.com> (raw)
In-Reply-To: <CAFbSC55tJhKD0KzMbpbxeDwAj=SKj=vZJcm6Zf0dWCQL7AY=pQ@mail.gmail.com>

בתאריך יום ב׳, 4 באוק׳ 2021, 20:00, מאת Eric Christian ‏<erclists@gmail.com
>:

> I am not sure that only we can recreate the KNI request overwrite.  We may
> be the only ones with a current use case that exposes the vulnerability.
>  It is possible for any KNI operation to encounter this issue with the new
> async mechanism.  As long as the call to kni_net_process_request() is a
> separate thread from rte_kni_handle_request() this has the potential to
> occur with the use of async requests.
>
> All you need is one async KNI request followed closely by a second KNI
> request before the rte_kni_handle_request() has had a chance to process the
> first request.
>
> The kernel dev driver simply returns the error value back to the caller if
> it is less than zero.
>
>
> Eric
>

What I did in order to attempt to recreate this issue was:

Create a qemu VM with four cores.
Run the KNI sample application.

Then:

1. Run ifconfig vEth0_0 down followed immediately by ifconfig vEth0_0 up

2. Same as above but in a sample userspace C application which calls ioctl
twice in a row to achieve the same.


Both did not recreate the problem.

The only way to recreate it IMHO is to either:
 run the ioctl from two parallel threads or perhaps assign RT fifo priority
to the application calling ioctl in a row.

Elad


>
> On Mon, Oct 4, 2021 at 12:19 PM Elad Nachman <eladv6@gmail.com> wrote:
>
>>
>>
>> On Mon, Oct 4, 2021 at 7:05 PM Ferruh Yigit <ferruh.yigit@intel.com>
>> wrote:
>>
>>> On 10/4/2021 3:58 PM, Elad Nachman wrote:
>>> > בתאריך יום ב׳, 4 באוק׳ 2021, 17:51, מאת Ferruh Yigit ‏<
>>> > ferruh.yigit@intel.com>:
>>> >
>>> >> On 10/4/2021 3:25 PM, Elad Nachman wrote:
>>> >>
>>> >> Can you please try to not top post, it will make impossible to follow
>>> this
>>> >> discussion later from the mail archives.
>>> >>
>>> >>> 1. Userspace will get an error
>>> >>
>>> >> So there is nothing special with returning '-EAGAIN', user will only
>>> >> observe an
>>> >> error.
>>> >> Wasn't initial intention to use '-EAGAIN' to try request again?
>>> >>
>>> > To signal user-space to retry the operation.
>>> >
>>>
>>> Not sure if it will reach to the end user. If user is calling "ifconfig
>>> <iface>
>>> down", it will just fail right, it won't recognize the error type.
>>>
>>> Unless this is common usage by the Linux network drivers, having this
>>> usage in
>>> KNI won't help much. I am for handling this in the kernel side if we can.
>>>
>>>
>> If user calls ifconfig <iface> down it will not happen. It requires some
>> multi-core race condition only Eric can recreate.
>>
>>
>>> >>
>>> >>> 2. Waiting with rtnl locked causes a deadlock; waiting with rtnl
>>> unlocked
>>> >>> for interface down command causes a crash because of a race
>>> condition in
>>> >>> the device delete/unregister list in the kernel.
>>> >>>
>>> >>
>>> >> Why waiting with rthnl lock causes a deadlock? As said below we are
>>> already
>>> >> doing it, why it is different with retry logic?
>>> >>
>>> > Because it can be interface down request.
>>> >
>>>
>>> (sure you like short answers)
>>>
>>> Please help me to see why "interface down" is special. Isn't it point of
>>> your
>>> patch to wait the request execution in the userspace even it is an async
>>> request?
>>>
>>> And yet again, number of retry can be limited.
>>>
>>>
>> No, it is not. Please look again:
>> https://patches.dpdk.org/project/dpdk/patch/20210924105409.21711-1-eladv6@gmail.com/
>>
>>
>>
>>>
>>> >
>>> >> I agree to not wait with rtnl unlocked.
>>> >>
>>> >>> FYI,
>>> >>>
>>> >>> Elad.
>>> >>>
>>> >>> בתאריך יום ב׳, 4 באוק׳ 2021, 17:13, מאת Ferruh Yigit ‏<
>>> >>> ferruh.yigit@intel.com>:
>>> >>>
>>> >>>> On 10/4/2021 2:09 PM, Elad Nachman wrote:
>>> >>>>> Hi,
>>> >>>>>
>>> >>>>> EAGAIN is propogated back to the kernel and to the caller.
>>> >>>>>
>>> >>>>
>>> >>>> So will the user get an error, or it will be handled by the kernel
>>> and
>>> >>>> retried?
>>> >>>>
>>> >>>>> We cannot retry from the kni kernel module since we hold the rtnl
>>> lock.
>>> >>>>>
>>> >>>>
>>> >>>> Why not? We are already waiting until a command time out, like
>>> >>>> 'kni_net_open()'
>>> >>>> can retry if 'kni_net_process_request()' returns '-EAGAIN'. And we
>>> can
>>> >>>> limit the
>>> >>>> number of retry for safety.
>>> >>>>
>>> >>>>> FYI,
>>> >>>>>
>>> >>>>> Elad
>>> >>>>>
>>> >>>>> בתאריך יום ב׳, 4 באוק׳ 2021, 16:05, מאת Ferruh Yigit ‏<
>>> >>>>> ferruh.yigit@intel.com>:
>>> >>>>>
>>> >>>>>> On 9/24/2021 11:54 AM, Elad Nachman wrote:
>>> >>>>>>> Fix lack of multiple KNI requests handling support by
>>> introducing a
>>> >>>>>>> request in progress flag which will fail additional requests with
>>> >>>>>>> EAGAIN return code if the original request has not been processed
>>> >>>>>>> by user-space.
>>> >>>>>>>
>>> >>>>>>> Bugzilla ID: 809
>>> >>>>>>
>>> >>>>>> Hi Eric,
>>> >>>>>>
>>> >>>>>> Can you please test this patch, if it solves the issue you
>>> reported?
>>> >>>>>>
>>> >>>>>>>
>>> >>>>>>> Signed-off-by: Elad Nachman <eladv6@gmail.com>
>>> >>>>>>> ---
>>> >>>>>>>  kernel/linux/kni/kni_net.c | 9 +++++++++
>>> >>>>>>>  lib/kni/rte_kni.c          | 2 ++
>>> >>>>>>>  lib/kni/rte_kni_common.h   | 1 +
>>> >>>>>>>  3 files changed, 12 insertions(+)
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>> <...>
>>> >>>>>>
>>> >>>>>>> @@ -123,7 +124,15 @@ kni_net_process_request(struct net_device
>>> *dev,
>>> >>>>>> struct rte_kni_request *req)
>>> >>>>>>>
>>> >>>>>>>       mutex_lock(&kni->sync_lock);
>>> >>>>>>>
>>> >>>>>>> +     /* Check that existing request has been processed: */
>>> >>>>>>> +     cur_req = (struct rte_kni_request *)kni->sync_kva;
>>> >>>>>>> +     if (cur_req->req_in_progress) {
>>> >>>>>>> +             ret = -EAGAIN;
>>> >>>>>>
>>> >>>>>> Overall logic in the KNI looks good to me, this helps to
>>> serialize the
>>> >>>>>> requests
>>> >>>>>> even for async ones.
>>> >>>>>>
>>> >>>>>> But can you please clarify how it behaves in the kernel side with
>>> >>>> '-EAGAIN'
>>> >>>>>> return type? Will linux call the ndo again, or will it just fail.
>>> >>>>>>
>>> >>>>>> If it just fails should we handle the re-try on '-EAGAIN' within
>>> the
>>> >> kni
>>> >>>>>> module?
>>> >>>>>>
>>> >>>>>>
>>> >>>>
>>> >>>>
>>> >>
>>> >> Elad.
>>>
>>>

  reply	other threads:[~2021-10-04 18:27 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-24 10:54 Elad Nachman
2021-10-04 13:01 ` Ferruh Yigit
2021-10-04 13:09   ` Elad Nachman
2021-10-04 14:03     ` Ferruh Yigit
2021-10-04 14:25       ` Elad Nachman
2021-10-04 14:51         ` Ferruh Yigit
2021-10-04 14:58           ` Elad Nachman
2021-10-04 15:48             ` Ferruh Yigit
2021-10-04 16:18               ` Elad Nachman
2021-10-04 16:59                 ` Eric Christian
2021-10-04 18:27                   ` Elad Nachman [this message]
2021-10-08 20:23                     ` Ferruh Yigit
2021-10-08 21:11                 ` Ferruh Yigit
2021-10-04 14:14   ` Eric Christian
2021-10-04 14:56     ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACXF7qkoC-8EkYY7WrAqAK7pUDUYZV4yONc-VZVUr-zXtnfdFg@mail.gmail.com \
    --to=eladv6@gmail.com \
    --cc=dev@dpdk.org \
    --cc=erclists@gmail.com \
    --cc=ferruh.yigit@intel.com \
    --cc=iryzhov@nfware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).