From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9B4D142B22; Tue, 16 May 2023 14:07:59 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 35464410EE; Tue, 16 May 2023 14:07:59 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 06D9240A8B for ; Tue, 16 May 2023 14:07:57 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684238877; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zb8f11arzYLc5cu32SnRa1ahetSf1bFEP46Lv4B8LtE=; b=f81RV0Mvz99bNYQ+IfJbVtXAvXv4yTrHVZJxZyLAxlZr7kU9azZXCNi5k4pnRFwtlzLSze OHDCnnc7M45m0S/gy40HJb5XNWQKzKHTxMvf8/N0wnnEUQDsLZvgSRSnCecEwOulVKpoeU K0AR+mlRuc61FneuvW500XJDIWYfWAQ= Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-489-eJyjRiuFOwa0LLMmH4uu-g-1; Tue, 16 May 2023 08:07:56 -0400 X-MC-Unique: eJyjRiuFOwa0LLMmH4uu-g-1 Received: by mail-ed1-f71.google.com with SMTP id 4fb4d7f45d1cf-510526d2a5fso5053384a12.0 for ; Tue, 16 May 2023 05:07:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684238874; x=1686830874; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XyAXi6x8aszqh12UrKORWcGpdd4ooyyAm/BCDR10PMo=; b=OQyPmyjn3msmoOlBm/o9/7CRyPQDirl/tU7vUcGRlggjafdXnwEsjJkQOEnE60lCOG ZFJIV8PMtbznKZ94G/J95IDA82ypGcNkJ6KnzxziTm/ChQsCHjDW96S7v+yc+lUdmVHi yGrk9aktDYDZT0MDHl4cTQSQiiaXr9YEOROlTB0vfuHDTqnrHsIKXzGD15zuwUo1/b3+ Dt2oa+1eOWN2IUGn+Ykn5aVeJFY+UO3me/RwCGl7vc9EsqJJC3BWJo0V84tPwv7BuAG9 p9ejmf9L+EDW+j4TUBKbNON/rK30843O0ZUY4sxsKq+STXYQiJq4loBeSZGgsB5iCOzd S0Cw== X-Gm-Message-State: AC+VfDwqp4hIc2hvnyM9jo4RcnlwlURwNwPIWR0Kr1SE7xUPbhkIOQE5 vGLI1C06M9EHKQRdy9LNEaO8/chKvJ4qEkqczcwCPQV2GFMwsJOpcK/cYmLeKqMH33uTdjQ8Gti 5m/w= X-Received: by 2002:a17:907:9306:b0:932:f88c:c2ff with SMTP id bu6-20020a170907930600b00932f88cc2ffmr33240666ejc.34.1684238873924; Tue, 16 May 2023 05:07:53 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7qJHlVmn1wHTsjysR6zW5kmvptWa/6TO8gxyaKVf84Bantn5YRd+b77BCx4+mxp9BnWIJMKA== X-Received: by 2002:a17:907:9306:b0:932:f88c:c2ff with SMTP id bu6-20020a170907930600b00932f88cc2ffmr33240637ejc.34.1684238873469; Tue, 16 May 2023 05:07:53 -0700 (PDT) Received: from [10.39.192.202] (5920ab7b.static.cust.trined.nl. [89.32.171.123]) by smtp.gmail.com with ESMTPSA id gv15-20020a1709072bcf00b0095337c5da35sm11128680ejc.15.2023.05.16.05.07.52 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 May 2023 05:07:52 -0700 (PDT) From: Eelco Chaudron To: Maxime Coquelin Cc: David Marchand , chenbo.xia@intel.com, dev@dpdk.org Subject: Re: [PATCH v2 3/3] vhost: add device op to offload the interrupt kick Date: Tue, 16 May 2023 14:07:51 +0200 X-Mailer: MailMate (1.14r5964) Message-ID: <3AC41F01-31B6-4BE0-98E4-265ACD17D4BF@redhat.com> In-Reply-To: References: <168069838578.833254.4856666346839028593.stgit@ebuild.local> <168069850278.833254.17836553051894685658.stgit@ebuild.local> <683D0767-DCE8-45F4-B904-CD2DD574DE53@redhat.com> <0E248797-C6C2-4C5A-B96F-1693B9B2091C@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 16 May 2023, at 13:45, Maxime Coquelin wrote: > On 5/16/23 13:36, Eelco Chaudron wrote: >> >> >> On 16 May 2023, at 12:12, David Marchand wrote: >> >>> On Tue, May 16, 2023 at 10:53=E2=80=AFAM Eelco Chaudron wrote: >>>> On 10 May 2023, at 13:44, David Marchand wrote: >>> >>> [snip] >>> >>>>>> @@ -846,6 +848,14 @@ vhost_user_socket_mem_free(struct vhost_user_so= cket *vsocket) >>>>>> vsocket->path =3D NULL; >>>>>> } >>>>>> >>>>>> + if (vsocket && vsocket->alloc_notify_ops) { >>>>>> +#pragma GCC diagnostic push >>>>>> +#pragma GCC diagnostic ignored "-Wcast-qual" >>>>>> + free((struct rte_vhost_device_ops *)vsocket->notify_= ops); >>>>>> +#pragma GCC diagnostic pop >>>>>> + vsocket->notify_ops =3D NULL; >>>>>> + } >>>>> >>>>> Rather than select the behavior based on a boolean (and here force th= e >>>>> compiler to close its eyes), I would instead add a non const pointer >>>>> to ops (let's say alloc_notify_ops) in vhost_user_socket. >>>>> The code can then unconditionnally call free(vsocket->alloc_notify_op= s); >>>> >>>> Good idea, I will make the change in v3. >>> >>> Feel free to use a better name for this field :-). >>> >>>> >>>>>> + >>>>>> if (vsocket) { >>>>>> free(vsocket); >>>>>> vsocket =3D NULL; >>> >>> [snip] >>> >>>>>> + /* >>>>>> + * Although the ops structure is a const structure, we do ne= ed to >>>>>> + * override the guest_notify operation. This is because with= the >>>>>> + * previous APIs it was "reserved" and if any garbage value = was passed, >>>>>> + * it could crash the application. >>>>>> + */ >>>>>> + if (ops && !ops->guest_notify) { >>>>> >>>>> Hum, as described in the comment above, I don't think we should look >>>>> at ops->guest_notify value at all. >>>>> Checking ops !=3D NULL should be enough. >>>> >>>> Not sure I get you here. If the guest_notify passed by the user is NUL= L, it means the previously =E2=80=98reserved[1]=E2=80=99 field is NULL, so = we do not need to use a new structure. >>>> >>>> I guess your comment would be true if we would introduce a new field i= n the data structure, not replacing a reserved one. >>> >>> Hum, I don't understand my comment either o_O'. >>> Too many days off... or maybe my evil twin took over the keyboard. >>> >>> >>>> >>>>>> + struct rte_vhost_device_ops *new_ops; >>>>>> + >>>>>> + new_ops =3D malloc(sizeof(*new_ops)); >>>>> >>>>> Strictly speaking, we lose the numa affinity of "ops" by calling mall= oc. >>>>> I am unclear of the impact though. >>>> >>>> Don=E2=80=99t think there is a portable API that we can use to determi= ne the NUMA for the ops memory and then allocate this on the same numa? >>>> >>>> Any thoughts or ideas on how to solve this? I hope most people will me= mset() the ops structure and the reserved[1] part is zero, but it might be = a problem in the future if more extensions get added. >>> >>> Determinining current numa is doable, via 'ops' >>> get_mempolicy(MPOL_F_NODE | MPOL_F_ADDR), like what is done for vq in >>> numa_realloc(). >>> The problem is how to allocate on this numa with the libc allocator >>> for which I have no idea... >>> We could go with the dpdk allocator (again, like numa_realloc()). >>> >>> >>> In practice, the passed ops will be probably from a const variable in >>> the program .data section (for which I think fields are set to 0 >>> unless explicitly initialised), or a memset() will be called for a >>> dynamic allocation from good citizens. >>> So we can probably live with the current proposal. >>> Plus, this is only for one release, since in 23.11 with the ABI bump, >>> we will drop this compat code. >>> >>> Maxime, Chenbo, what do you think? >> >> Wait for their response, but for now I assume we can just keep the numa = unaware malloc(). > > Let's keep it as is as we'll get rid of it in 23.11. Thanks for confirming. >>> >>> [snip] >>> >>>>> >>>>> But putting indentation aside, is this change equivalent? >>>>> - if ((vhost_need_event(vhost_used_event(vq), new, old)= && >>>>> - (vq->callfd >=3D 0)) || >>>>> - unlikely(!signalled_used_valid)) { >>>>> + if ((vhost_need_event(vhost_used_event(vq), new, old)= || >>>>> + unlikely(!signalled_used_valid)) && >>>>> + vq->callfd >=3D 0) { >>>> >>>> They are not equal, but in the past eventfd_write() should also not ha= ve been called with callfd < 0, guess this was an existing bug ;) >>> >>> I think this should be a separate fix. >> >> ACK, will add a separate patch in this series to fix it. > > I also caught & fixed it while implementing my VDUSE series [0]. > You can pick it in your series, and I will rebase my series on top of > it. Thanks for the details I=E2=80=99ll include your patch in my series. I will send out a new revision soon (after testing the changes with OVS). Thanks, Eelco > Thanks, > Maxime > > [0]: https://gitlab.com/mcoquelin/dpdk-next-virtio/-/commit/b976e1f226db5= c09834148847d994045eb89be93 > > >> >>> >>>> >>>>>> + vhost_vring_inject_irq(dev, vq); >>> >>> >>> --=20 >>> David Marchand >>