From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0ED4A42B2A; Wed, 17 May 2023 11:18:50 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DE4B84114A; Wed, 17 May 2023 11:18:49 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id C8E3D4067B for ; Wed, 17 May 2023 11:18:48 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684315128; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vQ/oWwXuT62SAt6NmzcJA/HCBiGdLkBfDPQWwlOkREs=; b=T9EU03DoY3MvCLZ/V9ThtFDj3jv1T+gTKlpkEoKpoOZ1d7DSStUcOzkKrZz4puVkJBNffd mIoeqF3rhpiPO9ClVh9il84CIctR2kU55F8H6q895WUzOPLs4n/gwC27g1piGozgxjRShJ Ei8SeY8kNpCrUD/AyAdI3fTNQPWwQbQ= Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-145-YcpjHO18Ox6fyR9XkHo_5A-1; Wed, 17 May 2023 05:18:47 -0400 X-MC-Unique: YcpjHO18Ox6fyR9XkHo_5A-1 Received: by mail-ed1-f70.google.com with SMTP id 4fb4d7f45d1cf-50bc1a01cffso571410a12.3 for ; Wed, 17 May 2023 02:18:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684315126; x=1686907126; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vNuks3pXPrcqt8yKxsxOa0+aeSVxQ7jz/2u1Ht3gkxw=; b=ANtg+/L4hjtTL9E1jFq8aFtI6Oh/JJsenOKG6LxGFisM7i/ENBQ/VZ2/TJh1P3nc0+ YDfqBletXCxRoov5V2W3ZuCeAP5YQbWr/cw47u2X0NJ2uT7PzsdvDdjZFEhTi8NuoU1g q8js0qEsINeNz+ntpNW+ZhWKBwmgUWPAwNxdKIRZ/rYKXdT+5BRpylLN5OmJuh01cwIC tDnBp3SEAUMtG9U94jSdiXjN/Kc+Cnm2fMf6xzK1bVMsM3I9ybw/pq2LR9WmyRtW8BOP WcWX0zsJ7TdR0lMaesDr/HWrR8QfHrGl8QUXHBlV5NKXK7G8fo8iEm0wScAeVdHUrBlD B+Yw== X-Gm-Message-State: AC+VfDwGiaRu7wC+rpDgIkm4gm0RdlplVog8XD0+KEYuw4QFp29pFhgB 1rf0f9UlBxpplGa6cM6B59fz8wsjP0gXdv9ErzCYATPvHt1e9TAdTnmHrcGPM7crqQdaWMNMjxq 2CBM= X-Received: by 2002:a17:906:974b:b0:965:ff38:2fb3 with SMTP id o11-20020a170906974b00b00965ff382fb3mr38150107ejy.74.1684315125920; Wed, 17 May 2023 02:18:45 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6yQKq4Xhwt9g7SUKWWk080QANGEeZKeQ0wH08aACcfD2xDLH7koHBCN9COmX4gDqzGwnbFVw== X-Received: by 2002:a17:906:974b:b0:965:ff38:2fb3 with SMTP id o11-20020a170906974b00b00965ff382fb3mr38150088ejy.74.1684315125536; Wed, 17 May 2023 02:18:45 -0700 (PDT) Received: from [10.39.192.205] (5920ab7b.static.cust.trined.nl. [89.32.171.123]) by smtp.gmail.com with ESMTPSA id ty20-20020a170907c71400b0096621340285sm11938930ejc.198.2023.05.17.02.18.44 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 May 2023 02:18:44 -0700 (PDT) From: Eelco Chaudron To: David Marchand Cc: maxime.coquelin@redhat.com, chenbo.xia@intel.com, dev@dpdk.org Subject: Re: [PATCH v2 3/3] vhost: add device op to offload the interrupt kick Date: Wed, 17 May 2023 11:18:43 +0200 X-Mailer: MailMate (1.14r5964) Message-ID: <3063BCE0-6063-4C93-A5A5-B66061C373B6@redhat.com> In-Reply-To: <0E248797-C6C2-4C5A-B96F-1693B9B2091C@redhat.com> References: <168069838578.833254.4856666346839028593.stgit@ebuild.local> <168069850278.833254.17836553051894685658.stgit@ebuild.local> <683D0767-DCE8-45F4-B904-CD2DD574DE53@redhat.com> <0E248797-C6C2-4C5A-B96F-1693B9B2091C@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 16 May 2023, at 13:36, Eelco Chaudron wrote: > On 16 May 2023, at 12:12, David Marchand wrote: > >> On Tue, May 16, 2023 at 10:53=E2=80=AFAM Eelco Chaudron wrote: >>> On 10 May 2023, at 13:44, David Marchand wrote: >> >> [snip] >> >>>>> @@ -846,6 +848,14 @@ vhost_user_socket_mem_free(struct vhost_user_soc= ket *vsocket) >>>>> vsocket->path =3D NULL; >>>>> } >>>>> >>>>> + if (vsocket && vsocket->alloc_notify_ops) { >>>>> +#pragma GCC diagnostic push >>>>> +#pragma GCC diagnostic ignored "-Wcast-qual" >>>>> + free((struct rte_vhost_device_ops *)vsocket->notify_o= ps); >>>>> +#pragma GCC diagnostic pop >>>>> + vsocket->notify_ops =3D NULL; >>>>> + } >>>> >>>> Rather than select the behavior based on a boolean (and here force the >>>> compiler to close its eyes), I would instead add a non const pointer >>>> to ops (let's say alloc_notify_ops) in vhost_user_socket. >>>> The code can then unconditionnally call free(vsocket->alloc_notify_ops= ); >>> >>> Good idea, I will make the change in v3. >> >> Feel free to use a better name for this field :-). >> >>> >>>>> + >>>>> if (vsocket) { >>>>> free(vsocket); >>>>> vsocket =3D NULL; >> >> [snip] >> >>>>> + /* >>>>> + * Although the ops structure is a const structure, we do nee= d to >>>>> + * override the guest_notify operation. This is because with = the >>>>> + * previous APIs it was "reserved" and if any garbage value w= as passed, >>>>> + * it could crash the application. >>>>> + */ >>>>> + if (ops && !ops->guest_notify) { >>>> >>>> Hum, as described in the comment above, I don't think we should look >>>> at ops->guest_notify value at all. >>>> Checking ops !=3D NULL should be enough. >>> >>> Not sure I get you here. If the guest_notify passed by the user is NULL= , it means the previously =E2=80=98reserved[1]=E2=80=99 field is NULL, so w= e do not need to use a new structure. >>> >>> I guess your comment would be true if we would introduce a new field in= the data structure, not replacing a reserved one. >> >> Hum, I don't understand my comment either o_O'. >> Too many days off... or maybe my evil twin took over the keyboard. >> >> >>> >>>>> + struct rte_vhost_device_ops *new_ops; >>>>> + >>>>> + new_ops =3D malloc(sizeof(*new_ops)); >>>> >>>> Strictly speaking, we lose the numa affinity of "ops" by calling mallo= c. >>>> I am unclear of the impact though. >>> >>> Don=E2=80=99t think there is a portable API that we can use to determin= e the NUMA for the ops memory and then allocate this on the same numa? >>> >>> Any thoughts or ideas on how to solve this? I hope most people will mem= set() the ops structure and the reserved[1] part is zero, but it might be a= problem in the future if more extensions get added. >> >> Determinining current numa is doable, via 'ops' >> get_mempolicy(MPOL_F_NODE | MPOL_F_ADDR), like what is done for vq in >> numa_realloc(). >> The problem is how to allocate on this numa with the libc allocator >> for which I have no idea... >> We could go with the dpdk allocator (again, like numa_realloc()). >> >> >> In practice, the passed ops will be probably from a const variable in >> the program .data section (for which I think fields are set to 0 >> unless explicitly initialised), or a memset() will be called for a >> dynamic allocation from good citizens. >> So we can probably live with the current proposal. >> Plus, this is only for one release, since in 23.11 with the ABI bump, >> we will drop this compat code. >> >> Maxime, Chenbo, what do you think? > > Wait for their response, but for now I assume we can just keep the numa u= naware malloc(). > >> >> [snip] >> >>>> >>>> But putting indentation aside, is this change equivalent? >>>> - if ((vhost_need_event(vhost_used_event(vq), new, old) = && >>>> - (vq->callfd >=3D 0)) || >>>> - unlikely(!signalled_used_valid)) { >>>> + if ((vhost_need_event(vhost_used_event(vq), new, old) = || >>>> + unlikely(!signalled_used_valid)) && >>>> + vq->callfd >=3D 0) { >>> >>> They are not equal, but in the past eventfd_write() should also not hav= e been called with callfd < 0, guess this was an existing bug ;) >> >> I think this should be a separate fix. > > ACK, will add a separate patch in this series to fix it. FYI I sent out the v3 patch. //Eelco