From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D8F1142B22; Tue, 16 May 2023 12:12:31 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 67DE940C35; Tue, 16 May 2023 12:12:31 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 068024068E for ; Tue, 16 May 2023 12:12:29 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684231949; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WoPtkO4hPMHw1LDp1JdSpTPH5C3W0c3YWly9tYlg9RM=; b=TBl47Z/MRLRQahKZVh7DBWKKVF5qTMH0WGgW5iSxsXWbeqVeHtZ7+VP+YP8Fr6xhDGUUgQ m+EVrX5Tn4BzTymxtOVHNdIMH/e2aMs9mEpAaHeGsx+7Zs+7EDhRcUZtdqFegeRt4fDSZ6 N5VMAneCNpzDit4rih/ynkOTqVAm64M= Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-456-a1Bpa0k0OuaZnDKgrDzLog-1; Tue, 16 May 2023 06:12:28 -0400 X-MC-Unique: a1Bpa0k0OuaZnDKgrDzLog-1 Received: by mail-pl1-f200.google.com with SMTP id d9443c01a7336-1ae3f6df2bfso2399425ad.1 for ; Tue, 16 May 2023 03:12:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684231947; x=1686823947; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WoPtkO4hPMHw1LDp1JdSpTPH5C3W0c3YWly9tYlg9RM=; b=gApu/yR0QsjXYs7rMOS1kccsgsIqoqwrq4bsUU7y8+SMekBnoHMTuhWK0Jb4vFVtbc cgVe23KnfgCBe8DWNZ3y2VOJHnZVsvzBD+ae3HdWEtvqJjCEQRbGPNziUO5TPQBqi3BA dOxfwRA1/t6/z8mTQvSpzDA1hHJUKNQwnuRMiBtcalitubRCXoSE0UctpsZdREY0qjVp Ws3bDOW2ngShikOI3169FQAzv8PFaYxFuyGD4y9S/DgFFu5DIJuiGt6UXcAW5jyusFyY 3hNhOX6EQG+gsMhVeLcpI4VCzDpDrxRq5h5f4XQdUEPRoxuR4HzFqsdffo7a8U/tgBBb GjXg== X-Gm-Message-State: AC+VfDym0IhX7NohxaerzVgeqVVI9flQyHXLb7upAq5yGRvRO3lHDbpF LD1i0nkWJMaACNY7fj9/3RLfyZj5jKGiODhuLBBk9c5pKETF7fPTIKd229WcTkOh0sqmbHn/Ktg ooC97/D+RxlHUxVgL56o= X-Received: by 2002:a17:903:249:b0:1ae:3214:1629 with SMTP id j9-20020a170903024900b001ae32141629mr2907480plh.45.1684231947238; Tue, 16 May 2023 03:12:27 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5tYAvQSVjfRWrU9KmCLeh8vm2dBTRPpIPB1eIw1yTb2lLNzYzznRdK1C2Ft+bMYk1IQ5xABho6N05CkYXuAJ8= X-Received: by 2002:a17:903:249:b0:1ae:3214:1629 with SMTP id j9-20020a170903024900b001ae32141629mr2907456plh.45.1684231946882; Tue, 16 May 2023 03:12:26 -0700 (PDT) MIME-Version: 1.0 References: <168069838578.833254.4856666346839028593.stgit@ebuild.local> <168069850278.833254.17836553051894685658.stgit@ebuild.local> <683D0767-DCE8-45F4-B904-CD2DD574DE53@redhat.com> In-Reply-To: <683D0767-DCE8-45F4-B904-CD2DD574DE53@redhat.com> From: David Marchand Date: Tue, 16 May 2023 12:12:15 +0200 Message-ID: Subject: Re: [PATCH v2 3/3] vhost: add device op to offload the interrupt kick To: Eelco Chaudron , maxime.coquelin@redhat.com, chenbo.xia@intel.com Cc: dev@dpdk.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Tue, May 16, 2023 at 10:53=E2=80=AFAM Eelco Chaudron wrote: > On 10 May 2023, at 13:44, David Marchand wrote: [snip] > >> @@ -846,6 +848,14 @@ vhost_user_socket_mem_free(struct vhost_user_sock= et *vsocket) > >> vsocket->path =3D NULL; > >> } > >> > >> + if (vsocket && vsocket->alloc_notify_ops) { > >> +#pragma GCC diagnostic push > >> +#pragma GCC diagnostic ignored "-Wcast-qual" > >> + free((struct rte_vhost_device_ops *)vsocket->notify_op= s); > >> +#pragma GCC diagnostic pop > >> + vsocket->notify_ops =3D NULL; > >> + } > > > > Rather than select the behavior based on a boolean (and here force the > > compiler to close its eyes), I would instead add a non const pointer > > to ops (let's say alloc_notify_ops) in vhost_user_socket. > > The code can then unconditionnally call free(vsocket->alloc_notify_ops)= ; > > Good idea, I will make the change in v3. Feel free to use a better name for this field :-). > > >> + > >> if (vsocket) { > >> free(vsocket); > >> vsocket =3D NULL; [snip] > >> + /* > >> + * Although the ops structure is a const structure, we do need= to > >> + * override the guest_notify operation. This is because with t= he > >> + * previous APIs it was "reserved" and if any garbage value wa= s passed, > >> + * it could crash the application. > >> + */ > >> + if (ops && !ops->guest_notify) { > > > > Hum, as described in the comment above, I don't think we should look > > at ops->guest_notify value at all. > > Checking ops !=3D NULL should be enough. > > Not sure I get you here. If the guest_notify passed by the user is NULL, = it means the previously =E2=80=98reserved[1]=E2=80=99 field is NULL, so we = do not need to use a new structure. > > I guess your comment would be true if we would introduce a new field in t= he data structure, not replacing a reserved one. Hum, I don't understand my comment either o_O'. Too many days off... or maybe my evil twin took over the keyboard. > > >> + struct rte_vhost_device_ops *new_ops; > >> + > >> + new_ops =3D malloc(sizeof(*new_ops)); > > > > Strictly speaking, we lose the numa affinity of "ops" by calling malloc= . > > I am unclear of the impact though. > > Don=E2=80=99t think there is a portable API that we can use to determine = the NUMA for the ops memory and then allocate this on the same numa? > > Any thoughts or ideas on how to solve this? I hope most people will memse= t() the ops structure and the reserved[1] part is zero, but it might be a p= roblem in the future if more extensions get added. Determinining current numa is doable, via 'ops' get_mempolicy(MPOL_F_NODE | MPOL_F_ADDR), like what is done for vq in numa_realloc(). The problem is how to allocate on this numa with the libc allocator for which I have no idea... We could go with the dpdk allocator (again, like numa_realloc()). In practice, the passed ops will be probably from a const variable in the program .data section (for which I think fields are set to 0 unless explicitly initialised), or a memset() will be called for a dynamic allocation from good citizens. So we can probably live with the current proposal. Plus, this is only for one release, since in 23.11 with the ABI bump, we will drop this compat code. Maxime, Chenbo, what do you think? [snip] > > > > But putting indentation aside, is this change equivalent? > > - if ((vhost_need_event(vhost_used_event(vq), new, old) &= & > > - (vq->callfd >=3D 0)) || > > - unlikely(!signalled_used_valid)) { > > + if ((vhost_need_event(vhost_used_event(vq), new, old) |= | > > + unlikely(!signalled_used_valid)) && > > + vq->callfd >=3D 0) { > > They are not equal, but in the past eventfd_write() should also not have = been called with callfd < 0, guess this was an existing bug ;) I think this should be a separate fix. > > >> + vhost_vring_inject_irq(dev, vq); --=20 David Marchand