From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9B42E42B22; Tue, 16 May 2023 13:36:31 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 793EC40A8B; Tue, 16 May 2023 13:36:31 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id AD31F4068E for ; Tue, 16 May 2023 13:36:30 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684236990; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WhsMbsrRyYXM9zfX+8LsKzJQHXUT7cEDhdM3/6/5U8E=; b=X6QrlPQyf7xjIWuQ6b2Xeg2okdvCnXAIIz8kuOcRmd2ccgI80xYY8wZ/EZq5ej3OQ8jjUs 6Oqe6bxqL4Bf1wb5gvra9PbpIy3M8hrve3Ly0Kw9VW2PsCFH2+BneEmSdcTp1XnMopuVsg 1TWIi7pJXBA50T47MEak2FoFfeUW+Cs= Received: from mail-ej1-f72.google.com (mail-ej1-f72.google.com [209.85.218.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-121-nB5GTJJ5Ppah99hzIqoI4A-1; Tue, 16 May 2023 07:36:29 -0400 X-MC-Unique: nB5GTJJ5Ppah99hzIqoI4A-1 Received: by mail-ej1-f72.google.com with SMTP id a640c23a62f3a-965fcb08b69so1456681966b.0 for ; Tue, 16 May 2023 04:36:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684236987; x=1686828987; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NdQsNfPZznOBOaVakJBBg9BRkxKZtN26mCfA6NDjsjk=; b=U3Ts5nUXy2z2WJZDGyVSogWZDUzUY9exVHa77b34e8Km0z0KLI7OAjrb9WapHMx2E3 lt/6uXTvrWF56d2+7WeTOFgzbJCOAIYKVxE2XUOPcuR0wE/q+CRz/LMtVQpIVghn7okK qV7TlL33FHClGQ2eXHvuZ8e6kLa2MAZquZCsJwXAavdt5QyIAZfHEL4OnhJNEIAimKrd G2DbSMCpbNtmOEyz/ldqq5J0QCxEITPeR/fJ3HXEgXlLxb8j8wyO/YRL9MZOQ+wvOpAo QC0HnBW++tZHrQao96bKoyhCQaSa9AfCts2Q+369Dr/7dlt3ozcFaRPVGWJxGpJqXR4c ssMw== X-Gm-Message-State: AC+VfDxKowejxCQx/UGq9PqFN60R7sDiBFs2ledMiC82ucLkdTaWST9y T4OKY6kT1uEgZkGnrFfC0Tc+//23FckqUd1GYBBDMuNu8y18ZslMTbBh78bbHTE7UEwRe+1csml 8iP5yWbxRbdg= X-Received: by 2002:a17:907:a4d:b0:95e:de94:5bea with SMTP id be13-20020a1709070a4d00b0095ede945beamr34027164ejc.56.1684236987397; Tue, 16 May 2023 04:36:27 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6g8JnuDmdHx4YzZeFljeqwLk2oyWjnKad7lzQoC9+kP2YNiPWpwVoIKBWGQQW6UUySAVVLBQ== X-Received: by 2002:a17:907:a4d:b0:95e:de94:5bea with SMTP id be13-20020a1709070a4d00b0095ede945beamr34027145ejc.56.1684236987022; Tue, 16 May 2023 04:36:27 -0700 (PDT) Received: from [10.39.192.202] (5920ab7b.static.cust.trined.nl. [89.32.171.123]) by smtp.gmail.com with ESMTPSA id j2-20020a170906534200b0096b4c3489e6sm2344583ejo.177.2023.05.16.04.36.25 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 May 2023 04:36:26 -0700 (PDT) From: Eelco Chaudron To: David Marchand Cc: maxime.coquelin@redhat.com, chenbo.xia@intel.com, dev@dpdk.org Subject: Re: [PATCH v2 3/3] vhost: add device op to offload the interrupt kick Date: Tue, 16 May 2023 13:36:25 +0200 X-Mailer: MailMate (1.14r5964) Message-ID: <0E248797-C6C2-4C5A-B96F-1693B9B2091C@redhat.com> In-Reply-To: References: <168069838578.833254.4856666346839028593.stgit@ebuild.local> <168069850278.833254.17836553051894685658.stgit@ebuild.local> <683D0767-DCE8-45F4-B904-CD2DD574DE53@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 16 May 2023, at 12:12, David Marchand wrote: > On Tue, May 16, 2023 at 10:53=E2=80=AFAM Eelco Chaudron wrote: >> On 10 May 2023, at 13:44, David Marchand wrote: > > [snip] > >>>> @@ -846,6 +848,14 @@ vhost_user_socket_mem_free(struct vhost_user_sock= et *vsocket) >>>> vsocket->path =3D NULL; >>>> } >>>> >>>> + if (vsocket && vsocket->alloc_notify_ops) { >>>> +#pragma GCC diagnostic push >>>> +#pragma GCC diagnostic ignored "-Wcast-qual" >>>> + free((struct rte_vhost_device_ops *)vsocket->notify_op= s); >>>> +#pragma GCC diagnostic pop >>>> + vsocket->notify_ops =3D NULL; >>>> + } >>> >>> Rather than select the behavior based on a boolean (and here force the >>> compiler to close its eyes), I would instead add a non const pointer >>> to ops (let's say alloc_notify_ops) in vhost_user_socket. >>> The code can then unconditionnally call free(vsocket->alloc_notify_ops)= ; >> >> Good idea, I will make the change in v3. > > Feel free to use a better name for this field :-). > >> >>>> + >>>> if (vsocket) { >>>> free(vsocket); >>>> vsocket =3D NULL; > > [snip] > >>>> + /* >>>> + * Although the ops structure is a const structure, we do need= to >>>> + * override the guest_notify operation. This is because with t= he >>>> + * previous APIs it was "reserved" and if any garbage value wa= s passed, >>>> + * it could crash the application. >>>> + */ >>>> + if (ops && !ops->guest_notify) { >>> >>> Hum, as described in the comment above, I don't think we should look >>> at ops->guest_notify value at all. >>> Checking ops !=3D NULL should be enough. >> >> Not sure I get you here. If the guest_notify passed by the user is NULL,= it means the previously =E2=80=98reserved[1]=E2=80=99 field is NULL, so we= do not need to use a new structure. >> >> I guess your comment would be true if we would introduce a new field in = the data structure, not replacing a reserved one. > > Hum, I don't understand my comment either o_O'. > Too many days off... or maybe my evil twin took over the keyboard. > > >> >>>> + struct rte_vhost_device_ops *new_ops; >>>> + >>>> + new_ops =3D malloc(sizeof(*new_ops)); >>> >>> Strictly speaking, we lose the numa affinity of "ops" by calling malloc= . >>> I am unclear of the impact though. >> >> Don=E2=80=99t think there is a portable API that we can use to determine= the NUMA for the ops memory and then allocate this on the same numa? >> >> Any thoughts or ideas on how to solve this? I hope most people will mems= et() the ops structure and the reserved[1] part is zero, but it might be a = problem in the future if more extensions get added. > > Determinining current numa is doable, via 'ops' > get_mempolicy(MPOL_F_NODE | MPOL_F_ADDR), like what is done for vq in > numa_realloc(). > The problem is how to allocate on this numa with the libc allocator > for which I have no idea... > We could go with the dpdk allocator (again, like numa_realloc()). > > > In practice, the passed ops will be probably from a const variable in > the program .data section (for which I think fields are set to 0 > unless explicitly initialised), or a memset() will be called for a > dynamic allocation from good citizens. > So we can probably live with the current proposal. > Plus, this is only for one release, since in 23.11 with the ABI bump, > we will drop this compat code. > > Maxime, Chenbo, what do you think? Wait for their response, but for now I assume we can just keep the numa una= ware malloc(). > > [snip] > >>> >>> But putting indentation aside, is this change equivalent? >>> - if ((vhost_need_event(vhost_used_event(vq), new, old) &= & >>> - (vq->callfd >=3D 0)) || >>> - unlikely(!signalled_used_valid)) { >>> + if ((vhost_need_event(vhost_used_event(vq), new, old) |= | >>> + unlikely(!signalled_used_valid)) && >>> + vq->callfd >=3D 0) { >> >> They are not equal, but in the past eventfd_write() should also not have= been called with callfd < 0, guess this was an existing bug ;) > > I think this should be a separate fix. ACK, will add a separate patch in this series to fix it. > >> >>>> + vhost_vring_inject_irq(dev, vq); > > > --=20 > David Marchand