From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6E491A00C4 for ; Mon, 18 Apr 2022 19:53:52 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0BD264014F; Mon, 18 Apr 2022 19:53:52 +0200 (CEST) Received: from mail-lf1-f49.google.com (mail-lf1-f49.google.com [209.85.167.49]) by mails.dpdk.org (Postfix) with ESMTP id 729FA40141 for ; Mon, 18 Apr 2022 19:53:51 +0200 (CEST) Received: by mail-lf1-f49.google.com with SMTP id x33so25345651lfu.1 for ; Mon, 18 Apr 2022 10:53:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=H7Z7RnNV1VGgTwRYYMc+99071zdUNkNtQtmU/ik0VNY=; b=VBAxKIIgUe9OWMkIOQBRunuxoVaYWiOZEgjXoepCbGfGVQtGqXryv5ZPIeraW5vm2G LITwg5msfVpU/6KwSm6ro4JBtUFxEYYGZKkiTrXbgeLZm10of/dy/k9ojs/mN0firYI1 cmKgDYfZT8YquUYDL2A51szsn9xXQdZD/9dMviVihGz8fadVoRY92Ovv0uDcLWzjJdFV CKfJtVE9dkoW+Hz4LjPiQEgQOmfQg45L7q6wk1ZHYjf8r3FW0lwgf7i3uShQPHPTng/6 hZfq/yuHjei/SMNcxJ+gO295U7Q+XmscgOGJzoqzlA5pY95cpTVEpOH2rGVOKqHATaix 2oLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=H7Z7RnNV1VGgTwRYYMc+99071zdUNkNtQtmU/ik0VNY=; b=sA5UjuPTctb+VDzsX4z7+5Xv1t/UzdvfHS2LF6PbgKEWvRpNDMjCD4tPclp0c6v+2X fDMSFA73r4G8kg2NZv1DeRwdui/dJAP0UImnDoCV1PU6XxXVh7GjHCtBtDVeSU3ttbPg DgVrECh4DatsLRvdoAcSUY8+jwf6HuOotUWKQrfDFpXvFlPFCZyl9YlFpZ9nEkMHwHb5 YdBHhEJHzH6RsT9p/4aioGXh6EGjggX+v5P1Y2QI6uKROO+mPlVqOC1gVyzTn0GzUkoC gAvVMTli6mMN3cG43h5tlgkdvc/jw9AJsUUR0WoWKAtVRyCWUyc2/UfL+a5p+9FjUyHA PKQA== X-Gm-Message-State: AOAM5304URnyP4DzqXfcFGHjN9DgbskHkdx4YOlvq1zSbyCf7JpJhYui 58Mrl1nbMf7zjahYK3ySV5027CWX53B8TyXgmjs= X-Google-Smtp-Source: ABdhPJxAWiUX9+rWEsW1fHsXaKwwbZltyXC0Om1ddFOWo8yN4SBOulaQJ1aEBh5A1+2XM0uNmzIZGPyn1Mns/jm+tUs= X-Received: by 2002:ac2:54b9:0:b0:46f:98ab:9184 with SMTP id w25-20020ac254b9000000b0046f98ab9184mr8598666lfk.640.1650304430927; Mon, 18 Apr 2022 10:53:50 -0700 (PDT) MIME-Version: 1.0 References: <20220408162629.372dfd0d@sovereign> <20220411203011.4df9f6f4@sovereign> <20220414220148.0d638532@sovereign> In-Reply-To: From: Antonio Di Bacco Date: Mon, 18 Apr 2022 19:53:38 +0200 Message-ID: Subject: Re: Shared memory between two primary DPDK processes To: Dmitry Kozlyuk Cc: users@dpdk.org Content-Type: multipart/alternative; boundary="00000000000060729005dcf17158" X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org --00000000000060729005dcf17158 Content-Type: text/plain; charset="UTF-8" Another info to add: The process that allocates the 1GB page has this map: antodib@Ubuntu-20.04-5:: /proc> sudo cat /proc/27812/maps | grep huge 140000000-180000000 rw-s 00000000 00:46 97193 /dev/huge1G/rtemap_0 while the process that maps the 1GB page (--file-prefix p2) has this maps, is stealing a new page? antodib@Ubuntu-20.04-5:: /proc> sudo cat /proc/27906/maps | grep huge 140000000-180000000 rw-s 00000000 00:46 113170 /dev/huge1G/p2map_0 7f7bc0000000-7f7c00000000 rw-s 00000000 00:46 97193 /dev/huge1G/rtemap_0 Il giorno lun 18 apr 2022 alle ore 19:34 Antonio Di Bacco < a.dibacco.ks@gmail.com> ha scritto: > At the end I tried the pidfd_getfd syscall that is working really fine and > giving me back a "clone" fd of an fd in that was opened from another > process. I tested it opening a text file in the first process and after > cloning the fd , I could really read the file also in the second process. > Now the weird thing: > 1) In the first process I allocate- a huge page, then get the fd > 2) In the second process I get my "clone" fd and do an mmap, it works but > if I write on that memory, the first process cannot see what I wrote > > int second_process(int remote_pid, int remote_mem_fd) { > > printf("remote_pid %d remote_mem_fd %d\n", remote_pid, > remote_mem_fd); > int pidfd = syscall(__NR_pidfd_open, remote_pid, 0); > > int my_mem_fd = syscall(438, pidfd, remote_mem_fd, 0); > printf("my_mem_fd %d\n", my_mem_fd); // This is nice > > int flags = MAP_SHARED | MAP_HUGETLB | (30 << MAP_HUGE_SHIFT); > uint64_t* addr = (uint64_t*) mmap(NULL, 1024 * 1024 * 1024, > PROT_READ|PROT_WRITE, flags, my_mem_fd, 0); > if (addr == -1) > perror("mmap"); > *addr = 0x0101010102020202; > } > > > Il giorno gio 14 apr 2022 alle ore 21:51 Antonio Di Bacco < > a.dibacco.ks@gmail.com> ha scritto: > >> >> >> Il giorno gio 14 apr 2022 alle ore 21:01 Dmitry Kozlyuk < >> dmitry.kozliuk@gmail.com> ha scritto: >> >>> 2022-04-14 10:20 (UTC+0200), Antonio Di Bacco: >>> [...] >>> > Ok, after having a look to memif I managed to exchange the fd between >>> the >>> > two processes and it works. >>> > Anyway the procedure seems a little bit clunky and I think I'm going >>> to use >>> > the new SYSCALL pidfd_getfd >>> > to achieve the same result. In your opinion this method (getfd_pidfd) >>> > could also work if the two DPDK processes >>> > are inside different docker containers? >>> >>> Honestly, I've just learned about pidfd_getfd() from you. >>> But I know that containers use PID namespaces, so there's a question >>> how you will obtain the pidfd of a process in another container. >>> >>> In general, any method of sharing FD will work. >>> Remember that you also need offset and size. >>> Given that some channel is required to share those, >>> I think Unix domain socket is still the preferred way. >>> >>> > Or is there another mechanims like using handles to hugepages present >>> in >>> > the filesystem to share between two >>> > different containers? >>> >>> FD is needed for mmap(). >>> You need to either pass the FD or open() the same hugepage file by path. >>> I advise against using paths because they are not a part of DPDK API >>> contract. >>> >> >> Thank you very much Dmitry, your answers are always enlightening. >> I'm going to ask a different question on the dpdk.org about the best >> practice to share memory between two dpdk processes running in different >> containers. >> > --00000000000060729005dcf17158 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Another info to add:=C2=A0=C2=A0

The pr= ocess that allocates the 1GB page has this map:
antodib@Ubuntu-20= .04-5:: /proc> sudo cat /proc/27812/maps | grep huge
140000000-180000= 000 rw-s 00000000 00:46 97193 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/dev/huge1G/rtemap_0

while the process that maps the 1GB page (--file-pre= fix p2) has this maps, is stealing a new page?
antodib@Ubuntu-20.04-5:: = /proc> sudo cat /proc/27906/maps | grep huge
140000000-180000000 rw-s= 00000000 00:46 113170 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /dev/huge1G/p2map_0
7f7bc0000000-= 7f7c00000000 rw-s 00000000 00:46 97193 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/dev/huge1G/rtemap_0

Il g= iorno lun 18 apr 2022 alle ore 19:34 Antonio Di Bacco <a.dibacco.ks@gmail.com> ha scritto:
At the= end I tried the pidfd_getfd syscall that is working really fine and giving= me back a "clone" fd of an fd in that was opened from another pr= ocess. I tested it opening a text file in the first process=C2=A0 and after= cloning the fd , I could really read the file also in the second process.= =C2=A0
Now the weird thing:
1) In the first process I allocat= e- a huge page, then get the fd
2) In the second process I get my= "clone" fd and do an mmap, it works but if I write on that memor= y, the first process cannot see what I wrote
=C2=A0=C2=A0
int second_process(int remote_pid, int remote_mem_fd) {

=C2=A0 =C2=A0 =C2=A0 =C2=A0 printf("remote_pid %d remote_mem_= fd %d\n", remote_pid, remote_mem_fd);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 i= nt pidfd =3D syscall(__NR_pidfd_open, remote_pid, 0);

=C2=A0 =C2=A0 = =C2=A0 =C2=A0 int my_mem_fd =3D syscall(438, pidfd, remote_mem_fd, 0);
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 printf("my_mem_fd %d\n", my_mem_fd);= =C2=A0 =C2=A0// This is nice

=C2=A0 =C2=A0 =C2=A0 =C2=A0 int flags = =3D MAP_SHARED | MAP_HUGETLB | (30 << MAP_HUGE_SHIFT);
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 uint64_t* addr =3D (uint64_t*) mmap(NULL, 1024 * 1024 * 1= 024, PROT_READ|PROT_WRITE, flags, my_mem_fd, 0);
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 if (addr =3D=3D -1)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 per= ror("mmap");
=C2=A0 =C2=A0 =C2=A0 =C2=A0 *addr =3D 0x010101010= 2020202;
}


Il giorno gio 14 apr 2022 alle= ore 21:51 Antonio Di Bacco <a.dibacco.ks@gmail.com> ha scritto:


Il giorno gio 14 apr 2022 alle ore 21:01 Dmitry Kozlyuk <dmitry.kozliuk@gmail= .com> ha scritto:
2022-04-14 10:20 (UTC+0200), Antonio Di Bacco:
[...]
> Ok, after having a look to memif I managed to exchange the fd=C2=A0 be= tween the
> two processes and it works.
> Anyway the procedure seems a little bit clunky and I think I'm goi= ng to use
> the new SYSCALL pidfd_getfd
> to achieve the same result.=C2=A0 In your opinion this method (getfd_p= idfd)
> could also work if the two DPDK processes
> are inside different docker containers?

Honestly, I've just learned about pidfd_getfd() from you.
But I know that containers use PID namespaces, so there's a question how you will obtain the pidfd of a process in another container.

In general, any method of sharing FD will work.
Remember that you also need offset and size.
Given that some channel is required to share those,
I think Unix domain socket is still the preferred way.

> Or is there another mechanims like using handles to hugepages present = in
> the filesystem to share between two
> different containers?

FD is needed for mmap().
You need to either pass the FD or open() the same hugepage file by path. I advise against using paths because they are not a part of DPDK API contra= ct.

Thank you very much Dmitry, your an= swers are always enlightening.
I'm going to ask a different q= uestion on the dpdk.org a= bout the best practice to share memory between two dpdk processes running i= n different containers.=C2=A0
--00000000000060729005dcf17158--