From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 32ED742BC7; Sun, 28 May 2023 22:07:54 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B807140DFB; Sun, 28 May 2023 22:07:53 +0200 (CEST) Received: from mail-lj1-f182.google.com (mail-lj1-f182.google.com [209.85.208.182]) by mails.dpdk.org (Postfix) with ESMTP id 34EDC40A81 for ; Sun, 28 May 2023 22:07:52 +0200 (CEST) Received: by mail-lj1-f182.google.com with SMTP id 38308e7fff4ca-2af2d092d7aso27285991fa.2 for ; Sun, 28 May 2023 13:07:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=weka.io; s=google; t=1685304471; x=1687896471; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=lTGcoyiv5UctZwjGugt5Eh9vZUrelMoJq+fOVido1/k=; b=QxZyymbL1kYiiV+LNOPyrmz3RbR+UBi9MDhjVQBYKW+dLNxziOGnzJBSs1Msw2dL4L Iut2K4td4veDFkyaXA+kJQK9JMqEh9Ys/U82y96t2FKoFnmabLQnyaegRWEOIRka6VYT /UvMvkRalDVrBrmnb9m4CCf8Qb/h+7kV6zkO8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685304471; x=1687896471; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=lTGcoyiv5UctZwjGugt5Eh9vZUrelMoJq+fOVido1/k=; b=KSCb30ZimrtZA/0R2fb3PowVD4w6XKjaciVwUuc0Sn/4Dio36ZlnaQ9VMOeV0E81Ig 8zpIvdbprpSIE+Bg2HD41VhEP8H9z7lvOSmHM8QPvPfD2Wu2OQzlzO1SS/kBS64XpOuD tC/7qpmkpn0mW0MyDRck7iXFSmfXPAaDEJyu0RvgJ6Cj0XELGYcehhmNprNQ2FHIqmY7 NJRJ7/wzDAJ+dzWgvqQerw7PtaEyXW8eZl9076QqGMdH3ubVA9FpfGV9SGs/j88aZsA/ FiJZ7IjAxkABIyQSNp3DWKEQtDrpP3h7oWjAhJASlCNSMkrQmxKC7RLPt5qx8fjKn+Lr MZcw== X-Gm-Message-State: AC+VfDwESq9otmxk02B3sQKCSVDHv85Hp2KLGgy/rYqjDsnbLD7NN4vh loLojSgyqyZc7yvbG+HDtCKQBDZj3RthHQsmjrTef792eOv+TdQsv8k5Wx96WuP8I9X/GstByb9 ruF1miKiJ6q+fOU5k9tJIeYCdVv2daXpvzeuNEeEVu4La8dexE3mWds7tt76FtqHDjiagJ5nffg tCqXn9Ay607UZsoFA= X-Google-Smtp-Source: ACHHUZ5LGa5WadYPTxkElIQp6P/Vh1N6PEBi/lSBBo1UyW48/i6/xxVdEF3HWlEtXC1EsrP86V27/9m07kqZm2bCa30= X-Received: by 2002:a2e:9d0f:0:b0:2af:1681:2993 with SMTP id t15-20020a2e9d0f000000b002af16812993mr2982870lji.49.1685304471044; Sun, 28 May 2023 13:07:51 -0700 (PDT) MIME-Version: 1.0 From: Baruch Even Date: Sun, 28 May 2023 23:07:40 +0300 Message-ID: Subject: Hugepage migration To: dpdk-dev Content-Type: multipart/alternative; boundary="00000000000055de6a05fcc68681" X-CLOUD-SEC-AV-Sent: true X-CLOUD-SEC-AV-Info: weka,google_mail,monitor X-Gm-Spam: 0 X-Gm-Phishy: 0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org --00000000000055de6a05fcc68681 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, We found an issue with newer kernels (5.13+) that are found on newer OSes (Ubuntu22, Rocky9, Ubuntu20 with kernel 5.15) where a 2M page that was allocated for DPDK was migrated (moved into another physical page) when a 1G page was allocated. >From our reading of the kernel commits this started with commit ae37c7ff79f1f030e28ec76c46ee032f8fd07607 mm: make alloc_contig_range handle in-use hugetlb pages This caused what looked like memory corruptions to us and cases where the rings were moved from their physical location and communication was no longer possible. I wanted to ask if anyone else hit this issue and what mitigations are available? We are currently looking at using a kernel driver to pin the pages but I expect that this issue will affect others and that a more general approach is needed. Thanks, Baruch --=20 Baruch Even Platform Technical Lead, WEKA E baruch@weka.io* =C2=AD*W www.weka.io * =C2=AD* * =C2=AD* --00000000000055de6a05fcc68681 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

We found an issue with n= ewer kernels (5.13+) that are found on newer OSes (Ubuntu22, Rocky9, Ubuntu= 20 with kernel 5.15) where a 2M page that was allocated for DPDK was migrat= ed (moved into another physical page) when a 1G page was allocated.

From our reading of the kernel commits this started with = commit ae37c7ff79f1f030e28ec76c46ee032f8fd07607
=C2=A0 =C2=A0 mm: make a= lloc_contig_range handle in-use hugetlb pages

This= caused what looked like memory corruptions to us and cases where the rings= were moved from their physical location and communication was no longer po= ssible.

I wanted to ask if anyone else hit this is= sue and what mitigations are available?

We are cur= rently looking at using a kernel driver to pin the pages but I expect that = this issue will affect others and that a more general approach is needed.

Thanks,
Baruch

--
=
Baruch Even=
Platform Technical Lead,=C2= =A0 WEKA
E=C2=A0baruch@weka.io=E2=80=85= =C2=AD= W=C2=A0www.weka.io=E2=80=85=C2=AD=C2=A0=E2=80=85=C2=AD
<= table style=3D"color:#4e4b4c;padding-left:2px;font-weight:bold;width:100%" = width=3D"100%" cellspacing=3D"0" cellpadding=3D"0" border=3D"0">
=
3D""
--00000000000055de6a05fcc68681--