From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 56298A00C5; Thu, 14 Jul 2022 10:45:48 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1FA4C42B8E; Thu, 14 Jul 2022 10:45:42 +0200 (CEST) Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2089.outbound.protection.outlook.com [40.107.220.89]) by mails.dpdk.org (Postfix) with ESMTP id 8E2F842B8D for ; Thu, 14 Jul 2022 10:45:40 +0200 (CEST) ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=kFBX4A5lBqmePR/HaYU5QZDda8AcITwXHzNfV7PgX7wOBE8On8JnZAy2ukJ7+kkQzV1s7u76X7s9z/g5A8yaV29MDv0p13uXCRA/QcBijo0RBrDyRQ6czHCFdg2m0arzY8pdi6n2wnspOTSa7Ltp+9/vJfZ5y5YFdbPZyQe7KX7eBupTA8F4QbZ4sC/iLKA/hKmmq2NNOYqVF71SEnEL92gK7e5qVUYV/9BKQYqyphP19S6ZzuPmsEFhBwYP5iuw5QXhnTA2QQP6Rhk9W7YgJWgzWBKOIbWP+iZK3o1pZ7W0mqQ4SrQEV3NCKIUKeKVIs6upFNiFWnnmIrSTFRruCQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4xrqgdt75PdMJHE6yzHStXiDjD6UL2eL1jrt5/D3lmc=; b=PEE9eyRnjekKttp5wRP2cfkxqB/WlAizCPqJDfJVqCjPFX69rKPFGodL76gvXhqWnSLTI+SjeC2xAIlS/kHBc7Ln2NFVYQOIEaCmoj8bLovandxTVY2kUwfK1IQLhnQpRquHFxyLW9AzzR81wmwe0w/oAcftVFC5N4nl2b+mKW602w5ANEjoYLo9EXhuK0xISxkIg1IM6l+wV3IREr1Ho64Das9y0kJXi8oeAZqTMcDsQhO4kOV7nNMiM3OzRyZI2Ii+Uj+poHM0ox9hFtHElvq/f7Wf4DzVWyckcutHhLPkr38Cjonw6u29kyuOH70j0NlsIiSLYapjP6eoCdKazA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 20.83.241.18) smtp.rcpttodomain=dpdk.org smtp.mailfrom=amd.com; dmarc=fail (p=none sp=none pct=100) action=none header.from=xilinx.com; dkim=none (message not signed); arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=xilinx.com] dmarc=[1,1,header.from=xilinx.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amdcloud.onmicrosoft.com; s=selector1-amdcloud-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4xrqgdt75PdMJHE6yzHStXiDjD6UL2eL1jrt5/D3lmc=; b=M9MvnPrrJI9C5o7qq+wwkef4TrWYaBYmh9ksC4yakHCelt6yDbzVhrTsGilshQKJVq3FtBju5pqhZtlWWFl6uVS14aRyQNb613M1kzyiFVTpIv0+L4kw60J83YyzOw3cJvUGivEAQPalt/5mnE9XoE0NPrkDXJapdFH7soDEy/sFYcCpxS4zFIad4Kyf/bq5S3pl1AvrZULDR4OjENj4S6sjDg5xjuTH24fgp6571oHkx/2sWXfEqutv//waxHSotc7lGiC6NXjscsun9vKtdNNdinZX+eX5SkDSq5RH7MHnlavTnYSaU5OE6fpWND6+I1U9Xvfc20a26NHUeTXQJQ== Received: from DM6PR14CA0048.namprd14.prod.outlook.com (2603:10b6:5:18f::25) by BYAPR12MB2615.namprd12.prod.outlook.com (2603:10b6:a03:61::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5417.25; Thu, 14 Jul 2022 08:45:37 +0000 Received: from DM6NAM11FT008.eop-nam11.prod.protection.outlook.com (2603:10b6:5:18f:cafe::75) by DM6PR14CA0048.outlook.office365.com (2603:10b6:5:18f::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5438.12 via Frontend Transport; Thu, 14 Jul 2022 08:45:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 20.83.241.18) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=fail action=none header.from=xilinx.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 20.83.241.18 as permitted sender) receiver=protection.outlook.com; client-ip=20.83.241.18; helo=mailrelay000000.14r1f435wfvunndds3vy4cdalc.xx.internal.cloudapp.net; pr=C Received: from mailrelay000000.14r1f435wfvunndds3vy4cdalc.xx.internal.cloudapp.net (20.83.241.18) by DM6NAM11FT008.mail.protection.outlook.com (10.13.172.85) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5438.12 via Frontend Transport; Thu, 14 Jul 2022 08:45:37 +0000 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by mailrelay000000.14r1f435wfvunndds3vy4cdalc.xx.internal.cloudapp.net (Postfix) with ESMTPS id 8C96C41F5D; Thu, 14 Jul 2022 08:45:24 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IEqLUWud4KZqZWQh0EygDBiKAu7SEOEHL5ecmHRmCniRj8gNKZVDBVRfCiKSinDv4wW/oFJIboLXze/7Qrk0Ryuc9ofe/G5CD/r/bYSAm9Lcah4H9ibmCIV/gt3FEMpIrwU0LBgxoKyUfBcv3ykbmfmd1BTHYAeSmNuGP1cXEV4wmmJ7E0fYxTvKv9yxJLgXPMlBzLkAWl9TXs2BD5QOWgnLbk/YjkDKdfMgOBAFM1hWg5eH5zkWgpP2xE9475q/9OqIxdNd/xsP1uTiw9pmY9gIT1ghkoWZe0DlsVGP+srj3Ox0J3PxzbfuxkuCm7w6gnql1XTz/MrBbZUAJ8cp8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4xrqgdt75PdMJHE6yzHStXiDjD6UL2eL1jrt5/D3lmc=; b=TkdAwR3IKnq3UkhWzLZ63H/s07WcRH2mDlt0xi/Z/tbj8EHF4pWx8yXyhwJGyUgMUzpUF2SdDEsoVYW3/KbQcvYnB76mLnNqhWW2Sm4cx+rhX+oYvLhgFlYQVLeZDMSdcSdwh9Gv/4Iky8/LD3CIlRbxyidrBmp9K9mSLXd6DrKFHcncnRq0PTmYLTdX5OyXbBPL9dfLhQrU/+V+m/8AyWHYGSQZODmEGqQ+7RQaBNLT9+HDIi7bb/fLRHLN1pF0MT5dLuRg57VOTQM8CbtfrNnLgvGxACAaFLaf7A4EjXBhhhjPPOa5+ezV7npy/JadcSBZYescdWIe+pt665OQ1w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 149.199.62.198) smtp.rcpttodomain=dpdk.org smtp.mailfrom=xilinx.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=xilinx.com; dkim=none (message not signed); arc=none Received: from SN6PR04CA0108.namprd04.prod.outlook.com (2603:10b6:805:f2::49) by SN6PR02MB3999.namprd02.prod.outlook.com (2603:10b6:805:3a::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5438.12; Thu, 14 Jul 2022 08:45:21 +0000 Received: from SN1NAM02FT0024.eop-nam02.prod.protection.outlook.com (2603:10b6:805:f2:cafe::8e) by SN6PR04CA0108.outlook.office365.com (2603:10b6:805:f2::49) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5417.17 via Frontend Transport; Thu, 14 Jul 2022 08:45:21 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 149.199.62.198) smtp.mailfrom=xilinx.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=xilinx.com; Received: from xsj-pvapexch02.xlnx.xilinx.com (149.199.62.198) by SN1NAM02FT0024.mail.protection.outlook.com (10.97.5.147) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5438.12 via Frontend Transport; Thu, 14 Jul 2022 08:45:21 +0000 Received: from xsj-pvapexch02.xlnx.xilinx.com (172.19.86.41) by xsj-pvapexch02.xlnx.xilinx.com (172.19.86.41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.14; Thu, 14 Jul 2022 01:45:09 -0700 Received: from smtp.xilinx.com (172.19.127.95) by xsj-pvapexch02.xlnx.xilinx.com (172.19.86.41) with Microsoft SMTP Server id 15.1.2176.14 via Frontend Transport; Thu, 14 Jul 2022 01:45:09 -0700 Envelope-to: dev@dpdk.org, chenbo.xia@intel.com, maxime.coquelin@redhat.com, andrew.rybchenko@oktetlabs.ru, absaini@amd.com Received: from [10.170.66.118] (port=50766 helo=xndengvm004118.xilinx.com) by smtp.xilinx.com with esmtp (Exim 4.90) (envelope-from ) id 1oBuT6-0005xh-QK; Thu, 14 Jul 2022 01:45:09 -0700 From: To: CC: , , , Abhimanyu Saini Subject: [PATCH v2 5/5] vdpa/sfc: Add support for SW assisted live migration Date: Thu, 14 Jul 2022 14:14:51 +0530 Message-ID: <20220714084451.38375-6-asaini@xilinx.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20220714084451.38375-1-asaini@xilinx.com> References: <20220714084451.38375-1-asaini@xilinx.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-EOPAttributedMessage: 1 X-MS-Office365-Filtering-Correlation-Id: 600363a3-32b2-4ef6-0ddb-08da65753987 X-MS-TrafficTypeDiagnostic: SN6PR02MB3999:EE_|DM6NAM11FT008:EE_|BYAPR12MB2615:EE_ X-MS-Exchange-SenderADCheck: 0 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Tr0LequCPsDtUUIxhhDyN7qk2Qabj16EfxDL0BGhUFgucOpSy4L+VCkPi2vOY+uCPIL19AEXsg0Andve+kq0IKIvuNL4ECEIz79zpNqW2vTc3CkfZHivV4ruv2TIB13KZ58Cm229FEyBozP+2Cx5+kqVVMrHFsdEv3rD2H9ALHPatfEM4bqL4bu553eZc68oWoi33TDrIJvVeoHGAFGiaqlQvSfsft7+gWS2XOa1C90ORIdAEKW2e28Rt4TWpLCLwnhDN6/5JlytJO9Gwj5KfwfeVI95nhsop3gAtcE2tHOR/AqQgV0P6bQ+CGt79hxn0ifqO2WJbXAi9gdUIQmSVfzBFV0d5a+iNWExF+vcPJoIwD627Hhlkmvx33SKM38FU0mLhcJiqnxMAY5fsm7zqokBSX1fOGcH7jOOI0lrBUHyS0jjPvUR+budfhsrSGF1KqTnD0CLB64x/73Wkf7cPIVQwXyOUeqWeA1ApkKHtmbaTljWLLyNgVzroL8w31wiuAXguM9F4hfwfbOQUshTKDHgi48/ZBkVGPnpVP2lze7eKfchkvEaMG70Nu1s0UEGxEko8xOBfF2FW33sH3T6VWS+QGeqlAsOOFCco8IGEgICCdFzv1g5BAi5v+wjmGviDM2hYg96Cy1E+3gX6oyOTZOZ5nRnqXgluT8uDPhQDLsEiJ4Kra92CyG52+BTlXs1wLNJaKitqevrTGwh1IIWVznuyjwEiB2M1zC9c7WEqerCRyCGmnp6HoOZcQ+fNhofbtmdFkawAxoHznbi3+tERhy3dKPwjO0Vy4R1k43NC5sTvPj0Y6eOw4ljrCjPWZMUMe3be+zhC1JrQXNG3BVwX+oMVl3xdAViOPbdsWb8bM0= X-Forefront-Antispam-Report-Untrusted: CIP:149.199.62.198; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:xsj-pvapexch02.xlnx.xilinx.com; PTR:unknown-62-198.xilinx.com; CAT:NONE; SFS:(13230016)(4636009)(376002)(39860400002)(396003)(346002)(136003)(36840700001)(46966006)(40470700004)(82740400003)(30864003)(7636003)(8936002)(54906003)(9786002)(6916009)(5660300002)(316002)(83380400001)(70586007)(4326008)(1076003)(8676002)(356005)(47076005)(336012)(186003)(40460700003)(426003)(70206006)(6666004)(478600001)(2616005)(36860700001)(2876002)(7696005)(40480700001)(41300700001)(36756003)(82310400005)(26005)(2906002)(102446001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR02MB3999 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DM6NAM11FT008.eop-nam11.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: ca782245-fa08-416f-b560-08da65752fe7 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Q52PUwDEATs3N2BODnXNhNQMNWn3M6DURy+pO+pHB1/PqDBsDlWG6ciiuxS4U95ZLrYI1Vcr4wt/EelZXCxFSQ/weJXRr/OX95r/yYhqW0v7r0NYvf5Vi/VjrDeASb9ET63fHoNTUky1xI0oZGou0Fa3W0DHAOwG8J2leZhI3Nq0YP8tg5+r8WnMAJN+FgfOYgEPD7Z0mCCgfKpzOehgYQnN0TJakXtzkHOMSpDCunp+8byaFsA4YFPT//KCvH5GDZyeqDbddHdJl6kyHrWUyj15Zx4dYRlxiuojGjoBRGH08D8MwSBRW+oY8UUgABCeFYUNvlUQ/TcYwX5M3ptgl73Rd3S8LQx1ZvXWrns2zfvjxisKrcJdek/ggX4eJjB2EszWQ2YNPze4X72bvMipnNqNrusXSLsrUFKdFsrb0Un6JLObj5R5nO3nLGH06cS8dLYyuUhq2W36rxR8IGTigxK7gpZYTcgczwXneJs1SWHxHSLLYIpjLRXzjk4SfvEeFKvrNzlh+GpuHRYQ5aAJnkM8OsF9wX+8QUot5PIOL5b0ZWuJzllDmJRzmjFBRi+3n6ntxXbN91a52s3jgdxJxwB3TnICikSpLkah+SSI+jn7hEcTxSRLDMb/zwaTWcFlb0fPjjLyAES+jC0Vd7DbYOt/RhuQYwX/kkcHkPEV1NYrX4CTI26688cl62npGMtr+75RdClLV4JZQyKzfSD5zicrI1UotLmgrB1OXkHRJZBSiyECUaoeNAi1+AxPyxZ7icseVgi5DLUGduP1/KQQNlpSCYGBAjgNuRnxXzOlJ4z08e1l2FgM+5ysoCiNIYA5ErSsp3y3r/URW8pXWf1FKARwMfLHw8KopypJaiV2RMo= X-Forefront-Antispam-Report: CIP:20.83.241.18; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mailrelay000000.14r1f435wfvunndds3vy4cdalc.xx.internal.cloudapp.net; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230016)(4636009)(136003)(346002)(376002)(39860400002)(396003)(46966006)(36840700001)(6666004)(336012)(9786002)(26005)(426003)(40480700001)(41300700001)(83380400001)(82310400005)(4326008)(5660300002)(7696005)(8936002)(47076005)(70206006)(8676002)(2876002)(83170400001)(2616005)(6916009)(54906003)(36756003)(42882007)(1076003)(2906002)(36860700001)(478600001)(316002)(30864003)(186003)(82740400003)(81166007)(102446001)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Jul 2022 08:45:37.6452 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 600363a3-32b2-4ef6-0ddb-08da65753987 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[20.83.241.18]; Helo=[mailrelay000000.14r1f435wfvunndds3vy4cdalc.xx.internal.cloudapp.net] X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: TreatMessagesAsInternal-DM6NAM11FT008.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR12MB2615 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Abhimanyu Saini In SW assisted live migration, vDPA driver will stop all virtqueues and setup up SW vrings to relay the communication between the virtio driver and the vDPA device using an event driven relay thread This will allow vDPA driver to help on guest dirty page logging for live migration. Signed-off-by: Abhimanyu Saini --- v2: * Fix checkpatch warnings * Add a cover letter drivers/vdpa/sfc/sfc_vdpa.h | 1 + drivers/vdpa/sfc/sfc_vdpa_ops.c | 337 ++++++++++++++++++++++++++++++-- drivers/vdpa/sfc/sfc_vdpa_ops.h | 15 +- 3 files changed, 330 insertions(+), 23 deletions(-) diff --git a/drivers/vdpa/sfc/sfc_vdpa.h b/drivers/vdpa/sfc/sfc_vdpa.h index daeb27d4cd..ae522caebe 100644 --- a/drivers/vdpa/sfc/sfc_vdpa.h +++ b/drivers/vdpa/sfc/sfc_vdpa.h @@ -18,6 +18,7 @@ #define SFC_VDPA_MAC_ADDR "mac" #define SFC_VDPA_DEFAULT_MCDI_IOVA 0x200000000000 +#define SFC_SW_VRING_IOVA 0x300000000000 /* Broadcast & Unicast MAC filters are supported */ #define SFC_MAX_SUPPORTED_FILTERS 3 diff --git a/drivers/vdpa/sfc/sfc_vdpa_ops.c b/drivers/vdpa/sfc/sfc_vdpa_ops.c index 6401d4e16f..1d29ee7187 100644 --- a/drivers/vdpa/sfc/sfc_vdpa_ops.c +++ b/drivers/vdpa/sfc/sfc_vdpa_ops.c @@ -4,10 +4,13 @@ #include #include +#include #include +#include #include #include +#include #include #include #include @@ -33,7 +36,9 @@ */ #define SFC_VDPA_DEFAULT_FEATURES \ ((1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \ - (1ULL << VIRTIO_NET_F_MQ)) + (1ULL << VIRTIO_NET_F_MQ) | \ + (1ULL << VHOST_F_LOG_ALL) | \ + (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE)) #define SFC_VDPA_MSIX_IRQ_SET_BUF_LEN \ (sizeof(struct vfio_irq_set) + \ @@ -42,6 +47,142 @@ /* It will be used for target VF when calling function is not PF */ #define SFC_VDPA_VF_NULL 0xFFFF +#define SFC_VDPA_DECODE_FD(data) (data.u64 >> 32) +#define SFC_VDPA_DECODE_QID(data) (data.u32 >> 1) +#define SFC_VDPA_DECODE_EV_TYPE(data) (data.u32 & 1) + +/* + * Create q_num number of epoll events for kickfd interrupts + * and q_num/2 events for callfd interrupts. Round up the + * total to (q_num * 2) number of events. + */ +#define SFC_VDPA_SW_RELAY_EVENT_NUM(q_num) (q_num * 2) + +static inline uint64_t +sfc_vdpa_encode_ev_data(int type, uint32_t qid, int fd) +{ + SFC_VDPA_ASSERT(fd > UINT32_MAX || qid > UINT32_MAX / 2); + return type | (qid << 1) | (uint64_t)fd << 32; +} + +static inline void +sfc_vdpa_queue_relay(struct sfc_vdpa_ops_data *ops_data, uint32_t qid) +{ + rte_vdpa_relay_vring_used(ops_data->vid, qid, &ops_data->sw_vq[qid]); + rte_vhost_vring_call(ops_data->vid, qid); +} + +static void* +sfc_vdpa_sw_relay(void *data) +{ + uint64_t buf; + uint32_t qid, q_num; + struct epoll_event ev; + struct rte_vhost_vring vring; + int nbytes, i, ret, fd, epfd, nfds = 0; + struct epoll_event events[SFC_VDPA_MAX_QUEUE_PAIRS * 2]; + struct sfc_vdpa_ops_data *ops_data = (struct sfc_vdpa_ops_data *)data; + + q_num = rte_vhost_get_vring_num(ops_data->vid); + epfd = epoll_create(SFC_VDPA_SW_RELAY_EVENT_NUM(q_num)); + if (epfd < 0) { + sfc_vdpa_log_init(ops_data->dev_handle, + "failed to create epoll instance"); + goto fail_epoll; + } + ops_data->epfd = epfd; + + vring.kickfd = -1; + for (qid = 0; qid < q_num; qid++) { + ev.events = EPOLLIN | EPOLLPRI; + ret = rte_vhost_get_vhost_vring(ops_data->vid, qid, &vring); + if (ret != 0) { + sfc_vdpa_log_init(ops_data->dev_handle, + "rte_vhost_get_vhost_vring error %s", + strerror(errno)); + goto fail_vring; + } + + ev.data.u64 = sfc_vdpa_encode_ev_data(0, qid, vring.kickfd); + if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) { + sfc_vdpa_log_init(ops_data->dev_handle, + "epoll add error: %s", + strerror(errno)); + goto fail_epoll_add; + } + } + + /* + * Register intr_fd created by vDPA driver in lieu of qemu's callfd + * to intercept rx queue notification. So that we can monitor rx + * notifications and issue rte_vdpa_relay_vring_used() + */ + for (qid = 0; qid < q_num; qid += 2) { + fd = ops_data->intr_fd[qid]; + ev.events = EPOLLIN | EPOLLPRI; + ev.data.u64 = sfc_vdpa_encode_ev_data(1, qid, fd); + if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) < 0) { + sfc_vdpa_log_init(ops_data->dev_handle, + "epoll add error: %s", + strerror(errno)); + goto fail_epoll_add; + } + sfc_vdpa_queue_relay(ops_data, qid); + } + + /* + * virtio driver in VM was continuously sending queue notifications + * while were setting up software vrings and hence the HW misses + * these doorbell notifications. Since, it is safe to send duplicate + * doorbell, send another doorbell from vDPA driver. + */ + for (qid = 0; qid < q_num; qid++) + rte_write16(qid, ops_data->vq_cxt[qid].doorbell); + + for (;;) { + nfds = epoll_wait(epfd, events, + SFC_VDPA_SW_RELAY_EVENT_NUM(q_num), -1); + if (nfds < 0) { + if (errno == EINTR) + continue; + sfc_vdpa_log_init(ops_data->dev_handle, + "epoll_wait return fail\n"); + goto fail_epoll_wait; + } + + for (i = 0; i < nfds; i++) { + fd = SFC_VDPA_DECODE_FD(events[i].data); + /* Ensure kickfd is not busy before proceeding */ + for (;;) { + nbytes = read(fd, &buf, 8); + if (nbytes < 0) { + if (errno == EINTR || + errno == EWOULDBLOCK || + errno == EAGAIN) + continue; + } + break; + } + + qid = SFC_VDPA_DECODE_QID(events[i].data); + if (SFC_VDPA_DECODE_EV_TYPE(events[i].data)) + sfc_vdpa_queue_relay(ops_data, qid); + else + rte_write16(qid, ops_data->vq_cxt[qid].doorbell); + } + } + + return NULL; + +fail_epoll: +fail_vring: +fail_epoll_add: +fail_epoll_wait: + close(epfd); + ops_data->epfd = -1; + return NULL; +} + static int sfc_vdpa_get_device_features(struct sfc_vdpa_ops_data *ops_data) { @@ -99,7 +240,7 @@ hva_to_gpa(int vid, uint64_t hva) static int sfc_vdpa_enable_vfio_intr(struct sfc_vdpa_ops_data *ops_data) { - int rc; + int rc, fd; int *irq_fd_ptr; int vfio_dev_fd; uint32_t i, num_vring; @@ -131,6 +272,17 @@ sfc_vdpa_enable_vfio_intr(struct sfc_vdpa_ops_data *ops_data) return -1; irq_fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = vring.callfd; + if (ops_data->sw_fallback_mode && !(i & 1)) { + fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC); + if (fd < 0) { + sfc_vdpa_err(ops_data->dev_handle, + "failed to create eventfd"); + goto fail_eventfd; + } + ops_data->intr_fd[i] = fd; + irq_fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd; + } else + ops_data->intr_fd[i] = -1; } rc = ioctl(vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set); @@ -138,16 +290,26 @@ sfc_vdpa_enable_vfio_intr(struct sfc_vdpa_ops_data *ops_data) sfc_vdpa_err(ops_data->dev_handle, "error enabling MSI-X interrupts: %s", strerror(errno)); - return -1; + goto fail_ioctl; } return 0; + +fail_ioctl: +fail_eventfd: + for (i = 0; i < num_vring; i++) { + if (ops_data->intr_fd[i] != -1) { + close(ops_data->intr_fd[i]); + ops_data->intr_fd[i] = -1; + } + } + return -1; } static int sfc_vdpa_disable_vfio_intr(struct sfc_vdpa_ops_data *ops_data) { - int rc; + int rc, i; int vfio_dev_fd; struct vfio_irq_set irq_set; void *dev; @@ -161,6 +323,12 @@ sfc_vdpa_disable_vfio_intr(struct sfc_vdpa_ops_data *ops_data) irq_set.index = VFIO_PCI_MSIX_IRQ_INDEX; irq_set.start = 0; + for (i = 0; i < ops_data->vq_count; i++) { + if (ops_data->intr_fd[i] >= 0) + close(ops_data->intr_fd[i]); + ops_data->intr_fd[i] = -1; + } + rc = ioctl(vfio_dev_fd, VFIO_DEVICE_SET_IRQS, &irq_set); if (rc) { sfc_vdpa_err(ops_data->dev_handle, @@ -223,12 +391,15 @@ sfc_vdpa_get_vring_info(struct sfc_vdpa_ops_data *ops_data, static int sfc_vdpa_virtq_start(struct sfc_vdpa_ops_data *ops_data, int vq_num) { - int rc; + int rc, fd; + uint64_t size; uint32_t doorbell; efx_virtio_vq_t *vq; + void *vring_buf, *dev; struct sfc_vdpa_vring_info vring; efx_virtio_vq_cfg_t vq_cfg; efx_virtio_vq_dyncfg_t vq_dyncfg; + uint64_t sw_vq_iova = ops_data->sw_vq_iova; vq = ops_data->vq_cxt[vq_num].vq; if (vq == NULL) @@ -241,6 +412,33 @@ sfc_vdpa_virtq_start(struct sfc_vdpa_ops_data *ops_data, int vq_num) goto fail_vring_info; } + if (ops_data->sw_fallback_mode) { + size = vring_size(vring.size, rte_mem_page_size()); + size = RTE_ALIGN_CEIL(size, rte_mem_page_size()); + vring_buf = rte_zmalloc("vdpa", size, rte_mem_page_size()); + vring_init(&ops_data->sw_vq[vq_num], vring.size, vring_buf, + rte_mem_page_size()); + + dev = ops_data->dev_handle; + fd = sfc_vdpa_adapter_by_dev_handle(dev)->vfio_container_fd; + rc = rte_vfio_container_dma_map(fd, + (uint64_t)(uintptr_t)vring_buf, + sw_vq_iova, size); + + /* Direct I/O for Tx queue, relay for Rx queue */ + if (!(vq_num & 1)) + vring.used = sw_vq_iova + + (char *)ops_data->sw_vq[vq_num].used - + (char *)ops_data->sw_vq[vq_num].desc; + + ops_data->sw_vq[vq_num].used->idx = vring.last_used_idx; + ops_data->sw_vq[vq_num].avail->idx = vring.last_avail_idx; + + ops_data->vq_cxt[vq_num].sw_vq_iova = sw_vq_iova; + ops_data->vq_cxt[vq_num].sw_vq_size = size; + ops_data->sw_vq_iova += size; + } + vq_cfg.evvc_target_vf = SFC_VDPA_VF_NULL; /* even virtqueue for RX and odd for TX */ @@ -309,9 +507,12 @@ sfc_vdpa_virtq_start(struct sfc_vdpa_ops_data *ops_data, int vq_num) static int sfc_vdpa_virtq_stop(struct sfc_vdpa_ops_data *ops_data, int vq_num) { - int rc; + int rc, fd; + void *dev, *buf; + uint64_t size, len, iova; efx_virtio_vq_dyncfg_t vq_idx; efx_virtio_vq_t *vq; + struct rte_vhost_vring vring; if (ops_data->vq_cxt[vq_num].enable != B_TRUE) return -1; @@ -320,12 +521,34 @@ sfc_vdpa_virtq_stop(struct sfc_vdpa_ops_data *ops_data, int vq_num) if (vq == NULL) return -1; + if (ops_data->sw_fallback_mode) { + dev = ops_data->dev_handle; + fd = sfc_vdpa_adapter_by_dev_handle(dev)->vfio_container_fd; + /* synchronize remaining new used entries if any */ + if (!(vq_num & 1)) + sfc_vdpa_queue_relay(ops_data, vq_num); + + rte_vhost_get_vhost_vring(ops_data->vid, vq_num, &vring); + len = SFC_VDPA_USED_RING_LEN(vring.size); + rte_vhost_log_used_vring(ops_data->vid, vq_num, 0, len); + + buf = ops_data->sw_vq[vq_num].desc; + size = ops_data->vq_cxt[vq_num].sw_vq_size; + iova = ops_data->vq_cxt[vq_num].sw_vq_iova; + rte_vfio_container_dma_unmap(fd, (uint64_t)(uintptr_t)buf, + iova, size); + } + /* stop the vq */ rc = efx_virtio_qstop(vq, &vq_idx); if (rc == 0) { - ops_data->vq_cxt[vq_num].cidx = vq_idx.evvd_vq_used_idx; - ops_data->vq_cxt[vq_num].pidx = vq_idx.evvd_vq_avail_idx; + if (ops_data->sw_fallback_mode) + vq_idx.evvd_vq_avail_idx = vq_idx.evvd_vq_used_idx; + rte_vhost_set_vring_base(ops_data->vid, vq_num, + vq_idx.evvd_vq_avail_idx, + vq_idx.evvd_vq_used_idx); } + ops_data->vq_cxt[vq_num].enable = B_FALSE; return rc; @@ -450,7 +673,11 @@ sfc_vdpa_start(struct sfc_vdpa_ops_data *ops_data) SFC_EFX_ASSERT(ops_data->state == SFC_VDPA_STATE_CONFIGURED); - sfc_vdpa_log_init(ops_data->dev_handle, "entry"); + if (ops_data->sw_fallback_mode) { + sfc_vdpa_log_init(ops_data->dev_handle, + "Trying to start VDPA with SW I/O relay"); + ops_data->sw_vq_iova = SFC_SW_VRING_IOVA; + } ops_data->state = SFC_VDPA_STATE_STARTING; @@ -675,6 +902,7 @@ static int sfc_vdpa_dev_close(int vid) { int ret; + void *status; struct rte_vdpa_device *vdpa_dev; struct sfc_vdpa_ops_data *ops_data; @@ -707,7 +935,23 @@ sfc_vdpa_dev_close(int vid) } ops_data->is_notify_thread_started = false; + if (ops_data->sw_fallback_mode) { + ret = pthread_cancel(ops_data->sw_relay_thread_id); + if (ret != 0) + sfc_vdpa_err(ops_data->dev_handle, + "failed to cancel LM relay thread: %s", + rte_strerror(ret)); + + ret = pthread_join(ops_data->sw_relay_thread_id, &status); + if (ret != 0) + sfc_vdpa_err(ops_data->dev_handle, + "failed to join LM relay thread: %s", + rte_strerror(ret)); + } + sfc_vdpa_stop(ops_data); + ops_data->sw_fallback_mode = false; + sfc_vdpa_close(ops_data); sfc_vdpa_adapter_unlock(ops_data->dev_handle); @@ -774,9 +1018,49 @@ sfc_vdpa_set_vring_state(int vid, int vring, int state) static int sfc_vdpa_set_features(int vid) { - RTE_SET_USED(vid); + int ret; + uint64_t features = 0; + struct rte_vdpa_device *vdpa_dev; + struct sfc_vdpa_ops_data *ops_data; - return -1; + vdpa_dev = rte_vhost_get_vdpa_device(vid); + ops_data = sfc_vdpa_get_data_by_dev(vdpa_dev); + if (ops_data == NULL) + return -1; + + rte_vhost_get_negotiated_features(vid, &features); + + if (!RTE_VHOST_NEED_LOG(features)) + return -1; + + sfc_vdpa_info(ops_data->dev_handle, "live-migration triggered"); + + sfc_vdpa_adapter_lock(ops_data->dev_handle); + + /* Stop HW Offload and unset host notifier */ + sfc_vdpa_stop(ops_data); + if (rte_vhost_host_notifier_ctrl(vid, RTE_VHOST_QUEUE_ALL, false) != 0) + sfc_vdpa_info(ops_data->dev_handle, + "vDPA (%s): Failed to clear host notifier", + ops_data->vdpa_dev->device->name); + + /* Restart vDPA with SW relay on RX queue */ + ops_data->sw_fallback_mode = true; + sfc_vdpa_start(ops_data); + ret = pthread_create(&ops_data->sw_relay_thread_id, NULL, + sfc_vdpa_sw_relay, (void *)ops_data); + if (ret != 0) + sfc_vdpa_err(ops_data->dev_handle, + "failed to create rx_relay thread: %s", + rte_strerror(ret)); + + if (rte_vhost_host_notifier_ctrl(vid, RTE_VHOST_QUEUE_ALL, true) != 0) + sfc_vdpa_info(ops_data->dev_handle, "notifier setup failed!"); + + sfc_vdpa_adapter_unlock(ops_data->dev_handle); + sfc_vdpa_info(ops_data->dev_handle, "SW fallback setup done!"); + + return 0; } static int @@ -860,17 +1144,28 @@ sfc_vdpa_get_notify_area(int vid, int qid, uint64_t *offset, uint64_t *size) sfc_vdpa_info(dev, "vDPA ops get_notify_area :: offset : 0x%" PRIx64, *offset); - pci_dev = sfc_vdpa_adapter_by_dev_handle(dev)->pdev; - doorbell = (uint8_t *)pci_dev->mem_resource[reg.index].addr + *offset; + if (!ops_data->sw_fallback_mode) { + pci_dev = sfc_vdpa_adapter_by_dev_handle(dev)->pdev; + doorbell = (uint8_t *)pci_dev->mem_resource[reg.index].addr + + *offset; + /* + * virtio-net driver in VM sends queue notifications before + * vDPA has a chance to setup the queues and notification area, + * and hence the HW misses these doorbell notifications. + * Since, it is safe to send duplicate doorbell, send another + * doorbell from vDPA driver as workaround for this timing issue + */ + rte_write16(qid, doorbell); + + /* + * Update doorbell address, it will come in handy during + * live-migration. + */ + ops_data->vq_cxt[qid].doorbell = doorbell; + } - /* - * virtio-net driver in VM sends queue notifications before - * vDPA has a chance to setup the queues and notification area, - * and hence the HW misses these doorbell notifications. - * Since, it is safe to send duplicate doorbell, send another - * doorbell from vDPA driver as workaround for this timing issue. - */ - rte_write16(qid, doorbell); + sfc_vdpa_info(dev, "vDPA ops get_notify_area :: offset : 0x%" PRIx64, + *offset); return 0; } diff --git a/drivers/vdpa/sfc/sfc_vdpa_ops.h b/drivers/vdpa/sfc/sfc_vdpa_ops.h index 5c8e352de3..dd301bae86 100644 --- a/drivers/vdpa/sfc/sfc_vdpa_ops.h +++ b/drivers/vdpa/sfc/sfc_vdpa_ops.h @@ -6,8 +6,11 @@ #define _SFC_VDPA_OPS_H #include +#include #define SFC_VDPA_MAX_QUEUE_PAIRS 8 +#define SFC_VDPA_USED_RING_LEN(size) \ + ((size) * sizeof(struct vring_used_elem) + sizeof(uint16_t) * 3) enum sfc_vdpa_context { SFC_VDPA_AS_VF @@ -37,9 +40,10 @@ struct sfc_vdpa_vring_info { typedef struct sfc_vdpa_vq_context_s { volatile void *doorbell; uint8_t enable; - uint32_t pidx; - uint32_t cidx; efx_virtio_vq_t *vq; + + uint64_t sw_vq_iova; + uint64_t sw_vq_size; } sfc_vdpa_vq_context_t; struct sfc_vdpa_ops_data { @@ -57,6 +61,13 @@ struct sfc_vdpa_ops_data { uint16_t vq_count; struct sfc_vdpa_vq_context_s vq_cxt[SFC_VDPA_MAX_QUEUE_PAIRS * 2]; + + int epfd; + uint64_t sw_vq_iova; + bool sw_fallback_mode; + pthread_t sw_relay_thread_id; + struct vring sw_vq[SFC_VDPA_MAX_QUEUE_PAIRS * 2]; + int intr_fd[SFC_VDPA_MAX_QUEUE_PAIRS * 2]; }; struct sfc_vdpa_ops_data * -- 2.18.2