From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E26C0A0C46; Mon, 20 Sep 2021 19:11:45 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 625ED40DF7; Mon, 20 Sep 2021 19:11:45 +0200 (CEST) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by mails.dpdk.org (Postfix) with ESMTP id 0260340DF5 for ; Mon, 20 Sep 2021 19:11:42 +0200 (CEST) X-IronPort-AV: E=McAfee;i="6200,9189,10113"; a="219990652" X-IronPort-AV: E=Sophos;i="5.85,308,1624345200"; d="scan'208";a="219990652" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2021 10:11:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.85,308,1624345200"; d="scan'208";a="532324799" Received: from orsmsx601.amr.corp.intel.com ([10.22.229.14]) by fmsmga004.fm.intel.com with ESMTP; 20 Sep 2021 10:11:41 -0700 Received: from orsmsx607.amr.corp.intel.com (10.22.229.20) by ORSMSX601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12; Mon, 20 Sep 2021 10:11:41 -0700 Received: from orsmsx604.amr.corp.intel.com (10.22.229.17) by ORSMSX607.amr.corp.intel.com (10.22.229.20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12; Mon, 20 Sep 2021 10:11:40 -0700 Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by orsmsx604.amr.corp.intel.com (10.22.229.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12 via Frontend Transport; Mon, 20 Sep 2021 10:11:40 -0700 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (104.47.58.174) by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2242.12; Mon, 20 Sep 2021 10:11:40 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=X9ojMqdSZWv/0oF4wgOPqYgZaXUfiN/SNgpuEtgvXiG3N7ccBi1XzHcbUaRTms6z0E3vdAIrrKX0fHJ9DgO23HkY26gtSQSb7kYVf7+8g1RCaczJZzn5JH/TiZ+bzj6+MicRuhn9lCc5Uk9Q2nEo6zncsz+N+mfeLY5ykGoAklOgCE1bdL2wtgVyHUXitQr3Eqhz7XaVPN0ZZzUmtZK1N6xSDplpxElyaZDQiB4tTBMHm4g81zcvuDYq0XOjCHB2SkzEAcGr6L/efS5yG6El6JiSjfor5IJAuik8Sk1ffQMDRjR1Du9DdAgzxRe9QKXjo51ZA65zRoUvjLkuZEerDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=Q1LjhgU2+PFvUjbZ2xLt93koPHFVk5TVBO+3TSJ6qxM=; b=hBh1BHKG99oWuEYzCZgmUZU/A7Lwq0LeOIV4mIyHEEPUzPf+rAc1GmMVdSTZcZL/DUcNBSmQzvPGijXGkPVQA9Q0RQ3shuJ42TACbDvzJrzZz4dd7aVLLQQMiOxC9zpWPwVMaEIobfNdgKMZnpkVhofyN2e7cH+p6cUyp16gvhDH1FZrJ9QTNDAd9XASuGe56ISpf0QVlnZ0cFaSIWFMsmFstI+uAnMvzFIKI1yvBkix8R4nlNg3FPzSbjJZchlOxUP6N1jWSaL3z3JcREHbMjSVtTHb5PmmKfB+KVnw/Pr6dU0Dxd1jfmEqFtteuj2U3ZNmNOjrc+8vxSP42How5Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Q1LjhgU2+PFvUjbZ2xLt93koPHFVk5TVBO+3TSJ6qxM=; b=yPdfmE9GCgnQviGtiFWBkvnwZSW888l2F+Sx3xxWw3LyEWmPypfo4xTRSPUvV2XcPRgfMEE1HVfScwtjH4S7ILlQt0tv6vp5nqkRjyeqbEFGUW1sRilUUf+8MBW7LREsd028bY+qYwi9eV6IgXXvZ3Sm3T8DZlBeNT0GIodY3V8= Authentication-Results: gmail.com; dkim=none (message not signed) header.d=none;gmail.com; dmarc=none action=none header.from=intel.com; Received: from PH0PR11MB5000.namprd11.prod.outlook.com (2603:10b6:510:41::19) by PH0PR11MB4869.namprd11.prod.outlook.com (2603:10b6:510:41::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4523.16; Mon, 20 Sep 2021 17:11:39 +0000 Received: from PH0PR11MB5000.namprd11.prod.outlook.com ([fe80::747b:3a08:d1ec:31fc]) by PH0PR11MB5000.namprd11.prod.outlook.com ([fe80::747b:3a08:d1ec:31fc%5]) with mapi id 15.20.4523.018; Mon, 20 Sep 2021 17:11:38 +0000 To: Tudor Cornea CC: , , , "Mihai Pogonaru" References: <1629466761-127333-1-git-send-email-tudor.cornea@gmail.com> <3154f4e8-9f0e-b04f-6e8f-096dc39f489c@intel.com> From: Ferruh Yigit X-User: ferruhy Message-ID: <924853ed-d3fa-f4e5-fde7-96147de4ad83@intel.com> Date: Mon, 20 Sep 2021 18:11:32 +0100 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-ClientProxiedBy: DB8PR06CA0060.eurprd06.prod.outlook.com (2603:10a6:10:120::34) To PH0PR11MB5000.namprd11.prod.outlook.com (2603:10b6:510:41::19) MIME-Version: 1.0 Received: from [192.168.0.206] (37.228.236.146) by DB8PR06CA0060.eurprd06.prod.outlook.com (2603:10a6:10:120::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4523.14 via Frontend Transport; Mon, 20 Sep 2021 17:11:37 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 9f8fd00b-5041-46de-edd5-08d97c59b559 X-MS-TrafficTypeDiagnostic: PH0PR11MB4869: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: DwByKiwOoe3iZlagEdBWMENnWWULio+mlWA5R4BhXoii4Wst2xWTnZMuqeLbDUO5S1CPUJIfRAEPOmnLdiwTWhoGkf4S2s9pRZ3eTyG0PAsFqf0/om+C5QrEPYu/xSRyCJgN32kbpcSBJpgnwAa1lLmiqvi74AaMyhV+1ijBsg4mErfyRBXncXJZlZyEXe666hjjAFgLdwFFamzLliSfSX/o/iZPQMfEhHL9Thedvm2+q/NwRcElgMFCEwZ2ZSzqvRJeFHim4w2BD7oSLHE6Ko/9h2ZeP5bD8YezuByLhTMKm3EwuSdi+tyiK14+rYJXkcPk/cCQYbzbRv1qqacI3PLiSk0ocd/co7/I5k2KiLayrul4G6h4KildBRyIMViBLN9rbR+RBGc2AuZg/LUSMls0SQeF+GAo9FkOyN9jdiaxvxv6bcbWSYYOLeCiO6hYdh/mRDqenwAnfpvTiLBM08WhZbLDDm5SEQ7YgnQ+3JdtYObzCy4PPtOrT1sMTLdqp0TXRKtc9NiSwLcnQ93ZYrUbPWm5NHL+y5Bm4m/7q1VlRgYmudY/RHw0NaQ90k/z2+cd/octWGfI2ICodG7L/GpW9YXxQlpuQeXqrTF+JJ+I8WveIGwJCGMnyxr2liazuJJB00MvQEddullprKMx28Qfh8D6qtqVPsmTB8l+D0h9YYU6YLtzsopADbve4wLd44SCsoQo9TMU+bbSzHPVZX0PVi7oqNe9jZ3uS0DksskNc0YJil+zIPIUMZJ1bST0xE8FsexdCfkS7PyU0rspoP9HlS9Z+KvYIsNNj0e+rBC11S1kVSdPDXX4PLadxQ7u X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR11MB5000.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(346002)(136003)(396003)(366004)(39860400002)(376002)(31686004)(478600001)(6666004)(316002)(53546011)(44832011)(8936002)(956004)(6916009)(5660300002)(66946007)(186003)(8676002)(86362001)(36756003)(4326008)(2906002)(83380400001)(38100700002)(66556008)(2616005)(66476007)(6486002)(31696002)(26005)(966005)(16576012)(45980500001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?aEphVjF2Qk5YRVBpTVhpaXNnZVpjblZXMU1hbXlRTE95aGFCVzVGeFhKM2NS?= =?utf-8?B?eUc3RlFxc0ZjWkpybEM4VjVWYjJhaVd3NmFLMk9hZU1wZnhjYWE1WkxHM2Nq?= =?utf-8?B?VTFUQy9lVGszakUyTE13SXlLbGladkd2cDY2a2FVUlhMdkdOZnVsb2x6UnpS?= =?utf-8?B?WUh1U0RLbzkzdnNteU5JZlZ3QVY0VDhvWkhzaUZiZDZ4UHNacHJEbE5OdTJj?= =?utf-8?B?VnZnSW0yTnhPTlJ4cXZ2dll3a0QvakVPeGQwYkcwdHI0SGVVVkdsSU43SzFT?= =?utf-8?B?aTExQ2ZXdUFqMlZkK3B0VzBSbHkvSVVyRFF3VW00aDRmcXJFMUhlM3diQ25D?= =?utf-8?B?T1ZFWSthSnRxR2NFa2M3ekZFOVh6T3ZleHNHQUhicDBmS0JBL2FEdmpyZUwr?= =?utf-8?B?Znd4bjhDc3FNM29xbFRqa29zYUFqZVlSS3J2UjlHYjcxdHc0RVVwUFlrTDFN?= =?utf-8?B?K28wemI3NGlkbk5FRVR1Yll2VHFZUnRuU2lWWmFWYS8zQUdIRE10UTkrendr?= =?utf-8?B?QlMxUUFhbmlYcWFDUEpHT2NobDlza3MrYzJPdFJyNTRrZHRBSmw1VkIyU3ZX?= =?utf-8?B?K1c0YzNSQWk4bFNvb3htRC9MQVJ3bnpFS3FoNCsrbW4vSm1WTll2bmVyYXRp?= =?utf-8?B?NGFianIwMXI3M2dCZ0M2S1FNemxLYWVlM0dFWmtrZjhHcVZFd0lMa3hSai84?= =?utf-8?B?clVCL3dya2dLRzREWURwajRGVW00aURtWjI4bzRXL3NrZkFJM3NRZEcrQXpx?= =?utf-8?B?VzA5Z2NkRlp5bjlWUS9wUy9JbTFiZHJodmpCSVczT2NZajJyU0YvME96YThL?= =?utf-8?B?SmRDaGhUaDM4cWdnNHJFNndIVEFuL1hPV2pqdG5xTkQ3UjRFS1pDRXRUc0dE?= =?utf-8?B?aUptYzRmcjBjQW1rbU9KS3dLRWVBRmo3aDBYVkFrSWIyNDMvOXVQY0FvTHJ6?= =?utf-8?B?eXdoVU51bTREVjZlZzFzQmVkTVQrVmR4WTE5M0g1UmtyL25YNEh3V2tJZk8x?= =?utf-8?B?aWVQWUJiaGowS2tpbCtXSS9sbUl2QjdiYnFDRThqU1VHT3M4UGxoZis5UWVn?= =?utf-8?B?aVpyVWc3a2cyR0M3ZkVCN1JzVEdEUXNWOS9ucytQbFBlSWtIZVpwcWhjNEpx?= =?utf-8?B?TUNtVXFNMEZBZmRwOXFldTR0M2JkaHAxMlQ4WXRrWFdkbEpuR3lPTno2bGM4?= =?utf-8?B?UTY0QlVXMHN6RmFWc0RGbWJFekVvRU9iS3E0OWRmbUdlMXRmdnkwMU5IUjFK?= =?utf-8?B?NGFBbmxseGhjblc3QTRPVS9laEVQU3ZuS2pZMTBGQVpaZld4cFA5T3ZtY2NK?= =?utf-8?B?MTV0YmFBVExxVEJDOWN2bkVqdUdZOE5PRlZzV2JQZGliVGJrWU1mR09GVHlZ?= =?utf-8?B?bjViaU1vRDUwMmxuTFg5N1VSTmVTVlNaN1BqVmM1N1hCNmlDSC9IZExDc2pU?= =?utf-8?B?RCtyb2RqaThnYXMxQUhoK2NjL2I1Q0o3bk1EQkQxMndTOHBIbEhneGhFOVdO?= =?utf-8?B?Z01mRzNzeWhnVStwUG1rVkJVWlRma0dIU01qcEx4K3pSM0dhaW1qdndUdFZQ?= =?utf-8?B?ZmZCbG9aV2Y3emRTVFI3L1NIeUtQQ0ZHaFJjWkdvZ3VsUzBFNkV6RWtFaW1y?= =?utf-8?B?SFROWXo1OFpMSG9HTWEwOWVTOEg2QXNvWERtV3ppckUzNjgrUklGb044SWxu?= =?utf-8?B?Um4xakorakJCaVZlU3Voakk2Tm5aZUI5elN4RUhkTE9nMEt5NzkxS3JRd0pK?= =?utf-8?Q?OM2cBS4HwPDup/rnoU/jLjM5vk3MtdlBv352+yI?= X-MS-Exchange-CrossTenant-Network-Message-Id: 9f8fd00b-5041-46de-edd5-08d97c59b559 X-MS-Exchange-CrossTenant-AuthSource: PH0PR11MB5000.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Sep 2021 17:11:38.7911 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: s3ng/IUzDjW1IKkPtekxArDuWCaWKZjVLPH0ym/BftyHpLDyJLo/IBB32YfBKdAlM6n/OqW4yxF3GhRxU//Czw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR11MB4869 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH] net/af_packet: fix ignoring full ring on tx X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 9/6/2021 11:23 AM, Tudor Cornea wrote: > Hi Ferruh, > > Would you mind separate timestamp status fix to its own patch? I think >> better to >> fix 'ignoring full Tx ring' first, to not make it dependent to timestamp >> patch. > > > Agreed. There are two issues solved by this patch. We will break it in two > different patches. > > I can see 'TP_STATUS_TS_SYS_HARDWARE' is deprecated, and I assume in the >> kernel >> versions the bug exists, this flag is not set, but can you please confirm? > > > And does it only seen with veth, if so I wonder if we can ignore it, not >> sure >> how common to use af_packet PMD over veth interface, do you have this >> usecase? > > > We've seen the timestamping issue only when running af_packet over > veth interfaces. We have a particular use-case internally, in which we need > to run inside a Kubernetes cluster. > We've found the following resources [1] , [2] related to this behavior in > the kernel. > > We believe that issue #2 (the ring getting full), can theoretically occur > on any type of NIC. > We managed to reproduce the bursty behavior on af_packet PMD over vmxnet3 > interface, by Tx-ing packets at a low rate (e.g ~340 pps), and toggling the > interface on / off > ifconfig $iface_name down; sleep 10; ifconfig $iface_name up > > We will attempt to give more context on the issue below, about what we > think happens: > - we have a 2048 queue shared between the kernel and the dpdk socket, > there's an index the queue in both the kernel and the dpdk driver > - the dpdk driver writes a packet or a burst, advances its idx and tells > the kernel to send the packets via a call to sendto() and the kernel sends > the packets and advances its idx > - once the interface is down the kernel can no longer send packets, but it > doesn't drop them, it just doesn't advance its idx > - for each packet there is header and in the header there is a status > integer which, among others, indicates the owner of the packet: the > userspace or the kernel - the userspace (dpdk driver) sets the status as > owned by the kernel when it adds another packet ; the kernel sets the > status back to owned by the userspace once it sends a packet > - the dpdk driver was ignoring this status bit and, even after the queue > was full, it would continue to put packets in the queue - its idx would be > "after" that of the kernel > - once the interface is brought up, the kernel would send all the packets > in the queue (since they have the status of being owned by the kernel) on > the next call to sendto() and the idx would be back to where it was before > the interface was brought up (let's call it k1) > - the dpdk driver idx into the queue would point somewhere in the queue > (let's call it d1) and would continue to add packets at that point, but the > kernel wouldn't send any packet anymore since there is now a gap of packets > owned by the userspace between the kernel index (k1) and the dpdk driver > idx (d1) > - the dpdk idx would eventually reach k1 and packets would be transferred > at a normal rate until both the dpdk idx and the kernel idx would reach d1 > again > - between d1 and k1 there are only packets with the status as owned by the > kernel - which where added by the dpdk driver while its index was between > d1 and k1 ; thus the kernel would burst all the packets till k1, while the > dpdk idx is at d1 > - the cycle repeats > > If a new traffic config comes (in our application) while this cycle is > happening, it could be that some of the packets of the old config are still > in queue (between d1 and k1) and will be bursted when the dpdk and kernel > idx reach d1 ; this would explain seeing packets from an old config, but > only in the first 2048 packets (which is the queue size) > > Hi Tudor, If there is an usage on of veth, OK to fix the timestamps issue. What you described above looks like a ring buffer with single producer and single consumer, and producer overwrites the not consumed items. I assume this happens because af_packet (consumer) can't send the packets because of the timestamp defect. (Also producer (dpdk app) should have checks to prevent overwrite, but that is a different issue.) I will comment to the new versions of the patches. Our of curiosity, are you using an modified af_packet implementation in kernel for above described usage? > [1] https://www.spinics.net/lists/kernel/msg3959391.html > [2] https://www.spinics.net/lists/netdev/msg739372.html > > On Wed, 1 Sept 2021 at 19:34, Ferruh Yigit wrote: > >> On 8/20/2021 2:39 PM, Tudor Cornea wrote: >>> The poll call can return POLLERR which is ignored, or it can return >>> POLLOUT, even if there are no free frames in the mmap-ed area. >>> >>> We can account for both of these cases by re-checking if the next >>> frame is empty before writing into it. >>> >>> We also now eliminate the timestamp status from the frame status. >>> >> >> Hi Tudor, >> >> Would you mind separate timestamp status fix to its own patch? I think >> better to >> fix 'ignoring full Tx ring' first, to not make it dependent to timestamp >> patch. >> >>> Signed-off-by: Mihai Pogonaru >>> Signed-off-by: Tudor Cornea >>> --- >>> drivers/net/af_packet/rte_eth_af_packet.c | 47 >> +++++++++++++++++++++++++++++-- >>> 1 file changed, 45 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/net/af_packet/rte_eth_af_packet.c >> b/drivers/net/af_packet/rte_eth_af_packet.c >>> index b73b211..3845df5 100644 >>> --- a/drivers/net/af_packet/rte_eth_af_packet.c >>> +++ b/drivers/net/af_packet/rte_eth_af_packet.c >>> @@ -167,6 +167,12 @@ eth_af_packet_rx(void *queue, struct rte_mbuf >> **bufs, uint16_t nb_pkts) >>> return num_rx; >>> } >>> >>> +static inline __u32 tx_ring_status_remove_ts(volatile __u32 *tp_status) >>> +{ >>> + return *tp_status & >>> + ~(TP_STATUS_TS_SOFTWARE | TP_STATUS_TS_RAW_HARDWARE); >> >> I can see 'TP_STATUS_TS_SYS_HARDWARE' is deprecated, and I assume in the >> kernel >> versions the bug exists, this flag is not set, but can you please confirm? >> >>> +} >>> + >>> /* >>> * Callback to handle sending packets through a real NIC. >>> */ >>> @@ -211,9 +217,41 @@ eth_af_packet_tx(void *queue, struct rte_mbuf >> **bufs, uint16_t nb_pkts) >>> } >>> } >>> >>> + /* >>> + * We must eliminate the timestamp status from the packet >>> + * status. This should only matter if timestamping is >> enabled >>> + * on the socket, but there is a BUG in the kernel which is >>> + * fixed in newer releases. >>> + >>> + * For interfaces of type 'veth', the sent skb is forwarded >>> + * to the peer and back into the network stack which >> timestamps >>> + * it on the RX path if timestamping is enabled globally >>> + * (which happens if any socket enables timestamping). >>> + >>> + * When the skb is destructed, tpacket_destruct_skb() is >> called >>> + * and it calls __packet_set_timestamp() which doesn't >> check >>> + * the flags on the socket and returns the timestamp if it >> is >>> + * set in the skb (and for veth it is, as mentioned above). >>> + */ >>> + >> >> Can you give some more details on this bug, any link etc.. >> >> And does it only seen with veth, if so I wonder if we can ignore it, not >> sure >> how common to use af_packet PMD over veth interface, do you have this >> usecase? >> >> And if only specific kernel versions impacted from this bug, what do you >> think >> about adding kernel version check to reduce the scope of the fix in >> af_packet PMD. >> >>> /* point at the next incoming frame */ >>> - if ((ppd->tp_status != TP_STATUS_AVAILABLE) && >>> - (poll(&pfd, 1, -1) < 0)) >>> + if ((tx_ring_status_remove_ts(&ppd->tp_status) >>> + != TP_STATUS_AVAILABLE) && (poll(&pfd, 1, -1) < 0)) >>> + break; >>> + >>> + /* >>> + * Poll can return POLLERR if the interface is down or >> POLLOUT, >>> + * even if there are no extra buffers available. >>> + * This happens, because packet_poll() calls >> datagram_poll() >>> + * which checks the space left in the socket buffer and in >> the >>> + * case of packet_mmap the default socket buffer length >>> + * doesn't match the requested size for the tx_ring so >> there >>> + * is always space left in socket buffer, which doesn't >> seem >>> + * to be correlated to the requested size for the tx_ring >>> + * in packet_mmap. >>> + */ >>> + if (tx_ring_status_remove_ts(&ppd->tp_status) >>> + != TP_STATUS_AVAILABLE) >>> break; >> >> In this case should we break or poll again? >> >> If 'POLLERR' is received when interface is down, it makes sense to break. >> Do you >> know if is there any other case that 'POLLERR' is returned? >> >> And for 'POLLOUT', when exactly this event sent? If 'POLLOUT' received, >> can we >> poll again to wait Tx ring available to send more packets? >> >>> >>> /* copy the tx frame data */ >>> @@ -242,6 +280,11 @@ eth_af_packet_tx(void *queue, struct rte_mbuf >> **bufs, uint16_t nb_pkts) >>> rte_pktmbuf_free(mbuf); >>> } >>> >>> + /* >>> + * We might have to ignore a few more errnos here since the packets >>> + * remain in the mmap-ed queue and will be sent later, presumably. >>> + */ >>> + >> >> Can you please describe above comment more? What errors ignored? >> And won't all packets sent will below sendto() call? >> >>> /* kick-off transmits */ >>> if (sendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0) == -1 && >>> errno != ENOBUFS && errno != EAGAIN) { >>> >> >>