From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <klarose@sandvine.com>
Received: from mail1.sandvine.com (Mail1.sandvine.com [64.7.137.134])
 by dpdk.org (Postfix) with ESMTP id B14DE1B253
 for <dev@dpdk.org>; Wed,  8 Nov 2017 20:33:38 +0100 (CET)
Received: from BLR-EXCHP-2.sandvine.com (192.168.196.172) by
 wtl-exchp-1.sandvine.com (192.168.194.176) with Microsoft SMTP Server (TLS)
 id 14.3.319.2; Wed, 8 Nov 2017 14:33:37 -0500
Received: from WTL-EXCHP-1.sandvine.com ([fe80::ac6b:cc1e:f2ff:93aa]) by
 blr-exchp-2.sandvine.com ([::1]) with mapi id 14.03.0319.002; Wed, 8 Nov 2017
 14:33:37 -0500
From: Kyle Larose <klarose@sandvine.com>
To: "dev@dpdk.org" <dev@dpdk.org>
CC: Declan Doherty <declan.doherty@intel.com>
Thread-Topic: rte_eth_bond 8023ad behaviour under congestion
Thread-Index: AdNYxtR66ZkILf8VTnGulDu1ehVBGw==
Date: Wed, 8 Nov 2017 19:33:35 +0000
Message-ID: <D76BBBCF97F57144BB5FCF08007244A7EDB0B712@wtl-exchp-1.sandvine.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [192.168.200.51]
x-c2processedorg: b2f06e69-072f-40ee-90c5-80a34e700794
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: [dpdk-dev] rte_eth_bond 8023ad behaviour under congestion
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Nov 2017 19:33:39 -0000

Hello,

I've been doing some testing using the 8023ad link bonding driver on a syst=
em with 4 10G i40e interfaces in the link bond. One thing I've noticed is t=
hat if any of the links are overloaded when I don't have dedicated control =
queues enabled, it starts dropping LACPDUs on transmit. I quickly realized =
that it's because of the following code in bond_ethdev_tx_burst_8023ad:


		num_tx_slave =3D rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id,
				slave_bufs[i], slave_nb_pkts[i]);

		/* If tx burst fails drop slow packets */
		for ( ; num_tx_slave < slave_slow_nb_pkts[i]; num_tx_slave++)
			rte_pktmbuf_free(slave_bufs[i][num_tx_slave]);

This chunk of code basically treats the LACPPDUs at a very low priority, si=
nce they are generated infrequently. I'd like to ensure that LACPPDUs are t=
ransmitted when there's congestion in the case where dedicated queues are n=
ot supported.

I can think of a few options to resolve this:
 1) Store the LACPDUS for later sending in a per-slave buffer if tx fails, =
and make sure these are always at the front of the send buffer, so that whe=
n there's room, they're sent (I'm not quite sure what the best way to do th=
is is).
 2) Allow enabling the dedicated tx queue without enabling the dedicated rx=
 queue.

I think both 1 & 2 are good solutions on their own, and should probably bot=
h be implemented. #2 is ideal, but doesn't cover all cases (like if there a=
re insufficient tx queues to dedicate one to this).

How do people feel about these proposals?

Note: I understand that this is not ideal at all, since the lack of a dedic=
ated rx queue means that lacpdus could drop on rx. But, in my use-case that=
's less likely than link congestion, so I'd like to at least be resilient h=
ere.

Thanks,

Kyle
=20