From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from m13-245.163.com (m13-245.163.com [220.181.13.245]) by dpdk.org (Postfix) with ESMTP id 9F4472BAF for ; Sat, 1 Jul 2017 14:01:28 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com; s=s110527; h=Date:From:Message-ID:Subject:MIME-Version; bh=SvUph HPmLdQcwmR9OeoWjehiFE6oGLgwdpxSAB71kQs=; b=PjwSj0OhXJOozq2WJxT35 NmOnXRWZr3TSXQdRCCvTrdvVhZzKcKFhDraeeALUAjG898W+sLAhHajgbbgYg1sl bAxHBEWZ/iRZ06izFmw+L724n4X+E6q3eNmr2EY3PtFVofjBPrniOhj5iK7wjySz mD4oqL8HJD3ZQTpCSUS3s8= Received: from lilongfei1110$163.com ( [116.117.135.228] ) by ajax-webmail-sdy11 (Coremail) ; Sat, 1 Jul 2017 20:01:25 +0800 (GMT+08:00) Date: Sat, 1 Jul 2017 20:01:16 +0800 From: "lilongfei1110" To: "users" Message-ID: <246f07ef.325b4.15cfe06edd6.Coremail.lilongfei1110@163.com> MIME-Version: 1.0 X-Mailer: NetEase Flash Mail 2.4.1.30 X-Priority: 3 (Normal) X-Originating-IP: [116.117.135.228] X-CM-TRANSID: 7MKowAC3UIcWj1dZ_cMnAA--.33496W X-CM-SenderInfo: polo005jihxiqrrqqiywtou0bp/1tbiyAkIgFWBYcnZGgABs3 X-Coremail-Antispam: 1U5529EdanIXcx71UUUUU7vcSsGvfC2KfnxnUU== Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: base64 X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: [dpdk-users] Buffer leak and RSS performance problem X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jul 2017 12:01:32 -0000 ICAgIEhp77yMDQogICAgVGhhbmtzIGZvciB5b3VyIHJlYWRpbmcuDQogICAgSSBlbmNvdW50ZXIg dHdvIHByb2JsZW1zIHdoZW4gSSB1c2UgbDJmd2QgYW5kIGwzZndkLiBIZXJlIGFyZSB0aGUgcHJv YmxlbXPvvJoNCiAgICBQMS4gSW4gbDJmd2Qgc291cmNlIGNvZGXvvIxJIGNoYW5nZSB0aGVzZSB2 YWx1ZXPvvIxhbmQgaXQgZG8gbm90IHJlY2VpdmUgcGFja2V0cyBhZnRlciByZWNlaXZpbmcgYWJv dXQgMUsgcGFja2V0cy4NCiNkZWZpbmUgTkJfTUJVRiAgIDgxOTIwDQojZGVmaW5lIFJURV9URVNU X1JYX0RFU0NfREVGQVVMVCA0MDMyDQogICAgSXMgaXQgcmVsZXZhbnQgd2l0aCB0aGUgaTQwZSBk cml2ZXIgY29kZSBiZWxvdz8NCiIgcnhxLT5yeF90YWlsID0gKHVpbnQxNl90KShyeHEtPnJ4X3Rh aWwgJiAocnhxLT5uYl9yeF9kZXNjIC0gMSkpOyINCg0KICAgIFAyLiBJIHVzZSBsM2Z3ZCB0byBw cm9jZXNzIGFib3V0IDM1R3BzIElQIHBhY2tldCBhbmQgcGFja2V0IGxlbmd0aCBpcyA2NEIsdGhl IHByb2Nlc3MgZHJvcCBtYW55IHBhY2tldHMuIFRoZW4gSSB1c2UgNCByeF9xdWV1ZXMgYW5kIDQg bGNvcmVzLCBidXQgaXQncyBwZXJmb3JtYW5jZSAgaW1wcm92ZXMgYSBsaXR0bGUuIFRoZSBzYW1l IHJlc3VsdCBoYXBwZW5zIHdoZW4gSSB1c2UgNiBxdWV1ZXMgYW5kIDYgbGNvcmVzLiBTbyBob3cg dG8gbG9jYXRpb24gd2hlcmUgdGhlIHByb2JsZW0gaXMuIElzIHRoZXJlIGFueSBzdWdnZXN0aW9u cz8NCiAgICBNeSBlbnZpcm9ubWVudO+8mg0KT1PvvJpSZWRoYXQgNi41IFg4Nl82NA0KQ1BV77ya SW50ZWwgWGVvbiBFNS0yNjUwIHYzQDIuM0dIeg0KTklD77yaSW50ZWwgQ29ycG9yYXRpb24gRGV2 aWNlIFs4MDg2OjE1ODBdDQpEUERLOiAxNi4xMQ0KDQoNCiAgICBUaGFua3MgJiBCZXN0IFJlZ2Fy ZHMu >From kiselev99@gmail.com Sat Jul 1 20:23:11 2017 Return-Path: Received: from mail-lf0-f42.google.com (mail-lf0-f42.google.com [209.85.215.42]) by dpdk.org (Postfix) with ESMTP id 9100E2BAF for ; Sat, 1 Jul 2017 20:23:11 +0200 (CEST) Received: by mail-lf0-f42.google.com with SMTP id b207so84790277lfg.2 for ; Sat, 01 Jul 2017 11:23:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:subject:message-id:date :cc:to; bh=srPYhTsS5STSQxvj/ngwpaUEPjjcfVwZyfEPw1b+Gh0=; b=Pc65GHVn2sh2u8VoY8fY0eMRa+Hqz4oCSduqVAsngdxthOcZM3l6EpFWTluTpf6rH6 GRyA1yPTLduOlQxCaQHeHiEwrPkkMLJNmhfgBPiST4Xkx80Im1S4JVrEpqXmP/mxq+ph NsxO5n4YmcAkGkZ0ooH9TBlOaDXH4TKs+d5Zdk9QlfUmBvM0MWaSnSZfy+TUvxGAEcqK 6LiiT41//2gIb4z7MwDznLcXYYaZbz/XHan1ftb1ChUHWcJVgid4B0uMTvR1lc0pCV+B T2X+pWBCodrnTwxE3AYAmykl2KskPQCUiSSXxNvZO0CnZXhsD3x3R3JxA1ZlBe2lrfme C6/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:message-id:date:cc:to; bh=srPYhTsS5STSQxvj/ngwpaUEPjjcfVwZyfEPw1b+Gh0=; b=kBc26EU1+jLJQmtiM2QRNE+R/48kpwzgFy09g66Syh8Wv2l9L+5EPDKRL0NCV8kQmZ kaQEmMjIqzRZIpfQR+Ft5Zfr8ct88efuPTLJt/H3RyfpmdSEq3tMrwKqPnkMSub3eyHm fIucuKgphqBeznIDIDYH/4KZskV1c9v/wdNnTwACNes+3bY48AGuB86xS9ooqbVr3S/L ahoV5BhHmu0B/+vJHGbA9Yg9gw7H5b3BDjSOSDjFpSjqLNT3fdowwFq4yQPsobRoeHFY J66npfX4nxLwoabsLRWmeLLIvQxgObm6EufS1Hz59rb3gHd8H/MdhoASWi8bnMPcqsga 943A== X-Gm-Message-State: AKS2vOw/ecRhEZLC4mjzbFTfm+ed3GSduoieJ0U2C78cHb8z2dyU6zhB eEq7Ko6ptwHxBUVSUco= X-Received: by 10.46.21.21 with SMTP id s21mr6474594ljd.118.1498933390569; Sat, 01 Jul 2017 11:23:10 -0700 (PDT) Received: from [192.168.5.115] ([37.139.80.50]) by smtp.gmail.com with ESMTPSA id b2sm2335566lje.56.2017.07.01.11.23.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 01 Jul 2017 11:23:09 -0700 (PDT) From: Alex Kiselev Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Message-Id: Date: Sat, 1 Jul 2017 21:23:08 +0300 Cc: declan.doherty@intel.com To: users@dpdk.org X-Mailer: Apple Mail (2.3273) Subject: [dpdk-users] bonding driver LACP mode issues X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jul 2017 18:23:11 -0000 Hi! Working with the bonding driver mode 4 (LACP) several times I am stuck in a situation when link aggregation port stopped forwarding packets = after some time of normal operation. Recreating aggregation group on the switch = didn't help in that situations. The only way out was to restart my application. I started investigating the source code of the bonding driver and = discovered that the rx_machine() function doesn't follow IEEE Std 802.1AX-2008 = standard. It looks like the following part of rx_machine() code implements=20 the recordPDU function described in the section "5.4.9 Functions" of the = standard. bool match =3D port->actor.system_priority =3D=3D lacp->partner.port_params.system_priority && is_same_ether_addr(&agg->actor.system, &lacp->partner.port_params.system) && port->actor.port_priority =3D=3D lacp->partner.port_params.port_priority && port->actor.port_number =3D=3D lacp->partner.port_params.port_number; =09 ... /* If LACP partner params match this port actor params = */ if (match =3D=3D true && ACTOR_STATE(port, AGGREGATION) = =3D=3D PARTNER_STATE(port, AGGREGATION)) PARTNER_STATE_SET(port, SYNCHRONIZATION); else if (!PARTNER_STATE(port, AGGREGATION) && = ACTOR_STATE(port, AGGREGATION)) PARTNER_STATE_SET(port, SYNCHRONIZATION); else PARTNER_STATE_CLR(port, SYNCHRONIZATION); Problem #1: According to recordPDU function, the "Partner_Key" parameter carried in = the received PDU should be compared to Actor_Oper_Port_Key. But the bonding driver doesn't do it. It only compares system_priority, = system, port_priority and port_number when evaluated match variable. Problem #2: Also, the standard indicates that: "Partner_Oper_Port_State.Synchronization is set to TRUE if all of these = parameters match, Actor_State.Synchronization in the received PDU is set to TRUE, and LACP = will actively maintain the link in the aggregation." But the bonding driver doesn't check that Actor_State.Synchronization in = the received PDU is set to TRUE. Problem #3: Also, the standard indicates that: "Partner_Oper_Port_State.Synchronization is also set to TRUE if the = value of Actor_State.Aggregation in the received PDU is set to FALSE (i.e., = indicates an Individual link), Actor_State.Synchronization in the received PDU is set to TRUE, = and LACP will actively maintain the link." The bonding driver only partly follows that rule and doesn't check that Actor_State.Synchronization in the received PDU is set to TRUE. Also, it checks ACTOR_STATE(port, AGGREGATION) but the standard doesn't say anything about this. My proposal is to replace partner state sync flag evalution block with = the a following one in order to more strictly follow the standart: /* If LACP partner params match this port actor params = */ if ((match =3D=3D true && lacp->partner.port_params.key = =3D=3D port->actor.key && ACTOR_STATE(port, AGGREGATION) =3D=3D = PARTNER_STATE(port, AGGREGATION) && STATE_FLAG(lacp->actor.state, = SYNCHRONIZATION) =3D=3D true) || (STATE_FLAG(lacp->actor.state, AGGREGATION) =3D=3D= false && STATE_FLAG(lacp->actor.state, = SYNCHRONIZATION) =3D=3D true) ) PARTNER_STATE_SET(port, SYNCHRONIZATION); else PARTNER_STATE_CLR(port, SYNCHRONIZATION); ... #define STATE_FLAG(_p, _f) (!!CHECK_FLAGS(_p, STATE_ ## _f)) I am not sure yet if the described problems are causing the driver to = stuck in a kind of deadlock situation in my application, but I think they might be the sources of my problem. Could someone take a look at my suggestions and help me to find out why my LACP boding port doesn't work correctly? Thank you. -- Alex Kiselev=