From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on0066.outbound.protection.outlook.com [104.47.0.66]) by dpdk.org (Postfix) with ESMTP id 3720199F5; Mon, 31 Jul 2017 19:06:38 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=edi7Y40z+dB/efzjPyeI0YqC+AqnJd2XjaKLQObUOPc=; b=hjBAPrc9xe00V4xB/WJyVi+/aBI1MYDQWLdkJVCWYdu/lFJNGSQIjec/vaZ8fNo9X3PZTGaUbNMUMb3KzNb4wgFTB24o40Oxwo5sLxN57IDGPmmE/4fr57Fu8e3sLpCkZ6QN/YJ1Qs+kO/4q1oUCUt9cSLOmmm6ny467qRKmbSY= Received: from DB6PR0502MB3048.eurprd05.prod.outlook.com (10.172.250.136) by DB6PR0501MB2760.eurprd05.prod.outlook.com (10.172.226.12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.1.1304.22; Mon, 31 Jul 2017 17:06:36 +0000 Received: from DB6PR0502MB3048.eurprd05.prod.outlook.com ([fe80::447d:390a:8e4b:5287]) by DB6PR0502MB3048.eurprd05.prod.outlook.com ([fe80::447d:390a:8e4b:5287%15]) with mapi id 15.01.1304.023; Mon, 31 Jul 2017 16:56:33 +0000 From: Matan Azrad To: Adrien Mazarguil CC: "dev@dpdk.org" , Thomas Monjalon , "Olga Shern" , "stable@dpdk.org" Thread-Topic: [PATCH] net/mlx4: workaround to verbs wrong error return Thread-Index: AQHTCgfEoeAeTg2ISkS844MN8M/g76JuAFFA Date: Mon, 31 Jul 2017 16:56:33 +0000 Message-ID: References: <1501499709-19873-1-git-send-email-matan@mellanox.com> <20170731141728.GO19852@6wind.com> In-Reply-To: <20170731141728.GO19852@6wind.com> Accept-Language: en-US, he-IL Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=matan@mellanox.com; x-originating-ip: [85.64.136.190] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0501MB2760; 7:5+Ue15hr/62klhEo1IFlgBCLnGtghfFKhq6drH2Ut/yI0443gGqCe7YfdNgSiV1wTXJsEI+800/Lt1p1SS8+wSPChXeG3n5y9xuiv88jYpRMptDk/8GcGklRL8gnz38I1FZkBR+/N4fSBDz1hb7PKAN1q92w9BsyNklIlvm95imIvGaptwXW7QwGz+FcHD/Kzj+ctyBivPU9sy1y2i+GXZBpo8V4CbZvrHS6P9iyE/w2Jhb03SW07RNVGCCUXZUQ1FtZuFl/54jJB6t2giJKg8TwNAasdZ2IGlVZ15pwrygP78tf7dmPkxkQmkPqo2IWVAAskeF0ldj1mZZRGwvCrkmwbMZmYb2HcGMIT47EwWz+AhzjjDxmTuyUinRlgfkezV8XEfGFNuC19JkNJ9bJKP8qF06Uxf+byOBY6DkJXExCM+KGey8jMg7105ZYFdzFS1wrNUaRgec07NATFrRiN266EhChy2pV8BL3AcnRn9Acf1zsVViKa1EUiWwNdGxosUXBqxzR7v4SkpHDAD+Fjoy4XYscrrKvSOB8CJaE3lL0zafnvO4vsR2drcZb360MPMcQH4dGsfnb3+UfzJtyEY6QDHeheVdK+9nnS5MoDdq2I1DK9fKpL5Xe6dqLuwBAx9nmMCJOoXSA+5Rxzt086x0t37Q2+VOJuUTSa9FKSs188rrh2PWvNN8UrA/6SjLiB92zMU4zUQL56yToqp9UuPbQmwDGt83c7Cwmy/smw7flkkzSqONwOruLK2mnDx0aYCDCbWDzkD56/GWm5jGs2W2ubVYUEXT5SxPRSoZuNJc= x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-ms-office365-filtering-correlation-id: c1c13aba-ecd0-4391-bbe1-08d4d8351951 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(48565401081)(300000503095)(300135400095)(2017052603031)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095); SRVR:DB6PR0501MB2760; x-ms-traffictypediagnostic: DB6PR0501MB2760: x-exchange-antispam-report-test: UriScan:; x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(100000703101)(100105400095)(93006095)(93001095)(10201501046)(3002001)(6055026)(6041248)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123562025)(20161123558100)(20161123564025)(20161123555025)(20161123560025)(6072148)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:DB6PR0501MB2760; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:DB6PR0501MB2760; x-forefront-prvs: 03853D523D x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(39400400002)(39850400002)(39450400003)(39860400002)(39410400002)(39840400002)(199003)(377454003)(13464003)(24454002)(189002)(6506006)(229853002)(53546010)(6916009)(2950100002)(2900100001)(97736004)(3280700002)(3660700001)(38730400002)(86362001)(7696004)(5660300001)(14454004)(110136004)(7736002)(68736007)(305945005)(8676002)(54356999)(5250100002)(66066001)(25786009)(9686003)(53936002)(81156014)(189998001)(2906002)(50986999)(81166006)(8936002)(4326008)(6246003)(76176999)(101416001)(99286003)(33656002)(55016002)(6436002)(102836003)(6116002)(478600001)(3846002)(54906002)(105586002)(74316002)(106356001); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0501MB2760; H:DB6PR0502MB3048.eurprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-originalarrivaltime: 31 Jul 2017 16:56:33.3392 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0501MB2760 Subject: Re: [dpdk-dev] [PATCH] net/mlx4: workaround to verbs wrong error return X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Jul 2017 17:06:38 -0000 Hi Adrien > -----Original Message----- > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com] > Sent: Monday, July 31, 2017 5:17 PM > To: Matan Azrad > Cc: dev@dpdk.org; Thomas Monjalon ; Olga Shern > ; stable@dpdk.org > Subject: Re: [PATCH] net/mlx4: workaround to verbs wrong error return >=20 > Hi Matan, >=20 > On Mon, Jul 31, 2017 at 02:15:09PM +0300, Matan Azrad wrote: > > Current mlx4 OFED version has bug which returns error to ibv destroy > > functions when the device was plugged out, in spite of the resources > > were destroyed correctly. > > > > Hence, failsafe PMD was aborted, only in debug mode, when it tries to > > remove the device in plug-out process. > > > > The workaround removed the ibv destroy assertions. > > > > DPDK 18.02 release should work with OFED-4.2 which will include the > > verbs fix to this bug, then, this patch will be removed. > > > > Signed-off-by: Matan Azrad > > Cc: stable@dpdk.org >=20 > Since this workaround is needed in order to validate hot-plug with mlx4 > compiled in debug mode due to a problem in Verbs, I don't think > stable@dpdk.org should be involved. >=20 Ok I'll remove it.=20 > What will be back-ported, once fixed, is the minimum OFED version to inst= all > to properly benefit from hot-plug functionality. >=20 > More comments about the patch below. >=20 > > --- > > drivers/net/mlx4/mlx4.c | 70 > +++++++++++++++++++++++++++++++++++--------- > > drivers/net/mlx4/mlx4_flow.c | 22 ++++++++++---- > > 2 files changed, 73 insertions(+), 19 deletions(-) > > > > diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c index > > 8451f5b..94782c2 100644 > > --- a/drivers/net/mlx4/mlx4.c > > +++ b/drivers/net/mlx4/mlx4.c > > @@ -955,7 +955,10 @@ struct rxq * > > return 0; > > error: > > if (mr_linear !=3D NULL) > > - claim_zero(ibv_dereg_mr(mr_linear)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_dereg_mr(mr_linear); > > > > rte_free(elts_linear); > > rte_free(elts); > > @@ -992,7 +995,10 @@ struct rxq * > > txq->elts_linear =3D NULL; > > txq->mr_linear =3D NULL; > > if (mr_linear !=3D NULL) > > - claim_zero(ibv_dereg_mr(mr_linear)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_dereg_mr(mr_linear); > > > > rte_free(elts_linear); > > if (elts =3D=3D NULL) > > @@ -1052,9 +1058,15 @@ struct rxq * > > ¶ms)); > > } > > if (txq->qp !=3D NULL) > > - claim_zero(ibv_destroy_qp(txq->qp)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_destroy_qp(txq->qp); > > if (txq->cq !=3D NULL) > > - claim_zero(ibv_destroy_cq(txq->cq)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_destroy_cq(txq->cq); > > if (txq->rd !=3D NULL) { > > struct ibv_exp_destroy_res_domain_attr attr =3D { > > .comp_mask =3D 0, > > @@ -1070,7 +1082,10 @@ struct rxq * > > if (txq->mp2mr[i].mp =3D=3D NULL) > > break; > > assert(txq->mp2mr[i].mr !=3D NULL); > > - claim_zero(ibv_dereg_mr(txq->mp2mr[i].mr)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_dereg_mr(txq->mp2mr[i].mr); > > } > > memset(txq, 0, sizeof(*txq)); > > } > > @@ -1302,7 +1317,10 @@ static struct ibv_mr *mlx4_mp2mr(struct ibv_pd > *, struct rte_mempool *) > > DEBUG("%p: MR <-> MP table full, dropping oldest entry.", > > (void *)txq); > > --i; > > - claim_zero(ibv_dereg_mr(txq->mp2mr[0].mr)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_dereg_mr(txq->mp2mr[0].mr); > > memmove(&txq->mp2mr[0], &txq->mp2mr[1], > > (sizeof(txq->mp2mr) - sizeof(txq->mp2mr[0]))); > > } > > @@ -2355,7 +2373,10 @@ struct txq_mp2mr_mbuf_check_data { > > (void *)rxq, > > (*mac)[0], (*mac)[1], (*mac)[2], (*mac)[3], (*mac)[4], (*mac)[5= ], > > mac_index, priv->vlan_filter[vlan_index].id); > > - claim_zero(ibv_destroy_flow(rxq- > >mac_flow[mac_index][vlan_index])); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_destroy_flow(rxq->mac_flow[mac_index][vlan_index]); > > rxq->mac_flow[mac_index][vlan_index] =3D NULL; } > > > > @@ -2736,7 +2757,10 @@ struct txq_mp2mr_mbuf_check_data { > > DEBUG("%p: disabling allmulticast mode", (void *)rxq); > > if (rxq->allmulti_flow =3D=3D NULL) > > return; > > - claim_zero(ibv_destroy_flow(rxq->allmulti_flow)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_destroy_flow(rxq->allmulti_flow); > > rxq->allmulti_flow =3D NULL; > > DEBUG("%p: allmulticast mode disabled", (void *)rxq); } @@ -2796,7 > > +2820,10 @@ struct txq_mp2mr_mbuf_check_data { > > DEBUG("%p: disabling promiscuous mode", (void *)rxq); > > if (rxq->promisc_flow =3D=3D NULL) > > return; > > - claim_zero(ibv_destroy_flow(rxq->promisc_flow)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_destroy_flow(rxq->promisc_flow); > > rxq->promisc_flow =3D NULL; > > DEBUG("%p: promiscuous mode disabled", (void *)rxq); } @@ - > 2847,9 > > +2874,15 @@ struct txq_mp2mr_mbuf_check_data { > > rxq_mac_addrs_del(rxq); > > } > > if (rxq->qp !=3D NULL) > > - claim_zero(ibv_destroy_qp(rxq->qp)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_destroy_qp(rxq->qp); > > if (rxq->cq !=3D NULL) > > - claim_zero(ibv_destroy_cq(rxq->cq)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_destroy_cq(rxq->cq); > > if (rxq->channel !=3D NULL) > > claim_zero(ibv_destroy_comp_channel(rxq->channel)); > > if (rxq->rd !=3D NULL) { > > @@ -2864,7 +2897,10 @@ struct txq_mp2mr_mbuf_check_data { > > &attr)); > > } > > if (rxq->mr !=3D NULL) > > - claim_zero(ibv_dereg_mr(rxq->mr)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_dereg_mr(rxq->mr); > > memset(rxq, 0, sizeof(*rxq)); > > } > > > > @@ -4374,7 +4410,10 @@ struct txq_mp2mr_mbuf_check_data { > > priv_parent_list_cleanup(priv); > > if (priv->pd !=3D NULL) { > > assert(priv->ctx !=3D NULL); > > - claim_zero(ibv_dealloc_pd(priv->pd)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_dealloc_pd(priv->pd); > > claim_zero(ibv_close_device(priv->ctx)); > > } else > > assert(priv->ctx =3D=3D NULL); > > @@ -6389,7 +6428,10 @@ struct txq_mp2mr_mbuf_check_data { > > port_error: > > rte_free(priv); > > if (pd) > > - claim_zero(ibv_dealloc_pd(pd)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_dealloc_pd(pd); > > if (ctx) > > claim_zero(ibv_close_device(ctx)); > > if (eth_dev) > > diff --git a/drivers/net/mlx4/mlx4_flow.c > > b/drivers/net/mlx4/mlx4_flow.c index 925c89c..daa62e3 100644 > > --- a/drivers/net/mlx4/mlx4_flow.c > > +++ b/drivers/net/mlx4/mlx4_flow.c > > @@ -799,8 +799,11 @@ struct rte_flow_drop { > > struct rte_flow_drop *fdq =3D priv->flow_drop_queue; > > > > priv->flow_drop_queue =3D NULL; > > - claim_zero(ibv_destroy_qp(fdq->qp)); > > - claim_zero(ibv_destroy_cq(fdq->cq)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_destroy_qp(fdq->qp); > > + ibv_destroy_cq(fdq->cq); > > rte_free(fdq); > > } > > } > > @@ -860,7 +863,10 @@ struct rte_flow_drop { > > priv->flow_drop_queue =3D fdq; > > return 0; > > err_create_qp: > > - claim_zero(ibv_destroy_cq(cq)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_destroy_cq(cq); > > err_create_cq: > > rte_free(fdq); > > err: > > @@ -1200,7 +1206,10 @@ struct rte_flow * > > (void)priv; > > LIST_REMOVE(flow, next); > > if (flow->ibv_flow) > > - claim_zero(ibv_destroy_flow(flow->ibv_flow)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_destroy_flow(flow->ibv_flow); > > rte_free(flow->ibv_attr); > > DEBUG("Flow destroyed %p", (void *)flow); > > rte_free(flow); > > @@ -1278,7 +1287,10 @@ struct rte_flow * > > for (flow =3D LIST_FIRST(&priv->flows); > > flow; > > flow =3D LIST_NEXT(flow, next)) { > > - claim_zero(ibv_destroy_flow(flow->ibv_flow)); > > + /* Current verbs does not allow to check real > > + * errors when the device was plugged out. > > + */ > > + ibv_destroy_flow(flow->ibv_flow); > > flow->ibv_flow =3D NULL; > > DEBUG("Flow %p removed", (void *)flow); > > } > > -- > > 1.8.3.1 > > >=20 > This approach looks way too intrusive. How about making the claim_zero() > definition not fail but still complain when compiled against a broken Ver= bs > version instead? >=20 > #include "mlx4_autoconf.h" >=20 > [...] >=20 > #ifndef HAVE_BROKEN_VERBS > #define claim_zero(...) assert((__VA_ARGS__) =3D=3D 0) > #else /* HAVE_BROKEN_VERBS */ > #define claim_zero(...) \ > (void)(((__VA_ARGS__) =3D=3D 0) || \ > DEBUG("Assertion `" # __VA_ARGS__ "' failed (IGNORED)")) > #endif /* HAVE_BROKEN_VERBS */ >=20 > You could use auto-config-h.sh to generate the HAVE_BROKEN_VERBS > definition > in mlx4_autoconf.h (see mlx4 Makefile) based on some symbol, macro or > type > that only exists or doesn't exist yet in problematic releases for instanc= e. >=20 I agree with the dependence on broken verbs but=20 there are other places in mlx4 code which use claim_zero assertion, So this suggestion will hurt other validations. What's about to create new define depend on broken verbs for the specific a= ssertions? It will be still intrusive but more accurate. > -- > Adrien Mazarguil > 6WIND