From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f51.google.com (mail-wm0-f51.google.com [74.125.82.51]) by dpdk.org (Postfix) with ESMTP id 4DABF7CA9 for ; Mon, 31 Jul 2017 16:17:38 +0200 (CEST) Received: by mail-wm0-f51.google.com with SMTP id m85so77261437wma.0 for ; Mon, 31 Jul 2017 07:17:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=phDaadzblm83ntDfeKdMEtQkQmgYb4Jmx68pS/COyVs=; b=u7lAwomZrq9ldv+t27qUl5V1HHN51bgSx/1oojqPQoM+adQlHXcDOe2vdjcRfWYr3l 6d+LpzsqrsjMJNAfWG89YoEbb4YrwdaVVHevucRTiBk7cO8y/3dKa5rm97VLpHJTXzLz 9yBE7aLsYlICdbDfsigAw8NJywI0TLwafML29haKHmUQWHkgl/lePZ8eswR3rDONUUH3 SqPXBCU3lVrqPnSlEkiAl1L1wi8wv3/lLXkOyvHc41V2D16XaQMKdg2GugHjHU35oPmQ bW3FrzM2GEaK5xFMTw8eWZvoY4J+ZVhBjPAbDPkuiyDcgFSCco4Jn5nP5x9j0d4/+VXS FFbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=phDaadzblm83ntDfeKdMEtQkQmgYb4Jmx68pS/COyVs=; b=rU10e8uRqTygqS7pi9CEEK3npcXXEq7KrHNAYZlrCrtnryZjb0UzGi23T9KA9ZALiH cLr6S3cJiLHuwxz9QSm/+6s2nrz2g3SoA4lRXl0DJRlUK4DyU3vU29kzgvvMV0ULEHo4 C4A2TIJV0/xMDw1kHwBK6zjgz6tiiQh51dxSsm1Hw6IOXpynVJ+huzSptSawOoQR80+T +8bTsyqjm/PlRmfZ2t7na5BT+kBZJsprx4wRVTgw6nebS21IUl2PbHxHvjVlX4ALXZko izG1ke3/0+qd3opaytf4t02jhYVJEfCC/3rYRBMyQlXcpycnzalA+5eei9QmaoQdUNmy tm5Q== X-Gm-Message-State: AIVw110Pz27nZXFmpQWRGBxNAOeBKdS4beSnZ0XAWG/mhgjRkwYGFF34 QudGVkZcLfX1e8ZL X-Received: by 10.28.38.133 with SMTP id m127mr6871755wmm.6.1501510657793; Mon, 31 Jul 2017 07:17:37 -0700 (PDT) Received: from 6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id g93sm16638269wrd.11.2017.07.31.07.17.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 31 Jul 2017 07:17:36 -0700 (PDT) Date: Mon, 31 Jul 2017 16:17:28 +0200 From: Adrien Mazarguil To: Matan Azrad Cc: dev@dpdk.org, Thomas Monjalon , Olga Shern , stable@dpdk.org Message-ID: <20170731141728.GO19852@6wind.com> References: <1501499709-19873-1-git-send-email-matan@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1501499709-19873-1-git-send-email-matan@mellanox.com> Subject: Re: [dpdk-dev] [PATCH] net/mlx4: workaround to verbs wrong error return X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Jul 2017 14:17:38 -0000 Hi Matan, On Mon, Jul 31, 2017 at 02:15:09PM +0300, Matan Azrad wrote: > Current mlx4 OFED version has bug which returns error to > ibv destroy functions when the device was plugged out, in > spite of the resources were destroyed correctly. > > Hence, failsafe PMD was aborted, only in debug mode, when > it tries to remove the device in plug-out process. > > The workaround removed the ibv destroy assertions. > > DPDK 18.02 release should work with OFED-4.2 which will > include the verbs fix to this bug, then, this patch will > be removed. > > Signed-off-by: Matan Azrad > Cc: stable@dpdk.org Since this workaround is needed in order to validate hot-plug with mlx4 compiled in debug mode due to a problem in Verbs, I don't think stable@dpdk.org should be involved. What will be back-ported, once fixed, is the minimum OFED version to install to properly benefit from hot-plug functionality. More comments about the patch below. > --- > drivers/net/mlx4/mlx4.c | 70 +++++++++++++++++++++++++++++++++++--------- > drivers/net/mlx4/mlx4_flow.c | 22 ++++++++++---- > 2 files changed, 73 insertions(+), 19 deletions(-) > > diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c > index 8451f5b..94782c2 100644 > --- a/drivers/net/mlx4/mlx4.c > +++ b/drivers/net/mlx4/mlx4.c > @@ -955,7 +955,10 @@ struct rxq * > return 0; > error: > if (mr_linear != NULL) > - claim_zero(ibv_dereg_mr(mr_linear)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_dereg_mr(mr_linear); > > rte_free(elts_linear); > rte_free(elts); > @@ -992,7 +995,10 @@ struct rxq * > txq->elts_linear = NULL; > txq->mr_linear = NULL; > if (mr_linear != NULL) > - claim_zero(ibv_dereg_mr(mr_linear)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_dereg_mr(mr_linear); > > rte_free(elts_linear); > if (elts == NULL) > @@ -1052,9 +1058,15 @@ struct rxq * > ¶ms)); > } > if (txq->qp != NULL) > - claim_zero(ibv_destroy_qp(txq->qp)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_destroy_qp(txq->qp); > if (txq->cq != NULL) > - claim_zero(ibv_destroy_cq(txq->cq)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_destroy_cq(txq->cq); > if (txq->rd != NULL) { > struct ibv_exp_destroy_res_domain_attr attr = { > .comp_mask = 0, > @@ -1070,7 +1082,10 @@ struct rxq * > if (txq->mp2mr[i].mp == NULL) > break; > assert(txq->mp2mr[i].mr != NULL); > - claim_zero(ibv_dereg_mr(txq->mp2mr[i].mr)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_dereg_mr(txq->mp2mr[i].mr); > } > memset(txq, 0, sizeof(*txq)); > } > @@ -1302,7 +1317,10 @@ static struct ibv_mr *mlx4_mp2mr(struct ibv_pd *, struct rte_mempool *) > DEBUG("%p: MR <-> MP table full, dropping oldest entry.", > (void *)txq); > --i; > - claim_zero(ibv_dereg_mr(txq->mp2mr[0].mr)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_dereg_mr(txq->mp2mr[0].mr); > memmove(&txq->mp2mr[0], &txq->mp2mr[1], > (sizeof(txq->mp2mr) - sizeof(txq->mp2mr[0]))); > } > @@ -2355,7 +2373,10 @@ struct txq_mp2mr_mbuf_check_data { > (void *)rxq, > (*mac)[0], (*mac)[1], (*mac)[2], (*mac)[3], (*mac)[4], (*mac)[5], > mac_index, priv->vlan_filter[vlan_index].id); > - claim_zero(ibv_destroy_flow(rxq->mac_flow[mac_index][vlan_index])); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_destroy_flow(rxq->mac_flow[mac_index][vlan_index]); > rxq->mac_flow[mac_index][vlan_index] = NULL; > } > > @@ -2736,7 +2757,10 @@ struct txq_mp2mr_mbuf_check_data { > DEBUG("%p: disabling allmulticast mode", (void *)rxq); > if (rxq->allmulti_flow == NULL) > return; > - claim_zero(ibv_destroy_flow(rxq->allmulti_flow)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_destroy_flow(rxq->allmulti_flow); > rxq->allmulti_flow = NULL; > DEBUG("%p: allmulticast mode disabled", (void *)rxq); > } > @@ -2796,7 +2820,10 @@ struct txq_mp2mr_mbuf_check_data { > DEBUG("%p: disabling promiscuous mode", (void *)rxq); > if (rxq->promisc_flow == NULL) > return; > - claim_zero(ibv_destroy_flow(rxq->promisc_flow)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_destroy_flow(rxq->promisc_flow); > rxq->promisc_flow = NULL; > DEBUG("%p: promiscuous mode disabled", (void *)rxq); > } > @@ -2847,9 +2874,15 @@ struct txq_mp2mr_mbuf_check_data { > rxq_mac_addrs_del(rxq); > } > if (rxq->qp != NULL) > - claim_zero(ibv_destroy_qp(rxq->qp)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_destroy_qp(rxq->qp); > if (rxq->cq != NULL) > - claim_zero(ibv_destroy_cq(rxq->cq)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_destroy_cq(rxq->cq); > if (rxq->channel != NULL) > claim_zero(ibv_destroy_comp_channel(rxq->channel)); > if (rxq->rd != NULL) { > @@ -2864,7 +2897,10 @@ struct txq_mp2mr_mbuf_check_data { > &attr)); > } > if (rxq->mr != NULL) > - claim_zero(ibv_dereg_mr(rxq->mr)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_dereg_mr(rxq->mr); > memset(rxq, 0, sizeof(*rxq)); > } > > @@ -4374,7 +4410,10 @@ struct txq_mp2mr_mbuf_check_data { > priv_parent_list_cleanup(priv); > if (priv->pd != NULL) { > assert(priv->ctx != NULL); > - claim_zero(ibv_dealloc_pd(priv->pd)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_dealloc_pd(priv->pd); > claim_zero(ibv_close_device(priv->ctx)); > } else > assert(priv->ctx == NULL); > @@ -6389,7 +6428,10 @@ struct txq_mp2mr_mbuf_check_data { > port_error: > rte_free(priv); > if (pd) > - claim_zero(ibv_dealloc_pd(pd)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_dealloc_pd(pd); > if (ctx) > claim_zero(ibv_close_device(ctx)); > if (eth_dev) > diff --git a/drivers/net/mlx4/mlx4_flow.c b/drivers/net/mlx4/mlx4_flow.c > index 925c89c..daa62e3 100644 > --- a/drivers/net/mlx4/mlx4_flow.c > +++ b/drivers/net/mlx4/mlx4_flow.c > @@ -799,8 +799,11 @@ struct rte_flow_drop { > struct rte_flow_drop *fdq = priv->flow_drop_queue; > > priv->flow_drop_queue = NULL; > - claim_zero(ibv_destroy_qp(fdq->qp)); > - claim_zero(ibv_destroy_cq(fdq->cq)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_destroy_qp(fdq->qp); > + ibv_destroy_cq(fdq->cq); > rte_free(fdq); > } > } > @@ -860,7 +863,10 @@ struct rte_flow_drop { > priv->flow_drop_queue = fdq; > return 0; > err_create_qp: > - claim_zero(ibv_destroy_cq(cq)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_destroy_cq(cq); > err_create_cq: > rte_free(fdq); > err: > @@ -1200,7 +1206,10 @@ struct rte_flow * > (void)priv; > LIST_REMOVE(flow, next); > if (flow->ibv_flow) > - claim_zero(ibv_destroy_flow(flow->ibv_flow)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_destroy_flow(flow->ibv_flow); > rte_free(flow->ibv_attr); > DEBUG("Flow destroyed %p", (void *)flow); > rte_free(flow); > @@ -1278,7 +1287,10 @@ struct rte_flow * > for (flow = LIST_FIRST(&priv->flows); > flow; > flow = LIST_NEXT(flow, next)) { > - claim_zero(ibv_destroy_flow(flow->ibv_flow)); > + /* Current verbs does not allow to check real > + * errors when the device was plugged out. > + */ > + ibv_destroy_flow(flow->ibv_flow); > flow->ibv_flow = NULL; > DEBUG("Flow %p removed", (void *)flow); > } > -- > 1.8.3.1 > This approach looks way too intrusive. How about making the claim_zero() definition not fail but still complain when compiled against a broken Verbs version instead? #include "mlx4_autoconf.h" [...] #ifndef HAVE_BROKEN_VERBS #define claim_zero(...) assert((__VA_ARGS__) == 0) #else /* HAVE_BROKEN_VERBS */ #define claim_zero(...) \ (void)(((__VA_ARGS__) == 0) || \ DEBUG("Assertion `" # __VA_ARGS__ "' failed (IGNORED)")) #endif /* HAVE_BROKEN_VERBS */ You could use auto-config-h.sh to generate the HAVE_BROKEN_VERBS definition in mlx4_autoconf.h (see mlx4 Makefile) based on some symbol, macro or type that only exists or doesn't exist yet in problematic releases for instance. -- Adrien Mazarguil 6WIND