From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-eopbgr70050.outbound.protection.outlook.com [40.107.7.50]) by dpdk.org (Postfix) with ESMTP id C3BCE5F17 for ; Thu, 25 Oct 2018 22:32:24 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LTYhSt4d1iZPO1/F1Lbz55J+ccaLEXVNutSLh7TZD9U=; b=NqH4nwPaaNRRmacNHXKParHcsyGr/IRsmNHqcrmgscXNqxKYZc5gGGrKG0bL6/O7rTgAKkCAHp2ThayQPddkDtmVWdkbxoMsenY2Xo8Eejlb+IRqV5y1GWeDdDqFKEx60TQXOsYYOlmDL+kwNY0izSsC6CpW9XUW14yjDqps5Fo= Received: from AM4PR05MB3265.eurprd05.prod.outlook.com (10.171.186.150) by AM4PR05MB3140.eurprd05.prod.outlook.com (10.171.186.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1250.30; Thu, 25 Oct 2018 20:32:23 +0000 Received: from AM4PR05MB3265.eurprd05.prod.outlook.com ([fe80::544b:a68d:e6a5:ba6e]) by AM4PR05MB3265.eurprd05.prod.outlook.com ([fe80::544b:a68d:e6a5:ba6e%2]) with mapi id 15.20.1273.022; Thu, 25 Oct 2018 20:32:23 +0000 From: Slava Ovsiienko To: Yongseok Koh CC: Shahaf Shuler , "dev@dpdk.org" Thread-Topic: [PATCH v2 7/7] net/mlx5: e-switch VXLAN rule cleanup routines Thread-Index: AQHUZJFckpmXklyEsE2b4AhV9b0B8aUvLHgAgAFLuOA= Date: Thu, 25 Oct 2018 20:32:23 +0000 Message-ID: References: <1538461807-37507-1-git-send-email-viacheslavo@mellanox.com> <1539612815-47199-1-git-send-email-viacheslavo@mellanox.com> <1539612815-47199-8-git-send-email-viacheslavo@mellanox.com> <20181025003636.GC26874@mtidpdk.mti.labs.mlnx> In-Reply-To: <20181025003636.GC26874@mtidpdk.mti.labs.mlnx> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=viacheslavo@mellanox.com; x-originating-ip: [193.201.81.176] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; AM4PR05MB3140; 6:ESamfKcosr1IGGXCHHKgCXeIxpokjgT4xKSCPLpfTBXYRhEPYO2ZXwmNw+SGfueNitd01F1ju3Zwpg+nHe0JVC9jy30y+DPgCabOQDcrMOilS0B1b7ym6qlAfPGEGhYXQHGZWE+sOdRAuncsxPZt7UolKNb5GRT8vTusx81oaPJ5KY6n4mYGOUVptsVuXwQQHSYMQxwecq/dDnHVnXMVxvROy1CEXNyEktdPwqJf/ZPmT4/Tq2pZedxQ8Zuy1d00PGqq4txPwVzJcEiHWyuWOS+IDUnw9yPTN+OepXYmnR+N4K2HyNzurAGiYR6OzkoUoRinpeMBfrYco6u5Gbr+iQg2MjWIse/lhUUbLbEApcyakLiX9T7ODY0eoMjsDYibw7FQtjhQAQ+/dMMFvoznK9kgFZ2B3WYA6tB67r1gqNODg7PCWx527VrSflQ/e7FETm/U60OB+MuCnNJrxKBKpw==; 5:Sv7GbyycIPmGeE/WE5mT9OcYdz8tAo9NaR4ZlLbIdVu+DeswB87MOx6pT7jw89BZJH7KBH/Ga4MYaIlqvJi0Pi/PnLuysSdLPjS2W6vmD31VjseI9VT7naF6bXmBpUQlMhpp9A/QBfJkShW+dHCuBKvMbKkqAe/6/oiGkls1k6E=; 7:2fBBdoTsJhVceBTS2nX1OsCnTxRmh+gB3Ax34frfQxG9Ly/pEKY0OmbPic/vTWWI/+Faf+ygH7PIBFLfSL2q3k+PRJFAlZJOmjhL3V/MwvhrtxcsV+yAtlYLiu6BKfCAu4US5qzHfhF2o4kATyzJawiKc2gUVJ3hwBeUYPbgwKR4mKEHxmP4Yco5YA5knrZafbKr5ztk2E6v7LoV6docqKB1kChqTt0s1fizClFXm18oQmFGxm80htzYYtXH9Atw x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: 6e927b7a-158b-40ac-bc8e-08d63ab8f840 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(5600074)(711020)(4618075)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7153060)(7193020); SRVR:AM4PR05MB3140; x-ms-traffictypediagnostic: AM4PR05MB3140: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3231355)(944501410)(52105095)(3002001)(93006095)(93001095)(10201501046)(6055026)(148016)(149066)(150057)(6041310)(20161123558120)(20161123560045)(20161123562045)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(201708071742011)(7699051)(76991095); SRVR:AM4PR05MB3140; BCL:0; PCL:0; RULEID:; SRVR:AM4PR05MB3140; x-forefront-prvs: 083691450C x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(366004)(136003)(376002)(39860400002)(346002)(13464003)(189003)(199004)(97736004)(5250100002)(2906002)(476003)(11346002)(446003)(186003)(99286004)(7696005)(76176011)(6862004)(105586002)(6116002)(3846002)(5024004)(256004)(14444005)(102836004)(7736002)(33656002)(26005)(4326008)(53546011)(6506007)(486006)(74316002)(25786009)(305945005)(106356001)(66066001)(8936002)(93886005)(8676002)(316002)(6246003)(68736007)(5660300001)(6636002)(71190400001)(4744004)(54906003)(53946003)(6436002)(71200400001)(9686003)(86362001)(2900100001)(81166006)(81156014)(14454004)(53936002)(478600001)(55016002)(229853002)(21314003); DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR05MB3140; H:AM4PR05MB3265.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: TqQpddFBVR0pUhvgAmWpFz6YJcOKHQ+fIMAXk5V1Fuz1YSpWseUor957vI8EfTHd2OejpB2LRBnUU27Lu375ysAv31sEXx0W3p0ctmhmr8ZUthMccE1QDdBq0Euray7vffimDZtrVehiSqQkqMvQwSbEINJQQ6SnOBPJPFvjNPfQ0GRqxfP33qTEFYSi7NApyaOd0vb3eTbEJEuwK1hcd3pwwjVWiMJp0W5VN6P+PRgDjxBZDJFOpjDSqpscgCZSxRDjzkdFRHTEZlviG9i4V3/umGwFjs5KFcRdIlDLS0thW4GPSYw7+AdotV7QPqho44qYMQtklBFR2X4Bx8Hm+V59byjvsUrtTszj0Kut1Ow= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6e927b7a-158b-40ac-bc8e-08d63ab8f840 X-MS-Exchange-CrossTenant-originalarrivaltime: 25 Oct 2018 20:32:23.1482 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR05MB3140 Subject: Re: [dpdk-dev] [PATCH v2 7/7] net/mlx5: e-switch VXLAN rule cleanup routines X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Oct 2018 20:32:25 -0000 > -----Original Message----- > From: Yongseok Koh > Sent: Thursday, October 25, 2018 3:37 > To: Slava Ovsiienko > Cc: Shahaf Shuler ; dev@dpdk.org > Subject: Re: [PATCH v2 7/7] net/mlx5: e-switch VXLAN rule cleanup routine= s >=20 > On Mon, Oct 15, 2018 at 02:13:35PM +0000, Viacheslav Ovsiienko wrote: > > The last part of patchset contains the rule cleanup routines. > > These ones is the part of outer interface initialization at the moment > > of VXLAN VTEP attaching. These routines query the list of attached > > VXLAN devices, the list of local IP addresses with peer and link scope > > attribute and the list of permanent neigh rules, then all found > > abovementioned items on the specified outer device are flushed. > > > > Suggested-by: Adrien Mazarguil > > Signed-off-by: Viacheslav Ovsiienko > > --- > > drivers/net/mlx5/mlx5_flow_tcf.c | 505 > > ++++++++++++++++++++++++++++++++++++++- > > 1 file changed, 499 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c > > b/drivers/net/mlx5/mlx5_flow_tcf.c > > index a1d7733..a3348ea 100644 > > --- a/drivers/net/mlx5/mlx5_flow_tcf.c > > +++ b/drivers/net/mlx5/mlx5_flow_tcf.c > > @@ -4012,6 +4012,502 @@ static LIST_HEAD(, mlx5_flow_tcf_vtep) } > > #endif /* HAVE_IFLA_VXLAN_COLLECT_METADATA */ > > > > +#define MNL_REQUEST_SIZE_MIN 256 > > +#define MNL_REQUEST_SIZE_MAX 2048 > > +#define MNL_REQUEST_SIZE RTE_MIN(RTE_MAX(sysconf(_SC_PAGESIZE), > \ > > + MNL_REQUEST_SIZE_MIN), > MNL_REQUEST_SIZE_MAX) > > + > > +/* Data structures used by flow_tcf_xxx_cb() routines. */ struct > > +tcf_nlcb_buf { > > + LIST_ENTRY(tcf_nlcb_buf) next; > > + uint32_t size; > > + alignas(struct nlmsghdr) > > + uint8_t msg[]; /**< Netlink message data. */ }; > > + > > +struct tcf_nlcb_context { > > + unsigned int ifindex; /**< Base interface index. */ > > + uint32_t bufsize; > > + LIST_HEAD(, tcf_nlcb_buf) nlbuf; > > +}; > > + > > +/** > > + * Allocate space for netlink command in buffer list > > + * > > + * @param[in, out] ctx > > + * Pointer to callback context with command buffers list. > > + * @param[in] size > > + * Required size of data buffer to be allocated. > > + * > > + * @return > > + * Pointer to allocated memory, aligned as message header. > > + * NULL if some error occurred. > > + */ > > +static struct nlmsghdr * > > +flow_tcf_alloc_nlcmd(struct tcf_nlcb_context *ctx, uint32_t size) { > > + struct tcf_nlcb_buf *buf; > > + struct nlmsghdr *nlh; > > + > > + size =3D NLMSG_ALIGN(size); > > + buf =3D LIST_FIRST(&ctx->nlbuf); > > + if (buf && (buf->size + size) <=3D ctx->bufsize) { > > + nlh =3D (struct nlmsghdr *)&buf->msg[buf->size]; > > + buf->size +=3D size; > > + return nlh; > > + } > > + if (size > ctx->bufsize) { > > + DRV_LOG(WARNING, "netlink: too long command buffer > requested"); > > + return NULL; > > + } > > + buf =3D rte_malloc(__func__, > > + ctx->bufsize + sizeof(struct tcf_nlcb_buf), > > + alignof(struct tcf_nlcb_buf)); > > + if (!buf) { > > + DRV_LOG(WARNING, "netlink: no memory for command > buffer"); > > + return NULL; > > + } > > + LIST_INSERT_HEAD(&ctx->nlbuf, buf, next); > > + buf->size =3D size; > > + nlh =3D (struct nlmsghdr *)&buf->msg[0]; > > + return nlh; > > +} > > + > > +/** > > + * Set NLM_F_ACK flags in the last netlink command in buffer. > > + * Only last command in the buffer will be acked by system. > > + * > > + * @param[in, out] buf > > + * Pointer to buffer with netlink commands. > > + */ > > +static void > > +flow_tcf_setack_nlcmd(struct tcf_nlcb_buf *buf) { > > + struct nlmsghdr *nlh; > > + uint32_t size =3D 0; > > + > > + assert(buf->size); > > + do { > > + nlh =3D (struct nlmsghdr *)&buf->msg[size]; > > + size +=3D NLMSG_ALIGN(nlh->nlmsg_len); > > + if (size >=3D buf->size) { > > + nlh->nlmsg_flags |=3D NLM_F_ACK; > > + break; > > + } > > + } while (true); > > +} > > + > > +/** > > + * Send the buffers with prepared netlink commands. Scans the list > > +and > > + * sends all found buffers. Buffers are sent and freed anyway in > > +order > > + * to prevent memory leakage if some every message in received packet. > > + * > > + * @param[in] tcf > > + * Context object initialized by mlx5_flow_tcf_context_create(). > > + * @param[in, out] ctx > > + * Pointer to callback context with command buffers list. > > + * > > + * @return > > + * Zero value on success, negative errno value otherwise > > + * and rte_errno is set. > > + */ > > +static int > > +flow_tcf_send_nlcmd(struct mlx5_flow_tcf_context *tcf, > > + struct tcf_nlcb_context *ctx) > > +{ > > + struct tcf_nlcb_buf *bc, *bn; > > + struct nlmsghdr *nlh; > > + int ret =3D 0; > > + > > + bc =3D LIST_FIRST(&ctx->nlbuf); > > + while (bc) { > > + int rc; > > + > > + bn =3D LIST_NEXT(bc, next); > > + if (bc->size) { > > + flow_tcf_setack_nlcmd(bc); > > + nlh =3D (struct nlmsghdr *)&bc->msg; > > + rc =3D flow_tcf_nl_ack(tcf, nlh, bc->size, NULL, NULL); > > + if (rc && !ret) > > + ret =3D rc; > > + } > > + rte_free(bc); > > + bc =3D bn; > > + } > > + LIST_INIT(&ctx->nlbuf); > > + return ret; > > +} > > + > > +/** > > + * Collect local IP address rules with scope link attribute on > > +specified > > + * network device. This is callback routine called by libmnl > > +mnl_cb_run() > > + * in loop for every message in received packet. > > + * > > + * @param[in] nlh > > + * Pointer to reply header. > > + * @param[in, out] arg > > + * Opaque data pointer for this callback. > > + * > > + * @return > > + * A positive, nonzero value on success, negative errno value otherw= ise > > + * and rte_errno is set. > > + */ > > +static int > > +flow_tcf_collect_local_cb(const struct nlmsghdr *nlh, void *arg) { > > + struct tcf_nlcb_context *ctx =3D arg; > > + struct nlmsghdr *cmd; > > + struct ifaddrmsg *ifa; > > + struct nlattr *na; > > + struct nlattr *na_local =3D NULL; > > + struct nlattr *na_peer =3D NULL; > > + unsigned char family; > > + > > + if (nlh->nlmsg_type !=3D RTM_NEWADDR) { > > + rte_errno =3D EINVAL; > > + return -rte_errno; > > + } > > + ifa =3D mnl_nlmsg_get_payload(nlh); > > + family =3D ifa->ifa_family; > > + if (ifa->ifa_index !=3D ctx->ifindex || > > + ifa->ifa_scope !=3D RT_SCOPE_LINK || > > + !(ifa->ifa_flags & IFA_F_PERMANENT) || > > + (family !=3D AF_INET && family !=3D AF_INET6)) > > + return 1; > > + mnl_attr_for_each(na, nlh, sizeof(*ifa)) { > > + switch (mnl_attr_get_type(na)) { > > + case IFA_LOCAL: > > + na_local =3D na; > > + break; > > + case IFA_ADDRESS: > > + na_peer =3D na; > > + break; > > + } > > + if (na_local && na_peer) > > + break; > > + } > > + if (!na_local || !na_peer) > > + return 1; > > + /* Local rule found with scope link, permanent and assigned peer. */ > > + cmd =3D flow_tcf_alloc_nlcmd(ctx, MNL_ALIGN(sizeof(struct > nlmsghdr)) + > > + MNL_ALIGN(sizeof(struct ifaddrmsg)) > + > > + (family =3D=3D AF_INET6 > > + ? 2 * > SZ_NLATTR_DATA_OF(IPV6_ADDR_LEN) > > + : 2 * > SZ_NLATTR_TYPE_OF(uint32_t))); >=20 > Better to use IPV4_ADDR_LEN instead? >=20 OK. > > + if (!cmd) { > > + rte_errno =3D ENOMEM; > > + return -rte_errno; > > + } > > + cmd =3D mnl_nlmsg_put_header(cmd); > > + cmd->nlmsg_type =3D RTM_DELADDR; > > + cmd->nlmsg_flags =3D NLM_F_REQUEST; > > + ifa =3D mnl_nlmsg_put_extra_header(cmd, sizeof(*ifa)); > > + ifa->ifa_flags =3D IFA_F_PERMANENT; > > + ifa->ifa_scope =3D RT_SCOPE_LINK; > > + ifa->ifa_index =3D ctx->ifindex; > > + if (family =3D=3D AF_INET) { > > + ifa->ifa_family =3D AF_INET; > > + ifa->ifa_prefixlen =3D 32; > > + mnl_attr_put_u32(cmd, IFA_LOCAL, > mnl_attr_get_u32(na_local)); > > + mnl_attr_put_u32(cmd, IFA_ADDRESS, > mnl_attr_get_u32(na_peer)); > > + } else { > > + ifa->ifa_family =3D AF_INET6; > > + ifa->ifa_prefixlen =3D 128; > > + mnl_attr_put(cmd, IFA_LOCAL, IPV6_ADDR_LEN, > > + mnl_attr_get_payload(na_local)); > > + mnl_attr_put(cmd, IFA_ADDRESS, IPV6_ADDR_LEN, > > + mnl_attr_get_payload(na_peer)); > > + } > > + return 1; > > +} > > + > > +/** > > + * Cleanup the local IP addresses on outer interface. > > + * > > + * @param[in] tcf > > + * Context object initialized by mlx5_flow_tcf_context_create(). > > + * @param[in] ifindex > > + * Network inferface index to perform cleanup. > > + */ > > +static void > > +flow_tcf_encap_local_cleanup(struct mlx5_flow_tcf_context *tcf, > > + unsigned int ifindex) > > +{ > > + struct nlmsghdr *nlh; > > + struct ifaddrmsg *ifa; > > + struct tcf_nlcb_context ctx =3D { > > + .ifindex =3D ifindex, > > + .bufsize =3D MNL_REQUEST_SIZE, > > + .nlbuf =3D LIST_HEAD_INITIALIZER(), > > + }; > > + int ret; > > + > > + assert(ifindex); > > + /* > > + * Seek and destroy leftovers of local IP addresses with > > + * matching properties "scope link". > > + */ > > + nlh =3D mnl_nlmsg_put_header(tcf->buf); > > + nlh->nlmsg_type =3D RTM_GETADDR; > > + nlh->nlmsg_flags =3D NLM_F_REQUEST | NLM_F_DUMP; > > + ifa =3D mnl_nlmsg_put_extra_header(nlh, sizeof(*ifa)); > > + ifa->ifa_family =3D AF_UNSPEC; > > + ifa->ifa_index =3D ifindex; > > + ifa->ifa_scope =3D RT_SCOPE_LINK; > > + ret =3D flow_tcf_nl_ack(tcf, nlh, 0, flow_tcf_collect_local_cb, &ctx)= ; > > + if (ret) > > + DRV_LOG(WARNING, "netlink: query device list error %d", > ret); > > + ret =3D flow_tcf_send_nlcmd(tcf, &ctx); > > + if (ret) > > + DRV_LOG(WARNING, "netlink: device delete error %d", ret); } > > + > > +/** > > + * Collect neigh permament rules on specified network device. > > + * This is callback routine called by libmnl mnl_cb_run() in loop for > > + * every message in received packet. > > + * > > + * @param[in] nlh > > + * Pointer to reply header. > > + * @param[in, out] arg > > + * Opaque data pointer for this callback. > > + * > > + * @return > > + * A positive, nonzero value on success, negative errno value otherw= ise > > + * and rte_errno is set. > > + */ > > +static int > > +flow_tcf_collect_neigh_cb(const struct nlmsghdr *nlh, void *arg) { > > + struct tcf_nlcb_context *ctx =3D arg; > > + struct nlmsghdr *cmd; > > + struct ndmsg *ndm; > > + struct nlattr *na; > > + struct nlattr *na_ip =3D NULL; > > + struct nlattr *na_mac =3D NULL; > > + unsigned char family; > > + > > + if (nlh->nlmsg_type !=3D RTM_NEWNEIGH) { > > + rte_errno =3D EINVAL; > > + return -rte_errno; > > + } > > + ndm =3D mnl_nlmsg_get_payload(nlh); > > + family =3D ndm->ndm_family; > > + if (ndm->ndm_ifindex !=3D (int)ctx->ifindex || > > + !(ndm->ndm_state & NUD_PERMANENT) || > > + (family !=3D AF_INET && family !=3D AF_INET6)) > > + return 1; > > + mnl_attr_for_each(na, nlh, sizeof(*ndm)) { > > + switch (mnl_attr_get_type(na)) { > > + case NDA_DST: > > + na_ip =3D na; > > + break; > > + case NDA_LLADDR: > > + na_mac =3D na; > > + break; > > + } > > + if (na_mac && na_ip) > > + break; > > + } > > + if (!na_mac || !na_ip) > > + return 1; > > + /* Neigh rule with permenent attribute found. */ > > + cmd =3D flow_tcf_alloc_nlcmd(ctx, MNL_ALIGN(sizeof(struct > nlmsghdr)) + > > + MNL_ALIGN(sizeof(struct ndmsg)) + > > + > SZ_NLATTR_DATA_OF(ETHER_ADDR_LEN) + > > + (family =3D=3D AF_INET6 > > + ? > SZ_NLATTR_DATA_OF(IPV6_ADDR_LEN) > > + : SZ_NLATTR_TYPE_OF(uint32_t))); >=20 > Better to use IPV4_ADDR_LEN instead? >=20 > > + if (!cmd) { > > + rte_errno =3D ENOMEM; > > + return -rte_errno; > > + } > > + cmd =3D mnl_nlmsg_put_header(cmd); > > + cmd->nlmsg_type =3D RTM_DELNEIGH; > > + cmd->nlmsg_flags =3D NLM_F_REQUEST; > > + ndm =3D mnl_nlmsg_put_extra_header(cmd, sizeof(*ndm)); > > + ndm->ndm_ifindex =3D ctx->ifindex; > > + ndm->ndm_state =3D NUD_PERMANENT; > > + ndm->ndm_flags =3D 0; > > + ndm->ndm_type =3D 0; > > + if (family =3D=3D AF_INET) { > > + ndm->ndm_family =3D AF_INET; > > + mnl_attr_put_u32(cmd, NDA_DST, > mnl_attr_get_u32(na_ip)); > > + } else { > > + ndm->ndm_family =3D AF_INET6; > > + mnl_attr_put(cmd, NDA_DST, IPV6_ADDR_LEN, > > + mnl_attr_get_payload(na_ip)); > > + } > > + mnl_attr_put(cmd, NDA_LLADDR, ETHER_ADDR_LEN, > > + mnl_attr_get_payload(na_mac)); > > + return 1; > > +} > > + > > +/** > > + * Cleanup the neigh rules on outer interface. > > + * > > + * @param[in] tcf > > + * Context object initialized by mlx5_flow_tcf_context_create(). > > + * @param[in] ifindex > > + * Network inferface index to perform cleanup. > > + */ > > +static void > > +flow_tcf_encap_neigh_cleanup(struct mlx5_flow_tcf_context *tcf, > > + unsigned int ifindex) > > +{ > > + struct nlmsghdr *nlh; > > + struct ndmsg *ndm; > > + struct tcf_nlcb_context ctx =3D { > > + .ifindex =3D ifindex, > > + .bufsize =3D MNL_REQUEST_SIZE, > > + .nlbuf =3D LIST_HEAD_INITIALIZER(), > > + }; > > + int ret; > > + > > + assert(ifindex); > > + /* Seek and destroy leftovers of neigh rules. */ > > + nlh =3D mnl_nlmsg_put_header(tcf->buf); > > + nlh->nlmsg_type =3D RTM_GETNEIGH; > > + nlh->nlmsg_flags =3D NLM_F_REQUEST | NLM_F_DUMP; > > + ndm =3D mnl_nlmsg_put_extra_header(nlh, sizeof(*ndm)); > > + ndm->ndm_family =3D AF_UNSPEC; > > + ndm->ndm_ifindex =3D ifindex; > > + ndm->ndm_state =3D NUD_PERMANENT; > > + ret =3D flow_tcf_nl_ack(tcf, nlh, 0, flow_tcf_collect_neigh_cb, &ctx)= ; > > + if (ret) > > + DRV_LOG(WARNING, "netlink: query device list error %d", > ret); > > + ret =3D flow_tcf_send_nlcmd(tcf, &ctx); > > + if (ret) > > + DRV_LOG(WARNING, "netlink: device delete error %d", ret); } > > + > > +/** > > + * Collect indices of VXLAN encap/decap interfaces associated with > device. > > + * This is callback routine called by libmnl mnl_cb_run() in loop for > > + * every message in received packet. > > + * > > + * @param[in] nlh > > + * Pointer to reply header. > > + * @param[in, out] arg > > + * Opaque data pointer for this callback. > > + * > > + * @return > > + * A positive, nonzero value on success, negative errno value otherw= ise > > + * and rte_errno is set. > > + */ > > +static int > > +flow_tcf_collect_vxlan_cb(const struct nlmsghdr *nlh, void *arg) { > > + struct tcf_nlcb_context *ctx =3D arg; > > + struct nlmsghdr *cmd; > > + struct ifinfomsg *ifm; > > + struct nlattr *na; > > + struct nlattr *na_info =3D NULL; > > + struct nlattr *na_vxlan =3D NULL; > > + bool found =3D false; > > + unsigned int vxindex; > > + > > + if (nlh->nlmsg_type !=3D RTM_NEWLINK) { > > + rte_errno =3D EINVAL; > > + return -rte_errno; > > + } > > + ifm =3D mnl_nlmsg_get_payload(nlh); > > + if (!ifm->ifi_index) { > > + rte_errno =3D EINVAL; > > + return -rte_errno; > > + } > > + mnl_attr_for_each(na, nlh, sizeof(*ifm)) > > + if (mnl_attr_get_type(na) =3D=3D IFLA_LINKINFO) { > > + na_info =3D na; > > + break; > > + } > > + if (!na_info) > > + return 1; > > + mnl_attr_for_each_nested(na, na_info) { > > + switch (mnl_attr_get_type(na)) { > > + case IFLA_INFO_KIND: > > + if (!strncmp("vxlan", mnl_attr_get_str(na), > > + mnl_attr_get_len(na))) > > + found =3D true; > > + break; > > + case IFLA_INFO_DATA: > > + na_vxlan =3D na; > > + break; > > + } > > + if (found && na_vxlan) > > + break; > > + } > > + if (!found || !na_vxlan) > > + return 1; > > + found =3D false; > > + mnl_attr_for_each_nested(na, na_vxlan) { > > + if (mnl_attr_get_type(na) =3D=3D IFLA_VXLAN_LINK && > > + mnl_attr_get_u32(na) =3D=3D ctx->ifindex) { > > + found =3D true; > > + break; > > + } > > + } > > + if (!found) > > + return 1; > > + /* Attached VXLAN device found, store the command to delete. */ > > + vxindex =3D ifm->ifi_index; > > + cmd =3D flow_tcf_alloc_nlcmd(ctx, MNL_ALIGN(sizeof(struct > nlmsghdr)) + > > + MNL_ALIGN(sizeof(struct > ifinfomsg))); > > + if (!nlh) { > > + rte_errno =3D ENOMEM; > > + return -rte_errno; > > + } > > + cmd =3D mnl_nlmsg_put_header(cmd); > > + cmd->nlmsg_type =3D RTM_DELLINK; > > + cmd->nlmsg_flags =3D NLM_F_REQUEST; > > + ifm =3D mnl_nlmsg_put_extra_header(cmd, sizeof(*ifm)); > > + ifm->ifi_family =3D AF_UNSPEC; > > + ifm->ifi_index =3D vxindex; > > + return 1; > > +} > > + > > +/** > > + * Cleanup the outer interface. Removes all found vxlan devices > > + * attached to specified index, flushes the meigh and local IP > > + * datavase. > > + * > > + * @param[in] tcf > > + * Context object initialized by mlx5_flow_tcf_context_create(). > > + * @param[in] ifindex > > + * Network inferface index to perform cleanup. > > + */ > > +static void > > +flow_tcf_encap_iface_cleanup(struct mlx5_flow_tcf_context *tcf, > > + unsigned int ifindex) > > +{ > > + struct nlmsghdr *nlh; > > + struct ifinfomsg *ifm; > > + struct tcf_nlcb_context ctx =3D { > > + .ifindex =3D ifindex, > > + .bufsize =3D MNL_REQUEST_SIZE, > > + .nlbuf =3D LIST_HEAD_INITIALIZER(), > > + }; > > + int ret; > > + > > + assert(ifindex); > > + /* > > + * Seek and destroy leftover VXLAN encap/decap interfaces with > > + * matching properties. > > + */ > > + nlh =3D mnl_nlmsg_put_header(tcf->buf); > > + nlh->nlmsg_type =3D RTM_GETLINK; > > + nlh->nlmsg_flags =3D NLM_F_REQUEST | NLM_F_DUMP; > > + ifm =3D mnl_nlmsg_put_extra_header(nlh, sizeof(*ifm)); > > + ifm->ifi_family =3D AF_UNSPEC; > > + ret =3D flow_tcf_nl_ack(tcf, nlh, 0, flow_tcf_collect_vxlan_cb, &ctx)= ; > > + if (ret) > > + DRV_LOG(WARNING, "netlink: query device list error %d", > ret); > > + ret =3D flow_tcf_send_nlcmd(tcf, &ctx); > > + if (ret) > > + DRV_LOG(WARNING, "netlink: device delete error %d", ret); } > > + > > + > > /** > > * Create target interface index for VXLAN tunneling decapsulation. > > * In order to share the UDP port within the other interfaces the @@ > > -4100,12 +4596,9 @@ static LIST_HEAD(, mlx5_flow_tcf_vtep) > > uint16_t pcnt; > > > > /* Not found, we should create the new attached VTEP. */ > > -/* > > - * TODO: not implemented yet > > - * flow_tcf_encap_iface_cleanup(tcf, ifouter); > > - * flow_tcf_encap_local_cleanup(tcf, ifouter); > > - * flow_tcf_encap_neigh_cleanup(tcf, ifouter); > > - */ > > + flow_tcf_encap_iface_cleanup(tcf, ifouter); > > + flow_tcf_encap_local_cleanup(tcf, ifouter); > > + flow_tcf_encap_neigh_cleanup(tcf, ifouter); >=20 > I have a fundamental questioin. Why are these cleanups needed? If I read = the > code correctly, it looks like cleaning up vtep, ip assginment and neigh e= ntry > which are not created/set by PMD. The reason why we have to clean up > things is that PMD exclusively owns the interface (ifouter). Is my > understanding correct? Because this is the most simple approach. I have no guess how to co-exist with unknown pre-created rules and how to get into account all their properties and side effects. While debugging I see the situations when application crashes and leaves a "leftovers" as VXLAN devices, neigh and local rules. If we run app= lication again - these leftovers were the sources of errors (EEXIST on rule creations and so= on). With best regards, Slava >=20 > Thanks, > Yongseok >=20 > > for (pcnt =3D 0; pcnt <=3D (MLX5_VXLAN_PORT_RANGE_MAX > > - MLX5_VXLAN_PORT_RANGE_MIN); > pcnt++) { > > encap_port++; > >