From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-eopbgr50043.outbound.protection.outlook.com [40.107.5.43]) by dpdk.org (Postfix) with ESMTP id F3E72DE3 for ; Sun, 14 Oct 2018 13:07:24 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jprrs0tb03S5oczDvBTeJ//TWs5D+84k4gzpeujDwJ4=; b=nJXTWxwYxSerWbiOikKoavshAj6BegoFBnAzwzI2bRiH49fbUDxjGlo37KlF8uf4XFHBjjsVLtLUNZEWHSvHNwftGQi8IhHP3l1CIxyORo690VmOkdjFnxMu6ewmLVKsNIRZpSzKKac5z6j1/95KL7NWLHH9c1ZDZ1Z8QTFUMK8= Received: from DB7PR05MB4426.eurprd05.prod.outlook.com (52.134.109.15) by DB7PR05MB5383.eurprd05.prod.outlook.com (20.177.122.212) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1228.21; Sun, 14 Oct 2018 11:07:23 +0000 Received: from DB7PR05MB4426.eurprd05.prod.outlook.com ([fe80::80e:e6b:baf2:d973]) by DB7PR05MB4426.eurprd05.prod.outlook.com ([fe80::80e:e6b:baf2:d973%3]) with mapi id 15.20.1228.020; Sun, 14 Oct 2018 11:07:23 +0000 From: Shahaf Shuler To: Mordechay Haimovsky CC: "dev@dpdk.org" Thread-Topic: [PATCH v2 2/2] net/mlx5: support e-switch flow count action Thread-Index: AQHUYWL2PAceajz0XUS1hGe6z4E3o6Ueb1tg Date: Sun, 14 Oct 2018 11:07:22 +0000 Message-ID: References: <1539263057-16678-1-git-send-email-motih@mellanox.com> <1539263057-16678-3-git-send-email-motih@mellanox.com> In-Reply-To: <1539263057-16678-3-git-send-email-motih@mellanox.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=shahafs@mellanox.com; x-originating-ip: [31.154.10.105] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB7PR05MB5383; 6:z3ejSkXO1OHz7wpEimI6Bc84SHmLr+/DzCEuYzgVA8rG6uMhBvMjPc2nLgiQ19jO3qv4y1iWa2LQfFPUTG+AO3H/3NODAqoTCy/dTKVPJ37ZK4FxVyEcKK4U5k0SW2vE3op8/tZqX/lOlxh54IVSYsqGKqK+HaQuBMKv+4idqZwm5oD+xJ8SbKrAAGxlhIzJ6UqFAra8rV9Zc6yIr4scnHz1z8xy7m58d7qbKWv2srIMKyfjlJxYbmzorXwXsz1nFuGab9vsl26ppquK3ewPq8LJhvzi8oueQCsN3hKJ+0d7O9Z9Eqv8wirROF3UJrtH9PRVLx3bY3zaOhtQcQp8tHYv6lp2/fovZctm20kUrOeHoMLA5CNbFui4NHAMX7o51K3fI5PK/nAP2xTUuC1zIcVxg2P4efEsYiwjfPQz+4RkieGK69OTPmYV51uKYXwvEx0Rn6Mxo4G9IMMZsgSmtA==; 5:0HWQ5BqyVnm452uBv7WCbW4ifxzwDklpzmmULMOoetSfyErdkzSokv0RfalDK4UvtMydXETJUGrimRb2GdgGbW72nd7Yoz/eaCPpdVA55979dqYSXsBBYy3S/g0rlxLxtrDvk+n7PXuG0CGOomi0Tbf/4IjNabzDrIQnPDxppH4=; 7:DaR6iQZR+uAdsVlf80RfiR5pjuunEYq5k4Zjt5z+k3UOflCNIIbLwzxywF64xPSbxtlM5nAqShpTkJhbOC3k9CI8aqu++Q2OsalBFhfer03hRXcvBWjqk7QLwmGCYrVfME74oOa1v517dJSQPVsF1ZuyS0X3YDPLWvT0PWZ6M+Hh++PUVp4abCVeGIZZa44q1NIjVs7xaGecUAxwwdJzaW1e6Nhq7M3f8zWqcIaTf/1VDtXEQKil2cyVrChvDD2m x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: cf327b18-c543-46eb-bc40-08d631c537a2 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(5600074)(711020)(4618075)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7153060)(7193020); SRVR:DB7PR05MB5383; x-ms-traffictypediagnostic: DB7PR05MB5383: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(211171220733660); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(93006095)(93001095)(10201501046)(3002001)(3231355)(944501410)(52105095)(6055026)(149066)(150057)(6041310)(20161123562045)(20161123558120)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(201708071742011)(7699051)(76991067); SRVR:DB7PR05MB5383; BCL:0; PCL:0; RULEID:; SRVR:DB7PR05MB5383; x-forefront-prvs: 08252193F3 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(396003)(376002)(346002)(136003)(39860400002)(199004)(189003)(478600001)(66066001)(486006)(26005)(7696005)(99286004)(68736007)(476003)(76176011)(11346002)(186003)(446003)(8936002)(8676002)(2900100001)(102836004)(6506007)(256004)(14444005)(81156014)(81166006)(71190400001)(6862004)(4744004)(71200400001)(5250100002)(4326008)(14454004)(33656002)(25786009)(97736004)(305945005)(6436002)(53936002)(9686003)(2906002)(5660300001)(106356001)(55016002)(6246003)(316002)(6636002)(86362001)(7736002)(74316002)(229853002)(105586002)(3846002)(6116002)(53946003)(579004); DIR:OUT; SFP:1101; SCL:1; SRVR:DB7PR05MB5383; H:DB7PR05MB4426.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: dqjM5Yd3aJ0cZ1tB9il1lTshEZie4Vg+SXJ7LbNtRredO9M1oNj8DRmZnhO6+Zcd7qCyevVw2/ZF+XIHJLkTvtLd9XfIsKBdtkpuH4rmA6N5Q1W3FqVdH8BzFkCFzW1z23q1z9JXb4JcFp6b1WUw5HyJtjzq89iXv3CvVC9CsIWtjkv+pos/tbjDrsBR98O6ERQGSa9v4X0ZXvoa8LmtcJeAM7z6dotp0oOg9lZG2VlnG1i8rhtA4uNjgbHHK45EZrR2u1BDDfJYQQBgI1nG7QFOrWkmBztZEUp7PkSMbuqdJ9wKeISSIijKQq9ZtGe1VPujB13pqBX66zaflBlBxUjFKj7gl12mck9lJMVnlA0= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: cf327b18-c543-46eb-bc40-08d631c537a2 X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Oct 2018 11:07:22.9359 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB7PR05MB5383 Subject: Re: [dpdk-dev] [PATCH v2 2/2] net/mlx5: support e-switch flow count action X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2018 11:07:25 -0000 Hi Moty,=20 In addition to below, the prefix for each tc flow function should be flow_t= cf_*. Make sure to update your functions.=20 Thursday, October 11, 2018 4:05 PM, Mordechay Haimovsky: > Subject: [PATCH v2 2/2] net/mlx5: support e-switch flow count action >=20 > This commit adds support for configuring flows destined to the mlx5 eswit= ch > with 'count' action and for querying these counts at runtime. >=20 Log from here... > It is possible to offload an interface flow rules to the hardware using D= PDK > flow commands. > With mlx5 it is also possible to offload a limited set of flow rules to t= he mlxsw > (or e-switch) using the same DPDK flow commands using the 'transfer' > attribute in the flow rule creation command. > The commands destined for the switch are transposed to TC flower rules an= d > are sent, as Netlink messages, to the mlx5 driver (or more precisely to t= he > netdev which represent the mlxsw port). Till here is not needed.=20 > Each flow rule configured by the mlx5 driver is also assigned with a set = of flow With TC, each flow rule ... Also a set or a single counter? > counters implicitly. These counters can be retrieved when querying the fl= ow > rule via Netlink, they can be found in each flow action section of the re= ply. No need for log from here... > Currently the limited set of eswitch flow rules does not contain the 'cou= nt' > action but since every rule contains a count we can still retrieve these = values > as if we configured a 'count' action. Till here.. >=20 > Supporting the 'count' action in the flow configuration command is straig= ht- > forward.=20 Hence, supporting the 'count' action... When transposing the command to a tc flower Netlink message we > just ignore it instead of rejecting it. > So the following two commands will have the same affect and behavior: > testpmd> flow create 0 transfer ingress pattern eth src is > 11:22:33:44:55:77 dst is 11:22:33:44:55:88 / end > actions drop / end > testpmd> flow create 0 transfer ingress pattern eth src is > 11:22:33:44:55:77 dst is 11:22:33:44:55:88 / end > actions count / drop / end Right, but it is better the user will explicitly provide count action anywa= y.=20 > In the flow query side, the command now also returns the counts the above Rephrase "the counts the above", maybe the counters or the above counters.= =20 > flow via using tc Netlink query command. > Special care was taken in order to prevent Netlink messages truncation du= e > to short buffers by using MNL_SOCKET_BUFFER_SIZE buffers which are pre- > allocate per port instance. I don't understand this part.=20 Isn't it a fixed size of response when querying the packet and bytes counte= rs of a single flow?=20 What is the size?=20 >=20 > Signed-off-by: Moti Haimovsky > --- > v2: > * Rebase on top of 3f4722ee01e7 ("net/mlx5: refactor TC-flow > infrastructure") > --- > drivers/net/mlx5/mlx5_flow.c | 29 +++- > drivers/net/mlx5/mlx5_flow.h | 14 +- > drivers/net/mlx5/mlx5_flow_dv.c | 1 + > drivers/net/mlx5/mlx5_flow_tcf.c | 286 > ++++++++++++++++++++++++++++++++++++- > drivers/net/mlx5/mlx5_flow_verbs.c | 1 + > 5 files changed, 325 insertions(+), 6 deletions(-) >=20 > diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c > index 6b2698a..7c6ece1 100644 > --- a/drivers/net/mlx5/mlx5_flow.c > +++ b/drivers/net/mlx5/mlx5_flow.c > @@ -1649,6 +1649,17 @@ uint32_t mlx5_flow_adjust_priority(struct > rte_eth_dev *dev, int32_t priority, { } >=20 > +int > +flow_null_query(struct rte_eth_dev *dev __rte_unused, > + struct rte_flow *flow __rte_unused, > + enum rte_flow_action_type type __rte_unused, > + void *data __rte_unused, > + struct rte_flow_error *error __rte_unused) { > + rte_errno =3D ENOTSUP; > + return -rte_errno; > +} > + > /* Void driver to protect from null pointer reference. */ const struct > mlx5_flow_driver_ops mlx5_flow_null_drv_ops =3D { > .validate =3D flow_null_validate, > @@ -1657,6 +1668,7 @@ uint32_t mlx5_flow_adjust_priority(struct > rte_eth_dev *dev, int32_t priority, > .apply =3D flow_null_apply, > .remove =3D flow_null_remove, > .destroy =3D flow_null_destroy, > + .query =3D flow_null_query, > }; >=20 You added the query function pointer to the driver ops which is good.=20 > /** > @@ -2349,10 +2361,19 @@ struct rte_flow * > * 0 on success, a negative errno value otherwise and rte_errno is set= . > */ > static int > -mlx5_flow_query_count(struct rte_flow *flow __rte_unused, > - void *data __rte_unused, > +mlx5_flow_query_count(struct rte_eth_dev *dev, Why not calling it flow_drv_query, and to have the function body just like = the other function pointers: Get the driver type from dev and then call its query op.=20 You will probably need to move the current implementation to the query func= tion of the verbs.=20 > + struct rte_flow *flow, > + void *data, > struct rte_flow_error *error) > { > + const struct mlx5_flow_driver_ops *fops; > + enum mlx5_flow_drv_type ftype =3D flow->drv_type; > + > + assert(ftype > MLX5_FLOW_TYPE_MIN && ftype < > MLX5_FLOW_TYPE_MAX); We don't have this assert on any other function pointer. Not sure why to st= art now.=20 > + fops =3D flow_get_drv_ops(ftype); > + if (ftype =3D=3D MLX5_FLOW_TYPE_TCF) > + return fops->query(dev, flow, > + RTE_FLOW_ACTION_TYPE_COUNT, data, > error); > #ifdef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT > if (flow->actions & MLX5_FLOW_ACTION_COUNT) { > struct rte_flow_query_count *qc =3D data; @@ -2402,7 +2423,7 > @@ struct rte_flow * > * @see rte_flow_ops > */ > int > -mlx5_flow_query(struct rte_eth_dev *dev __rte_unused, > +mlx5_flow_query(struct rte_eth_dev *dev, > struct rte_flow *flow, > const struct rte_flow_action *actions, > void *data, > @@ -2415,7 +2436,7 @@ struct rte_flow * > case RTE_FLOW_ACTION_TYPE_VOID: > break; > case RTE_FLOW_ACTION_TYPE_COUNT: > - ret =3D mlx5_flow_query_count(flow, data, error); > + ret =3D mlx5_flow_query_count(dev, flow, data, error); > break; > default: > return rte_flow_error_set(error, ENOTSUP, diff --git > a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h index > 41d55e8..3e1e9a0 100644 > --- a/drivers/net/mlx5/mlx5_flow.h > +++ b/drivers/net/mlx5/mlx5_flow.h > @@ -175,6 +175,8 @@ struct mlx5_flow_dv { struct mlx5_flow_tcf { > struct nlmsghdr *nlh; > struct tcmsg *tcm; > + uint64_t hits; > + uint64_t bytes; You have the "last before reset" flow query counters as part of the struct = rte_flow. Do you really need to add those here?=20 > }; >=20 > /* Verbs specification header. */ > @@ -232,7 +234,6 @@ struct rte_flow { > struct rte_flow_action_rss rss;/**< RSS context. */ > uint8_t key[MLX5_RSS_HASH_KEY_LEN]; /**< RSS hash key. */ > uint16_t (*queue)[]; /**< Destination queues to redirect traffic to. > */ > - void *nl_flow; /**< Netlink flow buffer if relevant. */ > LIST_HEAD(dev_flows, mlx5_flow) dev_flows; > /**< Device flows that are part of the flow. */ > uint32_t actions; /**< Bit-fields which mark all detected actions. */ > @@ -258,6 +259,11 @@ typedef void (*mlx5_flow_remove_t)(struct > rte_eth_dev *dev, > struct rte_flow *flow); > typedef void (*mlx5_flow_destroy_t)(struct rte_eth_dev *dev, > struct rte_flow *flow); > +typedef int (*mlx5_flow_query_t)(struct rte_eth_dev *dev, > + struct rte_flow *flow, > + enum rte_flow_action_type type, Do you believe you will have other type than count? mlx5_flow_query should = protect against it.=20 > + void *data, > + struct rte_flow_error *error); > struct mlx5_flow_driver_ops { > mlx5_flow_validate_t validate; > mlx5_flow_prepare_t prepare; > @@ -265,10 +271,16 @@ struct mlx5_flow_driver_ops { > mlx5_flow_apply_t apply; > mlx5_flow_remove_t remove; > mlx5_flow_destroy_t destroy; > + mlx5_flow_query_t query; > }; >=20 > /* mlx5_flow.c */ >=20 > +int flow_null_query(struct rte_eth_dev *dev, > + struct rte_flow *flow, > + enum rte_flow_action_type type, Ditto.=20 > + void *data, > + struct rte_flow_error *error); > uint64_t mlx5_flow_hashfields_adjust(struct mlx5_flow *dev_flow, int > tunnel, > uint32_t layer_types, > uint64_t hash_fields); > diff --git a/drivers/net/mlx5/mlx5_flow_dv.c > b/drivers/net/mlx5/mlx5_flow_dv.c index c9aa50f..9c8d074 100644 > --- a/drivers/net/mlx5/mlx5_flow_dv.c > +++ b/drivers/net/mlx5/mlx5_flow_dv.c > @@ -1365,6 +1365,7 @@ > .apply =3D flow_dv_apply, > .remove =3D flow_dv_remove, > .destroy =3D flow_dv_destroy, > + .query =3D flow_null_query, > }; >=20 > #endif /* HAVE_IBV_FLOW_DV_SUPPORT */ > diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c > b/drivers/net/mlx5/mlx5_flow_tcf.c > index 8535a15..d12a566 100644 > --- a/drivers/net/mlx5/mlx5_flow_tcf.c > +++ b/drivers/net/mlx5/mlx5_flow_tcf.c > @@ -6,6 +6,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -163,7 +164,8 @@ struct mlx5_flow_tcf_context { > struct mnl_socket *nl; /* NETLINK_ROUTE libmnl socket. */ > uint32_t seq; /* Message sequence number. */ > uint32_t buf_size; /* Message buffer size. */ > - uint8_t *buf; /* Message buffer. */ > + uint8_t *buf; > + /* Message buffer (used for receiving large netlink messages). */ > }; >=20 > /** Empty masks for known item types. */ @@ -696,6 +698,9 @@ struct > flow_tcf_ptoi { > "can't have multiple fate actions"); > action_flags |=3D MLX5_FLOW_ACTION_DROP; > break; > + case RTE_FLOW_ACTION_TYPE_COUNT: > + action_flags |=3D MLX5_FLOW_ACTION_COUNT; > + break; Is there any special validation you do for count?=20 If not, why having this logic?=20 > case RTE_FLOW_ACTION_TYPE_OF_POP_VLAN: > action_flags |=3D > MLX5_FLOW_ACTION_OF_POP_VLAN; > break; > @@ -875,6 +880,9 @@ struct flow_tcf_ptoi { > SZ_NLATTR_TYPE_OF(struct tc_gact); > flags |=3D MLX5_FLOW_ACTION_DROP; > break; > + case RTE_FLOW_ACTION_TYPE_COUNT: > + flags |=3D MLX5_FLOW_ACTION_COUNT; > + break; Ditto, is there a real use for this flag considering count action doesn't a= dd any bytes to the size of the netlink command?=20 > case RTE_FLOW_ACTION_TYPE_OF_POP_VLAN: > flags |=3D MLX5_FLOW_ACTION_OF_POP_VLAN; > goto action_of_vlan; > @@ -1360,6 +1368,12 @@ struct flow_tcf_ptoi { > mnl_attr_nest_end(nlh, na_act); > mnl_attr_nest_end(nlh, na_act_index); > break; > + case RTE_FLOW_ACTION_TYPE_COUNT: > + /* > + * Driver adds the count action implicitly for > + * each rule it creates. > + */ > + break; > case RTE_FLOW_ACTION_TYPE_OF_POP_VLAN: > conf.of_push_vlan =3D NULL; > vlan_act =3D TCA_VLAN_ACT_POP; > @@ -1564,6 +1578,275 @@ struct flow_tcf_ptoi { > rte_free(dev_flow); > } >=20 > +/** > + * Parse rtnetlink message attributes filling the attribute table with > +the info > + * being retrieved. > + * > + * @param tb > + * Attribute table to be filled. > + * @param[out] max > + * Maxinum entry in the attribute table. > + * @param rte > + * The attributes section in the message to be parsed. > + * @param len > + * The length of the attributes section in the message. > + * @return > + * 0 on successful extraction of action counts, -1 otherwise. Really? Looks like it returns void.=20 > + */ > +static void > +tc_parse_rtattr(struct rtattr *tb[], int max, struct rtattr *rta, int > +len) { > + unsigned short type; > + memset(tb, 0, sizeof(struct rtattr *) * (max + 1)); > + while (RTA_OK(rta, len)) { > + type =3D rta->rta_type; > + if (type <=3D max && !tb[type]) > + tb[type] =3D rta; > + rta =3D RTA_NEXT(rta, len); > + } > +} > + > + /** > + * Extract action counters from flower action. Extract flow counters from flower query.=20 > + * > + * @param rta > + * flower action stats properties in the Netlink message received. > + * @param[out] qc > + * Count statistics retrieved from the message query. Isn't it the count statistics of rte_flow?=20 > + * @return > + * 0 on successful extraction of action counts, -1 otherwise. > + */ > +static int > +tc_flow_extract_stats_attr(struct rtattr *rta, struct > +rte_flow_query_count *qc) { > + struct rtattr *tbs[TCA_STATS_MAX + 1]; > + > + tc_parse_rtattr(tbs, TCA_STATS_MAX, RTA_DATA(rta), > RTA_PAYLOAD(rta)); > + if (tbs[TCA_STATS_BASIC]) { > + struct gnet_stats_basic bs =3D {0}; > + > + memcpy(&bs, RTA_DATA(tbs[TCA_STATS_BASIC]), > + RTE_MIN(RTA_PAYLOAD(tbs[TCA_STATS_BASIC]), > + sizeof(bs))); > + qc->bytes =3D bs.bytes; > + qc->hits =3D bs.packets; > + qc->bytes_set =3D 1; > + qc->hits_set =3D 1; > + return 0; > + } > + return -1; > +} > + > + /** > + * Parse flower single action retrieving the flow counters from it if p= resent. How about making it more generic? Do not restrict to only flow counters.=20 Any subsequent flow query operation can re-use your function with small ext= ension.=20 > + * > + * @param arg > + * flower action properties in the Netlink message received. > + * @param[out] qc > + * Count statistics retrieved from the message query. If you agree with above then it should be void * data > + * @return > + * 0 on successful retrieval of action counts, -1 otherwise. > + */ > +static int > +tc_flow_parse_one_action(struct rtattr *arg, struct > +rte_flow_query_count *qc) { > + struct rtattr *tb[TCA_ACT_MAX + 1]; > + > + if (arg =3D=3D NULL) > + return -1; > + tc_parse_rtattr(tb, TCA_ACT_MAX, RTA_DATA(arg), > RTA_PAYLOAD(arg)); > + if (tb[TCA_ACT_KIND] =3D=3D NULL) > + return -1; > + if (tb[TCA_ACT_STATS]) > + return tc_flow_extract_stats_attr(tb[TCA_ACT_STATS], qc); > + return -1; > +} > + > + /** > + * Parse flower action section in the message, retrieving the flow > +counters Again - more generic. Not only counters.=20 > + * from the first action that contains them. > + * flow counters are stored in the actions defined by the flow and not > +in the > + * flow itself, therefore we need to traverse the flower action in > +search for > + * them. > + * > + * @param opt > + * flower section in the Netlink message received. > + * @param[out] qc > + * Count statistics retrieved from the message query. Void *data > + */ > +static void > +tc_flow_parse_action(const struct rtattr *arg, struct > +rte_flow_query_count *qc) { If this parse multiple actions then the name should be flow_tcf_parse_actio= ns > + struct rtattr *tb[TCA_ACT_MAX_PRIO + 1]; > + int i; > + > + if (arg =3D=3D NULL) > + return; > + tc_parse_rtattr(tb, TCA_ACT_MAX_PRIO, RTA_DATA(arg), > RTA_PAYLOAD(arg)); > + for (i =3D 0; i <=3D TCA_ACT_MAX_PRIO; i++) > + if (tb[i]) > + if (tc_flow_parse_one_action(tb[i], qc) =3D=3D 0) > + break; > +} > + > + /** > + * Parse Netlink reply on flower type of filters, retrieving the flow > +counters Not only > + * from it. > + * > + * @param opt > + * flower section in the Netlink message received. > + * @param[out] qc > + * Count statistics retrieved from the message query. Void *data > + */ > +static void > +tc_flower_parse_opt(struct rtattr *opt, > + struct rte_flow_query_count *qc) > +{ > + struct rtattr *tb[TCA_FLOWER_MAX + 1]; > + > + if (!opt) > + return; > + tc_parse_rtattr(tb, TCA_FLOWER_MAX, RTA_DATA(opt), > RTA_PAYLOAD(opt)); > + if (tb[TCA_FLOWER_ACT]) > + tc_flow_parse_action(tb[TCA_FLOWER_ACT], qc); } > + > + /** > + * Parse Netlink reply on filter query, retrieving the flow counters. Not only counters > + * > + * @param nlh > + * Message received from Netlink. > + * @param[out] qc > + * Count statistics retrieved from the message query. Void * data > + * > + * @return > + * MNL_CB_ERROR on error, MNL_CB_OK value otherwise. > + */ > +static int > +mlx5_nl_flow_parse_filter(const struct nlmsghdr *nlh, > + struct rte_flow_query_count *qc) > +{ > + struct tcmsg *t =3D NLMSG_DATA(nlh); > + int len =3D nlh->nlmsg_len; > + struct rtattr *tb[TCA_MAX + 1] =3D { }; > + > + if (nlh->nlmsg_type !=3D RTM_NEWTFILTER && > + nlh->nlmsg_type !=3D RTM_GETTFILTER && > + nlh->nlmsg_type !=3D RTM_DELTFILTER) > + return MNL_CB_OK; > + len -=3D NLMSG_LENGTH(sizeof(*t)); > + if (len < 0) > + return MNL_CB_ERROR; > + tc_parse_rtattr(tb, TCA_MAX, TCA_RTA(t), len); > + if (tb[TCA_KIND]) > + if (strcmp(RTA_DATA(tb[TCA_KIND]), "flower") =3D=3D 0) > + tc_flower_parse_opt(tb[TCA_OPTIONS], qc); > + return MNL_CB_OK; > +} > + > +/** > + * A callback to parse Netlink reply on filter query attempting to > +retrieve the > + * flow counters if present. > + * > + * @param nlh > + * Message received from Netlink. > + * @param[out] data > + * pointer to the count statistics to be filled by the routine. > + * > + * @return > + * MNL_CB_ERROR on error, MNL_CB_OK value otherwise. > + */ > +static int > +mlx5_nl_flow_parse_message(const struct nlmsghdr *nlh, void *data) { > + struct rte_flow_query_count *qc =3D (struct rte_flow_query_count > *)data; > + > + switch (nlh->nlmsg_type) { > + case NLMSG_NOOP: > + return MNL_CB_OK; > + case NLMSG_ERROR: > + case NLMSG_OVERRUN: > + return MNL_CB_ERROR; > + default: > + break; > + } > + return mlx5_nl_flow_parse_filter(nlh, qc); } > + > + /** > + * Query a tcf rule for its statistics via netlink. > + * > + * @param[in] dev > + * Pointer to Ethernet device. > + * @param[in] flow > + * Pointer to the sub flow. > + * @param[out] data > + * data retrieved by the query. > + * @param[out] error > + * Perform verbose error reporting if not NULL. > + * > + * @return > + * 0 on success, a negative errno value otherwise and rte_errno is se= t. > + */ > +static int > +mlx5_flow_tcf_query(struct rte_eth_dev *dev, > + struct rte_flow *flow, > + enum rte_flow_action_type type, > + void *data, > + struct rte_flow_error *error) > +{ > + struct rte_flow_query_count *qc =3D data; > + struct priv *priv =3D dev->data->dev_private; > + struct mlx5_flow_tcf_context *ctx =3D priv->tcf_context; > + struct mnl_socket *nl =3D ctx->nl; > + struct mlx5_flow *dev_flow; > + struct nlmsghdr *nlh; > + uint32_t seq =3D priv->tcf_context->seq++; Maybe better to have it random. See previous patch.=20 > + ssize_t ret; > + assert(qc); > + > + dev_flow =3D LIST_FIRST(&flow->dev_flows); I would expect that dev_flow would be the mlx5 format of the rte_flow provi= ded. It is not always the first in the flows list.=20 > + /* E-Switch flow can't be expanded. */ > + assert(!LIST_NEXT(dev_flow, next)); > + /* Currently only query count is supported. */ > + if (type !=3D RTE_FLOW_ACTION_TYPE_COUNT) > + goto error_nosup; This check is not needed. mlx5_flow_query should protect against it.=20 > + nlh =3D dev_flow->tcf.nlh; > + nlh->nlmsg_type =3D RTM_GETTFILTER; > + nlh->nlmsg_flags =3D NLM_F_REQUEST | NLM_F_ECHO; > + nlh->nlmsg_seq =3D seq; > + if (mnl_socket_sendto(nl, nlh, nlh->nlmsg_len) =3D=3D -1) > + goto error_exit; > + ret =3D mnl_socket_recvfrom(nl, priv->tcf_context->buf, > + priv->tcf_context->buf_size); Why not allocating the buffer along with its needed size on the stack? Isn'= t is simpler?=20 > + if (ret =3D=3D -1) > + goto error_exit; > + while (ret > 0) { > + ret =3D mnl_cb_run(ctx->buf, ret, seq, > + mnl_socket_get_portid(nl), > + mlx5_nl_flow_parse_message, qc); I don't understand why you need this callback and not simply call mlx5_nl_f= low_parse_message.=20 > + if (ret <=3D MNL_CB_STOP) > + break; > + ret =3D mnl_socket_recvfrom(nl, ctx->buf, ctx->buf_size); > + } > + /* Return the delta from last reset. */ > + qc->hits -=3D dev_flow->tcf.hits; > + qc->bytes -=3D dev_flow->tcf.bytes; I don't get the logic. After the mlx5_nl_flow_parse_message callbacks, on qc you have the current = count value. After the above two lines you override it with the old value. I think you meant to provide a different qc struct toe the mlx5_nl_flow_par= se_message function and subtract?=20 > + if (qc->reset) { > + dev_flow->tcf.hits =3D qc->hits; > + dev_flow->tcf.bytes =3D qc->bytes; > + } > + return 0; > +error_nosup: > + return rte_flow_error_set > + (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ACTION, > + NULL, "tcf: unsupported query"); The error_nosup can be removed.=20 > +error_exit: > + return rte_flow_error_set > + (error, errno, > RTE_FLOW_ERROR_TYPE_UNSPECIFIED, > + NULL, "netlink: failed to read flow rule statistics"); } > + > const struct mlx5_flow_driver_ops mlx5_flow_tcf_drv_ops =3D { > .validate =3D flow_tcf_validate, > .prepare =3D flow_tcf_prepare, > @@ -1571,6 +1854,7 @@ struct flow_tcf_ptoi { > .apply =3D flow_tcf_apply, > .remove =3D flow_tcf_remove, > .destroy =3D flow_tcf_destroy, > + .query =3D mlx5_flow_tcf_query, > }; >=20 > /** > diff --git a/drivers/net/mlx5/mlx5_flow_verbs.c > b/drivers/net/mlx5/mlx5_flow_verbs.c > index ad8f7ac..e377b3b 100644 > --- a/drivers/net/mlx5/mlx5_flow_verbs.c > +++ b/drivers/net/mlx5/mlx5_flow_verbs.c > @@ -1655,4 +1655,5 @@ > .apply =3D flow_verbs_apply, > .remove =3D flow_verbs_remove, > .destroy =3D flow_verbs_destroy, > + .query =3D flow_null_query, You will need implementation here.=20 > }; > -- > 1.8.3.1