* [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices
@ 2015-07-06 13:28 leeopop
2015-07-06 13:28 ` [dpdk-dev] [PATCH 1/2] eal/persistent: new library to hold memory region after program exit leeopop
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: leeopop @ 2015-07-06 13:28 UTC (permalink / raw)
To: dev
This is a native UIO-based PMD for Mellanox ConnectX-3 devices.
It uses a persistent memory library in order to provide a persistent
scartch area for the mlx4 HCA driver.
We release the driver itself under BSD license, but to use it for
commercial products, you may have to re-implement the separated GPL sources.
The GPL affected source codes reside in the mlnx_uio/kernel directory.
leeopop (2):
eal/persistent: new library to hold memory region after program exit
mlnx_uio: new poll mode driver
config/common_linuxapp | 10 +
drivers/net/Makefile | 1 +
drivers/net/mlnx_uio/.gitignore | 1 +
drivers/net/mlnx_uio/LICENSE | 30 +
drivers/net/mlnx_uio/Makefile | 139 +
drivers/net/mlnx_uio/convert.py | 50 +
drivers/net/mlnx_uio/include/autoconf.h | 10 +
drivers/net/mlnx_uio/include/bitmap.h | 314 +
drivers/net/mlnx_uio/include/bitops.h | 558 ++
drivers/net/mlnx_uio/include/dcbnl.h | 751 +++
drivers/net/mlnx_uio/include/etherdevice.h | 189 +
drivers/net/mlnx_uio/include/ib_mad.h | 664 ++
drivers/net/mlnx_uio/include/ib_smi.h | 128 +
drivers/net/mlnx_uio/include/ib_verbs.h | 806 +++
drivers/net/mlnx_uio/include/inline_functions.h | 307 +
drivers/net/mlnx_uio/include/kcompat.h | 36 +
drivers/net/mlnx_uio/include/kmod.h | 768 +++
drivers/net/mlnx_uio/include/list.h | 780 +++
drivers/net/mlnx_uio/include/log2.h | 229 +
drivers/net/mlnx_uio/include/mlx4_dpdk.h | 17 +
drivers/net/mlnx_uio/include/mlx4_uio.h | 24 +
drivers/net/mlnx_uio/include/mlx4_uio_helper.h | 800 +++
drivers/net/mlnx_uio/include/module.h | 12 +
drivers/net/mlnx_uio/include/netdev_features.h | 166 +
drivers/net/mlnx_uio/include/post_kmod.h | 13 +
drivers/net/mlnx_uio/include/radix-tree.h | 48 +
drivers/net/mlnx_uio/include/rbtree.h | 105 +
drivers/net/mlnx_uio/include/rbtree_augmented.h | 230 +
drivers/net/mlnx_uio/kernel/LICENSE | 339 +
drivers/net/mlnx_uio/kernel/bitmap.c | 831 +++
drivers/net/mlnx_uio/kernel/kcompat.c | 96 +
drivers/net/mlnx_uio/kernel/radix-tree.c | 78 +
drivers/net/mlnx_uio/kernel/rbtree.c | 561 ++
drivers/net/mlnx_uio/mlnx/include/mlx4/cmd.h | 309 +
drivers/net/mlnx_uio/mlnx/include/mlx4/cq.h | 195 +
drivers/net/mlnx_uio/mlnx/include/mlx4/device.h | 1744 +++++
drivers/net/mlnx_uio/mlnx/include/mlx4/doorbell.h | 90 +
drivers/net/mlnx_uio/mlnx/include/mlx4/driver.h | 175 +
drivers/net/mlnx_uio/mlnx/include/mlx4/qp.h | 540 ++
drivers/net/mlnx_uio/mlnx/include/mlx4/srq.h | 50 +
drivers/net/mlnx_uio/mlnx/include/mlx5/cmd.h | 56 +
drivers/net/mlnx_uio/mlnx/include/mlx5/cq.h | 182 +
drivers/net/mlnx_uio/mlnx/include/mlx5/device.h | 1204 ++++
drivers/net/mlnx_uio/mlnx/include/mlx5/doorbell.h | 85 +
drivers/net/mlnx_uio/mlnx/include/mlx5/driver.h | 1063 +++
.../net/mlnx_uio/mlnx/include/mlx5/flow_table.h | 59 +
drivers/net/mlnx_uio/mlnx/include/mlx5/mlx5_ifc.h | 6892 ++++++++++++++++++++
drivers/net/mlnx_uio/mlnx/include/mlx5/qp.h | 804 +++
drivers/net/mlnx_uio/mlnx/include/mlx5/srq.h | 46 +
drivers/net/mlnx_uio/mlnx/include/mlx5/vport.h | 52 +
drivers/net/mlnx_uio/mlnx/mlx4/Kconfig | 46 +
drivers/net/mlnx_uio/mlnx/mlx4/Makefile | 19 +
drivers/net/mlnx_uio/mlnx/mlx4/alloc.c | 872 +++
drivers/net/mlnx_uio/mlnx/mlx4/catas.c | 350 +
drivers/net/mlnx_uio/mlnx/mlx4/cmd.c | 3456 ++++++++++
drivers/net/mlnx_uio/mlnx/mlx4/cq.c | 443 ++
drivers/net/mlnx_uio/mlnx/mlx4/en_clock.c | 330 +
drivers/net/mlnx_uio/mlnx/mlx4/en_cq.c | 257 +
drivers/net/mlnx_uio/mlnx/mlx4/en_dcb_nl.c | 613 ++
drivers/net/mlnx_uio/mlnx/mlx4/en_ethtool.c | 2582 ++++++++
drivers/net/mlnx_uio/mlnx/mlx4/en_main.c | 493 ++
drivers/net/mlnx_uio/mlnx/mlx4/en_netdev.c | 3786 +++++++++++
drivers/net/mlnx_uio/mlnx/mlx4/en_port.c | 493 ++
drivers/net/mlnx_uio/mlnx/mlx4/en_port.h | 593 ++
drivers/net/mlnx_uio/mlnx/mlx4/en_resources.c | 184 +
drivers/net/mlnx_uio/mlnx/mlx4/en_rx.c | 1565 +++++
drivers/net/mlnx_uio/mlnx/mlx4/en_rx_uio.c | 187 +
drivers/net/mlnx_uio/mlnx/mlx4/en_selftest.c | 194 +
drivers/net/mlnx_uio/mlnx/mlx4/en_sysfs.c | 623 ++
drivers/net/mlnx_uio/mlnx/mlx4/en_tx.c | 1143 ++++
drivers/net/mlnx_uio/mlnx/mlx4/en_tx_uio.c | 47 +
drivers/net/mlnx_uio/mlnx/mlx4/eq.c | 1777 +++++
drivers/net/mlnx_uio/mlnx/mlx4/fw.c | 3005 +++++++++
drivers/net/mlnx_uio/mlnx/mlx4/fw.h | 270 +
drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.c | 292 +
drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.h | 150 +
drivers/net/mlnx_uio/mlnx/mlx4/icm.c | 522 ++
drivers/net/mlnx_uio/mlnx/mlx4/icm.h | 133 +
drivers/net/mlnx_uio/mlnx/mlx4/intf.c | 246 +
drivers/net/mlnx_uio/mlnx/mlx4/main.c | 5485 ++++++++++++++++
drivers/net/mlnx_uio/mlnx/mlx4/main.c.orig | 5335 +++++++++++++++
drivers/net/mlnx_uio/mlnx/mlx4/mcg.c | 1665 +++++
drivers/net/mlnx_uio/mlnx/mlx4/mlx4.h | 1514 +++++
drivers/net/mlnx_uio/mlnx/mlx4/mlx4_en.h | 1188 ++++
drivers/net/mlnx_uio/mlnx/mlx4/mlx4_stats.h | 153 +
drivers/net/mlnx_uio/mlnx/mlx4/mr.c | 1178 ++++
drivers/net/mlnx_uio/mlnx/mlx4/pd.c | 310 +
drivers/net/mlnx_uio/mlnx/mlx4/port.c | 1636 +++++
drivers/net/mlnx_uio/mlnx/mlx4/profile.c | 259 +
drivers/net/mlnx_uio/mlnx/mlx4/qp.c | 956 +++
drivers/net/mlnx_uio/mlnx/mlx4/reset.c | 202 +
drivers/net/mlnx_uio/mlnx/mlx4/resource_tracker.c | 5052 ++++++++++++++
drivers/net/mlnx_uio/mlnx/mlx4/sense.c | 153 +
drivers/net/mlnx_uio/mlnx/mlx4/srq.c | 314 +
drivers/net/mlnx_uio/mlnx/mlx5/core/Kconfig | 8 +
drivers/net/mlnx_uio/mlnx/mlx5/core/Makefile | 9 +
drivers/net/mlnx_uio/mlnx/mlx5/core/alloc.c | 273 +
drivers/net/mlnx_uio/mlnx/mlx5/core/cmd.c | 2069 ++++++
drivers/net/mlnx_uio/mlnx/mlx5/core/cq.c | 236 +
drivers/net/mlnx_uio/mlnx/mlx5/core/debugfs.c | 718 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/en.h | 695 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/en_debugfs.c | 115 +
drivers/net/mlnx_uio/mlnx/mlx5/core/en_ethtool.c | 816 +++
.../net/mlnx_uio/mlnx/mlx5/core/en_flow_table.c | 1014 +++
drivers/net/mlnx_uio/mlnx/mlx5/core/en_main.c | 2265 +++++++
drivers/net/mlnx_uio/mlnx/mlx5/core/en_rx.c | 310 +
drivers/net/mlnx_uio/mlnx/mlx5/core/en_tx.c | 392 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/en_txrx.c | 118 +
drivers/net/mlnx_uio/mlnx/mlx5/core/eq.c | 566 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/flow_table.c | 422 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/fw.c | 199 +
drivers/net/mlnx_uio/mlnx/mlx5/core/health.c | 229 +
drivers/net/mlnx_uio/mlnx/mlx5/core/mad.c | 78 +
drivers/net/mlnx_uio/mlnx/mlx5/core/main.c | 1583 +++++
drivers/net/mlnx_uio/mlnx/mlx5/core/mcg.c | 105 +
drivers/net/mlnx_uio/mlnx/mlx5/core/mlx5_core.h | 105 +
drivers/net/mlnx_uio/mlnx/mlx5/core/mr.c | 251 +
drivers/net/mlnx_uio/mlnx/mlx5/core/pagealloc.c | 533 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/params.c | 198 +
drivers/net/mlnx_uio/mlnx/mlx5/core/pd.c | 73 +
drivers/net/mlnx_uio/mlnx/mlx5/core/port.c | 869 +++
drivers/net/mlnx_uio/mlnx/mlx5/core/qp.c | 639 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/sriov.c | 525 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/srq.c | 524 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.c | 361 +
drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.h | 68 +
drivers/net/mlnx_uio/mlnx/mlx5/core/uar.c | 235 +
drivers/net/mlnx_uio/mlnx/mlx5/core/vport.c | 216 +
drivers/net/mlnx_uio/mlnx/mlx5/core/wq.c | 195 +
drivers/net/mlnx_uio/mlnx/mlx5/core/wq.h | 177 +
drivers/net/mlnx_uio/mlx4_en_special.h | 21 +
drivers/net/mlnx_uio/mlx4_uio.c | 1026 +++
drivers/net/mlnx_uio/prepare.py | 28 +
drivers/net/mlnx_uio/rte_pmd_mlnx_uio_version.map | 4 +
lib/Makefile | 1 +
lib/librte_eal/common/Makefile | 3 +
lib/librte_eal/common/include/rte_pci.h | 1 +
lib/librte_eal/common/include/rte_persistent_mem.h | 26 +
lib/librte_eal/linuxapp/eal/Makefile | 6 +
lib/librte_eal/linuxapp/eal/eal.c | 9 +
lib/librte_eal/linuxapp/eal/eal_persistent_mem.c | 148 +
.../eal/include/exec-env/rte_persistent_mem.h | 15 +
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 +
lib/librte_persistent/Makefile | 55 +
lib/librte_persistent/rte_persistent.c | 198 +
lib/librte_persistent/rte_persistent.h | 20 +
lib/librte_persistent/rte_persistent_version.map | 11 +
mk/rte.app.mk | 6 +
148 files changed, 91477 insertions(+)
create mode 100644 drivers/net/mlnx_uio/.gitignore
create mode 100644 drivers/net/mlnx_uio/LICENSE
create mode 100644 drivers/net/mlnx_uio/Makefile
create mode 100755 drivers/net/mlnx_uio/convert.py
create mode 100644 drivers/net/mlnx_uio/include/autoconf.h
create mode 100644 drivers/net/mlnx_uio/include/bitmap.h
create mode 100644 drivers/net/mlnx_uio/include/bitops.h
create mode 100644 drivers/net/mlnx_uio/include/dcbnl.h
create mode 100644 drivers/net/mlnx_uio/include/etherdevice.h
create mode 100644 drivers/net/mlnx_uio/include/ib_mad.h
create mode 100644 drivers/net/mlnx_uio/include/ib_smi.h
create mode 100644 drivers/net/mlnx_uio/include/ib_verbs.h
create mode 100644 drivers/net/mlnx_uio/include/inline_functions.h
create mode 100644 drivers/net/mlnx_uio/include/kcompat.h
create mode 100644 drivers/net/mlnx_uio/include/kmod.h
create mode 100644 drivers/net/mlnx_uio/include/list.h
create mode 100644 drivers/net/mlnx_uio/include/log2.h
create mode 100644 drivers/net/mlnx_uio/include/mlx4_dpdk.h
create mode 100644 drivers/net/mlnx_uio/include/mlx4_uio.h
create mode 100644 drivers/net/mlnx_uio/include/mlx4_uio_helper.h
create mode 100644 drivers/net/mlnx_uio/include/module.h
create mode 100644 drivers/net/mlnx_uio/include/netdev_features.h
create mode 100644 drivers/net/mlnx_uio/include/post_kmod.h
create mode 100644 drivers/net/mlnx_uio/include/radix-tree.h
create mode 100644 drivers/net/mlnx_uio/include/rbtree.h
create mode 100644 drivers/net/mlnx_uio/include/rbtree_augmented.h
create mode 100644 drivers/net/mlnx_uio/kernel/LICENSE
create mode 100644 drivers/net/mlnx_uio/kernel/bitmap.c
create mode 100644 drivers/net/mlnx_uio/kernel/kcompat.c
create mode 100644 drivers/net/mlnx_uio/kernel/radix-tree.c
create mode 100644 drivers/net/mlnx_uio/kernel/rbtree.c
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/cmd.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/cq.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/device.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/doorbell.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/driver.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/qp.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/srq.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/cmd.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/cq.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/device.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/doorbell.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/driver.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/flow_table.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/mlx5_ifc.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/qp.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/srq.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/vport.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/Kconfig
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/Makefile
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/alloc.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/catas.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/cmd.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/cq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_clock.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_cq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_dcb_nl.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_ethtool.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_main.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_netdev.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_port.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_port.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_resources.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_rx.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_rx_uio.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_selftest.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_sysfs.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_tx.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_tx_uio.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/eq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/fw.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/fw.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/icm.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/icm.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/intf.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/main.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/main.c.orig
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/mcg.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/mlx4.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/mlx4_en.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/mlx4_stats.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/mr.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/pd.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/port.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/profile.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/qp.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/reset.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/resource_tracker.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/sense.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/srq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/Kconfig
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/Makefile
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/alloc.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/cmd.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/cq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/debugfs.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_debugfs.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_ethtool.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_flow_table.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_main.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_rx.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_tx.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_txrx.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/eq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/flow_table.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/fw.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/health.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/mad.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/main.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/mcg.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/mlx5_core.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/mr.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/pagealloc.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/params.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/pd.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/port.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/qp.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/sriov.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/srq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/uar.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/vport.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/wq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/wq.h
create mode 100644 drivers/net/mlnx_uio/mlx4_en_special.h
create mode 100644 drivers/net/mlnx_uio/mlx4_uio.c
create mode 100755 drivers/net/mlnx_uio/prepare.py
create mode 100644 drivers/net/mlnx_uio/rte_pmd_mlnx_uio_version.map
create mode 100644 lib/librte_eal/common/include/rte_persistent_mem.h
create mode 100644 lib/librte_eal/linuxapp/eal/eal_persistent_mem.c
create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_persistent_mem.h
create mode 100644 lib/librte_persistent/Makefile
create mode 100644 lib/librte_persistent/rte_persistent.c
create mode 100644 lib/librte_persistent/rte_persistent.h
create mode 100644 lib/librte_persistent/rte_persistent_version.map
--
2.1.4
^ permalink raw reply [flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH 1/2] eal/persistent: new library to hold memory region after program exit
2015-07-06 13:28 [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices leeopop
@ 2015-07-06 13:28 ` leeopop
2015-07-06 14:34 ` Avi Kivity
2015-07-06 19:19 ` Stephen Hemminger
2015-07-06 13:28 ` [dpdk-dev] [PATCH 2/2] mlnx_uio: new poll mode driver leeopop
2015-07-06 14:17 ` [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices Thomas Monjalon
2 siblings, 2 replies; 13+ messages in thread
From: leeopop @ 2015-07-06 13:28 UTC (permalink / raw)
To: dev
Some NICs use host memory region as their scratch area.
When DPDK user applications terminate, all the memory regions are lost,
re-initialized (memzone), which causes HW faults.
This libraray maintains shared memory regions that is persistent across
multiple execution and termination of user level applications.
It also manages physically contiguous memory regions.
Signed-off-by: leeopop <dlrmsghd@gmail.com>
---
drivers/net/mlnx_uio/LICENSE | 30 ++++
lib/Makefile | 1 +
lib/librte_eal/common/Makefile | 3 +
lib/librte_eal/common/include/rte_pci.h | 1 +
lib/librte_eal/common/include/rte_persistent_mem.h | 26 +++
lib/librte_eal/linuxapp/eal/Makefile | 6 +
lib/librte_eal/linuxapp/eal/eal.c | 9 +
lib/librte_eal/linuxapp/eal/eal_persistent_mem.c | 148 +++++++++++++++
.../eal/include/exec-env/rte_persistent_mem.h | 15 ++
lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 +
lib/librte_persistent/Makefile | 55 ++++++
lib/librte_persistent/rte_persistent.c | 198 +++++++++++++++++++++
lib/librte_persistent/rte_persistent.h | 20 +++
lib/librte_persistent/rte_persistent_version.map | 11 ++
14 files changed, 525 insertions(+)
create mode 100644 drivers/net/mlnx_uio/LICENSE
create mode 100644 lib/librte_eal/common/include/rte_persistent_mem.h
create mode 100644 lib/librte_eal/linuxapp/eal/eal_persistent_mem.c
create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_persistent_mem.h
create mode 100644 lib/librte_persistent/Makefile
create mode 100644 lib/librte_persistent/rte_persistent.c
create mode 100644 lib/librte_persistent/rte_persistent.h
create mode 100644 lib/librte_persistent/rte_persistent_version.map
diff --git a/drivers/net/mlnx_uio/LICENSE b/drivers/net/mlnx_uio/LICENSE
new file mode 100644
index 0000000..7ef5b4b
--- /dev/null
+++ b/drivers/net/mlnx_uio/LICENSE
@@ -0,0 +1,30 @@
+* Source code in kernel/ directory follows GPLv2 license.
+
+
+Copyright (c) 2015, Keunhong Lee
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+* Redistributions of source code must retain the above copyright notice, this
+ list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright notice,
+ this list of conditions and the following disclaimer in the documentation
+ and/or other materials provided with the distribution.
+
+* Neither the name of bsd nor the names of its
+ contributors may be used to endorse or promote products derived from
+ this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
diff --git a/lib/Makefile b/lib/Makefile
index 5f480f9..7a491d3 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -57,6 +57,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
+DIRS-$(CONFIG_RTE_LIBRTE_PERSISTENT) += librte_persistent
ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_eal/common/Makefile b/lib/librte_eal/common/Makefile
index 38772d4..ce4b0a7 100644
--- a/lib/librte_eal/common/Makefile
+++ b/lib/librte_eal/common/Makefile
@@ -40,6 +40,9 @@ INC += rte_string_fns.h rte_version.h
INC += rte_eal_memconfig.h rte_malloc_heap.h
INC += rte_hexdump.h rte_devargs.h rte_dev.h
INC += rte_pci_dev_feature_defs.h rte_pci_dev_features.h
+ifeq ($(CONFIG_RTE_EAL_PERSISTENT_MEM),y)
+INC += rte_persistent_mem.h
+endif
ifeq ($(CONFIG_RTE_INSECURE_FUNCTION_WARNING),y)
INC += rte_warnings.h
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 7801fa0..a323e74 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -207,6 +207,7 @@ struct rte_pci_driver {
pci_devuninit_t *devuninit; /**< Device uninit function. */
const struct rte_pci_id *id_table; /**< ID table, NULL terminated. */
uint32_t drv_flags; /**< Flags contolling handling of device. */
+ void* priv; /**< Private data. */
};
/** Device needs PCI BAR mapping (done with either IGB_UIO or VFIO) */
diff --git a/lib/librte_eal/common/include/rte_persistent_mem.h b/lib/librte_eal/common/include/rte_persistent_mem.h
new file mode 100644
index 0000000..3a8ff23
--- /dev/null
+++ b/lib/librte_eal/common/include/rte_persistent_mem.h
@@ -0,0 +1,26 @@
+/*
+ * rte_persistent_memory.h
+ *
+ * Created on: Jun 22, 2015
+ * Author: leeopop
+ */
+
+#ifndef LIBRTE_EAL_COMMON_INCLUDE_RTE_PERSISTENT_MEM_H_
+#define LIBRTE_EAL_COMMON_INCLUDE_RTE_PERSISTENT_MEM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <exec-env/rte_persistent_mem.h>
+
+int rte_persistent_memory_init(void);
+int rte_persistent_memory_num_numa(void);
+
+extern void* persistent_allocated_memory[RTE_MAX_NUMA_NODES][RTE_EAL_PERSISTENT_MEM_COUNT];
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* LIBRTE_EAL_COMMON_INCLUDE_RTE_PERSISTENT_MEM_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index e99d7a3..139b608 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -74,6 +74,9 @@ SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_alarm.c
ifeq ($(CONFIG_RTE_LIBRTE_IVSHMEM),y)
SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_ivshmem.c
endif
+ifeq ($(CONFIG_RTE_EAL_PERSISTENT_MEM),y)
+SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_persistent_mem.c
+endif
# from common dir
SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_common_memzone.c
@@ -112,6 +115,9 @@ CFLAGS_eal_thread.o += -Wno-return-type
endif
INC := rte_interrupts.h rte_kni_common.h rte_dom0_common.h
+ifeq ($(CONFIG_RTE_EAL_PERSISTENT_MEM),y)
+INC += rte_persistent_mem.h
+endif
SYMLINK-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP)-include/exec-env := \
$(addprefix include/exec-env/,$(INC))
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 8809f57..b3f05a8 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -77,6 +77,10 @@
#include <malloc_heap.h>
#include <rte_eth_ring.h>
+#ifdef RTE_EAL_PERSISTENT_MEM
+#include <rte_persistent_mem.h>
+#endif
+
#include "eal_private.h"
#include "eal_thread.h"
#include "eal_internal_cfg.h"
@@ -759,6 +763,11 @@ rte_eal_init(int argc, char **argv)
if (fctret < 0)
exit(1);
+#ifdef RTE_EAL_PERSISTENT_MEM
+ if (rte_persistent_memory_init() < 0)
+ rte_panic("Cannot init persistent memory\n");
+#endif
+
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
internal_config.xen_dom0_support == 0 &&
diff --git a/lib/librte_eal/linuxapp/eal/eal_persistent_mem.c b/lib/librte_eal/linuxapp/eal/eal_persistent_mem.c
new file mode 100644
index 0000000..f72c148
--- /dev/null
+++ b/lib/librte_eal/linuxapp/eal/eal_persistent_mem.c
@@ -0,0 +1,148 @@
+/*
+ * eal_persistent_mem.c
+ *
+ * Created on: Jun 22, 2015
+ * Author: leeopop
+ */
+
+
+/*
+ * dma_memory.c
+ *
+ * Created on: Oct 4, 2014
+ * Author: leeopop
+ */
+
+
+#include <rte_persistent_mem.h>
+
+#include <sys/io.h>
+#include <sys/ipc.h>
+#include <sys/shm.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <assert.h>
+#include <unistd.h>
+#include <numa.h>
+#include <numaif.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+
+#include <rte_log.h>
+#include <rte_eal.h>
+#include <rte_memory.h>
+#include <rte_common.h>
+#include <rte_atomic.h>
+
+#define SHM_SIZE RTE_EAL_PERSISTENT_MEM_UNIT
+#define SHM_COUNT RTE_EAL_PERSISTENT_MEM_COUNT
+
+#define SHM_KEY_BASE (0x861591B)
+#define SHM_KEY ((SHM_KEY_BASE / SHM_COUNT)*SHM_COUNT)
+
+static void* reserve_shared_zone(int subindex, uint32_t len, int socket_id)
+{
+ assert(subindex < SHM_COUNT);
+ uint32_t shared_key = SHM_KEY_BASE + subindex;
+
+ int shmget_flag = IPC_CREAT | SHM_R | SHM_W | IPC_EXCL; // | SHM_LOCKED;
+ int shmid = -1;
+ int err;
+ if((len / RTE_PGSIZE_4K) > 1)
+ {
+ shmget_flag |= SHM_HUGETLB;
+ }
+
+ shmid = shmget(shared_key, len, shmget_flag);
+ void* addr = 0;
+ int clear = 1;
+ if(shmid < 0)
+ {
+ //Reuse existing
+ shmid = shmget(shared_key, len, shmget_flag &= ~IPC_EXCL);
+ assert(shmid >= 0);
+ clear = 0;
+ }
+ addr = shmat(shmid, 0, SHM_RND);
+ assert(addr);
+
+ if(socket_id != SOCKET_ID_ANY)
+ {
+ struct bitmask * mask = numa_bitmask_alloc(RTE_MAX_NUMA_NODES);
+ mask = numa_bitmask_clearall(mask);
+ mask = numa_bitmask_setbit(mask, socket_id);
+ long ret = mbind(addr, len, MPOL_BIND,
+ mask->maskp, RTE_MAX_NUMA_NODES,
+ MPOL_MF_MOVE_ALL | MPOL_MF_STRICT);
+ if(ret < 0)
+ {
+ RTE_LOG(WARNING, EAL, "Cannot mbind memory. Are you running with root?\n");
+ }
+ numa_bitmask_free(mask);
+ }
+ rte_mb();
+
+ if(clear)
+ {
+ memset(addr, 0, len);
+ }
+
+ size_t size;
+ volatile uint8_t reader = 0; //this prevents from being optimized out
+ volatile uint8_t* readp = (uint8_t*)addr;
+ for(size = 0; size < len; size++)
+ {
+ reader += *readp;
+ readp++;
+ }
+
+ rte_mb();
+ err = shmctl(shmid, SHM_LOCK, 0);
+ assert(err == 0);
+ return addr;
+}
+
+void* persistent_allocated_memory[RTE_MAX_NUMA_NODES][SHM_COUNT];
+
+static int numa_count = 0;
+
+int rte_persistent_memory_num_numa(void)
+{
+ return numa_count;
+}
+
+int rte_persistent_memory_init(void)
+{
+ assert(SHM_SIZE == RTE_PGSIZE_2M); //XXX considering only 2MB pages.
+ int num_numa = numa_num_configured_nodes();
+ if(num_numa == 0)
+ num_numa = 1;
+ numa_count = num_numa;
+ int node;
+ int k;
+ for(node = 0; node < RTE_MAX_NUMA_NODES; node++)
+ for(k=0; k<SHM_COUNT; k++)
+ persistent_allocated_memory[node][k] = 0;
+
+ for(node = 0; node < num_numa; node++)
+ {
+ int cur_socket = num_numa > 1 ? node : SOCKET_ID_ANY;
+ for(k=0; k<SHM_COUNT/num_numa; k++)
+ {
+ int zone_index = ((SHM_COUNT/num_numa)*node + k);
+ persistent_allocated_memory[node][k] = reserve_shared_zone(zone_index,
+ SHM_SIZE, cur_socket);
+ if(persistent_allocated_memory[node][k] == 0)
+ {
+ RTE_LOG(ERR, EAL, "Cannot allocate shared zone index %d."
+ "node: %d, local index: %d\n", zone_index, node, k);
+ return -1;
+ }
+ }
+ RTE_LOG(INFO, EAL, "Initialized %lu bytes shared zone on socket %d.\n",
+ ((uint64_t)(SHM_COUNT/num_numa)) * ((uint64_t)(SHM_SIZE)),
+ cur_socket);
+ }
+ return 0;
+}
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_persistent_mem.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_persistent_mem.h
new file mode 100644
index 0000000..4038cd5
--- /dev/null
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_persistent_mem.h
@@ -0,0 +1,15 @@
+/*
+ * rte_persistent_mem.h
+ *
+ * Created on: Jun 22, 2015
+ * Author: leeopop
+ */
+
+#ifndef LIBRTE_EAL_LINUXAPP_EAL_INCLUDE_EXEC_ENV_RTE_PERSISTENT_MEM_H_
+#define LIBRTE_EAL_LINUXAPP_EAL_INCLUDE_EXEC_ENV_RTE_PERSISTENT_MEM_H_
+
+#ifndef LIBRTE_EAL_COMMON_INCLUDE_RTE_PERSISTENT_MEM_H_
+#error "don't include this file directly, please include generic <rte_persistent_mem.h>"
+#endif
+
+#endif /* LIBRTE_EAL_LINUXAPP_EAL_INCLUDE_EXEC_ENV_RTE_PERSISTENT_MEM_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 7e850a9..4382a01 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -95,6 +95,8 @@ DPDK_2.0 {
rte_xen_dom0_memory_attach;
rte_xen_dom0_memory_init;
test_mp_secondary;
+ rte_persistent_memory_init;
+ rte_persistent_memory_num_numa;
local: *;
};
diff --git a/lib/librte_persistent/Makefile b/lib/librte_persistent/Makefile
new file mode 100644
index 0000000..a233d95
--- /dev/null
+++ b/lib/librte_persistent/Makefile
@@ -0,0 +1,55 @@
+# BSD LICENSE
+#
+# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of Intel Corporation nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_persistent.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_persistent_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_PERSISTENT) := rte_persistent.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_PERSISTENT)-include := rte_persistent.h
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PERSISTENT) += lib/librte_hash
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PERSISTENT) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PERSISTENT) += lib/librte_eal
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_persistent/rte_persistent.c b/lib/librte_persistent/rte_persistent.c
new file mode 100644
index 0000000..a21f9dc
--- /dev/null
+++ b/lib/librte_persistent/rte_persistent.c
@@ -0,0 +1,198 @@
+/*
+ * rte_persistent.c
+ *
+ * Created on: Jun 23, 2015
+ * Author: leeopop
+ */
+
+#include <rte_persistent_mem.h>
+#include <rte_persistent.h>
+#include <rte_hash.h>
+#include <rte_memory.h>
+
+#include <memory.h>
+#include <string.h>
+#include <rte_common.h>
+#include <rte_random.h>
+#include <rte_log.h>
+#include <assert.h>
+
+#define ALLOC_UNIT RTE_PGSIZE_4K
+#define MAX_CONT_MEMORY RTE_EAL_PERSISTENT_MEM_UNIT
+#define MAX_ALLOC_COUNT (RTE_EAL_PERSISTENT_MEM_COUNT*(RTE_EAL_PERSISTENT_MEM_UNIT/ALLOC_UNIT))
+#define SEGMENT_COUNT (RTE_EAL_PERSISTENT_MEM_COUNT)
+#define SUBSEGMENT_COUNT (RTE_EAL_PERSISTENT_MEM_UNIT/ALLOC_UNIT)
+
+static struct rte_hash* allocated_segments = 0;
+
+struct alloc_info
+{
+ void* addr; //0 if not allocated
+ phys_addr_t hw_addr;
+ int seg_index;
+ int sub_index;
+ int seg_count;
+};
+
+struct alloc_info info_array[MAX_ALLOC_COUNT];
+char alloc_array[SEGMENT_COUNT][SUBSEGMENT_COUNT+1];
+
+#define ALLOCATED 'a'
+#define FREE 'f'
+
+static int __initialized = 0;
+
+int rte_persistent_init(void)
+{
+ if(!__initialized)
+ {
+ struct rte_hash_parameters hash_param =
+ {
+ .name = "Persistent memory segments",
+ .entries = MAX_ALLOC_COUNT,
+ .bucket_entries = RTE_HASH_BUCKET_ENTRIES_MAX,
+ .key_len = sizeof(void*),
+ .hash_func = 0, //DEFAULT_HASH_FUNC,
+ .hash_func_init_val = 0,
+ .socket_id = SOCKET_ID_ANY,
+ };
+ allocated_segments = rte_hash_create(&hash_param);
+ memset(info_array, 0, sizeof(info_array));
+ memset(alloc_array, (int)FREE, sizeof(alloc_array));
+
+ int k;
+ for(k=0; k<SEGMENT_COUNT; k++)
+ alloc_array[k][SUBSEGMENT_COUNT] = 0;
+ __initialized = 1;
+ }
+ return 0;
+}
+
+static int global_to_local_start(int total_numa, int numa)
+{
+ return ((RTE_EAL_PERSISTENT_MEM_COUNT/total_numa)*numa);
+}
+
+static int global_to_local_range(int total_numa)
+{
+ return ((RTE_EAL_PERSISTENT_MEM_COUNT/total_numa));
+}
+
+void* rte_persistent_alloc(size_t size, int socket)
+{
+ int num_numa = rte_persistent_memory_num_numa();
+ if(socket == SOCKET_ID_ANY)
+ {
+ socket = rte_rand() % num_numa;
+ }
+
+ int l_start = global_to_local_start(num_numa, socket);
+ int l_range = global_to_local_range(num_numa);
+
+ int num_page = (size / ALLOC_UNIT);
+ if(size % ALLOC_UNIT)
+ num_page++;
+
+ char find_str[SUBSEGMENT_COUNT+1];
+ int k;
+ for(k=0; k<num_page; k++)
+ {
+ find_str[k] = FREE;
+ }
+ find_str[k] = 0;
+
+ void* found_buffer = 0;
+ for(k=l_start; k<(l_start + l_range); k++)
+ {
+ char* start = alloc_array[k];
+ char* found = strstr(start, find_str);
+
+ if(found)
+ {
+ int offset = found - start;
+ found_buffer = persistent_allocated_memory[socket][k];
+ assert(found_buffer);
+ found_buffer = RTE_PTR_ADD(found_buffer, ALLOC_UNIT*offset);
+ int j;
+ for(j=0; j<num_page; j++)
+ {
+ found[j] = ALLOCATED;
+ }
+ int index = rte_hash_add_key(allocated_segments, &found_buffer);
+ assert(index >= 0);
+ assert(info_array[index].addr == 0);
+ info_array[index].addr = found_buffer;
+ info_array[index].hw_addr = rte_mem_virt2phy(found_buffer);
+ info_array[index].seg_count = num_page;
+ info_array[index].seg_index = k;
+ info_array[index].sub_index = offset;
+ memset(found_buffer, 0, num_page*ALLOC_UNIT);
+
+
+ void* user = found_buffer;
+ uint64_t hw = rte_mem_virt2phy(user);
+ size_t diff = RTE_MAX((uint64_t)user, hw) - RTE_MIN((uint64_t)user, hw);
+ for(j = 0; j < num_page; j++)
+ {
+ size_t shift = ALLOC_UNIT * j;
+ void* cur_user = ((char*)user + shift);
+ uint64_t cur_hw = rte_mem_virt2phy(cur_user);
+ size_t cur_diff = RTE_MAX((uint64_t)cur_user, cur_hw) - RTE_MIN((uint64_t)cur_user, cur_hw);
+
+ if(cur_diff != diff)
+ {
+ RTE_LOG(ERR, EAL, "Hugepage is not contiguous, curdiff: %lX, expected: %lX\n", cur_diff, diff);
+ assert(0);
+ }
+ }
+ break;
+ }
+ }
+ if(!found_buffer)
+ RTE_LOG(ERR, EAL, "Cannot allocate persistent memory, size: %lu, socket: %d\n", size, socket);
+ return found_buffer;
+}
+
+phys_addr_t rte_persistent_hw_addr(const void* addr)
+{
+ if(addr == 0)
+ return 0;
+ int index = rte_hash_lookup(allocated_segments, (const void*)&addr);
+ assert(index >= 0);
+ assert(info_array[index].addr);
+ assert(info_array[index].addr == addr);
+ return info_array[index].hw_addr;
+}
+
+size_t rte_persistent_mem_length(const void* addr)
+{
+ int index = rte_hash_lookup(allocated_segments, (const void*)&addr);
+ assert(index >= 0);
+ assert(info_array[index].addr);
+ assert(info_array[index].addr == addr);
+ return info_array[index].seg_count * ALLOC_UNIT;
+}
+
+void rte_persistent_free(void* addr)
+{
+ int index = rte_hash_lookup(allocated_segments, (const void*)&addr);
+ assert(index >= 0);
+ assert(info_array[index].addr);
+ assert(info_array[index].addr == addr);
+
+ int seg_index = info_array[index].seg_index;
+ int sub_index = info_array[index].sub_index;
+ int len = info_array[index].seg_count;
+
+ info_array[index].seg_index = 0;
+ info_array[index].sub_index = 0;
+ info_array[index].seg_count = 0;
+ info_array[index].addr = 0;
+ info_array[index].hw_addr = 0;
+
+ rte_hash_del_key(allocated_segments, (const void*)&addr);
+
+ int k;
+ for(k=0; k<len; k++)
+ alloc_array[seg_index][sub_index+k] = FREE;
+}
diff --git a/lib/librte_persistent/rte_persistent.h b/lib/librte_persistent/rte_persistent.h
new file mode 100644
index 0000000..b59bd86
--- /dev/null
+++ b/lib/librte_persistent/rte_persistent.h
@@ -0,0 +1,20 @@
+/*
+ * rte_persistent.h
+ *
+ * Created on: Jun 23, 2015
+ * Author: leeopop
+ */
+
+#ifndef LIBRTE_PERSISTENT_RTE_PERSISTENT_H_
+#define LIBRTE_PERSISTENT_RTE_PERSISTENT_H_
+
+#include <rte_common.h>
+#include <rte_memory.h>
+
+int rte_persistent_init(void);
+void* rte_persistent_alloc(size_t size, int socket);
+phys_addr_t rte_persistent_hw_addr(const void* addr);
+void rte_persistent_free(void* addr);
+size_t rte_persistent_mem_length(const void* addr);
+
+#endif /* LIBRTE_PERSISTENT_RTE_PERSISTENT_H_ */
diff --git a/lib/librte_persistent/rte_persistent_version.map b/lib/librte_persistent/rte_persistent_version.map
new file mode 100644
index 0000000..f81d505
--- /dev/null
+++ b/lib/librte_persistent/rte_persistent_version.map
@@ -0,0 +1,11 @@
+DPDK_2.0 {
+ global:
+
+ rte_persistent_init;
+ rte_persistent_alloc;
+ rte_persistent_hw_addr;
+ rte_persistent_free;
+ rte_persistent_mem_length;
+
+ local: *;
+};
--
2.1.4
^ permalink raw reply [flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH 2/2] mlnx_uio: new poll mode driver
2015-07-06 13:28 [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices leeopop
2015-07-06 13:28 ` [dpdk-dev] [PATCH 1/2] eal/persistent: new library to hold memory region after program exit leeopop
@ 2015-07-06 13:28 ` leeopop
2015-07-06 14:17 ` [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices Thomas Monjalon
2 siblings, 0 replies; 13+ messages in thread
From: leeopop @ 2015-07-06 13:28 UTC (permalink / raw)
To: dev
This PMD offers direct access to Mellanox ConnectX-3 NICs
instead of using IB Verbs like mlx4 driver.
Currrently it supports a limited set of features;
it supports only physical functions (PF) and basic
RX/TX functionalities such as RSS and scatter-gather
I/O of mbufs.
We are working on the missing features such as
virtual functions, VLAN stripping, support for
mlx5 NICs, and etc.
We enable this PMD by default, as it only relies on
the new eal/persistent library without external
dependencies.
Signed-off-by: leeopop <dlrmsghd@gmail.com>
---
config/common_linuxapp | 10 +
drivers/net/Makefile | 1 +
drivers/net/mlnx_uio/.gitignore | 1 +
drivers/net/mlnx_uio/Makefile | 139 +
drivers/net/mlnx_uio/convert.py | 50 +
drivers/net/mlnx_uio/include/autoconf.h | 10 +
drivers/net/mlnx_uio/include/bitmap.h | 314 +
drivers/net/mlnx_uio/include/bitops.h | 558 ++
drivers/net/mlnx_uio/include/dcbnl.h | 751 +++
drivers/net/mlnx_uio/include/etherdevice.h | 189 +
drivers/net/mlnx_uio/include/ib_mad.h | 664 ++
drivers/net/mlnx_uio/include/ib_smi.h | 128 +
drivers/net/mlnx_uio/include/ib_verbs.h | 806 +++
drivers/net/mlnx_uio/include/inline_functions.h | 307 +
drivers/net/mlnx_uio/include/kcompat.h | 36 +
drivers/net/mlnx_uio/include/kmod.h | 768 +++
drivers/net/mlnx_uio/include/list.h | 780 +++
drivers/net/mlnx_uio/include/log2.h | 229 +
drivers/net/mlnx_uio/include/mlx4_dpdk.h | 17 +
drivers/net/mlnx_uio/include/mlx4_uio.h | 24 +
drivers/net/mlnx_uio/include/mlx4_uio_helper.h | 800 +++
drivers/net/mlnx_uio/include/module.h | 12 +
drivers/net/mlnx_uio/include/netdev_features.h | 166 +
drivers/net/mlnx_uio/include/post_kmod.h | 13 +
drivers/net/mlnx_uio/include/radix-tree.h | 48 +
drivers/net/mlnx_uio/include/rbtree.h | 105 +
drivers/net/mlnx_uio/include/rbtree_augmented.h | 230 +
drivers/net/mlnx_uio/kernel/LICENSE | 339 +
drivers/net/mlnx_uio/kernel/bitmap.c | 831 +++
drivers/net/mlnx_uio/kernel/kcompat.c | 96 +
drivers/net/mlnx_uio/kernel/radix-tree.c | 78 +
drivers/net/mlnx_uio/kernel/rbtree.c | 561 ++
drivers/net/mlnx_uio/mlnx/include/mlx4/cmd.h | 309 +
drivers/net/mlnx_uio/mlnx/include/mlx4/cq.h | 195 +
drivers/net/mlnx_uio/mlnx/include/mlx4/device.h | 1744 +++++
drivers/net/mlnx_uio/mlnx/include/mlx4/doorbell.h | 90 +
drivers/net/mlnx_uio/mlnx/include/mlx4/driver.h | 175 +
drivers/net/mlnx_uio/mlnx/include/mlx4/qp.h | 540 ++
drivers/net/mlnx_uio/mlnx/include/mlx4/srq.h | 50 +
drivers/net/mlnx_uio/mlnx/include/mlx5/cmd.h | 56 +
drivers/net/mlnx_uio/mlnx/include/mlx5/cq.h | 182 +
drivers/net/mlnx_uio/mlnx/include/mlx5/device.h | 1204 ++++
drivers/net/mlnx_uio/mlnx/include/mlx5/doorbell.h | 85 +
drivers/net/mlnx_uio/mlnx/include/mlx5/driver.h | 1063 +++
.../net/mlnx_uio/mlnx/include/mlx5/flow_table.h | 59 +
drivers/net/mlnx_uio/mlnx/include/mlx5/mlx5_ifc.h | 6892 ++++++++++++++++++++
drivers/net/mlnx_uio/mlnx/include/mlx5/qp.h | 804 +++
drivers/net/mlnx_uio/mlnx/include/mlx5/srq.h | 46 +
drivers/net/mlnx_uio/mlnx/include/mlx5/vport.h | 52 +
drivers/net/mlnx_uio/mlnx/mlx4/Kconfig | 46 +
drivers/net/mlnx_uio/mlnx/mlx4/Makefile | 19 +
drivers/net/mlnx_uio/mlnx/mlx4/alloc.c | 872 +++
drivers/net/mlnx_uio/mlnx/mlx4/catas.c | 350 +
drivers/net/mlnx_uio/mlnx/mlx4/cmd.c | 3456 ++++++++++
drivers/net/mlnx_uio/mlnx/mlx4/cq.c | 443 ++
drivers/net/mlnx_uio/mlnx/mlx4/en_clock.c | 330 +
drivers/net/mlnx_uio/mlnx/mlx4/en_cq.c | 257 +
drivers/net/mlnx_uio/mlnx/mlx4/en_dcb_nl.c | 613 ++
drivers/net/mlnx_uio/mlnx/mlx4/en_ethtool.c | 2582 ++++++++
drivers/net/mlnx_uio/mlnx/mlx4/en_main.c | 493 ++
drivers/net/mlnx_uio/mlnx/mlx4/en_netdev.c | 3786 +++++++++++
drivers/net/mlnx_uio/mlnx/mlx4/en_port.c | 493 ++
drivers/net/mlnx_uio/mlnx/mlx4/en_port.h | 593 ++
drivers/net/mlnx_uio/mlnx/mlx4/en_resources.c | 184 +
drivers/net/mlnx_uio/mlnx/mlx4/en_rx.c | 1565 +++++
drivers/net/mlnx_uio/mlnx/mlx4/en_rx_uio.c | 187 +
drivers/net/mlnx_uio/mlnx/mlx4/en_selftest.c | 194 +
drivers/net/mlnx_uio/mlnx/mlx4/en_sysfs.c | 623 ++
drivers/net/mlnx_uio/mlnx/mlx4/en_tx.c | 1143 ++++
drivers/net/mlnx_uio/mlnx/mlx4/en_tx_uio.c | 47 +
drivers/net/mlnx_uio/mlnx/mlx4/eq.c | 1777 +++++
drivers/net/mlnx_uio/mlnx/mlx4/fw.c | 3005 +++++++++
drivers/net/mlnx_uio/mlnx/mlx4/fw.h | 270 +
drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.c | 292 +
drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.h | 150 +
drivers/net/mlnx_uio/mlnx/mlx4/icm.c | 522 ++
drivers/net/mlnx_uio/mlnx/mlx4/icm.h | 133 +
drivers/net/mlnx_uio/mlnx/mlx4/intf.c | 246 +
drivers/net/mlnx_uio/mlnx/mlx4/main.c | 5485 ++++++++++++++++
drivers/net/mlnx_uio/mlnx/mlx4/main.c.orig | 5335 +++++++++++++++
drivers/net/mlnx_uio/mlnx/mlx4/mcg.c | 1665 +++++
drivers/net/mlnx_uio/mlnx/mlx4/mlx4.h | 1514 +++++
drivers/net/mlnx_uio/mlnx/mlx4/mlx4_en.h | 1188 ++++
drivers/net/mlnx_uio/mlnx/mlx4/mlx4_stats.h | 153 +
drivers/net/mlnx_uio/mlnx/mlx4/mr.c | 1178 ++++
drivers/net/mlnx_uio/mlnx/mlx4/pd.c | 310 +
drivers/net/mlnx_uio/mlnx/mlx4/port.c | 1636 +++++
drivers/net/mlnx_uio/mlnx/mlx4/profile.c | 259 +
drivers/net/mlnx_uio/mlnx/mlx4/qp.c | 956 +++
drivers/net/mlnx_uio/mlnx/mlx4/reset.c | 202 +
drivers/net/mlnx_uio/mlnx/mlx4/resource_tracker.c | 5052 ++++++++++++++
drivers/net/mlnx_uio/mlnx/mlx4/sense.c | 153 +
drivers/net/mlnx_uio/mlnx/mlx4/srq.c | 314 +
drivers/net/mlnx_uio/mlnx/mlx5/core/Kconfig | 8 +
drivers/net/mlnx_uio/mlnx/mlx5/core/Makefile | 9 +
drivers/net/mlnx_uio/mlnx/mlx5/core/alloc.c | 273 +
drivers/net/mlnx_uio/mlnx/mlx5/core/cmd.c | 2069 ++++++
drivers/net/mlnx_uio/mlnx/mlx5/core/cq.c | 236 +
drivers/net/mlnx_uio/mlnx/mlx5/core/debugfs.c | 718 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/en.h | 695 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/en_debugfs.c | 115 +
drivers/net/mlnx_uio/mlnx/mlx5/core/en_ethtool.c | 816 +++
.../net/mlnx_uio/mlnx/mlx5/core/en_flow_table.c | 1014 +++
drivers/net/mlnx_uio/mlnx/mlx5/core/en_main.c | 2265 +++++++
drivers/net/mlnx_uio/mlnx/mlx5/core/en_rx.c | 310 +
drivers/net/mlnx_uio/mlnx/mlx5/core/en_tx.c | 392 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/en_txrx.c | 118 +
drivers/net/mlnx_uio/mlnx/mlx5/core/eq.c | 566 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/flow_table.c | 422 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/fw.c | 199 +
drivers/net/mlnx_uio/mlnx/mlx5/core/health.c | 229 +
drivers/net/mlnx_uio/mlnx/mlx5/core/mad.c | 78 +
drivers/net/mlnx_uio/mlnx/mlx5/core/main.c | 1583 +++++
drivers/net/mlnx_uio/mlnx/mlx5/core/mcg.c | 105 +
drivers/net/mlnx_uio/mlnx/mlx5/core/mlx5_core.h | 105 +
drivers/net/mlnx_uio/mlnx/mlx5/core/mr.c | 251 +
drivers/net/mlnx_uio/mlnx/mlx5/core/pagealloc.c | 533 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/params.c | 198 +
drivers/net/mlnx_uio/mlnx/mlx5/core/pd.c | 73 +
drivers/net/mlnx_uio/mlnx/mlx5/core/port.c | 869 +++
drivers/net/mlnx_uio/mlnx/mlx5/core/qp.c | 639 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/sriov.c | 525 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/srq.c | 524 ++
drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.c | 361 +
drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.h | 68 +
drivers/net/mlnx_uio/mlnx/mlx5/core/uar.c | 235 +
drivers/net/mlnx_uio/mlnx/mlx5/core/vport.c | 216 +
drivers/net/mlnx_uio/mlnx/mlx5/core/wq.c | 195 +
drivers/net/mlnx_uio/mlnx/mlx5/core/wq.h | 177 +
drivers/net/mlnx_uio/mlx4_en_special.h | 21 +
drivers/net/mlnx_uio/mlx4_uio.c | 1026 +++
drivers/net/mlnx_uio/prepare.py | 28 +
drivers/net/mlnx_uio/rte_pmd_mlnx_uio_version.map | 4 +
mk/rte.app.mk | 6 +
134 files changed, 90952 insertions(+)
create mode 100644 drivers/net/mlnx_uio/.gitignore
create mode 100644 drivers/net/mlnx_uio/Makefile
create mode 100755 drivers/net/mlnx_uio/convert.py
create mode 100644 drivers/net/mlnx_uio/include/autoconf.h
create mode 100644 drivers/net/mlnx_uio/include/bitmap.h
create mode 100644 drivers/net/mlnx_uio/include/bitops.h
create mode 100644 drivers/net/mlnx_uio/include/dcbnl.h
create mode 100644 drivers/net/mlnx_uio/include/etherdevice.h
create mode 100644 drivers/net/mlnx_uio/include/ib_mad.h
create mode 100644 drivers/net/mlnx_uio/include/ib_smi.h
create mode 100644 drivers/net/mlnx_uio/include/ib_verbs.h
create mode 100644 drivers/net/mlnx_uio/include/inline_functions.h
create mode 100644 drivers/net/mlnx_uio/include/kcompat.h
create mode 100644 drivers/net/mlnx_uio/include/kmod.h
create mode 100644 drivers/net/mlnx_uio/include/list.h
create mode 100644 drivers/net/mlnx_uio/include/log2.h
create mode 100644 drivers/net/mlnx_uio/include/mlx4_dpdk.h
create mode 100644 drivers/net/mlnx_uio/include/mlx4_uio.h
create mode 100644 drivers/net/mlnx_uio/include/mlx4_uio_helper.h
create mode 100644 drivers/net/mlnx_uio/include/module.h
create mode 100644 drivers/net/mlnx_uio/include/netdev_features.h
create mode 100644 drivers/net/mlnx_uio/include/post_kmod.h
create mode 100644 drivers/net/mlnx_uio/include/radix-tree.h
create mode 100644 drivers/net/mlnx_uio/include/rbtree.h
create mode 100644 drivers/net/mlnx_uio/include/rbtree_augmented.h
create mode 100644 drivers/net/mlnx_uio/kernel/LICENSE
create mode 100644 drivers/net/mlnx_uio/kernel/bitmap.c
create mode 100644 drivers/net/mlnx_uio/kernel/kcompat.c
create mode 100644 drivers/net/mlnx_uio/kernel/radix-tree.c
create mode 100644 drivers/net/mlnx_uio/kernel/rbtree.c
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/cmd.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/cq.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/device.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/doorbell.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/driver.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/qp.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx4/srq.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/cmd.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/cq.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/device.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/doorbell.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/driver.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/flow_table.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/mlx5_ifc.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/qp.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/srq.h
create mode 100644 drivers/net/mlnx_uio/mlnx/include/mlx5/vport.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/Kconfig
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/Makefile
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/alloc.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/catas.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/cmd.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/cq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_clock.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_cq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_dcb_nl.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_ethtool.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_main.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_netdev.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_port.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_port.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_resources.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_rx.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_rx_uio.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_selftest.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_sysfs.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_tx.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/en_tx_uio.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/eq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/fw.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/fw.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/icm.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/icm.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/intf.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/main.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/main.c.orig
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/mcg.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/mlx4.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/mlx4_en.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/mlx4_stats.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/mr.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/pd.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/port.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/profile.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/qp.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/reset.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/resource_tracker.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/sense.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx4/srq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/Kconfig
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/Makefile
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/alloc.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/cmd.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/cq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/debugfs.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_debugfs.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_ethtool.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_flow_table.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_main.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_rx.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_tx.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/en_txrx.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/eq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/flow_table.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/fw.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/health.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/mad.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/main.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/mcg.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/mlx5_core.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/mr.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/pagealloc.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/params.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/pd.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/port.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/qp.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/sriov.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/srq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.h
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/uar.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/vport.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/wq.c
create mode 100644 drivers/net/mlnx_uio/mlnx/mlx5/core/wq.h
create mode 100644 drivers/net/mlnx_uio/mlx4_en_special.h
create mode 100644 drivers/net/mlnx_uio/mlx4_uio.c
create mode 100755 drivers/net/mlnx_uio/prepare.py
create mode 100644 drivers/net/mlnx_uio/rte_pmd_mlnx_uio_version.map
diff --git a/config/common_linuxapp b/config/common_linuxapp
index f5646e0..e20d049 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -106,6 +106,12 @@ CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
CONFIG_RTE_EAL_IGB_UIO=y
CONFIG_RTE_EAL_VFIO=y
+#persistent memory
+CONFIG_RTE_EAL_PERSISTENT_MEM=y
+CONFIG_RTE_EAL_PERSISTENT_MEM_UNIT=2097152
+CONFIG_RTE_EAL_PERSISTENT_MEM_COUNT=256
+CONFIG_RTE_LIBRTE_PERSISTENT=y
+
#
# Special configurations in PCI Config Space for high performance
#
@@ -212,7 +218,11 @@ CONFIG_RTE_LIBRTE_MLX4_MAX_INLINE=0
CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE=8
CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS=1
+# Compile UIO based Mellanox ConnectX-3 (MLX4) PMD
#
+CONFIG_RTE_LIBRTE_MLNX_UIO_PMD=y
+CONFIG_RTE_LIBRTE_MLNX_UIO_DEBUG=y
+
# Compile burst-oriented Chelsio Terminator 10GbE/40GbE (CXGBE) PMD
#
CONFIG_RTE_LIBRTE_CXGBE_PMD=y
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 644cacb..5f624d8 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -40,6 +40,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += fm10k
DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e
DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
+DIRS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx_uio
DIRS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += null
DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += pcap
DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += ring
diff --git a/drivers/net/mlnx_uio/.gitignore b/drivers/net/mlnx_uio/.gitignore
new file mode 100644
index 0000000..b2a14c8
--- /dev/null
+++ b/drivers/net/mlnx_uio/.gitignore
@@ -0,0 +1 @@
+driver_sources.mk
\ No newline at end of file
diff --git a/drivers/net/mlnx_uio/Makefile b/drivers/net/mlnx_uio/Makefile
new file mode 100644
index 0000000..8b5dd26
--- /dev/null
+++ b/drivers/net/mlnx_uio/Makefile
@@ -0,0 +1,139 @@
+# BSD LICENSE
+#
+# Copyright 2012-2015 6WIND S.A.
+# Copyright 2012 Mellanox.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of 6WIND S.A. nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# Library name.
+LIB = librte_pmd_mlnx_uio.a
+
+#External driver sources
+#-include $(SRCDIR)/driver_sources.mk
+
+MLNX_SRC = $(wildcard $(SRCDIR)/mlnx/*.c) $(wildcard $(SRCDIR)/mlnx/*/*.c) $(wildcard $(SRCDIR)/mlnx/*/*/*.c)
+
+$(SRCDIR)/driver_sources.mk: $(MLNX_SRC)
+ echo $(MLNX_SRC)
+ bash -c "cd $(SRCDIR); python3 prepare.py ."
+ bash -c "cd $(SRCDIR); python3 convert.py mlnx"
+
+VPATH+= $(SRCDIR)/kernel
+CFLAGS+= -I$(SRCDIR)
+CFLAGS+= -I$(SRCDIR)/include
+CFLAGS+= -I$(SRCDIR)/mlnx/include
+
+
+# Sources.
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += rbtree.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += radix-tree.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += bitmap.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += kcompat.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlx4_uio.c
+# mlx4_en sources
+#SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_tx.c
+#SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_rx.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_tx_uio.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_rx_uio.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_port.c
+#SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_netdev.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_main.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_selftest.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_sysfs.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_ethtool.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_dcb_nl.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_clock.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_cq.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/en_resources.c
+
+# mlx4_core sources
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/main.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/port.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/qp.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/srq.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/eq.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/cq.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/sense.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/profile.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/pd.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/mr.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/fw_qos.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/catas.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/resource_tracker.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/intf.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/alloc.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/mcg.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/reset.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/icm.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/fw.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += mlnx/mlx4/cmd.c
+
+# Dependencies.
+DEPDIRS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += lib/librte_persistent
+DEPDIRS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += lib/librte_mempool
+DEPDIRS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += lib/librte_ring
+DEPDIRS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += lib/librte_malloc
+
+# Basic CFLAGS.
+CFLAGS += -O3
+CFLAGS += -std=gnu99 -Wall -Wextra
+CFLAGS += -g
+CFLAGS += -I.
+CFLAGS += -D_XOPEN_SOURCE=600
+#CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -Werror=implicit-function-declaration
+
+# A few warnings cannot be avoided in external headers.
+CFLAGS += -Wno-error=cast-qual
+
+EXPORT_MAP := rte_pmd_mlnx_uio_version.map
+LIBABIVER := 1
+
+# DEBUG which is usually provided on the command-line may enable
+# CONFIG_RTE_LIBRTE_MLNX_DEBUG.
+ifeq ($(DEBUG),1)
+CONFIG_RTE_LIBRTE_MLNX_UIO_DEBUG := y
+endif
+
+# User-defined CFLAGS.
+ifeq ($(CONFIG_RTE_LIBRTE_MLNX_UIO_DEBUG),y)
+CFLAGS += -UNDEBUG #-DPEDANTIC -pedantic
+else
+CFLAGS += -DNDEBUG -UPEDANTIC
+endif
+
+include $(RTE_SDK)/mk/rte.lib.mk
+
+# Generate and clean-up MLNX_autoconf.h.
+
+export CC CFLAGS CPPFLAGS EXTRA_CFLAGS EXTRA_CPPFLAGS
+export AUTO_CONFIG_CFLAGS = -Wno-error
diff --git a/drivers/net/mlnx_uio/convert.py b/drivers/net/mlnx_uio/convert.py
new file mode 100755
index 0000000..26d58ec
--- /dev/null
+++ b/drivers/net/mlnx_uio/convert.py
@@ -0,0 +1,50 @@
+#!/usr/bin/env python3
+import re
+import os
+import os.path
+import sys
+
+source_ext = ['.c','.cc','.cpp','.cxx']
+header_ext = ['.h','.hpp','.hh','.hxx']
+
+root_dir = sys.argv[1]
+
+for curdir, subdirs, files in os.walk(root_dir):
+# if( not ("source_list" in files)):
+# continue
+ for source in files:
+ purename, extension = os.path.splitext(source)
+ if (not extension in source_ext) and (not extension in header_ext):
+ continue
+
+ with open (os.path.join(curdir, source), 'r') as source_file:
+ source_string= source_file.readlines() #map(lambda s: s.rstrip('\n'), source_file.readlines())
+ if '#define K_CONVERTED\n' in source_string:
+ continue
+
+ source_string.insert(0, '#include \"kmod.h\"\n')
+ #if '#include <linux/skbuff.h>\n' in source_string:
+ # source_string.insert(0, '#define K_SKBUFF\n')
+ source_string.insert(0, '#endif\n')
+ source_string.insert(0, '#define K_CONVERTED\n')
+ source_string.insert(0, '#ifndef K_CONVERTED\n')
+ out_lines = []
+
+ with open (os.path.join(curdir, source), 'w') as source_file:
+ rx_bracketed_include = re.compile(r'^.*(#include\s<[^>]+>).*$')
+ for line in source_string:
+ if not rx_bracketed_include.search(line):
+ out_lines.append(line)
+
+# rx_quoted_include = re.compile(r'^.*(#include\s*"[^"]+")$')
+# last_idx = 0
+# for idx, line in enumerate(out_lines):
+# if rx_quoted_include.search(line):
+# last_idx = idx
+
+
+ for line in out_lines:
+ source_file.write(line)
+
+ if extension in header_ext:
+ source_file.write('\n#include "post_kmod.h"\n')
diff --git a/drivers/net/mlnx_uio/include/autoconf.h b/drivers/net/mlnx_uio/include/autoconf.h
new file mode 100644
index 0000000..f75cd87
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/autoconf.h
@@ -0,0 +1,10 @@
+#undef CONFIG_MLX4_CORE
+#define CONFIG_MLX4_CORE 1
+#undef CONFIG_MLX4_EN
+#define CONFIG_MLX4_EN 1
+#undef CONFIG_MLX4_EN_DCB
+#define CONFIG_MLX4_EN_DCB 1
+#undef CONFIG_MLX5_CORE
+#define CONFIG_MLX5_CORE 1
+
+
diff --git a/drivers/net/mlnx_uio/include/bitmap.h b/drivers/net/mlnx_uio/include/bitmap.h
new file mode 100644
index 0000000..80d5c5d
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/bitmap.h
@@ -0,0 +1,314 @@
+#ifndef __LINUX_BITMAP_H
+#define __LINUX_BITMAP_H
+
+#ifndef __ASSEMBLY__
+
+#include "kmod.h"
+
+#include "bitops.h"
+/*
+ * bitmaps provide bit arrays that consume one or more unsigned
+ * longs. The bitmap interface and available operations are listed
+ * here, in bitmap.h
+ *
+ * Function implementations generic to all architectures are in
+ * lib/bitmap.c. Functions implementations that are architecture
+ * specific are in various include/asm-<arch>/bitops.h headers
+ * and other arch/<arch> specific files.
+ *
+ * See lib/bitmap.c for more details.
+ */
+
+/*
+ * The available bitmap operations and their rough meaning in the
+ * case that the bitmap is a single unsigned long are thus:
+ *
+ * Note that nbits should be always a compile time evaluable constant.
+ * Otherwise many inlines will generate horrible code.
+ *
+ * bitmap_zero(dst, nbits) *dst = 0UL
+ * bitmap_fill(dst, nbits) *dst = ~0UL
+ * bitmap_copy(dst, src, nbits) *dst = *src
+ * bitmap_and(dst, src1, src2, nbits) *dst = *src1 & *src2
+ * bitmap_or(dst, src1, src2, nbits) *dst = *src1 | *src2
+ * bitmap_xor(dst, src1, src2, nbits) *dst = *src1 ^ *src2
+ * bitmap_andnot(dst, src1, src2, nbits) *dst = *src1 & ~(*src2)
+ * bitmap_complement(dst, src, nbits) *dst = ~(*src)
+ * bitmap_equal(src1, src2, nbits) Are *src1 and *src2 equal?
+ * bitmap_intersects(src1, src2, nbits) Do *src1 and *src2 overlap?
+ * bitmap_subset(src1, src2, nbits) Is *src1 a subset of *src2?
+ * bitmap_empty(src, nbits) Are all bits zero in *src?
+ * bitmap_full(src, nbits) Are all bits set in *src?
+ * bitmap_weight(src, nbits) Hamming Weight: number set bits
+ * bitmap_set(dst, pos, nbits) Set specified bit area
+ * bitmap_clear(dst, pos, nbits) Clear specified bit area
+ * bitmap_find_next_zero_area(buf, len, pos, n, mask) Find bit free area
+ * bitmap_shift_right(dst, src, n, nbits) *dst = *src >> n
+ * bitmap_shift_left(dst, src, n, nbits) *dst = *src << n
+ * bitmap_remap(dst, src, old, new, nbits) *dst = map(old, new)(src)
+ * bitmap_bitremap(oldbit, old, new, nbits) newbit = map(old, new)(oldbit)
+ * bitmap_onto(dst, orig, relmap, nbits) *dst = orig relative to relmap
+ * bitmap_fold(dst, orig, sz, nbits) dst bits = orig bits mod sz
+ * bitmap_scnprintf(buf, len, src, nbits) Print bitmap src to buf
+ * bitmap_parse(buf, buflen, dst, nbits) Parse bitmap dst from kernel buf
+ * bitmap_parse_user(ubuf, ulen, dst, nbits) Parse bitmap dst from user buf
+ * bitmap_scnlistprintf(buf, len, src, nbits) Print bitmap src as list to buf
+ * bitmap_parselist(buf, dst, nbits) Parse bitmap dst from kernel buf
+ * bitmap_parselist_user(buf, dst, nbits) Parse bitmap dst from user buf
+ * bitmap_find_free_region(bitmap, bits, order) Find and allocate bit region
+ * bitmap_release_region(bitmap, pos, order) Free specified bit region
+ * bitmap_allocate_region(bitmap, pos, order) Allocate specified bit region
+ */
+
+/*
+ * Also the following operations in asm/bitops.h apply to bitmaps.
+ *
+ * set_bit(bit, addr) *addr |= bit
+ * clear_bit(bit, addr) *addr &= ~bit
+ * change_bit(bit, addr) *addr ^= bit
+ * test_bit(bit, addr) Is bit set in *addr?
+ * test_and_set_bit(bit, addr) Set bit and return old value
+ * test_and_clear_bit(bit, addr) Clear bit and return old value
+ * test_and_change_bit(bit, addr) Change bit and return old value
+ * find_first_zero_bit(addr, nbits) Position first zero bit in *addr
+ * find_first_bit(addr, nbits) Position first set bit in *addr
+ * find_next_zero_bit(addr, nbits, bit) Position next zero bit in *addr >= bit
+ * find_next_bit(addr, nbits, bit) Position next set bit in *addr >= bit
+ */
+
+/*
+ * The DECLARE_BITMAP(name,bits) macro, in linux/types.h, can be used
+ * to declare an array named 'name' of just enough unsigned longs to
+ * contain all bit positions from 0 to 'bits' - 1.
+ */
+#define DECLARE_BITMAP(name,bits) \
+ unsigned long name[BITS_TO_LONGS(bits)]
+
+/*
+ * lib/bitmap.c provides these functions:
+ */
+
+extern int __bitmap_empty(const unsigned long *bitmap, int bits);
+extern int __bitmap_full(const unsigned long *bitmap, int bits);
+extern int __bitmap_equal(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern void __bitmap_complement(unsigned long *dst, const unsigned long *src,
+ int bits);
+extern void __bitmap_shift_right(unsigned long *dst,
+ const unsigned long *src, int shift, int bits);
+extern void __bitmap_shift_left(unsigned long *dst,
+ const unsigned long *src, int shift, int bits);
+extern int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern int __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern int __bitmap_intersects(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern int __bitmap_subset(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern int __bitmap_weight(const unsigned long *bitmap, int bits);
+
+extern void bitmap_set(unsigned long *map, int i, int len);
+extern void bitmap_clear(unsigned long *map, int start, int nr);
+extern unsigned long bitmap_find_next_zero_area(unsigned long *map,
+ unsigned long size,
+ unsigned long start,
+ unsigned int nr,
+ unsigned long align_mask);
+
+extern int bitmap_scnprintf(char *buf, unsigned int len,
+ const unsigned long *src, int nbits);
+extern int __bitmap_parse(const char *buf, unsigned int buflen, int is_user,
+ unsigned long *dst, int nbits);
+extern int bitmap_parse_user(const char __user *ubuf, unsigned int ulen,
+ unsigned long *dst, int nbits);
+extern int bitmap_scnlistprintf(char *buf, unsigned int len,
+ const unsigned long *src, int nbits);
+extern int bitmap_parselist(const char *buf, unsigned long *maskp,
+ int nmaskbits);
+extern int bitmap_parselist_user(const char __user *ubuf, unsigned int ulen,
+ unsigned long *dst, int nbits);
+extern void bitmap_remap(unsigned long *dst, const unsigned long *src,
+ const unsigned long *old, const unsigned long *new, int bits);
+extern int bitmap_bitremap(int oldbit,
+ const unsigned long *old, const unsigned long *new, int bits);
+extern void bitmap_onto(unsigned long *dst, const unsigned long *orig,
+ const unsigned long *relmap, int bits);
+extern void bitmap_fold(unsigned long *dst, const unsigned long *orig,
+ int sz, int bits);
+extern int bitmap_find_free_region(unsigned long *bitmap, int bits, int order);
+extern void bitmap_release_region(unsigned long *bitmap, int pos, int order);
+extern int bitmap_allocate_region(unsigned long *bitmap, int pos, int order);
+extern void bitmap_copy_le(void *dst, const unsigned long *src, int nbits);
+extern int bitmap_ord_to_pos(const unsigned long *bitmap, int n, int bits);
+
+#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) % BITS_PER_LONG))
+#define BITMAP_LAST_WORD_MASK(nbits) \
+( \
+ ((nbits) % BITS_PER_LONG) ? \
+ (1UL<<((nbits) % BITS_PER_LONG))-1 : ~0UL \
+)
+
+#define small_const_nbits(nbits) \
+ (__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG)
+
+#ifndef BITS_TO_LONGS
+#define BITS_TO_LONGS(bits) (((bits)+BITS_PER_LONG-1)/BITS_PER_LONG)
+#endif
+static inline void bitmap_zero(unsigned long *dst, int nbits)
+{
+ if (small_const_nbits(nbits))
+ *dst = 0UL;
+ else {
+ int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ memset(dst, 0, len);
+ }
+}
+
+static inline void bitmap_fill(unsigned long *dst, int nbits)
+{
+ size_t nlongs = BITS_TO_LONGS(nbits);
+ if (!small_const_nbits(nbits)) {
+ int len = (nlongs - 1) * sizeof(unsigned long);
+ memset(dst, 0xff, len);
+ }
+ dst[nlongs - 1] = BITMAP_LAST_WORD_MASK(nbits);
+}
+
+static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
+ int nbits)
+{
+ if (small_const_nbits(nbits))
+ *dst = *src;
+ else {
+ int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ memcpy(dst, src, len);
+ }
+}
+
+static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (small_const_nbits(nbits))
+ return (*dst = *src1 & *src2) != 0;
+ return __bitmap_and(dst, src1, src2, nbits);
+}
+
+static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (small_const_nbits(nbits))
+ *dst = *src1 | *src2;
+ else
+ __bitmap_or(dst, src1, src2, nbits);
+}
+
+static inline void bitmap_xor(unsigned long *dst, const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (small_const_nbits(nbits))
+ *dst = *src1 ^ *src2;
+ else
+ __bitmap_xor(dst, src1, src2, nbits);
+}
+
+static inline int bitmap_andnot(unsigned long *dst, const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (small_const_nbits(nbits))
+ return (*dst = *src1 & ~(*src2)) != 0;
+ return __bitmap_andnot(dst, src1, src2, nbits);
+}
+
+static inline void bitmap_complement(unsigned long *dst, const unsigned long *src,
+ int nbits)
+{
+ if (small_const_nbits(nbits))
+ *dst = ~(*src) & BITMAP_LAST_WORD_MASK(nbits);
+ else
+ __bitmap_complement(dst, src, nbits);
+}
+
+static inline int bitmap_equal(const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (small_const_nbits(nbits))
+ return ! ((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
+ else
+ return __bitmap_equal(src1, src2, nbits);
+}
+
+static inline int bitmap_intersects(const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (small_const_nbits(nbits))
+ return ((*src1 & *src2) & BITMAP_LAST_WORD_MASK(nbits)) != 0;
+ else
+ return __bitmap_intersects(src1, src2, nbits);
+}
+
+static inline int bitmap_subset(const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (small_const_nbits(nbits))
+ return ! ((*src1 & ~(*src2)) & BITMAP_LAST_WORD_MASK(nbits));
+ else
+ return __bitmap_subset(src1, src2, nbits);
+}
+
+static inline int bitmap_empty(const unsigned long *src, int nbits)
+{
+ if (small_const_nbits(nbits))
+ return ! (*src & BITMAP_LAST_WORD_MASK(nbits));
+ else
+ return __bitmap_empty(src, nbits);
+}
+
+static inline int bitmap_full(const unsigned long *src, int nbits)
+{
+ if (small_const_nbits(nbits))
+ return ! (~(*src) & BITMAP_LAST_WORD_MASK(nbits));
+ else
+ return __bitmap_full(src, nbits);
+}
+
+static inline int bitmap_weight(const unsigned long *src, int nbits)
+{
+ if (small_const_nbits(nbits))
+ return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits));
+ return __bitmap_weight(src, nbits);
+}
+
+static inline void bitmap_shift_right(unsigned long *dst,
+ const unsigned long *src, int n, int nbits)
+{
+ if (small_const_nbits(nbits))
+ *dst = *src >> n;
+ else
+ __bitmap_shift_right(dst, src, n, nbits);
+}
+
+static inline void bitmap_shift_left(unsigned long *dst,
+ const unsigned long *src, int n, int nbits)
+{
+ if (small_const_nbits(nbits))
+ *dst = (*src << n) & BITMAP_LAST_WORD_MASK(nbits);
+ else
+ __bitmap_shift_left(dst, src, n, nbits);
+}
+
+#if 0
+static inline int bitmap_parse(const char *buf, unsigned int buflen,
+ unsigned long *maskp, int nmaskbits)
+{
+ return __bitmap_parse(buf, buflen, 0, maskp, nmaskbits);
+}
+#endif
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* __LINUX_BITMAP_H */
diff --git a/drivers/net/mlnx_uio/include/bitops.h b/drivers/net/mlnx_uio/include/bitops.h
new file mode 100644
index 0000000..3534e41
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/bitops.h
@@ -0,0 +1,558 @@
+#ifndef _LINUX_BITOPS_H
+#define _LINUX_BITOPS_H
+
+#include "kmod.h"
+
+#define BIT(nr) (1UL << (nr))
+#define BIT_ULL(nr) (1ULL << (nr))
+#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG))
+#define BIT_WORD(nr) ((nr) / BITS_PER_LONG)
+#define BIT_ULL_MASK(nr) (1ULL << ((nr) % BITS_PER_LONG_LONG))
+#define BIT_ULL_WORD(nr) ((nr) / BITS_PER_LONG_LONG)
+#define BITS_PER_BYTE 8
+#define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
+
+static __always_inline unsigned long __ffs(unsigned long word)
+{
+ int num = 0;
+
+#if BITS_PER_LONG == 64
+ if ((word & 0xffffffff) == 0) {
+ num += 32;
+ word >>= 32;
+ }
+#endif
+ if ((word & 0xffff) == 0) {
+ num += 16;
+ word >>= 16;
+ }
+ if ((word & 0xff) == 0) {
+ num += 8;
+ word >>= 8;
+ }
+ if ((word & 0xf) == 0) {
+ num += 4;
+ word >>= 4;
+ }
+ if ((word & 0x3) == 0) {
+ num += 2;
+ word >>= 2;
+ }
+ if ((word & 0x1) == 0)
+ num += 1;
+ return num;
+}
+
+
+static __always_inline int fls(int x)
+{
+ int r = 32;
+
+ if (!x)
+ return 0;
+ if (!(x & 0xffff0000u)) {
+ x <<= 16;
+ r -= 16;
+ }
+ if (!(x & 0xff000000u)) {
+ x <<= 8;
+ r -= 8;
+ }
+ if (!(x & 0xf0000000u)) {
+ x <<= 4;
+ r -= 4;
+ }
+ if (!(x & 0xc0000000u)) {
+ x <<= 2;
+ r -= 2;
+ }
+ if (!(x & 0x80000000u)) {
+ x <<= 1;
+ r -= 1;
+ }
+ return r;
+}
+
+static __always_inline unsigned long __fls(unsigned long word)
+{
+ int num = BITS_PER_LONG - 1;
+
+#if BITS_PER_LONG == 64
+ if (!(word & (~0ul << 32))) {
+ num -= 32;
+ word <<= 32;
+ }
+#endif
+ if (!(word & (~0ul << (BITS_PER_LONG-16)))) {
+ num -= 16;
+ word <<= 16;
+ }
+ if (!(word & (~0ul << (BITS_PER_LONG-8)))) {
+ num -= 8;
+ word <<= 8;
+ }
+ if (!(word & (~0ul << (BITS_PER_LONG-4)))) {
+ num -= 4;
+ word <<= 4;
+ }
+ if (!(word & (~0ul << (BITS_PER_LONG-2)))) {
+ num -= 2;
+ word <<= 2;
+ }
+ if (!(word & (~0ul << (BITS_PER_LONG-1))))
+ num -= 1;
+ return num;
+}
+
+
+
+#if BITS_PER_LONG == 32
+static __always_inline int fls64(u64 x)
+{
+ u32 h = x >> 32;
+ if (h)
+ return fls(h) + 32;
+ return fls(x);
+}
+#elif BITS_PER_LONG == 64
+static __always_inline int fls64(u64 x)
+{
+ if (x == 0)
+ return 0;
+ return __fls(x) + 1;
+}
+#else
+#error BITS_PER_LONG not 32 or 64
+#endif
+
+
+
+#define for_each_set_bit(bit, addr, size) \
+ for ((bit) = find_first_bit((addr), (size)); \
+ (bit) < (size); \
+ (bit) = find_next_bit((addr), (size), (bit) + 1))
+
+/* same as for_each_set_bit() but use bit as value to start with */
+#define for_each_set_bit_from(bit, addr, size) \
+ for ((bit) = find_next_bit((addr), (size), (bit)); \
+ (bit) < (size); \
+ (bit) = find_next_bit((addr), (size), (bit) + 1))
+
+#define for_each_clear_bit(bit, addr, size) \
+ for ((bit) = find_first_zero_bit((addr), (size)); \
+ (bit) < (size); \
+ (bit) = find_next_zero_bit((addr), (size), (bit) + 1))
+
+/* same as for_each_clear_bit() but use bit as value to start with */
+#define for_each_clear_bit_from(bit, addr, size) \
+ for ((bit) = find_next_zero_bit((addr), (size), (bit)); \
+ (bit) < (size); \
+ (bit) = find_next_zero_bit((addr), (size), (bit) + 1))
+
+static __inline__ int get_bitmask_order(unsigned int count)
+{
+ int order;
+
+ order = fls(count);
+ return order; /* We could be slightly more clever with -1 here... */
+}
+
+static __inline__ int get_count_order(unsigned int count)
+{
+ int order;
+
+ order = fls(count) - 1;
+ if (count & (count - 1))
+ order++;
+ return order;
+}
+
+#define __const_hweight8(w) \
+ ((unsigned int) \
+ ((!!((w) & (1ULL << 0))) + \
+ (!!((w) & (1ULL << 1))) + \
+ (!!((w) & (1ULL << 2))) + \
+ (!!((w) & (1ULL << 3))) + \
+ (!!((w) & (1ULL << 4))) + \
+ (!!((w) & (1ULL << 5))) + \
+ (!!((w) & (1ULL << 6))) + \
+ (!!((w) & (1ULL << 7)))))
+
+#define __const_hweight16(w) (__const_hweight8(w) + __const_hweight8((w) >> 8 ))
+#define __const_hweight32(w) (__const_hweight16(w) + __const_hweight16((w) >> 16))
+#define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32))
+
+/*
+ * Generic interface.
+ */
+#define hweight8(w) (__const_hweight8(w))
+#define hweight16(w) (__const_hweight16(w))
+#define hweight32(w) (__const_hweight32(w))
+#define hweight64(w) (__const_hweight64(w))
+
+static inline unsigned long hweight_long(unsigned long w)
+{
+ return sizeof(w) == 4 ? hweight32(w) : hweight64(w);
+}
+
+/**
+ * rol64 - rotate a 64-bit value left
+ * @word: value to rotate
+ * @shift: bits to roll
+ */
+static inline u64 rol64(u64 word, unsigned int shift)
+{
+ return (word << shift) | (word >> (64 - shift));
+}
+
+/**
+ * ror64 - rotate a 64-bit value right
+ * @word: value to rotate
+ * @shift: bits to roll
+ */
+static inline u64 ror64(u64 word, unsigned int shift)
+{
+ return (word >> shift) | (word << (64 - shift));
+}
+
+/**
+ * rol32 - rotate a 32-bit value left
+ * @word: value to rotate
+ * @shift: bits to roll
+ */
+static inline u32 rol32(u32 word, unsigned int shift)
+{
+ return (word << shift) | (word >> (32 - shift));
+}
+
+/**
+ * ror32 - rotate a 32-bit value right
+ * @word: value to rotate
+ * @shift: bits to roll
+ */
+static inline u32 ror32(u32 word, unsigned int shift)
+{
+ return (word >> shift) | (word << (32 - shift));
+}
+
+/**
+ * rol16 - rotate a 16-bit value left
+ * @word: value to rotate
+ * @shift: bits to roll
+ */
+static inline u16 rol16(u16 word, unsigned int shift)
+{
+ return (word << shift) | (word >> (16 - shift));
+}
+
+/**
+ * ror16 - rotate a 16-bit value right
+ * @word: value to rotate
+ * @shift: bits to roll
+ */
+static inline u16 ror16(u16 word, unsigned int shift)
+{
+ return (word >> shift) | (word << (16 - shift));
+}
+
+/**
+ * rol8 - rotate an 8-bit value left
+ * @word: value to rotate
+ * @shift: bits to roll
+ */
+static inline u8 rol8(u8 word, unsigned int shift)
+{
+ return (word << shift) | (word >> (8 - shift));
+}
+
+/**
+ * ror8 - rotate an 8-bit value right
+ * @word: value to rotate
+ * @shift: bits to roll
+ */
+static inline u8 ror8(u8 word, unsigned int shift)
+{
+ return (word >> shift) | (word << (8 - shift));
+}
+
+/**
+ * sign_extend32 - sign extend a 32-bit value using specified bit as sign-bit
+ * @value: value to sign extend
+ * @index: 0 based bit index (0<=index<32) to sign bit
+ */
+static inline s32 sign_extend32(u32 value, int index)
+{
+ u8 shift = 31 - index;
+ return (s32)(value << shift) >> shift;
+}
+
+static inline unsigned fls_long(unsigned long l)
+{
+ if (sizeof(l) == 4)
+ return fls(l);
+ return fls64(l);
+}
+
+/**
+ * __ffs64 - find first set bit in a 64 bit word
+ * @word: The 64 bit word
+ *
+ * On 64 bit arches this is a synomyn for __ffs
+ * The result is not defined if no bits are set, so check that @word
+ * is non-zero before calling this.
+ */
+static inline unsigned long __ffs64(u64 word)
+{
+#if BITS_PER_LONG == 32
+ if (((u32)word) == 0UL)
+ return __ffs((u32)(word >> 32)) + 32;
+#elif BITS_PER_LONG != 64
+#error BITS_PER_LONG not 32 or 64
+#endif
+ return __ffs((unsigned long)word);
+}
+
+
+#define BITOP_WORD(nr) ((nr) / BITS_PER_LONG)
+
+/*
+ * Find the next set bit in a memory region.
+ */
+static inline unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
+ unsigned long offset)
+{
+ const unsigned long *p = addr + BITOP_WORD(offset);
+ unsigned long result = offset & ~(BITS_PER_LONG-1);
+ unsigned long tmp;
+
+ if (offset >= size)
+ return size;
+ size -= result;
+ offset %= BITS_PER_LONG;
+ if (offset) {
+ tmp = *(p++);
+ tmp &= (~0UL << offset);
+ if (size < BITS_PER_LONG)
+ goto found_first;
+ if (tmp)
+ goto found_middle;
+ size -= BITS_PER_LONG;
+ result += BITS_PER_LONG;
+ }
+ while (size & ~(BITS_PER_LONG-1)) {
+ if ((tmp = *(p++)))
+ goto found_middle;
+ result += BITS_PER_LONG;
+ size -= BITS_PER_LONG;
+ }
+ if (!size)
+ return result;
+ tmp = *p;
+
+ found_first:
+ tmp &= (~0UL >> (BITS_PER_LONG - size));
+ if (tmp == 0UL) /* Are any bits set? */
+ return result + size; /* Nope. */
+ found_middle:
+ return result + __ffs(tmp);
+}
+EXPORT_SYMBOL(find_next_bit);
+
+/*
+ * ffz - find first zero in word.
+ * @word: The word to search
+ *
+ * Undefined if no zero exists, so code should check against ~0UL first.
+ */
+#define ffz(x) __ffs(~(x))
+
+/*
+ * This implementation of find_{first,next}_zero_bit was stolen from
+ * Linus' asm-alpha/bitops.h.
+ */
+static inline unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
+ unsigned long offset)
+{
+ const unsigned long *p = addr + BITOP_WORD(offset);
+ unsigned long result = offset & ~(BITS_PER_LONG-1);
+ unsigned long tmp;
+
+ if (offset >= size)
+ return size;
+ size -= result;
+ offset %= BITS_PER_LONG;
+ if (offset) {
+ tmp = *(p++);
+ tmp |= ~0UL >> (BITS_PER_LONG - offset);
+ if (size < BITS_PER_LONG)
+ goto found_first;
+ if (~tmp)
+ goto found_middle;
+ size -= BITS_PER_LONG;
+ result += BITS_PER_LONG;
+ }
+ while (size & ~(BITS_PER_LONG-1)) {
+ if (~(tmp = *(p++)))
+ goto found_middle;
+ result += BITS_PER_LONG;
+ size -= BITS_PER_LONG;
+ }
+ if (!size)
+ return result;
+ tmp = *p;
+
+ found_first:
+ tmp |= ~0UL << size;
+ if (tmp == ~0UL) /* Are any bits zero? */
+ return result + size; /* Nope. */
+ found_middle:
+ return result + ffz(tmp);
+}
+EXPORT_SYMBOL(find_next_zero_bit);
+
+/*
+ * Find the first set bit in a memory region.
+ */
+static inline unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
+{
+ const unsigned long *p = addr;
+ unsigned long result = 0;
+ unsigned long tmp;
+
+ while (size & ~(BITS_PER_LONG-1)) {
+ if ((tmp = *(p++)))
+ goto found;
+ result += BITS_PER_LONG;
+ size -= BITS_PER_LONG;
+ }
+ if (!size)
+ return result;
+
+ tmp = (*p) & (~0UL >> (BITS_PER_LONG - size));
+ if (tmp == 0UL) /* Are any bits set? */
+ return result + size; /* Nope. */
+ found:
+ return result + __ffs(tmp);
+}
+EXPORT_SYMBOL(find_first_bit);
+
+/*
+ * Find the first cleared bit in a memory region.
+ */
+static inline unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
+{
+ const unsigned long *p = addr;
+ unsigned long result = 0;
+ unsigned long tmp;
+
+ while (size & ~(BITS_PER_LONG-1)) {
+ if (~(tmp = *(p++)))
+ goto found;
+ result += BITS_PER_LONG;
+ size -= BITS_PER_LONG;
+ }
+ if (!size)
+ return result;
+
+ tmp = (*p) | (~0UL << size);
+ if (tmp == ~0UL) /* Are any bits zero? */
+ return result + size; /* Nope. */
+ found:
+ return result + ffz(tmp);
+}
+EXPORT_SYMBOL(find_first_zero_bit);
+
+/**
+ * test_bit - Determine whether a bit is set
+ * @nr: bit number to test
+ * @addr: Address to start counting from
+ */
+static inline int test_bit(int nr, const volatile unsigned long *addr)
+{
+ return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
+}
+
+static inline void set_bit(int nr, volatile unsigned long *addr)
+{
+ unsigned long mask = BIT_MASK(nr);
+ volatile unsigned long *p = addr + BIT_WORD(nr);
+
+ *p |= mask;
+}
+
+static inline void clear_bit(int nr, volatile unsigned long *addr)
+{
+ unsigned long mask = BIT_MASK(nr);
+ volatile unsigned long *p = addr + BIT_WORD(nr);
+
+ *p &= ~mask;
+}
+
+/**
+ * __change_bit - Toggle a bit in memory
+ * @nr: the bit to change
+ * @addr: the address to start counting from
+ *
+ * Unlike change_bit(), this function is non-atomic and may be reordered.
+ * If it's called on the same region of memory simultaneously, the effect
+ * may be that only one operation succeeds.
+ */
+static inline void change_bit(int nr, volatile unsigned long *addr)
+{
+ unsigned long mask = BIT_MASK(nr);
+ volatile unsigned long *p = addr + BIT_WORD(nr);
+
+ *p ^= mask;
+}
+
+/**
+ * __test_and_set_bit - Set a bit and return its old value
+ * @nr: Bit to set
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.
+ * If two examples of this operation race, one can appear to succeed
+ * but actually fail. You must protect multiple accesses with a lock.
+ */
+static inline int test_and_set_bit(int nr, volatile unsigned long *addr)
+{
+ unsigned long mask = BIT_MASK(nr);
+ volatile unsigned long *p = addr + BIT_WORD(nr);
+ unsigned long old = *p;
+
+ *p = old | mask;
+ return (old & mask) != 0;
+}
+
+/**
+ * __test_and_clear_bit - Clear a bit and return its old value
+ * @nr: Bit to clear
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.
+ * If two examples of this operation race, one can appear to succeed
+ * but actually fail. You must protect multiple accesses with a lock.
+ */
+static inline int test_and_clear_bit(int nr, volatile unsigned long *addr)
+{
+ unsigned long mask = BIT_MASK(nr);
+ volatile unsigned long *p = addr + BIT_WORD(nr);
+ unsigned long old = *p;
+
+ *p = old & ~mask;
+ return (old & mask) != 0;
+}
+
+/* WARNING: non atomic and it can be reordered! */
+static inline int test_and_change_bit(int nr,
+ volatile unsigned long *addr)
+{
+ unsigned long mask = BIT_MASK(nr);
+ volatile unsigned long *p = addr + BIT_WORD(nr);
+ unsigned long old = *p;
+
+ *p = old ^ mask;
+ return (old & mask) != 0;
+}
+
+#endif
diff --git a/drivers/net/mlnx_uio/include/dcbnl.h b/drivers/net/mlnx_uio/include/dcbnl.h
new file mode 100644
index 0000000..e5f190f
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/dcbnl.h
@@ -0,0 +1,751 @@
+ /*
+ * Copyright (c) 2008-2011, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * Author: Lucy Liu <lucy.liu@intel.com>
+ */
+
+#ifndef __LINUX_DCBNL_H__
+#define __LINUX_DCBNL_H__
+
+#include "kmod.h"
+
+/* IEEE 802.1Qaz std supported values */
+#define IEEE_8021QAZ_MAX_TCS 8
+
+#define IEEE_8021QAZ_TSA_STRICT 0
+#define IEEE_8021QAZ_TSA_CB_SHAPER 1
+#define IEEE_8021QAZ_TSA_ETS 2
+#define IEEE_8021QAZ_TSA_VENDOR 255
+
+/* This structure contains the IEEE 802.1Qaz ETS managed object
+ *
+ * @willing: willing bit in ETS configuration TLV
+ * @ets_cap: indicates supported capacity of ets feature
+ * @cbs: credit based shaper ets algorithm supported
+ * @tc_tx_bw: tc tx bandwidth indexed by traffic class
+ * @tc_rx_bw: tc rx bandwidth indexed by traffic class
+ * @tc_tsa: TSA Assignment table, indexed by traffic class
+ * @prio_tc: priority assignment table mapping 8021Qp to traffic class
+ * @tc_reco_bw: recommended tc bandwidth indexed by traffic class for TLV
+ * @tc_reco_tsa: recommended tc bandwidth indexed by traffic class for TLV
+ * @reco_prio_tc: recommended tc tx bandwidth indexed by traffic class for TLV
+ *
+ * Recommended values are used to set fields in the ETS recommendation TLV
+ * with hardware offloaded LLDP.
+ *
+ * ----
+ * TSA Assignment 8 bit identifiers
+ * 0 strict priority
+ * 1 credit-based shaper
+ * 2 enhanced transmission selection
+ * 3-254 reserved
+ * 255 vendor specific
+ */
+struct ieee_ets {
+ __u8 willing;
+ __u8 ets_cap;
+ __u8 cbs;
+ __u8 tc_tx_bw[IEEE_8021QAZ_MAX_TCS];
+ __u8 tc_rx_bw[IEEE_8021QAZ_MAX_TCS];
+ __u8 tc_tsa[IEEE_8021QAZ_MAX_TCS];
+ __u8 prio_tc[IEEE_8021QAZ_MAX_TCS];
+ __u8 tc_reco_bw[IEEE_8021QAZ_MAX_TCS];
+ __u8 tc_reco_tsa[IEEE_8021QAZ_MAX_TCS];
+ __u8 reco_prio_tc[IEEE_8021QAZ_MAX_TCS];
+};
+
+/* This structure contains rate limit extension to the IEEE 802.1Qaz ETS
+ * managed object.
+ * Values are 64 bits long and specified in Kbps to enable usage over both
+ * slow and very fast networks.
+ *
+ * @tc_maxrate: maximal tc tx bandwidth indexed by traffic class
+ */
+struct ieee_maxrate {
+ __u64 tc_maxrate[IEEE_8021QAZ_MAX_TCS];
+};
+
+enum dcbnl_cndd_states {
+ DCB_CNDD_RESET = 0,
+ DCB_CNDD_EDGE,
+ DCB_CNDD_INTERIOR,
+ DCB_CNDD_INTERIOR_READY,
+};
+
+/* This structure contains the IEEE 802.1Qau QCN managed object.
+ *
+ *@rpg_enable: enable QCN RP
+ *@rppp_max_rps: maximum number of RPs allowed for this CNPV on this port
+ *@rpg_time_reset: time between rate increases if no CNMs received.
+ * given in u-seconds
+ *@rpg_byte_reset: transmitted data between rate increases if no CNMs received.
+ * given in Bytes
+ *@rpg_threshold: The number of times rpByteStage or rpTimeStage can count
+ * before RP rate control state machine advances states
+ *@rpg_max_rate: the maxinun rate, in Mbits per second,
+ * at which an RP can transmit
+ *@rpg_ai_rate: The rate, in Mbits per second,
+ * used to increase rpTargetRate in the RPR_ACTIVE_INCREASE
+ *@rpg_hai_rate: The rate, in Mbits per second,
+ * used to increase rpTargetRate in the RPR_HYPER_INCREASE state
+ *@rpg_gd: Upon CNM receive, flow rate is limited to (Fb/Gd)*CurrentRate.
+ * rpgGd is given as log2(Gd), where Gd may only be powers of 2
+ *@rpg_min_dec_fac: The minimum factor by which the current transmit rate
+ * can be changed by reception of a CNM.
+ * value is given as percentage (1-100)
+ *@rpg_min_rate: The minimum value, in bits per second, for rate to limit
+ *@cndd_state_machine: The state of the congestion notification domain
+ * defense state machine, as defined by IEEE 802.3Qau
+ * section 32.1.1. In the interior ready state,
+ * the QCN capable hardware may add CN-TAG TLV to the
+ * outgoing traffic, to specifically identify outgoing
+ * flows.
+ */
+
+struct ieee_qcn {
+ __u8 rpg_enable[IEEE_8021QAZ_MAX_TCS];
+ __u32 rppp_max_rps[IEEE_8021QAZ_MAX_TCS];
+ __u32 rpg_time_reset[IEEE_8021QAZ_MAX_TCS];
+ __u32 rpg_byte_reset[IEEE_8021QAZ_MAX_TCS];
+ __u32 rpg_threshold[IEEE_8021QAZ_MAX_TCS];
+ __u32 rpg_max_rate[IEEE_8021QAZ_MAX_TCS];
+ __u32 rpg_ai_rate[IEEE_8021QAZ_MAX_TCS];
+ __u32 rpg_hai_rate[IEEE_8021QAZ_MAX_TCS];
+ __u32 rpg_gd[IEEE_8021QAZ_MAX_TCS];
+ __u32 rpg_min_dec_fac[IEEE_8021QAZ_MAX_TCS];
+ __u32 rpg_min_rate[IEEE_8021QAZ_MAX_TCS];
+ __u32 cndd_state_machine[IEEE_8021QAZ_MAX_TCS];
+};
+
+/* This structure contains the IEEE 802.1Qau QCN statistics.
+ *
+ *@rppp_rp_centiseconds: the number of RP-centiseconds accumulated
+ * by RPs at this priority level on this Port
+ *@rppp_created_rps: number of active RPs(flows) that react to CNMs
+ */
+
+struct ieee_qcn_stats {
+ __u64 rppp_rp_centiseconds[IEEE_8021QAZ_MAX_TCS];
+ __u32 rppp_created_rps[IEEE_8021QAZ_MAX_TCS];
+};
+
+/* This structure contains the IEEE 802.1Qaz PFC managed object
+ *
+ * @pfc_cap: Indicates the number of traffic classes on the local device
+ * that may simultaneously have PFC enabled.
+ * @pfc_en: bitmap indicating pfc enabled traffic classes
+ * @mbc: enable macsec bypass capability
+ * @delay: the allowance made for a round-trip propagation delay of the
+ * link in bits.
+ * @requests: count of the sent pfc frames
+ * @indications: count of the received pfc frames
+ */
+struct ieee_pfc {
+ __u8 pfc_cap;
+ __u8 pfc_en;
+ __u8 mbc;
+ __u16 delay;
+ __u64 requests[IEEE_8021QAZ_MAX_TCS];
+ __u64 indications[IEEE_8021QAZ_MAX_TCS];
+};
+
+/* CEE DCBX std supported values */
+#define CEE_DCBX_MAX_PGS 8
+#define CEE_DCBX_MAX_PRIO 8
+
+/**
+ * struct cee_pg - CEE Priority-Group managed object
+ *
+ * @willing: willing bit in the PG tlv
+ * @error: error bit in the PG tlv
+ * @pg_en: enable bit of the PG feature
+ * @tcs_supported: number of traffic classes supported
+ * @pg_bw: bandwidth percentage for each priority group
+ * @prio_pg: priority to PG mapping indexed by priority
+ */
+struct cee_pg {
+ __u8 willing;
+ __u8 error;
+ __u8 pg_en;
+ __u8 tcs_supported;
+ __u8 pg_bw[CEE_DCBX_MAX_PGS];
+ __u8 prio_pg[CEE_DCBX_MAX_PGS];
+};
+
+/**
+ * struct cee_pfc - CEE PFC managed object
+ *
+ * @willing: willing bit in the PFC tlv
+ * @error: error bit in the PFC tlv
+ * @pfc_en: bitmap indicating pfc enabled traffic classes
+ * @tcs_supported: number of traffic classes supported
+ */
+struct cee_pfc {
+ __u8 willing;
+ __u8 error;
+ __u8 pfc_en;
+ __u8 tcs_supported;
+};
+
+/* IEEE 802.1Qaz std supported values */
+#define IEEE_8021QAZ_APP_SEL_ETHERTYPE 1
+#define IEEE_8021QAZ_APP_SEL_STREAM 2
+#define IEEE_8021QAZ_APP_SEL_DGRAM 3
+#define IEEE_8021QAZ_APP_SEL_ANY 4
+
+/* This structure contains the IEEE 802.1Qaz APP managed object. This
+ * object is also used for the CEE std as well. There is no difference
+ * between the objects.
+ *
+ * @selector: protocol identifier type
+ * @protocol: protocol of type indicated
+ * @priority: 3-bit unsigned integer indicating priority for IEEE
+ * 8-bit 802.1p user priority bitmap for CEE
+ *
+ * ----
+ * Selector field values
+ * 0 Reserved
+ * 1 Ethertype
+ * 2 Well known port number over TCP or SCTP
+ * 3 Well known port number over UDP or DCCP
+ * 4 Well known port number over TCP, SCTP, UDP, or DCCP
+ * 5-7 Reserved
+ */
+struct dcb_app {
+ __u8 selector;
+ __u8 priority;
+ __u16 protocol;
+};
+
+/**
+ * struct dcb_peer_app_info - APP feature information sent by the peer
+ *
+ * @willing: willing bit in the peer APP tlv
+ * @error: error bit in the peer APP tlv
+ *
+ * In addition to this information the full peer APP tlv also contains
+ * a table of 'app_count' APP objects defined above.
+ */
+struct dcb_peer_app_info {
+ __u8 willing;
+ __u8 error;
+};
+
+struct dcbmsg {
+ __u8 dcb_family;
+ __u8 cmd;
+ __u16 dcb_pad;
+};
+
+/**
+ * enum dcbnl_commands - supported DCB commands
+ *
+ * @DCB_CMD_UNDEFINED: unspecified command to catch errors
+ * @DCB_CMD_GSTATE: request the state of DCB in the device
+ * @DCB_CMD_SSTATE: set the state of DCB in the device
+ * @DCB_CMD_PGTX_GCFG: request the priority group configuration for Tx
+ * @DCB_CMD_PGTX_SCFG: set the priority group configuration for Tx
+ * @DCB_CMD_PGRX_GCFG: request the priority group configuration for Rx
+ * @DCB_CMD_PGRX_SCFG: set the priority group configuration for Rx
+ * @DCB_CMD_PFC_GCFG: request the priority flow control configuration
+ * @DCB_CMD_PFC_SCFG: set the priority flow control configuration
+ * @DCB_CMD_SET_ALL: apply all changes to the underlying device
+ * @DCB_CMD_GPERM_HWADDR: get the permanent MAC address of the underlying
+ * device. Only useful when using bonding.
+ * @DCB_CMD_GCAP: request the DCB capabilities of the device
+ * @DCB_CMD_GNUMTCS: get the number of traffic classes currently supported
+ * @DCB_CMD_SNUMTCS: set the number of traffic classes
+ * @DCB_CMD_GBCN: set backward congestion notification configuration
+ * @DCB_CMD_SBCN: get backward congestion notification configration.
+ * @DCB_CMD_GAPP: get application protocol configuration
+ * @DCB_CMD_SAPP: set application protocol configuration
+ * @DCB_CMD_IEEE_SET: set IEEE 802.1Qaz configuration
+ * @DCB_CMD_IEEE_GET: get IEEE 802.1Qaz configuration
+ * @DCB_CMD_GDCBX: get DCBX engine configuration
+ * @DCB_CMD_SDCBX: set DCBX engine configuration
+ * @DCB_CMD_GFEATCFG: get DCBX features flags
+ * @DCB_CMD_SFEATCFG: set DCBX features negotiation flags
+ * @DCB_CMD_CEE_GET: get CEE aggregated configuration
+ * @DCB_CMD_IEEE_DEL: delete IEEE 802.1Qaz configuration
+ */
+enum dcbnl_commands {
+ DCB_CMD_UNDEFINED,
+
+ DCB_CMD_GSTATE,
+ DCB_CMD_SSTATE,
+
+ DCB_CMD_PGTX_GCFG,
+ DCB_CMD_PGTX_SCFG,
+ DCB_CMD_PGRX_GCFG,
+ DCB_CMD_PGRX_SCFG,
+
+ DCB_CMD_PFC_GCFG,
+ DCB_CMD_PFC_SCFG,
+
+ DCB_CMD_SET_ALL,
+
+ DCB_CMD_GPERM_HWADDR,
+
+ DCB_CMD_GCAP,
+
+ DCB_CMD_GNUMTCS,
+ DCB_CMD_SNUMTCS,
+
+ DCB_CMD_PFC_GSTATE,
+ DCB_CMD_PFC_SSTATE,
+
+ DCB_CMD_BCN_GCFG,
+ DCB_CMD_BCN_SCFG,
+
+ DCB_CMD_GAPP,
+ DCB_CMD_SAPP,
+
+ DCB_CMD_IEEE_SET,
+ DCB_CMD_IEEE_GET,
+
+ DCB_CMD_GDCBX,
+ DCB_CMD_SDCBX,
+
+ DCB_CMD_GFEATCFG,
+ DCB_CMD_SFEATCFG,
+
+ DCB_CMD_CEE_GET,
+ DCB_CMD_IEEE_DEL,
+
+ __DCB_CMD_ENUM_MAX,
+ DCB_CMD_MAX = __DCB_CMD_ENUM_MAX - 1,
+};
+
+/**
+ * enum dcbnl_attrs - DCB top-level netlink attributes
+ *
+ * @DCB_ATTR_UNDEFINED: unspecified attribute to catch errors
+ * @DCB_ATTR_IFNAME: interface name of the underlying device (NLA_STRING)
+ * @DCB_ATTR_STATE: enable state of DCB in the device (NLA_U8)
+ * @DCB_ATTR_PFC_STATE: enable state of PFC in the device (NLA_U8)
+ * @DCB_ATTR_PFC_CFG: priority flow control configuration (NLA_NESTED)
+ * @DCB_ATTR_NUM_TC: number of traffic classes supported in the device (NLA_U8)
+ * @DCB_ATTR_PG_CFG: priority group configuration (NLA_NESTED)
+ * @DCB_ATTR_SET_ALL: bool to commit changes to hardware or not (NLA_U8)
+ * @DCB_ATTR_PERM_HWADDR: MAC address of the physical device (NLA_NESTED)
+ * @DCB_ATTR_CAP: DCB capabilities of the device (NLA_NESTED)
+ * @DCB_ATTR_NUMTCS: number of traffic classes supported (NLA_NESTED)
+ * @DCB_ATTR_BCN: backward congestion notification configuration (NLA_NESTED)
+ * @DCB_ATTR_IEEE: IEEE 802.1Qaz supported attributes (NLA_NESTED)
+ * @DCB_ATTR_DCBX: DCBX engine configuration in the device (NLA_U8)
+ * @DCB_ATTR_FEATCFG: DCBX features flags (NLA_NESTED)
+ * @DCB_ATTR_CEE: CEE std supported attributes (NLA_NESTED)
+ */
+enum dcbnl_attrs {
+ DCB_ATTR_UNDEFINED,
+
+ DCB_ATTR_IFNAME,
+ DCB_ATTR_STATE,
+ DCB_ATTR_PFC_STATE,
+ DCB_ATTR_PFC_CFG,
+ DCB_ATTR_NUM_TC,
+ DCB_ATTR_PG_CFG,
+ DCB_ATTR_SET_ALL,
+ DCB_ATTR_PERM_HWADDR,
+ DCB_ATTR_CAP,
+ DCB_ATTR_NUMTCS,
+ DCB_ATTR_BCN,
+ DCB_ATTR_APP,
+
+ /* IEEE std attributes */
+ DCB_ATTR_IEEE,
+
+ DCB_ATTR_DCBX,
+ DCB_ATTR_FEATCFG,
+
+ /* CEE nested attributes */
+ DCB_ATTR_CEE,
+
+ __DCB_ATTR_ENUM_MAX,
+ DCB_ATTR_MAX = __DCB_ATTR_ENUM_MAX - 1,
+};
+
+/**
+ * enum ieee_attrs - IEEE 802.1Qaz get/set attributes
+ *
+ * @DCB_ATTR_IEEE_UNSPEC: unspecified
+ * @DCB_ATTR_IEEE_ETS: negotiated ETS configuration
+ * @DCB_ATTR_IEEE_PFC: negotiated PFC configuration
+ * @DCB_ATTR_IEEE_APP_TABLE: negotiated APP configuration
+ * @DCB_ATTR_IEEE_PEER_ETS: peer ETS configuration - get only
+ * @DCB_ATTR_IEEE_PEER_PFC: peer PFC configuration - get only
+ * @DCB_ATTR_IEEE_PEER_APP: peer APP tlv - get only
+ */
+enum ieee_attrs {
+ DCB_ATTR_IEEE_UNSPEC,
+ DCB_ATTR_IEEE_ETS,
+ DCB_ATTR_IEEE_PFC,
+ DCB_ATTR_IEEE_APP_TABLE,
+ DCB_ATTR_IEEE_PEER_ETS,
+ DCB_ATTR_IEEE_PEER_PFC,
+ DCB_ATTR_IEEE_PEER_APP,
+ DCB_ATTR_IEEE_MAXRATE,
+ DCB_ATTR_IEEE_QCN,
+ DCB_ATTR_IEEE_QCN_STATS,
+ __DCB_ATTR_IEEE_MAX
+};
+#define DCB_ATTR_IEEE_MAX (__DCB_ATTR_IEEE_MAX - 1)
+
+enum ieee_attrs_app {
+ DCB_ATTR_IEEE_APP_UNSPEC,
+ DCB_ATTR_IEEE_APP,
+ __DCB_ATTR_IEEE_APP_MAX
+};
+#define DCB_ATTR_IEEE_APP_MAX (__DCB_ATTR_IEEE_APP_MAX - 1)
+
+/**
+ * enum cee_attrs - CEE DCBX get attributes.
+ *
+ * @DCB_ATTR_CEE_UNSPEC: unspecified
+ * @DCB_ATTR_CEE_PEER_PG: peer PG configuration - get only
+ * @DCB_ATTR_CEE_PEER_PFC: peer PFC configuration - get only
+ * @DCB_ATTR_CEE_PEER_APP_TABLE: peer APP tlv - get only
+ * @DCB_ATTR_CEE_TX_PG: TX PG configuration (DCB_CMD_PGTX_GCFG)
+ * @DCB_ATTR_CEE_RX_PG: RX PG configuration (DCB_CMD_PGRX_GCFG)
+ * @DCB_ATTR_CEE_PFC: PFC configuration (DCB_CMD_PFC_GCFG)
+ * @DCB_ATTR_CEE_APP_TABLE: APP configuration (multi DCB_CMD_GAPP)
+ * @DCB_ATTR_CEE_FEAT: DCBX features flags (DCB_CMD_GFEATCFG)
+ *
+ * An aggregated collection of the cee std negotiated parameters.
+ */
+enum cee_attrs {
+ DCB_ATTR_CEE_UNSPEC,
+ DCB_ATTR_CEE_PEER_PG,
+ DCB_ATTR_CEE_PEER_PFC,
+ DCB_ATTR_CEE_PEER_APP_TABLE,
+ DCB_ATTR_CEE_TX_PG,
+ DCB_ATTR_CEE_RX_PG,
+ DCB_ATTR_CEE_PFC,
+ DCB_ATTR_CEE_APP_TABLE,
+ DCB_ATTR_CEE_FEAT,
+ __DCB_ATTR_CEE_MAX
+};
+#define DCB_ATTR_CEE_MAX (__DCB_ATTR_CEE_MAX - 1)
+
+enum peer_app_attr {
+ DCB_ATTR_CEE_PEER_APP_UNSPEC,
+ DCB_ATTR_CEE_PEER_APP_INFO,
+ DCB_ATTR_CEE_PEER_APP,
+ __DCB_ATTR_CEE_PEER_APP_MAX
+};
+#define DCB_ATTR_CEE_PEER_APP_MAX (__DCB_ATTR_CEE_PEER_APP_MAX - 1)
+
+enum cee_attrs_app {
+ DCB_ATTR_CEE_APP_UNSPEC,
+ DCB_ATTR_CEE_APP,
+ __DCB_ATTR_CEE_APP_MAX
+};
+#define DCB_ATTR_CEE_APP_MAX (__DCB_ATTR_CEE_APP_MAX - 1)
+
+/**
+ * enum dcbnl_pfc_attrs - DCB Priority Flow Control user priority nested attrs
+ *
+ * @DCB_PFC_UP_ATTR_UNDEFINED: unspecified attribute to catch errors
+ * @DCB_PFC_UP_ATTR_0: Priority Flow Control value for User Priority 0 (NLA_U8)
+ * @DCB_PFC_UP_ATTR_1: Priority Flow Control value for User Priority 1 (NLA_U8)
+ * @DCB_PFC_UP_ATTR_2: Priority Flow Control value for User Priority 2 (NLA_U8)
+ * @DCB_PFC_UP_ATTR_3: Priority Flow Control value for User Priority 3 (NLA_U8)
+ * @DCB_PFC_UP_ATTR_4: Priority Flow Control value for User Priority 4 (NLA_U8)
+ * @DCB_PFC_UP_ATTR_5: Priority Flow Control value for User Priority 5 (NLA_U8)
+ * @DCB_PFC_UP_ATTR_6: Priority Flow Control value for User Priority 6 (NLA_U8)
+ * @DCB_PFC_UP_ATTR_7: Priority Flow Control value for User Priority 7 (NLA_U8)
+ * @DCB_PFC_UP_ATTR_MAX: highest attribute number currently defined
+ * @DCB_PFC_UP_ATTR_ALL: apply to all priority flow control attrs (NLA_FLAG)
+ *
+ */
+enum dcbnl_pfc_up_attrs {
+ DCB_PFC_UP_ATTR_UNDEFINED,
+
+ DCB_PFC_UP_ATTR_0,
+ DCB_PFC_UP_ATTR_1,
+ DCB_PFC_UP_ATTR_2,
+ DCB_PFC_UP_ATTR_3,
+ DCB_PFC_UP_ATTR_4,
+ DCB_PFC_UP_ATTR_5,
+ DCB_PFC_UP_ATTR_6,
+ DCB_PFC_UP_ATTR_7,
+ DCB_PFC_UP_ATTR_ALL,
+
+ __DCB_PFC_UP_ATTR_ENUM_MAX,
+ DCB_PFC_UP_ATTR_MAX = __DCB_PFC_UP_ATTR_ENUM_MAX - 1,
+};
+
+/**
+ * enum dcbnl_pg_attrs - DCB Priority Group attributes
+ *
+ * @DCB_PG_ATTR_UNDEFINED: unspecified attribute to catch errors
+ * @DCB_PG_ATTR_TC_0: Priority Group Traffic Class 0 configuration (NLA_NESTED)
+ * @DCB_PG_ATTR_TC_1: Priority Group Traffic Class 1 configuration (NLA_NESTED)
+ * @DCB_PG_ATTR_TC_2: Priority Group Traffic Class 2 configuration (NLA_NESTED)
+ * @DCB_PG_ATTR_TC_3: Priority Group Traffic Class 3 configuration (NLA_NESTED)
+ * @DCB_PG_ATTR_TC_4: Priority Group Traffic Class 4 configuration (NLA_NESTED)
+ * @DCB_PG_ATTR_TC_5: Priority Group Traffic Class 5 configuration (NLA_NESTED)
+ * @DCB_PG_ATTR_TC_6: Priority Group Traffic Class 6 configuration (NLA_NESTED)
+ * @DCB_PG_ATTR_TC_7: Priority Group Traffic Class 7 configuration (NLA_NESTED)
+ * @DCB_PG_ATTR_TC_MAX: highest attribute number currently defined
+ * @DCB_PG_ATTR_TC_ALL: apply to all traffic classes (NLA_NESTED)
+ * @DCB_PG_ATTR_BW_ID_0: Percent of link bandwidth for Priority Group 0 (NLA_U8)
+ * @DCB_PG_ATTR_BW_ID_1: Percent of link bandwidth for Priority Group 1 (NLA_U8)
+ * @DCB_PG_ATTR_BW_ID_2: Percent of link bandwidth for Priority Group 2 (NLA_U8)
+ * @DCB_PG_ATTR_BW_ID_3: Percent of link bandwidth for Priority Group 3 (NLA_U8)
+ * @DCB_PG_ATTR_BW_ID_4: Percent of link bandwidth for Priority Group 4 (NLA_U8)
+ * @DCB_PG_ATTR_BW_ID_5: Percent of link bandwidth for Priority Group 5 (NLA_U8)
+ * @DCB_PG_ATTR_BW_ID_6: Percent of link bandwidth for Priority Group 6 (NLA_U8)
+ * @DCB_PG_ATTR_BW_ID_7: Percent of link bandwidth for Priority Group 7 (NLA_U8)
+ * @DCB_PG_ATTR_BW_ID_MAX: highest attribute number currently defined
+ * @DCB_PG_ATTR_BW_ID_ALL: apply to all priority groups (NLA_FLAG)
+ *
+ */
+enum dcbnl_pg_attrs {
+ DCB_PG_ATTR_UNDEFINED,
+
+ DCB_PG_ATTR_TC_0,
+ DCB_PG_ATTR_TC_1,
+ DCB_PG_ATTR_TC_2,
+ DCB_PG_ATTR_TC_3,
+ DCB_PG_ATTR_TC_4,
+ DCB_PG_ATTR_TC_5,
+ DCB_PG_ATTR_TC_6,
+ DCB_PG_ATTR_TC_7,
+ DCB_PG_ATTR_TC_MAX,
+ DCB_PG_ATTR_TC_ALL,
+
+ DCB_PG_ATTR_BW_ID_0,
+ DCB_PG_ATTR_BW_ID_1,
+ DCB_PG_ATTR_BW_ID_2,
+ DCB_PG_ATTR_BW_ID_3,
+ DCB_PG_ATTR_BW_ID_4,
+ DCB_PG_ATTR_BW_ID_5,
+ DCB_PG_ATTR_BW_ID_6,
+ DCB_PG_ATTR_BW_ID_7,
+ DCB_PG_ATTR_BW_ID_MAX,
+ DCB_PG_ATTR_BW_ID_ALL,
+
+ __DCB_PG_ATTR_ENUM_MAX,
+ DCB_PG_ATTR_MAX = __DCB_PG_ATTR_ENUM_MAX - 1,
+};
+
+/**
+ * enum dcbnl_tc_attrs - DCB Traffic Class attributes
+ *
+ * @DCB_TC_ATTR_PARAM_UNDEFINED: unspecified attribute to catch errors
+ * @DCB_TC_ATTR_PARAM_PGID: (NLA_U8) Priority group the traffic class belongs to
+ * Valid values are: 0-7
+ * @DCB_TC_ATTR_PARAM_UP_MAPPING: (NLA_U8) Traffic class to user priority map
+ * Some devices may not support changing the
+ * user priority map of a TC.
+ * @DCB_TC_ATTR_PARAM_STRICT_PRIO: (NLA_U8) Strict priority setting
+ * 0 - none
+ * 1 - group strict
+ * 2 - link strict
+ * @DCB_TC_ATTR_PARAM_BW_PCT: optional - (NLA_U8) If supported by the device and
+ * not configured to use link strict priority,
+ * this is the percentage of bandwidth of the
+ * priority group this traffic class belongs to
+ * @DCB_TC_ATTR_PARAM_ALL: (NLA_FLAG) all traffic class parameters
+ *
+ */
+enum dcbnl_tc_attrs {
+ DCB_TC_ATTR_PARAM_UNDEFINED,
+
+ DCB_TC_ATTR_PARAM_PGID,
+ DCB_TC_ATTR_PARAM_UP_MAPPING,
+ DCB_TC_ATTR_PARAM_STRICT_PRIO,
+ DCB_TC_ATTR_PARAM_BW_PCT,
+ DCB_TC_ATTR_PARAM_ALL,
+
+ __DCB_TC_ATTR_PARAM_ENUM_MAX,
+ DCB_TC_ATTR_PARAM_MAX = __DCB_TC_ATTR_PARAM_ENUM_MAX - 1,
+};
+
+/**
+ * enum dcbnl_cap_attrs - DCB Capability attributes
+ *
+ * @DCB_CAP_ATTR_UNDEFINED: unspecified attribute to catch errors
+ * @DCB_CAP_ATTR_ALL: (NLA_FLAG) all capability parameters
+ * @DCB_CAP_ATTR_PG: (NLA_U8) device supports Priority Groups
+ * @DCB_CAP_ATTR_PFC: (NLA_U8) device supports Priority Flow Control
+ * @DCB_CAP_ATTR_UP2TC: (NLA_U8) device supports user priority to
+ * traffic class mapping
+ * @DCB_CAP_ATTR_PG_TCS: (NLA_U8) bitmap where each bit represents a
+ * number of traffic classes the device
+ * can be configured to use for Priority Groups
+ * @DCB_CAP_ATTR_PFC_TCS: (NLA_U8) bitmap where each bit represents a
+ * number of traffic classes the device can be
+ * configured to use for Priority Flow Control
+ * @DCB_CAP_ATTR_GSP: (NLA_U8) device supports group strict priority
+ * @DCB_CAP_ATTR_BCN: (NLA_U8) device supports Backwards Congestion
+ * Notification
+ * @DCB_CAP_ATTR_DCBX: (NLA_U8) device supports DCBX engine
+ *
+ */
+enum dcbnl_cap_attrs {
+ DCB_CAP_ATTR_UNDEFINED,
+ DCB_CAP_ATTR_ALL,
+ DCB_CAP_ATTR_PG,
+ DCB_CAP_ATTR_PFC,
+ DCB_CAP_ATTR_UP2TC,
+ DCB_CAP_ATTR_PG_TCS,
+ DCB_CAP_ATTR_PFC_TCS,
+ DCB_CAP_ATTR_GSP,
+ DCB_CAP_ATTR_BCN,
+ DCB_CAP_ATTR_DCBX,
+
+ __DCB_CAP_ATTR_ENUM_MAX,
+ DCB_CAP_ATTR_MAX = __DCB_CAP_ATTR_ENUM_MAX - 1,
+};
+
+/**
+ * DCBX capability flags
+ *
+ * @DCB_CAP_DCBX_HOST: DCBX negotiation is performed by the host LLDP agent.
+ * 'set' routines are used to configure the device with
+ * the negotiated parameters
+ *
+ * @DCB_CAP_DCBX_LLD_MANAGED: DCBX negotiation is not performed in the host but
+ * by another entity
+ * 'get' routines are used to retrieve the
+ * negotiated parameters
+ * 'set' routines can be used to set the initial
+ * negotiation configuration
+ *
+ * @DCB_CAP_DCBX_VER_CEE: for a non-host DCBX engine, indicates the engine
+ * supports the CEE protocol flavor
+ *
+ * @DCB_CAP_DCBX_VER_IEEE: for a non-host DCBX engine, indicates the engine
+ * supports the IEEE protocol flavor
+ *
+ * @DCB_CAP_DCBX_STATIC: for a non-host DCBX engine, indicates the engine
+ * supports static configuration (i.e no actual
+ * negotiation is performed negotiated parameters equal
+ * the initial configuration)
+ *
+ */
+#define DCB_CAP_DCBX_HOST 0x01
+#define DCB_CAP_DCBX_LLD_MANAGED 0x02
+#define DCB_CAP_DCBX_VER_CEE 0x04
+#define DCB_CAP_DCBX_VER_IEEE 0x08
+#define DCB_CAP_DCBX_STATIC 0x10
+
+/**
+ * enum dcbnl_numtcs_attrs - number of traffic classes
+ *
+ * @DCB_NUMTCS_ATTR_UNDEFINED: unspecified attribute to catch errors
+ * @DCB_NUMTCS_ATTR_ALL: (NLA_FLAG) all traffic class attributes
+ * @DCB_NUMTCS_ATTR_PG: (NLA_U8) number of traffic classes used for
+ * priority groups
+ * @DCB_NUMTCS_ATTR_PFC: (NLA_U8) number of traffic classes which can
+ * support priority flow control
+ */
+enum dcbnl_numtcs_attrs {
+ DCB_NUMTCS_ATTR_UNDEFINED,
+ DCB_NUMTCS_ATTR_ALL,
+ DCB_NUMTCS_ATTR_PG,
+ DCB_NUMTCS_ATTR_PFC,
+
+ __DCB_NUMTCS_ATTR_ENUM_MAX,
+ DCB_NUMTCS_ATTR_MAX = __DCB_NUMTCS_ATTR_ENUM_MAX - 1,
+};
+
+enum dcbnl_bcn_attrs{
+ DCB_BCN_ATTR_UNDEFINED = 0,
+
+ DCB_BCN_ATTR_RP_0,
+ DCB_BCN_ATTR_RP_1,
+ DCB_BCN_ATTR_RP_2,
+ DCB_BCN_ATTR_RP_3,
+ DCB_BCN_ATTR_RP_4,
+ DCB_BCN_ATTR_RP_5,
+ DCB_BCN_ATTR_RP_6,
+ DCB_BCN_ATTR_RP_7,
+ DCB_BCN_ATTR_RP_ALL,
+
+ DCB_BCN_ATTR_BCNA_0,
+ DCB_BCN_ATTR_BCNA_1,
+ DCB_BCN_ATTR_ALPHA,
+ DCB_BCN_ATTR_BETA,
+ DCB_BCN_ATTR_GD,
+ DCB_BCN_ATTR_GI,
+ DCB_BCN_ATTR_TMAX,
+ DCB_BCN_ATTR_TD,
+ DCB_BCN_ATTR_RMIN,
+ DCB_BCN_ATTR_W,
+ DCB_BCN_ATTR_RD,
+ DCB_BCN_ATTR_RU,
+ DCB_BCN_ATTR_WRTT,
+ DCB_BCN_ATTR_RI,
+ DCB_BCN_ATTR_C,
+ DCB_BCN_ATTR_ALL,
+
+ __DCB_BCN_ATTR_ENUM_MAX,
+ DCB_BCN_ATTR_MAX = __DCB_BCN_ATTR_ENUM_MAX - 1,
+};
+
+/**
+ * enum dcb_general_attr_values - general DCB attribute values
+ *
+ * @DCB_ATTR_UNDEFINED: value used to indicate an attribute is not supported
+ *
+ */
+enum dcb_general_attr_values {
+ DCB_ATTR_VALUE_UNDEFINED = 0xff
+};
+
+#define DCB_APP_IDTYPE_ETHTYPE 0x00
+#define DCB_APP_IDTYPE_PORTNUM 0x01
+enum dcbnl_app_attrs {
+ DCB_APP_ATTR_UNDEFINED,
+
+ DCB_APP_ATTR_IDTYPE,
+ DCB_APP_ATTR_ID,
+ DCB_APP_ATTR_PRIORITY,
+
+ __DCB_APP_ATTR_ENUM_MAX,
+ DCB_APP_ATTR_MAX = __DCB_APP_ATTR_ENUM_MAX - 1,
+};
+
+/**
+ * enum dcbnl_featcfg_attrs - features conifiguration flags
+ *
+ * @DCB_FEATCFG_ATTR_UNDEFINED: unspecified attribute to catch errors
+ * @DCB_FEATCFG_ATTR_ALL: (NLA_FLAG) all features configuration attributes
+ * @DCB_FEATCFG_ATTR_PG: (NLA_U8) configuration flags for priority groups
+ * @DCB_FEATCFG_ATTR_PFC: (NLA_U8) configuration flags for priority
+ * flow control
+ * @DCB_FEATCFG_ATTR_APP: (NLA_U8) configuration flags for application TLV
+ *
+ */
+#define DCB_FEATCFG_ERROR 0x01 /* error in feature resolution */
+#define DCB_FEATCFG_ENABLE 0x02 /* enable feature */
+#define DCB_FEATCFG_WILLING 0x04 /* feature is willing */
+#define DCB_FEATCFG_ADVERTISE 0x08 /* advertise feature */
+enum dcbnl_featcfg_attrs {
+ DCB_FEATCFG_ATTR_UNDEFINED,
+ DCB_FEATCFG_ATTR_ALL,
+ DCB_FEATCFG_ATTR_PG,
+ DCB_FEATCFG_ATTR_PFC,
+ DCB_FEATCFG_ATTR_APP,
+
+ __DCB_FEATCFG_ATTR_ENUM_MAX,
+ DCB_FEATCFG_ATTR_MAX = __DCB_FEATCFG_ATTR_ENUM_MAX - 1,
+};
+
+#endif /* __LINUX_DCBNL_H__ */
diff --git a/drivers/net/mlnx_uio/include/etherdevice.h b/drivers/net/mlnx_uio/include/etherdevice.h
new file mode 100644
index 0000000..c080f9a
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/etherdevice.h
@@ -0,0 +1,189 @@
+/*
+ * INET An implementation of the TCP/IP protocol suite for the LINUX
+ * operating system. NET is implemented using the BSD Socket
+ * interface as the means of communication with the user level.
+ *
+ * Definitions for the Ethernet handlers.
+ *
+ * Version: @(#)eth.h 1.0.4 05/13/93
+ *
+ * Authors: Ross Biro
+ * Fred N. van Kempen, <waltje@uWalt.NL.Mugnet.ORG>
+ *
+ * Relocated to include/linux where it belongs by Alan Cox
+ * <gw4pts@gw4pts.ampr.org>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ */
+#ifndef _LINUX_ETHERDEVICE_H
+#define _LINUX_ETHERDEVICE_H
+
+#include <bits/endian.h>
+#include <rte_ether.h>
+
+/* Reserved Ethernet Addresses per IEEE 802.1Q */
+static const u8 eth_reserved_addr_base[6] =
+{ 0x01, 0x80, 0xc2, 0x00, 0x00, 0x00 };
+
+/**
+ * is_link_local_ether_addr - Determine if given Ethernet address is link-local
+ * @addr: Pointer to a six-byte array containing the Ethernet address
+ *
+ * Return true if address is link local reserved addr (01:80:c2:00:00:0X) per
+ * IEEE 802.1Q 8.6.3 Frame filtering.
+ */
+static inline bool is_link_local_ether_addr(const u8 *addr)
+{
+ const __be16 *a = (const __be16 *)addr;
+ const __be16 *b = (const __be16 *)eth_reserved_addr_base;
+ const __be16 m = cpu_to_be16(0xfff0);
+
+ return ((a[0] ^ b[0]) | (a[1] ^ b[1]) | ((a[2] ^ b[2]) & m)) == 0;
+}
+
+
+/**
+ * is_local_ether_addr - Determine if the Ethernet address is locally-assigned one (IEEE 802).
+ * @addr: Pointer to a six-byte array containing the Ethernet address
+ *
+ * Return true if the address is a local address.
+ */
+static inline bool is_local_ether_addr(const u8 *addr)
+{
+ return 0x02 & addr[0];
+}
+
+#if 0
+/**
+ * is_broadcast_ether_addr - Determine if the Ethernet address is broadcast
+ * @addr: Pointer to a six-byte array containing the Ethernet address
+ *
+ * Return true if the address is the broadcast address.
+ */
+static inline bool is_broadcast_ether_addr(const u8 *addr)
+{
+ return (addr[0] & addr[1] & addr[2] & addr[3] & addr[4] & addr[5]) == 0xff;
+}
+/**
+ * is_unicast_ether_addr - Determine if the Ethernet address is unicast
+ * @addr: Pointer to a six-byte array containing the Ethernet address
+ *
+ * Return true if the address is a unicast address.
+ */
+static inline bool is_unicast_ether_addr(const u8 *addr)
+{
+ return !is_multicast_ether_addr(addr);
+}
+#endif
+
+/**
+ * is_valid_ether_addr - Determine if the given Ethernet address is valid
+ * @addr: Pointer to a six-byte array containing the Ethernet address
+ *
+ * Check that the Ethernet address (MAC) is not 00:00:00:00:00:00, is not
+ * a multicast address, and is not FF:FF:FF:FF:FF:FF.
+ *
+ * Return true if the address is valid.
+ */
+static inline bool is_valid_ether_addr(const u8 *addr)
+{
+ /* FF:FF:FF:FF:FF:FF is a multicast address so we don't need to
+ * explicitly check for it here. */
+ return !is_multicast_ether_addr((struct ether_addr *)addr) && !is_zero_ether_addr((struct ether_addr *)addr);
+}
+
+
+/**
+ * eth_broadcast_addr - Assign broadcast address
+ * @addr: Pointer to a six-byte array containing the Ethernet address
+ *
+ * Assign the broadcast address to the given address array.
+ */
+static inline void eth_broadcast_addr(u8 *addr)
+{
+ memset(addr, 0xff, 6);
+}
+
+/**
+ * eth_zero_addr - Assign zero address
+ * @addr: Pointer to a six-byte array containing the Ethernet address
+ *
+ * Assign the zero address to the given address array.
+ */
+static inline void eth_zero_addr(u8 *addr)
+{
+ memset(addr, 0x00, 6);
+}
+
+
+/**
+ * compare_ether_addr - Compare two Ethernet addresses
+ * @addr1: Pointer to a six-byte array containing the Ethernet address
+ * @addr2: Pointer other six-byte array containing the Ethernet address
+ *
+ * Compare two Ethernet addresses, returns 0 if equal, non-zero otherwise.
+ * Unlike memcmp(), it doesn't return a value suitable for sorting.
+ */
+static inline unsigned compare_ether_addr(const u8 *addr1, const u8 *addr2)
+{
+ const u16 *a = (const u16 *) addr1;
+ const u16 *b = (const u16 *) addr2;
+
+ return ((a[0] ^ b[0]) | (a[1] ^ b[1]) | (a[2] ^ b[2])) != 0;
+}
+
+/**
+ * ether_addr_equal - Compare two Ethernet addresses
+ * @addr1: Pointer to a six-byte array containing the Ethernet address
+ * @addr2: Pointer other six-byte array containing the Ethernet address
+ *
+ * Compare two Ethernet addresses, returns true if equal
+ */
+static inline bool ether_addr_equal(const u8 *addr1, const u8 *addr2)
+{
+ return !compare_ether_addr(addr1, addr2);
+}
+
+/**
+ * ether_addr_equal_64bits - Compare two Ethernet addresses
+ * @addr1: Pointer to an array of 8 bytes
+ * @addr2: Pointer to an other array of 8 bytes
+ *
+ * Compare two Ethernet addresses, returns true if equal, false otherwise.
+ *
+ * The function doesn't need any conditional branches and possibly uses
+ * word memory accesses on CPU allowing cheap unaligned memory reads.
+ * arrays = { byte1, byte2, byte3, byte4, byte5, byte6, pad1, pad2 }
+ *
+ * Please note that alignment of addr1 & addr2 are only guaranteed to be 16 bits.
+ */
+
+static inline bool ether_addr_equal_64bits(const u8 addr1[6+2],
+ const u8 addr2[6+2])
+{
+ return ether_addr_equal(addr1, addr2);
+}
+
+
+/**
+ * compare_ether_header - Compare two Ethernet headers
+ * @a: Pointer to Ethernet header
+ * @b: Pointer to Ethernet header
+ *
+ * Compare two Ethernet headers, returns 0 if equal.
+ * This assumes that the network header (i.e., IP header) is 4-byte
+ * aligned OR the platform can handle unaligned access. This is the
+ * case for all packets coming into netif_receive_skb or similar
+ * entry points.
+ */
+
+static inline unsigned long compare_ether_header(const void *a, const void *b)
+{
+ return memcmp(a,b,6);
+}
+
+#endif /* _LINUX_ETHERDEVICE_H */
diff --git a/drivers/net/mlnx_uio/include/ib_mad.h b/drivers/net/mlnx_uio/include/ib_mad.h
new file mode 100644
index 0000000..918bc61
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/ib_mad.h
@@ -0,0 +1,664 @@
+/*
+ * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2004 Infinicon Corporation. All rights reserved.
+ * Copyright (c) 2004 Intel Corporation. All rights reserved.
+ * Copyright (c) 2004 Topspin Corporation. All rights reserved.
+ * Copyright (c) 2004-2006 Voltaire Corporation. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#if !defined(IB_MAD_H)
+#define IB_MAD_H
+
+#include "ib_verbs.h"
+#include "list.h"
+#include "bitmap.h"
+
+/* Management base version */
+#define IB_MGMT_BASE_VERSION 1
+
+/* Management classes */
+#define IB_MGMT_CLASS_SUBN_LID_ROUTED 0x01
+#define IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE 0x81
+#define IB_MGMT_CLASS_SUBN_ADM 0x03
+#define IB_MGMT_CLASS_PERF_MGMT 0x04
+#define IB_MGMT_CLASS_BM 0x05
+#define IB_MGMT_CLASS_DEVICE_MGMT 0x06
+#define IB_MGMT_CLASS_CM 0x07
+#define IB_MGMT_CLASS_SNMP 0x08
+#define IB_MGMT_CLASS_DEVICE_ADM 0x10
+#define IB_MGMT_CLASS_BOOT_MGMT 0x11
+#define IB_MGMT_CLASS_BIS 0x12
+#define IB_MGMT_CLASS_CONG_MGMT 0x21
+#define IB_MGMT_CLASS_VENDOR_RANGE2_START 0x30
+#define IB_MGMT_CLASS_VENDOR_RANGE2_END 0x4F
+
+#define IB_OPENIB_OUI (0x001405)
+
+/* Management methods */
+#define IB_MGMT_METHOD_GET 0x01
+#define IB_MGMT_METHOD_SET 0x02
+#define IB_MGMT_METHOD_GET_RESP 0x81
+#define IB_MGMT_METHOD_SEND 0x03
+#define IB_MGMT_METHOD_TRAP 0x05
+#define IB_MGMT_METHOD_REPORT 0x06
+#define IB_MGMT_METHOD_REPORT_RESP 0x86
+#define IB_MGMT_METHOD_TRAP_REPRESS 0x07
+
+#define IB_MGMT_METHOD_RESP 0x80
+#define IB_BM_ATTR_MOD_RESP cpu_to_be32(1)
+
+#define IB_MGMT_MAX_METHODS 128
+
+/* MAD Status field bit masks */
+#define IB_MGMT_MAD_STATUS_SUCCESS 0x0000
+#define IB_MGMT_MAD_STATUS_BUSY 0x0001
+#define IB_MGMT_MAD_STATUS_REDIRECT_REQD 0x0002
+#define IB_MGMT_MAD_STATUS_BAD_VERSION 0x0004
+#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD 0x0008
+#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD_ATTRIB 0x000c
+#define IB_MGMT_MAD_STATUS_INVALID_ATTRIB_VALUE 0x001c
+
+/* RMPP information */
+#define IB_MGMT_RMPP_VERSION 1
+
+#define IB_MGMT_RMPP_TYPE_DATA 1
+#define IB_MGMT_RMPP_TYPE_ACK 2
+#define IB_MGMT_RMPP_TYPE_STOP 3
+#define IB_MGMT_RMPP_TYPE_ABORT 4
+
+#define IB_MGMT_RMPP_FLAG_ACTIVE 1
+#define IB_MGMT_RMPP_FLAG_FIRST (1<<1)
+#define IB_MGMT_RMPP_FLAG_LAST (1<<2)
+
+#define IB_MGMT_RMPP_NO_RESPTIME 0x1F
+
+#define IB_MGMT_RMPP_STATUS_SUCCESS 0
+#define IB_MGMT_RMPP_STATUS_RESX 1
+#define IB_MGMT_RMPP_STATUS_ABORT_MIN 118
+#define IB_MGMT_RMPP_STATUS_T2L 118
+#define IB_MGMT_RMPP_STATUS_BAD_LEN 119
+#define IB_MGMT_RMPP_STATUS_BAD_SEG 120
+#define IB_MGMT_RMPP_STATUS_BADT 121
+#define IB_MGMT_RMPP_STATUS_W2S 122
+#define IB_MGMT_RMPP_STATUS_S2B 123
+#define IB_MGMT_RMPP_STATUS_BAD_STATUS 124
+#define IB_MGMT_RMPP_STATUS_UNV 125
+#define IB_MGMT_RMPP_STATUS_TMR 126
+#define IB_MGMT_RMPP_STATUS_UNSPEC 127
+#define IB_MGMT_RMPP_STATUS_ABORT_MAX 127
+
+#define IB_QP0 0
+#define IB_QP1 cpu_to_be32(1)
+#define IB_QP1_QKEY 0x80010000
+#define IB_QP_SET_QKEY 0x80000000
+
+#define IB_DEFAULT_PKEY_PARTIAL 0x7FFF
+#define IB_DEFAULT_PKEY_FULL 0xFFFF
+
+enum {
+ IB_MGMT_MAD_HDR = 24,
+ IB_MGMT_MAD_DATA = 232,
+ IB_MGMT_RMPP_HDR = 36,
+ IB_MGMT_RMPP_DATA = 220,
+ IB_MGMT_VENDOR_HDR = 40,
+ IB_MGMT_VENDOR_DATA = 216,
+ IB_MGMT_SA_HDR = 56,
+ IB_MGMT_SA_DATA = 200,
+ IB_MGMT_DEVICE_HDR = 64,
+ IB_MGMT_DEVICE_DATA = 192,
+};
+
+struct ib_mad_hdr {
+ u8 base_version;
+ u8 mgmt_class;
+ u8 class_version;
+ u8 method;
+ __be16 status;
+ __be16 class_specific;
+ __be64 tid;
+ __be16 attr_id;
+ __be16 resv;
+ __be32 attr_mod;
+};
+
+struct ib_rmpp_hdr {
+ u8 rmpp_version;
+ u8 rmpp_type;
+ u8 rmpp_rtime_flags;
+ u8 rmpp_status;
+ __be32 seg_num;
+ __be32 paylen_newwin;
+};
+
+typedef u64 __bitwise ib_sa_comp_mask;
+
+#define IB_SA_COMP_MASK(n) ((__force ib_sa_comp_mask) cpu_to_be64(1ull << (n)))
+
+/*
+ * ib_sa_hdr and ib_sa_mad structures must be packed because they have
+ * 64-bit fields that are only 32-bit aligned. 64-bit architectures will
+ * lay them out wrong otherwise. (And unfortunately they are sent on
+ * the wire so we can't change the layout)
+ */
+struct ib_sa_hdr {
+ __be64 sm_key;
+ __be16 attr_offset;
+ __be16 reserved;
+ ib_sa_comp_mask comp_mask;
+} __attribute__ ((packed));
+
+struct ib_mad {
+ struct ib_mad_hdr mad_hdr;
+ u8 data[IB_MGMT_MAD_DATA];
+};
+
+struct ib_rmpp_mad {
+ struct ib_mad_hdr mad_hdr;
+ struct ib_rmpp_hdr rmpp_hdr;
+ u8 data[IB_MGMT_RMPP_DATA];
+};
+
+struct ib_sa_mad {
+ struct ib_mad_hdr mad_hdr;
+ struct ib_rmpp_hdr rmpp_hdr;
+ struct ib_sa_hdr sa_hdr;
+ u8 data[IB_MGMT_SA_DATA];
+} __attribute__ ((packed));
+
+struct ib_vendor_mad {
+ struct ib_mad_hdr mad_hdr;
+ struct ib_rmpp_hdr rmpp_hdr;
+ u8 reserved;
+ u8 oui[3];
+ u8 data[IB_MGMT_VENDOR_DATA];
+};
+
+struct ib_class_port_info {
+ u8 base_version;
+ u8 class_version;
+ __be16 capability_mask;
+ u8 reserved[3];
+ u8 resp_time_value;
+ u8 redirect_gid[16];
+ __be32 redirect_tcslfl;
+ __be16 redirect_lid;
+ __be16 redirect_pkey;
+ __be32 redirect_qp;
+ __be32 redirect_qkey;
+ u8 trap_gid[16];
+ __be32 trap_tcslfl;
+ __be16 trap_lid;
+ __be16 trap_pkey;
+ __be32 trap_hlqp;
+ __be32 trap_qkey;
+};
+
+/**
+ * ib_mad_send_buf - MAD data buffer and work request for sends.
+ * @next: A pointer used to chain together MADs for posting.
+ * @mad: References an allocated MAD data buffer for MADs that do not have
+ * RMPP active. For MADs using RMPP, references the common and management
+ * class specific headers.
+ * @mad_agent: MAD agent that allocated the buffer.
+ * @ah: The address handle to use when sending the MAD.
+ * @context: User-controlled context fields.
+ * @hdr_len: Indicates the size of the data header of the MAD. This length
+ * includes the common MAD, RMPP, and class specific headers.
+ * @data_len: Indicates the total size of user-transferred data.
+ * @seg_count: The number of RMPP segments allocated for this send.
+ * @seg_size: Size of each RMPP segment.
+ * @timeout_ms: Time to wait for a response.
+ * @retries: Number of times to retry a request for a response. For MADs
+ * using RMPP, this applies per window. On completion, returns the number
+ * of retries needed to complete the transfer.
+ *
+ * Users are responsible for initializing the MAD buffer itself, with the
+ * exception of any RMPP header. Additional segment buffer space allocated
+ * beyond data_len is padding.
+ */
+struct ib_mad_send_buf {
+ struct ib_mad_send_buf *next;
+ void *mad;
+ struct ib_mad_agent *mad_agent;
+ struct ib_ah *ah;
+ void *context[2];
+ int hdr_len;
+ int data_len;
+ int seg_count;
+ int seg_size;
+ int timeout_ms;
+ int retries;
+};
+
+/**
+ * ib_response_mad - Returns if the specified MAD has been generated in
+ * response to a sent request or trap.
+ */
+int ib_response_mad(struct ib_mad *mad);
+
+/**
+ * ib_get_rmpp_resptime - Returns the RMPP response time.
+ * @rmpp_hdr: An RMPP header.
+ */
+static inline u8 ib_get_rmpp_resptime(struct ib_rmpp_hdr *rmpp_hdr)
+{
+ return rmpp_hdr->rmpp_rtime_flags >> 3;
+}
+
+/**
+ * ib_get_rmpp_flags - Returns the RMPP flags.
+ * @rmpp_hdr: An RMPP header.
+ */
+static inline u8 ib_get_rmpp_flags(struct ib_rmpp_hdr *rmpp_hdr)
+{
+ return rmpp_hdr->rmpp_rtime_flags & 0x7;
+}
+
+/**
+ * ib_set_rmpp_resptime - Sets the response time in an RMPP header.
+ * @rmpp_hdr: An RMPP header.
+ * @rtime: The response time to set.
+ */
+static inline void ib_set_rmpp_resptime(struct ib_rmpp_hdr *rmpp_hdr, u8 rtime)
+{
+ rmpp_hdr->rmpp_rtime_flags = ib_get_rmpp_flags(rmpp_hdr) | (rtime << 3);
+}
+
+/**
+ * ib_set_rmpp_flags - Sets the flags in an RMPP header.
+ * @rmpp_hdr: An RMPP header.
+ * @flags: The flags to set.
+ */
+static inline void ib_set_rmpp_flags(struct ib_rmpp_hdr *rmpp_hdr, u8 flags)
+{
+ rmpp_hdr->rmpp_rtime_flags = (rmpp_hdr->rmpp_rtime_flags & 0xF8) |
+ (flags & 0x7);
+}
+
+struct ib_mad_agent;
+struct ib_mad_send_wc;
+struct ib_mad_recv_wc;
+
+/**
+ * ib_mad_send_handler - callback handler for a sent MAD.
+ * @mad_agent: MAD agent that sent the MAD.
+ * @mad_send_wc: Send work completion information on the sent MAD.
+ */
+typedef void (*ib_mad_send_handler)(struct ib_mad_agent *mad_agent,
+ struct ib_mad_send_wc *mad_send_wc);
+
+/**
+ * ib_mad_snoop_handler - Callback handler for snooping sent MADs.
+ * @mad_agent: MAD agent that snooped the MAD.
+ * @send_wr: Work request information on the sent MAD.
+ * @mad_send_wc: Work completion information on the sent MAD. Valid
+ * only for snooping that occurs on a send completion.
+ *
+ * Clients snooping MADs should not modify data referenced by the @send_wr
+ * or @mad_send_wc.
+ */
+typedef void (*ib_mad_snoop_handler)(struct ib_mad_agent *mad_agent,
+ struct ib_mad_send_buf *send_buf,
+ struct ib_mad_send_wc *mad_send_wc);
+
+/**
+ * ib_mad_recv_handler - callback handler for a received MAD.
+ * @mad_agent: MAD agent requesting the received MAD.
+ * @mad_recv_wc: Received work completion information on the received MAD.
+ *
+ * MADs received in response to a send request operation will be handed to
+ * the user before the send operation completes. All data buffers given
+ * to registered agents through this routine are owned by the receiving
+ * client, except for snooping agents. Clients snooping MADs should not
+ * modify the data referenced by @mad_recv_wc.
+ */
+typedef void (*ib_mad_recv_handler)(struct ib_mad_agent *mad_agent,
+ struct ib_mad_recv_wc *mad_recv_wc);
+
+/**
+ * ib_mad_agent - Used to track MAD registration with the access layer.
+ * @device: Reference to device registration is on.
+ * @qp: Reference to QP used for sending and receiving MADs.
+ * @mr: Memory region for system memory usable for DMA.
+ * @recv_handler: Callback handler for a received MAD.
+ * @send_handler: Callback handler for a sent MAD.
+ * @snoop_handler: Callback handler for snooped sent MADs.
+ * @context: User-specified context associated with this registration.
+ * @hi_tid: Access layer assigned transaction ID for this client.
+ * Unsolicited MADs sent by this client will have the upper 32-bits
+ * of their TID set to this value.
+ * @port_num: Port number on which QP is registered
+ * @rmpp_version: If set, indicates the RMPP version used by this agent.
+ */
+struct ib_mad_agent {
+ struct ib_device *device;
+ struct ib_qp *qp;
+ struct ib_mr *mr;
+ ib_mad_recv_handler recv_handler;
+ ib_mad_send_handler send_handler;
+ ib_mad_snoop_handler snoop_handler;
+ void *context;
+ u32 hi_tid;
+ u8 port_num;
+ u8 rmpp_version;
+};
+
+/**
+ * ib_mad_send_wc - MAD send completion information.
+ * @send_buf: Send MAD data buffer associated with the send MAD request.
+ * @status: Completion status.
+ * @vendor_err: Optional vendor error information returned with a failed
+ * request.
+ */
+struct ib_mad_send_wc {
+ struct ib_mad_send_buf *send_buf;
+ enum ib_wc_status status;
+ u32 vendor_err;
+};
+
+/**
+ * ib_mad_recv_buf - received MAD buffer information.
+ * @list: Reference to next data buffer for a received RMPP MAD.
+ * @grh: References a data buffer containing the global route header.
+ * The data refereced by this buffer is only valid if the GRH is
+ * valid.
+ * @mad: References the start of the received MAD.
+ */
+struct ib_mad_recv_buf {
+ struct list_head list;
+ struct ib_grh *grh;
+ struct ib_mad *mad;
+};
+
+/**
+ * ib_mad_recv_wc - received MAD information.
+ * @wc: Completion information for the received data.
+ * @recv_buf: Specifies the location of the received data buffer(s).
+ * @rmpp_list: Specifies a list of RMPP reassembled received MAD buffers.
+ * @mad_len: The length of the received MAD, without duplicated headers.
+ *
+ * For received response, the wr_id contains a pointer to the ib_mad_send_buf
+ * for the corresponding send request.
+ */
+struct ib_mad_recv_wc {
+ struct ib_wc *wc;
+ struct ib_mad_recv_buf recv_buf;
+ struct list_head rmpp_list;
+ int mad_len;
+};
+
+/**
+ * ib_mad_reg_req - MAD registration request
+ * @mgmt_class: Indicates which management class of MADs should be receive
+ * by the caller. This field is only required if the user wishes to
+ * receive unsolicited MADs, otherwise it should be 0.
+ * @mgmt_class_version: Indicates which version of MADs for the given
+ * management class to receive.
+ * @oui: Indicates IEEE OUI when mgmt_class is a vendor class
+ * in the range from 0x30 to 0x4f. Otherwise not used.
+ * @method_mask: The caller will receive unsolicited MADs for any method
+ * where @method_mask = 1.
+ */
+struct ib_mad_reg_req {
+ u8 mgmt_class;
+ u8 mgmt_class_version;
+ u8 oui[3];
+ DECLARE_BITMAP(method_mask, IB_MGMT_MAX_METHODS);
+};
+
+/**
+ * ib_register_mad_agent - Register to send/receive MADs.
+ * @device: The device to register with.
+ * @port_num: The port on the specified device to use.
+ * @qp_type: Specifies which QP to access. Must be either
+ * IB_QPT_SMI or IB_QPT_GSI.
+ * @mad_reg_req: Specifies which unsolicited MADs should be received
+ * by the caller. This parameter may be NULL if the caller only
+ * wishes to receive solicited responses.
+ * @rmpp_version: If set, indicates that the client will send
+ * and receive MADs that contain the RMPP header for the given version.
+ * If set to 0, indicates that RMPP is not used by this client.
+ * @send_handler: The completion callback routine invoked after a send
+ * request has completed.
+ * @recv_handler: The completion callback routine invoked for a received
+ * MAD.
+ * @context: User specified context associated with the registration.
+ */
+struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device,
+ u8 port_num,
+ enum ib_qp_type qp_type,
+ struct ib_mad_reg_req *mad_reg_req,
+ u8 rmpp_version,
+ ib_mad_send_handler send_handler,
+ ib_mad_recv_handler recv_handler,
+ void *context);
+
+enum ib_mad_snoop_flags {
+ /*IB_MAD_SNOOP_POSTED_SENDS = 1,*/
+ /*IB_MAD_SNOOP_RMPP_SENDS = (1<<1),*/
+ IB_MAD_SNOOP_SEND_COMPLETIONS = (1<<2),
+ /*IB_MAD_SNOOP_RMPP_SEND_COMPLETIONS = (1<<3),*/
+ IB_MAD_SNOOP_RECVS = (1<<4)
+ /*IB_MAD_SNOOP_RMPP_RECVS = (1<<5),*/
+ /*IB_MAD_SNOOP_REDIRECTED_QPS = (1<<6)*/
+};
+
+/**
+ * ib_register_mad_snoop - Register to snoop sent and received MADs.
+ * @device: The device to register with.
+ * @port_num: The port on the specified device to use.
+ * @qp_type: Specifies which QP traffic to snoop. Must be either
+ * IB_QPT_SMI or IB_QPT_GSI.
+ * @mad_snoop_flags: Specifies information where snooping occurs.
+ * @send_handler: The callback routine invoked for a snooped send.
+ * @recv_handler: The callback routine invoked for a snooped receive.
+ * @context: User specified context associated with the registration.
+ */
+struct ib_mad_agent *ib_register_mad_snoop(struct ib_device *device,
+ u8 port_num,
+ enum ib_qp_type qp_type,
+ int mad_snoop_flags,
+ ib_mad_snoop_handler snoop_handler,
+ ib_mad_recv_handler recv_handler,
+ void *context);
+
+/**
+ * ib_unregister_mad_agent - Unregisters a client from using MAD services.
+ * @mad_agent: Corresponding MAD registration request to deregister.
+ *
+ * After invoking this routine, MAD services are no longer usable by the
+ * client on the associated QP.
+ */
+int ib_unregister_mad_agent(struct ib_mad_agent *mad_agent);
+
+/**
+ * ib_post_send_mad - Posts MAD(s) to the send queue of the QP associated
+ * with the registered client.
+ * @send_buf: Specifies the information needed to send the MAD(s).
+ * @bad_send_buf: Specifies the MAD on which an error was encountered. This
+ * parameter is optional if only a single MAD is posted.
+ *
+ * Sent MADs are not guaranteed to complete in the order that they were posted.
+ *
+ * If the MAD requires RMPP, the data buffer should contain a single copy
+ * of the common MAD, RMPP, and class specific headers, followed by the class
+ * defined data. If the class defined data would not divide evenly into
+ * RMPP segments, then space must be allocated at the end of the referenced
+ * buffer for any required padding. To indicate the amount of class defined
+ * data being transferred, the paylen_newwin field in the RMPP header should
+ * be set to the size of the class specific header plus the amount of class
+ * defined data being transferred. The paylen_newwin field should be
+ * specified in network-byte order.
+ */
+int ib_post_send_mad(struct ib_mad_send_buf *send_buf,
+ struct ib_mad_send_buf **bad_send_buf);
+
+
+/**
+ * ib_free_recv_mad - Returns data buffers used to receive a MAD.
+ * @mad_recv_wc: Work completion information for a received MAD.
+ *
+ * Clients receiving MADs through their ib_mad_recv_handler must call this
+ * routine to return the work completion buffers to the access layer.
+ */
+void ib_free_recv_mad(struct ib_mad_recv_wc *mad_recv_wc);
+
+/**
+ * ib_cancel_mad - Cancels an outstanding send MAD operation.
+ * @mad_agent: Specifies the registration associated with sent MAD.
+ * @send_buf: Indicates the MAD to cancel.
+ *
+ * MADs will be returned to the user through the corresponding
+ * ib_mad_send_handler.
+ */
+void ib_cancel_mad(struct ib_mad_agent *mad_agent,
+ struct ib_mad_send_buf *send_buf);
+
+/**
+ * ib_modify_mad - Modifies an outstanding send MAD operation.
+ * @mad_agent: Specifies the registration associated with sent MAD.
+ * @send_buf: Indicates the MAD to modify.
+ * @timeout_ms: New timeout value for sent MAD.
+ *
+ * This call will reset the timeout value for a sent MAD to the specified
+ * value.
+ */
+int ib_modify_mad(struct ib_mad_agent *mad_agent,
+ struct ib_mad_send_buf *send_buf, u32 timeout_ms);
+
+/**
+ * ib_redirect_mad_qp - Registers a QP for MAD services.
+ * @qp: Reference to a QP that requires MAD services.
+ * @rmpp_version: If set, indicates that the client will send
+ * and receive MADs that contain the RMPP header for the given version.
+ * If set to 0, indicates that RMPP is not used by this client.
+ * @send_handler: The completion callback routine invoked after a send
+ * request has completed.
+ * @recv_handler: The completion callback routine invoked for a received
+ * MAD.
+ * @context: User specified context associated with the registration.
+ *
+ * Use of this call allows clients to use MAD services, such as RMPP,
+ * on user-owned QPs. After calling this routine, users may send
+ * MADs on the specified QP by calling ib_mad_post_send.
+ */
+struct ib_mad_agent *ib_redirect_mad_qp(struct ib_qp *qp,
+ u8 rmpp_version,
+ ib_mad_send_handler send_handler,
+ ib_mad_recv_handler recv_handler,
+ void *context);
+
+/**
+ * ib_process_mad_wc - Processes a work completion associated with a
+ * MAD sent or received on a redirected QP.
+ * @mad_agent: Specifies the registered MAD service using the redirected QP.
+ * @wc: References a work completion associated with a sent or received
+ * MAD segment.
+ *
+ * This routine is used to complete or continue processing on a MAD request.
+ * If the work completion is associated with a send operation, calling
+ * this routine is required to continue an RMPP transfer or to wait for a
+ * corresponding response, if it is a request. If the work completion is
+ * associated with a receive operation, calling this routine is required to
+ * process an inbound or outbound RMPP transfer, or to match a response MAD
+ * with its corresponding request.
+ */
+int ib_process_mad_wc(struct ib_mad_agent *mad_agent,
+ struct ib_wc *wc);
+
+/**
+ * ib_create_send_mad - Allocate and initialize a data buffer and work request
+ * for sending a MAD.
+ * @mad_agent: Specifies the registered MAD service to associate with the MAD.
+ * @remote_qpn: Specifies the QPN of the receiving node.
+ * @pkey_index: Specifies which PKey the MAD will be sent using. This field
+ * is valid only if the remote_qpn is QP 1.
+ * @rmpp_active: Indicates if the send will enable RMPP.
+ * @hdr_len: Indicates the size of the data header of the MAD. This length
+ * should include the common MAD header, RMPP header, plus any class
+ * specific header.
+ * @data_len: Indicates the size of any user-transferred data. The call will
+ * automatically adjust the allocated buffer size to account for any
+ * additional padding that may be necessary.
+ * @gfp_mask: GFP mask used for the memory allocation.
+ *
+ * This routine allocates a MAD for sending. The returned MAD send buffer
+ * will reference a data buffer usable for sending a MAD, along
+ * with an initialized work request structure. Users may modify the returned
+ * MAD data buffer before posting the send.
+ *
+ * The returned MAD header, class specific headers, and any padding will be
+ * cleared. Users are responsible for initializing the common MAD header,
+ * any class specific header, and MAD data area.
+ * If @rmpp_active is set, the RMPP header will be initialized for sending.
+ */
+struct ib_mad_send_buf *ib_create_send_mad(struct ib_mad_agent *mad_agent,
+ u32 remote_qpn, u16 pkey_index,
+ int rmpp_active,
+ int hdr_len, int data_len,
+ gfp_t gfp_mask);
+
+/**
+ * ib_is_mad_class_rmpp - returns whether given management class
+ * supports RMPP.
+ * @mgmt_class: management class
+ *
+ * This routine returns whether the management class supports RMPP.
+ */
+int ib_is_mad_class_rmpp(u8 mgmt_class);
+
+/**
+ * ib_get_mad_data_offset - returns the data offset for a given
+ * management class.
+ * @mgmt_class: management class
+ *
+ * This routine returns the data offset in the MAD for the management
+ * class requested.
+ */
+int ib_get_mad_data_offset(u8 mgmt_class);
+
+/**
+ * ib_get_rmpp_segment - returns the data buffer for a given RMPP segment.
+ * @send_buf: Previously allocated send data buffer.
+ * @seg_num: number of segment to return
+ *
+ * This routine returns a pointer to the data buffer of an RMPP MAD.
+ * Users must provide synchronization to @send_buf around this call.
+ */
+void *ib_get_rmpp_segment(struct ib_mad_send_buf *send_buf, int seg_num);
+
+/**
+ * ib_free_send_mad - Returns data buffers used to send a MAD.
+ * @send_buf: Previously allocated send data buffer.
+ */
+void ib_free_send_mad(struct ib_mad_send_buf *send_buf);
+
+#endif /* IB_MAD_H */
diff --git a/drivers/net/mlnx_uio/include/ib_smi.h b/drivers/net/mlnx_uio/include/ib_smi.h
new file mode 100644
index 0000000..1b6c201
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/ib_smi.h
@@ -0,0 +1,128 @@
+/*
+ * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2004 Infinicon Corporation. All rights reserved.
+ * Copyright (c) 2004 Intel Corporation. All rights reserved.
+ * Copyright (c) 2004 Topspin Corporation. All rights reserved.
+ * Copyright (c) 2004 Voltaire Corporation. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#if !defined(IB_SMI_H)
+#define IB_SMI_H
+
+#include "ib_mad.h"
+
+#define IB_SMP_DATA_SIZE 64
+#define IB_SMP_MAX_PATH_HOPS 64
+
+struct ib_smp {
+ u8 base_version;
+ u8 mgmt_class;
+ u8 class_version;
+ u8 method;
+ __be16 status;
+ u8 hop_ptr;
+ u8 hop_cnt;
+ __be64 tid;
+ __be16 attr_id;
+ __be16 resv;
+ __be32 attr_mod;
+ __be64 mkey;
+ __be16 dr_slid;
+ __be16 dr_dlid;
+ u8 reserved[28];
+ u8 data[IB_SMP_DATA_SIZE];
+ u8 initial_path[IB_SMP_MAX_PATH_HOPS];
+ u8 return_path[IB_SMP_MAX_PATH_HOPS];
+} __attribute__ ((packed));
+
+#define IB_SMP_DIRECTION cpu_to_be16(0x8000)
+
+/* Subnet management attributes */
+#define IB_SMP_ATTR_NOTICE cpu_to_be16(0x0002)
+#define IB_SMP_ATTR_NODE_DESC cpu_to_be16(0x0010)
+#define IB_SMP_ATTR_NODE_INFO cpu_to_be16(0x0011)
+#define IB_SMP_ATTR_SWITCH_INFO cpu_to_be16(0x0012)
+#define IB_SMP_ATTR_GUID_INFO cpu_to_be16(0x0014)
+#define IB_SMP_ATTR_PORT_INFO cpu_to_be16(0x0015)
+#define IB_SMP_ATTR_PKEY_TABLE cpu_to_be16(0x0016)
+#define IB_SMP_ATTR_SL_TO_VL_TABLE cpu_to_be16(0x0017)
+#define IB_SMP_ATTR_VL_ARB_TABLE cpu_to_be16(0x0018)
+#define IB_SMP_ATTR_LINEAR_FORWARD_TABLE cpu_to_be16(0x0019)
+#define IB_SMP_ATTR_RANDOM_FORWARD_TABLE cpu_to_be16(0x001A)
+#define IB_SMP_ATTR_MCAST_FORWARD_TABLE cpu_to_be16(0x001B)
+#define IB_SMP_ATTR_SM_INFO cpu_to_be16(0x0020)
+#define IB_SMP_ATTR_VENDOR_DIAG cpu_to_be16(0x0030)
+#define IB_SMP_ATTR_LED_INFO cpu_to_be16(0x0031)
+#define IB_SMP_ATTR_VENDOR_MASK cpu_to_be16(0xFF00)
+
+struct ib_port_info {
+ __be64 mkey;
+ __be64 gid_prefix;
+ __be16 lid;
+ __be16 sm_lid;
+ __be32 cap_mask;
+ __be16 diag_code;
+ __be16 mkey_lease_period;
+ u8 local_port_num;
+ u8 link_width_enabled;
+ u8 link_width_supported;
+ u8 link_width_active;
+ u8 linkspeed_portstate; /* 4 bits, 4 bits */
+ u8 portphysstate_linkdown; /* 4 bits, 4 bits */
+ u8 mkeyprot_resv_lmc; /* 2 bits, 3, 3 */
+ u8 linkspeedactive_enabled; /* 4 bits, 4 bits */
+ u8 neighbormtu_mastersmsl; /* 4 bits, 4 bits */
+ u8 vlcap_inittype; /* 4 bits, 4 bits */
+ u8 vl_high_limit;
+ u8 vl_arb_high_cap;
+ u8 vl_arb_low_cap;
+ u8 inittypereply_mtucap; /* 4 bits, 4 bits */
+ u8 vlstallcnt_hoqlife; /* 3 bits, 5 bits */
+ u8 operationalvl_pei_peo_fpi_fpo; /* 4 bits, 1, 1, 1, 1 */
+ __be16 mkey_violations;
+ __be16 pkey_violations;
+ __be16 qkey_violations;
+ u8 guid_cap;
+ u8 clientrereg_resv_subnetto; /* 1 bit, 2 bits, 5 */
+ u8 resv_resptimevalue; /* 3 bits, 5 bits */
+ u8 localphyerrors_overrunerrors; /* 4 bits, 4 bits */
+ __be16 max_credit_hint;
+ u8 resv;
+ u8 link_roundtrip_latency[3];
+};
+
+static inline u8
+ib_get_smp_direction(struct ib_smp *smp)
+{
+ return ((smp->status & IB_SMP_DIRECTION) == IB_SMP_DIRECTION);
+}
+
+#endif /* IB_SMI_H */
diff --git a/drivers/net/mlnx_uio/include/ib_verbs.h b/drivers/net/mlnx_uio/include/ib_verbs.h
new file mode 100644
index 0000000..59e9c60
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/ib_verbs.h
@@ -0,0 +1,806 @@
+/*
+ * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2004 Infinicon Corporation. All rights reserved.
+ * Copyright (c) 2004 Intel Corporation. All rights reserved.
+ * Copyright (c) 2004 Topspin Corporation. All rights reserved.
+ * Copyright (c) 2004 Voltaire Corporation. All rights reserved.
+ * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007 Cisco Systems. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#if !defined(IB_VERBS_H)
+#define IB_VERBS_H
+
+extern struct workqueue_struct *ib_wq;
+
+union ib_gid {
+ u8 raw[16];
+ struct {
+ __be64 subnet_prefix;
+ __be64 interface_id;
+ } global;
+};
+
+enum rdma_node_type {
+ /* IB values map to NodeInfo:NodeType. */
+ RDMA_NODE_IB_CA = 1,
+ RDMA_NODE_IB_SWITCH,
+ RDMA_NODE_IB_ROUTER,
+ RDMA_NODE_RNIC,
+ RDMA_NODE_USNIC,
+ RDMA_NODE_USNIC_UDP,
+};
+
+enum rdma_transport_type {
+ RDMA_TRANSPORT_IB,
+ RDMA_TRANSPORT_IWARP,
+ RDMA_TRANSPORT_USNIC,
+ RDMA_TRANSPORT_USNIC_UDP
+};
+
+enum rdma_transport_type
+rdma_node_get_transport(enum rdma_node_type node_type) __attribute_const__;
+
+enum rdma_link_layer {
+ IB_LINK_LAYER_UNSPECIFIED,
+ IB_LINK_LAYER_INFINIBAND,
+ IB_LINK_LAYER_ETHERNET,
+};
+
+enum ib_device_cap_flags {
+ IB_DEVICE_RESIZE_MAX_WR = 1,
+ IB_DEVICE_BAD_PKEY_CNTR = (1<<1),
+ IB_DEVICE_BAD_QKEY_CNTR = (1<<2),
+ IB_DEVICE_RAW_MULTI = (1<<3),
+ IB_DEVICE_AUTO_PATH_MIG = (1<<4),
+ IB_DEVICE_CHANGE_PHY_PORT = (1<<5),
+ IB_DEVICE_UD_AV_PORT_ENFORCE = (1<<6),
+ IB_DEVICE_CURR_QP_STATE_MOD = (1<<7),
+ IB_DEVICE_SHUTDOWN_PORT = (1<<8),
+ IB_DEVICE_INIT_TYPE = (1<<9),
+ IB_DEVICE_PORT_ACTIVE_EVENT = (1<<10),
+ IB_DEVICE_SYS_IMAGE_GUID = (1<<11),
+ IB_DEVICE_RC_RNR_NAK_GEN = (1<<12),
+ IB_DEVICE_SRQ_RESIZE = (1<<13),
+ IB_DEVICE_N_NOTIFY_CQ = (1<<14),
+ IB_DEVICE_LOCAL_DMA_LKEY = (1<<15),
+ IB_DEVICE_RESERVED = (1<<16), /* old SEND_W_INV */
+ IB_DEVICE_MEM_WINDOW = (1<<17),
+ /*
+ * Devices should set IB_DEVICE_UD_IP_SUM if they support
+ * insertion of UDP and TCP checksum on outgoing UD IPoIB
+ * messages and can verify the validity of checksum for
+ * incoming messages. Setting this flag implies that the
+ * IPoIB driver may set NETIF_F_IP_CSUM for datagram mode.
+ */
+ IB_DEVICE_UD_IP_CSUM = (1<<18),
+ IB_DEVICE_UD_TSO = (1<<19),
+ IB_DEVICE_XRC = (1<<20),
+ IB_DEVICE_MEM_MGT_EXTENSIONS = (1<<21),
+ IB_DEVICE_BLOCK_MULTICAST_LOOPBACK = (1<<22),
+ IB_DEVICE_MEM_WINDOW_TYPE_2A = (1<<23),
+ IB_DEVICE_MEM_WINDOW_TYPE_2B = (1<<24),
+ IB_DEVICE_MANAGED_FLOW_STEERING = (1<<29)
+};
+
+enum ib_atomic_cap {
+ IB_ATOMIC_NONE,
+ IB_ATOMIC_HCA,
+ IB_ATOMIC_GLOB
+};
+
+struct ib_device_attr {
+ u64 fw_ver;
+ __be64 sys_image_guid;
+ u64 max_mr_size;
+ u64 page_size_cap;
+ u32 vendor_id;
+ u32 vendor_part_id;
+ u32 hw_ver;
+ int max_qp;
+ int max_qp_wr;
+ int device_cap_flags;
+ int max_sge;
+ int max_sge_rd;
+ int max_cq;
+ int max_cqe;
+ int max_mr;
+ int max_pd;
+ int max_qp_rd_atom;
+ int max_ee_rd_atom;
+ int max_res_rd_atom;
+ int max_qp_init_rd_atom;
+ int max_ee_init_rd_atom;
+ enum ib_atomic_cap atomic_cap;
+ enum ib_atomic_cap masked_atomic_cap;
+ int max_ee;
+ int max_rdd;
+ int max_mw;
+ int max_raw_ipv6_qp;
+ int max_raw_ethy_qp;
+ int max_mcast_grp;
+ int max_mcast_qp_attach;
+ int max_total_mcast_qp_attach;
+ int max_ah;
+ int max_fmr;
+ int max_map_per_fmr;
+ int max_srq;
+ int max_srq_wr;
+ int max_srq_sge;
+ unsigned int max_fast_reg_page_list_len;
+ u16 max_pkeys;
+ u8 local_ca_ack_delay;
+};
+
+enum ib_port_state {
+ IB_PORT_NOP = 0,
+ IB_PORT_DOWN = 1,
+ IB_PORT_INIT = 2,
+ IB_PORT_ARMED = 3,
+ IB_PORT_ACTIVE = 4,
+ IB_PORT_ACTIVE_DEFER = 5
+};
+
+enum ib_port_cap_flags {
+ IB_PORT_SM = 1 << 1,
+ IB_PORT_NOTICE_SUP = 1 << 2,
+ IB_PORT_TRAP_SUP = 1 << 3,
+ IB_PORT_OPT_IPD_SUP = 1 << 4,
+ IB_PORT_AUTO_MIGR_SUP = 1 << 5,
+ IB_PORT_SL_MAP_SUP = 1 << 6,
+ IB_PORT_MKEY_NVRAM = 1 << 7,
+ IB_PORT_PKEY_NVRAM = 1 << 8,
+ IB_PORT_LED_INFO_SUP = 1 << 9,
+ IB_PORT_SM_DISABLED = 1 << 10,
+ IB_PORT_SYS_IMAGE_GUID_SUP = 1 << 11,
+ IB_PORT_PKEY_SW_EXT_PORT_TRAP_SUP = 1 << 12,
+ IB_PORT_EXTENDED_SPEEDS_SUP = 1 << 14,
+ IB_PORT_CM_SUP = 1 << 16,
+ IB_PORT_SNMP_TUNNEL_SUP = 1 << 17,
+ IB_PORT_REINIT_SUP = 1 << 18,
+ IB_PORT_DEVICE_MGMT_SUP = 1 << 19,
+ IB_PORT_VENDOR_CLASS_SUP = 1 << 20,
+ IB_PORT_DR_NOTICE_SUP = 1 << 21,
+ IB_PORT_CAP_MASK_NOTICE_SUP = 1 << 22,
+ IB_PORT_BOOT_MGMT_SUP = 1 << 23,
+ IB_PORT_LINK_LATENCY_SUP = 1 << 24,
+ IB_PORT_CLIENT_REG_SUP = 1 << 25,
+ IB_PORT_IP_BASED_GIDS = 1 << 26
+};
+
+enum ib_port_width {
+ IB_WIDTH_1X = 1,
+ IB_WIDTH_4X = 2,
+ IB_WIDTH_8X = 4,
+ IB_WIDTH_12X = 8
+};
+
+static inline int ib_width_enum_to_int(enum ib_port_width width)
+{
+ switch (width) {
+ case IB_WIDTH_1X: return 1;
+ case IB_WIDTH_4X: return 4;
+ case IB_WIDTH_8X: return 8;
+ case IB_WIDTH_12X: return 12;
+ default: return -1;
+ }
+}
+
+enum ib_port_speed {
+ IB_SPEED_SDR = 1,
+ IB_SPEED_DDR = 2,
+ IB_SPEED_QDR = 4,
+ IB_SPEED_FDR10 = 8,
+ IB_SPEED_FDR = 16,
+ IB_SPEED_EDR = 32
+};
+
+struct ib_protocol_stats {
+ /* TBD... */
+};
+
+struct iw_protocol_stats {
+ u64 ipInReceives;
+ u64 ipInHdrErrors;
+ u64 ipInTooBigErrors;
+ u64 ipInNoRoutes;
+ u64 ipInAddrErrors;
+ u64 ipInUnknownProtos;
+ u64 ipInTruncatedPkts;
+ u64 ipInDiscards;
+ u64 ipInDelivers;
+ u64 ipOutForwDatagrams;
+ u64 ipOutRequests;
+ u64 ipOutDiscards;
+ u64 ipOutNoRoutes;
+ u64 ipReasmTimeout;
+ u64 ipReasmReqds;
+ u64 ipReasmOKs;
+ u64 ipReasmFails;
+ u64 ipFragOKs;
+ u64 ipFragFails;
+ u64 ipFragCreates;
+ u64 ipInMcastPkts;
+ u64 ipOutMcastPkts;
+ u64 ipInBcastPkts;
+ u64 ipOutBcastPkts;
+
+ u64 tcpRtoAlgorithm;
+ u64 tcpRtoMin;
+ u64 tcpRtoMax;
+ u64 tcpMaxConn;
+ u64 tcpActiveOpens;
+ u64 tcpPassiveOpens;
+ u64 tcpAttemptFails;
+ u64 tcpEstabResets;
+ u64 tcpCurrEstab;
+ u64 tcpInSegs;
+ u64 tcpOutSegs;
+ u64 tcpRetransSegs;
+ u64 tcpInErrs;
+ u64 tcpOutRsts;
+};
+
+union rdma_protocol_stats {
+ struct ib_protocol_stats ib;
+ struct iw_protocol_stats iw;
+};
+
+struct ib_port_attr {
+ enum ib_port_state state;
+ int max_mtu;
+ int active_mtu;
+ int gid_tbl_len;
+ u32 port_cap_flags;
+ u32 max_msg_sz;
+ u32 bad_pkey_cntr;
+ u32 qkey_viol_cntr;
+ u16 pkey_tbl_len;
+ u16 lid;
+ u16 sm_lid;
+ u8 lmc;
+ u8 max_vl_num;
+ u8 sm_sl;
+ u8 subnet_timeout;
+ u8 init_type_reply;
+ u8 active_width;
+ u8 active_speed;
+ u8 phys_state;
+};
+
+enum ib_device_modify_flags {
+ IB_DEVICE_MODIFY_SYS_IMAGE_GUID = 1 << 0,
+ IB_DEVICE_MODIFY_NODE_DESC = 1 << 1
+};
+
+struct ib_device_modify {
+ u64 sys_image_guid;
+ char node_desc[64];
+};
+
+enum ib_port_modify_flags {
+ IB_PORT_SHUTDOWN = 1,
+ IB_PORT_INIT_TYPE = (1<<2),
+ IB_PORT_RESET_QKEY_CNTR = (1<<3)
+};
+
+struct ib_port_modify {
+ u32 set_port_cap_mask;
+ u32 clr_port_cap_mask;
+ u8 init_type;
+};
+
+enum ib_event_type {
+ IB_EVENT_CQ_ERR,
+ IB_EVENT_QP_FATAL,
+ IB_EVENT_QP_REQ_ERR,
+ IB_EVENT_QP_ACCESS_ERR,
+ IB_EVENT_COMM_EST,
+ IB_EVENT_SQ_DRAINED,
+ IB_EVENT_PATH_MIG,
+ IB_EVENT_PATH_MIG_ERR,
+ IB_EVENT_DEVICE_FATAL,
+ IB_EVENT_PORT_ACTIVE,
+ IB_EVENT_PORT_ERR,
+ IB_EVENT_LID_CHANGE,
+ IB_EVENT_PKEY_CHANGE,
+ IB_EVENT_SM_CHANGE,
+ IB_EVENT_SRQ_ERR,
+ IB_EVENT_SRQ_LIMIT_REACHED,
+ IB_EVENT_QP_LAST_WQE_REACHED,
+ IB_EVENT_CLIENT_REREGISTER,
+ IB_EVENT_GID_CHANGE,
+};
+
+struct ib_event {
+ struct ib_device *device;
+ union {
+ struct ib_cq *cq;
+ struct ib_qp *qp;
+ struct ib_srq *srq;
+ u8 port_num;
+ } element;
+ enum ib_event_type event;
+};
+
+struct ib_event_handler {
+ struct ib_device *device;
+ void (*handler)(struct ib_event_handler *, struct ib_event *);
+ struct list_head list;
+};
+
+#define INIT_IB_EVENT_HANDLER(_ptr, _device, _handler) \
+ do { \
+ (_ptr)->device = _device; \
+ (_ptr)->handler = _handler; \
+ INIT_LIST_HEAD(&(_ptr)->list); \
+ } while (0)
+
+struct ib_global_route {
+ union ib_gid dgid;
+ u32 flow_label;
+ u8 sgid_index;
+ u8 hop_limit;
+ u8 traffic_class;
+};
+
+struct ib_grh {
+ __be32 version_tclass_flow;
+ __be16 paylen;
+ u8 next_hdr;
+ u8 hop_limit;
+ union ib_gid sgid;
+ union ib_gid dgid;
+};
+
+enum {
+ IB_MULTICAST_QPN = 0xffffff
+};
+
+#define IB_LID_PERMISSIVE cpu_to_be16(0xFFFF)
+
+enum ib_ah_flags {
+ IB_AH_GRH = 1
+};
+
+enum ib_rate {
+ IB_RATE_PORT_CURRENT = 0,
+ IB_RATE_2_5_GBPS = 2,
+ IB_RATE_5_GBPS = 5,
+ IB_RATE_10_GBPS = 3,
+ IB_RATE_20_GBPS = 6,
+ IB_RATE_30_GBPS = 4,
+ IB_RATE_40_GBPS = 7,
+ IB_RATE_60_GBPS = 8,
+ IB_RATE_80_GBPS = 9,
+ IB_RATE_120_GBPS = 10,
+ IB_RATE_14_GBPS = 11,
+ IB_RATE_56_GBPS = 12,
+ IB_RATE_112_GBPS = 13,
+ IB_RATE_168_GBPS = 14,
+ IB_RATE_25_GBPS = 15,
+ IB_RATE_100_GBPS = 16,
+ IB_RATE_200_GBPS = 17,
+ IB_RATE_300_GBPS = 18
+};
+
+/**
+ * ib_rate_to_mult - Convert the IB rate enum to a multiple of the
+ * base rate of 2.5 Gbit/sec. For example, IB_RATE_5_GBPS will be
+ * converted to 2, since 5 Gbit/sec is 2 * 2.5 Gbit/sec.
+ * @rate: rate to convert.
+ */
+int ib_rate_to_mult(enum ib_rate rate) __attribute_const__;
+
+/**
+ * ib_rate_to_mbps - Convert the IB rate enum to Mbps.
+ * For example, IB_RATE_2_5_GBPS will be converted to 2500.
+ * @rate: rate to convert.
+ */
+int ib_rate_to_mbps(enum ib_rate rate) __attribute_const__;
+
+/**
+ * mult_to_ib_rate - Convert a multiple of 2.5 Gbit/sec to an IB rate
+ * enum.
+ * @mult: multiple to convert.
+ */
+enum ib_rate mult_to_ib_rate(int mult) __attribute_const__;
+
+struct ib_ah_attr {
+ struct ib_global_route grh;
+ u16 dlid;
+ u8 sl;
+ u8 src_path_bits;
+ u8 static_rate;
+ u8 ah_flags;
+ u8 port_num;
+ u8 dmac[ETH_ALEN];
+ u16 vlan_id;
+};
+
+enum ib_wc_status {
+ IB_WC_SUCCESS,
+ IB_WC_LOC_LEN_ERR,
+ IB_WC_LOC_QP_OP_ERR,
+ IB_WC_LOC_EEC_OP_ERR,
+ IB_WC_LOC_PROT_ERR,
+ IB_WC_WR_FLUSH_ERR,
+ IB_WC_MW_BIND_ERR,
+ IB_WC_BAD_RESP_ERR,
+ IB_WC_LOC_ACCESS_ERR,
+ IB_WC_REM_INV_REQ_ERR,
+ IB_WC_REM_ACCESS_ERR,
+ IB_WC_REM_OP_ERR,
+ IB_WC_RETRY_EXC_ERR,
+ IB_WC_RNR_RETRY_EXC_ERR,
+ IB_WC_LOC_RDD_VIOL_ERR,
+ IB_WC_REM_INV_RD_REQ_ERR,
+ IB_WC_REM_ABORT_ERR,
+ IB_WC_INV_EECN_ERR,
+ IB_WC_INV_EEC_STATE_ERR,
+ IB_WC_FATAL_ERR,
+ IB_WC_RESP_TIMEOUT_ERR,
+ IB_WC_GENERAL_ERR
+};
+
+enum ib_wc_opcode {
+ IB_WC_SEND,
+ IB_WC_RDMA_WRITE,
+ IB_WC_RDMA_READ,
+ IB_WC_COMP_SWAP,
+ IB_WC_FETCH_ADD,
+ IB_WC_BIND_MW,
+ IB_WC_LSO,
+ IB_WC_LOCAL_INV,
+ IB_WC_FAST_REG_MR,
+ IB_WC_MASKED_COMP_SWAP,
+ IB_WC_MASKED_FETCH_ADD,
+/*
+ * Set value of IB_WC_RECV so consumers can test if a completion is a
+ * receive by testing (opcode & IB_WC_RECV).
+ */
+ IB_WC_RECV = 1 << 7,
+ IB_WC_RECV_RDMA_WITH_IMM
+};
+
+enum ib_wc_flags {
+ IB_WC_GRH = 1,
+ IB_WC_WITH_IMM = (1<<1),
+ IB_WC_WITH_INVALIDATE = (1<<2),
+ IB_WC_IP_CSUM_OK = (1<<3),
+ IB_WC_WITH_SMAC = (1<<4),
+ IB_WC_WITH_VLAN = (1<<5),
+};
+
+struct ib_wc {
+ u64 wr_id;
+ enum ib_wc_status status;
+ enum ib_wc_opcode opcode;
+ u32 vendor_err;
+ u32 byte_len;
+ struct ib_qp *qp;
+ union {
+ __be32 imm_data;
+ u32 invalidate_rkey;
+ } ex;
+ u32 src_qp;
+ int wc_flags;
+ u16 pkey_index;
+ u16 slid;
+ u8 sl;
+ u8 dlid_path_bits;
+ u8 port_num; /* valid only for DR SMPs on switches */
+ u8 smac[ETH_ALEN];
+ u16 vlan_id;
+};
+
+enum ib_cq_notify_flags {
+ IB_CQ_SOLICITED = 1 << 0,
+ IB_CQ_NEXT_COMP = 1 << 1,
+ IB_CQ_SOLICITED_MASK = IB_CQ_SOLICITED | IB_CQ_NEXT_COMP,
+ IB_CQ_REPORT_MISSED_EVENTS = 1 << 2,
+};
+
+enum ib_srq_type {
+ IB_SRQT_BASIC,
+ IB_SRQT_XRC
+};
+
+enum ib_srq_attr_mask {
+ IB_SRQ_MAX_WR = 1 << 0,
+ IB_SRQ_LIMIT = 1 << 1,
+};
+
+struct ib_srq_attr {
+ u32 max_wr;
+ u32 max_sge;
+ u32 srq_limit;
+};
+
+struct ib_srq_init_attr {
+ void (*event_handler)(struct ib_event *, void *);
+ void *srq_context;
+ struct ib_srq_attr attr;
+ enum ib_srq_type srq_type;
+
+ union {
+ struct {
+ struct ib_xrcd *xrcd;
+ struct ib_cq *cq;
+ } xrc;
+ } ext;
+};
+
+struct ib_qp_cap {
+ u32 max_send_wr;
+ u32 max_recv_wr;
+ u32 max_send_sge;
+ u32 max_recv_sge;
+ u32 max_inline_data;
+};
+
+enum ib_sig_type {
+ IB_SIGNAL_ALL_WR,
+ IB_SIGNAL_REQ_WR
+};
+
+enum ib_qp_type {
+ /*
+ * IB_QPT_SMI and IB_QPT_GSI have to be the first two entries
+ * here (and in that order) since the MAD layer uses them as
+ * indices into a 2-entry table.
+ */
+ IB_QPT_SMI,
+ IB_QPT_GSI,
+
+ IB_QPT_RC,
+ IB_QPT_UC,
+ IB_QPT_UD,
+ IB_QPT_RAW_IPV6,
+ IB_QPT_RAW_ETHERTYPE,
+ IB_QPT_RAW_PACKET = 8,
+ IB_QPT_XRC_INI = 9,
+ IB_QPT_XRC_TGT,
+ IB_QPT_MAX,
+ /* Reserve a range for qp types internal to the low level driver.
+ * These qp types will not be visible at the IB core layer, so the
+ * IB_QPT_MAX usages should not be affected in the core layer
+ */
+ IB_QPT_RESERVED1 = 0x1000,
+ IB_QPT_RESERVED2,
+ IB_QPT_RESERVED3,
+ IB_QPT_RESERVED4,
+ IB_QPT_RESERVED5,
+ IB_QPT_RESERVED6,
+ IB_QPT_RESERVED7,
+ IB_QPT_RESERVED8,
+ IB_QPT_RESERVED9,
+ IB_QPT_RESERVED10,
+};
+
+enum ib_qp_create_flags {
+ IB_QP_CREATE_IPOIB_UD_LSO = 1 << 0,
+ IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK = 1 << 1,
+ IB_QP_CREATE_NETIF_QP = 1 << 5,
+ /* reserve bits 26-31 for low level drivers' internal use */
+ IB_QP_CREATE_RESERVED_START = 1 << 26,
+ IB_QP_CREATE_RESERVED_END = 1 << 31,
+};
+
+
+/*
+ * Note: users may not call ib_close_qp or ib_destroy_qp from the event_handler
+ * callback to destroy the passed in QP.
+ */
+
+struct ib_qp_init_attr {
+ void (*event_handler)(struct ib_event *, void *);
+ void *qp_context;
+ struct ib_cq *send_cq;
+ struct ib_cq *recv_cq;
+ struct ib_srq *srq;
+ struct ib_xrcd *xrcd; /* XRC TGT QPs only */
+ struct ib_qp_cap cap;
+ enum ib_sig_type sq_sig_type;
+ enum ib_qp_type qp_type;
+ enum ib_qp_create_flags create_flags;
+ u8 port_num; /* special QP types only */
+};
+
+struct ib_qp_open_attr {
+ void (*event_handler)(struct ib_event *, void *);
+ void *qp_context;
+ u32 qp_num;
+ enum ib_qp_type qp_type;
+};
+
+enum ib_rnr_timeout {
+ IB_RNR_TIMER_655_36 = 0,
+ IB_RNR_TIMER_000_01 = 1,
+ IB_RNR_TIMER_000_02 = 2,
+ IB_RNR_TIMER_000_03 = 3,
+ IB_RNR_TIMER_000_04 = 4,
+ IB_RNR_TIMER_000_06 = 5,
+ IB_RNR_TIMER_000_08 = 6,
+ IB_RNR_TIMER_000_12 = 7,
+ IB_RNR_TIMER_000_16 = 8,
+ IB_RNR_TIMER_000_24 = 9,
+ IB_RNR_TIMER_000_32 = 10,
+ IB_RNR_TIMER_000_48 = 11,
+ IB_RNR_TIMER_000_64 = 12,
+ IB_RNR_TIMER_000_96 = 13,
+ IB_RNR_TIMER_001_28 = 14,
+ IB_RNR_TIMER_001_92 = 15,
+ IB_RNR_TIMER_002_56 = 16,
+ IB_RNR_TIMER_003_84 = 17,
+ IB_RNR_TIMER_005_12 = 18,
+ IB_RNR_TIMER_007_68 = 19,
+ IB_RNR_TIMER_010_24 = 20,
+ IB_RNR_TIMER_015_36 = 21,
+ IB_RNR_TIMER_020_48 = 22,
+ IB_RNR_TIMER_030_72 = 23,
+ IB_RNR_TIMER_040_96 = 24,
+ IB_RNR_TIMER_061_44 = 25,
+ IB_RNR_TIMER_081_92 = 26,
+ IB_RNR_TIMER_122_88 = 27,
+ IB_RNR_TIMER_163_84 = 28,
+ IB_RNR_TIMER_245_76 = 29,
+ IB_RNR_TIMER_327_68 = 30,
+ IB_RNR_TIMER_491_52 = 31
+};
+
+enum ib_qp_attr_mask {
+ IB_QP_STATE = 1,
+ IB_QP_CUR_STATE = (1<<1),
+ IB_QP_EN_SQD_ASYNC_NOTIFY = (1<<2),
+ IB_QP_ACCESS_FLAGS = (1<<3),
+ IB_QP_PKEY_INDEX = (1<<4),
+ IB_QP_PORT = (1<<5),
+ IB_QP_QKEY = (1<<6),
+ IB_QP_AV = (1<<7),
+ IB_QP_PATH_MTU = (1<<8),
+ IB_QP_TIMEOUT = (1<<9),
+ IB_QP_RETRY_CNT = (1<<10),
+ IB_QP_RNR_RETRY = (1<<11),
+ IB_QP_RQ_PSN = (1<<12),
+ IB_QP_MAX_QP_RD_ATOMIC = (1<<13),
+ IB_QP_ALT_PATH = (1<<14),
+ IB_QP_MIN_RNR_TIMER = (1<<15),
+ IB_QP_SQ_PSN = (1<<16),
+ IB_QP_MAX_DEST_RD_ATOMIC = (1<<17),
+ IB_QP_PATH_MIG_STATE = (1<<18),
+ IB_QP_CAP = (1<<19),
+ IB_QP_DEST_QPN = (1<<20),
+ IB_QP_SMAC = (1<<21),
+ IB_QP_ALT_SMAC = (1<<22),
+ IB_QP_VID = (1<<23),
+ IB_QP_ALT_VID = (1<<24),
+};
+
+enum ib_qp_state {
+ IB_QPS_RESET,
+ IB_QPS_INIT,
+ IB_QPS_RTR,
+ IB_QPS_RTS,
+ IB_QPS_SQD,
+ IB_QPS_SQE,
+ IB_QPS_ERR
+};
+
+enum ib_mig_state {
+ IB_MIG_MIGRATED,
+ IB_MIG_REARM,
+ IB_MIG_ARMED
+};
+
+enum ib_mw_type {
+ IB_MW_TYPE_1 = 1,
+ IB_MW_TYPE_2 = 2
+};
+
+struct ib_qp_attr {
+ enum ib_qp_state qp_state;
+ enum ib_qp_state cur_qp_state;
+ int path_mtu;
+ enum ib_mig_state path_mig_state;
+ u32 qkey;
+ u32 rq_psn;
+ u32 sq_psn;
+ u32 dest_qp_num;
+ int qp_access_flags;
+ struct ib_qp_cap cap;
+ struct ib_ah_attr ah_attr;
+ struct ib_ah_attr alt_ah_attr;
+ u16 pkey_index;
+ u16 alt_pkey_index;
+ u8 en_sqd_async_notify;
+ u8 sq_draining;
+ u8 max_rd_atomic;
+ u8 max_dest_rd_atomic;
+ u8 min_rnr_timer;
+ u8 port_num;
+ u8 timeout;
+ u8 retry_cnt;
+ u8 rnr_retry;
+ u8 alt_port_num;
+ u8 alt_timeout;
+ u8 smac[ETH_ALEN];
+ u8 alt_smac[ETH_ALEN];
+ u16 vlan_id;
+ u16 alt_vlan_id;
+};
+
+enum ib_wr_opcode {
+ IB_WR_RDMA_WRITE,
+ IB_WR_RDMA_WRITE_WITH_IMM,
+ IB_WR_SEND,
+ IB_WR_SEND_WITH_IMM,
+ IB_WR_RDMA_READ,
+ IB_WR_ATOMIC_CMP_AND_SWP,
+ IB_WR_ATOMIC_FETCH_AND_ADD,
+ IB_WR_LSO,
+ IB_WR_SEND_WITH_INV,
+ IB_WR_RDMA_READ_WITH_INV,
+ IB_WR_LOCAL_INV,
+ IB_WR_FAST_REG_MR,
+ IB_WR_MASKED_ATOMIC_CMP_AND_SWP,
+ IB_WR_MASKED_ATOMIC_FETCH_AND_ADD,
+ IB_WR_BIND_MW,
+ /* reserve values for low level drivers' internal use.
+ * These values will not be used at all in the ib core layer.
+ */
+ IB_WR_RESERVED1 = 0xf0,
+ IB_WR_RESERVED2,
+ IB_WR_RESERVED3,
+ IB_WR_RESERVED4,
+ IB_WR_RESERVED5,
+ IB_WR_RESERVED6,
+ IB_WR_RESERVED7,
+ IB_WR_RESERVED8,
+ IB_WR_RESERVED9,
+ IB_WR_RESERVED10,
+};
+
+enum ib_send_flags {
+ IB_SEND_FENCE = 1,
+ IB_SEND_SIGNALED = (1<<1),
+ IB_SEND_SOLICITED = (1<<2),
+ IB_SEND_INLINE = (1<<3),
+ IB_SEND_IP_CSUM = (1<<4),
+
+ /* reserve bits 26-31 for low level drivers' internal use */
+ IB_SEND_RESERVED_START = (1 << 26),
+ IB_SEND_RESERVED_END = (1 << 31),
+};
+
+
+
+#endif /* IB_VERBS_H */
diff --git a/drivers/net/mlnx_uio/include/inline_functions.h b/drivers/net/mlnx_uio/include/inline_functions.h
new file mode 100644
index 0000000..dccc038
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/inline_functions.h
@@ -0,0 +1,307 @@
+/*
+ * inline_functions.h
+ *
+ * Created on: Jun 24, 2015
+ * Author: leeopop
+ */
+
+#ifndef DRIVERS_NET_MLNX_UIO_INCLUDE_INLINE_FUNCTIONS_H_
+#define DRIVERS_NET_MLNX_UIO_INCLUDE_INLINE_FUNCTIONS_H_
+
+#include <pthread.h>
+
+struct completion
+{
+ int done;
+ pthread_cond_t wait;
+ pthread_mutex_t lock;
+};
+
+#define INLINE_MACRO static inline __attribute__((always_inline))
+
+INLINE_MACRO bool time_after(uint64_t a, uint64_t b)
+{
+ return a > b;
+}
+
+INLINE_MACRO bool time_before(uint64_t a, uint64_t b)
+{
+ return a < b;
+}
+
+INLINE_MACRO bool time_after_eq(uint64_t a, uint64_t b)
+{
+ return a >= b;
+}
+
+INLINE_MACRO bool time_before_eq(uint64_t a, uint64_t b)
+{
+ return a <= b;
+}
+
+/* $OpenBSD: strlcpy.c,v 1.8 /06/17 21:56:24 millert Exp $ */
+
+/*
+ * Copyright (c) Todd C. Miller <Todd.Miller@courtesan.com>
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+/*
+ * Copy src to string dst of size siz. At most siz-1 characters
+ * will be copied. Always NUL terminates (unless siz == 0).
+ * Returns strlen(src); if retval >= siz, truncation occurred.
+ */
+INLINE_MACRO size_t
+strlcpy(char *dst, const char *src, size_t siz)
+{
+ register char *d = dst;
+ register const char *s = src;
+ register size_t n = siz;
+
+ /* Copy as many bytes as will fit */
+ if (n != 0 && --n != 0) {
+ do {
+ if ((*d++ = *s++) == 0)
+ break;
+ } while (--n != 0);
+ }
+
+ /* Not enough room in dst, add NUL and traverse rest of src */
+ if (n == 0) {
+ if (siz != 0)
+ *d = '\0'; /* NUL-terminate dst */
+ while (*s++)
+ ;
+ }
+
+ return(s - src - 1); /* count does not include NUL */
+}
+
+/**
+ * div_u64_rem - unsigned 64bit divide with 32bit divisor with remainder
+ *
+ * This is commonly provided by 32bit archs to provide an optimized 64bit
+ * divide.
+ */
+INLINE_MACRO u64 div_u64_rem(u64 dividend, u32 divisor, u32 *remainder)
+{
+ *remainder = dividend % divisor;
+ return dividend / divisor;
+}
+
+/**
+ * div_s64_rem - signed 64bit divide with 32bit divisor with remainder
+ */
+INLINE_MACRO s64 div_s64_rem(s64 dividend, s32 divisor, s32 *remainder)
+{
+ *remainder = dividend % divisor;
+ return dividend / divisor;
+}
+
+
+INLINE_MACRO u64 div_u64(u64 dividend, u32 divisor)
+{
+ u32 remainder;
+ return div_u64_rem(dividend, divisor, &remainder);
+}
+
+INLINE_MACRO s64 div_s64(s64 dividend, s32 divisor)
+{
+ s32 remainder;
+ return div_s64_rem(dividend, divisor, &remainder);
+}
+
+INLINE_MACRO u8 read8(const volatile void* addr)
+{
+ return *((const volatile u8*)addr);
+}
+
+INLINE_MACRO void write8(u8 val, volatile void* addr)
+{
+ (*((volatile u8*)addr)) = val;
+}
+
+INLINE_MACRO u16 read16(const volatile void* addr)
+{
+ return *((const volatile u16*)addr);
+}
+
+INLINE_MACRO void write16(u16 val, volatile void* addr)
+{
+ (*((volatile u16*)addr)) = val;
+}
+
+INLINE_MACRO u32 read32(const volatile void* addr)
+{
+ return *((const volatile u32*)addr);
+}
+
+INLINE_MACRO void write32(u32 val, volatile void* addr)
+{
+ (*((volatile u32*)addr)) = val;
+}
+
+INLINE_MACRO u64 read64(const volatile void* addr)
+{
+ return *((const volatile u64*)addr);
+}
+
+INLINE_MACRO void write64(u64 val, volatile void* addr)
+{
+ (*((volatile u64*)addr)) = val;
+}
+
+INLINE_MACRO void sema_init(semaphore_t* sema, int val)
+{
+ mutex_init(&sema->lock);
+ sema->count = val;
+}
+
+INLINE_MACRO void down(semaphore_t* sema)
+{
+ while(1)
+ {
+ mutex_lock(&sema->lock);
+ if(sema->count > 0)
+ {
+ sema->count--;
+ mutex_unlock(&sema->lock);
+ break;
+ }
+ mutex_unlock(&sema->lock);
+ cond_resched();
+ }
+}
+INLINE_MACRO void up(semaphore_t* sema)
+{
+ mutex_lock(&sema->lock);
+ sema->count++;
+ mutex_unlock(&sema->lock);
+}
+
+INLINE_MACRO void init_completion(struct completion *x)
+{
+ x->done = 0;
+ pthread_cond_init(&x->wait, NULL);
+ pthread_mutex_init(&x->lock, NULL);
+}
+
+INLINE_MACRO void reinit_completion(struct completion *x)
+{
+ pthread_mutex_lock(&x->lock);
+ x->done = 0;
+ pthread_mutex_unlock(&x->lock);
+}
+
+INLINE_MACRO void wait_for_completion(struct completion *x)
+{
+ pthread_mutex_lock(&x->lock);
+ while(x->done == 0)
+ pthread_cond_wait(&x->wait, &x->lock);
+ x->done--;
+ pthread_mutex_unlock(&x->lock);
+}
+INLINE_MACRO void wait_for_completion_io(struct completion *x)
+{
+ wait_for_completion(x);
+}
+INLINE_MACRO int wait_for_completion_interruptible(struct completion *x)
+{
+ wait_for_completion(x);
+ return 0;
+}
+INLINE_MACRO int wait_for_completion_killable(struct completion *x)
+{
+ wait_for_completion(x);
+ return 0;
+}
+INLINE_MACRO unsigned long wait_for_completion_timeout(struct completion *x,
+ unsigned long timeout)
+{
+ struct timespec time, endtime;
+ int timeover = 0;
+ unsigned long msec = jiffies_to_msec(timeout);
+ unsigned long nsec = msec * NSEC_PER_MSEC;
+ //gettimeofday(&time, NULL);
+ clock_gettime(CLOCK_REALTIME_COARSE, &time);
+ time.tv_nsec += nsec%NSEC_PER_SEC;
+ time.tv_sec += nsec/NSEC_PER_SEC;
+ time.tv_sec += time.tv_nsec/NSEC_PER_SEC;
+ time.tv_nsec = time.tv_nsec%NSEC_PER_SEC;
+ pthread_mutex_lock(&x->lock);
+ while(!x->done && timeover == 0)
+ timeover = pthread_cond_timedwait(&x->wait, &x->lock, &time);
+
+ if(timeover == 0)
+ {
+ clock_gettime(CLOCK_REALTIME_COARSE, &endtime);
+ //gettimeofday(&time, NULL);
+ unsigned long remaining = MAX(1, (time.tv_sec - endtime.tv_sec)*NSEC_PER_SEC + time.tv_nsec - endtime.tv_nsec);
+ x->done--;
+
+ pthread_mutex_unlock(&x->lock);
+
+ return remaining;
+ }
+ else
+ {
+ pthread_mutex_unlock(&x->lock);
+ return 0;
+ }
+}
+INLINE_MACRO unsigned long wait_for_completion_io_timeout(struct completion *x,
+ unsigned long timeout)
+{
+ return wait_for_completion_timeout(x,timeout);
+}
+INLINE_MACRO unsigned long wait_for_completion_interruptible_timeout(
+ struct completion *x, unsigned long timeout)
+{
+ return wait_for_completion_timeout(x,timeout);
+}
+INLINE_MACRO unsigned long wait_for_completion_killable_timeout(
+ struct completion *x, unsigned long timeout)
+{
+ return wait_for_completion_timeout(x,timeout);
+}
+INLINE_MACRO bool try_wait_for_completion(struct completion *x)
+{
+ unsigned long ret = wait_for_completion_timeout(x, 1);
+ return RTE_MIN(1UL, ret);
+}
+INLINE_MACRO bool completion_done(struct completion *x)
+{
+ bool ret;
+ pthread_mutex_lock(&x->lock);
+ ret = x->done;
+ pthread_mutex_unlock(&x->lock);
+ return ret;
+}
+
+INLINE_MACRO void complete(struct completion *x)
+{
+ pthread_mutex_lock(&x->lock);
+ x->done++;
+ pthread_cond_signal(&x->wait);
+ pthread_mutex_unlock(&x->lock);
+}
+INLINE_MACRO void complete_all(struct completion *x)
+{
+ pthread_mutex_lock(&x->lock);
+ x->done = INT_MAX;
+ pthread_cond_broadcast(&x->wait);
+ pthread_mutex_unlock(&x->lock);
+}
+
+#endif /* DRIVERS_NET_MLNX_UIO_INCLUDE_INLINE_FUNCTIONS_H_ */
diff --git a/drivers/net/mlnx_uio/include/kcompat.h b/drivers/net/mlnx_uio/include/kcompat.h
new file mode 100644
index 0000000..270e29b
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/kcompat.h
@@ -0,0 +1,36 @@
+/*
+ * kcompat.h
+ *
+ * Created on: Jun 24, 2015
+ * Author: leeopop
+ */
+
+#ifndef DRIVERS_NET_MLNX_UIO_INCLUDE_KCOMPAT_H_
+#define DRIVERS_NET_MLNX_UIO_INCLUDE_KCOMPAT_H_
+
+
+void register_module_parameter(__module_param_t* param_t);
+
+
+void register_module_parameter_desc(__module_param_t* param_t, const char* desc);
+
+
+int module_paramter_count();
+
+
+enum module_param_type module_paramter_type(int index);
+
+
+void* module_paramter_ptr(int index);
+
+
+const char* module_paramter_name(int index);
+
+
+const char* module_paramter_desc(int index);
+
+
+int module_parameter_elt_count(int index);
+
+
+#endif /* DRIVERS_NET_MLNX_UIO_INCLUDE_KCOMPAT_H_ */
diff --git a/drivers/net/mlnx_uio/include/kmod.h b/drivers/net/mlnx_uio/include/kmod.h
new file mode 100644
index 0000000..79984eb
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/kmod.h
@@ -0,0 +1,768 @@
+/*
+ * kmod.h
+ *
+ * Created on: Jun 24, 2015
+ * Author: leeopop
+ */
+
+#ifndef DRIVERS_NET_MLNX_UIO_INCLUDE_KMOD_H_
+#define DRIVERS_NET_MLNX_UIO_INCLUDE_KMOD_H_
+
+#define KMOD_MODIFIED
+#undef KMOD_DISABLED
+#undef KMOD_REMOVED
+
+#define CONFIG_MLX4_DEBUG
+
+#include "autoconf.h"
+#include <rte_common.h>
+#include <rte_atomic.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <rte_persistent.h>
+#include <rte_branch_prediction.h>
+#include <rte_byteorder.h>
+#include <rte_pci.h>
+#include <rte_dev.h>
+#include <rte_ethdev.h>
+#include <rte_log.h>
+#include <rte_eal.h>
+#include <rte_spinlock.h>
+#include <rte_cycles.h>
+#include <rte_errno.h>
+#include <rte_ring.h>
+#include <rte_mbuf.h>
+#include <errno.h>
+#define ERESTARTSYS 512
+#define ERESTARTNOINTR 513
+#define ERESTARTNOHAND 514 /* restart if no handler.. */
+#define ENOIOCTLCMD 515 /* No ioctl command */
+#define ERESTART_RESTARTBLOCK 516 /* restart by calling sys_restart_syscall */
+#define EPROBE_DEFER 517 /* Driver requests probe retry */
+#define EOPENSTALE 518 /* open found a stale dentry */
+
+/* Defined for the NFSv3 protocol */
+#define EBADHANDLE 521 /* Illegal NFS file handle */
+#define ENOTSYNC 522 /* Update synchronization mismatch */
+#define EBADCOOKIE 523 /* Cookie is stale */
+#define ENOTSUPP 524 /* Operation is not supported */
+#define ETOOSMALL 525 /* Buffer or request is too small */
+#define ESERVERFAULT 526 /* An untranslatable error occurred */
+#define EBADTYPE 527 /* Type not supported by server */
+#define EJUKEBOX 528 /* Request initiated, but will not complete before timeout */
+#define EIOCBQUEUED 529 /* iocb queued, will get completion event */
+
+#include <stddef.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <asm/bitsperlong.h>
+#include <string.h>
+#include <strings.h>
+#include <memory.h>
+#include <assert.h>
+
+//linux include
+#include <linux/if_link.h>
+#include <sys/sysinfo.h>
+#include <sys/pci.h>
+
+#define __USE_MISC
+#include <netinet/in.h>
+
+#define BITS_PER_LONG __BITS_PER_LONG
+
+typedef uint8_t u8;
+typedef uint16_t u16;
+typedef uint32_t u32;
+typedef uint64_t u64;
+typedef int8_t s8;
+typedef int16_t s16;
+typedef int32_t s32;
+typedef int64_t s64;
+typedef uint64_t dma_addr_t;
+#if 0
+
+typedef u8 __u8;
+typedef u16 __u16;
+typedef u32 __u32;
+typedef u64 __u64;
+typedef s8 __s8;
+typedef s16 __s16;
+typedef s32 __s32;
+typedef s64 __s64;
+
+
+typedef uint64_t __be64;
+typedef uint64_t __le64;
+typedef uint32_t __be32;
+typedef uint32_t __le32;
+typedef uint16_t __be16;
+typedef uint16_t __le16;
+#endif
+
+#define cpu_to_le16(X) rte_cpu_to_le_16(X)
+#define cpu_to_le32(X) rte_cpu_to_le_32(X)
+#define cpu_to_le64(X) rte_cpu_to_le_64(X)
+#define cpu_to_be16(X) rte_cpu_to_be_16(X)
+#define cpu_to_be32(X) rte_cpu_to_be_32(X)
+#define cpu_to_be64(X) rte_cpu_to_be_64(X)
+
+#define be16_to_cpu(X) rte_be_to_cpu_16(X)
+#define be32_to_cpu(X) rte_be_to_cpu_32(X)
+#define be64_to_cpu(X) rte_be_to_cpu_64(X)
+#define le16_to_cpu(X) rte_le_to_cpu_16(X)
+#define le32_to_cpu(X) rte_le_to_cpu_32(X)
+#define le64_to_cpu(X) rte_le_to_cpu_64(X)
+
+#define be64_to_cpup(X) be64_to_cpu(*((__be64*)(X)))
+#define be32_to_cpup(X) be32_to_cpu(*((__be32*)(X)))
+#define be16_to_cpup(X) be16_to_cpu(*((__be16*)(X)))
+
+#define le64_to_cpup(X) le64_to_cpu(*((__le64*)(X)))
+#define le32_to_cpup(X) le32_to_cpu(*((__le32*)(X)))
+#define le16_to_cpup(X) le16_to_cpu(*((__le16*)(X)))
+
+#if 0
+#define be16_to_cpup(X) __be16_to_cpup(X)
+#define be32_to_cpup(X) __be32_to_cpup(X)
+#define be64_to_cpup(X) __be64_to_cpup(X)
+#define be16_to_cpu(X) __be16_to_cpu(X)
+#define be32_to_cpu(X) __be32_to_cpu(X)
+#define be64_to_cpu(X) __be64_to_cpu(X)
+#define le16_to_cpu(X) __le16_to_cpu(X)
+#define le32_to_cpu(X) __le32_to_cpu(X)
+#define le64_to_cpu(X) __le64_to_cpu(X)
+#define le16_to_cpus(X) __le16_to_cpus(X)
+#define le32_to_cpus(X) __le32_to_cpus(X)
+#define cpu_to_be16(X) __cpu_to_be16(X)
+#define cpu_to_be32(X) __cpu_to_be32(X)
+#define cpu_to_be64(X) __cpu_to_be64(X)
+#define cpu_to_le16(X) __cpu_to_le16(X)
+#define cpu_to_le32(X) __cpu_to_le32(X)
+#define cpu_to_le64(X) __cpu_to_le64(X)
+#define cpu_to_le16s(X) __cpu_to_le16s(X)
+#endif
+
+#define swab16(X) rte_bswap16(X)
+#define swab32(X) rte_bswap32(X)
+#define swab64(X) rte_bswap64(X)
+
+typedef unsigned gfp_t;
+typedef unsigned fmode_t;
+typedef unsigned oom_flags_t;
+
+static inline int WARN_ON_ONCE(int val) { if(val != 0) printf("WARN_ON_ONCE\n"); return val; }
+static inline int WARN_ON(int val) { if(val != 0) printf("WARN_ON\n"); return val; }
+//RCU
+#define rcu_assign_pointer(a,b) ((a) = (b))
+#define rcu_dereference(X) ((X))
+#define rcu_dereference_protected(X, LOCK) ((X))
+#define rcu_read_lock()
+#define rcu_read_unlock()
+#define __bitwise__
+#define __must_check
+#define __user
+#define __kernel
+#define __safe
+#define __force
+#define __nocast
+#define __iomem
+#define __chk_user_ptr(x) (void)0
+#define __chk_io_ptr(x) (void)0
+#define __builtin_warning(x, y...) (1)
+#define __must_hold(x)
+#define __acquires(x)
+#define __releases(x)
+#define __acquire(x) (void)0
+#define __release(x) (void)0
+#define __cond_lock(x,c) (c)
+#define __percpu
+#define __rcu
+#define __read_mostly
+#define __devinitdata
+#define __devinit
+#define __devexit_p(X) (X)
+#define ____cacheline_aligned_in_smp __rte_cache_aligned
+#define EXPORT_SYMBOL(X)
+#define EXPORT_SYMBOL_GPL(X)
+#define MODULE_AUTHOR(X)
+#define MODULE_DESCRIPTION(X)
+#define MODULE_LICENSE(X)
+#define MODULE_VERSION(X)
+
+#define __pure __attribute__((pure))
+#define __aligned(x) __attribute__((aligned(x)))
+#define __printf(a, b) __attribute__((format(printf, a, b)))
+#define __scanf(a, b) __attribute__((format(scanf, a, b)))
+#define noinline __attribute__((noinline))
+//#define __attribute_const__ __attribute__((__const__))
+#define __maybe_unused __attribute__((unused))
+#define __always_unused __attribute__((unused))
+
+#define MAX_MSIX_NUMBER 1024
+#define MAX_IRQ_NUMBER 16
+#define MAX_IRQ_DESC 256
+
+
+#define ETH_ALEN (6)
+#define PAGE_SIZE (4096)
+#define PAGE_SHIFT (12)
+#define PAGE_MASK (~(PAGE_SIZE-1))
+#define MSEC_PER_SEC 1000L
+#define USEC_PER_MSEC 1000L
+#define NSEC_PER_USEC 1000L
+#define NSEC_PER_MSEC 1000000L
+#define USEC_PER_SEC 1000000L
+#define NSEC_PER_SEC 1000000000L
+#define FSEC_PER_SEC 1000000000000000LL
+#define VLAN_N_VID 4096
+
+//#define ETH_ALEN 6 /* Octets in one ethernet addr */
+#define ETH_HLEN 14 /* Total octets in header. */
+#define ETH_ZLEN 60 /* Min. octets in frame sans FCS */
+#define ETH_DATA_LEN 1500 /* Max. octets in payload */
+#define ETH_FRAME_LEN 1514 /* Max. octets in frame sans FCS */
+#define ETH_FCS_LEN 4 /* Octets in the FCS */
+#define VLAN_HLEN 4
+#define VLAN_ETH_HLEN 18
+
+#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
+#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
+#define __ALIGN_MASK(x, mask) __ALIGN_KERNEL_MASK((x), (mask))
+#define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
+#define PAGE_ALIGN(addr) ALIGN(addr, PAGE_SIZE)
+#define PTR_ALIGN(p, a) ((typeof(p))ALIGN((unsigned long)(p), (a)))
+#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0)
+#define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f))
+#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))
+#define roundup(x, y) ((((x) + ((y) - 1)) / (y)) * (y))
+#define DIV_ROUND_CLOSEST(x, divisor)( \
+ { \
+ typeof(divisor) __divisor = divisor; \
+ (((x) + ((__divisor) / 2)) / (__divisor)); \
+ } \
+)
+#define mdelay(X) rte_delay_ms(X)
+#define msleep_interruptible(X) mdelay(X) // TODO: double checked
+#define udelay(X) rte_delay_us(X)
+#define msleep(X) rte_delay_ms(X)
+
+#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
+
+/* Plain integer GFP bitmasks. Do not use this directly. */
+#define ___GFP_DMA 0x01u
+#define ___GFP_HIGHMEM 0x02u
+#define ___GFP_DMA32 0x04u
+#define ___GFP_MOVABLE 0x08u
+#define ___GFP_WAIT 0x10u
+#define ___GFP_HIGH 0x20u
+#define ___GFP_IO 0x40u
+#define ___GFP_FS 0x80u
+#define ___GFP_COLD 0x100u
+#define ___GFP_NOWARN 0x200u
+#define ___GFP_REPEAT 0x400u
+#define ___GFP_NOFAIL 0x800u
+#define ___GFP_NORETRY 0x1000u
+#define ___GFP_MEMALLOC 0x2000u
+#define ___GFP_COMP 0x4000u
+#define ___GFP_ZERO 0x8000u
+#define ___GFP_NOMEMALLOC 0x10000u
+#define ___GFP_HARDWALL 0x20000u
+#define ___GFP_THISNODE 0x40000u
+#define ___GFP_RECLAIMABLE 0x80000u
+#define ___GFP_NOTRACK 0x200000u
+#define ___GFP_NO_KSWAPD 0x400000u
+#define ___GFP_OTHER_NODE 0x800000u
+#define ___GFP_WRITE 0x1000000u
+/* If the above are modified, __GFP_BITS_SHIFT may need updating */
+
+#ifndef PCI_VENDOR_ID_MELLANOX
+#define PCI_VENDOR_ID_MELLANOX 0x15b3
+#endif
+
+/*
+ * GFP bitmasks..
+ *
+ * Zone modifiers (see linux/mmzone.h - low three bits)
+ *
+ * Do not put any conditional on these. If necessary modify the definitions
+ * without the underscores and use them consistently. The definitions here may
+ * be used in bit comparisons.
+ */
+#define __GFP_DMA ((__force gfp_t)___GFP_DMA)
+#define __GFP_HIGHMEM ((__force gfp_t)___GFP_HIGHMEM)
+#define __GFP_DMA32 ((__force gfp_t)___GFP_DMA32)
+#define __GFP_MOVABLE ((__force gfp_t)___GFP_MOVABLE) /* Page is movable */
+#define GFP_ZONEMASK (__GFP_DMA|__GFP_HIGHMEM|__GFP_DMA32|__GFP_MOVABLE)
+/*
+ * Action modifiers - doesn't change the zoning
+ *
+ * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt
+ * _might_ fail. This depends upon the particular VM implementation.
+ *
+ * __GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
+ * cannot handle allocation failures. This modifier is deprecated and no new
+ * users should be added.
+ *
+ * __GFP_NORETRY: The VM implementation must not retry indefinitely.
+ *
+ * __GFP_MOVABLE: Flag that this page will be movable by the page migration
+ * mechanism or reclaimed
+ */
+#define __GFP_WAIT ((__force gfp_t)___GFP_WAIT) /* Can wait and reschedule? */
+#define __GFP_HIGH ((__force gfp_t)___GFP_HIGH) /* Should access emergency pools? */
+#define __GFP_IO ((__force gfp_t)___GFP_IO) /* Can start physical IO? */
+#define __GFP_FS ((__force gfp_t)___GFP_FS) /* Can call down to low-level FS? */
+#define __GFP_COLD ((__force gfp_t)___GFP_COLD) /* Cache-cold page required */
+#define __GFP_NOWARN ((__force gfp_t)___GFP_NOWARN) /* Suppress page allocation failure warning */
+#define __GFP_REPEAT ((__force gfp_t)___GFP_REPEAT) /* See above */
+#define __GFP_NOFAIL ((__force gfp_t)___GFP_NOFAIL) /* See above */
+#define __GFP_NORETRY ((__force gfp_t)___GFP_NORETRY) /* See above */
+#define __GFP_MEMALLOC ((__force gfp_t)___GFP_MEMALLOC)/* Allow access to emergency reserves */
+#define __GFP_COMP ((__force gfp_t)___GFP_COMP) /* Add compound page metadata */
+#define __GFP_ZERO ((__force gfp_t)___GFP_ZERO) /* Return zeroed page on success */
+#define __GFP_NOMEMALLOC ((__force gfp_t)___GFP_NOMEMALLOC) /* Don't use emergency reserves.
+ * This takes precedence over the
+ * __GFP_MEMALLOC flag if both are
+ * set
+ */
+#define __GFP_HARDWALL ((__force gfp_t)___GFP_HARDWALL) /* Enforce hardwall cpuset memory allocs */
+#define __GFP_THISNODE ((__force gfp_t)___GFP_THISNODE)/* No fallback, no policies */
+#define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) /* Page is reclaimable */
+#define __GFP_NOTRACK ((__force gfp_t)___GFP_NOTRACK) /* Don't track with kmemcheck */
+
+#define __GFP_NO_KSWAPD ((__force gfp_t)___GFP_NO_KSWAPD)
+#define __GFP_OTHER_NODE ((__force gfp_t)___GFP_OTHER_NODE) /* On behalf of other node */
+#define __GFP_WRITE ((__force gfp_t)___GFP_WRITE) /* Allocator intends to dirty page */
+
+/*
+ * This may seem redundant, but it's a way of annotating false positives vs.
+ * allocations that simply cannot be supported (e.g. page tables).
+ */
+#define __GFP_NOTRACK_FALSE_POSITIVE (__GFP_NOTRACK)
+
+#define __GFP_BITS_SHIFT 25 /* Room for N __GFP_FOO bits */
+#define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
+
+/* This equals 0, but use constants in case they ever change */
+#define GFP_NOWAIT (GFP_ATOMIC & ~__GFP_HIGH)
+/* GFP_ATOMIC means both !wait (__GFP_WAIT not set) and use emergency pool */
+#define GFP_ATOMIC (__GFP_HIGH)
+#define GFP_NOIO (__GFP_WAIT)
+#define GFP_NOFS (__GFP_WAIT | __GFP_IO)
+#define GFP_KERNEL (__GFP_WAIT | __GFP_IO | __GFP_FS)
+#define GFP_TEMPORARY (__GFP_WAIT | __GFP_IO | __GFP_FS | \
+ __GFP_RECLAIMABLE)
+#define GFP_USER (__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
+#define GFP_HIGHUSER (__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL | \
+ __GFP_HIGHMEM)
+#define GFP_HIGHUSER_MOVABLE (__GFP_WAIT | __GFP_IO | __GFP_FS | \
+ __GFP_HARDWALL | __GFP_HIGHMEM | \
+ __GFP_MOVABLE)
+#define GFP_IOFS (__GFP_IO | __GFP_FS)
+#define GFP_TRANSHUGE (GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
+ __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN | \
+ __GFP_NO_KSWAPD)
+
+#include "list.h"
+
+
+enum module_param_type
+{
+ param_type_uint,
+ param_type_int,
+ param_type_string,
+ param_type_ushort,
+ param_type_none,
+};
+
+typedef struct
+{
+ struct list_head list;
+ const char* name;
+ void* ptr;
+ int ptr_size;
+ int elt_size;
+ enum module_param_type param_type;
+ int permission;
+ const char* description;
+}__module_param_t;
+
+
+#define __init
+#define __exit
+
+#define __MODULE_STRING_1(x) #x
+#define __MODULE_STRING(x) __MODULE_STRING_1(x)
+
+#define ARRAY_SIZE(arr) (sizeof(arr)/sizeof(arr[0]))
+
+//#define MODULE_PARM(A) MODULE_PARM_DESC(A, "")
+#define module_param(name, type, perm) \
+ module_param_named(name, name, type, perm)
+
+
+#define module_param_string(_name, _variable, _size, _permission) \
+ static __module_param_t __module_param_ ## _name \
+ = { .name = #_name, .param_type = param_type_string, .ptr = &_variable, .permission = _permission, .ptr_size = _size, .elt_size = 1}; \
+ void __module_param_init_func__ ## _name(void); \
+ void __attribute__((constructor, used)) __module_param_init_func__ ## _name(void) \
+ { \
+ register_module_parameter(&__module_param_ ## _name); \
+ }
+
+#define module_param_named(_name, _variable, _type, _permission) \
+ static __module_param_t __module_param_ ## _name \
+ = { .name = #_name, .param_type = param_type_ ## _type, .ptr = &_variable, .permission = _permission, .ptr_size = sizeof(_type), .elt_size = sizeof(_type)}; \
+ void __module_param_init_func__ ## _name(void); \
+ void __attribute__((constructor, used)) __module_param_init_func__ ## _name(void) \
+ { \
+ register_module_parameter(&__module_param_ ## _name); \
+ }
+
+#define MODULE_PARM(A,B) MODULE_PARM_DESC(A,B)
+
+#define MODULE_PARM_DESC(_name, _description_string) \
+ void __module_param_desc_func__ ## _name(void); \
+ void __attribute__((constructor, used)) __module_param_desc_func__ ## _name(void) \
+ { \
+ register_module_parameter_desc(&__module_param_ ## _name, _description_string); \
+ }
+
+#define module_param_array(_variable, _type, nump, _permission) module_param_array_named(_variable, _variable, _type, nump, _permission)
+
+#define module_param_array_named(_name, _variable, _type, nump, _permission) \
+ static __module_param_t __module_param_ ## _name \
+ = { .name = #_name, .param_type = param_type_ ## _type, .ptr = &_variable, .permission = _permission, .ptr_size = sizeof(_variable), .elt_size = sizeof(_variable[0])}; \
+ void __module_param_init_func__ ## _name(void); \
+ void __attribute__((constructor, used)) __module_param_init_func__ ## _name(void) \
+ { \
+ *(nump) = sizeof(_variable[0]); \
+ register_module_parameter(&__module_param_ ## _name); \
+ }
+
+
+
+//end module def
+
+typedef rte_spinlock_t mutex_t;
+typedef rte_spinlock_t spinlock_t;
+typedef rte_atomic32_t atomic_t;
+typedef rte_spinlock_t rwlock_t;
+
+#define ATOMIC_INIT(X) RTE_ATOMIC32_INIT(X)
+
+#define DEFINE_MUTEX(_mutex) \
+ rte_spinlock_t _mutex = {.locked=0}
+
+#define mmiowb() rte_wmb()
+#define wmb() rte_wmb()
+#define mb() rte_mb()
+#define rmb() rte_rmb()
+#define read_barrier_depends() rte_rmb()
+#define smp_mb() rte_mb()
+#define smp_wmb() rte_wmb()
+#define smp_rmb() rte_rmb()
+#define smp_mb__before_atomic() barrier()
+#define smp_mb__before_clear_bit() barrier()
+#define synchronize_rcu() rte_mb()
+#define synchronize_irq(X) rte_mb()
+#define synchronize_rcu_expedited() rte_mb()
+//#define timecounter_cyc2time(X, Y) (Y)
+#define timecounter_cyc2time(clock, timestamp) (timestamp)
+#define timecounter_read(clock)
+#define timecounter_init(a,b,c)
+#define time_is_before_jiffies(X) (jiffies > (X))
+
+#define prefetch(ptr) rte_prefetch0(ptr)
+#define prefetchw(ptr) rte_prefetch0(ptr)
+//#define HZ (rte_get_tsc_hz())
+#ifndef HZ
+#define HZ 1000UL //1000 Hz
+#endif
+#define jiffies (HZ*rte_rdtsc()/rte_get_tsc_hz())
+
+#define round_jiffies(x) (x)
+#define round_jiffies_relative(x) (x)
+
+#define spin_lock_init(X) rte_spinlock_init((mutex_t*)X)
+#define spin_lock(X) rte_spinlock_lock((mutex_t*)X)
+#define spin_unlock(X) rte_spinlock_unlock((mutex_t*)X)
+
+#define spin_lock_irq spin_lock
+#define spin_unlock_irq spin_unlock
+#define spin_lock_irqsave(X,flag) spin_lock(X)
+#define spin_unlock_irqrestore(X,flag) spin_unlock(X)
+#define spin_lock_bh(X) spin_lock(X)
+#define spin_unlock_bh(X) spin_unlock(X)
+
+#define mutex_init(X) rte_spinlock_init((mutex_t*)X)
+#define mutex_lock(X) rte_spinlock_lock((mutex_t*)X)
+#define mutex_unlock(X) rte_spinlock_unlock((mutex_t*)X)
+#define mutex_destroy(X)
+
+#define rwlock_init(X) rte_spinlock_init(X)
+#define read_lock(X) rte_spinlock_lock(X)
+#define read_unlock(X) rte_spinlock_unlock(X)
+#define write_lock(X) rte_spinlock_lock(X)
+#define write_unlock(X) rte_spinlock_unlock(X)
+#define read_lock_irqsave(X,flag) rte_spinlock_lock(X)
+#define read_unlock_irqrestore(X,flag) rte_spinlock_unlock(X)
+#define write_lock_irqsave(X,flag) rte_spinlock_lock(X)
+#define write_unlock_irqrestore(X,flag) rte_spinlock_unlock(X)
+
+#define readb read8
+#define writeb write8
+#define readw read16
+#define writew write16
+#define readl read32
+#define writel write32
+#define readq read64
+#define writeq write64
+
+#define kmalloc(size, flag) rte_malloc("kmalloc", size, RTE_CACHE_LINE_SIZE)
+#define kzalloc(size, flag) rte_zmalloc("kzalloc", size, RTE_CACHE_LINE_SIZE)
+#define kcalloc(count, unit, kern_flag) rte_calloc("kcalloc", count, unit, RTE_CACHE_LINE_SIZE)
+#define kmalloc_node(size, flag, node) rte_malloc_socket("kmalloc_node", size, RTE_CACHE_LINE_SIZE, node)
+#define kzalloc_node(size, kern_flag, node) rte_zmalloc_socket("kzalloc_node", size, RTE_CACHE_LINE_SIZE, node)
+#define vzalloc(size) rte_zmalloc("vzalloc", size, RTE_CACHE_LINE_SIZE)
+#define vzalloc_node(size, node) rte_zmalloc_socket("vzalloc", size, RTE_CACHE_LINE_SIZE, node)
+#define vfree(ptr) rte_free(ptr)
+#define vmalloc(size) rte_malloc("vmalloc", size, RTE_CACHE_LINE_SIZE)
+#define vmalloc_node(size, node) rte_malloc_socket("vmalloc", size, RTE_CACHE_LINE_SIZE, node)
+
+#define kfree(ptr) rte_free(ptr)
+
+#define L1_CACHE_BYTES RTE_CACHE_LINE_SIZE
+#define SMP_CACHE_BYTES L1_CACHE_BYTES
+#define cache_line_size(X) RTE_CACHE_LINE_SIZE
+
+#define atomic_cmpset(a,b,c) rte_atomic32_cmpset(a,b,c)
+#define atomic_init(a) rte_atomic32_init(a)
+#define atomic_set(a,b) rte_atomic32_set(a,b)
+#define atomic_read(a) rte_atomic32_read(a)
+#define atomic_add(a,b) rte_atomic32_add(b,a)
+#define atomic_sub(a,b) rte_atomic32_sub(b,a)
+#define atomic_inc(a) rte_atomic32_inc(a)
+#define atomic_dec(a) rte_atomic32_dec(a)
+#define atomic_add_return(a,b) rte_atomic32_add_return(b,a)
+#define atomic_inc_return(a) rte_atomic32_add_return(a,1)
+#define atomic_dec_return(a) rte_atomic32_add_return(a,-1)
+#define atomic_inc_and_test(a) rte_atomic32_inc_and_test(a)
+#define atomic_dec_and_test(a) rte_atomic32_dec_and_test(a)
+#define atomic_test_and_set(a) rte_atomic32_test_and_set(a)
+#define atomic_clear(a) rte_atomic32_clear(a)
+#define cond_resched() rte_pause()
+
+#define jiffies_to_msecs(X) (MSEC_PER_SEC*(X)/HZ)
+
+#define max __MAX
+#define min __MIN
+#define MAX __MAX
+#define MIN __MIN
+#define __MAX(a,b) RTE_MAX((a),(b))
+#define __MIN(a,b) RTE_MIN((a),(b))
+#define min3(a,b,c) RTE_MIN(RTE_MIN((a),(b)),(c))
+#define clamp_t(type, val, lo, hi) min_t(type, max_t(type, val, lo), hi)
+#define min_t(type, a, b) MIN((type)(a), (type)(b))
+#define max_t(type, a, b) MAX((type)(a), (type)(b))
+
+//
+
+#define __always_inline __inline __attribute__ ((__always_inline__))
+#define __packed __attribute__((packed))
+
+#define time_get_ts(p_timespec) (clock_gettime(CLOCK_MONOTONIC_RAW, p_timespec))
+
+#define msecs_to_jiffies(msec) ((msec * HZ) / MSEC_PER_SEC)
+#define jiffies_to_msec(jifi) ((jifi*MSEC_PER_SEC) / HZ)
+
+#define __raw_writeq write64
+
+struct mutex{
+ mutex_t mutex;
+}__attribute__((packed));
+
+typedef struct semaphore
+{
+ int count;
+ mutex_t lock;
+}semaphore_t;
+#define rw_semaphore semaphore
+#define down_read down
+#define up_read up
+#define down_write down
+#define up_write up
+#define init_rwsem(x) sema_init(x,1)
+
+#define KERN_EMERG "[KERN_EMERG]"
+#define KERN_ALERT "[KERN_ALERT]"
+#define KERN_CRIT "[KERN_CRIT]"
+#define KERN_ERR "[KERN_ERR]"
+#define KERN_WARNING "[KERN_WARNING]"
+#define KERN_NOTICE "[KERN_NOTICE]"
+#define KERN_DEBUG "[KERN_DEBUG]"
+#define KERN_INFO "[KERN_INFO]"
+
+#define printk printf
+#define printk_once printf
+#define si_meminfo(ptr) sysinfo(ptr)
+
+#define dev_printk(PRINT_LEVEL, device, format, arg...) \
+ do { \
+ printf("[dev_printk] "); \
+ printf(format, ##arg); \
+ } while(0)
+
+#define dev_err(device, format, arg...) \
+ do { \
+ printf("[dev_err] "); \
+ printf(format, ##arg); \
+ } while(0)
+
+#define dev_info(device, format, arg...) \
+ do { \
+ printf("[dev_info] "); \
+ printf(format, ##arg); \
+ } while(0)
+
+#define dev_warn(device, format, arg...) \
+ do { \
+ printf("[dev_warn] "); \
+ printf(format, ##arg); \
+ } while(0)
+
+#define pr_warning(format, arg...) \
+ do { \
+ printf("[pr_warning] "); \
+ printf(format, ##arg); \
+ } while(0)
+
+#define pr_warn(format, arg...) \
+ do { \
+ printf("[pr_warn] "); \
+ printf(format, ##arg); \
+ } while(0)
+
+#define pr_debug(format, arg...) \
+ do { \
+ printf("[pr_debug] "); \
+ printf(format, ##arg); \
+ } while(0)
+
+#define pr_info(format, arg...) \
+ do { \
+ printf("[pr_info] "); \
+ printf(format, ##arg); \
+ } while(0)
+
+#define pr_err(format, arg...) \
+ do { \
+ printf("[pr_err] "); \
+ printf(format, ##arg); \
+ } while(0)
+
+#define pr_devel(format, arg...) \
+ do { \
+ printf("[pr_devel] "); \
+ printf(format, ##arg); \
+ } while(0)
+
+# define do_div(n,base) ({ \
+ uint64_t __rem; \
+ __rem = ((uint64_t)(n)) % (base); \
+ (n) = ((uint64_t)(n)) / (base); \
+ __rem; \
+ })
+
+#define dev_dbg(dev, format, arg...) \
+ ({ \
+ if (0) \
+ dev_printk(KERN_DEBUG, dev, format, ##arg); \
+ 0; \
+ })
+
+#define __raw_writel write32
+#define __raw_readl read32
+#define writel write32
+#define readl read32
+
+#define BUG_ON(X) assert(!(X))
+
+
+#define MAX_ERRNO 4095
+
+
+#define IS_ERR_VALUE(x) unlikely((x) >= (unsigned long)-MAX_ERRNO)
+
+static inline void * __must_check ERR_PTR(long error)
+{
+ return (void *) error;
+}
+
+static inline long __must_check PTR_ERR(__force const void *ptr)
+{
+ return (long) ptr;
+}
+
+static inline bool __must_check IS_ERR(__force const void *ptr)
+{
+ return IS_ERR_VALUE((unsigned long)ptr);
+}
+
+static inline bool __must_check IS_ERR_OR_NULL(__force const void *ptr)
+{
+ return !ptr || IS_ERR_VALUE((unsigned long)ptr);
+}
+
+/**
+ * ERR_CAST - Explicitly cast an error-valued pointer to another pointer type
+ * @ptr: The pointer to cast.
+ *
+ * Explicitly cast an error-valued pointer to another pointer type in such a
+ * way as to make it clear that's what's going on.
+ */
+static inline void * __must_check ERR_CAST(__force const void *ptr)
+{
+ /* cast away the const */
+ return (void *) ptr;
+}
+
+static inline int __must_check PTR_ERR_OR_ZERO(__force const void *ptr)
+{
+ if (IS_ERR(ptr))
+ return PTR_ERR(ptr);
+ else
+ return 0;
+}
+
+/* Deprecated */
+#define PTR_RET(p) PTR_ERR_OR_ZERO(p)
+
+
+
+enum {
+ NETIF_MSG_DRV = 0x0001,
+ NETIF_MSG_PROBE = 0x0002,
+ NETIF_MSG_LINK = 0x0004,
+ NETIF_MSG_TIMER = 0x0008,
+ NETIF_MSG_IFDOWN = 0x0010,
+ NETIF_MSG_IFUP = 0x0020,
+ NETIF_MSG_RX_ERR = 0x0040,
+ NETIF_MSG_TX_ERR = 0x0080,
+ NETIF_MSG_TX_QUEUED = 0x0100,
+ NETIF_MSG_INTR = 0x0200,
+ NETIF_MSG_TX_DONE = 0x0400,
+ NETIF_MSG_RX_STATUS = 0x0800,
+ NETIF_MSG_PKTDATA = 0x1000,
+ NETIF_MSG_HW = 0x2000,
+ NETIF_MSG_WOL = 0x4000,
+};
+
+
+#include "kcompat.h"
+#include "inline_functions.h"
+#include "etherdevice.h"
+#include "netdev_features.h"
+
+#endif /* DRIVERS_NET_MLNX_UIO_INCLUDE_KMOD_H_ */
diff --git a/drivers/net/mlnx_uio/include/list.h b/drivers/net/mlnx_uio/include/list.h
new file mode 100644
index 0000000..2706890
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/list.h
@@ -0,0 +1,780 @@
+/*
+ * list.h
+ *
+ * Created on: Jun 24, 2015
+ * Author: leeopop
+ */
+#ifndef _LINUX_LIST_H
+#define _LINUX_LIST_H
+
+#include <stddef.h>
+
+/*
+ * Simple doubly linked list implementation.
+ *
+ * Some of the internal functions ("__xxx") are useful when
+ * manipulating whole lists rather than single entries, as
+ * sometimes we already know the next/prev entries and we can
+ * generate better code by using them directly rather than
+ * using the generic single-entry routines.
+ */
+
+struct list_head{
+ struct list_head *next;
+ struct list_head *prev;
+};
+
+#define LIST_HEAD_INIT(name) { &(name), &(name) }
+
+#ifdef LIST_HEAD
+#undef LIST_HEAD
+#endif
+#define LIST_HEAD(name) \
+ struct list_head name = LIST_HEAD_INIT(name)
+
+#ifndef LIST_POISON1
+#define LIST_POISON1 ((void*)-1)
+#endif
+#ifndef LIST_POISON2
+#define LIST_POISON2 ((void*)-2)
+#endif
+
+
+#define container_of(ptr, type, member) ({ \
+ typeof( ((type *)0)->member ) *__mptr = (ptr); \
+ (type *)( (char *)__mptr - offsetof(type,member) );})
+
+static inline void INIT_LIST_HEAD(struct list_head *list)
+{
+ list->next = list;
+ list->prev = list;
+}
+
+/*
+ * Insert a new entry between two known consecutive entries.
+ *
+ * This is only for internal list manipulation where we know
+ * the prev/next entries already!
+ */
+#ifndef CONFIG_DEBUG_LIST
+static inline void __list_add(struct list_head *new,
+ struct list_head *prev,
+ struct list_head *next)
+{
+ next->prev = new;
+ new->next = next;
+ new->prev = prev;
+ prev->next = new;
+}
+#else
+extern void __list_add(struct list_head *new,
+ struct list_head *prev,
+ struct list_head *next);
+#endif
+
+/**
+ * list_add - add a new entry
+ * @new: new entry to be added
+ * @head: list head to add it after
+ *
+ * Insert a new entry after the specified head.
+ * This is good for implementing stacks.
+ */
+static inline void list_add(struct list_head *new, struct list_head *head)
+{
+ __list_add(new, head, head->next);
+}
+
+
+/**
+ * list_add_tail - add a new entry
+ * @new: new entry to be added
+ * @head: list head to add it before
+ *
+ * Insert a new entry before the specified head.
+ * This is useful for implementing queues.
+ */
+static inline void list_add_tail(struct list_head *new, struct list_head *head)
+{
+ __list_add(new, head->prev, head);
+}
+
+/*
+ * Delete a list entry by making the prev/next entries
+ * point to each other.
+ *
+ * This is only for internal list manipulation where we know
+ * the prev/next entries already!
+ */
+static inline void __list_del(struct list_head * prev, struct list_head * next)
+{
+ next->prev = prev;
+ prev->next = next;
+}
+
+/**
+ * list_del - deletes entry from list.
+ * @entry: the element to delete from the list.
+ * Note: list_empty() on entry does not return true after this, the entry is
+ * in an undefined state.
+ */
+#ifndef CONFIG_DEBUG_LIST
+static inline void __list_del_entry(struct list_head *entry)
+{
+ __list_del(entry->prev, entry->next);
+}
+
+static inline void list_del(struct list_head *entry)
+{
+ __list_del(entry->prev, entry->next);
+ entry->next = LIST_POISON1;
+ entry->prev = LIST_POISON2;
+}
+#else
+extern void __list_del_entry(struct list_head *entry);
+extern void list_del(struct list_head *entry);
+#endif
+
+/**
+ * list_replace - replace old entry by new one
+ * @old : the element to be replaced
+ * @new : the new element to insert
+ *
+ * If @old was empty, it will be overwritten.
+ */
+static inline void list_replace(struct list_head *old,
+ struct list_head *new)
+{
+ new->next = old->next;
+ new->next->prev = new;
+ new->prev = old->prev;
+ new->prev->next = new;
+}
+
+static inline void list_replace_init(struct list_head *old,
+ struct list_head *new)
+{
+ list_replace(old, new);
+ INIT_LIST_HEAD(old);
+}
+
+/**
+ * list_del_init - deletes entry from list and reinitialize it.
+ * @entry: the element to delete from the list.
+ */
+static inline void list_del_init(struct list_head *entry)
+{
+ __list_del_entry(entry);
+ INIT_LIST_HEAD(entry);
+}
+
+/**
+ * list_move - delete from one list and add as another's head
+ * @list: the entry to move
+ * @head: the head that will precede our entry
+ */
+static inline void list_move(struct list_head *list, struct list_head *head)
+{
+ __list_del_entry(list);
+ list_add(list, head);
+}
+
+/**
+ * list_move_tail - delete from one list and add as another's tail
+ * @list: the entry to move
+ * @head: the head that will follow our entry
+ */
+static inline void list_move_tail(struct list_head *list,
+ struct list_head *head)
+{
+ __list_del_entry(list);
+ list_add_tail(list, head);
+}
+
+/**
+ * list_is_last - tests whether @list is the last entry in list @head
+ * @list: the entry to test
+ * @head: the head of the list
+ */
+static inline int list_is_last(const struct list_head *list,
+ const struct list_head *head)
+{
+ return list->next == head;
+}
+
+/**
+ * list_empty - tests whether a list is empty
+ * @head: the list to test.
+ */
+static inline int list_empty(const struct list_head *head)
+{
+ return head->next == head;
+}
+
+/**
+ * list_empty_careful - tests whether a list is empty and not being modified
+ * @head: the list to test
+ *
+ * Description:
+ * tests whether a list is empty _and_ checks that no other CPU might be
+ * in the process of modifying either member (next or prev)
+ *
+ * NOTE: using list_empty_careful() without synchronization
+ * can only be safe if the only activity that can happen
+ * to the list entry is list_del_init(). Eg. it cannot be used
+ * if another CPU could re-list_add() it.
+ */
+static inline int list_empty_careful(const struct list_head *head)
+{
+ struct list_head *next = head->next;
+ return (next == head) && (next == head->prev);
+}
+
+/**
+ * list_rotate_left - rotate the list to the left
+ * @head: the head of the list
+ */
+static inline void list_rotate_left(struct list_head *head)
+{
+ struct list_head *first;
+
+ if (!list_empty(head)) {
+ first = head->next;
+ list_move_tail(first, head);
+ }
+}
+
+/**
+ * list_is_singular - tests whether a list has just one entry.
+ * @head: the list to test.
+ */
+static inline int list_is_singular(const struct list_head *head)
+{
+ return !list_empty(head) && (head->next == head->prev);
+}
+
+static inline void __list_cut_position(struct list_head *list,
+ struct list_head *head, struct list_head *entry)
+{
+ struct list_head *new_first = entry->next;
+ list->next = head->next;
+ list->next->prev = list;
+ list->prev = entry;
+ entry->next = list;
+ head->next = new_first;
+ new_first->prev = head;
+}
+
+/**
+ * list_cut_position - cut a list into two
+ * @list: a new list to add all removed entries
+ * @head: a list with entries
+ * @entry: an entry within head, could be the head itself
+ * and if so we won't cut the list
+ *
+ * This helper moves the initial part of @head, up to and
+ * including @entry, from @head to @list. You should
+ * pass on @entry an element you know is on @head. @list
+ * should be an empty list or a list you do not care about
+ * losing its data.
+ *
+ */
+static inline void list_cut_position(struct list_head *list,
+ struct list_head *head, struct list_head *entry)
+{
+ if (list_empty(head))
+ return;
+ if (list_is_singular(head) &&
+ (head->next != entry && head != entry))
+ return;
+ if (entry == head)
+ INIT_LIST_HEAD(list);
+ else
+ __list_cut_position(list, head, entry);
+}
+
+static inline void __list_splice(const struct list_head *list,
+ struct list_head *prev,
+ struct list_head *next)
+{
+ struct list_head *first = list->next;
+ struct list_head *last = list->prev;
+
+ first->prev = prev;
+ prev->next = first;
+
+ last->next = next;
+ next->prev = last;
+}
+
+/**
+ * list_splice - join two lists, this is designed for stacks
+ * @list: the new list to add.
+ * @head: the place to add it in the first list.
+ */
+static inline void list_splice(const struct list_head *list,
+ struct list_head *head)
+{
+ if (!list_empty(list))
+ __list_splice(list, head, head->next);
+}
+
+/**
+ * list_splice_tail - join two lists, each list being a queue
+ * @list: the new list to add.
+ * @head: the place to add it in the first list.
+ */
+static inline void list_splice_tail(struct list_head *list,
+ struct list_head *head)
+{
+ if (!list_empty(list))
+ __list_splice(list, head->prev, head);
+}
+
+/**
+ * list_splice_init - join two lists and reinitialise the emptied list.
+ * @list: the new list to add.
+ * @head: the place to add it in the first list.
+ *
+ * The list at @list is reinitialised
+ */
+static inline void list_splice_init(struct list_head *list,
+ struct list_head *head)
+{
+ if (!list_empty(list)) {
+ __list_splice(list, head, head->next);
+ INIT_LIST_HEAD(list);
+ }
+}
+
+/**
+ * list_splice_tail_init - join two lists and reinitialise the emptied list
+ * @list: the new list to add.
+ * @head: the place to add it in the first list.
+ *
+ * Each of the lists is a queue.
+ * The list at @list is reinitialised
+ */
+static inline void list_splice_tail_init(struct list_head *list,
+ struct list_head *head)
+{
+ if (!list_empty(list)) {
+ __list_splice(list, head->prev, head);
+ INIT_LIST_HEAD(list);
+ }
+}
+
+/**
+ * list_entry - get the struct for this entry
+ * @ptr: the &struct list_head pointer.
+ * @type: the type of the struct this is embedded in.
+ * @member: the name of the list_struct within the struct.
+ */
+#define list_entry(ptr, type, member) \
+ container_of(ptr, type, member)
+
+/**
+ * list_first_entry - get the first element from a list
+ * @ptr: the list head to take the element from.
+ * @type: the type of the struct this is embedded in.
+ * @member: the name of the list_struct within the struct.
+ *
+ * Note, that list is expected to be not empty.
+ */
+#define list_first_entry(ptr, type, member) \
+ list_entry((ptr)->next, type, member)
+
+/**
+ * list_last_entry - get the last element from a list
+ * @ptr: the list head to take the element from.
+ * @type: the type of the struct this is embedded in.
+ * @member: the name of the list_struct within the struct.
+ *
+ * Note, that list is expected to be not empty.
+ */
+#define list_last_entry(ptr, type, member) \
+ list_entry((ptr)->prev, type, member)
+
+/**
+ * list_first_entry_or_null - get the first element from a list
+ * @ptr: the list head to take the element from.
+ * @type: the type of the struct this is embedded in.
+ * @member: the name of the list_struct within the struct.
+ *
+ * Note that if the list is empty, it returns NULL.
+ */
+#define list_first_entry_or_null(ptr, type, member) \
+ (!list_empty(ptr) ? list_first_entry(ptr, type, member) : NULL)
+
+/**
+ * list_next_entry - get the next element in list
+ * @pos: the type * to cursor
+ * @member: the name of the list_struct within the struct.
+ */
+#define list_next_entry(pos, member) \
+ list_entry((pos)->member.next, typeof(*(pos)), member)
+
+/**
+ * list_prev_entry - get the prev element in list
+ * @pos: the type * to cursor
+ * @member: the name of the list_struct within the struct.
+ */
+#define list_prev_entry(pos, member) \
+ list_entry((pos)->member.prev, typeof(*(pos)), member)
+
+/**
+ * list_for_each - iterate over a list
+ * @pos: the &struct list_head to use as a loop cursor.
+ * @head: the head for your list.
+ */
+#define list_for_each(pos, head) \
+ for (pos = (head)->next; pos != (head); pos = pos->next)
+
+/**
+ * list_for_each_prev - iterate over a list backwards
+ * @pos: the &struct list_head to use as a loop cursor.
+ * @head: the head for your list.
+ */
+#define list_for_each_prev(pos, head) \
+ for (pos = (head)->prev; pos != (head); pos = pos->prev)
+
+/**
+ * list_for_each_safe - iterate over a list safe against removal of list entry
+ * @pos: the &struct list_head to use as a loop cursor.
+ * @n: another &struct list_head to use as temporary storage
+ * @head: the head for your list.
+ */
+#define list_for_each_safe(pos, n, head) \
+ for (pos = (head)->next, n = pos->next; pos != (head); \
+ pos = n, n = pos->next)
+
+/**
+ * list_for_each_prev_safe - iterate over a list backwards safe against removal of list entry
+ * @pos: the &struct list_head to use as a loop cursor.
+ * @n: another &struct list_head to use as temporary storage
+ * @head: the head for your list.
+ */
+#define list_for_each_prev_safe(pos, n, head) \
+ for (pos = (head)->prev, n = pos->prev; \
+ pos != (head); \
+ pos = n, n = pos->prev)
+
+/**
+ * list_for_each_entry - iterate over list of given type
+ * @pos: the type * to use as a loop cursor.
+ * @head: the head for your list.
+ * @member: the name of the list_struct within the struct.
+ */
+#define list_for_each_entry(pos, head, member) \
+ for (pos = list_first_entry(head, typeof(*pos), member); \
+ &pos->member != (head); \
+ pos = list_next_entry(pos, member))
+
+/**
+ * list_for_each_entry_reverse - iterate backwards over list of given type.
+ * @pos: the type * to use as a loop cursor.
+ * @head: the head for your list.
+ * @member: the name of the list_struct within the struct.
+ */
+#define list_for_each_entry_reverse(pos, head, member) \
+ for (pos = list_last_entry(head, typeof(*pos), member); \
+ &pos->member != (head); \
+ pos = list_prev_entry(pos, member))
+
+/**
+ * list_prepare_entry - prepare a pos entry for use in list_for_each_entry_continue()
+ * @pos: the type * to use as a start point
+ * @head: the head of the list
+ * @member: the name of the list_struct within the struct.
+ *
+ * Prepares a pos entry for use as a start point in list_for_each_entry_continue().
+ */
+#define list_prepare_entry(pos, head, member) \
+ ((pos) ? : list_entry(head, typeof(*pos), member))
+
+/**
+ * list_for_each_entry_continue - continue iteration over list of given type
+ * @pos: the type * to use as a loop cursor.
+ * @head: the head for your list.
+ * @member: the name of the list_struct within the struct.
+ *
+ * Continue to iterate over list of given type, continuing after
+ * the current position.
+ */
+#define list_for_each_entry_continue(pos, head, member) \
+ for (pos = list_next_entry(pos, member); \
+ &pos->member != (head); \
+ pos = list_next_entry(pos, member))
+
+/**
+ * list_for_each_entry_continue_reverse - iterate backwards from the given point
+ * @pos: the type * to use as a loop cursor.
+ * @head: the head for your list.
+ * @member: the name of the list_struct within the struct.
+ *
+ * Start to iterate over list of given type backwards, continuing after
+ * the current position.
+ */
+#define list_for_each_entry_continue_reverse(pos, head, member) \
+ for (pos = list_prev_entry(pos, member); \
+ &pos->member != (head); \
+ pos = list_prev_entry(pos, member))
+
+/**
+ * list_for_each_entry_from - iterate over list of given type from the current point
+ * @pos: the type * to use as a loop cursor.
+ * @head: the head for your list.
+ * @member: the name of the list_struct within the struct.
+ *
+ * Iterate over list of given type, continuing from current position.
+ */
+#define list_for_each_entry_from(pos, head, member) \
+ for (; &pos->member != (head); \
+ pos = list_next_entry(pos, member))
+
+/**
+ * list_for_each_entry_safe - iterate over list of given type safe against removal of list entry
+ * @pos: the type * to use as a loop cursor.
+ * @n: another type * to use as temporary storage
+ * @head: the head for your list.
+ * @member: the name of the list_struct within the struct.
+ */
+#define list_for_each_entry_safe(pos, n, head, member) \
+ for (pos = list_first_entry(head, typeof(*pos), member), \
+ n = list_next_entry(pos, member); \
+ &pos->member != (head); \
+ pos = n, n = list_next_entry(n, member))
+
+/**
+ * list_for_each_entry_safe_continue - continue list iteration safe against removal
+ * @pos: the type * to use as a loop cursor.
+ * @n: another type * to use as temporary storage
+ * @head: the head for your list.
+ * @member: the name of the list_struct within the struct.
+ *
+ * Iterate over list of given type, continuing after current point,
+ * safe against removal of list entry.
+ */
+#define list_for_each_entry_safe_continue(pos, n, head, member) \
+ for (pos = list_next_entry(pos, member), \
+ n = list_next_entry(pos, member); \
+ &pos->member != (head); \
+ pos = n, n = list_next_entry(n, member))
+
+/**
+ * list_for_each_entry_safe_from - iterate over list from current point safe against removal
+ * @pos: the type * to use as a loop cursor.
+ * @n: another type * to use as temporary storage
+ * @head: the head for your list.
+ * @member: the name of the list_struct within the struct.
+ *
+ * Iterate over list of given type from current point, safe against
+ * removal of list entry.
+ */
+#define list_for_each_entry_safe_from(pos, n, head, member) \
+ for (n = list_next_entry(pos, member); \
+ &pos->member != (head); \
+ pos = n, n = list_next_entry(n, member))
+
+/**
+ * list_for_each_entry_safe_reverse - iterate backwards over list safe against removal
+ * @pos: the type * to use as a loop cursor.
+ * @n: another type * to use as temporary storage
+ * @head: the head for your list.
+ * @member: the name of the list_struct within the struct.
+ *
+ * Iterate backwards over list of given type, safe against removal
+ * of list entry.
+ */
+#define list_for_each_entry_safe_reverse(pos, n, head, member) \
+ for (pos = list_last_entry(head, typeof(*pos), member), \
+ n = list_prev_entry(pos, member); \
+ &pos->member != (head); \
+ pos = n, n = list_prev_entry(n, member))
+
+/**
+ * list_safe_reset_next - reset a stale list_for_each_entry_safe loop
+ * @pos: the loop cursor used in the list_for_each_entry_safe loop
+ * @n: temporary storage used in list_for_each_entry_safe
+ * @member: the name of the list_struct within the struct.
+ *
+ * list_safe_reset_next is not safe to use in general if the list may be
+ * modified concurrently (eg. the lock is dropped in the loop body). An
+ * exception to this is if the cursor element (pos) is pinned in the list,
+ * and list_safe_reset_next is called after re-taking the lock and before
+ * completing the current iteration of the loop body.
+ */
+#define list_safe_reset_next(pos, n, member) \
+ n = list_next_entry(pos, member)
+
+struct hlist_head {
+ struct hlist_node *first;
+};
+
+struct hlist_node {
+ struct hlist_node *next, **pprev;
+};
+
+/**
+ * struct callback_head - callback structure for use with RCU and task_work
+ * @next: next update requests in a list
+ * @func: actual update function to call after the grace period.
+ */
+struct callback_head {
+ struct callback_head *next;
+ void (*func)(struct callback_head *head);
+};
+#define rcu_head callback_head
+
+#define HLIST_HEAD_INIT { .first = NULL }
+#define INIT_HLIST_HEAD(ptr) ((ptr)->first = NULL)
+
+static inline void INIT_HLIST_NODE(struct hlist_node *h)
+{
+ h->next = NULL;
+ h->pprev = NULL;
+}
+
+static inline int hlist_unhashed(const struct hlist_node *h)
+{
+ return !h->pprev;
+}
+
+static inline int hlist_empty(const struct hlist_head *h)
+{
+ return !h->first;
+}
+
+static inline void __hlist_del(struct hlist_node *n)
+{
+ struct hlist_node *next = n->next;
+ struct hlist_node **pprev = n->pprev;
+ *pprev = next;
+ if (next)
+ next->pprev = pprev;
+}
+
+
+static inline void hlist_del(struct hlist_node *n)
+{
+ __hlist_del(n);
+ n->next = LIST_POISON1;
+ n->pprev = LIST_POISON2;
+}
+
+static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
+{
+ struct hlist_node *first = h->first;
+ n->next = first;
+ if (first)
+ first->pprev = &n->next;
+ h->first = n;
+ n->pprev = &h->first;
+}
+
+static inline void hlist_add_behind(struct hlist_node *n,
+ struct hlist_node *prev)
+{
+ n->next = prev->next;
+ prev->next = n;
+ n->pprev = &prev->next;
+
+ if (n->next)
+ n->next->pprev = &n->next;
+}
+
+#define hlist_entry(ptr, type, member) container_of(ptr,type,member)
+
+#define hlist_for_each(pos, head) \
+ for (pos = (head)->first; pos ; pos = pos->next)
+
+#define hlist_for_each_safe(pos, n, head) \
+ for (pos = (head)->first; pos && ({ n = pos->next; 1; }); \
+ pos = n)
+
+#define hlist_entry_safe(ptr, type, member) \
+ ({ typeof(ptr) ____ptr = (ptr); \
+ ____ptr ? hlist_entry(____ptr, type, member) : NULL; \
+ })
+
+#define hlist_for_each_entry_safe(pos, n, head, member) \
+ for (pos = hlist_entry_safe((head)->first, typeof(*pos), member);\
+ pos && ({ n = pos->member.next; 1; }); \
+ pos = hlist_entry_safe(n, typeof(*pos), member))
+
+#define hlist_for_each_entry(pos, head, member) \
+ for (pos = hlist_entry_safe((head)->first, typeof(*(pos)), member);\
+ pos; \
+ pos = hlist_entry_safe((pos)->member.next, typeof(*(pos)), member))
+
+#define hlist_first_rcu(head) (*((struct hlist_node __rcu **)(&(head)->first)))
+#define hlist_next_rcu(node) (*((struct hlist_node __rcu **)(&(node)->next)))
+#define hlist_pprev_rcu(node) (*((struct hlist_node __rcu **)((node)->pprev)))
+
+#define hlist_for_each_entry_rcu(pos, head, member) \
+ for (pos = hlist_entry_safe (rcu_dereference_raw(hlist_first_rcu(head)),\
+ typeof(*(pos)), member); \
+ pos; \
+ pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\
+ &(pos)->member)), typeof(*(pos)), member))
+
+#define list_entry_rcu(ptr, type, member) \
+ ({ \
+ typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \
+ container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \
+ })
+
+/**
+ * Where are list_empty_rcu() and list_first_entry_rcu()?
+ *
+ * Implementing those functions following their counterparts list_empty() and
+ * list_first_entry() is not advisable because they lead to subtle race
+ * conditions as the following snippet shows:
+ *
+ * if (!list_empty_rcu(mylist)) {
+ * struct foo *bar = list_first_entry_rcu(mylist, struct foo, list_member);
+ * do_something(bar);
+ * }
+ *
+ * The list may not be empty when list_empty_rcu checks it, but it may be when
+ * list_first_entry_rcu rereads the ->next pointer.
+ *
+ * Rereading the ->next pointer is not a problem for list_empty() and
+ * list_first_entry() because they would be protected by a lock that blocks
+ * writers.
+ *
+ * See list_first_or_null_rcu for an alternative.
+ */
+
+
+/**
+ * list_for_each_entry_rcu - iterate over rcu list of given type
+ * @pos: the type * to use as a loop cursor.
+ * @head: the head for your list.
+ * @member: the name of the list_struct within the struct.
+ *
+ * This list-traversal primitive may safely run concurrently with
+ * the _rcu list-mutation primitives such as list_add_rcu()
+ * as long as the traversal is guarded by rcu_read_lock().
+ */
+#define list_for_each_entry_rcu(pos, head, member) \
+ for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \
+ &pos->member != (head); \
+ pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
+
+/**
+ * list_for_each_entry_continue_rcu - continue iteration over list of given type
+ * @pos: the type * to use as a loop cursor.
+ * @head: the head for your list.
+ * @member: the name of the list_struct within the struct.
+ *
+ * Continue to iterate over list of given type, continuing after
+ * the current position.
+ */
+#define list_for_each_entry_continue_rcu(pos, head, member) \
+ for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
+ &pos->member != (head); \
+ pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
+
+
+#endif
diff --git a/drivers/net/mlnx_uio/include/log2.h b/drivers/net/mlnx_uio/include/log2.h
new file mode 100644
index 0000000..2c0ca9f
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/log2.h
@@ -0,0 +1,229 @@
+/* Integer base 2 logarithm calculation
+ *
+ * Copyright (C) 2006 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _LINUX_LOG2_H
+#define _LINUX_LOG2_H
+
+#include "bitops.h"
+#include <assert.h>
+/*
+ * deal with unrepresentable constant logarithms
+ */
+__attribute__((const))
+static inline int ____ilog2_NaN(void)
+{
+ assert(0);
+ return -1;
+}
+/*
+ * non-constant log of base 2 calculators
+ * - the arch may override these in asm/bitops.h if they can be implemented
+ * more efficiently than using fls() and fls64()
+ * - the arch is not required to handle n==0 if implementing the fallback
+ */
+#ifndef CONFIG_ARCH_HAS_ILOG2_U32
+static inline __attribute__((const))
+int __ilog2_u32(u32 n)
+{
+ return fls(n) - 1;
+}
+#endif
+
+#ifndef CONFIG_ARCH_HAS_ILOG2_U64
+static inline __attribute__((const))
+int __ilog2_u64(u64 n)
+{
+ return fls64(n) - 1;
+}
+#endif
+
+/*
+ * Determine whether some value is a power of two, where zero is
+ * *not* considered a power of two.
+ */
+
+static inline __attribute__((const))
+bool is_power_of_2(unsigned long n)
+{
+ return (n != 0 && ((n & (n - 1)) == 0));
+}
+
+/*
+ * round up to nearest power of two
+ */
+static inline __attribute__((const))
+unsigned long __roundup_pow_of_two(unsigned long n)
+{
+ return 1UL << fls_long(n - 1);
+}
+
+/*
+ * round down to nearest power of two
+ */
+static inline __attribute__((const))
+unsigned long __rounddown_pow_of_two(unsigned long n)
+{
+ return 1UL << (fls_long(n) - 1);
+}
+
+/**
+ * ilog2 - log of base 2 of 32-bit or a 64-bit unsigned value
+ * @n - parameter
+ *
+ * constant-capable log of base 2 calculation
+ * - this can be used to initialise global variables from constant data, hence
+ * the massive ternary operator construction
+ *
+ * selects the appropriately-sized optimised version depending on sizeof(n)
+ */
+#define ilog2(n) \
+( \
+ __builtin_constant_p(n) ? ( \
+ (n) < 1 ? ____ilog2_NaN() : \
+ (n) & (1ULL << 63) ? 63 : \
+ (n) & (1ULL << 62) ? 62 : \
+ (n) & (1ULL << 61) ? 61 : \
+ (n) & (1ULL << 60) ? 60 : \
+ (n) & (1ULL << 59) ? 59 : \
+ (n) & (1ULL << 58) ? 58 : \
+ (n) & (1ULL << 57) ? 57 : \
+ (n) & (1ULL << 56) ? 56 : \
+ (n) & (1ULL << 55) ? 55 : \
+ (n) & (1ULL << 54) ? 54 : \
+ (n) & (1ULL << 53) ? 53 : \
+ (n) & (1ULL << 52) ? 52 : \
+ (n) & (1ULL << 51) ? 51 : \
+ (n) & (1ULL << 50) ? 50 : \
+ (n) & (1ULL << 49) ? 49 : \
+ (n) & (1ULL << 48) ? 48 : \
+ (n) & (1ULL << 47) ? 47 : \
+ (n) & (1ULL << 46) ? 46 : \
+ (n) & (1ULL << 45) ? 45 : \
+ (n) & (1ULL << 44) ? 44 : \
+ (n) & (1ULL << 43) ? 43 : \
+ (n) & (1ULL << 42) ? 42 : \
+ (n) & (1ULL << 41) ? 41 : \
+ (n) & (1ULL << 40) ? 40 : \
+ (n) & (1ULL << 39) ? 39 : \
+ (n) & (1ULL << 38) ? 38 : \
+ (n) & (1ULL << 37) ? 37 : \
+ (n) & (1ULL << 36) ? 36 : \
+ (n) & (1ULL << 35) ? 35 : \
+ (n) & (1ULL << 34) ? 34 : \
+ (n) & (1ULL << 33) ? 33 : \
+ (n) & (1ULL << 32) ? 32 : \
+ (n) & (1ULL << 31) ? 31 : \
+ (n) & (1ULL << 30) ? 30 : \
+ (n) & (1ULL << 29) ? 29 : \
+ (n) & (1ULL << 28) ? 28 : \
+ (n) & (1ULL << 27) ? 27 : \
+ (n) & (1ULL << 26) ? 26 : \
+ (n) & (1ULL << 25) ? 25 : \
+ (n) & (1ULL << 24) ? 24 : \
+ (n) & (1ULL << 23) ? 23 : \
+ (n) & (1ULL << 22) ? 22 : \
+ (n) & (1ULL << 21) ? 21 : \
+ (n) & (1ULL << 20) ? 20 : \
+ (n) & (1ULL << 19) ? 19 : \
+ (n) & (1ULL << 18) ? 18 : \
+ (n) & (1ULL << 17) ? 17 : \
+ (n) & (1ULL << 16) ? 16 : \
+ (n) & (1ULL << 15) ? 15 : \
+ (n) & (1ULL << 14) ? 14 : \
+ (n) & (1ULL << 13) ? 13 : \
+ (n) & (1ULL << 12) ? 12 : \
+ (n) & (1ULL << 11) ? 11 : \
+ (n) & (1ULL << 10) ? 10 : \
+ (n) & (1ULL << 9) ? 9 : \
+ (n) & (1ULL << 8) ? 8 : \
+ (n) & (1ULL << 7) ? 7 : \
+ (n) & (1ULL << 6) ? 6 : \
+ (n) & (1ULL << 5) ? 5 : \
+ (n) & (1ULL << 4) ? 4 : \
+ (n) & (1ULL << 3) ? 3 : \
+ (n) & (1ULL << 2) ? 2 : \
+ (n) & (1ULL << 1) ? 1 : \
+ (n) & (1ULL << 0) ? 0 : \
+ ____ilog2_NaN() \
+ ) : \
+ (sizeof(n) <= 4) ? \
+ __ilog2_u32(n) : \
+ __ilog2_u64(n) \
+ )
+
+/**
+ * roundup_pow_of_two - round the given value up to nearest power of two
+ * @n - parameter
+ *
+ * round the given value up to the nearest power of two
+ * - the result is undefined when n == 0
+ * - this can be used to initialise global variables from constant data
+ */
+#define roundup_pow_of_two(n) \
+( \
+ __builtin_constant_p(n) ? ( \
+ (n == 1) ? 1 : \
+ (1UL << (ilog2((n) - 1) + 1)) \
+ ) : \
+ __roundup_pow_of_two(n) \
+ )
+
+/**
+ * rounddown_pow_of_two - round the given value down to nearest power of two
+ * @n - parameter
+ *
+ * round the given value down to the nearest power of two
+ * - the result is undefined when n == 0
+ * - this can be used to initialise global variables from constant data
+ */
+#define rounddown_pow_of_two(n) \
+( \
+ __builtin_constant_p(n) ? ( \
+ (1UL << ilog2(n))) : \
+ __rounddown_pow_of_two(n) \
+ )
+
+/**
+ * order_base_2 - calculate the (rounded up) base 2 order of the argument
+ * @n: parameter
+ *
+ * The first few values calculated by this routine:
+ * ob2(0) = 0
+ * ob2(1) = 0
+ * ob2(2) = 1
+ * ob2(3) = 2
+ * ob2(4) = 2
+ * ob2(5) = 3
+ * ... and so on.
+ */
+
+#define order_base_2(n) ilog2(roundup_pow_of_two(n))
+
+/*
+ * Runtime evaluation of get_order()
+ */
+static inline __attribute_const__
+int __get_order(unsigned long size)
+{
+ int order;
+
+ size--;
+ size >>= PAGE_SHIFT;
+#if BITS_PER_LONG == 32
+ order = fls(size);
+#else
+ order = fls64(size);
+#endif
+ return order;
+}
+#define get_order(n) __get_order(n)
+
+#endif /* _LINUX_LOG2_H */
diff --git a/drivers/net/mlnx_uio/include/mlx4_dpdk.h b/drivers/net/mlnx_uio/include/mlx4_dpdk.h
new file mode 100644
index 0000000..361f025
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/mlx4_dpdk.h
@@ -0,0 +1,17 @@
+/*
+ * mlx4_dpdk.h
+ *
+ * Created on: Jun 27, 2015
+ * Author: leeopop
+ */
+
+#ifndef DRIVERS_NET_MLNX_UIO_INCLUDE_MLX4_DPDK_H_
+#define DRIVERS_NET_MLNX_UIO_INCLUDE_MLX4_DPDK_H_
+
+#include "kmod.h"
+
+struct mlx4_eth_private
+{
+};
+
+#endif /* DRIVERS_NET_MLNX_UIO_INCLUDE_MLX4_DPDK_H_ */
diff --git a/drivers/net/mlnx_uio/include/mlx4_uio.h b/drivers/net/mlnx_uio/include/mlx4_uio.h
new file mode 100644
index 0000000..f425420
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/mlx4_uio.h
@@ -0,0 +1,24 @@
+/*
+ * mlx4_uio.h
+ *
+ * Created on: Jun 30, 2015
+ * Author: leeopop
+ */
+
+#ifndef DRIVERS_NET_MLNX_UIO_INCLUDE_MLX4_UIO_H_
+#define DRIVERS_NET_MLNX_UIO_INCLUDE_MLX4_UIO_H_
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+
+uint16_t
+mlx4_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
+ uint16_t nb_pkts);
+uint16_t
+mlx4_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+ uint16_t nb_pkts);
+
+extern const struct eth_dev_ops mlx4_eth_dev_ops;
+
+
+#endif /* DRIVERS_NET_MLNX_UIO_INCLUDE_MLX4_UIO_H_ */
diff --git a/drivers/net/mlnx_uio/include/mlx4_uio_helper.h b/drivers/net/mlnx_uio/include/mlx4_uio_helper.h
new file mode 100644
index 0000000..b57b9e1
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/mlx4_uio_helper.h
@@ -0,0 +1,800 @@
+/*
+ * mlx4_uio_helper.h
+ *
+ * Created on: Jul 1, 2015
+ * Author: leeopop
+ */
+
+#ifndef DRIVERS_NET_MLNX_UIO_INCLUDE_MLX4_UIO_HELPER_H_
+#define DRIVERS_NET_MLNX_UIO_INCLUDE_MLX4_UIO_HELPER_H_
+
+#include "kmod.h"
+#include "mlnx/mlx4/mlx4_en.h"
+#include "log2.h"
+
+static void mlx4_en_u64_to_mac(unsigned char dst_mac[ETH_ALEN], u64 src_mac)
+{
+ int i;
+ for (i = ETH_ALEN - 1; i >= 0; --i) {
+ dst_mac[i] = src_mac & 0xff;
+ src_mac >>= 8;
+ }
+}
+
+static int mlx4_en_uc_steer_add(struct mlx4_en_priv *priv,
+ unsigned char *mac, int *qpn, u64 *reg_id)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_dev *dev = mdev->dev;
+ int err;
+
+ switch (dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_B0: {
+ struct mlx4_qp qp;
+ u8 gid[16] = {0};
+
+ qp.qpn = *qpn;
+ memcpy(&gid[10], mac, ETH_ALEN);
+ gid[5] = priv->port;
+
+ err = mlx4_unicast_attach(dev, &qp, gid, 0, MLX4_PROT_ETH);
+ break;
+ }
+ case MLX4_STEERING_MODE_DEVICE_MANAGED: {
+ struct mlx4_spec_list spec_eth = { {NULL} };
+ __be64 mac_mask = cpu_to_be64(MLX4_MAC_MASK << 16);
+
+ struct mlx4_net_trans_rule rule = {
+ .queue_mode = MLX4_NET_TRANS_Q_FIFO,
+ .exclusive = 0,
+ .allow_loopback = 1,
+ .promisc_mode = MLX4_FS_REGULAR,
+ .priority = MLX4_DOMAIN_NIC,
+ };
+
+ rule.port = priv->port;
+ rule.qpn = *qpn;
+ INIT_LIST_HEAD(&rule.list);
+
+ spec_eth.id = MLX4_NET_TRANS_RULE_ID_ETH;
+ memcpy(spec_eth.eth.dst_mac, mac, ETH_ALEN);
+ memcpy(spec_eth.eth.dst_mac_msk, &mac_mask, ETH_ALEN);
+ list_add_tail(&spec_eth.list, &rule.list);
+
+ err = mlx4_flow_attach(dev, &rule, reg_id);
+ break;
+ }
+ default:
+ return -EINVAL;
+ }
+ if (err)
+ en_warn(priv, "Failed Attaching Unicast\n");
+
+ return err;
+}
+
+static int mlx4_en_tunnel_steer_add(struct mlx4_en_priv *priv, unsigned char *addr,
+ int qpn, u64 *reg_id)
+{
+ int err;
+
+ if (priv->mdev->dev->caps.tunnel_offload_mode != MLX4_TUNNEL_OFFLOAD_MODE_VXLAN ||
+ priv->mdev->dev->caps.dmfs_high_steer_mode == MLX4_STEERING_DMFS_A0_STATIC)
+ return 0; /* do nothing */
+
+ err = mlx4_tunnel_steer_add(priv->mdev->dev, addr, priv->port, qpn,
+ MLX4_DOMAIN_NIC, reg_id);
+ if (err) {
+ en_err(priv, "failed to add vxlan steering rule, err %d\n", err);
+ return err;
+ }
+ en_dbg(DRV, priv, "added vxlan steering rule, mac %pM reg_id %llx\n", addr, *reg_id);
+ return 0;
+}
+
+static void mlx4_en_uc_steer_release(struct mlx4_en_priv *priv,
+ unsigned char *mac, int qpn, u64 reg_id)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_dev *dev = mdev->dev;
+
+ switch (dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_B0: {
+ struct mlx4_qp qp;
+ u8 gid[16] = {0};
+
+ qp.qpn = qpn;
+ memcpy(&gid[10], mac, ETH_ALEN);
+ gid[5] = priv->port;
+
+ mlx4_unicast_detach(dev, &qp, gid, MLX4_PROT_ETH);
+ break;
+ }
+ case MLX4_STEERING_MODE_DEVICE_MANAGED: {
+ mlx4_flow_detach(dev, reg_id);
+ break;
+ }
+ default:
+ en_err(priv, "Invalid steering mode.\n");
+ }
+}
+
+
+static int mlx4_en_get_qp(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_dev *dev = mdev->dev;
+ struct mlx4_mac_entry *entry;
+ int index = 0;
+ int err = 0;
+ u64 reg_id = 0;
+ int *qpn = &priv->base_qpn;
+ u64 mac = mlx4_mac_to_u64(priv->current_mac);
+
+ en_dbg(DRV, priv, "Registering MAC: %pM for adding\n",
+ priv->current_mac);
+ index = mlx4_register_mac(dev, priv->port, mac);
+ if (index < 0) {
+ err = index;
+ en_err(priv, "Failed adding MAC: %pM\n",
+ priv->current_mac);
+ return err;
+ }
+
+ if (dev->caps.steering_mode == MLX4_STEERING_MODE_A0) {
+ int base_qpn = mlx4_get_base_qpn(dev, priv->port);
+ *qpn = base_qpn + index;
+ return 0;
+ }
+
+ err = mlx4_qp_reserve_range(dev, 1, 1, qpn, MLX4_RESERVE_A0_QP);
+ en_dbg(DRV, priv, "Reserved qp %d\n", *qpn);
+ if (err) {
+ en_err(priv, "Failed to reserve qp for mac registration\n");
+ goto qp_err;
+ }
+
+ err = mlx4_en_uc_steer_add(priv, priv->current_mac, qpn, ®_id);
+ if (err)
+ goto steer_err;
+
+ err = mlx4_en_tunnel_steer_add(priv, priv->current_mac, *qpn,
+ &priv->tunnel_reg_id);
+ if (err)
+ goto tunnel_err;
+
+ entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+ if (!entry) {
+ err = -ENOMEM;
+ goto alloc_err;
+ }
+ memcpy(entry->mac, priv->current_mac, sizeof(entry->mac));
+ memcpy(priv->current_mac, entry->mac, sizeof(priv->current_mac));
+ entry->reg_id = reg_id;
+
+ hlist_add_head(&entry->hlist,
+ &priv->mac_hash[entry->mac[MLX4_EN_MAC_HASH_IDX]]);
+
+ return 0;
+
+alloc_err:
+ if (priv->tunnel_reg_id)
+ mlx4_flow_detach(priv->mdev->dev, priv->tunnel_reg_id);
+tunnel_err:
+ mlx4_en_uc_steer_release(priv, priv->current_mac, *qpn, reg_id);
+
+steer_err:
+ mlx4_qp_release_range(dev, *qpn, 1);
+
+qp_err:
+ mlx4_unregister_mac(dev, priv->port, mac);
+ return err;
+}
+
+static inline void mlx4_en_update_rx_prod_db(struct mlx4_en_rx_ring *ring)
+{
+ *ring->wqres.db.db = cpu_to_be32(ring->prod & 0xffff);
+}
+
+static int mlx4_en_config_rss_qp(struct mlx4_en_priv *priv, int qpn,
+ struct mlx4_en_rx_ring *ring,
+ enum mlx4_qp_state *state,
+ struct mlx4_qp *qp)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_qp_context *context;
+ int err = 0;
+
+ context = kmalloc(sizeof(*context), GFP_KERNEL);
+ if (!context)
+ return -ENOMEM;
+
+ err = mlx4_qp_alloc(mdev->dev, qpn, qp, GFP_KERNEL);
+ if (err) {
+ en_err(priv, "Failed to allocate qp #%x\n", qpn);
+ goto out;
+ }
+ qp->event = mlx4_en_sqp_event;
+
+ memset(context, 0, sizeof *context);
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ mlx4_en_fill_qp_context(priv, ring->actual_size, ring->stride, 0, 0,
+ qpn, ring->cqn, -1, context);
+#else
+ mlx4_en_fill_qp_context(priv, ring->actual_size, ring->stride, 0, 0,
+ qpn, ring->rx_cq.mcq.cqn, context);
+#endif
+ context->db_rec_addr = cpu_to_be64(ring->wqres.db.dma);
+
+ /* Cancel FCS removal if FW allows */
+ if (mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_FCS_KEEP) {
+ context->param3 |= cpu_to_be32(1 << 29);
+#ifdef HAVE_NETIF_F_RXFCS
+ if (priv->dev->features & NETIF_F_RXFCS)
+#else
+ //if (priv->pflags & MLX4_EN_PRIV_FLAGS_RXFCS)
+#endif
+ ring->fcs_del = 0;
+ //else
+ // ring->fcs_del = ETH_FCS_LEN;
+ } else
+ ring->fcs_del = 0;
+
+ err = mlx4_qp_to_ready(mdev->dev, &ring->wqres.mtt, context, qp, state);
+ if (err) {
+ mlx4_qp_remove(mdev->dev, qp);
+ mlx4_qp_free(mdev->dev, qp);
+ }
+ mlx4_en_update_rx_prod_db(ring);
+out:
+ kfree(context);
+ return err;
+}
+
+/* Allocate rx qp's and configure them according to rss map */
+static int mlx4_en_config_rss_steer(struct rte_eth_dev *dev)
+{
+ struct mlx4_en_priv* priv = rtedev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_en_rss_map *rss_map = &priv->rss_map;
+ struct mlx4_qp_context context;
+ struct mlx4_rss_context *rss_context;
+ int rss_rings;
+ void *ptr;
+ u8 rss_mask = (MLX4_RSS_IPV4 | MLX4_RSS_TCP_IPV4 | MLX4_RSS_IPV6 |
+ MLX4_RSS_TCP_IPV6);
+ int i, qpn;
+ int err = 0;
+ int good_qps = 0;
+#ifndef HAVE_NETDEV_RSS_KEY_FILL
+ static const u32 rsskey[MLX4_EN_RSS_KEY_SIZE] = { 0xD181C62C, 0xF7F4DB5B, 0x1983A2FC,
+ 0x943E1ADB, 0xD9389E6B, 0xD1039C2C, 0xA74499AD,
+ 0x593D56D9, 0xF3253C06, 0x2ADC1FFC};
+#endif
+
+ en_dbg(DRV, priv, "Configuring rss steering\n");
+ err = mlx4_qp_reserve_range(mdev->dev, dev->data->nb_rx_queues,
+ dev->data->nb_rx_queues,
+ &rss_map->base_qpn, 0);
+ if (err) {
+ en_err(priv, "Failed reserving %d qps\n", dev->data->nb_rx_queues);
+ return err;
+ }
+
+ for (i = 0; i < dev->data->nb_rx_queues; i++) {
+ qpn = rss_map->base_qpn + i;
+ err = mlx4_en_config_rss_qp(priv, qpn, dev->data->rx_queues[i],
+ &rss_map->state[i],
+ &rss_map->qps[i]);
+ if (err)
+ goto rss_err;
+
+ ++good_qps;
+ }
+
+ /* Configure RSS indirection qp */
+ err = mlx4_qp_alloc(mdev->dev, priv->base_qpn, &rss_map->indir_qp, GFP_KERNEL);
+ if (err) {
+ en_err(priv, "Failed to allocate RSS indirection QP\n");
+ goto rss_err;
+ }
+ rss_map->indir_qp.event = mlx4_en_sqp_event;
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ mlx4_en_fill_qp_context(priv, 0, 0, 0, 1, priv->base_qpn,
+ priv->rx_ring[0]->cqn, -1, &context);
+#else
+ mlx4_en_fill_qp_context(priv, 0, 0, 0, 1, priv->base_qpn,
+ ((struct mlx4_en_rx_ring*)(dev->data->rx_queues[0]))->rx_cq.mcq.cqn, &context);
+#endif
+
+ rss_rings = dev->data->nb_rx_queues;
+ /*
+ if (!priv->prof->rss_rings || priv->prof->rss_rings > dev->data->nb_rx_queues)
+ rss_rings = dev->data->nb_rx_queues;
+ else
+ rss_rings = priv->prof->rss_rings;
+ */
+
+ ptr = ((void *) &context) + offsetof(struct mlx4_qp_context, pri_path)
+ + MLX4_RSS_OFFSET_IN_QPC_PRI_PATH;
+ rss_context = ptr;
+ rss_context->base_qpn = cpu_to_be32(ilog2(rss_rings) << 24 |
+ (rss_map->base_qpn));
+ rss_context->default_qpn = cpu_to_be32(rss_map->base_qpn);
+ if (priv->mdev->profile.udp_rss) {
+ rss_mask |= MLX4_RSS_UDP_IPV4 | MLX4_RSS_UDP_IPV6;
+ rss_context->base_qpn_udp = rss_context->default_qpn;
+ }
+
+ if (mdev->dev->caps.tunnel_offload_mode == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN) {
+ en_info(priv, "Setting RSS context tunnel type to RSS on inner headers\n");
+ rss_mask |= MLX4_RSS_BY_INNER_HEADERS;
+ }
+
+ rss_context->flags = rss_mask;
+ rss_context->hash_fn = MLX4_RSS_HASH_TOP;
+#ifdef HAVE_ETH_SS_RSS_HASH_FUNCS
+ if (priv->rss_hash_fn == ETH_RSS_HASH_XOR) {
+ rss_context->hash_fn = MLX4_RSS_HASH_XOR;
+ } else if (priv->rss_hash_fn == ETH_RSS_HASH_TOP) {
+ rss_context->hash_fn = MLX4_RSS_HASH_TOP;
+ memcpy(rss_context->rss_key, priv->rss_key,
+ MLX4_EN_RSS_KEY_SIZE);
+#ifdef HAVE_NETDEV_RSS_KEY_FILL
+ netdev_rss_key_fill(rss_context->rss_key,
+ MLX4_EN_RSS_KEY_SIZE);
+#else
+ for (i = 0; i < MLX4_EN_RSS_KEY_SIZE; i++)
+ rss_context->rss_key[i] = cpu_to_be32(rsskey[i]);
+#endif
+ } else {
+ en_err(priv, "Unknown RSS hash function requested\n");
+ err = -EINVAL;
+ goto indir_err;
+ }
+#else
+#ifndef HAVE_NETDEV_RSS_KEY_FILL
+ for (i = 0; i < MLX4_EN_RSS_KEY_SIZE; i++)
+ rss_context->rss_key[i] = cpu_to_be32(rsskey[i]);
+#else
+ memcpy(rss_context->rss_key, priv->rss_key, MLX4_EN_RSS_KEY_SIZE);
+#endif
+#endif
+ err = mlx4_qp_to_ready(mdev->dev, &priv->res.mtt, &context,
+ &rss_map->indir_qp, &rss_map->indir_state);
+ if (err)
+ goto indir_err;
+
+ return 0;
+
+indir_err:
+ mlx4_qp_modify(mdev->dev, NULL, rss_map->indir_state,
+ MLX4_QP_STATE_RST, NULL, 0, 0, &rss_map->indir_qp);
+ mlx4_qp_remove(mdev->dev, &rss_map->indir_qp);
+ mlx4_qp_free(mdev->dev, &rss_map->indir_qp);
+rss_err:
+ for (i = 0; i < good_qps; i++) {
+ mlx4_qp_modify(mdev->dev, NULL, rss_map->state[i],
+ MLX4_QP_STATE_RST, NULL, 0, 0, &rss_map->qps[i]);
+ mlx4_qp_remove(mdev->dev, &rss_map->qps[i]);
+ mlx4_qp_free(mdev->dev, &rss_map->qps[i]);
+ }
+ mlx4_qp_release_range(mdev->dev, rss_map->base_qpn, dev->data->nb_rx_queues);
+ return err;
+}
+
+static int mlx4_en_create_drop_qp(struct mlx4_en_priv *priv)
+{
+ int err;
+ u32 qpn;
+
+ err = mlx4_qp_reserve_range(priv->mdev->dev, 1, 1, &qpn,
+ MLX4_RESERVE_A0_QP);
+ if (err) {
+ en_err(priv, "Failed reserving drop qpn\n");
+ return err;
+ }
+ err = mlx4_qp_alloc(priv->mdev->dev, qpn, &priv->drop_qp, GFP_KERNEL);
+ if (err) {
+ en_err(priv, "Failed allocating drop qp\n");
+ mlx4_qp_release_range(priv->mdev->dev, qpn, 1);
+ return err;
+ }
+
+ return 0;
+}
+
+static struct mlx4_en_tx_desc *mlx4_en_bounce_to_desc(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring *ring,
+ u32 index,
+ unsigned int desc_size)
+{
+ u32 copy = (ring->size - index) * TXBB_SIZE;
+ int i;
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ __be32 owner_bit = (ring->prod & ring->size) ?
+ cpu_to_be32(MLX4_EN_BIT_DESC_OWN) : 0;
+#endif
+
+ for (i = desc_size - copy - 4; i >= 0; i -= 4) {
+ if ((i & (TXBB_SIZE - 1)) == 0) {
+ wmb();
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ *((u32 *) (ring->buf + i)) =
+ (*((u32 *) (ring->bounce_buf + copy + i)) &
+ WQE_FORMAT_1_MASK) |
+ owner_bit;
+ continue;
+#endif
+ }
+
+ *((u32 *) (ring->buf + i)) =
+ *((u32 *) (ring->bounce_buf + copy + i));
+ }
+
+ for (i = copy - 4; i >= 4; i -= 4) {
+ if ((i & (TXBB_SIZE - 1)) == 0)
+ wmb();
+
+ *((u32 *) (ring->buf + index * TXBB_SIZE + i)) =
+ *((u32 *) (ring->bounce_buf + i));
+ }
+
+ /* Return real descriptor location */
+ return ring->buf + index * TXBB_SIZE;
+}
+
+static int mlx4_txq_is_full(struct mlx4_en_tx_ring* ring)
+{
+ int stop_queue = (int)(ring->prod - ring->cons) > (ring->size - HEADROOM - MAX_DESC_TXBBS);
+ return stop_queue;
+}
+
+static u32 mlx4_en_free_tx_desc(struct mlx4_en_tx_ring *ring,
+ int index, u8 owner, u64 timestamp)
+{
+ struct mlx4_en_tx_info *tx_info = &ring->tx_info[index];
+ struct mlx4_en_tx_desc *tx_desc = ring->buf + index * TXBB_SIZE;
+ struct mlx4_wqe_data_seg *data = &tx_desc->data;
+ void *end = ring->buf + ring->buf_size;
+ struct rte_mbuf* mbuf = tx_info->mbuf;
+ int i;
+
+ if(timestamp)
+ {
+ if(ring->tx_tstamp_callback)
+ {
+ ring->tx_tstamp_callback(timestamp, mbuf, ring->tx_tstamp_callback_arg);
+ }
+ }
+
+ rte_pktmbuf_free(mbuf);
+ return tx_info->nr_txbb;
+}
+
+static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring *ring, int index,
+ u8 owner)
+{
+ __be32 stamp = cpu_to_be32(STAMP_VAL | (!!owner << STAMP_SHIFT));
+ struct mlx4_en_tx_desc *tx_desc = ring->buf + index * TXBB_SIZE;
+ struct mlx4_en_tx_info *tx_info = &ring->tx_info[index];
+ void *end = ring->buf + ring->buf_size;
+ __be32 *ptr = (__be32 *)tx_desc;
+ int i;
+
+ /* Optimize the common case when there are no wraparounds */
+ if (likely((void *)tx_desc + tx_info->nr_txbb * TXBB_SIZE <= end)) {
+ /* Stamp the freed descriptor */
+ for (i = 0; i < tx_info->nr_txbb * TXBB_SIZE;
+ i += STAMP_STRIDE) {
+ *ptr = stamp;
+ ptr += STAMP_DWORDS;
+ }
+ } else {
+ /* Stamp the freed descriptor */
+ for (i = 0; i < tx_info->nr_txbb * TXBB_SIZE;
+ i += STAMP_STRIDE) {
+ *ptr = stamp;
+ ptr += STAMP_DWORDS;
+ if ((void *)ptr >= end) {
+ ptr = ring->buf;
+ stamp ^= cpu_to_be32(0x80000000);
+ }
+ }
+ }
+}
+
+static int mlx4_en_process_tx_cq(struct mlx4_en_tx_ring *ring)
+{
+ struct mlx4_en_cq* cq = &ring->tx_cq;
+ struct mlx4_cq *mcq = &cq->mcq;
+ struct rte_eth_dev* dev = cq->rte_dev;
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_cqe *cqe;
+ u16 index;
+ u16 new_index, ring_index, stamp_index;
+ u32 txbbs_skipped = 0;
+#ifndef CONFIG_INFINIBAND_WQE_FORMAT
+ u32 txbbs_stamp = 0;
+#endif
+ u32 cons_index = mcq->cons_index;
+ int size = cq->size;
+ u32 size_mask = ring->size_mask;
+ struct mlx4_cqe *buf = cq->buf;
+ u32 packets = 0;
+ u32 bytes = 0;
+ int factor = priv->cqe_factor;
+ u64 timestamp = 0;
+ int done = 0;
+ u32 last_nr_txbb;
+ u32 ring_cons;
+
+ index = cons_index & size_mask;
+ cqe = mlx4_en_get_cqe(buf, index, priv->cqe_size) + factor;
+ last_nr_txbb = ACCESS_ONCE(ring->last_nr_txbb);
+ ring_cons = ACCESS_ONCE(ring->cons);
+ ring_index = ring_cons & size_mask;
+ stamp_index = ring_index;
+
+ /* Process all completed CQEs */
+ while (XNOR(cqe->owner_sr_opcode & MLX4_CQE_OWNER_MASK,
+ cons_index & size)) {
+ /*
+ * make sure we read the CQE after we read the
+ * ownership bit
+ */
+ rmb();
+
+ if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
+ MLX4_CQE_OPCODE_ERROR)) {
+ struct mlx4_err_cqe *cqe_err = (struct mlx4_err_cqe *)cqe;
+
+ en_err(priv, "CQE error - vendor syndrome: 0x%x syndrome: 0x%x\n",
+ cqe_err->vendor_err_syndrome,
+ cqe_err->syndrome);
+ }
+
+ /* Skip over last polled CQE */
+ new_index = be16_to_cpu(cqe->wqe_index) & size_mask;
+
+ do {
+ txbbs_skipped += last_nr_txbb;
+ ring_index = (ring_index + last_nr_txbb) & size_mask;
+ if (ring->enable_hwtstamp)
+ {
+ timestamp = mlx4_en_get_cqe_ts(cqe);
+ }
+
+ /* free next descriptor */
+ last_nr_txbb = mlx4_en_free_tx_desc(
+ ring, ring_index,
+ !!((ring_cons + txbbs_skipped) &
+ ring->size), timestamp);
+
+ mlx4_en_stamp_wqe(priv, ring, stamp_index,
+ !!((ring_cons + txbbs_stamp) &
+ ring->size));
+ stamp_index = ring_index;
+ txbbs_stamp = txbbs_skipped;
+ ++done;
+ } while ((ring_index != new_index));
+
+ ++cons_index;
+ index = cons_index & size_mask;
+ cqe = mlx4_en_get_cqe(buf, index, priv->cqe_size) + factor;
+ }
+
+
+ /*
+ * To prevent CQ overflow we first update CQ consumer and only then
+ * the ring consumer.
+ */
+ mcq->cons_index = cons_index;
+ mlx4_cq_set_ci(mcq);
+ wmb();
+
+ /* we want to dirty this cache line once */
+ ACCESS_ONCE(ring->last_nr_txbb) = last_nr_txbb;
+ ACCESS_ONCE(ring->cons) = ring_cons + txbbs_skipped;
+
+ return done;
+}
+static int mlx4_en_prepare_rx_desc(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring, int index)
+{
+ struct mlx4_en_rx_desc *rx_desc = ring->buf + (index * ring->stride);
+ //struct rte_mbuf **frags = ring->rx_info + index;
+
+ {
+ dma_addr_t dma;
+ int i;
+ struct rte_mbuf* mbuf = NULL;
+
+ for (i = 0; i < ring->num_frags; i++) {
+ mbuf = rte_pktmbuf_alloc(ring->mb_pool);
+ assert(mbuf);
+ assert((mbuf->buf_len - mbuf->data_off) >= ring->frag_size);
+ dma = mbuf->buf_physaddr + mbuf->data_off;
+ rx_desc->data[i].addr = cpu_to_be64(dma);
+ ring->rx_info[ring->num_frags*index + i] = mbuf;
+ }
+
+ return 0;
+ }
+}
+static void mlx4_en_refill_rx_buffers(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring)
+{
+ int index = ring->prod & ring->size_mask;
+
+ while ((u32) (ring->prod - ring->cons) < ring->actual_size) {
+ if (mlx4_en_prepare_rx_desc(priv, ring, index))
+ break;
+ ring->prod++;
+ index = ring->prod & ring->size_mask;
+ }
+}
+
+static struct rte_mbuf* mlx4_en_complete_rx_desc(
+ struct mlx4_en_rx_ring *ring,
+ struct mlx4_en_rx_desc *rx_desc,
+ struct rte_mbuf **mbuf_frags,
+ int length)
+{
+ int nr;
+ int remaining = length;
+
+ /* Collect used fragments while replacing them in the HW descriptors */
+ struct rte_mbuf* head = mbuf_frags[0];
+ struct rte_mbuf* prev = 0;
+ struct rte_mbuf* mbuf = 0;
+ int frags = 0;
+ for (nr = 0; nr < ring->num_frags; nr++) {
+ mbuf = mbuf_frags[nr];
+ if(remaining == 0)
+ {
+ rte_pktmbuf_free(mbuf);
+ continue;
+ }
+ ++frags;
+ int frag_len = ring->frag_size;
+ if(remaining < frag_len)
+ mbuf->data_len = remaining;
+ else
+ mbuf->data_len = frag_len;
+ remaining -= mbuf->data_len;
+
+ if(prev)
+ prev->next = mbuf;
+ prev = mbuf;
+ }
+ head->nb_segs = frags;
+ head->pkt_len = length;
+ return head;
+}
+
+static int mlx4_en_process_rx_cq(struct mlx4_en_rx_ring *ring, struct rte_mbuf** ret_array, int budget)
+{
+ int received = 0;
+ struct mlx4_en_cq* cq = &ring->rx_cq;
+ struct rte_eth_dev* dev = cq->rte_dev;
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_cqe *cqe;
+ struct rte_mbuf **mbuf_frag;
+ struct rte_mbuf *mbuf = 0;
+ struct mlx4_en_rx_desc *rx_desc;
+ int index;
+ int nr;
+ unsigned int length;
+ int polled = 0;
+ u64 ol_flags = 0;
+ int factor = priv->cqe_factor;
+ u64 timestamp;
+#ifdef HAVE_NETDEV_HW_ENC_FEATURES
+ bool l2_tunnel;
+#endif
+
+
+ /* We assume a 1:1 mapping between CQEs and Rx descriptors, so Rx
+ * descriptor offset can be deduced from the CQE index instead of
+ * reading 'cqe->index' */
+ index = cq->mcq.cons_index & ring->size_mask;
+ cqe = mlx4_en_get_cqe(cq->buf, index, priv->cqe_size) + factor;
+
+ /* Process all completed CQEs */
+ while (XNOR(cqe->owner_sr_opcode & MLX4_CQE_OWNER_MASK,
+ cq->mcq.cons_index & cq->size)) {
+
+ mbuf_frag = ring->rx_info + (index * ring->num_frags);
+ rx_desc = ring->buf + (index << ring->log_stride);
+
+ /*
+ * make sure we read the CQE after we read the ownership bit
+ */
+ rmb();
+
+ /* Drop packet on bad receive or bad checksum */
+ if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
+ MLX4_CQE_OPCODE_ERROR)) {
+ en_err(priv, "CQE completed in error - vendor syndrom:%d syndrom:%d\n",
+ ((struct mlx4_err_cqe *)cqe)->vendor_err_syndrome,
+ ((struct mlx4_err_cqe *)cqe)->syndrome);
+ goto next;
+ }
+ if (unlikely(cqe->badfcs_enc & MLX4_CQE_BAD_FCS)) {
+ en_dbg(RX_ERR, priv, "Accepted frame with bad FCS\n");
+ goto next;
+ }
+
+ length = be32_to_cpu(cqe->byte_cnt);
+ length -= ring->fcs_del;
+
+ //if (cqe->owner_sr_opcode & MLX4_CQE_IS_RECV_MASK)
+ //mlx4_en_inline_scatter(ring, frags,
+ // rx_desc, priv, length);
+
+
+ /*
+ * Packet is OK - process it.
+ */
+
+
+ if (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP |
+ MLX4_CQE_STATUS_UDP)) {
+ if ((cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPOK)) &&
+ cqe->checksum == cpu_to_be16(0xffff)) {
+ ol_flags |= PKT_TX_IP_CKSUM;
+ if(cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP))
+ ol_flags |= PKT_TX_IP_CKSUM;
+ else if(cqe->status & cpu_to_be16(MLX4_CQE_STATUS_UDP))
+ ol_flags |= PKT_TX_UDP_CKSUM;
+
+ }
+ }
+
+ /* GRO not possible, complete processing here */
+ mbuf = mlx4_en_complete_rx_desc(ring, rx_desc, mbuf_frag, length);
+ if (!mbuf) {
+ goto next;
+ }
+
+ mbuf->ol_flags = ol_flags;
+
+/*
+ if ((be32_to_cpu(cqe->vlan_my_qpn) &
+ MLX4_CQE_VLAN_PRESENT_MASK) &&
+ (dev->features & NETIF_F_HW_VLAN_CTAG_RX)) {
+ __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), be16_to_cpu(cqe->sl_vid));
+ }
+ */
+
+ if (ring->enable_hwtstamp) {
+ timestamp = mlx4_en_get_cqe_ts(cqe);
+ mbuf->udata64 = timestamp;
+ }
+
+ ret_array[received++] = mbuf;
+
+next:
+ //for (nr = 0; nr < priv->num_frags; nr++)
+ // mlx4_en_free_frag(priv, frags, nr);
+
+ ++cq->mcq.cons_index;
+ index = (cq->mcq.cons_index) & ring->size_mask;
+ cqe = mlx4_en_get_cqe(cq->buf, index, priv->cqe_size) + factor;
+ if (++polled == budget)
+ goto out;
+ }
+
+out:
+ mlx4_cq_set_ci(&cq->mcq);
+ wmb(); /* ensure HW sees CQ consumer before we post new buffers */
+ ring->cons = cq->mcq.cons_index;
+ mlx4_en_refill_rx_buffers(priv, ring);
+ mlx4_en_update_rx_prod_db(ring);
+ return polled;
+}
+
+#endif /* DRIVERS_NET_MLNX_UIO_INCLUDE_MLX4_UIO_HELPER_H_ */
diff --git a/drivers/net/mlnx_uio/include/module.h b/drivers/net/mlnx_uio/include/module.h
new file mode 100644
index 0000000..01b8358
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/module.h
@@ -0,0 +1,12 @@
+/*
+ * module.h
+ *
+ * Created on: Jun 24, 2015
+ * Author: leeopop
+ */
+
+#ifndef DRIVERS_NET_MLNX_UIO_INCLUDE_MODULE_H_
+#define DRIVERS_NET_MLNX_UIO_INCLUDE_MODULE_H_
+
+
+#endif /* DRIVERS_NET_MLNX_UIO_INCLUDE_MODULE_H_ */
diff --git a/drivers/net/mlnx_uio/include/netdev_features.h b/drivers/net/mlnx_uio/include/netdev_features.h
new file mode 100644
index 0000000..8d82064
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/netdev_features.h
@@ -0,0 +1,166 @@
+/*
+ * Network device features.
+ *
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#ifndef _LINUX_NETDEV_FEATURES_H
+#define _LINUX_NETDEV_FEATURES_H
+
+enum netdev_features{
+ NETIF_F_SG_BIT, /* Scatter/gather IO. */
+ NETIF_F_IP_CSUM_BIT, /* Can checksum TCP/UDP over IPv4. */
+ __UNUSED_NETIF_F_1,
+ NETIF_F_HW_CSUM_BIT, /* Can checksum all the packets. */
+ NETIF_F_IPV6_CSUM_BIT, /* Can checksum TCP/UDP over IPV6 */
+ NETIF_F_HIGHDMA_BIT, /* Can DMA to high memory. */
+ NETIF_F_FRAGLIST_BIT, /* Scatter/gather IO. */
+ NETIF_F_HW_VLAN_CTAG_TX_BIT, /* Transmit VLAN CTAG HW acceleration */
+ NETIF_F_HW_VLAN_CTAG_RX_BIT, /* Receive VLAN CTAG HW acceleration */
+ NETIF_F_HW_VLAN_CTAG_FILTER_BIT,/* Receive filtering on VLAN CTAGs */
+ NETIF_F_VLAN_CHALLENGED_BIT, /* Device cannot handle VLAN packets */
+ NETIF_F_GSO_BIT, /* Enable software GSO. */
+ NETIF_F_LLTX_BIT, /* LockLess TX - deprecated. Please */
+ /* do not use LLTX in new drivers */
+ NETIF_F_NETNS_LOCAL_BIT, /* Does not change network namespaces */
+ NETIF_F_GRO_BIT, /* Generic receive offload */
+ NETIF_F_LRO_BIT, /* large receive offload */
+
+ /**/NETIF_F_GSO_SHIFT, /* keep the order of SKB_GSO_* bits */
+ NETIF_F_TSO_BIT /* ... TCPv4 segmentation */
+ = NETIF_F_GSO_SHIFT,
+ NETIF_F_UFO_BIT, /* ... UDPv4 fragmentation */
+ NETIF_F_GSO_ROBUST_BIT, /* ... ->SKB_GSO_DODGY */
+ NETIF_F_TSO_ECN_BIT, /* ... TCP ECN support */
+ NETIF_F_TSO6_BIT, /* ... TCPv6 segmentation */
+ NETIF_F_FSO_BIT, /* ... FCoE segmentation */
+ NETIF_F_GSO_GRE_BIT, /* ... GRE with TSO */
+ NETIF_F_GSO_IPIP_BIT, /* ... IPIP tunnel with TSO */
+ NETIF_F_GSO_SIT_BIT, /* ... SIT tunnel with TSO */
+ NETIF_F_GSO_UDP_TUNNEL_BIT, /* ... UDP TUNNEL with TSO */
+ NETIF_F_GSO_MPLS_BIT, /* ... MPLS segmentation */
+ /**/NETIF_F_GSO_LAST = /* last bit, see GSO_MASK */
+ NETIF_F_GSO_MPLS_BIT,
+
+ NETIF_F_FCOE_CRC_BIT, /* FCoE CRC32 */
+ NETIF_F_SCTP_CSUM_BIT, /* SCTP checksum offload */
+ NETIF_F_FCOE_MTU_BIT, /* Supports max FCoE MTU, 2158 bytes*/
+ NETIF_F_NTUPLE_BIT, /* N-tuple filters supported */
+ NETIF_F_RXHASH_BIT, /* Receive hashing offload */
+ NETIF_F_RXCSUM_BIT, /* Receive checksumming offload */
+ NETIF_F_NOCACHE_COPY_BIT, /* Use no-cache copyfromuser */
+ NETIF_F_LOOPBACK_BIT, /* Enable loopback */
+ NETIF_F_RXFCS_BIT, /* Append FCS to skb pkt data */
+ NETIF_F_RXALL_BIT, /* Receive errored frames too */
+ NETIF_F_HW_VLAN_STAG_TX_BIT, /* Transmit VLAN STAG HW acceleration */
+ NETIF_F_HW_VLAN_STAG_RX_BIT, /* Receive VLAN STAG HW acceleration */
+ NETIF_F_HW_VLAN_STAG_FILTER_BIT,/* Receive filtering on VLAN STAGs */
+ NETIF_F_HW_L2FW_DOFFLOAD_BIT, /* Allow L2 Forwarding in Hardware */
+ NETIF_F_HW_QDISC_BIT, /* Supports hardware Qdisc */
+
+ /*
+ * Add your fresh new feature above and remember to update
+ * netdev_features_strings[] in net/core/ethtool.c and maybe
+ * some feature mask #defines below. Please also describe it
+ * in Documentation/networking/netdev-features.txt.
+ */
+
+ /**/NETDEV_FEATURE_COUNT
+};
+
+typedef enum netdev_features netdev_features_t;
+
+/* copy'n'paste compression ;) */
+#define __NETIF_F_BIT(bit) ((netdev_features_t)1 << (bit))
+#define __NETIF_F(name) __NETIF_F_BIT(NETIF_F_##name##_BIT)
+
+#define NETIF_F_FCOE_CRC __NETIF_F(FCOE_CRC)
+#define NETIF_F_FCOE_MTU __NETIF_F(FCOE_MTU)
+#define NETIF_F_FRAGLIST __NETIF_F(FRAGLIST)
+#define NETIF_F_FSO __NETIF_F(FSO)
+#define NETIF_F_GRO __NETIF_F(GRO)
+#define NETIF_F_GSO __NETIF_F(GSO)
+#define NETIF_F_GSO_ROBUST __NETIF_F(GSO_ROBUST)
+#define NETIF_F_HIGHDMA __NETIF_F(HIGHDMA)
+#define NETIF_F_HW_CSUM __NETIF_F(HW_CSUM)
+#define NETIF_F_HW_VLAN_CTAG_FILTER __NETIF_F(HW_VLAN_CTAG_FILTER)
+#define NETIF_F_HW_VLAN_CTAG_RX __NETIF_F(HW_VLAN_CTAG_RX)
+#define NETIF_F_HW_VLAN_CTAG_TX __NETIF_F(HW_VLAN_CTAG_TX)
+#define NETIF_F_HW_QDISC __NETIF_F(HW_QDISC)
+#define NETIF_F_IP_CSUM __NETIF_F(IP_CSUM)
+#define NETIF_F_IPV6_CSUM __NETIF_F(IPV6_CSUM)
+#define NETIF_F_LLTX __NETIF_F(LLTX)
+#define NETIF_F_LOOPBACK __NETIF_F(LOOPBACK)
+#define NETIF_F_LRO __NETIF_F(LRO)
+#define NETIF_F_NETNS_LOCAL __NETIF_F(NETNS_LOCAL)
+#define NETIF_F_NOCACHE_COPY __NETIF_F(NOCACHE_COPY)
+#define NETIF_F_NTUPLE __NETIF_F(NTUPLE)
+#define NETIF_F_RXCSUM __NETIF_F(RXCSUM)
+#define NETIF_F_RXHASH __NETIF_F(RXHASH)
+#define NETIF_F_SCTP_CSUM __NETIF_F(SCTP_CSUM)
+#define NETIF_F_SG __NETIF_F(SG)
+#define NETIF_F_TSO6 __NETIF_F(TSO6)
+#define NETIF_F_TSO_ECN __NETIF_F(TSO_ECN)
+#define NETIF_F_TSO __NETIF_F(TSO)
+#define NETIF_F_UFO __NETIF_F(UFO)
+#define NETIF_F_VLAN_CHALLENGED __NETIF_F(VLAN_CHALLENGED)
+#define NETIF_F_RXFCS __NETIF_F(RXFCS)
+#define NETIF_F_RXALL __NETIF_F(RXALL)
+#define NETIF_F_GSO_GRE __NETIF_F(GSO_GRE)
+#define NETIF_F_GSO_IPIP __NETIF_F(GSO_IPIP)
+#define NETIF_F_GSO_SIT __NETIF_F(GSO_SIT)
+#define NETIF_F_GSO_UDP_TUNNEL __NETIF_F(GSO_UDP_TUNNEL)
+#define NETIF_F_GSO_MPLS __NETIF_F(GSO_MPLS)
+#define NETIF_F_HW_VLAN_STAG_FILTER __NETIF_F(HW_VLAN_STAG_FILTER)
+#define NETIF_F_HW_VLAN_STAG_RX __NETIF_F(HW_VLAN_STAG_RX)
+#define NETIF_F_HW_VLAN_STAG_TX __NETIF_F(HW_VLAN_STAG_TX)
+#define NETIF_F_HW_L2FW_DOFFLOAD __NETIF_F(HW_L2FW_DOFFLOAD)
+
+/* Features valid for ethtool to change */
+/* = all defined minus driver/device-class-related */
+#define NETIF_F_NEVER_CHANGE (NETIF_F_VLAN_CHALLENGED | \
+ NETIF_F_LLTX | NETIF_F_NETNS_LOCAL)
+
+/* remember that ((t)1 << t_BITS) is undefined in C99 */
+#define NETIF_F_ETHTOOL_BITS ((__NETIF_F_BIT(NETDEV_FEATURE_COUNT - 1) | \
+ (__NETIF_F_BIT(NETDEV_FEATURE_COUNT - 1) - 1)) & \
+ ~NETIF_F_NEVER_CHANGE)
+
+/* Segmentation offload feature mask */
+#define NETIF_F_GSO_MASK (__NETIF_F_BIT(NETIF_F_GSO_LAST + 1) - \
+ __NETIF_F_BIT(NETIF_F_GSO_SHIFT))
+
+/* List of features with software fallbacks. */
+#define NETIF_F_GSO_SOFTWARE (NETIF_F_TSO | NETIF_F_TSO_ECN | \
+ NETIF_F_TSO6 | NETIF_F_UFO)
+
+#define NETIF_F_GEN_CSUM NETIF_F_HW_CSUM
+#define NETIF_F_V4_CSUM (NETIF_F_GEN_CSUM | NETIF_F_IP_CSUM)
+#define NETIF_F_V6_CSUM (NETIF_F_GEN_CSUM | NETIF_F_IPV6_CSUM)
+#define NETIF_F_ALL_CSUM (NETIF_F_V4_CSUM | NETIF_F_V6_CSUM)
+
+#define NETIF_F_ALL_TSO (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_TSO_ECN)
+
+#define NETIF_F_ALL_FCOE (NETIF_F_FCOE_CRC | NETIF_F_FCOE_MTU | \
+ NETIF_F_FSO)
+
+/*
+ * If one device supports one of these features, then enable them
+ * for all in netdev_increment_features.
+ */
+#define NETIF_F_ONE_FOR_ALL (NETIF_F_GSO_SOFTWARE | NETIF_F_GSO_ROBUST | \
+ NETIF_F_SG | NETIF_F_HIGHDMA | \
+ NETIF_F_FRAGLIST | NETIF_F_VLAN_CHALLENGED)
+/*
+ * If one device doesn't support one of these features, then disable it
+ * for all in netdev_increment_features.
+ */
+#define NETIF_F_ALL_FOR_ALL (NETIF_F_NOCACHE_COPY | NETIF_F_FSO)
+
+/* changeable features with no special hardware requirements */
+#define NETIF_F_SOFT_FEATURES (NETIF_F_GSO | NETIF_F_GRO)
+
+#endif /* _LINUX_NETDEV_FEATURES_H */
diff --git a/drivers/net/mlnx_uio/include/post_kmod.h b/drivers/net/mlnx_uio/include/post_kmod.h
new file mode 100644
index 0000000..f81e8ff
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/post_kmod.h
@@ -0,0 +1,13 @@
+/*
+ * post_kmod.h
+ *
+ * Created on: Jun 24, 2015
+ * Author: leeopop
+ */
+
+#ifndef DRIVERS_NET_MLNX_UIO_INCLUDE_POST_KMOD_H_
+#define DRIVERS_NET_MLNX_UIO_INCLUDE_POST_KMOD_H_
+
+
+
+#endif /* DRIVERS_NET_MLNX_UIO_INCLUDE_POST_KMOD_H_ */
diff --git a/drivers/net/mlnx_uio/include/radix-tree.h b/drivers/net/mlnx_uio/include/radix-tree.h
new file mode 100644
index 0000000..e63b212
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/radix-tree.h
@@ -0,0 +1,48 @@
+/*
+ * Copyright (C) 2001 Momchil Velikov
+ * Portions Copyright (C) 2001 Christoph Hellwig
+ * Copyright (C) 2006 Nick Piggin
+ * Copyright (C) 2012 Konstantin Khlebnikov
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+#ifndef _LINUX_RADIX_TREE_H
+#define _LINUX_RADIX_TREE_H
+
+#include <linux/types.h>
+
+/* root tags are stored in gfp_mask, shifted by __GFP_BITS_SHIFT */
+#define RADIX_TREE_MAP_SIZE 1024*1024
+
+struct radix_tree_root {
+ void* value[RADIX_TREE_MAP_SIZE];
+};
+
+#define RADIX_TREE_INIT(mask) {0,}
+
+#define RADIX_TREE(name, mask) \
+ struct radix_tree_root name = RADIX_TREE_INIT(mask)
+
+#define INIT_RADIX_TREE(root, mask) \
+do { \
+ memset((root)->value, 0, sizeof((root)->value)); \
+} while (0)
+
+int radix_tree_insert(struct radix_tree_root *, unsigned long, void *);
+void *radix_tree_lookup(struct radix_tree_root *, unsigned long);
+void **radix_tree_lookup_slot(struct radix_tree_root *, unsigned long);
+void *radix_tree_delete(struct radix_tree_root *, unsigned long);
+
+#endif /* _LINUX_RADIX_TREE_H */
diff --git a/drivers/net/mlnx_uio/include/rbtree.h b/drivers/net/mlnx_uio/include/rbtree.h
new file mode 100644
index 0000000..f8d9d47
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/rbtree.h
@@ -0,0 +1,105 @@
+/*
+ Red Black Trees
+ (C) 1999 Andrea Arcangeli <andrea@suse.de>
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+ linux/include/linux/rbtree.h
+
+ To use rbtrees you'll have to implement your own insert and search cores.
+ This will avoid us to use callbacks and to drop drammatically performances.
+ I know it's not the cleaner way, but in C (not in C++) to get
+ performances and genericity...
+
+ See Documentation/rbtree.txt for documentation and samples.
+*/
+
+#ifndef _LINUX_RBTREE_H
+#define _LINUX_RBTREE_H
+
+struct rb_node {
+ unsigned long __rb_parent_color;
+ struct rb_node *rb_right;
+ struct rb_node *rb_left;
+} __attribute__((aligned(sizeof(long))));
+ /* The alignment might seem pointless, but allegedly CRIS needs it */
+
+struct rb_root {
+ struct rb_node *rb_node;
+};
+
+
+#define rb_parent(r) ((struct rb_node *)((r)->__rb_parent_color & ~3))
+
+#define RB_ROOT (struct rb_root) { NULL, }
+#define rb_entry(ptr, type, member) container_of(ptr, type, member)
+
+#define RB_EMPTY_ROOT(root) ((root)->rb_node == NULL)
+
+/* 'empty' nodes are nodes that are known not to be inserted in an rbree */
+#define RB_EMPTY_NODE(node) \
+ ((node)->__rb_parent_color == (unsigned long)(node))
+#define RB_CLEAR_NODE(node) \
+ ((node)->__rb_parent_color = (unsigned long)(node))
+
+
+extern void rb_insert_color(struct rb_node *, struct rb_root *);
+extern void rb_erase(struct rb_node *, struct rb_root *);
+
+
+/* Find logical next and previous nodes in a tree */
+extern struct rb_node *rb_next(const struct rb_node *);
+extern struct rb_node *rb_prev(const struct rb_node *);
+extern struct rb_node *rb_first(const struct rb_root *);
+extern struct rb_node *rb_last(const struct rb_root *);
+
+/* Postorder iteration - always visit the parent after its children */
+extern struct rb_node *rb_first_postorder(const struct rb_root *);
+extern struct rb_node *rb_next_postorder(const struct rb_node *);
+
+/* Fast replacement of a single node without remove/rebalance/add/rebalance */
+extern void rb_replace_node(struct rb_node *victim, struct rb_node *new,
+ struct rb_root *root);
+
+static inline void rb_link_node(struct rb_node * node, struct rb_node * parent,
+ struct rb_node ** rb_link)
+{
+ node->__rb_parent_color = (unsigned long)parent;
+ node->rb_left = node->rb_right = NULL;
+
+ *rb_link = node;
+}
+
+#define rb_entry_safe(ptr, type, member) \
+ ({ typeof(ptr) ____ptr = (ptr); \
+ ____ptr ? rb_entry(____ptr, type, member) : NULL; \
+ })
+
+/**
+ * rbtree_postorder_for_each_entry_safe - iterate over rb_root in post order of
+ * given type safe against removal of rb_node entry
+ *
+ * @pos: the 'type *' to use as a loop cursor.
+ * @n: another 'type *' to use as temporary storage
+ * @root: 'rb_root *' of the rbtree.
+ * @field: the name of the rb_node field within 'type'.
+ */
+#define rbtree_postorder_for_each_entry_safe(pos, n, root, field) \
+ for (pos = rb_entry_safe(rb_first_postorder(root), typeof(*pos), field); \
+ pos && ({ n = rb_entry_safe(rb_next_postorder(&pos->field), \
+ typeof(*pos), field); 1; }); \
+ pos = n)
+
+#endif /* _LINUX_RBTREE_H */
diff --git a/drivers/net/mlnx_uio/include/rbtree_augmented.h b/drivers/net/mlnx_uio/include/rbtree_augmented.h
new file mode 100644
index 0000000..bfdd7d0
--- /dev/null
+++ b/drivers/net/mlnx_uio/include/rbtree_augmented.h
@@ -0,0 +1,230 @@
+/*
+ Red Black Trees
+ (C) 1999 Andrea Arcangeli <andrea@suse.de>
+ (C) 2002 David Woodhouse <dwmw2@infradead.org>
+ (C) 2012 Michel Lespinasse <walken@google.com>
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+ linux/include/linux/rbtree_augmented.h
+*/
+
+#ifndef _LINUX_RBTREE_AUGMENTED_H
+#define _LINUX_RBTREE_AUGMENTED_H
+#include "kmod.h"
+#include "rbtree.h"
+/*
+ * Please note - only struct rb_augment_callbacks and the prototypes for
+ * rb_insert_augmented() and rb_erase_augmented() are intended to be public.
+ * The rest are implementation details you are not expected to depend on.
+ *
+ * See Documentation/rbtree.txt for documentation and samples.
+ */
+
+struct rb_augment_callbacks {
+ void (*propagate)(struct rb_node *node, struct rb_node *stop);
+ void (*copy)(struct rb_node *old, struct rb_node *new);
+ void (*rotate)(struct rb_node *old, struct rb_node *new);
+};
+
+extern void __rb_insert_augmented(struct rb_node *node, struct rb_root *root,
+ void (*augment_rotate)(struct rb_node *old, struct rb_node *new));
+static inline void
+rb_insert_augmented(struct rb_node *node, struct rb_root *root,
+ const struct rb_augment_callbacks *augment)
+{
+ __rb_insert_augmented(node, root, augment->rotate);
+}
+
+#define RB_DECLARE_CALLBACKS(rbstatic, rbname, rbstruct, rbfield, \
+ rbtype, rbaugmented, rbcompute) \
+static inline void \
+rbname ## _propagate(struct rb_node *rb, struct rb_node *stop) \
+{ \
+ while (rb != stop) { \
+ rbstruct *node = rb_entry(rb, rbstruct, rbfield); \
+ rbtype augmented = rbcompute(node); \
+ if (node->rbaugmented == augmented) \
+ break; \
+ node->rbaugmented = augmented; \
+ rb = rb_parent(&node->rbfield); \
+ } \
+} \
+static inline void \
+rbname ## _copy(struct rb_node *rb_old, struct rb_node *rb_new) \
+{ \
+ rbstruct *old = rb_entry(rb_old, rbstruct, rbfield); \
+ rbstruct *new = rb_entry(rb_new, rbstruct, rbfield); \
+ new->rbaugmented = old->rbaugmented; \
+} \
+static void \
+rbname ## _rotate(struct rb_node *rb_old, struct rb_node *rb_new) \
+{ \
+ rbstruct *old = rb_entry(rb_old, rbstruct, rbfield); \
+ rbstruct *new = rb_entry(rb_new, rbstruct, rbfield); \
+ new->rbaugmented = old->rbaugmented; \
+ old->rbaugmented = rbcompute(old); \
+} \
+rbstatic const struct rb_augment_callbacks rbname = { \
+ rbname ## _propagate, rbname ## _copy, rbname ## _rotate \
+};
+
+
+#define RB_RED 0
+#define RB_BLACK 1
+
+#define __rb_parent(pc) ((struct rb_node *)(pc & ~3))
+
+#define __rb_color(pc) ((pc) & 1)
+#define __rb_is_black(pc) __rb_color(pc)
+#define __rb_is_red(pc) (!__rb_color(pc))
+#define rb_color(rb) __rb_color((rb)->__rb_parent_color)
+#define rb_is_red(rb) __rb_is_red((rb)->__rb_parent_color)
+#define rb_is_black(rb) __rb_is_black((rb)->__rb_parent_color)
+
+static inline void rb_set_parent(struct rb_node *rb, struct rb_node *p)
+{
+ rb->__rb_parent_color = rb_color(rb) | (unsigned long)p;
+}
+
+static inline void rb_set_parent_color(struct rb_node *rb,
+ struct rb_node *p, int color)
+{
+ rb->__rb_parent_color = (unsigned long)p | color;
+}
+
+static inline void
+__rb_change_child(struct rb_node *old, struct rb_node *new,
+ struct rb_node *parent, struct rb_root *root)
+{
+ if (parent) {
+ if (parent->rb_left == old)
+ parent->rb_left = new;
+ else
+ parent->rb_right = new;
+ } else
+ root->rb_node = new;
+}
+
+extern void __rb_erase_color(struct rb_node *parent, struct rb_root *root,
+ void (*augment_rotate)(struct rb_node *old, struct rb_node *new));
+
+static __always_inline struct rb_node *
+__rb_erase_augmented(struct rb_node *node, struct rb_root *root,
+ const struct rb_augment_callbacks *augment)
+{
+ struct rb_node *child = node->rb_right, *tmp = node->rb_left;
+ struct rb_node *parent, *rebalance;
+ unsigned long pc;
+
+ if (!tmp) {
+ /*
+ * Case 1: node to erase has no more than 1 child (easy!)
+ *
+ * Note that if there is one child it must be red due to 5)
+ * and node must be black due to 4). We adjust colors locally
+ * so as to bypass __rb_erase_color() later on.
+ */
+ pc = node->__rb_parent_color;
+ parent = __rb_parent(pc);
+ __rb_change_child(node, child, parent, root);
+ if (child) {
+ child->__rb_parent_color = pc;
+ rebalance = NULL;
+ } else
+ rebalance = __rb_is_black(pc) ? parent : NULL;
+ tmp = parent;
+ } else if (!child) {
+ /* Still case 1, but this time the child is node->rb_left */
+ tmp->__rb_parent_color = pc = node->__rb_parent_color;
+ parent = __rb_parent(pc);
+ __rb_change_child(node, tmp, parent, root);
+ rebalance = NULL;
+ tmp = parent;
+ } else {
+ struct rb_node *successor = child, *child2;
+ tmp = child->rb_left;
+ if (!tmp) {
+ /*
+ * Case 2: node's successor is its right child
+ *
+ * (n) (s)
+ * / \ / \
+ * (x) (s) -> (x) (c)
+ * \
+ * (c)
+ */
+ parent = successor;
+ child2 = successor->rb_right;
+ augment->copy(node, successor);
+ } else {
+ /*
+ * Case 3: node's successor is leftmost under
+ * node's right child subtree
+ *
+ * (n) (s)
+ * / \ / \
+ * (x) (y) -> (x) (y)
+ * / /
+ * (p) (p)
+ * / /
+ * (s) (c)
+ * \
+ * (c)
+ */
+ do {
+ parent = successor;
+ successor = tmp;
+ tmp = tmp->rb_left;
+ } while (tmp);
+ parent->rb_left = child2 = successor->rb_right;
+ successor->rb_right = child;
+ rb_set_parent(child, successor);
+ augment->copy(node, successor);
+ augment->propagate(parent, successor);
+ }
+
+ successor->rb_left = tmp = node->rb_left;
+ rb_set_parent(tmp, successor);
+
+ pc = node->__rb_parent_color;
+ tmp = __rb_parent(pc);
+ __rb_change_child(node, successor, tmp, root);
+ if (child2) {
+ successor->__rb_parent_color = pc;
+ rb_set_parent_color(child2, parent, RB_BLACK);
+ rebalance = NULL;
+ } else {
+ unsigned long pc2 = successor->__rb_parent_color;
+ successor->__rb_parent_color = pc;
+ rebalance = __rb_is_black(pc2) ? parent : NULL;
+ }
+ tmp = successor;
+ }
+
+ augment->propagate(tmp, NULL);
+ return rebalance;
+}
+
+static __always_inline void
+rb_erase_augmented(struct rb_node *node, struct rb_root *root,
+ const struct rb_augment_callbacks *augment)
+{
+ struct rb_node *rebalance = __rb_erase_augmented(node, root, augment);
+ if (rebalance)
+ __rb_erase_color(rebalance, root, augment->rotate);
+}
+
+#endif /* _LINUX_RBTREE_AUGMENTED_H */
diff --git a/drivers/net/mlnx_uio/kernel/LICENSE b/drivers/net/mlnx_uio/kernel/LICENSE
new file mode 100644
index 0000000..23cb790
--- /dev/null
+++ b/drivers/net/mlnx_uio/kernel/LICENSE
@@ -0,0 +1,339 @@
+ GNU GENERAL PUBLIC LICENSE
+ Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc., <http://fsf.org/>
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The licenses for most software are designed to take away your
+freedom to share and change it. By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users. This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it. (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.) You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+ To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have. You must make sure that they, too, receive or can get the
+source code. And you must show them these terms so they know their
+rights.
+
+ We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+ Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software. If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+ Finally, any free program is threatened constantly by software
+patents. We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary. To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ GNU GENERAL PUBLIC LICENSE
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+ 0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License. The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language. (Hereinafter, translation is included without limitation in
+the term "modification".) Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope. The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+ 1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+ 2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+ a) You must cause the modified files to carry prominent notices
+ stating that you changed the files and the date of any change.
+
+ b) You must cause any work that you distribute or publish, that in
+ whole or in part contains or is derived from the Program or any
+ part thereof, to be licensed as a whole at no charge to all third
+ parties under the terms of this License.
+
+ c) If the modified program normally reads commands interactively
+ when run, you must cause it, when started running for such
+ interactive use in the most ordinary way, to print or display an
+ announcement including an appropriate copyright notice and a
+ notice that there is no warranty (or else, saying that you provide
+ a warranty) and that users may redistribute the program under
+ these conditions, and telling the user how to view a copy of this
+ License. (Exception: if the Program itself is interactive but
+ does not normally print such an announcement, your work based on
+ the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole. If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works. But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+ 3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+ a) Accompany it with the complete corresponding machine-readable
+ source code, which must be distributed under the terms of Sections
+ 1 and 2 above on a medium customarily used for software interchange; or,
+
+ b) Accompany it with a written offer, valid for at least three
+ years, to give any third party, for a charge no more than your
+ cost of physically performing source distribution, a complete
+ machine-readable copy of the corresponding source code, to be
+ distributed under the terms of Sections 1 and 2 above on a medium
+ customarily used for software interchange; or,
+
+ c) Accompany it with the information you received as to the offer
+ to distribute corresponding source code. (This alternative is
+ allowed only for noncommercial distribution and only if you
+ received the program in object code or executable form with such
+ an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it. For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable. However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+ 4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License. Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+ 5. You are not required to accept this License, since you have not
+signed it. However, nothing else grants you permission to modify or
+distribute the Program or its derivative works. These actions are
+prohibited by law if you do not accept this License. Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+ 6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions. You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+ 7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all. For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices. Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+ 8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded. In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+ 9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number. If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation. If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+ 10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission. For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this. Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+ NO WARRANTY
+
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+ END OF TERMS AND CONDITIONS
+
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+ {description}
+ Copyright (C) {year} {fullname}
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License along
+ with this program; if not, write to the Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+ Gnomovision version 69, Copyright (C) year name of author
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License. Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary. Here is a sample; alter the names:
+
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+ {signature of Ty Coon}, 1 April 1989
+ Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs. If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library. If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
diff --git a/drivers/net/mlnx_uio/kernel/bitmap.c b/drivers/net/mlnx_uio/kernel/bitmap.c
new file mode 100644
index 0000000..01adb4f
--- /dev/null
+++ b/drivers/net/mlnx_uio/kernel/bitmap.c
@@ -0,0 +1,831 @@
+/*
+ * bitmap.c
+ *
+ * Created on: Jul 15, 2014
+ * Author: leeopop
+ */
+
+
+/*
+ * lib/bitmap.c
+ * Helper functions for bitmap.h.
+ *
+ * This source code is licensed under the GNU General Public License,
+ * Version 2. See the file COPYING for more details.
+ */
+#include "kmod.h"
+#include "bitmap.h"
+
+#include <assert.h>
+#include <errno.h>
+#include "bitops.h"
+
+/*
+ * bitmaps provide an array of bits, implemented using an an
+ * array of unsigned longs. The number of valid bits in a
+ * given bitmap does _not_ need to be an exact multiple of
+ * BITS_PER_LONG.
+ *
+ * The possible unused bits in the last, partially used word
+ * of a bitmap are 'don't care'. The implementation makes
+ * no particular effort to keep them zero. It ensures that
+ * their value will not affect the results of any operation.
+ * The bitmap operations that return Boolean (bitmap_empty,
+ * for example) or scalar (bitmap_weight, for example) results
+ * carefully filter out these unused bits from impacting their
+ * results.
+ *
+ * These operations actually hold to a slightly stronger rule:
+ * if you don't input any bitmaps to these ops that have some
+ * unused bits set, then they won't output any set unused bits
+ * in output bitmaps.
+ *
+ * The byte ordering of bitmaps is more natural on little
+ * endian architectures. See the big-endian headers
+ * include/asm-ppc64/bitops.h and include/asm-s390/bitops.h
+ * for the best explanations of this ordering.
+ */
+
+int __bitmap_empty(const unsigned long *bitmap, int bits)
+{
+ int k, lim = bits/BITS_PER_LONG;
+ for (k = 0; k < lim; ++k)
+ if (bitmap[k])
+ return 0;
+
+ if (bits % BITS_PER_LONG)
+ if (bitmap[k] & BITMAP_LAST_WORD_MASK(bits))
+ return 0;
+
+ return 1;
+}
+EXPORT_SYMBOL(__bitmap_empty);
+
+int __bitmap_full(const unsigned long *bitmap, int bits)
+{
+ int k, lim = bits/BITS_PER_LONG;
+ for (k = 0; k < lim; ++k)
+ if (~bitmap[k])
+ return 0;
+
+ if (bits % BITS_PER_LONG)
+ if (~bitmap[k] & BITMAP_LAST_WORD_MASK(bits))
+ return 0;
+
+ return 1;
+}
+EXPORT_SYMBOL(__bitmap_full);
+
+int __bitmap_equal(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits)
+{
+ int k, lim = bits/BITS_PER_LONG;
+ for (k = 0; k < lim; ++k)
+ if (bitmap1[k] != bitmap2[k])
+ return 0;
+
+ if (bits % BITS_PER_LONG)
+ if ((bitmap1[k] ^ bitmap2[k]) & BITMAP_LAST_WORD_MASK(bits))
+ return 0;
+
+ return 1;
+}
+EXPORT_SYMBOL(__bitmap_equal);
+
+void __bitmap_complement(unsigned long *dst, const unsigned long *src, int bits)
+{
+ int k, lim = bits/BITS_PER_LONG;
+ for (k = 0; k < lim; ++k)
+ dst[k] = ~src[k];
+
+ if (bits % BITS_PER_LONG)
+ dst[k] = ~src[k] & BITMAP_LAST_WORD_MASK(bits);
+}
+EXPORT_SYMBOL(__bitmap_complement);
+
+/**
+ * __bitmap_shift_right - logical right shift of the bits in a bitmap
+ * @dst : destination bitmap
+ * @src : source bitmap
+ * @shift : shift by this many bits
+ * @bits : bitmap size, in bits
+ *
+ * Shifting right (dividing) means moving bits in the MS -> LS bit
+ * direction. Zeros are fed into the vacated MS positions and the
+ * LS bits shifted off the bottom are lost.
+ */
+void __bitmap_shift_right(unsigned long *dst,
+ const unsigned long *src, int shift, int bits)
+{
+ int k, lim = BITS_TO_LONGS(bits), left = bits % BITS_PER_LONG;
+ int off = shift/BITS_PER_LONG, rem = shift % BITS_PER_LONG;
+ unsigned long mask = (1UL << left) - 1;
+ for (k = 0; off + k < lim; ++k) {
+ unsigned long upper, lower;
+
+ /*
+ * If shift is not word aligned, take lower rem bits of
+ * word above and make them the top rem bits of result.
+ */
+ if (!rem || off + k + 1 >= lim)
+ upper = 0;
+ else {
+ upper = src[off + k + 1];
+ if (off + k + 1 == lim - 1 && left)
+ upper &= mask;
+ }
+ lower = src[off + k];
+ if (left && off + k == lim - 1)
+ lower &= mask;
+ dst[k] = upper << (BITS_PER_LONG - rem) | lower >> rem;
+ if (left && k == lim - 1)
+ dst[k] &= mask;
+ }
+ if (off)
+ memset(&dst[lim - off], 0, off*sizeof(unsigned long));
+}
+EXPORT_SYMBOL(__bitmap_shift_right);
+
+
+/**
+ * __bitmap_shift_left - logical left shift of the bits in a bitmap
+ * @dst : destination bitmap
+ * @src : source bitmap
+ * @shift : shift by this many bits
+ * @bits : bitmap size, in bits
+ *
+ * Shifting left (multiplying) means moving bits in the LS -> MS
+ * direction. Zeros are fed into the vacated LS bit positions
+ * and those MS bits shifted off the top are lost.
+ */
+
+void __bitmap_shift_left(unsigned long *dst,
+ const unsigned long *src, int shift, int bits)
+{
+ int k, lim = BITS_TO_LONGS(bits), left = bits % BITS_PER_LONG;
+ int off = shift/BITS_PER_LONG, rem = shift % BITS_PER_LONG;
+ for (k = lim - off - 1; k >= 0; --k) {
+ unsigned long upper, lower;
+
+ /*
+ * If shift is not word aligned, take upper rem bits of
+ * word below and make them the bottom rem bits of result.
+ */
+ if (rem && k > 0)
+ lower = src[k - 1];
+ else
+ lower = 0;
+ upper = src[k];
+ if (left && k == lim - 1)
+ upper &= (1UL << left) - 1;
+ dst[k + off] = lower >> (BITS_PER_LONG - rem) | upper << rem;
+ if (left && k + off == lim - 1)
+ dst[k + off] &= (1UL << left) - 1;
+ }
+ if (off)
+ memset(dst, 0, off*sizeof(unsigned long));
+}
+EXPORT_SYMBOL(__bitmap_shift_left);
+
+int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits)
+{
+ int k;
+ int nr = BITS_TO_LONGS(bits);
+ unsigned long result = 0;
+
+ for (k = 0; k < nr; k++)
+ result |= (dst[k] = bitmap1[k] & bitmap2[k]);
+ return result != 0;
+}
+EXPORT_SYMBOL(__bitmap_and);
+
+void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits)
+{
+ int k;
+ int nr = BITS_TO_LONGS(bits);
+
+ for (k = 0; k < nr; k++)
+ dst[k] = bitmap1[k] | bitmap2[k];
+}
+EXPORT_SYMBOL(__bitmap_or);
+
+void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits)
+{
+ int k;
+ int nr = BITS_TO_LONGS(bits);
+
+ for (k = 0; k < nr; k++)
+ dst[k] = bitmap1[k] ^ bitmap2[k];
+}
+EXPORT_SYMBOL(__bitmap_xor);
+
+int __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits)
+{
+ int k;
+ int nr = BITS_TO_LONGS(bits);
+ unsigned long result = 0;
+
+ for (k = 0; k < nr; k++)
+ result |= (dst[k] = bitmap1[k] & ~bitmap2[k]);
+ return result != 0;
+}
+EXPORT_SYMBOL(__bitmap_andnot);
+
+int __bitmap_intersects(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits)
+{
+ int k, lim = bits/BITS_PER_LONG;
+ for (k = 0; k < lim; ++k)
+ if (bitmap1[k] & bitmap2[k])
+ return 1;
+
+ if (bits % BITS_PER_LONG)
+ if ((bitmap1[k] & bitmap2[k]) & BITMAP_LAST_WORD_MASK(bits))
+ return 1;
+ return 0;
+}
+EXPORT_SYMBOL(__bitmap_intersects);
+
+int __bitmap_subset(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits)
+{
+ int k, lim = bits/BITS_PER_LONG;
+ for (k = 0; k < lim; ++k)
+ if (bitmap1[k] & ~bitmap2[k])
+ return 0;
+
+ if (bits % BITS_PER_LONG)
+ if ((bitmap1[k] & ~bitmap2[k]) & BITMAP_LAST_WORD_MASK(bits))
+ return 0;
+ return 1;
+}
+EXPORT_SYMBOL(__bitmap_subset);
+
+int __bitmap_weight(const unsigned long *bitmap, int bits)
+{
+ int k, w = 0, lim = bits/BITS_PER_LONG;
+
+ for (k = 0; k < lim; k++)
+ w += hweight_long(bitmap[k]);
+
+ if (bits % BITS_PER_LONG)
+ w += hweight_long(bitmap[k] & BITMAP_LAST_WORD_MASK(bits));
+
+ return w;
+}
+EXPORT_SYMBOL(__bitmap_weight);
+
+void bitmap_set(unsigned long *map, int start, int nr)
+{
+ unsigned long *p = map + BIT_WORD(start);
+ const int size = start + nr;
+ int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
+ unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
+
+ while (nr - bits_to_set >= 0) {
+ *p |= mask_to_set;
+ nr -= bits_to_set;
+ bits_to_set = BITS_PER_LONG;
+ mask_to_set = ~0UL;
+ p++;
+ }
+ if (nr) {
+ mask_to_set &= BITMAP_LAST_WORD_MASK(size);
+ *p |= mask_to_set;
+ }
+}
+EXPORT_SYMBOL(bitmap_set);
+
+void bitmap_clear(unsigned long *map, int start, int nr)
+{
+ unsigned long *p = map + BIT_WORD(start);
+ const int size = start + nr;
+ int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
+ unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+
+ while (nr - bits_to_clear >= 0) {
+ *p &= ~mask_to_clear;
+ nr -= bits_to_clear;
+ bits_to_clear = BITS_PER_LONG;
+ mask_to_clear = ~0UL;
+ p++;
+ }
+ if (nr) {
+ mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+ *p &= ~mask_to_clear;
+ }
+}
+EXPORT_SYMBOL(bitmap_clear);
+
+/*
+ * bitmap_find_next_zero_area - find a contiguous aligned zero area
+ * @map: The address to base the search on
+ * @size: The bitmap size in bits
+ * @start: The bitnumber to start searching at
+ * @nr: The number of zeroed bits we're looking for
+ * @align_mask: Alignment mask for zero area
+ *
+ * The @align_mask should be one less than a power of 2; the effect is that
+ * the bit offset of all zero areas this function finds is multiples of that
+ * power of 2. A @align_mask of 0 means no alignment is required.
+ */
+unsigned long bitmap_find_next_zero_area(unsigned long *map,
+ unsigned long size,
+ unsigned long start,
+ unsigned int nr,
+ unsigned long align_mask)
+{
+ unsigned long index, end, i;
+ again:
+ index = find_next_zero_bit(map, size, start);
+
+ /* Align allocation */
+ index = __ALIGN_MASK(index, align_mask);
+
+ end = index + nr;
+ if (end > size)
+ return end;
+ i = find_next_bit(map, end, index);
+ if (i < end) {
+ start = i + 1;
+ goto again;
+ }
+ return index;
+}
+EXPORT_SYMBOL(bitmap_find_next_zero_area);
+
+/*
+ * Bitmap printing & parsing functions: first version by Nadia Yvette Chambers,
+ * second version by Paul Jackson, third by Joe Korty.
+ */
+
+#define CHUNKSZ 32
+#define nbits_to_hold_value(val) fls(val)
+#define BASEDEC 10 /* fancier cpuset lists input in decimal */
+
+
+/**
+ * bitmap_pos_to_ord - find ordinal of set bit at given position in bitmap
+ * @buf: pointer to a bitmap
+ * @pos: a bit position in @buf (0 <= @pos < @bits)
+ * @bits: number of valid bit positions in @buf
+ *
+ * Map the bit at position @pos in @buf (of length @bits) to the
+ * ordinal of which set bit it is. If it is not set or if @pos
+ * is not a valid bit position, map to -1.
+ *
+ * If for example, just bits 4 through 7 are set in @buf, then @pos
+ * values 4 through 7 will get mapped to 0 through 3, respectively,
+ * and other @pos values will get mapped to 0. When @pos value 7
+ * gets mapped to (returns) @ord value 3 in this example, that means
+ * that bit 7 is the 3rd (starting with 0th) set bit in @buf.
+ *
+ * The bit positions 0 through @bits are valid positions in @buf.
+ */
+static int bitmap_pos_to_ord(const unsigned long *buf, int pos, int bits)
+{
+ int i, ord;
+
+ if (pos < 0 || pos >= bits || !test_bit(pos, buf))
+ return -1;
+
+ i = find_first_bit(buf, bits);
+ ord = 0;
+ while (i < pos) {
+ i = find_next_bit(buf, bits, i + 1);
+ ord++;
+ }
+ assert(i != pos);
+
+ return ord;
+}
+
+/**
+ * bitmap_ord_to_pos - find position of n-th set bit in bitmap
+ * @buf: pointer to bitmap
+ * @ord: ordinal bit position (n-th set bit, n >= 0)
+ * @bits: number of valid bit positions in @buf
+ *
+ * Map the ordinal offset of bit @ord in @buf to its position in @buf.
+ * Value of @ord should be in range 0 <= @ord < weight(buf), else
+ * results are undefined.
+ *
+ * If for example, just bits 4 through 7 are set in @buf, then @ord
+ * values 0 through 3 will get mapped to 4 through 7, respectively,
+ * and all other @ord values return undefined values. When @ord value 3
+ * gets mapped to (returns) @pos value 7 in this example, that means
+ * that the 3rd set bit (starting with 0th) is at position 7 in @buf.
+ *
+ * The bit positions 0 through @bits are valid positions in @buf.
+ */
+int bitmap_ord_to_pos(const unsigned long *buf, int ord, int bits)
+{
+ int pos = 0;
+
+ if (ord >= 0 && ord < bits) {
+ int i;
+
+ for (i = find_first_bit(buf, bits);
+ i < bits && ord > 0;
+ i = find_next_bit(buf, bits, i + 1))
+ ord--;
+ if (i < bits && ord == 0)
+ pos = i;
+ }
+
+ return pos;
+}
+
+/**
+ * bitmap_remap - Apply map defined by a pair of bitmaps to another bitmap
+ * @dst: remapped result
+ * @src: subset to be remapped
+ * @old: defines domain of map
+ * @new: defines range of map
+ * @bits: number of bits in each of these bitmaps
+ *
+ * Let @old and @new define a mapping of bit positions, such that
+ * whatever position is held by the n-th set bit in @old is mapped
+ * to the n-th set bit in @new. In the more general case, allowing
+ * for the possibility that the weight 'w' of @new is less than the
+ * weight of @old, map the position of the n-th set bit in @old to
+ * the position of the m-th set bit in @new, where m == n % w.
+ *
+ * If either of the @old and @new bitmaps are empty, or if @src and
+ * @dst point to the same location, then this routine copies @src
+ * to @dst.
+ *
+ * The positions of unset bits in @old are mapped to themselves
+ * (the identify map).
+ *
+ * Apply the above specified mapping to @src, placing the result in
+ * @dst, clearing any bits previously set in @dst.
+ *
+ * For example, lets say that @old has bits 4 through 7 set, and
+ * @new has bits 12 through 15 set. This defines the mapping of bit
+ * position 4 to 12, 5 to 13, 6 to 14 and 7 to 15, and of all other
+ * bit positions unchanged. So if say @src comes into this routine
+ * with bits 1, 5 and 7 set, then @dst should leave with bits 1,
+ * 13 and 15 set.
+ */
+void bitmap_remap(unsigned long *dst, const unsigned long *src,
+ const unsigned long *old, const unsigned long *new,
+ int bits)
+{
+ int oldbit, w;
+
+ if (dst == src) /* following doesn't handle inplace remaps */
+ return;
+ bitmap_zero(dst, bits);
+
+ w = bitmap_weight(new, bits);
+ for_each_set_bit(oldbit, src, bits) {
+ int n = bitmap_pos_to_ord(old, oldbit, bits);
+
+ if (n < 0 || w == 0)
+ set_bit(oldbit, dst); /* identity map */
+ else
+ set_bit(bitmap_ord_to_pos(new, n % w, bits), dst);
+ }
+}
+EXPORT_SYMBOL(bitmap_remap);
+
+/**
+ * bitmap_bitremap - Apply map defined by a pair of bitmaps to a single bit
+ * @oldbit: bit position to be mapped
+ * @old: defines domain of map
+ * @new: defines range of map
+ * @bits: number of bits in each of these bitmaps
+ *
+ * Let @old and @new define a mapping of bit positions, such that
+ * whatever position is held by the n-th set bit in @old is mapped
+ * to the n-th set bit in @new. In the more general case, allowing
+ * for the possibility that the weight 'w' of @new is less than the
+ * weight of @old, map the position of the n-th set bit in @old to
+ * the position of the m-th set bit in @new, where m == n % w.
+ *
+ * The positions of unset bits in @old are mapped to themselves
+ * (the identify map).
+ *
+ * Apply the above specified mapping to bit position @oldbit, returning
+ * the new bit position.
+ *
+ * For example, lets say that @old has bits 4 through 7 set, and
+ * @new has bits 12 through 15 set. This defines the mapping of bit
+ * position 4 to 12, 5 to 13, 6 to 14 and 7 to 15, and of all other
+ * bit positions unchanged. So if say @oldbit is 5, then this routine
+ * returns 13.
+ */
+int bitmap_bitremap(int oldbit, const unsigned long *old,
+ const unsigned long *new, int bits)
+{
+ int w = bitmap_weight(new, bits);
+ int n = bitmap_pos_to_ord(old, oldbit, bits);
+ if (n < 0 || w == 0)
+ return oldbit;
+ else
+ return bitmap_ord_to_pos(new, n % w, bits);
+}
+EXPORT_SYMBOL(bitmap_bitremap);
+
+/**
+ * bitmap_onto - translate one bitmap relative to another
+ * @dst: resulting translated bitmap
+ * @orig: original untranslated bitmap
+ * @relmap: bitmap relative to which translated
+ * @bits: number of bits in each of these bitmaps
+ *
+ * Set the n-th bit of @dst iff there exists some m such that the
+ * n-th bit of @relmap is set, the m-th bit of @orig is set, and
+ * the n-th bit of @relmap is also the m-th _set_ bit of @relmap.
+ * (If you understood the previous sentence the first time your
+ * read it, you're overqualified for your current job.)
+ *
+ * In other words, @orig is mapped onto (surjectively) @dst,
+ * using the the map { <n, m> | the n-th bit of @relmap is the
+ * m-th set bit of @relmap }.
+ *
+ * Any set bits in @orig above bit number W, where W is the
+ * weight of (number of set bits in) @relmap are mapped nowhere.
+ * In particular, if for all bits m set in @orig, m >= W, then
+ * @dst will end up empty. In situations where the possibility
+ * of such an empty result is not desired, one way to avoid it is
+ * to use the bitmap_fold() operator, below, to first fold the
+ * @orig bitmap over itself so that all its set bits x are in the
+ * range 0 <= x < W. The bitmap_fold() operator does this by
+ * setting the bit (m % W) in @dst, for each bit (m) set in @orig.
+ *
+ * Example [1] for bitmap_onto():
+ * Let's say @relmap has bits 30-39 set, and @orig has bits
+ * 1, 3, 5, 7, 9 and 11 set. Then on return from this routine,
+ * @dst will have bits 31, 33, 35, 37 and 39 set.
+ *
+ * When bit 0 is set in @orig, it means turn on the bit in
+ * @dst corresponding to whatever is the first bit (if any)
+ * that is turned on in @relmap. Since bit 0 was off in the
+ * above example, we leave off that bit (bit 30) in @dst.
+ *
+ * When bit 1 is set in @orig (as in the above example), it
+ * means turn on the bit in @dst corresponding to whatever
+ * is the second bit that is turned on in @relmap. The second
+ * bit in @relmap that was turned on in the above example was
+ * bit 31, so we turned on bit 31 in @dst.
+ *
+ * Similarly, we turned on bits 33, 35, 37 and 39 in @dst,
+ * because they were the 4th, 6th, 8th and 10th set bits
+ * set in @relmap, and the 4th, 6th, 8th and 10th bits of
+ * @orig (i.e. bits 3, 5, 7 and 9) were also set.
+ *
+ * When bit 11 is set in @orig, it means turn on the bit in
+ * @dst corresponding to whatever is the twelfth bit that is
+ * turned on in @relmap. In the above example, there were
+ * only ten bits turned on in @relmap (30..39), so that bit
+ * 11 was set in @orig had no affect on @dst.
+ *
+ * Example [2] for bitmap_fold() + bitmap_onto():
+ * Let's say @relmap has these ten bits set:
+ * 40 41 42 43 45 48 53 61 74 95
+ * (for the curious, that's 40 plus the first ten terms of the
+ * Fibonacci sequence.)
+ *
+ * Further lets say we use the following code, invoking
+ * bitmap_fold() then bitmap_onto, as suggested above to
+ * avoid the possitility of an empty @dst result:
+ *
+ * unsigned long *tmp; // a temporary bitmap's bits
+ *
+ * bitmap_fold(tmp, orig, bitmap_weight(relmap, bits), bits);
+ * bitmap_onto(dst, tmp, relmap, bits);
+ *
+ * Then this table shows what various values of @dst would be, for
+ * various @orig's. I list the zero-based positions of each set bit.
+ * The tmp column shows the intermediate result, as computed by
+ * using bitmap_fold() to fold the @orig bitmap modulo ten
+ * (the weight of @relmap).
+ *
+ * @orig tmp @dst
+ * 0 0 40
+ * 1 1 41
+ * 9 9 95
+ * 10 0 40 (*)
+ * 1 3 5 7 1 3 5 7 41 43 48 61
+ * 0 1 2 3 4 0 1 2 3 4 40 41 42 43 45
+ * 0 9 18 27 0 9 8 7 40 61 74 95
+ * 0 10 20 30 0 40
+ * 0 11 22 33 0 1 2 3 40 41 42 43
+ * 0 12 24 36 0 2 4 6 40 42 45 53
+ * 78 1 2 8 41 42 74 (*)
+ *
+ * (*) For these marked lines, if we hadn't first done bitmap_fold()
+ * into tmp, then the @dst result would have been empty.
+ *
+ * If either of @orig or @relmap is empty (no set bits), then @dst
+ * will be returned empty.
+ *
+ * If (as explained above) the only set bits in @orig are in positions
+ * m where m >= W, (where W is the weight of @relmap) then @dst will
+ * once again be returned empty.
+ *
+ * All bits in @dst not set by the above rule are cleared.
+ */
+void bitmap_onto(unsigned long *dst, const unsigned long *orig,
+ const unsigned long *relmap, int bits)
+{
+ int n, m; /* same meaning as in above comment */
+
+ if (dst == orig) /* following doesn't handle inplace mappings */
+ return;
+ bitmap_zero(dst, bits);
+
+ /*
+ * The following code is a more efficient, but less
+ * obvious, equivalent to the loop:
+ * for (m = 0; m < bitmap_weight(relmap, bits); m++) {
+ * n = bitmap_ord_to_pos(orig, m, bits);
+ * if (test_bit(m, orig))
+ * set_bit(n, dst);
+ * }
+ */
+
+ m = 0;
+ for_each_set_bit(n, relmap, bits) {
+ /* m == bitmap_pos_to_ord(relmap, n, bits) */
+ if (test_bit(m, orig))
+ set_bit(n, dst);
+ m++;
+ }
+}
+EXPORT_SYMBOL(bitmap_onto);
+
+/**
+ * bitmap_fold - fold larger bitmap into smaller, modulo specified size
+ * @dst: resulting smaller bitmap
+ * @orig: original larger bitmap
+ * @sz: specified size
+ * @bits: number of bits in each of these bitmaps
+ *
+ * For each bit oldbit in @orig, set bit oldbit mod @sz in @dst.
+ * Clear all other bits in @dst. See further the comment and
+ * Example [2] for bitmap_onto() for why and how to use this.
+ */
+void bitmap_fold(unsigned long *dst, const unsigned long *orig,
+ int sz, int bits)
+{
+ int oldbit;
+
+ if (dst == orig) /* following doesn't handle inplace mappings */
+ return;
+ bitmap_zero(dst, bits);
+
+ for_each_set_bit(oldbit, orig, bits)
+ set_bit(oldbit % sz, dst);
+}
+EXPORT_SYMBOL(bitmap_fold);
+
+/*
+ * Common code for bitmap_*_region() routines.
+ * bitmap: array of unsigned longs corresponding to the bitmap
+ * pos: the beginning of the region
+ * order: region size (log base 2 of number of bits)
+ * reg_op: operation(s) to perform on that region of bitmap
+ *
+ * Can set, verify and/or release a region of bits in a bitmap,
+ * depending on which combination of REG_OP_* flag bits is set.
+ *
+ * A region of a bitmap is a sequence of bits in the bitmap, of
+ * some size '1 << order' (a power of two), aligned to that same
+ * '1 << order' power of two.
+ *
+ * Returns 1 if REG_OP_ISFREE succeeds (region is all zero bits).
+ * Returns 0 in all other cases and reg_ops.
+ */
+
+enum {
+ REG_OP_ISFREE, /* true if region is all zero bits */
+ REG_OP_ALLOC, /* set all bits in region */
+ REG_OP_RELEASE, /* clear all bits in region */
+};
+
+static int __reg_op(unsigned long *bitmap, int pos, int order, int reg_op)
+{
+ int nbits_reg; /* number of bits in region */
+ int index; /* index first long of region in bitmap */
+ int offset; /* bit offset region in bitmap[index] */
+ int nlongs_reg; /* num longs spanned by region in bitmap */
+ int nbitsinlong; /* num bits of region in each spanned long */
+ unsigned long mask; /* bitmask for one long of region */
+ int i; /* scans bitmap by longs */
+ int ret = 0; /* return value */
+
+ /*
+ * Either nlongs_reg == 1 (for small orders that fit in one long)
+ * or (offset == 0 && mask == ~0UL) (for larger multiword orders.)
+ */
+ nbits_reg = 1 << order;
+ index = pos / BITS_PER_LONG;
+ offset = pos - (index * BITS_PER_LONG);
+ nlongs_reg = BITS_TO_LONGS(nbits_reg);
+ nbitsinlong = min(nbits_reg, BITS_PER_LONG);
+
+ /*
+ * Can't do "mask = (1UL << nbitsinlong) - 1", as that
+ * overflows if nbitsinlong == BITS_PER_LONG.
+ */
+ mask = (1UL << (nbitsinlong - 1));
+ mask += mask - 1;
+ mask <<= offset;
+
+ switch (reg_op) {
+ case REG_OP_ISFREE:
+ for (i = 0; i < nlongs_reg; i++) {
+ if (bitmap[index + i] & mask)
+ goto done;
+ }
+ ret = 1; /* all bits in region free (zero) */
+ break;
+
+ case REG_OP_ALLOC:
+ for (i = 0; i < nlongs_reg; i++)
+ bitmap[index + i] |= mask;
+ break;
+
+ case REG_OP_RELEASE:
+ for (i = 0; i < nlongs_reg; i++)
+ bitmap[index + i] &= ~mask;
+ break;
+ }
+ done:
+ return ret;
+}
+
+/**
+ * bitmap_find_free_region - find a contiguous aligned mem region
+ * @bitmap: array of unsigned longs corresponding to the bitmap
+ * @bits: number of bits in the bitmap
+ * @order: region size (log base 2 of number of bits) to find
+ *
+ * Find a region of free (zero) bits in a @bitmap of @bits bits and
+ * allocate them (set them to one). Only consider regions of length
+ * a power (@order) of two, aligned to that power of two, which
+ * makes the search algorithm much faster.
+ *
+ * Return the bit offset in bitmap of the allocated region,
+ * or -errno on failure.
+ */
+int bitmap_find_free_region(unsigned long *bitmap, int bits, int order)
+{
+ int pos, end; /* scans bitmap by regions of size order */
+
+ for (pos = 0 ; (end = pos + (1 << order)) <= bits; pos = end) {
+ if (!__reg_op(bitmap, pos, order, REG_OP_ISFREE))
+ continue;
+ __reg_op(bitmap, pos, order, REG_OP_ALLOC);
+ return pos;
+ }
+ return -ENOMEM;
+}
+EXPORT_SYMBOL(bitmap_find_free_region);
+
+/**
+ * bitmap_release_region - release allocated bitmap region
+ * @bitmap: array of unsigned longs corresponding to the bitmap
+ * @pos: beginning of bit region to release
+ * @order: region size (log base 2 of number of bits) to release
+ *
+ * This is the complement to __bitmap_find_free_region() and releases
+ * the found region (by clearing it in the bitmap).
+ *
+ * No return value.
+ */
+void bitmap_release_region(unsigned long *bitmap, int pos, int order)
+{
+ __reg_op(bitmap, pos, order, REG_OP_RELEASE);
+}
+EXPORT_SYMBOL(bitmap_release_region);
+
+/**
+ * bitmap_allocate_region - allocate bitmap region
+ * @bitmap: array of unsigned longs corresponding to the bitmap
+ * @pos: beginning of bit region to allocate
+ * @order: region size (log base 2 of number of bits) to allocate
+ *
+ * Allocate (set bits in) a specified region of a bitmap.
+ *
+ * Return 0 on success, or %-EBUSY if specified region wasn't
+ * free (not all bits were zero).
+ */
+int bitmap_allocate_region(unsigned long *bitmap, int pos, int order)
+{
+ if (!__reg_op(bitmap, pos, order, REG_OP_ISFREE))
+ return -EBUSY;
+ __reg_op(bitmap, pos, order, REG_OP_ALLOC);
+ return 0;
+}
+EXPORT_SYMBOL(bitmap_allocate_region);
+
+
+EXPORT_SYMBOL(bitmap_copy_le);
diff --git a/drivers/net/mlnx_uio/kernel/kcompat.c b/drivers/net/mlnx_uio/kernel/kcompat.c
new file mode 100644
index 0000000..2151dd3
--- /dev/null
+++ b/drivers/net/mlnx_uio/kernel/kcompat.c
@@ -0,0 +1,96 @@
+/*
+ * kcompat.c
+ *
+ * Created on: Jun 24, 2015
+ * Author: leeopop
+ */
+
+#include "kmod.h"
+
+static struct list_head __module_parameter_list = LIST_HEAD_INIT(__module_parameter_list);
+void register_module_parameter(__module_param_t* param_t)
+{
+ list_add_tail(¶m_t->list, &__module_parameter_list);
+}
+
+void register_module_parameter_desc(__module_param_t* param_t, const char* desc)
+{
+ param_t->description = desc;
+}
+
+int module_paramter_count()
+{
+ int count = 0;
+ __module_param_t* iter = 0;
+ list_for_each_entry(iter, &__module_parameter_list, list)
+ {
+ count++;
+ }
+ return count;
+}
+
+enum module_param_type module_paramter_type(int index)
+{
+ int count = 0;
+ __module_param_t* iter = 0;
+ list_for_each_entry(iter, &__module_parameter_list, list)
+ {
+ if(count == index)
+ return iter->param_type;
+ count++;
+ }
+ return param_type_none;
+}
+
+void* module_paramter_ptr(int index)
+{
+ int count = 0;
+ __module_param_t* iter = 0;
+ list_for_each_entry(iter, &__module_parameter_list, list)
+ {
+ if(count == index)
+ return iter->ptr;
+ count++;
+ }
+ return 0;
+}
+
+const char* module_paramter_name(int index)
+{
+ int count = 0;
+ __module_param_t* iter = 0;
+ list_for_each_entry(iter, &__module_parameter_list, list)
+ {
+ if(count == index)
+ return iter->name;
+ count++;
+ }
+ return 0;
+}
+
+const char* module_paramter_desc(int index)
+{
+ int count = 0;
+ __module_param_t* iter = 0;
+ list_for_each_entry(iter, &__module_parameter_list, list)
+ {
+ if(count == index)
+ return iter->description;
+ count++;
+ }
+ return 0;
+}
+
+int module_parameter_elt_count(int index)
+{
+ int count = 0;
+ __module_param_t* iter = 0;
+ list_for_each_entry(iter, &__module_parameter_list, list)
+ {
+ if(count == index)
+ return iter->ptr_size / iter->elt_size;
+ count++;
+ }
+ return 0;
+}
+
diff --git a/drivers/net/mlnx_uio/kernel/radix-tree.c b/drivers/net/mlnx_uio/kernel/radix-tree.c
new file mode 100644
index 0000000..4ed6289
--- /dev/null
+++ b/drivers/net/mlnx_uio/kernel/radix-tree.c
@@ -0,0 +1,78 @@
+/*
+ * Copyright (C) 2001 Momchil Velikov
+ * Portions Copyright (C) 2001 Christoph Hellwig
+ * Copyright (C) 2005 SGI, Christoph Lameter
+ * Copyright (C) 2006 Nick Piggin
+ * Copyright (C) 2012 Konstantin Khlebnikov
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include "radix-tree.h"
+
+#include "kmod.h"
+
+/**
+ * radix_tree_insert - insert into a radix tree
+ * @root: radix tree root
+ * @index: index key
+ * @item: item to insert
+ *
+ * Insert an item into the radix tree at position @index.
+ */
+int radix_tree_insert(struct radix_tree_root *root,
+ unsigned long index, void *item)
+{
+ assert(index < RADIX_TREE_MAP_SIZE);
+ root->value[index] = item;
+
+ return 0;
+}
+EXPORT_SYMBOL(radix_tree_insert);
+
+/*
+ * is_slot == 1 : search for the slot.
+ * is_slot == 0 : search for the node.
+ */
+static void *radix_tree_lookup_element(struct radix_tree_root *root,
+ unsigned long index, int is_slot)
+{
+ assert(index < RADIX_TREE_MAP_SIZE);
+ return root->value[index];
+}
+
+void *radix_tree_lookup(struct radix_tree_root *root, unsigned long index)
+{
+ return radix_tree_lookup_element(root, index, 0);
+}
+EXPORT_SYMBOL(radix_tree_lookup);
+
+/**
+ * radix_tree_delete - delete an item from a radix tree
+ * @root: radix tree root
+ * @index: index key
+ *
+ * Remove the item at @index from the radix tree rooted at @root.
+ *
+ * Returns the address of the deleted item, or NULL if it was not present.
+ */
+void *radix_tree_delete(struct radix_tree_root *root, unsigned long index)
+{
+ assert(index < RADIX_TREE_MAP_SIZE);
+ void* ret = root->value[index];
+ root->value[index] = 0;
+ return ret;
+}
+EXPORT_SYMBOL(radix_tree_delete);
diff --git a/drivers/net/mlnx_uio/kernel/rbtree.c b/drivers/net/mlnx_uio/kernel/rbtree.c
new file mode 100644
index 0000000..8102f48
--- /dev/null
+++ b/drivers/net/mlnx_uio/kernel/rbtree.c
@@ -0,0 +1,561 @@
+/*
+ Red Black Trees
+ (C) 1999 Andrea Arcangeli <andrea@suse.de>
+ (C) 2002 David Woodhouse <dwmw2@infradead.org>
+ (C) 2012 Michel Lespinasse <walken@google.com>
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+ linux/lib/rbtree.c
+*/
+
+#include "kmod.h"
+#include "rbtree_augmented.h"
+/*
+ * red-black trees properties: http://en.wikipedia.org/wiki/Rbtree
+ *
+ * 1) A node is either red or black
+ * 2) The root is black
+ * 3) All leaves (NULL) are black
+ * 4) Both children of every red node are black
+ * 5) Every simple path from root to leaves contains the same number
+ * of black nodes.
+ *
+ * 4 and 5 give the O(log n) guarantee, since 4 implies you cannot have two
+ * consecutive red nodes in a path and every red node is therefore followed by
+ * a black. So if B is the number of black nodes on every simple path (as per
+ * 5), then the longest possible path due to 4 is 2B.
+ *
+ * We shall indicate color with case, where black nodes are uppercase and red
+ * nodes will be lowercase. Unknown color nodes shall be drawn as red within
+ * parentheses and have some accompanying text comment.
+ */
+
+static inline void rb_set_black(struct rb_node *rb)
+{
+ rb->__rb_parent_color |= RB_BLACK;
+}
+
+static inline struct rb_node *rb_red_parent(struct rb_node *red)
+{
+ return (struct rb_node *)red->__rb_parent_color;
+}
+
+/*
+ * Helper function for rotations:
+ * - old's parent and color get assigned to new
+ * - old gets assigned new as a parent and 'color' as a color.
+ */
+static inline void
+__rb_rotate_set_parents(struct rb_node *old, struct rb_node *new,
+ struct rb_root *root, int color)
+{
+ struct rb_node *parent = rb_parent(old);
+ new->__rb_parent_color = old->__rb_parent_color;
+ rb_set_parent_color(old, new, color);
+ __rb_change_child(old, new, parent, root);
+}
+
+static __always_inline void
+__rb_insert(struct rb_node *node, struct rb_root *root,
+ void (*augment_rotate)(struct rb_node *old, struct rb_node *new))
+{
+ struct rb_node *parent = rb_red_parent(node), *gparent, *tmp;
+
+ while (true) {
+ /*
+ * Loop invariant: node is red
+ *
+ * If there is a black parent, we are done.
+ * Otherwise, take some corrective action as we don't
+ * want a red root or two consecutive red nodes.
+ */
+ if (!parent) {
+ rb_set_parent_color(node, NULL, RB_BLACK);
+ break;
+ } else if (rb_is_black(parent))
+ break;
+
+ gparent = rb_red_parent(parent);
+
+ tmp = gparent->rb_right;
+ if (parent != tmp) { /* parent == gparent->rb_left */
+ if (tmp && rb_is_red(tmp)) {
+ /*
+ * Case 1 - color flips
+ *
+ * G g
+ * / \ / \
+ * p u --> P U
+ * / /
+ * n N
+ *
+ * However, since g's parent might be red, and
+ * 4) does not allow this, we need to recurse
+ * at g.
+ */
+ rb_set_parent_color(tmp, gparent, RB_BLACK);
+ rb_set_parent_color(parent, gparent, RB_BLACK);
+ node = gparent;
+ parent = rb_parent(node);
+ rb_set_parent_color(node, parent, RB_RED);
+ continue;
+ }
+
+ tmp = parent->rb_right;
+ if (node == tmp) {
+ /*
+ * Case 2 - left rotate at parent
+ *
+ * G G
+ * / \ / \
+ * p U --> n U
+ * \ /
+ * n p
+ *
+ * This still leaves us in violation of 4), the
+ * continuation into Case 3 will fix that.
+ */
+ parent->rb_right = tmp = node->rb_left;
+ node->rb_left = parent;
+ if (tmp)
+ rb_set_parent_color(tmp, parent,
+ RB_BLACK);
+ rb_set_parent_color(parent, node, RB_RED);
+ augment_rotate(parent, node);
+ parent = node;
+ tmp = node->rb_right;
+ }
+
+ /*
+ * Case 3 - right rotate at gparent
+ *
+ * G P
+ * / \ / \
+ * p U --> n g
+ * / \
+ * n U
+ */
+ gparent->rb_left = tmp; /* == parent->rb_right */
+ parent->rb_right = gparent;
+ if (tmp)
+ rb_set_parent_color(tmp, gparent, RB_BLACK);
+ __rb_rotate_set_parents(gparent, parent, root, RB_RED);
+ augment_rotate(gparent, parent);
+ break;
+ } else {
+ tmp = gparent->rb_left;
+ if (tmp && rb_is_red(tmp)) {
+ /* Case 1 - color flips */
+ rb_set_parent_color(tmp, gparent, RB_BLACK);
+ rb_set_parent_color(parent, gparent, RB_BLACK);
+ node = gparent;
+ parent = rb_parent(node);
+ rb_set_parent_color(node, parent, RB_RED);
+ continue;
+ }
+
+ tmp = parent->rb_left;
+ if (node == tmp) {
+ /* Case 2 - right rotate at parent */
+ parent->rb_left = tmp = node->rb_right;
+ node->rb_right = parent;
+ if (tmp)
+ rb_set_parent_color(tmp, parent,
+ RB_BLACK);
+ rb_set_parent_color(parent, node, RB_RED);
+ augment_rotate(parent, node);
+ parent = node;
+ tmp = node->rb_left;
+ }
+
+ /* Case 3 - left rotate at gparent */
+ gparent->rb_right = tmp; /* == parent->rb_left */
+ parent->rb_left = gparent;
+ if (tmp)
+ rb_set_parent_color(tmp, gparent, RB_BLACK);
+ __rb_rotate_set_parents(gparent, parent, root, RB_RED);
+ augment_rotate(gparent, parent);
+ break;
+ }
+ }
+}
+
+/*
+ * Inline version for rb_erase() use - we want to be able to inline
+ * and eliminate the dummy_rotate callback there
+ */
+static __always_inline void
+____rb_erase_color(struct rb_node *parent, struct rb_root *root,
+ void (*augment_rotate)(struct rb_node *old, struct rb_node *new))
+{
+ struct rb_node *node = NULL, *sibling, *tmp1, *tmp2;
+
+ while (true) {
+ /*
+ * Loop invariants:
+ * - node is black (or NULL on first iteration)
+ * - node is not the root (parent is not NULL)
+ * - All leaf paths going through parent and node have a
+ * black node count that is 1 lower than other leaf paths.
+ */
+ sibling = parent->rb_right;
+ if (node != sibling) { /* node == parent->rb_left */
+ if (rb_is_red(sibling)) {
+ /*
+ * Case 1 - left rotate at parent
+ *
+ * P S
+ * / \ / \
+ * N s --> p Sr
+ * / \ / \
+ * Sl Sr N Sl
+ */
+ parent->rb_right = tmp1 = sibling->rb_left;
+ sibling->rb_left = parent;
+ rb_set_parent_color(tmp1, parent, RB_BLACK);
+ __rb_rotate_set_parents(parent, sibling, root,
+ RB_RED);
+ augment_rotate(parent, sibling);
+ sibling = tmp1;
+ }
+ tmp1 = sibling->rb_right;
+ if (!tmp1 || rb_is_black(tmp1)) {
+ tmp2 = sibling->rb_left;
+ if (!tmp2 || rb_is_black(tmp2)) {
+ /*
+ * Case 2 - sibling color flip
+ * (p could be either color here)
+ *
+ * (p) (p)
+ * / \ / \
+ * N S --> N s
+ * / \ / \
+ * Sl Sr Sl Sr
+ *
+ * This leaves us violating 5) which
+ * can be fixed by flipping p to black
+ * if it was red, or by recursing at p.
+ * p is red when coming from Case 1.
+ */
+ rb_set_parent_color(sibling, parent,
+ RB_RED);
+ if (rb_is_red(parent))
+ rb_set_black(parent);
+ else {
+ node = parent;
+ parent = rb_parent(node);
+ if (parent)
+ continue;
+ }
+ break;
+ }
+ /*
+ * Case 3 - right rotate at sibling
+ * (p could be either color here)
+ *
+ * (p) (p)
+ * / \ / \
+ * N S --> N Sl
+ * / \ \
+ * sl Sr s
+ * \
+ * Sr
+ */
+ sibling->rb_left = tmp1 = tmp2->rb_right;
+ tmp2->rb_right = sibling;
+ parent->rb_right = tmp2;
+ if (tmp1)
+ rb_set_parent_color(tmp1, sibling,
+ RB_BLACK);
+ augment_rotate(sibling, tmp2);
+ tmp1 = sibling;
+ sibling = tmp2;
+ }
+ /*
+ * Case 4 - left rotate at parent + color flips
+ * (p and sl could be either color here.
+ * After rotation, p becomes black, s acquires
+ * p's color, and sl keeps its color)
+ *
+ * (p) (s)
+ * / \ / \
+ * N S --> P Sr
+ * / \ / \
+ * (sl) sr N (sl)
+ */
+ parent->rb_right = tmp2 = sibling->rb_left;
+ sibling->rb_left = parent;
+ rb_set_parent_color(tmp1, sibling, RB_BLACK);
+ if (tmp2)
+ rb_set_parent(tmp2, parent);
+ __rb_rotate_set_parents(parent, sibling, root,
+ RB_BLACK);
+ augment_rotate(parent, sibling);
+ break;
+ } else {
+ sibling = parent->rb_left;
+ if (rb_is_red(sibling)) {
+ /* Case 1 - right rotate at parent */
+ parent->rb_left = tmp1 = sibling->rb_right;
+ sibling->rb_right = parent;
+ rb_set_parent_color(tmp1, parent, RB_BLACK);
+ __rb_rotate_set_parents(parent, sibling, root,
+ RB_RED);
+ augment_rotate(parent, sibling);
+ sibling = tmp1;
+ }
+ tmp1 = sibling->rb_left;
+ if (!tmp1 || rb_is_black(tmp1)) {
+ tmp2 = sibling->rb_right;
+ if (!tmp2 || rb_is_black(tmp2)) {
+ /* Case 2 - sibling color flip */
+ rb_set_parent_color(sibling, parent,
+ RB_RED);
+ if (rb_is_red(parent))
+ rb_set_black(parent);
+ else {
+ node = parent;
+ parent = rb_parent(node);
+ if (parent)
+ continue;
+ }
+ break;
+ }
+ /* Case 3 - right rotate at sibling */
+ sibling->rb_right = tmp1 = tmp2->rb_left;
+ tmp2->rb_left = sibling;
+ parent->rb_left = tmp2;
+ if (tmp1)
+ rb_set_parent_color(tmp1, sibling,
+ RB_BLACK);
+ augment_rotate(sibling, tmp2);
+ tmp1 = sibling;
+ sibling = tmp2;
+ }
+ /* Case 4 - left rotate at parent + color flips */
+ parent->rb_left = tmp2 = sibling->rb_right;
+ sibling->rb_right = parent;
+ rb_set_parent_color(tmp1, sibling, RB_BLACK);
+ if (tmp2)
+ rb_set_parent(tmp2, parent);
+ __rb_rotate_set_parents(parent, sibling, root,
+ RB_BLACK);
+ augment_rotate(parent, sibling);
+ break;
+ }
+ }
+}
+
+/* Non-inline version for rb_erase_augmented() use */
+void __rb_erase_color(struct rb_node *parent, struct rb_root *root,
+ void (*augment_rotate)(struct rb_node *old, struct rb_node *new))
+{
+ ____rb_erase_color(parent, root, augment_rotate);
+}
+EXPORT_SYMBOL(__rb_erase_color);
+
+/*
+ * Non-augmented rbtree manipulation functions.
+ *
+ * We use dummy augmented callbacks here, and have the compiler optimize them
+ * out of the rb_insert_color() and rb_erase() function definitions.
+ */
+
+static inline void dummy_propagate(struct rb_node *node, struct rb_node *stop) {}
+static inline void dummy_copy(struct rb_node *old, struct rb_node *new) {}
+static inline void dummy_rotate(struct rb_node *old, struct rb_node *new) {}
+
+static const struct rb_augment_callbacks dummy_callbacks = {
+ dummy_propagate, dummy_copy, dummy_rotate
+};
+
+void rb_insert_color(struct rb_node *node, struct rb_root *root)
+{
+ __rb_insert(node, root, dummy_rotate);
+}
+EXPORT_SYMBOL(rb_insert_color);
+
+void rb_erase(struct rb_node *node, struct rb_root *root)
+{
+ struct rb_node *rebalance;
+ rebalance = __rb_erase_augmented(node, root, &dummy_callbacks);
+ if (rebalance)
+ ____rb_erase_color(rebalance, root, dummy_rotate);
+}
+EXPORT_SYMBOL(rb_erase);
+
+/*
+ * Augmented rbtree manipulation functions.
+ *
+ * This instantiates the same __always_inline functions as in the non-augmented
+ * case, but this time with user-defined callbacks.
+ */
+
+void __rb_insert_augmented(struct rb_node *node, struct rb_root *root,
+ void (*augment_rotate)(struct rb_node *old, struct rb_node *new))
+{
+ __rb_insert(node, root, augment_rotate);
+}
+EXPORT_SYMBOL(__rb_insert_augmented);
+
+/*
+ * This function returns the first node (in sort order) of the tree.
+ */
+struct rb_node *rb_first(const struct rb_root *root)
+{
+ struct rb_node *n;
+
+ n = root->rb_node;
+ if (!n)
+ return NULL;
+ while (n->rb_left)
+ n = n->rb_left;
+ return n;
+}
+EXPORT_SYMBOL(rb_first);
+
+struct rb_node *rb_last(const struct rb_root *root)
+{
+ struct rb_node *n;
+
+ n = root->rb_node;
+ if (!n)
+ return NULL;
+ while (n->rb_right)
+ n = n->rb_right;
+ return n;
+}
+EXPORT_SYMBOL(rb_last);
+
+struct rb_node *rb_next(const struct rb_node *node)
+{
+ struct rb_node *parent;
+
+ if (RB_EMPTY_NODE(node))
+ return NULL;
+
+ /*
+ * If we have a right-hand child, go down and then left as far
+ * as we can.
+ */
+ if (node->rb_right) {
+ node = node->rb_right;
+ while (node->rb_left)
+ node=node->rb_left;
+ return (struct rb_node *)node;
+ }
+
+ /*
+ * No right-hand children. Everything down and left is smaller than us,
+ * so any 'next' node must be in the general direction of our parent.
+ * Go up the tree; any time the ancestor is a right-hand child of its
+ * parent, keep going up. First time it's a left-hand child of its
+ * parent, said parent is our 'next' node.
+ */
+ while ((parent = rb_parent(node)) && node == parent->rb_right)
+ node = parent;
+
+ return parent;
+}
+EXPORT_SYMBOL(rb_next);
+
+struct rb_node *rb_prev(const struct rb_node *node)
+{
+ struct rb_node *parent;
+
+ if (RB_EMPTY_NODE(node))
+ return NULL;
+
+ /*
+ * If we have a left-hand child, go down and then right as far
+ * as we can.
+ */
+ if (node->rb_left) {
+ node = node->rb_left;
+ while (node->rb_right)
+ node=node->rb_right;
+ return (struct rb_node *)node;
+ }
+
+ /*
+ * No left-hand children. Go up till we find an ancestor which
+ * is a right-hand child of its parent.
+ */
+ while ((parent = rb_parent(node)) && node == parent->rb_left)
+ node = parent;
+
+ return parent;
+}
+EXPORT_SYMBOL(rb_prev);
+
+void rb_replace_node(struct rb_node *victim, struct rb_node *new,
+ struct rb_root *root)
+{
+ struct rb_node *parent = rb_parent(victim);
+
+ /* Set the surrounding nodes to point to the replacement */
+ __rb_change_child(victim, new, parent, root);
+ if (victim->rb_left)
+ rb_set_parent(victim->rb_left, new);
+ if (victim->rb_right)
+ rb_set_parent(victim->rb_right, new);
+
+ /* Copy the pointers/colour from the victim to the replacement */
+ *new = *victim;
+}
+EXPORT_SYMBOL(rb_replace_node);
+
+static struct rb_node *rb_left_deepest_node(const struct rb_node *node)
+{
+ for (;;) {
+ if (node->rb_left)
+ node = node->rb_left;
+ else if (node->rb_right)
+ node = node->rb_right;
+ else
+ return (struct rb_node *)node;
+ }
+ assert(0);
+ return 0;
+}
+
+struct rb_node *rb_next_postorder(const struct rb_node *node)
+{
+ const struct rb_node *parent;
+ if (!node)
+ return NULL;
+ parent = rb_parent(node);
+
+ /* If we're sitting on node, we've already seen our children */
+ if (parent && node == parent->rb_left && parent->rb_right) {
+ /* If we are the parent's left node, go to the parent's right
+ * node then all the way down to the left */
+ return rb_left_deepest_node(parent->rb_right);
+ } else
+ /* Otherwise we are the parent's right node, and the parent
+ * should be next */
+ return (struct rb_node *)parent;
+}
+EXPORT_SYMBOL(rb_next_postorder);
+
+struct rb_node *rb_first_postorder(const struct rb_root *root)
+{
+ if (!root->rb_node)
+ return NULL;
+
+ return rb_left_deepest_node(root->rb_node);
+}
+EXPORT_SYMBOL(rb_first_postorder);
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx4/cmd.h b/drivers/net/mlnx_uio/mlnx/include/mlx4/cmd.h
new file mode 100644
index 0000000..efe68f0
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx4/cmd.h
@@ -0,0 +1,309 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX4_CMD_H
+#define MLX4_CMD_H
+
+
+enum {
+ /* initialization and general commands */
+ MLX4_CMD_SYS_EN = 0x1,
+ MLX4_CMD_SYS_DIS = 0x2,
+ MLX4_CMD_MAP_FA = 0xfff,
+ MLX4_CMD_UNMAP_FA = 0xffe,
+ MLX4_CMD_RUN_FW = 0xff6,
+ MLX4_CMD_MOD_STAT_CFG = 0x34,
+ MLX4_CMD_QUERY_DEV_CAP = 0x3,
+ MLX4_CMD_QUERY_FW = 0x4,
+ MLX4_CMD_ENABLE_LAM = 0xff8,
+ MLX4_CMD_DISABLE_LAM = 0xff7,
+ MLX4_CMD_QUERY_DDR = 0x5,
+ MLX4_CMD_QUERY_ADAPTER = 0x6,
+ MLX4_CMD_INIT_HCA = 0x7,
+ MLX4_CMD_CLOSE_HCA = 0x8,
+ MLX4_CMD_INIT_PORT = 0x9,
+ MLX4_CMD_CLOSE_PORT = 0xa,
+ MLX4_CMD_QUERY_HCA = 0xb,
+ MLX4_CMD_QUERY_PORT = 0x43,
+ MLX4_CMD_SENSE_PORT = 0x4d,
+ MLX4_CMD_HW_HEALTH_CHECK = 0x50,
+ MLX4_CMD_SET_PORT = 0xc,
+ MLX4_CMD_SET_NODE = 0x5a,
+ MLX4_CMD_QUERY_FUNC = 0x56,
+ MLX4_CMD_ACCESS_DDR = 0x2e,
+ MLX4_CMD_MAP_ICM = 0xffa,
+ MLX4_CMD_UNMAP_ICM = 0xff9,
+ MLX4_CMD_MAP_ICM_AUX = 0xffc,
+ MLX4_CMD_UNMAP_ICM_AUX = 0xffb,
+ MLX4_CMD_SET_ICM_SIZE = 0xffd,
+ MLX4_CMD_ACCESS_REG = 0x3b,
+ MLX4_CMD_ALLOCATE_VPP = 0x80,
+ MLX4_CMD_SET_VPORT_QOS = 0x81,
+
+ /*master notify fw on finish for slave's flr*/
+ MLX4_CMD_INFORM_FLR_DONE = 0x5b,
+ MLX4_CMD_VIRT_PORT_MAP = 0x5c,
+ MLX4_CMD_GET_OP_REQ = 0x59,
+
+ /* TPT commands */
+ MLX4_CMD_SW2HW_MPT = 0xd,
+ MLX4_CMD_QUERY_MPT = 0xe,
+ MLX4_CMD_HW2SW_MPT = 0xf,
+ MLX4_CMD_READ_MTT = 0x10,
+ MLX4_CMD_WRITE_MTT = 0x11,
+ MLX4_CMD_SYNC_TPT = 0x2f,
+
+ /* EQ commands */
+ MLX4_CMD_MAP_EQ = 0x12,
+ MLX4_CMD_SW2HW_EQ = 0x13,
+ MLX4_CMD_HW2SW_EQ = 0x14,
+ MLX4_CMD_QUERY_EQ = 0x15,
+
+ /* CQ commands */
+ MLX4_CMD_SW2HW_CQ = 0x16,
+ MLX4_CMD_HW2SW_CQ = 0x17,
+ MLX4_CMD_QUERY_CQ = 0x18,
+ MLX4_CMD_MODIFY_CQ = 0x2c,
+
+ /* SRQ commands */
+ MLX4_CMD_SW2HW_SRQ = 0x35,
+ MLX4_CMD_HW2SW_SRQ = 0x36,
+ MLX4_CMD_QUERY_SRQ = 0x37,
+ MLX4_CMD_ARM_SRQ = 0x40,
+
+ /* QP/EE commands */
+ MLX4_CMD_RST2INIT_QP = 0x19,
+ MLX4_CMD_INIT2RTR_QP = 0x1a,
+ MLX4_CMD_RTR2RTS_QP = 0x1b,
+ MLX4_CMD_RTS2RTS_QP = 0x1c,
+ MLX4_CMD_SQERR2RTS_QP = 0x1d,
+ MLX4_CMD_2ERR_QP = 0x1e,
+ MLX4_CMD_RTS2SQD_QP = 0x1f,
+ MLX4_CMD_SQD2SQD_QP = 0x38,
+ MLX4_CMD_SQD2RTS_QP = 0x20,
+ MLX4_CMD_2RST_QP = 0x21,
+ MLX4_CMD_QUERY_QP = 0x22,
+ MLX4_CMD_INIT2INIT_QP = 0x2d,
+ MLX4_CMD_SUSPEND_QP = 0x32,
+ MLX4_CMD_UNSUSPEND_QP = 0x33,
+ MLX4_CMD_UPDATE_QP = 0x61,
+ /* special QP and management commands */
+ MLX4_CMD_CONF_SPECIAL_QP = 0x23,
+ MLX4_CMD_MAD_IFC = 0x24,
+ MLX4_CMD_MAD_DEMUX = 0x203,
+
+ /* multicast commands */
+ MLX4_CMD_READ_MCG = 0x25,
+ MLX4_CMD_WRITE_MCG = 0x26,
+ MLX4_CMD_MGID_HASH = 0x27,
+
+ /* miscellaneous commands */
+ MLX4_CMD_DIAG_RPRT = 0x30,
+ MLX4_CMD_NOP = 0x31,
+ MLX4_CMD_CONFIG_DEV = 0x3a,
+ MLX4_CMD_ACCESS_MEM = 0x2e,
+ MLX4_CMD_SET_VEP = 0x52,
+
+ /* Ethernet specific commands */
+ MLX4_CMD_SET_VLAN_FLTR = 0x47,
+ MLX4_CMD_SET_MCAST_FLTR = 0x48,
+ MLX4_CMD_DUMP_ETH_STATS = 0x49,
+
+ /* Communication channel commands */
+ MLX4_CMD_ARM_COMM_CHANNEL = 0x57,
+ MLX4_CMD_GEN_EQE = 0x58,
+
+ /* virtual commands */
+ MLX4_CMD_ALLOC_RES = 0xf00,
+ MLX4_CMD_FREE_RES = 0xf01,
+ MLX4_CMD_MCAST_ATTACH = 0xf05,
+ MLX4_CMD_UCAST_ATTACH = 0xf06,
+ MLX4_CMD_PROMISC = 0xf08,
+ MLX4_CMD_QUERY_FUNC_CAP = 0xf0a,
+ MLX4_CMD_QP_ATTACH = 0xf0b,
+
+ /* debug commands */
+ MLX4_CMD_QUERY_DEBUG_MSG = 0x2a,
+ MLX4_CMD_SET_DEBUG_MSG = 0x2b,
+
+ /* statistics commands */
+ MLX4_CMD_QUERY_IF_STAT = 0X54,
+ MLX4_CMD_SET_IF_STAT = 0X55,
+
+ /* register/delete flow steering network rules */
+ MLX4_QP_FLOW_STEERING_ATTACH = 0x65,
+ MLX4_QP_FLOW_STEERING_DETACH = 0x66,
+ MLX4_FLOW_STEERING_IB_UC_QP_RANGE = 0x64,
+
+ /* Update and read QCN parameters */
+ MLX4_CMD_CONGESTION_CTRL_OPCODE = 0x68,
+};
+
+enum {
+ MLX4_CMD_TIME_CLASS_A = 60000,
+ MLX4_CMD_TIME_CLASS_B = 60000,
+ MLX4_CMD_TIME_CLASS_C = 60000,
+};
+
+enum {
+ /* virtual to physical port mapping opcode modifiers */
+ MLX4_GET_PORT_VIRT2PHY = 0x0,
+ MLX4_SET_PORT_VIRT2PHY = 0x1,
+};
+
+enum {
+ MLX4_MAILBOX_SIZE = 4096,
+ MLX4_ACCESS_MEM_ALIGN = 256,
+};
+
+enum {
+ /* Set port opcode modifiers */
+ MLX4_SET_PORT_IB_OPCODE = 0x0,
+ MLX4_SET_PORT_ETH_OPCODE = 0x1,
+ MLX4_SET_PORT_BEACON_OPCODE = 0x4,
+};
+
+enum {
+ /* Set port Ethernet input modifiers */
+ MLX4_SET_PORT_GENERAL = 0x0,
+ MLX4_SET_PORT_RQP_CALC = 0x1,
+ MLX4_SET_PORT_MAC_TABLE = 0x2,
+ MLX4_SET_PORT_VLAN_TABLE = 0x3,
+ MLX4_SET_PORT_PRIO_MAP = 0x4,
+ MLX4_SET_PORT_GID_TABLE = 0x5,
+ MLX4_SET_PORT_PRIO2TC = 0x8,
+ MLX4_SET_PORT_SCHEDULER = 0x9,
+ MLX4_SET_PORT_VXLAN = 0xB,
+ MLX4_SET_PORT_ROCE_ADDR = 0xD
+};
+
+enum {
+ MLX4_CMD_MAD_DEMUX_CONFIG = 0,
+ MLX4_CMD_MAD_DEMUX_QUERY_STATE = 1,
+ MLX4_CMD_MAD_DEMUX_QUERY_RESTR = 2, /* Query mad demux restrictions */
+};
+
+enum {
+ MLX4_CMD_WRAPPED,
+ MLX4_CMD_NATIVE
+};
+
+struct mlx4_config_dev_params {
+ u16 vxlan_udp_dport;
+ u8 rx_csum_flags_port_1;
+ u8 rx_csum_flags_port_2;
+};
+
+struct mlx4_dev;
+
+struct mlx4_cmd_mailbox {
+ void *buf;
+ dma_addr_t dma;
+};
+
+int __mlx4_cmd(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
+ int out_is_imm, u32 in_modifier, u8 op_modifier,
+ u16 op, unsigned long timeout, int native);
+
+/* Invoke a command with no output parameter */
+static inline int mlx4_cmd(struct mlx4_dev *dev, u64 in_param, u32 in_modifier,
+ u8 op_modifier, u16 op, unsigned long timeout,
+ int native)
+{
+ return __mlx4_cmd(dev, in_param, NULL, 0, in_modifier,
+ op_modifier, op, timeout, native);
+}
+
+/* Invoke a command with an output mailbox */
+static inline int mlx4_cmd_box(struct mlx4_dev *dev, u64 in_param, u64 out_param,
+ u32 in_modifier, u8 op_modifier, u16 op,
+ unsigned long timeout, int native)
+{
+ return __mlx4_cmd(dev, in_param, &out_param, 0, in_modifier,
+ op_modifier, op, timeout, native);
+}
+
+/*
+ * Invoke a command with an immediate output parameter (and copy the
+ * output into the caller's out_param pointer after the command
+ * executes).
+ */
+static inline int mlx4_cmd_imm(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
+ u32 in_modifier, u8 op_modifier, u16 op,
+ unsigned long timeout, int native)
+{
+ return __mlx4_cmd(dev, in_param, out_param, 1, in_modifier,
+ op_modifier, op, timeout, native);
+}
+
+struct mlx4_cmd_mailbox *mlx4_alloc_cmd_mailbox(struct mlx4_dev *dev);
+void mlx4_free_cmd_mailbox(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox);
+
+#ifdef KMOD_DISABLED
+int mlx4_get_vf_statistics(struct mlx4_dev *dev, int port, int vf,
+ struct net_device_stats *link_stats);
+#endif
+
+u32 mlx4_comm_get_version(void);
+int mlx4_set_vf_mac(struct mlx4_dev *dev, int port, int vf, u64 mac);
+int mlx4_set_vf_vlan(struct mlx4_dev *dev, int port, int vf, u16 vlan, u8 qos);
+int mlx4_set_vf_rate(struct mlx4_dev *dev, int port, int vf, int min_tx_rate,
+ int max_tx_rate);
+int mlx4_set_vf_spoofchk(struct mlx4_dev *dev, int port, int vf, bool setting);
+#ifdef HAVE_NDO_SET_VF_MAC
+int mlx4_get_vf_config(struct mlx4_dev *dev, int port, int vf, struct ifla_vf_info *ivf);
+#endif
+int mlx4_set_vf_link_state(struct mlx4_dev *dev, int port, int vf, int link_state);
+int mlx4_get_vf_link_state(struct mlx4_dev *dev, int port, int vf);
+int mlx4_config_dev_retrieval(struct mlx4_dev *dev,
+ struct mlx4_config_dev_params *params);
+void mlx4_cmd_wake_completions(struct mlx4_dev *dev);
+void mlx4_report_internal_err_comm_event(struct mlx4_dev *dev);
+ssize_t mlx4_get_vf_rate(struct mlx4_dev *dev, int port, int vf, char *buf);
+/*
+ * mlx4_get_slave_default_vlan -
+ * return true if VST ( default vlan)
+ * if VST, will return vlan & qos (if not NULL)
+ */
+bool mlx4_get_slave_default_vlan(struct mlx4_dev *dev, int port, int slave,
+ u16 *vlan, u8 *qos);
+
+#define MLX4_COMM_GET_IF_REV(cmd_chan_ver) (u8)((cmd_chan_ver) >> 8)
+#define COMM_CHAN_EVENT_INTERNAL_ERR (1 << 17)
+
+#endif /* MLX4_CMD_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx4/cq.h b/drivers/net/mlnx_uio/mlnx/include/mlx4/cq.h
new file mode 100644
index 0000000..eb5aaaa
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx4/cq.h
@@ -0,0 +1,195 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX4_CQ_H
+#define MLX4_CQ_H
+
+#ifdef HAVE_UAPI_LINUX_IF_ETHER_H
+#else
+#endif
+
+#include "mlx4/device.h"
+#include "mlx4/doorbell.h"
+
+struct mlx4_cqe {
+ __be32 vlan_my_qpn;
+ __be32 immed_rss_invalid;
+ __be32 g_mlpath_rqpn;
+ union {
+ struct {
+ union {
+ struct {
+ __be16 sl_vid;
+ __be16 rlid;
+ };
+ __be32 timestamp_16_47;
+ };
+ __be16 status;
+ u8 ipv6_ext_mask;
+ u8 badfcs_enc;
+ };
+ struct {
+ __be16 reserved1;
+ u8 smac[6];
+ };
+ };
+ __be32 byte_cnt;
+ __be16 wqe_index;
+ __be16 checksum;
+ u8 reserved2[1];
+ __be16 timestamp_0_15;
+ u8 owner_sr_opcode;
+} __packed;
+
+struct mlx4_err_cqe {
+ __be32 my_qpn;
+ u32 reserved1[5];
+ __be16 wqe_index;
+ u8 vendor_err_syndrome;
+ u8 syndrome;
+ u8 reserved2[3];
+ u8 owner_sr_opcode;
+};
+
+struct mlx4_ts_cqe {
+ __be32 vlan_my_qpn;
+ __be32 immed_rss_invalid;
+ __be32 g_mlpath_rqpn;
+ __be32 timestamp_hi;
+ __be16 status;
+ u8 ipv6_ext_mask;
+ u8 badfcs_enc;
+ __be32 byte_cnt;
+ __be16 wqe_index;
+ __be16 checksum;
+ u8 reserved;
+ __be16 timestamp_lo;
+ u8 owner_sr_opcode;
+} __packed;
+
+enum {
+ MLX4_CQE_L2_TUNNEL_IPOK = 1 << 31,
+ MLX4_CQE_VLAN_PRESENT_MASK = 1 << 29,
+ MLX4_CQE_L2_TUNNEL = 1 << 27,
+ MLX4_CQE_L2_TUNNEL_CSUM = 1 << 26,
+ MLX4_CQE_L2_TUNNEL_IPV4 = 1 << 25,
+
+ MLX4_CQE_QPN_MASK = 0xffffff,
+ MLX4_CQE_VID_MASK = 0xfff,
+};
+
+enum {
+ MLX4_CQE_OWNER_MASK = 0x80,
+ MLX4_CQE_IS_SEND_MASK = 0x40,
+ MLX4_CQE_IS_RECV_MASK = 0x20,
+ MLX4_CQE_OPCODE_MASK = 0x1f
+};
+
+enum {
+ MLX4_CQE_SYNDROME_LOCAL_LENGTH_ERR = 0x01,
+ MLX4_CQE_SYNDROME_LOCAL_QP_OP_ERR = 0x02,
+ MLX4_CQE_SYNDROME_LOCAL_PROT_ERR = 0x04,
+ MLX4_CQE_SYNDROME_WR_FLUSH_ERR = 0x05,
+ MLX4_CQE_SYNDROME_MW_BIND_ERR = 0x06,
+ MLX4_CQE_SYNDROME_BAD_RESP_ERR = 0x10,
+ MLX4_CQE_SYNDROME_LOCAL_ACCESS_ERR = 0x11,
+ MLX4_CQE_SYNDROME_REMOTE_INVAL_REQ_ERR = 0x12,
+ MLX4_CQE_SYNDROME_REMOTE_ACCESS_ERR = 0x13,
+ MLX4_CQE_SYNDROME_REMOTE_OP_ERR = 0x14,
+ MLX4_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR = 0x15,
+ MLX4_CQE_SYNDROME_RNR_RETRY_EXC_ERR = 0x16,
+ MLX4_CQE_SYNDROME_REMOTE_ABORTED_ERR = 0x22,
+};
+
+enum {
+ MLX4_CQE_STATUS_IPV4 = 1 << 6,
+ MLX4_CQE_STATUS_IPV4F = 1 << 7,
+ MLX4_CQE_STATUS_IPV6 = 1 << 8,
+ MLX4_CQE_STATUS_IPV4OPT = 1 << 9,
+ MLX4_CQE_STATUS_TCP = 1 << 10,
+ MLX4_CQE_STATUS_UDP = 1 << 11,
+ MLX4_CQE_STATUS_IPOK = 1 << 12,
+};
+
+enum {
+ MLX4_CQE_LLC = 1,
+ MLX4_CQE_SNAP = 1 << 1,
+ MLX4_CQE_BAD_FCS = 1 << 4,
+};
+
+static inline void mlx4_cq_arm(struct mlx4_cq *cq, u32 cmd,
+ void __iomem *uar_page,
+ spinlock_t *doorbell_lock)
+{
+ __be32 doorbell[2];
+ u32 sn;
+ u32 ci;
+
+ sn = cq->arm_sn & 3;
+ ci = cq->cons_index & 0xffffff;
+
+ *cq->arm_db = cpu_to_be32(sn << 28 | cmd | ci);
+
+ /*
+ * Make sure that the doorbell record in host memory is
+ * written before ringing the doorbell via PCI MMIO.
+ */
+ wmb();
+
+ doorbell[0] = cpu_to_be32(sn << 28 | cmd | cq->cqn);
+ doorbell[1] = cpu_to_be32(ci);
+
+ mlx4_write64(doorbell, uar_page + MLX4_CQ_DOORBELL, doorbell_lock);
+}
+
+static inline void mlx4_cq_set_ci(struct mlx4_cq *cq)
+{
+ *cq->set_ci_db = cpu_to_be32(cq->cons_index & 0xffffff);
+}
+
+enum {
+ MLX4_CQ_DB_REQ_NOT_SOL = 1 << 24,
+ MLX4_CQ_DB_REQ_NOT = 2 << 24
+};
+
+int mlx4_cq_modify(struct mlx4_dev *dev, struct mlx4_cq *cq,
+ u16 count, u16 period);
+int mlx4_cq_resize(struct mlx4_dev *dev, struct mlx4_cq *cq,
+ int entries, struct mlx4_mtt *mtt);
+int mlx4_cq_ignore_overrun(struct mlx4_dev *dev, struct mlx4_cq *cq);
+#endif /* MLX4_CQ_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx4/device.h b/drivers/net/mlnx_uio/mlnx/include/mlx4/device.h
new file mode 100644
index 0000000..97c1c68
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx4/device.h
@@ -0,0 +1,1744 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX4_DEVICE_H
+#define MLX4_DEVICE_H
+
+#include "bitmap.h"
+#include "radix-tree.h"
+
+#ifdef HAVE_TIMECOUNTER_H
+#else
+#endif
+
+#define MAX_MSIX_P_PORT 17
+#define MAX_MSIX 1024
+#define MIN_MSIX_P_PORT 5
+#define PPC_MAX_MSIX 32
+#define MLX4_IS_LEGACY_EQ_MODE(dev_cap) ((dev_cap).num_comp_vectors < \
+ (dev_cap).num_ports * MIN_MSIX_P_PORT)
+
+#define MLX4_MAX_100M_UNITS_VAL 255 /*
+ * work around: can't set values
+ * greater then this value when
+ * using 100 Mbps units.
+ */
+#define MLX4_RATELIMIT_100M_UNITS 3 /* 100 Mbps */
+#define MLX4_RATELIMIT_1G_UNITS 4 /* 1 Gbps */
+#define MLX4_RATELIMIT_DEFAULT 0x00ff
+
+#define MLX4_GID_LEN 16
+#define MLX4_ROCE_MAX_GIDS 128
+#define MLX4_ROCE_PF_GIDS 16
+
+/*
+ * MLX4_RX_CSUM_MODE_VAL_NON_TCP_UDP -
+ * Receive checksum value is reported in CQE also for non TCP/UDP packets.
+ *
+ * MLX4_RX_CSUM_MODE_L4 -
+ * L4_CSUM bit in CQE, which indicates whether or not L4 checksum
+ * was validated correctly, is supported.
+ *
+ * MLX4_RX_CSUM_MODE_IP_OK_IP_NON_TCP_UDP -
+ * IP_OK CQE's field is supported also for non TCP/UDP IP packets.
+ *
+ * MLX4_RX_CSUM_MODE_MULTI_VLAN -
+ * Receive Checksum offload is supported for packets with
+ * more than 2 vlan headers.
+ */
+enum mlx4_rx_csum_mode {
+ MLX4_RX_CSUM_MODE_VAL_NON_TCP_UDP = 1UL << 0,
+ MLX4_RX_CSUM_MODE_L4 = 1UL << 1,
+ MLX4_RX_CSUM_MODE_IP_OK_IP_NON_TCP_UDP = 1UL << 2,
+ MLX4_RX_CSUM_MODE_MULTI_VLAN = 1UL << 3
+};
+
+enum {
+ MLX4_FLAG_MSI_X = 1 << 0,
+ MLX4_FLAG_OLD_PORT_CMDS = 1 << 1,
+ MLX4_FLAG_MASTER = 1 << 2,
+ MLX4_FLAG_SLAVE = 1 << 3,
+ MLX4_FLAG_SRIOV = 1 << 4,
+ MLX4_FLAG_OLD_REG_MAC = 1 << 6,
+ MLX4_FLAG_DEV_NUM_STR = 1 << 5,
+ MLX4_FLAG_BONDED = 1 << 7
+};
+
+enum {
+ MLX4_PORT_CAP_IS_SM = 1 << 1,
+ MLX4_PORT_CAP_DEV_MGMT_SUP = 1 << 19,
+};
+
+enum {
+ MLX4_MAX_PORTS = 2,
+ MLX4_MAX_PORT_PKEYS = 128,
+ MLX4_MAX_PORT_GIDS = 128
+};
+
+/* base qkey for use in sriov tunnel-qp/proxy-qp communication.
+ * These qkeys must not be allowed for general use. This is a 64k range,
+ * and to test for violation, we use the mask (protect against future chg).
+ */
+#define MLX4_RESERVED_QKEY_BASE (0xFFFF0000)
+#define MLX4_RESERVED_QKEY_MASK (0xFFFF0000)
+
+enum {
+ MLX4_BOARD_ID_LEN = 64,
+ MLX4_VSD_LEN = 208
+};
+
+enum {
+ MLX4_MAX_NUM_PF = 16,
+ MLX4_MAX_NUM_VF = 126,
+ MLX4_MAX_NUM_VF_P_PORT = 64,
+ MLX4_MFUNC_MAX = 128,
+ MLX4_MAX_EQ_NUM = 1024,
+ MLX4_MFUNC_EQ_NUM = 4,
+ MLX4_MFUNC_MAX_EQES = 8,
+ MLX4_MFUNC_EQE_MASK = (MLX4_MFUNC_MAX_EQES - 1)
+};
+
+/* Driver supports 3 diffrent device methods to manage traffic steering:
+ * -device managed - High level API for ib and eth flow steering. FW is
+ * managing flow steering tables.
+ * - B0 steering mode - Common low level API for ib and (if supported) eth.
+ * - A0 steering mode - Limited low level API for eth. In case of IB,
+ * B0 mode is in use.
+ */
+enum {
+ MLX4_STEERING_MODE_A0,
+ MLX4_STEERING_MODE_B0,
+ MLX4_STEERING_MODE_DEVICE_MANAGED
+};
+
+enum {
+ MLX4_STEERING_DMFS_A0_DEFAULT,
+ MLX4_STEERING_DMFS_A0_DYNAMIC,
+ MLX4_STEERING_DMFS_A0_STATIC,
+ MLX4_STEERING_DMFS_A0_DISABLE,
+ MLX4_STEERING_DMFS_A0_NOT_SUPPORTED
+};
+
+enum {
+ MLX4_STEERING_ATTR_DMFS_IPOIB = (1UL << 0),
+ MLX4_STEERING_ATTR_DMFS_EN = (1UL << 1),
+ MLX4_STEERING_ATTR_IB_IGNORE_SIP = (1UL << 2),
+ MLX4_STEERING_ATTR_ETH_IGNORE_SIP = (1UL << 3),
+};
+
+static inline const char *mlx4_steering_mode_str(int steering_mode)
+{
+ switch (steering_mode) {
+ case MLX4_STEERING_MODE_A0:
+ return "A0 steering";
+
+ case MLX4_STEERING_MODE_B0:
+ return "B0 steering";
+
+ case MLX4_STEERING_MODE_DEVICE_MANAGED:
+ return "Device managed flow steering";
+
+ default:
+ return "Unrecognize steering mode";
+ }
+}
+
+enum {
+ MLX4_TUNNEL_OFFLOAD_MODE_NONE,
+ MLX4_TUNNEL_OFFLOAD_MODE_VXLAN
+};
+
+enum {
+ MLX4_DEV_CAP_FLAG_RC = 1LL << 0,
+ MLX4_DEV_CAP_FLAG_UC = 1LL << 1,
+ MLX4_DEV_CAP_FLAG_UD = 1LL << 2,
+ MLX4_DEV_CAP_FLAG_XRC = 1LL << 3,
+ MLX4_DEV_CAP_FLAG_SRQ = 1LL << 6,
+ MLX4_DEV_CAP_FLAG_IPOIB_CSUM = 1LL << 7,
+ MLX4_DEV_CAP_FLAG_BAD_PKEY_CNTR = 1LL << 8,
+ MLX4_DEV_CAP_FLAG_BAD_QKEY_CNTR = 1LL << 9,
+ MLX4_DEV_CAP_FLAG_DPDP = 1LL << 12,
+ MLX4_DEV_CAP_FLAG_BLH = 1LL << 15,
+ MLX4_DEV_CAP_FLAG_MEM_WINDOW = 1LL << 16,
+ MLX4_DEV_CAP_FLAG_APM = 1LL << 17,
+ MLX4_DEV_CAP_FLAG_ATOMIC = 1LL << 18,
+ MLX4_DEV_CAP_FLAG_RAW_MCAST = 1LL << 19,
+ MLX4_DEV_CAP_FLAG_UD_AV_PORT = 1LL << 20,
+ MLX4_DEV_CAP_FLAG_UD_MCAST = 1LL << 21,
+ MLX4_DEV_CAP_FLAG_IBOE = 1LL << 30,
+ MLX4_DEV_CAP_FLAG_UC_LOOPBACK = 1LL << 32,
+ MLX4_DEV_CAP_FLAG_FCS_KEEP = 1LL << 34,
+ MLX4_DEV_CAP_FLAG_WOL_PORT1 = 1LL << 37,
+ MLX4_DEV_CAP_FLAG_WOL_PORT2 = 1LL << 38,
+ MLX4_DEV_CAP_FLAG_UDP_RSS = 1LL << 40,
+ MLX4_DEV_CAP_FLAG_VEP_UC_STEER = 1LL << 41,
+ MLX4_DEV_CAP_FLAG_VEP_MC_STEER = 1LL << 42,
+ MLX4_DEV_CAP_FLAG_CROSS_CHANNEL = 1LL << 44,
+ MLX4_DEV_CAP_FLAG_COUNTERS = 1LL << 48,
+ MLX4_DEV_CAP_FLAG_COUNTERS_EXT = 1LL << 49,
+ MLX4_DEV_CAP_FLAG_RSS_IP_FRAG = 1LL << 52,
+ MLX4_DEV_CAP_FLAG_SET_ETH_SCHED = 1LL << 53,
+ MLX4_DEV_CAP_FLAG_SENSE_SUPPORT = 1LL << 55,
+ MLX4_DEV_CAP_FLAG_FAST_DROP = 1LL << 57,
+ MLX4_DEV_CAP_FLAG_PORT_MNG_CHG_EV = 1LL << 59,
+ MLX4_DEV_CAP_FLAG_64B_EQE = 1LL << 61,
+ MLX4_DEV_CAP_FLAG_64B_CQE = 1LL << 62,
+ MLX4_DEV_CAP_FLAG_R_ROCE = 1LL << 63
+};
+
+enum {
+ MLX4_DEV_CAP_FLAG2_RSS = 1LL << 0,
+ MLX4_DEV_CAP_FLAG2_RSS_TOP = 1LL << 1,
+ MLX4_DEV_CAP_FLAG2_RSS_XOR = 1LL << 2,
+ MLX4_DEV_CAP_FLAG2_FS_EN = 1LL << 3,
+ MLX4_DEV_CAP_FLAG2_REASSIGN_MAC_EN = 1LL << 4,
+ MLX4_DEV_CAP_FLAG2_TS = 1LL << 5,
+ MLX4_DEV_CAP_FLAG2_VLAN_CONTROL = 1LL << 6,
+ MLX4_DEV_CAP_FLAG2_FSM = 1LL << 7,
+ MLX4_DEV_CAP_FLAG2_UPDATE_QP = 1LL << 8,
+ MLX4_DEV_CAP_FLAG2_DMFS_IPOIB = 1LL << 9,
+ MLX4_DEV_CAP_FLAG2_VXLAN_OFFLOADS = 1LL << 10,
+ MLX4_DEV_CAP_FLAG2_MAD_DEMUX = 1LL << 11,
+ MLX4_DEV_CAP_FLAG2_CQE_STRIDE = 1LL << 12,
+ MLX4_DEV_CAP_FLAG2_EQE_STRIDE = 1LL << 13,
+ MLX4_DEV_CAP_FLAG2_ETH_PROT_CTRL = 1LL << 14,
+ MLX4_DEV_CAP_FLAG2_ETH_BACKPL_AN_REP = 1LL << 15,
+ MLX4_DEV_CAP_FLAG2_CONFIG_DEV = 1LL << 16,
+ MLX4_DEV_CAP_FLAG2_SYS_EQS = 1LL << 17,
+ MLX4_DEV_CAP_FLAG2_80_VFS = 1LL << 18,
+ MLX4_DEV_CAP_FLAG2_FS_A0 = 1LL << 19,
+ MLX4_DEV_CAP_FLAG2_RECOVERABLE_ERROR_EVENT = 1LL << 20,
+ MLX4_DEV_CAP_FLAG2_PORT_REMAP = 1LL << 21,
+ MLX4_DEV_CAP_FLAG2_ROCEV2 = 1LL << 22,
+ MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB = 1LL << 23,
+ MLX4_DEV_CAP_FLAG2_RX_CSUM_MODE = 1LL << 24,
+ MLX4_DEV_CAP_FLAG2_MODIFY_PARSER = 1LL << 25,
+ MLX4_DEV_CAP_FLAG2_LB_SRC_CHK = 1LL << 26,
+ MLX4_DEV_CAP_FLAG2_ETS_CFG = 1LL << 27,
+ MLX4_DEV_CAP_FLAG2_FLOWSTATS_EN = 1LL << 28,
+ MLX4_DEV_CAP_FLAG2_DRIVER_VERSION_TO_FW = 1LL << 29,
+ MLX4_DEV_CAP_FLAG2_FS_EN_NCSI = 1LL << 30,
+ MLX4_DEV_CAP_FLAG2_DMFS_TAG_MODE = 1LL << 31,
+ MLX4_DEV_CAP_FLAG2_ROCE_V1_V2 = 1LL << 32,
+ MLX4_DEV_CAP_FLAG2_QCN = 1LL << 33,
+ MLX4_DEV_CAP_FLAG2_DISABLE_SIP_CHECK = 1ULL << 34,
+ MLX4_DEV_CAP_FLAG2_QOS_VPP = 1ULL << 35,
+ MLX4_DEV_CAP_FLAG2_PORT_BEACON = 1ULL << 36,
+ MLX4_DEV_CAP_FLAG2_IGNORE_FCS = 1ULL << 37,
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ MLX4_DEV_CAP_FLAG2_WQE_FORMAT = 1ULL << 38,
+#endif
+};
+
+enum {
+ MLX4_QUERY_FUNC_FLAGS_BF_RES_QP = 1LL << 0,
+ MLX4_QUERY_FUNC_FLAGS_A0_RES_QP = 1LL << 1,
+ MLX4_QUERY_FUNC_FLAGS_ROCE_ADDR = 1LL << 2
+};
+
+enum {
+ MLX4_VF_CAP_FLAG_RESET = 1 << 0
+};
+
+/* bit enums for an 8-bit flags field indicating special use
+ * QPs which require special handling in qp_reserve_range.
+ * Currently, this only includes QPs used by the ETH interface,
+ * where we expect to use blueflame. These QPs must not have
+ * bits 6 and 7 set in their qp number.
+ *
+ * This enum may use only bits 0..7.
+ */
+enum {
+ MLX4_RESERVE_A0_QP = 1 << 6,
+ MLX4_RESERVE_ETH_BF_QP = 1 << 7,
+};
+
+enum {
+ MLX4_DEV_CAP_CQ_FLAG_IO = 1 << 0
+};
+
+enum {
+ MLX4_DEV_CAP_64B_EQE_ENABLED = 1LL << 0,
+ MLX4_DEV_CAP_64B_CQE_ENABLED = 1LL << 1,
+ MLX4_DEV_CAP_CQE_STRIDE_ENABLED = 1LL << 2,
+ MLX4_DEV_CAP_EQE_STRIDE_ENABLED = 1LL << 3
+};
+
+enum {
+ MLX4_USER_DEV_CAP_LARGE_CQE = 1L << 0,
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ MLX4_USER_DEV_CAP_WQE_FORMAT = 1L << 1
+#endif
+};
+
+enum {
+ MLX4_FUNC_CAP_64B_EQE_CQE = 1L << 0,
+ MLX4_FUNC_CAP_EQE_CQE_STRIDE = 1L << 1,
+ MLX4_FUNC_CAP_DMFS_A0_STATIC = 1L << 2
+};
+
+
+#define MLX4_ATTR_EXTENDED_PORT_INFO cpu_to_be16(0xff90)
+
+enum {
+ MLX4_BMME_FLAG_WIN_TYPE_2B = 1 << 1,
+ MLX4_BMME_FLAG_LOCAL_INV = 1 << 6,
+ MLX4_BMME_FLAG_REMOTE_INV = 1 << 7,
+ MLX4_BMME_FLAG_TYPE_2_WIN = 1 << 9,
+ MLX4_BMME_FLAG_RESERVED_LKEY = 1 << 10,
+ MLX4_BMME_FLAG_FAST_REG_WR = 1 << 11,
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ MLX4_BMME_FLAG_WQE_FORMAT = 1 << 17,
+#endif
+ MLX4_BMME_FLAG_ROCE_V1_V2 = 1 << 19,
+ MLX4_BMME_FLAG_PORT_REMAP = 1 << 24,
+ MLX4_BMME_FLAG_VSD_INIT2RTR = 1 << 28,
+};
+
+enum {
+ MLX4_FLAG_PORT_REMAP = MLX4_BMME_FLAG_PORT_REMAP
+};
+
+enum {
+ MLX4_FLAG_ROCE_V1_V2 = MLX4_BMME_FLAG_ROCE_V1_V2
+};
+
+enum mlx4_event {
+ MLX4_EVENT_TYPE_COMP = 0x00,
+ MLX4_EVENT_TYPE_PATH_MIG = 0x01,
+ MLX4_EVENT_TYPE_COMM_EST = 0x02,
+ MLX4_EVENT_TYPE_SQ_DRAINED = 0x03,
+ MLX4_EVENT_TYPE_SRQ_QP_LAST_WQE = 0x13,
+ MLX4_EVENT_TYPE_SRQ_LIMIT = 0x14,
+ MLX4_EVENT_TYPE_CQ_ERROR = 0x04,
+ MLX4_EVENT_TYPE_WQ_CATAS_ERROR = 0x05,
+ MLX4_EVENT_TYPE_EEC_CATAS_ERROR = 0x06,
+ MLX4_EVENT_TYPE_PATH_MIG_FAILED = 0x07,
+ MLX4_EVENT_TYPE_WQ_INVAL_REQ_ERROR = 0x10,
+ MLX4_EVENT_TYPE_WQ_ACCESS_ERROR = 0x11,
+ MLX4_EVENT_TYPE_SRQ_CATAS_ERROR = 0x12,
+ MLX4_EVENT_TYPE_LOCAL_CATAS_ERROR = 0x08,
+ MLX4_EVENT_TYPE_PORT_CHANGE = 0x09,
+ MLX4_EVENT_TYPE_EQ_OVERFLOW = 0x0f,
+ MLX4_EVENT_TYPE_ECC_DETECT = 0x0e,
+ MLX4_EVENT_TYPE_CMD = 0x0a,
+ MLX4_EVENT_TYPE_VEP_UPDATE = 0x19,
+ MLX4_EVENT_TYPE_COMM_CHANNEL = 0x18,
+ MLX4_EVENT_TYPE_OP_REQUIRED = 0x1a,
+ MLX4_EVENT_TYPE_FATAL_WARNING = 0x1b,
+ MLX4_EVENT_TYPE_FLR_EVENT = 0x1c,
+ MLX4_EVENT_TYPE_PORT_MNG_CHG_EVENT = 0x1d,
+ MLX4_EVENT_TYPE_RECOVERABLE_ERROR_EVENT = 0x3e,
+ MLX4_EVENT_TYPE_NONE = 0xff,
+};
+
+enum {
+ MLX4_PORT_CHANGE_SUBTYPE_DOWN = 1,
+ MLX4_PORT_CHANGE_SUBTYPE_ACTIVE = 4
+};
+
+enum {
+ MLX4_RECOVERABLE_ERROR_EVENT_SUBTYPE_BAD_CABLE = 1,
+ MLX4_RECOVERABLE_ERROR_EVENT_SUBTYPE_UNSUPPORTED_CABLE = 2,
+ MLX4_RECOVERABLE_ERROR_EVENT_SUBTYPE_BAD_UNREADABLE_EEPROM = 4,
+};
+
+enum {
+ MLX4_FATAL_WARNING_SUBTYPE_WARMING = 0,
+};
+
+enum slave_port_state {
+ SLAVE_PORT_DOWN = 0,
+ SLAVE_PENDING_UP,
+ SLAVE_PORT_UP,
+};
+
+enum slave_port_gen_event {
+ SLAVE_PORT_GEN_EVENT_DOWN = 0,
+ SLAVE_PORT_GEN_EVENT_UP,
+ SLAVE_PORT_GEN_EVENT_NONE,
+};
+
+enum slave_port_state_event {
+ MLX4_PORT_STATE_DEV_EVENT_PORT_DOWN,
+ MLX4_PORT_STATE_DEV_EVENT_PORT_UP,
+ MLX4_PORT_STATE_IB_PORT_STATE_EVENT_GID_VALID,
+ MLX4_PORT_STATE_IB_EVENT_GID_INVALID,
+};
+
+enum {
+ MLX4_PERM_LOCAL_READ = 1 << 10,
+ MLX4_PERM_LOCAL_WRITE = 1 << 11,
+ MLX4_PERM_REMOTE_READ = 1 << 12,
+ MLX4_PERM_REMOTE_WRITE = 1 << 13,
+ MLX4_PERM_ATOMIC = 1 << 14,
+ MLX4_PERM_BIND_MW = 1 << 15,
+ MLX4_PERM_MASK = 0xFC00
+};
+
+enum {
+ MLX4_OPCODE_NOP = 0x00,
+ MLX4_OPCODE_SEND_INVAL = 0x01,
+ MLX4_OPCODE_RDMA_WRITE = 0x08,
+ MLX4_OPCODE_RDMA_WRITE_IMM = 0x09,
+ MLX4_OPCODE_SEND = 0x0a,
+ MLX4_OPCODE_SEND_IMM = 0x0b,
+ MLX4_OPCODE_LSO = 0x0e,
+ MLX4_OPCODE_RDMA_READ = 0x10,
+ MLX4_OPCODE_ATOMIC_CS = 0x11,
+ MLX4_OPCODE_ATOMIC_FA = 0x12,
+ MLX4_OPCODE_MASKED_ATOMIC_CS = 0x14,
+ MLX4_OPCODE_MASKED_ATOMIC_FA = 0x15,
+ MLX4_OPCODE_BIND_MW = 0x18,
+ MLX4_OPCODE_FMR = 0x19,
+ MLX4_OPCODE_LOCAL_INVAL = 0x1b,
+ MLX4_OPCODE_CONFIG_CMD = 0x1f,
+
+ MLX4_RECV_OPCODE_RDMA_WRITE_IMM = 0x00,
+ MLX4_RECV_OPCODE_SEND = 0x01,
+ MLX4_RECV_OPCODE_SEND_IMM = 0x02,
+ MLX4_RECV_OPCODE_SEND_INVAL = 0x03,
+
+ MLX4_CQE_OPCODE_ERROR = 0x1e,
+ MLX4_CQE_OPCODE_RESIZE = 0x16,
+};
+
+enum {
+ MLX4_STAT_RATE_OFFSET = 5
+};
+
+enum mlx4_protocol {
+ MLX4_PROT_IB_IPV6 = 0,
+ MLX4_PROT_ETH,
+ MLX4_PROT_IB_IPV4,
+ MLX4_PROT_FCOE
+};
+
+enum mlx4_flow_roce_type {
+ MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV6 = 0,
+ MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4
+};
+
+enum {
+ MLX4_MTT_FLAG_PRESENT = 1
+};
+
+enum {
+ MLX4_MAX_MTT_SHIFT = 31
+};
+
+enum mlx4_qp_region {
+ MLX4_QP_REGION_FW = 0,
+ MLX4_QP_REGION_RSS_RAW_ETH,
+ MLX4_QP_REGION_BOTTOM = MLX4_QP_REGION_RSS_RAW_ETH,
+ MLX4_QP_REGION_ETH_ADDR,
+ MLX4_QP_REGION_FC_ADDR,
+ MLX4_QP_REGION_FC_EXCH,
+ MLX4_NUM_QP_REGION
+};
+
+enum mlx4_port_type {
+ MLX4_PORT_TYPE_NONE = 0,
+ MLX4_PORT_TYPE_IB = 1,
+ MLX4_PORT_TYPE_ETH = 2,
+ MLX4_PORT_TYPE_AUTO = 3,
+ MLX4_PORT_TYPE_NA = 4
+};
+
+enum mlx4_special_vlan_idx {
+ MLX4_NO_VLAN_IDX = 0,
+ MLX4_VLAN_MISS_IDX,
+ MLX4_VLAN_REGULAR
+};
+
+enum mlx4_steer_type {
+ MLX4_MC_STEER = 0,
+ MLX4_UC_STEER,
+ MLX4_NUM_STEERS
+};
+
+enum {
+ MLX4_NUM_FEXCH = 64 * 1024,
+};
+
+enum {
+ MLX4_MAX_FAST_REG_PAGES = 511,
+};
+
+enum {
+ MLX4_DEV_PMC_SUBTYPE_GUID_INFO = 0x14,
+ MLX4_DEV_PMC_SUBTYPE_PORT_INFO = 0x15,
+ MLX4_DEV_PMC_SUBTYPE_PKEY_TABLE = 0x16,
+};
+
+/* Port mgmt change event handling */
+enum {
+ MLX4_EQ_PORT_INFO_MSTR_SM_LID_CHANGE_MASK = 1 << 0,
+ MLX4_EQ_PORT_INFO_GID_PFX_CHANGE_MASK = 1 << 1,
+ MLX4_EQ_PORT_INFO_LID_CHANGE_MASK = 1 << 2,
+ MLX4_EQ_PORT_INFO_CLIENT_REREG_MASK = 1 << 3,
+ MLX4_EQ_PORT_INFO_MSTR_SM_SL_CHANGE_MASK = 1 << 4,
+};
+
+enum {
+ MLX4_DEVICE_STATE_UP = 1 << 0,
+ MLX4_DEVICE_STATE_INTERNAL_ERROR = 1 << 1,
+};
+
+enum {
+ MLX4_INTERFACE_STATE_UP = 1 << 0,
+ MLX4_INTERFACE_STATE_DELETION = 1 << 1,
+};
+
+#define MSTR_SM_CHANGE_MASK (MLX4_EQ_PORT_INFO_MSTR_SM_SL_CHANGE_MASK | \
+ MLX4_EQ_PORT_INFO_MSTR_SM_LID_CHANGE_MASK)
+
+enum mlx4_module_id {
+ MLX4_MODULE_ID_SFP = 0x3,
+ MLX4_MODULE_ID_QSFP = 0xC,
+ MLX4_MODULE_ID_QSFP_PLUS = 0xD,
+ MLX4_MODULE_ID_QSFP28 = 0x11,
+};
+
+enum mlx4_roce_mode {
+ MLX4_ROCE_MODE_1,
+ MLX4_ROCE_MODE_1_5,
+ MLX4_ROCE_MODE_2,
+ MLX4_ROCE_MODE_1_5_PLUS_2,
+ MLX4_ROCE_MODE_1_PLUS_2,
+ MLX4_ROCE_MODE_MAX,
+ MLX4_ROCE_MODE_INVALID = MLX4_ROCE_MODE_MAX
+};
+
+static inline const char *mlx4_roce_mode_to_str(enum mlx4_roce_mode m)
+{
+ switch (m) {
+ case MLX4_ROCE_MODE_1:
+ return "Roce V1";
+ case MLX4_ROCE_MODE_1_5:
+ return "RoCE V1.5";
+ case MLX4_ROCE_MODE_2:
+ return "RoCE V2";
+ case MLX4_ROCE_MODE_1_5_PLUS_2:
+ return "RoCE V1.5/V2";
+ case MLX4_ROCE_MODE_1_PLUS_2:
+ return "RoCE V1/V2";
+ default:
+ return "Unknown";
+ }
+}
+
+static inline u64 mlx4_fw_ver(u64 major, u64 minor, u64 subminor)
+{
+ return (major << 32) | (minor << 16) | subminor;
+}
+
+struct mlx4_phys_caps {
+ u32 gid_phys_table_len[MLX4_MAX_PORTS + 1];
+ u32 pkey_phys_table_len[MLX4_MAX_PORTS + 1];
+ u32 num_phys_eqs;
+ u32 base_sqpn;
+ u32 base_proxy_sqpn;
+ u32 base_tunnel_sqpn;
+};
+
+enum mlx4_roce_gid_type {
+ MLX4_ROCE_GID_TYPE_V1 = 0,
+ MLX4_ROCE_GID_TYPE_V1_5 = 1,
+ MLX4_ROCE_GID_TYPE_V2 = 2,
+ MLX4_ROCE_GID_TYPE_MAX,
+ MLX4_ROCE_GID_TYPE_INVALID = MLX4_ROCE_GID_TYPE_MAX,
+};
+
+static inline const char *mlx4_roce_gid_type_to_str(enum mlx4_roce_gid_type t)
+{
+ switch (t) {
+ case MLX4_ROCE_GID_TYPE_V1:
+ return "V1";
+ case MLX4_ROCE_GID_TYPE_V1_5:
+ return "V1.5";
+ case MLX4_ROCE_GID_TYPE_V2:
+ return "V2";
+ default:
+ return "Unknown";
+ }
+}
+
+static inline int mlx4_roce_is_over_ip(enum mlx4_roce_gid_type gid_type)
+{
+ return gid_type != MLX4_ROCE_GID_TYPE_V1;
+}
+
+struct mlx4_caps {
+ u64 fw_ver;
+ u32 function;
+ int num_ports;
+ int vl_cap[MLX4_MAX_PORTS + 1];
+ int ib_mtu_cap[MLX4_MAX_PORTS + 1];
+ __be32 ib_port_def_cap[MLX4_MAX_PORTS + 1];
+ u64 def_mac[MLX4_MAX_PORTS + 1];
+ int eth_mtu_cap[MLX4_MAX_PORTS + 1];
+ int gid_table_len[MLX4_MAX_PORTS + 1];
+ int pkey_table_len[MLX4_MAX_PORTS + 1];
+ int trans_type[MLX4_MAX_PORTS + 1];
+ int vendor_oui[MLX4_MAX_PORTS + 1];
+ int wavelength[MLX4_MAX_PORTS + 1];
+ u64 trans_code[MLX4_MAX_PORTS + 1];
+ int local_ca_ack_delay;
+ int num_uars;
+ u32 uar_page_size;
+ int bf_reg_size;
+ int bf_regs_per_page;
+ int max_sq_sg;
+ int max_rq_sg;
+ int num_qps;
+ int max_wqes;
+ int max_sq_desc_sz;
+ int max_rq_desc_sz;
+ int max_qp_init_rdma;
+ int max_qp_dest_rdma;
+ u32 *qp0_qkey;
+ u32 *qp0_proxy;
+ u32 *qp1_proxy;
+ u32 *qp0_tunnel;
+ u32 *qp1_tunnel;
+ int num_srqs;
+ int max_srq_wqes;
+ int max_srq_sge;
+ int reserved_srqs;
+ int num_cqs;
+ int max_cqes;
+ int reserved_cqs;
+ int num_sys_eqs;
+ int num_eqs;
+ int reserved_eqs;
+ int num_comp_vectors;
+ int num_mpts;
+ int max_fmr_maps;
+ int num_mtts;
+ int fmr_reserved_mtts;
+ int reserved_mtts;
+ int reserved_mrws;
+ int reserved_uars;
+ int num_mgms;
+ int num_amgms;
+ int reserved_mcgs;
+ int num_qp_per_mgm;
+ int steering_mode;
+ int steering_attr;
+ enum mlx4_roce_mode roce_mode;
+ enum mlx4_roce_gid_type ud_gid_type;
+ int dmfs_high_steer_mode;
+ int fs_log_max_ucast_qp_range_size;
+ int num_pds;
+ int reserved_pds;
+ int max_xrcds;
+ int reserved_xrcds;
+ int mtt_entry_sz;
+ u32 max_msg_sz;
+ u32 page_size_cap;
+ u64 flags;
+ u64 flags2;
+ u32 bmme_flags;
+ u32 reserved_lkey;
+ u16 stat_rate_support;
+ u8 port_width_cap[MLX4_MAX_PORTS + 1];
+ int max_gso_sz;
+ int max_rss_tbl_sz;
+ int reserved_qps_cnt[MLX4_NUM_QP_REGION];
+ int reserved_qps;
+ int reserved_qps_base[MLX4_NUM_QP_REGION];
+ int log_num_macs;
+ int log_num_vlans;
+ enum mlx4_port_type port_type[MLX4_MAX_PORTS + 1];
+ u8 supported_type[MLX4_MAX_PORTS + 1];
+ u8 suggested_type[MLX4_MAX_PORTS + 1];
+ u8 default_sense[MLX4_MAX_PORTS + 1];
+ u32 port_mask[MLX4_MAX_PORTS + 1];
+ enum mlx4_port_type possible_type[MLX4_MAX_PORTS + 1];
+ u32 max_counters;
+ u8 port_ib_mtu[MLX4_MAX_PORTS + 1];
+ u16 sqp_demux;
+ u32 mad_demux;
+ u32 sync_qp;
+ u32 cq_flags;
+ u32 eqe_size;
+ u32 cqe_size;
+ u8 eqe_factor;
+ u32 userspace_caps; /* userspace must be aware of these */
+ u32 function_caps; /* VFs must be aware of these */
+ u8 fast_drop;
+ u16 hca_core_clock;
+ u64 phys_port_id[MLX4_MAX_PORTS + 1];
+ u32 max_basic_counters;
+ u32 max_extended_counters;
+ u8 def_counter_index[MLX4_MAX_PORTS + 1];
+ int tunnel_offload_mode;
+ u8 cq_overrun;
+ u8 rx_checksum_flags_port[MLX4_MAX_PORTS + 1];
+ u8 alloc_res_qp_mask;
+ u32 dmfs_high_rate_qpn_base;
+ u32 dmfs_high_rate_qpn_range;
+ u32 vf_caps;
+ u8 rr_proto;
+ u8 roce_addr_support;
+};
+
+struct mlx4_buf_list {
+ void *buf;
+ dma_addr_t map;
+};
+
+struct mlx4_buf {
+ struct mlx4_buf_list direct;
+ struct mlx4_buf_list *page_list;
+ int nbufs;
+ int npages;
+ int page_shift;
+};
+
+struct mlx4_mtt {
+ u32 offset;
+ int order;
+ int page_shift;
+};
+
+enum {
+ MLX4_DB_PER_PAGE = PAGE_SIZE / 4
+};
+
+struct mlx4_db_pgdir {
+ struct list_head list;
+ DECLARE_BITMAP(order0, MLX4_DB_PER_PAGE);
+ DECLARE_BITMAP(order1, MLX4_DB_PER_PAGE / 2);
+ unsigned long *bits[2];
+ __be32 *db_page;
+ dma_addr_t db_dma;
+};
+
+struct mlx4_ib_user_db_page;
+
+struct mlx4_db {
+ __be32 *db;
+ union {
+ struct mlx4_db_pgdir *pgdir;
+ struct mlx4_ib_user_db_page *user_page;
+ } u;
+ dma_addr_t dma;
+ int index;
+ int order;
+};
+
+struct mlx4_hwq_resources {
+ struct mlx4_db db;
+ struct mlx4_mtt mtt;
+ struct mlx4_buf buf;
+};
+
+struct mlx4_mr {
+ struct mlx4_mtt mtt;
+ u64 iova;
+ u64 size;
+ u32 key;
+ u32 pd;
+ u32 access;
+ int enabled;
+};
+
+enum mlx4_mw_type {
+ MLX4_MW_TYPE_1 = 1,
+ MLX4_MW_TYPE_2 = 2,
+};
+
+struct mlx4_mw {
+ u32 key;
+ u32 pd;
+ enum mlx4_mw_type type;
+ int enabled;
+};
+
+struct mlx4_fmr {
+ struct mlx4_mr mr;
+ struct mlx4_mpt_entry *mpt;
+ __be64 *mtts;
+ dma_addr_t dma_handle;
+ int max_pages;
+ int max_maps;
+ int maps;
+ u8 page_shift;
+};
+
+struct mlx4_uar {
+#ifdef KMOD_MODIFIED
+ void* pfn_addr;
+#else
+ unsigned long pfn;
+#endif
+ int index;
+ struct list_head bf_list;
+ unsigned free_bf_bmap;
+ void __iomem *map;
+ void __iomem *bf_map;
+};
+
+struct mlx4_bf {
+ unsigned int offset;
+ int buf_size;
+ struct mlx4_uar *uar;
+ void __iomem *reg;
+};
+
+struct mlx4_cq {
+#ifdef KMOD_MODIFIED
+ //XXX lets use direct call for completion handlers
+#else
+ void (*comp) (struct mlx4_cq *);
+ void (*event) (struct mlx4_cq *, enum mlx4_event);
+#endif
+
+ struct mlx4_uar *uar;
+
+ u32 cons_index;
+
+ u16 irq;
+ __be32 *set_ci_db;
+ __be32 *arm_db;
+ int arm_sn;
+
+ int cqn;
+ unsigned vector;
+
+ atomic_t refcount;
+#ifdef KMOD_MODIFIED
+ struct completion free;
+#endif
+ int eqn;
+ struct {
+ struct list_head list;
+ void (*comp)(struct mlx4_cq *);
+ void *priv;
+ } tasklet_ctx;
+ int reset_notify_added;
+ struct list_head reset_notify;
+};
+
+struct mlx4_qp {
+ void (*event) (struct mlx4_qp *, enum mlx4_event);
+
+ int qpn;
+
+ atomic_t refcount;
+#ifdef KMOD_MODIFIED
+ struct completion free;
+#endif
+};
+
+struct mlx4_srq {
+ void (*event) (struct mlx4_srq *, enum mlx4_event);
+
+ int srqn;
+ int max;
+ int max_gs;
+ int wqe_shift;
+
+ atomic_t refcount;
+#ifdef KMOD_MODIFIED
+ struct completion free;
+#endif
+};
+
+struct mlx4_av {
+ __be32 port_pd;
+ u8 reserved1;
+ u8 g_slid;
+ __be16 dlid;
+ u8 reserved2;
+ u8 gid_index;
+ u8 stat_rate;
+ u8 hop_limit;
+ __be32 sl_tclass_flowlabel;
+ u8 dgid[16];
+};
+
+struct mlx4_eth_av {
+ __be32 port_pd;
+ u8 reserved1;
+ u8 smac_idx;
+ u16 reserved2;
+ u8 reserved3;
+ u8 gid_index;
+ u8 stat_rate;
+ u8 hop_limit;
+ __be32 sl_tclass_flowlabel;
+ u8 dgid[16];
+ u8 s_mac[6];
+ u8 reserved4[2];
+ __be16 vlan;
+ u8 mac[ETH_ALEN];
+};
+
+union mlx4_ext_av {
+ struct mlx4_av ib;
+ struct mlx4_eth_av eth;
+};
+
+struct mlx4_if_stat_control {
+ u8 reserved1[3];
+ /* Extended counters enabled */
+ u8 cnt_mode;
+ /* Number of interfaces */
+ __be32 num_of_if;
+ __be32 reserved[2];
+};
+
+struct mlx4_if_stat_basic {
+ struct mlx4_if_stat_control control;
+ struct {
+ __be64 IfRxFrames;
+ __be64 IfRxOctets;
+ __be64 IfTxFrames;
+ __be64 IfTxOctets;
+ } counters[];
+};
+#define MLX4_IF_STAT_BSC_SZ(ports)(sizeof(struct mlx4_if_stat_extended) +\
+ sizeof(((struct mlx4_if_stat_extended *)0)->\
+ counters[0]) * ports)
+
+struct mlx4_if_stat_extended {
+ struct mlx4_if_stat_control control;
+ struct {
+ __be64 IfRxUnicastFrames;
+ __be64 IfRxUnicastOctets;
+ __be64 IfRxMulticastFrames;
+ __be64 IfRxMulticastOctets;
+ __be64 IfRxBroadcastFrames;
+ __be64 IfRxBroadcastOctets;
+ __be64 IfRxNoBufferFrames;
+ __be64 IfRxNoBufferOctets;
+ __be64 IfRxErrorFrames;
+ __be64 IfRxErrorOctets;
+ __be32 reserved[39];
+ __be64 IfTxUnicastFrames;
+ __be64 IfTxUnicastOctets;
+ __be64 IfTxMulticastFrames;
+ __be64 IfTxMulticastOctets;
+ __be64 IfTxBroadcastFrames;
+ __be64 IfTxBroadcastOctets;
+ __be64 IfTxDroppedFrames;
+ __be64 IfTxDroppedOctets;
+ __be64 IfTxRequestedFramesSent;
+ __be64 IfTxGeneratedFramesSent;
+ __be64 IfTxTsoOctets;
+ } __packed counters[];
+};
+#define MLX4_IF_STAT_EXT_SZ(ports) (sizeof(struct mlx4_if_stat_extended) +\
+ sizeof(((struct mlx4_if_stat_extended *)\
+ 0)->counters[0]) * ports)
+
+union mlx4_counter {
+ struct mlx4_if_stat_control control;
+ struct mlx4_if_stat_basic basic;
+ struct mlx4_if_stat_extended ext;
+};
+#define MLX4_IF_STAT_SZ(ports) MLX4_IF_STAT_EXT_SZ(ports)
+
+struct mlx4_quotas {
+ int qp;
+ int cq;
+ int srq;
+ int mpt;
+ int mtt;
+ int counter;
+ int xrcd;
+};
+
+struct mlx4_vf_dev {
+ u8 min_port;
+ u8 n_ports;
+};
+
+struct mlx4_slaves_base_gid_index {
+ DECLARE_BITMAP(slaves, MLX4_MFUNC_MAX);
+};
+
+struct mlx4_dev_persistent {
+ struct rte_pci_device *rte_pdev;
+ struct mlx4_dev *dev;
+ int nvfs[MLX4_MAX_PORTS + 1];
+ int num_vfs;
+ enum mlx4_port_type curr_port_type[MLX4_MAX_PORTS + 1];
+ enum mlx4_port_type curr_port_poss_type[MLX4_MAX_PORTS + 1];
+#ifdef KMOD_REMOVED
+ struct work_struct catas_work; //collect errors
+ struct workqueue_struct *catas_wq;
+#endif
+ struct mutex device_state_mutex; /* protect HW state */
+ u8 state;
+ struct mutex interface_state_mutex; /* protect SW state */
+ u8 interface_state;
+};
+
+struct mlx4_dev {
+ struct mlx4_dev_persistent *persist;
+ unsigned long flags;
+ unsigned long num_slaves;
+ struct mlx4_caps caps;
+ struct mlx4_phys_caps phys_caps;
+ struct mlx4_quotas quotas;
+ struct radix_tree_root qp_table_tree;
+ u8 rev_id;
+ char board_id[MLX4_BOARD_ID_LEN];
+ int numa_node;
+ int oper_log_mgm_entry_size;
+ u64 regid_promisc_array[MLX4_MAX_PORTS + 1];
+ u64 regid_allmulti_array[MLX4_MAX_PORTS + 1];
+ struct mlx4_vf_dev *dev_vfs;
+ spinlock_t eq_accounting_lock;
+};
+
+struct mlx4_clock_params {
+ u64 offset;
+ u8 bar;
+ u8 size;
+};
+
+struct mlx4_eqe {
+ u8 reserved1;
+ u8 type;
+ u8 reserved2;
+ u8 subtype;
+ union {
+ u32 raw[6];
+ struct {
+ __be32 cqn;
+ } __packed comp;
+ struct {
+ u16 reserved1;
+ __be16 token;
+ u32 reserved2;
+ u8 reserved3[3];
+ u8 status;
+ __be64 out_param;
+ } __packed cmd;
+ struct {
+ __be32 qpn;
+ } __packed qp;
+ struct {
+ __be32 srqn;
+ } __packed srq;
+ struct {
+ __be32 cqn;
+ u32 reserved1;
+ u8 reserved2[3];
+ u8 syndrome;
+ } __packed cq_err;
+ struct {
+ u32 reserved1[2];
+ __be32 port;
+ } __packed port_change;
+ struct {
+ #define COMM_CHANNEL_BIT_ARRAY_SIZE 4
+ u32 reserved;
+ u32 bit_vec[COMM_CHANNEL_BIT_ARRAY_SIZE];
+ } __packed comm_channel_arm;
+ struct {
+ u8 port;
+ u8 reserved[3];
+ __be64 mac;
+ } __packed mac_update;
+ struct {
+ __be32 slave_id;
+ } __packed flr_event;
+ struct {
+ __be16 current_temperature;
+ __be16 warning_threshold;
+ } __packed warming;
+ struct {
+ u8 reserved[3];
+ u8 port;
+ union {
+ struct {
+ __be16 mstr_sm_lid;
+ __be16 port_lid;
+ __be32 changed_attr;
+ u8 reserved[3];
+ u8 mstr_sm_sl;
+ __be64 gid_prefix;
+ } __packed port_info;
+ struct {
+ __be32 block_ptr;
+ __be32 tbl_entries_mask;
+ } __packed tbl_change_info;
+ } params;
+ } __packed port_mgmt_change;
+ struct {
+ u8 reserved[3];
+ u8 port;
+ u32 reserved1[5];
+ } __packed bad_cable;
+ } event;
+ u8 slave_id;
+ u8 reserved3[2];
+ u8 owner;
+} __packed;
+
+struct mlx4_init_port_param {
+ int set_guid0;
+ int set_node_guid;
+ int set_si_guid;
+ u16 mtu;
+ int port_width_cap;
+ u16 vl_cap;
+ u16 max_gid;
+ u16 max_pkey;
+ u64 guid0;
+ u64 node_guid;
+ u64 si_guid;
+};
+
+#define MAD_IFC_DATA_SZ 192
+/* MAD IFC Mailbox */
+struct mlx4_mad_ifc {
+ u8 base_version;
+ u8 mgmt_class;
+ u8 class_version;
+ u8 method;
+ __be16 status;
+ __be16 class_specific;
+ __be64 tid;
+ __be16 attr_id;
+ __be16 resv;
+ __be32 attr_mod;
+ __be64 mkey;
+ __be16 dr_slid;
+ __be16 dr_dlid;
+ u8 reserved[28];
+ u8 data[MAD_IFC_DATA_SZ];
+} __packed;
+
+#define mlx4_foreach_port(port, dev, type) \
+ for ((port) = 1; (port) <= (dev)->caps.num_ports; (port)++) \
+ if ((type) == (dev)->caps.port_mask[(port)])
+
+#define mlx4_foreach_non_ib_transport_port(port, dev) \
+ for ((port) = 1; (port) <= (dev)->caps.num_ports; (port)++) \
+ if (((dev)->caps.port_mask[port] != MLX4_PORT_TYPE_IB))
+
+#define mlx4_foreach_ib_transport_port(port, dev) \
+ for ((port) = 1; (port) <= (dev)->caps.num_ports; (port)++) \
+ if (((dev)->caps.port_mask[port] == MLX4_PORT_TYPE_IB) || \
+ ((dev)->caps.flags & MLX4_DEV_CAP_FLAG_IBOE) || \
+ ((dev)->caps.flags & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2) || \
+ ((dev)->caps.flags & MLX4_DEV_CAP_FLAG_R_ROCE) || \
+ ((dev)->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCEV2))
+
+#define MLX4_INVALID_SLAVE_ID 0xFF
+
+#define MLX4_SINK_COUNTER_INDEX 0xff
+
+#ifdef KMOD_REMOVED
+void handle_port_mgmt_change_event(struct work_struct *work);
+#endif
+
+static inline int mlx4_master_func_num(struct mlx4_dev *dev)
+{
+ return dev->caps.function;
+}
+
+static inline int mlx4_is_master(struct mlx4_dev *dev)
+{
+ return dev->flags & MLX4_FLAG_MASTER;
+}
+
+static inline int mlx4_num_reserved_sqps(struct mlx4_dev *dev)
+{
+ return dev->phys_caps.base_sqpn + 8 +
+ 16 * MLX4_MFUNC_MAX * !!mlx4_is_master(dev);
+}
+
+static inline int mlx4_is_qp_reserved(struct mlx4_dev *dev, u32 qpn)
+{
+ return (qpn < dev->phys_caps.base_sqpn + 8 +
+ 16 * MLX4_MFUNC_MAX * !!mlx4_is_master(dev) &&
+ qpn >= dev->phys_caps.base_sqpn) ||
+ (qpn < dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW]);
+}
+
+static inline int mlx4_is_guest_proxy(struct mlx4_dev *dev, int slave, u32 qpn)
+{
+ int guest_proxy_base = dev->phys_caps.base_proxy_sqpn + slave * 8;
+
+ if (qpn >= guest_proxy_base && qpn < guest_proxy_base + 8)
+ return 1;
+
+ return 0;
+}
+
+static inline int mlx4_is_mfunc(struct mlx4_dev *dev)
+{
+ return dev->flags & (MLX4_FLAG_SLAVE | MLX4_FLAG_MASTER);
+}
+
+static inline int mlx4_is_slave(struct mlx4_dev *dev)
+{
+ return dev->flags & MLX4_FLAG_SLAVE;
+}
+
+static inline int mlx4_is_eth(struct mlx4_dev *dev, int port)
+{
+ return dev->caps.port_type[port] == MLX4_PORT_TYPE_IB ? 0 : 1;
+}
+
+int mlx4_buf_alloc(struct mlx4_dev *dev, int size, int max_direct,
+ struct mlx4_buf *buf, gfp_t gfp);
+void mlx4_buf_free(struct mlx4_dev *dev, int size, struct mlx4_buf *buf);
+static inline void *mlx4_buf_offset(struct mlx4_buf *buf, int offset)
+{
+ if (BITS_PER_LONG == 64 || buf->nbufs == 1)
+ return buf->direct.buf + offset;
+ else
+ return buf->page_list[offset >> PAGE_SHIFT].buf +
+ (offset & (PAGE_SIZE - 1));
+}
+
+int mlx4_pd_alloc(struct mlx4_dev *dev, u32 *pdn);
+void mlx4_pd_free(struct mlx4_dev *dev, u32 pdn);
+int mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn);
+void mlx4_xrcd_free(struct mlx4_dev *dev, u32 xrcdn);
+
+int mlx4_uar_alloc(struct mlx4_dev *dev, struct mlx4_uar *uar);
+void mlx4_uar_free(struct mlx4_dev *dev, struct mlx4_uar *uar);
+int mlx4_bf_alloc(struct mlx4_dev *dev, struct mlx4_bf *bf, int node);
+void mlx4_bf_free(struct mlx4_dev *dev, struct mlx4_bf *bf);
+
+int mlx4_mtt_init(struct mlx4_dev *dev, int npages, int page_shift,
+ struct mlx4_mtt *mtt);
+void mlx4_mtt_cleanup(struct mlx4_dev *dev, struct mlx4_mtt *mtt);
+u64 mlx4_mtt_addr(struct mlx4_dev *dev, struct mlx4_mtt *mtt);
+
+int mlx4_mr_alloc(struct mlx4_dev *dev, u32 pd, u64 iova, u64 size, u32 access,
+ int npages, int page_shift, struct mlx4_mr *mr);
+int mlx4_mr_free(struct mlx4_dev *dev, struct mlx4_mr *mr);
+int mlx4_mr_enable(struct mlx4_dev *dev, struct mlx4_mr *mr);
+int mlx4_mw_alloc(struct mlx4_dev *dev, u32 pd, enum mlx4_mw_type type,
+ struct mlx4_mw *mw);
+void mlx4_mw_free(struct mlx4_dev *dev, struct mlx4_mw *mw);
+int mlx4_mw_enable(struct mlx4_dev *dev, struct mlx4_mw *mw);
+int mlx4_write_mtt(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
+ int start_index, int npages, u64 *page_list);
+int mlx4_buf_write_mtt(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
+ struct mlx4_buf *buf, gfp_t gfp);
+
+int mlx4_db_alloc(struct mlx4_dev *dev, struct mlx4_db *db, int order,
+ gfp_t gfp);
+void mlx4_db_free(struct mlx4_dev *dev, struct mlx4_db *db);
+
+int mlx4_alloc_hwq_res(struct mlx4_dev *dev, struct mlx4_hwq_resources *wqres,
+ int size, int max_direct);
+void mlx4_free_hwq_res(struct mlx4_dev *mdev, struct mlx4_hwq_resources *wqres,
+ int size);
+
+int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, struct mlx4_mtt *mtt,
+ struct mlx4_uar *uar, u64 db_rec, struct mlx4_cq *cq,
+ unsigned vector, int collapsed, int timestamp_en);
+void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq);
+int mlx4_qp_reserve_range(struct mlx4_dev *dev, int cnt, int align,
+ int *base, u8 flags);
+void mlx4_qp_release_range(struct mlx4_dev *dev, int base_qpn, int cnt);
+
+int mlx4_qp_alloc(struct mlx4_dev *dev, int qpn, struct mlx4_qp *qp,
+ gfp_t gfp);
+void mlx4_qp_free(struct mlx4_dev *dev, struct mlx4_qp *qp);
+
+int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, u32 cqn, u16 xrcdn,
+ struct mlx4_mtt *mtt, u64 db_rec, struct mlx4_srq *srq);
+void mlx4_srq_free(struct mlx4_dev *dev, struct mlx4_srq *srq);
+int mlx4_srq_arm(struct mlx4_dev *dev, struct mlx4_srq *srq, int limit_watermark);
+int mlx4_srq_query(struct mlx4_dev *dev, struct mlx4_srq *srq, int *limit_watermark);
+
+int mlx4_INIT_PORT(struct mlx4_dev *dev, int port);
+int mlx4_CLOSE_PORT(struct mlx4_dev *dev, int port);
+
+int mlx4_unicast_attach(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16],
+ int block_mcast_loopback, enum mlx4_protocol prot);
+int mlx4_unicast_detach(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16],
+ enum mlx4_protocol prot);
+int mlx4_multicast_attach(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16],
+ u8 port, int block_mcast_loopback,
+ enum mlx4_protocol protocol, u64 *reg_id);
+int mlx4_multicast_detach(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16],
+ enum mlx4_protocol protocol, u64 reg_id);
+
+enum {
+ MLX4_DOMAIN_UVERBS = 0x1000,
+ MLX4_DOMAIN_ETHTOOL = 0x2000,
+ MLX4_DOMAIN_RFS = 0x3000,
+ MLX4_DOMAIN_NIC = 0x5000,
+};
+
+enum mlx4_net_trans_rule_id {
+ MLX4_NET_TRANS_RULE_ID_ETH = 0,
+ MLX4_NET_TRANS_RULE_ID_IB,
+ MLX4_NET_TRANS_RULE_ID_IPV6,
+ MLX4_NET_TRANS_RULE_ID_IPV4,
+ MLX4_NET_TRANS_RULE_ID_TCP,
+ MLX4_NET_TRANS_RULE_ID_UDP,
+ MLX4_NET_TRANS_RULE_ID_VXLAN,
+ MLX4_NET_TRANS_RULE_NUM, /* should be last */
+};
+
+extern const u16 __sw_id_hw[];
+
+static inline int map_hw_to_sw_id(u16 header_id)
+{
+
+ int i;
+ for (i = 0; i < MLX4_NET_TRANS_RULE_NUM; i++) {
+ if (header_id == __sw_id_hw[i])
+ return i;
+ }
+ return -EINVAL;
+}
+
+enum mlx4_net_trans_promisc_mode {
+ MLX4_FS_REGULAR = 1,
+ MLX4_FS_ALL_DEFAULT,
+ MLX4_FS_MC_DEFAULT,
+ MLX4_FS_UC_SNIFFER,
+ MLX4_FS_MC_SNIFFER,
+ MLX4_FS_MODE_NUM, /* should be last */
+};
+
+struct mlx4_spec_eth {
+ u8 dst_mac[ETH_ALEN];
+ u8 dst_mac_msk[ETH_ALEN];
+ u8 src_mac[ETH_ALEN];
+ u8 src_mac_msk[ETH_ALEN];
+ u8 ether_type_enable;
+ __be16 ether_type;
+ __be16 vlan_id_msk;
+ __be16 vlan_id;
+};
+
+struct mlx4_spec_tcp_udp {
+ __be16 dst_port;
+ __be16 dst_port_msk;
+ __be16 src_port;
+ __be16 src_port_msk;
+};
+
+struct mlx4_spec_ipv4 {
+ __be32 dst_ip;
+ __be32 dst_ip_msk;
+ __be32 src_ip;
+ __be32 src_ip_msk;
+};
+
+struct mlx4_spec_ib {
+ __be32 l3_qpn;
+ __be32 qpn_msk;
+ enum mlx4_flow_roce_type roce_type;
+ u8 dst_gid[16];
+ u8 dst_gid_msk[16];
+};
+
+struct mlx4_spec_vxlan {
+ __be32 vni;
+ __be32 vni_mask;
+
+};
+
+struct mlx4_spec_list {
+ struct list_head list;
+ enum mlx4_net_trans_rule_id id;
+ union {
+ struct mlx4_spec_eth eth;
+ struct mlx4_spec_ib ib;
+ struct mlx4_spec_ipv4 ipv4;
+ struct mlx4_spec_tcp_udp tcp_udp;
+ struct mlx4_spec_vxlan vxlan;
+ };
+};
+
+enum mlx4_net_trans_hw_rule_queue {
+ MLX4_NET_TRANS_Q_FIFO,
+ MLX4_NET_TRANS_Q_LIFO,
+};
+
+struct mlx4_net_trans_rule {
+ struct list_head list;
+ enum mlx4_net_trans_hw_rule_queue queue_mode;
+ bool exclusive;
+ bool allow_loopback;
+ enum mlx4_net_trans_promisc_mode promisc_mode;
+ u8 port;
+ u16 priority;
+ u32 qpn;
+};
+
+struct mlx4_net_trans_rule_hw_ctrl {
+ __be16 prio;
+ u8 type;
+ u8 flags;
+ u8 rsvd1;
+ u8 funcid;
+ u8 vep;
+ u8 port;
+ __be32 qpn;
+ __be32 rsvd2;
+};
+
+struct mlx4_net_trans_rule_hw_ib {
+ u8 size;
+ u8 rsvd1;
+ __be16 id;
+ u32 rsvd2;
+ __be32 l3_qpn;
+ __be32 qpn_mask;
+ u8 dst_gid[16];
+ u8 dst_gid_msk[16];
+} __packed;
+
+struct mlx4_net_trans_rule_hw_eth {
+ u8 size;
+ u8 rsvd;
+ __be16 id;
+ u8 rsvd1[6];
+ u8 dst_mac[6];
+ u16 rsvd2;
+ u8 dst_mac_msk[6];
+ u16 rsvd3;
+ u8 src_mac[6];
+ u16 rsvd4;
+ u8 src_mac_msk[6];
+ u8 rsvd5;
+ u8 ether_type_enable;
+ __be16 ether_type;
+ __be16 vlan_tag_msk;
+ __be16 vlan_tag;
+} __packed;
+
+struct mlx4_net_trans_rule_hw_tcp_udp {
+ u8 size;
+ u8 rsvd;
+ __be16 id;
+ __be16 rsvd1[3];
+ __be16 dst_port;
+ __be16 rsvd2;
+ __be16 dst_port_msk;
+ __be16 rsvd3;
+ __be16 src_port;
+ __be16 rsvd4;
+ __be16 src_port_msk;
+} __packed;
+
+struct mlx4_net_trans_rule_hw_ipv4 {
+ u8 size;
+ u8 rsvd;
+ __be16 id;
+ __be32 rsvd1;
+ __be32 dst_ip;
+ __be32 dst_ip_msk;
+ __be32 src_ip;
+ __be32 src_ip_msk;
+} __packed;
+
+struct mlx4_net_trans_rule_hw_vxlan {
+ u8 size;
+ u8 rsvd;
+ __be16 id;
+ __be32 rsvd1;
+ __be32 vni;
+ __be32 vni_mask;
+} __packed;
+
+struct _rule_hw {
+ union {
+ struct {
+ u8 size;
+ u8 rsvd;
+ __be16 id;
+ };
+ struct mlx4_net_trans_rule_hw_eth eth;
+ struct mlx4_net_trans_rule_hw_ib ib;
+ struct mlx4_net_trans_rule_hw_ipv4 ipv4;
+ struct mlx4_net_trans_rule_hw_tcp_udp tcp_udp;
+ struct mlx4_net_trans_rule_hw_vxlan vxlan;
+ };
+};
+
+enum {
+ VXLAN_STEER_BY_OUTER_MAC = 1 << 0,
+ VXLAN_STEER_BY_OUTER_VLAN = 1 << 1,
+ VXLAN_STEER_BY_VSID_VNI = 1 << 2,
+ VXLAN_STEER_BY_INNER_MAC = 1 << 3,
+ VXLAN_STEER_BY_INNER_VLAN = 1 << 4,
+};
+
+enum {
+ MLX4_EQ_ID_EN,
+ MLX4_EQ_ID_IB
+};
+
+#define MLX4_EQ_ID_TO_UUID(id, port, n) (((unsigned)id) << 31 | (port) << 24 | (n))
+#define MLX4_EQ_UUID_TO_ID(uuid) (uuid >> 31)
+#define MLX4_NET_TRANS_PROMISC_MODE_OFFSET MLX4_FS_REGULAR
+
+
+int mlx4_flow_steer_promisc_add(struct mlx4_dev *dev, u8 port, u32 qpn,
+ enum mlx4_net_trans_promisc_mode mode);
+int mlx4_flow_steer_promisc_remove(struct mlx4_dev *dev, u8 port,
+ enum mlx4_net_trans_promisc_mode mode);
+int mlx4_multicast_promisc_add(struct mlx4_dev *dev, u32 qpn, u8 port);
+int mlx4_multicast_promisc_remove(struct mlx4_dev *dev, u32 qpn, u8 port);
+int mlx4_unicast_promisc_add(struct mlx4_dev *dev, u32 qpn, u8 port);
+int mlx4_unicast_promisc_remove(struct mlx4_dev *dev, u32 qpn, u8 port);
+int mlx4_SET_MCAST_FLTR(struct mlx4_dev *dev, u8 port, u64 mac, u64 clear, u8 mode);
+
+int mlx4_register_mac(struct mlx4_dev *dev, u8 port, u64 mac);
+void mlx4_unregister_mac(struct mlx4_dev *dev, u8 port, u64 mac);
+int mlx4_get_base_qpn(struct mlx4_dev *dev, u8 port);
+int __mlx4_replace_mac(struct mlx4_dev *dev, u8 port, int qpn, u64 new_mac);
+int mlx4_SET_PORT_general(struct mlx4_dev *dev, u8 port, int mtu,
+ u8 pptx, u8 pfctx, u8 pprx, u8 pfcrx);
+int mlx4_SET_PORT_qpn_calc(struct mlx4_dev *dev, u8 port, u32 base_qpn,
+ u8 promisc);
+int mlx4_SET_PORT_BEACON(struct mlx4_dev *dev, u8 port, u16 time);
+int mlx4_SET_PORT_fcs_check(struct mlx4_dev *dev, u8 port,
+ u8 ignore_fcs_value);
+int mlx4_SET_PORT_VXLAN(struct mlx4_dev *dev, u8 port, u8 steering, int enable);
+int mlx4_find_cached_mac(struct mlx4_dev *dev, u8 port, u64 mac, int *idx);
+int mlx4_find_cached_vlan(struct mlx4_dev *dev, u8 port, u16 vid, int *idx);
+int mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index);
+void mlx4_unregister_vlan(struct mlx4_dev *dev, u8 port, u16 vlan);
+
+int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list,
+ int npages, u64 iova, u32 *lkey, u32 *rkey);
+int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages,
+ int max_maps, u8 page_shift, struct mlx4_fmr *fmr);
+int mlx4_fmr_enable(struct mlx4_dev *dev, struct mlx4_fmr *fmr);
+void mlx4_fmr_unmap(struct mlx4_dev *dev, struct mlx4_fmr *fmr,
+ u32 *lkey, u32 *rkey);
+int mlx4_fmr_free(struct mlx4_dev *dev, struct mlx4_fmr *fmr);
+int mlx4_SYNC_TPT(struct mlx4_dev *dev);
+int mlx4_query_diag_counters(struct mlx4_dev *mlx4_dev, int array_length,
+ u8 op_modifier, u32 in_offset[],
+ u32 counter_out[]);
+
+int mlx4_test_interrupts(struct mlx4_dev *dev);
+u32 mlx4_get_eqs_per_port(struct mlx4_dev *dev, u8 port);
+bool mlx4_is_eq_vector_valid(struct mlx4_dev *dev, u8 port, int vector);
+struct cpu_rmap *mlx4_get_cpu_rmap(struct mlx4_dev *dev, int port);
+int mlx4_assign_eq(struct mlx4_dev *dev, u8 port, u32 consumer_uuid,
+ void (*cb)(unsigned vector, u32 uuid, void *data),
+ void *notifier_data, int *vector);
+void mlx4_release_eq(struct mlx4_dev *dev, int uuid, int vec);
+__printf(5, 6) int mlx4_rename_eq(struct mlx4_dev *dev, int port, int vector,
+ u8 priority, const char namefmt[], ...);
+
+int mlx4_is_eq_shared(struct mlx4_dev *dev, int vector);
+int mlx4_eq_get_irq(struct mlx4_dev *dev, int vec);
+
+int mlx4_get_phys_port_id(struct mlx4_dev *dev);
+int mlx4_wol_read(struct mlx4_dev *dev, u64 *config, int port);
+int mlx4_wol_write(struct mlx4_dev *dev, u64 config, int port);
+
+int mlx4_counter_alloc(struct mlx4_dev *dev, u8 port, u32 *idx);
+void mlx4_counter_free(struct mlx4_dev *dev, u8 port, u32 idx);
+
+void mlx4_set_admin_guid(struct mlx4_dev *dev, __be64 guid, int entry,
+ int port);
+__be64 mlx4_get_admin_guid(struct mlx4_dev *dev, int entry, int port);
+void mlx4_set_random_admin_guid(struct mlx4_dev *dev, int entry, int port);
+int mlx4_flow_attach(struct mlx4_dev *dev,
+ struct mlx4_net_trans_rule *rule, u64 *reg_id);
+int mlx4_flow_detach(struct mlx4_dev *dev, u64 reg_id);
+int mlx4_map_sw_to_hw_steering_mode(struct mlx4_dev *dev,
+ enum mlx4_net_trans_promisc_mode flow_type);
+int mlx4_map_hw_to_sw_steering_mode(struct mlx4_dev *dev, u8 flow_type);
+int mlx4_map_sw_to_hw_steering_id(struct mlx4_dev *dev,
+ enum mlx4_net_trans_rule_id id);
+int mlx4_hw_rule_sz(struct mlx4_dev *dev, enum mlx4_net_trans_rule_id id);
+
+int mlx4_tunnel_steer_add(struct mlx4_dev *dev, unsigned char *addr,
+ int port, int qpn, u16 prio, u64 *reg_id);
+
+void mlx4_sync_pkey_table(struct mlx4_dev *dev, int slave, int port,
+ int i, int val);
+
+int mlx4_get_parav_qkey(struct mlx4_dev *dev, u32 qpn, u32 *qkey);
+
+int mlx4_is_slave_active(struct mlx4_dev *dev, int slave);
+int mlx4_gen_pkey_eqe(struct mlx4_dev *dev, int slave, u8 port);
+int mlx4_gen_guid_change_eqe(struct mlx4_dev *dev, int slave, u8 port);
+int mlx4_gen_slaves_port_mgt_ev(struct mlx4_dev *dev, u8 port, int attr, u16 lid, u8 sl);
+int mlx4_gen_port_state_change_eqe(struct mlx4_dev *dev, int slave, u8 port, u8 port_subtype_change);
+enum slave_port_state mlx4_get_slave_port_state(struct mlx4_dev *dev, int slave, u8 port);
+int set_and_calc_slave_port_state(struct mlx4_dev *dev, int slave, u8 port, int event, enum slave_port_gen_event *gen_event);
+
+void mlx4_put_slave_node_guid(struct mlx4_dev *dev, int slave, __be64 guid);
+__be64 mlx4_get_slave_node_guid(struct mlx4_dev *dev, int slave);
+
+int mlx4_get_slave_from_roce_gid(struct mlx4_dev *dev, int port, u8 *gid,
+ int *slave_id);
+int mlx4_get_roce_gid_from_slave(struct mlx4_dev *dev, int port, int slave_id,
+ u8 *gid, enum mlx4_roce_gid_type *gid_type);
+
+int mlx4_FLOW_STEERING_IB_UC_QP_RANGE(struct mlx4_dev *dev, u32 min_range_qpn,
+ u32 max_range_qpn);
+
+uint64_t mlx4_read_clock(struct mlx4_dev *dev);
+int mlx4_get_internal_clock_params(struct mlx4_dev *dev,
+ struct mlx4_clock_params *params);
+
+struct mlx4_active_ports {
+ DECLARE_BITMAP(ports, MLX4_MAX_PORTS);
+};
+/* Returns a bitmap of the physical ports which are assigned to slave */
+struct mlx4_active_ports mlx4_get_active_ports(struct mlx4_dev *dev, int slave);
+
+/* Returns the physical port that represents the virtual port of the slave, */
+/* or a value < 0 in case of an error. If a slave has 2 ports, the identity */
+/* mapping is returned. */
+int mlx4_slave_convert_port(struct mlx4_dev *dev, int slave, int port);
+
+struct mlx4_slaves_pport {
+ DECLARE_BITMAP(slaves, MLX4_MFUNC_MAX);
+};
+/* Returns a bitmap of all slaves that are assigned to port. */
+struct mlx4_slaves_pport mlx4_phys_to_slaves_pport(struct mlx4_dev *dev,
+ int port);
+
+/* Returns a bitmap of all slaves that are assigned exactly to all the */
+/* the ports that are set in crit_ports. */
+struct mlx4_slaves_pport mlx4_phys_to_slaves_pport_actv(
+ struct mlx4_dev *dev,
+ const struct mlx4_active_ports *crit_ports);
+
+/* Returns the slave's virtual port that represents the physical port. */
+int mlx4_phys_to_slave_port(struct mlx4_dev *dev, int slave, int port);
+
+int mlx4_get_base_gid_ix(struct mlx4_dev *dev, int slave, int port);
+
+int mlx4_config_vxlan_port(struct mlx4_dev *dev, __be16 udp_port);
+int mlx4_disable_rx_port_check(struct mlx4_dev *dev, bool dis);
+int mlx4_config_roce_v2_port(struct mlx4_dev *dev, u16 udp_port);
+int mlx4_virt2phy_port_map(struct mlx4_dev *dev, u32 port1, u32 port2);
+int mlx4_vf_smi_enabled(struct mlx4_dev *dev, int slave, int port);
+int mlx4_vf_get_enable_smi_admin(struct mlx4_dev *dev, int slave, int port);
+int mlx4_vf_set_enable_smi_admin(struct mlx4_dev *dev, int slave, int port,
+ int enable);
+int mlx4_mr_hw_get_mpt(struct mlx4_dev *dev, struct mlx4_mr *mmr,
+ struct mlx4_mpt_entry ***mpt_entry);
+int mlx4_mr_hw_write_mpt(struct mlx4_dev *dev, struct mlx4_mr *mmr,
+ struct mlx4_mpt_entry **mpt_entry);
+int mlx4_mr_hw_change_pd(struct mlx4_dev *dev, struct mlx4_mpt_entry *mpt_entry,
+ u32 pdn);
+int mlx4_mr_hw_change_access(struct mlx4_dev *dev,
+ struct mlx4_mpt_entry *mpt_entry,
+ u32 access);
+void mlx4_mr_hw_put_mpt(struct mlx4_dev *dev,
+ struct mlx4_mpt_entry **mpt_entry);
+void mlx4_mr_rereg_mem_cleanup(struct mlx4_dev *dev, struct mlx4_mr *mr);
+int mlx4_mr_rereg_mem_write(struct mlx4_dev *dev, struct mlx4_mr *mr,
+ u64 iova, u64 size, int npages,
+ int page_shift, struct mlx4_mpt_entry *mpt_entry);
+
+int mlx4_get_module_info(struct mlx4_dev *dev, u8 port,
+ u16 offset, u16 size, u8 *data);
+
+/* Returns true if running in low memory profile (kdump kernel) */
+#ifdef KMOD_MODIFIED
+static inline bool mlx4_low_memory_profile(void)
+{
+ //return is_kdump_kernel(); //memory save mode
+ return 0; //
+}
+#endif
+
+/* ACCESS REG commands */
+enum mlx4_access_reg_method {
+ MLX4_ACCESS_REG_QUERY = 0x1,
+ MLX4_ACCESS_REG_WRITE = 0x2,
+};
+
+/* ACCESS PTYS Reg command */
+enum mlx4_ptys_proto {
+ MLX4_PTYS_IB = 1<<0,
+ MLX4_PTYS_EN = 1<<2,
+};
+
+struct mlx4_ptys_reg {
+ u8 resrvd1;
+ u8 local_port;
+ u8 resrvd2;
+ u8 proto_mask;
+ __be32 resrvd3[2];
+ __be32 eth_proto_cap;
+ __be16 ib_width_cap;
+ __be16 ib_speed_cap;
+ __be32 resrvd4;
+ __be32 eth_proto_admin;
+ __be16 ib_width_admin;
+ __be16 ib_speed_admin;
+ __be32 resrvd5;
+ __be32 eth_proto_oper;
+ __be16 ib_width_oper;
+ __be16 ib_speed_oper;
+ __be32 resrvd6;
+ __be32 eth_proto_lp_adv;
+} __packed;
+
+int mlx4_ACCESS_PTYS_REG(struct mlx4_dev *dev,
+ enum mlx4_access_reg_method method,
+ struct mlx4_ptys_reg *ptys_reg);
+
+struct mlx4_roce_addr {
+ u8 gid[MLX4_GID_LEN];
+ enum mlx4_roce_gid_type type;
+};
+
+struct mlx4_roce_addr_table {
+ struct mlx4_roce_addr addr[MLX4_ROCE_MAX_GIDS];
+};
+
+int mlx4_update_roce_addr_table(struct mlx4_dev *dev, u8 port_num,
+ struct mlx4_roce_addr_table *table,
+ int native_or_wrapped);
+
+#endif /* MLX4_DEVICE_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx4/doorbell.h b/drivers/net/mlnx_uio/mlnx/include/mlx4/doorbell.h
new file mode 100644
index 0000000..8b18449
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx4/doorbell.h
@@ -0,0 +1,90 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
+ * Copyright (c) 2005 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX4_DOORBELL_H
+#define MLX4_DOORBELL_H
+
+
+#define MLX4_SEND_DOORBELL 0x14
+#define MLX4_CQ_DOORBELL 0x20
+
+#if BITS_PER_LONG == 64
+/*
+ * Assume that we can just write a 64-bit doorbell atomically. s390
+ * actually doesn't have writeq() but S/390 systems don't even have
+ * PCI so we won't worry about it.
+ */
+
+#define MLX4_DECLARE_DOORBELL_LOCK(name)
+#define MLX4_INIT_DOORBELL_LOCK(ptr) do { } while (0)
+#define MLX4_GET_DOORBELL_LOCK(ptr) (NULL)
+
+static inline void mlx4_write64(__be32 val[2], void __iomem *dest,
+ spinlock_t *doorbell_lock)
+{
+ __raw_writeq(*(u64 *) val, dest);
+}
+
+#else
+
+/*
+ * Just fall back to a spinlock to protect the doorbell if
+ * BITS_PER_LONG is 32 -- there's no portable way to do atomic 64-bit
+ * MMIO writes.
+ */
+
+#define MLX4_DECLARE_DOORBELL_LOCK(name) spinlock_t name;
+#define MLX4_INIT_DOORBELL_LOCK(ptr) spin_lock_init(ptr)
+#define MLX4_GET_DOORBELL_LOCK(ptr) (ptr)
+
+static inline void mlx4_write64(__be32 val[2], void __iomem *dest,
+ spinlock_t *doorbell_lock)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(doorbell_lock, flags);
+ __raw_writel((__force u32) val[0], dest);
+ __raw_writel((__force u32) val[1], dest + 4);
+ spin_unlock_irqrestore(doorbell_lock, flags);
+}
+
+#endif
+
+#endif /* MLX4_DOORBELL_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx4/driver.h b/drivers/net/mlnx_uio/mlnx/include/mlx4/driver.h
new file mode 100644
index 0000000..aac5155
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx4/driver.h
@@ -0,0 +1,175 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX4_DRIVER_H
+#define MLX4_DRIVER_H
+
+
+struct mlx4_dev;
+
+#define MLX4_MAC_MASK 0xffffffffffffULL
+
+enum mlx4_dev_event {
+ MLX4_DEV_EVENT_CATASTROPHIC_ERROR,
+ MLX4_DEV_EVENT_PORT_UP,
+ MLX4_DEV_EVENT_PORT_DOWN,
+ MLX4_DEV_EVENT_PORT_REINIT,
+ MLX4_DEV_EVENT_PORT_MGMT_CHANGE,
+ MLX4_DEV_EVENT_SLAVE_INIT,
+ MLX4_DEV_EVENT_SLAVE_SHUTDOWN,
+};
+
+enum {
+ MLX4_INTFF_BONDING = 1 << 0
+};
+
+struct mlx4_interface {
+ void * (*add) (struct mlx4_dev *dev);
+ void (*remove)(struct mlx4_dev *dev, void *context);
+ void (*event) (struct mlx4_dev *dev, void *context,
+ enum mlx4_dev_event event, unsigned long param);
+ void * (*get_dev)(struct mlx4_dev *dev, void *context, u8 port);
+ void (*activate)(struct mlx4_dev *dev, void *context);
+ struct list_head list;
+ enum mlx4_protocol protocol;
+ int flags;
+};
+
+enum {
+ MLX4_MAX_DEVICES = 32,
+ MLX4_DEVS_TBL_SIZE = MLX4_MAX_DEVICES + 1,
+ MLX4_DBDF2VAL_STR_SIZE = 512,
+ MLX4_STR_NAME_SIZE = 64,
+ MLX4_MAX_BDF_VALS = 3,
+ MLX4_ENDOF_TBL = -1LL
+};
+
+struct mlx4_dbdf2val {
+ u64 dbdf;
+ int argc;
+ int val[MLX4_MAX_BDF_VALS];
+};
+
+struct mlx4_range {
+ int min;
+ int max;
+};
+
+/*
+ * mlx4_dbdf2val_lst struct holds all the data needed to convert
+ * dbdf-to-value-list string into dbdf-to-value table.
+ * dbdf-to-value-list string is a comma separated list of dbdf-to-value strings.
+ * the format of dbdf-to-value string is: "[mmmm:]bb:dd.f-v1[;v2]"
+ * mmmm - Domain number (optional)
+ * bb - Bus number
+ * dd - device number
+ * f - Function number
+ * v1 - First value related to the domain-bus-device-function.
+ * v2 - Second value related to the domain-bus-device-function (optional).
+ * bb, dd - Two hexadecimal digits without preceding 0x.
+ * mmmm - Four hexadecimal digits without preceding 0x.
+ * f - One hexadecimal without preceding 0x.
+ * v1,v2 - Number with normal convention (e.g 100, 0xd3).
+ * dbdf-to-value-list string format:
+ * "[mmmm:]bb:dd.f-v1[;v2],[mmmm:]bb:dd.f-v1[;v2],..."
+ *
+ */
+struct mlx4_dbdf2val_lst {
+ char name[MLX4_STR_NAME_SIZE]; /* String name */
+ char str[MLX4_DBDF2VAL_STR_SIZE]; /* dbdf2val list str */
+ struct mlx4_dbdf2val tbl[MLX4_DEVS_TBL_SIZE];/* dbdf to value table */
+ int num_vals; /* # of vals per dbdf */
+ int def_val[MLX4_MAX_BDF_VALS]; /* Default values */
+ struct mlx4_range range; /* Valid values range */
+ int num_inval_vals; /* # of values in middle of range
+ * which are invalid */
+ int inval_val[MLX4_MAX_BDF_VALS]; /* invalid values table */
+};
+
+int mlx4_fill_dbdf2val_tbl(struct mlx4_dbdf2val_lst *dbdf2val_lst);
+#ifdef KMOD_MODIFIED
+int mlx4_get_val(struct mlx4_dbdf2val *tbl, struct rte_pci_device *pdev, int idx,
+ int *val);
+#endif
+
+int mlx4_register_interface(struct mlx4_interface *intf);
+void mlx4_unregister_interface(struct mlx4_interface *intf);
+
+int mlx4_bond(struct mlx4_dev *dev);
+int mlx4_unbond(struct mlx4_dev *dev);
+static inline int mlx4_is_bonded(struct mlx4_dev *dev)
+{
+ return !!(dev->flags & MLX4_FLAG_BONDED);
+}
+
+struct mlx4_port_map {
+ u8 port1;
+ u8 port2;
+};
+
+int mlx4_port_map_set(struct mlx4_dev *dev, struct mlx4_port_map *v2p);
+int mlx4_port_map_get(struct mlx4_dev *dev, u8 vport, u8 *pport);
+
+void *mlx4_get_protocol_dev(struct mlx4_dev *dev, enum mlx4_protocol proto, int port);
+
+static inline u64 mlx4_mac_to_u64(u8 *addr)
+{
+ u64 mac = 0;
+ int i;
+
+ for (i = 0; i < ETH_ALEN; i++) {
+ mac <<= 8;
+ mac |= addr[i];
+ }
+ return mac;
+}
+
+static inline int mlx4_is_little_endian(void)
+{
+#if defined(__LITTLE_ENDIAN)
+ return 1;
+#elif defined(__BIG_ENDIAN)
+ return 0;
+#else
+#error Host endianness not defined
+#endif
+}
+
+int mlx4_choose_vector(struct mlx4_dev *dev, int vector, int num_comp);
+void mlx4_release_vector(struct mlx4_dev *dev, int vector);
+#endif /* MLX4_DRIVER_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx4/qp.h b/drivers/net/mlnx_uio/mlnx/include/mlx4/qp.h
new file mode 100644
index 0000000..91503a5
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx4/qp.h
@@ -0,0 +1,540 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX4_QP_H
+#define MLX4_QP_H
+
+#include "radix-tree.h"
+#include "mlx4/device.h"
+
+#define MLX4_INVALID_LKEY 0x100
+#define DS_SIZE_ALIGNMENT 16
+
+/* When using this MACROs we must use other name for "owner_bit" */
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ #define WQE_FORMAT_1_MASK cpu_to_be32(0xbfffffff)
+ #define SET_BYTE_COUNT(byte_count) (cpu_to_be32(byte_count) | owner_bit)
+ #define SET_LSO_MSS(mss_hdr_size) (cpu_to_be32(mss_hdr_size) | owner_bit)
+
+ /* The +8 is for mss_header and inline header */
+ #define GET_LSO_SEG_SIZE(lso_header_size) \
+ ((lso_header_size > 60) ? \
+ ALIGN(lso_header_size + 8, DS_SIZE_ALIGNMENT) : \
+ ALIGN(lso_header_size + 4, DS_SIZE_ALIGNMENT))
+
+ #define DS_BYTE_COUNT_MASK cpu_to_be32(0x3fffffff)
+
+#else
+ #define SET_BYTE_COUNT(byte_count) cpu_to_be32(byte_count)
+ #define SET_LSO_MSS(mss_hdr_size) cpu_to_be32(mss_hdr_size)
+ #define GET_LSO_SEG_SIZE(lso_header_size) \
+ ALIGN(lso_header_size + 4, DS_SIZE_ALIGNMENT)
+ #define DS_BYTE_COUNT_MASK cpu_to_be32(0x7fffffff)
+#endif
+
+enum mlx4_qp_optpar {
+ MLX4_QP_OPTPAR_ALT_ADDR_PATH = 1 << 0,
+ MLX4_QP_OPTPAR_RRE = 1 << 1,
+ MLX4_QP_OPTPAR_RAE = 1 << 2,
+ MLX4_QP_OPTPAR_RWE = 1 << 3,
+ MLX4_QP_OPTPAR_PKEY_INDEX = 1 << 4,
+ MLX4_QP_OPTPAR_Q_KEY = 1 << 5,
+ MLX4_QP_OPTPAR_RNR_TIMEOUT = 1 << 6,
+ MLX4_QP_OPTPAR_PRIMARY_ADDR_PATH = 1 << 7,
+ MLX4_QP_OPTPAR_SRA_MAX = 1 << 8,
+ MLX4_QP_OPTPAR_RRA_MAX = 1 << 9,
+ MLX4_QP_OPTPAR_PM_STATE = 1 << 10,
+ MLX4_QP_OPTPAR_RETRY_COUNT = 1 << 12,
+ MLX4_QP_OPTPAR_RNR_RETRY = 1 << 13,
+ MLX4_QP_OPTPAR_ACK_TIMEOUT = 1 << 14,
+ MLX4_QP_OPTPAR_SCHED_QUEUE = 1 << 16,
+ MLX4_QP_OPTPAR_COUNTER_INDEX = 1 << 20,
+ MLX4_QP_OPTPAR_VLAN_STRIPPING = 1 << 21,
+};
+
+enum mlx4_qp_state {
+ MLX4_QP_STATE_RST = 0,
+ MLX4_QP_STATE_INIT = 1,
+ MLX4_QP_STATE_RTR = 2,
+ MLX4_QP_STATE_RTS = 3,
+ MLX4_QP_STATE_SQER = 4,
+ MLX4_QP_STATE_SQD = 5,
+ MLX4_QP_STATE_ERR = 6,
+ MLX4_QP_STATE_SQ_DRAINING = 7,
+ MLX4_QP_NUM_STATE
+};
+
+enum {
+ MLX4_QP_ST_RC = 0x0,
+ MLX4_QP_ST_UC = 0x1,
+ MLX4_QP_ST_RD = 0x2,
+ MLX4_QP_ST_UD = 0x3,
+ MLX4_QP_ST_XRC = 0x6,
+ MLX4_QP_ST_MLX = 0x7
+};
+
+enum {
+ MLX4_QP_PM_MIGRATED = 0x3,
+ MLX4_QP_PM_ARMED = 0x0,
+ MLX4_QP_PM_REARM = 0x1
+};
+
+enum {
+ /* params1 */
+ MLX4_QP_BIT_SRE = 1 << 15,
+ MLX4_QP_BIT_SWE = 1 << 14,
+ MLX4_QP_BIT_SAE = 1 << 13,
+ /* params2 */
+ MLX4_QP_BIT_RRE = 1 << 15,
+ MLX4_QP_BIT_RWE = 1 << 14,
+ MLX4_QP_BIT_RAE = 1 << 13,
+ MLX4_QP_BIT_FPP = 1 << 3,
+ MLX4_QP_BIT_RIC = 1 << 4,
+ MLX4_QP_BIT_COLL_SYNC_RQ = 1 << 2,
+ MLX4_QP_BIT_COLL_SYNC_SQ = 1 << 1,
+ MLX4_QP_BIT_COLL_MASTER = 1 << 0
+};
+
+enum {
+ MLX4_RSS_HASH_XOR = 0,
+ MLX4_RSS_HASH_TOP = 1,
+
+ MLX4_RSS_UDP_IPV6 = 1 << 0,
+ MLX4_RSS_UDP_IPV4 = 1 << 1,
+ MLX4_RSS_TCP_IPV6 = 1 << 2,
+ MLX4_RSS_IPV6 = 1 << 3,
+ MLX4_RSS_TCP_IPV4 = 1 << 4,
+ MLX4_RSS_IPV4 = 1 << 5,
+
+ MLX4_RSS_BY_OUTER_HEADERS = 0 << 6,
+ MLX4_RSS_BY_INNER_HEADERS = 2 << 6,
+ MLX4_RSS_BY_INNER_HEADERS_IPONLY = 3 << 6,
+
+ /* offset of mlx4_rss_context within mlx4_qp_context.pri_path */
+ MLX4_RSS_OFFSET_IN_QPC_PRI_PATH = 0x24,
+ /* offset of being RSS indirection QP within mlx4_qp_context.flags */
+ MLX4_RSS_QPC_FLAG_OFFSET = 13,
+};
+
+#ifdef HAVE_NETDEV_RSS_KEY_FILL
+#define MLX4_EN_RSS_KEY_SIZE 40
+#else
+#define MLX4_EN_RSS_KEY_SIZE 10
+#endif
+
+struct mlx4_rss_context {
+ __be32 base_qpn;
+ __be32 default_qpn;
+ u16 reserved;
+ u8 hash_fn;
+ u8 flags;
+#ifdef HAVE_NETDEV_RSS_KEY_FILL
+ __be32 rss_key[MLX4_EN_RSS_KEY_SIZE / sizeof(__be32)];
+#else
+ __be32 rss_key[MLX4_EN_RSS_KEY_SIZE];
+#endif
+ __be32 base_qpn_udp;
+};
+
+struct mlx4_qp_path {
+ u8 fl;
+ union {
+ u8 vlan_control;
+ u8 control;
+ };
+ u8 disable_pkey_check;
+ u8 pkey_index;
+ u8 counter_index;
+ u8 grh_mylmc;
+ __be16 rlid;
+ u8 ackto;
+ u8 mgid_index;
+ u8 static_rate;
+ u8 hop_limit;
+ __be32 tclass_flowlabel;
+ u8 rgid[16];
+ u8 sched_queue;
+ u8 vlan_index;
+ u8 feup;
+ u8 fvl_rx;
+ u8 reserved4[2];
+ u8 dmac[ETH_ALEN];
+};
+
+enum { /* fl */
+ MLX4_FL_CV = 1 << 6,
+ MLX4_FL_ETH_HIDE_CQE_VLAN = 1 << 2,
+ MLX4_FL_ETH_SRC_CHECK_MC_LB = 1 << 1,
+ MLX4_FL_ETH_SRC_CHECK_UC_LB = 1 << 0,
+};
+
+enum { /* control */
+ MLX4_CTRL_ETH_SRC_CHECK_IF_COUNTER = 1 << 7,
+};
+
+enum { /* vlan_control */
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED = 1 << 6,
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_PRIO_TAGGED = 1 << 5, /* 802.1p priority tag */
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_UNTAGGED = 1 << 4,
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED = 1 << 2,
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED = 1 << 1, /* 802.1p priority tag */
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED = 1 << 0
+};
+
+enum { /* feup */
+ MLX4_FEUP_FORCE_ETH_UP = 1 << 6, /* force Eth UP */
+ MLX4_FSM_FORCE_ETH_SRC_MAC = 1 << 5, /* force Source MAC */
+ MLX4_FVL_FORCE_ETH_VLAN = 1 << 3 /* force Eth vlan */
+};
+
+enum { /* fvl_rx */
+ MLX4_FVL_RX_FORCE_ETH_VLAN = 1 << 0 /* enforce Eth rx vlan */
+};
+
+struct mlx4_qp_context {
+ __be32 flags;
+ __be32 pd;
+ u8 mtu_msgmax;
+ u8 rq_size_stride;
+ u8 sq_size_stride;
+ u8 rlkey_roce_mode;
+ __be32 usr_page;
+ __be32 local_qpn;
+ __be32 remote_qpn;
+ struct mlx4_qp_path pri_path;
+ struct mlx4_qp_path alt_path;
+ __be32 params1;
+ u32 reserved1;
+ __be32 next_send_psn;
+ __be32 cqn_send;
+ __be16 roce_entropy;
+ __be16 reserved2[3];
+ __be32 last_acked_psn;
+ __be32 ssn;
+ __be32 params2;
+ __be32 rnr_nextrecvpsn;
+ __be32 xrcd;
+ __be32 cqn_recv;
+ __be64 db_rec_addr;
+ __be32 qkey;
+ __be32 srqn;
+ __be32 msn;
+ __be16 rq_wqe_counter;
+ __be16 sq_wqe_counter;
+ u8 reserved3[7];
+ u8 qos_vport;
+ __be32 param3;
+ __be32 nummmcpeers_basemkey;
+ u8 log_page_size;
+ u8 reserved4[2];
+ u8 mtt_base_addr_h;
+ __be32 mtt_base_addr_l;
+ u32 reserved5[10];
+};
+
+struct mlx4_update_qp_context {
+ __be64 qp_mask;
+ __be64 primary_addr_path_mask;
+ __be64 secondary_addr_path_mask;
+ u64 reserved1;
+ struct mlx4_qp_context qp_context;
+ u64 reserved2[58];
+};
+
+enum {
+ MLX4_UPD_QP_MASK_PM_STATE = 32,
+ MLX4_UPD_QP_MASK_VSD = 33,
+ MLX4_UPD_QP_MASK_QOS_VPP = 34,
+};
+
+enum {
+ MLX4_UPD_QP_PATH_MASK_PKEY_INDEX = 0 + 32,
+ MLX4_UPD_QP_PATH_MASK_FSM = 1 + 32,
+ MLX4_UPD_QP_PATH_MASK_MAC_INDEX = 2 + 32,
+ MLX4_UPD_QP_PATH_MASK_FVL = 3 + 32,
+ MLX4_UPD_QP_PATH_MASK_CV = 4 + 32,
+ MLX4_UPD_QP_PATH_MASK_VLAN_INDEX = 5 + 32,
+ MLX4_UPD_QP_PATH_MASK_ETH_HIDE_CQE_VLAN = 6 + 32,
+ MLX4_UPD_QP_PATH_MASK_ETH_TX_BLOCK_UNTAGGED = 7 + 32,
+ MLX4_UPD_QP_PATH_MASK_ETH_TX_BLOCK_1P = 8 + 32,
+ MLX4_UPD_QP_PATH_MASK_ETH_TX_BLOCK_TAGGED = 9 + 32,
+ MLX4_UPD_QP_PATH_MASK_ETH_RX_BLOCK_UNTAGGED = 10 + 32,
+ MLX4_UPD_QP_PATH_MASK_ETH_RX_BLOCK_1P = 11 + 32,
+ MLX4_UPD_QP_PATH_MASK_ETH_RX_BLOCK_TAGGED = 12 + 32,
+ MLX4_UPD_QP_PATH_MASK_FEUP = 13 + 32,
+ MLX4_UPD_QP_PATH_MASK_SCHED_QUEUE = 14 + 32,
+ MLX4_UPD_QP_PATH_MASK_IF_COUNTER_INDEX = 15 + 32,
+ MLX4_UPD_QP_PATH_MASK_FVL_RX = 16 + 32,
+ MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_UC_LB = 18 + 32,
+ MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB = 19 + 32,
+};
+
+enum { /* param3 */
+ MLX4_STRIP_VLAN = 1 << 30
+};
+
+/* Which firmware version adds support for NEC (NoErrorCompletion) bit */
+#define MLX4_FW_VER_WQE_CTRL_NEC mlx4_fw_ver(2, 2, 232)
+
+enum {
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ MLX4_WQE_CTRL_NEC = 1 << 31,
+ MLX4_WQE_CTRL_OWN = 1 << 30,
+ MLX4_WQE_CTRL_RR = 0,
+#else
+ MLX4_WQE_CTRL_OWN = 1 << 31,
+ MLX4_WQE_CTRL_NEC = 1 << 29,
+ MLX4_WQE_CTRL_RR = 1 << 6,
+#endif
+ MLX4_WQE_CTRL_IIP = 1 << 28,
+ MLX4_WQE_CTRL_ILP = 1 << 27,
+ MLX4_WQE_CTRL_FENCE = 1 << 6,
+ MLX4_WQE_CTRL_CQ_UPDATE = 3 << 2,
+ MLX4_WQE_CTRL_SOLICITED = 1 << 1,
+ MLX4_WQE_CTRL_IP_CSUM = 1 << 4,
+ MLX4_WQE_CTRL_TCP_UDP_CSUM = 1 << 5,
+ MLX4_WQE_CTRL_INS_VLAN = 1 << 6,
+ MLX4_WQE_CTRL_STRONG_ORDER = 1 << 7,
+ MLX4_WQE_CTRL_FORCE_LOOPBACK = 1 << 0,
+};
+
+struct mlx4_wqe_ctrl_seg {
+ __be32 owner_opcode;
+ union {
+ struct {
+ __be16 vlan_tag;
+ u8 ins_vlan;
+ u8 fence_size;
+ };
+ __be32 bf_qpn;
+ };
+ /*
+ * High 24 bits are SRC remote buffer; low 8 bits are flags:
+ * [7] SO (strong ordering)
+ * [5] TCP/UDP checksum
+ * [4] IP checksum
+ * [3:2] C (generate completion queue entry)
+ * [1] SE (solicited event)
+ * [0] FL (force loopback)
+ */
+ union {
+ __be32 srcrb_flags;
+ __be16 srcrb_flags16[2];
+ };
+ /*
+ * imm is immediate data for send/RDMA write w/ immediate;
+ * also invalidation key for send with invalidate; input
+ * modifier for WQEs on CCQs.
+ */
+ __be32 imm;
+};
+
+enum {
+ MLX4_WQE_MLX_VL15 = 1 << 17,
+ MLX4_WQE_MLX_SLR = 1 << 16
+};
+
+struct mlx4_wqe_mlx_seg {
+ u8 owner;
+ u8 reserved1[2];
+ u8 opcode;
+ __be16 sched_prio;
+ u8 reserved2;
+ u8 size;
+ /*
+ * [17] VL15
+ * [16] SLR
+ * [15:12] static rate
+ * [11:8] SL
+ * [4] ICRC
+ * [3:2] C
+ * [0] FL (force loopback)
+ */
+ __be32 flags;
+ __be16 rlid;
+ u16 reserved3;
+};
+
+struct mlx4_wqe_datagram_seg {
+ __be32 av[8];
+ __be32 dqpn;
+ __be32 qkey;
+ __be16 vlan;
+ u8 mac[ETH_ALEN];
+};
+
+struct mlx4_wqe_lso_seg {
+ __be32 mss_hdr_size;
+ __be32 header[0];
+};
+
+enum mlx4_wqe_bind_seg_flags2 {
+ MLX4_WQE_BIND_ZERO_BASED = (1 << 30),
+ MLX4_WQE_BIND_TYPE_2 = (1 << 31),
+};
+
+struct mlx4_wqe_bind_seg {
+ __be32 flags1;
+ __be32 flags2;
+ __be32 new_rkey;
+ __be32 lkey;
+ __be64 addr;
+ __be64 length;
+};
+
+enum {
+ MLX4_WQE_FMR_PERM_LOCAL_READ = 1 << 27,
+ MLX4_WQE_FMR_PERM_LOCAL_WRITE = 1 << 28,
+ MLX4_WQE_FMR_AND_BIND_PERM_REMOTE_READ = 1 << 29,
+ MLX4_WQE_FMR_AND_BIND_PERM_REMOTE_WRITE = 1 << 30,
+ MLX4_WQE_FMR_AND_BIND_PERM_ATOMIC = 1 << 31
+};
+
+struct mlx4_wqe_fmr_seg {
+ __be32 flags;
+ __be32 mem_key;
+ __be64 buf_list;
+ __be64 start_addr;
+ __be64 reg_len;
+ __be32 offset;
+ __be32 page_size;
+ u32 reserved[2];
+};
+
+struct mlx4_wqe_fmr_ext_seg {
+ u8 flags;
+ u8 reserved;
+ __be16 app_mask;
+ __be16 wire_app_tag;
+ __be16 mem_app_tag;
+ __be32 wire_ref_tag_base;
+ __be32 mem_ref_tag_base;
+};
+
+struct mlx4_wqe_local_inval_seg {
+ u64 reserved1;
+ __be32 mem_key;
+ u32 reserved2;
+ u64 reserved3[2];
+};
+
+struct mlx4_wqe_raddr_seg {
+ __be64 raddr;
+ __be32 rkey;
+ u32 reserved;
+};
+
+struct mlx4_wqe_atomic_seg {
+ __be64 swap_add;
+ __be64 compare;
+};
+
+struct mlx4_wqe_masked_atomic_seg {
+ __be64 swap_add;
+ __be64 compare;
+ __be64 swap_add_mask;
+ __be64 compare_mask;
+};
+
+struct mlx4_wqe_data_seg {
+ __be32 byte_count;
+ __be32 lkey;
+ __be64 addr;
+};
+
+enum {
+ MLX4_INLINE_ALIGN = 64,
+ MLX4_INLINE_SEG = 1 << 31,
+};
+
+struct mlx4_wqe_inline_seg {
+ __be32 byte_count;
+};
+
+enum mlx4_update_qp_attr {
+ MLX4_UPDATE_QP_SMAC = 1 << 0,
+ MLX4_UPDATE_QP_ETH_SRC_CHECK_MC_LB = 1 << 1,
+ MLX4_UPDATE_QP_VSD = 1 << 2,
+ MLX4_UPDATE_QP_QOS_VPORT = 1 << 3,
+ MLX4_UPDATE_QP_SUPPORTED_ATTRS = (1 << 4) - 1
+};
+
+enum mlx4_update_qp_params_flags {
+ MLX4_UPDATE_QP_PARAMS_FLAGS_ETH_CHECK_MC_LB = 1 << 0,
+ MLX4_UPDATE_QP_PARAMS_FLAGS_VSD_ENABLE = 1 << 1,
+};
+
+struct mlx4_update_qp_params {
+ u8 smac_index;
+ u8 qos_vport;
+ u32 flags;
+};
+
+int mlx4_update_qp(struct mlx4_dev *dev, u32 qpn,
+ enum mlx4_update_qp_attr attr,
+ struct mlx4_update_qp_params *params);
+int mlx4_qp_modify(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
+ enum mlx4_qp_state cur_state, enum mlx4_qp_state new_state,
+ struct mlx4_qp_context *context, enum mlx4_qp_optpar optpar,
+ int sqd_event, struct mlx4_qp *qp);
+
+int mlx4_qp_query(struct mlx4_dev *dev, struct mlx4_qp *qp,
+ struct mlx4_qp_context *context, int native_or_wrapped);
+
+int mlx4_qp_to_ready(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
+ struct mlx4_qp_context *context,
+ struct mlx4_qp *qp, enum mlx4_qp_state *qp_state);
+
+static inline struct mlx4_qp *__mlx4_qp_lookup(struct mlx4_dev *dev, u32 qpn)
+{
+ return radix_tree_lookup(&dev->qp_table_tree, qpn & (dev->caps.num_qps - 1));
+}
+
+void mlx4_qp_remove(struct mlx4_dev *dev, struct mlx4_qp *qp);
+
+static inline u16 folded_qp(u32 q)
+{
+ u16 res;
+
+ res = ((q & 0xff) ^ ((q & 0xff0000) >> 16)) | (q & 0xff00);
+ return res;
+}
+
+u32 mlx4_qp_roce_entropy(struct mlx4_dev *dev, u32 qpn);
+
+#endif /* MLX4_QP_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx4/srq.h b/drivers/net/mlnx_uio/mlnx/include/mlx4/srq.h
new file mode 100644
index 0000000..a852bfe
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx4/srq.h
@@ -0,0 +1,50 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX4_SRQ_H
+#define MLX4_SRQ_H
+
+struct mlx4_wqe_srq_next_seg {
+ u16 reserved1;
+ __be16 next_wqe_index;
+ u32 reserved2[3];
+};
+
+struct mlx4_srq *mlx4_srq_lookup(struct mlx4_dev *dev, u32 srqn);
+
+#endif /* MLX4_SRQ_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx5/cmd.h b/drivers/net/mlnx_uio/mlnx/include/mlx5/cmd.h
new file mode 100644
index 0000000..55f37d9
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx5/cmd.h
@@ -0,0 +1,56 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX5_CMD_H
+#define MLX5_CMD_H
+
+
+struct manage_pages_layout {
+ u64 ptr;
+ u32 reserved;
+ u16 num_entries;
+ u16 func_id;
+};
+
+
+struct mlx5_cmd_alloc_uar_imm_out {
+ u32 rsvd[3];
+ u32 uarn;
+};
+
+#endif /* MLX5_CMD_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx5/cq.h b/drivers/net/mlnx_uio/mlnx/include/mlx5/cq.h
new file mode 100644
index 0000000..5555856
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx5/cq.h
@@ -0,0 +1,182 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX5_CORE_CQ_H
+#define MLX5_CORE_CQ_H
+
+
+
+struct mlx5_core_cq {
+ u32 cqn;
+ int cqe_sz;
+ __be32 *set_ci_db;
+ __be32 *arm_db;
+ unsigned vector;
+ int irqn;
+ void (*comp) (struct mlx5_core_cq *);
+ void (*event) (struct mlx5_core_cq *, enum mlx5_event);
+ struct mlx5_uar *uar;
+ u32 cons_index;
+ unsigned arm_sn;
+ struct mlx5_rsc_debug *dbg;
+ int pid;
+ int reset_notify_added;
+ struct list_head reset_notify;
+};
+
+
+enum {
+ MLX5_CQE_SYNDROME_LOCAL_LENGTH_ERR = 0x01,
+ MLX5_CQE_SYNDROME_LOCAL_QP_OP_ERR = 0x02,
+ MLX5_CQE_SYNDROME_LOCAL_PROT_ERR = 0x04,
+ MLX5_CQE_SYNDROME_WR_FLUSH_ERR = 0x05,
+ MLX5_CQE_SYNDROME_MW_BIND_ERR = 0x06,
+ MLX5_CQE_SYNDROME_BAD_RESP_ERR = 0x10,
+ MLX5_CQE_SYNDROME_LOCAL_ACCESS_ERR = 0x11,
+ MLX5_CQE_SYNDROME_REMOTE_INVAL_REQ_ERR = 0x12,
+ MLX5_CQE_SYNDROME_REMOTE_ACCESS_ERR = 0x13,
+ MLX5_CQE_SYNDROME_REMOTE_OP_ERR = 0x14,
+ MLX5_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR = 0x15,
+ MLX5_CQE_SYNDROME_RNR_RETRY_EXC_ERR = 0x16,
+ MLX5_CQE_SYNDROME_REMOTE_ABORTED_ERR = 0x22,
+};
+
+enum {
+ MLX5_CQE_OWNER_MASK = 1,
+ MLX5_CQE_REQ = 0,
+ MLX5_CQE_RESP_WR_IMM = 1,
+ MLX5_CQE_RESP_SEND = 2,
+ MLX5_CQE_RESP_SEND_IMM = 3,
+ MLX5_CQE_RESP_SEND_INV = 4,
+ MLX5_CQE_RESIZE_CQ = 5,
+ MLX5_CQE_SIG_ERR = 12,
+ MLX5_CQE_REQ_ERR = 13,
+ MLX5_CQE_RESP_ERR = 14,
+ MLX5_CQE_INVALID = 15,
+};
+
+enum {
+ MLX5_CQ_MODIFY_PERIOD = 1 << 0,
+ MLX5_CQ_MODIFY_COUNT = 1 << 1,
+ MLX5_CQ_MODIFY_OVERRUN = 1 << 2,
+};
+
+enum {
+ MLX5_CQ_OPMOD_RESIZE = 1,
+ MLX5_MODIFY_CQ_MASK_LOG_SIZE = 1 << 0,
+ MLX5_MODIFY_CQ_MASK_PG_OFFSET = 1 << 1,
+ MLX5_MODIFY_CQ_MASK_PG_SIZE = 1 << 2,
+};
+
+struct mlx5_cq_modify_params {
+ int type;
+ union {
+ struct {
+ u32 page_offset;
+ u8 log_cq_size;
+ } resize;
+
+ struct {
+ } moder;
+
+ struct {
+ } mapping;
+ } params;
+};
+
+enum {
+ CQE_SIZE_64 = 0,
+ CQE_SIZE_128 = 1,
+};
+
+static inline int cqe_sz_to_mlx_sz(u8 size)
+{
+ return size == 64 ? CQE_SIZE_64 : CQE_SIZE_128;
+}
+
+static inline void mlx5_cq_set_ci(struct mlx5_core_cq *cq)
+{
+ *cq->set_ci_db = cpu_to_be32(cq->cons_index & 0xffffff);
+}
+
+enum {
+ MLX5_CQ_DB_REQ_NOT_SOL = 1 << 24,
+ MLX5_CQ_DB_REQ_NOT = 0 << 24
+};
+
+static inline void mlx5_cq_arm(struct mlx5_core_cq *cq, u32 cmd,
+ void __iomem *uar_page,
+ spinlock_t *doorbell_lock,
+ u32 cons_index)
+{
+ __be32 doorbell[2];
+ u32 sn;
+ u32 ci;
+
+ sn = cq->arm_sn & 3;
+ ci = cons_index & 0xffffff;
+
+ *cq->arm_db = cpu_to_be32(sn << 28 | cmd | ci);
+
+ /* Make sure that the doorbell record in host memory is
+ * written before ringing the doorbell via PCI MMIO.
+ */
+ wmb();
+
+ doorbell[0] = cpu_to_be32(sn << 28 | cmd | ci);
+ doorbell[1] = cpu_to_be32(cq->cqn);
+
+ mlx5_write64(doorbell, uar_page + MLX5_CQ_DOORBELL, doorbell_lock);
+}
+
+int mlx5_init_cq_table(struct mlx5_core_dev *dev);
+void mlx5_cleanup_cq_table(struct mlx5_core_dev *dev);
+int mlx5_core_create_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq,
+ struct mlx5_create_cq_mbox_in *in, int inlen);
+int mlx5_core_destroy_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq);
+int mlx5_core_query_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq,
+ struct mlx5_query_cq_mbox_out *out);
+int mlx5_core_modify_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq,
+ struct mlx5_modify_cq_mbox_in *in, int in_sz);
+int mlx5_core_modify_cq_moderation(struct mlx5_core_dev *dev,
+ struct mlx5_core_cq *cq, u16 cq_period,
+ u16 cq_max_count);
+int mlx5_debug_cq_add(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq);
+void mlx5_debug_cq_remove(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq);
+
+#endif /* MLX5_CORE_CQ_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx5/device.h b/drivers/net/mlnx_uio/mlnx/include/mlx5/device.h
new file mode 100644
index 0000000..ee979da
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx5/device.h
@@ -0,0 +1,1204 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX5_DEVICE_H
+#define MLX5_DEVICE_H
+
+
+#if defined(__LITTLE_ENDIAN)
+#define MLX5_SET_HOST_ENDIANNESS 0
+#elif defined(__BIG_ENDIAN)
+#define MLX5_SET_HOST_ENDIANNESS 0x80
+#else
+#error Host endianness not defined
+#endif
+
+/* helper macros */
+#define __mlx5_nullp(typ) ((struct mlx5_ifc_##typ##_bits *)0)
+#define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld)
+#define __mlx5_bit_off(typ, fld) ((unsigned)(unsigned long)(&(__mlx5_nullp(typ)->fld)))
+#define __mlx5_dw_off(typ, fld) (__mlx5_bit_off(typ, fld) / 32)
+#define __mlx5_64_off(typ, fld) (__mlx5_bit_off(typ, fld) / 64)
+#define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - (__mlx5_bit_off(typ, fld) & 0x1f))
+#define __mlx5_mask(typ, fld) ((u32)((1ull << __mlx5_bit_sz(typ, fld)) - 1))
+#define __mlx5_dw_mask(typ, fld) (__mlx5_mask(typ, fld) << __mlx5_dw_bit_off(typ, fld))
+#define __mlx5_st_sz_bits(typ) sizeof(struct mlx5_ifc_##typ##_bits)
+
+#define MLX5_FLD_SZ_BYTES(typ, fld) (__mlx5_bit_sz(typ, fld) / 8)
+#define MLX5_ST_SZ_BYTES(typ) (sizeof(struct mlx5_ifc_##typ##_bits) / 8)
+#define MLX5_ST_SZ_DW(typ) (sizeof(struct mlx5_ifc_##typ##_bits) / 32)
+#define MLX5_UN_SZ_BYTES(typ) (sizeof(union mlx5_ifc_##typ##_bits) / 8)
+#define MLX5_UN_SZ_DW(typ) (sizeof(union mlx5_ifc_##typ##_bits) / 32)
+#define MLX5_BYTE_OFF(typ, fld) (__mlx5_bit_off(typ, fld) / 8)
+#define MLX5_ADDR_OF(typ, p, fld) ((char *)(p) + MLX5_BYTE_OFF(typ, fld))
+
+/* insert a value to a struct */
+#define MLX5_SET(typ, p, fld, v) do { \
+ BUILD_BUG_ON(__mlx5_st_sz_bits(typ) % 32); \
+ *((__be32 *)(p) + __mlx5_dw_off(typ, fld)) = \
+ cpu_to_be32((be32_to_cpu(*((__be32 *)(p) + __mlx5_dw_off(typ, fld))) & \
+ (~__mlx5_dw_mask(typ, fld))) | (((v) & __mlx5_mask(typ, fld)) \
+ << __mlx5_dw_bit_off(typ, fld))); \
+} while (0)
+
+#define MLX5_SET_TO_ONES(typ, p, fld) do { \
+ BUILD_BUG_ON(__mlx5_st_sz_bits(typ) % 32); \
+ *((__be32 *)(p) + __mlx5_dw_off(typ, fld)) = \
+ cpu_to_be32((be32_to_cpu(*((__be32 *)(p) + __mlx5_dw_off(typ, fld))) & \
+ (~__mlx5_dw_mask(typ, fld))) | ((__mlx5_mask(typ, fld)) \
+ << __mlx5_dw_bit_off(typ, fld))); \
+} while (0)
+
+#define MLX5_GET(typ, p, fld) ((be32_to_cpu(*((__be32 *)(p) +\
+__mlx5_dw_off(typ, fld))) >> __mlx5_dw_bit_off(typ, fld)) & \
+__mlx5_mask(typ, fld))
+
+#define MLX5_GET_PR(typ, p, fld) ({ \
+ u32 ___t = MLX5_GET(typ, p, fld); \
+ pr_debug(#fld " = 0x%x\n", ___t); \
+ ___t; \
+})
+
+#define MLX5_SET64(typ, p, fld, v) do { \
+ BUILD_BUG_ON(__mlx5_bit_sz(typ, fld) != 64); \
+ BUILD_BUG_ON(__mlx5_bit_off(typ, fld) % 64); \
+ *((__be64 *)(p) + __mlx5_64_off(typ, fld)) = cpu_to_be64(v); \
+} while (0)
+
+#define MLX5_GET64(typ, p, fld) be64_to_cpu(*((__be64 *)(p) + __mlx5_64_off(typ, fld)))
+
+#define MLX5_GET64_PR(typ, p, fld) ({ \
+ u64 ___t = MLX5_GET64(typ, p, fld); \
+ pr_debug(#fld " = 0x%llx\n", ___t); \
+ ___t; \
+})
+
+enum {
+ MLX5_MAX_COMMANDS = 32,
+ MLX5_CMD_DATA_BLOCK_SIZE = 512,
+ MLX5_PCI_CMD_XPORT = 7,
+ MLX5_MKEY_BSF_OCTO_SIZE = 4,
+ MLX5_MAX_PSVS = 4,
+};
+
+enum {
+ MLX5_EXTENDED_UD_AV = 0x80000000,
+};
+
+enum {
+ MLX5_CQ_STATE_ARMED = 9,
+ MLX5_CQ_STATE_ALWAYS_ARMED = 0xb,
+ MLX5_CQ_STATE_FIRED = 0xa,
+};
+
+enum {
+ MLX5_CQ_FLAGS_OI = 2,
+};
+
+enum {
+ MLX5_STAT_RATE_OFFSET = 5,
+};
+
+enum {
+ MLX5_INLINE_SEG = 0x80000000,
+};
+
+enum {
+ MLX5_HW_START_PADDING = MLX5_INLINE_SEG,
+};
+
+enum {
+ MLX5_MIN_PKEY_TABLE_SIZE = 128,
+ MLX5_MAX_LOG_PKEY_TABLE = 5,
+};
+
+enum {
+ MLX5_MKEY_INBOX_PG_ACCESS = 1 << 31
+};
+
+enum {
+ MLX5_PFAULT_SUBTYPE_WQE = 0,
+ MLX5_PFAULT_SUBTYPE_RDMA = 1,
+};
+
+enum {
+ MLX5_PERM_LOCAL_READ = 1 << 2,
+ MLX5_PERM_LOCAL_WRITE = 1 << 3,
+ MLX5_PERM_REMOTE_READ = 1 << 4,
+ MLX5_PERM_REMOTE_WRITE = 1 << 5,
+ MLX5_PERM_ATOMIC = 1 << 6,
+ MLX5_PERM_UMR_EN = 1 << 7,
+};
+
+enum {
+ MLX5_PCIE_CTRL_SMALL_FENCE = 1 << 0,
+ MLX5_PCIE_CTRL_RELAXED_ORDERING = 1 << 2,
+ MLX5_PCIE_CTRL_NO_SNOOP = 1 << 3,
+ MLX5_PCIE_CTRL_TLP_PROCE_EN = 1 << 6,
+ MLX5_PCIE_CTRL_TPH_MASK = 3 << 4,
+};
+
+enum {
+ MLX5_ACCESS_MODE_PA = 0,
+ MLX5_ACCESS_MODE_MTT = 1,
+ MLX5_ACCESS_MODE_KLM = 2
+};
+
+enum {
+ MLX5_MKEY_REMOTE_INVAL = 1 << 24,
+ MLX5_MKEY_FLAG_SYNC_UMR = 1 << 29,
+ MLX5_MKEY_BSF_EN = 1 << 30,
+ MLX5_MKEY_LEN64 = 1 << 31,
+};
+
+enum {
+ MLX5_EN_RD = (u64)1,
+ MLX5_EN_WR = (u64)2
+};
+
+enum {
+ MLX5_BF_REGS_PER_PAGE = 4,
+ MLX5_MAX_UAR_PAGES = 1 << 8,
+ MLX5_NON_FP_BF_REGS_PER_PAGE = 2,
+ MLX5_MAX_UUARS = MLX5_MAX_UAR_PAGES * MLX5_NON_FP_BF_REGS_PER_PAGE,
+};
+
+enum {
+ MLX5_MKEY_MASK_LEN = 1ull << 0,
+ MLX5_MKEY_MASK_PAGE_SIZE = 1ull << 1,
+ MLX5_MKEY_MASK_START_ADDR = 1ull << 6,
+ MLX5_MKEY_MASK_PD = 1ull << 7,
+ MLX5_MKEY_MASK_EN_RINVAL = 1ull << 8,
+ MLX5_MKEY_MASK_EN_SIGERR = 1ull << 9,
+ MLX5_MKEY_MASK_BSF_EN = 1ull << 12,
+ MLX5_MKEY_MASK_KEY = 1ull << 13,
+ MLX5_MKEY_MASK_QPN = 1ull << 14,
+ MLX5_MKEY_MASK_LR = 1ull << 17,
+ MLX5_MKEY_MASK_LW = 1ull << 18,
+ MLX5_MKEY_MASK_RR = 1ull << 19,
+ MLX5_MKEY_MASK_RW = 1ull << 20,
+ MLX5_MKEY_MASK_A = 1ull << 21,
+ MLX5_MKEY_MASK_SMALL_FENCE = 1ull << 23,
+ MLX5_MKEY_MASK_FREE = 1ull << 29,
+};
+
+enum {
+ MLX5_UMR_TRANSLATION_OFFSET_EN = (1 << 4),
+
+ MLX5_UMR_CHECK_NOT_FREE = (1 << 5),
+ MLX5_UMR_CHECK_FREE = (2 << 5),
+
+ MLX5_UMR_INLINE = (1 << 7),
+};
+
+#define MLX5_UMR_MTT_ALIGNMENT 0x40
+#define MLX5_UMR_MTT_MASK (MLX5_UMR_MTT_ALIGNMENT - 1)
+#define MLX5_UMR_MTT_MIN_CHUNK_SIZE MLX5_UMR_MTT_ALIGNMENT
+
+enum mlx5_event {
+ MLX5_EVENT_TYPE_COMP = 0x0,
+
+ MLX5_EVENT_TYPE_PATH_MIG = 0x01,
+ MLX5_EVENT_TYPE_COMM_EST = 0x02,
+ MLX5_EVENT_TYPE_SQ_DRAINED = 0x03,
+ MLX5_EVENT_TYPE_SRQ_LAST_WQE = 0x13,
+ MLX5_EVENT_TYPE_SRQ_RQ_LIMIT = 0x14,
+
+ MLX5_EVENT_TYPE_CQ_ERROR = 0x04,
+ MLX5_EVENT_TYPE_WQ_CATAS_ERROR = 0x05,
+ MLX5_EVENT_TYPE_PATH_MIG_FAILED = 0x07,
+ MLX5_EVENT_TYPE_WQ_INVAL_REQ_ERROR = 0x10,
+ MLX5_EVENT_TYPE_WQ_ACCESS_ERROR = 0x11,
+ MLX5_EVENT_TYPE_SRQ_CATAS_ERROR = 0x12,
+
+ MLX5_EVENT_TYPE_INTERNAL_ERROR = 0x08,
+ MLX5_EVENT_TYPE_PORT_CHANGE = 0x09,
+ MLX5_EVENT_TYPE_GPIO_EVENT = 0x15,
+ MLX5_EVENT_TYPE_REMOTE_CONFIG = 0x19,
+
+ MLX5_EVENT_TYPE_DB_BF_CONGESTION = 0x1a,
+ MLX5_EVENT_TYPE_STALL_EVENT = 0x1b,
+
+ MLX5_EVENT_TYPE_CMD = 0x0a,
+ MLX5_EVENT_TYPE_PAGE_REQUEST = 0xb,
+
+ MLX5_EVENT_TYPE_PAGE_FAULT = 0xc,
+
+ MLX5_EVENT_TYPE_DCT_DRAINED = 0x1c,
+ MLX5_EVENT_TYPE_DCT_KEY_VIOLATION = 0x1d,
+};
+
+enum {
+ MLX5_PORT_CHANGE_SUBTYPE_DOWN = 1,
+ MLX5_PORT_CHANGE_SUBTYPE_ACTIVE = 4,
+ MLX5_PORT_CHANGE_SUBTYPE_INITIALIZED = 5,
+ MLX5_PORT_CHANGE_SUBTYPE_LID = 6,
+ MLX5_PORT_CHANGE_SUBTYPE_PKEY = 7,
+ MLX5_PORT_CHANGE_SUBTYPE_GUID = 8,
+ MLX5_PORT_CHANGE_SUBTYPE_CLIENT_REREG = 9,
+};
+
+enum {
+ MLX5_MAX_INLINE_RECEIVE_SIZE = 64
+};
+
+enum {
+ MLX5_DEV_CAP_FLAG_XRC = 1LL << 3,
+ MLX5_DEV_CAP_FLAG_BAD_PKEY_CNTR = 1LL << 8,
+ MLX5_DEV_CAP_FLAG_BAD_QKEY_CNTR = 1LL << 9,
+ MLX5_DEV_CAP_FLAG_APM = 1LL << 17,
+ MLX5_DEV_CAP_FLAG_BLOCK_MCAST = 1LL << 23,
+ MLX5_DEV_CAP_FLAG_ON_DMND_PG = 1LL << 24,
+ MLX5_DEV_CAP_FLAG_CQ_MODER = 1LL << 29,
+ MLX5_DEV_CAP_FLAG_RESIZE_CQ = 1LL << 30,
+ MLX5_DEV_CAP_FLAG_ATOMIC = 1LL << 33,
+ MLX5_DEV_CAP_FLAG_ROCE = 1LL << 34,
+ MLX5_DEV_CAP_FLAG_DCT = 1LL << 37,
+ MLX5_DEV_CAP_FLAG_SIG_HAND_OVER = 1LL << 40,
+ MLX5_DEV_CAP_FLAG_CMDIF_CSUM = 3LL << 46,
+};
+
+enum {
+ MLX5_ROCE_VERSION_1 = 0,
+ MLX5_ROCE_VERSION_2 = 2,
+};
+
+enum {
+ MLX5_ROCE_VERSION_1_CAP = 1 << MLX5_ROCE_VERSION_1,
+ MLX5_ROCE_VERSION_2_CAP = 1 << MLX5_ROCE_VERSION_2,
+};
+
+enum {
+ MLX5_ROCE_L3_TYPE_IPV4 = 0,
+ MLX5_ROCE_L3_TYPE_IPV6 = 1,
+};
+
+enum {
+ MLX5_ROCE_L3_TYPE_IPV4_CAP = 1 << 1,
+ MLX5_ROCE_L3_TYPE_IPV6_CAP = 1 << 2,
+};
+
+enum {
+ MLX5_OPCODE_NOP = 0x00,
+ MLX5_OPCODE_SEND_INVAL = 0x01,
+ MLX5_OPCODE_RDMA_WRITE = 0x08,
+ MLX5_OPCODE_RDMA_WRITE_IMM = 0x09,
+ MLX5_OPCODE_SEND = 0x0a,
+ MLX5_OPCODE_SEND_IMM = 0x0b,
+ MLX5_OPCODE_LSO = 0x0e,
+ MLX5_OPCODE_RDMA_READ = 0x10,
+ MLX5_OPCODE_ATOMIC_CS = 0x11,
+ MLX5_OPCODE_ATOMIC_FA = 0x12,
+ MLX5_OPCODE_ATOMIC_MASKED_CS = 0x14,
+ MLX5_OPCODE_ATOMIC_MASKED_FA = 0x15,
+ MLX5_OPCODE_BIND_MW = 0x18,
+ MLX5_OPCODE_CONFIG_CMD = 0x1f,
+
+ MLX5_RECV_OPCODE_RDMA_WRITE_IMM = 0x00,
+ MLX5_RECV_OPCODE_SEND = 0x01,
+ MLX5_RECV_OPCODE_SEND_IMM = 0x02,
+ MLX5_RECV_OPCODE_SEND_INVAL = 0x03,
+
+ MLX5_CQE_OPCODE_ERROR = 0x1e,
+ MLX5_CQE_OPCODE_RESIZE = 0x16,
+
+ MLX5_OPCODE_SET_PSV = 0x20,
+ MLX5_OPCODE_GET_PSV = 0x21,
+ MLX5_OPCODE_CHECK_PSV = 0x22,
+ MLX5_OPCODE_RGET_PSV = 0x26,
+ MLX5_OPCODE_RCHECK_PSV = 0x27,
+
+ MLX5_OPCODE_UMR = 0x25,
+
+};
+
+enum {
+ MLX5_SET_PORT_RESET_QKEY = 0,
+ MLX5_SET_PORT_GUID0 = 16,
+ MLX5_SET_PORT_NODE_GUID = 17,
+ MLX5_SET_PORT_SYS_GUID = 18,
+ MLX5_SET_PORT_GID_TABLE = 19,
+ MLX5_SET_PORT_PKEY_TABLE = 20,
+};
+
+enum {
+ MLX5_MAX_PAGE_SHIFT = 31
+};
+
+enum {
+ MLX5_ADAPTER_PAGE_SHIFT = 12,
+ MLX5_ADAPTER_PAGE_SIZE = 1 << MLX5_ADAPTER_PAGE_SHIFT,
+};
+
+enum {
+ MLX5_CAP_OFF_DCT = 41,
+ MLX5_CAP_OFF_CMDIF_CSUM = 46,
+};
+
+struct mlx5_inbox_hdr {
+ __be16 opcode;
+ u8 rsvd[4];
+ __be16 opmod;
+};
+
+struct mlx5_outbox_hdr {
+ u8 status;
+ u8 rsvd[3];
+ __be32 syndrome;
+};
+
+enum mlx5_odp_transport_cap_bits {
+ MLX5_ODP_SUPPORT_SEND = 1 << 31,
+ MLX5_ODP_SUPPORT_RECV = 1 << 30,
+ MLX5_ODP_SUPPORT_WRITE = 1 << 29,
+ MLX5_ODP_SUPPORT_READ = 1 << 28,
+};
+
+struct mlx5_odp_caps {
+ char reserved[0x10];
+ struct {
+ __be32 rc_odp_caps;
+ __be32 uc_odp_caps;
+ __be32 ud_odp_caps;
+ } per_transport_caps;
+ char reserved2[0xe4];
+};
+
+struct mlx5_cmd_layout {
+ u8 type;
+ u8 rsvd0[3];
+ __be32 inlen;
+ __be64 in_ptr;
+ __be32 in[4];
+ __be32 out[4];
+ __be64 out_ptr;
+ __be32 outlen;
+ u8 token;
+ u8 sig;
+ u8 rsvd1;
+ u8 status_own;
+};
+
+
+struct health_buffer {
+ __be32 assert_var[5];
+ __be32 rsvd0[3];
+ __be32 assert_exit_ptr;
+ __be32 assert_callra;
+ __be32 rsvd1[2];
+ __be32 fw_ver;
+ __be32 hw_id;
+ __be32 rsvd2;
+ u8 irisc_index;
+ u8 synd;
+ __be16 ext_synd;
+};
+
+struct mlx5_init_seg {
+ __be32 fw_rev;
+ __be32 cmdif_rev_fw_sub;
+ __be32 rsvd0[2];
+ __be32 cmdq_addr_h;
+ __be32 cmdq_addr_l_sz;
+ __be32 cmd_dbell;
+ __be32 rsvd1[120];
+ __be32 initializing;
+ struct health_buffer health;
+ __be32 rsvd2[884];
+ __be32 health_counter;
+ __be32 rsvd3[1019];
+ __be64 ieee1588_clk;
+ __be32 ieee1588_clk_type;
+ __be32 clr_intx;
+};
+
+struct mlx5_eqe_comp {
+ __be32 reserved[6];
+ __be32 cqn;
+};
+
+struct mlx5_eqe_qp_srq {
+ __be32 reserved[6];
+ __be32 qp_srq_n;
+};
+
+struct mlx5_eqe_cq_err {
+ __be32 cqn;
+ u8 reserved1[7];
+ u8 syndrome;
+};
+
+struct mlx5_eqe_port_state {
+ u8 reserved0[8];
+ u8 port;
+};
+
+struct mlx5_eqe_gpio {
+ __be32 reserved0[2];
+ __be64 gpio_event;
+};
+
+struct mlx5_eqe_congestion {
+ u8 type;
+ u8 rsvd0;
+ u8 congestion_level;
+};
+
+struct mlx5_eqe_stall_vl {
+ u8 rsvd0[3];
+ u8 port_vl;
+};
+
+struct mlx5_eqe_cmd {
+ __be32 vector;
+ __be32 rsvd[6];
+};
+
+struct mlx5_eqe_page_req {
+ u8 rsvd0[2];
+ __be16 func_id;
+ __be32 num_pages;
+ __be32 rsvd1[5];
+};
+
+struct mlx5_eqe_page_fault {
+ __be32 bytes_committed;
+ union {
+ struct {
+ u16 reserved1;
+ __be16 wqe_index;
+ u16 reserved2;
+ __be16 packet_length;
+ u8 reserved3[12];
+ } __packed wqe;
+ struct {
+ __be32 r_key;
+ u16 reserved1;
+ __be16 packet_length;
+ __be32 rdma_op_len;
+ __be64 rdma_va;
+ } __packed rdma;
+ } __packed;
+ __be32 flags_qpn;
+} __packed;
+
+struct mlx5_eqe_dct {
+ __be32 reserved[6];
+ __be32 dctn;
+};
+
+union ev_data {
+ __be32 raw[7];
+ struct mlx5_eqe_cmd cmd;
+ struct mlx5_eqe_comp comp;
+ struct mlx5_eqe_qp_srq qp_srq;
+ struct mlx5_eqe_cq_err cq_err;
+ struct mlx5_eqe_port_state port;
+ struct mlx5_eqe_gpio gpio;
+ struct mlx5_eqe_congestion cong;
+ struct mlx5_eqe_stall_vl stall_vl;
+ struct mlx5_eqe_page_req req_pages;
+ struct mlx5_eqe_page_fault page_fault;
+ struct mlx5_eqe_dct dct;
+} __packed;
+
+struct mlx5_eqe {
+ u8 rsvd0;
+ u8 type;
+ u8 rsvd1;
+ u8 sub_type;
+ __be32 rsvd2[7];
+ union ev_data data;
+ __be16 rsvd3;
+ u8 signature;
+ u8 owner;
+} __packed;
+
+struct mlx5_cmd_prot_block {
+ u8 data[MLX5_CMD_DATA_BLOCK_SIZE];
+ u8 rsvd0[48];
+ __be64 next;
+ __be32 block_num;
+ u8 rsvd1;
+ u8 token;
+ u8 ctrl_sig;
+ u8 sig;
+};
+
+enum {
+ MLX5_CQE_SYND_FLUSHED_IN_ERROR = 5,
+};
+
+struct mlx5_err_cqe {
+ u8 rsvd0[32];
+ __be32 srqn;
+ u8 rsvd1[18];
+ u8 vendor_err_synd;
+ u8 syndrome;
+ __be32 s_wqe_opcode_qpn;
+ __be16 wqe_counter;
+ u8 signature;
+ u8 op_own;
+};
+
+struct mlx5_cqe64 {
+ u8 rsvd0[4];
+ u8 lro_tcppsh_abort_dupack;
+ u8 lro_min_ttl;
+ __be16 lro_tcp_win;
+ __be32 lro_ack_seq_num;
+ __be32 rss_hash_result;
+ u8 rss_hash_type;
+ u8 ml_path;
+ u8 rsvd20[2];
+ __be16 check_sum;
+ __be16 slid;
+ __be32 flags_rqpn;
+ u8 hds_ip_ext;
+ u8 l4_hdr_type_etc;
+ __be16 vlan_info;
+ __be32 srqn; /* [31:24]: lro_num_seg, [23:0]: srqn */
+ __be32 imm_inval_pkey;
+ u8 rsvd40[4];
+ __be32 byte_cnt;
+ __be64 timestamp;
+ __be32 sop_drop_qpn;
+ __be16 wqe_counter;
+ u8 signature;
+ u8 op_own;
+};
+
+static inline int get_cqe_lro_tcppsh(struct mlx5_cqe64 *cqe)
+{
+ return (cqe->lro_tcppsh_abort_dupack >> 6) & 1;
+}
+
+static inline u8 get_cqe_l4_hdr_type(struct mlx5_cqe64 *cqe)
+{
+ return (cqe->l4_hdr_type_etc >> 4) & 0x7;
+}
+
+static inline int cqe_has_vlan(struct mlx5_cqe64 *cqe)
+{
+ return !!(cqe->l4_hdr_type_etc & 0x1);
+}
+
+enum {
+ CQE_L4_HDR_TYPE_NONE = 0x0,
+ CQE_L4_HDR_TYPE_TCP_NO_ACK = 0x1,
+ CQE_L4_HDR_TYPE_UDP = 0x2,
+ CQE_L4_HDR_TYPE_TCP_ACK_NO_DATA = 0x3,
+ CQE_L4_HDR_TYPE_TCP_ACK_AND_DATA = 0x4,
+};
+
+enum {
+ CQE_RSS_HTYPE_IP = 0x3 << 6,
+ CQE_RSS_HTYPE_L4 = 0x3 << 2,
+};
+
+enum {
+ CQE_ROCE_L3_HEADER_TYPE_GRH = 0x0,
+ CQE_ROCE_L3_HEADER_TYPE_IPV6 = 0x1,
+ CQE_ROCE_L3_HEADER_TYPE_IPV4 = 0x2,
+};
+
+enum {
+ CQE_L2_OK = 1 << 0,
+ CQE_L3_OK = 1 << 1,
+ CQE_L4_OK = 1 << 2,
+};
+
+struct mlx5_sig_err_cqe {
+ u8 rsvd0[16];
+ __be32 expected_trans_sig;
+ __be32 actual_trans_sig;
+ __be32 expected_reftag;
+ __be32 actual_reftag;
+ __be16 syndrome;
+ u8 rsvd22[2];
+ __be32 mkey;
+ __be64 err_offset;
+ u8 rsvd30[8];
+ __be32 qpn;
+ u8 rsvd38[2];
+ u8 signature;
+ u8 op_own;
+};
+
+struct mlx5_wqe_srq_next_seg {
+ u8 rsvd0[2];
+ __be16 next_wqe_index;
+ u8 signature;
+ u8 rsvd1[11];
+};
+
+union mlx5_ext_cqe {
+ struct ib_grh grh;
+ u8 inl[64];
+};
+
+struct mlx5_cqe128 {
+ union mlx5_ext_cqe inl_grh;
+ struct mlx5_cqe64 cqe64;
+};
+
+struct mlx5_srq_ctx {
+ u8 state_log_sz;
+ u8 rsvd0[3];
+ __be32 flags_xrcd;
+ __be32 pgoff_cqn;
+ u8 rsvd1[4];
+ u8 log_pg_sz;
+ u8 rsvd2[7];
+ __be32 pd;
+ __be16 lwm;
+ __be16 wqe_cnt;
+ u8 rsvd3[8];
+ __be64 db_record;
+};
+
+struct mlx5_create_srq_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 input_srqn;
+ u8 rsvd0[4];
+ struct mlx5_srq_ctx ctx;
+ u8 rsvd1[208];
+ __be64 pas[0];
+};
+
+struct mlx5_create_srq_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ __be32 srqn;
+ u8 rsvd[4];
+};
+
+struct mlx5_destroy_srq_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 srqn;
+ u8 rsvd[4];
+};
+
+struct mlx5_destroy_srq_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+};
+
+struct mlx5_query_srq_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 srqn;
+ u8 rsvd0[4];
+};
+
+struct mlx5_query_srq_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd0[8];
+ struct mlx5_srq_ctx ctx;
+ u8 rsvd1[32];
+ __be64 pas[0];
+};
+
+struct mlx5_arm_srq_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 srqn;
+ __be16 rsvd;
+ __be16 lwm;
+};
+
+struct mlx5_arm_srq_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+};
+
+struct mlx5_cq_context {
+ u8 status;
+ u8 cqe_sz_flags;
+ u8 st;
+ u8 rsvd3;
+ u8 rsvd4[6];
+ __be16 page_offset;
+ __be32 log_sz_usr_page;
+ __be16 cq_period;
+ __be16 cq_max_count;
+ __be16 rsvd20;
+ __be16 c_eqn;
+ u8 log_pg_sz;
+ u8 rsvd25[7];
+ __be32 last_notified_index;
+ __be32 solicit_producer_index;
+ __be32 consumer_counter;
+ __be32 producer_counter;
+ u8 rsvd48[8];
+ __be64 db_record_addr;
+};
+
+struct mlx5_create_cq_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 input_cqn;
+ u8 rsvdx[4];
+ struct mlx5_cq_context ctx;
+ u8 rsvd6[192];
+ __be64 pas[0];
+};
+
+struct mlx5_create_cq_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ __be32 cqn;
+ u8 rsvd0[4];
+};
+
+struct mlx5_destroy_cq_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 cqn;
+ u8 rsvd0[4];
+};
+
+struct mlx5_destroy_cq_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd0[8];
+};
+
+struct mlx5_query_cq_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 cqn;
+ u8 rsvd0[4];
+};
+
+struct mlx5_query_cq_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd0[8];
+ struct mlx5_cq_context ctx;
+ u8 rsvd6[16];
+ __be64 pas[0];
+};
+
+struct mlx5_modify_cq_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 cqn;
+ __be32 field_select;
+ struct mlx5_cq_context ctx;
+ u8 rsvd[192];
+ __be64 pas[0];
+};
+
+struct mlx5_modify_cq_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+};
+
+struct mlx5_eq_context {
+ u8 status;
+ u8 ec_oi;
+ u8 st;
+ u8 rsvd2[7];
+ __be16 page_pffset;
+ __be32 log_sz_usr_page;
+ u8 rsvd3[7];
+ u8 intr;
+ u8 log_page_size;
+ u8 rsvd4[15];
+ __be32 consumer_counter;
+ __be32 produser_counter;
+ u8 rsvd5[16];
+};
+
+struct mlx5_create_eq_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ u8 rsvd0[3];
+ u8 input_eqn;
+ u8 rsvd1[4];
+ struct mlx5_eq_context ctx;
+ u8 rsvd2[8];
+ __be64 events_mask;
+ u8 rsvd3[176];
+ __be64 pas[0];
+};
+
+struct mlx5_create_eq_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd0[3];
+ u8 eq_number;
+ u8 rsvd1[4];
+};
+
+struct mlx5_map_eq_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be64 mask;
+ u8 mu;
+ u8 rsvd0[2];
+ u8 eqn;
+ u8 rsvd1[24];
+};
+
+struct mlx5_map_eq_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+};
+
+struct mlx5_query_eq_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ u8 rsvd0[3];
+ u8 eqn;
+ u8 rsvd1[4];
+};
+
+struct mlx5_query_eq_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+ struct mlx5_eq_context ctx;
+};
+
+enum {
+ MLX5_MKEY_STATUS_FREE = 1 << 6,
+};
+
+struct mlx5_mkey_seg {
+ /* This is a two bit field occupying bits 31-30.
+ * bit 31 is always 0,
+ * bit 30 is zero for regular MRs and 1 (e.g free) for UMRs that do not have tanslation
+ */
+ u8 status;
+ u8 pcie_control;
+ u8 flags;
+ u8 version;
+ __be32 qpn_mkey7_0;
+ u8 rsvd1[4];
+ __be32 flags_pd;
+ __be64 start_addr;
+ __be64 len;
+ __be32 bsfs_octo_size;
+ u8 rsvd2[16];
+ __be32 xlt_oct_size;
+ u8 rsvd3[3];
+ u8 log2_page_size;
+ u8 rsvd4[4];
+};
+
+struct mlx5_query_special_ctxs_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ u8 rsvd[8];
+};
+
+struct mlx5_query_special_ctxs_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ __be32 dump_fill_mkey;
+ __be32 reserved_lkey;
+};
+
+struct mlx5_create_mkey_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 input_mkey_index;
+ __be32 flags;
+ struct mlx5_mkey_seg seg;
+ u8 rsvd1[16];
+ __be32 xlat_oct_act_size;
+ __be32 rsvd2;
+ u8 rsvd3[168];
+ __be64 pas[0];
+};
+
+struct mlx5_create_mkey_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ __be32 mkey;
+ u8 rsvd[4];
+};
+
+struct mlx5_query_mkey_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 mkey;
+};
+
+struct mlx5_query_mkey_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ __be64 pas[0];
+};
+
+struct mlx5_modify_mkey_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 mkey;
+ __be64 pas[0];
+};
+
+struct mlx5_modify_mkey_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+};
+
+struct mlx5_dump_mkey_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+};
+
+struct mlx5_dump_mkey_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ __be32 mkey;
+};
+
+struct mlx5_mad_ifc_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be16 remote_lid;
+ u8 rsvd0;
+ u8 port;
+ u8 rsvd1[4];
+ u8 data[256];
+};
+
+struct mlx5_mad_ifc_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+ u8 data[256];
+};
+
+struct mlx5_access_reg_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ u8 rsvd0[2];
+ __be16 register_id;
+ __be32 arg;
+ __be32 data[0];
+};
+
+struct mlx5_access_reg_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+ __be32 data[0];
+};
+
+#define MLX5_ATTR_EXTENDED_PORT_INFO cpu_to_be16(0xff90)
+
+enum {
+ MLX_EXT_PORT_CAP_FLAG_EXTENDED_PORT_INFO = 1 << 0
+};
+
+enum {
+ DCT_STATE_ACTIVE = 0,
+ DCT_STATE_DRAINING = 1,
+ DCT_STATE_DRAINED = 2
+};
+
+static inline const char *mlx5_dct_state_str(u8 state)
+{
+ switch (state) {
+ case DCT_STATE_ACTIVE: return "Active";
+ case DCT_STATE_DRAINING: return "Drained";
+ case DCT_STATE_DRAINED: return "Drained";
+ default: return "Invalid";
+ }
+}
+
+struct mlx5_allocate_psv_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 npsv_pd;
+ __be32 rsvd_psv0;
+};
+
+struct mlx5_allocate_psv_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+ __be32 psv_idx[4];
+};
+
+struct mlx5_destroy_psv_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 psv_number;
+ u8 rsvd[4];
+};
+
+struct mlx5_destroy_psv_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+};
+
+static inline int mlx5_host_is_le(void)
+{
+#if defined(__LITTLE_ENDIAN)
+ return 1;
+#elif defined(__BIG_ENDIAN)
+ return 0;
+#else
+#error Host endianness not defined
+#endif
+}
+
+#define MLX5_CMD_OP_MAX 0x920
+
+enum {
+ VPORT_STATE_DOWN = 0x0,
+ VPORT_STATE_UP = 0x1,
+};
+
+enum {
+ MLX5_L3_PROT_TYPE_IPV4 = 0,
+ MLX5_L3_PROT_TYPE_IPV6 = 1,
+};
+
+enum {
+ MLX5_L4_PROT_TYPE_TCP = 0,
+ MLX5_L4_PROT_TYPE_UDP = 1,
+};
+
+enum {
+ MLX5_HASH_FIELD_SEL_SRC_IP = 1 << 0,
+ MLX5_HASH_FIELD_SEL_DST_IP = 1 << 1,
+ MLX5_HASH_FIELD_SEL_L4_SPORT = 1 << 2,
+ MLX5_HASH_FIELD_SEL_L4_DPORT = 1 << 3,
+ MLX5_HASH_FIELD_SEL_IPSEC_SPI = 1 << 4,
+};
+
+enum {
+ MLX5_MATCH_OUTER_HEADERS = 1 << 0,
+ MLX5_MATCH_MISC_PARAMETERS = 1 << 1,
+ MLX5_MATCH_INNER_HEADERS = 1 << 2,
+
+};
+
+enum {
+ MLX5_FLOW_TABLE_TYPE_NIC_RCV = 0,
+ MLX5_FLOW_TABLE_TYPE_ESWITCH = 4,
+};
+
+enum {
+ MLX5_FLOW_CONTEXT_DEST_TYPE_VPORT = 0,
+ MLX5_FLOW_CONTEXT_DEST_TYPE_FLOW_TABLE = 1,
+ MLX5_FLOW_CONTEXT_DEST_TYPE_TIR = 2,
+};
+
+enum {
+ MLX5_RQC_RQ_TYPE_MEMORY_RQ_INLINE = 0x0,
+ MLX5_RQC_RQ_TYPE_MEMORY_RQ_RPM = 0x1,
+};
+
+enum {
+ MLX5_CMD_STAT_OK = 0x0,
+ MLX5_CMD_STAT_INT_ERR = 0x1,
+ MLX5_CMD_STAT_BAD_OP_ERR = 0x2,
+ MLX5_CMD_STAT_BAD_PARAM_ERR = 0x3,
+ MLX5_CMD_STAT_BAD_SYS_STATE_ERR = 0x4,
+ MLX5_CMD_STAT_BAD_RES_ERR = 0x5,
+ MLX5_CMD_STAT_RES_BUSY = 0x6,
+ MLX5_CMD_STAT_LIM_ERR = 0x8,
+ MLX5_CMD_STAT_BAD_RES_STATE_ERR = 0x9,
+ MLX5_CMD_STAT_IX_ERR = 0xa,
+ MLX5_CMD_STAT_NO_RES_ERR = 0xf,
+ MLX5_CMD_STAT_BAD_INP_LEN_ERR = 0x50,
+ MLX5_CMD_STAT_BAD_OUTP_LEN_ERR = 0x51,
+ MLX5_CMD_STAT_BAD_QP_STATE_ERR = 0x10,
+ MLX5_CMD_STAT_BAD_PKT_ERR = 0x30,
+ MLX5_CMD_STAT_BAD_SIZE_OUTS_CQES_ERR = 0x40,
+};
+
+enum {
+ MLX5_IEEE_802_3_COUNTERS_GROUP = 0x0,
+ MLX5_RFC_2863_COUNTERS_GROUP = 0x1,
+ MLX5_RFC_2819_COUNTERS_GROUP = 0x2,
+ MLX5_RFC_3635_COUNTERS_GROUP = 0x3,
+ MLX5_ETHERNET_EXTENDED_COUNTERS_GROUP = 0x5,
+ MLX5_PER_PRIORITY_COUNTERS_GROUP = 0x10,
+ MLX5_PER_TRAFFIC_CLASS_COUNTERS_GROUP = 0x11
+};
+
+/* MLX5 DEV CAPs */
+
+/* TODO: EAT.ME */
+enum mlx5_cap_mode {
+ HCA_CAP_OPMOD_GET_MAX = 0,
+ HCA_CAP_OPMOD_GET_CUR = 1,
+};
+
+enum mlx5_cap_type {
+ MLX5_CAP_GENERAL = 0,
+ MLX5_CAP_ETHERNET_OFFLOADS,
+ MLX5_CAP_ODP,
+ MLX5_CAP_ATOMIC,
+ MLX5_CAP_ROCE,
+ MLX5_CAP_IPOIB_OFFLOADS,
+ MLX5_CAP_EOIB_OFFLOADS,
+ MLX5_CAP_FLOW_TABLE,
+ /* NUM OF CAP Types */
+ MLX5_CAP_NUM
+};
+
+/* GET Dev Caps macros */
+#define MLX5_CAP_GEN(mdev, cap) \
+ MLX5_GET(cmd_hca_cap, mdev->hca_caps_cur[MLX5_CAP_GENERAL], cap)
+
+#define MLX5_CAP_GEN_MAX(mdev, cap) \
+ MLX5_GET(cmd_hca_cap, mdev->hca_caps_max[MLX5_CAP_GENERAL], cap)
+
+#define MLX5_CAP_ETH(mdev, cap) \
+ MLX5_GET(per_protocol_networking_offload_caps,\
+ mdev->hca_caps_cur[MLX5_CAP_ETHERNET_OFFLOADS], cap)
+
+#define MLX5_CAP_ETH_MAX(mdev, cap) \
+ MLX5_GET(per_protocol_networking_offload_caps,\
+ mdev->hca_caps_max[MLX5_CAP_ETHERNET_OFFLOADS], cap)
+
+#define MLX5_CAP_ROCE(mdev, cap) \
+ MLX5_GET(roce_cap, mdev->hca_caps_cur[MLX5_CAP_ROCE], cap)
+
+#define MLX5_CAP_ROCE_MAX(mdev, cap) \
+ MLX5_GET(roce_cap, mdev->hca_caps_max[MLX5_CAP_ROCE], cap)
+
+#define MLX5_CAP_ATOMIC(mdev, cap) \
+ MLX5_GET(atomic_caps, mdev->hca_caps_cur[MLX5_CAP_ATOMIC], cap)
+
+#define MLX5_CAP_ATOMIC_MAX(mdev, cap) \
+ MLX5_GET(atomic_caps, mdev->hca_caps_max[MLX5_CAP_ATOMIC], cap)
+
+#define MLX5_CAP_FLOWTABLE(mdev, cap) \
+ MLX5_GET(flow_table_nic_cap, mdev->hca_caps_cur[MLX5_CAP_FLOW_TABLE], cap)
+
+#define MLX5_CAP_FLOWTABLE_MAX(mdev, cap) \
+ MLX5_GET(flow_table_nic_cap, mdev->hca_caps_max[MLX5_CAP_FLOW_TABLE], cap)
+
+#define MLX5_CAP_ODP(mdev, cap)\
+ MLX5_GET(odp_cap, mdev->hca_caps_cur[MLX5_CAP_ODP], cap)
+
+#define MLX5_CAP_ODP_MAX(mdev, cap)\
+ MLX5_GET(odp_cap, mdev->hca_caps_max[MLX5_CAP_ODP], cap)
+
+#define MLX5_CAP_ODP_PT(mdev, trans, odp_pt_cap)\
+ MLX5_GET(odp_per_transport_service_cap,\
+ MLX5_ADDR_OF(odp_cap, mdev->hca_caps_cur[MLX5_CAP_ODP],\
+ trans),\
+ odp_pt_cap)
+
+#define MLX5_CAP_ODP_PT_MAX(mdev, trans, odp_pt_cap)\
+ MLX5_GET(odp_per_transport_service_cap,\
+ MLX5_ADDR_OF(odp_cap, mdev->hca_caps_max[MLX5_CAP_ODP],\
+ trans),\
+ odp_pt_cap)
+
+#endif /* MLX5_DEVICE_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx5/doorbell.h b/drivers/net/mlnx_uio/mlnx/include/mlx5/doorbell.h
new file mode 100644
index 0000000..dacc852
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx5/doorbell.h
@@ -0,0 +1,85 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX5_DOORBELL_H
+#define MLX5_DOORBELL_H
+
+#define MLX5_BF_OFFSET 0x800
+#define MLX5_CQ_DOORBELL 0x20
+
+#if BITS_PER_LONG == 64
+/* Assume that we can just write a 64-bit doorbell atomically. s390
+ * actually doesn't have writeq() but S/390 systems don't even have
+ * PCI so we won't worry about it.
+ */
+
+#define MLX5_DECLARE_DOORBELL_LOCK(name)
+#define MLX5_INIT_DOORBELL_LOCK(ptr) do { } while (0)
+#define MLX5_GET_DOORBELL_LOCK(ptr) (NULL)
+
+static inline void mlx5_write64(__be32 val[2], void __iomem *dest,
+ spinlock_t *doorbell_lock)
+{
+ __raw_writeq(*(u64 *)val, dest);
+}
+
+#else
+
+/* Just fall back to a spinlock to protect the doorbell if
+ * BITS_PER_LONG is 32 -- there's no portable way to do atomic 64-bit
+ * MMIO writes.
+ */
+
+#define MLX5_DECLARE_DOORBELL_LOCK(name) spinlock_t name;
+#define MLX5_INIT_DOORBELL_LOCK(ptr) spin_lock_init(ptr)
+#define MLX5_GET_DOORBELL_LOCK(ptr) (ptr)
+
+static inline void mlx5_write64(__be32 val[2], void __iomem *dest,
+ spinlock_t *doorbell_lock)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(doorbell_lock, flags);
+ __raw_writel((__force u32) val[0], dest);
+ __raw_writel((__force u32) val[1], dest + 4);
+ spin_unlock_irqrestore(doorbell_lock, flags);
+}
+
+#endif
+
+#endif /* MLX5_DOORBELL_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx5/driver.h b/drivers/net/mlnx_uio/mlnx/include/mlx5/driver.h
new file mode 100644
index 0000000..0b4c58a
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx5/driver.h
@@ -0,0 +1,1063 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX5_DRIVER_H
+#define MLX5_DRIVER_H
+
+
+
+enum mlx5_res_type {
+ MLX5_RES_QP,
+ MLX5_RES_SRQ,
+ MLX5_RES_XSRQ,
+ MLX5_RES_DCT,
+};
+
+enum {
+ MLX5_BOARD_ID_LEN = 64,
+ MLX5_MAX_NAME_LEN = 16,
+};
+
+enum {
+ /* one minute for the sake of bringup. Generally, commands must always
+ * complete and we may need to increase this timeout value
+ */
+ MLX5_CMD_TIMEOUT_MSEC = 2 * 60 * 1000,
+ MLX5_CMD_WQ_MAX_NAME = 32,
+};
+
+enum {
+ CMD_OWNER_SW = 0x0,
+ CMD_OWNER_HW = 0x1,
+ CMD_STATUS_SUCCESS = 0,
+};
+
+enum mlx5_sqp_t {
+ MLX5_SQP_SMI = 0,
+ MLX5_SQP_GSI = 1,
+ MLX5_SQP_IEEE_1588 = 2,
+ MLX5_SQP_SNIFFER = 3,
+ MLX5_SQP_SYNC_UMR = 4,
+};
+
+enum {
+ MLX5_MAX_PORTS = 2,
+};
+
+enum {
+ MLX5_EQ_VEC_PAGES = 0,
+ MLX5_EQ_VEC_CMD = 1,
+ MLX5_EQ_VEC_ASYNC = 2,
+ MLX5_EQ_VEC_COMP_BASE,
+};
+
+enum {
+ MLX5_MAX_IRQ_NAME = 32
+};
+
+enum {
+ MLX5_ATOMIC_MODE_OFF = 16,
+ MLX5_ATOMIC_MODE_NONE = 0 << MLX5_ATOMIC_MODE_OFF,
+ MLX5_ATOMIC_MODE_IB_COMP = 1 << MLX5_ATOMIC_MODE_OFF,
+ MLX5_ATOMIC_MODE_CX = 2 << MLX5_ATOMIC_MODE_OFF,
+ MLX5_ATOMIC_MODE_8B = 3 << MLX5_ATOMIC_MODE_OFF,
+ MLX5_ATOMIC_MODE_16B = 4 << MLX5_ATOMIC_MODE_OFF,
+ MLX5_ATOMIC_MODE_32B = 5 << MLX5_ATOMIC_MODE_OFF,
+ MLX5_ATOMIC_MODE_64B = 6 << MLX5_ATOMIC_MODE_OFF,
+ MLX5_ATOMIC_MODE_128B = 7 << MLX5_ATOMIC_MODE_OFF,
+ MLX5_ATOMIC_MODE_256B = 8 << MLX5_ATOMIC_MODE_OFF,
+};
+
+enum {
+ MLX5_ATOMIC_MODE_DCT_OFF = 20,
+ MLX5_ATOMIC_MODE_DCT_NONE = 0 << MLX5_ATOMIC_MODE_DCT_OFF,
+ MLX5_ATOMIC_MODE_DCT_IB_COMP = 1 << MLX5_ATOMIC_MODE_DCT_OFF,
+ MLX5_ATOMIC_MODE_DCT_CX = 2 << MLX5_ATOMIC_MODE_DCT_OFF,
+ MLX5_ATOMIC_MODE_DCT_8B = 3 << MLX5_ATOMIC_MODE_DCT_OFF,
+ MLX5_ATOMIC_MODE_DCT_16B = 4 << MLX5_ATOMIC_MODE_DCT_OFF,
+ MLX5_ATOMIC_MODE_DCT_32B = 5 << MLX5_ATOMIC_MODE_DCT_OFF,
+ MLX5_ATOMIC_MODE_DCT_64B = 6 << MLX5_ATOMIC_MODE_DCT_OFF,
+ MLX5_ATOMIC_MODE_DCT_128B = 7 << MLX5_ATOMIC_MODE_DCT_OFF,
+ MLX5_ATOMIC_MODE_DCT_256B = 8 << MLX5_ATOMIC_MODE_DCT_OFF,
+};
+
+enum {
+ MLX5_ATOMIC_OPS_CMP_SWAP = 1 << 0,
+ MLX5_ATOMIC_OPS_FETCH_ADD = 1 << 1,
+ MLX5_ATOMIC_OPS_MASKED_CMP_SWAP = 1 << 2,
+ MLX5_ATOMIC_OPS_MASKED_FETCH_ADD = 1 << 3,
+};
+
+enum {
+ MLX5_REG_PCAP = 0x5001,
+ MLX5_REG_PMTU = 0x5003,
+ MLX5_REG_PTYS = 0x5004,
+ MLX5_REG_PAOS = 0x5006,
+ MLX5_REG_PPCNT = 0x5008,
+ MLX5_REG_PMAOS = 0x5012,
+ MLX5_REG_PUDE = 0x5009,
+ MLX5_REG_PMPE = 0x5010,
+ MLX5_REG_PELC = 0x500e,
+ MLX5_REG_PVLC = 0x500f,
+ MLX5_REG_PMLP = 0, /* TBD */
+ MLX5_REG_NODE_DESC = 0x6001,
+ MLX5_REG_HOST_ENDIANNESS = 0x7004,
+};
+
+enum mlx5_page_fault_resume_flags {
+ MLX5_PAGE_FAULT_RESUME_REQUESTOR = 1 << 0,
+ MLX5_PAGE_FAULT_RESUME_WRITE = 1 << 1,
+ MLX5_PAGE_FAULT_RESUME_RDMA = 1 << 2,
+ MLX5_PAGE_FAULT_RESUME_ERROR = 1 << 7,
+};
+
+enum dbg_rsc_type {
+ MLX5_DBG_RSC_QP,
+ MLX5_DBG_RSC_EQ,
+ MLX5_DBG_RSC_CQ,
+ MLX5_DBG_RSC_DCT,
+};
+
+struct mlx5_field_desc {
+ struct dentry *dent;
+ int i;
+};
+
+struct mlx5_rsc_debug {
+ struct mlx5_core_dev *dev;
+ void *object;
+ enum dbg_rsc_type type;
+ struct dentry *root;
+ struct mlx5_field_desc fields[0];
+};
+
+enum mlx5_dev_event {
+ MLX5_DEV_EVENT_SYS_ERROR,
+ MLX5_DEV_EVENT_PORT_UP,
+ MLX5_DEV_EVENT_PORT_DOWN,
+ MLX5_DEV_EVENT_PORT_INITIALIZED,
+ MLX5_DEV_EVENT_LID_CHANGE,
+ MLX5_DEV_EVENT_PKEY_CHANGE,
+ MLX5_DEV_EVENT_GUID_CHANGE,
+ MLX5_DEV_EVENT_CLIENT_REREG,
+};
+
+enum mlx5_port_status {
+ MLX5_PORT_UP = 1 << 1,
+ MLX5_PORT_DOWN = 1 << 2,
+};
+
+struct mlx5_uuar_info {
+ struct mlx5_uar *uars;
+ int num_uars;
+ int num_low_latency_uuars;
+ unsigned long *bitmap;
+ unsigned int *count;
+ struct mlx5_bf *bfs;
+
+ /*
+ * protect uuar allocation data structs
+ */
+ struct mutex lock;
+ u32 ver;
+};
+
+struct mlx5_bf {
+ void __iomem *reg;
+ void __iomem *regreg;
+ int buf_size;
+ struct mlx5_uar *uar;
+ unsigned long offset;
+ int need_lock;
+ /* protect blue flame buffer selection when needed
+ */
+ spinlock_t lock;
+
+ /* serialize 64 bit writes when done as two 32 bit accesses
+ */
+ spinlock_t lock32;
+ int uuarn;
+};
+
+struct mlx5_cmd_first {
+ __be32 data[4];
+};
+
+struct mlx5_cmd_msg {
+ struct list_head list;
+ struct mlx5_cmd_cache_head *ch;
+ u32 len;
+ struct mlx5_cmd_first first;
+ struct mlx5_cmd_mailbox *next;
+};
+
+struct mlx5_cmd_debug {
+ struct dentry *dbg_root;
+ struct dentry *dbg_in;
+ struct dentry *dbg_out;
+ struct dentry *dbg_outlen;
+ struct dentry *dbg_status;
+ struct dentry *dbg_run;
+ void *in_msg;
+ void *out_msg;
+ u8 status;
+ u16 inlen;
+ u16 outlen;
+};
+
+struct mlx5_cmd_cache_head {
+ /* protect block chain allocations
+ */
+ spinlock_t lock;
+ struct list_head head;
+ struct kobject kobj;
+ unsigned max_inbox_size;
+ unsigned num_ent;
+ unsigned miss;
+ unsigned total_commands;
+ unsigned free;
+ struct mlx5_core_dev *dev;
+};
+
+enum {
+ MLX5_NUM_COMMAND_CACHES = 5,
+};
+
+struct cmd_msg_cache {
+ struct kobject *ko;
+ struct mlx5_cmd_cache_head ch[MLX5_NUM_COMMAND_CACHES];
+ atomic_t real_miss;
+};
+
+struct mlx5_cmd_stats {
+ u64 sum;
+ u64 n;
+ struct dentry *root;
+ struct dentry *avg;
+ struct dentry *count;
+ /* protect command average calculations */
+ spinlock_t lock;
+};
+
+struct mlx5_cmd {
+ void *cmd_alloc_buf;
+ dma_addr_t alloc_dma;
+ int alloc_size;
+ void *cmd_buf;
+ dma_addr_t dma;
+ u16 cmdif_rev;
+ u8 log_sz;
+ u8 log_stride;
+ int max_reg_cmds;
+ int events;
+ u32 __iomem *vector;
+
+ /* protect command queue allocations
+ */
+ spinlock_t alloc_lock;
+
+ /* protect token allocations
+ */
+ spinlock_t token_lock;
+ u8 token;
+ unsigned long bitmask;
+ char wq_name[MLX5_CMD_WQ_MAX_NAME];
+ struct workqueue_struct *wq;
+ struct semaphore sem;
+ struct semaphore pages_sem;
+ int mode;
+ struct mlx5_cmd_work_ent *ent_arr[MLX5_MAX_COMMANDS];
+ struct pci_pool *pool;
+ struct mlx5_cmd_debug dbg;
+ struct cmd_msg_cache cache;
+ int checksum_disabled;
+ struct mlx5_cmd_stats stats[MLX5_CMD_OP_MAX];
+};
+
+struct mlx5_port_caps {
+ int gid_table_len;
+ int pkey_table_len;
+ u8 ext_port_cap;
+};
+
+struct mlx5_cmd_mailbox {
+ void *buf;
+ dma_addr_t dma;
+ struct mlx5_cmd_mailbox *next;
+};
+
+struct mlx5_buf_list {
+ void *buf;
+ dma_addr_t map;
+};
+
+struct mlx5_buf {
+ struct mlx5_buf_list direct;
+ struct mlx5_buf_list *page_list;
+ int nbufs;
+ int npages;
+ int size;
+ u8 page_shift;
+};
+
+struct mlx5_eq {
+ struct mlx5_core_dev *dev;
+ __be32 __iomem *doorbell;
+ u32 cons_index;
+ struct mlx5_buf buf;
+ int size;
+ u8 irqn;
+ u8 eqn;
+ int nent;
+ struct list_head list;
+ int index;
+ struct mlx5_rsc_debug *dbg;
+ cpumask_var_t affinity_mask;
+};
+
+struct mlx5_core_psv {
+ u32 psv_idx;
+ struct psv_layout {
+ u32 pd;
+ u16 syndrome;
+ u16 reserved;
+ u16 bg;
+ u16 app_tag;
+ u32 ref_tag;
+ } psv;
+};
+
+struct mlx5_core_sig_ctx {
+ struct mlx5_core_psv psv_memory;
+ struct mlx5_core_psv psv_wire;
+ struct ib_sig_err err_item;
+ bool sig_status_checked;
+ bool sig_err_exists;
+ u32 sigerr_count;
+};
+
+struct mlx5_core_mr {
+ u64 iova;
+ u64 size;
+ u32 key;
+ u32 pd;
+};
+
+struct mlx5_core_rsc_common {
+ enum mlx5_res_type res;
+ atomic_t refcount;
+ struct completion free;
+};
+
+struct mlx5_core_srq {
+ struct mlx5_core_rsc_common common; /* must be first */
+ u32 srqn;
+ int max;
+ int max_gs;
+ int max_avail_gather;
+ int wqe_shift;
+ void (*event)(struct mlx5_core_srq *, enum mlx5_event);
+
+ atomic_t refcount;
+ struct completion free;
+};
+
+struct mlx5_eq_table {
+ void __iomem *update_ci;
+ void __iomem *update_arm_ci;
+ struct list_head comp_eqs_list;
+ struct mlx5_eq pages_eq;
+ struct mlx5_eq async_eq;
+ struct mlx5_eq cmd_eq;
+ cpumask_var_t *irq_masks;
+ int num_comp_vectors;
+ /* protect EQs list
+ */
+ spinlock_t lock;
+};
+
+struct mlx5_uar {
+ u32 index;
+ struct list_head bf_list;
+ unsigned free_bf_bmap;
+ void __iomem *bf_map;
+ void __iomem *map;
+};
+
+
+struct mlx5_core_health {
+ struct health_buffer __iomem *health;
+ __be32 __iomem *health_counter;
+ struct timer_list timer;
+ struct list_head list;
+ u32 prev;
+ int miss_counter;
+};
+
+struct mlx5_cq_table {
+ /* protect radix tree
+ */
+ spinlock_t lock;
+ struct radix_tree_root tree;
+};
+
+struct mlx5_qp_table {
+ /* protect radix tree
+ */
+ spinlock_t lock;
+ struct radix_tree_root tree;
+};
+
+struct mlx5_srq_table {
+ /* protect radix tree
+ */
+ spinlock_t lock;
+ struct radix_tree_root tree;
+};
+
+struct mlx5_mr_table {
+ /* protect radix tree
+ */
+ spinlock_t lock;
+ struct radix_tree_root tree;
+};
+
+struct mlx5_vf_context {
+ u32 state_mask;
+ int enabled;
+};
+
+struct mlx5_sriov_vf {
+ struct mlx5_core_dev *dev;
+ struct kobject kobj;
+ int vf;
+};
+
+struct mlx5_core_sriov {
+ struct mlx5_vf_context *vfs_ctx;
+ struct kobject *config;
+ struct kobject node_guid_kobj;
+ struct mlx5_sriov_vf *vfs;
+ int num_vfs;
+ int vf_partial_init;
+};
+
+struct mlx5_irq_info {
+ cpumask_var_t mask;
+ char name[MLX5_MAX_IRQ_NAME];
+};
+
+struct mlx5_priv {
+ char name[MLX5_MAX_NAME_LEN];
+ struct mlx5_eq_table eq_table;
+ struct msix_entry *msix_arr;
+ struct mlx5_irq_info *irq_info;
+ struct mlx5_uuar_info uuari;
+ MLX5_DECLARE_DOORBELL_LOCK(cq_uar_lock);
+
+ struct io_mapping *bf_mapping;
+
+ /* pages stuff */
+ struct workqueue_struct *pg_wq;
+ struct rb_root page_root;
+ int fw_pages;
+ atomic_t reg_pages;
+ struct list_head free_list;
+
+ struct mlx5_core_health health;
+
+ struct mlx5_srq_table srq_table;
+
+ /* start: qp staff */
+ struct mlx5_qp_table qp_table;
+ struct dentry *qp_debugfs;
+ struct dentry *eq_debugfs;
+ struct dentry *cq_debugfs;
+ struct dentry *cmdif_debugfs;
+ /* end: qp staff */
+
+ /* start: dct stuff */
+ struct dentry *dct_debugfs;
+ /* end: dct stuff */
+
+ /* start: cq staff */
+ struct mlx5_cq_table cq_table;
+ /* end: cq staff */
+
+ /* start: mr staff */
+ struct mlx5_mr_table mr_table;
+ /* end: mr staff */
+
+ /* start: alloc staff */
+ /* protect buffer alocation according to numa node */
+ struct mutex alloc_mutex;
+ int numa_node;
+
+ struct mutex pgdir_mutex;
+ struct list_head pgdir_list;
+ /* end: alloc staff */
+ struct dentry *dbg_root;
+
+ /* protect mkey key part */
+ spinlock_t mkey_lock;
+ u8 mkey_key;
+
+ struct list_head dev_list;
+ struct list_head ctx_list;
+ spinlock_t ctx_lock;
+
+ struct mlx5_core_sriov sriov;
+ unsigned long pci_dev_data;
+};
+
+enum mlx5_device_state {
+ MLX5_DEVICE_STATE_UP,
+ MLX5_DEVICE_STATE_INTERNAL_ERROR,
+};
+
+enum mlx5_interface_state {
+ MLX5_INTERFACE_STATE_DOWN,
+ MLX5_INTERFACE_STATE_UP,
+};
+
+enum mlx5_pci_status {
+ MLX5_PCI_STATUS_DISABLED,
+ MLX5_PCI_STATUS_ENABLED,
+};
+
+struct mlx5_special_contexts {
+ int resd_lkey;
+};
+
+struct mlx5_core_dev {
+ struct pci_dev *pdev;
+ struct mutex pci_status_mutex;
+ enum mlx5_pci_status pci_status;
+ u8 rev_id;
+ char board_id[MLX5_BOARD_ID_LEN];
+ struct mlx5_cmd cmd;
+ struct mlx5_port_caps port_caps[MLX5_MAX_PORTS];
+ u32 hca_caps_cur[MLX5_CAP_NUM][MLX5_UN_SZ_DW(hca_cap_union)];
+ u32 hca_caps_max[MLX5_CAP_NUM][MLX5_UN_SZ_DW(hca_cap_union)];
+ phys_addr_t iseg_base;
+ struct mlx5_init_seg __iomem *iseg;
+ enum mlx5_device_state state;
+ struct mutex intf_state_mutex;
+ enum mlx5_interface_state interface_state;
+ void (*event) (struct mlx5_core_dev *dev,
+ enum mlx5_dev_event event,
+ unsigned long param);
+ struct mlx5_priv priv;
+ struct mlx5_profile *profile;
+ atomic_t num_qps;
+ u64 aysnc_events_mask;
+ u32 supported_issi_mask;
+ u32 issi;
+ struct mlx5_special_contexts special_contexts;
+};
+
+struct mlx5_db {
+ __be32 *db;
+ union {
+ struct mlx5_db_pgdir *pgdir;
+ struct mlx5_ib_user_db_page *user_page;
+ } u;
+ dma_addr_t dma;
+ int index;
+};
+
+enum {
+ MLX5_DB_PER_PAGE = PAGE_SIZE / L1_CACHE_BYTES,
+};
+
+struct mlx5_core_dct {
+ struct mlx5_core_rsc_common common; /* must be first */
+ void (*event)(struct mlx5_core_dct *, enum mlx5_event);
+ int dctn;
+ struct completion drained;
+ struct mlx5_rsc_debug *dbg;
+ int pid;
+};
+
+enum {
+ MLX5_COMP_EQ_SIZE = 1024,
+};
+
+enum {
+ MLX5_PTYS_IB = 1 << 0,
+ MLX5_PTYS_EN = 1 << 2,
+};
+
+struct mlx5_db_pgdir {
+ struct list_head list;
+ DECLARE_BITMAP(bitmap, MLX5_DB_PER_PAGE);
+ __be32 *db_page;
+ dma_addr_t db_dma;
+};
+
+typedef void (*mlx5_cmd_cbk_t)(int status, void *context);
+
+struct mlx5_cmd_work_ent {
+ struct mlx5_cmd_msg *in;
+ struct mlx5_cmd_msg *out;
+ void *uout;
+ int uout_size;
+ mlx5_cmd_cbk_t callback;
+ void *context;
+ int idx;
+ struct completion done;
+ struct mlx5_cmd *cmd;
+ struct work_struct work;
+ struct mlx5_cmd_layout *lay;
+ int ret;
+ int page_queue;
+ u8 status;
+ u8 token;
+#ifdef HAVE_KTIME_GET_NS
+ u64 ts1;
+ u64 ts2;
+#else
+ struct timespec ts1;
+ struct timespec ts2;
+#endif
+ u16 op;
+};
+
+struct mlx5_pas {
+ u64 pa;
+ u8 log_sz;
+};
+
+enum port_state_policy {
+ MLX5_AAA_000
+};
+
+enum phy_port_state {
+ MLX5_AAA_111
+};
+
+struct mlx5_hca_vport_context {
+ u32 field_select;
+ bool sm_virt_aware;
+ bool has_smi;
+ bool has_raw;
+ enum port_state_policy policy;
+ enum phy_port_state phys_state;
+ enum ib_port_state vport_state;
+ u8 port_physical_state;
+ u64 sys_image_guid;
+ u64 port_guid;
+ u64 node_guid;
+ u32 cap_mask1;
+ u32 cap_mask1_perm;
+ u32 cap_mask2;
+ u32 cap_mask2_perm;
+ u16 lid;
+ u8 init_type_reply; /* bitmask: see ib spec 14.2.5.6 InitTypeReply */
+ u8 lmc;
+ u8 subnet_timeout;
+ u16 sm_lid;
+ u8 sm_sl;
+ u16 qkey_violation_counter;
+ u16 pkey_violation_counter;
+ bool grh_required;
+};
+
+struct mlx5_net_counters {
+ u64 packets;
+ u64 octets;
+};
+
+struct mlx5_ptys_reg {
+ u8 local_port;
+ u8 proto_mask;
+ u32 eth_proto_cap;
+ u16 ib_link_width_cap;
+ u16 ib_proto_cap;
+ u32 eth_proto_admin;
+ u16 ib_link_width_admin;
+ u16 ib_proto_admin;
+ u32 eth_proto_oper;
+ u16 ib_link_width_oper;
+ u16 ib_proto_oper;
+ u32 eth_proto_lp_advertise;
+};
+
+struct mlx5_pvlc_reg {
+ u8 local_port;
+ u8 vl_hw_cap;
+ u8 vl_admin;
+ u8 vl_operational;
+};
+
+struct mlx5_pmtu_reg {
+ u8 local_port;
+ u16 max_mtu;
+ u16 admin_mtu;
+ u16 oper_mtu;
+};
+
+struct mlx5_vport_counters {
+ struct mlx5_net_counters received_errors;
+ struct mlx5_net_counters transmit_errors;
+ struct mlx5_net_counters received_ib_unicast;
+ struct mlx5_net_counters transmitted_ib_unicast;
+ struct mlx5_net_counters received_ib_multicast;
+ struct mlx5_net_counters transmitted_ib_multicast;
+ struct mlx5_net_counters received_eth_broadcast;
+ struct mlx5_net_counters transmitted_eth_broadcast;
+};
+
+static inline void *mlx5_buf_offset(struct mlx5_buf *buf, int offset)
+{
+ if (likely(BITS_PER_LONG == 64 || buf->nbufs == 1))
+ return buf->direct.buf + offset;
+ else
+ return buf->page_list[offset >> PAGE_SHIFT].buf +
+ (offset & (PAGE_SIZE - 1));
+}
+
+extern struct workqueue_struct *mlx5_core_wq;
+
+#define STRUCT_FIELD(header, field) \
+ .struct_offset_bytes = offsetof(struct ib_unpacked_ ## header, field), \
+ .struct_size_bytes = sizeof((struct ib_unpacked_ ## header *)0)->field
+
+static inline struct mlx5_core_dev *pci2mlx5_core_dev(struct pci_dev *pdev)
+{
+ return pci_get_drvdata(pdev);
+}
+
+extern struct dentry *mlx5_debugfs_root;
+
+static inline u16 fw_rev_maj(struct mlx5_core_dev *dev)
+{
+ return ioread32be(&dev->iseg->fw_rev) & 0xffff;
+}
+
+static inline u16 fw_rev_min(struct mlx5_core_dev *dev)
+{
+ return ioread32be(&dev->iseg->fw_rev) >> 16;
+}
+
+static inline u16 fw_rev_sub(struct mlx5_core_dev *dev)
+{
+ return ioread32be(&dev->iseg->cmdif_rev_fw_sub) & 0xffff;
+}
+
+static inline u16 cmdif_rev(struct mlx5_core_dev *dev)
+{
+ return ioread32be(&dev->iseg->cmdif_rev_fw_sub) >> 16;
+}
+
+static inline void *mlx5_vzalloc(unsigned long size)
+{
+ void *rtn;
+
+ rtn = kzalloc(size, GFP_KERNEL | __GFP_NOWARN);
+ if (!rtn)
+ rtn = vzalloc(size);
+ return rtn;
+}
+
+static inline void *mlx5_vmalloc(unsigned long size)
+{
+ void *rtn;
+
+ rtn = kmalloc(size, GFP_KERNEL | __GFP_NOWARN);
+ if (!rtn)
+ rtn = vmalloc(size);
+ return rtn;
+}
+
+int mlx5_cmd_init(struct mlx5_core_dev *dev);
+void mlx5_cmd_cleanup(struct mlx5_core_dev *dev);
+void mlx5_cmd_use_events(struct mlx5_core_dev *dev);
+void mlx5_cmd_use_polling(struct mlx5_core_dev *dev);
+int mlx5_cmd_status_to_err(struct mlx5_outbox_hdr *hdr);
+int mlx5_cmd_status_to_err_v2(void *ptr);
+int mlx5_core_query_special_contexts(struct mlx5_core_dev *dev);
+int mlx5_core_get_caps(struct mlx5_core_dev *dev, enum mlx5_cap_type cap_type,
+ enum mlx5_cap_mode cap_mode);
+int mlx5_cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
+ int out_size);
+int mlx5_cmd_exec_cb(struct mlx5_core_dev *dev, void *in, int in_size,
+ void *out, int out_size, mlx5_cmd_cbk_t callback,
+ void *context);
+int mlx5_cmd_alloc_uar(struct mlx5_core_dev *dev, u32 *uarn);
+int mlx5_cmd_free_uar(struct mlx5_core_dev *dev, u32 uarn);
+int mlx5_alloc_uuars(struct mlx5_core_dev *dev, struct mlx5_uuar_info *uuari);
+int mlx5_free_uuars(struct mlx5_core_dev *dev, struct mlx5_uuar_info *uuari);
+int mlx5_alloc_map_uar(struct mlx5_core_dev *mdev, struct mlx5_uar *uar);
+void mlx5_unmap_free_uar(struct mlx5_core_dev *mdev, struct mlx5_uar *uar);
+void mlx5_health_cleanup(void);
+void __init mlx5_health_init(void);
+void mlx5_start_health_poll(struct mlx5_core_dev *dev);
+void mlx5_stop_health_poll(struct mlx5_core_dev *dev);
+int mlx5_buf_alloc_node(struct mlx5_core_dev *dev, int size, int max_direct,
+ struct mlx5_buf *buf, int node);
+int mlx5_buf_alloc(struct mlx5_core_dev *dev, int size, int max_direct,
+ struct mlx5_buf *buf);
+void mlx5_buf_free(struct mlx5_core_dev *dev, struct mlx5_buf *buf);
+struct mlx5_cmd_mailbox *mlx5_alloc_cmd_mailbox_chain(struct mlx5_core_dev *dev,
+ gfp_t flags, int npages);
+void mlx5_free_cmd_mailbox_chain(struct mlx5_core_dev *dev,
+ struct mlx5_cmd_mailbox *head);
+int mlx5_core_create_srq(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+ struct mlx5_create_srq_mbox_in *in, int inlen,
+ int is_xrc);
+int mlx5_core_destroy_srq(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq);
+int mlx5_core_query_srq(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+ struct mlx5_query_srq_mbox_out *out);
+int mlx5_core_arm_srq(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+ u16 lwm, int is_srq);
+void mlx5_init_mr_table(struct mlx5_core_dev *dev);
+void mlx5_cleanup_mr_table(struct mlx5_core_dev *dev);
+int mlx5_core_create_mkey(struct mlx5_core_dev *dev, struct mlx5_core_mr *mr,
+ struct mlx5_create_mkey_mbox_in *in, int inlen,
+ mlx5_cmd_cbk_t callback, void *context,
+ struct mlx5_create_mkey_mbox_out *out);
+int mlx5_core_destroy_mkey(struct mlx5_core_dev *dev, struct mlx5_core_mr *mr);
+int mlx5_core_query_mkey(struct mlx5_core_dev *dev, struct mlx5_core_mr *mr,
+ struct mlx5_query_mkey_mbox_out *out, int outlen);
+int mlx5_core_dump_fill_mkey(struct mlx5_core_dev *dev, struct mlx5_core_mr *mr,
+ u32 *mkey);
+int mlx5_core_alloc_pd(struct mlx5_core_dev *dev, u32 *pdn);
+int mlx5_core_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn);
+int mlx5_core_mad_ifc(struct mlx5_core_dev *dev, void *inb, void *outb,
+ u16 opmod, u8 port);
+void mlx5_pagealloc_init(struct mlx5_core_dev *dev);
+void mlx5_pagealloc_cleanup(struct mlx5_core_dev *dev);
+int mlx5_pagealloc_start(struct mlx5_core_dev *dev);
+void mlx5_pagealloc_stop(struct mlx5_core_dev *dev);
+void mlx5_core_req_pages_handler(struct mlx5_core_dev *dev, u16 func_id,
+ s32 npages);
+int mlx5_update_guids(struct mlx5_core_dev *dev);
+int mlx5_satisfy_startup_pages(struct mlx5_core_dev *dev, int boot);
+int mlx5_reclaim_startup_pages(struct mlx5_core_dev *dev);
+void mlx5_register_debugfs(void);
+void mlx5_unregister_debugfs(void);
+int mlx5_eq_init(struct mlx5_core_dev *dev);
+void mlx5_eq_cleanup(struct mlx5_core_dev *dev);
+void mlx5_fill_page_array(struct mlx5_buf *buf, __be64 *pas);
+void mlx5_cq_completion(struct mlx5_core_dev *dev, u32 cqn);
+int mlx5_rsc_event(struct mlx5_core_dev *dev, u32 rsn, int event_type);
+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
+void mlx5_eq_pagefault(struct mlx5_core_dev *dev, struct mlx5_eqe *eqe);
+#endif
+void mlx5_srq_event(struct mlx5_core_dev *dev, u32 srqn, int event_type);
+struct mlx5_core_srq *mlx5_core_get_srq(struct mlx5_core_dev *dev, u32 srqn);
+void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, unsigned long vector);
+void mlx5_cq_event(struct mlx5_core_dev *dev, u32 cqn, int event_type);
+int mlx5_create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq, u8 vecidx,
+ int nent, u64 mask, const char *name, struct mlx5_uar *uar);
+int mlx5_destroy_unmap_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq);
+int mlx5_start_eqs(struct mlx5_core_dev *dev);
+int mlx5_stop_eqs(struct mlx5_core_dev *dev);
+int mlx5_vector2eqn(struct mlx5_core_dev *dev, int vector, int *eqn, int *irqn);
+int mlx5_core_attach_mcg(struct mlx5_core_dev *dev, union ib_gid *mgid, u32 qpn);
+int mlx5_core_detach_mcg(struct mlx5_core_dev *dev, union ib_gid *mgid, u32 qpn);
+
+int mlx5_qp_debugfs_init(struct mlx5_core_dev *dev);
+void mlx5_qp_debugfs_cleanup(struct mlx5_core_dev *dev);
+int mlx5_dct_debugfs_init(struct mlx5_core_dev *dev);
+void mlx5_dct_debugfs_cleanup(struct mlx5_core_dev *dev);
+int mlx5_core_access_reg(struct mlx5_core_dev *dev, void *data_in,
+ int size_in, void *data_out, int size_out,
+ u16 reg_num, int arg, int write);
+int mlx5_set_port_caps(struct mlx5_core_dev *dev, u8 port_num,
+ u32 set, u32 clear, u32 cur);
+int mlx5_query_port_ptys(struct mlx5_core_dev *dev, u32 *ptys,
+ int ptys_size, int proto_mask);
+int mlx5_query_port_proto_cap(struct mlx5_core_dev *dev,
+ u32 *proto_cap, int proto_mask);
+int mlx5_core_access_ptys(struct mlx5_core_dev *dev, struct mlx5_ptys_reg *ptys, int write);
+int mlx5_core_access_pvlc(struct mlx5_core_dev *dev, struct mlx5_pvlc_reg *pvlc, int write);
+int mlx5_core_access_pmtu(struct mlx5_core_dev *dev, struct mlx5_pmtu_reg *pmtu, int write);
+int mlx5_query_port_proto_admin(struct mlx5_core_dev *dev,
+ u32 *proto_admin, int proto_mask);
+int mlx5_set_port_proto(struct mlx5_core_dev *dev, u32 proto_admin,
+ int proto_mask);
+int mlx5_set_port_status(struct mlx5_core_dev *dev,
+ enum mlx5_port_status status);
+int mlx5_query_port_status(struct mlx5_core_dev *dev, u8 *status);
+int mlx5_set_port_mtu(struct mlx5_core_dev *dev, int mtu);
+void mlx5_query_port_max_mtu(struct mlx5_core_dev *dev, int *max_mtu);
+void mlx5_query_port_oper_mtu(struct mlx5_core_dev *dev, int *oper_mtu);
+
+int mlx5_debug_eq_add(struct mlx5_core_dev *dev, struct mlx5_eq *eq);
+void mlx5_debug_eq_remove(struct mlx5_core_dev *dev, struct mlx5_eq *eq);
+int mlx5_core_eq_query(struct mlx5_core_dev *dev, struct mlx5_eq *eq,
+ struct mlx5_query_eq_mbox_out *out, int outlen);
+int mlx5_eq_debugfs_init(struct mlx5_core_dev *dev);
+void mlx5_eq_debugfs_cleanup(struct mlx5_core_dev *dev);
+int mlx5_cq_debugfs_init(struct mlx5_core_dev *dev);
+void mlx5_cq_debugfs_cleanup(struct mlx5_core_dev *dev);
+int mlx5_db_alloc(struct mlx5_core_dev *dev, struct mlx5_db *db);
+int mlx5_db_alloc_node(struct mlx5_core_dev *dev, struct mlx5_db *db,
+ int node);
+void mlx5_db_free(struct mlx5_core_dev *dev, struct mlx5_db *db);
+
+const char *mlx5_command_str(int command);
+int mlx5_cmdif_debugfs_init(struct mlx5_core_dev *dev);
+void mlx5_cmdif_debugfs_cleanup(struct mlx5_core_dev *dev);
+int mlx5_core_create_psv(struct mlx5_core_dev *dev, u32 pdn,
+ int npsvs, u32 *sig_index);
+int mlx5_core_destroy_psv(struct mlx5_core_dev *dev, int psv_num);
+void mlx5_core_put_rsc(struct mlx5_core_rsc_common *common);
+int mlx5_query_odp_caps(struct mlx5_core_dev *dev,
+ struct mlx5_odp_caps *odp_caps);
+int mlx5_core_modify_hca_vport_context(struct mlx5_core_dev *dev,
+ u8 other_vport, u8 port_num,
+ u16 vf_num,
+ struct mlx5_hca_vport_context *req);
+int mlx5_core_check_enable_vf_hca(struct mlx5_core_dev *dev, u32 field_select, u8 vport_num);
+int mlx5_core_query_hca_vport_context(struct mlx5_core_dev *dev,
+ u8 other_vport, u8 port_num,
+ u16 vf_num,
+ struct mlx5_hca_vport_context *rep);
+int mlx5_core_query_gids(struct mlx5_core_dev *dev, u8 other_vport,
+ u8 port_num, u16 vf_num, u16 gid_index,
+ union ib_gid *gid);
+int mlx5_core_query_pkeys(struct mlx5_core_dev *dev, u8 other_vport,
+ u8 port_num, u16 vf_num, u16 pkey_index,
+ u16 *pkey);
+int mlx5_core_query_vport_counter(struct mlx5_core_dev *dev, u8 other_vport,
+ u8 port_num, u16 vf_num,
+ struct mlx5_vport_counters *vc);
+int mlx5_sriov_init(struct mlx5_core_dev *dev);
+int mlx5_sriov_cleanup(struct mlx5_core_dev *dev);
+
+static inline u32 mlx5_mkey_to_idx(u32 mkey)
+{
+ return mkey >> 8;
+}
+
+static inline int fw_initializing(struct mlx5_core_dev *dev)
+{
+ return ioread32be(&dev->iseg->initializing) >> 31;
+}
+
+static inline u32 mlx5_idx_to_mkey(u32 mkey_idx)
+{
+ return mkey_idx << 8;
+}
+
+static inline u8 mlx5_mkey_variant(u32 mkey)
+{
+ return mkey & 0xff;
+}
+
+enum {
+ MLX5_PROF_MASK_QP_SIZE = (u64)1 << 0,
+ MLX5_PROF_MASK_MR_CACHE = (u64)1 << 1,
+ MLX5_PROF_MASK_DCT = (u64)1 << 2,
+};
+
+enum {
+ MAX_MR_CACHE_ENTRIES = 15,
+};
+
+enum {
+ MLX5_INTERFACE_PROTOCOL_IB = 0,
+ MLX5_INTERFACE_PROTOCOL_ETH = 1,
+};
+
+struct mlx5_interface {
+ void * (*add)(struct mlx5_core_dev *dev);
+ void (*remove)(struct mlx5_core_dev *dev, void *context);
+ void (*event)(struct mlx5_core_dev *dev, void *context,
+ enum mlx5_dev_event event, unsigned long param);
+ void * (*get_dev)(void *context);
+ int protocol;
+ struct list_head list;
+};
+
+void *mlx5_get_protocol_dev(struct mlx5_core_dev *mdev, int protocol);
+int mlx5_register_interface(struct mlx5_interface *intf);
+void mlx5_unregister_interface(struct mlx5_interface *intf);
+int mlx5_vport_enable_roce(struct mlx5_core_dev *mdev);
+int mlx5_core_query_vendor_id(struct mlx5_core_dev *mdev, u32 *vendor_id);
+
+struct mlx5_profile {
+ u64 mask;
+ u8 log_max_qp;
+ int dct_enable;
+ struct {
+ int size;
+ int limit;
+ } mr_cache[MAX_MR_CACHE_ENTRIES];
+};
+
+static inline int mlx5_get_out_cmd_status(void *ptr)
+{
+ return *(u8 *)ptr;
+}
+
+enum {
+ MLX5_PCI_DEV_IS_VF = 1 << 0,
+};
+
+static inline int mlx5_core_is_pf(struct mlx5_core_dev *dev)
+{
+ return !(dev->priv.pci_dev_data & MLX5_PCI_DEV_IS_VF);
+}
+
+static inline int mlx5_get_gid_table_len(u16 param)
+{
+ if (param > 4) {
+ pr_warn("gid table length is zero\n");
+ return 0;
+ }
+
+ return 8 * (1 << param);
+}
+
+static inline u16 mlx5_to_sw_pkey_sz(int pkey_sz)
+{
+ if (pkey_sz > MLX5_MAX_LOG_PKEY_TABLE)
+ return 0;
+
+ return MLX5_MIN_PKEY_TABLE_SIZE << pkey_sz;
+}
+
+enum {
+ MLX5_SRIOV_UNLOAD_MAGIC = 0x2cf58291
+};
+
+#endif /* MLX5_DRIVER_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx5/flow_table.h b/drivers/net/mlnx_uio/mlnx/include/mlx5/flow_table.h
new file mode 100644
index 0000000..37d5981
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx5/flow_table.h
@@ -0,0 +1,59 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies, Ltd. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX5_FLOW_TABLE_H
+#define MLX5_FLOW_TABLE_H
+
+
+struct mlx5_flow_table_group {
+ u8 log_sz;
+ u8 match_criteria_enable;
+ u32 match_criteria[MLX5_ST_SZ_DW(fte_match_param)];
+};
+
+void *mlx5_create_flow_table(struct mlx5_core_dev *dev, u8 level, u8 table_type,
+ u16 num_groups,
+ struct mlx5_flow_table_group *group);
+void mlx5_destroy_flow_table(void *flow_table);
+int mlx5_add_flow_table_entry(void *flow_table, u8 match_criteria_enable,
+ void *match_criteria, void *flow_context,
+ u32 *flow_index);
+void mlx5_del_flow_table_entry(void *flow_table, u32 flow_index);
+u32 mlx5_get_flow_table_id(void *flow_table);
+
+#endif /* MLX5_FLOW_TABLE_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx5/mlx5_ifc.h b/drivers/net/mlnx_uio/mlnx/include/mlx5/mlx5_ifc.h
new file mode 100644
index 0000000..7f986d6
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx5/mlx5_ifc.h
@@ -0,0 +1,6892 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+
+ Autogenerated file.
+ Date: 2015-04-13 14:59
+ Source Document Name: Mellanox <Doc Name>
+ Source Document Version: 0.25
+ Generated by adb_to_c.py (EAT.ME Version: 1.0.70)
+*/
+#ifndef MLX5_IFC_H
+#define MLX5_IFC_H
+
+enum {
+ MLX5_EVENT_TYPE_CODING_COMPLETION_EVENTS = 0x0,
+ MLX5_EVENT_TYPE_CODING_PATH_MIGRATED_SUCCEEDED = 0x1,
+ MLX5_EVENT_TYPE_CODING_COMMUNICATION_ESTABLISHED = 0x2,
+ MLX5_EVENT_TYPE_CODING_SEND_QUEUE_DRAINED = 0x3,
+ MLX5_EVENT_TYPE_CODING_LAST_WQE_REACHED = 0x13,
+ MLX5_EVENT_TYPE_CODING_SRQ_LIMIT = 0x14,
+ MLX5_EVENT_TYPE_CODING_DCT_ALL_CONNECTIONS_CLOSED = 0x1c,
+ MLX5_EVENT_TYPE_CODING_DCT_ACCESS_KEY_VIOLATION = 0x1d,
+ MLX5_EVENT_TYPE_CODING_CQ_ERROR = 0x4,
+ MLX5_EVENT_TYPE_CODING_LOCAL_WQ_CATASTROPHIC_ERROR = 0x5,
+ MLX5_EVENT_TYPE_CODING_PATH_MIGRATION_FAILED = 0x7,
+ MLX5_EVENT_TYPE_CODING_PAGE_FAULT_EVENT = 0xc,
+ MLX5_EVENT_TYPE_CODING_INVALID_REQUEST_LOCAL_WQ_ERROR = 0x10,
+ MLX5_EVENT_TYPE_CODING_LOCAL_ACCESS_VIOLATION_WQ_ERROR = 0x11,
+ MLX5_EVENT_TYPE_CODING_LOCAL_SRQ_CATASTROPHIC_ERROR = 0x12,
+ MLX5_EVENT_TYPE_CODING_INTERNAL_ERROR = 0x8,
+ MLX5_EVENT_TYPE_CODING_PORT_STATE_CHANGE = 0x9,
+ MLX5_EVENT_TYPE_CODING_GPIO_EVENT = 0x15,
+ MLX5_EVENT_TYPE_CODING_REMOTE_CONFIGURATION_PROTOCOL_EVENT = 0x19,
+ MLX5_EVENT_TYPE_CODING_DOORBELL_BLUEFLAME_CONGESTION_EVENT = 0x1a,
+ MLX5_EVENT_TYPE_CODING_STALL_VL_EVENT = 0x1b,
+ MLX5_EVENT_TYPE_CODING_DROPPED_PACKET_LOGGED_EVENT = 0x1f,
+ MLX5_EVENT_TYPE_CODING_COMMAND_INTERFACE_COMPLETION = 0xa,
+ MLX5_EVENT_TYPE_CODING_PAGE_REQUEST = 0xb
+};
+
+enum {
+ MLX5_MODIFY_TIR_BITMASK_LRO = 0x0,
+ MLX5_MODIFY_TIR_BITMASK_INDIRECT_TABLE = 0x1,
+ MLX5_MODIFY_TIR_BITMASK_HASH = 0x2,
+ MLX5_MODIFY_TIR_BITMASK_TUNNELED_OFFLOAD_EN = 0x3
+};
+
+enum {
+ MLX5_CMD_OP_QUERY_HCA_CAP = 0x100,
+ MLX5_CMD_OP_QUERY_ADAPTER = 0x101,
+ MLX5_CMD_OP_INIT_HCA = 0x102,
+ MLX5_CMD_OP_TEARDOWN_HCA = 0x103,
+ MLX5_CMD_OP_ENABLE_HCA = 0x104,
+ MLX5_CMD_OP_DISABLE_HCA = 0x105,
+ MLX5_CMD_OP_QUERY_PAGES = 0x107,
+ MLX5_CMD_OP_MANAGE_PAGES = 0x108,
+ MLX5_CMD_OP_SET_HCA_CAP = 0x109,
+ MLX5_CMD_OP_QUERY_ISSI = 0x10a,
+ MLX5_CMD_OP_SET_ISSI = 0x10b,
+ MLX5_CMD_OP_CREATE_MKEY = 0x200,
+ MLX5_CMD_OP_QUERY_MKEY = 0x201,
+ MLX5_CMD_OP_DESTROY_MKEY = 0x202,
+ MLX5_CMD_OP_QUERY_SPECIAL_CONTEXTS = 0x203,
+ MLX5_CMD_OP_PAGE_FAULT_RESUME = 0x204,
+ MLX5_CMD_OP_CREATE_EQ = 0x301,
+ MLX5_CMD_OP_DESTROY_EQ = 0x302,
+ MLX5_CMD_OP_QUERY_EQ = 0x303,
+ MLX5_CMD_OP_GEN_EQE = 0x304,
+ MLX5_CMD_OP_CREATE_CQ = 0x400,
+ MLX5_CMD_OP_DESTROY_CQ = 0x401,
+ MLX5_CMD_OP_QUERY_CQ = 0x402,
+ MLX5_CMD_OP_MODIFY_CQ = 0x403,
+ MLX5_CMD_OP_CREATE_QP = 0x500,
+ MLX5_CMD_OP_DESTROY_QP = 0x501,
+ MLX5_CMD_OP_RST2INIT_QP = 0x502,
+ MLX5_CMD_OP_INIT2RTR_QP = 0x503,
+ MLX5_CMD_OP_RTR2RTS_QP = 0x504,
+ MLX5_CMD_OP_RTS2RTS_QP = 0x505,
+ MLX5_CMD_OP_SQERR2RTS_QP = 0x506,
+ MLX5_CMD_OP_2ERR_QP = 0x507,
+ MLX5_CMD_OP_2RST_QP = 0x50a,
+ MLX5_CMD_OP_QUERY_QP = 0x50b,
+ MLX5_CMD_OP_SQD_RTS_QP = 0x50c,
+ MLX5_CMD_OP_INIT2INIT_QP = 0x50e,
+ MLX5_CMD_OP_CREATE_PSV = 0x600,
+ MLX5_CMD_OP_DESTROY_PSV = 0x601,
+ MLX5_CMD_OP_CREATE_SRQ = 0x700,
+ MLX5_CMD_OP_DESTROY_SRQ = 0x701,
+ MLX5_CMD_OP_QUERY_SRQ = 0x702,
+ MLX5_CMD_OP_ARM_RQ = 0x703,
+ MLX5_CMD_OP_CREATE_XRC_SRQ = 0x705,
+ MLX5_CMD_OP_DESTROY_XRC_SRQ = 0x706,
+ MLX5_CMD_OP_QUERY_XRC_SRQ = 0x707,
+ MLX5_CMD_OP_ARM_XRC_SRQ = 0x708,
+ MLX5_CMD_OP_CREATE_DCT = 0x710,
+ MLX5_CMD_OP_DESTROY_DCT = 0x711,
+ MLX5_CMD_OP_DRAIN_DCT = 0x712,
+ MLX5_CMD_OP_QUERY_DCT = 0x713,
+ MLX5_CMD_OP_ARM_DCT_FOR_KEY_VIOLATION = 0x714,
+ MLX5_CMD_OP_QUERY_VPORT_STATE = 0x750,
+ MLX5_CMD_OP_MODIFY_VPORT_STATE = 0x751,
+ MLX5_CMD_OP_QUERY_ESW_VPORT_CONTEXT = 0x752,
+ MLX5_CMD_OP_MODIFY_ESW_VPORT_CONTEXT = 0x753,
+ MLX5_CMD_OP_QUERY_NIC_VPORT_CONTEXT = 0x754,
+ MLX5_CMD_OP_MODIFY_NIC_VPORT_CONTEXT = 0x755,
+ MLX5_CMD_OP_QUERY_ROCE_ADDRESS = 0x760,
+ MLX5_CMD_OP_SET_ROCE_ADDRESS = 0x761,
+ MLX5_CMD_OP_QUERY_HCA_VPORT_CONTEXT = 0x762,
+ MLX5_CMD_OP_MODIFY_HCA_VPORT_CONTEXT = 0x763,
+ MLX5_CMD_OP_QUERY_HCA_VPORT_GID = 0x764,
+ MLX5_CMD_OP_QUERY_HCA_VPORT_PKEY = 0x765,
+ MLX5_CMD_OP_QUERY_VPORT_COUNTER = 0x770,
+ MLX5_CMD_OP_ALLOC_Q_COUNTER = 0x771,
+ MLX5_CMD_OP_DEALLOC_Q_COUNTER = 0x772,
+ MLX5_CMD_OP_QUERY_Q_COUNTER = 0x773,
+ MLX5_CMD_OP_ALLOC_PD = 0x800,
+ MLX5_CMD_OP_DEALLOC_PD = 0x801,
+ MLX5_CMD_OP_ALLOC_UAR = 0x802,
+ MLX5_CMD_OP_DEALLOC_UAR = 0x803,
+ MLX5_CMD_OP_CONFIG_INT_MODERATION = 0x804,
+ MLX5_CMD_OP_ACCESS_REG = 0x805,
+ MLX5_CMD_OP_ATTACH_TO_MCG = 0x806,
+ MLX5_CMD_OP_DETTACH_FROM_MCG = 0x807,
+ MLX5_CMD_OP_GET_DROPPED_PACKET_LOG = 0x80a,
+ MLX5_CMD_OP_MAD_IFC = 0x50d,
+ MLX5_CMD_OP_QUERY_MAD_DEMUX = 0x80b,
+ MLX5_CMD_OP_SET_MAD_DEMUX = 0x80c,
+ MLX5_CMD_OP_NOP = 0x80d,
+ MLX5_CMD_OP_ALLOC_XRCD = 0x80e,
+ MLX5_CMD_OP_DEALLOC_XRCD = 0x80f,
+ MLX5_CMD_OP_ALLOC_TRANSPORT_DOMAIN = 0x816,
+ MLX5_CMD_OP_DEALLOC_TRANSPORT_DOMAIN = 0x817,
+ MLX5_CMD_OP_QUERY_CONG_STATUS = 0x822,
+ MLX5_CMD_OP_MODIFY_CONG_STATUS = 0x823,
+ MLX5_CMD_OP_QUERY_CONG_PARAMS = 0x824,
+ MLX5_CMD_OP_MODIFY_CONG_PARAMS = 0x825,
+ MLX5_CMD_OP_QUERY_CONG_STATISTICS = 0x826,
+ MLX5_CMD_OP_ADD_VXLAN_UDP_DPORT = 0x827,
+ MLX5_CMD_OP_DELETE_VXLAN_UDP_DPORT = 0x828,
+ MLX5_CMD_OP_SET_L2_TABLE_ENTRY = 0x829,
+ MLX5_CMD_OP_QUERY_L2_TABLE_ENTRY = 0x82a,
+ MLX5_CMD_OP_DELETE_L2_TABLE_ENTRY = 0x82b,
+ MLX5_CMD_OP_CREATE_TIR = 0x900,
+ MLX5_CMD_OP_MODIFY_TIR = 0x901,
+ MLX5_CMD_OP_DESTROY_TIR = 0x902,
+ MLX5_CMD_OP_QUERY_TIR = 0x903,
+ MLX5_CMD_OP_CREATE_SQ = 0x904,
+ MLX5_CMD_OP_MODIFY_SQ = 0x905,
+ MLX5_CMD_OP_DESTROY_SQ = 0x906,
+ MLX5_CMD_OP_QUERY_SQ = 0x907,
+ MLX5_CMD_OP_CREATE_RQ = 0x908,
+ MLX5_CMD_OP_MODIFY_RQ = 0x909,
+ MLX5_CMD_OP_DESTROY_RQ = 0x90a,
+ MLX5_CMD_OP_QUERY_RQ = 0x90b,
+ MLX5_CMD_OP_CREATE_RMP = 0x90c,
+ MLX5_CMD_OP_MODIFY_RMP = 0x90d,
+ MLX5_CMD_OP_DESTROY_RMP = 0x90e,
+ MLX5_CMD_OP_QUERY_RMP = 0x90f,
+ MLX5_CMD_OP_CREATE_TIS = 0x912,
+ MLX5_CMD_OP_MODIFY_TIS = 0x913,
+ MLX5_CMD_OP_DESTROY_TIS = 0x914,
+ MLX5_CMD_OP_QUERY_TIS = 0x915,
+ MLX5_CMD_OP_CREATE_RQT = 0x916,
+ MLX5_CMD_OP_MODIFY_RQT = 0x917,
+ MLX5_CMD_OP_DESTROY_RQT = 0x918,
+ MLX5_CMD_OP_QUERY_RQT = 0x919,
+ MLX5_CMD_OP_CREATE_FLOW_TABLE = 0x930,
+ MLX5_CMD_OP_DESTROY_FLOW_TABLE = 0x931,
+ MLX5_CMD_OP_QUERY_FLOW_TABLE = 0x932,
+ MLX5_CMD_OP_CREATE_FLOW_GROUP = 0x933,
+ MLX5_CMD_OP_DESTROY_FLOW_GROUP = 0x934,
+ MLX5_CMD_OP_QUERY_FLOW_GROUP = 0x935,
+ MLX5_CMD_OP_SET_FLOW_TABLE_ENTRY = 0x936,
+ MLX5_CMD_OP_QUERY_FLOW_TABLE_ENTRY = 0x937,
+ MLX5_CMD_OP_DELETE_FLOW_TABLE_ENTRY = 0x938
+};
+
+struct mlx5_ifc_flow_table_fields_supported_bits {
+ u8 outer_dmac[0x1];
+ u8 outer_smac[0x1];
+ u8 outer_ether_type[0x1];
+ u8 reserved_0[0x1];
+ u8 outer_first_prio[0x1];
+ u8 outer_first_cfi[0x1];
+ u8 outer_first_vid[0x1];
+ u8 reserved_1[0x1];
+ u8 outer_second_prio[0x1];
+ u8 outer_second_cfi[0x1];
+ u8 outer_second_vid[0x1];
+ u8 reserved_2[0x1];
+ u8 outer_sip[0x1];
+ u8 outer_dip[0x1];
+ u8 outer_frag[0x1];
+ u8 outer_ip_protocol[0x1];
+ u8 outer_ip_ecn[0x1];
+ u8 outer_ip_dscp[0x1];
+ u8 outer_udp_sport[0x1];
+ u8 outer_udp_dport[0x1];
+ u8 outer_tcp_sport[0x1];
+ u8 outer_tcp_dport[0x1];
+ u8 outer_tcp_flags[0x1];
+ u8 outer_gre_protocol[0x1];
+ u8 outer_gre_key[0x1];
+ u8 outer_vxlan_vni[0x1];
+ u8 reserved_3[0x5];
+ u8 source_eswitch_port[0x1];
+
+ u8 inner_dmac[0x1];
+ u8 inner_smac[0x1];
+ u8 inner_ether_type[0x1];
+ u8 reserved_4[0x1];
+ u8 inner_first_prio[0x1];
+ u8 inner_first_cfi[0x1];
+ u8 inner_first_vid[0x1];
+ u8 reserved_5[0x1];
+ u8 inner_second_prio[0x1];
+ u8 inner_second_cfi[0x1];
+ u8 inner_second_vid[0x1];
+ u8 reserved_6[0x1];
+ u8 inner_sip[0x1];
+ u8 inner_dip[0x1];
+ u8 inner_frag[0x1];
+ u8 inner_ip_protocol[0x1];
+ u8 inner_ip_ecn[0x1];
+ u8 inner_ip_dscp[0x1];
+ u8 inner_udp_sport[0x1];
+ u8 inner_udp_dport[0x1];
+ u8 inner_tcp_sport[0x1];
+ u8 inner_tcp_dport[0x1];
+ u8 inner_tcp_flags[0x1];
+ u8 reserved_7[0x9];
+
+ u8 reserved_8[0x40];
+};
+
+struct mlx5_ifc_flow_table_prop_layout_bits {
+ u8 ft_support[0x1];
+ u8 reserved_0[0x1f];
+
+ u8 reserved_1[0x2];
+ u8 log_max_ft_size[0x6];
+ u8 reserved_2[0x10];
+ u8 max_ft_level[0x8];
+
+ u8 reserved_3[0x20];
+
+ u8 reserved_4[0x18];
+ u8 log_max_ft_num[0x8];
+
+ u8 reserved_5[0x18];
+ u8 log_max_destination[0x8];
+
+ u8 reserved_6[0x18];
+ u8 log_max_flow[0x8];
+
+ u8 reserved_7[0x40];
+
+ struct mlx5_ifc_flow_table_fields_supported_bits ft_field_support;
+
+ struct mlx5_ifc_flow_table_fields_supported_bits ft_field_bitmask_support;
+};
+
+struct mlx5_ifc_odp_per_transport_service_cap_bits {
+ u8 send[0x1];
+ u8 receive[0x1];
+ u8 write[0x1];
+ u8 read[0x1];
+ u8 reserved_0[0x1];
+ u8 srq_receive[0x1];
+ u8 reserved_1[0x1a];
+};
+
+struct mlx5_ifc_fte_match_set_lyr_2_4_bits {
+ u8 smac_47_16[0x20];
+
+ u8 smac_15_0[0x10];
+ u8 ethertype[0x10];
+
+ u8 dmac_47_16[0x20];
+
+ u8 dmac_15_0[0x10];
+ u8 first_prio[0x3];
+ u8 first_cfi[0x1];
+ u8 first_vid[0xc];
+
+ u8 ip_protocol[0x8];
+ u8 ip_dscp[0x6];
+ u8 ip_ecn[0x2];
+ u8 vlan_tag[0x1];
+ u8 reserved_0[0x1];
+ u8 frag[0x1];
+ u8 reserved_1[0x4];
+ u8 tcp_flags[0x9];
+
+ u8 tcp_sport[0x10];
+ u8 tcp_dport[0x10];
+
+ u8 reserved_2[0x20];
+
+ u8 udp_sport[0x10];
+ u8 udp_dport[0x10];
+
+ u8 src_ip[4][0x20];
+
+ u8 dst_ip[4][0x20];
+};
+
+struct mlx5_ifc_fte_match_set_misc_bits {
+ u8 reserved_0[0x20];
+
+ u8 reserved_1[0x10];
+ u8 source_port[0x10];
+
+ u8 outer_second_prio[0x3];
+ u8 outer_second_cfi[0x1];
+ u8 outer_second_vid[0xc];
+ u8 inner_second_prio[0x3];
+ u8 inner_second_cfi[0x1];
+ u8 inner_second_vid[0xc];
+
+ u8 outer_second_vlan_tag[0x1];
+ u8 inner_second_vlan_tag[0x1];
+ u8 reserved_2[0xe];
+ u8 gre_protocol[0x10];
+
+ u8 gre_key_h[0x18];
+ u8 gre_key_l[0x8];
+
+ u8 vxlan_vni[0x18];
+ u8 reserved_3[0x8];
+
+ u8 reserved_4[0x20];
+
+ u8 reserved_5[0xc];
+ u8 outer_ipv6_flow_label[0x14];
+
+ u8 reserved_6[0xc];
+ u8 inner_ipv6_flow_label[0x14];
+
+ u8 reserved_7[0xe0];
+};
+
+struct mlx5_ifc_cmd_pas_bits {
+ u8 pa_h[0x20];
+
+ u8 pa_l[0x14];
+ u8 reserved_0[0xc];
+};
+
+struct mlx5_ifc_uint64_bits {
+ u8 hi[0x20];
+
+ u8 lo[0x20];
+};
+
+enum {
+ MLX5_ADS_STAT_RATE_NO_LIMIT = 0x0,
+ MLX5_ADS_STAT_RATE_2_5GBPS = 0x7,
+ MLX5_ADS_STAT_RATE_10GBPS = 0x8,
+ MLX5_ADS_STAT_RATE_30GBPS = 0x9,
+ MLX5_ADS_STAT_RATE_5GBPS = 0xa,
+ MLX5_ADS_STAT_RATE_20GBPS = 0xb,
+ MLX5_ADS_STAT_RATE_40GBPS = 0xc,
+ MLX5_ADS_STAT_RATE_60GBPS = 0xd,
+ MLX5_ADS_STAT_RATE_80GBPS = 0xe,
+ MLX5_ADS_STAT_RATE_120GBPS = 0xf,
+};
+
+struct mlx5_ifc_ads_bits {
+ u8 fl[0x1];
+ u8 free_ar[0x1];
+ u8 reserved_0[0xe];
+ u8 pkey_index[0x10];
+
+ u8 reserved_1[0x8];
+ u8 grh[0x1];
+ u8 mlid[0x7];
+ u8 rlid[0x10];
+
+ u8 ack_timeout[0x5];
+ u8 reserved_2[0x3];
+ u8 src_addr_index[0x8];
+ u8 reserved_3[0x4];
+ u8 stat_rate[0x4];
+ u8 hop_limit[0x8];
+
+ u8 reserved_4[0x4];
+ u8 tclass[0x8];
+ u8 flow_label[0x14];
+
+ u8 rgid_rip[16][0x8];
+
+ u8 reserved_5[0x4];
+ u8 f_dscp[0x1];
+ u8 f_ecn[0x1];
+ u8 reserved_6[0x1];
+ u8 f_eth_prio[0x1];
+ u8 ecn[0x2];
+ u8 dscp[0x6];
+ u8 udp_sport[0x10];
+
+ u8 dei_cfi[0x1];
+ u8 eth_prio[0x3];
+ u8 sl[0x4];
+ u8 port[0x8];
+ u8 rmac_47_32[0x10];
+
+ u8 rmac_31_0[0x20];
+};
+
+struct mlx5_ifc_flow_table_nic_cap_bits {
+ u8 reserved_0[0x200];
+
+ struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_receive;
+
+ u8 reserved_1[0x200];
+
+ struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_receive_sniffer;
+
+ struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_transmit;
+
+ u8 reserved_2[0x200];
+
+ struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_transmit_sniffer;
+
+ u8 reserved_3[0x7200];
+};
+
+struct mlx5_ifc_per_protocol_networking_offload_caps_bits {
+ u8 csum_cap[0x1];
+ u8 vlan_cap[0x1];
+ u8 lro_cap[0x1];
+ u8 lro_psh_flag[0x1];
+ u8 lro_time_stamp[0x1];
+ u8 reserved_0[0x6];
+ u8 max_lso_cap[0x5];
+ u8 reserved_1[0x4];
+ u8 rss_ind_tbl_cap[0x4];
+ u8 reserved_2[0x3];
+ u8 tunnel_lso_const_out_ip_id[0x1];
+ u8 reserved_3[0x2];
+ u8 tunnel_statless_gre[0x1];
+ u8 tunnel_stateless_vxlan[0x1];
+
+ u8 reserved_4[0x20];
+
+ u8 reserved_5[0x10];
+ u8 lro_min_mss_size[0x10];
+
+ u8 reserved_6[0x120];
+
+ u8 lro_timer_supported_periods[4][0x20];
+
+ u8 reserved_7[0x600];
+};
+
+struct mlx5_ifc_roce_cap_bits {
+ u8 roce_apm[0x1];
+ u8 reserved_0[0x1f];
+
+ u8 reserved_1[0x60];
+
+ u8 reserved_2[0xc];
+ u8 l3_type[0x4];
+ u8 reserved_3[0x8];
+ u8 roce_version[0x8];
+
+ u8 reserved_4[0x10];
+ u8 r_roce_dest_udp_port[0x10];
+
+ u8 r_roce_max_src_udp_port[0x10];
+ u8 r_roce_min_src_udp_port[0x10];
+
+ u8 reserved_5[0x10];
+ u8 roce_address_table_size[0x10];
+
+ u8 reserved_6[0x700];
+};
+
+enum {
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_QP_1_BYTE = 0x0,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_QP_2_BYTES = 0x2,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_QP_4_BYTES = 0x4,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_QP_8_BYTES = 0x8,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_QP_16_BYTES = 0x10,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_QP_32_BYTES = 0x20,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_QP_64_BYTES = 0x40,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_QP_128_BYTES = 0x80,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_QP_256_BYTES = 0x100,
+};
+
+enum {
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_DC_1_BYTE = 0x1,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_DC_2_BYTES = 0x2,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_DC_4_BYTES = 0x4,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_DC_8_BYTES = 0x8,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_DC_16_BYTES = 0x10,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_DC_32_BYTES = 0x20,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_DC_64_BYTES = 0x40,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_DC_128_BYTES = 0x80,
+ MLX5_ATOMIC_CAPS_ATOMIC_SIZE_DC_256_BYTES = 0x100,
+};
+
+struct mlx5_ifc_atomic_caps_bits {
+ u8 reserved_0[0x40];
+
+ u8 atomic_req_endianess[0x1];
+ u8 reserved_1[0x1f];
+
+ u8 reserved_2[0x20];
+
+ u8 reserved_3[0x10];
+ u8 atomic_operations[0x10];
+
+ u8 reserved_4[0x10];
+ u8 atomic_size_qp[0x10];
+
+ u8 reserved_5[0x10];
+ u8 atomic_size_dc[0x10];
+
+ u8 reserved_6[0x720];
+};
+
+struct mlx5_ifc_odp_cap_bits {
+ u8 reserved_0[0x40];
+
+ u8 sig[0x1];
+ u8 reserved_1[0x1f];
+
+ u8 reserved_2[0x20];
+
+ struct mlx5_ifc_odp_per_transport_service_cap_bits rc_odp_caps;
+
+ struct mlx5_ifc_odp_per_transport_service_cap_bits uc_odp_caps;
+
+ struct mlx5_ifc_odp_per_transport_service_cap_bits ud_odp_caps;
+
+ u8 reserved_3[0x720];
+};
+
+enum {
+ MLX5_WQ_TYPE_LINKED_LIST = 0x0,
+ MLX5_WQ_TYPE_CYCLIC = 0x1,
+ MLX5_WQ_TYPE_STRQ = 0x2,
+};
+
+enum {
+ MLX5_WQ_END_PAD_MODE_NONE = 0x0,
+ MLX5_WQ_END_PAD_MODE_ALIGN = 0x1,
+};
+
+enum {
+ MLX5_CMD_HCA_CAP_GID_TABLE_SIZE_8_GID_ENTRIES = 0x0,
+ MLX5_CMD_HCA_CAP_GID_TABLE_SIZE_16_GID_ENTRIES = 0x1,
+ MLX5_CMD_HCA_CAP_GID_TABLE_SIZE_32_GID_ENTRIES = 0x2,
+ MLX5_CMD_HCA_CAP_GID_TABLE_SIZE_64_GID_ENTRIES = 0x3,
+ MLX5_CMD_HCA_CAP_GID_TABLE_SIZE_128_GID_ENTRIES = 0x4,
+};
+
+enum {
+ MLX5_CMD_HCA_CAP_PKEY_TABLE_SIZE_128_ENTRIES = 0x0,
+ MLX5_CMD_HCA_CAP_PKEY_TABLE_SIZE_256_ENTRIES = 0x1,
+ MLX5_CMD_HCA_CAP_PKEY_TABLE_SIZE_512_ENTRIES = 0x2,
+ MLX5_CMD_HCA_CAP_PKEY_TABLE_SIZE_1K_ENTRIES = 0x3,
+ MLX5_CMD_HCA_CAP_PKEY_TABLE_SIZE_2K_ENTRIES = 0x4,
+ MLX5_CMD_HCA_CAP_PKEY_TABLE_SIZE_4K_ENTRIES = 0x5,
+};
+
+enum {
+ MLX5_CMD_HCA_CAP_PORT_TYPE_IB = 0x0,
+ MLX5_CMD_HCA_CAP_PORT_TYPE_ETHERNET = 0x1,
+};
+
+enum {
+ MLX5_CMD_HCA_CAP_CMDIF_CHECKSUM_DISABLED = 0x0,
+ MLX5_CMD_HCA_CAP_CMDIF_CHECKSUM_INITIAL_STATE = 0x1,
+ MLX5_CMD_HCA_CAP_CMDIF_CHECKSUM_ENABLED = 0x3,
+};
+
+enum {
+ MLX5_CAP_PORT_TYPE_IB = 0x0,
+ MLX5_CAP_PORT_TYPE_ETH = 0x1,
+};
+
+struct mlx5_ifc_cmd_hca_cap_bits {
+ u8 reserved_0[0x80];
+
+ u8 log_max_srq_sz[0x8];
+ u8 log_max_qp_sz[0x8];
+ u8 reserved_1[0xb];
+ u8 log_max_qp[0x5];
+
+ u8 reserved_2[0xb];
+ u8 log_max_srq[0x5];
+ u8 reserved_3[0x10];
+
+ u8 reserved_4[0x8];
+ u8 log_max_cq_sz[0x8];
+ u8 reserved_5[0xb];
+ u8 log_max_cq[0x5];
+
+ u8 log_max_eq_sz[0x8];
+ u8 reserved_6[0x2];
+ u8 log_max_mkey[0x6];
+ u8 reserved_7[0xc];
+ u8 log_max_eq[0x4];
+
+ u8 max_indirection[0x8];
+ u8 reserved_8[0x1];
+ u8 log_max_mrw_sz[0x7];
+ u8 reserved_9[0x2];
+ u8 log_max_bsf_list_size[0x6];
+ u8 reserved_10[0x2];
+ u8 log_max_klm_list_size[0x6];
+
+ u8 reserved_11[0xa];
+ u8 log_max_ra_req_dc[0x6];
+ u8 reserved_12[0xa];
+ u8 log_max_ra_res_dc[0x6];
+
+ u8 reserved_13[0xa];
+ u8 log_max_ra_req_qp[0x6];
+ u8 reserved_14[0xa];
+ u8 log_max_ra_res_qp[0x6];
+
+ u8 pad_cap[0x1];
+ u8 cc_query_allowed[0x1];
+ u8 cc_modify_allowed[0x1];
+ u8 reserved_15[0xd];
+ u8 gid_table_size[0x10];
+
+ u8 out_of_seq_cnt[0x1];
+ u8 vport_counters[0x1];
+ u8 reserved_16[0x4];
+ u8 max_qp_cnt[0xa];
+ u8 pkey_table_size[0x10];
+
+ u8 vport_group_manager[0x1];
+ u8 vhca_group_manager[0x1];
+ u8 ib_virt[0x1];
+ u8 eth_virt[0x1];
+ u8 reserved_17[0x1];
+ u8 ets[0x1];
+ u8 nic_flow_table[0x1];
+ u8 reserved_18[0x4];
+ u8 local_ca_ack_delay[0x5];
+ u8 reserved_19[0x6];
+ u8 port_type[0x2];
+ u8 num_ports[0x8];
+
+ u8 reserved_20[0x3];
+ u8 log_max_msg[0x5];
+ u8 reserved_21[0x18];
+
+ u8 stat_rate_support[0x10];
+ u8 reserved_22[0xc];
+ u8 cqe_version[0x4];
+
+ u8 compact_address_vector[0x1];
+ u8 reserved_23[0xe];
+ u8 drain_sigerr[0x1];
+ u8 cmdif_checksum[0x2];
+ u8 sigerr_cqe[0x1];
+ u8 reserved_24[0x1];
+ u8 wq_signature[0x1];
+ u8 sctr_data_cqe[0x1];
+ u8 reserved_25[0x1];
+ u8 sho[0x1];
+ u8 tph[0x1];
+ u8 rf[0x1];
+ u8 dct[0x1];
+ u8 reserved_26[0x1];
+ u8 eth_net_offloads[0x1];
+ u8 roce[0x1];
+ u8 atomic[0x1];
+ u8 reserved_27[0x1];
+
+ u8 cq_oi[0x1];
+ u8 cq_resize[0x1];
+ u8 cq_moderation[0x1];
+ u8 reserved_28[0x3];
+ u8 cq_eq_remap[0x1];
+ u8 pg[0x1];
+ u8 block_lb_mc[0x1];
+ u8 reserved_29[0x1];
+ u8 scqe_break_moderation[0x1];
+ u8 reserved_30[0x1];
+ u8 cd[0x1];
+ u8 reserved_31[0x1];
+ u8 apm[0x1];
+ u8 reserved_32[0x7];
+ u8 qkv[0x1];
+ u8 pkv[0x1];
+ u8 reserved_33[0x4];
+ u8 xrc[0x1];
+ u8 ud[0x1];
+ u8 uc[0x1];
+ u8 rc[0x1];
+
+ u8 reserved_34[0xa];
+ u8 uar_sz[0x6];
+ u8 reserved_35[0x8];
+ u8 log_pg_sz[0x8];
+
+ u8 bf[0x1];
+ u8 reserved_36[0x1];
+ u8 pad_tx_eth_packet[0x1];
+ u8 reserved_37[0x8];
+ u8 log_bf_reg_size[0x5];
+ u8 reserved_38[0x10];
+
+ u8 reserved_39[0x10];
+ u8 max_wqe_sz_sq[0x10];
+
+ u8 reserved_40[0x10];
+ u8 max_wqe_sz_rq[0x10];
+
+ u8 reserved_41[0x10];
+ u8 max_wqe_sz_sq_dc[0x10];
+
+ u8 reserved_42[0x7];
+ u8 max_qp_mcg[0x19];
+
+ u8 reserved_43[0x18];
+ u8 log_max_mcg[0x8];
+
+ u8 reserved_44[0x3];
+ u8 log_max_transport_domain[0x5];
+ u8 reserved_45[0x3];
+ u8 log_max_pd[0x5];
+ u8 reserved_46[0xb];
+ u8 log_max_xrcd[0x5];
+
+ u8 reserved_47[0x20];
+
+ u8 reserved_48[0x3];
+ u8 log_max_rq[0x5];
+ u8 reserved_49[0x3];
+ u8 log_max_sq[0x5];
+ u8 reserved_50[0x3];
+ u8 log_max_tir[0x5];
+ u8 reserved_51[0x3];
+ u8 log_max_tis[0x5];
+
+ u8 basic_cyclic_rcv_wqe[0x1];
+ u8 reserved_52[0x2];
+ u8 log_max_rmp[0x5];
+ u8 reserved_53[0x3];
+ u8 log_max_rqt[0x5];
+ u8 reserved_54[0x3];
+ u8 log_max_rqt_size[0x5];
+ u8 reserved_55[0x3];
+ u8 log_max_tis_per_sq[0x5];
+
+ u8 reserved_56[0x3];
+ u8 log_max_stride_sz_rq[0x5];
+ u8 reserved_57[0x3];
+ u8 log_min_stride_sz_rq[0x5];
+ u8 reserved_58[0x3];
+ u8 log_max_stride_sz_sq[0x5];
+ u8 reserved_59[0x3];
+ u8 log_min_stride_sz_sq[0x5];
+
+ u8 reserved_60[0x1b];
+ u8 log_max_wq_sz[0x5];
+
+ u8 reserved_61[0xa0];
+
+ u8 reserved_62[0x3];
+ u8 log_max_l2_table[0x5];
+ u8 reserved_63[0x8];
+ u8 log_uar_page_sz[0x10];
+
+ u8 reserved_64[0x100];
+
+ u8 reserved_65[0x1f];
+ u8 cqe_zip[0x1];
+
+ u8 cqe_zip_timeout[0x10];
+ u8 cqe_zip_max_num[0x10];
+
+ u8 reserved_66[0x220];
+};
+
+enum {
+ MLX5_DEST_FORMAT_STRUCT_DESTINATION_TYPE_FLOW_TABLE_ = 0x1,
+ MLX5_DEST_FORMAT_STRUCT_DESTINATION_TYPE_TIR = 0x2,
+};
+
+struct mlx5_ifc_dest_format_struct_bits {
+ u8 destination_type[0x8];
+ u8 destination_id[0x18];
+
+ u8 reserved_0[0x20];
+};
+
+struct mlx5_ifc_fte_match_param_bits {
+ struct mlx5_ifc_fte_match_set_lyr_2_4_bits outer_headers;
+
+ struct mlx5_ifc_fte_match_set_misc_bits misc_parameters;
+
+ struct mlx5_ifc_fte_match_set_lyr_2_4_bits inner_headers;
+
+ u8 reserved_0[0xa00];
+};
+
+enum {
+ MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_SRC_IP = 0x0,
+ MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_DST_IP = 0x1,
+ MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_L4_SPORT = 0x2,
+ MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_L4_DPORT = 0x3,
+ MLX5_RX_HASH_FIELD_SELECT_SELECTED_FIELDS_IPSEC_SPI = 0x4,
+};
+
+struct mlx5_ifc_rx_hash_field_select_bits {
+ u8 l3_prot_type[0x1];
+ u8 l4_prot_type[0x1];
+ u8 selected_fields[0x1e];
+};
+
+enum {
+ MLX5_WQ_WQ_TYPE_WQ_LINKED_LIST = 0x0,
+ MLX5_WQ_WQ_TYPE_WQ_CYCLIC = 0x1,
+};
+
+enum {
+ MLX5_WQ_END_PADDING_MODE_END_PAD_NONE = 0x0,
+ MLX5_WQ_END_PADDING_MODE_END_PAD_ALIGN = 0x1,
+};
+
+struct mlx5_ifc_wq_bits {
+ u8 wq_type[0x4];
+ u8 wq_signature[0x1];
+ u8 end_padding_mode[0x2];
+ u8 cd_slave[0x1];
+ u8 reserved_0[0x18];
+
+ u8 hds_skip_first_sge[0x1];
+ u8 log2_hds_buf_size[0x3];
+ u8 reserved_1[0x7];
+ u8 page_offset[0x5];
+ u8 lwm[0x10];
+
+ u8 reserved_2[0x8];
+ u8 pd[0x18];
+
+ u8 reserved_3[0x8];
+ u8 uar_page[0x18];
+
+ u8 dbr_addr[0x40];
+
+ u8 hw_counter[0x20];
+
+ u8 sw_counter[0x20];
+
+ u8 reserved_4[0xc];
+ u8 log_wq_stride[0x4];
+ u8 reserved_5[0x3];
+ u8 log_wq_pg_sz[0x5];
+ u8 reserved_6[0x3];
+ u8 log_wq_sz[0x5];
+
+ u8 reserved_7[0x4e0];
+
+ struct mlx5_ifc_cmd_pas_bits pas[0];
+};
+
+struct mlx5_ifc_rq_num_bits {
+ u8 reserved_0[0x8];
+ u8 rq_num[0x18];
+};
+
+struct mlx5_ifc_mac_address_layout_bits {
+ u8 reserved_0[0x10];
+ u8 mac_addr_47_32[0x10];
+
+ u8 mac_addr_31_0[0x20];
+};
+
+struct mlx5_ifc_cong_control_r_roce_ecn_np_bits {
+ u8 reserved_0[0xa0];
+
+ u8 min_time_between_cnps[0x20];
+
+ u8 reserved_1[0x12];
+ u8 cnp_dscp[0x6];
+ u8 reserved_2[0x5];
+ u8 cnp_802p_prio[0x3];
+
+ u8 reserved_3[0x720];
+};
+
+struct mlx5_ifc_cong_control_r_roce_ecn_rp_bits {
+ u8 reserved_0[0x60];
+
+ u8 reserved_1[0x4];
+ u8 clamp_tgt_rate[0x1];
+ u8 reserved_2[0x3];
+ u8 clamp_tgt_rate_after_time_inc[0x1];
+ u8 reserved_3[0x17];
+
+ u8 reserved_4[0x20];
+
+ u8 rpg_time_reset[0x20];
+
+ u8 rpg_byte_reset[0x20];
+
+ u8 rpg_threshold[0x20];
+
+ u8 rpg_max_rate[0x20];
+
+ u8 rpg_ai_rate[0x20];
+
+ u8 rpg_hai_rate[0x20];
+
+ u8 rpg_gd[0x20];
+
+ u8 rpg_min_dec_fac[0x20];
+
+ u8 rpg_min_rate[0x20];
+
+ u8 reserved_5[0xe0];
+
+ u8 rate_to_set_on_first_cnp[0x20];
+
+ u8 dce_tcp_g[0x20];
+
+ u8 dce_tcp_rtt[0x20];
+
+ u8 rate_reduce_monitor_period[0x20];
+
+ u8 reserved_6[0x20];
+
+ u8 initial_alpha_value[0x20];
+
+ u8 reserved_7[0x4a0];
+};
+
+struct mlx5_ifc_cong_control_802_1qau_rp_bits {
+ u8 reserved_0[0x80];
+
+ u8 rppp_max_rps[0x20];
+
+ u8 rpg_time_reset[0x20];
+
+ u8 rpg_byte_reset[0x20];
+
+ u8 rpg_threshold[0x20];
+
+ u8 rpg_max_rate[0x20];
+
+ u8 rpg_ai_rate[0x20];
+
+ u8 rpg_hai_rate[0x20];
+
+ u8 rpg_gd[0x20];
+
+ u8 rpg_min_dec_fac[0x20];
+
+ u8 rpg_min_rate[0x20];
+
+ u8 reserved_1[0x640];
+};
+
+enum {
+ MLX5_RESIZE_FIELD_SELECT_RESIZE_FIELD_SELECT_LOG_CQ_SIZE = 0x1,
+ MLX5_RESIZE_FIELD_SELECT_RESIZE_FIELD_SELECT_PAGE_OFFSET = 0x2,
+ MLX5_RESIZE_FIELD_SELECT_RESIZE_FIELD_SELECT_LOG_PAGE_SIZE = 0x4,
+};
+
+struct mlx5_ifc_resize_field_select_bits {
+ u8 resize_field_select[0x20];
+};
+
+enum {
+ MLX5_MODIFY_FIELD_SELECT_MODIFY_FIELD_SELECT_CQ_PERIOD = 0x1,
+ MLX5_MODIFY_FIELD_SELECT_MODIFY_FIELD_SELECT_CQ_MAX_COUNT = 0x2,
+ MLX5_MODIFY_FIELD_SELECT_MODIFY_FIELD_SELECT_OI = 0x4,
+ MLX5_MODIFY_FIELD_SELECT_MODIFY_FIELD_SELECT_C_EQN = 0x8,
+};
+
+struct mlx5_ifc_modify_field_select_bits {
+ u8 modify_field_select[0x20];
+};
+
+struct mlx5_ifc_field_select_r_roce_np_bits {
+ u8 field_select_r_roce_np[0x20];
+};
+
+struct mlx5_ifc_field_select_r_roce_rp_bits {
+ u8 field_select_r_roce_rp[0x20];
+};
+
+enum {
+ MLX5_FIELD_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPPP_MAX_RPS = 0x4,
+ MLX5_FIELD_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_TIME_RESET = 0x8,
+ MLX5_FIELD_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_BYTE_RESET = 0x10,
+ MLX5_FIELD_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_THRESHOLD = 0x20,
+ MLX5_FIELD_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_MAX_RATE = 0x40,
+ MLX5_FIELD_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_AI_RATE = 0x80,
+ MLX5_FIELD_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_HAI_RATE = 0x100,
+ MLX5_FIELD_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_GD = 0x200,
+ MLX5_FIELD_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_MIN_DEC_FAC = 0x400,
+ MLX5_FIELD_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_MIN_RATE = 0x800,
+};
+
+struct mlx5_ifc_field_select_802_1qau_rp_bits {
+ u8 field_select_8021qaurp[0x20];
+};
+
+struct mlx5_ifc_phys_layer_cntrs_bits {
+ u8 time_since_last_clear_high[0x20];
+
+ u8 time_since_last_clear_low[0x20];
+
+ u8 symbol_errors_high[0x20];
+
+ u8 symbol_errors_low[0x20];
+
+ u8 sync_headers_errors_high[0x20];
+
+ u8 sync_headers_errors_low[0x20];
+
+ u8 edpl_bip_errors_lane0_high[0x20];
+
+ u8 edpl_bip_errors_lane0_low[0x20];
+
+ u8 edpl_bip_errors_lane1_high[0x20];
+
+ u8 edpl_bip_errors_lane1_low[0x20];
+
+ u8 edpl_bip_errors_lane2_high[0x20];
+
+ u8 edpl_bip_errors_lane2_low[0x20];
+
+ u8 edpl_bip_errors_lane3_high[0x20];
+
+ u8 edpl_bip_errors_lane3_low[0x20];
+
+ u8 fc_fec_corrected_blocks_lane0_high[0x20];
+
+ u8 fc_fec_corrected_blocks_lane0_low[0x20];
+
+ u8 fc_fec_corrected_blocks_lane1_high[0x20];
+
+ u8 fc_fec_corrected_blocks_lane1_low[0x20];
+
+ u8 fc_fec_corrected_blocks_lane2_high[0x20];
+
+ u8 fc_fec_corrected_blocks_lane2_low[0x20];
+
+ u8 fc_fec_corrected_blocks_lane3_high[0x20];
+
+ u8 fc_fec_corrected_blocks_lane3_low[0x20];
+
+ u8 fc_fec_uncorrectable_blocks_lane0_high[0x20];
+
+ u8 fc_fec_uncorrectable_blocks_lane0_low[0x20];
+
+ u8 fc_fec_uncorrectable_blocks_lane1_high[0x20];
+
+ u8 fc_fec_uncorrectable_blocks_lane1_low[0x20];
+
+ u8 fc_fec_uncorrectable_blocks_lane2_high[0x20];
+
+ u8 fc_fec_uncorrectable_blocks_lane2_low[0x20];
+
+ u8 fc_fec_uncorrectable_blocks_lane3_high[0x20];
+
+ u8 fc_fec_uncorrectable_blocks_lane3_low[0x20];
+
+ u8 rs_fec_corrected_blocks_high[0x20];
+
+ u8 rs_fec_corrected_blocks_low[0x20];
+
+ u8 rs_fec_uncorrectable_blocks_high[0x20];
+
+ u8 rs_fec_uncorrectable_blocks_low[0x20];
+
+ u8 rs_fec_no_errors_blocks_high[0x20];
+
+ u8 rs_fec_no_errors_blocks_low[0x20];
+
+ u8 rs_fec_single_error_blocks_high[0x20];
+
+ u8 rs_fec_single_error_blocks_low[0x20];
+
+ u8 rs_fec_corrected_symbols_total_high[0x20];
+
+ u8 rs_fec_corrected_symbols_total_low[0x20];
+
+ u8 rs_fec_corrected_symbols_lane0_high[0x20];
+
+ u8 rs_fec_corrected_symbols_lane0_low[0x20];
+
+ u8 rs_fec_corrected_symbols_lane1_high[0x20];
+
+ u8 rs_fec_corrected_symbols_lane1_low[0x20];
+
+ u8 rs_fec_corrected_symbols_lane2_high[0x20];
+
+ u8 rs_fec_corrected_symbols_lane2_low[0x20];
+
+ u8 rs_fec_corrected_symbols_lane3_high[0x20];
+
+ u8 rs_fec_corrected_symbols_lane3_low[0x20];
+
+ u8 link_down_events[0x20];
+
+ u8 successful_recovery_events[0x20];
+
+ u8 reserved_0[0x180];
+};
+
+struct mlx5_ifc_eth_per_traffic_grp_data_layout_bits {
+ u8 transmit_queue_high[0x20];
+
+ u8 transmit_queue_low[0x20];
+
+ u8 reserved_0[0x780];
+};
+
+struct mlx5_ifc_eth_per_prio_grp_data_layout_bits {
+ u8 rx_octets_high[0x20];
+
+ u8 rx_octets_low[0x20];
+
+ u8 reserved_0[0xc0];
+
+ u8 rx_frames_high[0x20];
+
+ u8 rx_frames_low[0x20];
+
+ u8 tx_octets_high[0x20];
+
+ u8 tx_octets_low[0x20];
+
+ u8 reserved_1[0xc0];
+
+ u8 tx_frames_high[0x20];
+
+ u8 tx_frames_low[0x20];
+
+ u8 rx_pause_high[0x20];
+
+ u8 rx_pause_low[0x20];
+
+ u8 rx_pause_duration_high[0x20];
+
+ u8 rx_pause_duration_low[0x20];
+
+ u8 tx_pause_high[0x20];
+
+ u8 tx_pause_low[0x20];
+
+ u8 tx_pause_duration_high[0x20];
+
+ u8 tx_pause_duration_low[0x20];
+
+ u8 rx_pause_transition_high[0x20];
+
+ u8 rx_pause_transition_low[0x20];
+
+ u8 reserved_2[0x400];
+};
+
+struct mlx5_ifc_eth_extended_cntrs_grp_data_layout_bits {
+ u8 port_transmit_wait_high[0x20];
+
+ u8 port_transmit_wait_low[0x20];
+
+ u8 reserved_0[0x780];
+};
+
+struct mlx5_ifc_eth_3635_cntrs_grp_data_layout_bits {
+ u8 dot3stats_alignment_errors_high[0x20];
+
+ u8 dot3stats_alignment_errors_low[0x20];
+
+ u8 dot3stats_fcs_errors_high[0x20];
+
+ u8 dot3stats_fcs_errors_low[0x20];
+
+ u8 dot3stats_single_collision_frames_high[0x20];
+
+ u8 dot3stats_single_collision_frames_low[0x20];
+
+ u8 dot3stats_multiple_collision_frames_high[0x20];
+
+ u8 dot3stats_multiple_collision_frames_low[0x20];
+
+ u8 dot3stats_sqe_test_errors_high[0x20];
+
+ u8 dot3stats_sqe_test_errors_low[0x20];
+
+ u8 dot3stats_deferred_transmissions_high[0x20];
+
+ u8 dot3stats_deferred_transmissions_low[0x20];
+
+ u8 dot3stats_late_collisions_high[0x20];
+
+ u8 dot3stats_late_collisions_low[0x20];
+
+ u8 dot3stats_excessive_collisions_high[0x20];
+
+ u8 dot3stats_excessive_collisions_low[0x20];
+
+ u8 dot3stats_internal_mac_transmit_errors_high[0x20];
+
+ u8 dot3stats_internal_mac_transmit_errors_low[0x20];
+
+ u8 dot3stats_carrier_sense_errors_high[0x20];
+
+ u8 dot3stats_carrier_sense_errors_low[0x20];
+
+ u8 dot3stats_frame_too_longs_high[0x20];
+
+ u8 dot3stats_frame_too_longs_low[0x20];
+
+ u8 dot3stats_internal_mac_receive_errors_high[0x20];
+
+ u8 dot3stats_internal_mac_receive_errors_low[0x20];
+
+ u8 dot3stats_symbol_errors_high[0x20];
+
+ u8 dot3stats_symbol_errors_low[0x20];
+
+ u8 dot3control_in_unknown_opcodes_high[0x20];
+
+ u8 dot3control_in_unknown_opcodes_low[0x20];
+
+ u8 dot3in_pause_frames_high[0x20];
+
+ u8 dot3in_pause_frames_low[0x20];
+
+ u8 dot3out_pause_frames_high[0x20];
+
+ u8 dot3out_pause_frames_low[0x20];
+
+ u8 reserved_0[0x3c0];
+};
+
+struct mlx5_ifc_eth_2819_cntrs_grp_data_layout_bits {
+ u8 ether_stats_drop_events_high[0x20];
+
+ u8 ether_stats_drop_events_low[0x20];
+
+ u8 ether_stats_octets_high[0x20];
+
+ u8 ether_stats_octets_low[0x20];
+
+ u8 ether_stats_pkts_high[0x20];
+
+ u8 ether_stats_pkts_low[0x20];
+
+ u8 ether_stats_broadcast_pkts_high[0x20];
+
+ u8 ether_stats_broadcast_pkts_low[0x20];
+
+ u8 ether_stats_multicast_pkts_high[0x20];
+
+ u8 ether_stats_multicast_pkts_low[0x20];
+
+ u8 ether_stats_crc_align_errors_high[0x20];
+
+ u8 ether_stats_crc_align_errors_low[0x20];
+
+ u8 ether_stats_undersize_pkts_high[0x20];
+
+ u8 ether_stats_undersize_pkts_low[0x20];
+
+ u8 ether_stats_oversize_pkts_high[0x20];
+
+ u8 ether_stats_oversize_pkts_low[0x20];
+
+ u8 ether_stats_fragments_high[0x20];
+
+ u8 ether_stats_fragments_low[0x20];
+
+ u8 ether_stats_jabbers_high[0x20];
+
+ u8 ether_stats_jabbers_low[0x20];
+
+ u8 ether_stats_collisions_high[0x20];
+
+ u8 ether_stats_collisions_low[0x20];
+
+ u8 ether_stats_pkts64octets_high[0x20];
+
+ u8 ether_stats_pkts64octets_low[0x20];
+
+ u8 ether_stats_pkts65to127octets_high[0x20];
+
+ u8 ether_stats_pkts65to127octets_low[0x20];
+
+ u8 ether_stats_pkts128to255octets_high[0x20];
+
+ u8 ether_stats_pkts128to255octets_low[0x20];
+
+ u8 ether_stats_pkts256to511octets_high[0x20];
+
+ u8 ether_stats_pkts256to511octets_low[0x20];
+
+ u8 ether_stats_pkts512to1023octets_high[0x20];
+
+ u8 ether_stats_pkts512to1023octets_low[0x20];
+
+ u8 ether_stats_pkts1024to1518octets_high[0x20];
+
+ u8 ether_stats_pkts1024to1518octets_low[0x20];
+
+ u8 ether_stats_pkts1519to2047octets_high[0x20];
+
+ u8 ether_stats_pkts1519to2047octets_low[0x20];
+
+ u8 ether_stats_pkts2048to4095octets_high[0x20];
+
+ u8 ether_stats_pkts2048to4095octets_low[0x20];
+
+ u8 ether_stats_pkts4096to8191octets_high[0x20];
+
+ u8 ether_stats_pkts4096to8191octets_low[0x20];
+
+ u8 ether_stats_pkts8192to10239octets_high[0x20];
+
+ u8 ether_stats_pkts8192to10239octets_low[0x20];
+
+ u8 reserved_0[0x280];
+};
+
+struct mlx5_ifc_eth_2863_cntrs_grp_data_layout_bits {
+ u8 if_in_octets_high[0x20];
+
+ u8 if_in_octets_low[0x20];
+
+ u8 if_in_ucast_pkts_high[0x20];
+
+ u8 if_in_ucast_pkts_low[0x20];
+
+ u8 if_in_discards_high[0x20];
+
+ u8 if_in_discards_low[0x20];
+
+ u8 if_in_errors_high[0x20];
+
+ u8 if_in_errors_low[0x20];
+
+ u8 if_in_unknown_protos_high[0x20];
+
+ u8 if_in_unknown_protos_low[0x20];
+
+ u8 if_out_octets_high[0x20];
+
+ u8 if_out_octets_low[0x20];
+
+ u8 if_out_ucast_pkts_high[0x20];
+
+ u8 if_out_ucast_pkts_low[0x20];
+
+ u8 if_out_discards_high[0x20];
+
+ u8 if_out_discards_low[0x20];
+
+ u8 if_out_errors_high[0x20];
+
+ u8 if_out_errors_low[0x20];
+
+ u8 if_in_multicast_pkts_high[0x20];
+
+ u8 if_in_multicast_pkts_low[0x20];
+
+ u8 if_in_broadcast_pkts_high[0x20];
+
+ u8 if_in_broadcast_pkts_low[0x20];
+
+ u8 if_out_multicast_pkts_high[0x20];
+
+ u8 if_out_multicast_pkts_low[0x20];
+
+ u8 if_out_broadcast_pkts_high[0x20];
+
+ u8 if_out_broadcast_pkts_low[0x20];
+
+ u8 reserved_0[0x480];
+};
+
+struct mlx5_ifc_eth_802_3_cntrs_grp_data_layout_bits {
+ u8 a_frames_transmitted_ok_high[0x20];
+
+ u8 a_frames_transmitted_ok_low[0x20];
+
+ u8 a_frames_received_ok_high[0x20];
+
+ u8 a_frames_received_ok_low[0x20];
+
+ u8 a_frame_check_sequence_errors_high[0x20];
+
+ u8 a_frame_check_sequence_errors_low[0x20];
+
+ u8 a_alignment_errors_high[0x20];
+
+ u8 a_alignment_errors_low[0x20];
+
+ u8 a_octets_transmitted_ok_high[0x20];
+
+ u8 a_octets_transmitted_ok_low[0x20];
+
+ u8 a_octets_received_ok_high[0x20];
+
+ u8 a_octets_received_ok_low[0x20];
+
+ u8 a_multicast_frames_xmitted_ok_high[0x20];
+
+ u8 a_multicast_frames_xmitted_ok_low[0x20];
+
+ u8 a_broadcast_frames_xmitted_ok_high[0x20];
+
+ u8 a_broadcast_frames_xmitted_ok_low[0x20];
+
+ u8 a_multicast_frames_received_ok_high[0x20];
+
+ u8 a_multicast_frames_received_ok_low[0x20];
+
+ u8 a_broadcast_frames_received_ok_high[0x20];
+
+ u8 a_broadcast_frames_received_ok_low[0x20];
+
+ u8 a_in_range_length_errors_high[0x20];
+
+ u8 a_in_range_length_errors_low[0x20];
+
+ u8 a_out_of_range_length_field_high[0x20];
+
+ u8 a_out_of_range_length_field_low[0x20];
+
+ u8 a_frame_too_long_errors_high[0x20];
+
+ u8 a_frame_too_long_errors_low[0x20];
+
+ u8 a_symbol_error_during_carrier_high[0x20];
+
+ u8 a_symbol_error_during_carrier_low[0x20];
+
+ u8 a_mac_control_frames_transmitted_high[0x20];
+
+ u8 a_mac_control_frames_transmitted_low[0x20];
+
+ u8 a_mac_control_frames_received_high[0x20];
+
+ u8 a_mac_control_frames_received_low[0x20];
+
+ u8 a_unsupported_opcodes_received_high[0x20];
+
+ u8 a_unsupported_opcodes_received_low[0x20];
+
+ u8 a_pause_mac_ctrl_frames_received_high[0x20];
+
+ u8 a_pause_mac_ctrl_frames_received_low[0x20];
+
+ u8 a_pause_mac_ctrl_frames_transmitted_high[0x20];
+
+ u8 a_pause_mac_ctrl_frames_transmitted_low[0x20];
+
+ u8 reserved_0[0x300];
+};
+
+struct mlx5_ifc_cmd_inter_comp_event_bits {
+ u8 command_completion_vector[0x20];
+
+ u8 reserved_0[0xc0];
+};
+
+struct mlx5_ifc_stall_vl_event_bits {
+ u8 reserved_0[0x18];
+ u8 port_num[0x1];
+ u8 reserved_1[0x3];
+ u8 vl[0x4];
+
+ u8 reserved_2[0xa0];
+};
+
+struct mlx5_ifc_db_bf_congestion_event_bits {
+ u8 event_subtype[0x8];
+ u8 reserved_0[0x8];
+ u8 congestion_level[0x8];
+ u8 reserved_1[0x8];
+
+ u8 reserved_2[0xa0];
+};
+
+struct mlx5_ifc_gpio_event_bits {
+ u8 reserved_0[0x60];
+
+ u8 gpio_event_hi[0x20];
+
+ u8 gpio_event_lo[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_port_state_change_event_bits {
+ u8 reserved_0[0x40];
+
+ u8 port_num[0x4];
+ u8 reserved_1[0x1c];
+
+ u8 reserved_2[0x80];
+};
+
+struct mlx5_ifc_dropped_packet_logged_bits {
+ u8 reserved_0[0xe0];
+};
+
+enum {
+ MLX5_CQ_ERROR_SYNDROME_CQ_OVERRUN = 0x1,
+ MLX5_CQ_ERROR_SYNDROME_CQ_ACCESS_VIOLATION_ERROR = 0x2,
+};
+
+struct mlx5_ifc_cq_error_bits {
+ u8 reserved_0[0x8];
+ u8 cqn[0x18];
+
+ u8 reserved_1[0x20];
+
+ u8 reserved_2[0x18];
+ u8 syndrome[0x8];
+
+ u8 reserved_3[0x80];
+};
+
+struct mlx5_ifc_rdma_page_fault_event_bits {
+ u8 bytes_commited[0x20];
+
+ u8 r_key[0x20];
+
+ u8 reserved_0[0x10];
+ u8 packet_len[0x10];
+
+ u8 rdma_op_len[0x20];
+
+ u8 rdma_va[0x40];
+
+ u8 reserved_1[0x5];
+ u8 rdma[0x1];
+ u8 write[0x1];
+ u8 requestor[0x1];
+ u8 qp_number[0x18];
+};
+
+struct mlx5_ifc_wqe_associated_page_fault_event_bits {
+ u8 bytes_committed[0x20];
+
+ u8 reserved_0[0x10];
+ u8 wqe_index[0x10];
+
+ u8 reserved_1[0x10];
+ u8 len[0x10];
+
+ u8 reserved_2[0x60];
+
+ u8 reserved_3[0x5];
+ u8 rdma[0x1];
+ u8 write_read[0x1];
+ u8 requestor[0x1];
+ u8 qpn[0x18];
+};
+
+struct mlx5_ifc_qp_events_bits {
+ u8 reserved_0[0xa0];
+
+ u8 type[0x8];
+ u8 reserved_1[0x18];
+
+ u8 reserved_2[0x8];
+ u8 qpn_rqn_sqn[0x18];
+};
+
+struct mlx5_ifc_dct_events_bits {
+ u8 reserved_0[0xc0];
+
+ u8 reserved_1[0x8];
+ u8 dct_number[0x18];
+};
+
+struct mlx5_ifc_comp_event_bits {
+ u8 reserved_0[0xc0];
+
+ u8 reserved_1[0x8];
+ u8 cq_number[0x18];
+};
+
+enum {
+ MLX5_QPC_STATE_RST = 0x0,
+ MLX5_QPC_STATE_INIT = 0x1,
+ MLX5_QPC_STATE_RTR = 0x2,
+ MLX5_QPC_STATE_RTS = 0x3,
+ MLX5_QPC_STATE_SQER = 0x4,
+ MLX5_QPC_STATE_ERR = 0x6,
+ MLX5_QPC_STATE_SQD = 0x7,
+ MLX5_QPC_STATE_SUSPENDED = 0x9,
+};
+
+enum {
+ MLX5_QPC_ST_RC = 0x0,
+ MLX5_QPC_ST_UC = 0x1,
+ MLX5_QPC_ST_UD = 0x2,
+ MLX5_QPC_ST_XRC = 0x3,
+ MLX5_QPC_ST_DCI = 0x5,
+ MLX5_QPC_ST_QP0 = 0x7,
+ MLX5_QPC_ST_QP1 = 0x8,
+ MLX5_QPC_ST_RAW_DATAGRAM = 0x9,
+ MLX5_QPC_ST_REG_UMR = 0xc,
+};
+
+enum {
+ MLX5_QPC_PM_STATE_ARMED = 0x0,
+ MLX5_QPC_PM_STATE_REARM = 0x1,
+ MLX5_QPC_PM_STATE_RESERVED = 0x2,
+ MLX5_QPC_PM_STATE_MIGRATED = 0x3,
+};
+
+enum {
+ MLX5_QPC_END_PADDING_MODE_SCATTER_AS_IS = 0x0,
+ MLX5_QPC_END_PADDING_MODE_PAD_TO_CACHE_LINE_ALIGNMENT = 0x1,
+};
+
+enum {
+ MLX5_QPC_MTU_256_BYTES = 0x1,
+ MLX5_QPC_MTU_512_BYTES = 0x2,
+ MLX5_QPC_MTU_1K_BYTES = 0x3,
+ MLX5_QPC_MTU_2K_BYTES = 0x4,
+ MLX5_QPC_MTU_4K_BYTES = 0x5,
+ MLX5_QPC_MTU_RAW_ETHERNET_QP = 0x7,
+};
+
+enum {
+ MLX5_QPC_ATOMIC_MODE_IB_SPEC = 0x1,
+ MLX5_QPC_ATOMIC_MODE_ONLY_8B = 0x2,
+ MLX5_QPC_ATOMIC_MODE_UP_TO_8B = 0x3,
+ MLX5_QPC_ATOMIC_MODE_UP_TO_16B = 0x4,
+ MLX5_QPC_ATOMIC_MODE_UP_TO_32B = 0x5,
+ MLX5_QPC_ATOMIC_MODE_UP_TO_64B = 0x6,
+ MLX5_QPC_ATOMIC_MODE_UP_TO_128B = 0x7,
+ MLX5_QPC_ATOMIC_MODE_UP_TO_256B = 0x8,
+};
+
+enum {
+ MLX5_QPC_CS_REQ_DISABLE = 0x0,
+ MLX5_QPC_CS_REQ_UP_TO_32B = 0x11,
+ MLX5_QPC_CS_REQ_UP_TO_64B = 0x22,
+};
+
+enum {
+ MLX5_QPC_CS_RES_DISABLE = 0x0,
+ MLX5_QPC_CS_RES_UP_TO_32B = 0x1,
+ MLX5_QPC_CS_RES_UP_TO_64B = 0x2,
+};
+
+struct mlx5_ifc_qpc_bits {
+ u8 state[0x4];
+ u8 reserved_0[0x4];
+ u8 st[0x8];
+ u8 reserved_1[0x3];
+ u8 pm_state[0x2];
+ u8 reserved_2[0x7];
+ u8 end_padding_mode[0x2];
+ u8 reserved_3[0x2];
+
+ u8 wq_signature[0x1];
+ u8 block_lb_mc[0x1];
+ u8 atomic_like_write_en[0x1];
+ u8 latency_sensitive[0x1];
+ u8 reserved_4[0x1];
+ u8 drain_sigerr[0x1];
+ u8 reserved_5[0x2];
+ u8 pd[0x18];
+
+ u8 mtu[0x3];
+ u8 log_msg_max[0x5];
+ u8 reserved_6[0x1];
+ u8 log_rq_size[0x4];
+ u8 log_rq_stride[0x3];
+ u8 no_sq[0x1];
+ u8 log_sq_size[0x4];
+ u8 reserved_7[0x6];
+ u8 rlky[0x1];
+ u8 reserved_8[0x4];
+
+ u8 counter_set_id[0x8];
+ u8 uar_page[0x18];
+
+ u8 reserved_9[0x8];
+ u8 user_index[0x18];
+
+ u8 reserved_10[0x3];
+ u8 log_page_size[0x5];
+ u8 remote_qpn[0x18];
+
+ struct mlx5_ifc_ads_bits primary_address_path;
+
+ struct mlx5_ifc_ads_bits secondary_address_path;
+
+ u8 log_ack_req_freq[0x4];
+ u8 reserved_11[0x4];
+ u8 log_sra_max[0x3];
+ u8 reserved_12[0x2];
+ u8 retry_count[0x3];
+ u8 rnr_retry[0x3];
+ u8 reserved_13[0x1];
+ u8 fre[0x1];
+ u8 cur_rnr_retry[0x3];
+ u8 cur_retry_count[0x3];
+ u8 reserved_14[0x5];
+
+ u8 reserved_15[0x20];
+
+ u8 reserved_16[0x8];
+ u8 next_send_psn[0x18];
+
+ u8 reserved_17[0x8];
+ u8 cqn_snd[0x18];
+
+ u8 reserved_18[0x40];
+
+ u8 reserved_19[0x8];
+ u8 last_acked_psn[0x18];
+
+ u8 reserved_20[0x8];
+ u8 ssn[0x18];
+
+ u8 reserved_21[0x8];
+ u8 log_rra_max[0x3];
+ u8 reserved_22[0x1];
+ u8 atomic_mode[0x4];
+ u8 rre[0x1];
+ u8 rwe[0x1];
+ u8 rae[0x1];
+ u8 reserved_23[0x1];
+ u8 page_offset[0x6];
+ u8 reserved_24[0x3];
+ u8 cd_slave_receive[0x1];
+ u8 cd_slave_send[0x1];
+ u8 cd_master[0x1];
+
+ u8 reserved_25[0x3];
+ u8 min_rnr_nak[0x5];
+ u8 next_rcv_psn[0x18];
+
+ u8 reserved_26[0x8];
+ u8 xrcd[0x18];
+
+ u8 reserved_27[0x8];
+ u8 cqn_rcv[0x18];
+
+ u8 dbr_addr[0x40];
+
+ u8 q_key[0x20];
+
+ u8 reserved_28[0x5];
+ u8 rq_type[0x3];
+ u8 srqn_rmpn[0x18];
+
+ u8 reserved_29[0x8];
+ u8 rmsn[0x18];
+
+ u8 hw_sq_wqebb_counter[0x10];
+ u8 sw_sq_wqebb_counter[0x10];
+
+ u8 hw_rq_counter[0x20];
+
+ u8 sw_rq_counter[0x20];
+
+ u8 reserved_30[0x20];
+
+ u8 reserved_31[0xf];
+ u8 cgs[0x1];
+ u8 cs_req[0x8];
+ u8 cs_res[0x8];
+
+ u8 dc_access_key[0x40];
+
+ u8 reserved_32[0xc0];
+};
+
+struct mlx5_ifc_roce_addr_layout_bits {
+ u8 source_l3_address[16][0x8];
+
+ u8 reserved_0[0x3];
+ u8 vlan_valid[0x1];
+ u8 vlan_id[0xc];
+ u8 source_mac_47_32[0x10];
+
+ u8 source_mac_31_0[0x20];
+
+ u8 reserved_1[0x14];
+ u8 roce_l3_type[0x4];
+ u8 roce_version[0x8];
+
+ u8 reserved_2[0x20];
+};
+
+union mlx5_ifc_hca_cap_union_bits {
+ struct mlx5_ifc_cmd_hca_cap_bits cmd_hca_cap;
+ struct mlx5_ifc_odp_cap_bits odp_cap;
+ struct mlx5_ifc_atomic_caps_bits atomic_caps;
+ struct mlx5_ifc_roce_cap_bits roce_cap;
+ struct mlx5_ifc_per_protocol_networking_offload_caps_bits per_protocol_networking_offload_caps;
+ struct mlx5_ifc_flow_table_nic_cap_bits flow_table_nic_cap;
+ u8 reserved_0[0x8000];
+};
+
+enum {
+ MLX5_FLOW_CONTEXT_ACTION_ALLOW = 0x1,
+ MLX5_FLOW_CONTEXT_ACTION_DROP = 0x2,
+ MLX5_FLOW_CONTEXT_ACTION_FWD_DEST = 0x4,
+};
+
+struct mlx5_ifc_flow_context_bits {
+ u8 reserved_0[0x20];
+
+ u8 group_id[0x20];
+
+ u8 reserved_1[0x8];
+ u8 flow_tag[0x18];
+
+ u8 reserved_2[0x10];
+ u8 action[0x10];
+
+ u8 reserved_3[0x8];
+ u8 destination_list_size[0x18];
+
+ u8 reserved_4[0x160];
+
+ struct mlx5_ifc_fte_match_param_bits match_value;
+
+ u8 reserved_5[0x600];
+
+ struct mlx5_ifc_dest_format_struct_bits destination[0];
+};
+
+enum {
+ MLX5_XRC_SRQC_STATE_GOOD = 0x0,
+ MLX5_XRC_SRQC_STATE_ERROR = 0x1,
+};
+
+struct mlx5_ifc_xrc_srqc_bits {
+ u8 state[0x4];
+ u8 log_xrc_srq_size[0x4];
+ u8 reserved_0[0x18];
+
+ u8 wq_signature[0x1];
+ u8 cont_srq[0x1];
+ u8 reserved_1[0x1];
+ u8 rlky[0x1];
+ u8 basic_cyclic_rcv_wqe[0x1];
+ u8 log_rq_stride[0x3];
+ u8 xrcd[0x18];
+
+ u8 page_offset[0x6];
+ u8 reserved_2[0x2];
+ u8 cqn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 reserved_4[0x2];
+ u8 log_page_size[0x6];
+ u8 user_index[0x18];
+
+ u8 reserved_5[0x20];
+
+ u8 reserved_6[0x8];
+ u8 pd[0x18];
+
+ u8 lwm[0x10];
+ u8 wqe_cnt[0x10];
+
+ u8 reserved_7[0x40];
+
+ u8 dbr_addr[0x40];
+
+ u8 reserved_8[0x80];
+};
+
+struct mlx5_ifc_traffic_counter_bits {
+ u8 packets[0x40];
+
+ u8 octets[0x40];
+};
+
+struct mlx5_ifc_tisc_bits {
+ u8 reserved_0[0xc];
+ u8 prio[0x4];
+ u8 reserved_1[0x10];
+
+ u8 reserved_2[0x100];
+
+ u8 reserved_3[0x8];
+ u8 transport_domain[0x18];
+
+ u8 reserved_4[0x3c0];
+};
+
+enum {
+ MLX5_TIRC_DISP_TYPE_DIRECT = 0x0,
+ MLX5_TIRC_DISP_TYPE_INDIRECT = 0x1,
+};
+
+enum {
+ MLX5_TIRC_LRO_ENABLE_MASK_IPV4_LRO = 0x1,
+ MLX5_TIRC_LRO_ENABLE_MASK_IPV6_LRO = 0x2,
+};
+
+enum {
+ MLX5_TIRC_RX_HASH_FN_HASH_NONE = 0x0,
+ MLX5_TIRC_RX_HASH_FN_HASH_INVERTED_XOR8 = 0x1,
+ MLX5_TIRC_RX_HASH_FN_HASH_TOEPLITZ = 0x2,
+};
+
+enum {
+ MLX5_TIRC_SELF_LB_BLOCK_BLOCK_UNICAST_ = 0x1,
+ MLX5_TIRC_SELF_LB_BLOCK_BLOCK_MULTICAST_ = 0x2,
+};
+
+struct mlx5_ifc_tirc_bits {
+ u8 reserved_0[0x20];
+
+ u8 disp_type[0x4];
+ u8 reserved_1[0x1c];
+
+ u8 reserved_2[0x40];
+
+ u8 reserved_3[0x4];
+ u8 lro_timeout_period_usecs[0x10];
+ u8 lro_enable_mask[0x4];
+ u8 lro_max_ip_payload_size[0x8];
+
+ u8 reserved_4[0x40];
+
+ u8 reserved_5[0x8];
+ u8 inline_rqn[0x18];
+
+ u8 rx_hash_symmetric[0x1];
+ u8 reserved_6[0x1];
+ u8 tunneled_offload_en[0x1];
+ u8 reserved_7[0x5];
+ u8 indirect_table[0x18];
+
+ u8 rx_hash_fn[0x4];
+ u8 reserved_8[0x2];
+ u8 self_lb_block[0x2];
+ u8 transport_domain[0x18];
+
+ u8 rx_hash_toeplitz_key[10][0x20];
+
+ struct mlx5_ifc_rx_hash_field_select_bits rx_hash_field_selector_outer;
+
+ struct mlx5_ifc_rx_hash_field_select_bits rx_hash_field_selector_inner;
+
+ u8 reserved_9[0x4c0];
+};
+
+enum {
+ MLX5_SRQC_STATE_GOOD = 0x0,
+ MLX5_SRQC_STATE_ERROR = 0x1,
+};
+
+struct mlx5_ifc_srqc_bits {
+ u8 state[0x4];
+ u8 log_srq_size[0x4];
+ u8 reserved_0[0x18];
+
+ u8 wq_signature[0x1];
+ u8 cont_srq[0x1];
+ u8 reserved_1[0x1];
+ u8 rlky[0x1];
+ u8 basic_cyclic_rcv_wqe[0x1];
+ u8 log_rq_stride[0x3];
+ u8 xrcd[0x18];
+
+ u8 page_offset[0x6];
+ u8 reserved_2[0x2];
+ u8 cqn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 reserved_4[0x2];
+ u8 log_page_size[0x6];
+ u8 user_index[0x18];
+
+ u8 reserved_5[0x20];
+
+ u8 reserved_6[0x8];
+ u8 pd[0x18];
+
+ u8 lwm[0x10];
+ u8 wqe_cnt[0x10];
+
+ u8 reserved_7[0x40];
+
+ u8 dbr_addr[0x40];
+
+ u8 reserved_8[0x80];
+};
+
+enum {
+ MLX5_SQC_STATE_RST = 0x0,
+ MLX5_SQC_STATE_RDY = 0x1,
+ MLX5_SQC_STATE_ERR = 0x3,
+};
+
+struct mlx5_ifc_sqc_bits {
+ u8 rlky[0x1];
+ u8 cd_master[0x1];
+ u8 fre[0x1];
+ u8 flush_in_error_en[0x1];
+ u8 reserved_0[0x4];
+ u8 state[0x4];
+ u8 reserved_1[0x14];
+
+ u8 reserved_2[0x8];
+ u8 user_index[0x18];
+
+ u8 reserved_3[0x8];
+ u8 cqn[0x18];
+
+ u8 reserved_4[0xa0];
+
+ u8 tis_lst_sz[0x10];
+ u8 reserved_5[0x10];
+
+ u8 reserved_6[0x40];
+
+ u8 reserved_7[0x8];
+ u8 tis_num_0[0x18];
+
+ struct mlx5_ifc_wq_bits wq;
+};
+
+struct mlx5_ifc_rqtc_bits {
+ u8 reserved_0[0xa0];
+
+ u8 reserved_1[0x10];
+ u8 rqt_max_size[0x10];
+
+ u8 reserved_2[0x10];
+ u8 rqt_actual_size[0x10];
+
+ u8 reserved_3[0x6a0];
+
+ struct mlx5_ifc_rq_num_bits rq_num[0];
+};
+
+enum {
+ MLX5_RQC_MEM_RQ_TYPE_MEMORY_RQ_INLINE = 0x0,
+ MLX5_RQC_MEM_RQ_TYPE_MEMORY_RQ_RMP = 0x1,
+};
+
+enum {
+ MLX5_RQC_STATE_RST = 0x0,
+ MLX5_RQC_STATE_RDY = 0x1,
+ MLX5_RQC_STATE_ERR = 0x3,
+};
+
+struct mlx5_ifc_rqc_bits {
+ u8 rlky[0x1];
+ u8 reserved_0[0x2];
+ u8 vsd[0x1];
+ u8 mem_rq_type[0x4];
+ u8 state[0x4];
+ u8 reserved_1[0x1];
+ u8 flush_in_error_en[0x1];
+ u8 reserved_2[0x12];
+
+ u8 reserved_3[0x8];
+ u8 user_index[0x18];
+
+ u8 reserved_4[0x8];
+ u8 cqn[0x18];
+
+ u8 counter_set_id[0x8];
+ u8 reserved_5[0x18];
+
+ u8 reserved_6[0x8];
+ u8 rmpn[0x18];
+
+ u8 reserved_7[0xe0];
+
+ struct mlx5_ifc_wq_bits wq;
+};
+
+enum {
+ MLX5_RMPC_STATE_RDY = 0x1,
+ MLX5_RMPC_STATE_ERR = 0x3,
+};
+
+struct mlx5_ifc_rmpc_bits {
+ u8 reserved_0[0x8];
+ u8 state[0x4];
+ u8 reserved_1[0x14];
+
+ u8 basic_cyclic_rcv_wqe[0x1];
+ u8 reserved_2[0x1f];
+
+ u8 reserved_3[0x140];
+
+ struct mlx5_ifc_wq_bits wq;
+};
+
+enum {
+ MLX5_NIC_VPORT_CONTEXT_ALLOWED_LIST_TYPE_CURRENT_UC_MAC_ADDRESS = 0x0,
+};
+
+struct mlx5_ifc_nic_vport_context_bits {
+ u8 reserved_0[0x1f];
+ u8 roce_en[0x1];
+
+ u8 reserved_1[0x120];
+
+ u8 system_image_guid[0x40];
+ u8 port_guid[0x40];
+ u8 node_guid[0x40];
+
+ u8 reserved_2[0x140];
+ u8 qkey_violation_counter[0x10];
+ u8 reserved_3[0x435];
+
+ u8 allowed_list_type[0x3];
+ u8 reserved_4[0xc];
+ u8 allowed_list_size[0xc];
+
+ struct mlx5_ifc_mac_address_layout_bits permanent_address;
+
+ u8 reserved_5[0x20];
+
+ u8 current_uc_mac_address[0][0x40];
+};
+
+enum {
+ MLX5_MKC_ACCESS_MODE_PA = 0x0,
+ MLX5_MKC_ACCESS_MODE_MTT = 0x1,
+ MLX5_MKC_ACCESS_MODE_KLMS = 0x2,
+};
+
+struct mlx5_ifc_mkc_bits {
+ u8 reserved_0[0x1];
+ u8 free[0x1];
+ u8 reserved_1[0xd];
+ u8 small_fence_on_rdma_read_response[0x1];
+ u8 umr_en[0x1];
+ u8 a[0x1];
+ u8 rw[0x1];
+ u8 rr[0x1];
+ u8 lw[0x1];
+ u8 lr[0x1];
+ u8 access_mode[0x2];
+ u8 reserved_2[0x8];
+
+ u8 qpn[0x18];
+ u8 mkey_7_0[0x8];
+
+ u8 reserved_3[0x20];
+
+ u8 length64[0x1];
+ u8 bsf_en[0x1];
+ u8 sync_umr[0x1];
+ u8 reserved_4[0x2];
+ u8 expected_sigerr_count[0x1];
+ u8 reserved_5[0x1];
+ u8 en_rinval[0x1];
+ u8 pd[0x18];
+
+ u8 start_addr[0x40];
+
+ u8 len[0x40];
+
+ u8 bsf_octword_size[0x20];
+
+ u8 reserved_6[0x80];
+
+ u8 translations_octword_size[0x20];
+
+ u8 reserved_7[0x1b];
+ u8 log_page_size[0x5];
+
+ u8 reserved_8[0x20];
+};
+
+struct mlx5_ifc_pkey_bits {
+ u8 reserved_0[0x10];
+ u8 pkey[0x10];
+};
+
+struct mlx5_ifc_array128_auto_bits {
+ u8 array128_auto[16][0x8];
+};
+
+struct mlx5_ifc_hca_vport_context_bits {
+ u8 field_select[0x20];
+
+ u8 reserved_0[0xe0];
+
+ u8 sm_virt_aware[0x1];
+ u8 has_smi[0x1];
+ u8 has_raw[0x1];
+ u8 grh_required[0x1];
+ u8 reserved_1[0xc];
+ u8 port_physical_state[0x4];
+ u8 vport_state_policy[0x4];
+ u8 port_state[0x4];
+ u8 vport_state[0x4];
+
+ u8 reserved_2[0x20];
+
+ u8 system_image_guid[0x40];
+
+ u8 port_guid[0x40];
+
+ u8 node_guid[0x40];
+
+ u8 cap_mask1[0x20];
+
+ u8 cap_mask1_field_select[0x20];
+
+ u8 cap_mask2[0x20];
+
+ u8 cap_mask2_field_select[0x20];
+
+ u8 reserved_3[0x80];
+
+ u8 lid[0x10];
+ u8 reserved_4[0x4];
+ u8 init_type_reply[0x4];
+ u8 lmc[0x3];
+ u8 subnet_timeout[0x5];
+
+ u8 sm_lid[0x10];
+ u8 sm_sl[0x4];
+ u8 reserved_5[0xc];
+
+ u8 qkey_violation_counter[0x10];
+ u8 pkey_violation_counter[0x10];
+
+ u8 reserved_6[0xca0];
+};
+
+enum {
+ MLX5_EQC_STATUS_OK = 0x0,
+ MLX5_EQC_STATUS_EQ_WRITE_FAILURE = 0xa,
+};
+
+enum {
+ MLX5_EQC_ST_ARMED = 0x9,
+ MLX5_EQC_ST_FIRED = 0xa,
+};
+
+struct mlx5_ifc_eqc_bits {
+ u8 status[0x4];
+ u8 reserved_0[0x9];
+ u8 ec[0x1];
+ u8 oi[0x1];
+ u8 reserved_1[0x5];
+ u8 st[0x4];
+ u8 reserved_2[0x8];
+
+ u8 reserved_3[0x20];
+
+ u8 reserved_4[0x14];
+ u8 page_offset[0x6];
+ u8 reserved_5[0x6];
+
+ u8 reserved_6[0x3];
+ u8 log_eq_size[0x5];
+ u8 uar_page[0x18];
+
+ u8 reserved_7[0x20];
+
+ u8 reserved_8[0x18];
+ u8 intr[0x8];
+
+ u8 reserved_9[0x3];
+ u8 log_page_size[0x5];
+ u8 reserved_10[0x18];
+
+ u8 reserved_11[0x60];
+
+ u8 reserved_12[0x8];
+ u8 consumer_counter[0x18];
+
+ u8 reserved_13[0x8];
+ u8 producer_counter[0x18];
+
+ u8 reserved_14[0x80];
+};
+
+enum {
+ MLX5_DCTC_STATE_ACTIVE = 0x0,
+ MLX5_DCTC_STATE_DRAINING = 0x1,
+ MLX5_DCTC_STATE_DRAINED = 0x2,
+};
+
+enum {
+ MLX5_DCTC_CS_RES_DISABLE = 0x0,
+ MLX5_DCTC_CS_RES_NA = 0x1,
+ MLX5_DCTC_CS_RES_UP_TO_64B = 0x2,
+};
+
+enum {
+ MLX5_DCTC_MTU_256_BYTES = 0x1,
+ MLX5_DCTC_MTU_512_BYTES = 0x2,
+ MLX5_DCTC_MTU_1K_BYTES = 0x3,
+ MLX5_DCTC_MTU_2K_BYTES = 0x4,
+ MLX5_DCTC_MTU_4K_BYTES = 0x5,
+};
+
+struct mlx5_ifc_dctc_bits {
+ u8 reserved_0[0x4];
+ u8 state[0x4];
+ u8 reserved_1[0x18];
+
+ u8 reserved_2[0x8];
+ u8 user_index[0x18];
+
+ u8 reserved_3[0x8];
+ u8 cqn[0x18];
+
+ u8 counter_set_id[0x8];
+ u8 atomic_mode[0x4];
+ u8 rre[0x1];
+ u8 rwe[0x1];
+ u8 rae[0x1];
+ u8 atomic_like_write_en[0x1];
+ u8 latency_sensitive[0x1];
+ u8 rlky[0x1];
+ u8 free_ar[0x1];
+ u8 reserved_4[0xd];
+
+ u8 reserved_5[0x8];
+ u8 cs_res[0x8];
+ u8 reserved_6[0x3];
+ u8 min_rnr_nak[0x5];
+ u8 reserved_7[0x8];
+
+ u8 reserved_8[0x8];
+ u8 srqn[0x18];
+
+ u8 reserved_9[0x8];
+ u8 pd[0x18];
+
+ u8 tclass[0x8];
+ u8 reserved_10[0x4];
+ u8 flow_label[0x14];
+
+ u8 dc_access_key[0x40];
+
+ u8 reserved_11[0x5];
+ u8 mtu[0x3];
+ u8 port[0x8];
+ u8 pkey_index[0x10];
+
+ u8 reserved_12[0x8];
+ u8 my_addr_index[0x8];
+ u8 reserved_13[0x8];
+ u8 hop_limit[0x8];
+
+ u8 dc_access_key_violation_count[0x20];
+
+ u8 reserved_14[0x14];
+ u8 dei_cfi[0x1];
+ u8 eth_prio[0x3];
+ u8 ecn[0x2];
+ u8 dscp[0x6];
+
+ u8 reserved_15[0x40];
+};
+
+enum {
+ MLX5_CQC_STATUS_OK = 0x0,
+ MLX5_CQC_STATUS_CQ_OVERFLOW = 0x9,
+ MLX5_CQC_STATUS_CQ_WRITE_FAIL = 0xa,
+};
+
+enum {
+ MLX5_CQC_CQE_SZ_64_BYTES = 0x0,
+ MLX5_CQC_CQE_SZ_128_BYTES = 0x1,
+};
+
+enum {
+ MLX5_CQC_ST_SOLICITED_NOTIFICATION_REQUEST_ARMED = 0x6,
+ MLX5_CQC_ST_NOTIFICATION_REQUEST_ARMED = 0x9,
+ MLX5_CQC_ST_FIRED = 0xa,
+};
+
+struct mlx5_ifc_cqc_bits {
+ u8 status[0x4];
+ u8 reserved_0[0x4];
+ u8 cqe_sz[0x3];
+ u8 cc[0x1];
+ u8 reserved_1[0x1];
+ u8 scqe_break_moderation_en[0x1];
+ u8 oi[0x1];
+ u8 reserved_2[0x2];
+ u8 cqe_zip_en[0x1];
+ u8 mini_cqe_res_format[0x2];
+ u8 st[0x4];
+ u8 reserved_3[0x8];
+
+ u8 reserved_4[0x20];
+
+ u8 reserved_5[0x14];
+ u8 page_offset[0x6];
+ u8 reserved_6[0x6];
+
+ u8 reserved_7[0x3];
+ u8 log_cq_size[0x5];
+ u8 uar_page[0x18];
+
+ u8 reserved_8[0x4];
+ u8 cq_period[0xc];
+ u8 cq_max_count[0x10];
+
+ u8 reserved_9[0x18];
+ u8 c_eqn[0x8];
+
+ u8 reserved_10[0x3];
+ u8 log_page_size[0x5];
+ u8 reserved_11[0x18];
+
+ u8 reserved_12[0x20];
+
+ u8 reserved_13[0x8];
+ u8 last_notified_index[0x18];
+
+ u8 reserved_14[0x8];
+ u8 last_solicit_index[0x18];
+
+ u8 reserved_15[0x8];
+ u8 consumer_counter[0x18];
+
+ u8 reserved_16[0x8];
+ u8 producer_counter[0x18];
+
+ u8 reserved_17[0x40];
+
+ u8 dbr_addr[0x40];
+};
+
+union mlx5_ifc_cong_control_roce_ecn_auto_bits {
+ struct mlx5_ifc_cong_control_802_1qau_rp_bits cong_control_802_1qau_rp;
+ struct mlx5_ifc_cong_control_r_roce_ecn_rp_bits cong_control_r_roce_ecn_rp;
+ struct mlx5_ifc_cong_control_r_roce_ecn_np_bits cong_control_r_roce_ecn_np;
+ u8 reserved_0[0x800];
+};
+
+struct mlx5_ifc_query_adapter_param_block_bits {
+ u8 reserved_0[0xc0];
+
+ u8 reserved_1[0x8];
+ u8 ieee_vendor_id[0x18];
+
+ u8 reserved_2[0x10];
+ u8 vsd_vendor_id[0x10];
+
+ u8 vsd[208][0x8];
+
+ u8 vsd_contd_psid[16][0x8];
+};
+
+union mlx5_ifc_modify_field_select_resize_field_select_auto_bits {
+ struct mlx5_ifc_modify_field_select_bits modify_field_select;
+ struct mlx5_ifc_resize_field_select_bits resize_field_select;
+ u8 reserved_0[0x20];
+};
+
+union mlx5_ifc_field_select_802_1_r_roce_auto_bits {
+ struct mlx5_ifc_field_select_802_1qau_rp_bits field_select_802_1qau_rp;
+ struct mlx5_ifc_field_select_r_roce_rp_bits field_select_r_roce_rp;
+ struct mlx5_ifc_field_select_r_roce_np_bits field_select_r_roce_np;
+ u8 reserved_0[0x20];
+};
+
+union mlx5_ifc_eth_cntrs_grp_data_layout_auto_bits {
+ struct mlx5_ifc_eth_802_3_cntrs_grp_data_layout_bits eth_802_3_cntrs_grp_data_layout;
+ struct mlx5_ifc_eth_2863_cntrs_grp_data_layout_bits eth_2863_cntrs_grp_data_layout;
+ struct mlx5_ifc_eth_2819_cntrs_grp_data_layout_bits eth_2819_cntrs_grp_data_layout;
+ struct mlx5_ifc_eth_3635_cntrs_grp_data_layout_bits eth_3635_cntrs_grp_data_layout;
+ struct mlx5_ifc_eth_extended_cntrs_grp_data_layout_bits eth_extended_cntrs_grp_data_layout;
+ struct mlx5_ifc_eth_per_prio_grp_data_layout_bits eth_per_prio_grp_data_layout;
+ struct mlx5_ifc_eth_per_traffic_grp_data_layout_bits eth_per_traffic_grp_data_layout;
+ struct mlx5_ifc_phys_layer_cntrs_bits phys_layer_cntrs;
+ u8 reserved_0[0x7c0];
+};
+
+union mlx5_ifc_event_auto_bits {
+ struct mlx5_ifc_comp_event_bits comp_event;
+ struct mlx5_ifc_dct_events_bits dct_events;
+ struct mlx5_ifc_qp_events_bits qp_events;
+ struct mlx5_ifc_wqe_associated_page_fault_event_bits wqe_associated_page_fault_event;
+ struct mlx5_ifc_rdma_page_fault_event_bits rdma_page_fault_event;
+ struct mlx5_ifc_cq_error_bits cq_error;
+ struct mlx5_ifc_dropped_packet_logged_bits dropped_packet_logged;
+ struct mlx5_ifc_port_state_change_event_bits port_state_change_event;
+ struct mlx5_ifc_gpio_event_bits gpio_event;
+ struct mlx5_ifc_db_bf_congestion_event_bits db_bf_congestion_event;
+ struct mlx5_ifc_stall_vl_event_bits stall_vl_event;
+ struct mlx5_ifc_cmd_inter_comp_event_bits cmd_inter_comp_event;
+ u8 reserved_0[0xe0];
+};
+
+struct mlx5_ifc_health_buffer_bits {
+ u8 reserved_0[0x100];
+
+ u8 assert_existptr[0x20];
+
+ u8 assert_callra[0x20];
+
+ u8 reserved_1[0x40];
+
+ u8 fw_version[0x20];
+
+ u8 hw_id[0x20];
+
+ u8 reserved_2[0x20];
+
+ u8 irisc_index[0x8];
+ u8 synd[0x8];
+ u8 ext_synd[0x10];
+};
+
+struct mlx5_ifc_register_loopback_control_bits {
+ u8 no_lb[0x1];
+ u8 reserved_0[0x7];
+ u8 port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 reserved_2[0x60];
+};
+
+struct mlx5_ifc_teardown_hca_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+enum {
+ MLX5_TEARDOWN_HCA_IN_PROFILE_GRACEFUL_CLOSE = 0x0,
+ MLX5_TEARDOWN_HCA_IN_PROFILE_PANIC_CLOSE = 0x1,
+};
+
+struct mlx5_ifc_teardown_hca_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x10];
+ u8 profile[0x10];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_sqerr2rts_qp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_sqerr2rts_qp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 opt_param_mask[0x20];
+
+ u8 reserved_4[0x20];
+
+ struct mlx5_ifc_qpc_bits qpc;
+
+ u8 reserved_5[0x80];
+};
+
+struct mlx5_ifc_sqd2rts_qp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_sqd2rts_qp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 opt_param_mask[0x20];
+
+ u8 reserved_4[0x20];
+
+ struct mlx5_ifc_qpc_bits qpc;
+
+ u8 reserved_5[0x80];
+};
+
+struct mlx5_ifc_set_roce_address_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_set_roce_address_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 roce_address_index[0x10];
+ u8 reserved_2[0x10];
+
+ u8 reserved_3[0x20];
+
+ struct mlx5_ifc_roce_addr_layout_bits roce_address;
+};
+
+struct mlx5_ifc_set_mad_demux_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+enum {
+ MLX5_SET_MAD_DEMUX_IN_DEMUX_MODE_PASS_ALL = 0x0,
+ MLX5_SET_MAD_DEMUX_IN_DEMUX_MODE_SELECTIVE = 0x2,
+};
+
+struct mlx5_ifc_set_mad_demux_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x20];
+
+ u8 reserved_3[0x6];
+ u8 demux_mode[0x2];
+ u8 reserved_4[0x18];
+};
+
+struct mlx5_ifc_set_l2_table_entry_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_set_l2_table_entry_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x60];
+
+ u8 reserved_3[0x8];
+ u8 table_index[0x18];
+
+ u8 reserved_4[0x20];
+
+ u8 reserved_5[0x13];
+ u8 vlan_valid[0x1];
+ u8 vlan[0xc];
+
+ struct mlx5_ifc_mac_address_layout_bits mac_address;
+
+ u8 reserved_6[0xc0];
+};
+
+struct mlx5_ifc_set_issi_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_set_issi_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x10];
+ u8 current_issi[0x10];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_set_hca_cap_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_set_hca_cap_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ union mlx5_ifc_hca_cap_union_bits capability;
+};
+
+struct mlx5_ifc_set_fte_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_set_fte_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ u8 table_type[0x8];
+ u8 reserved_3[0x18];
+
+ u8 reserved_4[0x8];
+ u8 table_id[0x18];
+
+ u8 reserved_5[0x40];
+
+ u8 flow_index[0x20];
+
+ u8 reserved_6[0xe0];
+
+ struct mlx5_ifc_flow_context_bits flow_context;
+};
+
+struct mlx5_ifc_rts2rts_qp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_rst2init_qp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 opt_param_mask[0x20];
+
+ u8 reserved_4[0x20];
+
+ struct mlx5_ifc_qpc_bits qpc;
+
+ u8 reserved_5[0x80];
+};
+
+struct mlx5_ifc_rst2init_qp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_rts2rts_qp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 opt_param_mask[0x20];
+
+ u8 reserved_4[0x20];
+
+ struct mlx5_ifc_qpc_bits qpc;
+
+ u8 reserved_5[0x80];
+};
+
+struct mlx5_ifc_rtr2rts_qp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_rtr2rts_qp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 opt_param_mask[0x20];
+
+ u8 reserved_4[0x20];
+
+ struct mlx5_ifc_qpc_bits qpc;
+
+ u8 reserved_5[0x80];
+};
+
+struct mlx5_ifc_query_xrc_srq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_xrc_srqc_bits xrc_srq_context_entry;
+
+ u8 reserved_2[0x600];
+
+ u8 pas[0][0x40];
+};
+
+struct mlx5_ifc_query_xrc_srq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 xrc_srqn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+enum {
+ MLX5_QUERY_VPORT_STATE_OUT_STATE_DOWN = 0x0,
+ MLX5_QUERY_VPORT_STATE_OUT_STATE_UP = 0x1,
+};
+
+struct mlx5_ifc_query_vport_state_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x20];
+
+ u8 reserved_2[0x18];
+ u8 admin_state[0x4];
+ u8 state[0x4];
+};
+
+enum {
+ MLX5_QUERY_VPORT_STATE_IN_OP_MOD_VNIC_VPORT = 0x0,
+};
+
+struct mlx5_ifc_query_vport_state_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 other_vport[0x1];
+ u8 reserved_2[0xf];
+ u8 vport_number[0x10];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_vport_counter_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_traffic_counter_bits received_errors;
+
+ struct mlx5_ifc_traffic_counter_bits transmit_errors;
+
+ struct mlx5_ifc_traffic_counter_bits received_ib_unicast;
+
+ struct mlx5_ifc_traffic_counter_bits transmitted_ib_unicast;
+
+ struct mlx5_ifc_traffic_counter_bits received_ib_multicast;
+
+ struct mlx5_ifc_traffic_counter_bits transmitted_ib_multicast;
+
+ struct mlx5_ifc_traffic_counter_bits received_eth_broadcast;
+
+ struct mlx5_ifc_traffic_counter_bits transmitted_eth_broadcast;
+
+ struct mlx5_ifc_traffic_counter_bits received_eth_unicast;
+
+ struct mlx5_ifc_traffic_counter_bits transmitted_eth_unicast;
+
+ struct mlx5_ifc_traffic_counter_bits received_eth_multicast;
+
+ struct mlx5_ifc_traffic_counter_bits transmitted_eth_multicast;
+
+ u8 reserved_2[0xa00];
+};
+
+enum {
+ MLX5_QUERY_VPORT_COUNTER_IN_OP_MOD_VPORT_COUNTERS = 0x0,
+};
+
+struct mlx5_ifc_query_vport_counter_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 other_vport[0x1];
+ u8 reserved_2[0xb];
+ u8 port_num[0x4];
+ u8 vport_number[0x10];
+
+ u8 reserved_3[0x60];
+
+ u8 clear[0x1];
+ u8 reserved_4[0x1f];
+
+ u8 reserved_5[0x20];
+};
+
+struct mlx5_ifc_query_tis_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_tisc_bits tis_context;
+};
+
+struct mlx5_ifc_query_tis_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 tisn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_tir_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0xc0];
+
+ struct mlx5_ifc_tirc_bits tir_context;
+};
+
+struct mlx5_ifc_query_tir_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 tirn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_srq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_srqc_bits srq_context_entry;
+
+ u8 reserved_2[0x600];
+
+ u8 pas[0][0x40];
+};
+
+struct mlx5_ifc_query_srq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 srqn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_sq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0xc0];
+
+ struct mlx5_ifc_sqc_bits sq_context;
+};
+
+struct mlx5_ifc_query_sq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 sqn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_special_contexts_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x20];
+
+ u8 resd_lkey[0x20];
+};
+
+struct mlx5_ifc_query_special_contexts_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_query_rqt_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0xc0];
+
+ struct mlx5_ifc_rqtc_bits rqt_context;
+};
+
+struct mlx5_ifc_query_rqt_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 rqtn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_rq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0xc0];
+
+ struct mlx5_ifc_rqc_bits rq_context;
+};
+
+struct mlx5_ifc_query_rq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 rqn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_roce_address_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_roce_addr_layout_bits roce_address;
+};
+
+struct mlx5_ifc_query_roce_address_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 roce_address_index[0x10];
+ u8 reserved_2[0x10];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_rmp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0xc0];
+
+ struct mlx5_ifc_rmpc_bits rmp_context;
+};
+
+struct mlx5_ifc_query_rmp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 rmpn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_qp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ u8 opt_param_mask[0x20];
+
+ u8 reserved_2[0x20];
+
+ struct mlx5_ifc_qpc_bits qpc;
+
+ u8 reserved_3[0x80];
+
+ u8 pas[0][0x40];
+};
+
+struct mlx5_ifc_query_qp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_q_counter_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ u8 rx_write_requests[0x20];
+
+ u8 reserved_2[0x20];
+
+ u8 rx_read_requests[0x20];
+
+ u8 reserved_3[0x20];
+
+ u8 rx_atomic_requests[0x20];
+
+ u8 reserved_4[0x20];
+
+ u8 rx_dct_connect[0x20];
+
+ u8 reserved_5[0x20];
+
+ u8 out_of_buffer[0x20];
+
+ u8 reserved_6[0x20];
+
+ u8 out_of_sequence[0x20];
+
+ u8 reserved_7[0x620];
+};
+
+struct mlx5_ifc_query_q_counter_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x80];
+
+ u8 clear[0x1];
+ u8 reserved_3[0x1f];
+
+ u8 reserved_4[0x18];
+ u8 counter_set_id[0x8];
+};
+
+struct mlx5_ifc_query_pages_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x10];
+ u8 function_id[0x10];
+
+ u8 num_pages[0x20];
+};
+
+enum {
+ MLX5_QUERY_PAGES_IN_OP_MOD_BOOT_PAGES = 0x1,
+ MLX5_QUERY_PAGES_IN_OP_MOD_INIT_PAGES = 0x2,
+ MLX5_QUERY_PAGES_IN_OP_MOD_REGULAR_PAGES = 0x3,
+};
+
+struct mlx5_ifc_query_pages_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x10];
+ u8 function_id[0x10];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_nic_vport_context_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_nic_vport_context_bits nic_vport_context;
+};
+
+struct mlx5_ifc_query_nic_vport_context_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 other_vport[0x1];
+ u8 reserved_2[0xf];
+ u8 vport_number[0x10];
+
+ u8 reserved_3[0x5];
+ u8 allowed_list_type[0x3];
+ u8 reserved_4[0x18];
+};
+
+struct mlx5_ifc_query_mkey_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_mkc_bits memory_key_mkey_entry;
+
+ u8 reserved_2[0x600];
+
+ u8 bsf0_klm0_pas_mtt0_1[16][0x8];
+
+ u8 bsf1_klm1_pas_mtt2_3[16][0x8];
+};
+
+struct mlx5_ifc_query_mkey_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 mkey_index[0x18];
+
+ u8 pg_access[0x1];
+ u8 reserved_3[0x1f];
+};
+
+struct mlx5_ifc_query_mad_demux_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ u8 mad_dumux_parameters_block[0x20];
+};
+
+struct mlx5_ifc_query_mad_demux_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_query_l2_table_entry_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0xa0];
+
+ u8 reserved_2[0x13];
+ u8 vlan_valid[0x1];
+ u8 vlan[0xc];
+
+ struct mlx5_ifc_mac_address_layout_bits mac_address;
+
+ u8 reserved_3[0xc0];
+};
+
+struct mlx5_ifc_query_l2_table_entry_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x60];
+
+ u8 reserved_3[0x8];
+ u8 table_index[0x18];
+
+ u8 reserved_4[0x140];
+};
+
+struct mlx5_ifc_query_issi_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x10];
+ u8 current_issi[0x10];
+
+ u8 reserved_2[0xa0];
+
+ u8 supported_issi_reserved[76][0x8];
+ u8 supported_issi_dw0[0x20];
+};
+
+struct mlx5_ifc_query_issi_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_query_hca_vport_pkey_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_pkey_bits pkey[0];
+};
+
+struct mlx5_ifc_query_hca_vport_gid_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 other_vport[0x1];
+ u8 reserved_2[0xb];
+ u8 port_num[0x4];
+ u8 vport_number[0x10];
+
+ u8 reserved_3[0x10];
+ u8 gid_index[0x10];
+};
+
+struct mlx5_ifc_query_hca_vport_context_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_hca_vport_context_bits hca_vport_context;
+};
+
+struct mlx5_ifc_query_hca_vport_context_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 other_vport[0x1];
+ u8 reserved_2[0xb];
+ u8 port_num[0x4];
+ u8 vport_number[0x10];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_flow_table_field_bitmask_bits {
+ u8 outer_dmac[0x1];
+ u8 outer_smac[0x1];
+ u8 outer_ether_type[0x1];
+ u8 reserved_0[0x1];
+ u8 outer_first_prio[0x1];
+ u8 outer_first_cfi[0x1];
+ u8 outer_first_vid[0x1];
+ u8 reserved_1[0x1];
+ u8 outer_second_prio[0x1];
+ u8 outer_second_cfi[0x1];
+ u8 outer_second_vid[0x1];
+ u8 reserved_2[0x1];
+ u8 outer_sip[0x1];
+ u8 outer_dip[0x1];
+ u8 outer_frag[0x1];
+ u8 outer_ip_protocol[0x1];
+ u8 reserved_3[0x4];
+ u8 outer_l4_sport[0x1];
+ u8 outer_l4_dport[0x1];
+ u8 outer_tcp_flags[0x1];
+ u8 outer_gre_protcol[0x1];
+ u8 outer_gre_key[0x1];
+ u8 outer_vxlan_vni[0x1];
+ u8 reserved_4[0x5];
+ u8 source_eswitch_port[0x1];
+
+ u8 inner_dmac[0x1];
+ u8 inner_smac[0x1];
+ u8 inner_ether_type[0x1];
+ u8 reserved_5[0x1];
+ u8 inner_first_prio[0x1];
+ u8 inner_first_cfi[0x1];
+ u8 inner_first_vid[0x1];
+ u8 reserved_6[0x1];
+ u8 inner_second_prio[0x1];
+ u8 inner_second_cfi[0x1];
+ u8 inner_second_vid[0x1];
+ u8 reserved_7[0x1];
+ u8 inner_sip[0x1];
+ u8 inner_dip[0x1];
+ u8 inner_frag[0x1];
+ u8 inner_ip_protocol[0x1];
+ u8 reserved_8[0x4];
+ u8 inner_l4_sport[0x1];
+ u8 inner_l4_dport[0x1];
+ u8 inner_tcp_flags[0x1];
+ u8 reserved_9[0x9];
+
+ u8 reserved_10[0x40];
+};
+
+union mlx5_ifc_cmd_hca_cap_odp_cap_atomic_caps_roce_cap_per_protocol_networking_offload_caps_auto_bits {
+ struct mlx5_ifc_cmd_hca_cap_bits cmd_hca_cap;
+ struct mlx5_ifc_odp_cap_bits odp_cap;
+ struct mlx5_ifc_atomic_caps_bits atomic_caps;
+ struct mlx5_ifc_roce_cap_bits roce_cap;
+ struct mlx5_ifc_per_protocol_networking_offload_caps_bits per_protocol_networking_offload_caps;
+ u8 reserved_0[0x800];
+};
+
+struct mlx5_ifc_query_hca_cap_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ union mlx5_ifc_hca_cap_union_bits capability;
+};
+
+struct mlx5_ifc_query_hca_cap_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_query_flow_table_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x80];
+
+ u8 reserved_2[0x8];
+ u8 level[0x8];
+ u8 reserved_3[0x8];
+ u8 log_size[0x8];
+
+ u8 reserved_4[0x120];
+};
+
+struct mlx5_ifc_query_flow_table_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ u8 table_type[0x8];
+ u8 reserved_3[0x18];
+
+ u8 reserved_4[0x8];
+ u8 table_id[0x18];
+
+ u8 reserved_5[0x140];
+};
+
+struct mlx5_ifc_query_fte_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x1c0];
+
+ struct mlx5_ifc_flow_context_bits flow_context;
+};
+
+struct mlx5_ifc_query_fte_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ u8 table_type[0x8];
+ u8 reserved_3[0x18];
+
+ u8 reserved_4[0x8];
+ u8 table_id[0x18];
+
+ u8 reserved_5[0x40];
+
+ u8 flow_index[0x20];
+
+ u8 reserved_6[0xe0];
+};
+
+enum {
+ MLX5_QUERY_FLOW_GROUP_OUT_MATCH_CRITERIA_ENABLE_OUTER_HEADERS = 0x0,
+ MLX5_QUERY_FLOW_GROUP_OUT_MATCH_CRITERIA_ENABLE_MISC_PARAMETERS = 0x1,
+ MLX5_QUERY_FLOW_GROUP_OUT_MATCH_CRITERIA_ENABLE_INNER_HEADERS = 0x2,
+};
+
+struct mlx5_ifc_query_flow_group_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0xa0];
+
+ u8 start_flow_index[0x20];
+
+ u8 reserved_2[0x20];
+
+ u8 end_flow_index[0x20];
+
+ u8 reserved_3[0xa0];
+
+ u8 reserved_4[0x18];
+ u8 match_criteria_enable[0x8];
+
+ struct mlx5_ifc_fte_match_param_bits match_criteria;
+
+ u8 reserved_5[0xe00];
+};
+
+struct mlx5_ifc_query_flow_group_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ u8 table_type[0x8];
+ u8 reserved_3[0x18];
+
+ u8 reserved_4[0x8];
+ u8 table_id[0x18];
+
+ u8 group_id[0x20];
+
+ u8 reserved_5[0x120];
+};
+
+struct mlx5_ifc_query_eq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_eqc_bits eq_context_entry;
+
+ u8 reserved_2[0x40];
+
+ u8 event_bitmask[0x40];
+
+ u8 reserved_3[0x580];
+
+ u8 pas[0][0x40];
+};
+
+struct mlx5_ifc_query_eq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x18];
+ u8 eq_number[0x8];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_dct_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_dctc_bits dct_context_entry;
+
+ u8 reserved_2[0x180];
+};
+
+struct mlx5_ifc_query_dct_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 dctn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_cq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_cqc_bits cq_context;
+
+ u8 reserved_2[0x600];
+
+ u8 pas[0][0x40];
+};
+
+struct mlx5_ifc_query_cq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 cqn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_cong_status_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x20];
+
+ u8 enable[0x1];
+ u8 tag_enable[0x1];
+ u8 reserved_2[0x1e];
+};
+
+struct mlx5_ifc_query_cong_status_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x18];
+ u8 priority[0x4];
+ u8 cong_protocol[0x4];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_cong_statistics_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ u8 cur_flows[0x20];
+
+ u8 sum_flows[0x20];
+
+ u8 cnp_ignored_high[0x20];
+
+ u8 cnp_ignored_low[0x20];
+
+ u8 cnp_handled_high[0x20];
+
+ u8 cnp_handled_low[0x20];
+
+ u8 reserved_2[0x100];
+
+ u8 time_stamp_high[0x20];
+
+ u8 time_stamp_low[0x20];
+
+ u8 accumulators_period[0x20];
+
+ u8 ecn_marked_roce_packets_high[0x20];
+
+ u8 ecn_marked_roce_packets_low[0x20];
+
+ u8 cnps_sent_high[0x20];
+
+ u8 cnps_sent_low[0x20];
+
+ u8 reserved_3[0x560];
+};
+
+struct mlx5_ifc_query_cong_statistics_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 clear[0x1];
+ u8 reserved_2[0x1f];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_cong_params_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ union mlx5_ifc_cong_control_roce_ecn_auto_bits congestion_parameters;
+};
+
+struct mlx5_ifc_query_cong_params_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x1c];
+ u8 cong_protocol[0x4];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_query_adapter_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ struct mlx5_ifc_query_adapter_param_block_bits query_adapter_struct;
+};
+
+struct mlx5_ifc_query_adapter_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_qp_2rst_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_qp_2rst_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_qp_2err_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_qp_2err_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_page_fault_resume_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_page_fault_resume_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 error[0x1];
+ u8 reserved_2[0x4];
+ u8 rdma[0x1];
+ u8 read_write[0x1];
+ u8 req_res[0x1];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_nop_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_nop_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_modify_vport_state_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_modify_vport_state_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 other_vport[0x1];
+ u8 reserved_2[0xf];
+ u8 vport_number[0x10];
+
+ u8 reserved_3[0x18];
+ u8 admin_state[0x4];
+ u8 reserved_4[0x4];
+};
+
+struct mlx5_ifc_modify_tis_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_modify_tis_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 tisn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 modify_bitmask[0x40];
+
+ u8 reserved_4[0x40];
+
+ struct mlx5_ifc_tisc_bits ctx;
+};
+
+struct mlx5_ifc_modify_tir_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_modify_tir_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 tirn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 modify_bitmask[0x40];
+
+ u8 reserved_4[0x40];
+
+ struct mlx5_ifc_tirc_bits ctx;
+};
+
+struct mlx5_ifc_modify_sq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_modify_sq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 sq_state[0x4];
+ u8 reserved_2[0x4];
+ u8 sqn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 modify_bitmask[0x40];
+
+ u8 reserved_4[0x40];
+
+ struct mlx5_ifc_sqc_bits ctx;
+};
+
+struct mlx5_ifc_modify_rqt_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_modify_rqt_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 rqtn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 modify_bitmask[0x40];
+
+ u8 reserved_4[0x40];
+
+ struct mlx5_ifc_rqtc_bits ctx;
+};
+
+struct mlx5_ifc_modify_rq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_modify_rq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 rq_state[0x4];
+ u8 reserved_2[0x4];
+ u8 rqn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 modify_bitmask[0x40];
+
+ u8 reserved_4[0x40];
+
+ struct mlx5_ifc_rqc_bits ctx;
+};
+
+struct mlx5_ifc_modify_rmp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_rmp_bitmask_bits {
+ u8 reserved[0x20];
+
+ u8 reserved1[0x1f];
+ u8 lwm[0x1];
+};
+
+struct mlx5_ifc_modify_rmp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 rmp_state[0x4];
+ u8 reserved_2[0x4];
+ u8 rmpn[0x18];
+
+ u8 reserved_3[0x20];
+
+ struct mlx5_ifc_rmp_bitmask_bits bitmask;
+
+ u8 reserved_4[0x40];
+
+ struct mlx5_ifc_rmpc_bits ctx;
+};
+
+struct mlx5_ifc_modify_nic_vport_context_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_modify_nic_vport_field_select_bits {
+ u8 reserved_0[0x1c];
+ u8 permanent_address[0x1];
+ u8 addresses_list[0x1];
+ u8 roce_en[0x1];
+ u8 reserved_1[0x1];
+};
+
+struct mlx5_ifc_modify_nic_vport_context_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 other_vport[0x1];
+ u8 reserved_2[0xf];
+ u8 vport_number[0x10];
+
+ struct mlx5_ifc_modify_nic_vport_field_select_bits field_select;
+
+ u8 reserved_3[0x780];
+
+ struct mlx5_ifc_nic_vport_context_bits nic_vport_context;
+};
+
+struct mlx5_ifc_modify_hca_vport_context_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 other_vport[0x1];
+ u8 reserved_2[0xb];
+ u8 port_num[0x4];
+ u8 vport_number[0x10];
+
+ u8 reserved_3[0x20];
+
+ struct mlx5_ifc_hca_vport_context_bits hca_vport_context;
+};
+
+struct mlx5_ifc_modify_cq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+enum {
+ MLX5_MODIFY_CQ_IN_OP_MOD_MODIFY_CQ = 0x0,
+ MLX5_MODIFY_CQ_IN_OP_MOD_RESIZE_CQ = 0x1,
+};
+
+struct mlx5_ifc_modify_cq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 cqn[0x18];
+
+ union mlx5_ifc_modify_field_select_resize_field_select_auto_bits modify_field_select_resize_field_select;
+
+ struct mlx5_ifc_cqc_bits cq_context;
+
+ u8 reserved_3[0x600];
+
+ u8 pas[0][0x40];
+};
+
+struct mlx5_ifc_modify_cong_status_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_modify_cong_status_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x18];
+ u8 priority[0x4];
+ u8 cong_protocol[0x4];
+
+ u8 enable[0x1];
+ u8 tag_enable[0x1];
+ u8 reserved_3[0x1e];
+};
+
+struct mlx5_ifc_modify_cong_params_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_modify_cong_params_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x1c];
+ u8 cong_protocol[0x4];
+
+ union mlx5_ifc_field_select_802_1_r_roce_auto_bits field_select;
+
+ u8 reserved_3[0x80];
+
+ union mlx5_ifc_cong_control_roce_ecn_auto_bits congestion_parameters;
+};
+
+struct mlx5_ifc_manage_pages_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 output_num_entries[0x20];
+
+ u8 reserved_1[0x20];
+
+ u8 pas[0][0x40];
+};
+
+enum {
+ MLX5_MANAGE_PAGES_IN_OP_MOD_ALLOCATION_FAIL = 0x0,
+ MLX5_MANAGE_PAGES_IN_OP_MOD_ALLOCATION_SUCCESS = 0x1,
+ MLX5_MANAGE_PAGES_IN_OP_MOD_HCA_RETURN_PAGES = 0x2,
+};
+
+struct mlx5_ifc_manage_pages_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x10];
+ u8 function_id[0x10];
+
+ u8 input_num_entries[0x20];
+
+ u8 pas[0][0x40];
+};
+
+struct mlx5_ifc_mad_ifc_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ u8 response_mad_packet[256][0x8];
+};
+
+struct mlx5_ifc_mad_ifc_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 remote_lid[0x10];
+ u8 reserved_2[0x8];
+ u8 port[0x8];
+
+ u8 reserved_3[0x20];
+
+ u8 mad[256][0x8];
+};
+
+struct mlx5_ifc_init_hca_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_init_hca_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_init2rtr_qp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_init2rtr_qp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 opt_param_mask[0x20];
+
+ u8 reserved_4[0x20];
+
+ struct mlx5_ifc_qpc_bits qpc;
+
+ u8 reserved_5[0x80];
+};
+
+struct mlx5_ifc_init2init_qp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_init2init_qp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 opt_param_mask[0x20];
+
+ u8 reserved_4[0x20];
+
+ struct mlx5_ifc_qpc_bits qpc;
+
+ u8 reserved_5[0x80];
+};
+
+struct mlx5_ifc_get_dropped_packet_log_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ u8 packet_headers_log[128][0x8];
+
+ u8 packet_syndrome[64][0x8];
+};
+
+struct mlx5_ifc_get_dropped_packet_log_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_gen_eqe_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x18];
+ u8 eq_number[0x8];
+
+ u8 reserved_3[0x20];
+
+ u8 eqe[64][0x8];
+};
+
+struct mlx5_ifc_gen_eq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_filed_select_r_roce_rp_bits {
+ u8 field_select_r_roce_rp[0x20];
+};
+
+struct mlx5_ifc_filed_select_r_roce_np_bits {
+ u8 field_select_r_roce_np[0x20];
+};
+
+enum {
+ MLX5_FILED_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPPP_MAX_RPS = 0x4,
+ MLX5_FILED_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_TIME_RESET = 0x8,
+ MLX5_FILED_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_BYTE_RESET = 0x10,
+ MLX5_FILED_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_THRESHOLD = 0x20,
+ MLX5_FILED_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_MAX_RATE = 0x40,
+ MLX5_FILED_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_AI_RATE = 0x80,
+ MLX5_FILED_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_HAI_RATE = 0x100,
+ MLX5_FILED_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_GD = 0x200,
+ MLX5_FILED_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_MIN_DEC_FAC = 0x400,
+ MLX5_FILED_SELECT_802_1QAU_RP_FIELD_SELECT_8021QAURP_RPG_MIN_RATE = 0x800,
+};
+
+struct mlx5_ifc_filed_select_802_1qau_rp_bits {
+ u8 field_select_8021qaurp[0x20];
+};
+
+struct mlx5_ifc_enable_hca_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x20];
+};
+
+struct mlx5_ifc_enable_hca_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x10];
+ u8 function_id[0x10];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_drain_dct_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_drain_dct_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 dctn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_disable_hca_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x20];
+};
+
+struct mlx5_ifc_disable_hca_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x10];
+ u8 function_id[0x10];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_detach_from_mcg_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_detach_from_mcg_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 multicast_gid[16][0x8];
+};
+
+struct mlx5_ifc_destroy_xrc_srq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_xrc_srq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 xrc_srqn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_tis_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_tis_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 tisn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_tir_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_tir_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 tirn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_srq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_srq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 srqn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_sq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_sq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 sqn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_rqt_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_rqt_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 rqtn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_rq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_rq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 rqn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_rmp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_rmp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 rmpn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_qp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_qp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_psv_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_psv_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 psvn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_mkey_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_mkey_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 mkey_index[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_flow_table_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_flow_table_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ u8 table_type[0x8];
+ u8 reserved_3[0x18];
+
+ u8 reserved_4[0x8];
+ u8 table_id[0x18];
+
+ u8 reserved_5[0x140];
+};
+
+struct mlx5_ifc_destroy_flow_group_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_flow_group_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ u8 table_type[0x8];
+ u8 reserved_3[0x18];
+
+ u8 reserved_4[0x8];
+ u8 table_id[0x18];
+
+ u8 group_id[0x20];
+
+ u8 reserved_5[0x120];
+};
+
+struct mlx5_ifc_destroy_eq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_eq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x18];
+ u8 eq_number[0x8];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_dct_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_dct_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 dctn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_destroy_cq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_destroy_cq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 cqn[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_delete_vxlan_udp_dport_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_delete_vxlan_udp_dport_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x20];
+
+ u8 reserved_3[0x10];
+ u8 vxlan_udp_port[0x10];
+};
+
+struct mlx5_ifc_delete_l2_table_entry_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_delete_l2_table_entry_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x60];
+
+ u8 reserved_3[0x8];
+ u8 table_index[0x18];
+
+ u8 reserved_4[0x140];
+};
+
+struct mlx5_ifc_delete_fte_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_delete_fte_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ u8 table_type[0x8];
+ u8 reserved_3[0x18];
+
+ u8 reserved_4[0x8];
+ u8 table_id[0x18];
+
+ u8 reserved_5[0x40];
+
+ u8 flow_index[0x20];
+
+ u8 reserved_6[0xe0];
+};
+
+struct mlx5_ifc_dealloc_xrcd_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_dealloc_xrcd_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 xrcd[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_dealloc_uar_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_dealloc_uar_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 uar[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_dealloc_transport_domain_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_dealloc_transport_domain_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 transport_domain[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_dealloc_q_counter_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_dealloc_q_counter_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x18];
+ u8 counter_set_id[0x8];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_dealloc_pd_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_dealloc_pd_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 pd[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_create_xrc_srq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 xrc_srqn[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_xrc_srq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ struct mlx5_ifc_xrc_srqc_bits xrc_srq_context_entry;
+
+ u8 reserved_3[0x600];
+
+ u8 pas[0][0x40];
+};
+
+struct mlx5_ifc_create_tis_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 tisn[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_tis_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0xc0];
+
+ struct mlx5_ifc_tisc_bits ctx;
+};
+
+struct mlx5_ifc_create_tir_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 tirn[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_tir_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0xc0];
+
+ struct mlx5_ifc_tirc_bits ctx;
+};
+
+struct mlx5_ifc_create_srq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 srqn[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_srq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ struct mlx5_ifc_srqc_bits srq_context_entry;
+
+ u8 reserved_3[0x600];
+
+ u8 pas[0][0x40];
+};
+
+struct mlx5_ifc_create_sq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 sqn[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_sq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0xc0];
+
+ struct mlx5_ifc_sqc_bits ctx;
+};
+
+struct mlx5_ifc_create_rqt_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 rqtn[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_rqt_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0xc0];
+
+ struct mlx5_ifc_rqtc_bits rqt_context;
+};
+
+struct mlx5_ifc_create_rq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 rqn[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_rq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0xc0];
+
+ struct mlx5_ifc_rqc_bits ctx;
+};
+
+struct mlx5_ifc_create_rmp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 rmpn[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_rmp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0xc0];
+
+ struct mlx5_ifc_rmpc_bits ctx;
+};
+
+struct mlx5_ifc_create_qp_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_qp_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ u8 opt_param_mask[0x20];
+
+ u8 reserved_3[0x20];
+
+ struct mlx5_ifc_qpc_bits qpc;
+
+ u8 reserved_4[0x80];
+
+ u8 pas[0][0x40];
+};
+
+struct mlx5_ifc_create_psv_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ u8 reserved_2[0x8];
+ u8 psv0_index[0x18];
+
+ u8 reserved_3[0x8];
+ u8 psv1_index[0x18];
+
+ u8 reserved_4[0x8];
+ u8 psv2_index[0x18];
+
+ u8 reserved_5[0x8];
+ u8 psv3_index[0x18];
+};
+
+struct mlx5_ifc_create_psv_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 num_psv[0x4];
+ u8 reserved_2[0x4];
+ u8 pd[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_create_mkey_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 mkey_index[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_mkey_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x20];
+
+ u8 pg_access[0x1];
+ u8 reserved_3[0x1f];
+
+ struct mlx5_ifc_mkc_bits memory_key_mkey_entry;
+
+ u8 reserved_4[0x80];
+
+ u8 translations_octword_actual_size[0x20];
+
+ u8 reserved_5[0x560];
+
+ u8 klm_pas_mtt[0][0x20];
+};
+
+struct mlx5_ifc_create_flow_table_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 table_id[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_flow_table_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ u8 table_type[0x8];
+ u8 reserved_3[0x18];
+
+ u8 reserved_4[0x20];
+
+ u8 reserved_5[0x8];
+ u8 level[0x8];
+ u8 reserved_6[0x8];
+ u8 log_size[0x8];
+
+ u8 reserved_7[0x120];
+};
+
+struct mlx5_ifc_create_flow_group_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 group_id[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+enum {
+ MLX5_CREATE_FLOW_GROUP_IN_MATCH_CRITERIA_ENABLE_OUTER_HEADERS = 0x0,
+ MLX5_CREATE_FLOW_GROUP_IN_MATCH_CRITERIA_ENABLE_MISC_PARAMETERS = 0x1,
+ MLX5_CREATE_FLOW_GROUP_IN_MATCH_CRITERIA_ENABLE_INNER_HEADERS = 0x2,
+};
+
+struct mlx5_ifc_create_flow_group_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ u8 table_type[0x8];
+ u8 reserved_3[0x18];
+
+ u8 reserved_4[0x8];
+ u8 table_id[0x18];
+
+ u8 reserved_5[0x20];
+
+ u8 start_flow_index[0x20];
+
+ u8 reserved_6[0x20];
+
+ u8 end_flow_index[0x20];
+
+ u8 reserved_7[0xa0];
+
+ u8 reserved_8[0x18];
+ u8 match_criteria_enable[0x8];
+
+ struct mlx5_ifc_fte_match_param_bits match_criteria;
+
+ u8 reserved_9[0xe00];
+};
+
+struct mlx5_ifc_create_eq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x18];
+ u8 eq_number[0x8];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_eq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ struct mlx5_ifc_eqc_bits eq_context_entry;
+
+ u8 reserved_3[0x40];
+
+ u8 event_bitmask[0x40];
+
+ u8 reserved_4[0x580];
+
+ u8 pas[0][0x40];
+};
+
+struct mlx5_ifc_create_dct_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 dctn[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_dct_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ struct mlx5_ifc_dctc_bits dct_context_entry;
+
+ u8 reserved_3[0x180];
+};
+
+struct mlx5_ifc_create_cq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 cqn[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_create_cq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+
+ struct mlx5_ifc_cqc_bits cq_context;
+
+ u8 reserved_3[0x600];
+
+ u8 pas[0][0x40];
+};
+
+struct mlx5_ifc_config_int_moderation_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x4];
+ u8 min_delay[0xc];
+ u8 int_vector[0x10];
+
+ u8 reserved_2[0x20];
+};
+
+enum {
+ MLX5_CONFIG_INT_MODERATION_IN_OP_MOD_WRITE = 0x0,
+ MLX5_CONFIG_INT_MODERATION_IN_OP_MOD_READ = 0x1,
+};
+
+struct mlx5_ifc_config_int_moderation_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x4];
+ u8 min_delay[0xc];
+ u8 int_vector[0x10];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_attach_to_mcg_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_attach_to_mcg_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 qpn[0x18];
+
+ u8 reserved_3[0x20];
+
+ u8 multicast_gid[16][0x8];
+};
+
+struct mlx5_ifc_arm_xrc_srq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+enum {
+ MLX5_ARM_XRC_SRQ_IN_OP_MOD_XRC_SRQ = 0x1,
+};
+
+struct mlx5_ifc_arm_xrc_srq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 xrc_srqn[0x18];
+
+ u8 reserved_3[0x10];
+ u8 lwm[0x10];
+};
+
+struct mlx5_ifc_arm_rq_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+enum {
+ MLX5_ARM_RQ_IN_OP_MOD_SRQ_ = 0x1,
+};
+
+struct mlx5_ifc_arm_rq_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 srq_number[0x18];
+
+ u8 reserved_3[0x10];
+ u8 lwm[0x10];
+};
+
+struct mlx5_ifc_arm_dct_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_arm_dct_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x8];
+ u8 dct_number[0x18];
+
+ u8 reserved_3[0x20];
+};
+
+struct mlx5_ifc_alloc_xrcd_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 xrcd[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_alloc_xrcd_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_alloc_uar_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 uar[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_alloc_uar_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_alloc_transport_domain_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 transport_domain[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_alloc_transport_domain_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_alloc_q_counter_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x18];
+ u8 counter_set_id[0x8];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_alloc_q_counter_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_alloc_pd_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x8];
+ u8 pd[0x18];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_alloc_pd_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_add_vxlan_udp_dport_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+struct mlx5_ifc_add_vxlan_udp_dport_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x20];
+
+ u8 reserved_3[0x10];
+ u8 vxlan_udp_port[0x10];
+};
+
+struct mlx5_ifc_access_register_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+
+ u8 register_data[0][0x20];
+};
+
+enum {
+ MLX5_ACCESS_REGISTER_IN_OP_MOD_WRITE = 0x0,
+ MLX5_ACCESS_REGISTER_IN_OP_MOD_READ = 0x1,
+};
+
+struct mlx5_ifc_access_register_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 reserved_2[0x10];
+ u8 register_id[0x10];
+
+ u8 argument[0x20];
+
+ u8 register_data[0][0x20];
+};
+
+struct mlx5_ifc_sltp_reg_bits {
+ u8 status[0x4];
+ u8 version[0x4];
+ u8 local_port[0x8];
+ u8 pnat[0x2];
+ u8 reserved_0[0x2];
+ u8 lane[0x4];
+ u8 reserved_1[0x8];
+
+ u8 reserved_2[0x20];
+
+ u8 reserved_3[0x7];
+ u8 polarity[0x1];
+ u8 ob_tap0[0x8];
+ u8 ob_tap1[0x8];
+ u8 ob_tap2[0x8];
+
+ u8 reserved_4[0xc];
+ u8 ob_preemp_mode[0x4];
+ u8 ob_reg[0x8];
+ u8 ob_bias[0x8];
+
+ u8 reserved_5[0x20];
+};
+
+struct mlx5_ifc_slrg_reg_bits {
+ u8 status[0x4];
+ u8 version[0x4];
+ u8 local_port[0x8];
+ u8 pnat[0x2];
+ u8 reserved_0[0x2];
+ u8 lane[0x4];
+ u8 reserved_1[0x8];
+
+ u8 time_to_link_up[0x10];
+ u8 reserved_2[0xc];
+ u8 grade_lane_speed[0x4];
+
+ u8 grade_version[0x8];
+ u8 grade[0x18];
+
+ u8 reserved_3[0x4];
+ u8 height_grade_type[0x4];
+ u8 height_grade[0x18];
+
+ u8 height_dz[0x10];
+ u8 height_dv[0x10];
+
+ u8 reserved_4[0x10];
+ u8 height_sigma[0x10];
+
+ u8 reserved_5[0x20];
+
+ u8 reserved_6[0x4];
+ u8 phase_grade_type[0x4];
+ u8 phase_grade[0x18];
+
+ u8 reserved_7[0x8];
+ u8 phase_eo_pos[0x8];
+ u8 reserved_8[0x8];
+ u8 phase_eo_neg[0x8];
+
+ u8 ffe_set_tested[0x10];
+ u8 test_errors_per_lane[0x10];
+};
+
+struct mlx5_ifc_pvlc_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 reserved_2[0x1c];
+ u8 vl_hw_cap[0x4];
+
+ u8 reserved_3[0x1c];
+ u8 vl_admin[0x4];
+
+ u8 reserved_4[0x1c];
+ u8 vl_operational[0x4];
+};
+
+struct mlx5_ifc_pude_reg_bits {
+ u8 swid[0x8];
+ u8 local_port[0x8];
+ u8 reserved_0[0x4];
+ u8 admin_status[0x4];
+ u8 reserved_1[0x4];
+ u8 oper_status[0x4];
+
+ u8 reserved_2[0x60];
+};
+
+struct mlx5_ifc_ptys_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0xd];
+ u8 proto_mask[0x3];
+
+ u8 reserved_2[0x40];
+
+ u8 eth_proto_capability[0x20];
+
+ u8 ib_link_width_capability[0x10];
+ u8 ib_proto_capability[0x10];
+
+ u8 reserved_3[0x20];
+
+ u8 eth_proto_admin[0x20];
+
+ u8 ib_link_width_admin[0x10];
+ u8 ib_proto_admin[0x10];
+
+ u8 reserved_4[0x20];
+
+ u8 eth_proto_oper[0x20];
+
+ u8 ib_link_width_oper[0x10];
+ u8 ib_proto_oper[0x10];
+
+ u8 reserved_5[0x20];
+
+ u8 eth_proto_lp_advertise[0x20];
+
+ u8 reserved_6[0x60];
+};
+
+struct mlx5_ifc_ptas_reg_bits {
+ u8 reserved_0[0x20];
+
+ u8 algorithm_options[0x10];
+ u8 reserved_1[0x4];
+ u8 repetitions_mode[0x4];
+ u8 num_of_repetitions[0x8];
+
+ u8 grade_version[0x8];
+ u8 height_grade_type[0x4];
+ u8 phase_grade_type[0x4];
+ u8 height_grade_weight[0x8];
+ u8 phase_grade_weight[0x8];
+
+ u8 gisim_measure_bits[0x10];
+ u8 adaptive_tap_measure_bits[0x10];
+
+ u8 ber_bath_high_error_threshold[0x10];
+ u8 ber_bath_mid_error_threshold[0x10];
+
+ u8 ber_bath_low_error_threshold[0x10];
+ u8 one_ratio_high_threshold[0x10];
+
+ u8 one_ratio_high_mid_threshold[0x10];
+ u8 one_ratio_low_mid_threshold[0x10];
+
+ u8 one_ratio_low_threshold[0x10];
+ u8 ndeo_error_threshold[0x10];
+
+ u8 mixer_offset_step_size[0x10];
+ u8 reserved_2[0x8];
+ u8 mix90_phase_for_voltage_bath[0x8];
+
+ u8 mixer_offset_start[0x10];
+ u8 mixer_offset_end[0x10];
+
+ u8 reserved_3[0x15];
+ u8 ber_test_time[0xb];
+};
+
+struct mlx5_ifc_pspa_reg_bits {
+ u8 swid[0x8];
+ u8 local_port[0x8];
+ u8 sub_port[0x8];
+ u8 reserved_0[0x8];
+
+ u8 reserved_1[0x20];
+};
+
+struct mlx5_ifc_pqdr_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0x5];
+ u8 prio[0x3];
+ u8 reserved_2[0x6];
+ u8 mode[0x2];
+
+ u8 reserved_3[0x20];
+
+ u8 reserved_4[0x10];
+ u8 min_threshold[0x10];
+
+ u8 reserved_5[0x10];
+ u8 max_threshold[0x10];
+
+ u8 reserved_6[0x10];
+ u8 mark_probability_denominator[0x10];
+
+ u8 reserved_7[0x60];
+};
+
+struct mlx5_ifc_ppsc_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 reserved_2[0x60];
+
+ u8 reserved_3[0x1c];
+ u8 wrps_admin[0x4];
+
+ u8 reserved_4[0x1c];
+ u8 wrps_status[0x4];
+
+ u8 reserved_5[0x8];
+ u8 up_threshold[0x8];
+ u8 reserved_6[0x8];
+ u8 down_threshold[0x8];
+
+ u8 reserved_7[0x20];
+
+ u8 reserved_8[0x1c];
+ u8 srps_admin[0x4];
+
+ u8 reserved_9[0x1c];
+ u8 srps_status[0x4];
+
+ u8 reserved_10[0x40];
+};
+
+struct mlx5_ifc_pplr_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 reserved_2[0x8];
+ u8 lb_cap[0x8];
+ u8 reserved_3[0x8];
+ u8 lb_en[0x8];
+};
+
+struct mlx5_ifc_pplm_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 reserved_2[0x20];
+
+ u8 port_profile_mode[0x8];
+ u8 static_port_profile[0x8];
+ u8 active_port_profile[0x8];
+ u8 reserved_3[0x8];
+
+ u8 retransmission_active[0x8];
+ u8 fec_mode_active[0x18];
+
+ u8 reserved_4[0x20];
+};
+
+struct mlx5_ifc_ppcnt_reg_bits {
+ u8 swid[0x8];
+ u8 local_port[0x8];
+ u8 pnat[0x2];
+ u8 reserved_0[0x8];
+ u8 grp[0x6];
+
+ u8 clr[0x1];
+ u8 reserved_1[0x1c];
+ u8 prio_tc[0x3];
+
+ union mlx5_ifc_eth_cntrs_grp_data_layout_auto_bits counter_set;
+};
+
+struct mlx5_ifc_ppad_reg_bits {
+ u8 reserved_0[0x3];
+ u8 single_mac[0x1];
+ u8 reserved_1[0x4];
+ u8 local_port[0x8];
+ u8 mac_47_32[0x10];
+
+ u8 mac_31_0[0x20];
+
+ u8 reserved_2[0x40];
+};
+
+struct mlx5_ifc_pmtu_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 max_mtu[0x10];
+ u8 reserved_2[0x10];
+
+ u8 admin_mtu[0x10];
+ u8 reserved_3[0x10];
+
+ u8 oper_mtu[0x10];
+ u8 reserved_4[0x10];
+};
+
+struct mlx5_ifc_pmpr_reg_bits {
+ u8 reserved_0[0x8];
+ u8 module[0x8];
+ u8 reserved_1[0x10];
+
+ u8 reserved_2[0x18];
+ u8 attenuation_5g[0x8];
+
+ u8 reserved_3[0x18];
+ u8 attenuation_7g[0x8];
+
+ u8 reserved_4[0x18];
+ u8 attenuation_12g[0x8];
+};
+
+struct mlx5_ifc_pmpe_reg_bits {
+ u8 reserved_0[0x8];
+ u8 module[0x8];
+ u8 reserved_1[0xc];
+ u8 module_status[0x4];
+
+ u8 reserved_2[0x60];
+};
+
+struct mlx5_ifc_pmpc_reg_bits {
+ u8 module_state_updated[32][0x8];
+};
+
+struct mlx5_ifc_pmlpn_reg_bits {
+ u8 reserved_0[0x4];
+ u8 mlpn_status[0x4];
+ u8 local_port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 e[0x1];
+ u8 reserved_2[0x1f];
+};
+
+struct mlx5_ifc_pmlp_reg_bits {
+ u8 rxtx[0x1];
+ u8 reserved_0[0x7];
+ u8 local_port[0x8];
+ u8 reserved_1[0x8];
+ u8 width[0x8];
+
+ u8 lane0_module_mapping[0x20];
+
+ u8 lane1_module_mapping[0x20];
+
+ u8 lane2_module_mapping[0x20];
+
+ u8 lane3_module_mapping[0x20];
+
+ u8 reserved_2[0x160];
+};
+
+struct mlx5_ifc_pmaos_reg_bits {
+ u8 reserved_0[0x8];
+ u8 module[0x8];
+ u8 reserved_1[0x4];
+ u8 admin_status[0x4];
+ u8 reserved_2[0x4];
+ u8 oper_status[0x4];
+
+ u8 ase[0x1];
+ u8 ee[0x1];
+ u8 reserved_3[0x1c];
+ u8 e[0x2];
+
+ u8 reserved_4[0x40];
+};
+
+struct mlx5_ifc_plpc_reg_bits {
+ u8 reserved_0[0x4];
+ u8 profile_id[0xc];
+ u8 reserved_1[0x4];
+ u8 proto_mask[0x4];
+ u8 reserved_2[0x8];
+
+ u8 reserved_3[0x10];
+ u8 lane_speed[0x10];
+
+ u8 reserved_4[0x17];
+ u8 lpbf[0x1];
+ u8 fec_mode_policy[0x8];
+
+ u8 retransmission_capability[0x8];
+ u8 fec_mode_capability[0x18];
+
+ u8 retransmission_support_admin[0x8];
+ u8 fec_mode_support_admin[0x18];
+
+ u8 retransmission_request_admin[0x8];
+ u8 fec_mode_request_admin[0x18];
+
+ u8 reserved_5[0x80];
+};
+
+struct mlx5_ifc_plib_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0x8];
+ u8 ib_port[0x8];
+
+ u8 reserved_2[0x60];
+};
+
+struct mlx5_ifc_plbf_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0xd];
+ u8 lbf_mode[0x3];
+
+ u8 reserved_2[0x20];
+};
+
+struct mlx5_ifc_pipg_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 dic[0x1];
+ u8 reserved_2[0x19];
+ u8 ipg[0x4];
+ u8 reserved_3[0x2];
+};
+
+struct mlx5_ifc_pifr_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 reserved_2[0xe0];
+
+ u8 port_filter[8][0x20];
+
+ u8 port_filter_update_en[8][0x20];
+};
+
+struct mlx5_ifc_pfcc_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 ppan[0x4];
+ u8 reserved_2[0x4];
+ u8 prio_mask_tx[0x8];
+ u8 reserved_3[0x8];
+ u8 prio_mask_rx[0x8];
+
+ u8 pptx[0x1];
+ u8 aptx[0x1];
+ u8 reserved_4[0x6];
+ u8 pfctx[0x8];
+ u8 reserved_5[0x10];
+
+ u8 pprx[0x1];
+ u8 aprx[0x1];
+ u8 reserved_6[0x6];
+ u8 pfcrx[0x8];
+ u8 reserved_7[0x10];
+
+ u8 reserved_8[0x80];
+};
+
+struct mlx5_ifc_pelc_reg_bits {
+ u8 op[0x4];
+ u8 reserved_0[0x4];
+ u8 local_port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 op_admin[0x8];
+ u8 op_capability[0x8];
+ u8 op_request[0x8];
+ u8 op_active[0x8];
+
+ u8 admin[0x40];
+
+ u8 capability[0x40];
+
+ u8 request[0x40];
+
+ u8 active[0x40];
+
+ u8 reserved_2[0x80];
+};
+
+struct mlx5_ifc_peir_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 reserved_2[0xc];
+ u8 error_count[0x4];
+ u8 reserved_3[0x10];
+
+ u8 reserved_4[0xc];
+ u8 lane[0x4];
+ u8 reserved_5[0x8];
+ u8 error_type[0x8];
+};
+
+struct mlx5_ifc_pcap_reg_bits {
+ u8 reserved_0[0x8];
+ u8 local_port[0x8];
+ u8 reserved_1[0x10];
+
+ u8 port_capability_mask[4][0x20];
+};
+
+struct mlx5_ifc_paos_reg_bits {
+ u8 swid[0x8];
+ u8 local_port[0x8];
+ u8 reserved_0[0x4];
+ u8 admin_status[0x4];
+ u8 reserved_1[0x4];
+ u8 oper_status[0x4];
+
+ u8 ase[0x1];
+ u8 ee[0x1];
+ u8 reserved_2[0x1c];
+ u8 e[0x2];
+
+ u8 reserved_3[0x40];
+};
+
+struct mlx5_ifc_pamp_reg_bits {
+ u8 reserved_0[0x8];
+ u8 opamp_group[0x8];
+ u8 reserved_1[0xc];
+ u8 opamp_group_type[0x4];
+
+ u8 start_index[0x10];
+ u8 reserved_2[0x4];
+ u8 num_of_indices[0xc];
+
+ u8 index_data[18][0x10];
+};
+
+struct mlx5_ifc_lane_2_module_mapping_bits {
+ u8 reserved_0[0x6];
+ u8 rx_lane[0x2];
+ u8 reserved_1[0x6];
+ u8 tx_lane[0x2];
+ u8 reserved_2[0x8];
+ u8 module[0x8];
+};
+
+struct mlx5_ifc_bufferx_reg_bits {
+ u8 reserved_0[0x6];
+ u8 lossy[0x1];
+ u8 epsb[0x1];
+ u8 reserved_1[0xc];
+ u8 size[0xc];
+
+ u8 xoff_threshold[0x10];
+ u8 xon_threshold[0x10];
+};
+
+struct mlx5_ifc_set_node_in_bits {
+ u8 node_description[64][0x8];
+};
+
+struct mlx5_ifc_register_power_settings_bits {
+ u8 reserved_0[0x18];
+ u8 power_settings_level[0x8];
+
+ u8 reserved_1[0x60];
+};
+
+struct mlx5_ifc_register_host_endianess_bits {
+ u8 he[0x1];
+ u8 reserved_0[0x1f];
+
+ u8 reserved_1[0x60];
+};
+
+struct mlx5_ifc_umr_pointer_desc_argument_bits {
+ u8 reserved_0[0x20];
+
+ u8 mkey[0x20];
+
+ u8 addressh_63_32[0x20];
+
+ u8 addressl_31_0[0x20];
+};
+
+struct mlx5_ifc_ud_adrs_vector_bits {
+ u8 dc_key[0x40];
+
+ u8 ext[0x1];
+ u8 reserved_0[0x7];
+ u8 destination_qp_dct[0x18];
+
+ u8 static_rate[0x4];
+ u8 sl_eth_prio[0x4];
+ u8 fl[0x1];
+ u8 mlid[0x7];
+ u8 rlid_udp_sport[0x10];
+
+ u8 reserved_1[0x20];
+
+ u8 rmac_47_16[0x20];
+
+ u8 rmac_15_0[0x10];
+ u8 tclass[0x8];
+ u8 hop_limit[0x8];
+
+ u8 reserved_2[0x1];
+ u8 grh[0x1];
+ u8 reserved_3[0x2];
+ u8 src_addr_index[0x8];
+ u8 flow_label[0x14];
+
+ u8 rgid_rip[16][0x8];
+};
+
+struct mlx5_ifc_pages_req_event_bits {
+ u8 reserved_0[0x10];
+ u8 function_id[0x10];
+
+ u8 num_pages[0x20];
+
+ u8 reserved_1[0xa0];
+};
+
+struct mlx5_ifc_eqe_bits {
+ u8 reserved_0[0x8];
+ u8 event_type[0x8];
+ u8 reserved_1[0x8];
+ u8 event_sub_type[0x8];
+
+ u8 reserved_2[0xe0];
+
+ union mlx5_ifc_event_auto_bits event_data;
+
+ u8 reserved_3[0x10];
+ u8 signature[0x8];
+ u8 reserved_4[0x7];
+ u8 owner[0x1];
+};
+
+enum {
+ MLX5_CMD_QUEUE_ENTRY_TYPE_PCIE_CMD_IF_TRANSPORT = 0x7,
+};
+
+struct mlx5_ifc_cmd_queue_entry_bits {
+ u8 type[0x8];
+ u8 reserved_0[0x18];
+
+ u8 input_length[0x20];
+
+ u8 input_mailbox_pointer_63_32[0x20];
+
+ u8 input_mailbox_pointer_31_9[0x17];
+ u8 reserved_1[0x9];
+
+ u8 command_input_inline_data[16][0x8];
+
+ u8 command_output_inline_data[16][0x8];
+
+ u8 output_mailbox_pointer_63_32[0x20];
+
+ u8 output_mailbox_pointer_31_9[0x17];
+ u8 reserved_2[0x9];
+
+ u8 output_length[0x20];
+
+ u8 token[0x8];
+ u8 signature[0x8];
+ u8 reserved_3[0x8];
+ u8 status[0x7];
+ u8 ownership[0x1];
+};
+
+struct mlx5_ifc_cmd_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 command_output[0x20];
+};
+
+struct mlx5_ifc_cmd_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 command[0][0x20];
+};
+
+struct mlx5_ifc_cmd_if_box_bits {
+ u8 mailbox_data[512][0x8];
+
+ u8 reserved_0[0x180];
+
+ u8 next_pointer_63_32[0x20];
+
+ u8 next_pointer_31_10[0x16];
+ u8 reserved_1[0xa];
+
+ u8 block_number[0x20];
+
+ u8 reserved_2[0x8];
+ u8 token[0x8];
+ u8 ctrl_signature[0x8];
+ u8 signature[0x8];
+};
+
+struct mlx5_ifc_mtt_bits {
+ u8 ptag_63_32[0x20];
+
+ u8 ptag_31_8[0x18];
+ u8 reserved_0[0x6];
+ u8 wr_en[0x1];
+ u8 rd_en[0x1];
+};
+
+enum {
+ MLX5_INITIAL_SEG_NIC_INTERFACE_FULL_DRIVER = 0x0,
+ MLX5_INITIAL_SEG_NIC_INTERFACE_DISABLED = 0x1,
+ MLX5_INITIAL_SEG_NIC_INTERFACE_NO_DRAM_NIC = 0x2,
+};
+
+enum {
+ MLX5_INITIAL_SEG_NIC_INTERFACE_SUPPORTED_FULL_DRIVER = 0x0,
+ MLX5_INITIAL_SEG_NIC_INTERFACE_SUPPORTED_DISABLED = 0x1,
+ MLX5_INITIAL_SEG_NIC_INTERFACE_SUPPORTED_NO_DRAM_NIC = 0x2,
+};
+
+enum {
+ MLX5_INITIAL_SEG_HEALTH_SYNDROME_FW_INTERNAL_ERR = 0x1,
+ MLX5_INITIAL_SEG_HEALTH_SYNDROME_DEAD_IRISC = 0x7,
+ MLX5_INITIAL_SEG_HEALTH_SYNDROME_HW_FATAL_ERR = 0x8,
+ MLX5_INITIAL_SEG_HEALTH_SYNDROME_FW_CRC_ERR = 0x9,
+ MLX5_INITIAL_SEG_HEALTH_SYNDROME_ICM_FETCH_PCI_ERR = 0xa,
+ MLX5_INITIAL_SEG_HEALTH_SYNDROME_ICM_PAGE_ERR = 0xb,
+ MLX5_INITIAL_SEG_HEALTH_SYNDROME_ASYNCHRONOUS_EQ_BUF_OVERRUN = 0xc,
+ MLX5_INITIAL_SEG_HEALTH_SYNDROME_EQ_IN_ERR = 0xd,
+ MLX5_INITIAL_SEG_HEALTH_SYNDROME_EQ_INV = 0xe,
+ MLX5_INITIAL_SEG_HEALTH_SYNDROME_FFSER_ERR = 0xf,
+ MLX5_INITIAL_SEG_HEALTH_SYNDROME_HIGH_TEMP_ERR = 0x10,
+};
+
+struct mlx5_ifc_initial_seg_bits {
+ u8 fw_rev_minor[0x10];
+ u8 fw_rev_major[0x10];
+
+ u8 cmd_interface_rev[0x10];
+ u8 fw_rev_subminor[0x10];
+
+ u8 reserved_0[0x40];
+
+ u8 cmdq_phy_addr_63_32[0x20];
+
+ u8 cmdq_phy_addr_31_12[0x14];
+ u8 reserved_1[0x2];
+ u8 nic_interface[0x2];
+ u8 log_cmdq_size[0x4];
+ u8 log_cmdq_stride[0x4];
+
+ u8 command_doorbell_vector[0x20];
+
+ u8 reserved_2[0xf00];
+
+ u8 initializing[0x1];
+ u8 reserved_3[0x4];
+ u8 nic_interface_supported[0x3];
+ u8 reserved_4[0x18];
+
+ struct mlx5_ifc_health_buffer_bits health_buffer;
+
+ u8 no_dram_nic_offset[0x20];
+
+ u8 reserved_5[0x6e40];
+
+ u8 reserved_6[0x1f];
+ u8 clear_int[0x1];
+
+ u8 health_syndrome[0x8];
+ u8 health_counter[0x18];
+
+ u8 reserved_7[0x17fc0];
+};
+
+union mlx5_ifc_ports_control_registers_document_bits {
+ struct mlx5_ifc_bufferx_reg_bits bufferx_reg;
+ struct mlx5_ifc_eth_2819_cntrs_grp_data_layout_bits eth_2819_cntrs_grp_data_layout;
+ struct mlx5_ifc_eth_2863_cntrs_grp_data_layout_bits eth_2863_cntrs_grp_data_layout;
+ struct mlx5_ifc_eth_3635_cntrs_grp_data_layout_bits eth_3635_cntrs_grp_data_layout;
+ struct mlx5_ifc_eth_802_3_cntrs_grp_data_layout_bits eth_802_3_cntrs_grp_data_layout;
+ struct mlx5_ifc_eth_extended_cntrs_grp_data_layout_bits eth_extended_cntrs_grp_data_layout;
+ struct mlx5_ifc_eth_per_prio_grp_data_layout_bits eth_per_prio_grp_data_layout;
+ struct mlx5_ifc_eth_per_traffic_grp_data_layout_bits eth_per_traffic_grp_data_layout;
+ struct mlx5_ifc_lane_2_module_mapping_bits lane_2_module_mapping;
+ struct mlx5_ifc_pamp_reg_bits pamp_reg;
+ struct mlx5_ifc_paos_reg_bits paos_reg;
+ struct mlx5_ifc_pcap_reg_bits pcap_reg;
+ struct mlx5_ifc_peir_reg_bits peir_reg;
+ struct mlx5_ifc_pelc_reg_bits pelc_reg;
+ struct mlx5_ifc_pfcc_reg_bits pfcc_reg;
+ struct mlx5_ifc_phys_layer_cntrs_bits phys_layer_cntrs;
+ struct mlx5_ifc_pifr_reg_bits pifr_reg;
+ struct mlx5_ifc_pipg_reg_bits pipg_reg;
+ struct mlx5_ifc_plbf_reg_bits plbf_reg;
+ struct mlx5_ifc_plib_reg_bits plib_reg;
+ struct mlx5_ifc_plpc_reg_bits plpc_reg;
+ struct mlx5_ifc_pmaos_reg_bits pmaos_reg;
+ struct mlx5_ifc_pmlp_reg_bits pmlp_reg;
+ struct mlx5_ifc_pmlpn_reg_bits pmlpn_reg;
+ struct mlx5_ifc_pmpc_reg_bits pmpc_reg;
+ struct mlx5_ifc_pmpe_reg_bits pmpe_reg;
+ struct mlx5_ifc_pmpr_reg_bits pmpr_reg;
+ struct mlx5_ifc_pmtu_reg_bits pmtu_reg;
+ struct mlx5_ifc_ppad_reg_bits ppad_reg;
+ struct mlx5_ifc_ppcnt_reg_bits ppcnt_reg;
+ struct mlx5_ifc_pplm_reg_bits pplm_reg;
+ struct mlx5_ifc_pplr_reg_bits pplr_reg;
+ struct mlx5_ifc_ppsc_reg_bits ppsc_reg;
+ struct mlx5_ifc_pqdr_reg_bits pqdr_reg;
+ struct mlx5_ifc_pspa_reg_bits pspa_reg;
+ struct mlx5_ifc_ptas_reg_bits ptas_reg;
+ struct mlx5_ifc_ptys_reg_bits ptys_reg;
+ struct mlx5_ifc_pude_reg_bits pude_reg;
+ struct mlx5_ifc_pvlc_reg_bits pvlc_reg;
+ struct mlx5_ifc_slrg_reg_bits slrg_reg;
+ struct mlx5_ifc_sltp_reg_bits sltp_reg;
+ u8 reserved_0[0x60e0];
+};
+
+union mlx5_ifc_debug_enhancements_document_bits {
+ struct mlx5_ifc_health_buffer_bits health_buffer;
+ u8 reserved_0[0x200];
+};
+
+union mlx5_ifc_uplink_pci_interface_document_bits {
+ struct mlx5_ifc_initial_seg_bits initial_seg;
+ u8 reserved_0[0x20060];
+};
+
+
+enum {
+ MLX5_HCA_VPORT_SEL_PORT_GUID = 1 << 0,
+ MLX5_HCA_VPORT_SEL_NODE_GUID = 1 << 1,
+ MLX5_HCA_VPORT_SEL_STATE_POLICY = 1 << 2,
+};
+
+struct mlx5_ifc_query_hca_vport_gid_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x20];
+
+ u8 gids_num[0x10];
+ u8 reserved_2[0x10];
+
+ struct mlx5_ifc_array128_auto_bits gid[0];
+};
+
+struct mlx5_ifc_query_hca_vport_pkey_in_bits {
+ u8 opcode[0x10];
+ u8 reserved_0[0x10];
+
+ u8 reserved_1[0x10];
+ u8 op_mod[0x10];
+
+ u8 other_vport[0x1];
+ u8 reserved_2[0xb];
+ u8 port_num[0x4];
+ u8 vport_number[0x10];
+
+ u8 reserved_3[0x10];
+ u8 pkey_index[0x10];
+};
+
+struct mlx5_ifc_modify_hca_vport_context_out_bits {
+ u8 status[0x8];
+ u8 reserved_0[0x18];
+
+ u8 syndrome[0x20];
+
+ u8 reserved_1[0x40];
+};
+
+#endif /* MLX5_IFC_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx5/qp.h b/drivers/net/mlnx_uio/mlnx/include/mlx5/qp.h
new file mode 100644
index 0000000..d3af935
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx5/qp.h
@@ -0,0 +1,804 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX5_QP_H
+#define MLX5_QP_H
+
+
+#define MLX5_INVALID_LKEY 0x100
+#define MLX5_SIG_WQE_SIZE (MLX5_SEND_WQE_BB * 5)
+#define MLX5_DIF_SIZE 8
+#define MLX5_STRIDE_BLOCK_OP 0x400
+#define MLX5_CPY_GRD_MASK 0xc0
+#define MLX5_CPY_APP_MASK 0x30
+#define MLX5_CPY_REF_MASK 0x0f
+#define MLX5_BSF_INC_REFTAG (1 << 6)
+#define MLX5_BSF_INL_VALID (1 << 15)
+#define MLX5_BSF_REFRESH_DIF (1 << 14)
+#define MLX5_BSF_REPEAT_BLOCK (1 << 7)
+#define MLX5_BSF_APPTAG_ESCAPE 0x1
+#define MLX5_BSF_APPREF_ESCAPE 0x2
+
+#define MLX5_QPN_BITS 24
+#define MLX5_QPN_MASK ((1 << MLX5_QPN_BITS) - 1)
+
+enum mlx5_qp_optpar {
+ MLX5_QP_OPTPAR_ALT_ADDR_PATH = 1 << 0,
+ MLX5_QP_OPTPAR_RRE = 1 << 1,
+ MLX5_QP_OPTPAR_RAE = 1 << 2,
+ MLX5_QP_OPTPAR_RWE = 1 << 3,
+ MLX5_QP_OPTPAR_PKEY_INDEX = 1 << 4,
+ MLX5_QP_OPTPAR_Q_KEY = 1 << 5,
+ MLX5_QP_OPTPAR_RNR_TIMEOUT = 1 << 6,
+ MLX5_QP_OPTPAR_PRIMARY_ADDR_PATH = 1 << 7,
+ MLX5_QP_OPTPAR_SRA_MAX = 1 << 8,
+ MLX5_QP_OPTPAR_RRA_MAX = 1 << 9,
+ MLX5_QP_OPTPAR_PM_STATE = 1 << 10,
+ MLX5_QP_OPTPAR_RETRY_COUNT = 1 << 12,
+ MLX5_QP_OPTPAR_RNR_RETRY = 1 << 13,
+ MLX5_QP_OPTPAR_ACK_TIMEOUT = 1 << 14,
+ MLX5_QP_OPTPAR_PRI_PORT = 1 << 16,
+ MLX5_QP_OPTPAR_SRQN = 1 << 18,
+ MLX5_QP_OPTPAR_CQN_RCV = 1 << 19,
+ MLX5_QP_OPTPAR_DC_HS = 1 << 20,
+ MLX5_QP_OPTPAR_DC_KEY = 1 << 21,
+};
+
+enum mlx5_qp_state {
+ MLX5_QP_STATE_RST = 0,
+ MLX5_QP_STATE_INIT = 1,
+ MLX5_QP_STATE_RTR = 2,
+ MLX5_QP_STATE_RTS = 3,
+ MLX5_QP_STATE_SQER = 4,
+ MLX5_QP_STATE_SQD = 5,
+ MLX5_QP_STATE_ERR = 6,
+ MLX5_QP_STATE_SQ_DRAINING = 7,
+ MLX5_QP_STATE_SUSPENDED = 9,
+ MLX5_QP_NUM_STATE
+};
+
+enum {
+ MLX5_QP_ST_RC = 0x0,
+ MLX5_QP_ST_UC = 0x1,
+ MLX5_QP_ST_UD = 0x2,
+ MLX5_QP_ST_XRC = 0x3,
+ MLX5_QP_ST_MLX = 0x4,
+ MLX5_QP_ST_DC = 0x5,
+ MLX5_QP_ST_QP0 = 0x7,
+ MLX5_QP_ST_QP1 = 0x8,
+ MLX5_QP_ST_RAW_ETHERTYPE = 0x9,
+ MLX5_QP_ST_RAW_IPV6 = 0xa,
+ MLX5_QP_ST_SNIFFER = 0xb,
+ MLX5_QP_ST_SYNC_UMR = 0xe,
+ MLX5_QP_ST_PTP_1588 = 0xd,
+ MLX5_QP_ST_REG_UMR = 0xc,
+ MLX5_QP_ST_MAX
+};
+
+enum {
+ MLX5_QP_PM_MIGRATED = 0x3,
+ MLX5_QP_PM_ARMED = 0x0,
+ MLX5_QP_PM_REARM = 0x1
+};
+
+enum {
+ MLX5_NON_ZERO_RQ = 0 << 24,
+ MLX5_SRQ_RQ = 1 << 24,
+ MLX5_CRQ_RQ = 2 << 24,
+ MLX5_ZERO_LEN_RQ = 3 << 24
+};
+
+enum {
+ /* params1 */
+ MLX5_QP_BIT_SRE = 1 << 15,
+ MLX5_QP_BIT_SWE = 1 << 14,
+ MLX5_QP_BIT_SAE = 1 << 13,
+ /* params2 */
+ MLX5_QP_BIT_RRE = 1 << 15,
+ MLX5_QP_BIT_RWE = 1 << 14,
+ MLX5_QP_BIT_RAE = 1 << 13,
+ MLX5_QP_BIT_RIC = 1 << 4,
+ MLX5_QP_BIT_COLL_SYNC_RQ = 1 << 2,
+ MLX5_QP_BIT_COLL_SYNC_SQ = 1 << 1,
+ MLX5_QP_BIT_COLL_MASTER = 1 << 0
+};
+
+enum {
+ MLX5_DCT_BIT_RRE = 1 << 19,
+ MLX5_DCT_BIT_RWE = 1 << 18,
+ MLX5_DCT_BIT_RAE = 1 << 17,
+};
+
+enum {
+ MLX5_WQE_CTRL_CQ_UPDATE = 2 << 2,
+ MLX5_WQE_CTRL_CQ_UPDATE_AND_EQE = 3 << 2,
+ MLX5_WQE_CTRL_SOLICITED = 1 << 1,
+};
+
+enum {
+ MLX5_SEND_WQE_DS = 16,
+ MLX5_SEND_WQE_BB = 64,
+};
+
+#define MLX5_SEND_WQEBB_NUM_DS (MLX5_SEND_WQE_BB / MLX5_SEND_WQE_DS)
+
+enum {
+ MLX5_SEND_WQE_MAX_WQEBBS = 16,
+};
+
+enum {
+ MLX5_WQE_FMR_PERM_LOCAL_READ = 1 << 27,
+ MLX5_WQE_FMR_PERM_LOCAL_WRITE = 1 << 28,
+ MLX5_WQE_FMR_PERM_REMOTE_READ = 1 << 29,
+ MLX5_WQE_FMR_PERM_REMOTE_WRITE = 1 << 30,
+ MLX5_WQE_FMR_PERM_ATOMIC = 1 << 31
+};
+
+enum {
+ MLX5_FENCE_MODE_NONE = 0 << 5,
+ MLX5_FENCE_MODE_INITIATOR_SMALL = 1 << 5,
+ MLX5_FENCE_MODE_STRONG_ORDERING = 3 << 5,
+ MLX5_FENCE_MODE_SMALL_AND_FENCE = 4 << 5,
+};
+
+enum {
+ MLX5_QP_LAT_SENSITIVE = 1 << 28,
+ MLX5_QP_BLOCK_MCAST = 1 << 30,
+ MLX5_QP_ENABLE_SIG = 1 << 31,
+};
+
+enum {
+ MLX5_RCV_DBR = 0,
+ MLX5_SND_DBR = 1,
+};
+
+enum {
+ MLX5_FLAGS_INLINE = 1<<7,
+ MLX5_FLAGS_CHECK_FREE = 1<<5,
+};
+
+struct mlx5_wqe_fmr_seg {
+ __be32 flags;
+ __be32 mem_key;
+ __be64 buf_list;
+ __be64 start_addr;
+ __be64 reg_len;
+ __be32 offset;
+ __be32 page_size;
+ u32 reserved[2];
+};
+
+struct mlx5_wqe_ctrl_seg {
+ __be32 opmod_idx_opcode;
+ __be32 qpn_ds;
+ u8 signature;
+ u8 rsvd[2];
+ u8 fm_ce_se;
+ __be32 imm;
+};
+
+#define MLX5_WQE_CTRL_DS_MASK 0x3f
+#define MLX5_WQE_CTRL_QPN_MASK 0xffffff00
+#define MLX5_WQE_CTRL_QPN_SHIFT 8
+#define MLX5_WQE_DS_UNITS 16
+#define MLX5_WQE_CTRL_OPCODE_MASK 0xff
+#define MLX5_WQE_CTRL_WQE_INDEX_MASK 0x00ffff00
+#define MLX5_WQE_CTRL_WQE_INDEX_SHIFT 8
+
+enum {
+ MLX5_ETH_WQE_L3_INNER_CSUM = 1 << 4,
+ MLX5_ETH_WQE_L4_INNER_CSUM = 1 << 5,
+ MLX5_ETH_WQE_L3_CSUM = 1 << 6,
+ MLX5_ETH_WQE_L4_CSUM = 1 << 7,
+};
+
+struct mlx5_wqe_eth_seg {
+ u8 rsvd0[4];
+ u8 cs_flags;
+ u8 rsvd1;
+ __be16 mss;
+ __be32 rsvd2;
+ __be16 inline_hdr_sz;
+ u8 inline_hdr_start[2];
+};
+
+struct mlx5_wqe_xrc_seg {
+ __be32 xrc_srqn;
+ u8 rsvd[12];
+};
+
+struct mlx5_wqe_masked_atomic_seg {
+ __be64 swap_add;
+ __be64 compare;
+ __be64 swap_add_mask;
+ __be64 compare_mask;
+};
+
+struct mlx5_av {
+ union {
+ struct {
+ __be32 qkey;
+ __be32 reserved;
+ } qkey;
+ __be64 dc_key;
+ } key;
+ __be32 dqp_dct;
+ u8 stat_rate_sl;
+ u8 fl_mlid;
+ union {
+ __be16 rlid;
+ __be16 udp_sport;
+ };
+ u8 reserved0[4];
+ u8 rmac[6];
+ u8 tclass;
+ u8 hop_limit;
+ __be32 grh_gid_fl;
+ u8 rgid[16];
+};
+
+struct mlx5_wqe_datagram_seg {
+ struct mlx5_av av;
+};
+
+struct mlx5_wqe_raddr_seg {
+ __be64 raddr;
+ __be32 rkey;
+ u32 reserved;
+};
+
+struct mlx5_wqe_atomic_seg {
+ __be64 swap_add;
+ __be64 compare;
+};
+
+struct mlx5_wqe_data_seg {
+ __be32 byte_count;
+ __be32 lkey;
+ __be64 addr;
+};
+
+struct mlx5_wqe_umr_ctrl_seg {
+ u8 flags;
+ u8 rsvd0[3];
+ __be16 klm_octowords;
+ __be16 bsf_octowords;
+ __be64 mkey_mask;
+ u8 rsvd1[32];
+};
+
+struct mlx5_seg_set_psv {
+ __be32 psv_num;
+ __be16 syndrome;
+ __be16 status;
+ __be32 transient_sig;
+ __be32 ref_tag;
+};
+
+struct mlx5_seg_get_psv {
+ u8 rsvd[19];
+ u8 num_psv;
+ __be32 l_key;
+ __be64 va;
+ __be32 psv_index[4];
+};
+
+struct mlx5_seg_check_psv {
+ u8 rsvd0[2];
+ __be16 err_coalescing_op;
+ u8 rsvd1[2];
+ __be16 xport_err_op;
+ u8 rsvd2[2];
+ __be16 xport_err_mask;
+ u8 rsvd3[7];
+ u8 num_psv;
+ __be32 l_key;
+ __be64 va;
+ __be32 psv_index[4];
+};
+
+struct mlx5_rwqe_sig {
+ u8 rsvd0[4];
+ u8 signature;
+ u8 rsvd1[11];
+};
+
+struct mlx5_wqe_signature_seg {
+ u8 rsvd0[4];
+ u8 signature;
+ u8 rsvd1[11];
+};
+
+#define MLX5_WQE_INLINE_SEG_BYTE_COUNT_MASK 0x3ff
+
+struct mlx5_wqe_inline_seg {
+ __be32 byte_count;
+};
+
+enum mlx5_sig_type {
+ MLX5_DIF_CRC = 0x1,
+ MLX5_DIF_IPCS = 0x2,
+};
+
+struct mlx5_bsf_inl {
+ __be16 vld_refresh;
+ __be16 dif_apptag;
+ __be32 dif_reftag;
+ u8 sig_type;
+ u8 rp_inv_seed;
+ u8 rsvd[3];
+ u8 dif_inc_ref_guard_check;
+ __be16 dif_app_bitmask_check;
+};
+
+struct mlx5_bsf {
+ struct mlx5_bsf_basic {
+ u8 bsf_size_sbs;
+ u8 check_byte_mask;
+ union {
+ u8 copy_byte_mask;
+ u8 bs_selector;
+ u8 rsvd_wflags;
+ } wire;
+ union {
+ u8 bs_selector;
+ u8 rsvd_mflags;
+ } mem;
+ __be32 raw_data_size;
+ __be32 w_bfs_psv;
+ __be32 m_bfs_psv;
+ } basic;
+ struct mlx5_bsf_ext {
+ __be32 t_init_gen_pro_size;
+ __be32 rsvd_epi_size;
+ __be32 w_tfs_psv;
+ __be32 m_tfs_psv;
+ } ext;
+ struct mlx5_bsf_inl w_inl;
+ struct mlx5_bsf_inl m_inl;
+};
+
+struct mlx5_klm {
+ __be32 bcount;
+ __be32 key;
+ __be64 va;
+};
+
+struct mlx5_stride_block_entry {
+ __be16 stride;
+ __be16 bcount;
+ __be32 key;
+ __be64 va;
+};
+
+struct mlx5_stride_block_ctrl_seg {
+ __be32 bcount_per_cycle;
+ __be32 op;
+ __be32 repeat_count;
+ u16 rsvd;
+ __be16 num_entries;
+};
+
+enum mlx5_pagefault_flags {
+ MLX5_PFAULT_REQUESTOR = 1 << 0,
+ MLX5_PFAULT_WRITE = 1 << 1,
+ MLX5_PFAULT_RDMA = 1 << 2,
+};
+
+/* Contains the details of a pagefault. */
+struct mlx5_pagefault {
+ u32 bytes_committed;
+ u8 event_subtype;
+ enum mlx5_pagefault_flags flags;
+ union {
+ /* Initiator or send message responder pagefault details. */
+ struct {
+ /* Received packet size, only valid for responders. */
+ u32 packet_size;
+ /*
+ * WQE index. Refers to either the send queue or
+ * receive queue, according to event_subtype.
+ */
+ u16 wqe_index;
+ } wqe;
+ /* RDMA responder pagefault details */
+ struct {
+ u32 r_key;
+ /*
+ * Received packet size, minimal size page fault
+ * resolution required for forward progress.
+ */
+ u32 packet_size;
+ u32 rdma_op_len;
+ u64 rdma_va;
+ } rdma;
+ };
+};
+
+struct mlx5_core_qp {
+ struct mlx5_core_rsc_common common; /* must be first */
+ void (*event) (struct mlx5_core_qp *, int);
+ void (*pfault_handler)(struct mlx5_core_qp *, struct mlx5_pagefault *);
+ int qpn;
+ struct mlx5_rsc_debug *dbg;
+ int pid;
+};
+
+struct mlx5_qp_path {
+ u8 fl_free_ar;
+ u8 rsvd3;
+ __be16 pkey_index;
+ u8 rsvd0;
+ u8 grh_mlid;
+ __be16 rlid;
+ u8 ackto_lt;
+ u8 mgid_index;
+ u8 static_rate;
+ u8 hop_limit;
+ __be32 tclass_flowlabel;
+ union {
+ u8 rgid[16];
+ u8 rip[16];
+ };
+ u8 f_dscp_ecn_prio;
+ u8 ecn_dscp;
+ __be16 udp_sport;
+ u8 dci_cfi_prio_sl;
+ u8 port;
+ u8 rmac[6];
+};
+
+struct mlx5_qp_context {
+ __be32 flags;
+ __be32 flags_pd;
+ u8 mtu_msgmax;
+ u8 rq_size_stride;
+ __be16 sq_crq_size;
+ __be32 qp_counter_set_usr_page;
+ __be32 wire_qpn;
+ __be32 log_pg_sz_remote_qpn;
+ struct mlx5_qp_path pri_path;
+ struct mlx5_qp_path alt_path;
+ __be32 params1;
+ u8 reserved2[4];
+ __be32 next_send_psn;
+ __be32 cqn_send;
+ u8 reserved3[8];
+ __be32 last_acked_psn;
+ __be32 ssn;
+ __be32 params2;
+ __be32 rnr_nextrecvpsn;
+ __be32 xrcd;
+ __be32 cqn_recv;
+ __be64 db_rec_addr;
+ __be32 qkey;
+ __be32 rq_type_srqn;
+ __be32 rmsn;
+ __be16 hw_sq_wqe_counter;
+ __be16 sw_sq_wqe_counter;
+ __be16 hw_rcyclic_byte_counter;
+ __be16 hw_rq_counter;
+ __be16 sw_rcyclic_byte_counter;
+ __be16 sw_rq_counter;
+ u8 rsvd0[5];
+ u8 cgs;
+ u8 cs_req;
+ u8 cs_res;
+ __be64 dc_access_key;
+ u8 rsvd1[24];
+};
+
+struct mlx5_create_qp_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 input_qpn;
+ u8 rsvd0[4];
+ __be32 opt_param_mask;
+ u8 rsvd1[4];
+ struct mlx5_qp_context ctx;
+ u8 rsvd3[16];
+ __be64 pas[0];
+};
+
+struct mlx5_dct_context {
+ u8 state;
+ u8 rsvd0[7];
+ __be32 cqn;
+ __be32 flags;
+ u8 rsvd1;
+ u8 cs_res;
+ u8 min_rnr;
+ u8 rsvd2;
+ __be32 srqn;
+ __be32 pdn;
+ __be32 tclass_flow_label;
+ __be64 access_key;
+ u8 mtu;
+ u8 port;
+ __be16 pkey_index;
+ u8 rsvd4;
+ u8 mgid_index;
+ u8 rsvd5;
+ u8 hop_limit;
+ __be32 access_violations;
+ u8 rsvd[12];
+};
+
+struct mlx5_create_dct_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ u8 rsvd0[8];
+ struct mlx5_dct_context context;
+ u8 rsvd[48];
+};
+
+struct mlx5_create_dct_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ __be32 dctn;
+ u8 rsvd0[4];
+};
+
+struct mlx5_destroy_dct_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 dctn;
+ u8 rsvd0[4];
+};
+
+struct mlx5_destroy_dct_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd0[8];
+};
+
+struct mlx5_drain_dct_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 dctn;
+ u8 rsvd0[4];
+};
+
+struct mlx5_drain_dct_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd0[8];
+};
+
+struct mlx5_create_qp_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ __be32 qpn;
+ u8 rsvd0[4];
+};
+
+struct mlx5_destroy_qp_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 qpn;
+ u8 rsvd0[4];
+};
+
+struct mlx5_destroy_qp_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd0[8];
+};
+
+struct mlx5_modify_qp_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 qpn;
+ u8 rsvd1[4];
+ __be32 optparam;
+ u8 rsvd0[4];
+ struct mlx5_qp_context ctx;
+};
+
+struct mlx5_modify_qp_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd0[8];
+};
+
+struct mlx5_query_qp_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 qpn;
+ u8 rsvd[4];
+};
+
+struct mlx5_query_qp_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd1[8];
+ __be32 optparam;
+ u8 rsvd0[4];
+ struct mlx5_qp_context ctx;
+ u8 rsvd2[16];
+ __be64 pas[0];
+};
+
+struct mlx5_query_dct_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 dctn;
+ u8 rsvd[4];
+};
+
+struct mlx5_query_dct_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd0[8];
+ struct mlx5_dct_context ctx;
+ u8 rsvd1[48];
+};
+
+struct mlx5_arm_dct_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 dctn;
+ u8 rsvd[4];
+};
+
+struct mlx5_arm_dct_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd0[8];
+};
+
+struct mlx5_conf_sqp_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 qpn;
+ u8 rsvd[3];
+ u8 type;
+};
+
+struct mlx5_conf_sqp_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+};
+
+struct mlx5_alloc_xrcd_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ u8 rsvd[8];
+};
+
+struct mlx5_alloc_xrcd_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ __be32 xrcdn;
+ u8 rsvd[4];
+};
+
+struct mlx5_dealloc_xrcd_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 xrcdn;
+ u8 rsvd[4];
+};
+
+struct mlx5_dealloc_xrcd_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+};
+
+static inline struct mlx5_core_qp *__mlx5_qp_lookup(struct mlx5_core_dev *dev, u32 qpn)
+{
+ return radix_tree_lookup(&dev->priv.qp_table.tree, qpn);
+}
+
+static inline struct mlx5_core_mr *__mlx5_mr_lookup(struct mlx5_core_dev *dev, u32 key)
+{
+ return radix_tree_lookup(&dev->priv.mr_table.tree, key);
+}
+
+struct mlx5_page_fault_resume_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 flags_qpn;
+ u8 reserved[4];
+};
+
+struct mlx5_page_fault_resume_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvd[8];
+};
+
+int mlx5_core_create_qp(struct mlx5_core_dev *dev,
+ struct mlx5_core_qp *qp,
+ struct mlx5_create_qp_mbox_in *in,
+ int inlen);
+int mlx5_core_qp_modify(struct mlx5_core_dev *dev, enum mlx5_qp_state cur_state,
+ enum mlx5_qp_state new_state,
+ struct mlx5_modify_qp_mbox_in *in, int sqd_event,
+ struct mlx5_core_qp *qp);
+int mlx5_core_destroy_qp(struct mlx5_core_dev *dev,
+ struct mlx5_core_qp *qp);
+int mlx5_core_qp_query(struct mlx5_core_dev *dev, struct mlx5_core_qp *qp,
+ struct mlx5_query_qp_mbox_out *out, int outlen);
+int mlx5_core_dct_query(struct mlx5_core_dev *dev, struct mlx5_core_dct *dct,
+ struct mlx5_query_dct_mbox_out *out);
+int mlx5_core_arm_dct(struct mlx5_core_dev *dev, struct mlx5_core_dct *dct);
+
+int mlx5_core_xrcd_alloc(struct mlx5_core_dev *dev, u32 *xrcdn);
+int mlx5_core_xrcd_dealloc(struct mlx5_core_dev *dev, u32 xrcdn);
+void mlx5_init_qp_table(struct mlx5_core_dev *dev);
+void mlx5_cleanup_qp_table(struct mlx5_core_dev *dev);
+void mlx5_init_dct_table(struct mlx5_core_dev *dev);
+void mlx5_cleanup_dct_table(struct mlx5_core_dev *dev);
+int mlx5_debug_qp_add(struct mlx5_core_dev *dev, struct mlx5_core_qp *qp);
+void mlx5_debug_qp_remove(struct mlx5_core_dev *dev, struct mlx5_core_qp *qp);
+int mlx5_core_create_dct(struct mlx5_core_dev *dev,
+ struct mlx5_core_dct *dct,
+ struct mlx5_create_dct_mbox_in *in);
+int mlx5_core_destroy_dct(struct mlx5_core_dev *dev,
+ struct mlx5_core_dct *dct);
+int mlx5_debug_dct_add(struct mlx5_core_dev *dev, struct mlx5_core_dct *dct);
+void mlx5_debug_dct_remove(struct mlx5_core_dev *dev, struct mlx5_core_dct *dct);
+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
+int mlx5_core_page_fault_resume(struct mlx5_core_dev *dev, u32 qpn,
+ u8 context, int error);
+#endif
+
+static inline const char *mlx5_qp_type_str(int type)
+{
+ switch (type) {
+ case MLX5_QP_ST_RC: return "RC";
+ case MLX5_QP_ST_UC: return "C";
+ case MLX5_QP_ST_UD: return "UD";
+ case MLX5_QP_ST_XRC: return "XRC";
+ case MLX5_QP_ST_MLX: return "MLX";
+ case MLX5_QP_ST_DC: return "DC";
+ case MLX5_QP_ST_QP0: return "QP0";
+ case MLX5_QP_ST_QP1: return "QP1";
+ case MLX5_QP_ST_RAW_ETHERTYPE: return "RAW_ETHERTYPE";
+ case MLX5_QP_ST_RAW_IPV6: return "RAW_IPV6";
+ case MLX5_QP_ST_SNIFFER: return "SNIFFER";
+ case MLX5_QP_ST_SYNC_UMR: return "SYNC_UMR";
+ case MLX5_QP_ST_PTP_1588: return "PTP_1588";
+ case MLX5_QP_ST_REG_UMR: return "REG_UMR";
+ default: return "Invalid transport type";
+ }
+}
+
+static inline const char *mlx5_qp_state_str(int state)
+{
+ switch (state) {
+ case MLX5_QP_STATE_RST:
+ return "RST";
+ case MLX5_QP_STATE_INIT:
+ return "INIT";
+ case MLX5_QP_STATE_RTR:
+ return "RTR";
+ case MLX5_QP_STATE_RTS:
+ return "RTS";
+ case MLX5_QP_STATE_SQER:
+ return "SQER";
+ case MLX5_QP_STATE_SQD:
+ return "SQD";
+ case MLX5_QP_STATE_ERR:
+ return "ERR";
+ case MLX5_QP_STATE_SQ_DRAINING:
+ return "SQ_DRAINING";
+ case MLX5_QP_STATE_SUSPENDED:
+ return "SUSPENDED";
+ default: return "Invalid QP state";
+ }
+}
+
+#endif /* MLX5_QP_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx5/srq.h b/drivers/net/mlnx_uio/mlnx/include/mlx5/srq.h
new file mode 100644
index 0000000..e197b08
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx5/srq.h
@@ -0,0 +1,46 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX5_SRQ_H
+#define MLX5_SRQ_H
+
+
+void mlx5_init_srq_table(struct mlx5_core_dev *dev);
+void mlx5_cleanup_srq_table(struct mlx5_core_dev *dev);
+
+#endif /* MLX5_SRQ_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/include/mlx5/vport.h b/drivers/net/mlnx_uio/mlnx/include/mlx5/vport.h
new file mode 100644
index 0000000..5494708
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/include/mlx5/vport.h
@@ -0,0 +1,52 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies, Ltd. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __MLX5_VPORT_H__
+#define __MLX5_VPORT_H__
+
+
+u8 mlx5_query_vport_state(struct mlx5_core_dev *mdev, u8 opmod);
+int mlx5_query_nic_vport_mac_address(struct mlx5_core_dev *mdev, u8 *addr);
+int mlx5_nic_vport_enable_roce(struct mlx5_core_dev *mdev);
+int mlx5_nic_vport_disable_roce(struct mlx5_core_dev *mdev);
+int mlx5_query_nic_vport_system_image_guid(struct mlx5_core_dev *mdev,
+ u64 *system_image_guid);
+int mlx5_query_nic_vport_node_guid(struct mlx5_core_dev *mdev, u64 *node_guid);
+int mlx5_query_nic_vport_qkey_viol_cntr(struct mlx5_core_dev *mdev,
+ u16 *qkey_viol_cntr);
+#endif /* __MLX5_VPORT_H__ */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/Kconfig b/drivers/net/mlnx_uio/mlnx/mlx4/Kconfig
new file mode 100644
index 0000000..1486ce9
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/Kconfig
@@ -0,0 +1,46 @@
+#
+# Mellanox driver configuration
+#
+
+config MLX4_EN
+ tristate "Mellanox Technologies 1/10/40Gbit Ethernet support"
+ depends on PCI
+ select MLX4_CORE
+ select PTP_1588_CLOCK
+ ---help---
+ This driver supports Mellanox Technologies ConnectX Ethernet
+ devices.
+
+config MLX4_EN_DCB
+ bool "Data Center Bridging (DCB) Support"
+ default y
+ depends on MLX4_EN && DCB
+ ---help---
+ Say Y here if you want to use Data Center Bridging (DCB) in the
+ driver.
+ If set to N, will not be able to configure QoS and ratelimit attributes.
+ This flag is depended on the kernel's DCB support.
+
+ If unsure, set to Y
+
+config MLX4_EN_VXLAN
+ bool "VXLAN offloads Support"
+ default y
+ depends on MLX4_EN && VXLAN && !(MLX4_EN=y && VXLAN=m)
+ ---help---
+ Say Y here if you want to use VXLAN offloads in the driver.
+
+config MLX4_CORE
+ tristate
+ depends on PCI
+ default n
+
+config MLX4_DEBUG
+ bool "Verbose debugging output" if (MLX4_CORE && EXPERT)
+ depends on MLX4_CORE
+ default y
+ ---help---
+ This option causes debugging code to be compiled into the
+ mlx4_core driver. The output can be turned on via the
+ debug_level module parameter (which can also be set after
+ the driver is loaded through sysfs).
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/Makefile b/drivers/net/mlnx_uio/mlnx/mlx4/Makefile
new file mode 100644
index 0000000..4dc4d05
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/Makefile
@@ -0,0 +1,19 @@
+ccflags-y += $(MLNX_CFLAGS)
+
+obj-$(CONFIG_MLX4_CORE) += mlx4_core.o
+
+mlx4_core-y := alloc.o catas.o cmd.o cq.o eq.o fw.o fw_qos.o icm.o intf.o \
+ main.o mcg.o mr.o pd.o port.o profile.o qp.o reset.o sense.o \
+ srq.o resource_tracker.o
+
+obj-$(CONFIG_MLX4_EN) += mlx4_en.o
+
+mlx4_en-y := en_main.o en_tx.o en_rx.o en_ethtool.o en_port.o en_cq.o \
+ en_resources.o en_netdev.o en_selftest.o en_clock.o
+
+ifeq ($(CONFIG_COMPAT_DISABLE_DCB),)
+mlx4_en-$(CONFIG_MLX4_EN_DCB) += en_dcb_nl.o
+endif
+ifneq ($(CONFIG_COMPAT_EN_SYSFS),)
+mlx4_en-y += en_sysfs.o
+endif
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/alloc.c b/drivers/net/mlnx_uio/mlnx/mlx4/alloc.c
new file mode 100644
index 0000000..ebe24b2
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/alloc.c
@@ -0,0 +1,872 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+#include "mlx4.h"
+#include "log2.h"
+
+u32 mlx4_bitmap_alloc(struct mlx4_bitmap *bitmap)
+{
+ u32 obj;
+
+ spin_lock(&bitmap->lock);
+
+ obj = find_next_zero_bit(bitmap->table, bitmap->max, bitmap->last);
+ if (obj >= bitmap->max) {
+ bitmap->top = (bitmap->top + bitmap->max + bitmap->reserved_top)
+ & bitmap->mask;
+ obj = find_first_zero_bit(bitmap->table, bitmap->max);
+ }
+
+ if (obj < bitmap->max) {
+ set_bit(obj, bitmap->table);
+ bitmap->last = (obj + 1);
+ if (bitmap->last == bitmap->max)
+ bitmap->last = 0;
+ obj |= bitmap->top;
+ } else
+ obj = -1;
+
+ if (obj != -1)
+ --bitmap->avail;
+
+ spin_unlock(&bitmap->lock);
+
+ return obj;
+}
+
+void mlx4_bitmap_free(struct mlx4_bitmap *bitmap, u32 obj, int use_rr)
+{
+ mlx4_bitmap_free_range(bitmap, obj, 1, use_rr);
+}
+
+static unsigned long find_aligned_range(unsigned long *bitmap,
+ u32 start, u32 nbits,
+ int len, int align, u32 skip_mask)
+{
+ unsigned long end, i;
+
+again:
+ start = ALIGN(start, align);
+
+ while ((start < nbits) && (test_bit(start, bitmap) ||
+ (start & skip_mask)))
+ start += align;
+
+ if (start >= nbits)
+ return -1;
+
+ end = start+len;
+ if (end > nbits)
+ return -1;
+
+ for (i = start + 1; i < end; i++) {
+ if (test_bit(i, bitmap) || ((u32)i & skip_mask)) {
+ start = i + 1;
+ goto again;
+ }
+ }
+
+ return start;
+}
+
+u32 mlx4_bitmap_alloc_range(struct mlx4_bitmap *bitmap, int cnt,
+ int align, u32 skip_mask)
+{
+ u32 obj;
+
+ if (likely(cnt == 1 && align == 1 && !skip_mask))
+ return mlx4_bitmap_alloc(bitmap);
+
+ spin_lock(&bitmap->lock);
+
+ obj = find_aligned_range(bitmap->table, bitmap->last,
+ bitmap->max, cnt, align, skip_mask);
+ if (obj >= bitmap->max) {
+ bitmap->top = (bitmap->top + bitmap->max + bitmap->reserved_top)
+ & bitmap->mask;
+ obj = find_aligned_range(bitmap->table, 0, bitmap->max,
+ cnt, align, skip_mask);
+ }
+
+ if (obj < bitmap->max) {
+ bitmap_set(bitmap->table, obj, cnt);
+ if (obj == bitmap->last) {
+ bitmap->last = (obj + cnt);
+ if (bitmap->last >= bitmap->max)
+ bitmap->last = 0;
+ }
+ obj |= bitmap->top;
+ } else
+ obj = -1;
+
+ if (obj != -1)
+ bitmap->avail -= cnt;
+
+ spin_unlock(&bitmap->lock);
+
+ return obj;
+}
+
+u32 mlx4_bitmap_avail(struct mlx4_bitmap *bitmap)
+{
+ return bitmap->avail;
+}
+
+static u32 mlx4_bitmap_masked_value(struct mlx4_bitmap *bitmap, u32 obj)
+{
+ return obj & (bitmap->max + bitmap->reserved_top - 1);
+}
+
+void mlx4_bitmap_free_range(struct mlx4_bitmap *bitmap, u32 obj, int cnt,
+ int use_rr)
+{
+ obj &= bitmap->max + bitmap->reserved_top - 1;
+
+ spin_lock(&bitmap->lock);
+ if (!use_rr) {
+ bitmap->last = min(bitmap->last, obj);
+ bitmap->top = (bitmap->top + bitmap->max + bitmap->reserved_top)
+ & bitmap->mask;
+ }
+ bitmap_clear(bitmap->table, obj, cnt);
+ bitmap->avail += cnt;
+ spin_unlock(&bitmap->lock);
+}
+
+int mlx4_bitmap_init(struct mlx4_bitmap *bitmap, u32 num, u32 mask,
+ u32 reserved_bot, u32 reserved_top)
+{
+ /* num must be a power of 2 */
+ if (num != roundup_pow_of_two(num))
+ return -EINVAL;
+
+ bitmap->last = 0;
+ bitmap->top = 0;
+ bitmap->max = num - reserved_top;
+ bitmap->mask = mask;
+ bitmap->reserved_top = reserved_top;
+ bitmap->avail = num - reserved_top - reserved_bot;
+ bitmap->effective_len = bitmap->avail;
+ spin_lock_init(&bitmap->lock);
+ bitmap->table = kzalloc(BITS_TO_LONGS(bitmap->max) *
+ sizeof (long), GFP_KERNEL);
+ if (!bitmap->table)
+ return -ENOMEM;
+
+ bitmap_set(bitmap->table, 0, reserved_bot);
+
+ return 0;
+}
+
+void mlx4_bitmap_cleanup(struct mlx4_bitmap *bitmap)
+{
+ kfree(bitmap->table);
+}
+
+struct mlx4_zone_allocator {
+ struct list_head entries;
+ struct list_head prios;
+ u32 last_uid;
+ u32 mask;
+ /* protect the zone_allocator from concurrent accesses */
+ spinlock_t lock;
+ enum mlx4_zone_alloc_flags flags;
+};
+
+struct mlx4_zone_entry {
+ struct list_head list;
+ struct list_head prio_list;
+ u32 uid;
+ struct mlx4_zone_allocator *allocator;
+ struct mlx4_bitmap *bitmap;
+ int use_rr;
+ int priority;
+ int offset;
+ enum mlx4_zone_flags flags;
+};
+
+struct mlx4_zone_allocator *mlx4_zone_allocator_create(enum mlx4_zone_alloc_flags flags)
+{
+ struct mlx4_zone_allocator *zones = kmalloc(sizeof(*zones), GFP_KERNEL);
+
+ if (NULL == zones)
+ return NULL;
+
+ INIT_LIST_HEAD(&zones->entries);
+ INIT_LIST_HEAD(&zones->prios);
+ spin_lock_init(&zones->lock);
+ zones->last_uid = 0;
+ zones->mask = 0;
+ zones->flags = flags;
+
+ return zones;
+}
+
+int mlx4_zone_add_one(struct mlx4_zone_allocator *zone_alloc,
+ struct mlx4_bitmap *bitmap,
+ u32 flags,
+ int priority,
+ int offset,
+ u32 *puid)
+{
+ u32 mask = mlx4_bitmap_masked_value(bitmap, (u32)-1);
+ struct mlx4_zone_entry *it;
+ struct mlx4_zone_entry *zone = kmalloc(sizeof(*zone), GFP_KERNEL);
+
+ if (NULL == zone)
+ return -ENOMEM;
+
+ zone->flags = flags;
+ zone->bitmap = bitmap;
+ zone->use_rr = (flags & MLX4_ZONE_USE_RR) ? MLX4_USE_RR : 0;
+ zone->priority = priority;
+ zone->offset = offset;
+
+ spin_lock(&zone_alloc->lock);
+
+ zone->uid = zone_alloc->last_uid++;
+ zone->allocator = zone_alloc;
+
+ if (zone_alloc->mask < mask)
+ zone_alloc->mask = mask;
+
+ list_for_each_entry(it, &zone_alloc->prios, prio_list)
+ if (it->priority >= priority)
+ break;
+
+ if (&it->prio_list == &zone_alloc->prios || it->priority > priority)
+ list_add_tail(&zone->prio_list, &it->prio_list);
+ list_add_tail(&zone->list, &it->list);
+
+ spin_unlock(&zone_alloc->lock);
+
+ *puid = zone->uid;
+
+ return 0;
+}
+
+/* Should be called under a lock */
+static int __mlx4_zone_remove_one_entry(struct mlx4_zone_entry *entry)
+{
+ struct mlx4_zone_allocator *zone_alloc = entry->allocator;
+
+ if (!list_empty(&entry->prio_list)) {
+ /* Check if we need to add an alternative node to the prio list */
+ if (!list_is_last(&entry->list, &zone_alloc->entries)) {
+ struct mlx4_zone_entry *next = list_first_entry(&entry->list,
+ typeof(*next),
+ list);
+
+ if (next->priority == entry->priority)
+ list_add_tail(&next->prio_list, &entry->prio_list);
+ }
+
+ list_del(&entry->prio_list);
+ }
+
+ list_del(&entry->list);
+
+ if (zone_alloc->flags & MLX4_ZONE_ALLOC_FLAGS_NO_OVERLAP) {
+ u32 mask = 0;
+ struct mlx4_zone_entry *it;
+
+ list_for_each_entry(it, &zone_alloc->prios, prio_list) {
+ u32 cur_mask = mlx4_bitmap_masked_value(it->bitmap, (u32)-1);
+
+ if (mask < cur_mask)
+ mask = cur_mask;
+ }
+ zone_alloc->mask = mask;
+ }
+
+ return 0;
+}
+
+void mlx4_zone_allocator_destroy(struct mlx4_zone_allocator *zone_alloc)
+{
+ struct mlx4_zone_entry *zone, *tmp;
+
+ spin_lock(&zone_alloc->lock);
+
+ list_for_each_entry_safe(zone, tmp, &zone_alloc->entries, list) {
+ list_del(&zone->list);
+ list_del(&zone->prio_list);
+ kfree(zone);
+ }
+
+ spin_unlock(&zone_alloc->lock);
+ kfree(zone_alloc);
+}
+
+/* Should be called under a lock */
+static u32 __mlx4_alloc_from_zone(struct mlx4_zone_entry *zone, int count,
+ int align, u32 skip_mask, u32 *puid)
+{
+ u32 uid = 0;
+ u32 res;
+ struct mlx4_zone_allocator *zone_alloc = zone->allocator;
+ struct mlx4_zone_entry *curr_node;
+
+ res = mlx4_bitmap_alloc_range(zone->bitmap, count,
+ align, skip_mask);
+
+ if (res != (u32)-1) {
+ res += zone->offset;
+ uid = zone->uid;
+ goto out;
+ }
+
+ list_for_each_entry(curr_node, &zone_alloc->prios, prio_list) {
+ if (unlikely(curr_node->priority == zone->priority))
+ break;
+ }
+
+ if (zone->flags & MLX4_ZONE_ALLOW_ALLOC_FROM_LOWER_PRIO) {
+ struct mlx4_zone_entry *it = curr_node;
+
+ list_for_each_entry_continue_reverse(it, &zone_alloc->entries, list) {
+ res = mlx4_bitmap_alloc_range(it->bitmap, count,
+ align, skip_mask);
+ if (res != (u32)-1) {
+ res += it->offset;
+ uid = it->uid;
+ goto out;
+ }
+ }
+ }
+
+ if (zone->flags & MLX4_ZONE_ALLOW_ALLOC_FROM_EQ_PRIO) {
+ struct mlx4_zone_entry *it = curr_node;
+
+ list_for_each_entry_from(it, &zone_alloc->entries, list) {
+ if (unlikely(it == zone))
+ continue;
+
+ if (unlikely(it->priority != curr_node->priority))
+ break;
+
+ res = mlx4_bitmap_alloc_range(it->bitmap, count,
+ align, skip_mask);
+ if (res != (u32)-1) {
+ res += it->offset;
+ uid = it->uid;
+ goto out;
+ }
+ }
+ }
+
+ if (zone->flags & MLX4_ZONE_FALLBACK_TO_HIGHER_PRIO) {
+ if (list_is_last(&curr_node->prio_list, &zone_alloc->prios))
+ goto out;
+
+ curr_node = list_first_entry(&curr_node->prio_list,
+ typeof(*curr_node),
+ prio_list);
+
+ list_for_each_entry_from(curr_node, &zone_alloc->entries, list) {
+ res = mlx4_bitmap_alloc_range(curr_node->bitmap, count,
+ align, skip_mask);
+ if (res != (u32)-1) {
+ res += curr_node->offset;
+ uid = curr_node->uid;
+ goto out;
+ }
+ }
+ }
+
+out:
+ if (NULL != puid && res != (u32)-1)
+ *puid = uid;
+ return res;
+}
+
+/* Should be called under a lock */
+static void __mlx4_free_from_zone(struct mlx4_zone_entry *zone, u32 obj,
+ u32 count)
+{
+ mlx4_bitmap_free_range(zone->bitmap, obj - zone->offset, count, zone->use_rr);
+}
+
+/* Should be called under a lock */
+static struct mlx4_zone_entry *__mlx4_find_zone_by_uid(
+ struct mlx4_zone_allocator *zones, u32 uid)
+{
+ struct mlx4_zone_entry *zone;
+
+ list_for_each_entry(zone, &zones->entries, list) {
+ if (zone->uid == uid)
+ return zone;
+ }
+
+ return NULL;
+}
+
+struct mlx4_bitmap *mlx4_zone_get_bitmap(struct mlx4_zone_allocator *zones, u32 uid)
+{
+ struct mlx4_zone_entry *zone;
+ struct mlx4_bitmap *bitmap;
+
+ spin_lock(&zones->lock);
+
+ zone = __mlx4_find_zone_by_uid(zones, uid);
+
+ bitmap = zone == NULL ? NULL : zone->bitmap;
+
+ spin_unlock(&zones->lock);
+
+ return bitmap;
+}
+
+int mlx4_zone_remove_one(struct mlx4_zone_allocator *zones, u32 uid)
+{
+ struct mlx4_zone_entry *zone;
+ int res;
+
+ spin_lock(&zones->lock);
+
+ zone = __mlx4_find_zone_by_uid(zones, uid);
+
+ if (NULL == zone) {
+ res = -1;
+ goto out;
+ }
+
+ res = __mlx4_zone_remove_one_entry(zone);
+
+out:
+ spin_unlock(&zones->lock);
+ kfree(zone);
+
+ return res;
+}
+
+/* Should be called under a lock */
+static struct mlx4_zone_entry *__mlx4_find_zone_by_uid_unique(
+ struct mlx4_zone_allocator *zones, u32 obj)
+{
+ struct mlx4_zone_entry *zone, *zone_candidate = NULL;
+ u32 dist = (u32)-1;
+
+ /* Search for the smallest zone that this obj could be
+ * allocated from. This is done in order to handle
+ * situations when small bitmaps are allocated from bigger
+ * bitmaps (and the allocated space is marked as reserved in
+ * the bigger bitmap.
+ */
+ list_for_each_entry(zone, &zones->entries, list) {
+ if (obj >= zone->offset) {
+ u32 mobj = (obj - zone->offset) & zones->mask;
+
+ if (mobj < zone->bitmap->max) {
+ u32 curr_dist = zone->bitmap->effective_len;
+
+ if (curr_dist < dist) {
+ dist = curr_dist;
+ zone_candidate = zone;
+ }
+ }
+ }
+ }
+
+ return zone_candidate;
+}
+
+u32 mlx4_zone_alloc_entries(struct mlx4_zone_allocator *zones, u32 uid, int count,
+ int align, u32 skip_mask, u32 *puid)
+{
+ struct mlx4_zone_entry *zone;
+ int res = -1;
+
+ spin_lock(&zones->lock);
+
+ zone = __mlx4_find_zone_by_uid(zones, uid);
+
+ if (NULL == zone)
+ goto out;
+
+ res = __mlx4_alloc_from_zone(zone, count, align, skip_mask, puid);
+
+out:
+ spin_unlock(&zones->lock);
+
+ return res;
+}
+
+u32 mlx4_zone_free_entries(struct mlx4_zone_allocator *zones, u32 uid, u32 obj, u32 count)
+{
+ struct mlx4_zone_entry *zone;
+ int res = 0;
+
+ spin_lock(&zones->lock);
+
+ zone = __mlx4_find_zone_by_uid(zones, uid);
+
+ if (NULL == zone) {
+ res = -1;
+ goto out;
+ }
+
+ __mlx4_free_from_zone(zone, obj, count);
+
+out:
+ spin_unlock(&zones->lock);
+
+ return res;
+}
+
+u32 mlx4_zone_free_entries_unique(struct mlx4_zone_allocator *zones, u32 obj, u32 count)
+{
+ struct mlx4_zone_entry *zone;
+ int res;
+
+ if (!(zones->flags & MLX4_ZONE_ALLOC_FLAGS_NO_OVERLAP))
+ return -EFAULT;
+
+ spin_lock(&zones->lock);
+
+ zone = __mlx4_find_zone_by_uid_unique(zones, obj);
+
+ if (NULL == zone) {
+ res = -1;
+ goto out;
+ }
+
+ __mlx4_free_from_zone(zone, obj, count);
+ res = 0;
+
+out:
+ spin_unlock(&zones->lock);
+
+ return res;
+}
+/*
+ * Handling for queue buffers -- we allocate a bunch of memory and
+ * register it in a memory region at HCA virtual address 0. If the
+ * requested size is > max_direct, we split the allocation into
+ * multiple pages, so we don't require too much contiguous memory.
+ */
+
+int mlx4_buf_alloc(struct mlx4_dev *dev, int size, int max_direct,
+ struct mlx4_buf *buf, gfp_t gfp)
+{
+ dma_addr_t t;
+
+/* ARM arch does not allow vmap(virt_to_page(x)) operations.
+ * In this case, we must allocate 1 contigious DMA buffer.
+ */
+#ifdef CONFIG_ARM
+ max_direct = size;
+#endif
+
+#ifdef KMOD_MODIFIED
+ if (size > max_direct)
+ {
+ dev_warn(dev, "max_direct is %d, but request size is %d, ignored.\n", max_direct, size);
+ max_direct = size;
+ }
+#endif
+
+ if (size <= max_direct) {
+ buf->nbufs = 1;
+ buf->npages = 1;
+ buf->page_shift = get_order(size) + PAGE_SHIFT;
+#ifdef KMOD_MODIFIED
+ buf->direct.buf = rte_persistent_alloc(size, dev->persist->rte_pdev->numa_node);
+ t = rte_persistent_hw_addr(buf->direct.buf);
+#else
+ buf->direct.buf = dma_alloc_coherent(&dev->persist->pdev->dev,
+ size, &t, gfp);
+#endif
+ if (!buf->direct.buf)
+ return -ENOMEM;
+
+ buf->direct.map = t;
+
+ while (t & ((1 << buf->page_shift) - 1)) {
+ --buf->page_shift;
+ buf->npages *= 2;
+ }
+
+ memset(buf->direct.buf, 0, size);
+ }
+#ifdef KMOD_REMOVED
+ else {
+ int i;
+
+ buf->direct.buf = NULL;
+ buf->nbufs = (size + PAGE_SIZE - 1) / PAGE_SIZE;
+ buf->npages = buf->nbufs;
+ buf->page_shift = PAGE_SHIFT;
+ buf->page_list = kcalloc(buf->nbufs, sizeof(*buf->page_list),
+ gfp);
+ if (!buf->page_list)
+ return -ENOMEM;
+
+ for (i = 0; i < buf->nbufs; ++i) {
+ buf->page_list[i].buf =
+ dma_alloc_coherent(&dev->persist->pdev->dev,
+ PAGE_SIZE,
+ &t, gfp);
+ if (!buf->page_list[i].buf)
+ goto err_free;
+
+ buf->page_list[i].map = t;
+
+ memset(buf->page_list[i].buf, 0, PAGE_SIZE);
+ }
+
+ if (BITS_PER_LONG == 64) {
+ struct page **pages;
+ pages = kmalloc(sizeof *pages * buf->nbufs, gfp);
+ if (!pages)
+ goto err_free;
+ for (i = 0; i < buf->nbufs; ++i)
+ pages[i] = virt_to_page(buf->page_list[i].buf);
+ buf->direct.buf = vmap(pages, buf->nbufs, VM_MAP, PAGE_KERNEL);
+ kfree(pages);
+ if (!buf->direct.buf)
+ goto err_free;
+ }
+ }
+#endif
+ return 0;
+
+err_free:
+ mlx4_buf_free(dev, size, buf);
+
+ return -ENOMEM;
+}
+EXPORT_SYMBOL_GPL(mlx4_buf_alloc);
+
+void mlx4_buf_free(struct mlx4_dev *dev, int size, struct mlx4_buf *buf)
+{
+ int i;
+#ifdef KMOD_REMOVED
+ if (buf->nbufs == 1)
+#endif
+#ifdef KMOD_MODIFIED
+ rte_persistent_free(buf->direct.buf);
+#else
+ dma_free_coherent(&dev->persist->pdev->dev, size,
+ buf->direct.buf,
+ buf->direct.map);
+#endif
+#ifdef KMOD_REMOVED
+ else {
+ if (BITS_PER_LONG == 64)
+ vunmap(buf->direct.buf);
+
+ for (i = 0; i < buf->nbufs; ++i)
+ if (buf->page_list[i].buf)
+ dma_free_coherent(&dev->persist->pdev->dev,
+ PAGE_SIZE,
+ buf->page_list[i].buf,
+ buf->page_list[i].map);
+ kfree(buf->page_list);
+ }
+#endif
+}
+EXPORT_SYMBOL_GPL(mlx4_buf_free);
+
+static struct mlx4_db_pgdir *mlx4_alloc_db_pgdir(struct rte_pci_device *dma_device,
+ gfp_t gfp)
+{
+ struct mlx4_db_pgdir *pgdir;
+
+ pgdir = kzalloc(sizeof *pgdir, gfp);
+ if (!pgdir)
+ return NULL;
+
+ bitmap_fill(pgdir->order1, MLX4_DB_PER_PAGE / 2);
+ pgdir->bits[0] = pgdir->order0;
+ pgdir->bits[1] = pgdir->order1;
+#ifdef KMOD_MODIFIED
+ pgdir->db_page = rte_persistent_alloc(PAGE_SIZE, dma_device->numa_node);
+ pgdir->db_dma = rte_persistent_hw_addr(pgdir->db_page);
+#else
+ pgdir->db_page = dma_alloc_coherent(dma_device, PAGE_SIZE,
+ &pgdir->db_dma, gfp);
+#endif
+ if (!pgdir->db_page) {
+ kfree(pgdir);
+ return NULL;
+ }
+
+ return pgdir;
+}
+
+static int mlx4_alloc_db_from_pgdir(struct mlx4_db_pgdir *pgdir,
+ struct mlx4_db *db, int order)
+{
+ int o;
+ int i;
+
+ for (o = order; o <= 1; ++o) {
+ i = find_first_bit(pgdir->bits[o], MLX4_DB_PER_PAGE >> o);
+ if (i < MLX4_DB_PER_PAGE >> o)
+ goto found;
+ }
+
+ return -ENOMEM;
+
+found:
+ clear_bit(i, pgdir->bits[o]);
+
+ i <<= o;
+
+ if (o > order)
+ set_bit(i ^ 1, pgdir->bits[order]);
+
+ db->u.pgdir = pgdir;
+ db->index = i;
+ db->db = pgdir->db_page + db->index;
+ db->dma = pgdir->db_dma + db->index * 4;
+ db->order = order;
+
+ return 0;
+}
+
+int mlx4_db_alloc(struct mlx4_dev *dev, struct mlx4_db *db, int order, gfp_t gfp)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_db_pgdir *pgdir;
+ int ret = 0;
+
+ mutex_lock(&priv->pgdir_mutex);
+
+ list_for_each_entry(pgdir, &priv->pgdir_list, list)
+ if (!mlx4_alloc_db_from_pgdir(pgdir, db, order))
+ goto out;
+#ifdef KMOD_MODIFIED
+ pgdir = mlx4_alloc_db_pgdir(dev->persist->rte_pdev, gfp);
+#endif
+ if (!pgdir) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ list_add(&pgdir->list, &priv->pgdir_list);
+
+ /* This should never fail -- we just allocated an empty page: */
+ WARN_ON(mlx4_alloc_db_from_pgdir(pgdir, db, order));
+
+out:
+ mutex_unlock(&priv->pgdir_mutex);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mlx4_db_alloc);
+
+void mlx4_db_free(struct mlx4_dev *dev, struct mlx4_db *db)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int o;
+ int i;
+
+ mutex_lock(&priv->pgdir_mutex);
+
+ o = db->order;
+ i = db->index;
+
+ if (db->order == 0 && test_bit(i ^ 1, db->u.pgdir->order0)) {
+ clear_bit(i ^ 1, db->u.pgdir->order0);
+ ++o;
+ }
+ i >>= o;
+ set_bit(i, db->u.pgdir->bits[o]);
+
+ if (bitmap_full(db->u.pgdir->order1, MLX4_DB_PER_PAGE / 2)) {
+#ifdef KMOD_MODIFIED
+ rte_persistent_free(db->u.pgdir->db_page);
+#else
+ dma_free_coherent(&dev->persist->pdev->dev, PAGE_SIZE,
+ db->u.pgdir->db_page, db->u.pgdir->db_dma);
+#endif
+ list_del(&db->u.pgdir->list);
+ kfree(db->u.pgdir);
+ }
+
+ mutex_unlock(&priv->pgdir_mutex);
+}
+EXPORT_SYMBOL_GPL(mlx4_db_free);
+
+int mlx4_alloc_hwq_res(struct mlx4_dev *dev, struct mlx4_hwq_resources *wqres,
+ int size, int max_direct)
+{
+ int err;
+
+ err = mlx4_db_alloc(dev, &wqres->db, 1, GFP_KERNEL);
+ if (err)
+ return err;
+
+ *wqres->db.db = 0;
+
+ err = mlx4_buf_alloc(dev, size, max_direct, &wqres->buf, GFP_KERNEL);
+ if (err)
+ goto err_db;
+
+ err = mlx4_mtt_init(dev, wqres->buf.npages, wqres->buf.page_shift,
+ &wqres->mtt);
+ if (err)
+ goto err_buf;
+
+ err = mlx4_buf_write_mtt(dev, &wqres->mtt, &wqres->buf, GFP_KERNEL);
+ if (err)
+ goto err_mtt;
+
+ return 0;
+
+err_mtt:
+ mlx4_mtt_cleanup(dev, &wqres->mtt);
+err_buf:
+ mlx4_buf_free(dev, size, &wqres->buf);
+err_db:
+ mlx4_db_free(dev, &wqres->db);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_alloc_hwq_res);
+
+void mlx4_free_hwq_res(struct mlx4_dev *dev, struct mlx4_hwq_resources *wqres,
+ int size)
+{
+ mlx4_mtt_cleanup(dev, &wqres->mtt);
+ mlx4_buf_free(dev, size, &wqres->buf);
+ mlx4_db_free(dev, &wqres->db);
+}
+EXPORT_SYMBOL_GPL(mlx4_free_hwq_res);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/catas.c b/drivers/net/mlnx_uio/mlnx/mlx4/catas.c
new file mode 100644
index 0000000..7fafd67
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/catas.c
@@ -0,0 +1,350 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved.
+ * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+#include "mlx4.h"
+
+enum {
+ MLX4_CATAS_POLL_INTERVAL = 5 * HZ,
+};
+
+
+
+int mlx4_internal_err_reset = 1;
+module_param_named(internal_err_reset, mlx4_internal_err_reset, int, 0644);
+MODULE_PARM_DESC(internal_err_reset,
+ "Reset device on internal errors if non-zero (default 1)");
+
+static int read_vendor_id(struct mlx4_dev *dev)
+{
+#ifdef KMOD_REMOVED
+ u16 vendor_id = 0;
+ int ret;
+
+ ret = pci_read_config_word(dev->persist->pdev, 0, &vendor_id);
+ if (ret) {
+ mlx4_err(dev, "Failed to read vendor ID, ret=%d\n", ret);
+ return ret;
+ }
+
+ if (vendor_id == 0xffff) {
+ mlx4_err(dev, "PCI can't be accessed to read vendor id\n");
+ return -EINVAL;
+ }
+
+ return 0;
+#endif
+ return dev->persist->rte_pdev->id.vendor_id;
+}
+
+static int mlx4_reset_master(struct mlx4_dev *dev)
+{
+ int err = 0;
+
+ if (mlx4_is_master(dev))
+ mlx4_report_internal_err_comm_event(dev);
+#ifdef KMOD_REMOVED
+ if (!pci_channel_offline(dev->persist->pdev))
+#endif
+ {
+ err = read_vendor_id(dev);
+ /* If PCI can't be accessed to read vendor ID we assume that its
+ * link was disabled and chip was already reset.
+ */
+ if (err)
+ return 0;
+
+ err = mlx4_reset(dev);
+ if (err)
+ mlx4_err(dev, "Fail to reset HCA\n");
+ }
+
+ return err;
+}
+
+static int mlx4_reset_slave(struct mlx4_dev *dev)
+{
+#define COM_CHAN_RST_REQ_OFFSET 0x10
+#define COM_CHAN_RST_ACK_OFFSET 0x08
+
+ u32 comm_flags;
+ u32 rst_req;
+ u32 rst_ack;
+ unsigned long end;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+#ifdef KMOD_REMOVED
+ if (pci_channel_offline(dev->persist->pdev))
+ return 0;
+#endif
+
+ comm_flags = swab32(readl((__iomem char *)priv->mfunc.comm +
+ MLX4_COMM_CHAN_FLAGS));
+ if (comm_flags == 0xffffffff) {
+ mlx4_err(dev, "VF reset is not needed\n");
+ return 0;
+ }
+
+ if (!(dev->caps.vf_caps & MLX4_VF_CAP_FLAG_RESET)) {
+ mlx4_err(dev, "VF reset is not supported\n");
+ return -EOPNOTSUPP;
+ }
+
+ rst_req = (comm_flags & (u32)(1 << COM_CHAN_RST_REQ_OFFSET)) >>
+ COM_CHAN_RST_REQ_OFFSET;
+ rst_ack = (comm_flags & (u32)(1 << COM_CHAN_RST_ACK_OFFSET)) >>
+ COM_CHAN_RST_ACK_OFFSET;
+ if (rst_req != rst_ack) {
+ mlx4_err(dev, "Communication channel isn't sync, fail to send reset\n");
+ return -EIO;
+ }
+
+ rst_req ^= 1;
+ mlx4_warn(dev, "VF is sending reset request to Firmware\n");
+ comm_flags = rst_req << COM_CHAN_RST_REQ_OFFSET;
+ __raw_writel((__force u32)cpu_to_be32(comm_flags),
+ (__iomem char *)priv->mfunc.comm + MLX4_COMM_CHAN_FLAGS);
+ /* Make sure that our comm channel write doesn't
+ * get mixed in with writes from another CPU.
+ */
+ mmiowb();
+
+ end = msecs_to_jiffies(MLX4_COMM_TIME) + jiffies;
+ while (time_before(jiffies, end)) {
+ comm_flags = swab32(readl((__iomem char *)priv->mfunc.comm +
+ MLX4_COMM_CHAN_FLAGS));
+ rst_ack = (comm_flags & (u32)(1 << COM_CHAN_RST_ACK_OFFSET)) >>
+ COM_CHAN_RST_ACK_OFFSET;
+
+ /* Reading rst_req again since the communication channel can
+ * be reset at any time by the PF and all its bits will be
+ * set to zero.
+ */
+ rst_req = (comm_flags & (u32)(1 << COM_CHAN_RST_REQ_OFFSET)) >>
+ COM_CHAN_RST_REQ_OFFSET;
+
+ if (rst_ack == rst_req) {
+ mlx4_warn(dev, "VF Reset succeed\n");
+ return 0;
+ }
+ cond_resched();
+ }
+ mlx4_err(dev, "Fail to send reset over the communication channel\n");
+ return -ETIMEDOUT;
+}
+
+static int mlx4_comm_internal_err(u32 slave_read)
+{
+ return (u32)COMM_CHAN_EVENT_INTERNAL_ERR ==
+ (slave_read & (u32)COMM_CHAN_EVENT_INTERNAL_ERR) ? 1 : 0;
+}
+
+void mlx4_enter_error_state(struct mlx4_dev_persistent *persist)
+{
+ int err;
+ struct mlx4_dev *dev;
+
+ if (!mlx4_internal_err_reset)
+ return;
+
+ mutex_lock(&persist->device_state_mutex);
+ if (persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR)
+ goto out;
+
+ dev = persist->dev;
+ mlx4_err(dev, "device is going to be reset\n");
+ if (mlx4_is_slave(dev))
+ err = mlx4_reset_slave(dev);
+ else
+ err = mlx4_reset_master(dev);
+ BUG_ON(err != 0);
+
+ dev->persist->state |= MLX4_DEVICE_STATE_INTERNAL_ERROR;
+ mlx4_err(dev, "device was reset successfully\n");
+ mutex_unlock(&persist->device_state_mutex);
+
+ /* At that step HW was already reset, now notify clients */
+ mlx4_dispatch_event(dev, MLX4_DEV_EVENT_CATASTROPHIC_ERROR, 0);
+ mlx4_cmd_wake_completions(dev);
+ return;
+
+out:
+ mutex_unlock(&persist->device_state_mutex);
+}
+
+static void mlx4_handle_error_state(struct mlx4_dev_persistent *persist)
+{
+ int err = 0;
+
+ mlx4_enter_error_state(persist);
+ mutex_lock(&persist->interface_state_mutex);
+ if (persist->interface_state & MLX4_INTERFACE_STATE_UP &&
+ !(persist->interface_state & MLX4_INTERFACE_STATE_DELETION)) {
+#ifdef KMOD_REMOVED
+ err = mlx4_restart_one(persist->rte_pdev);
+#endif
+ mlx4_info(persist->dev, "mlx4_restart_one was ended, ret=%d\n",
+ err);
+ }
+ mutex_unlock(&persist->interface_state_mutex);
+}
+
+static void dump_err_buf(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ int i;
+
+ mlx4_err(dev, "Internal error detected:\n");
+ for (i = 0; i < priv->fw.catas_size; ++i)
+ mlx4_err(dev, " buf[%02x]: %08x\n",
+ i, swab32(readl(priv->catas_err.map + i)));
+}
+
+static void poll_catas(unsigned long dev_ptr)
+{
+ struct mlx4_dev *dev = (struct mlx4_dev *) dev_ptr;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ u32 slave_read;
+
+ if (mlx4_is_slave(dev)) {
+ slave_read = swab32(readl(&priv->mfunc.comm->slave_read));
+ if (mlx4_comm_internal_err(slave_read)) {
+ mlx4_warn(dev, "Internal error detected on the communication channel\n");
+ goto internal_err;
+ }
+ } else if (readl(priv->catas_err.map)) {
+ dump_err_buf(dev);
+ goto internal_err;
+ }
+
+ if (dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR) {
+ mlx4_warn(dev, "Internal error mark was detected on device\n");
+ goto internal_err;
+ }
+#ifdef KMOD_REMOVED
+ mod_timer(&priv->catas_err.timer,
+ round_jiffies(jiffies + MLX4_CATAS_POLL_INTERVAL));
+ return;
+
+internal_err:
+ if (mlx4_internal_err_reset)
+ queue_work(dev->persist->catas_wq, &dev->persist->catas_work);
+#endif
+internal_err:
+ return;
+}
+#ifdef KMOD_MODIFIED
+static void catas_reset(struct mlx4_dev_persistent *persist)
+{
+ mlx4_handle_error_state(persist);
+}
+#endif
+
+void mlx4_start_catas_poll(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ //phys_addr_t addr;
+
+ INIT_LIST_HEAD(&priv->catas_err.list);
+#ifdef KMOD_DISABLED
+ init_timer(&priv->catas_err.timer);
+#endif
+ priv->catas_err.map = NULL;
+
+ if (!mlx4_is_slave(dev)) {
+#ifdef KMOD_MODIFIED
+ assert(dev->persist->rte_pdev->mem_resource[priv->fw.catas_bar].len
+ >= ( priv->fw.catas_offset + priv->fw.catas_size * 4 ));
+ priv->catas_err.map = RTE_PTR_ADD(dev->persist->rte_pdev->mem_resource[priv->fw.catas_bar].addr, priv->fw.catas_offset);
+#else
+ addr = pci_resource_start(dev->persist->pdev,
+ priv->fw.catas_bar) +
+ priv->fw.catas_offset;
+
+ priv->catas_err.map = ioremap(addr, priv->fw.catas_size * 4);
+#endif
+ if (!priv->catas_err.map) {
+ mlx4_warn(dev, "Failed to map internal error buffer\n");
+ return;
+ }
+ }
+#ifdef KMOD_REMOVED
+ priv->catas_err.timer.data = (unsigned long) dev;
+ priv->catas_err.timer.function = poll_catas;
+ priv->catas_err.timer.expires =
+ round_jiffies(jiffies + MLX4_CATAS_POLL_INTERVAL);
+ add_timer(&priv->catas_err.timer);
+#endif
+}
+
+void mlx4_stop_catas_poll(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+#ifdef KMOD_REMOVED
+ del_timer_sync(&priv->catas_err.timer);
+
+ if (priv->catas_err.map) {
+ iounmap(priv->catas_err.map);
+ priv->catas_err.map = NULL;
+ }
+
+ if (dev->persist->interface_state & MLX4_INTERFACE_STATE_DELETION)
+ flush_workqueue(dev->persist->catas_wq);
+#endif
+}
+
+int mlx4_catas_init(struct mlx4_dev *dev)
+{
+#ifdef KMOD_REMOVED
+ INIT_WORK(&dev->persist->catas_work, catas_reset);
+ dev->persist->catas_wq = create_singlethread_workqueue("mlx4_health");
+ if (!dev->persist->catas_wq)
+ return -ENOMEM;
+#endif
+
+ return 0;
+}
+
+void mlx4_catas_end(struct mlx4_dev *dev)
+{
+#ifdef KMOD_REMOVED
+ if (dev->persist->catas_wq) {
+ destroy_workqueue(dev->persist->catas_wq);
+ dev->persist->catas_wq = NULL;
+ }
+#endif
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/cmd.c b/drivers/net/mlnx_uio/mlnx/mlx4/cmd.c
new file mode 100644
index 0000000..3337e46
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/cmd.c
@@ -0,0 +1,3456 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "ib_verbs.h"
+#include "ib_smi.h"
+#include "ib_mad.h"
+
+#include "mlx4/device.h"
+#include "mlx4.h"
+#include "fw.h"
+#include "fw_qos.h"
+
+#define CMD_POLL_TOKEN 0xffff
+#define INBOX_MASK 0xffffffffffffff00ULL
+
+#define CMD_CHAN_VER 1
+#define CMD_CHAN_IF_REV 1
+
+enum {
+ /* command completed successfully: */
+ CMD_STAT_OK = 0x00,
+ /* Internal error (such as a bus error) occurred while processing command: */
+ CMD_STAT_INTERNAL_ERR = 0x01,
+ /* Operation/command not supported or opcode modifier not supported: */
+ CMD_STAT_BAD_OP = 0x02,
+ /* Parameter not supported or parameter out of range: */
+ CMD_STAT_BAD_PARAM = 0x03,
+ /* System not enabled or bad system state: */
+ CMD_STAT_BAD_SYS_STATE = 0x04,
+ /* Attempt to access reserved or unallocaterd resource: */
+ CMD_STAT_BAD_RESOURCE = 0x05,
+ /* Requested resource is currently executing a command, or is otherwise busy: */
+ CMD_STAT_RESOURCE_BUSY = 0x06,
+ /* Required capability exceeds device limits: */
+ CMD_STAT_EXCEED_LIM = 0x08,
+ /* Resource is not in the appropriate state or ownership: */
+ CMD_STAT_BAD_RES_STATE = 0x09,
+ /* Index out of range: */
+ CMD_STAT_BAD_INDEX = 0x0a,
+ /* FW image corrupted: */
+ CMD_STAT_BAD_NVMEM = 0x0b,
+ /* Error in ICM mapping (e.g. not enough auxiliary ICM pages to execute command): */
+ CMD_STAT_ICM_ERROR = 0x0c,
+ /* Attempt to modify a QP/EE which is not in the presumed state: */
+ CMD_STAT_BAD_QP_STATE = 0x10,
+ /* Bad segment parameters (Address/Size): */
+ CMD_STAT_BAD_SEG_PARAM = 0x20,
+ /* Memory Region has Memory Windows bound to: */
+ CMD_STAT_REG_BOUND = 0x21,
+ /* HCA local attached memory not present: */
+ CMD_STAT_LAM_NOT_PRE = 0x22,
+ /* Bad management packet (silently discarded): */
+ CMD_STAT_BAD_PKT = 0x30,
+ /* More outstanding CQEs in CQ than new CQ size: */
+ CMD_STAT_BAD_SIZE = 0x40,
+ /* Multi Function device support required: */
+ CMD_STAT_MULTI_FUNC_REQ = 0x50,
+};
+
+enum {
+ HCR_IN_PARAM_OFFSET = 0x00,
+ HCR_IN_MODIFIER_OFFSET = 0x08,
+ HCR_OUT_PARAM_OFFSET = 0x0c,
+ HCR_TOKEN_OFFSET = 0x14,
+ HCR_STATUS_OFFSET = 0x18,
+
+ HCR_OPMOD_SHIFT = 12,
+ HCR_T_BIT = 21,
+ HCR_E_BIT = 22,
+ HCR_GO_BIT = 23
+};
+
+enum {
+ GO_BIT_TIMEOUT_MSECS = 10000
+};
+
+enum mlx4_vlan_transition {
+ MLX4_VLAN_TRANSITION_VST_VST = 0,
+ MLX4_VLAN_TRANSITION_VST_VGT = 1,
+ MLX4_VLAN_TRANSITION_VGT_VST = 2,
+ MLX4_VLAN_TRANSITION_VGT_VGT = 3,
+};
+
+
+struct mlx4_cmd_context {
+ struct completion done;
+ int result;
+ int next;
+ u64 out_param;
+ u16 token;
+ u8 fw_status;
+};
+
+static int mlx4_master_process_vhcr(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr_cmd *in_vhcr);
+
+static int mlx4_status_to_errno(u8 status)
+{
+ static const int trans_table[] = {
+ [CMD_STAT_INTERNAL_ERR] = -EIO,
+ [CMD_STAT_BAD_OP] = -EPERM,
+ [CMD_STAT_BAD_PARAM] = -EINVAL,
+ [CMD_STAT_BAD_SYS_STATE] = -ENXIO,
+ [CMD_STAT_BAD_RESOURCE] = -EBADF,
+ [CMD_STAT_RESOURCE_BUSY] = -EBUSY,
+ [CMD_STAT_EXCEED_LIM] = -ENOMEM,
+ [CMD_STAT_BAD_RES_STATE] = -EBADF,
+ [CMD_STAT_BAD_INDEX] = -EBADF,
+ [CMD_STAT_BAD_NVMEM] = -EFAULT,
+ [CMD_STAT_ICM_ERROR] = -ENFILE,
+ [CMD_STAT_BAD_QP_STATE] = -EINVAL,
+ [CMD_STAT_BAD_SEG_PARAM] = -EFAULT,
+ [CMD_STAT_REG_BOUND] = -EBUSY,
+ [CMD_STAT_LAM_NOT_PRE] = -EAGAIN,
+ [CMD_STAT_BAD_PKT] = -EINVAL,
+ [CMD_STAT_BAD_SIZE] = -ENOMEM,
+ [CMD_STAT_MULTI_FUNC_REQ] = -EACCES,
+ };
+
+ if (status >= ARRAY_SIZE(trans_table) ||
+ (status != CMD_STAT_OK && trans_table[status] == 0))
+ return -EIO;
+
+ return trans_table[status];
+}
+
+static u8 mlx4_errno_to_status(int err)
+{
+ switch (err) {
+ case -EPERM:
+ return CMD_STAT_BAD_OP;
+ case -EINVAL:
+ return CMD_STAT_BAD_PARAM;
+ case -ENXIO:
+ return CMD_STAT_BAD_SYS_STATE;
+ case -EBUSY:
+ return CMD_STAT_RESOURCE_BUSY;
+ case -ENOMEM:
+ return CMD_STAT_EXCEED_LIM;
+ case -ENFILE:
+ return CMD_STAT_ICM_ERROR;
+ default:
+ return CMD_STAT_INTERNAL_ERR;
+ }
+}
+
+static int mlx4_internal_err_ret_value(struct mlx4_dev *dev, u16 op,
+ u8 op_modifier)
+{
+ switch (op) {
+ case MLX4_CMD_UNMAP_ICM:
+ case MLX4_CMD_UNMAP_ICM_AUX:
+ case MLX4_CMD_UNMAP_FA:
+ case MLX4_CMD_2RST_QP:
+ case MLX4_CMD_HW2SW_EQ:
+ case MLX4_CMD_HW2SW_CQ:
+ case MLX4_CMD_HW2SW_SRQ:
+ case MLX4_CMD_HW2SW_MPT:
+ case MLX4_CMD_CLOSE_HCA:
+ case MLX4_QP_FLOW_STEERING_DETACH:
+ case MLX4_CMD_FREE_RES:
+ case MLX4_CMD_CLOSE_PORT:
+ return CMD_STAT_OK;
+
+ case MLX4_CMD_QP_ATTACH:
+ /* On Detach case return success */
+ if (op_modifier == 0)
+ return CMD_STAT_OK;
+ return mlx4_status_to_errno(CMD_STAT_INTERNAL_ERR);
+
+ default:
+ return mlx4_status_to_errno(CMD_STAT_INTERNAL_ERR);
+ }
+}
+
+static int mlx4_closing_cmd_fatal_error(u16 op, u8 fw_status)
+{
+ /* Any error during the closing commands below is considered fatal */
+ if (op == MLX4_CMD_CLOSE_HCA ||
+ op == MLX4_CMD_HW2SW_EQ ||
+ op == MLX4_CMD_HW2SW_CQ ||
+ op == MLX4_CMD_2RST_QP ||
+ op == MLX4_CMD_HW2SW_SRQ ||
+ op == MLX4_CMD_SYNC_TPT ||
+ op == MLX4_CMD_UNMAP_ICM ||
+ op == MLX4_CMD_UNMAP_ICM_AUX ||
+ op == MLX4_CMD_UNMAP_FA)
+ return 1;
+ /* Error on MLX4_CMD_HW2SW_MPT is fatal except when fw status equals
+ * CMD_STAT_REG_BOUND.
+ * This status indicates that memory region has memory windows bound to it
+ * which may result from invalid user space usage and is not fatal.
+ */
+ if (op == MLX4_CMD_HW2SW_MPT && fw_status != CMD_STAT_REG_BOUND)
+ return 1;
+ return 0;
+}
+
+static int mlx4_cmd_reset_flow(struct mlx4_dev *dev, u16 op, u8 op_modifier,
+ int err)
+{
+ /* Only if reset flow is really active return code is based on
+ * command, otherwise current error code is returned.
+ */
+ if (mlx4_internal_err_reset) {
+ mlx4_enter_error_state(dev->persist);
+ err = mlx4_internal_err_ret_value(dev, op, op_modifier);
+ }
+
+ return err;
+}
+
+static int comm_pending(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ u32 status = readl(&priv->mfunc.comm->slave_read);
+
+ return (swab32(status) >> 31) != priv->cmd.comm_toggle;
+}
+
+static int mlx4_comm_cmd_post(struct mlx4_dev *dev, u8 cmd, u16 param)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ u32 val;
+
+ /* To avoid writing to unknown addresses after the device state was
+ * changed to internal error and the function was rest,
+ * check the INTERNAL_ERROR flag which is updated under
+ * device_state_mutex lock.
+ */
+ mutex_lock(&dev->persist->device_state_mutex);
+
+ if (dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR) {
+ mutex_unlock(&dev->persist->device_state_mutex);
+ return -EIO;
+ }
+
+ priv->cmd.comm_toggle ^= 1;
+ val = param | (cmd << 16) | (priv->cmd.comm_toggle << 31);
+ __raw_writel((__force u32) cpu_to_be32(val),
+ &priv->mfunc.comm->slave_write);
+ mmiowb();
+ mutex_unlock(&dev->persist->device_state_mutex);
+ return 0;
+}
+
+static int mlx4_comm_cmd_poll(struct mlx4_dev *dev, u8 cmd, u16 param,
+ unsigned long timeout)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ unsigned long end;
+ int err = 0;
+ int ret_from_pending = 0;
+
+ /* First, verify that the master reports correct status */
+ if (comm_pending(dev)) {
+ mlx4_warn(dev, "Communication channel is not idle - my toggle is %d (cmd:0x%x)\n",
+ priv->cmd.comm_toggle, cmd);
+ return -EAGAIN;
+ }
+
+ /* Write command */
+ down(&priv->cmd.poll_sem);
+ if (mlx4_comm_cmd_post(dev, cmd, param)) {
+ /* Only in case the device state is INTERNAL_ERROR,
+ * mlx4_comm_cmd_post returns with an error
+ */
+ err = mlx4_status_to_errno(CMD_STAT_INTERNAL_ERR);
+ goto out;
+ }
+
+ end = msecs_to_jiffies(timeout) + jiffies;
+ while (comm_pending(dev) && time_before(jiffies, end))
+ cond_resched();
+ ret_from_pending = comm_pending(dev);
+ if (ret_from_pending) {
+ /* check if the slave is trying to boot in the middle of
+ * FLR process. The only non-zero result in the RESET command
+ * is MLX4_DELAY_RESET_SLAVE*/
+ if ((MLX4_COMM_CMD_RESET == cmd)) {
+ err = MLX4_DELAY_RESET_SLAVE;
+ goto out;
+ } else {
+ mlx4_warn(dev, "Communication channel command 0x%x timed out\n",
+ cmd);
+ err = mlx4_status_to_errno(CMD_STAT_INTERNAL_ERR);
+ }
+ }
+
+ if (err)
+ mlx4_enter_error_state(dev->persist);
+out:
+ up(&priv->cmd.poll_sem);
+ return err;
+}
+
+#ifdef KMOD_DISABLED
+
+static int mlx4_comm_cmd_wait(struct mlx4_dev *dev, u8 vhcr_cmd,
+ u16 param, u16 op, unsigned long timeout)
+{
+ struct mlx4_cmd *cmd = &mlx4_priv(dev)->cmd;
+ struct mlx4_cmd_context *context;
+ unsigned long end;
+ int err = 0;
+
+ down(&cmd->event_sem);
+
+ spin_lock(&cmd->context_lock);
+ BUG_ON(cmd->free_head < 0);
+ context = &cmd->context[cmd->free_head];
+ context->token += cmd->token_mask + 1;
+ cmd->free_head = context->next;
+ spin_unlock(&cmd->context_lock);
+
+ reinit_completion(&context->done);
+
+ if (mlx4_comm_cmd_post(dev, vhcr_cmd, param)) {
+ /* Only in case the device state is INTERNAL_ERROR,
+ * mlx4_comm_cmd_post returns with an error
+ */
+ err = mlx4_status_to_errno(CMD_STAT_INTERNAL_ERR);
+ goto out;
+ }
+
+ if (!wait_for_completion_timeout(&context->done,
+ msecs_to_jiffies(timeout))) {
+ mlx4_warn(dev, "communication channel command 0x%x (op=0x%x) timed out\n",
+ vhcr_cmd, op);
+ goto out_reset;
+ }
+
+ err = context->result;
+ if (err && context->fw_status != CMD_STAT_MULTI_FUNC_REQ) {
+ mlx4_err(dev, "command 0x%x failed: fw status = 0x%x\n",
+ vhcr_cmd, context->fw_status);
+ if (mlx4_closing_cmd_fatal_error(op, context->fw_status))
+ goto out_reset;
+ }
+
+ /* wait for comm channel ready
+ * this is necessary for prevention the race
+ * when switching between event to polling mode
+ * Skipping this section in case the device is in FATAL_ERROR state,
+ * In this state, no commands are sent via the comm channel until
+ * the device has returned from reset.
+ */
+ if (!(dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR)) {
+ end = msecs_to_jiffies(timeout) + jiffies;
+ while (comm_pending(dev) && time_before(jiffies, end))
+ cond_resched();
+ }
+ goto out;
+
+out_reset:
+ err = mlx4_status_to_errno(CMD_STAT_INTERNAL_ERR);
+ mlx4_enter_error_state(dev->persist);
+out:
+ spin_lock(&cmd->context_lock);
+ context->next = cmd->free_head;
+ cmd->free_head = context - cmd->context;
+ spin_unlock(&cmd->context_lock);
+
+ up(&cmd->event_sem);
+ return err;
+}
+
+#endif
+
+int mlx4_comm_cmd(struct mlx4_dev *dev, u8 cmd, u16 param,
+ u16 op, unsigned long timeout)
+{
+ if (dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR)
+ return mlx4_status_to_errno(CMD_STAT_INTERNAL_ERR);
+
+ if (mlx4_priv(dev)->cmd.use_events)
+ {
+ assert(0);
+#ifdef KMOD_DISABLED
+ //return mlx4_comm_cmd_wait(dev, cmd, param, op, timeout);
+#endif
+ }
+ return mlx4_comm_cmd_poll(dev, cmd, param, timeout);
+}
+
+static int cmd_pending(struct mlx4_dev *dev)
+{
+ u32 status;
+#ifdef KMOD_DISABLED
+ //if (pci_channel_offline(dev->persist->pdev))
+ // return -EIO;
+#endif
+ status = readl(mlx4_priv(dev)->cmd.hcr + HCR_STATUS_OFFSET);
+
+ return (status & swab32(1 << HCR_GO_BIT)) ||
+ (mlx4_priv(dev)->cmd.toggle ==
+ !!(status & swab32(1 << HCR_T_BIT)));
+}
+
+static int mlx4_cmd_post(struct mlx4_dev *dev, u64 in_param, u64 out_param,
+ u32 in_modifier, u8 op_modifier, u16 op, u16 token,
+ int event)
+{
+ struct mlx4_cmd *cmd = &mlx4_priv(dev)->cmd;
+ u32 __iomem *hcr = cmd->hcr;
+ int ret = -EIO;
+ unsigned long end;
+
+ mutex_lock(&dev->persist->device_state_mutex);
+ /* To avoid writing to unknown addresses after the device state was
+ * changed to internal error and the chip was reset,
+ * check the INTERNAL_ERROR flag which is updated under
+ * device_state_mutex lock.
+ */
+#ifdef KMOD_DISABLED
+ //if (pci_channel_offline(dev->persist->pdev) ||
+#endif
+ if(
+ (dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR)) {
+ /*
+ * Device is going through error recovery
+ * and cannot accept commands.
+ */
+ goto out;
+ }
+
+ end = jiffies;
+ if (event)
+ end += msecs_to_jiffies(GO_BIT_TIMEOUT_MSECS);
+
+ while (cmd_pending(dev)) {
+#ifdef KMOD_DISABLED
+ //if (pci_channel_offline(dev->persist->pdev)) {
+ /*
+ * Device is going through error recovery
+ * and cannot accept commands.
+ */
+ // goto out;
+ //}
+#endif
+
+ if (time_after_eq(jiffies, end)) {
+ mlx4_err(dev, "%s:cmd_pending failed\n", __func__);
+ goto out;
+ }
+ cond_resched();
+ }
+
+ /*
+ * We use writel (instead of something like memcpy_toio)
+ * because writes of less than 32 bits to the HCR don't work
+ * (and some architectures such as ia64 implement memcpy_toio
+ * in terms of writeb).
+ */
+ __raw_writel((__force u32) cpu_to_be32(in_param >> 32), hcr + 0);
+ __raw_writel((__force u32) cpu_to_be32(in_param & 0xfffffffful), hcr + 1);
+ __raw_writel((__force u32) cpu_to_be32(in_modifier), hcr + 2);
+ __raw_writel((__force u32) cpu_to_be32(out_param >> 32), hcr + 3);
+ __raw_writel((__force u32) cpu_to_be32(out_param & 0xfffffffful), hcr + 4);
+ __raw_writel((__force u32) cpu_to_be32(token << 16), hcr + 5);
+
+ /* __raw_writel may not order writes. */
+ wmb();
+
+ __raw_writel((__force u32) cpu_to_be32((1 << HCR_GO_BIT) |
+ (cmd->toggle << HCR_T_BIT) |
+ (event ? (1 << HCR_E_BIT) : 0) |
+ (op_modifier << HCR_OPMOD_SHIFT) |
+ op), hcr + 6);
+
+ /*
+ * Make sure that our HCR writes don't get mixed in with
+ * writes from another CPU starting a FW command.
+ */
+ mmiowb();
+
+ cmd->toggle = cmd->toggle ^ 1;
+
+ ret = 0;
+
+out:
+ if (ret)
+ mlx4_warn(dev, "Could not post command 0x%x: ret=%d, in_param=0x%llx, in_mod=0x%x, op_mod=0x%x\n",
+ op, ret, in_param, in_modifier, op_modifier);
+ mutex_unlock(&dev->persist->device_state_mutex);
+
+ return ret;
+}
+
+static int mlx4_slave_cmd(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
+ int out_is_imm, u32 in_modifier, u8 op_modifier,
+ u16 op, unsigned long timeout)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_vhcr_cmd *vhcr = priv->mfunc.vhcr;
+ int ret;
+
+ mutex_lock(&priv->cmd.slave_cmd_mutex);
+
+ vhcr->in_param = cpu_to_be64(in_param);
+ vhcr->out_param = out_param ? cpu_to_be64(*out_param) : 0;
+ vhcr->in_modifier = cpu_to_be32(in_modifier);
+ vhcr->opcode = cpu_to_be16((((u16) op_modifier) << 12) | (op & 0xfff));
+ vhcr->token = cpu_to_be16(CMD_POLL_TOKEN);
+ vhcr->status = 0;
+ vhcr->flags = !!(priv->cmd.use_events) << 6;
+
+ if (mlx4_is_master(dev)) {
+ ret = mlx4_master_process_vhcr(dev, dev->caps.function, vhcr);
+ if (!ret) {
+ if (out_is_imm) {
+ if (out_param)
+ *out_param =
+ be64_to_cpu(vhcr->out_param);
+ else {
+ mlx4_err(dev, "response expected while output mailbox is NULL for command 0x%x\n",
+ op);
+ vhcr->status = CMD_STAT_BAD_PARAM;
+ }
+ }
+ ret = mlx4_status_to_errno(vhcr->status);
+ }
+ if (ret &&
+ dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR)
+ ret = mlx4_internal_err_ret_value(dev, op, op_modifier);
+ } else {
+ ret = mlx4_comm_cmd(dev, MLX4_COMM_CMD_VHCR_POST, 0, op,
+ MLX4_COMM_TIME + timeout);
+ if (!ret) {
+ if (out_is_imm) {
+ if (out_param)
+ *out_param =
+ be64_to_cpu(vhcr->out_param);
+ else {
+ mlx4_err(dev, "response expected while output mailbox is NULL for command 0x%x\n",
+ op);
+ vhcr->status = CMD_STAT_BAD_PARAM;
+ }
+ }
+ ret = mlx4_status_to_errno(vhcr->status);
+ } else {
+ if (dev->persist->state &
+ MLX4_DEVICE_STATE_INTERNAL_ERROR)
+ ret = mlx4_internal_err_ret_value(dev, op,
+ op_modifier);
+ else
+ mlx4_err(dev, "failed execution of VHCR_POST command opcode 0x%x\n", op);
+ }
+ }
+
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
+ return ret;
+}
+
+static int mlx4_cmd_poll(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
+ int out_is_imm, u32 in_modifier, u8 op_modifier,
+ u16 op, unsigned long timeout)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ void __iomem *hcr = priv->cmd.hcr;
+ int err = 0;
+ unsigned long end;
+ u32 stat;
+
+ down(&priv->cmd.poll_sem);
+
+ if (dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR) {
+ /*
+ * Device is going through error recovery
+ * and cannot accept commands.
+ */
+ err = mlx4_internal_err_ret_value(dev, op, op_modifier);
+ goto out;
+ }
+
+ if (out_is_imm && !out_param) {
+ mlx4_err(dev, "response expected while output mailbox is NULL for command 0x%x\n",
+ op);
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = mlx4_cmd_post(dev, in_param, out_param ? *out_param : 0,
+ in_modifier, op_modifier, op, CMD_POLL_TOKEN, 0);
+ if (err)
+ goto out_reset;
+
+ end = msecs_to_jiffies(timeout) + jiffies;
+ while (cmd_pending(dev) && time_before(jiffies, end)) {
+#ifdef KMOD_DISABLED
+ if (pci_channel_offline(dev->persist->pdev)) {
+ /*
+ * Device is going through error recovery
+ * and cannot accept commands.
+ */
+ err = -EIO;
+ goto out_reset;
+ }
+#endif
+
+ if (dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR) {
+ err = mlx4_internal_err_ret_value(dev, op, op_modifier);
+ goto out;
+ }
+
+ cond_resched();
+ }
+
+ if (cmd_pending(dev)) {
+ mlx4_warn(dev, "command 0x%x timed out (go bit not cleared)\n",
+ op);
+ err = -EIO;
+ goto out_reset;
+ }
+
+ if (out_is_imm)
+ *out_param =
+ (u64) be32_to_cpu((__force __be32)
+ __raw_readl(hcr + HCR_OUT_PARAM_OFFSET)) << 32 |
+ (u64) be32_to_cpu((__force __be32)
+ __raw_readl(hcr + HCR_OUT_PARAM_OFFSET + 4));
+ stat = be32_to_cpu((__force __be32)
+ __raw_readl(hcr + HCR_STATUS_OFFSET)) >> 24;
+ err = mlx4_status_to_errno(stat);
+ if (err) {
+ mlx4_err(dev, "command 0x%x failed: fw status = 0x%x\n",
+ op, stat);
+ if (mlx4_closing_cmd_fatal_error(op, stat))
+ goto out_reset;
+ goto out;
+ }
+
+out_reset:
+ if (err)
+ err = mlx4_cmd_reset_flow(dev, op, op_modifier, err);
+out:
+ up(&priv->cmd.poll_sem);
+ return err;
+}
+
+void mlx4_cmd_event(struct mlx4_dev *dev, u16 token, u8 status, u64 out_param)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cmd_context *context =
+ &priv->cmd.context[token & priv->cmd.token_mask];
+
+ /* previously timed out command completing at long last */
+ if (token != context->token)
+ return;
+
+ context->fw_status = status;
+ context->result = mlx4_status_to_errno(status);
+ context->out_param = out_param;
+ complete(&context->done);
+}
+
+static int mlx4_cmd_wait(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
+ int out_is_imm, u32 in_modifier, u8 op_modifier,
+ u16 op, unsigned long timeout)
+{
+ struct mlx4_cmd *cmd = &mlx4_priv(dev)->cmd;
+ struct mlx4_cmd_context *context;
+ int err = 0;
+
+ down(&cmd->event_sem);
+
+ spin_lock(&cmd->context_lock);
+ BUG_ON(cmd->free_head < 0);
+ context = &cmd->context[cmd->free_head];
+ context->token += cmd->token_mask + 1;
+ cmd->free_head = context->next;
+ spin_unlock(&cmd->context_lock);
+
+ if (out_is_imm && !out_param) {
+ mlx4_err(dev, "response expected while output mailbox is NULL for command 0x%x\n",
+ op);
+ err = -EINVAL;
+ goto out;
+ }
+
+ reinit_completion(&context->done);
+
+ err = mlx4_cmd_post(dev, in_param, out_param ? *out_param : 0,
+ in_modifier, op_modifier, op, context->token, 1);
+ if (err)
+ goto out_reset;
+
+ if (!wait_for_completion_timeout(&context->done,
+ msecs_to_jiffies(timeout))) {
+ mlx4_warn(dev, "command 0x%x timed out (go bit not cleared)\n",
+ op);
+ err = -EIO;
+ goto out_reset;
+ }
+
+ err = context->result;
+ if (err) {
+ /* Since we do not want to have this error message always
+ * displayed at driver start when there are ConnectX2 HCAs
+ * on the host, we deprecate the error message for this
+ * specific command/input_mod/opcode_mod/fw-status to be debug.
+ */
+ if (op == MLX4_CMD_SET_PORT &&
+ (in_modifier == 1 || in_modifier == 2) &&
+ op_modifier == MLX4_SET_PORT_IB_OPCODE &&
+ context->fw_status == CMD_STAT_BAD_SIZE)
+ mlx4_dbg(dev, "command 0x%x failed: fw status = 0x%x\n",
+ op, context->fw_status);
+ else
+ mlx4_err(dev, "command 0x%x failed: fw status = 0x%x\n",
+ op, context->fw_status);
+ if (dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR)
+ err = mlx4_internal_err_ret_value(dev, op, op_modifier);
+ else if (mlx4_closing_cmd_fatal_error(op, context->fw_status))
+ goto out_reset;
+
+ goto out;
+ }
+
+ if (out_is_imm)
+ *out_param = context->out_param;
+
+out_reset:
+ if (err)
+ err = mlx4_cmd_reset_flow(dev, op, op_modifier, err);
+out:
+ spin_lock(&cmd->context_lock);
+ context->next = cmd->free_head;
+ cmd->free_head = context - cmd->context;
+ spin_unlock(&cmd->context_lock);
+
+ up(&cmd->event_sem);
+ return err;
+}
+
+int __mlx4_cmd(struct mlx4_dev *dev, u64 in_param, u64 *out_param,
+ int out_is_imm, u32 in_modifier, u8 op_modifier,
+ u16 op, unsigned long timeout, int native)
+{
+#ifdef KMOD_DISABLED
+ if (pci_channel_offline(dev->persist->pdev))
+ return mlx4_cmd_reset_flow(dev, op, op_modifier, -EIO);
+#endif
+
+ if (!mlx4_is_mfunc(dev) || (native && mlx4_is_master(dev))) {
+ int ret;
+
+ if (dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR)
+ return mlx4_internal_err_ret_value(dev, op,
+ op_modifier);
+ down_read(&mlx4_priv(dev)->cmd.switch_sem);
+ if (mlx4_priv(dev)->cmd.use_events)
+ {
+#ifdef KMOD_DISABLED
+ ret = mlx4_cmd_wait(dev, in_param, out_param,
+ out_is_imm, in_modifier,
+ op_modifier, op, timeout);
+#else
+ assert(0);
+#endif
+ }
+ else
+ ret = mlx4_cmd_poll(dev, in_param, out_param,
+ out_is_imm, in_modifier,
+ op_modifier, op, timeout);
+
+ up_read(&mlx4_priv(dev)->cmd.switch_sem);
+ return ret;
+ }
+ return mlx4_slave_cmd(dev, in_param, out_param, out_is_imm,
+ in_modifier, op_modifier, op, timeout);
+}
+EXPORT_SYMBOL_GPL(__mlx4_cmd);
+
+
+int mlx4_ARM_COMM_CHANNEL(struct mlx4_dev *dev)
+{
+ return mlx4_cmd(dev, 0, 0, 0, MLX4_CMD_ARM_COMM_CHANNEL,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+}
+
+static int mlx4_ACCESS_MEM(struct mlx4_dev *dev, u64 master_addr,
+ int slave, u64 slave_addr,
+ int size, int is_read)
+{
+ u64 in_param;
+ u64 out_param;
+
+ if ((slave_addr & 0xfff) | (master_addr & 0xfff) |
+ (slave & ~0x7f) | (size & 0xff)) {
+ mlx4_err(dev, "Bad access mem params - slave_addr:0x%llx master_addr:0x%llx slave_id:%d size:%d\n",
+ slave_addr, master_addr, slave, size);
+ return -EINVAL;
+ }
+
+ if (is_read) {
+ in_param = (u64) slave | slave_addr;
+ out_param = (u64) dev->caps.function | master_addr;
+ } else {
+ in_param = (u64) dev->caps.function | master_addr;
+ out_param = (u64) slave | slave_addr;
+ }
+
+ return mlx4_cmd_imm(dev, in_param, &out_param, size, 0,
+ MLX4_CMD_ACCESS_MEM,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+}
+
+static int query_pkey_block(struct mlx4_dev *dev, u8 port, u16 index, u16 *pkey,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox)
+{
+ struct ib_smp *in_mad = (struct ib_smp *)(inbox->buf);
+ struct ib_smp *out_mad = (struct ib_smp *)(outbox->buf);
+ int err;
+ int i;
+
+ if (index & 0x1f)
+ return -EINVAL;
+
+ in_mad->attr_mod = cpu_to_be32(index / 32);
+
+ err = mlx4_cmd_box(dev, inbox->dma, outbox->dma, port, 3,
+ MLX4_CMD_MAD_IFC, MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+ if (err)
+ return err;
+
+ for (i = 0; i < 32; ++i)
+ pkey[i] = be16_to_cpu(((__be16 *) out_mad->data)[i]);
+
+ return err;
+}
+
+static int get_full_pkey_table(struct mlx4_dev *dev, u8 port, u16 *table,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox)
+{
+ int i;
+ int err;
+
+ for (i = 0; i < dev->caps.pkey_table_len[port]; i += 32) {
+ err = query_pkey_block(dev, port, i, table + i, inbox, outbox);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+#define PORT_CAPABILITY_LOCATION_IN_SMP 20
+#define PORT_STATE_OFFSET 32
+
+static enum ib_port_state vf_port_state(struct mlx4_dev *dev, int port, int vf)
+{
+ if (mlx4_get_slave_port_state(dev, vf, port) == SLAVE_PORT_UP)
+ return IB_PORT_ACTIVE;
+ else
+ return IB_PORT_DOWN;
+}
+
+static int mlx4_MAD_IFC_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ struct ib_smp *smp = inbox->buf;
+ u32 index;
+ u8 port;
+ u8 opcode_modifier;
+ u16 *table;
+ int err;
+ int vidx, pidx;
+ int network_view;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct ib_smp *outsmp = outbox->buf;
+ __be16 *outtab = (__be16 *)(outsmp->data);
+ __be32 slave_cap_mask;
+ __be64 slave_node_guid;
+
+ port = vhcr->in_modifier;
+
+ /* network-view bit is for driver use only, and should not be passed to FW */
+ opcode_modifier = vhcr->op_modifier & ~0x8; /* clear netw view bit */
+ network_view = !!(vhcr->op_modifier & 0x8);
+
+ if (smp->base_version == 1 &&
+ smp->mgmt_class == IB_MGMT_CLASS_SUBN_LID_ROUTED &&
+ smp->class_version == 1) {
+ /* host view is paravirtualized */
+ if (!network_view && smp->method == IB_MGMT_METHOD_GET) {
+ if (smp->attr_id == IB_SMP_ATTR_PKEY_TABLE) {
+ index = be32_to_cpu(smp->attr_mod);
+ if (port < 1 || port > dev->caps.num_ports)
+ return -EINVAL;
+ table = kcalloc((dev->caps.pkey_table_len[port] / 32) + 1,
+ sizeof(*table) * 32, GFP_KERNEL);
+
+ if (!table)
+ return -ENOMEM;
+ /* need to get the full pkey table because the paravirtualized
+ * pkeys may be scattered among several pkey blocks.
+ */
+ err = get_full_pkey_table(dev, port, table, inbox, outbox);
+ if (!err) {
+ for (vidx = index * 32; vidx < (index + 1) * 32; ++vidx) {
+ pidx = priv->virt2phys_pkey[slave][port - 1][vidx];
+ outtab[vidx % 32] = cpu_to_be16(table[pidx]);
+ }
+ }
+ kfree(table);
+ return err;
+ }
+ if (smp->attr_id == IB_SMP_ATTR_PORT_INFO) {
+ /*get the slave specific caps:*/
+ /*do the command */
+ err = mlx4_cmd_box(dev, inbox->dma, outbox->dma,
+ vhcr->in_modifier, opcode_modifier,
+ vhcr->op, MLX4_CMD_TIME_CLASS_C, MLX4_CMD_NATIVE);
+ /* modify the response for slaves */
+ if (!err && slave != mlx4_master_func_num(dev)) {
+ u8 *state = outsmp->data + PORT_STATE_OFFSET;
+
+ *state = (*state & 0xf0) | vf_port_state(dev, port, slave);
+ slave_cap_mask = priv->mfunc.master.slave_state[slave].ib_cap_mask[port];
+ memcpy(outsmp->data + PORT_CAPABILITY_LOCATION_IN_SMP, &slave_cap_mask, 4);
+ }
+ return err;
+ }
+ if (smp->attr_id == IB_SMP_ATTR_GUID_INFO) {
+ __be64 guid = mlx4_get_admin_guid(dev, slave,
+ port);
+
+ /* set the PF admin guid to the FW/HW burned
+ * GUID, if it wasn't yet set
+ */
+ if (slave == 0 && guid == 0) {
+ smp->attr_mod = 0;
+ err = mlx4_cmd_box(dev,
+ inbox->dma,
+ outbox->dma,
+ vhcr->in_modifier,
+ opcode_modifier,
+ vhcr->op,
+ MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+ if (err)
+ return err;
+ mlx4_set_admin_guid(dev,
+ *(__be64 *)outsmp->
+ data, slave, port);
+ } else {
+ memcpy(outsmp->data, &guid, 8);
+ }
+
+ /* clean all other gids */
+ memset(outsmp->data + 8, 0, 56);
+ return 0;
+ }
+ if (smp->attr_id == IB_SMP_ATTR_NODE_INFO) {
+ err = mlx4_cmd_box(dev, inbox->dma, outbox->dma,
+ vhcr->in_modifier, opcode_modifier,
+ vhcr->op, MLX4_CMD_TIME_CLASS_C, MLX4_CMD_NATIVE);
+ if (!err) {
+ slave_node_guid = mlx4_get_slave_node_guid(dev, slave);
+ memcpy(outsmp->data + 12, &slave_node_guid, 8);
+ }
+ return err;
+ }
+ }
+ }
+
+ /* Non-privileged VFs are only allowed "host" view LID-routed 'Get' MADs.
+ * These are the MADs used by ib verbs (such as ib_query_gids).
+ */
+ if (slave != mlx4_master_func_num(dev) &&
+ !mlx4_vf_smi_enabled(dev, slave, port)) {
+ if (!(smp->mgmt_class == IB_MGMT_CLASS_SUBN_LID_ROUTED &&
+ smp->method == IB_MGMT_METHOD_GET) || network_view) {
+ mlx4_err(dev, "Unprivileged slave %d is trying to execute a Subnet MGMT MAD, class 0x%x, method 0x%x, view=%s for attr 0x%x. Rejecting\n",
+ slave, smp->method, smp->mgmt_class,
+ network_view ? "Network" : "Host",
+ be16_to_cpu(smp->attr_id));
+ return -EPERM;
+ }
+ }
+
+ return mlx4_cmd_box(dev, inbox->dma, outbox->dma,
+ vhcr->in_modifier, opcode_modifier,
+ vhcr->op, MLX4_CMD_TIME_CLASS_C, MLX4_CMD_NATIVE);
+}
+
+static int mlx4_CMD_EPERM_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ return -EPERM;
+}
+
+int mlx4_DMA_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ u64 in_param;
+ u64 out_param;
+ int err;
+
+ in_param = cmd->has_inbox ? (u64) inbox->dma : vhcr->in_param;
+ out_param = cmd->has_outbox ? (u64) outbox->dma : vhcr->out_param;
+ if (cmd->encode_slave_id) {
+ in_param &= 0xffffffffffffff00ll;
+ in_param |= slave;
+ }
+
+ err = __mlx4_cmd(dev, in_param, &out_param, cmd->out_is_imm,
+ vhcr->in_modifier, vhcr->op_modifier, vhcr->op,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+
+ if (cmd->out_is_imm)
+ vhcr->out_param = out_param;
+
+ return err;
+}
+
+static struct mlx4_cmd_info cmd_info[] = {
+ {
+ .opcode = MLX4_CMD_QUERY_FW,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_QUERY_FW_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_QUERY_HCA,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = NULL
+ },
+ {
+ .opcode = MLX4_CMD_QUERY_DEV_CAP,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_QUERY_DEV_CAP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_QUERY_FUNC_CAP,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_QUERY_FUNC_CAP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_QUERY_ADAPTER,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = NULL
+ },
+ {
+ .opcode = MLX4_CMD_INIT_PORT,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_INIT_PORT_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_CLOSE_PORT,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_CLOSE_PORT_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_QUERY_PORT,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_QUERY_PORT_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_SET_PORT,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_SET_PORT_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_MAP_EQ,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_MAP_EQ_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_SW2HW_EQ,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = true,
+ .verify = NULL,
+ .wrapper = mlx4_SW2HW_EQ_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_HW_HEALTH_CHECK,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = NULL
+ },
+ {
+ .opcode = MLX4_CMD_NOP,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = NULL
+ },
+ {
+ .opcode = MLX4_CMD_CONFIG_DEV,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_CONFIG_DEV_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_ALLOC_RES,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = true,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_ALLOC_RES_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_FREE_RES,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_FREE_RES_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_SW2HW_MPT,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = true,
+ .verify = NULL,
+ .wrapper = mlx4_SW2HW_MPT_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_QUERY_MPT,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_QUERY_MPT_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_HW2SW_MPT,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_HW2SW_MPT_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_READ_MTT,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = NULL
+ },
+ {
+ .opcode = MLX4_CMD_WRITE_MTT,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_WRITE_MTT_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_SYNC_TPT,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = NULL
+ },
+ {
+ .opcode = MLX4_CMD_HW2SW_EQ,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = true,
+ .verify = NULL,
+ .wrapper = mlx4_HW2SW_EQ_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_QUERY_EQ,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = true,
+ .verify = NULL,
+ .wrapper = mlx4_QUERY_EQ_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_SW2HW_CQ,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = true,
+ .verify = NULL,
+ .wrapper = mlx4_SW2HW_CQ_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_HW2SW_CQ,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_HW2SW_CQ_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_QUERY_CQ,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_QUERY_CQ_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_MODIFY_CQ,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = true,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_MODIFY_CQ_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_SW2HW_SRQ,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = true,
+ .verify = NULL,
+ .wrapper = mlx4_SW2HW_SRQ_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_HW2SW_SRQ,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_HW2SW_SRQ_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_QUERY_SRQ,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_QUERY_SRQ_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_ARM_SRQ,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_ARM_SRQ_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_RST2INIT_QP,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = true,
+ .verify = NULL,
+ .wrapper = mlx4_RST2INIT_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_INIT2INIT_QP,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_INIT2INIT_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_INIT2RTR_QP,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_INIT2RTR_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_RTR2RTS_QP,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_RTR2RTS_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_RTS2RTS_QP,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_RTS2RTS_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_SQERR2RTS_QP,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_SQERR2RTS_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_2ERR_QP,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_GEN_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_RTS2SQD_QP,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_GEN_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_SQD2SQD_QP,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_SQD2SQD_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_SQD2RTS_QP,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_SQD2RTS_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_2RST_QP,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_2RST_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_QUERY_QP,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_GEN_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_SUSPEND_QP,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_GEN_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_UNSUSPEND_QP,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_GEN_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_UPDATE_QP,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_UPDATE_QP_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_GET_OP_REQ,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_CMD_EPERM_wrapper,
+ },
+ {
+ .opcode = MLX4_CMD_ALLOCATE_VPP,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_CMD_EPERM_wrapper,
+ },
+ {
+ .opcode = MLX4_CMD_SET_VPORT_QOS,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_CMD_EPERM_wrapper,
+ },
+ {
+ .opcode = MLX4_CMD_CONF_SPECIAL_QP,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL, /* XXX verify: only demux can do this */
+ .wrapper = NULL
+ },
+ {
+ .opcode = MLX4_CMD_MAD_IFC,
+ .has_inbox = true,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_MAD_IFC_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_MAD_DEMUX,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_CMD_EPERM_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_QUERY_IF_STAT,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_QUERY_IF_STAT_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_ACCESS_REG,
+ .has_inbox = true,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_ACCESS_REG_wrapper,
+ },
+ {
+ .opcode = MLX4_CMD_CONGESTION_CTRL_OPCODE,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_CMD_EPERM_wrapper,
+ },
+ /* Native multicast commands are not available for guests */
+ {
+ .opcode = MLX4_CMD_QP_ATTACH,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_QP_ATTACH_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_PROMISC,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_PROMISC_wrapper
+ },
+ /* Ethernet specific commands */
+ {
+ .opcode = MLX4_CMD_SET_VLAN_FLTR,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_SET_VLAN_FLTR_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_SET_MCAST_FLTR,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_SET_MCAST_FLTR_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_DUMP_ETH_STATS,
+ .has_inbox = false,
+ .has_outbox = true,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_DUMP_ETH_STATS_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_INFORM_FLR_DONE,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = NULL
+ },
+ /* flow steering commands */
+ {
+ .opcode = MLX4_QP_FLOW_STEERING_ATTACH,
+ .has_inbox = true,
+ .has_outbox = false,
+ .out_is_imm = true,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_QP_FLOW_STEERING_ATTACH_wrapper
+ },
+ {
+ .opcode = MLX4_QP_FLOW_STEERING_DETACH,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_QP_FLOW_STEERING_DETACH_wrapper
+ },
+ {
+ .opcode = MLX4_FLOW_STEERING_IB_UC_QP_RANGE,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_CMD_EPERM_wrapper
+ },
+ {
+ .opcode = MLX4_CMD_VIRT_PORT_MAP,
+ .has_inbox = false,
+ .has_outbox = false,
+ .out_is_imm = false,
+ .encode_slave_id = false,
+ .verify = NULL,
+ .wrapper = mlx4_CMD_EPERM_wrapper
+ },
+};
+
+static int mlx4_master_process_vhcr(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr_cmd *in_vhcr)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cmd_info *cmd = NULL;
+ struct mlx4_vhcr_cmd *vhcr_cmd = in_vhcr ? in_vhcr : priv->mfunc.vhcr;
+ struct mlx4_vhcr *vhcr;
+ struct mlx4_cmd_mailbox *inbox = NULL;
+ struct mlx4_cmd_mailbox *outbox = NULL;
+ u64 in_param;
+ u64 out_param;
+ int ret = 0;
+ int i;
+ int err = 0;
+
+ /* Create sw representation of Virtual HCR */
+ vhcr = kzalloc(sizeof(struct mlx4_vhcr), GFP_KERNEL);
+ if (!vhcr)
+ return -ENOMEM;
+
+ /* DMA in the vHCR */
+ if (!in_vhcr) {
+ ret = mlx4_ACCESS_MEM(dev, priv->mfunc.vhcr_dma, slave,
+ priv->mfunc.master.slave_state[slave].vhcr_dma,
+ ALIGN(sizeof(struct mlx4_vhcr_cmd),
+ MLX4_ACCESS_MEM_ALIGN), 1);
+ if (ret) {
+ if (!(dev->persist->state &
+ MLX4_DEVICE_STATE_INTERNAL_ERROR))
+ mlx4_err(dev, "%s: Failed reading vhcr ret: 0x%x\n",
+ __func__, ret);
+ kfree(vhcr);
+ return ret;
+ }
+ }
+
+ /* Fill SW VHCR fields */
+ vhcr->in_param = be64_to_cpu(vhcr_cmd->in_param);
+ vhcr->out_param = be64_to_cpu(vhcr_cmd->out_param);
+ vhcr->in_modifier = be32_to_cpu(vhcr_cmd->in_modifier);
+ vhcr->token = be16_to_cpu(vhcr_cmd->token);
+ vhcr->op = be16_to_cpu(vhcr_cmd->opcode) & 0xfff;
+ vhcr->op_modifier = (u8) (be16_to_cpu(vhcr_cmd->opcode) >> 12);
+ vhcr->e_bit = vhcr_cmd->flags & (1 << 6);
+
+ /* Lookup command */
+ for (i = 0; i < ARRAY_SIZE(cmd_info); ++i) {
+ if (vhcr->op == cmd_info[i].opcode) {
+ cmd = &cmd_info[i];
+ break;
+ }
+ }
+ if (!cmd) {
+ mlx4_err(dev, "Unknown command:0x%x accepted from slave:%d\n",
+ vhcr->op, slave);
+ vhcr_cmd->status = CMD_STAT_BAD_PARAM;
+ goto out_status;
+ }
+
+ /* Read inbox */
+ if (cmd->has_inbox) {
+ vhcr->in_param &= INBOX_MASK;
+ inbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(inbox)) {
+ vhcr_cmd->status = CMD_STAT_BAD_SIZE;
+ inbox = NULL;
+ goto out_status;
+ }
+
+ ret = mlx4_ACCESS_MEM(dev, inbox->dma, slave,
+ vhcr->in_param,
+ MLX4_MAILBOX_SIZE, 1);
+ if (ret) {
+ if (!(dev->persist->state &
+ MLX4_DEVICE_STATE_INTERNAL_ERROR))
+ mlx4_err(dev, "%s: Failed reading inbox (cmd:0x%x)\n",
+ __func__, cmd->opcode);
+ vhcr_cmd->status = CMD_STAT_INTERNAL_ERR;
+ goto out_status;
+ }
+ }
+
+ /* Apply permission and bound checks if applicable */
+ if (cmd->verify && cmd->verify(dev, slave, vhcr, inbox)) {
+ mlx4_warn(dev, "Command:0x%x from slave: %d failed protection checks for resource_id:%d\n",
+ vhcr->op, slave, vhcr->in_modifier);
+ vhcr_cmd->status = CMD_STAT_BAD_OP;
+ goto out_status;
+ }
+
+ /* Allocate outbox */
+ if (cmd->has_outbox) {
+ outbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(outbox)) {
+ vhcr_cmd->status = CMD_STAT_BAD_SIZE;
+ outbox = NULL;
+ goto out_status;
+ }
+ }
+
+ /* Execute the command! */
+ if (cmd->wrapper) {
+ err = cmd->wrapper(dev, slave, vhcr, inbox, outbox,
+ cmd);
+ if (cmd->out_is_imm)
+ vhcr_cmd->out_param = cpu_to_be64(vhcr->out_param);
+ } else {
+ in_param = cmd->has_inbox ? (u64) inbox->dma :
+ vhcr->in_param;
+ out_param = cmd->has_outbox ? (u64) outbox->dma :
+ vhcr->out_param;
+ err = __mlx4_cmd(dev, in_param, &out_param,
+ cmd->out_is_imm, vhcr->in_modifier,
+ vhcr->op_modifier, vhcr->op,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+
+ if (cmd->out_is_imm) {
+ vhcr->out_param = out_param;
+ vhcr_cmd->out_param = cpu_to_be64(vhcr->out_param);
+ }
+ }
+
+ if (err) {
+ if (!(dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR))
+ mlx4_warn(dev, "vhcr command 0x%x slave:%d in_param 0x%llx in_mod=0x%x op_mod=0x%x failed with error:%d, status %d\n",
+ vhcr->op, slave, vhcr->in_param,
+ vhcr->in_modifier, vhcr->op_modifier,
+ vhcr->err_no, err);
+ vhcr_cmd->status = mlx4_errno_to_status(err);
+ goto out_status;
+ }
+
+
+ /* Write outbox if command completed successfully */
+ if (cmd->has_outbox && !vhcr_cmd->status) {
+ ret = mlx4_ACCESS_MEM(dev, outbox->dma, slave,
+ vhcr->out_param,
+ MLX4_MAILBOX_SIZE, MLX4_CMD_WRAPPED);
+ if (ret) {
+ /* If we failed to write back the outbox after the
+ *command was successfully executed, we must fail this
+ * slave, as it is now in undefined state */
+ if (!(dev->persist->state &
+ MLX4_DEVICE_STATE_INTERNAL_ERROR))
+ mlx4_err(dev, "%s:Failed writing outbox\n", __func__);
+ goto out;
+ }
+ }
+
+out_status:
+ /* DMA back vhcr result */
+ if (!in_vhcr) {
+ ret = mlx4_ACCESS_MEM(dev, priv->mfunc.vhcr_dma, slave,
+ priv->mfunc.master.slave_state[slave].vhcr_dma,
+ ALIGN(sizeof(struct mlx4_vhcr),
+ MLX4_ACCESS_MEM_ALIGN),
+ MLX4_CMD_WRAPPED);
+ if (ret)
+ mlx4_err(dev, "%s:Failed writing vhcr result\n",
+ __func__);
+ else if (vhcr->e_bit &&
+ mlx4_GEN_EQE(dev, slave, &priv->mfunc.master.cmd_eqe))
+ mlx4_warn(dev, "Failed to generate command completion eqe for slave %d\n",
+ slave);
+ }
+
+out:
+ kfree(vhcr);
+ mlx4_free_cmd_mailbox(dev, inbox);
+ mlx4_free_cmd_mailbox(dev, outbox);
+ return ret;
+}
+
+static int mlx4_master_immediate_activate_vlan_qos(struct mlx4_priv *priv,
+ int slave, int port)
+{
+ struct mlx4_vport_oper_state *vp_oper;
+ struct mlx4_vport_state *vp_admin;
+ struct mlx4_vf_immed_vlan_work *work;
+ struct mlx4_dev *dev = &(priv->dev);
+ int err;
+ int admin_vlan_ix = NO_INDX;
+
+ vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
+ vp_admin = &priv->mfunc.master.vf_admin[slave].vport[port];
+
+ if (vp_oper->state.default_vlan == vp_admin->default_vlan &&
+ vp_oper->state.default_qos == vp_admin->default_qos &&
+ vp_oper->state.link_state == vp_admin->link_state &&
+ vp_oper->state.qos_vport == vp_admin->qos_vport)
+ return 0;
+
+ if (!(priv->mfunc.master.slave_state[slave].active &&
+ dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_UPDATE_QP)) {
+ /* even if the UPDATE_QP command isn't supported, we still want
+ * to set this VF link according to the admin directive
+ */
+ vp_oper->state.link_state = vp_admin->link_state;
+ return -1;
+ }
+
+ mlx4_dbg(dev, "updating immediately admin params slave %d port %d\n",
+ slave, port);
+ mlx4_dbg(dev, "vlan %d QoS %d link down %d\n",
+ vp_admin->default_vlan, vp_admin->default_qos,
+ vp_admin->link_state);
+
+ work = kzalloc(sizeof(*work), GFP_KERNEL);
+ if (!work)
+ return -ENOMEM;
+
+ if (vp_oper->state.default_vlan != vp_admin->default_vlan) {
+ if (MLX4_VGT != vp_admin->default_vlan) {
+ err = __mlx4_register_vlan(&priv->dev, port,
+ vp_admin->default_vlan,
+ &admin_vlan_ix);
+ if (err) {
+ kfree(work);
+ mlx4_warn(&priv->dev,
+ "No vlan resources slave %d, port %d\n",
+ slave, port);
+ return err;
+ }
+ } else {
+ admin_vlan_ix = NO_INDX;
+ }
+ work->flags |= MLX4_VF_IMMED_VLAN_FLAG_VLAN;
+ mlx4_dbg(&priv->dev,
+ "alloc vlan %d idx %d slave %d port %d\n",
+ (int)(vp_admin->default_vlan),
+ admin_vlan_ix, slave, port);
+ }
+
+ /* save original vlan ix and vlan id */
+ work->orig_vlan_id = vp_oper->state.default_vlan;
+ work->orig_vlan_ix = vp_oper->vlan_idx;
+
+ /* handle new qos */
+ if (vp_oper->state.default_qos != vp_admin->default_qos)
+ work->flags |= MLX4_VF_IMMED_VLAN_FLAG_QOS;
+
+ if (work->flags & MLX4_VF_IMMED_VLAN_FLAG_VLAN)
+ vp_oper->vlan_idx = admin_vlan_ix;
+
+ vp_oper->state.default_vlan = vp_admin->default_vlan;
+ vp_oper->state.default_qos = vp_admin->default_qos;
+ vp_oper->state.link_state = vp_admin->link_state;
+ vp_oper->state.qos_vport = vp_admin->qos_vport;
+
+ if (vp_admin->link_state == IFLA_VF_LINK_STATE_DISABLE)
+ work->flags |= MLX4_VF_IMMED_VLAN_FLAG_LINK_DISABLE;
+
+ /* iterate over QPs owned by this slave, using UPDATE_QP */
+ work->port = port;
+ work->slave = slave;
+ work->qos = vp_oper->state.default_qos;
+ work->qos_vport = vp_oper->state.qos_vport;
+ work->vlan_id = vp_oper->state.default_vlan;
+ work->vlan_ix = vp_oper->vlan_idx;
+ work->priv = priv;
+
+#ifdef KMOD_DISABLED
+ INIT_WORK(&work->work, mlx4_vf_immed_vlan_work_handler);
+ queue_work(priv->mfunc.master.comm_wq, &work->work);
+#endif
+
+ return 0;
+}
+
+static void mlx4_set_default_port_qos(struct mlx4_dev *dev, int port)
+{
+ struct mlx4_qos_manager *port_qos_ctl;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ port_qos_ctl = &priv->mfunc.master.qos_ctl[port];
+ bitmap_zero(port_qos_ctl->priority_bm, MLX4_NUM_UP);
+
+ /* Enable only default prio at PF init routine */
+ set_bit(MLX4_DEFAULT_QOS_PRIO, port_qos_ctl->priority_bm);
+}
+
+static void mlx4_allocate_port_vpps(struct mlx4_dev *dev, int port)
+{
+ int i;
+ int err;
+ int num_vfs;
+ u16 availible_vpp;
+ u8 vpp_param[MLX4_NUM_UP];
+ struct mlx4_qos_manager *port_qos;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ err = mlx4_ALLOCATE_VPP_get(dev, port, &availible_vpp, vpp_param);
+ if (err) {
+ mlx4_info(dev, "Failed query availible VPPs\n");
+ return;
+ }
+
+ port_qos = &priv->mfunc.master.qos_ctl[port];
+ num_vfs = (availible_vpp /
+ bitmap_weight(port_qos->priority_bm, MLX4_NUM_UP));
+
+ for (i = 0; i < MLX4_NUM_UP; i++) {
+ if (test_bit(i, port_qos->priority_bm))
+ vpp_param[i] = num_vfs;
+ }
+
+ err = mlx4_ALLOCATE_VPP_set(dev, port, vpp_param);
+ if (err) {
+ mlx4_info(dev, "Failed allocating VPPs\n");
+ return;
+ }
+
+ /* Query actual allocated VPP, just to make sure */
+ err = mlx4_ALLOCATE_VPP_get(dev, port, &availible_vpp, vpp_param);
+ if (err) {
+ mlx4_info(dev, "Failed query availible VPPs\n");
+ return;
+ }
+
+ port_qos->num_of_qos_vfs = num_vfs;
+ mlx4_dbg(dev, "Port %d Availible VPPs %d\n", port, availible_vpp);
+
+ for (i = 0; i < MLX4_NUM_UP; i++)
+ mlx4_dbg(dev, "Port %d UP %d Allocated %d VPPs\n", port, i,
+ vpp_param[i]);
+}
+
+static int mlx4_master_activate_admin_state(struct mlx4_priv *priv, int slave)
+{
+ int port, err;
+ struct mlx4_vport_state *vp_admin;
+ struct mlx4_vport_oper_state *vp_oper;
+ struct mlx4_active_ports actv_ports = mlx4_get_active_ports(
+ &priv->dev, slave);
+ int min_port = find_first_bit(actv_ports.ports,
+ priv->dev.caps.num_ports) + 1;
+ int max_port = min_port - 1 +
+ bitmap_weight(actv_ports.ports, priv->dev.caps.num_ports);
+
+ for (port = min_port; port <= max_port; port++) {
+ if (!test_bit(port - 1, actv_ports.ports))
+ continue;
+ priv->mfunc.master.vf_oper[slave].smi_enabled[port] =
+ priv->mfunc.master.vf_admin[slave].enable_smi[port];
+ vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
+ vp_admin = &priv->mfunc.master.vf_admin[slave].vport[port];
+ vp_oper->state = *vp_admin;
+ if (MLX4_VGT != vp_admin->default_vlan) {
+ err = __mlx4_register_vlan(&priv->dev, port,
+ vp_admin->default_vlan, &(vp_oper->vlan_idx));
+ if (err) {
+ vp_oper->vlan_idx = NO_INDX;
+ mlx4_warn(&priv->dev,
+ "No vlan resources slave %d, port %d\n",
+ slave, port);
+ return err;
+ }
+ mlx4_dbg(&priv->dev, "alloc vlan %d idx %d slave %d port %d\n",
+ (int)(vp_oper->state.default_vlan),
+ vp_oper->vlan_idx, slave, port);
+ }
+ if (vp_admin->spoofchk) {
+ vp_oper->mac_idx = __mlx4_register_mac(&priv->dev,
+ port,
+ vp_admin->mac);
+ if (0 > vp_oper->mac_idx) {
+ err = vp_oper->mac_idx;
+ vp_oper->mac_idx = NO_INDX;
+ mlx4_warn(&priv->dev,
+ "No mac resources slave %d, port %d\n",
+ slave, port);
+ return err;
+ }
+ mlx4_dbg(&priv->dev, "alloc mac %llx idx %d slave %d port %d\n",
+ vp_oper->state.mac, vp_oper->mac_idx, slave, port);
+ }
+ }
+ return 0;
+}
+
+static void mlx4_master_deactivate_admin_state(struct mlx4_priv *priv, int slave)
+{
+ int port;
+ struct mlx4_vport_oper_state *vp_oper;
+ struct mlx4_active_ports actv_ports = mlx4_get_active_ports(
+ &priv->dev, slave);
+ int min_port = find_first_bit(actv_ports.ports,
+ priv->dev.caps.num_ports) + 1;
+ int max_port = min_port - 1 +
+ bitmap_weight(actv_ports.ports, priv->dev.caps.num_ports);
+
+
+ for (port = min_port; port <= max_port; port++) {
+ if (!test_bit(port - 1, actv_ports.ports))
+ continue;
+ priv->mfunc.master.vf_oper[slave].smi_enabled[port] =
+ MLX4_VF_SMI_DISABLED;
+ vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
+ if (NO_INDX != vp_oper->vlan_idx) {
+ __mlx4_unregister_vlan(&priv->dev,
+ port, vp_oper->state.default_vlan);
+ vp_oper->vlan_idx = NO_INDX;
+ }
+ if (NO_INDX != vp_oper->mac_idx) {
+ __mlx4_unregister_mac(&priv->dev, port, vp_oper->state.mac);
+ vp_oper->mac_idx = NO_INDX;
+ }
+ }
+ return;
+}
+
+static void mlx4_master_do_cmd(struct mlx4_dev *dev, int slave, u8 cmd,
+ u16 param, u8 toggle)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_state *slave_state = priv->mfunc.master.slave_state;
+ u32 reply;
+ u8 is_going_down = 0;
+ int i;
+ unsigned long flags;
+
+ slave_state[slave].comm_toggle ^= 1;
+ reply = (u32) slave_state[slave].comm_toggle << 31;
+ if (toggle != slave_state[slave].comm_toggle) {
+ mlx4_warn(dev, "Incorrect toggle %d from slave %d. *** MASTER STATE COMPROMISED ***\n",
+ toggle, slave);
+ goto reset_slave;
+ }
+ if (cmd == MLX4_COMM_CMD_RESET) {
+ mlx4_warn(dev, "Received reset from slave:%d\n", slave);
+ slave_state[slave].active = false;
+ slave_state[slave].old_vlan_api = false;
+ mlx4_master_deactivate_admin_state(priv, slave);
+ for (i = 0; i < MLX4_EVENT_TYPES_NUM; ++i) {
+ slave_state[slave].event_eq[i].eqn = -1;
+ slave_state[slave].event_eq[i].token = 0;
+ }
+ /*check if we are in the middle of FLR process,
+ if so return "retry" status to the slave*/
+ if (MLX4_COMM_CMD_FLR == slave_state[slave].last_cmd)
+ goto inform_slave_state;
+
+ mlx4_dispatch_event(dev, MLX4_DEV_EVENT_SLAVE_SHUTDOWN, slave);
+
+ /* write the version in the event field */
+ reply |= mlx4_comm_get_version();
+
+ goto reset_slave;
+ }
+ /*command from slave in the middle of FLR*/
+ if (cmd != MLX4_COMM_CMD_RESET &&
+ MLX4_COMM_CMD_FLR == slave_state[slave].last_cmd) {
+ mlx4_warn(dev, "slave:%d is Trying to run cmd(0x%x) in the middle of FLR\n",
+ slave, cmd);
+ return;
+ }
+
+ switch (cmd) {
+ case MLX4_COMM_CMD_VHCR0:
+ if (slave_state[slave].last_cmd != MLX4_COMM_CMD_RESET)
+ goto reset_slave;
+ slave_state[slave].vhcr_dma = ((u64) param) << 48;
+ priv->mfunc.master.slave_state[slave].cookie = 0;
+ break;
+ case MLX4_COMM_CMD_VHCR1:
+ if (slave_state[slave].last_cmd != MLX4_COMM_CMD_VHCR0)
+ goto reset_slave;
+ slave_state[slave].vhcr_dma |= ((u64) param) << 32;
+ break;
+ case MLX4_COMM_CMD_VHCR2:
+ if (slave_state[slave].last_cmd != MLX4_COMM_CMD_VHCR1)
+ goto reset_slave;
+ slave_state[slave].vhcr_dma |= ((u64) param) << 16;
+ break;
+ case MLX4_COMM_CMD_VHCR_EN:
+ if (slave_state[slave].last_cmd != MLX4_COMM_CMD_VHCR2)
+ goto reset_slave;
+ slave_state[slave].vhcr_dma |= param;
+ if (mlx4_master_activate_admin_state(priv, slave))
+ goto reset_slave;
+ slave_state[slave].active = true;
+ mlx4_dispatch_event(dev, MLX4_DEV_EVENT_SLAVE_INIT, slave);
+ break;
+ case MLX4_COMM_CMD_VHCR_POST:
+ if ((slave_state[slave].last_cmd != MLX4_COMM_CMD_VHCR_EN) &&
+ (slave_state[slave].last_cmd != MLX4_COMM_CMD_VHCR_POST)) {
+ mlx4_warn(dev, "slave:%d is out of sync, cmd=0x%x, last command=0x%x, reset is needed\n",
+ slave, cmd, slave_state[slave].last_cmd);
+ goto reset_slave;
+ }
+
+ mutex_lock(&priv->cmd.slave_cmd_mutex);
+ if (mlx4_master_process_vhcr(dev, slave, NULL)) {
+ mlx4_err(dev, "Failed processing vhcr for slave:%d, resetting slave\n",
+ slave);
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
+ goto reset_slave;
+ }
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
+ break;
+ default:
+ mlx4_warn(dev, "Bad comm cmd:%d from slave:%d\n", cmd, slave);
+ goto reset_slave;
+ }
+ spin_lock_irqsave(&priv->mfunc.master.slave_state_lock, flags);
+ if (!slave_state[slave].is_slave_going_down)
+ slave_state[slave].last_cmd = cmd;
+ else
+ is_going_down = 1;
+ spin_unlock_irqrestore(&priv->mfunc.master.slave_state_lock, flags);
+ if (is_going_down) {
+ mlx4_warn(dev, "Slave is going down aborting command(%d) executing from slave:%d\n",
+ cmd, slave);
+ return;
+ }
+ __raw_writel((__force u32) cpu_to_be32(reply),
+ &priv->mfunc.comm[slave].slave_read);
+ mmiowb();
+
+ return;
+
+reset_slave:
+ /* cleanup any slave resources */
+ if (dev->persist->interface_state & MLX4_INTERFACE_STATE_UP)
+ mlx4_delete_all_resources_for_slave(dev, slave);
+
+ if (cmd != MLX4_COMM_CMD_RESET) {
+ mlx4_warn(dev, "Turn on internal error to force reset, slave=%d, cmd=0x%x\n",
+ slave, cmd);
+ /* Turn on internal error letting slave reset itself immeditaly,
+ * otherwise it might take till timeout on command is passed
+ */
+ reply |= ((u32)COMM_CHAN_EVENT_INTERNAL_ERR);
+ }
+
+ spin_lock_irqsave(&priv->mfunc.master.slave_state_lock, flags);
+ if (!slave_state[slave].is_slave_going_down)
+ slave_state[slave].last_cmd = MLX4_COMM_CMD_RESET;
+ spin_unlock_irqrestore(&priv->mfunc.master.slave_state_lock, flags);
+ /*with slave in the middle of flr, no need to clean resources again.*/
+inform_slave_state:
+ memset(&slave_state[slave].event_eq, 0,
+ sizeof(struct mlx4_slave_event_eq_info));
+ __raw_writel((__force u32) cpu_to_be32(reply),
+ &priv->mfunc.comm[slave].slave_read);
+ wmb();
+}
+
+/* master command processing */
+#ifdef KMOD_MODIFIED
+void mlx4_master_comm_channel(struct mlx4_mfunc_master_ctx *master)
+{
+#else
+void mlx4_master_comm_channel(struct work_struct *work)
+{
+ struct mlx4_mfunc_master_ctx *master =
+ container_of(work,
+ struct mlx4_mfunc_master_ctx,
+ comm_work);
+#endif
+ struct mlx4_mfunc *mfunc =
+ container_of(master, struct mlx4_mfunc, master);
+ struct mlx4_priv *priv =
+ container_of(mfunc, struct mlx4_priv, mfunc);
+ struct mlx4_dev *dev = &priv->dev;
+ __be32 *bit_vec;
+ u32 comm_cmd;
+ u32 vec;
+ int i, j, slave;
+ int toggle;
+ int served = 0;
+ int reported = 0;
+ u32 slt;
+
+ bit_vec = master->comm_arm_bit_vector;
+ for (i = 0; i < COMM_CHANNEL_BIT_ARRAY_SIZE; i++) {
+ vec = be32_to_cpu(bit_vec[i]);
+ for (j = 0; j < 32; j++) {
+ if (!(vec & (1 << j)))
+ continue;
+ ++reported;
+ slave = (i * 32) + j;
+ comm_cmd = swab32(readl(
+ &mfunc->comm[slave].slave_write));
+ slt = swab32(readl(&mfunc->comm[slave].slave_read))
+ >> 31;
+ toggle = comm_cmd >> 31;
+ if (toggle != slt) {
+ if (master->slave_state[slave].comm_toggle
+ != slt) {
+ pr_info("slave %d out of sync. read toggle %d, state toggle %d. Resynching.\n",
+ slave, slt,
+ master->slave_state[slave].comm_toggle);
+ master->slave_state[slave].comm_toggle =
+ slt;
+ }
+ mlx4_master_do_cmd(dev, slave,
+ comm_cmd >> 16 & 0xff,
+ comm_cmd & 0xffff, toggle);
+ ++served;
+ }
+ }
+ }
+
+ if (reported && reported != served)
+ mlx4_warn(dev, "Got command event with bitmask from %d slaves but %d were served\n",
+ reported, served);
+
+ if (mlx4_ARM_COMM_CHANNEL(dev))
+ mlx4_warn(dev, "Failed to arm comm channel events\n");
+}
+
+static int sync_toggles(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ u32 wr_toggle;
+ u32 rd_toggle;
+ unsigned long end;
+
+ wr_toggle = swab32(readl(&priv->mfunc.comm->slave_write));
+ if (wr_toggle == 0xffffffff)
+ end = jiffies + msecs_to_jiffies(30000);
+ else
+ end = jiffies + msecs_to_jiffies(5000);
+
+ while (time_before(jiffies, end)) {
+ rd_toggle = swab32(readl(&priv->mfunc.comm->slave_read));
+ if (wr_toggle == 0xffffffff || rd_toggle == 0xffffffff) {
+ /* PCI might be offline */
+ msleep(100);
+ wr_toggle = swab32(readl(&priv->mfunc.comm->
+ slave_write));
+ continue;
+ }
+
+ if (rd_toggle >> 31 == wr_toggle >> 31) {
+ priv->cmd.comm_toggle = rd_toggle >> 31;
+ return 0;
+ }
+
+ cond_resched();
+ }
+
+ /*
+ * we could reach here if for example the previous VM using this
+ * function misbehaved and left the channel with unsynced state. We
+ * should fix this here and give this VM a chance to use a properly
+ * synced channel
+ */
+ mlx4_warn(dev, "recovering from previously mis-behaved VM\n");
+ __raw_writel((__force u32) 0, &priv->mfunc.comm->slave_read);
+ __raw_writel((__force u32) 0, &priv->mfunc.comm->slave_write);
+ priv->cmd.comm_toggle = 0;
+
+ return 0;
+}
+
+int mlx4_multi_func_init(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_state *s_state;
+ int i, j, err, port;
+
+ if (mlx4_is_master(dev))
+ {
+
+#ifdef KMOD_MODIFIED
+ assert(
+ (priv->fw.comm_base + MLX4_COMM_PAGESIZE) <= dev->persist->rte_pdev->mem_resource[priv->fw.comm_bar].len
+ );
+ priv->mfunc.comm = RTE_PTR_ADD(dev->persist->rte_pdev->mem_resource[priv->fw.comm_bar].addr, priv->fw.comm_base);
+
+#else
+ priv->mfunc.comm = ioremap(pci_resource_start(dev->persist->pdev,
+ priv->fw.comm_bar) +
+ priv->fw.comm_base, MLX4_COMM_PAGESIZE);
+#endif
+ }
+ else
+ {
+#ifdef KMOD_MODIFIED
+ assert(
+ (MLX4_SLAVE_COMM_BASE + MLX4_COMM_PAGESIZE) <= dev->persist->rte_pdev->mem_resource[2].len
+ );
+ priv->mfunc.comm = RTE_PTR_ADD(dev->persist->rte_pdev->mem_resource[2].addr, MLX4_SLAVE_COMM_BASE);
+#else
+ priv->mfunc.comm =
+ ioremap(pci_resource_start(dev->persist->pdev, 2) +
+ MLX4_SLAVE_COMM_BASE, MLX4_COMM_PAGESIZE);
+#endif
+ }
+ if (!priv->mfunc.comm) {
+ mlx4_err(dev, "Couldn't map communication vector\n");
+ goto err_vhcr;
+ }
+
+ if (mlx4_is_master(dev)) {
+ struct mlx4_vf_oper_state *vf_oper;
+ struct mlx4_vf_admin_state *vf_admin;
+
+ priv->mfunc.master.slave_state =
+ kzalloc(dev->num_slaves *
+ sizeof(struct mlx4_slave_state), GFP_KERNEL);
+ for (i = 0; i < dev->num_slaves; i++)
+ priv->mfunc.master.slave_state[i].slave_gid_type = MLX4_ROCE_GID_TYPE_INVALID;
+
+ if (!priv->mfunc.master.slave_state)
+ goto err_comm;
+
+ priv->mfunc.master.vf_admin =
+ kzalloc(dev->num_slaves *
+ sizeof(struct mlx4_vf_admin_state), GFP_KERNEL);
+ if (!priv->mfunc.master.vf_admin)
+ goto err_comm_admin;
+
+ priv->mfunc.master.vf_oper =
+ kzalloc(dev->num_slaves *
+ sizeof(struct mlx4_vf_oper_state), GFP_KERNEL);
+ if (!priv->mfunc.master.vf_oper)
+ goto err_comm_oper;
+
+ for (i = 0; i < dev->num_slaves; ++i) {
+ vf_admin = &priv->mfunc.master.vf_admin[i];
+ vf_oper = &priv->mfunc.master.vf_oper[i];
+ s_state = &priv->mfunc.master.slave_state[i];
+ s_state->last_cmd = MLX4_COMM_CMD_RESET;
+ mutex_init(&priv->mfunc.master.gen_eqe_mutex[i]);
+ for (j = 0; j < MLX4_EVENT_TYPES_NUM; ++j)
+ s_state->event_eq[j].eqn = -1;
+ __raw_writel((__force u32) 0,
+ &priv->mfunc.comm[i].slave_write);
+ __raw_writel((__force u32) 0,
+ &priv->mfunc.comm[i].slave_read);
+ mmiowb();
+ for (port = 1; port <= MLX4_MAX_PORTS; port++) {
+ struct mlx4_vport_state *admin_vport;
+ struct mlx4_vport_state *oper_vport;
+
+ s_state->vlan_filter[port] =
+ kzalloc(sizeof(struct mlx4_vlan_fltr),
+ GFP_KERNEL);
+ if (!s_state->vlan_filter[port]) {
+ if (--port)
+ kfree(s_state->vlan_filter[port]);
+ goto err_slaves;
+ }
+
+ admin_vport = &vf_admin->vport[port];
+ oper_vport = &vf_oper->vport[port].state;
+ INIT_LIST_HEAD(&s_state->mcast_filters[port]);
+ admin_vport->default_vlan = MLX4_VGT;
+ oper_vport->default_vlan = MLX4_VGT;
+ admin_vport->qos_vport =
+ MLX4_VPP_DEFAULT_VPORT;
+ vf_oper->vport[port].vlan_idx = NO_INDX;
+ vf_oper->vport[port].mac_idx = NO_INDX;
+ mlx4_set_random_admin_guid(dev, i, port);
+ }
+ spin_lock_init(&s_state->lock);
+ }
+
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QOS_VPP) {
+ for (port = 1; port <= dev->caps.num_ports; port++) {
+ if (mlx4_is_eth(dev, port)) {
+ mlx4_set_default_port_qos(dev, port);
+ mlx4_allocate_port_vpps(dev, port);
+ }
+ }
+ }
+
+ memset(&priv->mfunc.master.cmd_eqe, 0, dev->caps.eqe_size);
+ priv->mfunc.master.cmd_eqe.type = MLX4_EVENT_TYPE_CMD;
+#ifdef KMOD_DISABLED
+ INIT_WORK(&priv->mfunc.master.comm_work,
+ mlx4_master_comm_channel);
+ INIT_WORK(&priv->mfunc.master.slave_event_work,
+ mlx4_gen_slave_eqe);
+ INIT_WORK(&priv->mfunc.master.slave_flr_event_work,
+ mlx4_master_handle_slave_flr);
+#endif
+ spin_lock_init(&priv->mfunc.master.slave_state_lock);
+ spin_lock_init(&priv->mfunc.master.slave_eq.event_lock);
+#ifdef KMOD_DISABLED
+ priv->mfunc.master.comm_wq =
+ create_singlethread_workqueue("mlx4_comm");
+
+ if (!priv->mfunc.master.comm_wq)
+ goto err_slaves;
+#endif
+
+ if (mlx4_init_resource_tracker(dev))
+ goto err_thread;
+
+ } else {
+ err = sync_toggles(dev);
+ if (err) {
+ mlx4_err(dev, "Couldn't sync toggles\n");
+ goto err_comm;
+ }
+ }
+ return 0;
+
+err_thread:
+#ifdef KMOD_DISABLED
+ flush_workqueue(priv->mfunc.master.comm_wq);
+ destroy_workqueue(priv->mfunc.master.comm_wq);
+#endif
+err_slaves:
+ while (--i) {
+ for (port = 1; port <= MLX4_MAX_PORTS; port++)
+ kfree(priv->mfunc.master.slave_state[i].vlan_filter[port]);
+ }
+ kfree(priv->mfunc.master.vf_oper);
+err_comm_oper:
+ kfree(priv->mfunc.master.vf_admin);
+err_comm_admin:
+ kfree(priv->mfunc.master.slave_state);
+err_comm:
+#ifdef KMOD_DISABLED
+ iounmap(priv->mfunc.comm);
+#endif
+err_vhcr:
+#ifdef KMOD_DISABLED
+ dma_free_coherent(&dev->persist->pdev->dev, PAGE_SIZE,
+ priv->mfunc.vhcr,
+ priv->mfunc.vhcr_dma);
+#endif
+ priv->mfunc.vhcr = NULL;
+ return -ENOMEM;
+}
+
+int mlx4_cmd_init(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int flags = 0;
+
+ if (!priv->cmd.initialized) {
+ init_rwsem(&priv->cmd.switch_sem);
+ mutex_init(&priv->cmd.slave_cmd_mutex);
+ sema_init(&priv->cmd.poll_sem, 1);
+ priv->cmd.use_events = 0;
+ priv->cmd.toggle = 1;
+ priv->cmd.initialized = 1;
+ flags |= MLX4_CMD_CLEANUP_STRUCT;
+ }
+
+ if (!mlx4_is_slave(dev) && !priv->cmd.hcr) {
+#ifdef KMOD_MODIFIED
+ assert(dev->persist->rte_pdev->mem_resource[0].len >= (MLX4_HCR_BASE+MLX4_HCR_SIZE));
+ priv->cmd.hcr = RTE_PTR_ADD(dev->persist->rte_pdev->mem_resource[0].addr, MLX4_HCR_BASE);
+#else
+ priv->cmd.hcr = ioremap(pci_resource_start(dev->persist->pdev,
+ 0) + MLX4_HCR_BASE, MLX4_HCR_SIZE);
+#endif
+ if (!priv->cmd.hcr) {
+ mlx4_err(dev, "Couldn't map command register\n");
+ goto err;
+ }
+ flags |= MLX4_CMD_CLEANUP_HCR;
+ }
+
+ if (mlx4_is_mfunc(dev) && !priv->mfunc.vhcr) {
+#ifdef KMOD_MODIFIED
+ priv->mfunc.vhcr = rte_persistent_alloc(PAGE_SIZE, dev->persist->rte_pdev->numa_node);
+ if (!priv->mfunc.vhcr)
+ goto err;
+ priv->mfunc.vhcr_dma = rte_persistent_hw_addr(priv->mfunc.vhcr);
+#else
+ priv->mfunc.vhcr = dma_alloc_coherent(&dev->persist->pdev->dev,
+ PAGE_SIZE,
+ &priv->mfunc.vhcr_dma,
+ GFP_KERNEL);
+ if (!priv->mfunc.vhcr)
+ goto err;
+#endif
+
+ flags |= MLX4_CMD_CLEANUP_VHCR;
+ }
+#ifdef KMOD_MODIFIED
+ //we don't need mailbox pool. use rte_malloc instead
+#else
+ if (!priv->cmd.pool) {
+ priv->cmd.pool = pci_pool_create("mlx4_cmd",
+ dev->persist->pdev,
+ MLX4_MAILBOX_SIZE,
+ MLX4_MAILBOX_SIZE, 0);
+ if (!priv->cmd.pool)
+ goto err;
+
+ flags |= MLX4_CMD_CLEANUP_POOL;
+ }
+#endif
+
+ return 0;
+
+err:
+ mlx4_cmd_cleanup(dev, flags);
+ return -ENOMEM;
+}
+
+void mlx4_report_internal_err_comm_event(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int slave;
+ u32 slave_read;
+
+ /* Report an internal error event to all
+ * communication channels.
+ */
+ for (slave = 0; slave < dev->num_slaves; slave++) {
+ slave_read = swab32(readl(&priv->mfunc.comm[slave].slave_read));
+ slave_read |= (u32)COMM_CHAN_EVENT_INTERNAL_ERR;
+ __raw_writel((__force u32)cpu_to_be32(slave_read),
+ &priv->mfunc.comm[slave].slave_read);
+ /* Make sure that our comm channel write doesn't
+ * get mixed in with writes from another CPU.
+ */
+ mmiowb();
+ }
+}
+
+void mlx4_multi_func_cleanup(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i, port;
+
+ if (mlx4_is_master(dev)) {
+#ifdef KMOD_DISABLED
+ flush_workqueue(priv->mfunc.master.comm_wq);
+ destroy_workqueue(priv->mfunc.master.comm_wq);
+#endif
+ for (i = 0; i < dev->num_slaves; i++) {
+ for (port = 1; port <= MLX4_MAX_PORTS; port++)
+ kfree(priv->mfunc.master.slave_state[i].vlan_filter[port]);
+ }
+ kfree(priv->mfunc.master.slave_state);
+ kfree(priv->mfunc.master.vf_admin);
+ kfree(priv->mfunc.master.vf_oper);
+ dev->num_slaves = 0;
+ }
+#ifdef KMOD_REMOVED
+ iounmap(priv->mfunc.comm);
+#endif
+}
+
+void mlx4_cmd_cleanup(struct mlx4_dev *dev, int cleanup_mask)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+#ifdef KMOD_MODIFIED
+ // no mailbox pool
+#else
+ if (priv->cmd.pool && (cleanup_mask & MLX4_CMD_CLEANUP_POOL)) {
+ pci_pool_destroy(priv->cmd.pool);
+ priv->cmd.pool = NULL;
+ }
+#endif
+
+ if (!mlx4_is_slave(dev) && priv->cmd.hcr &&
+ (cleanup_mask & MLX4_CMD_CLEANUP_HCR)) {
+#ifdef KMOD_REMOVED
+ iounmap(priv->cmd.hcr);
+#endif
+ priv->cmd.hcr = NULL;
+ }
+ if (mlx4_is_mfunc(dev) && priv->mfunc.vhcr &&
+ (cleanup_mask & MLX4_CMD_CLEANUP_VHCR)) {
+#ifdef KMOD_MODIFIED
+ rte_persistent_free(priv->mfunc.vhcr);
+#else
+ dma_free_coherent(&dev->persist->pdev->dev, PAGE_SIZE,
+ priv->mfunc.vhcr, priv->mfunc.vhcr_dma);
+#endif
+ priv->mfunc.vhcr = NULL;
+ }
+ if (priv->cmd.initialized && (cleanup_mask & MLX4_CMD_CLEANUP_STRUCT))
+ priv->cmd.initialized = 0;
+}
+
+/*
+ * Switch to using events to issue FW commands (can only be called
+ * after event queue for command events has been initialized).
+ */
+int mlx4_cmd_use_events(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i;
+ int err = 0;
+
+ priv->cmd.context = kmalloc(priv->cmd.max_cmds *
+ sizeof (struct mlx4_cmd_context),
+ GFP_KERNEL);
+ if (!priv->cmd.context)
+ return -ENOMEM;
+
+ down_write(&priv->cmd.switch_sem);
+ for (i = 0; i < priv->cmd.max_cmds; ++i) {
+ priv->cmd.context[i].token = i;
+ priv->cmd.context[i].next = i + 1;
+ /* To support fatal error flow, initialize all
+ * cmd contexts to allow simulating completions
+ * with complete() at any time.
+ */
+ init_completion(&priv->cmd.context[i].done);
+ }
+
+ priv->cmd.context[priv->cmd.max_cmds - 1].next = -1;
+ priv->cmd.free_head = 0;
+
+ sema_init(&priv->cmd.event_sem, priv->cmd.max_cmds);
+ spin_lock_init(&priv->cmd.context_lock);
+
+ for (priv->cmd.token_mask = 1;
+ priv->cmd.token_mask < priv->cmd.max_cmds;
+ priv->cmd.token_mask <<= 1)
+ ; /* nothing */
+ --priv->cmd.token_mask;
+
+ down(&priv->cmd.poll_sem);
+ priv->cmd.use_events = 1;
+ up_write(&priv->cmd.switch_sem);
+
+ return err;
+}
+
+/*
+ * Switch back to polling (used when shutting down the device)
+ */
+void mlx4_cmd_use_polling(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i;
+
+ down_write(&priv->cmd.switch_sem);
+ priv->cmd.use_events = 0;
+
+ for (i = 0; i < priv->cmd.max_cmds; ++i)
+ down(&priv->cmd.event_sem);
+
+ kfree(priv->cmd.context);
+
+ up(&priv->cmd.poll_sem);
+ up_write(&priv->cmd.switch_sem);
+}
+
+struct mlx4_cmd_mailbox *mlx4_alloc_cmd_mailbox(struct mlx4_dev *dev)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+
+ mailbox = kmalloc(sizeof *mailbox, GFP_KERNEL);
+ if (!mailbox)
+ return ERR_PTR(-ENOMEM);
+#ifdef KMOD_MODIFIED
+
+ mailbox->buf = rte_malloc_socket("mailbox", MLX4_MAILBOX_SIZE, MLX4_MAILBOX_SIZE, dev->numa_node);
+ assert(mailbox->buf);
+ mailbox->dma = rte_malloc_virt2phy(mailbox->buf);
+#else
+ mailbox->buf = pci_pool_alloc(mlx4_priv(dev)->cmd.pool, GFP_KERNEL,
+ &mailbox->dma);
+#endif
+ if (!mailbox->buf) {
+ kfree(mailbox);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ memset(mailbox->buf, 0, MLX4_MAILBOX_SIZE);
+
+ return mailbox;
+}
+EXPORT_SYMBOL_GPL(mlx4_alloc_cmd_mailbox);
+
+void mlx4_free_cmd_mailbox(struct mlx4_dev *dev,
+ struct mlx4_cmd_mailbox *mailbox)
+{
+ if (!mailbox)
+ return;
+#ifdef KMOD_MODIFIED
+ rte_free(mailbox->buf);
+ mailbox->buf = NULL;
+#else
+ pci_pool_free(mlx4_priv(dev)->cmd.pool, mailbox->buf, mailbox->dma);
+#endif
+ kfree(mailbox);
+}
+EXPORT_SYMBOL_GPL(mlx4_free_cmd_mailbox);
+
+u32 mlx4_comm_get_version(void)
+{
+ return ((u32) CMD_CHAN_IF_REV << 8) | (u32) CMD_CHAN_VER;
+}
+
+int mlx4_get_slave_indx(struct mlx4_dev *dev, int vf)
+{
+ if ((vf < 0) || (vf >= dev->persist->num_vfs)) {
+ mlx4_err(dev, "Bad vf number:%d (number of activated vf: %d)\n",
+ vf, dev->persist->num_vfs);
+ return -EINVAL;
+ }
+
+ return vf+1;
+}
+
+int mlx4_get_vf_indx(struct mlx4_dev *dev, int slave)
+{
+ if (slave < 1 || slave > dev->persist->num_vfs) {
+ mlx4_err(dev,
+ "Bad slave number:%d (number of activated slaves: %lu)\n",
+ slave, dev->num_slaves);
+ return -EINVAL;
+ }
+ return slave - 1;
+}
+
+void mlx4_cmd_wake_completions(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cmd_context *context;
+ int i;
+
+ spin_lock(&priv->cmd.context_lock);
+ if (priv->cmd.context) {
+ for (i = 0; i < priv->cmd.max_cmds; ++i) {
+ context = &priv->cmd.context[i];
+ context->fw_status = CMD_STAT_INTERNAL_ERR;
+ context->result =
+ mlx4_status_to_errno(CMD_STAT_INTERNAL_ERR);
+ complete(&context->done);
+ }
+ }
+ spin_unlock(&priv->cmd.context_lock);
+}
+
+struct mlx4_active_ports mlx4_get_active_ports(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_active_ports actv_ports;
+ int vf;
+
+ bitmap_zero(actv_ports.ports, MLX4_MAX_PORTS);
+
+ if (slave == 0) {
+ bitmap_fill(actv_ports.ports, dev->caps.num_ports);
+ return actv_ports;
+ }
+
+ vf = mlx4_get_vf_indx(dev, slave);
+ if (vf < 0)
+ return actv_ports;
+
+ bitmap_set(actv_ports.ports, dev->dev_vfs[vf].min_port - 1,
+ min((int)dev->dev_vfs[mlx4_get_vf_indx(dev, slave)].n_ports,
+ dev->caps.num_ports));
+
+ return actv_ports;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_active_ports);
+
+int mlx4_slave_convert_port(struct mlx4_dev *dev, int slave, int port)
+{
+ unsigned n;
+ struct mlx4_active_ports actv_ports = mlx4_get_active_ports(dev, slave);
+ unsigned m = bitmap_weight(actv_ports.ports, dev->caps.num_ports);
+
+ if (port <= 0 || port > m)
+ return -EINVAL;
+
+ n = find_first_bit(actv_ports.ports, dev->caps.num_ports);
+ if (port <= n)
+ port = n + 1;
+
+ return port;
+}
+EXPORT_SYMBOL_GPL(mlx4_slave_convert_port);
+
+int mlx4_phys_to_slave_port(struct mlx4_dev *dev, int slave, int port)
+{
+ struct mlx4_active_ports actv_ports = mlx4_get_active_ports(dev, slave);
+ if (test_bit(port - 1, actv_ports.ports))
+ return port -
+ find_first_bit(actv_ports.ports, dev->caps.num_ports);
+
+ return -1;
+}
+EXPORT_SYMBOL_GPL(mlx4_phys_to_slave_port);
+
+struct mlx4_slaves_pport mlx4_phys_to_slaves_pport(struct mlx4_dev *dev,
+ int port)
+{
+ unsigned i;
+ struct mlx4_slaves_pport slaves_pport;
+
+ bitmap_zero(slaves_pport.slaves, MLX4_MFUNC_MAX);
+
+ if (port <= 0 || port > dev->caps.num_ports)
+ return slaves_pport;
+
+ for (i = 0; i < dev->persist->num_vfs + 1; i++) {
+ struct mlx4_active_ports actv_ports =
+ mlx4_get_active_ports(dev, i);
+ if (test_bit(port - 1, actv_ports.ports))
+ set_bit(i, slaves_pport.slaves);
+ }
+
+ return slaves_pport;
+}
+EXPORT_SYMBOL_GPL(mlx4_phys_to_slaves_pport);
+
+struct mlx4_slaves_pport mlx4_phys_to_slaves_pport_actv(
+ struct mlx4_dev *dev,
+ const struct mlx4_active_ports *crit_ports)
+{
+ unsigned i;
+ struct mlx4_slaves_pport slaves_pport;
+
+ bitmap_zero(slaves_pport.slaves, MLX4_MFUNC_MAX);
+
+ for (i = 0; i < dev->persist->num_vfs + 1; i++) {
+ struct mlx4_active_ports actv_ports =
+ mlx4_get_active_ports(dev, i);
+ if (bitmap_equal(crit_ports->ports, actv_ports.ports,
+ dev->caps.num_ports))
+ set_bit(i, slaves_pport.slaves);
+ }
+
+ return slaves_pport;
+}
+EXPORT_SYMBOL_GPL(mlx4_phys_to_slaves_pport_actv);
+
+static int mlx4_slaves_closest_port(struct mlx4_dev *dev, int slave, int port)
+{
+ struct mlx4_active_ports actv_ports = mlx4_get_active_ports(dev, slave);
+ int min_port = find_first_bit(actv_ports.ports, dev->caps.num_ports)
+ + 1;
+ int max_port = min_port +
+ bitmap_weight(actv_ports.ports, dev->caps.num_ports);
+
+ if (port < min_port)
+ port = min_port;
+ else if (port >= max_port)
+ port = max_port - 1;
+
+ return port;
+}
+
+static int mlx4_set_vport_qos(struct mlx4_priv *priv, int slave, int port,
+ int max_tx_rate)
+{
+ int i;
+ int err;
+ struct mlx4_qos_manager *port_qos;
+ struct mlx4_dev *dev = &priv->dev;
+ struct mlx4_vport_qos_param vpp_qos[MLX4_NUM_UP];
+
+ port_qos = &priv->mfunc.master.qos_ctl[port];
+ memset(vpp_qos, 0, sizeof(struct mlx4_vport_qos_param) * MLX4_NUM_UP);
+
+ if (slave > port_qos->num_of_qos_vfs) {
+ mlx4_info(dev, "No availible VPP resources for this VF\n");
+ return -EINVAL;
+ }
+
+ /* Query for default QoS values from Vport 0 is needed */
+ err = mlx4_SET_VPORT_QOS_get(dev, port, 0, vpp_qos);
+ if (err) {
+ mlx4_info(dev, "Failed to query Vport 0 QoS values\n");
+ return err;
+ }
+
+ for (i = 0; i < MLX4_NUM_UP; i++) {
+ if (test_bit(i, port_qos->priority_bm) && max_tx_rate) {
+ vpp_qos[i].max_avg_bw = max_tx_rate;
+ vpp_qos[i].enable = 1;
+ } else {
+ /* if user supplied tx_rate == 0, meaning no rate limit
+ * configuration is required. so we are leaving the
+ * value of max_avg_bw as queried from Vport 0.
+ */
+ vpp_qos[i].enable = 0;
+ }
+ }
+
+ err = mlx4_SET_VPORT_QOS_set(dev, port, slave, vpp_qos);
+ if (err) {
+ mlx4_info(dev, "Failed to set Vport %d QoS values\n", slave);
+ return err;
+ }
+
+ return 0;
+}
+
+static bool mlx4_is_vf_vst_and_prio_qos(struct mlx4_dev *dev, int port,
+ struct mlx4_vport_state *vf_admin)
+{
+ struct mlx4_qos_manager *info;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (!mlx4_is_master(dev) ||
+ !(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QOS_VPP))
+ return false;
+
+ info = &priv->mfunc.master.qos_ctl[port];
+
+ if (vf_admin->default_vlan != MLX4_VGT &&
+ test_bit(vf_admin->default_qos, info->priority_bm))
+ return true;
+
+ return false;
+}
+
+static bool mlx4_valid_vf_state_change(struct mlx4_dev *dev, int port,
+ struct mlx4_vport_state *vf_admin,
+ int vlan, int qos)
+{
+ struct mlx4_vport_state dummy_admin = {0};
+
+ if (!mlx4_is_vf_vst_and_prio_qos(dev, port, vf_admin) ||
+ !vf_admin->tx_rate)
+ return true;
+
+ dummy_admin.default_qos = qos;
+ dummy_admin.default_vlan = vlan;
+
+ /* VF wants to move to other VST state which is valid with current
+ * rate limit. Either differnt default vlan in VST or other
+ * supported QoS priority. Otherwise we don't allow this change when
+ * the TX rate is still configured.
+ */
+ if (mlx4_is_vf_vst_and_prio_qos(dev, port, &dummy_admin))
+ return true;
+
+ mlx4_info(dev, "Cannot change VF state to %s while rate is set\n",
+ (vlan == MLX4_VGT) ? "VGT" : "VST");
+
+ if (vlan != MLX4_VGT)
+ mlx4_info(dev, "VST priority %d not supported for QoS\n", qos);
+
+ mlx4_info(dev, "Please set rate to 0 prior to this VF state change\n");
+
+ return false;
+}
+
+int mlx4_set_vf_mac(struct mlx4_dev *dev, int port, int vf, u64 mac)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_vport_state *s_info;
+ int slave;
+
+ if (!mlx4_is_master(dev))
+ return -EPROTONOSUPPORT;
+
+ slave = mlx4_get_slave_indx(dev, vf);
+ if (slave < 0)
+ return -EINVAL;
+
+ port = mlx4_slaves_closest_port(dev, slave, port);
+ s_info = &priv->mfunc.master.vf_admin[slave].vport[port];
+ s_info->mac = mac;
+ mlx4_info(dev, "default mac on vf %d port %d to %llX will take afect only after vf restart\n",
+ vf, port, s_info->mac);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_set_vf_mac);
+
+
+int mlx4_set_vf_vlan(struct mlx4_dev *dev, int port, int vf, u16 vlan, u8 qos)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_vport_state *vf_admin;
+ int slave;
+
+ if ((!mlx4_is_master(dev)) ||
+ !(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_VLAN_CONTROL))
+ return -EPROTONOSUPPORT;
+
+ if ((vlan > 4095) || (qos > 7))
+ return -EINVAL;
+
+ slave = mlx4_get_slave_indx(dev, vf);
+ if (slave < 0)
+ return -EINVAL;
+
+ port = mlx4_slaves_closest_port(dev, slave, port);
+ vf_admin = &priv->mfunc.master.vf_admin[slave].vport[port];
+
+ if (!mlx4_valid_vf_state_change(dev, port, vf_admin, vlan, qos))
+ return -EPERM;
+
+ if ((0 == vlan) && (0 == qos))
+ vf_admin->default_vlan = MLX4_VGT;
+ else
+ vf_admin->default_vlan = vlan;
+ vf_admin->default_qos = qos;
+
+ /* If rate was configured prior to VST, we saved the configured rate
+ * in vf_admin->rate and now, if priority supported we enforce the QoS
+ */
+ if (mlx4_is_vf_vst_and_prio_qos(dev, port, vf_admin) &&
+ vf_admin->tx_rate)
+ vf_admin->qos_vport = slave;
+
+ if (mlx4_master_immediate_activate_vlan_qos(priv, slave, port))
+ mlx4_info(dev,
+ "updating vf %d port %d config will take effect on next VF restart\n",
+ vf, port);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_set_vf_vlan);
+
+int mlx4_set_vf_rate(struct mlx4_dev *dev, int port, int vf, int min_tx_rate,
+ int max_tx_rate)
+{
+ int err;
+ int slave;
+ struct mlx4_vport_state *vf_admin;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (!mlx4_is_master(dev) ||
+ !(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QOS_VPP))
+ return -EPROTONOSUPPORT;
+
+ if (min_tx_rate) {
+ mlx4_info(dev, "Minimum BW share not supported\n");
+ return -EPROTONOSUPPORT;
+ }
+
+ slave = mlx4_get_slave_indx(dev, vf);
+ if (slave < 0)
+ return -EINVAL;
+
+ port = mlx4_slaves_closest_port(dev, slave, port);
+ vf_admin = &priv->mfunc.master.vf_admin[slave].vport[port];
+
+ err = mlx4_set_vport_qos(priv, slave, port, max_tx_rate);
+ if (err) {
+ mlx4_info(dev, "vf %d failed to set rate %d\n", vf,
+ max_tx_rate);
+ return err;
+ }
+
+ vf_admin->tx_rate = max_tx_rate;
+ /* if VF is not in supported mode (VST with supported prio),
+ * we do not change vport configuration for its QPs, but save
+ * the rate, so it will be enforced when it moves to supported
+ * mode next time.
+ */
+ if (!mlx4_is_vf_vst_and_prio_qos(dev, port, vf_admin)) {
+ mlx4_info(dev,
+ "rate set for VF %d when not in valid state\n", vf);
+
+ if (vf_admin->default_vlan != MLX4_VGT)
+ mlx4_info(dev, "VST priority not supported by QoS\n");
+ else
+ mlx4_info(dev, "VF in VGT mode (needed VST)\n");
+
+ mlx4_info(dev,
+ "rate %d take affect when VF moves to valid state\n",
+ max_tx_rate);
+ return 0;
+ }
+
+ /* If user sets rate 0 assigning default vport for its QPs */
+ vf_admin->qos_vport = max_tx_rate ? slave : MLX4_VPP_DEFAULT_VPORT;
+
+ if (priv->mfunc.master.slave_state[slave].active &&
+ dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_UPDATE_QP)
+ mlx4_master_immediate_activate_vlan_qos(priv, slave, port);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_set_vf_rate);
+
+ /* mlx4_get_slave_default_vlan -
+ * return true if VST ( default vlan)
+ * if VST, will return vlan & qos (if not NULL)
+ */
+bool mlx4_get_slave_default_vlan(struct mlx4_dev *dev, int port, int slave,
+ u16 *vlan, u8 *qos)
+{
+ struct mlx4_vport_oper_state *vp_oper;
+ struct mlx4_priv *priv;
+
+ priv = mlx4_priv(dev);
+ port = mlx4_slaves_closest_port(dev, slave, port);
+ vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
+
+ if (MLX4_VGT != vp_oper->state.default_vlan) {
+ if (vlan)
+ *vlan = vp_oper->state.default_vlan;
+ if (qos)
+ *qos = vp_oper->state.default_qos;
+ return true;
+ }
+ return false;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_slave_default_vlan);
+
+int mlx4_set_vf_spoofchk(struct mlx4_dev *dev, int port, int vf, bool setting)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_vport_state *s_info;
+ int slave;
+
+ if ((!mlx4_is_master(dev)) ||
+ !(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_FSM))
+ return -EPROTONOSUPPORT;
+
+ slave = mlx4_get_slave_indx(dev, vf);
+ if (slave < 0)
+ return -EINVAL;
+
+ port = mlx4_slaves_closest_port(dev, slave, port);
+ s_info = &priv->mfunc.master.vf_admin[slave].vport[port];
+ s_info->spoofchk = setting;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_set_vf_spoofchk);
+
+#ifdef HAVE_NDO_SET_VF_MAC
+int mlx4_get_vf_config(struct mlx4_dev *dev, int port, int vf, struct ifla_vf_info *ivf)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_vport_state *s_info;
+ int slave;
+
+ if (!mlx4_is_master(dev))
+ return -EPROTONOSUPPORT;
+
+ slave = mlx4_get_slave_indx(dev, vf);
+ if (slave < 0)
+ return -EINVAL;
+
+ s_info = &priv->mfunc.master.vf_admin[slave].vport[port];
+ ivf->vf = vf;
+
+ /* need to convert it to a func */
+ ivf->mac[0] = ((s_info->mac >> (5*8)) & 0xff);
+ ivf->mac[1] = ((s_info->mac >> (4*8)) & 0xff);
+ ivf->mac[2] = ((s_info->mac >> (3*8)) & 0xff);
+ ivf->mac[3] = ((s_info->mac >> (2*8)) & 0xff);
+ ivf->mac[4] = ((s_info->mac >> (1*8)) & 0xff);
+ ivf->mac[5] = ((s_info->mac) & 0xff);
+
+ ivf->vlan = s_info->default_vlan;
+ ivf->qos = s_info->default_qos;
+
+#ifdef HAVE_TX_RATE_LIMIT
+ if (mlx4_is_vf_vst_and_prio_qos(dev, port, s_info))
+ ivf->max_tx_rate = s_info->tx_rate;
+ else
+ ivf->max_tx_rate = 0;
+
+ ivf->min_tx_rate = 0;
+#elif defined(HAVE_VF_TX_RATE)
+ if (mlx4_is_vf_vst_and_prio_qos(dev, port, s_info))
+ ivf->tx_rate = s_info->tx_rate;
+ else
+ ivf->tx_rate = 0;
+#endif
+#ifdef HAVE_VF_INFO_SPOOFCHK
+ ivf->spoofchk = s_info->spoofchk;
+#endif
+#ifdef HAVE_LINKSTATE
+ ivf->linkstate = s_info->link_state;
+#endif
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_vf_config);
+#endif
+
+int mlx4_set_vf_link_state(struct mlx4_dev *dev, int port, int vf, int link_state)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_vport_state *s_info;
+ int slave;
+ u8 link_stat_event;
+
+ slave = mlx4_get_slave_indx(dev, vf);
+ if (slave < 0)
+ return -EINVAL;
+
+ port = mlx4_slaves_closest_port(dev, slave, port);
+ switch (link_state) {
+ case IFLA_VF_LINK_STATE_AUTO:
+ /* get current link state */
+ if (!priv->sense.do_sense_port[port])
+ link_stat_event = MLX4_PORT_CHANGE_SUBTYPE_ACTIVE;
+ else
+ link_stat_event = MLX4_PORT_CHANGE_SUBTYPE_DOWN;
+ break;
+
+ case IFLA_VF_LINK_STATE_ENABLE:
+ link_stat_event = MLX4_PORT_CHANGE_SUBTYPE_ACTIVE;
+ break;
+
+ case IFLA_VF_LINK_STATE_DISABLE:
+ link_stat_event = MLX4_PORT_CHANGE_SUBTYPE_DOWN;
+ break;
+
+ default:
+ mlx4_warn(dev, "unknown value for link_state %02x on slave %d port %d\n",
+ link_state, slave, port);
+ return -EINVAL;
+ };
+ s_info = &priv->mfunc.master.vf_admin[slave].vport[port];
+ s_info->link_state = link_state;
+
+ /* send event */
+ mlx4_gen_port_state_change_eqe(dev, slave, port, link_stat_event);
+
+ if (mlx4_master_immediate_activate_vlan_qos(priv, slave, port))
+ mlx4_dbg(dev,
+ "updating vf %d port %d no link state HW enforcment\n",
+ vf, port);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_set_vf_link_state);
+
+int mlx4_get_vf_link_state(struct mlx4_dev *dev, int port, int vf)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_vport_state *s_info;
+ int slave;
+
+ if (!mlx4_is_master(dev))
+ return -EPROTONOSUPPORT;
+
+ slave = mlx4_get_slave_indx(dev, vf);
+ if (slave < 0)
+ return -EINVAL;
+
+ s_info = &priv->mfunc.master.vf_admin[slave].vport[port];
+
+ return s_info->link_state;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_vf_link_state);
+
+#ifdef KMOD_DISABLED
+int mlx4_get_vf_statistics(struct mlx4_dev *dev, int port, int vf_idx,
+ struct net_device_stats *link_stats)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cmd_mailbox *if_stat_mailbox = NULL;
+ union mlx4_counter *counter;
+ int slave;
+ int err = 0;
+ u32 if_stat_in_mod;
+ struct counter_index *vf, *tmp_vf;
+
+ if (!link_stats)
+ return -EINVAL;
+
+ if (!mlx4_is_master(dev))
+ return -EPROTONOSUPPORT;
+
+ slave = mlx4_get_slave_indx(dev, vf_idx);
+ if (slave < 0)
+ return -EINVAL;
+
+ if_stat_mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(if_stat_mailbox)) {
+ err = PTR_ERR(if_stat_mailbox);
+ return err;
+ }
+
+ memset(link_stats, 0, sizeof(*link_stats));
+
+ mutex_lock(&priv->counters_table.mutex);
+ list_for_each_entry_safe(vf, tmp_vf,
+ &priv->counters_table.vf_list[slave - 1][port - 1],
+ list) {
+ mlx4_dbg(dev, "%s: read statistics for slave %d, port %d, counter index %d\n",
+ __func__, slave, port, vf->index);
+
+ memset(if_stat_mailbox->buf, 0, sizeof(union mlx4_counter));
+ if_stat_in_mod = (vf->index & 0xff);
+ if (if_stat_in_mod == MLX4_SINK_COUNTER_INDEX)
+ continue;
+ err = mlx4_cmd_box(dev, 0, if_stat_mailbox->dma,
+ if_stat_in_mod, 0,
+ MLX4_CMD_QUERY_IF_STAT,
+ MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_dbg(dev, "%s: failed to read statistics for counter index %d\n",
+ __func__, vf->index);
+ goto if_stat_out;
+ }
+ counter = (union mlx4_counter *)if_stat_mailbox->buf;
+ if ((counter->control.cnt_mode & 0xf) == 1) {
+ link_stats->rx_packets += be64_to_cpu(counter->ext.counters[0].IfRxBroadcastFrames) +
+ be64_to_cpu(counter->ext.counters[0].IfRxUnicastFrames) +
+ be64_to_cpu(counter->ext.counters[0].IfRxMulticastFrames);
+ link_stats->tx_packets += be64_to_cpu(counter->ext.counters[0].IfTxBroadcastFrames) +
+ be64_to_cpu(counter->ext.counters[0].IfTxUnicastFrames) +
+ be64_to_cpu(counter->ext.counters[0].IfTxMulticastFrames);
+ link_stats->rx_bytes += be64_to_cpu(counter->ext.counters[0].IfRxBroadcastOctets) +
+ be64_to_cpu(counter->ext.counters[0].IfRxUnicastOctets) +
+ be64_to_cpu(counter->ext.counters[0].IfRxMulticastOctets);
+ link_stats->tx_bytes += be64_to_cpu(counter->ext.counters[0].IfTxBroadcastOctets) +
+ be64_to_cpu(counter->ext.counters[0].IfTxUnicastOctets) +
+ be64_to_cpu(counter->ext.counters[0].IfTxMulticastOctets);
+ link_stats->rx_errors += be64_to_cpu(counter->ext.counters[0].IfRxErrorFrames);
+ link_stats->rx_dropped += be64_to_cpu(counter->ext.counters[0].IfRxNoBufferFrames);
+ link_stats->tx_dropped += be64_to_cpu(counter->ext.counters[0].IfTxDroppedFrames);
+ link_stats->multicast += be64_to_cpu(counter->ext.counters[0].IfRxMulticastFrames);
+ }
+ }
+
+if_stat_out:
+ mutex_unlock(&priv->counters_table.mutex);
+ mlx4_free_cmd_mailbox(dev, if_stat_mailbox);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_vf_statistics);
+#endif
+
+int mlx4_vf_smi_enabled(struct mlx4_dev *dev, int slave, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (slave < 1 || slave >= dev->num_slaves ||
+ port < 1 || port > MLX4_MAX_PORTS)
+ return 0;
+
+ return priv->mfunc.master.vf_oper[slave].smi_enabled[port] ==
+ MLX4_VF_SMI_ENABLED;
+}
+EXPORT_SYMBOL_GPL(mlx4_vf_smi_enabled);
+
+int mlx4_vf_get_enable_smi_admin(struct mlx4_dev *dev, int slave, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (slave == mlx4_master_func_num(dev))
+ return 1;
+
+ if (slave < 1 || slave >= dev->num_slaves ||
+ port < 1 || port > MLX4_MAX_PORTS)
+ return 0;
+
+ return priv->mfunc.master.vf_admin[slave].enable_smi[port] ==
+ MLX4_VF_SMI_ENABLED;
+}
+EXPORT_SYMBOL_GPL(mlx4_vf_get_enable_smi_admin);
+
+int mlx4_vf_set_enable_smi_admin(struct mlx4_dev *dev, int slave, int port,
+ int enabled)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (slave == mlx4_master_func_num(dev))
+ return 0;
+
+ if (slave < 1 || slave >= dev->num_slaves ||
+ port < 1 || port > MLX4_MAX_PORTS ||
+ enabled < 0 || enabled > 1)
+ return -EINVAL;
+
+ priv->mfunc.master.vf_admin[slave].enable_smi[port] = enabled;
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_vf_set_enable_smi_admin);
+
+ssize_t mlx4_get_vf_rate(struct mlx4_dev *dev, int port, int vf, char *buf)
+{
+ int slave;
+ int rate = 0;
+ ssize_t len = 0;
+ struct mlx4_vport_state *s_info;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (!mlx4_is_master(dev) ||
+ !(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QOS_VPP))
+ return -EOPNOTSUPP;
+
+ slave = mlx4_get_slave_indx(dev, vf);
+ if (slave < 0)
+ return -EINVAL;
+
+ s_info = &priv->mfunc.master.vf_admin[slave].vport[port];
+
+ if (mlx4_is_vf_vst_and_prio_qos(dev, port, s_info))
+ rate = s_info->tx_rate;
+
+ len += sprintf(&buf[len], "%d\n", rate);
+
+ return len;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_vf_rate);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/cq.c b/drivers/net/mlnx_uio/mlnx/mlx4/cq.c
new file mode 100644
index 0000000..20e1133
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/cq.c
@@ -0,0 +1,443 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2004 Voltaire, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx4/cq.h"
+
+#include "mlx4.h"
+#include "icm.h"
+#include "log2.h"
+
+#define MLX4_CQ_STATUS_OK ( 0 << 28)
+#define MLX4_CQ_STATUS_OVERFLOW ( 9 << 28)
+#define MLX4_CQ_STATUS_WRITE_FAIL (10 << 28)
+#define MLX4_CQ_FLAG_CC ( 1 << 18)
+#define MLX4_CQ_FLAG_OI ( 1 << 17)
+#define MLX4_CQ_STATE_ARMED ( 9 << 8)
+#define MLX4_CQ_STATE_ARMED_SOL ( 6 << 8)
+#define MLX4_EQ_STATE_FIRED (10 << 8)
+
+#define TASKLET_MAX_TIME 2
+#define TASKLET_MAX_TIME_JIFFIES msecs_to_jiffies(TASKLET_MAX_TIME)
+
+#ifdef KMOD_REMOVED
+void mlx4_cq_tasklet_cb(unsigned long data)
+{
+ unsigned long flags;
+ unsigned long end = jiffies + TASKLET_MAX_TIME_JIFFIES;
+ struct mlx4_eq_tasklet *ctx = (struct mlx4_eq_tasklet *)data;
+ struct mlx4_cq *mcq, *temp;
+
+ spin_lock_irqsave(&ctx->lock, flags);
+ list_splice_tail_init(&ctx->list, &ctx->process_list);
+ spin_unlock_irqrestore(&ctx->lock, flags);
+
+ list_for_each_entry_safe(mcq, temp, &ctx->process_list, tasklet_ctx.list) {
+ list_del_init(&mcq->tasklet_ctx.list);
+ mcq->tasklet_ctx.comp(mcq);
+ if (atomic_dec_and_test(&mcq->refcount))
+ complete(&mcq->free);
+ if (time_after(jiffies, end))
+ break;
+ }
+
+ if (!list_empty(&ctx->process_list))
+ tasklet_schedule(&ctx->task);
+}
+
+static void mlx4_add_cq_to_tasklet(struct mlx4_cq *cq)
+{
+ unsigned long flags;
+ struct mlx4_eq_tasklet *tasklet_ctx = cq->tasklet_ctx.priv;
+
+ spin_lock_irqsave(&tasklet_ctx->lock, flags);
+ /* When migrating CQs between EQs will be implemented, please note
+ * that you need to sync this point. It is possible that
+ * while migrating a CQ, completions on the old EQs could
+ * still arrive.
+ */
+ if (list_empty_careful(&cq->tasklet_ctx.list)) {
+ atomic_inc(&cq->refcount);
+ list_add_tail(&cq->tasklet_ctx.list, &tasklet_ctx->list);
+ }
+ spin_unlock_irqrestore(&tasklet_ctx->lock, flags);
+}
+#endif
+
+void mlx4_cq_completion(struct mlx4_dev *dev, u32 cqn)
+{
+ struct mlx4_cq *cq;
+
+ rcu_read_lock();
+ cq = radix_tree_lookup(&mlx4_priv(dev)->cq_table.tree,
+ cqn & (dev->caps.num_cqs - 1));
+ rcu_read_unlock();
+
+ if (!cq) {
+ mlx4_dbg(dev, "Completion event for bogus CQ %08x\n", cqn);
+ return;
+ }
+
+ ++cq->arm_sn;
+
+ //cq->comp(cq); XXX
+}
+
+void mlx4_cq_event(struct mlx4_dev *dev, u32 cqn, int event_type)
+{
+ struct mlx4_cq_table *cq_table = &mlx4_priv(dev)->cq_table;
+ struct mlx4_cq *cq;
+
+ rcu_read_lock();
+ cq = radix_tree_lookup(&cq_table->tree, cqn & (dev->caps.num_cqs - 1));
+ rcu_read_unlock();
+
+ if (cq) {
+ atomic_inc(&cq->refcount);
+ } else {
+ mlx4_warn(dev, "Async event for bogus CQ %08x\n", cqn);
+ return;
+ }
+
+ //cq->event(cq, event_type); XXX
+
+ if (atomic_dec_and_test(&cq->refcount))
+ complete(&cq->free);
+}
+
+static int mlx4_SW2HW_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
+ int cq_num)
+{
+ return mlx4_cmd(dev, mailbox->dma, cq_num, 0,
+ MLX4_CMD_SW2HW_CQ, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+}
+
+static int mlx4_MODIFY_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
+ int cq_num, u32 opmod)
+{
+ return mlx4_cmd(dev, mailbox->dma, cq_num, opmod, MLX4_CMD_MODIFY_CQ,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+}
+
+static int mlx4_HW2SW_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
+ int cq_num)
+{
+ return mlx4_cmd_box(dev, 0, mailbox ? mailbox->dma : 0,
+ cq_num, mailbox ? 0 : 1, MLX4_CMD_HW2SW_CQ,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+}
+
+int mlx4_cq_modify(struct mlx4_dev *dev, struct mlx4_cq *cq,
+ u16 count, u16 period)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_cq_context *cq_context;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ cq_context = mailbox->buf;
+ cq_context->cq_max_count = cpu_to_be16(count);
+ cq_context->cq_period = cpu_to_be16(period);
+
+ err = mlx4_MODIFY_CQ(dev, mailbox, cq->cqn, 1);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_cq_modify);
+
+int mlx4_cq_resize(struct mlx4_dev *dev, struct mlx4_cq *cq,
+ int entries, struct mlx4_mtt *mtt)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_cq_context *cq_context;
+ u64 mtt_addr;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ cq_context = mailbox->buf;
+ cq_context->logsize_usrpage = cpu_to_be32(ilog2(entries) << 24);
+ cq_context->log_page_size = mtt->page_shift - 12;
+ mtt_addr = mlx4_mtt_addr(dev, mtt);
+ cq_context->mtt_base_addr_h = mtt_addr >> 32;
+ cq_context->mtt_base_addr_l = cpu_to_be32(mtt_addr & 0xffffffff);
+
+ err = mlx4_MODIFY_CQ(dev, mailbox, cq->cqn, 0);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_cq_resize);
+
+int mlx4_cq_ignore_overrun(struct mlx4_dev *dev, struct mlx4_cq *cq)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_cq_context *cq_context;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ cq_context = mailbox->buf;
+ memset(cq_context, 0, sizeof *cq_context);
+
+ cq_context->flags |= cpu_to_be32(MLX4_CQ_FLAG_OI);
+
+ err = mlx4_MODIFY_CQ(dev, mailbox, cq->cqn, 3);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_cq_ignore_overrun);
+
+int __mlx4_cq_alloc_icm(struct mlx4_dev *dev, int *cqn)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cq_table *cq_table = &priv->cq_table;
+ int err;
+
+ *cqn = mlx4_bitmap_alloc(&cq_table->bitmap);
+ if (*cqn == -1)
+ return -ENOMEM;
+
+ err = mlx4_table_get(dev, &cq_table->table, *cqn, GFP_KERNEL);
+ if (err)
+ goto err_out;
+
+ err = mlx4_table_get(dev, &cq_table->cmpt_table, *cqn, GFP_KERNEL);
+ if (err)
+ goto err_put;
+ return 0;
+
+err_put:
+ mlx4_table_put(dev, &cq_table->table, *cqn);
+
+err_out:
+ mlx4_bitmap_free(&cq_table->bitmap, *cqn, MLX4_NO_RR);
+ return err;
+}
+
+static int mlx4_cq_alloc_icm(struct mlx4_dev *dev, int *cqn)
+{
+ u64 out_param;
+ int err;
+
+ if (mlx4_is_mfunc(dev)) {
+ err = mlx4_cmd_imm(dev, 0, &out_param, RES_CQ,
+ RES_OP_RESERVE_AND_MAP, MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (err)
+ return err;
+ else {
+ *cqn = get_param_l(&out_param);
+ return 0;
+ }
+ }
+ return __mlx4_cq_alloc_icm(dev, cqn);
+}
+
+void __mlx4_cq_free_icm(struct mlx4_dev *dev, int cqn)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cq_table *cq_table = &priv->cq_table;
+
+ mlx4_table_put(dev, &cq_table->cmpt_table, cqn);
+ mlx4_table_put(dev, &cq_table->table, cqn);
+ mlx4_bitmap_free(&cq_table->bitmap, cqn, MLX4_NO_RR);
+}
+
+static void mlx4_cq_free_icm(struct mlx4_dev *dev, int cqn)
+{
+ u64 in_param = 0;
+ int err;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, cqn);
+ err = mlx4_cmd(dev, in_param, RES_CQ, RES_OP_RESERVE_AND_MAP,
+ MLX4_CMD_FREE_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (err)
+ mlx4_warn(dev, "Failed freeing cq:%d\n", cqn);
+ } else
+ __mlx4_cq_free_icm(dev, cqn);
+}
+
+int mlx4_cq_alloc(struct mlx4_dev *dev, int nent,
+ struct mlx4_mtt *mtt, struct mlx4_uar *uar, u64 db_rec,
+ struct mlx4_cq *cq, unsigned vector, int collapsed,
+ int timestamp_en)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cq_table *cq_table = &priv->cq_table;
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_cq_context *cq_context;
+ u64 mtt_addr;
+ int err;
+
+ if (vector >= dev->caps.num_comp_vectors)
+ return -EINVAL;
+
+ cq->vector = vector;
+
+ err = mlx4_cq_alloc_icm(dev, &cq->cqn);
+ if (err)
+ return err;
+
+ spin_lock(&cq_table->lock);
+ err = radix_tree_insert(&cq_table->tree, cq->cqn, cq);
+ spin_unlock(&cq_table->lock);
+ if (err)
+ goto err_icm;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox)) {
+ err = PTR_ERR(mailbox);
+ goto err_radix;
+ }
+
+ cq_context = mailbox->buf;
+ cq_context->flags = cpu_to_be32(!!collapsed << 18);
+ if (timestamp_en)
+ cq_context->flags |= cpu_to_be32(1 << 19);
+
+ cq_context->logsize_usrpage = cpu_to_be32((ilog2(nent) << 24) | uar->index);
+ cq_context->comp_eqn = priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(vector)].eqn;
+ cq_context->log_page_size = mtt->page_shift - MLX4_ICM_PAGE_SHIFT;
+
+ mtt_addr = mlx4_mtt_addr(dev, mtt);
+ cq_context->mtt_base_addr_h = mtt_addr >> 32;
+ cq_context->mtt_base_addr_l = cpu_to_be32(mtt_addr & 0xffffffff);
+ cq_context->db_rec_addr = cpu_to_be64(db_rec);
+
+ err = mlx4_SW2HW_CQ(dev, mailbox, cq->cqn);
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ if (err)
+ goto err_radix;
+
+ cq->cons_index = 0;
+ cq->arm_sn = 1;
+ cq->uar = uar;
+ cq->eqn = priv->eq_table.eq[cq->vector].eqn;
+ cq->irq = priv->eq_table.eq[cq->vector].irq;
+
+ atomic_set(&cq->refcount, 1);
+ init_completion(&cq->free);
+#ifdef KMOD_REMOVED
+ cq->comp = mlx4_add_cq_to_tasklet;
+ cq->tasklet_ctx.priv =
+ &priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(vector)].tasklet_ctx;
+#endif
+ INIT_LIST_HEAD(&cq->tasklet_ctx.list);
+
+ cq->irq = priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(vector)].irq;
+ return 0;
+
+err_radix:
+ spin_lock(&cq_table->lock);
+ radix_tree_delete(&cq_table->tree, cq->cqn);
+ spin_unlock(&cq_table->lock);
+
+err_icm:
+ mlx4_cq_free_icm(dev, cq->cqn);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_cq_alloc);
+
+void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cq_table *cq_table = &priv->cq_table;
+ int err;
+
+ err = mlx4_HW2SW_CQ(dev, NULL, cq->cqn);
+ if (err)
+ mlx4_warn(dev, "HW2SW_CQ failed (%d) for CQN %06x\n", err, cq->cqn);
+
+ spin_lock(&cq_table->lock);
+ radix_tree_delete(&cq_table->tree, cq->cqn);
+ spin_unlock(&cq_table->lock);
+
+ synchronize_irq(priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq->vector)].irq);
+ /* synchronize ASYNC irq */
+ if (priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq->vector)].irq !=
+ priv->eq_table.eq[MLX4_EQ_ASYNC].irq)
+ synchronize_irq(priv->eq_table.eq[MLX4_EQ_ASYNC].irq);
+
+ if (atomic_dec_and_test(&cq->refcount))
+ complete(&cq->free);
+ wait_for_completion(&cq->free);
+
+ mlx4_cq_free_icm(dev, cq->cqn);
+}
+EXPORT_SYMBOL_GPL(mlx4_cq_free);
+
+int mlx4_init_cq_table(struct mlx4_dev *dev)
+{
+ struct mlx4_cq_table *cq_table = &mlx4_priv(dev)->cq_table;
+ int err;
+
+ spin_lock_init(&cq_table->lock);
+ INIT_RADIX_TREE(&cq_table->tree, GFP_ATOMIC);
+ if (mlx4_is_slave(dev))
+ return 0;
+
+ err = mlx4_bitmap_init(&cq_table->bitmap, dev->caps.num_cqs,
+ dev->caps.num_cqs - 1, dev->caps.reserved_cqs, 0);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+void mlx4_cleanup_cq_table(struct mlx4_dev *dev)
+{
+ if (mlx4_is_slave(dev))
+ return;
+ /* Nothing to do to clean up radix_tree */
+ mlx4_bitmap_cleanup(&mlx4_priv(dev)->cq_table.bitmap);
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_clock.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_clock.c
new file mode 100644
index 0000000..6b52622
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_clock.c
@@ -0,0 +1,330 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2012 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+
+#include "mlx4_en.h"
+
+/* mlx4_en_read_clock - read raw cycle counter (to be used by time counter)
+ */
+#ifdef KMOD_DISABLED
+static cycle_t mlx4_en_read_clock(const struct cyclecounter *tc)
+{
+ struct mlx4_en_dev *mdev =
+ container_of(tc, struct mlx4_en_dev, cycles);
+ struct mlx4_dev *dev = mdev->dev;
+
+ return mlx4_read_clock(dev) & tc->mask;
+}
+#endif
+
+u64 mlx4_en_get_cqe_ts(struct mlx4_cqe *cqe)
+{
+ u64 hi, lo;
+ struct mlx4_ts_cqe *ts_cqe = (struct mlx4_ts_cqe *)cqe;
+
+ lo = (u64)be16_to_cpu(ts_cqe->timestamp_lo);
+ hi = ((u64)be32_to_cpu(ts_cqe->timestamp_hi) + !lo) << 16;
+
+ return hi | lo;
+}
+
+#ifdef KMOD_DISABLED
+void mlx4_en_fill_hwtstamps(struct mlx4_en_dev *mdev,
+ struct skb_shared_hwtstamps *hwts,
+ u64 timestamp)
+{
+ unsigned long flags;
+ u64 nsec;
+
+ read_lock_irqsave(&mdev->clock_lock, flags);
+ nsec = timecounter_cyc2time(&mdev->clock, timestamp);
+ read_unlock_irqrestore(&mdev->clock_lock, flags);
+
+ memset(hwts, 0, sizeof(struct skb_shared_hwtstamps));
+ hwts->hwtstamp = ns_to_ktime(nsec);
+}
+#endif
+
+#if defined (HAVE_PTP_CLOCK_INFO) && (defined (CONFIG_PTP_1588_CLOCK) || defined(CONFIG_PTP_1588_CLOCK_MODULE))
+/**
+ * mlx4_en_remove_timestamp - disable PTP device
+ * @mdev: board private structure
+ *
+ * Stop the PTP support.
+ **/
+void mlx4_en_remove_timestamp(struct mlx4_en_dev *mdev)
+{
+ if (mdev->ptp_clock) {
+ ptp_clock_unregister(mdev->ptp_clock);
+ mdev->ptp_clock = NULL;
+ mlx4_info(mdev, "removed PHC\n");
+ }
+}
+#endif
+
+#ifdef KMOD_DISABLED
+void mlx4_en_ptp_overflow_check(struct mlx4_en_dev *mdev)
+{
+ bool timeout = time_is_before_jiffies(mdev->last_overflow_check +
+ mdev->overflow_period);
+ unsigned long flags;
+
+ if (timeout) {
+ write_lock_irqsave(&mdev->clock_lock, flags);
+ timecounter_read(&mdev->clock);
+ write_unlock_irqrestore(&mdev->clock_lock, flags);
+ mdev->last_overflow_check = jiffies;
+ }
+}
+#endif
+
+#if defined (HAVE_PTP_CLOCK_INFO) && (defined (CONFIG_PTP_1588_CLOCK) || defined(CONFIG_PTP_1588_CLOCK_MODULE))
+/**
+ * mlx4_en_phc_adjfreq - adjust the frequency of the hardware clock
+ * @ptp: ptp clock structure
+ * @delta: Desired frequency change in parts per billion
+ *
+ * Adjust the frequency of the PHC cycle counter by the indicated delta from
+ * the base frequency.
+ **/
+static int mlx4_en_phc_adjfreq(struct ptp_clock_info *ptp, s32 delta)
+{
+ u64 adj;
+ u32 diff, mult;
+ int neg_adj = 0;
+ unsigned long flags;
+ struct mlx4_en_dev *mdev = container_of(ptp, struct mlx4_en_dev,
+ ptp_clock_info);
+
+ if (delta < 0) {
+ neg_adj = 1;
+ delta = -delta;
+ }
+ mult = mdev->nominal_c_mult;
+ adj = mult;
+ adj *= delta;
+ diff = div_u64(adj, 1000000000ULL);
+
+ write_lock_irqsave(&mdev->clock_lock, flags);
+ timecounter_read(&mdev->clock);
+ mdev->cycles.mult = neg_adj ? mult - diff : mult + diff;
+ write_unlock_irqrestore(&mdev->clock_lock, flags);
+
+ return 0;
+}
+
+/**
+ * mlx4_en_phc_adjtime - Shift the time of the hardware clock
+ * @ptp: ptp clock structure
+ * @delta: Desired change in nanoseconds
+ *
+ * Adjust the timer by resetting the timecounter structure.
+ **/
+static int mlx4_en_phc_adjtime(struct ptp_clock_info *ptp, s64 delta)
+{
+ struct mlx4_en_dev *mdev = container_of(ptp, struct mlx4_en_dev,
+ ptp_clock_info);
+ unsigned long flags;
+
+ write_lock_irqsave(&mdev->clock_lock, flags);
+ timecounter_adjtime(&mdev->clock, delta);
+ write_unlock_irqrestore(&mdev->clock_lock, flags);
+
+ return 0;
+}
+
+/**
+ * mlx4_en_phc_gettime - Reads the current time from the hardware clock
+ * @ptp: ptp clock structure
+ * @ts: timespec structure to hold the current time value
+ *
+ * Read the timecounter and return the correct value in ns after converting
+ * it into a struct timespec.
+ **/
+#ifdef HAVE_PTP_CLOCK_INFO_GETTIME_32BIT
+static int mlx4_en_phc_gettime(struct ptp_clock_info *ptp, struct timespec *ts)
+#else
+static int mlx4_en_phc_gettime(struct ptp_clock_info *ptp,
+ struct timespec64 *ts)
+#endif
+{
+ struct mlx4_en_dev *mdev = container_of(ptp, struct mlx4_en_dev,
+ ptp_clock_info);
+ unsigned long flags;
+ u32 remainder;
+ u64 ns;
+
+ write_lock_irqsave(&mdev->clock_lock, flags);
+ ns = timecounter_read(&mdev->clock);
+ write_unlock_irqrestore(&mdev->clock_lock, flags);
+
+ ts->tv_sec = div_u64_rem(ns, NSEC_PER_SEC, &remainder);
+ ts->tv_nsec = remainder;
+
+ return 0;
+}
+
+/**
+ * mlx4_en_phc_settime - Set the current time on the hardware clock
+ * @ptp: ptp clock structure
+ * @ts: timespec containing the new time for the cycle counter
+ *
+ * Reset the timecounter to use a new base value instead of the kernel
+ * wall timer value.
+ **/
+static int mlx4_en_phc_settime(struct ptp_clock_info *ptp,
+#ifdef HAVE_PTP_CLOCK_INFO_GETTIME_32BIT
+ const struct timespec *ts)
+#else
+ const struct timespec64 *ts)
+#endif
+{
+ struct mlx4_en_dev *mdev = container_of(ptp, struct mlx4_en_dev,
+ ptp_clock_info);
+#ifdef HAVE_PTP_CLOCK_INFO_GETTIME_32BIT
+ u64 ns = timespec_to_ns(ts);
+#else
+ u64 ns = timespec64_to_ns(ts);
+#endif
+ unsigned long flags;
+
+ /* reset the timecounter */
+ write_lock_irqsave(&mdev->clock_lock, flags);
+ timecounter_init(&mdev->clock, &mdev->cycles, ns);
+ write_unlock_irqrestore(&mdev->clock_lock, flags);
+
+ return 0;
+}
+
+/**
+ * mlx4_en_phc_enable - enable or disable an ancillary feature
+ * @ptp: ptp clock structure
+ * @request: Desired resource to enable or disable
+ * @on: Caller passes one to enable or zero to disable
+ *
+ * Enable (or disable) ancillary features of the PHC subsystem.
+ * Currently, no ancillary features are supported.
+ **/
+static int mlx4_en_phc_enable(struct ptp_clock_info __always_unused *ptp,
+ struct ptp_clock_request __always_unused *request,
+ int __always_unused on)
+{
+ return -EOPNOTSUPP;
+}
+
+static const struct ptp_clock_info mlx4_en_ptp_clock_info = {
+ .owner = THIS_MODULE,
+ .max_adj = 100000000,
+ .n_alarm = 0,
+ .n_ext_ts = 0,
+ .n_per_out = 0,
+#ifdef HAVE_PTP_CLOCK_INFO_N_PINS
+ .n_pins = 0,
+#endif
+ .pps = 0,
+ .adjfreq = mlx4_en_phc_adjfreq,
+ .adjtime = mlx4_en_phc_adjtime,
+#ifdef HAVE_PTP_CLOCK_INFO_GETTIME_32BIT
+ .gettime = mlx4_en_phc_gettime,
+ .settime = mlx4_en_phc_settime,
+#else
+ .gettime64 = mlx4_en_phc_gettime,
+ .settime64 = mlx4_en_phc_settime,
+#endif
+ .enable = mlx4_en_phc_enable,
+};
+#endif
+
+#ifdef KMOD_DISABLED
+void mlx4_en_init_timestamp(struct mlx4_en_dev *mdev)
+{
+ struct mlx4_dev *dev = mdev->dev;
+ unsigned long flags;
+#ifdef HAVE_CYCLECOUNTER_CYC2NS_4_PARAMS
+ u64 ns, zero = 0;
+#else
+ u64 ns;
+#endif
+
+ rwlock_init(&mdev->clock_lock);
+
+ memset(&mdev->cycles, 0, sizeof(mdev->cycles));
+ mdev->cycles.read = mlx4_en_read_clock;
+ mdev->cycles.mask = CLOCKSOURCE_MASK(48);
+ /* Using shift to make calculation more accurate. Since current HW
+ * clock frequency is 427 MHz, and cycles are given using a 48 bits
+ * register, the biggest shift when calculating using u64, is 14
+ * (max_cycles * multiplier < 2^64)
+ */
+ mdev->cycles.shift = 14;
+ mdev->cycles.mult =
+ clocksource_khz2mult(1000 * dev->caps.hca_core_clock, mdev->cycles.shift);
+ mdev->nominal_c_mult = mdev->cycles.mult;
+
+ write_lock_irqsave(&mdev->clock_lock, flags);
+ timecounter_init(&mdev->clock, &mdev->cycles,
+ ktime_to_ns(ktime_get_real()));
+ write_unlock_irqrestore(&mdev->clock_lock, flags);
+
+ /* Calculate period in seconds to call the overflow watchdog - to make
+ * sure counter is checked at least once every wrap around.
+ */
+#ifdef HAVE_CYCLECOUNTER_CYC2NS_4_PARAMS
+ ns = cyclecounter_cyc2ns(&mdev->cycles, mdev->cycles.mask, zero, &zero);
+#else
+ ns = cyclecounter_cyc2ns(&mdev->cycles, mdev->cycles.mask);
+#endif
+ do_div(ns, NSEC_PER_SEC / 2 / HZ);
+ mdev->overflow_period = ns;
+
+#if defined (HAVE_PTP_CLOCK_INFO) && (defined (CONFIG_PTP_1588_CLOCK) || defined(CONFIG_PTP_1588_CLOCK_MODULE))
+ /* Configure the PHC */
+ mdev->ptp_clock_info = mlx4_en_ptp_clock_info;
+ snprintf(mdev->ptp_clock_info.name, 16, "mlx4 ptp");
+
+ mdev->ptp_clock = ptp_clock_register(&mdev->ptp_clock_info,
+ &mdev->pdev->dev);
+ if (IS_ERR(mdev->ptp_clock)) {
+ mdev->ptp_clock = NULL;
+ mlx4_err(mdev, "ptp_clock_register failed\n");
+ } else {
+ mlx4_info(mdev, "registered PHC clock\n");
+ }
+#endif
+
+}
+#endif
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_cq.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_cq.c
new file mode 100644
index 0000000..75ecfc7
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_cq.c
@@ -0,0 +1,257 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+
+#include "mlx4_en.h"
+
+static void mlx4_en_cq_event(struct mlx4_cq *cq, enum mlx4_event event)
+{
+ return;
+}
+
+
+int mlx4_en_create_cq(struct mlx4_en_priv *priv,
+ struct mlx4_en_cq *cq,
+ int entries, int ring, enum cq_type mode,
+ int node)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err;
+
+ cq->size = entries;
+ cq->buf_size = cq->size * mdev->dev->caps.cqe_size;
+
+ cq->ring = ring;
+ cq->is_tx = mode;
+ cq->vector = mdev->dev->caps.num_comp_vectors;
+
+ /* Allocate HW buffers on provided NUMA node.
+ * dev->numa_node is used in mtt range allocation flow.
+ */
+ //set_dev_node(&mdev->dev->persist->pdev->dev, node);
+ err = mlx4_alloc_hwq_res(mdev->dev, &cq->wqres,
+ cq->buf_size, 2 * PAGE_SIZE);
+ //set_dev_node(&mdev->dev->persist->pdev->dev, mdev->dev->numa_node);
+ if (err)
+ goto err_cq;
+
+ err = mlx4_en_map_buffer(&cq->wqres.buf);
+ if (err)
+ goto err_res;
+
+ cq->buf = (struct mlx4_cqe *)cq->wqres.buf.direct.buf;
+
+ return 0;
+
+err_res:
+ mlx4_free_hwq_res(mdev->dev, &cq->wqres, cq->buf_size);
+err_cq:
+ return err;
+}
+
+#define MLX4_EN_EQ_NAME_PRIORITY 2
+
+#ifdef KMOD_DISABLED
+static void mlx4_en_cq_eq_cb(unsigned vector, u32 uuid, void *data)
+{
+ int err;
+ struct mlx4_en_cq **pcq = data;
+
+ if (MLX4_EQ_UUID_TO_ID(uuid) == MLX4_EQ_ID_EN) {
+ struct mlx4_en_cq *cq = *pcq;
+ //struct mlx4_en_priv *priv = netdev_priv(cq->dev);
+ struct mlx4_en_priv *priv = cq->rte_dev->data->dev_private;
+ struct mlx4_en_dev *mdev = priv->mdev;
+
+ if (uuid == MLX4_EQ_ID_TO_UUID(MLX4_EQ_ID_EN, priv->port,
+ pcq - priv->rx_cq)) {
+ err = mlx4_rename_eq(mdev->dev, priv->port, vector,
+ MLX4_EN_EQ_NAME_PRIORITY, "%s-%d",
+ priv->rte_dev->data->name, cq->ring);
+ if (err)
+ mlx4_warn(mdev, "Failed to rename EQ, continuing with default name\n");
+ }
+ }
+}
+#endif
+
+int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
+ int cq_idx, int timestamp_en)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err = 0;
+ char name[25];
+ bool assigned_eq = false;
+
+ cq->rte_dev = mdev->rte_pndev[priv->port];
+ cq->mcq.set_ci_db = cq->wqres.db.db;
+ cq->mcq.arm_db = cq->wqres.db.db + 1;
+ *cq->mcq.set_ci_db = 0;
+ *cq->mcq.arm_db = 0;
+ memset(cq->buf, 0, cq->buf_size);
+
+ if (cq->is_tx == RX) {
+ if (!mlx4_is_eq_vector_valid(mdev->dev, priv->port,
+ cq->vector)) {
+#ifdef KMOD_MODIFIED
+ cq->vector = 0; //XXX
+#else
+ cq->vector = cpumask_first(priv->rx_ring[cq->ring]->affinity_mask);
+#endif
+
+ err = mlx4_assign_eq(mdev->dev, priv->port,
+ MLX4_EQ_ID_TO_UUID(MLX4_EQ_ID_EN,
+ priv->port,
+ cq_idx),
+ NULL, //mlx4_en_cq_eq_cb,
+ NULL,//&priv->rx_cq[cq_idx],
+ &cq->vector);
+ if (err) {
+ mlx4_err(mdev, "Failed assigning an EQ to %s\n",
+ name);
+ goto free_eq;
+ }
+
+ assigned_eq = true;
+ }
+
+ /* Set IRQ for specific name (per ring) */
+ err = mlx4_rename_eq(mdev->dev, priv->port, cq->vector,
+ MLX4_EN_EQ_NAME_PRIORITY, "%s-%d",
+ priv->rte_dev->data->name, cq->ring);
+
+ if (err) {
+ mlx4_warn(mdev, "Failed to rename EQ, continuing with default name\n");
+ err = 0;
+ }
+
+#if defined(HAVE_IRQ_DESC_GET_IRQ_DATA) && defined(HAVE_IRQ_TO_DESC_EXPORTED)
+ cq->irq_desc =
+ irq_to_desc(mlx4_eq_get_irq(mdev->dev,
+ cq->vector));
+#endif
+ } else {
+ /* For TX we use the same irq per
+ ring we assigned for the RX */
+ struct mlx4_en_rx_ring *rx_ring;
+
+ cq_idx = cq_idx % priv->rte_dev->data->nb_rx_queues;
+ //rx_cq = priv->rx_cq[cq_idx];
+ rx_ring = priv->rte_dev->data->rx_queues[cq_idx];
+ cq->vector = rx_ring->rx_cq.vector;
+ }
+
+ if (!cq->is_tx)
+ cq->size = ((struct mlx4_en_rx_ring *)priv->rte_dev->data->rx_queues[cq->ring])->actual_size;
+#ifdef KMOD_MODIFIED
+
+#else
+ if ((cq->is_tx && priv->hwtstamp_config.tx_type) ||
+ (!cq->is_tx && priv->hwtstamp_config.rx_filter))
+ timestamp_en = 1;
+#endif
+
+ err = mlx4_cq_alloc(mdev->dev, cq->size, &cq->wqres.mtt,
+ &mdev->priv_uar, cq->wqres.db.dma, &cq->mcq,
+ cq->vector, 0, timestamp_en);
+ if (err)
+ goto free_eq;
+#ifdef KMOD_MODIFIED
+ //lets call them directly
+#else
+ cq->mcq.comp = cq->is_tx ? mlx4_en_tx_irq : mlx4_en_rx_irq;
+ cq->mcq.event = mlx4_en_cq_event;
+#endif
+
+#ifdef KMOD_DISABLED
+ if (cq->is_tx) {
+ netif_napi_add(cq->dev, &cq->napi, mlx4_en_poll_tx_cq,
+ NAPI_POLL_WEIGHT);
+ } else {
+ netif_napi_add(cq->dev, &cq->napi, mlx4_en_poll_rx_cq, 64);
+#ifdef HAVE_NAPI_HASH_ADD
+ napi_hash_add(&cq->napi);
+#endif
+ }
+
+ napi_enable(&cq->napi);
+#endif
+
+ return 0;
+
+free_eq:
+ if (assigned_eq)
+ mlx4_release_eq(mdev->dev, MLX4_EQ_ID_TO_UUID(
+ MLX4_EQ_ID_EN, priv->port, cq_idx),
+ cq->vector);
+ cq->vector = mdev->dev->caps.num_comp_vectors;
+ return err;
+}
+
+
+void mlx4_en_deactivate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq)
+{
+#ifdef KMOD_DISABLED
+ napi_disable(&cq->napi);
+#ifdef HAVE_NAPI_HASH_ADD
+ if (!cq->is_tx) {
+ napi_hash_del(&cq->napi);
+ synchronize_rcu();
+ }
+#endif
+ netif_napi_del(&cq->napi);
+#endif
+
+ mlx4_cq_free(priv->mdev->dev, &cq->mcq);
+}
+
+/* Set rx cq moderation parameters */
+int mlx4_en_set_cq_moder(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq)
+{
+ return mlx4_cq_modify(priv->mdev->dev, &cq->mcq,
+ cq->moder_cnt, cq->moder_time);
+}
+
+int mlx4_en_arm_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq)
+{
+ mlx4_cq_arm(&cq->mcq, MLX4_CQ_DB_REQ_NOT, priv->mdev->uar_map,
+ &priv->mdev->uar_lock);
+
+ return 0;
+}
+
+
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_dcb_nl.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_dcb_nl.c
new file mode 100644
index 0000000..f486378
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_dcb_nl.c
@@ -0,0 +1,613 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2011 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+
+#include "mlx4_en.h"
+#include "fw_qos.h"
+
+/* Definitions for QCN
+ */
+
+#define MLX4_EN_MAX_TX_RING_P_UP 32
+#define MLX4_EN_NUM_UP 8
+
+struct mlx4_congestion_control_mb_prio_802_1_qau_params {
+ __be32 modify_enable_high;
+ __be32 modify_enable_low;
+ __be32 reserved1;
+ __be32 extended_enable;
+ __be32 rppp_max_rps;
+ __be32 rpg_time_reset;
+ __be32 rpg_byte_reset;
+ __be32 rpg_threshold;
+ __be32 rpg_max_rate;
+ __be32 rpg_ai_rate;
+ __be32 rpg_hai_rate;
+ __be32 rpg_gd;
+ __be32 rpg_min_dec_fac;
+ __be32 rpg_min_rate;
+ __be32 max_time_rise;
+ __be32 max_byte_rise;
+ __be32 max_qdelta;
+ __be32 min_qoffset;
+ __be32 gd_coefficient;
+ __be32 reserved2[5];
+ __be32 cp_sample_base;
+ __be32 reserved3[39];
+} __packed;
+
+struct mlx4_congestion_control_mb_prio_802_1_qau_statistics {
+ __be64 rppp_rp_centiseconds;
+ __be32 reserved1;
+ __be32 ignored_cnm;
+ __be32 rppp_created_rps;
+ __be32 estimated_total_rate;
+ __be32 max_active_rate_limiter_index;
+ __be32 dropped_cnms_busy_fw;
+ __be32 reserved2;
+ __be32 cnms_handled_successfully;
+ __be32 min_total_limiters_rate;
+ __be32 max_total_limiters_rate;
+ __be32 reserved3[4];
+} __packed;
+
+enum mlx4_en_congestion_control_algorithm {
+ MLX4_CTRL_ALGO_802_1_QAU_REACTION_POINT = 0,
+};
+
+enum mlx4_en_congestion_control_opmod {
+ MLX4_CONGESTION_CONTROL_GET_PARAMS,
+ MLX4_CONGESTION_CONTROL_GET_STATISTICS,
+ MLX4_CONGESTION_CONTROL_SET_PARAMS = 4,
+};
+
+static int mlx4_en_dcbnl_ieee_getets(struct rte_eth_dev *dev,
+ struct ieee_ets *ets)
+{
+ struct mlx4_en_priv *priv = dev->data->dev_private;
+ struct ieee_ets *my_ets = &priv->ets;
+
+ /* No IEEE PFC settings available */
+ if (!my_ets)
+ return -EINVAL;
+
+ ets->ets_cap = IEEE_8021QAZ_MAX_TCS;
+ ets->cbs = my_ets->cbs;
+ memcpy(ets->tc_tx_bw, my_ets->tc_tx_bw, sizeof(ets->tc_tx_bw));
+ memcpy(ets->tc_tsa, my_ets->tc_tsa, sizeof(ets->tc_tsa));
+ memcpy(ets->prio_tc, my_ets->prio_tc, sizeof(ets->prio_tc));
+
+ return 0;
+}
+
+static int mlx4_en_ets_validate(struct mlx4_en_priv *priv, struct ieee_ets *ets)
+{
+ int i;
+ int total_ets_bw = 0;
+ int has_ets_tc = 0;
+
+ for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+ if (ets->prio_tc[i] >= MLX4_EN_NUM_UP) {
+ en_err(priv, "Bad priority in UP <=> TC mapping. TC: %d, UP: %d\n",
+ i, ets->prio_tc[i]);
+ return -EINVAL;
+ }
+
+ switch (ets->tc_tsa[i]) {
+ case IEEE_8021QAZ_TSA_VENDOR:
+ case IEEE_8021QAZ_TSA_STRICT:
+ break;
+ case IEEE_8021QAZ_TSA_ETS:
+ has_ets_tc = 1;
+ total_ets_bw += ets->tc_tx_bw[i];
+ break;
+ default:
+ en_err(priv, "TC[%d]: Not supported TSA: %d\n",
+ i, ets->tc_tsa[i]);
+ return -ENOTSUPP;
+ }
+ }
+
+ if (has_ets_tc && total_ets_bw != MLX4_EN_BW_MAX) {
+ en_err(priv, "Bad ETS BW sum: %d. Should be exactly 100%%\n",
+ total_ets_bw);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int mlx4_disable_32_14_4_e_read(struct mlx4_dev *dev, u8 *config, int port)
+{
+ struct mlx4_congestion_control_mb_prio_802_1_qau_params *hw_qcn;
+ struct mlx4_cmd_mailbox *mailbox_out = NULL;
+ u64 mailbox_in_dma = 0;
+ u32 inmod = 0;
+ int err = 0;
+
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QCN))
+ return -EOPNOTSUPP;
+
+ mailbox_out = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox_out))
+ return -ENOMEM;
+
+ hw_qcn =
+ (struct mlx4_congestion_control_mb_prio_802_1_qau_params *)
+ mailbox_out->buf;
+
+ inmod = port | 1 << 8 |
+ (MLX4_CTRL_ALGO_802_1_QAU_REACTION_POINT << 16);
+
+ err = mlx4_cmd_box(dev, mailbox_in_dma,
+ mailbox_out->dma,
+ inmod, MLX4_CONGESTION_CONTROL_GET_PARAMS,
+ MLX4_CMD_CONGESTION_CTRL_OPCODE,
+ MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+ if (!err)
+ *config = be32_to_cpu(hw_qcn->extended_enable) >> 22;
+
+ mlx4_free_cmd_mailbox(dev, mailbox_out);
+
+ return err;
+}
+
+int mlx4_disable_32_14_4_e_write(struct mlx4_dev *dev, u8 config, int port)
+{
+ struct mlx4_congestion_control_mb_prio_802_1_qau_params *hw_qcn;
+ struct mlx4_cmd_mailbox *mailbox_in = NULL;
+ u64 mailbox_in_dma = 0;
+ u32 inmod = 0;
+ int err;
+
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QCN))
+ return -EOPNOTSUPP;
+
+ mailbox_in = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox_in))
+ return -ENOMEM;
+
+ mailbox_in_dma = mailbox_in->dma;
+ hw_qcn =
+ (struct mlx4_congestion_control_mb_prio_802_1_qau_params *)mailbox_in->buf;
+
+ inmod = port | 0xff << 8 |
+ (MLX4_CTRL_ALGO_802_1_QAU_REACTION_POINT << 16);
+
+ /* Before updating QCN parameter,
+ *need to set it's modify enable bit to 1
+ */
+
+ hw_qcn->modify_enable_high = cpu_to_be32(1 << 22);
+
+ hw_qcn->extended_enable = cpu_to_be32(config << 22);
+
+ err = mlx4_cmd(dev, mailbox_in_dma, inmod,
+ MLX4_CONGESTION_CONTROL_SET_PARAMS,
+ MLX4_CMD_CONGESTION_CTRL_OPCODE,
+ MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox_in);
+ return err;
+}
+
+static int mlx4_en_config_port_scheduler(struct mlx4_en_priv *priv,
+ struct ieee_ets *ets, u16 *ratelimit)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int num_strict = 0;
+ int i;
+ __u8 tc_tx_bw[IEEE_8021QAZ_MAX_TCS] = { 0 };
+ __u8 pg[IEEE_8021QAZ_MAX_TCS] = { 0 };
+
+ ets = ets ?: &priv->ets;
+ ratelimit = ratelimit ?: priv->maxrate;
+
+ /* higher TC means higher priority => lower pg */
+ for (i = IEEE_8021QAZ_MAX_TCS - 1; i >= 0; i--) {
+ switch (ets->tc_tsa[i]) {
+ case IEEE_8021QAZ_TSA_VENDOR:
+ pg[i] = MLX4_EN_TC_VENDOR;
+ tc_tx_bw[i] = MLX4_EN_BW_MAX;
+ break;
+ case IEEE_8021QAZ_TSA_STRICT:
+ pg[i] = num_strict++;
+ tc_tx_bw[i] = MLX4_EN_BW_MAX;
+ break;
+ case IEEE_8021QAZ_TSA_ETS:
+ pg[i] = MLX4_EN_TC_ETS;
+ tc_tx_bw[i] = ets->tc_tx_bw[i] ?: MLX4_EN_BW_MIN;
+ break;
+ }
+ }
+
+ return mlx4_SET_PORT_SCHEDULER(mdev->dev, priv->port, tc_tx_bw, pg,
+ ratelimit);
+}
+
+static int
+mlx4_en_dcbnl_ieee_setets(struct rte_eth_dev *rte_dev, struct ieee_ets *ets)
+{
+ struct mlx4_en_priv *priv = rte_dev->data->dev_private;
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err;
+
+ err = mlx4_en_ets_validate(priv, ets);
+ if (err)
+ return err;
+
+ err = mlx4_SET_PORT_PRIO2TC(mdev->dev, priv->port, ets->prio_tc);
+ if (err)
+ return err;
+
+ err = mlx4_en_config_port_scheduler(priv, ets, NULL);
+ if (err)
+ return err;
+
+ memcpy(&priv->ets, ets, sizeof(priv->ets));
+
+ return 0;
+}
+
+static int mlx4_en_dcbnl_ieee_getpfc(struct rte_eth_dev *dev,
+ struct ieee_pfc *pfc)
+{
+ struct mlx4_en_priv *priv = dev->data->dev_private;
+
+ pfc->pfc_cap = IEEE_8021QAZ_MAX_TCS;
+ pfc->pfc_en = priv->prof->tx_ppp;
+
+ return 0;
+}
+
+static int mlx4_en_dcbnl_ieee_setpfc(struct rte_eth_dev *rte_dev,
+ struct ieee_pfc *pfc)
+{
+ struct mlx4_en_priv *priv = rte_dev->data->dev_private;
+ struct mlx4_en_port_profile *prof = priv->prof;
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err;
+
+ en_dbg(DRV, priv, "cap: 0x%x en: 0x%x mbc: 0x%x delay: %d\n",
+ pfc->pfc_cap,
+ pfc->pfc_en,
+ pfc->mbc,
+ pfc->delay);
+
+ prof->rx_pause = !pfc->pfc_en;
+ prof->tx_pause = !pfc->pfc_en;
+ prof->rx_ppp = pfc->pfc_en;
+ prof->tx_ppp = pfc->pfc_en;
+
+ err = mlx4_SET_PORT_general(mdev->dev, priv->port,
+ priv->eff_mtu + ETH_FCS_LEN,
+ prof->tx_pause,
+ prof->tx_ppp,
+ prof->rx_pause,
+ prof->rx_ppp);
+ if (err) {
+ en_err(priv, "Failed setting pause params\n");
+ } else {
+ /*
+ mlx4_en_update_pfc_stats_bitmap(mdev->dev, &priv->stats_bitmap,
+ prof->rx_ppp, prof->rx_pause,
+ prof->tx_ppp, prof->tx_pause);
+ XXX */
+ }
+
+ return err;
+}
+
+#ifdef KMOD_DISABLED
+static u8 mlx4_en_dcbnl_getdcbx(struct net_device *dev)
+{
+ return DCB_CAP_DCBX_HOST | DCB_CAP_DCBX_VER_IEEE;
+}
+
+static u8 mlx4_en_dcbnl_setdcbx(struct net_device *dev, u8 mode)
+{
+ if ((mode & DCB_CAP_DCBX_LLD_MANAGED) ||
+ (mode & DCB_CAP_DCBX_VER_CEE) ||
+ !(mode & DCB_CAP_DCBX_VER_IEEE) ||
+ !(mode & DCB_CAP_DCBX_HOST))
+ return 1;
+
+ return 0;
+}
+#endif
+
+#define MLX4_RATELIMIT_UNITS_IN_KB 100000 /* rate-limit HW unit in Kbps */
+#ifndef CONFIG_SYSFS_MAXRATE
+static
+#endif
+int mlx4_en_dcbnl_ieee_getmaxrate(struct rte_eth_dev *rte_dev,
+ struct ieee_maxrate *maxrate)
+{
+ struct mlx4_en_priv *priv = rte_dev->data->dev_private;
+ int i;
+
+ for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++)
+ maxrate->tc_maxrate[i] =
+ priv->maxrate[i] * MLX4_RATELIMIT_UNITS_IN_KB;
+
+ return 0;
+}
+
+#ifndef CONFIG_SYSFS_MAXRATE
+static
+#endif
+int mlx4_en_dcbnl_ieee_setmaxrate(struct rte_eth_dev *rte_dev,
+ struct ieee_maxrate *maxrate)
+{
+ struct mlx4_en_priv *priv = rte_dev->data->dev_private;
+ u16 tmp[IEEE_8021QAZ_MAX_TCS];
+ int i, err;
+
+ for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+ /* Convert from Kbps into HW units, rounding result up.
+ * Setting to 0, means unlimited BW.
+ */
+ tmp[i] = div_u64(maxrate->tc_maxrate[i] +
+ MLX4_RATELIMIT_UNITS_IN_KB - 1,
+ MLX4_RATELIMIT_UNITS_IN_KB);
+ }
+
+ err = mlx4_en_config_port_scheduler(priv, NULL, tmp);
+ if (err)
+ return err;
+
+ memcpy(priv->maxrate, tmp, sizeof(priv->maxrate));
+
+ return 0;
+}
+
+#define RPG_ENABLE_BIT 31
+#define CN_TAG_BIT 30
+
+#ifndef CONFIG_SYSFS_QCN
+static
+#endif
+int mlx4_en_dcbnl_ieee_getqcn(struct rte_eth_dev *rte_dev,
+ struct ieee_qcn *qcn)
+{
+ struct mlx4_en_priv *priv = rte_dev->data->dev_private;
+ struct mlx4_congestion_control_mb_prio_802_1_qau_params *hw_qcn;
+ struct mlx4_cmd_mailbox *mailbox_out = NULL;
+ u64 mailbox_in_dma = 0;
+ u32 inmod = 0;
+ int i, err;
+
+ if (!(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QCN))
+ return -EOPNOTSUPP;
+
+ mailbox_out = mlx4_alloc_cmd_mailbox(priv->mdev->dev);
+ if (IS_ERR(mailbox_out))
+ return -ENOMEM;
+ hw_qcn =
+ (struct mlx4_congestion_control_mb_prio_802_1_qau_params *)
+ mailbox_out->buf;
+
+ for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+ inmod = priv->port | ((1<<i) << 8) |
+ (MLX4_CTRL_ALGO_802_1_QAU_REACTION_POINT << 16);
+ err = mlx4_cmd_box(priv->mdev->dev, mailbox_in_dma,
+ mailbox_out->dma,
+ inmod, MLX4_CONGESTION_CONTROL_GET_PARAMS,
+ MLX4_CMD_CONGESTION_CTRL_OPCODE,
+ MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_free_cmd_mailbox(priv->mdev->dev, mailbox_out);
+ return err;
+ }
+
+ qcn->rpg_enable[i] =
+ be32_to_cpu(hw_qcn->extended_enable) >> RPG_ENABLE_BIT;
+ qcn->rppp_max_rps[i] =
+ be32_to_cpu(hw_qcn->rppp_max_rps);
+ qcn->rpg_time_reset[i] =
+ be32_to_cpu(hw_qcn->rpg_time_reset);
+ qcn->rpg_byte_reset[i] =
+ be32_to_cpu(hw_qcn->rpg_byte_reset);
+ qcn->rpg_threshold[i] =
+ be32_to_cpu(hw_qcn->rpg_threshold);
+ qcn->rpg_max_rate[i] =
+ be32_to_cpu(hw_qcn->rpg_max_rate);
+ qcn->rpg_ai_rate[i] =
+ be32_to_cpu(hw_qcn->rpg_ai_rate);
+ qcn->rpg_hai_rate[i] =
+ be32_to_cpu(hw_qcn->rpg_hai_rate);
+ qcn->rpg_gd[i] =
+ be32_to_cpu(hw_qcn->rpg_gd);
+ qcn->rpg_min_dec_fac[i] =
+ be32_to_cpu(hw_qcn->rpg_min_dec_fac);
+ qcn->rpg_min_rate[i] =
+ be32_to_cpu(hw_qcn->rpg_min_rate);
+ qcn->cndd_state_machine[i] =
+ priv->cndd_state[i];
+ }
+ mlx4_free_cmd_mailbox(priv->mdev->dev, mailbox_out);
+ return 0;
+}
+
+#ifndef CONFIG_SYSFS_QCN
+static
+#endif
+int mlx4_en_dcbnl_ieee_setqcn(struct rte_eth_dev *rte_dev,
+ struct ieee_qcn *qcn)
+{
+ struct mlx4_en_priv *priv = rte_dev->data->dev_private;
+ struct mlx4_congestion_control_mb_prio_802_1_qau_params *hw_qcn;
+ struct mlx4_cmd_mailbox *mailbox_in = NULL;
+ u64 mailbox_in_dma = 0;
+ u32 inmod = 0;
+ int i, err;
+#define MODIFY_ENABLE_HIGH_MASK 0xc0000000
+#define MODIFY_ENABLE_LOW_MASK 0xffc00000
+
+ if (!(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QCN))
+ return -EOPNOTSUPP;
+
+ mailbox_in = mlx4_alloc_cmd_mailbox(priv->mdev->dev);
+ if (IS_ERR(mailbox_in))
+ return -ENOMEM;
+
+ mailbox_in_dma = mailbox_in->dma;
+ hw_qcn =
+ (struct mlx4_congestion_control_mb_prio_802_1_qau_params *)mailbox_in->buf;
+ for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+ inmod = priv->port | ((1<<i) << 8) |
+ (MLX4_CTRL_ALGO_802_1_QAU_REACTION_POINT << 16);
+
+ /* Before updating QCN parameter,
+ * need to set it's modify enable bit to 1
+ */
+
+ hw_qcn->modify_enable_high = cpu_to_be32(
+ MODIFY_ENABLE_HIGH_MASK);
+ hw_qcn->modify_enable_low = cpu_to_be32(MODIFY_ENABLE_LOW_MASK);
+
+ hw_qcn->extended_enable = cpu_to_be32(qcn->rpg_enable[i] << RPG_ENABLE_BIT);
+ hw_qcn->rppp_max_rps = cpu_to_be32(qcn->rppp_max_rps[i]);
+ hw_qcn->rpg_time_reset = cpu_to_be32(qcn->rpg_time_reset[i]);
+ hw_qcn->rpg_byte_reset = cpu_to_be32(qcn->rpg_byte_reset[i]);
+ hw_qcn->rpg_threshold = cpu_to_be32(qcn->rpg_threshold[i]);
+ hw_qcn->rpg_max_rate = cpu_to_be32(qcn->rpg_max_rate[i]);
+ hw_qcn->rpg_ai_rate = cpu_to_be32(qcn->rpg_ai_rate[i]);
+ hw_qcn->rpg_hai_rate = cpu_to_be32(qcn->rpg_hai_rate[i]);
+ hw_qcn->rpg_gd = cpu_to_be32(qcn->rpg_gd[i]);
+ hw_qcn->rpg_min_dec_fac = cpu_to_be32(qcn->rpg_min_dec_fac[i]);
+ hw_qcn->rpg_min_rate = cpu_to_be32(qcn->rpg_min_rate[i]);
+ priv->cndd_state[i] = qcn->cndd_state_machine[i];
+ if (qcn->cndd_state_machine[i] == DCB_CNDD_INTERIOR_READY)
+ hw_qcn->extended_enable |= cpu_to_be32(1 << CN_TAG_BIT);
+
+ err = mlx4_cmd(priv->mdev->dev, mailbox_in_dma, inmod,
+ MLX4_CONGESTION_CONTROL_SET_PARAMS,
+ MLX4_CMD_CONGESTION_CTRL_OPCODE,
+ MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_free_cmd_mailbox(priv->mdev->dev, mailbox_in);
+ return err;
+ }
+ }
+ mlx4_free_cmd_mailbox(priv->mdev->dev, mailbox_in);
+ return 0;
+}
+
+#ifndef CONFIG_SYSFS_QCN
+static
+#endif
+int mlx4_en_dcbnl_ieee_getqcnstats(struct rte_eth_dev *rte_dev,
+ struct ieee_qcn_stats *qcn_stats)
+{
+ struct mlx4_en_priv *priv = rte_dev->data->dev_private;
+ struct mlx4_congestion_control_mb_prio_802_1_qau_statistics *hw_qcn_stats;
+ struct mlx4_cmd_mailbox *mailbox_out = NULL;
+ u64 mailbox_in_dma = 0;
+ u32 inmod = 0;
+ int i, err;
+
+ if (!(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QCN))
+ return -EOPNOTSUPP;
+
+ mailbox_out = mlx4_alloc_cmd_mailbox(priv->mdev->dev);
+ if (IS_ERR(mailbox_out))
+ return -ENOMEM;
+
+ hw_qcn_stats =
+ (struct mlx4_congestion_control_mb_prio_802_1_qau_statistics *)
+ mailbox_out->buf;
+
+ for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) {
+ inmod = priv->port | ((1<<i) << 8) |
+ (MLX4_CTRL_ALGO_802_1_QAU_REACTION_POINT << 16);
+ err = mlx4_cmd_box(priv->mdev->dev, mailbox_in_dma,
+ mailbox_out->dma, inmod,
+ MLX4_CONGESTION_CONTROL_GET_STATISTICS,
+ MLX4_CMD_CONGESTION_CTRL_OPCODE,
+ MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_free_cmd_mailbox(priv->mdev->dev, mailbox_out);
+ return err;
+ }
+ qcn_stats->rppp_rp_centiseconds[i] =
+ be64_to_cpu(hw_qcn_stats->rppp_rp_centiseconds);
+ qcn_stats->rppp_created_rps[i] =
+ be32_to_cpu(hw_qcn_stats->rppp_created_rps);
+ }
+ mlx4_free_cmd_mailbox(priv->mdev->dev, mailbox_out);
+ return 0;
+}
+
+#ifdef KMOD_DISABLED
+const struct dcbnl_rtnl_ops mlx4_en_dcbnl_ops = {
+ .ieee_getets = mlx4_en_dcbnl_ieee_getets,
+ .ieee_setets = mlx4_en_dcbnl_ieee_setets,
+#ifdef HAVE_IEEE_GET_SET_MAXRATE
+ .ieee_getmaxrate = mlx4_en_dcbnl_ieee_getmaxrate,
+ .ieee_setmaxrate = mlx4_en_dcbnl_ieee_setmaxrate,
+#endif
+ .ieee_getpfc = mlx4_en_dcbnl_ieee_getpfc,
+ .ieee_setpfc = mlx4_en_dcbnl_ieee_setpfc,
+
+ .getdcbx = mlx4_en_dcbnl_getdcbx,
+ .setdcbx = mlx4_en_dcbnl_setdcbx,
+#ifdef HAVE_IEEE_GETQCN
+ .ieee_getqcn = mlx4_en_dcbnl_ieee_getqcn,
+ .ieee_setqcn = mlx4_en_dcbnl_ieee_setqcn,
+ .ieee_getqcnstats = mlx4_en_dcbnl_ieee_getqcnstats,
+#endif
+};
+
+const struct dcbnl_rtnl_ops mlx4_en_dcbnl_pfc_ops = {
+ .ieee_getpfc = mlx4_en_dcbnl_ieee_getpfc,
+ .ieee_setpfc = mlx4_en_dcbnl_ieee_setpfc,
+
+ .getdcbx = mlx4_en_dcbnl_getdcbx,
+ .setdcbx = mlx4_en_dcbnl_setdcbx,
+#ifdef HAVE_IEEE_GETQCN
+ .ieee_getqcn = mlx4_en_dcbnl_ieee_getqcn,
+ .ieee_setqcn = mlx4_en_dcbnl_ieee_setqcn,
+ .ieee_getqcnstats = mlx4_en_dcbnl_ieee_getqcnstats,
+#endif
+};
+#endif
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_ethtool.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_ethtool.c
new file mode 100644
index 0000000..0612b26
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_ethtool.c
@@ -0,0 +1,2582 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+
+#include "mlx4_en.h"
+#include "en_port.h"
+
+#define EN_ETHTOOL_QP_ATTACH (1ull << 63)
+#define EN_ETHTOOL_SHORT_MASK cpu_to_be16(0xffff)
+#define EN_ETHTOOL_WORD_MASK cpu_to_be32(0xffffffff)
+
+union mlx4_ethtool_flow_union {
+ struct ethtool_tcpip4_spec tcp_ip4_spec;
+ struct ethtool_tcpip4_spec udp_ip4_spec;
+ struct ethtool_tcpip4_spec sctp_ip4_spec;
+ struct ethtool_ah_espip4_spec ah_ip4_spec;
+ struct ethtool_ah_espip4_spec esp_ip4_spec;
+ struct ethtool_usrip4_spec usr_ip4_spec;
+ struct ethhdr ether_spec;
+ __u8 hdata[52];
+};
+
+struct mlx4_ethtool_flow_ext {
+ __u8 padding[2];
+ unsigned char h_dest[ETH_ALEN];
+ __be16 vlan_etype;
+ __be16 vlan_tci;
+ __be32 data[2];
+};
+
+struct mlx4_ethtool_rx_flow_spec {
+ __u32 flow_type;
+ union mlx4_ethtool_flow_union h_u;
+ struct mlx4_ethtool_flow_ext h_ext;
+ union mlx4_ethtool_flow_union m_u;
+ struct mlx4_ethtool_flow_ext m_ext;
+ __u64 ring_cookie;
+ __u32 location;
+};
+
+struct mlx4_ethtool_rxnfc {
+ __u32 cmd;
+ __u32 flow_type;
+ __u64 data;
+ struct mlx4_ethtool_rx_flow_spec fs;
+ __u32 rule_cnt;
+ __u32 rule_locs[0];
+};
+
+#ifndef FLOW_MAC_EXT
+#define FLOW_MAC_EXT 0x40000000
+#endif
+
+#ifdef KMOD_DISABLED
+
+static int mlx4_en_change_inline_scatter_thold(struct net_device *dev,
+ int thold)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int val = 0;
+ int stride = 0;
+ int ret = 0;
+ int port_up = 0;
+
+ /* can enable inline scatter if port is up and MTU is 1500 */
+ if (priv->num_frags != 1)
+ return -EINVAL;
+
+ if (thold >= MIN_INLINE_SCATTER) {
+ stride = roundup_pow_of_two(thold);
+ val = stride;
+ }
+
+ /* disable inline scatter and reset stride */
+ if (stride == 0) {
+ stride = roundup_pow_of_two(
+ sizeof(struct mlx4_en_rx_desc) +
+ DS_SIZE * MLX4_EN_MAX_RX_FRAGS);
+ val = 0;
+ }
+
+ if (stride > MAX_INLINE_SCATTER) {
+ ret = -EINVAL;
+ } else if (stride > MAX_DESC_SIZE && stride < dev->mtu) {
+ /* stride cannot be larger than MAX_DESC_SIZE,
+ * unless we ensure that all packets will
+ * be inline scatterd - thold >= MTU
+ */
+ ret = -EINVAL;
+ } else {
+ /* inline scatter thold is good */
+ priv->prof->inline_scatter_thold = val;
+
+ mutex_lock(&mdev->state_lock);
+ if (priv->port_up) {
+ port_up = 1;
+ mlx4_en_stop_port(dev, 1);
+ }
+
+ mlx4_en_free_resources(priv);
+
+ priv->stride = stride;
+
+ ret = mlx4_en_alloc_resources(priv);
+ if (ret) {
+ en_err(priv, "Failed reallocating port resources\n");
+ goto out;
+ }
+
+ if (port_up) {
+ ret = mlx4_en_start_port(dev);
+ if (ret)
+ en_err(priv, "Failed starting port\n");
+ }
+
+out:
+ mutex_unlock(&mdev->state_lock);
+ }
+
+ return ret;
+}
+
+static int mlx4_en_moderation_update(struct mlx4_en_priv *priv)
+{
+ int i;
+ int err = 0;
+
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ priv->tx_cq[i]->moder_cnt = priv->tx_frames;
+ priv->tx_cq[i]->moder_time = priv->tx_usecs;
+ if (priv->port_up) {
+ err = mlx4_en_set_cq_moder(priv, priv->tx_cq[i]);
+ if (err)
+ return err;
+ }
+ }
+
+ if (priv->adaptive_rx_coal)
+ return 0;
+
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ priv->rx_cq[i]->moder_cnt = priv->rx_frames;
+ priv->rx_cq[i]->moder_time = priv->rx_usecs;
+ priv->last_moder_time[i] = MLX4_EN_AUTO_CONF;
+ if (priv->port_up) {
+ err = mlx4_en_set_cq_moder(priv, priv->rx_cq[i]);
+ if (err)
+ return err;
+ }
+ }
+
+ return err;
+}
+
+static void
+mlx4_en_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *drvinfo)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+
+ strlcpy(drvinfo->driver, DRV_NAME, sizeof(drvinfo->driver));
+ strlcpy(drvinfo->version, DRV_VERSION " (" DRV_RELDATE ")",
+ sizeof(drvinfo->version));
+ snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
+ "%d.%d.%d",
+ (u16) (mdev->dev->caps.fw_ver >> 32),
+ (u16) ((mdev->dev->caps.fw_ver >> 16) & 0xffff),
+ (u16) (mdev->dev->caps.fw_ver & 0xffff));
+ strlcpy(drvinfo->bus_info, pci_name(mdev->dev->persist->pdev),
+ sizeof(drvinfo->bus_info));
+ drvinfo->n_stats = 0;
+ drvinfo->regdump_len = 0;
+ drvinfo->eedump_len = 0;
+}
+
+#if (!defined(HAVE_NETDEV_HW_FEATURES) && !defined(HAVE_NET_DEVICE_OPS_EXT))
+static u32 mlx4_en_get_tso(struct net_device *dev)
+{
+ return (dev->features & NETIF_F_TSO) != 0;
+}
+
+static int mlx4_en_set_tso(struct net_device *dev, u32 data)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ if (data) {
+ if (!priv->mdev->LSO_support)
+ return -EPERM;
+ dev->features |= (NETIF_F_TSO | NETIF_F_TSO6);
+#ifndef HAVE_VLAN_GRO_RECEIVE
+ dev->vlan_features |= (NETIF_F_TSO | NETIF_F_TSO6);
+#else
+ if (priv->vlgrp) {
+ int i;
+ struct net_device *vdev;
+ for (i = 0; i < VLAN_N_VID; i++) {
+ vdev = vlan_group_get_device(priv->vlgrp, i);
+ if (vdev) {
+ vdev->features |= (NETIF_F_TSO | NETIF_F_TSO6);
+ vlan_group_set_device(priv->vlgrp, i, vdev);
+ }
+ }
+ }
+#endif
+ } else {
+ dev->features &= ~(NETIF_F_TSO | NETIF_F_TSO6);
+#ifndef HAVE_VLAN_GRO_RECEIVE
+ dev->vlan_features &= ~(NETIF_F_TSO | NETIF_F_TSO6);
+#else
+ if (priv->vlgrp) {
+ int i;
+ struct net_device *vdev;
+ for (i = 0; i < VLAN_N_VID; i++) {
+ vdev = vlan_group_get_device(priv->vlgrp, i);
+ if (vdev) {
+ vdev->features &= ~(NETIF_F_TSO | NETIF_F_TSO6);
+ vlan_group_set_device(priv->vlgrp, i, vdev);
+ }
+ }
+ }
+#endif
+ }
+ return 0;
+}
+
+static u32 mlx4_en_get_rx_csum(struct net_device *dev)
+{
+ return dev->features & NETIF_F_RXCSUM;
+}
+
+static int mlx4_en_set_rx_csum(struct net_device *dev, u32 data)
+{
+ if (!data) {
+ dev->features &= ~NETIF_F_RXCSUM;
+ return 0;
+ }
+ dev->features |= NETIF_F_RXCSUM;
+ return 0;
+}
+#endif
+
+static const char mlx4_en_priv_flags[][ETH_GSTRING_LEN] = {
+ "blueflame",
+ "mlx4_flow_steering_ethernet_l2",
+ "mlx4_flow_steering_ipv4",
+ "mlx4_flow_steering_tcp",
+ "mlx4_flow_steering_udp",
+ "qcn_disable_32_14_4_e",
+ "rx-copy",
+ "rx-fcs",
+ "rx-all",
+};
+
+static const char main_strings[][ETH_GSTRING_LEN] = {
+ /* main statistics */
+ "rx_packets", "tx_packets", "rx_bytes", "tx_bytes", "rx_errors",
+ "tx_errors", "rx_dropped", "tx_dropped", "multicast", "collisions",
+ "rx_length_errors", "rx_over_errors", "rx_crc_errors",
+ "rx_frame_errors", "rx_fifo_errors", "rx_missed_errors",
+ "tx_aborted_errors", "tx_carrier_errors", "tx_fifo_errors",
+ "tx_heartbeat_errors", "tx_window_errors",
+
+ /* port statistics */
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+ "rx_lro_aggregated", "rx_lro_flushed", "rx_lro_no_desc",
+#endif
+ "tso_packets",
+ "xmit_more",
+ "queue_stopped", "wake_queue", "tx_timeout", "rx_alloc_failed",
+ "rx_csum_good", "rx_csum_none", "rx_csum_complete", "tx_chksum_offload",
+
+ /* priority flow control statistics rx */
+ "rx_pause_prio_0", "rx_pause_duration_prio_0",
+ "rx_pause_transition_prio_0",
+ "rx_pause_prio_1", "rx_pause_duration_prio_1",
+ "rx_pause_transition_prio_1",
+ "rx_pause_prio_2", "rx_pause_duration_prio_2",
+ "rx_pause_transition_prio_2",
+ "rx_pause_prio_3", "rx_pause_duration_prio_3",
+ "rx_pause_transition_prio_3",
+ "rx_pause_prio_4", "rx_pause_duration_prio_4",
+ "rx_pause_transition_prio_4",
+ "rx_pause_prio_5", "rx_pause_duration_prio_5",
+ "rx_pause_transition_prio_5",
+ "rx_pause_prio_6", "rx_pause_duration_prio_6",
+ "rx_pause_transition_prio_6",
+ "rx_pause_prio_7", "rx_pause_duration_prio_7",
+ "rx_pause_transition_prio_7",
+
+ /* flow control statistics rx */
+ "rx_pause", "rx_pause_duration", "rx_pause_transition",
+
+ /* priority flow control statistics tx */
+ "tx_pause_prio_0", "tx_pause_duration_prio_0",
+ "tx_pause_transition_prio_0",
+ "tx_pause_prio_1", "tx_pause_duration_prio_1",
+ "tx_pause_transition_prio_1",
+ "tx_pause_prio_2", "tx_pause_duration_prio_2",
+ "tx_pause_transition_prio_2",
+ "tx_pause_prio_3", "tx_pause_duration_prio_3",
+ "tx_pause_transition_prio_3",
+ "tx_pause_prio_4", "tx_pause_duration_prio_4",
+ "tx_pause_transition_prio_4",
+ "tx_pause_prio_5", "tx_pause_duration_prio_5",
+ "tx_pause_transition_prio_5",
+ "tx_pause_prio_6", "tx_pause_duration_prio_6",
+ "tx_pause_transition_prio_6",
+ "tx_pause_prio_7", "tx_pause_duration_prio_7",
+ "tx_pause_transition_prio_7",
+
+ /* flow control statistics tx */
+ "tx_pause", "tx_pause_duration", "tx_pause_transition",
+
+ /* VF statistics */
+ "rx_multicast_packets",
+ "rx_broadcast_packets",
+ "rx_filtered",
+ "tx_multicast_packets",
+ "tx_broadcast_packets",
+ "tx_dropped",
+
+ /* VPort statistics */
+ "vport_rx_unicast_packets",
+ "vport_rx_unicast_bytes",
+ "vport_rx_multicast_packets",
+ "vport_rx_multicast_bytes",
+ "vport_rx_broadcast_packets",
+ "vport_rx_broadcast_bytes",
+ "vport_rx_dropped",
+ "vport_rx_filtered",
+ "vport_tx_unicast_packets",
+ "vport_tx_unicast_bytes",
+ "vport_tx_multicast_packets",
+ "vport_tx_multicast_bytes",
+ "vport_tx_broadcast_packets",
+ "vport_tx_broadcast_bytes",
+ "vport_tx_dropped",
+
+ /* packet statistics */
+ "rx_multicast_packets",
+ "rx_broadcast_packets",
+ "rx_jabbers",
+ "rx_in_range_length_error",
+ "rx_out_range_length_error",
+ "tx_multicast_packets",
+ "tx_broadcast_packets",
+ "rx_prio_0_packets", "rx_prio_0_bytes",
+ "rx_prio_1_packets", "rx_prio_1_bytes",
+ "rx_prio_2_packets", "rx_prio_2_bytes",
+ "rx_prio_3_packets", "rx_prio_3_bytes",
+ "rx_prio_4_packets", "rx_prio_4_bytes",
+ "rx_prio_5_packets", "rx_prio_5_bytes",
+ "rx_prio_6_packets", "rx_prio_6_bytes",
+ "rx_prio_7_packets", "rx_prio_7_bytes",
+ "rx_novlan_packets", "rx_novlan_bytes",
+ "tx_prio_0_packets", "tx_prio_0_bytes",
+ "tx_prio_1_packets", "tx_prio_1_bytes",
+ "tx_prio_2_packets", "tx_prio_2_bytes",
+ "tx_prio_3_packets", "tx_prio_3_bytes",
+ "tx_prio_4_packets", "tx_prio_4_bytes",
+ "tx_prio_5_packets", "tx_prio_5_bytes",
+ "tx_prio_6_packets", "tx_prio_6_bytes",
+ "tx_prio_7_packets", "tx_prio_7_bytes",
+ "tx_novlan_packets", "tx_novlan_bytes",
+
+};
+
+static const char mlx4_en_test_names[][ETH_GSTRING_LEN]= {
+ "Interrupt Test",
+ "Link Test",
+ "Speed Test",
+ "Register Test",
+ "Loopback Test",
+};
+
+static u32 mlx4_en_get_msglevel(struct net_device *dev)
+{
+ return ((struct mlx4_en_priv *) netdev_priv(dev))->msg_enable;
+}
+
+static void mlx4_en_set_msglevel(struct net_device *dev, u32 val)
+{
+ ((struct mlx4_en_priv *) netdev_priv(dev))->msg_enable = val;
+}
+
+static void mlx4_en_get_wol(struct net_device *netdev,
+ struct ethtool_wolinfo *wol)
+{
+ struct mlx4_en_priv *priv = netdev_priv(netdev);
+ int err = 0;
+ u64 config = 0;
+ u64 mask;
+
+ if ((priv->port < 1) || (priv->port > 2)) {
+ en_err(priv, "Failed to get WoL information\n");
+ return;
+ }
+
+ mask = (priv->port == 1) ? MLX4_DEV_CAP_FLAG_WOL_PORT1 :
+ MLX4_DEV_CAP_FLAG_WOL_PORT2;
+
+ if (!(priv->mdev->dev->caps.flags & mask)) {
+ wol->supported = 0;
+ wol->wolopts = 0;
+ return;
+ }
+
+ err = mlx4_wol_read(priv->mdev->dev, &config, priv->port);
+ if (err) {
+ en_err(priv, "Failed to get WoL information\n");
+ return;
+ }
+
+ if (config & MLX4_EN_WOL_MAGIC)
+ wol->supported = WAKE_MAGIC;
+ else
+ wol->supported = 0;
+
+ if (config & MLX4_EN_WOL_ENABLED)
+ wol->wolopts = WAKE_MAGIC;
+ else
+ wol->wolopts = 0;
+}
+
+static int mlx4_en_set_wol(struct net_device *netdev,
+ struct ethtool_wolinfo *wol)
+{
+ struct mlx4_en_priv *priv = netdev_priv(netdev);
+ u64 config = 0;
+ int err = 0;
+ u64 mask;
+
+ if ((priv->port < 1) || (priv->port > 2))
+ return -EOPNOTSUPP;
+
+ mask = (priv->port == 1) ? MLX4_DEV_CAP_FLAG_WOL_PORT1 :
+ MLX4_DEV_CAP_FLAG_WOL_PORT2;
+
+ if (!(priv->mdev->dev->caps.flags & mask))
+ return -EOPNOTSUPP;
+
+ if (wol->supported & ~WAKE_MAGIC)
+ return -EINVAL;
+
+ err = mlx4_wol_read(priv->mdev->dev, &config, priv->port);
+ if (err) {
+ en_err(priv, "Failed to get WoL info, unable to modify\n");
+ return err;
+ }
+
+ if (wol->wolopts & WAKE_MAGIC) {
+ config |= MLX4_EN_WOL_DO_MODIFY | MLX4_EN_WOL_ENABLED |
+ MLX4_EN_WOL_MAGIC;
+ } else {
+ config &= ~(MLX4_EN_WOL_ENABLED | MLX4_EN_WOL_MAGIC);
+ config |= MLX4_EN_WOL_DO_MODIFY;
+ }
+
+ err = mlx4_wol_write(priv->mdev->dev, config, priv->port);
+ if (err)
+ en_err(priv, "Failed to set WoL information\n");
+
+ return err;
+}
+
+struct bitmap_iterator {
+ unsigned long *stats_bitmap;
+ unsigned int count;
+ unsigned int iterator;
+ bool advance_array; /* if set, force no increments */
+};
+
+static inline void bitmap_iterator_init(struct bitmap_iterator *h,
+ unsigned long *stats_bitmap,
+ int count)
+{
+ h->iterator = 0;
+ h->advance_array = !bitmap_empty(stats_bitmap, count);
+ h->count = h->advance_array ? bitmap_weight(stats_bitmap, count)
+ : count;
+ h->stats_bitmap = stats_bitmap;
+}
+
+static inline int bitmap_iterator_test(struct bitmap_iterator *h)
+{
+ return !h->advance_array ? 1 : test_bit(h->iterator, h->stats_bitmap);
+}
+
+static inline int bitmap_iterator_inc(struct bitmap_iterator *h)
+{
+ return h->iterator++;
+}
+
+static inline unsigned int
+bitmap_iterator_count(struct bitmap_iterator *h)
+{
+ return h->count;
+}
+
+static int mlx4_en_get_sset_count(struct net_device *dev, int sset)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct bitmap_iterator it;
+
+ bitmap_iterator_init(&it, priv->stats_bitmap.bitmap, NUM_ALL_STATS);
+
+ switch (sset) {
+ case ETH_SS_STATS:
+ return bitmap_iterator_count(&it) +
+ (priv->tx_ring_num * 2) +
+#ifdef CONFIG_NET_RX_BUSY_POLL
+ (priv->rx_ring_num * 5);
+#else
+ (priv->rx_ring_num * 2);
+#endif
+ case ETH_SS_TEST:
+ return MLX4_EN_NUM_SELF_TEST - !(priv->mdev->dev->caps.flags
+ & MLX4_DEV_CAP_FLAG_UC_LOOPBACK) * 2;
+ case ETH_SS_PRIV_FLAGS:
+ return ARRAY_SIZE(mlx4_en_priv_flags);
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+static void mlx4_en_update_lro_stats(struct mlx4_en_priv *priv)
+{
+ int i;
+
+ priv->port_stats.lro_aggregated = 0;
+ priv->port_stats.lro_flushed = 0;
+ priv->port_stats.lro_no_desc = 0;
+
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ priv->port_stats.lro_aggregated += priv->rx_ring[i]->lro.lro_mgr.stats.aggregated;
+ priv->port_stats.lro_flushed += priv->rx_ring[i]->lro.lro_mgr.stats.flushed;
+ priv->port_stats.lro_no_desc += priv->rx_ring[i]->lro.lro_mgr.stats.no_desc;
+ }
+}
+#endif
+
+static void mlx4_en_get_ethtool_stats(struct net_device *dev,
+ struct ethtool_stats *stats, uint64_t *data)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ int index = 0;
+ int i;
+ struct bitmap_iterator it;
+
+ bitmap_iterator_init(&it, priv->stats_bitmap.bitmap, NUM_ALL_STATS);
+
+ spin_lock_bh(&priv->stats_lock);
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+ mlx4_en_update_lro_stats(priv);
+#endif
+
+ for (i = 0; i < NUM_MAIN_STATS; i++, bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ data[index++] = ((unsigned long *)&priv->stats)[i];
+
+ for (i = 0; i < NUM_PORT_STATS; i++, bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ data[index++] = ((unsigned long *)&priv->port_stats)[i];
+
+ for (i = 0; i < NUM_FLOW_PRIORITY_STATS_RX;
+ i++, bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ data[index++] =
+ ((u64 *)&priv->rx_priority_flowstats)[i];
+
+ for (i = 0; i < NUM_FLOW_STATS_RX; i++, bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ data[index++] = ((u64 *)&priv->rx_flowstats)[i];
+
+ for (i = 0; i < NUM_FLOW_PRIORITY_STATS_TX;
+ i++, bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ data[index++] =
+ ((u64 *)&priv->tx_priority_flowstats)[i];
+
+ for (i = 0; i < NUM_FLOW_STATS_TX; i++, bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ data[index++] = ((u64 *)&priv->tx_flowstats)[i];
+
+ for (i = 0; i < NUM_VF_STATS; i++,
+ bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ data[index++] = ((unsigned long *)&priv->vf_stats)[i];
+
+ for (i = 0; i < NUM_VPORT_STATS; i++,
+ bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ data[index++] = ((unsigned long *)&priv->vport_stats)[i];
+
+ for (i = 0; i < NUM_PKT_STATS; i++, bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ data[index++] = ((unsigned long *)&priv->pkstats)[i];
+
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ data[index++] = priv->tx_ring[i]->packets;
+ data[index++] = priv->tx_ring[i]->bytes;
+ }
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ data[index++] = priv->rx_ring[i]->packets;
+ data[index++] = priv->rx_ring[i]->bytes;
+#ifdef CONFIG_NET_RX_BUSY_POLL
+ data[index++] = priv->rx_ring[i]->yields;
+ data[index++] = priv->rx_ring[i]->misses;
+ data[index++] = priv->rx_ring[i]->cleaned;
+#endif
+ }
+ spin_unlock_bh(&priv->stats_lock);
+
+}
+
+static void mlx4_en_self_test(struct net_device *dev,
+ struct ethtool_test *etest, u64 *buf)
+{
+ mlx4_en_ex_selftest(dev, &etest->flags, buf);
+}
+
+static void mlx4_en_get_strings(struct net_device *dev,
+ uint32_t stringset, uint8_t *data)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ int index = 0;
+ int i, strings = 0;
+ struct bitmap_iterator it;
+
+ bitmap_iterator_init(&it, priv->stats_bitmap.bitmap, NUM_ALL_STATS);
+
+ switch (stringset) {
+ case ETH_SS_TEST:
+ for (i = 0; i < MLX4_EN_NUM_SELF_TEST - 2; i++)
+ strcpy(data + i * ETH_GSTRING_LEN, mlx4_en_test_names[i]);
+ if (priv->mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_UC_LOOPBACK)
+ for (; i < MLX4_EN_NUM_SELF_TEST; i++)
+ strcpy(data + i * ETH_GSTRING_LEN, mlx4_en_test_names[i]);
+ break;
+
+ case ETH_SS_STATS:
+ /* Add main counters */
+ for (i = 0; i < NUM_MAIN_STATS; i++, strings++,
+ bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ strcpy(data + (index++) * ETH_GSTRING_LEN,
+ main_strings[strings]);
+
+ for (i = 0; i < NUM_PORT_STATS; i++, strings++,
+ bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ strcpy(data + (index++) * ETH_GSTRING_LEN,
+ main_strings[strings]);
+
+ for (i = 0; i < NUM_FLOW_STATS; i++, strings++,
+ bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ strcpy(data + (index++) * ETH_GSTRING_LEN,
+ main_strings[strings]);
+
+ for (i = 0; i < NUM_VF_STATS; i++, strings++,
+ bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ strcpy(data + (index++) * ETH_GSTRING_LEN,
+ main_strings[strings]);
+
+ for (i = 0; i < NUM_VPORT_STATS; i++, strings++,
+ bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ strcpy(data + (index++) * ETH_GSTRING_LEN,
+ main_strings[strings]);
+
+ for (i = 0; i < NUM_PKT_STATS; i++, strings++,
+ bitmap_iterator_inc(&it))
+ if (bitmap_iterator_test(&it))
+ strcpy(data + (index++) * ETH_GSTRING_LEN,
+ main_strings[strings]);
+
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ sprintf(data + (index++) * ETH_GSTRING_LEN,
+ "tx%d_packets", i);
+ sprintf(data + (index++) * ETH_GSTRING_LEN,
+ "tx%d_bytes", i);
+ }
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ sprintf(data + (index++) * ETH_GSTRING_LEN,
+ "rx%d_packets", i);
+ sprintf(data + (index++) * ETH_GSTRING_LEN,
+ "rx%d_bytes", i);
+#ifdef CONFIG_NET_RX_BUSY_POLL
+ sprintf(data + (index++) * ETH_GSTRING_LEN,
+ "rx%d_napi_yield", i);
+ sprintf(data + (index++) * ETH_GSTRING_LEN,
+ "rx%d_misses", i);
+ sprintf(data + (index++) * ETH_GSTRING_LEN,
+ "rx%d_cleaned", i);
+#endif
+ }
+ break;
+ case ETH_SS_PRIV_FLAGS:
+ for (i = 0; i < ARRAY_SIZE(mlx4_en_priv_flags); i++)
+ strcpy(data + i * ETH_GSTRING_LEN,
+ mlx4_en_priv_flags[i]);
+ break;
+
+ }
+}
+
+static u32 mlx4_en_autoneg_get(struct net_device *dev)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ u32 autoneg = AUTONEG_DISABLE;
+
+ if ((mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ETH_BACKPL_AN_REP) &&
+ (priv->port_state.flags & MLX4_EN_PORT_ANE))
+ autoneg = AUTONEG_ENABLE;
+
+ return autoneg;
+}
+
+static u32 ptys_get_supported_port(struct mlx4_ptys_reg *ptys_reg)
+{
+ u32 eth_proto = be32_to_cpu(ptys_reg->eth_proto_cap);
+
+ if (eth_proto & (MLX4_PROT_MASK(MLX4_10GBASE_T)
+ | MLX4_PROT_MASK(MLX4_1000BASE_T)
+ | MLX4_PROT_MASK(MLX4_100BASE_TX))) {
+ return SUPPORTED_TP;
+ }
+
+ if (eth_proto & (MLX4_PROT_MASK(MLX4_10GBASE_CR)
+ | MLX4_PROT_MASK(MLX4_10GBASE_SR)
+ | MLX4_PROT_MASK(MLX4_56GBASE_SR4)
+ | MLX4_PROT_MASK(MLX4_40GBASE_CR4)
+ | MLX4_PROT_MASK(MLX4_40GBASE_SR4)
+ | MLX4_PROT_MASK(MLX4_1000BASE_CX_SGMII))) {
+ return SUPPORTED_FIBRE;
+ }
+
+ if (eth_proto & (MLX4_PROT_MASK(MLX4_56GBASE_KR4)
+ | MLX4_PROT_MASK(MLX4_40GBASE_KR4)
+ | MLX4_PROT_MASK(MLX4_20GBASE_KR2)
+ | MLX4_PROT_MASK(MLX4_10GBASE_KR)
+ | MLX4_PROT_MASK(MLX4_10GBASE_KX4)
+ | MLX4_PROT_MASK(MLX4_1000BASE_KX))) {
+ return SUPPORTED_Backplane;
+ }
+ return 0;
+}
+
+static u32 ptys_get_active_port(struct mlx4_ptys_reg *ptys_reg)
+{
+ u32 eth_proto = be32_to_cpu(ptys_reg->eth_proto_oper);
+
+ if (!eth_proto) /* link down */
+ eth_proto = be32_to_cpu(ptys_reg->eth_proto_cap);
+
+ if (eth_proto & (MLX4_PROT_MASK(MLX4_10GBASE_T)
+ | MLX4_PROT_MASK(MLX4_1000BASE_T)
+ | MLX4_PROT_MASK(MLX4_100BASE_TX))) {
+ return PORT_TP;
+ }
+
+ if (eth_proto & (MLX4_PROT_MASK(MLX4_10GBASE_SR)
+ | MLX4_PROT_MASK(MLX4_56GBASE_SR4)
+ | MLX4_PROT_MASK(MLX4_40GBASE_SR4)
+ | MLX4_PROT_MASK(MLX4_1000BASE_CX_SGMII))) {
+ return PORT_FIBRE;
+ }
+
+ if (eth_proto & (MLX4_PROT_MASK(MLX4_10GBASE_CR)
+ | MLX4_PROT_MASK(MLX4_56GBASE_CR4)
+ | MLX4_PROT_MASK(MLX4_40GBASE_CR4))) {
+ return PORT_DA;
+ }
+
+ if (eth_proto & (MLX4_PROT_MASK(MLX4_56GBASE_KR4)
+ | MLX4_PROT_MASK(MLX4_40GBASE_KR4)
+ | MLX4_PROT_MASK(MLX4_20GBASE_KR2)
+ | MLX4_PROT_MASK(MLX4_10GBASE_KR)
+ | MLX4_PROT_MASK(MLX4_10GBASE_KX4)
+ | MLX4_PROT_MASK(MLX4_1000BASE_KX))) {
+ return PORT_NONE;
+ }
+ return PORT_OTHER;
+}
+
+#define MLX4_LINK_MODES_SZ \
+ (FIELD_SIZEOF(struct mlx4_ptys_reg, eth_proto_cap) * 8)
+
+enum ethtool_report {
+ SUPPORTED = 0,
+ ADVERTISED = 1,
+ SPEED = 2
+};
+
+/* Translates mlx4 link mode to equivalent ethtool Link modes/speed */
+static u32 ptys2ethtool_map[MLX4_LINK_MODES_SZ][3] = {
+ [MLX4_100BASE_TX] = {
+ SUPPORTED_100baseT_Full,
+ ADVERTISED_100baseT_Full,
+ SPEED_100
+ },
+
+ [MLX4_1000BASE_T] = {
+ SUPPORTED_1000baseT_Full,
+ ADVERTISED_1000baseT_Full,
+ SPEED_1000
+ },
+ [MLX4_1000BASE_CX_SGMII] = {
+ SUPPORTED_1000baseKX_Full,
+ ADVERTISED_1000baseKX_Full,
+ SPEED_1000
+ },
+ [MLX4_1000BASE_KX] = {
+ SUPPORTED_1000baseKX_Full,
+ ADVERTISED_1000baseKX_Full,
+ SPEED_1000
+ },
+
+ [MLX4_10GBASE_T] = {
+ SUPPORTED_10000baseT_Full,
+ ADVERTISED_10000baseT_Full,
+ SPEED_10000
+ },
+ [MLX4_10GBASE_CX4] = {
+ SUPPORTED_10000baseKX4_Full,
+ ADVERTISED_10000baseKX4_Full,
+ SPEED_10000
+ },
+ [MLX4_10GBASE_KX4] = {
+ SUPPORTED_10000baseKX4_Full,
+ ADVERTISED_10000baseKX4_Full,
+ SPEED_10000
+ },
+ [MLX4_10GBASE_KR] = {
+ SUPPORTED_10000baseKR_Full,
+ ADVERTISED_10000baseKR_Full,
+ SPEED_10000
+ },
+ [MLX4_10GBASE_CR] = {
+ SUPPORTED_10000baseKR_Full,
+ ADVERTISED_10000baseKR_Full,
+ SPEED_10000
+ },
+ [MLX4_10GBASE_SR] = {
+ SUPPORTED_10000baseKR_Full,
+ ADVERTISED_10000baseKR_Full,
+ SPEED_10000
+ },
+
+ [MLX4_20GBASE_KR2] = {
+ SUPPORTED_20000baseMLD2_Full | SUPPORTED_20000baseKR2_Full,
+ ADVERTISED_20000baseMLD2_Full | ADVERTISED_20000baseKR2_Full,
+ SPEED_20000
+ },
+
+ [MLX4_40GBASE_CR4] = {
+ SUPPORTED_40000baseCR4_Full,
+ ADVERTISED_40000baseCR4_Full,
+ SPEED_40000
+ },
+ [MLX4_40GBASE_KR4] = {
+ SUPPORTED_40000baseKR4_Full,
+ ADVERTISED_40000baseKR4_Full,
+ SPEED_40000
+ },
+ [MLX4_40GBASE_SR4] = {
+ SUPPORTED_40000baseSR4_Full,
+ ADVERTISED_40000baseSR4_Full,
+ SPEED_40000
+ },
+
+ [MLX4_56GBASE_KR4] = {
+ SUPPORTED_56000baseKR4_Full,
+ ADVERTISED_56000baseKR4_Full,
+ SPEED_56000
+ },
+ [MLX4_56GBASE_CR4] = {
+ SUPPORTED_56000baseCR4_Full,
+ ADVERTISED_56000baseCR4_Full,
+ SPEED_56000
+ },
+ [MLX4_56GBASE_SR4] = {
+ SUPPORTED_56000baseSR4_Full,
+ ADVERTISED_56000baseSR4_Full,
+ SPEED_56000
+ },
+};
+
+static u32 ptys2ethtool_link_modes(u32 eth_proto, enum ethtool_report report)
+{
+ int i;
+ u32 link_modes = 0;
+
+ for (i = 0; i < MLX4_LINK_MODES_SZ; i++) {
+ if (eth_proto & MLX4_PROT_MASK(i))
+ link_modes |= ptys2ethtool_map[i][report];
+ }
+ return link_modes;
+}
+
+static u32 ethtool2ptys_link_modes(u32 link_modes, enum ethtool_report report)
+{
+ int i;
+ u32 ptys_modes = 0;
+
+ for (i = 0; i < MLX4_LINK_MODES_SZ; i++) {
+ if (ptys2ethtool_map[i][report] & link_modes)
+ ptys_modes |= 1 << i;
+ }
+ return ptys_modes;
+}
+
+/* Convert actual speed (SPEED_XXX) to ptys link modes */
+static u32 speed2ptys_link_modes(u32 speed)
+{
+ int i;
+ u32 ptys_modes = 0;
+
+ for (i = 0; i < MLX4_LINK_MODES_SZ; i++) {
+ if (ptys2ethtool_map[i][SPEED] == speed)
+ ptys_modes |= 1 << i;
+ }
+ return ptys_modes;
+}
+
+static int ethtool_get_ptys_settings(struct net_device *dev,
+ struct ethtool_cmd *cmd)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_ptys_reg ptys_reg;
+ u32 eth_proto;
+ int ret;
+
+ memset(&ptys_reg, 0, sizeof(ptys_reg));
+ ptys_reg.local_port = priv->port;
+ ptys_reg.proto_mask = MLX4_PTYS_EN;
+ ret = mlx4_ACCESS_PTYS_REG(priv->mdev->dev,
+ MLX4_ACCESS_REG_QUERY, &ptys_reg);
+ if (ret) {
+ en_warn(priv, "Failed to run mlx4_ACCESS_PTYS_REG status(%x)",
+ ret);
+ return ret;
+ }
+ en_dbg(DRV, priv, "ptys_reg.proto_mask %x\n",
+ ptys_reg.proto_mask);
+ en_dbg(DRV, priv, "ptys_reg.eth_proto_cap %x\n",
+ be32_to_cpu(ptys_reg.eth_proto_cap));
+ en_dbg(DRV, priv, "ptys_reg.eth_proto_admin %x\n",
+ be32_to_cpu(ptys_reg.eth_proto_admin));
+ en_dbg(DRV, priv, "ptys_reg.eth_proto_oper %x\n",
+ be32_to_cpu(ptys_reg.eth_proto_oper));
+ en_dbg(DRV, priv, "ptys_reg.eth_proto_lp_adv %x\n",
+ be32_to_cpu(ptys_reg.eth_proto_lp_adv));
+
+ cmd->supported = 0;
+ cmd->advertising = 0;
+
+ cmd->supported |= ptys_get_supported_port(&ptys_reg);
+
+ eth_proto = be32_to_cpu(ptys_reg.eth_proto_cap);
+ cmd->supported |= ptys2ethtool_link_modes(eth_proto, SUPPORTED);
+
+ eth_proto = be32_to_cpu(ptys_reg.eth_proto_admin);
+ cmd->advertising |= ptys2ethtool_link_modes(eth_proto, ADVERTISED);
+
+ cmd->supported |= SUPPORTED_Pause | SUPPORTED_Asym_Pause;
+ cmd->advertising |= (priv->prof->tx_pause) ? ADVERTISED_Pause : 0;
+
+ cmd->advertising |= (priv->prof->tx_pause ^ priv->prof->rx_pause) ?
+ ADVERTISED_Asym_Pause : 0;
+
+ cmd->port = ptys_get_active_port(&ptys_reg);
+ cmd->transceiver = (SUPPORTED_TP & cmd->supported) ?
+ XCVR_EXTERNAL : XCVR_INTERNAL;
+
+ if (mlx4_en_autoneg_get(dev)) {
+ cmd->supported |= SUPPORTED_Autoneg;
+ cmd->advertising |= ADVERTISED_Autoneg;
+ }
+
+ cmd->autoneg = (priv->port_state.flags & MLX4_EN_PORT_ANC) ?
+ AUTONEG_ENABLE : AUTONEG_DISABLE;
+
+ eth_proto = be32_to_cpu(ptys_reg.eth_proto_lp_adv);
+ cmd->lp_advertising = ptys2ethtool_link_modes(eth_proto, ADVERTISED);
+
+ cmd->lp_advertising |= (priv->port_state.flags & MLX4_EN_PORT_ANC) ?
+ ADVERTISED_Autoneg : 0;
+
+ cmd->phy_address = 0;
+ cmd->mdio_support = 0;
+ cmd->maxtxpkt = 0;
+ cmd->maxrxpkt = 0;
+ cmd->eth_tp_mdix = ETH_TP_MDI_INVALID;
+#if defined(ETH_TP_MDI_AUTO)
+ cmd->eth_tp_mdix_ctrl = ETH_TP_MDI_AUTO;
+#endif
+
+ return ret;
+}
+
+static void ethtool_get_default_settings(struct net_device *dev,
+ struct ethtool_cmd *cmd)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ int trans_type;
+
+ cmd->autoneg = AUTONEG_DISABLE;
+ cmd->supported = SUPPORTED_10000baseT_Full;
+ cmd->advertising = ADVERTISED_10000baseT_Full;
+ trans_type = priv->port_state.transceiver;
+
+ if (trans_type > 0 && trans_type <= 0xC) {
+ cmd->port = PORT_FIBRE;
+ cmd->transceiver = XCVR_EXTERNAL;
+ cmd->supported |= SUPPORTED_FIBRE;
+ cmd->advertising |= ADVERTISED_FIBRE;
+ } else if (trans_type == 0x80 || trans_type == 0) {
+ cmd->port = PORT_TP;
+ cmd->transceiver = XCVR_INTERNAL;
+ cmd->supported |= SUPPORTED_TP;
+ cmd->advertising |= ADVERTISED_TP;
+ } else {
+ cmd->port = -1;
+ cmd->transceiver = -1;
+ }
+}
+
+static int mlx4_en_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ int ret = -EINVAL;
+
+ if (mlx4_en_QUERY_PORT(priv->mdev, priv->port))
+ return -ENOMEM;
+
+ en_dbg(DRV, priv, "query port state.flags ANC(%x) ANE(%x)\n",
+ priv->port_state.flags & MLX4_EN_PORT_ANC,
+ priv->port_state.flags & MLX4_EN_PORT_ANE);
+
+ if (priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ETH_PROT_CTRL)
+ ret = ethtool_get_ptys_settings(dev, cmd);
+ if (ret) /* ETH PROT CRTL is not supported or PTYS CMD failed */
+ ethtool_get_default_settings(dev, cmd);
+
+ if (netif_carrier_ok(dev)) {
+ ethtool_cmd_speed_set(cmd, priv->port_state.link_speed);
+ cmd->duplex = DUPLEX_FULL;
+ } else {
+ ethtool_cmd_speed_set(cmd, SPEED_UNKNOWN);
+ cmd->duplex = DUPLEX_UNKNOWN;
+ }
+ return 0;
+}
+
+/* Calculate PTYS admin according ethtool speed (SPEED_XXX) */
+static __be32 speed_set_ptys_admin(struct mlx4_en_priv *priv, u32 speed,
+ __be32 proto_cap)
+{
+ __be32 proto_admin = 0;
+
+ if (!speed) { /* Speed = 0 ==> Reset Link modes */
+ proto_admin = proto_cap;
+ en_info(priv, "Speed was set to 0, Reset advertised Link Modes to default (%x)\n",
+ be32_to_cpu(proto_cap));
+ } else {
+ u32 ptys_link_modes = speed2ptys_link_modes(speed);
+
+ proto_admin = cpu_to_be32(ptys_link_modes) & proto_cap;
+ en_info(priv, "Setting Speed to %d\n", speed);
+ }
+ return proto_admin;
+}
+
+static int mlx4_en_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_ptys_reg ptys_reg;
+ __be32 proto_admin;
+ int ret;
+
+ u32 ptys_adv = ethtool2ptys_link_modes(cmd->advertising, ADVERTISED);
+ int speed = ethtool_cmd_speed(cmd);
+
+ en_dbg(DRV, priv, "Set Speed=%d adv=0x%x autoneg=%d duplex=%d\n",
+ speed, cmd->advertising, cmd->autoneg, cmd->duplex);
+
+ if (!(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ETH_PROT_CTRL) ||
+ (cmd->duplex == DUPLEX_HALF))
+ return -EINVAL;
+
+ memset(&ptys_reg, 0, sizeof(ptys_reg));
+ ptys_reg.local_port = priv->port;
+ ptys_reg.proto_mask = MLX4_PTYS_EN;
+ ret = mlx4_ACCESS_PTYS_REG(priv->mdev->dev,
+ MLX4_ACCESS_REG_QUERY, &ptys_reg);
+ if (ret) {
+ en_warn(priv, "Failed to QUERY mlx4_ACCESS_PTYS_REG status(%x)\n",
+ ret);
+ return 0;
+ }
+
+ proto_admin = cmd->autoneg == AUTONEG_ENABLE ?
+ cpu_to_be32(ptys_adv) :
+ speed_set_ptys_admin(priv, speed,
+ ptys_reg.eth_proto_cap);
+
+ proto_admin &= ptys_reg.eth_proto_cap;
+ if (!proto_admin) {
+ en_warn(priv, "Not supported link mode(s) requested, check supported link modes.\n");
+ return -EINVAL; /* nothing to change due to bad input */
+ }
+
+ if (proto_admin == ptys_reg.eth_proto_admin)
+ return 0; /* Nothing to change */
+
+ en_dbg(DRV, priv, "mlx4_ACCESS_PTYS_REG SET: ptys_reg.eth_proto_admin = 0x%x\n",
+ be32_to_cpu(proto_admin));
+
+ ptys_reg.eth_proto_admin = proto_admin;
+ ret = mlx4_ACCESS_PTYS_REG(priv->mdev->dev, MLX4_ACCESS_REG_WRITE,
+ &ptys_reg);
+ if (ret) {
+ en_warn(priv, "Failed to write mlx4_ACCESS_PTYS_REG eth_proto_admin(0x%x) status(0x%x)",
+ be32_to_cpu(ptys_reg.eth_proto_admin), ret);
+ return ret;
+ }
+
+ mutex_lock(&priv->mdev->state_lock);
+ if (priv->port_up) {
+ en_warn(priv, "Port link mode changed, restarting port...\n");
+ mlx4_en_stop_port(dev, 1);
+ if (mlx4_en_start_port(dev))
+ en_err(priv, "Failed restarting port %d\n", priv->port);
+ }
+ mutex_unlock(&priv->mdev->state_lock);
+ return 0;
+}
+
+static int mlx4_en_get_coalesce(struct net_device *dev,
+ struct ethtool_coalesce *coal)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ coal->tx_coalesce_usecs = priv->tx_usecs;
+ coal->tx_max_coalesced_frames = priv->tx_frames;
+ coal->tx_max_coalesced_frames_irq = priv->tx_work_limit;
+
+ coal->rx_coalesce_usecs = priv->rx_usecs;
+ coal->rx_max_coalesced_frames = priv->rx_frames;
+
+ coal->pkt_rate_low = priv->pkt_rate_low;
+ coal->rx_coalesce_usecs_low = priv->rx_usecs_low;
+ coal->pkt_rate_high = priv->pkt_rate_high;
+ coal->rx_coalesce_usecs_high = priv->rx_usecs_high;
+ coal->rate_sample_interval = priv->sample_interval;
+ coal->use_adaptive_rx_coalesce = priv->adaptive_rx_coal;
+
+ return 0;
+}
+
+static int mlx4_en_set_coalesce(struct net_device *dev,
+ struct ethtool_coalesce *coal)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ if (!coal->tx_max_coalesced_frames_irq)
+ return -EINVAL;
+
+ priv->rx_frames = (coal->rx_max_coalesced_frames ==
+ MLX4_EN_AUTO_CONF) ?
+ MLX4_EN_RX_COAL_TARGET :
+ coal->rx_max_coalesced_frames;
+ priv->rx_usecs = (coal->rx_coalesce_usecs ==
+ MLX4_EN_AUTO_CONF) ?
+ MLX4_EN_RX_COAL_TIME :
+ coal->rx_coalesce_usecs;
+
+ /* Setting TX coalescing parameters */
+ if (coal->tx_coalesce_usecs != priv->tx_usecs ||
+ coal->tx_max_coalesced_frames != priv->tx_frames) {
+ priv->tx_usecs = coal->tx_coalesce_usecs;
+ priv->tx_frames = coal->tx_max_coalesced_frames;
+ }
+
+ /* Set adaptive coalescing params */
+ priv->pkt_rate_low = coal->pkt_rate_low;
+ priv->rx_usecs_low = coal->rx_coalesce_usecs_low;
+ priv->pkt_rate_high = coal->pkt_rate_high;
+ priv->rx_usecs_high = coal->rx_coalesce_usecs_high;
+ priv->sample_interval = coal->rate_sample_interval;
+ priv->adaptive_rx_coal = coal->use_adaptive_rx_coalesce;
+ priv->tx_work_limit = coal->tx_max_coalesced_frames_irq;
+
+ return mlx4_en_moderation_update(priv);
+}
+
+static int mlx4_en_set_pauseparam(struct net_device *dev,
+ struct ethtool_pauseparam *pause)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err;
+
+ if (pause->autoneg)
+ return -EINVAL;
+
+ priv->prof->tx_pause = pause->tx_pause != 0;
+ priv->prof->rx_pause = pause->rx_pause != 0;
+ err = mlx4_SET_PORT_general(mdev->dev, priv->port,
+ priv->rx_skb_size + ETH_FCS_LEN,
+ priv->prof->tx_pause,
+ priv->prof->tx_ppp,
+ priv->prof->rx_pause,
+ priv->prof->rx_ppp);
+ if (err) {
+ en_err(priv, "Failed setting pause params\n");
+ } else {
+ mlx4_en_update_pfc_stats_bitmap(mdev->dev, &priv->stats_bitmap,
+ priv->prof->rx_ppp,
+ priv->prof->rx_pause,
+ priv->prof->tx_ppp,
+ priv->prof->tx_pause);
+ }
+
+ return err;
+}
+
+static void mlx4_en_get_pauseparam(struct net_device *dev,
+ struct ethtool_pauseparam *pause)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ pause->tx_pause = priv->prof->tx_pause;
+ pause->rx_pause = priv->prof->rx_pause;
+}
+
+static int mlx4_en_set_ringparam(struct net_device *dev,
+ struct ethtool_ringparam *param)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ u32 rx_size, tx_size;
+ int port_up = 0;
+ int err = 0;
+
+ if (param->rx_jumbo_pending || param->rx_mini_pending)
+ return -EINVAL;
+
+ rx_size = roundup_pow_of_two(param->rx_pending);
+ rx_size = max_t(u32, rx_size, MLX4_EN_MIN_RX_SIZE);
+ rx_size = min_t(u32, rx_size, MLX4_EN_MAX_RX_SIZE);
+ tx_size = roundup_pow_of_two(param->tx_pending);
+ tx_size = max_t(u32, tx_size, MLX4_EN_MIN_TX_SIZE);
+ tx_size = min_t(u32, tx_size, MLX4_EN_MAX_TX_SIZE);
+
+ if (rx_size == (priv->port_up ? priv->rx_ring[0]->actual_size :
+ priv->rx_ring[0]->size) &&
+ tx_size == priv->tx_ring[0]->size)
+ return 0;
+
+ mutex_lock(&mdev->state_lock);
+ if (priv->port_up) {
+ port_up = 1;
+ mlx4_en_stop_port(dev, 1);
+ }
+
+ mlx4_en_free_resources(priv);
+
+ priv->prof->tx_ring_size = tx_size;
+ priv->prof->rx_ring_size = rx_size;
+
+ err = mlx4_en_alloc_resources(priv);
+ if (err) {
+ en_err(priv, "Failed reallocating port resources\n");
+ goto out;
+ }
+ if (port_up) {
+ err = mlx4_en_start_port(dev);
+ if (err)
+ en_err(priv, "Failed starting port\n");
+ }
+
+ err = mlx4_en_moderation_update(priv);
+
+out:
+ mutex_unlock(&mdev->state_lock);
+ return err;
+}
+
+static void mlx4_en_get_ringparam(struct net_device *dev,
+ struct ethtool_ringparam *param)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ memset(param, 0, sizeof(*param));
+ param->rx_max_pending = MLX4_EN_MAX_RX_SIZE;
+ param->tx_max_pending = MLX4_EN_MAX_TX_SIZE;
+ param->rx_pending = priv->port_up ?
+ priv->rx_ring[0]->actual_size : priv->rx_ring[0]->size;
+ param->tx_pending = priv->tx_ring[0]->size;
+}
+
+#ifndef CONFIG_SYSFS_INDIR_SETTING
+static
+#endif
+u32 mlx4_en_get_rxfh_indir_size(struct net_device *dev)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ return priv->rx_ring_num;
+}
+
+#if defined(HAVE_GET_SET_RXFH) && !defined(HAVE_GET_SET_RXFH_INDIR_EXT)
+static u32 mlx4_en_get_rxfh_key_size(struct net_device *netdev)
+{
+ return MLX4_EN_RSS_KEY_SIZE;
+}
+#endif
+
+#ifdef HAVE_ETH_SS_RSS_HASH_FUNCS
+static int mlx4_en_check_rxfh_func(struct net_device *dev, u8 hfunc)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ /* check if requested function is supported by the device */
+ if ((hfunc == ETH_RSS_HASH_TOP &&
+ !(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RSS_TOP)) ||
+ (hfunc == ETH_RSS_HASH_XOR &&
+ !(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RSS_XOR)))
+ return -EINVAL;
+
+ priv->rss_hash_fn = hfunc;
+#ifdef HAVE_NETIF_F_RXHASH
+ if (hfunc == ETH_RSS_HASH_TOP && !(dev->features & NETIF_F_RXHASH))
+ en_warn(priv,
+ "Toeplitz hash function should be used in conjunction with RX hashing for optimal performance\n");
+ if (hfunc == ETH_RSS_HASH_XOR && (dev->features & NETIF_F_RXHASH))
+ en_warn(priv,
+ "Enabling both XOR Hash function and RX Hashing can limit RPS functionality\n");
+#endif
+ return 0;
+}
+#endif
+
+#ifdef CONFIG_SYSFS_INDIR_SETTING
+int mlx4_en_get_rxfh_indir(struct net_device *dev, u32 *ring_index)
+#else
+#ifdef HAVE_GET_SET_RXFH_INDIR_EXT
+static int mlx4_en_get_rxfh_indir(struct net_device *dev, u32 *ring_index)
+#else
+#ifdef HAVE_ETH_SS_RSS_HASH_FUNCS
+static int mlx4_en_get_rxfh(struct net_device *dev, u32 *ring_index, u8 *key,
+ u8 *hfunc)
+#else
+static int mlx4_en_get_rxfh(struct net_device *dev, u32 *ring_index, u8 *key)
+#endif
+#endif
+#endif
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_rss_map *rss_map = &priv->rss_map;
+ int rss_rings;
+ size_t n = priv->rx_ring_num;
+ int err = 0;
+
+ rss_rings = priv->prof->rss_rings ?: priv->rx_ring_num;
+ rss_rings = 1 << ilog2(rss_rings);
+
+ while (n--) {
+ if (!ring_index)
+ break;
+ ring_index[n] = rss_map->qps[n % rss_rings].qpn -
+ rss_map->base_qpn;
+ }
+#if !defined(HAVE_GET_SET_RXFH_INDIR_EXT) && !defined(CONFIG_SYSFS_INDIR_SETTING)
+ if (key)
+ memcpy(key, priv->rss_key, MLX4_EN_RSS_KEY_SIZE);
+#endif
+#ifdef HAVE_ETH_SS_RSS_HASH_FUNCS
+ if (hfunc)
+ *hfunc = priv->rss_hash_fn;
+#endif
+ return err;
+}
+
+#ifdef CONFIG_SYSFS_INDIR_SETTING
+int mlx4_en_set_rxfh_indir(struct net_device *dev,
+ const u32 *ring_index)
+#else
+#ifdef HAVE_GET_SET_RXFH_INDIR_EXT
+static int mlx4_en_set_rxfh_indir(struct net_device *dev,
+ const u32 *ring_index)
+#else
+static int mlx4_en_set_rxfh(struct net_device *dev, const u32 *ring_index,
+#ifdef HAVE_ETH_SS_RSS_HASH_FUNCS
+ const u8 *key, const u8 hfunc)
+#else
+ const u8 *key)
+#endif
+#endif
+#endif
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int port_up = 0;
+ int err = 0;
+ int i;
+ int rss_rings = 0;
+
+ /* Calculate RSS table size and make sure flows are spread evenly
+ * between rings
+ */
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ if (!ring_index)
+ continue;
+ if (i > 0 && !ring_index[i] && !rss_rings)
+ rss_rings = i;
+
+ if (ring_index[i] != (i % (rss_rings ?: priv->rx_ring_num)))
+ return -EINVAL;
+ }
+
+ if (!rss_rings)
+ rss_rings = priv->rx_ring_num;
+
+ /* RSS table size must be an order of 2 */
+ if (!is_power_of_2(rss_rings))
+ return -EINVAL;
+
+#ifdef HAVE_ETH_SS_RSS_HASH_FUNCS
+ if (hfunc != ETH_RSS_HASH_NO_CHANGE) {
+ err = mlx4_en_check_rxfh_func(dev, hfunc);
+ if (err)
+ return err;
+ }
+#endif
+
+ mutex_lock(&mdev->state_lock);
+ if (priv->port_up) {
+ port_up = 1;
+ mlx4_en_stop_port(dev, 1);
+ }
+
+ if (ring_index)
+ priv->prof->rss_rings = rss_rings;
+#if !defined(HAVE_GET_SET_RXFH_INDIR_EXT) && !defined(CONFIG_SYSFS_INDIR_SETTING)
+ if (key)
+ memcpy(priv->rss_key, key, MLX4_EN_RSS_KEY_SIZE);
+#endif
+
+ if (port_up) {
+ err = mlx4_en_start_port(dev);
+ if (err)
+ en_err(priv, "Failed starting port\n");
+ }
+
+ mutex_unlock(&mdev->state_lock);
+ return err;
+}
+
+#define all_zeros_or_all_ones(field) \
+ ((field) == 0 || (field) == (__force typeof(field))-1)
+
+static int mlx4_en_validate_flow(struct net_device *dev,
+ struct mlx4_ethtool_rxnfc *cmd)
+{
+ struct ethtool_usrip4_spec *l3_mask;
+ struct ethtool_tcpip4_spec *l4_mask;
+ struct ethhdr *eth_mask;
+
+ if (cmd->fs.location >= MAX_NUM_OF_FS_RULES)
+ return -EINVAL;
+
+#ifdef HAVE_ETHTOOL_FLOW_EXT_H_DEST
+ if (cmd->fs.flow_type & FLOW_MAC_EXT) {
+ /* dest mac mask must be ff:ff:ff:ff:ff:ff */
+ if (!is_broadcast_ether_addr(cmd->fs.m_ext.h_dest))
+ return -EINVAL;
+ }
+#endif
+
+ switch (cmd->fs.flow_type & ~(FLOW_EXT | FLOW_MAC_EXT)) {
+ case TCP_V4_FLOW:
+ case UDP_V4_FLOW:
+ if (cmd->fs.m_u.tcp_ip4_spec.tos)
+ return -EINVAL;
+ l4_mask = &cmd->fs.m_u.tcp_ip4_spec;
+ /* don't allow mask which isn't all 0 or 1 */
+ if (!all_zeros_or_all_ones(l4_mask->ip4src) ||
+ !all_zeros_or_all_ones(l4_mask->ip4dst) ||
+ !all_zeros_or_all_ones(l4_mask->psrc) ||
+ !all_zeros_or_all_ones(l4_mask->pdst))
+ return -EINVAL;
+ break;
+ case IP_USER_FLOW:
+ l3_mask = &cmd->fs.m_u.usr_ip4_spec;
+ if (l3_mask->l4_4_bytes || l3_mask->tos || l3_mask->proto ||
+ cmd->fs.h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
+ (!l3_mask->ip4src && !l3_mask->ip4dst) ||
+ !all_zeros_or_all_ones(l3_mask->ip4src) ||
+ !all_zeros_or_all_ones(l3_mask->ip4dst))
+ return -EINVAL;
+ break;
+ case ETHER_FLOW:
+ eth_mask = &cmd->fs.m_u.ether_spec;
+ /* source mac mask must not be set */
+ if (!is_zero_ether_addr(eth_mask->h_source))
+ return -EINVAL;
+
+ /* dest mac mask must be ff:ff:ff:ff:ff:ff */
+ if (!is_broadcast_ether_addr(eth_mask->h_dest))
+ return -EINVAL;
+
+ if (!all_zeros_or_all_ones(eth_mask->h_proto))
+ return -EINVAL;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if ((cmd->fs.flow_type & FLOW_EXT)) {
+ if (cmd->fs.m_ext.vlan_etype ||
+ !((cmd->fs.m_ext.vlan_tci & cpu_to_be16(VLAN_VID_MASK)) ==
+ 0 ||
+ (cmd->fs.m_ext.vlan_tci & cpu_to_be16(VLAN_VID_MASK)) ==
+ cpu_to_be16(VLAN_VID_MASK)))
+ return -EINVAL;
+
+ if (cmd->fs.m_ext.vlan_tci) {
+ if (be16_to_cpu(cmd->fs.h_ext.vlan_tci) >= VLAN_N_VID)
+ return -EINVAL;
+
+ }
+ }
+
+ return 0;
+}
+
+static int mlx4_en_ethtool_add_mac_rule(struct mlx4_ethtool_rxnfc *cmd,
+ struct list_head *rule_list_h,
+ struct mlx4_spec_list *spec_l2,
+ unsigned char *mac)
+{
+ int err = 0;
+ __be64 mac_msk = cpu_to_be64(MLX4_MAC_MASK << 16);
+
+ spec_l2->id = MLX4_NET_TRANS_RULE_ID_ETH;
+ memcpy(spec_l2->eth.dst_mac_msk, &mac_msk, ETH_ALEN);
+ memcpy(spec_l2->eth.dst_mac, mac, ETH_ALEN);
+
+ if ((cmd->fs.flow_type & FLOW_EXT) &&
+ (cmd->fs.m_ext.vlan_tci & cpu_to_be16(VLAN_VID_MASK))) {
+ spec_l2->eth.vlan_id = cmd->fs.h_ext.vlan_tci;
+ spec_l2->eth.vlan_id_msk = cpu_to_be16(VLAN_VID_MASK);
+ }
+
+ list_add_tail(&spec_l2->list, rule_list_h);
+
+ return err;
+}
+
+static int mlx4_en_ethtool_add_mac_rule_by_ipv4(struct mlx4_en_priv *priv,
+ struct mlx4_ethtool_rxnfc *cmd,
+ struct list_head *rule_list_h,
+ struct mlx4_spec_list *spec_l2,
+ __be32 ipv4_dst)
+{
+#ifdef CONFIG_INET
+ unsigned char mac[ETH_ALEN];
+
+ if (!ipv4_is_multicast(ipv4_dst)) {
+#ifdef HAVE_ETHTOOL_FLOW_EXT_H_DEST
+ if (cmd->fs.flow_type & FLOW_MAC_EXT)
+ memcpy(&mac, cmd->fs.h_ext.h_dest, ETH_ALEN);
+ else
+#endif
+ memcpy(&mac, priv->dev->dev_addr, ETH_ALEN);
+ } else {
+ ip_eth_mc_map(ipv4_dst, mac);
+ }
+
+ return mlx4_en_ethtool_add_mac_rule(cmd, rule_list_h, spec_l2, &mac[0]);
+#else
+ return -EINVAL;
+#endif
+}
+
+static int add_ip_rule(struct mlx4_en_priv *priv,
+ struct mlx4_ethtool_rxnfc *cmd,
+ struct list_head *list_h)
+{
+ int err;
+ struct mlx4_spec_list *spec_l2 = NULL;
+ struct mlx4_spec_list *spec_l3 = NULL;
+ struct ethtool_usrip4_spec *l3_mask = &cmd->fs.m_u.usr_ip4_spec;
+
+ spec_l3 = kzalloc(sizeof(*spec_l3), GFP_KERNEL);
+ spec_l2 = kzalloc(sizeof(*spec_l2), GFP_KERNEL);
+ if (!spec_l2 || !spec_l3) {
+ err = -ENOMEM;
+ goto free_spec;
+ }
+
+ err = mlx4_en_ethtool_add_mac_rule_by_ipv4(priv, cmd, list_h, spec_l2,
+ cmd->fs.h_u.
+ usr_ip4_spec.ip4dst);
+ if (err)
+ goto free_spec;
+ spec_l3->id = MLX4_NET_TRANS_RULE_ID_IPV4;
+ spec_l3->ipv4.src_ip = cmd->fs.h_u.usr_ip4_spec.ip4src;
+ if (l3_mask->ip4src)
+ spec_l3->ipv4.src_ip_msk = EN_ETHTOOL_WORD_MASK;
+ spec_l3->ipv4.dst_ip = cmd->fs.h_u.usr_ip4_spec.ip4dst;
+ if (l3_mask->ip4dst)
+ spec_l3->ipv4.dst_ip_msk = EN_ETHTOOL_WORD_MASK;
+ list_add_tail(&spec_l3->list, list_h);
+
+ return 0;
+
+free_spec:
+ kfree(spec_l2);
+ kfree(spec_l3);
+ return err;
+}
+
+static int add_tcp_udp_rule(struct mlx4_en_priv *priv,
+ struct mlx4_ethtool_rxnfc *cmd,
+ struct list_head *list_h, int proto)
+{
+ int err;
+ struct mlx4_spec_list *spec_l2 = NULL;
+ struct mlx4_spec_list *spec_l3 = NULL;
+ struct mlx4_spec_list *spec_l4 = NULL;
+ struct ethtool_tcpip4_spec *l4_mask = &cmd->fs.m_u.tcp_ip4_spec;
+
+ spec_l2 = kzalloc(sizeof(*spec_l2), GFP_KERNEL);
+ spec_l3 = kzalloc(sizeof(*spec_l3), GFP_KERNEL);
+ spec_l4 = kzalloc(sizeof(*spec_l4), GFP_KERNEL);
+ if (!spec_l2 || !spec_l3 || !spec_l4) {
+ err = -ENOMEM;
+ goto free_spec;
+ }
+
+ spec_l3->id = MLX4_NET_TRANS_RULE_ID_IPV4;
+
+ if (proto == TCP_V4_FLOW) {
+ err = mlx4_en_ethtool_add_mac_rule_by_ipv4(priv, cmd, list_h,
+ spec_l2,
+ cmd->fs.h_u.
+ tcp_ip4_spec.ip4dst);
+ if (err)
+ goto free_spec;
+ spec_l4->id = MLX4_NET_TRANS_RULE_ID_TCP;
+ spec_l3->ipv4.src_ip = cmd->fs.h_u.tcp_ip4_spec.ip4src;
+ spec_l3->ipv4.dst_ip = cmd->fs.h_u.tcp_ip4_spec.ip4dst;
+ spec_l4->tcp_udp.src_port = cmd->fs.h_u.tcp_ip4_spec.psrc;
+ spec_l4->tcp_udp.dst_port = cmd->fs.h_u.tcp_ip4_spec.pdst;
+ } else {
+ err = mlx4_en_ethtool_add_mac_rule_by_ipv4(priv, cmd, list_h,
+ spec_l2,
+ cmd->fs.h_u.
+ udp_ip4_spec.ip4dst);
+ if (err)
+ goto free_spec;
+ spec_l4->id = MLX4_NET_TRANS_RULE_ID_UDP;
+ spec_l3->ipv4.src_ip = cmd->fs.h_u.udp_ip4_spec.ip4src;
+ spec_l3->ipv4.dst_ip = cmd->fs.h_u.udp_ip4_spec.ip4dst;
+ spec_l4->tcp_udp.src_port = cmd->fs.h_u.udp_ip4_spec.psrc;
+ spec_l4->tcp_udp.dst_port = cmd->fs.h_u.udp_ip4_spec.pdst;
+ }
+
+ if (l4_mask->ip4src)
+ spec_l3->ipv4.src_ip_msk = EN_ETHTOOL_WORD_MASK;
+ if (l4_mask->ip4dst)
+ spec_l3->ipv4.dst_ip_msk = EN_ETHTOOL_WORD_MASK;
+
+ if (l4_mask->psrc)
+ spec_l4->tcp_udp.src_port_msk = EN_ETHTOOL_SHORT_MASK;
+ if (l4_mask->pdst)
+ spec_l4->tcp_udp.dst_port_msk = EN_ETHTOOL_SHORT_MASK;
+
+ list_add_tail(&spec_l3->list, list_h);
+ list_add_tail(&spec_l4->list, list_h);
+
+ return 0;
+
+free_spec:
+ kfree(spec_l2);
+ kfree(spec_l3);
+ kfree(spec_l4);
+ return err;
+}
+
+static int mlx4_en_ethtool_to_net_trans_rule(struct net_device *dev,
+ struct mlx4_ethtool_rxnfc *cmd,
+ struct list_head *rule_list_h)
+{
+ int err;
+ struct ethhdr *eth_spec;
+ struct mlx4_spec_list *spec_l2;
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ err = mlx4_en_validate_flow(dev, cmd);
+ if (err)
+ return err;
+
+ switch (cmd->fs.flow_type & ~(FLOW_EXT | FLOW_MAC_EXT)) {
+ case ETHER_FLOW:
+ spec_l2 = kzalloc(sizeof(*spec_l2), GFP_KERNEL);
+ if (!spec_l2)
+ return -ENOMEM;
+
+ eth_spec = &cmd->fs.h_u.ether_spec;
+ mlx4_en_ethtool_add_mac_rule(cmd, rule_list_h, spec_l2,
+ ð_spec->h_dest[0]);
+ spec_l2->eth.ether_type = eth_spec->h_proto;
+ if (eth_spec->h_proto)
+ spec_l2->eth.ether_type_enable = 1;
+ break;
+ case IP_USER_FLOW:
+ err = add_ip_rule(priv, cmd, rule_list_h);
+ break;
+ case TCP_V4_FLOW:
+ err = add_tcp_udp_rule(priv, cmd, rule_list_h, TCP_V4_FLOW);
+ break;
+ case UDP_V4_FLOW:
+ err = add_tcp_udp_rule(priv, cmd, rule_list_h, UDP_V4_FLOW);
+ break;
+ }
+
+ return err;
+}
+
+static int mlx4_en_flow_replace(struct net_device *dev,
+ struct mlx4_ethtool_rxnfc *cmd)
+{
+ int err;
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct ethtool_flow_id *loc_rule;
+ struct mlx4_spec_list *spec, *tmp_spec;
+ u32 qpn;
+ u64 reg_id;
+
+ struct mlx4_net_trans_rule rule = {
+ .queue_mode = MLX4_NET_TRANS_Q_FIFO,
+ .exclusive = 0,
+ .allow_loopback = 1,
+ .promisc_mode = MLX4_FS_REGULAR,
+ };
+
+ rule.port = priv->port;
+ rule.priority = MLX4_DOMAIN_ETHTOOL | cmd->fs.location;
+ INIT_LIST_HEAD(&rule.list);
+
+ /* Allow direct QP attaches if the EN_ETHTOOL_QP_ATTACH flag is set */
+ if (cmd->fs.ring_cookie == RX_CLS_FLOW_DISC)
+ qpn = priv->drop_qp.qpn;
+ else if (cmd->fs.ring_cookie & EN_ETHTOOL_QP_ATTACH) {
+ qpn = cmd->fs.ring_cookie & (EN_ETHTOOL_QP_ATTACH - 1);
+ } else {
+ if (cmd->fs.ring_cookie >= priv->rx_ring_num) {
+ en_warn(priv, "rxnfc: RX ring (%llu) doesn't exist\n",
+ cmd->fs.ring_cookie);
+ return -EINVAL;
+ }
+ qpn = priv->rss_map.qps[cmd->fs.ring_cookie].qpn;
+ if (!qpn) {
+ en_warn(priv, "rxnfc: RX ring (%llu) is inactive\n",
+ cmd->fs.ring_cookie);
+ return -EINVAL;
+ }
+ }
+ rule.qpn = qpn;
+ err = mlx4_en_ethtool_to_net_trans_rule(dev, cmd, &rule.list);
+ if (err)
+ goto out_free_list;
+
+ loc_rule = &priv->ethtool_rules[cmd->fs.location];
+ if (loc_rule->id) {
+ err = mlx4_flow_detach(priv->mdev->dev, loc_rule->id);
+ if (err) {
+ en_err(priv, "Fail to detach network rule at location %d. registration id = %llx\n",
+ cmd->fs.location, loc_rule->id);
+ goto out_free_list;
+ }
+ loc_rule->id = 0;
+ memset(&loc_rule->flow_spec, 0,
+ sizeof(struct ethtool_rx_flow_spec));
+ list_del(&loc_rule->list);
+ }
+ err = mlx4_flow_attach(priv->mdev->dev, &rule, ®_id);
+ if (err) {
+ en_err(priv, "Fail to attach network rule at location %d\n",
+ cmd->fs.location);
+ goto out_free_list;
+ }
+ loc_rule->id = reg_id;
+ memcpy(&loc_rule->flow_spec, &cmd->fs,
+ sizeof(struct ethtool_rx_flow_spec));
+ list_add_tail(&loc_rule->list, &priv->ethtool_list);
+
+out_free_list:
+ list_for_each_entry_safe(spec, tmp_spec, &rule.list, list) {
+ list_del(&spec->list);
+ kfree(spec);
+ }
+ return err;
+}
+
+static int mlx4_en_flow_detach(struct net_device *dev,
+ struct mlx4_ethtool_rxnfc *cmd)
+{
+ int err = 0;
+ struct ethtool_flow_id *rule;
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ if (cmd->fs.location >= MAX_NUM_OF_FS_RULES)
+ return -EINVAL;
+
+ rule = &priv->ethtool_rules[cmd->fs.location];
+ if (!rule->id) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ err = mlx4_flow_detach(priv->mdev->dev, rule->id);
+ if (err) {
+ en_err(priv, "Fail to detach network rule at location %d. registration id = 0x%llx\n",
+ cmd->fs.location, rule->id);
+ goto out;
+ }
+ rule->id = 0;
+ memset(&rule->flow_spec, 0, sizeof(struct ethtool_rx_flow_spec));
+ list_del(&rule->list);
+out:
+ return err;
+
+}
+
+static int mlx4_en_get_flow(struct net_device *dev, struct mlx4_ethtool_rxnfc *cmd,
+ int loc)
+{
+ int err = 0;
+ struct ethtool_flow_id *rule;
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ if (loc < 0 || loc >= MAX_NUM_OF_FS_RULES)
+ return -EINVAL;
+
+ rule = &priv->ethtool_rules[loc];
+ if (rule->id)
+ memcpy(&cmd->fs, &rule->flow_spec,
+ sizeof(struct ethtool_rx_flow_spec));
+ else
+ err = -ENOENT;
+
+ return err;
+}
+
+static int mlx4_en_get_num_flows(struct mlx4_en_priv *priv)
+{
+
+ int i, res = 0;
+ for (i = 0; i < MAX_NUM_OF_FS_RULES; i++) {
+ if (priv->ethtool_rules[i].id)
+ res++;
+ }
+ return res;
+
+}
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,2,0))
+static int mlx4_en_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *c,
+ u32 *rule_locs)
+#else
+static int mlx4_en_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *c,
+ void *rule_locs)
+#endif
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err = 0;
+ int i = 0, priority = 0;
+ struct mlx4_ethtool_rxnfc *cmd = (struct mlx4_ethtool_rxnfc *)c;
+
+ if ((cmd->cmd == ETHTOOL_GRXCLSRLCNT ||
+ cmd->cmd == ETHTOOL_GRXCLSRULE ||
+ cmd->cmd == ETHTOOL_GRXCLSRLALL) &&
+ (mdev->dev->caps.steering_mode !=
+ MLX4_STEERING_MODE_DEVICE_MANAGED || !priv->port_up))
+ return -EINVAL;
+
+ switch (cmd->cmd) {
+ case ETHTOOL_GRXRINGS:
+ cmd->data = priv->rx_ring_num;
+ break;
+ case ETHTOOL_GRXCLSRLCNT:
+ cmd->rule_cnt = mlx4_en_get_num_flows(priv);
+ break;
+ case ETHTOOL_GRXCLSRULE:
+ err = mlx4_en_get_flow(dev, cmd, cmd->fs.location);
+ break;
+ case ETHTOOL_GRXCLSRLALL:
+ while ((!err || err == -ENOENT) && priority < cmd->rule_cnt) {
+ err = mlx4_en_get_flow(dev, cmd, i);
+ if (!err)
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,2,0))
+ rule_locs[priority++] = i;
+#else
+ ((u32 *)(rule_locs))[priority++] = i;
+#endif
+ i++;
+ }
+ err = 0;
+ break;
+ default:
+ err = -EOPNOTSUPP;
+ break;
+ }
+
+ return err;
+}
+
+static int mlx4_en_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *c)
+{
+ int err = 0;
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_ethtool_rxnfc *cmd = (struct mlx4_ethtool_rxnfc *)c;
+
+ if (mdev->dev->caps.steering_mode !=
+ MLX4_STEERING_MODE_DEVICE_MANAGED || !priv->port_up)
+ return -EINVAL;
+
+ switch (cmd->cmd) {
+ case ETHTOOL_SRXCLSRLINS:
+ err = mlx4_en_flow_replace(dev, cmd);
+ break;
+ case ETHTOOL_SRXCLSRLDEL:
+ err = mlx4_en_flow_detach(dev, cmd);
+ break;
+ default:
+ en_warn(priv, "Unsupported ethtool command. (%d)\n", cmd->cmd);
+ return -EINVAL;
+ }
+
+ return err;
+}
+
+#ifndef CONFIG_SYSFS_NUM_CHANNELS
+static
+#endif
+void mlx4_en_get_channels(struct net_device *dev,
+ struct ethtool_channels *channel)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ memset(channel, 0, sizeof(*channel));
+
+ channel->max_rx = MAX_RX_RINGS;
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ channel->max_tx = MLX4_EN_MAX_TX_RING_P_UP;
+#else
+ channel->max_tx = MLX4_EN_NUM_TX_RINGS * 2;
+#endif
+
+ channel->rx_count = priv->rx_ring_num;
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ channel->tx_count = priv->tx_ring_num / MLX4_EN_NUM_UP;
+#else
+ channel->tx_count = priv->tx_ring_num -
+ (!!priv->prof->rx_ppp) * MLX4_EN_NUM_PPP_RINGS;
+#endif
+}
+
+#ifndef CONFIG_SYSFS_NUM_CHANNELS
+static
+#endif
+int mlx4_en_set_channels(struct net_device *dev,
+ struct ethtool_channels *channel)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int port_up = 0;
+ int err = 0;
+
+ if (channel->other_count || channel->combined_count ||
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ channel->tx_count > MLX4_EN_MAX_TX_RING_P_UP ||
+#else
+ channel->tx_count > MLX4_EN_NUM_TX_RINGS * 2 ||
+#endif
+ channel->rx_count > MAX_RX_RINGS ||
+ !channel->tx_count || !channel->rx_count)
+ return -EINVAL;
+
+ mutex_lock(&mdev->state_lock);
+ if (priv->port_up) {
+ port_up = 1;
+ mlx4_en_stop_port(dev, 1);
+ }
+
+ mlx4_en_free_resources(priv);
+
+ priv->num_tx_rings_p_up = channel->tx_count;
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ priv->tx_ring_num = channel->tx_count * MLX4_EN_NUM_UP;
+#else
+ priv->tx_ring_num = channel->tx_count +
+ (!!priv->prof->rx_ppp) * MLX4_EN_NUM_PPP_RINGS;
+#endif
+ priv->rx_ring_num = channel->rx_count;
+
+ err = mlx4_en_alloc_resources(priv);
+ if (err) {
+ en_err(priv, "Failed reallocating port resources\n");
+ goto out;
+ }
+
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ netif_set_real_num_tx_queues(dev, priv->tx_ring_num);
+#else
+ dev->real_num_tx_queues = priv->tx_ring_num;
+#endif
+ netif_set_real_num_rx_queues(dev, priv->rx_ring_num);
+
+#ifdef HAVE_NEW_TX_RING_SCHEME
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,39))
+ if (dev->num_tc)
+#else
+ if (netdev_get_num_tc(dev))
+#endif
+ mlx4_en_setup_tc(dev, MLX4_EN_NUM_UP);
+#endif
+
+ en_warn(priv, "Using %d TX rings\n", priv->tx_ring_num);
+ en_warn(priv, "Using %d RX rings\n", priv->rx_ring_num);
+
+ if (port_up) {
+ err = mlx4_en_start_port(dev);
+ if (err)
+ en_err(priv, "Failed starting port\n");
+ }
+
+ err = mlx4_en_moderation_update(priv);
+
+out:
+ mutex_unlock(&mdev->state_lock);
+ return err;
+}
+
+#if defined(HAVE_GET_TS_INFO) || defined(HAVE_GET_TS_INFO_EXT)
+static int mlx4_en_get_ts_info(struct net_device *dev,
+ struct ethtool_ts_info *info)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int ret;
+
+ ret = ethtool_op_get_ts_info(dev, info);
+ if (ret)
+ return ret;
+
+ if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_TS) {
+ info->so_timestamping |=
+ SOF_TIMESTAMPING_TX_HARDWARE |
+ SOF_TIMESTAMPING_RX_HARDWARE |
+ SOF_TIMESTAMPING_RAW_HARDWARE;
+
+ info->tx_types =
+ (1 << HWTSTAMP_TX_OFF) |
+ (1 << HWTSTAMP_TX_ON);
+
+ info->rx_filters =
+ (1 << HWTSTAMP_FILTER_NONE) |
+ (1 << HWTSTAMP_FILTER_ALL);
+
+#if defined (HAVE_PTP_CLOCK_INFO) && (defined (CONFIG_PTP_1588_CLOCK) || defined(CONFIG_PTP_1588_CLOCK_MODULE))
+ if (mdev->ptp_clock)
+ info->phc_index = ptp_clock_index(mdev->ptp_clock);
+#endif
+ }
+
+ return ret;
+}
+#endif
+
+#if (!defined(HAVE_NETDEV_HW_FEATURES) && !defined(HAVE_NET_DEVICE_OPS_EXT))
+int mlx4_en_set_flags(struct net_device *dev, u32 data)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ if (DEV_FEATURE_CHANGED(dev, data, NETIF_F_HW_VLAN_CTAG_RX)) {
+ en_info(priv, "Turn %s RX vlan strip offload\n",
+ (data & NETIF_F_HW_VLAN_CTAG_RX) ? "ON" : "OFF");
+
+ if (data & NETIF_F_HW_VLAN_CTAG_RX)
+ priv->hwtstamp_config.flags |= NETIF_F_HW_VLAN_CTAG_RX;
+ else
+ priv->hwtstamp_config.flags &= ~NETIF_F_HW_VLAN_CTAG_RX;
+
+ mlx4_en_reset_config(dev, priv->hwtstamp_config, data);
+ }
+
+ if (DEV_FEATURE_CHANGED(dev, data, NETIF_F_HW_VLAN_CTAG_TX)) {
+ en_info(priv, "Turn %s TX vlan strip offload\n",
+ (data & NETIF_F_HW_VLAN_CTAG_TX) ? "ON" : "OFF");
+
+ if (data & NETIF_F_HW_VLAN_CTAG_TX)
+ dev->features |= NETIF_F_HW_VLAN_CTAG_TX;
+ else
+ dev->features &= ~NETIF_F_HW_VLAN_CTAG_TX;
+ }
+
+ if (data & ETH_FLAG_LRO)
+ dev->features |= NETIF_F_LRO;
+ else
+ dev->features &= ~NETIF_F_LRO;
+
+ return 0;
+}
+
+u32 mlx4_en_get_flags(struct net_device *dev)
+{
+ return ethtool_op_get_flags(dev) |
+ (dev->features & NETIF_F_HW_VLAN_CTAG_RX) |
+ (dev->features & NETIF_F_HW_VLAN_CTAG_TX);
+}
+#endif
+
+static int mlx4_en_set_priv_flags(struct net_device *dev, u32 flags)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ bool bf_enabled_new = !!(flags & MLX4_EN_PRIV_FLAGS_BLUEFLAME);
+ bool bf_enabled_old = !!(priv->pflags & MLX4_EN_PRIV_FLAGS_BLUEFLAME);
+ int i;
+
+ if ((flags ^ priv->pflags) &
+ (MLX4_EN_PRIV_FLAGS_FS_EN_L2 |
+ MLX4_EN_PRIV_FLAGS_FS_EN_IPV4 |
+ MLX4_EN_PRIV_FLAGS_FS_EN_TCP |
+ MLX4_EN_PRIV_FLAGS_FS_EN_UDP))
+ return -EINVAL;
+
+#ifndef CONFIG_COMPAT_DISABLE_DCB
+ if ((flags & MLX4_EN_PRIV_FLAGS_DISABLE_32_14_4_E) &&
+ !(priv->pflags & MLX4_EN_PRIV_FLAGS_DISABLE_32_14_4_E)) {
+#ifndef CONFIG_MLX4_EN_DCB
+ return -EOPNOTSUPP;
+#endif
+ if (mlx4_disable_32_14_4_e_write(mdev->dev, 1, priv->port)) {
+ en_err(priv, "Failed configure QCN parameter\n");
+ } else {
+ priv->pflags |= MLX4_EN_PRIV_FLAGS_DISABLE_32_14_4_E;
+ }
+
+ } else if (!(flags & MLX4_EN_PRIV_FLAGS_DISABLE_32_14_4_E) &&
+ (priv->pflags & MLX4_EN_PRIV_FLAGS_DISABLE_32_14_4_E)) {
+#ifndef CONFIG_MLX4_EN_DCB
+ return -EOPNOTSUPP;
+#endif
+ if (mlx4_disable_32_14_4_e_write(mdev->dev, 0, priv->port)) {
+ en_err(priv, "Failed configure QCN parameter\n");
+ } else {
+ priv->pflags &= ~MLX4_EN_PRIV_FLAGS_DISABLE_32_14_4_E;
+ }
+ }
+#endif
+
+ if ((flags ^ priv->pflags) & MLX4_EN_PRIV_FLAGS_RXFCS) {
+ int err = 0;
+ bool port_up = false;
+ u8 rxfcs_value = (flags & MLX4_EN_PRIV_FLAGS_RXFCS) ? 1 : 0;
+
+#ifdef HAVE_NETIF_F_RXFCS
+ return -EOPNOTSUPP;
+#endif
+
+ if (!(mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_IGNORE_FCS)
+ || mlx4_is_mfunc(mdev->dev))
+ return -EOPNOTSUPP;
+
+ en_info(priv, "Turn %s RX-FCS\n", rxfcs_value ? "ON" : "OFF");
+
+ if (rxfcs_value)
+ priv->pflags |= MLX4_EN_PRIV_FLAGS_RXFCS;
+ else
+ priv->pflags &= ~MLX4_EN_PRIV_FLAGS_RXFCS;
+
+ mutex_lock(&mdev->state_lock);
+ if (priv->port_up) {
+ port_up = true;
+ en_warn(priv,
+ "Port link mode changed, restarting port...\n");
+ mlx4_en_stop_port(dev, 1);
+ }
+ if (port_up) {
+ err = mlx4_en_start_port(dev);
+ if (err)
+ en_err(priv, "Failed restarting port %d\n",
+ priv->port);
+ }
+ mutex_unlock(&mdev->state_lock);
+
+ if (err)
+ return err;
+ }
+
+ if ((flags ^ priv->pflags) & MLX4_EN_PRIV_FLAGS_RXALL) {
+ int ret = 0;
+ u8 rxall_value = (flags & MLX4_EN_PRIV_FLAGS_RXALL) ? 1 : 0;
+
+#ifdef HAVE_NETIF_F_RXALL
+ return -EOPNOTSUPP;
+#endif
+
+ if (!(mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_IGNORE_FCS)
+ || mlx4_is_mfunc(mdev->dev))
+ return -EOPNOTSUPP;
+
+ en_info(priv, "Turn %s RX-ALL\n", rxall_value ? "ON" : "OFF");
+
+ if (rxall_value)
+ priv->pflags |= MLX4_EN_PRIV_FLAGS_RXALL;
+ else
+ priv->pflags &= ~MLX4_EN_PRIV_FLAGS_RXALL;
+
+ ret = mlx4_SET_PORT_fcs_check(mdev->dev,
+ priv->port, rxall_value);
+ if (ret)
+ return ret;
+ }
+
+ if ((flags ^ priv->pflags) & MLX4_EN_PRIV_FLAGS_INLINE_SCATTER) {
+ int ret = 0;
+ u8 rx_copy_value = !!(flags &
+ MLX4_EN_PRIV_FLAGS_INLINE_SCATTER);
+
+ if (rx_copy_value)
+ priv->pflags |= MLX4_EN_PRIV_FLAGS_INLINE_SCATTER;
+ else
+ priv->pflags &= ~MLX4_EN_PRIV_FLAGS_INLINE_SCATTER;
+
+ ret = mlx4_en_change_inline_scatter_thold(dev,
+ MAX_INLINE_SCATTER *
+ rx_copy_value);
+
+ if (ret)
+ return ret;
+ }
+
+ if (bf_enabled_new == bf_enabled_old)
+ return 0; /* Nothing to do */
+
+ if (bf_enabled_new) {
+ bool bf_supported = true;
+
+ for (i = 0; i < priv->tx_ring_num; i++)
+ bf_supported &= priv->tx_ring[i]->bf_alloced;
+
+ if (!bf_supported) {
+ en_err(priv, "BlueFlame is not supported\n");
+ return -EINVAL;
+ }
+
+ priv->pflags |= MLX4_EN_PRIV_FLAGS_BLUEFLAME;
+ } else {
+ priv->pflags &= ~MLX4_EN_PRIV_FLAGS_BLUEFLAME;
+ }
+
+ for (i = 0; i < priv->tx_ring_num; i++)
+ priv->tx_ring[i]->bf_enabled = bf_enabled_new;
+
+ en_info(priv, "BlueFlame %s\n",
+ bf_enabled_new ? "Enabled" : "Disabled");
+
+ return !(flags == priv->pflags);
+}
+
+static u32 mlx4_en_get_priv_flags(struct net_device *dev)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ return priv->pflags;
+}
+
+#ifdef HAVE_GET_SET_TUNABLE
+static int mlx4_en_get_tunable(struct net_device *dev,
+ const struct ethtool_tunable *tuna,
+ void *data)
+{
+ const struct mlx4_en_priv *priv = netdev_priv(dev);
+ int ret = 0;
+
+ switch (tuna->id) {
+ case ETHTOOL_TX_COPYBREAK:
+ *(u32 *)data = priv->prof->inline_thold;
+ break;
+ case ETHTOOL_RX_COPYBREAK:
+ *(u32 *)data = priv->prof->inline_scatter_thold;
+ break;
+ default:
+ ret = -EINVAL;
+ break;
+ }
+
+ return ret;
+}
+
+static int mlx4_en_set_tunable(struct net_device *dev,
+ const struct ethtool_tunable *tuna,
+ const void *data)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ int val;
+ int ret = 0;
+
+ switch (tuna->id) {
+ case ETHTOOL_TX_COPYBREAK:
+ val = *(u32 *)data;
+ if (val < MIN_PKT_LEN || val > MAX_INLINE)
+ ret = -EINVAL;
+ else
+ priv->prof->inline_thold = val;
+ break;
+ case ETHTOOL_RX_COPYBREAK:
+ val = *(u32 *)data;
+ ret = mlx4_en_change_inline_scatter_thold(dev, val);
+ break;
+ default:
+ ret = -EINVAL;
+ break;
+ }
+
+ return ret;
+}
+#endif
+
+#ifdef HAVE_GET_MODULE_EEPROM
+static int mlx4_en_get_module_info(struct net_device *dev,
+ struct ethtool_modinfo *modinfo)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int ret;
+ u8 data[4];
+
+ /* Read first 2 bytes to get Module & REV ID */
+ ret = mlx4_get_module_info(mdev->dev, priv->port,
+ 0/*offset*/, 2/*size*/, data);
+ if (ret < 2)
+ return -EIO;
+
+ switch (data[0] /* identifier */) {
+ case MLX4_MODULE_ID_QSFP:
+ modinfo->type = ETH_MODULE_SFF_8436;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8436_LEN;
+ break;
+ case MLX4_MODULE_ID_QSFP_PLUS:
+ if (data[1] >= 0x3) { /* revision id */
+ modinfo->type = ETH_MODULE_SFF_8636;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8636_LEN;
+ } else {
+ modinfo->type = ETH_MODULE_SFF_8436;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8436_LEN;
+ }
+ break;
+ case MLX4_MODULE_ID_QSFP28:
+ modinfo->type = ETH_MODULE_SFF_8636;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8636_LEN;
+ break;
+ case MLX4_MODULE_ID_SFP:
+ modinfo->type = ETH_MODULE_SFF_8472;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN;
+ break;
+ default:
+ return -ENOSYS;
+ }
+
+ return 0;
+}
+#endif
+
+#ifdef HAVE_GET_MODULE_EEPROM
+static int mlx4_en_get_module_eeprom(struct net_device *dev,
+ struct ethtool_eeprom *ee,
+ u8 *data)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int offset = ee->offset;
+ int i = 0, ret;
+
+ if (ee->len == 0)
+ return -EINVAL;
+
+ memset(data, 0, ee->len);
+
+ while (i < ee->len) {
+ en_dbg(DRV, priv,
+ "mlx4_get_module_info i(%d) offset(%d) len(%d)\n",
+ i, offset, ee->len - i);
+
+ ret = mlx4_get_module_info(mdev->dev, priv->port,
+ offset, ee->len - i, data + i);
+
+ if (!ret) /* Done reading */
+ return 0;
+
+ if (ret < 0) {
+ en_err(priv,
+ "mlx4_get_module_info i(%d) offset(%d) bytes_to_read(%d) - FAILED (0x%x)\n",
+ i, offset, ee->len - i, ret);
+ return 0;
+ }
+
+ i += ret;
+ offset += ret;
+ }
+ return 0;
+}
+#endif
+
+#if defined(HAVE_SET_PHYS_ID) || defined(HAVE_SET_PHYS_ID_EXT)
+static int mlx4_en_set_phys_id(struct net_device *dev,
+ enum ethtool_phys_id_state state)
+{
+ int err;
+ u16 beacon_duration;
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+
+ if (!(mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_PORT_BEACON))
+ return -EOPNOTSUPP;
+
+ switch (state) {
+ case ETHTOOL_ID_ACTIVE:
+ beacon_duration = PORT_BEACON_MAX_LIMIT;
+ break;
+ case ETHTOOL_ID_INACTIVE:
+ beacon_duration = 0;
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ err = mlx4_SET_PORT_BEACON(mdev->dev, priv->port, beacon_duration);
+ return err;
+}
+#endif
+
+const struct ethtool_ops mlx4_en_ethtool_ops = {
+ .get_drvinfo = mlx4_en_get_drvinfo,
+ .get_settings = mlx4_en_get_settings,
+ .set_settings = mlx4_en_set_settings,
+#if (!defined(HAVE_NETDEV_HW_FEATURES) && !defined(HAVE_NET_DEVICE_OPS_EXT))
+#ifdef NETIF_F_TSO
+ .get_tso = mlx4_en_get_tso,
+ .set_tso = mlx4_en_set_tso,
+#endif
+ .get_sg = ethtool_op_get_sg,
+ .set_sg = ethtool_op_set_sg,
+ .get_rx_csum = mlx4_en_get_rx_csum,
+ .set_rx_csum = mlx4_en_set_rx_csum,
+ .get_tx_csum = ethtool_op_get_tx_csum,
+ .set_tx_csum = ethtool_op_set_tx_ipv6_csum,
+#endif
+ .get_link = ethtool_op_get_link,
+ .get_strings = mlx4_en_get_strings,
+ .get_sset_count = mlx4_en_get_sset_count,
+ .get_ethtool_stats = mlx4_en_get_ethtool_stats,
+ .self_test = mlx4_en_self_test,
+#if defined(HAVE_SET_PHYS_ID) && !defined(HAVE_SET_PHYS_ID_EXT)
+ .set_phys_id = mlx4_en_set_phys_id,
+#endif
+ .get_wol = mlx4_en_get_wol,
+ .set_wol = mlx4_en_set_wol,
+ .get_msglevel = mlx4_en_get_msglevel,
+ .set_msglevel = mlx4_en_set_msglevel,
+ .get_coalesce = mlx4_en_get_coalesce,
+ .set_coalesce = mlx4_en_set_coalesce,
+ .get_pauseparam = mlx4_en_get_pauseparam,
+ .set_pauseparam = mlx4_en_set_pauseparam,
+ .get_ringparam = mlx4_en_get_ringparam,
+ .set_ringparam = mlx4_en_set_ringparam,
+#if (!defined(HAVE_NETDEV_HW_FEATURES) && !defined(HAVE_NET_DEVICE_OPS_EXT))
+ .get_flags = mlx4_en_get_flags,
+ .set_flags = mlx4_en_set_flags,
+#endif
+ .get_rxnfc = mlx4_en_get_rxnfc,
+ .set_rxnfc = mlx4_en_set_rxnfc,
+#if defined(HAVE_GET_SET_RXFH) && !defined(HAVE_GET_SET_RXFH_INDIR_EXT)
+ .get_rxfh_indir_size = mlx4_en_get_rxfh_indir_size,
+ .get_rxfh_key_size = mlx4_en_get_rxfh_key_size,
+ .get_rxfh = mlx4_en_get_rxfh,
+ .set_rxfh = mlx4_en_set_rxfh,
+#endif
+#ifdef HAVE_GET_SET_CHANNELS
+ .get_channels = mlx4_en_get_channels,
+ .set_channels = mlx4_en_set_channels,
+#endif
+#if defined(HAVE_GET_TS_INFO) && !defined(HAVE_GET_TS_INFO_EXT)
+ .get_ts_info = mlx4_en_get_ts_info,
+#endif
+ .set_priv_flags = mlx4_en_set_priv_flags,
+ .get_priv_flags = mlx4_en_get_priv_flags,
+#ifdef HAVE_GET_SET_TUNABLE
+ .get_tunable = mlx4_en_get_tunable,
+ .set_tunable = mlx4_en_set_tunable,
+#endif
+#ifdef HAVE_GET_MODULE_EEPROM
+ .get_module_info = mlx4_en_get_module_info,
+ .get_module_eeprom = mlx4_en_get_module_eeprom
+#endif
+};
+
+#ifdef HAVE_ETHTOOL_OPS_EXT
+const struct ethtool_ops_ext mlx4_en_ethtool_ops_ext = {
+ .size = sizeof(struct ethtool_ops_ext),
+ .get_rxfh_indir_size = mlx4_en_get_rxfh_indir_size,
+#ifdef HAVE_GET_SET_RXFH_INDIR_EXT
+ .get_rxfh_indir = mlx4_en_get_rxfh_indir,
+ .set_rxfh_indir = mlx4_en_set_rxfh_indir,
+#endif
+#ifdef HAVE_GET_SET_CHANNELS_EXT
+ .get_channels = mlx4_en_get_channels,
+ .set_channels = mlx4_en_set_channels,
+#endif
+#ifdef HAVE_GET_TS_INFO_EXT
+ .get_ts_info = mlx4_en_get_ts_info,
+#endif
+#ifdef HAVE_SET_PHYS_ID_EXT
+ .set_phys_id = mlx4_en_set_phys_id,
+#endif
+};
+#endif
+
+
+
+#endif //kmod
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_main.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_main.c
new file mode 100644
index 0000000..6f95caa
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_main.c
@@ -0,0 +1,493 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+
+
+#include "mlx4_en.h"
+#include "mlx4_uio.h"
+#include "mlx4_uio_helper.h"
+
+MODULE_AUTHOR("Liran Liss, Yevgeny Petrilin");
+MODULE_DESCRIPTION("Mellanox ConnectX HCA Ethernet driver");
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_VERSION(DRV_VERSION " ("DRV_RELDATE")");
+
+static const char mlx4_en_version[] =
+ DRV_NAME ": Mellanox ConnectX HCA Ethernet driver v"
+ DRV_VERSION " (" DRV_RELDATE ")\n";
+
+#define MLX4_EN_PARM_INT(X, def_val, desc) \
+ static unsigned int X = def_val;\
+ module_param(X , int, 0444); \
+ MODULE_PARM_DESC(X, desc);
+
+
+/*
+ * Device scope module parameters
+ */
+
+/* Enable RSS UDP traffic */
+MLX4_EN_PARM_INT(udp_rss, 1,
+ "Enable RSS for incoming UDP traffic or disabled (0)");
+
+/* Priority pausing */
+MLX4_EN_PARM_INT(pfctx, 0, "Priority based Flow Control policy on TX[7:0]."
+ " Per priority bit mask");
+MLX4_EN_PARM_INT(pfcrx, 0, "Priority based Flow Control policy on RX[7:0]."
+ " Per priority bit mask");
+
+MLX4_EN_PARM_INT(inline_thold, MAX_INLINE,
+ "Threshold for using inline data (range: 17-104, default: 104)");
+
+#define MAX_PFC_TX 0xff
+#define MAX_PFC_RX 0xff
+
+#if defined(HAVE_VA_FORMAT) && !defined(CONFIG_X86_XEN)
+void en_print(const char *level, const struct mlx4_en_priv *priv,
+ const char *format, ...)
+{
+ va_list args;
+ struct va_format vaf;
+
+ va_start(args, format);
+
+ vaf.fmt = format;
+ vaf.va = &args;
+ if (priv->registered)
+ printk("%s%s: %s: %pV",
+ level, DRV_NAME, priv->dev->name, &vaf);
+ else
+ printk("%s%s: %s: Port %d: %pV",
+ level, DRV_NAME, dev_name(&priv->mdev->pdev->dev),
+ priv->port, &vaf);
+ va_end(args);
+}
+#endif
+
+void mlx4_en_update_loopback_state(struct rte_eth_dev *dev,
+ netdev_features_t features)
+{
+ struct mlx4_en_priv *priv = dev->data->dev_private;
+
+ if (features & NETIF_F_LOOPBACK)
+ priv->ctrl_flags |= cpu_to_be32(MLX4_WQE_CTRL_FORCE_LOOPBACK);
+ else
+ priv->ctrl_flags &= cpu_to_be32(~MLX4_WQE_CTRL_FORCE_LOOPBACK);
+
+ priv->flags &= ~(MLX4_EN_FLAG_RX_FILTER_NEEDED|
+ MLX4_EN_FLAG_ENABLE_HW_LOOPBACK);
+
+ /* Drop the packet if SRIOV is not enabled
+ * and not performing the selftest or flb disabled
+ */
+ if (mlx4_is_mfunc(priv->mdev->dev) &&
+ !(features & NETIF_F_LOOPBACK) && !priv->validate_loopback)
+ priv->flags |= MLX4_EN_FLAG_RX_FILTER_NEEDED;
+
+ /* Set dmac in Tx WQE if we are in SRIOV mode or if loopback selftest
+ * is requested
+ */
+ if (mlx4_is_mfunc(priv->mdev->dev) || priv->validate_loopback)
+ priv->flags |= MLX4_EN_FLAG_ENABLE_HW_LOOPBACK;
+
+ mutex_lock(&priv->mdev->state_lock);
+ if (priv->mdev->dev->caps.flags2 &
+ MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB &&
+ priv->rss_map.indir_qp.qpn) {
+ int i;
+ int err = 0;
+
+ for (i = 0; i < dev->data->nb_rx_queues; i++) {
+ int ret;
+
+ ret = mlx4_en_change_mcast_loopback(priv,
+ &priv->rss_map.qps[i],
+ !!(features &
+ NETIF_F_LOOPBACK));
+ if (!err)
+ err = ret;
+ }
+ if (err)
+ mlx4_warn(priv->mdev, "failed to change mcast loopback\n");
+ }
+ mutex_unlock(&priv->mdev->state_lock);
+}
+
+static int mlx4_en_get_profile(struct mlx4_en_dev *mdev)
+{
+ struct mlx4_en_profile *params = &mdev->profile;
+ int i;
+
+ params->udp_rss = udp_rss;
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ params->num_tx_rings_p_up = mlx4_low_memory_profile() ?
+ MLX4_EN_MIN_TX_RING_P_UP :
+ min_t(int, num_online_cpus(), MLX4_EN_MAX_TX_RING_P_UP);
+#endif
+
+ if (params->udp_rss && !(mdev->dev->caps.flags
+ & MLX4_DEV_CAP_FLAG_UDP_RSS)) {
+ mlx4_warn(mdev, "UDP RSS is not supported on this device\n");
+ params->udp_rss = 0;
+ }
+ for (i = 1; i <= MLX4_MAX_PORTS; i++) {
+ params->prof[i].rx_pause = 1;
+ params->prof[i].rx_ppp = pfcrx;
+ params->prof[i].tx_pause = 1;
+ params->prof[i].tx_ppp = pfctx;
+ params->prof[i].tx_ring_size = MLX4_EN_DEF_TX_RING_SIZE;
+ params->prof[i].rx_ring_size = MLX4_EN_DEF_RX_RING_SIZE;
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ params->prof[i].tx_ring_num = params->num_tx_rings_p_up *
+ MLX4_EN_NUM_UP;
+#else
+ params->prof[i].tx_ring_num = MLX4_EN_NUM_TX_RINGS +
+ (!!pfcrx) * MLX4_EN_NUM_PPP_RINGS;
+#endif
+ params->prof[i].rss_rings = 0;
+ params->prof[i].inline_thold = inline_thold;
+ params->prof[i].inline_scatter_thold = 0;
+ }
+
+ return 0;
+}
+
+static void *mlx4_en_get_rte_eth_dev(struct mlx4_dev *dev, void *ctx, u8 port)
+{
+ struct mlx4_en_dev *endev = ctx;
+
+ return endev->rte_pndev[port];
+}
+
+static void mlx4_en_event(struct mlx4_dev *dev, void *endev_ptr,
+ enum mlx4_dev_event event, unsigned long port)
+{
+ struct mlx4_en_dev *mdev = (struct mlx4_en_dev *) endev_ptr;
+ struct mlx4_en_priv *priv;
+
+ switch (event) {
+ case MLX4_DEV_EVENT_PORT_UP:
+ case MLX4_DEV_EVENT_PORT_DOWN:
+ if (!mdev->rte_pndev[port])
+ return;
+ priv = mdev->rte_pndev[port]->data->dev_private;
+ /* To prevent races, we poll the link state in a separate
+ task rather than changing it here */
+ priv->link_state = event;
+ //queue_work(mdev->workqueue, &priv->linkstate_task);
+ break;
+
+ case MLX4_DEV_EVENT_CATASTROPHIC_ERROR:
+ mlx4_err(mdev, "Internal error detected, restarting device\n");
+ break;
+
+ case MLX4_DEV_EVENT_SLAVE_INIT:
+ case MLX4_DEV_EVENT_SLAVE_SHUTDOWN:
+ break;
+ default:
+ if (port < 1 || port > dev->caps.num_ports ||
+ !mdev->rte_pndev[port])
+ return;
+ mlx4_warn(mdev, "Unhandled event %d for port %d\n", event,
+ (int) port);
+ }
+}
+
+static void mlx4_en_remove(struct mlx4_dev *dev, void *endev_ptr)
+{
+ struct mlx4_en_dev *mdev = endev_ptr;
+ int i;
+
+ mutex_lock(&mdev->state_lock);
+ mdev->device_up = false;
+ mutex_unlock(&mdev->state_lock);
+
+ mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_ETH)
+ if (mdev->rte_pndev[i])
+ {
+ //mlx4_en_destroy_rte_eth_dev(mdev->rte_pndev[i]);
+ assert(0); //XXX
+ }
+
+#if defined (HAVE_PTP_CLOCK_INFO) && (defined (CONFIG_PTP_1588_CLOCK) || defined(CONFIG_PTP_1588_CLOCK_MODULE))
+ if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_TS)
+ mlx4_en_remove_timestamp(mdev);
+#endif
+
+ //flush_workqueue(mdev->workqueue);
+ //destroy_workqueue(mdev->workqueue);
+ (void) mlx4_mr_free(dev, &mdev->mr);
+ //iounmap(mdev->uar_map);
+ mlx4_uar_free(dev, &mdev->priv_uar);
+ mlx4_pd_free(dev, mdev->priv_pdn);
+ //if (mdev->nb.notifier_call)
+// unregister_netdevice_notifier(&mdev->nb);
+ kfree(mdev);
+}
+
+static int mlx4_alloc_rtedev(struct mlx4_dev* dev, struct mlx4_en_dev *mdev, int port, struct mlx4_en_port_profile* prof)
+{
+ struct rte_eth_dev* rte_dev;
+ char ethdev_name[RTE_ETH_NAME_MAX_LEN];
+ snprintf(ethdev_name, sizeof(ethdev_name), "%d:%d.%d:port%d",
+ dev->persist->rte_pdev->addr.bus, dev->persist->rte_pdev->addr.devid,
+ dev->persist->rte_pdev->addr.function, port);
+ rte_dev = rte_eth_dev_allocate(ethdev_name, mlx4_is_slave(dev) ? RTE_ETH_DEV_VIRTUAL : RTE_ETH_DEV_PCI);
+ if (rte_dev == NULL)
+ return -ENOMEM;
+
+ rte_dev->pci_dev = dev->persist->rte_pdev;
+ rte_dev->data->rx_mbuf_alloc_failed = 0;
+
+ /*
+ * Initialize driver private data
+ */
+
+ rte_dev->data->dev_private = rte_zmalloc("mlx4_private", sizeof(struct mlx4_en_priv), RTE_CACHE_LINE_SIZE);
+ struct mlx4_en_priv* priv = rte_dev->data->dev_private;
+ priv->prof = prof;
+ priv->port = port;
+ priv->mdev = mdev;
+ rte_dev->data->mtu = ETHER_MTU;
+ TAILQ_INIT(&(rte_dev->link_intr_cbs));
+
+ rte_dev->dev_ops = &mlx4_eth_dev_ops;
+ rte_dev->rx_pkt_burst = &mlx4_recv_pkts;
+ rte_dev->tx_pkt_burst = &mlx4_xmit_pkts;
+
+ /* Set default MAC */
+ rte_dev->data->mac_addrs = rte_zmalloc_socket("mlx4_mac",
+ sizeof(struct ether_addr) * (1 << mdev->dev->caps.log_num_macs),
+ RTE_CACHE_LINE_SIZE, rte_dev->pci_dev->numa_node);
+ mlx4_en_u64_to_mac(rte_dev->data->mac_addrs[0].addr_bytes, mdev->dev->caps.def_mac[priv->port]);
+ if (!is_valid_ether_addr(rte_dev->data->mac_addrs[0].addr_bytes)) {
+ if (mlx4_is_slave(priv->mdev->dev)) {
+ u64 mac_u64 = rte_rand();
+ mdev->dev->caps.def_mac[priv->port] = mac_u64;
+ mlx4_en_u64_to_mac(rte_dev->data->mac_addrs[0].addr_bytes, mdev->dev->caps.def_mac[priv->port]);
+
+ en_warn(priv, "Assigned random MAC address %pM\n", rte_dev->data->mac_addrs[0].addr_bytes);
+
+ } else {
+ en_err(priv, "Port: %d, invalid mac burned: %pM, quiting\n",
+ priv->port, rte_dev->data->mac_addrs[0].addr_bytes);
+ return -EINVAL;
+ }
+ }
+
+ memcpy(priv->current_mac, rte_dev->data->mac_addrs[0].addr_bytes, sizeof(priv->current_mac));
+
+ return 0;
+}
+
+static void mlx4_en_activate(struct mlx4_dev *dev, void *ctx)
+{
+ int i;
+ struct mlx4_en_dev *mdev = ctx;
+
+ /* Create a netdev for each port */
+ mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_ETH) {
+ mlx4_info(mdev, "Activating port:%d\n", i);
+
+ if (mlx4_alloc_rtedev(dev, mdev, i, &mdev->profile.prof[i]))
+ mdev->rte_pndev[i] = NULL;
+ }
+
+#ifdef HAVE_NETDEV_BONDING_INFO
+ /* register notifier */
+ mdev->nb.notifier_call = mlx4_en_netdev_event;
+ if (register_netdevice_notifier(&mdev->nb)) {
+ mdev->nb.notifier_call = NULL;
+ mlx4_err(mdev, "Failed to create notifier\n");
+ }
+#endif
+}
+
+static void *mlx4_en_add(struct mlx4_dev *dev)
+{
+ struct mlx4_en_dev *mdev;
+ int i;
+
+ printk_once(KERN_INFO "%s", mlx4_en_version);
+
+ mdev = kzalloc(sizeof(*mdev), GFP_KERNEL);
+ if (!mdev)
+ goto err_free_res;
+
+ if (mlx4_pd_alloc(dev, &mdev->priv_pdn))
+ goto err_free_dev;
+
+ if (mlx4_uar_alloc(dev, &mdev->priv_uar))
+ goto err_pd;
+
+ //mdev->uar_map = ioremap((phys_addr_t) mdev->priv_uar.pfn << PAGE_SHIFT,
+ // PAGE_SIZE);
+ mdev->uar_map = mdev->priv_uar.pfn_addr;
+
+ if (!mdev->uar_map)
+ goto err_uar;
+ spin_lock_init(&mdev->uar_lock);
+
+ mdev->dev = dev;
+ //mdev->dma_device = &dev->persist->pdev->dev;
+ mdev->rte_pdev = dev->persist->rte_pdev;
+ mdev->device_up = false;
+
+ mdev->LSO_support = !!(dev->caps.flags & (1 << 15));
+ if (!mdev->LSO_support)
+ mlx4_warn(mdev, "LSO not supported, please upgrade to later FW version to enable LSO\n");
+
+ if (mlx4_mr_alloc(mdev->dev, mdev->priv_pdn, 0, ~0ull,
+ MLX4_PERM_LOCAL_WRITE | MLX4_PERM_LOCAL_READ,
+ 0, 0, &mdev->mr)) {
+ mlx4_err(mdev, "Failed allocating memory region\n");
+ goto err_map;
+ }
+ if (mlx4_mr_enable(mdev->dev, &mdev->mr)) {
+ mlx4_err(mdev, "Failed enabling memory region\n");
+ goto err_mr;
+ }
+
+ /* Build device profile according to supplied module parameters */
+ if (mlx4_en_get_profile(mdev)) {
+ mlx4_err(mdev, "Bad module parameters, aborting\n");
+ goto err_mr;
+ }
+
+ /* Configure which ports to start according to module parameters */
+ mdev->port_cnt = 0;
+ mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_ETH)
+ mdev->port_cnt++;
+
+ /* Initialize time stamp mechanism */
+ //if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_TS)
+ // mlx4_en_init_timestamp(mdev);
+
+ /* Set default number of RX rings*/
+ //mlx4_en_set_num_rx_rings(mdev);
+
+ /* Create our own workqueue for reset/multicast tasks
+ * Note: we cannot use the shared workqueue because of deadlocks caused
+ * by the rtnl lock */
+ //mdev->workqueue = create_singlethread_workqueue("mlx4_en");
+ //if (!mdev->workqueue)
+ // goto err_mr;
+
+ /* At this stage all non-port specific tasks are complete:
+ * mark the card state as up */
+ mutex_init(&mdev->state_lock);
+ mdev->device_up = true;
+
+ return mdev;
+
+err_mr:
+ (void) mlx4_mr_free(dev, &mdev->mr);
+err_map:
+ //if (mdev->uar_map)
+ //iounmap(mdev->uar_map);
+err_uar:
+ mlx4_uar_free(dev, &mdev->priv_uar);
+err_pd:
+ mlx4_pd_free(dev, mdev->priv_pdn);
+err_free_dev:
+ kfree(mdev);
+err_free_res:
+ return NULL;
+}
+
+static struct mlx4_interface mlx4_en_interface = {
+ .add = mlx4_en_add,
+ .remove = mlx4_en_remove,
+ .event = mlx4_en_event,
+ .get_dev = mlx4_en_get_rte_eth_dev,
+ .protocol = MLX4_PROT_ETH,
+ .activate = mlx4_en_activate,
+};
+
+static void mlx4_en_verify_params(void)
+{
+ if (pfctx > MAX_PFC_TX) {
+ pr_warn("mlx4_en: WARNING: illegal module parameter pfctx 0x%x - should be in range 0-0x%x, will be changed to default (0)\n",
+ pfctx, MAX_PFC_TX);
+ pfctx = 0;
+ }
+
+ if (pfcrx > MAX_PFC_RX) {
+ pr_warn("mlx4_en: WARNING: illegal module parameter pfcrx 0x%x - should be in range 0-0x%x, will be changed to default (0)\n",
+ pfcrx, MAX_PFC_RX);
+ pfcrx = 0;
+ }
+
+ if (inline_thold < MIN_PKT_LEN || inline_thold > MAX_INLINE) {
+ pr_warn("mlx4_en: WARNING: illegal module parameter inline_thold %d - should be in range %d-%d, will be changed to default (%d)\n",
+ inline_thold, MIN_PKT_LEN, MAX_INLINE, MAX_INLINE);
+ inline_thold = MAX_INLINE;
+ }
+}
+
+
+static int mlx4_en_init(const char *name, const char *args)
+{
+ inline_thold = 0;
+
+ mlx4_en_verify_params();
+
+ return mlx4_register_interface(&mlx4_en_interface);
+}
+
+
+static int mlx4_en_cleanup(const char *name)
+{
+ mlx4_unregister_interface(&mlx4_en_interface);
+ return 0;
+}
+
+#ifdef KMOD_DISABLED
+
+module_init(mlx4_en_init);
+module_exit(mlx4_en_cleanup);
+#endif
+
+
+static struct rte_driver mlx4_en = {
+ .type = PMD_PDEV,
+ .init = mlx4_en_init,
+ .uninit = mlx4_en_cleanup,
+};
+
+
+PMD_REGISTER_DRIVER(mlx4_en);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_netdev.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_netdev.c
new file mode 100644
index 0000000..0334420
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_netdev.c
@@ -0,0 +1,3786 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+#include "log2.h"
+#include <linux/version.h>
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#ifdef KMOD_REMOVED
+
+#ifdef CONFIG_NET_RX_BUSY_POLL
+#endif
+#ifdef HAVE_VXLAN_ENABLED
+#ifdef HAVE_VXLAN_DYNAMIC_PORT
+#endif
+#endif
+
+
+#include "mlx4_en.h"
+#include "en_port.h"
+
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ #define INIT_OWNER_BIT cpu_to_be32(1 << 30)
+#else
+ #define INIT_OWNER_BIT 0xffffffff
+#endif
+
+#ifdef HAVE_NEW_TX_RING_SCHEME
+int mlx4_en_setup_tc(struct net_device *dev, u8 up)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ int i;
+ unsigned int offset = 0;
+
+ if (up && up != MLX4_EN_NUM_UP)
+ return -EINVAL;
+
+ netdev_set_num_tc(dev, up);
+
+ /* Partition Tx queues evenly amongst UP's */
+ for (i = 0; i < up; i++) {
+ netdev_set_tc_queue(dev, i, priv->num_tx_rings_p_up, offset);
+ offset += priv->num_tx_rings_p_up;
+ }
+
+ return 0;
+}
+#endif
+
+#ifdef CONFIG_NET_RX_BUSY_POLL
+/* must be called with local_bh_disable()d */
+static int mlx4_en_low_latency_recv(struct napi_struct *napi)
+{
+ struct mlx4_en_cq *cq = container_of(napi, struct mlx4_en_cq, napi);
+ struct net_device *dev = cq->dev;
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_rx_ring *rx_ring = priv->rx_ring[cq->ring];
+ int done;
+
+ if (!priv->port_up)
+ return LL_FLUSH_FAILED;
+
+ if (!mlx4_en_cq_lock_poll(cq))
+ return LL_FLUSH_BUSY;
+
+ done = mlx4_en_process_rx_cq(dev, cq, 4);
+ if (likely(done))
+ rx_ring->cleaned += done;
+ else
+ rx_ring->misses++;
+
+ mlx4_en_cq_unlock_poll(cq);
+
+ return done;
+}
+#endif /* CONFIG_NET_RX_BUSY_POLL */
+
+#ifdef CONFIG_RFS_ACCEL
+
+#ifdef HAVE_NDO_RX_FLOW_STEER
+struct mlx4_en_filter {
+ struct list_head next;
+ struct work_struct work;
+
+ u8 ip_proto;
+ __be32 src_ip;
+ __be32 dst_ip;
+ __be16 src_port;
+ __be16 dst_port;
+
+ int rxq_index;
+ struct mlx4_en_priv *priv;
+ u32 flow_id; /* RFS infrastructure id */
+ int id; /* mlx4_en driver id */
+ u64 reg_id; /* Flow steering API id */
+ u8 activated; /* Used to prevent expiry before filter
+ * is attached
+ */
+ struct hlist_node filter_chain;
+};
+
+static void mlx4_en_filter_rfs_expire(struct mlx4_en_priv *priv);
+
+static enum mlx4_net_trans_rule_id mlx4_ip_proto_to_trans_rule_id(u8 ip_proto)
+{
+ switch (ip_proto) {
+ case IPPROTO_UDP:
+ return MLX4_NET_TRANS_RULE_ID_UDP;
+ case IPPROTO_TCP:
+ return MLX4_NET_TRANS_RULE_ID_TCP;
+ default:
+ return MLX4_NET_TRANS_RULE_NUM;
+ }
+};
+
+static void mlx4_en_filter_work(struct work_struct *work)
+{
+ struct mlx4_en_filter *filter = container_of(work,
+ struct mlx4_en_filter,
+ work);
+ struct mlx4_en_priv *priv = filter->priv;
+ struct mlx4_spec_list spec_tcp_udp = {
+ .id = mlx4_ip_proto_to_trans_rule_id(filter->ip_proto),
+ {
+ .tcp_udp = {
+ .dst_port = filter->dst_port,
+ .dst_port_msk = (__force __be16)-1,
+ .src_port = filter->src_port,
+ .src_port_msk = (__force __be16)-1,
+ },
+ },
+ };
+ struct mlx4_spec_list spec_ip = {
+ .id = MLX4_NET_TRANS_RULE_ID_IPV4,
+ {
+ .ipv4 = {
+ .dst_ip = filter->dst_ip,
+ .dst_ip_msk = (__force __be32)-1,
+ .src_ip = filter->src_ip,
+ .src_ip_msk = (__force __be32)-1,
+ },
+ },
+ };
+ struct mlx4_spec_list spec_eth = {
+ .id = MLX4_NET_TRANS_RULE_ID_ETH,
+ };
+ struct mlx4_net_trans_rule rule = {
+ .list = LIST_HEAD_INIT(rule.list),
+ .queue_mode = MLX4_NET_TRANS_Q_LIFO,
+ .exclusive = 1,
+ .allow_loopback = 1,
+ .promisc_mode = MLX4_FS_REGULAR,
+ .port = priv->port,
+ .priority = MLX4_DOMAIN_RFS,
+ };
+ int rc;
+ __be64 mac_mask = cpu_to_be64(MLX4_MAC_MASK << 16);
+
+ if (spec_tcp_udp.id >= MLX4_NET_TRANS_RULE_NUM) {
+ en_warn(priv, "RFS: ignoring unsupported ip protocol (%d)\n",
+ filter->ip_proto);
+ goto ignore;
+ }
+ list_add_tail(&spec_eth.list, &rule.list);
+ list_add_tail(&spec_ip.list, &rule.list);
+ list_add_tail(&spec_tcp_udp.list, &rule.list);
+
+ rule.qpn = priv->rss_map.qps[filter->rxq_index].qpn;
+ memcpy(spec_eth.eth.dst_mac, mlx4_priv_mac_addr(priv), ETH_ALEN);
+ memcpy(spec_eth.eth.dst_mac_msk, &mac_mask, ETH_ALEN);
+
+ filter->activated = 0;
+
+ if (filter->reg_id) {
+ rc = mlx4_flow_detach(priv->mdev->dev, filter->reg_id);
+ if (rc && rc != -ENOENT)
+ en_err(priv, "Error detaching flow. rc = %d\n", rc);
+ }
+
+ rc = mlx4_flow_attach(priv->mdev->dev, &rule, &filter->reg_id);
+ if (rc)
+ en_err(priv, "Error attaching flow. err = %d\n", rc);
+
+ignore:
+ mlx4_en_filter_rfs_expire(priv);
+
+ filter->activated = 1;
+}
+
+static inline struct hlist_head *
+filter_hash_bucket(struct mlx4_en_priv *priv, __be32 src_ip, __be32 dst_ip,
+ __be16 src_port, __be16 dst_port)
+{
+ unsigned long l;
+ int bucket_idx;
+
+ l = (__force unsigned long)src_port |
+ ((__force unsigned long)dst_port << 2);
+ l ^= (__force unsigned long)(src_ip ^ dst_ip);
+
+ bucket_idx = hash_long(l, MLX4_EN_FILTER_HASH_SHIFT);
+
+ return &priv->filter_hash[bucket_idx];
+}
+
+static struct mlx4_en_filter *
+mlx4_en_filter_alloc(struct mlx4_en_priv *priv, int rxq_index, __be32 src_ip,
+ __be32 dst_ip, u8 ip_proto, __be16 src_port,
+ __be16 dst_port, u32 flow_id)
+{
+ struct mlx4_en_filter *filter = NULL;
+
+ filter = kzalloc(sizeof(struct mlx4_en_filter), GFP_ATOMIC);
+ if (!filter)
+ return NULL;
+
+ filter->priv = priv;
+ filter->rxq_index = rxq_index;
+ INIT_WORK(&filter->work, mlx4_en_filter_work);
+
+ filter->src_ip = src_ip;
+ filter->dst_ip = dst_ip;
+ filter->ip_proto = ip_proto;
+ filter->src_port = src_port;
+ filter->dst_port = dst_port;
+
+ filter->flow_id = flow_id;
+
+ filter->id = priv->last_filter_id++ % RPS_NO_FILTER;
+
+ list_add_tail(&filter->next, &priv->filters);
+ hlist_add_head(&filter->filter_chain,
+ filter_hash_bucket(priv, src_ip, dst_ip, src_port,
+ dst_port));
+
+ return filter;
+}
+
+static void mlx4_en_filter_free(struct mlx4_en_filter *filter)
+{
+ struct mlx4_en_priv *priv = filter->priv;
+ int rc;
+
+ list_del(&filter->next);
+
+ rc = mlx4_flow_detach(priv->mdev->dev, filter->reg_id);
+ if (rc && rc != -ENOENT)
+ en_err(priv, "Error detaching flow. rc = %d\n", rc);
+
+ kfree(filter);
+}
+
+static inline struct mlx4_en_filter *
+mlx4_en_filter_find(struct mlx4_en_priv *priv, __be32 src_ip, __be32 dst_ip,
+ u8 ip_proto, __be16 src_port, __be16 dst_port)
+{
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0))
+ struct hlist_node *elem;
+#endif
+ struct mlx4_en_filter *filter;
+ struct mlx4_en_filter *ret = NULL;
+
+ hlist_for_each_entry(filter,
+ filter_hash_bucket(priv, src_ip, dst_ip,
+ src_port, dst_port),
+ filter_chain) {
+ if (filter->src_ip == src_ip &&
+ filter->dst_ip == dst_ip &&
+ filter->ip_proto == ip_proto &&
+ filter->src_port == src_port &&
+ filter->dst_port == dst_port) {
+ ret = filter;
+ break;
+ }
+ }
+
+ return ret;
+}
+
+static int
+mlx4_en_filter_rfs(struct net_device *net_dev, const struct sk_buff *skb,
+ u16 rxq_index, u32 flow_id)
+{
+ struct mlx4_en_priv *priv = netdev_priv(net_dev);
+ struct mlx4_en_filter *filter;
+ const struct iphdr *ip;
+ const __be16 *ports;
+ u8 ip_proto;
+ __be32 src_ip;
+ __be32 dst_ip;
+ __be16 src_port;
+ __be16 dst_port;
+ int nhoff = skb_network_offset(skb);
+ int ret = 0;
+
+ if (skb->protocol != htons(ETH_P_IP))
+ return -EPROTONOSUPPORT;
+
+ ip = (const struct iphdr *)(skb->data + nhoff);
+ if (ip_is_fragment(ip))
+ return -EPROTONOSUPPORT;
+
+ if ((ip->protocol != IPPROTO_TCP) && (ip->protocol != IPPROTO_UDP))
+ return -EPROTONOSUPPORT;
+ ports = (const __be16 *)(skb->data + nhoff + 4 * ip->ihl);
+
+ ip_proto = ip->protocol;
+ src_ip = ip->saddr;
+ dst_ip = ip->daddr;
+ src_port = ports[0];
+ dst_port = ports[1];
+
+ spin_lock_bh(&priv->filters_lock);
+ filter = mlx4_en_filter_find(priv, src_ip, dst_ip, ip_proto,
+ src_port, dst_port);
+ if (filter) {
+ if (filter->rxq_index == rxq_index)
+ goto out;
+
+ filter->rxq_index = rxq_index;
+ } else {
+ filter = mlx4_en_filter_alloc(priv, rxq_index,
+ src_ip, dst_ip, ip_proto,
+ src_port, dst_port, flow_id);
+ if (!filter) {
+ ret = -ENOMEM;
+ goto err;
+ }
+ }
+
+ queue_work(priv->mdev->workqueue, &filter->work);
+
+out:
+ ret = filter->id;
+err:
+ spin_unlock_bh(&priv->filters_lock);
+
+ return ret;
+}
+
+void mlx4_en_cleanup_filters(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_filter *filter, *tmp;
+ LIST_HEAD(del_list);
+
+ spin_lock_bh(&priv->filters_lock);
+ list_for_each_entry_safe(filter, tmp, &priv->filters, next) {
+ list_move(&filter->next, &del_list);
+ hlist_del(&filter->filter_chain);
+ }
+ spin_unlock_bh(&priv->filters_lock);
+
+ list_for_each_entry_safe(filter, tmp, &del_list, next) {
+ cancel_work_sync(&filter->work);
+ mlx4_en_filter_free(filter);
+ }
+}
+
+static void mlx4_en_filter_rfs_expire(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_filter *filter = NULL, *tmp, *last_filter = NULL;
+ LIST_HEAD(del_list);
+ int i = 0;
+
+ spin_lock_bh(&priv->filters_lock);
+ list_for_each_entry_safe(filter, tmp, &priv->filters, next) {
+ if (i > MLX4_EN_FILTER_EXPIRY_QUOTA)
+ break;
+
+ if (filter->activated &&
+ !work_pending(&filter->work) &&
+ rps_may_expire_flow(priv->dev,
+ filter->rxq_index, filter->flow_id,
+ filter->id)) {
+ list_move(&filter->next, &del_list);
+ hlist_del(&filter->filter_chain);
+ } else
+ last_filter = filter;
+
+ i++;
+ }
+
+ if (last_filter && (&last_filter->next != priv->filters.next))
+ list_move(&priv->filters, &last_filter->next);
+
+ spin_unlock_bh(&priv->filters_lock);
+
+ list_for_each_entry_safe(filter, tmp, &del_list, next)
+ mlx4_en_filter_free(filter);
+}
+#endif
+#endif
+
+#ifdef HAVE_VLAN_GRO_RECEIVE
+static void mlx4_en_vlan_rx_register(struct net_device *dev, struct vlan_group *grp)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ en_dbg(HW, priv, "Registering VLAN group:%p\n", grp);
+
+ priv->vlgrp = grp;
+}
+#endif
+
+static int mlx4_en_vlan_rx_add_vid(struct rte_eth_dev *dev, unsigned short vid)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err;
+ int idx;
+
+ en_dbg(HW, priv, "adding VLAN:%d\n", vid);
+
+ set_bit(vid, priv->active_vlans);
+
+ /* Add VID to port VLAN filter */
+ mutex_lock(&mdev->state_lock);
+ if (mdev->device_up && priv->port_up) {
+ err = mlx4_SET_VLAN_FLTR(mdev->dev, priv);
+ if (err)
+ en_err(priv, "Failed configuring VLAN filter\n");
+ }
+ if (mlx4_register_vlan(mdev->dev, priv->port, vid, &idx))
+ en_dbg(HW, priv, "failed adding vlan %d\n", vid);
+ mutex_unlock(&mdev->state_lock);
+
+ return 0;
+}
+
+static int mlx4_en_vlan_rx_kill_vid(struct rte_eth_dev *dev, unsigned short vid)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err;
+
+ en_dbg(HW, priv, "Killing VID:%d\n", vid);
+
+ clear_bit(vid, priv->active_vlans);
+
+ /* Remove VID from port VLAN filter */
+ mutex_lock(&mdev->state_lock);
+ mlx4_unregister_vlan(mdev->dev, priv->port, vid);
+
+ if (mdev->device_up && priv->port_up) {
+ err = mlx4_SET_VLAN_FLTR(mdev->dev, priv);
+ if (err)
+ en_err(priv, "Failed configuring VLAN filter\n");
+ }
+ mutex_unlock(&mdev->state_lock);
+
+ return 0;
+}
+
+static void mlx4_en_u64_to_mac(unsigned char dst_mac[ETH_ALEN], u64 src_mac)
+{
+ int i;
+ for (i = ETH_ALEN - 1; i >= 0; --i) {
+ dst_mac[i] = src_mac & 0xff;
+ src_mac >>= 8;
+ }
+ //memset(&dst_mac[ETH_ALEN], 0, 2);
+}
+
+
+static int mlx4_en_tunnel_steer_add(struct mlx4_en_priv *priv, unsigned char *addr,
+ int qpn, u64 *reg_id)
+{
+ int err;
+
+ if (priv->mdev->dev->caps.tunnel_offload_mode != MLX4_TUNNEL_OFFLOAD_MODE_VXLAN ||
+ priv->mdev->dev->caps.dmfs_high_steer_mode == MLX4_STEERING_DMFS_A0_STATIC)
+ return 0; /* do nothing */
+
+ err = mlx4_tunnel_steer_add(priv->mdev->dev, addr, priv->port, qpn,
+ MLX4_DOMAIN_NIC, reg_id);
+ if (err) {
+ en_err(priv, "failed to add vxlan steering rule, err %d\n", err);
+ return err;
+ }
+ en_dbg(DRV, priv, "added vxlan steering rule, mac %pM reg_id %llx\n", addr, *reg_id);
+ return 0;
+}
+
+
+static int mlx4_en_uc_steer_add(struct mlx4_en_priv *priv,
+ unsigned char *mac, int *qpn, u64 *reg_id)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_dev *dev = mdev->dev;
+ int err;
+
+ switch (dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_B0: {
+ struct mlx4_qp qp;
+ u8 gid[16] = {0};
+
+ qp.qpn = *qpn;
+ memcpy(&gid[10], mac, ETH_ALEN);
+ gid[5] = priv->port;
+
+ err = mlx4_unicast_attach(dev, &qp, gid, 0, MLX4_PROT_ETH);
+ break;
+ }
+ case MLX4_STEERING_MODE_DEVICE_MANAGED: {
+ struct mlx4_spec_list spec_eth = { {NULL} };
+ __be64 mac_mask = cpu_to_be64(MLX4_MAC_MASK << 16);
+
+ struct mlx4_net_trans_rule rule = {
+ .queue_mode = MLX4_NET_TRANS_Q_FIFO,
+ .exclusive = 0,
+ .allow_loopback = 1,
+ .promisc_mode = MLX4_FS_REGULAR,
+ .priority = MLX4_DOMAIN_NIC,
+ };
+
+ rule.port = priv->port;
+ rule.qpn = *qpn;
+ INIT_LIST_HEAD(&rule.list);
+
+ spec_eth.id = MLX4_NET_TRANS_RULE_ID_ETH;
+ memcpy(spec_eth.eth.dst_mac, mac, ETH_ALEN);
+ memcpy(spec_eth.eth.dst_mac_msk, &mac_mask, ETH_ALEN);
+ list_add_tail(&spec_eth.list, &rule.list);
+
+ err = mlx4_flow_attach(dev, &rule, reg_id);
+ break;
+ }
+ default:
+ return -EINVAL;
+ }
+ if (err)
+ en_warn(priv, "Failed Attaching Unicast\n");
+
+ return err;
+}
+
+static void mlx4_en_uc_steer_release(struct mlx4_en_priv *priv,
+ unsigned char *mac, int qpn, u64 reg_id)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_dev *dev = mdev->dev;
+
+ switch (dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_B0: {
+ struct mlx4_qp qp;
+ u8 gid[16] = {0};
+
+ qp.qpn = qpn;
+ memcpy(&gid[10], mac, ETH_ALEN);
+ gid[5] = priv->port;
+
+ mlx4_unicast_detach(dev, &qp, gid, MLX4_PROT_ETH);
+ break;
+ }
+ case MLX4_STEERING_MODE_DEVICE_MANAGED: {
+ mlx4_flow_detach(dev, reg_id);
+ break;
+ }
+ default:
+ en_err(priv, "Invalid steering mode.\n");
+ }
+}
+
+static uint8_t* mlx4_priv_mac_addr(struct mlx4_en_priv* priv)
+{
+ return priv->rte_dev->data->mac_addrs[0].addr_bytes;
+}
+
+static int mlx4_en_get_qp(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_dev *dev = mdev->dev;
+ struct mlx4_mac_entry *entry;
+ int index = 0;
+ int err = 0;
+ u64 reg_id = 0;
+ int *qpn = &priv->base_qpn;
+ u64 mac = mlx4_mac_to_u64(mlx4_priv_mac_addr(priv));
+
+ en_dbg(DRV, priv, "Registering MAC: %pM for adding\n",
+ mlx4_priv_mac_addr(priv));
+ index = mlx4_register_mac(dev, priv->port, mac);
+ if (index < 0) {
+ err = index;
+ en_err(priv, "Failed adding MAC: %pM\n",
+ mlx4_priv_mac_addr(priv));
+ return err;
+ }
+
+ if (dev->caps.steering_mode == MLX4_STEERING_MODE_A0) {
+ int base_qpn = mlx4_get_base_qpn(dev, priv->port);
+ *qpn = base_qpn + index;
+ return 0;
+ }
+
+ err = mlx4_qp_reserve_range(dev, 1, 1, qpn, MLX4_RESERVE_A0_QP);
+ en_dbg(DRV, priv, "Reserved qp %d\n", *qpn);
+ if (err) {
+ en_err(priv, "Failed to reserve qp for mac registration\n");
+ goto qp_err;
+ }
+
+ err = mlx4_en_uc_steer_add(priv, mlx4_priv_mac_addr(priv), qpn, ®_id);
+ if (err)
+ goto steer_err;
+
+ err = mlx4_en_tunnel_steer_add(priv, mlx4_priv_mac_addr(priv), *qpn,
+ &priv->tunnel_reg_id);
+ if (err)
+ goto tunnel_err;
+
+ entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+ if (!entry) {
+ err = -ENOMEM;
+ goto alloc_err;
+ }
+ memcpy(entry->mac, mlx4_priv_mac_addr(priv), sizeof(entry->mac));
+ memcpy(priv->current_mac, entry->mac, sizeof(priv->current_mac));
+ entry->reg_id = reg_id;
+
+ hlist_add_head(&entry->hlist,
+ &priv->mac_hash[entry->mac[MLX4_EN_MAC_HASH_IDX]]);
+
+ return 0;
+
+alloc_err:
+ if (priv->tunnel_reg_id)
+ mlx4_flow_detach(priv->mdev->dev, priv->tunnel_reg_id);
+tunnel_err:
+ mlx4_en_uc_steer_release(priv, mlx4_priv_mac_addr(priv), *qpn, reg_id);
+
+steer_err:
+ mlx4_qp_release_range(dev, *qpn, 1);
+
+qp_err:
+ mlx4_unregister_mac(dev, priv->port, mac);
+ return err;
+}
+
+static void mlx4_en_put_qp(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_dev *dev = mdev->dev;
+ int qpn = priv->base_qpn;
+ u64 mac;
+
+ if (dev->caps.steering_mode == MLX4_STEERING_MODE_A0) {
+ mac = mlx4_mac_to_u64(mlx4_priv_mac_addr(priv));
+ en_dbg(DRV, priv, "Registering MAC: %pM for deleting\n",
+ mlx4_priv_mac_addr(priv));
+ mlx4_unregister_mac(dev, priv->port, mac);
+ } else {
+ struct mlx4_mac_entry *entry;
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0))
+ struct hlist_node *n, *tmp;
+#else
+ struct hlist_node *tmp;
+#endif
+ struct hlist_head *bucket;
+ unsigned int i;
+
+ for (i = 0; i < MLX4_EN_MAC_HASH_SIZE; ++i) {
+ bucket = &priv->mac_hash[i];
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0))
+ hlist_for_each_entry_safe(entry, n, tmp, bucket, hlist) {
+#else
+ hlist_for_each_entry_safe(entry, tmp, bucket, hlist) {
+#endif
+ mac = mlx4_mac_to_u64(entry->mac);
+ en_dbg(DRV, priv, "Registering MAC: %pM for deleting\n",
+ entry->mac);
+ mlx4_en_uc_steer_release(priv, entry->mac,
+ qpn, entry->reg_id);
+
+ mlx4_unregister_mac(dev, priv->port, mac);
+ hlist_del(&entry->hlist);
+ kfree(entry);
+ }
+ }
+
+ if (priv->tunnel_reg_id) {
+ mlx4_flow_detach(priv->mdev->dev, priv->tunnel_reg_id);
+ priv->tunnel_reg_id = 0;
+ }
+
+ en_dbg(DRV, priv, "Releasing qp: port %d, qpn %d\n",
+ priv->port, qpn);
+ mlx4_qp_release_range(dev, qpn, 1);
+ priv->flags &= ~MLX4_EN_FLAG_FORCE_PROMISC;
+ }
+}
+
+static int mlx4_en_replace_mac(struct mlx4_en_priv *priv, int qpn,
+ unsigned char *new_mac, unsigned char *prev_mac)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_dev *dev = mdev->dev;
+ int err = 0;
+ u64 new_mac_u64 = mlx4_mac_to_u64(new_mac);
+
+ if (dev->caps.steering_mode != MLX4_STEERING_MODE_A0) {
+ struct hlist_head *bucket;
+ unsigned int mac_hash;
+ struct mlx4_mac_entry *entry;
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0))
+ struct hlist_node *n, *tmp;
+#else
+ struct hlist_node *tmp;
+#endif
+ u64 prev_mac_u64 = mlx4_mac_to_u64(prev_mac);
+
+ bucket = &priv->mac_hash[prev_mac[MLX4_EN_MAC_HASH_IDX]];
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0))
+ hlist_for_each_entry_safe(entry, n, tmp, bucket, hlist) {
+#else
+ hlist_for_each_entry_safe(entry, tmp, bucket, hlist) {
+#endif
+ if (ether_addr_equal_64bits(entry->mac, prev_mac)) {
+ mlx4_en_uc_steer_release(priv, entry->mac,
+ qpn, entry->reg_id);
+ mlx4_unregister_mac(dev, priv->port,
+ prev_mac_u64);
+ hlist_del(&entry->hlist);
+ synchronize_rcu();
+ memcpy(entry->mac, new_mac, ETH_ALEN);
+ entry->reg_id = 0;
+ mac_hash = new_mac[MLX4_EN_MAC_HASH_IDX];
+ hlist_add_head(&entry->hlist,
+ &priv->mac_hash[mac_hash]);
+ mlx4_register_mac(dev, priv->port, new_mac_u64);
+ err = mlx4_en_uc_steer_add(priv, new_mac,
+ &qpn,
+ &entry->reg_id);
+ if (err)
+ return err;
+ if (priv->tunnel_reg_id) {
+ mlx4_flow_detach(priv->mdev->dev, priv->tunnel_reg_id);
+ priv->tunnel_reg_id = 0;
+ }
+ err = mlx4_en_tunnel_steer_add(priv, new_mac, qpn,
+ &priv->tunnel_reg_id);
+ return err;
+ }
+ }
+ return -EINVAL;
+ }
+
+ return __mlx4_replace_mac(dev, priv->port, qpn, new_mac_u64);
+}
+
+static int mlx4_en_do_set_mac(struct mlx4_en_priv *priv,
+ unsigned char new_mac[ETH_ALEN + 2])
+{
+ int err = 0;
+
+ if (priv->port_up) {
+ /* Remove old MAC and insert the new one */
+ err = mlx4_en_replace_mac(priv, priv->base_qpn,
+ new_mac, priv->current_mac);
+ if (err)
+ en_err(priv, "Failed changing HW MAC address\n");
+ } else
+ en_dbg(HW, priv, "Port is down while registering mac, exiting...\n");
+
+ if (!err)
+ memcpy(priv->current_mac, new_mac, sizeof(priv->current_mac));
+
+ return err;
+}
+
+static int mlx4_en_set_mac(struct rte_eth_dev *dev, void *addr)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct sockaddr *saddr = addr;
+ unsigned char new_mac[ETH_ALEN + 2];
+ int err;
+
+ if (!is_valid_ether_addr(saddr->sa_data))
+ return -EADDRNOTAVAIL;
+
+ mutex_lock(&mdev->state_lock);
+ memcpy(new_mac, saddr->sa_data, ETH_ALEN);
+ err = mlx4_en_do_set_mac(priv, new_mac);
+ if (!err)
+ memcpy(mlx4_priv_mac_addr(priv), saddr->sa_data, ETH_ALEN);
+ mutex_unlock(&mdev->state_lock);
+
+ return err;
+}
+
+static void mlx4_en_clear_list(struct rte_eth_dev *dev)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_mc_list *tmp, *mc_to_del;
+
+ list_for_each_entry_safe(mc_to_del, tmp, &priv->mc_list, list) {
+ list_del(&mc_to_del->list);
+ kfree(mc_to_del);
+ }
+}
+
+#ifdef KMOD_DISABLED
+
+static void mlx4_en_cache_mclist(struct net_device *dev)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,35))
+ struct netdev_hw_addr *ha;
+#else
+ struct dev_mc_list *mclist;
+#endif
+ struct mlx4_en_mc_list *tmp;
+
+ mlx4_en_clear_list(dev);
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,35))
+ netdev_for_each_mc_addr(ha, dev) {
+#else
+ for (mclist = dev->mc_list; mclist; mclist = mclist->next) {
+#endif
+ tmp = kzalloc(sizeof(struct mlx4_en_mc_list), GFP_ATOMIC);
+ if (!tmp) {
+ mlx4_en_clear_list(dev);
+ return;
+ }
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,35))
+ memcpy(tmp->addr, ha->addr, ETH_ALEN);
+#else
+ memcpy(tmp->addr, mclist->dmi_addr, ETH_ALEN);
+#endif
+ list_add_tail(&tmp->list, &priv->mc_list);
+ }
+}
+#endif
+
+static void update_mclist_flags(struct mlx4_en_priv *priv,
+ struct list_head *dst,
+ struct list_head *src)
+{
+ struct mlx4_en_mc_list *dst_tmp, *src_tmp, *new_mc;
+ bool found;
+
+ /* Find all the entries that should be removed from dst,
+ * These are the entries that are not found in src
+ */
+ list_for_each_entry(dst_tmp, dst, list) {
+ found = false;
+ list_for_each_entry(src_tmp, src, list) {
+ if (ether_addr_equal(dst_tmp->addr, src_tmp->addr)) {
+ found = true;
+ break;
+ }
+ }
+ if (!found)
+ dst_tmp->action = MCLIST_REM;
+ }
+
+ /* Add entries that exist in src but not in dst
+ * mark them as need to add
+ */
+ list_for_each_entry(src_tmp, src, list) {
+ found = false;
+ list_for_each_entry(dst_tmp, dst, list) {
+ if (ether_addr_equal(dst_tmp->addr, src_tmp->addr)) {
+ dst_tmp->action = MCLIST_NONE;
+ found = true;
+ break;
+ }
+ }
+ if (!found) {
+
+ /*
+ new_mc = kmemdup(src_tmp,
+ sizeof(struct mlx4_en_mc_list),
+ GFP_KERNEL);
+ */
+ new_mc = kmalloc(sizeof(struct mlx4_en_mc_list), GFP_KERNEL);
+ if (!new_mc)
+ return;
+ memcpy(new_mc, src_tmp, sizeof(struct mlx4_en_mc_list));
+
+ new_mc->action = MCLIST_ADD;
+ list_add_tail(&new_mc->list, dst);
+ }
+ }
+}
+
+#ifdef KMOD_DISABLED
+static void mlx4_en_set_rx_mode(struct net_device *dev)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ if (!priv->port_up)
+ return;
+
+ //queue_work(priv->mdev->workqueue, &priv->rx_mode_task);
+}
+#endif
+
+static void mlx4_en_set_promisc_mode(struct mlx4_en_priv *priv,
+ struct mlx4_en_dev *mdev)
+{
+ int err = 0;
+
+ if (!(priv->flags & MLX4_EN_FLAG_PROMISC)) {
+ //if (netif_msg_rx_status(priv))
+ en_warn(priv, "Entering promiscuous mode\n");
+ priv->flags |= MLX4_EN_FLAG_PROMISC;
+
+ /* Enable promiscouos mode */
+ switch (mdev->dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_DEVICE_MANAGED:
+ err = mlx4_flow_steer_promisc_add(mdev->dev,
+ priv->port,
+ priv->base_qpn,
+ MLX4_FS_ALL_DEFAULT);
+ if (err)
+ en_err(priv, "Failed enabling promiscuous mode\n");
+ priv->flags |= MLX4_EN_FLAG_MC_PROMISC;
+ break;
+
+ case MLX4_STEERING_MODE_B0:
+ err = mlx4_unicast_promisc_add(mdev->dev,
+ priv->base_qpn,
+ priv->port);
+ if (err)
+ en_err(priv, "Failed enabling unicast promiscuous mode\n");
+
+ /* Add the default qp number as multicast
+ * promisc
+ */
+ if (!(priv->flags & MLX4_EN_FLAG_MC_PROMISC)) {
+ err = mlx4_multicast_promisc_add(mdev->dev,
+ priv->base_qpn,
+ priv->port);
+ if (err)
+ en_err(priv, "Failed enabling multicast promiscuous mode\n");
+ priv->flags |= MLX4_EN_FLAG_MC_PROMISC;
+ }
+ break;
+
+ case MLX4_STEERING_MODE_A0:
+ err = mlx4_SET_PORT_qpn_calc(mdev->dev,
+ priv->port,
+ priv->base_qpn,
+ 1);
+ if (err)
+ en_err(priv, "Failed enabling promiscuous mode\n");
+ break;
+ }
+
+ /* Disable port multicast filter (unconditionally) */
+ err = mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, 0,
+ 0, MLX4_MCAST_DISABLE);
+ if (err)
+ en_err(priv, "Failed disabling multicast filter\n");
+ }
+}
+
+static void mlx4_en_clear_promisc_mode(struct mlx4_en_priv *priv,
+ struct mlx4_en_dev *mdev)
+{
+ int err = 0;
+
+ //if (netif_msg_rx_status(priv))
+ en_warn(priv, "Leaving promiscuous mode\n");
+ priv->flags &= ~MLX4_EN_FLAG_PROMISC;
+
+ /* Disable promiscouos mode */
+ switch (mdev->dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_DEVICE_MANAGED:
+ err = mlx4_flow_steer_promisc_remove(mdev->dev,
+ priv->port,
+ MLX4_FS_ALL_DEFAULT);
+ if (err)
+ en_err(priv, "Failed disabling promiscuous mode\n");
+ priv->flags &= ~MLX4_EN_FLAG_MC_PROMISC;
+ break;
+
+ case MLX4_STEERING_MODE_B0:
+ err = mlx4_unicast_promisc_remove(mdev->dev,
+ priv->base_qpn,
+ priv->port);
+ if (err)
+ en_err(priv, "Failed disabling unicast promiscuous mode\n");
+ /* Disable Multicast promisc */
+ if (priv->flags & MLX4_EN_FLAG_MC_PROMISC) {
+ err = mlx4_multicast_promisc_remove(mdev->dev,
+ priv->base_qpn,
+ priv->port);
+ if (err)
+ en_err(priv, "Failed disabling multicast promiscuous mode\n");
+ priv->flags &= ~MLX4_EN_FLAG_MC_PROMISC;
+ }
+ break;
+
+ case MLX4_STEERING_MODE_A0:
+ err = mlx4_SET_PORT_qpn_calc(mdev->dev,
+ priv->port,
+ priv->base_qpn, 0);
+ if (err)
+ en_err(priv, "Failed disabling promiscuous mode\n");
+ break;
+ }
+}
+
+#ifdef KMOD_DISABLED
+
+static void mlx4_en_do_multicast(struct mlx4_en_priv *priv,
+ struct rte_eth_dev *dev,
+ struct mlx4_en_dev *mdev)
+{
+ struct mlx4_en_mc_list *mclist, *tmp;
+ u64 mcast_addr = 0;
+ u8 mc_list[16] = {0};
+ int err = 0;
+
+ /* Enable/disable the multicast filter according to IFF_ALLMULTI */
+ if (dev->flags & IFF_ALLMULTI) {
+ err = mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, 0,
+ 0, MLX4_MCAST_DISABLE);
+ if (err)
+ en_err(priv, "Failed disabling multicast filter\n");
+
+ /* Add the default qp number as multicast promisc */
+ if (!(priv->flags & MLX4_EN_FLAG_MC_PROMISC)) {
+ switch (mdev->dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_DEVICE_MANAGED:
+ err = mlx4_flow_steer_promisc_add(mdev->dev,
+ priv->port,
+ priv->base_qpn,
+ MLX4_FS_MC_DEFAULT);
+ break;
+
+ case MLX4_STEERING_MODE_B0:
+ err = mlx4_multicast_promisc_add(mdev->dev,
+ priv->base_qpn,
+ priv->port);
+ break;
+
+ case MLX4_STEERING_MODE_A0:
+ break;
+ }
+ if (err)
+ en_err(priv, "Failed entering multicast promisc mode\n");
+ priv->flags |= MLX4_EN_FLAG_MC_PROMISC;
+ }
+ } else {
+ /* Disable Multicast promisc */
+ if (priv->flags & MLX4_EN_FLAG_MC_PROMISC) {
+ switch (mdev->dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_DEVICE_MANAGED:
+ err = mlx4_flow_steer_promisc_remove(mdev->dev,
+ priv->port,
+ MLX4_FS_MC_DEFAULT);
+ break;
+
+ case MLX4_STEERING_MODE_B0:
+ err = mlx4_multicast_promisc_remove(mdev->dev,
+ priv->base_qpn,
+ priv->port);
+ break;
+
+ case MLX4_STEERING_MODE_A0:
+ break;
+ }
+ if (err)
+ en_err(priv, "Failed disabling multicast promiscuous mode\n");
+ priv->flags &= ~MLX4_EN_FLAG_MC_PROMISC;
+ }
+
+ err = mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, 0,
+ 0, MLX4_MCAST_DISABLE);
+ if (err)
+ en_err(priv, "Failed disabling multicast filter\n");
+
+ /* Flush mcast filter and init it with broadcast address */
+ mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, ETH_BCAST,
+ 1, MLX4_MCAST_CONFIG);
+
+ /* Update multicast list - we cache all addresses so they won't
+ * change while HW is updated holding the command semaphor */
+ netif_addr_lock_bh(dev);
+ mlx4_en_cache_mclist(dev);
+ netif_addr_unlock_bh(dev);
+ list_for_each_entry(mclist, &priv->mc_list, list) {
+ mcast_addr = mlx4_mac_to_u64(mclist->addr);
+ mlx4_SET_MCAST_FLTR(mdev->dev, priv->port,
+ mcast_addr, 0, MLX4_MCAST_CONFIG);
+ }
+ err = mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, 0,
+ 0, MLX4_MCAST_ENABLE);
+ if (err)
+ en_err(priv, "Failed enabling multicast filter\n");
+
+ update_mclist_flags(priv, &priv->curr_list, &priv->mc_list);
+ list_for_each_entry_safe(mclist, tmp, &priv->curr_list, list) {
+ if (mclist->action == MCLIST_REM) {
+ /* detach this address and delete from list */
+ memcpy(&mc_list[10], mclist->addr, ETH_ALEN);
+ mc_list[5] = priv->port;
+ err = mlx4_multicast_detach(mdev->dev,
+ &priv->rss_map.indir_qp,
+ mc_list,
+ MLX4_PROT_ETH,
+ mclist->reg_id);
+ if (err)
+ en_err(priv, "Fail to detach multicast address\n");
+
+ if (mclist->tunnel_reg_id) {
+ err = mlx4_flow_detach(priv->mdev->dev, mclist->tunnel_reg_id);
+ if (err)
+ en_err(priv, "Failed to detach multicast address\n");
+ }
+
+ /* remove from list */
+ list_del(&mclist->list);
+ kfree(mclist);
+ } else if (mclist->action == MCLIST_ADD) {
+ /* attach the address */
+ memcpy(&mc_list[10], mclist->addr, ETH_ALEN);
+ /* needed for B0 steering support */
+ mc_list[5] = priv->port;
+ err = mlx4_multicast_attach(mdev->dev,
+ &priv->rss_map.indir_qp,
+ mc_list,
+ priv->port, 0,
+ MLX4_PROT_ETH,
+ &mclist->reg_id);
+ if (err)
+ en_err(priv, "Fail to attach multicast address\n");
+
+ err = mlx4_en_tunnel_steer_add(priv, &mc_list[10], priv->base_qpn,
+ &mclist->tunnel_reg_id);
+ if (err)
+ en_err(priv, "Failed to attach multicast address\n");
+ }
+ }
+ }
+}
+#endif
+
+static void mlx4_en_do_uc_filter(struct mlx4_en_priv *priv,
+ struct rte_eth_dev *dev,
+ struct mlx4_en_dev *mdev)
+{
+ struct mlx4_mac_entry *entry;
+ struct hlist_node *tmp;
+ bool found;
+ u64 mac;
+ int err = 0;
+ struct hlist_head *bucket;
+ unsigned int i;
+ int removed = 0;
+ u32 prev_flags;
+
+ /* Note that we do not need to protect our mac_hash traversal with rcu,
+ * since all modification code is protected by mdev->state_lock
+ */
+
+ /* find what to remove */
+ for (i = 0; i < MLX4_EN_MAC_HASH_SIZE; ++i) {
+ bucket = &priv->mac_hash[i];
+ hlist_for_each_entry_safe(entry, tmp, bucket, hlist) {
+ found = false;
+ int _k;
+ uint8_t invalid_mac[] = {0,0,0,0,0,0,};
+ for(_k=0; _k< (1 << priv->mdev->dev->caps.log_num_macs); _k++)
+ {
+ if (ether_addr_equal_64bits(entry->mac,
+ mlx4_priv_mac_addr(priv) + ETHER_ADDR_LEN*_k)) {
+ found = true;
+ break;
+ }
+ }
+
+ /* MAC address of the port is not in uc list */
+ if (ether_addr_equal_64bits(entry->mac,
+ priv->current_mac))
+ found = true;
+
+ if (!found) {
+ mac = mlx4_mac_to_u64(entry->mac);
+ mlx4_en_uc_steer_release(priv, entry->mac,
+ priv->base_qpn,
+ entry->reg_id);
+ mlx4_unregister_mac(mdev->dev, priv->port, mac);
+
+ hlist_del(&entry->hlist);
+ kfree(entry);
+ en_dbg(DRV, priv, "Removed MAC %pM on port:%d\n",
+ entry->mac, priv->port);
+ ++removed;
+ }
+ }
+ }
+
+ /* if we didn't remove anything, there is no use in trying to add
+ * again once we are in a forced promisc mode state
+ */
+ if ((priv->flags & MLX4_EN_FLAG_FORCE_PROMISC) && 0 == removed)
+ return;
+
+ prev_flags = priv->flags;
+ priv->flags &= ~MLX4_EN_FLAG_FORCE_PROMISC;
+
+ /* find what to add */
+ int _k;
+ uint8_t invalid_mac[] = {0,0,0,0,0,0,};
+ for(_k=0; _k< (1 << priv->mdev->dev->caps.log_num_macs); _k++)
+ {
+ uint8_t* ha_addr = (mlx4_priv_mac_addr(priv) + ETHER_ADDR_LEN*_k);
+ if(0 == memcmp(ha_addr, invalid_mac, ETHER_ADDR_LEN))
+ continue;
+ found = false;
+ bucket = &priv->mac_hash[ha_addr[MLX4_EN_MAC_HASH_IDX]];
+ hlist_for_each_entry(entry, bucket, hlist) {
+ if (ether_addr_equal_64bits(entry->mac, ha_addr)) {
+ found = true;
+ break;
+ }
+ }
+
+ if (!found) {
+ entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+ if (!entry) {
+ en_err(priv, "Failed adding MAC %pM on port:%d (out of memory)\n",
+ ha_addr, priv->port);
+ priv->flags |= MLX4_EN_FLAG_FORCE_PROMISC;
+ break;
+ }
+ mac = mlx4_mac_to_u64(ha_addr);
+ memcpy(entry->mac, ha_addr, ETH_ALEN);
+ err = mlx4_register_mac(mdev->dev, priv->port, mac);
+ if (err < 0) {
+ en_err(priv, "Failed registering MAC %pM on port %d: %d\n",
+ ha_addr, priv->port, err);
+ kfree(entry);
+ priv->flags |= MLX4_EN_FLAG_FORCE_PROMISC;
+ break;
+ }
+ err = mlx4_en_uc_steer_add(priv, ha_addr,
+ &priv->base_qpn,
+ &entry->reg_id);
+ if (err) {
+ en_err(priv, "Failed adding MAC %pM on port %d: %d\n",
+ ha_addr, priv->port, err);
+ mlx4_unregister_mac(mdev->dev, priv->port, mac);
+ kfree(entry);
+ priv->flags |= MLX4_EN_FLAG_FORCE_PROMISC;
+ break;
+ } else {
+ unsigned int mac_hash;
+ en_dbg(DRV, priv, "Added MAC %pM on port:%d\n",
+ ha_addr, priv->port);
+ mac_hash = ha_addr[MLX4_EN_MAC_HASH_IDX];
+ bucket = &priv->mac_hash[mac_hash];
+ hlist_add_head(&entry->hlist, bucket);
+ }
+ }
+ }
+
+ if (priv->flags & MLX4_EN_FLAG_FORCE_PROMISC) {
+ en_warn(priv, "Forcing promiscuous mode on port:%d\n",
+ priv->port);
+ } else if (prev_flags & MLX4_EN_FLAG_FORCE_PROMISC) {
+ en_warn(priv, "Stop forcing promiscuous mode on port:%d\n",
+ priv->port);
+ }
+}
+
+#ifdef KMOD_DISABLED
+
+static void mlx4_en_do_set_rx_mode(struct work_struct *work)
+{
+ struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv,
+ rx_mode_task);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct net_device *dev = priv->dev;
+
+ mutex_lock(&mdev->state_lock);
+ if (!mdev->device_up) {
+ en_dbg(HW, priv, "Card is not up, ignoring rx mode change.\n");
+ goto out;
+ }
+ if (!priv->port_up) {
+ en_dbg(HW, priv, "Port is down, ignoring rx mode change.\n");
+ goto out;
+ }
+
+ if (!netif_carrier_ok(dev)) {
+ if (!mlx4_en_QUERY_PORT(mdev, priv->port)) {
+ if (priv->port_state.link_state) {
+ priv->last_link_state = MLX4_DEV_EVENT_PORT_UP;
+ netif_carrier_on(dev);
+ en_dbg(LINK, priv, "Link Up\n");
+ }
+ }
+ }
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,2,0))
+ if (dev->priv_flags & IFF_UNICAST_FLT)
+#else
+ if (mdev->dev->caps.steering_mode != MLX4_STEERING_MODE_A0)
+#endif
+ mlx4_en_do_uc_filter(priv, dev, mdev);
+
+ /* Promsicuous mode: disable all filters */
+ if ((dev->flags & IFF_PROMISC) ||
+ (priv->flags & MLX4_EN_FLAG_FORCE_PROMISC)) {
+ mlx4_en_set_promisc_mode(priv, mdev);
+ goto out;
+ }
+
+ /* Not in promiscuous mode */
+ if (priv->flags & MLX4_EN_FLAG_PROMISC)
+ mlx4_en_clear_promisc_mode(priv, mdev);
+
+ mlx4_en_do_multicast(priv, dev, mdev);
+out:
+ mutex_unlock(&mdev->state_lock);
+}
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static void mlx4_en_netpoll(struct net_device *dev)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_cq *cq;
+ int i;
+
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ cq = priv->rx_cq[i];
+ napi_schedule(&cq->napi);
+ }
+}
+#endif
+
+static void mlx4_en_tx_timeout(struct net_device *dev)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int i;
+
+ if (netif_msg_timer(priv))
+ en_warn(priv, "Tx timeout called on port:%d\n", priv->port);
+
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ if (!netif_tx_queue_stopped(netdev_get_tx_queue(dev, i)))
+ continue;
+ en_warn(priv, "TX timeout on queue: %d, QP: 0x%x, CQ: 0x%x, Cons: 0x%x, Prod: 0x%x\n",
+ i, priv->tx_ring[i]->qpn, priv->tx_ring[i]->cqn,
+ priv->tx_ring[i]->cons, priv->tx_ring[i]->prod);
+ }
+
+ priv->port_stats.tx_timeout++;
+ en_dbg(DRV, priv, "Scheduling watchdog\n");
+ queue_work(mdev->workqueue, &priv->watchdog_task);
+}
+
+
+static struct net_device_stats *mlx4_en_get_stats(struct net_device *dev)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ spin_lock_bh(&priv->stats_lock);
+ memcpy(&priv->ret_stats, &priv->stats, sizeof(priv->stats));
+ spin_unlock_bh(&priv->stats_lock);
+
+ return &priv->ret_stats;
+}
+#endif
+
+static void mlx4_en_set_default_moderation(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_cq *cq;
+ int i;
+
+ /* If we haven't received a specific coalescing setting
+ * (module param), we set the moderation parameters as follows:
+ * - moder_cnt is set to the number of mtu sized packets to
+ * satisfy our coalescing target.
+ * - moder_time is set to a fixed value.
+ */
+ priv->rx_frames = MLX4_EN_RX_COAL_TARGET;
+ priv->rx_usecs = MLX4_EN_RX_COAL_TIME;
+ priv->tx_frames = MLX4_EN_TX_COAL_PKTS;
+ priv->tx_usecs = MLX4_EN_TX_COAL_TIME;
+ en_dbg(INTR, priv, "Default coalesing params for mtu:%d - rx_frames:%d rx_usecs:%d\n",
+ priv->rte_dev->data->mtu, priv->rx_frames, priv->rx_usecs);
+
+ /* Setup cq moderation params */
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ cq = priv->rx_cq[i];
+ cq->moder_cnt = priv->rx_frames;
+ cq->moder_time = priv->rx_usecs;
+ priv->last_moder_time[i] = MLX4_EN_AUTO_CONF;
+ priv->last_moder_packets[i] = 0;
+ priv->last_moder_bytes[i] = 0;
+ }
+
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ cq = priv->tx_cq[i];
+ cq->moder_cnt = priv->tx_frames;
+ cq->moder_time = priv->tx_usecs;
+ }
+
+ /* Reset auto-moderation params */
+ priv->pkt_rate_low = MLX4_EN_RX_RATE_LOW;
+ priv->rx_usecs_low = MLX4_EN_RX_COAL_TIME_LOW;
+ priv->pkt_rate_high = MLX4_EN_RX_RATE_HIGH;
+ priv->rx_usecs_high = MLX4_EN_RX_COAL_TIME_HIGH;
+ priv->sample_interval = MLX4_EN_SAMPLE_INTERVAL;
+ priv->adaptive_rx_coal = 1;
+ priv->last_moder_jiffies = 0;
+ priv->last_moder_tx_packets = 0;
+}
+
+static void mlx4_en_auto_moderation(struct mlx4_en_priv *priv)
+{
+ unsigned long period = (unsigned long) (jiffies - priv->last_moder_jiffies);
+ struct mlx4_en_cq *cq;
+ unsigned long packets;
+ unsigned long rate;
+ unsigned long avg_pkt_size;
+ unsigned long rx_packets;
+ unsigned long rx_bytes;
+ unsigned long rx_pkt_diff;
+ int moder_time;
+ int ring, err;
+
+ if (!priv->adaptive_rx_coal || period < priv->sample_interval * HZ)
+ return;
+
+ for (ring = 0; ring < priv->rx_ring_num; ring++) {
+ spin_lock_bh(&priv->stats_lock);
+ rx_packets = priv->rx_ring[ring]->packets;
+ rx_bytes = priv->rx_ring[ring]->bytes;
+ spin_unlock_bh(&priv->stats_lock);
+
+ rx_pkt_diff = ((unsigned long) (rx_packets -
+ priv->last_moder_packets[ring]));
+ packets = rx_pkt_diff;
+ rate = packets * HZ / period;
+ avg_pkt_size = packets ? ((unsigned long) (rx_bytes -
+ priv->last_moder_bytes[ring])) / packets : 0;
+
+ /* Apply auto-moderation only when packet rate
+ * exceeds a rate that it matters */
+ if (rate > (MLX4_EN_RX_RATE_THRESH / priv->rx_ring_num) &&
+ avg_pkt_size > MLX4_EN_AVG_PKT_SMALL) {
+ if (rate < priv->pkt_rate_low)
+ moder_time = priv->rx_usecs_low;
+ else if (rate > priv->pkt_rate_high)
+ moder_time = priv->rx_usecs_high;
+ else
+ moder_time = (rate - priv->pkt_rate_low) *
+ (priv->rx_usecs_high - priv->rx_usecs_low) /
+ (priv->pkt_rate_high - priv->pkt_rate_low) +
+ priv->rx_usecs_low;
+ } else {
+ moder_time = priv->rx_usecs_low;
+ }
+
+ if (moder_time != priv->last_moder_time[ring]) {
+ priv->last_moder_time[ring] = moder_time;
+ cq = priv->rx_cq[ring];
+ cq->moder_time = moder_time;
+ cq->moder_cnt = priv->rx_frames;
+ err = mlx4_en_set_cq_moder(priv, cq);
+ if (err)
+ en_err(priv, "Failed modifying moderation for cq:%d\n",
+ ring);
+ }
+ priv->last_moder_packets[ring] = rx_packets;
+ priv->last_moder_bytes[ring] = rx_bytes;
+ }
+
+ priv->last_moder_jiffies = jiffies;
+}
+
+#ifdef KMOD_DISABLED
+static void mlx4_en_do_get_stats(struct work_struct *work)
+{
+ struct delayed_work *delay = to_delayed_work(work);
+ struct mlx4_en_priv *priv = container_of(delay, struct mlx4_en_priv,
+ stats_task);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err;
+
+ mutex_lock(&mdev->state_lock);
+ if (mdev->device_up) {
+ if (priv->port_up) {
+ if (mlx4_is_slave(mdev->dev))
+ err = mlx4_en_get_vport_stats(mdev, priv->port);
+ else
+ err = mlx4_en_DUMP_ETH_STATS(mdev, priv->port, 0);
+ if (err)
+ en_dbg(HW, priv, "Could not update stats\n");
+
+ mlx4_en_auto_moderation(priv);
+ }
+
+ queue_delayed_work(mdev->workqueue, &priv->stats_task, STATS_DELAY);
+ }
+ if (mdev->mac_removed[MLX4_MAX_PORTS + 1 - priv->port]) {
+ mlx4_en_do_set_mac(priv, priv->current_mac);
+ mdev->mac_removed[MLX4_MAX_PORTS + 1 - priv->port] = 0;
+ }
+ mutex_unlock(&mdev->state_lock);
+}
+#endif
+
+/* mlx4_en_service_task - Run service task for tasks that needed to be done
+ * periodically
+ */
+#ifdef KMOD_DISABLED
+static void mlx4_en_service_task(struct work_struct *work)
+{
+ struct delayed_work *delay = to_delayed_work(work);
+ struct mlx4_en_priv *priv = container_of(delay, struct mlx4_en_priv,
+ service_task);
+ struct mlx4_en_dev *mdev = priv->mdev;
+
+ mutex_lock(&mdev->state_lock);
+ if (mdev->device_up) {
+ if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_TS)
+ mlx4_en_ptp_overflow_check(mdev);
+
+ mlx4_en_recover_from_oom(priv);
+ queue_delayed_work(mdev->workqueue, &priv->service_task,
+ SERVICE_TASK_DELAY);
+ }
+ mutex_unlock(&mdev->state_lock);
+}
+
+static void mlx4_en_linkstate(struct work_struct *work)
+{
+ struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv,
+ linkstate_task);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int linkstate = priv->link_state;
+
+ mutex_lock(&mdev->state_lock);
+ /* If observable port state changed set carrier state and
+ * report to system log */
+ if (priv->last_link_state != linkstate) {
+ if (linkstate == MLX4_DEV_EVENT_PORT_DOWN) {
+ en_info(priv, "Link Down\n");
+ netif_carrier_off(priv->dev);
+ } else {
+ en_info(priv, "Link Up\n");
+ netif_carrier_on(priv->dev);
+ }
+ }
+ priv->last_link_state = linkstate;
+ mutex_unlock(&mdev->state_lock);
+}
+
+
+static int mlx4_en_init_affinity_hint(struct mlx4_en_priv *priv, int ring_idx)
+{
+ struct mlx4_en_rx_ring *ring = priv->rx_ring[ring_idx];
+ int numa_node = priv->mdev->dev->numa_node;
+ int ret = 0;
+
+ if (!zalloc_cpumask_var(&ring->affinity_mask, GFP_KERNEL))
+ return -ENOMEM;
+
+ ret = cpumask_set_cpu_local_first(ring_idx, numa_node,
+ ring->affinity_mask);
+ if (ret)
+ free_cpumask_var(ring->affinity_mask);
+
+ return ret;
+}
+
+
+static void mlx4_en_free_affinity_hint(struct mlx4_en_priv *priv, int ring_idx)
+{
+ free_cpumask_var(priv->rx_ring[ring_idx]->affinity_mask);
+}
+#endif
+
+
+int mlx4_en_start_port(struct rte_eth_dev *dev)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_en_cq *cq;
+ struct mlx4_en_tx_ring *tx_ring;
+ int rx_index = 0;
+ int tx_index = 0;
+ int err = 0;
+ int i;
+ int j;
+ u8 mc_list[16] = {0};
+
+ if (priv->port_up) {
+ en_dbg(DRV, priv, "start port called while port already up\n");
+ return 0;
+ }
+
+ INIT_LIST_HEAD(&priv->mc_list);
+ INIT_LIST_HEAD(&priv->curr_list);
+ INIT_LIST_HEAD(&priv->ethtool_list);
+ memset(&priv->ethtool_rules[0], 0,
+ sizeof(struct ethtool_flow_id) * MAX_NUM_OF_FS_RULES);
+
+ /* Calculate Rx buf size */
+ dev->data->mtu = min(dev->data->mtu, priv->max_mtu);
+ mlx4_en_calc_rx_buf(dev);
+ en_dbg(DRV, priv, "Rx buf size:%d\n", priv->rx_skb_size);
+
+ /* Configure rx cq's and rings */
+ err = mlx4_en_activate_rx_rings(priv);
+ if (err) {
+ en_err(priv, "Failed to activate RX rings\n");
+ return err;
+ }
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ cq = priv->rx_cq[i];
+
+#ifdef KMOD_DISABLED
+ mlx4_en_cq_init_lock(cq);
+
+ err = mlx4_en_init_affinity_hint(priv, i);
+ if (err) {
+ en_err(priv, "Failed preparing IRQ affinity hint\n");
+ goto cq_err;
+ }
+#endif
+
+ err = mlx4_en_activate_cq(priv, cq, i);
+ if (err) {
+ en_err(priv, "Failed activating Rx CQ\n");
+ //mlx4_en_free_affinity_hint(priv, i);
+ goto cq_err;
+ }
+
+ for (j = 0; j < cq->size; j++) {
+ struct mlx4_cqe *cqe = NULL;
+
+ cqe = mlx4_en_get_cqe(cq->buf, j, priv->cqe_size) +
+ priv->cqe_factor;
+ cqe->owner_sr_opcode = MLX4_CQE_OWNER_MASK;
+ }
+
+ err = mlx4_en_set_cq_moder(priv, cq);
+ if (err) {
+ en_err(priv, "Failed setting cq moderation parameters\n");
+ mlx4_en_deactivate_cq(priv, cq);
+ //mlx4_en_free_affinity_hint(priv, i);
+ goto cq_err;
+ }
+ mlx4_en_arm_cq(priv, cq);
+ priv->rx_ring[i]->cqn = cq->mcq.cqn;
+ ++rx_index;
+ }
+
+ /* Set qp number */
+ en_dbg(DRV, priv, "Getting qp number for port %d\n", priv->port);
+ err = mlx4_en_get_qp(priv);
+ if (err) {
+ en_err(priv, "Failed getting eth qp\n");
+ goto cq_err;
+ }
+ mdev->mac_removed[priv->port] = 0;
+
+ /* gets default allocated counter index from func cap */
+ /* or sink counter index if no resources */
+ priv->counter_index = mdev->dev->caps.def_counter_index[priv->port - 1];
+
+ en_dbg(DRV, priv, "%s: default counter index %d for port %d\n",
+ __func__, priv->counter_index, priv->port);
+
+ err = mlx4_en_config_rss_steer(priv);
+ if (err) {
+ en_err(priv, "Failed configuring rss steering\n");
+ goto mac_err;
+ }
+
+ err = mlx4_en_create_drop_qp(priv);
+ if (err)
+ goto rss_err;
+
+ /* Configure tx cq's and rings */
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ /* Configure cq */
+ cq = priv->tx_cq[i];
+ err = mlx4_en_activate_cq(priv, cq, i);
+ if (err) {
+ en_err(priv, "Failed allocating Tx CQ\n");
+ goto tx_err;
+ }
+ err = mlx4_en_set_cq_moder(priv, cq);
+ if (err) {
+ en_err(priv, "Failed setting cq moderation parameters\n");
+ mlx4_en_deactivate_cq(priv, cq);
+ goto tx_err;
+ }
+ en_dbg(DRV, priv, "Resetting index of collapsed CQ:%d to -1\n", i);
+ cq->buf->wqe_index = cpu_to_be16(0xffff);
+
+ /* Configure ring */
+ tx_ring = priv->tx_ring[i];
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ err = mlx4_en_activate_tx_ring(priv, tx_ring, cq->mcq.cqn,
+ i / priv->num_tx_rings_p_up);
+#else
+ err = mlx4_en_activate_tx_ring(priv, tx_ring, cq->mcq.cqn);
+#endif
+ if (err) {
+ en_err(priv, "Failed allocating Tx ring\n");
+ mlx4_en_deactivate_cq(priv, cq);
+ goto tx_err;
+ }
+ //tx_ring->tx_queue = netdev_get_tx_queue(dev, i);
+
+ /* Arm CQ for TX completions */
+ mlx4_en_arm_cq(priv, cq);
+
+ /* Set initial ownership of all Tx TXBBs to SW (1) */
+ for (j = 0; j < tx_ring->buf_size; j += STAMP_STRIDE)
+ *((u32 *) (tx_ring->buf + j)) = INIT_OWNER_BIT;
+ ++tx_index;
+ }
+
+ /* Configure port */
+ err = mlx4_SET_PORT_general(mdev->dev, priv->port,
+ priv->rx_skb_size + ETH_FCS_LEN,
+ priv->prof->tx_pause,
+ priv->prof->tx_ppp,
+ priv->prof->rx_pause,
+ priv->prof->rx_ppp);
+ if (err) {
+ en_err(priv, "Failed setting port general configurations for port %d, with error %d\n",
+ priv->port, err);
+ goto tx_err;
+ }
+ /* Set default qp number */
+ err = mlx4_SET_PORT_qpn_calc(mdev->dev, priv->port, priv->base_qpn, 0);
+ if (err) {
+ en_err(priv, "Failed setting default qp numbers\n");
+ goto tx_err;
+ }
+
+ if (mdev->dev->caps.tunnel_offload_mode == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN) {
+ err = mlx4_SET_PORT_VXLAN(mdev->dev, priv->port, VXLAN_STEER_BY_OUTER_MAC, 1);
+ if (err) {
+ en_err(priv, "Failed setting port L2 tunnel configuration, err %d\n",
+ err);
+ goto tx_err;
+ }
+ }
+
+ /* Init port */
+ en_dbg(HW, priv, "Initializing port\n");
+ err = mlx4_INIT_PORT(mdev->dev, priv->port);
+ if (err) {
+ en_err(priv, "Failed Initializing port\n");
+ goto tx_err;
+ }
+
+ /* Attach rx QP to bradcast address */
+ memset(&mc_list[10], 0xff, ETH_ALEN);
+ mc_list[5] = priv->port; /* needed for B0 steering support */
+ if (mlx4_multicast_attach(mdev->dev, &priv->rss_map.indir_qp, mc_list,
+ priv->port, 0, MLX4_PROT_ETH,
+ &priv->broadcast_id))
+ mlx4_warn(mdev, "Failed Attaching Broadcast\n");
+
+ /* Must redo promiscuous mode setup. */
+ priv->flags &= ~(MLX4_EN_FLAG_PROMISC | MLX4_EN_FLAG_MC_PROMISC);
+
+ /* Schedule multicast task to populate multicast list */
+ //queue_work(mdev->workqueue, &priv->rx_mode_task);
+
+#ifdef HAVE_VXLAN_DYNAMIC_PORT
+ if (priv->mdev->dev->caps.tunnel_offload_mode == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN)
+ vxlan_get_rx_port(dev);
+#endif
+
+ priv->port_up = true;
+ //netif_tx_start_all_queues(dev);
+ //netif_device_attach(dev);
+
+ return 0;
+
+tx_err:
+ while (tx_index--) {
+ mlx4_en_deactivate_tx_ring(priv, priv->tx_ring[tx_index]);
+ mlx4_en_deactivate_cq(priv, priv->tx_cq[tx_index]);
+ }
+ mlx4_en_destroy_drop_qp(priv);
+rss_err:
+ mlx4_en_release_rss_steer(priv);
+mac_err:
+ mlx4_en_put_qp(priv);
+cq_err:
+ while (rx_index--) {
+ mlx4_en_deactivate_cq(priv, priv->rx_cq[rx_index]);
+ //mlx4_en_free_affinity_hint(priv, rx_index);
+ }
+ for (i = 0; i < priv->rx_ring_num; i++)
+ mlx4_en_deactivate_rx_ring(priv, priv->rx_ring[i]);
+
+ return err; /* need to close devices */
+}
+
+
+void mlx4_en_stop_port(struct rte_eth_dev *dev, int detach)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_en_mc_list *mclist, *tmp;
+ struct ethtool_flow_id *flow, *tmp_flow;
+ int i;
+ u8 mc_list[16] = {0};
+
+ if (!priv->port_up) {
+ en_dbg(DRV, priv, "stop port called while port already down\n");
+ return;
+ }
+
+ /* close port*/
+ mlx4_CLOSE_PORT(mdev->dev, priv->port);
+
+ /* Synchronize with tx routine */
+ //netif_tx_lock_bh(dev);
+ //if (detach)
+ // netif_device_detach(dev);
+ //netif_tx_stop_all_queues(dev);
+ //netif_tx_unlock_bh(dev);
+
+ //netif_tx_disable(dev);
+
+ /* Set port as not active */
+ priv->port_up = false;
+
+ /* Promsicuous mode */
+ if (mdev->dev->caps.steering_mode ==
+ MLX4_STEERING_MODE_DEVICE_MANAGED) {
+ priv->flags &= ~(MLX4_EN_FLAG_PROMISC |
+ MLX4_EN_FLAG_MC_PROMISC);
+ mlx4_flow_steer_promisc_remove(mdev->dev,
+ priv->port,
+ MLX4_FS_ALL_DEFAULT);
+ mlx4_flow_steer_promisc_remove(mdev->dev,
+ priv->port,
+ MLX4_FS_MC_DEFAULT);
+ } else if (priv->flags & MLX4_EN_FLAG_PROMISC) {
+ priv->flags &= ~MLX4_EN_FLAG_PROMISC;
+
+ /* Disable promiscouos mode */
+ mlx4_unicast_promisc_remove(mdev->dev, priv->base_qpn,
+ priv->port);
+
+ /* Disable Multicast promisc */
+ if (priv->flags & MLX4_EN_FLAG_MC_PROMISC) {
+ mlx4_multicast_promisc_remove(mdev->dev, priv->base_qpn,
+ priv->port);
+ priv->flags &= ~MLX4_EN_FLAG_MC_PROMISC;
+ }
+ }
+
+ /* Detach All multicasts */
+ memset(&mc_list[10], 0xff, ETH_ALEN);
+ mc_list[5] = priv->port; /* needed for B0 steering support */
+ mlx4_multicast_detach(mdev->dev, &priv->rss_map.indir_qp, mc_list,
+ MLX4_PROT_ETH, priv->broadcast_id);
+ list_for_each_entry(mclist, &priv->curr_list, list) {
+ memcpy(&mc_list[10], mclist->addr, ETH_ALEN);
+ mc_list[5] = priv->port;
+ mlx4_multicast_detach(mdev->dev, &priv->rss_map.indir_qp,
+ mc_list, MLX4_PROT_ETH, mclist->reg_id);
+ if (mclist->tunnel_reg_id)
+ mlx4_flow_detach(mdev->dev, mclist->tunnel_reg_id);
+ }
+ mlx4_en_clear_list(dev);
+ list_for_each_entry_safe(mclist, tmp, &priv->curr_list, list) {
+ list_del(&mclist->list);
+ kfree(mclist);
+ }
+
+ /* Flush multicast filter */
+ mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, 0, 1, MLX4_MCAST_CONFIG);
+
+ /* Remove flow steering rules for the port*/
+ if (mdev->dev->caps.steering_mode ==
+ MLX4_STEERING_MODE_DEVICE_MANAGED) {
+ //ASSERT_RTNL();
+ assert(0);
+ list_for_each_entry_safe(flow, tmp_flow,
+ &priv->ethtool_list, list) {
+ mlx4_flow_detach(mdev->dev, flow->id);
+ list_del(&flow->list);
+ }
+ }
+
+ mlx4_en_destroy_drop_qp(priv);
+
+ /* Free TX Rings */
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ mlx4_en_deactivate_tx_ring(priv, priv->tx_ring[i]);
+ mlx4_en_deactivate_cq(priv, priv->tx_cq[i]);
+ }
+ msleep(10);
+
+ for (i = 0; i < priv->tx_ring_num; i++)
+ mlx4_en_free_tx_buf(dev, priv->tx_ring[i]);
+
+ /* Free RSS qps */
+ mlx4_en_release_rss_steer(priv);
+
+ /* Unregister Mac address for the port */
+ mlx4_en_put_qp(priv);
+ if (!(mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_REASSIGN_MAC_EN))
+ mdev->mac_removed[priv->port] = 1;
+
+ /* Free RX Rings */
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ struct mlx4_en_cq *cq = priv->rx_cq[i];
+/*
+ local_bh_disable();
+ while (!mlx4_en_cq_lock_napi(cq)) {
+ pr_info("CQ %d locked\n", i);
+ mdelay(1);
+ }
+ local_bh_enable();
+
+ napi_synchronize(&cq->napi);
+ */
+ mlx4_en_deactivate_rx_ring(priv, priv->rx_ring[i]);
+ mlx4_en_deactivate_cq(priv, cq);
+
+ //mlx4_en_free_affinity_hint(priv, i);
+ }
+}
+
+#ifdef KMOD_DISABLED
+static void mlx4_en_restart(struct work_struct *work)
+{
+ struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv,
+ watchdog_task);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct net_device *dev = priv->dev;
+
+ en_dbg(DRV, priv, "Watchdog task called for port %d\n", priv->port);
+
+ mutex_lock(&mdev->state_lock);
+ if (priv->port_up) {
+ mlx4_en_stop_port(dev, 1);
+ if (mlx4_en_start_port(dev))
+ en_err(priv, "Failed restarting port %d\n", priv->port);
+ }
+ mutex_unlock(&mdev->state_lock);
+}
+
+
+static void mlx4_en_clear_stats(struct net_device *dev)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int i;
+
+ if (!mlx4_is_slave(mdev->dev))
+ if (mlx4_en_DUMP_ETH_STATS(mdev, priv->port, 1))
+ en_dbg(HW, priv, "Failed dumping statistics\n");
+
+ memset(&priv->stats, 0, sizeof(priv->stats));
+ memset(&priv->pstats, 0, sizeof(priv->pstats));
+ memset(&priv->pkstats, 0, sizeof(priv->pkstats));
+ memset(&priv->port_stats, 0, sizeof(priv->port_stats));
+ memset(&priv->vport_stats, 0, sizeof(priv->vport_stats));
+ memset(&priv->rx_flowstats, 0, sizeof(priv->rx_flowstats));
+ memset(&priv->tx_flowstats, 0, sizeof(priv->tx_flowstats));
+ memset(&priv->rx_priority_flowstats, 0,
+ sizeof(priv->rx_priority_flowstats));
+ memset(&priv->tx_priority_flowstats, 0,
+ sizeof(priv->tx_priority_flowstats));
+
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ priv->tx_ring[i]->bytes = 0;
+ priv->tx_ring[i]->packets = 0;
+ priv->tx_ring[i]->tx_csum = 0;
+ }
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ priv->rx_ring[i]->bytes = 0;
+ priv->rx_ring[i]->packets = 0;
+ priv->rx_ring[i]->csum_ok = 0;
+ priv->rx_ring[i]->csum_none = 0;
+ priv->rx_ring[i]->csum_complete = 0;
+ }
+}
+#endif
+
+#ifdef KMOD_DISABLED
+static int mlx4_en_open(struct rte_eth_dev *dev)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err = 0;
+
+ mutex_lock(&mdev->state_lock);
+
+ if (!mdev->device_up) {
+ en_err(priv, "Cannot open - device down/disabled\n");
+ err = -EBUSY;
+ goto out;
+ }
+
+ /* Reset HW statistics and SW counters */
+ //mlx4_en_clear_stats(dev);
+
+ err = mlx4_en_start_port(dev);
+ if (err)
+ en_err(priv, "Failed starting port:%d\n", priv->port);
+
+out:
+ mutex_unlock(&mdev->state_lock);
+ return err;
+}
+
+
+static int mlx4_en_close(struct rte_eth_dev *dev)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+
+ en_dbg(IFDOWN, priv, "Close port called\n");
+
+ mutex_lock(&mdev->state_lock);
+
+ mlx4_en_stop_port(dev, 0);
+ //netif_carrier_off(dev);
+
+ mutex_unlock(&mdev->state_lock);
+ return 0;
+}
+
+void mlx4_en_free_resources(struct mlx4_en_priv *priv)
+{
+ int i;
+
+#ifdef HAVE_NETDEV_RX_CPU_RMAP
+#ifdef CONFIG_RFS_ACCEL
+ priv->dev->rx_cpu_rmap = NULL;
+#endif
+#endif
+
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ if (priv->tx_ring && priv->tx_ring[i])
+ mlx4_en_destroy_tx_ring(priv, &priv->tx_ring[i]);
+ if (priv->tx_cq && priv->tx_cq[i])
+ mlx4_en_destroy_cq(priv, &priv->tx_cq[i]);
+ }
+
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ if (priv->rx_ring[i])
+ mlx4_en_destroy_rx_ring(priv, &priv->rx_ring[i],
+ priv->prof->rx_ring_size, priv->stride);
+ if (priv->rx_cq[i])
+ mlx4_en_destroy_cq(priv, &priv->rx_cq[i]);
+ }
+}
+
+int mlx4_en_alloc_resources(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_port_profile *prof = priv->prof;
+ int i;
+ int node;
+
+ /* Create tx Rings */
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ //node = cpu_to_node(i % num_online_cpus());
+ node = priv->rte_dev->pci_dev->numa_node;
+ if (mlx4_en_create_cq(priv, &priv->tx_cq[i],
+ prof->tx_ring_size, i, TX, node))
+ goto err;
+
+ if (mlx4_en_create_tx_ring(priv, &priv->tx_ring[i],
+ prof->tx_ring_size, TXBB_SIZE,
+ node, i))
+ goto err;
+ }
+
+ /* Create rx Rings */
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ //node = cpu_to_node(i % num_online_cpus());
+ node = priv->rte_dev->pci_dev->numa_node;
+ if (mlx4_en_create_cq(priv, &priv->rx_cq[i],
+ prof->rx_ring_size, i, RX, node))
+ goto err;
+
+ if (mlx4_en_create_rx_ring(priv, &priv->rx_ring[i],
+ prof->rx_ring_size, priv->stride,
+ node))
+ goto err;
+ }
+
+#ifdef HAVE_NETDEV_RX_CPU_RMAP
+#ifdef CONFIG_RFS_ACCEL
+ priv->dev->rx_cpu_rmap = mlx4_get_cpu_rmap(priv->mdev->dev, priv->port);
+#endif
+#endif
+
+ return 0;
+
+err:
+ en_err(priv, "Failed to allocate NIC resources\n");
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ if (priv->rx_ring[i])
+ mlx4_en_destroy_rx_ring(priv, &priv->rx_ring[i],
+ prof->rx_ring_size,
+ priv->stride);
+ if (priv->rx_cq[i])
+ mlx4_en_destroy_cq(priv, &priv->rx_cq[i]);
+ }
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ if (priv->tx_ring[i])
+ mlx4_en_destroy_tx_ring(priv, &priv->tx_ring[i]);
+ if (priv->tx_cq[i])
+ mlx4_en_destroy_cq(priv, &priv->tx_cq[i]);
+ }
+ return -ENOMEM;
+}
+#endif
+
+#ifdef KMOD_DISABLED
+static const char fmt_u64[] = "%llu\n";
+
+struct en_stats_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct en_port *, struct en_stats_attribute *,
+ char *buf);
+ ssize_t (*store)(struct en_port *, struct en_stats_attribute *,
+ char *buf, size_t count);
+};
+
+struct en_port_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct en_port *, struct en_port_attribute *,
+ char *buf);
+ ssize_t (*store)(struct en_port *, struct en_port_attribute *,
+ const char *buf, size_t count);
+};
+
+#define EN_PORT_ATTR(_name, _mode, _show, _store) \
+struct en_stats_attribute en_stats_attr_##_name = \
+ __ATTR(_name, _mode, _show, _store)
+
+/* Show a given an attribute in the statistics group */
+static ssize_t mlx4_en_show_vf_statistics(struct en_port *en_p,
+ struct en_stats_attribute *attr,
+ char *buf, unsigned long offset)
+{
+ ssize_t ret = -EINVAL;
+ struct net_device_stats link_stats;
+
+ memset(&link_stats, 0xff, sizeof(struct net_device_stats));
+
+ mlx4_get_vf_statistics(en_p->dev, en_p->port_num, en_p->vport_num, &link_stats);
+
+ ret = sprintf(buf, fmt_u64, *(u64 *)(((u8 *)&link_stats) + offset));
+
+ return ret;
+}
+
+/* generate a read-only statistics attribute */
+#define VFSTAT_ENTRY(name) \
+static ssize_t name##_show(struct en_port *en_p, \
+ struct en_stats_attribute *attr, char *buf) \
+{ \
+ return mlx4_en_show_vf_statistics(en_p, attr, buf, \
+ offsetof(struct net_device_stats, name)); \
+} \
+static EN_PORT_ATTR(name, S_IRUGO, name##_show, NULL)
+
+VFSTAT_ENTRY(rx_packets);
+VFSTAT_ENTRY(tx_packets);
+VFSTAT_ENTRY(rx_bytes);
+VFSTAT_ENTRY(tx_bytes);
+VFSTAT_ENTRY(rx_errors);
+VFSTAT_ENTRY(tx_errors);
+VFSTAT_ENTRY(rx_dropped);
+VFSTAT_ENTRY(tx_dropped);
+VFSTAT_ENTRY(multicast);
+VFSTAT_ENTRY(collisions);
+VFSTAT_ENTRY(rx_length_errors);
+VFSTAT_ENTRY(rx_over_errors);
+VFSTAT_ENTRY(rx_crc_errors);
+VFSTAT_ENTRY(rx_frame_errors);
+VFSTAT_ENTRY(rx_fifo_errors);
+VFSTAT_ENTRY(rx_missed_errors);
+VFSTAT_ENTRY(tx_aborted_errors);
+VFSTAT_ENTRY(tx_carrier_errors);
+VFSTAT_ENTRY(tx_fifo_errors);
+VFSTAT_ENTRY(tx_heartbeat_errors);
+VFSTAT_ENTRY(tx_window_errors);
+VFSTAT_ENTRY(rx_compressed);
+VFSTAT_ENTRY(tx_compressed);
+
+static struct attribute *vfstat_attrs[] = {
+ &en_stats_attr_rx_packets.attr,
+ &en_stats_attr_tx_packets.attr,
+ &en_stats_attr_rx_bytes.attr,
+ &en_stats_attr_tx_bytes.attr,
+ &en_stats_attr_rx_errors.attr,
+ &en_stats_attr_tx_errors.attr,
+ &en_stats_attr_rx_dropped.attr,
+ &en_stats_attr_tx_dropped.attr,
+ &en_stats_attr_multicast.attr,
+ &en_stats_attr_collisions.attr,
+ &en_stats_attr_rx_length_errors.attr,
+ &en_stats_attr_rx_over_errors.attr,
+ &en_stats_attr_rx_crc_errors.attr,
+ &en_stats_attr_rx_frame_errors.attr,
+ &en_stats_attr_rx_fifo_errors.attr,
+ &en_stats_attr_rx_missed_errors.attr,
+ &en_stats_attr_tx_aborted_errors.attr,
+ &en_stats_attr_tx_carrier_errors.attr,
+ &en_stats_attr_tx_fifo_errors.attr,
+ &en_stats_attr_tx_heartbeat_errors.attr,
+ &en_stats_attr_tx_window_errors.attr,
+ &en_stats_attr_rx_compressed.attr,
+ &en_stats_attr_tx_compressed.attr,
+ NULL
+};
+
+static ssize_t en_stats_show(struct kobject *kobj, struct attribute *attr,
+ char *buf)
+{
+ struct en_stats_attribute *en_stats_attr =
+ container_of(attr, struct en_stats_attribute, attr);
+ struct en_port *p = container_of(kobj, struct en_port, kobj_stats);
+
+ if (!en_stats_attr->show)
+ return -EIO;
+
+ return en_stats_attr->show(p, en_stats_attr, buf);
+}
+
+#ifdef CONFIG_COMPAT_SYSFS_OPS_CONST
+static const struct sysfs_ops en_port_stats_sysfs_ops = {
+#else
+static struct sysfs_ops en_port_stats_sysfs_ops = {
+#endif
+ .show = en_stats_show
+};
+
+static struct kobj_type en_port_stats = {
+ .sysfs_ops = &en_port_stats_sysfs_ops,
+ .default_attrs = vfstat_attrs,
+};
+
+static ssize_t mlx4_en_show_vf_link_state(struct en_port *en_p,
+ struct en_port_attribute *attr,
+ char *buf)
+{
+ static const char * const str[] = { "auto", "enable", "disable" };
+ int link_state;
+ ssize_t len = 0;
+
+ link_state = mlx4_get_vf_link_state(en_p->dev, en_p->port_num,
+ en_p->vport_num);
+ if (link_state >= 0)
+ len += sprintf(&buf[len], "%s\n", str[link_state]);
+
+ return len;
+}
+
+static ssize_t mlx4_en_store_vf_link_state(struct en_port *en_p,
+ struct en_port_attribute *attr,
+ const char *buf, size_t count)
+{
+ int err, link_state;
+
+ if (count > 128)
+ return -EINVAL;
+
+ if (strstr(buf, "auto"))
+ link_state = IFLA_VF_LINK_STATE_AUTO;
+ else if (strstr(buf, "enable"))
+ link_state = IFLA_VF_LINK_STATE_ENABLE;
+ else if (strstr(buf, "disable"))
+ link_state = IFLA_VF_LINK_STATE_DISABLE;
+ else
+ return -EINVAL;
+
+ err = mlx4_set_vf_link_state(en_p->dev, en_p->port_num,
+ en_p->vport_num, link_state);
+ return err ? err : count;
+}
+
+struct en_port_attribute en_port_attr_link_state = __ATTR(link_state,
+ S_IRUGO | S_IWUSR,
+ mlx4_en_show_vf_link_state,
+ mlx4_en_store_vf_link_state);
+
+static ssize_t mlx4_en_show_tx_rate(struct en_port *en_p,
+ struct en_port_attribute *attr,
+ char *buf)
+{
+ return mlx4_get_vf_rate(en_p->dev, en_p->port_num,
+ en_p->vport_num, buf);
+}
+
+struct en_port_attribute en_port_attr_tx_rate = __ATTR(tx_rate,
+ S_IRUGO,
+ mlx4_en_show_tx_rate,
+ NULL);
+
+static ssize_t en_port_show(struct kobject *kobj,
+ struct attribute *attr, char *buf)
+{
+ struct en_port_attribute *en_port_attr =
+ container_of(attr, struct en_port_attribute, attr);
+ struct en_port *p = container_of(kobj, struct en_port, kobj_vf);
+
+ if (!en_port_attr->show)
+ return -EIO;
+
+ return en_port_attr->show(p, en_port_attr, buf);
+}
+
+static ssize_t en_port_store(struct kobject *kobj,
+ struct attribute *attr,
+ const char *buf, size_t count)
+{
+ struct en_port_attribute *en_port_attr =
+ container_of(attr, struct en_port_attribute, attr);
+ struct en_port *p = container_of(kobj, struct en_port, kobj_vf);
+
+ if (!en_port_attr->store)
+ return -EIO;
+
+ return en_port_attr->store(p, en_port_attr, buf, count);
+}
+
+#ifdef CONFIG_COMPAT_SYSFS_OPS_CONST
+static const struct sysfs_ops en_port_vf_ops = {
+#else
+static struct sysfs_ops en_port_vf_ops = {
+#endif
+ .show = en_port_show,
+ .store = en_port_store,
+};
+
+static struct attribute *vf_attrs[] = {
+ &en_port_attr_link_state.attr,
+ &en_port_attr_tx_rate.attr,
+ NULL
+};
+
+static struct kobj_type en_port_type = {
+ .sysfs_ops = &en_port_vf_ops,
+ .default_attrs = vf_attrs,
+};
+#endif
+
+#ifdef KMOD_DISABLED
+void mlx4_en_destroy_netdev(struct rte_eth_dev *dev)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int i;
+
+ en_dbg(DRV, priv, "Destroying netdev on port:%d\n", priv->port);
+
+#ifdef CONFIG_COMPAT_EN_SYSFS
+ if (priv->sysfs_group_initialized)
+ mlx4_en_sysfs_remove(dev);
+#endif
+
+ /* Unregister device - this will close the port if it was up */
+ if (priv->registered)
+ {
+ //unregister_netdev(dev); XXX
+ }
+
+ if (priv->allocated)
+ mlx4_free_hwq_res(mdev->dev, &priv->res, MLX4_EN_PAGE_SIZE);
+
+ //cancel_delayed_work(&priv->stats_task);
+ //cancel_delayed_work(&priv->service_task);
+ /* flush any pending task for this netdev */
+ //flush_workqueue(mdev->workqueue);
+
+ /* Detach the netdev so tasks would not attempt to access it */
+ mutex_lock(&mdev->state_lock);
+ mdev->rte_pndev[priv->port] = NULL;
+ mdev->rte_upper[priv->port] = NULL;
+ mutex_unlock(&mdev->state_lock);
+
+ if (mlx4_is_master(priv->mdev->dev)) {
+ for (i = 0; i < priv->mdev->dev->persist->num_vfs; i++) {
+ if (priv->vf_ports[i]) {
+ //kobject_put(&priv->vf_ports[i]->kobj_stats);
+ //kobject_put(&priv->vf_ports[i]->kobj_vf);
+ kfree(priv->vf_ports[i]);
+ priv->vf_ports[i] = NULL;
+ }
+ }
+ }
+
+ mlx4_en_free_resources(priv);
+
+ kfree(priv->tx_ring);
+ kfree(priv->tx_cq);
+
+ //free_netdev(dev);
+ rte_free(dev);
+}
+#endif
+
+static int mlx4_en_change_mtu(struct rte_eth_dev *dev, int new_mtu)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err = 0;
+
+ en_dbg(DRV, priv, "Change MTU called - current:%d new:%d\n",
+ dev->data->mtu, new_mtu);
+
+ if ((new_mtu < MLX4_EN_MIN_MTU) || (new_mtu > priv->max_mtu)) {
+ en_err(priv, "Bad MTU size:%d.\n", new_mtu);
+ return -EPERM;
+ }
+
+ if (priv->prof->inline_scatter_thold >= MIN_INLINE_SCATTER) {
+ en_err(priv, "Please disable RX Copybreak by setting to 0\n");
+ return -EPERM;
+ }
+
+ dev->data->mtu = new_mtu;
+
+ //if (netif_running(dev))
+ {
+ mutex_lock(&mdev->state_lock);
+ if (!mdev->device_up) {
+ /* NIC is probably restarting - let watchdog task reset
+ * the port */
+ en_dbg(DRV, priv, "Change MTU called with card down!?\n");
+ } else {
+ mlx4_en_stop_port(dev, 1);
+ err = mlx4_en_start_port(dev);
+ if (err) {
+ en_err(priv, "Failed restarting port:%d\n",
+ priv->port);
+ //queue_work(mdev->workqueue, &priv->watchdog_task);
+ }
+ }
+ mutex_unlock(&mdev->state_lock);
+ }
+ return 0;
+}
+
+#ifdef KMOD_DISABLED
+
+#ifdef HAVE_SIOCGHWTSTAMP
+static int mlx4_en_hwtstamp_set(struct net_device *dev, struct ifreq *ifr)
+#else
+static int mlx4_en_hwtstamp_ioctl(struct net_device *dev, struct ifreq *ifr)
+#endif
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct hwtstamp_config config;
+
+ if (copy_from_user(&config, ifr->ifr_data, sizeof(config)))
+ return -EFAULT;
+
+ /* reserved for future extensions */
+ if (config.flags)
+ return -EINVAL;
+
+ /* device doesn't support time stamping */
+ if (!(mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_TS))
+ return -EINVAL;
+
+ /* TX HW timestamp */
+ switch (config.tx_type) {
+ case HWTSTAMP_TX_OFF:
+ case HWTSTAMP_TX_ON:
+ break;
+ default:
+ return -ERANGE;
+ }
+
+ /* RX HW timestamp */
+ switch (config.rx_filter) {
+ case HWTSTAMP_FILTER_NONE:
+ break;
+ case HWTSTAMP_FILTER_ALL:
+ case HWTSTAMP_FILTER_SOME:
+ case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
+ case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
+ case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
+ case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
+ case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
+ case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
+ case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
+ case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
+ case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
+ case HWTSTAMP_FILTER_PTP_V2_EVENT:
+ case HWTSTAMP_FILTER_PTP_V2_SYNC:
+ case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ config.rx_filter = HWTSTAMP_FILTER_ALL;
+ break;
+ default:
+ return -ERANGE;
+ }
+
+ if (mlx4_en_reset_config(dev, config, dev->features)) {
+ config.tx_type = HWTSTAMP_TX_OFF;
+ config.rx_filter = HWTSTAMP_FILTER_NONE;
+ }
+
+ return copy_to_user(ifr->ifr_data, &config,
+ sizeof(config)) ? -EFAULT : 0;
+}
+
+#ifdef HAVE_SIOCGHWTSTAMP
+static int mlx4_en_hwtstamp_get(struct net_device *dev, struct ifreq *ifr)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ return copy_to_user(ifr->ifr_data, &priv->hwtstamp_config,
+ sizeof(priv->hwtstamp_config)) ? -EFAULT : 0;
+}
+#endif
+
+static int mlx4_en_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
+{
+ switch (cmd) {
+ case SIOCSHWTSTAMP:
+#ifdef HAVE_SIOCGHWTSTAMP
+ return mlx4_en_hwtstamp_set(dev, ifr);
+ case SIOCGHWTSTAMP:
+ return mlx4_en_hwtstamp_get(dev, ifr);
+#else
+ return mlx4_en_hwtstamp_ioctl(dev, ifr);
+#endif
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+#ifndef CONFIG_SYSFS_LOOPBACK
+static
+#endif
+int mlx4_en_set_features(struct net_device *netdev,
+#ifdef HAVE_NET_DEVICE_OPS_EXT
+ u32 features)
+#else
+ netdev_features_t features)
+#endif
+{
+ struct mlx4_en_priv *priv = netdev_priv(netdev);
+ bool reset = false;
+ int ret = 0;
+
+#ifdef HAVE_NETIF_F_RXFCS
+ if (DEV_FEATURE_CHANGED(netdev, features, NETIF_F_RXFCS)) {
+ en_info(priv, "Turn %s RX-FCS\n",
+ (features & NETIF_F_RXFCS) ? "ON" : "OFF");
+ reset = true;
+ }
+#endif
+
+#ifdef HAVE_NETIF_F_RXALL
+ if (DEV_FEATURE_CHANGED(netdev, features, NETIF_F_RXALL)) {
+ u8 ignore_fcs_value = (features & NETIF_F_RXALL) ? 1 : 0;
+
+ en_info(priv, "Turn %s RX-ALL\n",
+ ignore_fcs_value ? "ON" : "OFF");
+ ret = mlx4_SET_PORT_fcs_check(priv->mdev->dev,
+ priv->port, ignore_fcs_value);
+ if (ret)
+ return ret;
+ }
+#endif
+
+ if (DEV_FEATURE_CHANGED(netdev, features, NETIF_F_HW_VLAN_CTAG_RX)) {
+ en_info(priv, "Turn %s RX vlan strip offload\n",
+ (features & NETIF_F_HW_VLAN_CTAG_RX) ? "ON" : "OFF");
+ reset = true;
+ }
+
+ if (DEV_FEATURE_CHANGED(netdev, features, NETIF_F_HW_VLAN_CTAG_TX))
+ en_info(priv, "Turn %s TX vlan strip offload\n",
+ (features & NETIF_F_HW_VLAN_CTAG_TX) ? "ON" : "OFF");
+
+ if (DEV_FEATURE_CHANGED(netdev, features, NETIF_F_LOOPBACK)) {
+ en_info(priv, "Turn %s loopback\n",
+ (features & NETIF_F_LOOPBACK) ? "ON" : "OFF");
+ mlx4_en_update_loopback_state(netdev, features);
+ }
+
+ if (reset) {
+ ret = mlx4_en_reset_config(netdev, priv->hwtstamp_config,
+ features);
+ if (ret)
+ return ret;
+ }
+ return 0;
+}
+
+#endif
+
+#ifdef HAVE_NDO_SET_VF_MAC
+static int mlx4_en_set_vf_mac(struct net_device *dev, int queue, u8 *mac)
+{
+ struct mlx4_en_priv *en_priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = en_priv->mdev;
+ u64 mac_u64 = mlx4_mac_to_u64(mac);
+
+ if (!is_valid_ether_addr(mac))
+ return -EINVAL;
+
+ return mlx4_set_vf_mac(mdev->dev, en_priv->port, queue, mac_u64);
+}
+
+static int mlx4_en_set_vf_vlan(struct net_device *dev, int vf, u16 vlan, u8 qos)
+{
+ struct mlx4_en_priv *en_priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = en_priv->mdev;
+
+ return mlx4_set_vf_vlan(mdev->dev, en_priv->port, vf, vlan, qos);
+}
+#endif
+
+#ifdef HAVE_TX_RATE_LIMIT
+static int mlx4_en_set_vf_rate(struct net_device *dev, int vf, int min_tx_rate,
+ int max_tx_rate)
+{
+ struct mlx4_en_priv *en_priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = en_priv->mdev;
+
+ return mlx4_set_vf_rate(mdev->dev, en_priv->port, vf, min_tx_rate,
+ max_tx_rate);
+}
+#elif defined(HAVE_VF_TX_RATE)
+static int mlx4_en_set_vf_tx_rate(struct net_device *dev, int vf, int rate)
+{
+ struct mlx4_en_priv *en_priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = en_priv->mdev;
+
+ return mlx4_set_vf_rate(mdev->dev, en_priv->port, vf, 0, rate);
+}
+#endif
+
+#if defined(HAVE_VF_INFO_SPOOFCHK) || defined(HAVE_NETDEV_OPS_EXT_NDO_SET_VF_SPOOFCHK)
+static int mlx4_en_set_vf_spoofchk(struct net_device *dev, int vf, bool setting)
+{
+ struct mlx4_en_priv *en_priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = en_priv->mdev;
+
+ return mlx4_set_vf_spoofchk(mdev->dev, en_priv->port, vf, setting);
+}
+#endif
+
+#ifdef HAVE_NDO_SET_VF_MAC
+static int mlx4_en_get_vf_config(struct net_device *dev, int vf, struct ifla_vf_info *ivf)
+{
+ struct mlx4_en_priv *en_priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = en_priv->mdev;
+
+ return mlx4_get_vf_config(mdev->dev, en_priv->port, vf, ivf);
+}
+#endif
+
+#if defined(HAVE_NETDEV_OPS_NDO_SET_VF_LINK_STATE) || defined(HAVE_NETDEV_OPS_EXT_NDO_SET_VF_LINK_STATE)
+static int mlx4_en_set_vf_link_state(struct net_device *dev, int vf, int link_state)
+{
+ struct mlx4_en_priv *en_priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = en_priv->mdev;
+
+ return mlx4_set_vf_link_state(mdev->dev, en_priv->port, vf, link_state);
+}
+#endif
+
+#if defined(HAVE_NETDEV_NDO_GET_PHYS_PORT_ID) || defined(HAVE_NETDEV_EXT_NDO_GET_PHYS_PORT_ID)
+#define PORT_ID_BYTE_LEN 8
+static int mlx4_en_get_phys_port_id(struct net_device *dev,
+#ifdef HAVE_NETDEV_PHYS_ITEM_ID
+ struct netdev_phys_item_id *ppid)
+#else
+ struct netdev_phys_port_id *ppid)
+#endif
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_dev *mdev = priv->mdev->dev;
+ int i;
+ u64 phys_port_id = mdev->caps.phys_port_id[priv->port];
+
+ if (!phys_port_id)
+ return -EOPNOTSUPP;
+
+ ppid->id_len = sizeof(phys_port_id);
+ for (i = PORT_ID_BYTE_LEN - 1; i >= 0; --i) {
+ ppid->id[i] = phys_port_id & 0xff;
+ phys_port_id >>= 8;
+ }
+ return 0;
+}
+#endif
+
+#ifdef HAVE_VXLAN_ENABLED
+static void mlx4_en_add_vxlan_offloads(struct work_struct *work)
+{
+ int ret;
+ struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv,
+ vxlan_add_task);
+
+ ret = mlx4_config_vxlan_port(priv->mdev->dev, priv->vxlan_port);
+ if (ret)
+ goto out;
+
+ ret = mlx4_SET_PORT_VXLAN(priv->mdev->dev, priv->port,
+ VXLAN_STEER_BY_OUTER_MAC, 1);
+out:
+ if (ret) {
+ en_err(priv, "failed setting L2 tunnel configuration ret %d\n", ret);
+ return;
+ }
+
+ /* set offloads */
+#ifdef HAVE_NETDEV_HW_ENC_FEATURES
+ priv->dev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
+ NETIF_F_TSO | NETIF_F_GSO_UDP_TUNNEL;
+#endif
+#ifdef HAVE_NETDEV_HW_FEATURES
+ priv->dev->hw_features |= NETIF_F_GSO_UDP_TUNNEL;
+#elif defined(HAVE_NET_DEVICE_OPS_EXT)
+ netdev_extended(priv->dev)->hw_features |= NETIF_F_GSO_UDP_TUNNEL;
+#endif
+ priv->dev->features |= NETIF_F_GSO_UDP_TUNNEL;
+}
+
+static void mlx4_en_del_vxlan_offloads(struct work_struct *work)
+{
+ int ret;
+ struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv,
+ vxlan_del_task);
+ /* unset offloads */
+#ifdef HAVE_NETDEV_HW_ENC_FEATURES
+ priv->dev->hw_enc_features &= ~(NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
+ NETIF_F_TSO | NETIF_F_GSO_UDP_TUNNEL);
+#endif
+#ifdef HAVE_NETDEV_HW_FEATURES
+ priv->dev->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
+#elif defined(HAVE_NET_DEVICE_OPS_EXT)
+ netdev_extended(priv->dev)->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
+#endif
+ priv->dev->features &= ~NETIF_F_GSO_UDP_TUNNEL;
+
+ ret = mlx4_SET_PORT_VXLAN(priv->mdev->dev, priv->port,
+ VXLAN_STEER_BY_OUTER_MAC, 0);
+ if (ret)
+ en_err(priv, "failed setting L2 tunnel configuration ret %d\n", ret);
+
+ priv->vxlan_port = 0;
+}
+
+#ifdef HAVE_VXLAN_DYNAMIC_PORT
+static void mlx4_en_add_vxlan_port(struct net_device *dev,
+ sa_family_t sa_family, __be16 port)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ __be16 current_port;
+
+ if (priv->mdev->dev->caps.tunnel_offload_mode != MLX4_TUNNEL_OFFLOAD_MODE_VXLAN)
+ return;
+
+ if (sa_family == AF_INET6)
+ return;
+
+ current_port = priv->vxlan_port;
+ if (current_port && current_port != port) {
+ en_warn(priv, "vxlan port %d configured, can't add port %d\n",
+ ntohs(current_port), ntohs(port));
+ return;
+ }
+
+ priv->vxlan_port = port;
+ queue_work(priv->mdev->workqueue, &priv->vxlan_add_task);
+}
+
+static void mlx4_en_del_vxlan_port(struct net_device *dev,
+ sa_family_t sa_family, __be16 port)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ __be16 current_port;
+
+ if (priv->mdev->dev->caps.tunnel_offload_mode != MLX4_TUNNEL_OFFLOAD_MODE_VXLAN)
+ return;
+
+ if (sa_family == AF_INET6)
+ return;
+
+ current_port = priv->vxlan_port;
+ if (current_port != port) {
+ en_dbg(DRV, priv, "vxlan port %d isn't configured, ignoring\n", ntohs(port));
+ return;
+ }
+
+ queue_work(priv->mdev->workqueue, &priv->vxlan_del_task);
+}
+
+#ifdef HAVE_NETDEV_FEATURES_T
+static netdev_features_t mlx4_en_features_check(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features)
+{
+ return vxlan_features_check(skb, features);
+}
+
+#else
+#ifdef HAVE_VXLAN_GSO_CHECK
+static bool mlx4_en_gso_check(struct sk_buff *skb, struct net_device *dev)
+{
+ return vxlan_gso_check(skb);
+}
+#endif
+#endif
+#endif
+#endif
+
+#ifdef KMOD_DISABLED
+static const struct net_device_ops mlx4_netdev_ops = {
+ .ndo_open = mlx4_en_open,
+ .ndo_stop = mlx4_en_close,
+ .ndo_start_xmit = mlx4_en_xmit,
+ .ndo_select_queue = mlx4_en_select_queue,
+ .ndo_get_stats = mlx4_en_get_stats,
+ .ndo_set_rx_mode = mlx4_en_set_rx_mode,
+ .ndo_set_mac_address = mlx4_en_set_mac,
+ .ndo_validate_addr = eth_validate_addr,
+ .ndo_change_mtu = mlx4_en_change_mtu,
+ .ndo_do_ioctl = mlx4_en_ioctl,
+ .ndo_tx_timeout = mlx4_en_tx_timeout,
+#ifdef HAVE_VLAN_GRO_RECEIVE
+ .ndo_vlan_rx_register = mlx4_en_vlan_rx_register,
+#endif
+ .ndo_vlan_rx_add_vid = mlx4_en_vlan_rx_add_vid,
+ .ndo_vlan_rx_kill_vid = mlx4_en_vlan_rx_kill_vid,
+#ifdef CONFIG_NET_POLL_CONTROLLER
+ .ndo_poll_controller = mlx4_en_netpoll,
+#endif
+#ifdef HAVE_NDO_SET_FEATURES
+ .ndo_set_features = mlx4_en_set_features,
+#endif
+#ifdef HAVE_NDO_SETUP_TC
+ .ndo_setup_tc = mlx4_en_setup_tc,
+#endif
+#ifdef HAVE_NDO_RX_FLOW_STEER
+#ifdef CONFIG_RFS_ACCEL
+ .ndo_rx_flow_steer = mlx4_en_filter_rfs,
+#endif
+#endif
+#ifdef CONFIG_NET_RX_BUSY_POLL
+#ifndef HAVE_NETDEV_EXTENDED_NDO_BUSY_POLL
+ .ndo_busy_poll = mlx4_en_low_latency_recv,
+#endif
+#endif
+#ifdef HAVE_NETDEV_NDO_GET_PHYS_PORT_ID
+ .ndo_get_phys_port_id = mlx4_en_get_phys_port_id,
+#endif
+#ifdef HAVE_VXLAN_ENABLED
+#ifdef HAVE_VXLAN_DYNAMIC_PORT
+ .ndo_add_vxlan_port = mlx4_en_add_vxlan_port,
+ .ndo_del_vxlan_port = mlx4_en_del_vxlan_port,
+#ifdef HAVE_NETDEV_FEATURES_T
+ .ndo_features_check = mlx4_en_features_check,
+#else
+#ifdef HAVE_VXLAN_GSO_CHECK
+ .ndo_gso_check = mlx4_en_gso_check,
+#endif
+#endif
+#endif
+#endif
+};
+
+static const struct net_device_ops mlx4_netdev_ops_master = {
+ .ndo_open = mlx4_en_open,
+ .ndo_stop = mlx4_en_close,
+ .ndo_start_xmit = mlx4_en_xmit,
+ .ndo_select_queue = mlx4_en_select_queue,
+ .ndo_get_stats = mlx4_en_get_stats,
+ .ndo_set_rx_mode = mlx4_en_set_rx_mode,
+ .ndo_set_mac_address = mlx4_en_set_mac,
+ .ndo_validate_addr = eth_validate_addr,
+ .ndo_change_mtu = mlx4_en_change_mtu,
+ .ndo_tx_timeout = mlx4_en_tx_timeout,
+#ifdef HAVE_VLAN_GRO_RECEIVE
+ .ndo_vlan_rx_register = mlx4_en_vlan_rx_register,
+#endif
+ .ndo_vlan_rx_add_vid = mlx4_en_vlan_rx_add_vid,
+ .ndo_vlan_rx_kill_vid = mlx4_en_vlan_rx_kill_vid,
+#ifdef HAVE_NDO_SET_VF_MAC
+ .ndo_set_vf_mac = mlx4_en_set_vf_mac,
+ .ndo_set_vf_vlan = mlx4_en_set_vf_vlan,
+#endif
+#ifdef HAVE_TX_RATE_LIMIT
+ .ndo_set_vf_rate = mlx4_en_set_vf_rate,
+#elif defined(HAVE_VF_TX_RATE)
+ .ndo_set_vf_tx_rate = mlx4_en_set_vf_tx_rate,
+#endif
+#if (defined(HAVE_NETDEV_OPS_NDO_SET_VF_SPOOFCHK) && !defined(HAVE_NET_DEVICE_OPS_EXT))
+ .ndo_set_vf_spoofchk = mlx4_en_set_vf_spoofchk,
+#endif
+#if (defined(HAVE_NETDEV_OPS_NDO_SET_VF_LINK_STATE) && !defined(HAVE_NET_DEVICE_OPS_EXT))
+ .ndo_set_vf_link_state = mlx4_en_set_vf_link_state,
+#endif
+#ifdef HAVE_NDO_SET_VF_MAC
+ .ndo_get_vf_config = mlx4_en_get_vf_config,
+#endif
+#ifdef CONFIG_NET_POLL_CONTROLLER
+ .ndo_poll_controller = mlx4_en_netpoll,
+#endif
+#if (defined(HAVE_NDO_SET_FEATURES) && !defined(HAVE_NET_DEVICE_OPS_EXT))
+ .ndo_set_features = mlx4_en_set_features,
+#endif
+#ifdef HAVE_NDO_SETUP_TC
+ .ndo_setup_tc = mlx4_en_setup_tc,
+#endif
+#ifdef HAVE_NDO_RX_FLOW_STEER
+#ifdef CONFIG_RFS_ACCEL
+ .ndo_rx_flow_steer = mlx4_en_filter_rfs,
+#endif
+#endif
+#ifdef HAVE_NETDEV_NDO_GET_PHYS_PORT_ID
+ .ndo_get_phys_port_id = mlx4_en_get_phys_port_id,
+#endif
+#ifdef HAVE_VXLAN_ENABLED
+#ifdef HAVE_VXLAN_DYNAMIC_PORT
+ .ndo_add_vxlan_port = mlx4_en_add_vxlan_port,
+ .ndo_del_vxlan_port = mlx4_en_del_vxlan_port,
+#ifdef HAVE_NETDEV_FEATURES_T
+ .ndo_features_check = mlx4_en_features_check,
+#endif
+#endif
+#endif
+};
+
+#ifdef HAVE_NET_DEVICE_OPS_EXT
+static const struct net_device_ops_ext mlx4_netdev_ops_ext = {
+ .size = sizeof(struct net_device_ops_ext),
+ .ndo_set_features = mlx4_en_set_features,
+#ifdef HAVE_NETDEV_EXT_NDO_GET_PHYS_PORT_ID
+ .ndo_get_phys_port_id = mlx4_en_get_phys_port_id,
+#endif
+};
+
+static const struct net_device_ops_ext mlx4_netdev_ops_master_ext = {
+ .size = sizeof(struct net_device_ops_ext),
+#ifdef HAVE_NETDEV_OPS_EXT_NDO_SET_VF_SPOOFCHK
+ .ndo_set_vf_spoofchk = mlx4_en_set_vf_spoofchk,
+#endif
+#if defined(HAVE_NETDEV_OPS_EXT_NDO_SET_VF_LINK_STATE)
+ .ndo_set_vf_link_state = mlx4_en_set_vf_link_state,
+#endif
+#ifdef HAVE_NETDEV_EXT_NDO_GET_PHYS_PORT_ID
+ .ndo_get_phys_port_id = mlx4_en_get_phys_port_id,
+#endif
+ .ndo_set_features = mlx4_en_set_features,
+};
+#endif
+#endif
+
+#ifdef KMOD_DISABLED
+struct mlx4_en_bond {
+ struct work_struct work;
+ struct mlx4_en_priv *priv;
+ int is_bonded;
+ struct mlx4_port_map port_map;
+};
+#endif
+
+#ifdef HAVE_NETDEV_BONDING_INFO
+static void mlx4_en_bond_work(struct work_struct *work)
+{
+ struct mlx4_en_bond *bond = container_of(work,
+ struct mlx4_en_bond,
+ work);
+ int err = 0;
+ struct mlx4_dev *dev = bond->priv->mdev->dev;
+
+ if (bond->is_bonded) {
+ if (!mlx4_is_bonded(dev)) {
+ err = mlx4_bond(dev);
+ if (err)
+ en_err(bond->priv, "Fail to bond device\n");
+ }
+ if (!err) {
+ err = mlx4_port_map_set(dev, &bond->port_map);
+ if (err)
+ en_err(bond->priv, "Fail to set port map [%d][%d]: %d\n",
+ bond->port_map.port1,
+ bond->port_map.port2,
+ err);
+ }
+ } else if (mlx4_is_bonded(dev)) {
+ err = mlx4_unbond(dev);
+ if (err)
+ en_err(bond->priv, "Fail to unbond device\n");
+ }
+ dev_put(bond->priv->dev);
+ kfree(bond);
+}
+
+static int mlx4_en_queue_bond_work(struct mlx4_en_priv *priv, int is_bonded,
+ u8 v2p_p1, u8 v2p_p2)
+{
+ struct mlx4_en_bond *bond = NULL;
+
+ bond = kzalloc(sizeof(*bond), GFP_ATOMIC);
+ if (!bond)
+ return -ENOMEM;
+
+ INIT_WORK(&bond->work, mlx4_en_bond_work);
+ bond->priv = priv;
+ bond->is_bonded = is_bonded;
+ bond->port_map.port1 = v2p_p1;
+ bond->port_map.port2 = v2p_p2;
+ dev_hold(priv->dev);
+ queue_work(priv->mdev->workqueue, &bond->work);
+ return 0;
+}
+
+int mlx4_en_netdev_event(struct notifier_block *this,
+ unsigned long event, void *ptr)
+{
+ struct net_device *ndev = netdev_notifier_info_to_dev(ptr);
+ u8 port = 0;
+ struct mlx4_en_dev *mdev;
+ struct mlx4_dev *dev;
+ int i, num_eth_ports = 0;
+ bool do_bond = true;
+ struct mlx4_en_priv *priv;
+ u8 v2p_port1 = 0;
+ u8 v2p_port2 = 0;
+
+ if (!net_eq(dev_net(ndev), &init_net))
+ return NOTIFY_DONE;
+
+ mdev = container_of(this, struct mlx4_en_dev, nb);
+ dev = mdev->dev;
+
+ /* Go into this mode only when two network devices set on two ports
+ * of the same mlx4 device are slaves of the same bonding master
+ */
+ mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_ETH) {
+ ++num_eth_ports;
+ if (!port && (mdev->pndev[i] == ndev))
+ port = i;
+ mdev->upper[i] = mdev->pndev[i] ?
+ netdev_master_upper_dev_get(mdev->pndev[i]) : NULL;
+ /* condition not met: network device is a slave */
+ if (!mdev->upper[i])
+ do_bond = false;
+ if (num_eth_ports < 2)
+ continue;
+ /* condition not met: same master */
+ if (mdev->upper[i] != mdev->upper[i-1])
+ do_bond = false;
+ }
+ /* condition not met: 2 salves */
+ do_bond = (num_eth_ports == 2) ? do_bond : false;
+
+ /* handle only events that come with enough info */
+ if ((do_bond && (event != NETDEV_BONDING_INFO)) || !port)
+ return NOTIFY_DONE;
+
+ priv = netdev_priv(ndev);
+ if (do_bond) {
+ struct netdev_notifier_bonding_info *notifier_info = ptr;
+ struct netdev_bonding_info *bonding_info =
+ ¬ifier_info->bonding_info;
+
+ /* required mode 1, 2 or 4 */
+ if ((bonding_info->master.bond_mode != BOND_MODE_ACTIVEBACKUP) &&
+ (bonding_info->master.bond_mode != BOND_MODE_XOR) &&
+ (bonding_info->master.bond_mode != BOND_MODE_8023AD))
+ do_bond = false;
+
+ /* require exactly 2 slaves */
+ if (bonding_info->master.num_slaves != 2)
+ do_bond = false;
+
+ /* calc v2p */
+ if (do_bond) {
+ if (bonding_info->master.bond_mode ==
+ BOND_MODE_ACTIVEBACKUP) {
+ /* in active-backup mode virtual ports are
+ * mapped to the physical port of the active
+ * slave */
+ if (bonding_info->slave.state ==
+ BOND_STATE_BACKUP) {
+ if (port == 1) {
+ v2p_port1 = 2;
+ v2p_port2 = 2;
+ } else {
+ v2p_port1 = 1;
+ v2p_port2 = 1;
+ }
+ } else { /* BOND_STATE_ACTIVE */
+ if (port == 1) {
+ v2p_port1 = 1;
+ v2p_port2 = 1;
+ } else {
+ v2p_port1 = 2;
+ v2p_port2 = 2;
+ }
+ }
+ } else { /* Active-Active */
+ /* in active-active mode a virtual port is
+ * mapped to the native physical port if and only
+ * if the physical port is up */
+ __s8 link = bonding_info->slave.link;
+
+ if (port == 1)
+ v2p_port2 = 2;
+ else
+ v2p_port1 = 1;
+ if ((link == BOND_LINK_UP) ||
+ (link == BOND_LINK_FAIL)) {
+ if (port == 1)
+ v2p_port1 = 1;
+ else
+ v2p_port2 = 2;
+ } else { /* BOND_LINK_DOWN || BOND_LINK_BACK */
+ if (port == 1)
+ v2p_port1 = 2;
+ else
+ v2p_port2 = 1;
+ }
+ }
+ }
+ }
+
+ mlx4_en_queue_bond_work(priv, do_bond,
+ v2p_port1, v2p_port2);
+
+ return NOTIFY_DONE;
+}
+#endif
+
+#ifdef KMOD_DISABLED
+void mlx4_en_update_pfc_stats_bitmap(struct mlx4_dev *dev,
+ struct mlx4_en_stats_bitmap *stats_bitmap,
+ u8 rx_ppp, u8 rx_pause,
+ u8 tx_ppp, u8 tx_pause)
+{
+ int last_i = NUM_MAIN_STATS + NUM_PORT_STATS;
+
+ if (!mlx4_is_slave(dev) &&
+ (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_FLOWSTATS_EN)) {
+ mutex_lock(&stats_bitmap->mutex);
+ bitmap_clear(stats_bitmap->bitmap, last_i, NUM_FLOW_STATS);
+
+ if (rx_ppp)
+ bitmap_set(stats_bitmap->bitmap, last_i,
+ NUM_FLOW_PRIORITY_STATS_RX);
+ last_i += NUM_FLOW_PRIORITY_STATS_RX;
+
+ if (rx_pause && !(rx_ppp))
+ bitmap_set(stats_bitmap->bitmap, last_i,
+ NUM_FLOW_STATS_RX);
+ last_i += NUM_FLOW_STATS_RX;
+
+ if (tx_ppp)
+ bitmap_set(stats_bitmap->bitmap, last_i,
+ NUM_FLOW_PRIORITY_STATS_TX);
+ last_i += NUM_FLOW_PRIORITY_STATS_TX;
+
+ if (tx_pause && !(tx_ppp))
+ bitmap_set(stats_bitmap->bitmap, last_i,
+ NUM_FLOW_STATS_TX);
+ last_i += NUM_FLOW_STATS_TX;
+
+ mutex_unlock(&stats_bitmap->mutex);
+ }
+}
+
+void mlx4_en_set_stats_bitmap(struct mlx4_dev *dev,
+ struct mlx4_en_stats_bitmap *stats_bitmap,
+ u8 rx_ppp, u8 rx_pause,
+ u8 tx_ppp, u8 tx_pause)
+{
+ int last_i = 0;
+
+ mutex_init(&stats_bitmap->mutex);
+ bitmap_zero(stats_bitmap->bitmap, NUM_ALL_STATS);
+
+ if (mlx4_is_slave(dev)) {
+ bitmap_set(stats_bitmap->bitmap, last_i +
+ MLX4_FIND_NETDEV_STAT(rx_packets), 1);
+ bitmap_set(stats_bitmap->bitmap, last_i +
+ MLX4_FIND_NETDEV_STAT(tx_packets), 1);
+ bitmap_set(stats_bitmap->bitmap, last_i +
+ MLX4_FIND_NETDEV_STAT(rx_bytes), 1);
+ bitmap_set(stats_bitmap->bitmap, last_i +
+ MLX4_FIND_NETDEV_STAT(tx_bytes), 1);
+ bitmap_set(stats_bitmap->bitmap, last_i +
+ MLX4_FIND_NETDEV_STAT(rx_dropped), 1);
+ bitmap_set(stats_bitmap->bitmap, last_i +
+ MLX4_FIND_NETDEV_STAT(tx_dropped), 1);
+ } else {
+ bitmap_set(stats_bitmap->bitmap, last_i, NUM_MAIN_STATS);
+ }
+ last_i += NUM_MAIN_STATS;
+
+ bitmap_set(stats_bitmap->bitmap, last_i, NUM_PORT_STATS);
+ last_i += NUM_PORT_STATS;
+
+ mlx4_en_update_pfc_stats_bitmap(dev, stats_bitmap,
+ rx_ppp, rx_pause,
+ tx_ppp, tx_pause);
+ last_i += NUM_FLOW_STATS;
+
+ if (mlx4_is_slave(dev))
+ bitmap_set(stats_bitmap->bitmap, last_i, NUM_VF_STATS);
+ last_i += NUM_VF_STATS;
+
+ if (!mlx4_is_slave(dev))
+ bitmap_set(stats_bitmap->bitmap, last_i, NUM_VPORT_STATS);
+ last_i += NUM_VPORT_STATS;
+
+ if (!mlx4_is_slave(dev))
+ bitmap_set(stats_bitmap->bitmap, last_i, NUM_PKT_STATS);
+}
+#endif
+
+#ifdef KMOD_REMOVED
+int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
+ struct mlx4_en_port_profile *prof)
+{
+ struct net_device *dev;
+ struct mlx4_en_priv *priv;
+ int i;
+ int err;
+ u64 mac_u64;
+#if (!defined(CONFIG_COMPAT_DISABLE_DCB) && defined(CONFIG_MLX4_EN_DCB))
+ u8 config = 0;
+#endif
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ dev = alloc_etherdev_mqs(sizeof(struct mlx4_en_priv),
+ MAX_TX_RINGS, MAX_RX_RINGS);
+#else
+ dev = alloc_etherdev_mq(sizeof(struct mlx4_en_priv), MAX_TX_RINGS);
+#endif
+ if (dev == NULL)
+ return -ENOMEM;
+
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ netif_set_real_num_tx_queues(dev, prof->tx_ring_num);
+#else
+ dev->real_num_tx_queues = prof->tx_ring_num;
+#endif
+ netif_set_real_num_rx_queues(dev, prof->rx_ring_num);
+
+ SET_NETDEV_DEV(dev, &mdev->dev->persist->pdev->dev);
+#ifdef HAVE_NET_DEVICE_DEV_PORT
+ dev->dev_port = port - 1;
+#else
+ dev->dev_id = port - 1;
+#endif
+
+ /*
+ * Initialize driver private data
+ */
+
+ priv = netdev_priv(dev);
+ memset(priv, 0, sizeof(struct mlx4_en_priv));
+ priv->counter_index = 0xff;
+ spin_lock_init(&priv->stats_lock);
+ INIT_WORK(&priv->rx_mode_task, mlx4_en_do_set_rx_mode);
+ INIT_WORK(&priv->watchdog_task, mlx4_en_restart);
+ INIT_WORK(&priv->linkstate_task, mlx4_en_linkstate);
+ INIT_DELAYED_WORK(&priv->stats_task, mlx4_en_do_get_stats);
+ INIT_DELAYED_WORK(&priv->service_task, mlx4_en_service_task);
+#ifdef HAVE_VXLAN_ENABLED
+ INIT_WORK(&priv->vxlan_add_task, mlx4_en_add_vxlan_offloads);
+ INIT_WORK(&priv->vxlan_del_task, mlx4_en_del_vxlan_offloads);
+#endif
+#ifdef CONFIG_RFS_ACCEL
+ INIT_LIST_HEAD(&priv->filters);
+ spin_lock_init(&priv->filters_lock);
+#endif
+
+ priv->dev = dev;
+ priv->mdev = mdev;
+ priv->ddev = &mdev->pdev->dev;
+ priv->prof = prof;
+ priv->port = port;
+ priv->port_up = false;
+ priv->flags = prof->flags;
+ priv->pflags = MLX4_EN_PRIV_FLAGS_BLUEFLAME;
+ priv->ctrl_flags = cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE |
+ MLX4_WQE_CTRL_SOLICITED);
+ priv->num_tx_rings_p_up = mdev->profile.num_tx_rings_p_up;
+ priv->tx_ring_num = prof->tx_ring_num;
+ priv->tx_work_limit = MLX4_EN_DEFAULT_TX_WORK;
+#ifdef HAVE_NETDEV_RSS_KEY_FILL
+ netdev_rss_key_fill(priv->rss_key, sizeof(priv->rss_key));
+#endif
+
+ priv->tx_ring = kzalloc(sizeof(struct mlx4_en_tx_ring *) * MAX_TX_RINGS,
+ GFP_KERNEL);
+ if (!priv->tx_ring) {
+ err = -ENOMEM;
+ goto out;
+ }
+ priv->tx_cq = kzalloc(sizeof(struct mlx4_en_cq *) * MAX_TX_RINGS,
+ GFP_KERNEL);
+ if (!priv->tx_cq) {
+ err = -ENOMEM;
+ goto out;
+ }
+ priv->rx_ring_num = prof->rx_ring_num;
+ priv->cqe_factor = (mdev->dev->caps.cqe_size == 64) ? 1 : 0;
+ priv->cqe_size = mdev->dev->caps.cqe_size;
+ priv->mac_index = -1;
+ priv->msg_enable = MLX4_EN_MSG_LEVEL;
+#ifndef CONFIG_COMPAT_DISABLE_DCB
+#ifdef CONFIG_MLX4_EN_DCB
+ if (!mlx4_is_slave(priv->mdev->dev)) {
+ u8 prio;
+
+ for (prio = 0; prio < IEEE_8021QAZ_MAX_TCS; ++prio) {
+ priv->ets.prio_tc[prio] = prio;
+ priv->ets.tc_tsa[prio] = IEEE_8021QAZ_TSA_VENDOR;
+ }
+
+ if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ETS_CFG) {
+ dev->dcbnl_ops = &mlx4_en_dcbnl_ops;
+ } else {
+ en_info(priv, "enabling only PFC DCB ops\n");
+ dev->dcbnl_ops = &mlx4_en_dcbnl_pfc_ops;
+ }
+ /* Query for defalut disable_32_14_4_e value for qcn */
+ err = mlx4_disable_32_14_4_e_read(priv->mdev->dev, &config, priv->port);
+ if (!err) {
+ if (config)
+ priv->pflags |= MLX4_EN_PRIV_FLAGS_DISABLE_32_14_4_E;
+ else
+ priv->pflags &= ~MLX4_EN_PRIV_FLAGS_DISABLE_32_14_4_E;
+ } else {
+ en_err(priv, "Failed to query disable_32_14_4_e field for QCN\n");
+ }
+ }
+#endif
+#endif
+
+ for (i = 0; i < MLX4_EN_MAC_HASH_SIZE; ++i)
+ INIT_HLIST_HEAD(&priv->mac_hash[i]);
+
+ /* Query for default mac and max mtu */
+ priv->max_mtu = mdev->dev->caps.eth_mtu_cap[priv->port];
+
+ if (mdev->dev->caps.rx_checksum_flags_port[priv->port] &
+ MLX4_RX_CSUM_MODE_VAL_NON_TCP_UDP)
+ priv->flags |= MLX4_EN_FLAG_RX_CSUM_NON_TCP_UDP;
+
+ /* Set default MAC */
+ dev->addr_len = ETH_ALEN;
+ mlx4_en_u64_to_mac(dev->dev_addr, mdev->dev->caps.def_mac[priv->port]);
+ if (!is_valid_ether_addr(dev->dev_addr)) {
+ if (mlx4_is_slave(priv->mdev->dev)) {
+ eth_hw_addr_random(dev);
+ en_warn(priv, "Assigned random MAC address %pM\n", dev->dev_addr);
+ mac_u64 = mlx4_mac_to_u64(dev->dev_addr);
+ mdev->dev->caps.def_mac[priv->port] = mac_u64;
+ } else {
+ en_err(priv, "Port: %d, invalid mac burned: %pM, quiting\n",
+ priv->port, dev->dev_addr);
+ err = -EINVAL;
+ goto out;
+ }
+ }
+
+ memcpy(priv->current_mac, dev->dev_addr, sizeof(priv->current_mac));
+
+ priv->stride = prof->inline_scatter_thold >= MIN_INLINE_SCATTER ?
+ prof->inline_scatter_thold :
+ roundup_pow_of_two(sizeof(struct mlx4_en_rx_desc) +
+ DS_SIZE * MLX4_EN_MAX_RX_FRAGS);
+
+ err = mlx4_en_alloc_resources(priv);
+ if (err)
+ goto out;
+
+ /* Initialize time stamping config */
+ priv->hwtstamp_config.flags = 0;
+ priv->hwtstamp_config.tx_type = HWTSTAMP_TX_OFF;
+ priv->hwtstamp_config.rx_filter = HWTSTAMP_FILTER_NONE;
+
+ /* Allocate page for receive rings */
+ err = mlx4_alloc_hwq_res(mdev->dev, &priv->res,
+ MLX4_EN_PAGE_SIZE, MLX4_EN_PAGE_SIZE);
+ if (err) {
+ en_err(priv, "Failed to allocate page for rx qps\n");
+ goto out;
+ }
+ priv->allocated = 1;
+
+ /*
+ * Initialize netdev entry points
+ */
+ if (mlx4_is_master(priv->mdev->dev))
+ dev->netdev_ops = &mlx4_netdev_ops_master;
+ else
+ dev->netdev_ops = &mlx4_netdev_ops;
+
+#ifdef HAVE_NET_DEVICE_OPS_EXT
+ if (mlx4_is_master(priv->mdev->dev))
+ set_netdev_ops_ext(dev, &mlx4_netdev_ops_master_ext);
+ else
+ set_netdev_ops_ext(dev, &mlx4_netdev_ops_ext);
+#endif
+
+ dev->watchdog_timeo = MLX4_EN_WATCHDOG_TIMEOUT;
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ netif_set_real_num_tx_queues(dev, priv->tx_ring_num);
+#else
+ dev->real_num_tx_queues = priv->tx_ring_num;
+#endif
+ netif_set_real_num_rx_queues(dev, priv->rx_ring_num);
+
+#ifdef HAVE_ETHTOOL_OPS_EXT
+ SET_ETHTOOL_OPS(dev, &mlx4_en_ethtool_ops);
+ set_ethtool_ops_ext(dev, &mlx4_en_ethtool_ops_ext);
+#else
+ dev->ethtool_ops = &mlx4_en_ethtool_ops;
+#endif
+
+#ifdef CONFIG_NET_RX_BUSY_POLL
+#ifdef HAVE_NETDEV_EXTENDED_NDO_BUSY_POLL
+ netdev_extended(dev)->ndo_busy_poll = mlx4_en_low_latency_recv;
+#endif
+#endif
+
+ /*
+ * Set driver features
+ */
+#ifdef HAVE_NETDEV_HW_FEATURES
+ dev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
+ if (mdev->LSO_support)
+ dev->hw_features |= NETIF_F_TSO | NETIF_F_TSO6;
+
+ dev->vlan_features = dev->hw_features;
+
+#ifdef HAVE_NETIF_F_RXHASH
+ dev->hw_features |= NETIF_F_RXCSUM | NETIF_F_RXHASH;
+#else
+ dev->hw_features |= NETIF_F_RXCSUM;
+#endif
+ dev->features = dev->hw_features | NETIF_F_HIGHDMA |
+ NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
+ NETIF_F_HW_VLAN_CTAG_FILTER;
+ dev->hw_features |= NETIF_F_LOOPBACK |
+ NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
+
+#ifdef HAVE_NETIF_F_RXFCS
+ if (mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_FCS_KEEP)
+ dev->hw_features |= NETIF_F_RXFCS;
+#endif
+
+#ifdef HAVE_NETIF_F_RXALL
+ if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_IGNORE_FCS)
+ dev->hw_features |= NETIF_F_RXALL;
+#endif
+
+ if (mdev->dev->caps.steering_mode ==
+ MLX4_STEERING_MODE_DEVICE_MANAGED &&
+ mdev->dev->caps.dmfs_high_steer_mode != MLX4_STEERING_DMFS_A0_STATIC)
+ dev->hw_features |= NETIF_F_NTUPLE;
+
+#ifndef NETIF_F_SOFT_FEATURES
+ dev->hw_features |= NETIF_F_GSO | NETIF_F_GRO;
+ dev->features |= NETIF_F_GSO | NETIF_F_GRO;
+#endif
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+ dev->hw_features |= NETIF_F_LRO;
+ dev->features |= NETIF_F_LRO;
+#endif
+#else
+ dev->features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
+
+ if (mdev->LSO_support)
+ dev->features |= NETIF_F_TSO | NETIF_F_TSO6;
+
+ dev->vlan_features = dev->features;
+
+#ifdef HAVE_NETIF_F_RXHASH
+ dev->features |= NETIF_F_RXCSUM | NETIF_F_RXHASH;
+#else
+ dev->features |= NETIF_F_RXCSUM;
+#endif
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+ dev->features |= NETIF_F_LRO;
+#endif
+#ifdef HAVE_SET_NETDEV_HW_FEATURES
+ set_netdev_hw_features(dev, dev->features);
+#endif
+ dev->features = dev->features | NETIF_F_HIGHDMA |
+ NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
+ NETIF_F_HW_VLAN_CTAG_FILTER;
+#ifdef HAVE_NETDEV_EXTENDED_HW_FEATURES
+ netdev_extended(dev)->hw_features |= NETIF_F_LOOPBACK;
+ netdev_extended(dev)->hw_features |= NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_CTAG_TX;
+#endif
+
+ if (mdev->dev->caps.steering_mode ==
+ MLX4_STEERING_MODE_DEVICE_MANAGED)
+#ifdef HAVE_NETDEV_EXTENDED_HW_FEATURES
+ netdev_extended(dev)->hw_features |= NETIF_F_NTUPLE;
+#else
+ dev->features |= NETIF_F_NTUPLE;
+#endif
+
+#ifndef NETIF_F_SOFT_FEATURES
+ dev->features |= NETIF_F_GSO | NETIF_F_GRO;
+#endif
+#endif
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,2,0))
+
+ if (mdev->dev->caps.steering_mode != MLX4_STEERING_MODE_A0)
+ dev->priv_flags |= IFF_UNICAST_FLT;
+#endif
+
+#ifdef HAVE_ETH_SS_RSS_HASH_FUNCS
+ /* Setting a default hash function value */
+ if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RSS_TOP) {
+ priv->rss_hash_fn = ETH_RSS_HASH_TOP;
+ } else if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RSS_XOR) {
+ priv->rss_hash_fn = ETH_RSS_HASH_XOR;
+ } else {
+ en_warn(priv,
+ "No RSS hash capabilities exposed, using Toeplitz\n");
+ priv->rss_hash_fn = ETH_RSS_HASH_TOP;
+ }
+#endif
+
+ mdev->pndev[port] = dev;
+ mdev->upper[port] = NULL;
+
+ netif_carrier_off(dev);
+ mlx4_en_set_default_moderation(priv);
+
+ en_warn(priv, "Using %d TX rings\n", prof->tx_ring_num);
+ en_warn(priv, "Using %d RX rings\n", prof->rx_ring_num);
+
+ mlx4_en_update_loopback_state(priv->dev, priv->dev->features);
+
+ /* Configure port */
+ mlx4_en_calc_rx_buf(dev);
+ err = mlx4_SET_PORT_general(mdev->dev, priv->port,
+ priv->rx_skb_size + ETH_FCS_LEN,
+ prof->tx_pause, prof->tx_ppp,
+ prof->rx_pause, prof->rx_ppp);
+ if (err) {
+ en_err(priv, "Failed setting port general configurations for port %d, with error %d\n",
+ priv->port, err);
+ goto out;
+ }
+
+ if (mdev->dev->caps.tunnel_offload_mode == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN) {
+ err = mlx4_SET_PORT_VXLAN(mdev->dev, priv->port, VXLAN_STEER_BY_OUTER_MAC, 1);
+ if (err) {
+ en_err(priv, "Failed setting port L2 tunnel configuration, err %d\n",
+ err);
+ goto out;
+ }
+ }
+
+ /* Init port */
+ en_warn(priv, "Initializing port\n");
+ err = mlx4_INIT_PORT(mdev->dev, priv->port);
+ if (err) {
+ en_err(priv, "Failed Initializing port\n");
+ goto out;
+ }
+ queue_delayed_work(mdev->workqueue, &priv->stats_task, STATS_DELAY);
+
+ if (mdev->dev->caps.steering_mode ==
+ MLX4_STEERING_MODE_DEVICE_MANAGED) {
+ priv->pflags |= MLX4_EN_PRIV_FLAGS_FS_EN_L2;
+ if (mdev->dev->caps.dmfs_high_steer_mode != MLX4_STEERING_DMFS_A0_STATIC)
+ priv->pflags |= MLX4_EN_PRIV_FLAGS_FS_EN_IPV4 |
+ MLX4_EN_PRIV_FLAGS_FS_EN_TCP |
+ MLX4_EN_PRIV_FLAGS_FS_EN_UDP;
+ }
+ if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_TS)
+ queue_delayed_work(mdev->workqueue, &priv->service_task,
+ SERVICE_TASK_DELAY);
+
+ mlx4_en_set_stats_bitmap(mdev->dev, &priv->stats_bitmap,
+ mdev->profile.prof[priv->port].rx_ppp,
+ mdev->profile.prof[priv->port].rx_pause,
+ mdev->profile.prof[priv->port].tx_ppp,
+ mdev->profile.prof[priv->port].tx_pause);
+
+ err = register_netdev(dev);
+ if (err) {
+ en_err(priv, "Netdev registration failed for port %d\n", port);
+ goto out;
+ }
+
+ priv->registered = 1;
+
+ if (mlx4_is_master(priv->mdev->dev)) {
+ for (i = 0; i < priv->mdev->dev->persist->num_vfs; i++) {
+ priv->vf_ports[i] = kzalloc(sizeof(*priv->vf_ports[i]), GFP_KERNEL);
+ if (!priv->vf_ports[i]) {
+ err = -ENOMEM;
+ goto out;
+ }
+ priv->vf_ports[i]->dev = priv->mdev->dev;
+ priv->vf_ports[i]->port_num = port & 0xff;
+ priv->vf_ports[i]->vport_num = i & 0xff;
+ err = kobject_init_and_add(&priv->vf_ports[i]->kobj_vf,
+ &en_port_type,
+ &dev->dev.kobj,
+ "vf%d", i);
+ if (err) {
+ kfree(priv->vf_ports[i]);
+ priv->vf_ports[i] = NULL;
+ goto out;
+ }
+ err = kobject_init_and_add(&priv->vf_ports[i]->kobj_stats,
+ &en_port_stats,
+ &priv->vf_ports[i]->kobj_vf,
+ "statistics");
+ if (err) {
+ kobject_put(&priv->vf_ports[i]->kobj_vf);
+ kfree(priv->vf_ports[i]);
+ priv->vf_ports[i] = NULL;
+ goto out;
+ }
+ }
+ }
+
+#ifdef CONFIG_COMPAT_EN_SYSFS
+ err = mlx4_en_sysfs_create(dev);
+ if (err)
+ goto out;
+ priv->sysfs_group_initialized = 1;
+#endif
+
+ return 0;
+
+out:
+ mlx4_en_destroy_netdev(dev);
+ return err;
+}
+
+int mlx4_en_reset_config(struct net_device *dev,
+ struct hwtstamp_config ts_config,
+ netdev_features_t features)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int port_up = 0;
+ int err = 0;
+
+ if (priv->hwtstamp_config.tx_type == ts_config.tx_type &&
+ priv->hwtstamp_config.rx_filter == ts_config.rx_filter &&
+ !DEV_FEATURE_CHANGED(dev, features, NETIF_F_HW_VLAN_CTAG_RX)
+#ifdef HAVE_NETIF_F_RXFCS
+ && !DEV_FEATURE_CHANGED(dev, features, NETIF_F_RXFCS)
+#endif
+ )
+ return 0; /* Nothing to change */
+
+ if (DEV_FEATURE_CHANGED(dev, features, NETIF_F_HW_VLAN_CTAG_RX) &&
+ (features & NETIF_F_HW_VLAN_CTAG_RX) &&
+ (priv->hwtstamp_config.rx_filter != HWTSTAMP_FILTER_NONE)) {
+ en_warn(priv, "Can't turn ON rx vlan offload while time-stamping rx filter is ON\n");
+ return -EINVAL;
+ }
+
+ mutex_lock(&mdev->state_lock);
+ if (priv->port_up) {
+ port_up = 1;
+ mlx4_en_stop_port(dev, 1);
+ }
+
+ mlx4_en_free_resources(priv);
+
+ en_warn(priv, "Changing device configuration rx filter(%x) rx vlan(%x)\n",
+ ts_config.rx_filter, !!(features & NETIF_F_HW_VLAN_CTAG_RX));
+
+ priv->hwtstamp_config.tx_type = ts_config.tx_type;
+ priv->hwtstamp_config.rx_filter = ts_config.rx_filter;
+
+ if (DEV_FEATURE_CHANGED(dev, features, NETIF_F_HW_VLAN_CTAG_RX)) {
+ if (features & NETIF_F_HW_VLAN_CTAG_RX)
+ dev->features |= NETIF_F_HW_VLAN_CTAG_RX;
+ else
+ dev->features &= ~NETIF_F_HW_VLAN_CTAG_RX;
+#ifdef HAVE_WANTED_FEATURES
+ } else if (ts_config.rx_filter == HWTSTAMP_FILTER_NONE) {
+ /* RX time-stamping is OFF, update the RX vlan offload
+ * to the latest wanted state
+ */
+ if (dev->wanted_features & NETIF_F_HW_VLAN_CTAG_RX)
+ dev->features |= NETIF_F_HW_VLAN_CTAG_RX;
+ else
+ dev->features &= ~NETIF_F_HW_VLAN_CTAG_RX;
+#endif
+ }
+
+#ifdef HAVE_NETIF_F_RXFCS
+ if (DEV_FEATURE_CHANGED(dev, features, NETIF_F_RXFCS)) {
+ if (features & NETIF_F_RXFCS)
+ dev->features |= NETIF_F_RXFCS;
+ else
+ dev->features &= ~NETIF_F_RXFCS;
+ }
+#endif
+
+ /* RX vlan offload and RX time-stamping can't co-exist !
+ * Regardless of the caller's choice,
+ * Turn Off RX vlan offload in case of time-stamping is ON
+ */
+ if (ts_config.rx_filter != HWTSTAMP_FILTER_NONE) {
+ if (dev->features & NETIF_F_HW_VLAN_CTAG_RX)
+ en_warn(priv, "Turning off RX vlan offload since RX time-stamping is ON\n");
+ dev->features &= ~NETIF_F_HW_VLAN_CTAG_RX;
+ }
+
+ err = mlx4_en_alloc_resources(priv);
+ if (err) {
+ en_err(priv, "Failed reallocating port resources\n");
+ goto out;
+ }
+ if (port_up) {
+ err = mlx4_en_start_port(dev);
+ if (err)
+ en_err(priv, "Failed starting port\n");
+ }
+
+out:
+ mutex_unlock(&mdev->state_lock);
+ netdev_features_change(dev);
+ return err;
+}
+#endif
+
+#endif
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_port.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_port.c
new file mode 100644
index 0000000..6627e6b
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_port.c
@@ -0,0 +1,493 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+
+
+
+#include "en_port.h"
+#include "mlx4_en.h"
+
+
+int mlx4_SET_VLAN_FLTR(struct mlx4_dev *dev, struct mlx4_en_priv *priv)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_set_vlan_fltr_mbox *filter;
+ int i;
+ int j;
+ int index = 0;
+ u32 entry;
+ int err = 0;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ filter = mailbox->buf;
+ for (i = VLAN_FLTR_SIZE - 1; i >= 0; i--) {
+ entry = 0;
+ for (j = 0; j < 32; j++)
+ if (test_bit(index++, priv->active_vlans))
+ entry |= 1 << j;
+ filter->entry[i] = cpu_to_be32(entry);
+ }
+ err = mlx4_cmd(dev, mailbox->dma, priv->port, 0, MLX4_CMD_SET_VLAN_FLTR,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_WRAPPED);
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+int mlx4_en_QUERY_PORT(struct mlx4_en_dev *mdev, u8 port)
+{
+ struct mlx4_en_query_port_context *qport_context;
+ struct mlx4_en_priv *priv = rtedev_priv(mdev->rte_pndev[port]);
+ struct mlx4_en_port_state *state = &priv->port_state;
+ struct mlx4_cmd_mailbox *mailbox;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(mdev->dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ err = mlx4_cmd_box(mdev->dev, 0, mailbox->dma, port, 0,
+ MLX4_CMD_QUERY_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_WRAPPED);
+ if (err)
+ goto out;
+ qport_context = mailbox->buf;
+
+ /* This command is always accessed from Ethtool context
+ * already synchronized, no need in locking */
+ state->link_state = !!(qport_context->link_up & MLX4_EN_LINK_UP_MASK);
+ switch (qport_context->link_speed & MLX4_EN_SPEED_MASK) {
+ case MLX4_EN_100M_SPEED:
+ state->link_speed = SPEED_100;
+ break;
+ case MLX4_EN_1G_SPEED:
+ state->link_speed = SPEED_1000;
+ break;
+ case MLX4_EN_10G_SPEED_XAUI:
+ case MLX4_EN_10G_SPEED_XFI:
+ state->link_speed = SPEED_10000;
+ break;
+ case MLX4_EN_20G_SPEED:
+ state->link_speed = SPEED_20000;
+ break;
+ case MLX4_EN_40G_SPEED:
+ state->link_speed = SPEED_40000;
+ break;
+ case MLX4_EN_56G_SPEED:
+ state->link_speed = SPEED_56000;
+ break;
+ default:
+ state->link_speed = -1;
+ break;
+ }
+
+ state->transceiver = qport_context->transceiver;
+
+ state->flags = 0; /* Reset and recalculate the port flags */
+ state->flags |= (qport_context->link_up & MLX4_EN_ANC_MASK) ?
+ MLX4_EN_PORT_ANC : 0;
+ state->flags |= (qport_context->autoneg & MLX4_EN_AUTONEG_MASK) ?
+ MLX4_EN_PORT_ANE : 0;
+
+out:
+ mlx4_free_cmd_mailbox(mdev->dev, mailbox);
+ return err;
+}
+
+/* Each counter set is located in struct mlx4_en_stat_out_mbox
+ * with a const offset between its prio components.
+ * This function runs over a counter set and sum all of it's prio components.
+ */
+static unsigned long en_stats_adder(__be64 *start, __be64 *next, int num)
+{
+ __be64 *curr = start;
+ unsigned long ret = 0;
+ int i;
+ int offset = next - start;
+
+ for (i = 0; i < num; i++) {
+ ret += be64_to_cpu(*curr);
+ curr += offset;
+ }
+
+ return ret;
+}
+
+#ifdef KMOD_REMOVED
+
+int mlx4_en_DUMP_ETH_STATS(struct mlx4_en_dev *mdev, u8 port, u8 reset)
+{
+ struct mlx4_en_vport_stats tmp_vport_stats;
+ struct mlx4_en_stat_out_mbox *mlx4_en_stats;
+ struct mlx4_en_stat_out_flow_control_mbox *flowstats;
+ struct mlx4_en_priv *priv = netdev_priv(mdev->pndev[port]);
+ struct net_device_stats *stats = &priv->stats;
+ struct mlx4_en_vport_stats *vport_stats = &priv->vport_stats;
+ struct mlx4_cmd_mailbox *mailbox;
+ u64 in_mod = reset << 8 | port;
+ int err;
+ int i, read_counters = 0;;
+
+ mailbox = mlx4_alloc_cmd_mailbox(mdev->dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ err = mlx4_cmd_box(mdev->dev, 0, mailbox->dma, in_mod, 0,
+ MLX4_CMD_DUMP_ETH_STATS, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_NATIVE);
+ if (err)
+ goto out;
+
+ mlx4_en_stats = mailbox->buf;
+
+ spin_lock_bh(&priv->stats_lock);
+
+ stats->rx_packets = 0;
+ stats->rx_bytes = 0;
+ priv->port_stats.rx_chksum_good = 0;
+ priv->port_stats.rx_chksum_none = 0;
+ priv->port_stats.rx_chksum_complete = 0;
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ priv->port_stats.rx_chksum_good += priv->rx_ring[i]->csum_ok;
+ priv->port_stats.rx_chksum_none += priv->rx_ring[i]->csum_none;
+ priv->port_stats.rx_chksum_complete += priv->rx_ring[i]->csum_complete;
+ }
+ stats->tx_packets = 0;
+ stats->tx_bytes = 0;
+ priv->port_stats.tx_chksum_offload = 0;
+ priv->port_stats.queue_stopped = 0;
+ priv->port_stats.wake_queue = 0;
+ priv->port_stats.tso_packets = 0;
+ priv->port_stats.xmit_more = 0;
+
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ const struct mlx4_en_tx_ring *ring = priv->tx_ring[i];
+
+ priv->port_stats.tx_chksum_offload += ring->tx_csum;
+ priv->port_stats.queue_stopped += ring->queue_stopped;
+ priv->port_stats.wake_queue += ring->wake_queue;
+ priv->port_stats.tso_packets += ring->tso_packets;
+ priv->port_stats.xmit_more += ring->xmit_more;
+ }
+
+ /* net device stats */
+ stats->rx_errors = be64_to_cpu(mlx4_en_stats->PCS) +
+ be32_to_cpu(mlx4_en_stats->RJBBR) +
+ be32_to_cpu(mlx4_en_stats->RCRC) +
+ be32_to_cpu(mlx4_en_stats->RRUNT) +
+ be64_to_cpu(mlx4_en_stats->RInRangeLengthErr) +
+ be64_to_cpu(mlx4_en_stats->ROutRangeLengthErr) +
+ be32_to_cpu(mlx4_en_stats->RSHORT) +
+ en_stats_adder(&mlx4_en_stats->RGIANT_prio_0,
+ &mlx4_en_stats->RGIANT_prio_1,
+ NUM_PRIORITIES);
+ stats->tx_errors = en_stats_adder(&mlx4_en_stats->TGIANT_prio_0,
+ &mlx4_en_stats->TGIANT_prio_1,
+ NUM_PRIORITIES);
+ stats->multicast = en_stats_adder(&mlx4_en_stats->MCAST_prio_0,
+ &mlx4_en_stats->MCAST_prio_1,
+ NUM_PRIORITIES);
+ stats->collisions = 0;
+ stats->rx_dropped = be32_to_cpu(mlx4_en_stats->RDROP);
+ stats->rx_length_errors = be32_to_cpu(mlx4_en_stats->RdropLength);
+ stats->rx_over_errors = be32_to_cpu(mlx4_en_stats->RdropOvflw);
+ stats->rx_crc_errors = be32_to_cpu(mlx4_en_stats->RCRC);
+ stats->rx_frame_errors = 0;
+ stats->rx_fifo_errors = be32_to_cpu(mlx4_en_stats->RdropOvflw);
+ stats->rx_missed_errors = be32_to_cpu(mlx4_en_stats->RdropOvflw);
+ stats->tx_aborted_errors = 0;
+ stats->tx_carrier_errors = 0;
+ stats->tx_fifo_errors = 0;
+ stats->tx_heartbeat_errors = 0;
+ stats->tx_window_errors = 0;
+ stats->tx_dropped = be32_to_cpu(mlx4_en_stats->TDROP);
+
+ /* RX stats */
+ stats->rx_packets = en_stats_adder(&mlx4_en_stats->RTOT_prio_0,
+ &mlx4_en_stats->RTOT_prio_1,
+ NUM_PRIORITIES);
+ stats->rx_bytes = en_stats_adder(&mlx4_en_stats->ROCT_prio_0,
+ &mlx4_en_stats->ROCT_prio_1,
+ NUM_PRIORITIES);
+ priv->pkstats.rx_multicast_packets = stats->multicast;
+ priv->pkstats.rx_broadcast_packets =
+ en_stats_adder(&mlx4_en_stats->RBCAST_prio_0,
+ &mlx4_en_stats->RBCAST_prio_1,
+ NUM_PRIORITIES);
+ priv->pkstats.rx_jabbers = be32_to_cpu(mlx4_en_stats->RJBBR);
+ priv->pkstats.rx_in_range_length_error =
+ be64_to_cpu(mlx4_en_stats->RInRangeLengthErr);
+ priv->pkstats.rx_out_range_length_error =
+ be64_to_cpu(mlx4_en_stats->ROutRangeLengthErr);
+
+ /* Tx stats */
+ stats->tx_packets = en_stats_adder(&mlx4_en_stats->TTOT_prio_0,
+ &mlx4_en_stats->TTOT_prio_1,
+ NUM_PRIORITIES);
+ stats->tx_bytes = en_stats_adder(&mlx4_en_stats->TOCT_prio_0,
+ &mlx4_en_stats->TOCT_prio_1,
+ NUM_PRIORITIES);
+ priv->pkstats.tx_multicast_packets =
+ en_stats_adder(&mlx4_en_stats->TMCAST_prio_0,
+ &mlx4_en_stats->TMCAST_prio_1,
+ NUM_PRIORITIES);
+ priv->pkstats.tx_broadcast_packets =
+ en_stats_adder(&mlx4_en_stats->TBCAST_prio_0,
+ &mlx4_en_stats->TBCAST_prio_1,
+ NUM_PRIORITIES);
+
+ priv->pkstats.rx_prio[0][0] = be64_to_cpu(mlx4_en_stats->RTOT_prio_0);
+ priv->pkstats.rx_prio[0][1] = be64_to_cpu(mlx4_en_stats->ROCT_prio_0);
+ priv->pkstats.rx_prio[1][0] = be64_to_cpu(mlx4_en_stats->RTOT_prio_1);
+ priv->pkstats.rx_prio[1][1] = be64_to_cpu(mlx4_en_stats->ROCT_prio_1);
+ priv->pkstats.rx_prio[2][0] = be64_to_cpu(mlx4_en_stats->RTOT_prio_2);
+ priv->pkstats.rx_prio[2][1] = be64_to_cpu(mlx4_en_stats->ROCT_prio_2);
+ priv->pkstats.rx_prio[3][0] = be64_to_cpu(mlx4_en_stats->RTOT_prio_3);
+ priv->pkstats.rx_prio[3][1] = be64_to_cpu(mlx4_en_stats->ROCT_prio_3);
+ priv->pkstats.rx_prio[4][0] = be64_to_cpu(mlx4_en_stats->RTOT_prio_4);
+ priv->pkstats.rx_prio[4][1] = be64_to_cpu(mlx4_en_stats->ROCT_prio_4);
+ priv->pkstats.rx_prio[5][0] = be64_to_cpu(mlx4_en_stats->RTOT_prio_5);
+ priv->pkstats.rx_prio[5][1] = be64_to_cpu(mlx4_en_stats->ROCT_prio_5);
+ priv->pkstats.rx_prio[6][0] = be64_to_cpu(mlx4_en_stats->RTOT_prio_6);
+ priv->pkstats.rx_prio[6][1] = be64_to_cpu(mlx4_en_stats->ROCT_prio_6);
+ priv->pkstats.rx_prio[7][0] = be64_to_cpu(mlx4_en_stats->RTOT_prio_7);
+ priv->pkstats.rx_prio[7][1] = be64_to_cpu(mlx4_en_stats->ROCT_prio_7);
+ priv->pkstats.rx_prio[8][0] = be64_to_cpu(mlx4_en_stats->RTOT_novlan);
+ priv->pkstats.rx_prio[8][1] = be64_to_cpu(mlx4_en_stats->ROCT_novlan);
+ priv->pkstats.tx_prio[0][0] = be64_to_cpu(mlx4_en_stats->TTOT_prio_0);
+ priv->pkstats.tx_prio[0][1] = be64_to_cpu(mlx4_en_stats->TOCT_prio_0);
+ priv->pkstats.tx_prio[1][0] = be64_to_cpu(mlx4_en_stats->TTOT_prio_1);
+ priv->pkstats.tx_prio[1][1] = be64_to_cpu(mlx4_en_stats->TOCT_prio_1);
+ priv->pkstats.tx_prio[2][0] = be64_to_cpu(mlx4_en_stats->TTOT_prio_2);
+ priv->pkstats.tx_prio[2][1] = be64_to_cpu(mlx4_en_stats->TOCT_prio_2);
+ priv->pkstats.tx_prio[3][0] = be64_to_cpu(mlx4_en_stats->TTOT_prio_3);
+ priv->pkstats.tx_prio[3][1] = be64_to_cpu(mlx4_en_stats->TOCT_prio_3);
+ priv->pkstats.tx_prio[4][0] = be64_to_cpu(mlx4_en_stats->TTOT_prio_4);
+ priv->pkstats.tx_prio[4][1] = be64_to_cpu(mlx4_en_stats->TOCT_prio_4);
+ priv->pkstats.tx_prio[5][0] = be64_to_cpu(mlx4_en_stats->TTOT_prio_5);
+ priv->pkstats.tx_prio[5][1] = be64_to_cpu(mlx4_en_stats->TOCT_prio_5);
+ priv->pkstats.tx_prio[6][0] = be64_to_cpu(mlx4_en_stats->TTOT_prio_6);
+ priv->pkstats.tx_prio[6][1] = be64_to_cpu(mlx4_en_stats->TOCT_prio_6);
+ priv->pkstats.tx_prio[7][0] = be64_to_cpu(mlx4_en_stats->TTOT_prio_7);
+ priv->pkstats.tx_prio[7][1] = be64_to_cpu(mlx4_en_stats->TOCT_prio_7);
+ priv->pkstats.tx_prio[8][0] = be64_to_cpu(mlx4_en_stats->TTOT_novlan);
+ priv->pkstats.tx_prio[8][1] = be64_to_cpu(mlx4_en_stats->TOCT_novlan);
+
+ spin_unlock_bh(&priv->stats_lock);
+
+ memset(&tmp_vport_stats, 0, sizeof(tmp_vport_stats));
+ err = mlx4_get_vport_ethtool_stats(mdev->dev, port,
+ &tmp_vport_stats, reset,
+ &read_counters);
+ spin_lock_bh(&priv->stats_lock);
+ if (!err && read_counters) {
+ /* ethtool stats format */
+ vport_stats->rx_unicast_packets = tmp_vport_stats.rx_unicast_packets;
+ vport_stats->rx_unicast_bytes = tmp_vport_stats.rx_unicast_bytes;
+ vport_stats->rx_multicast_packets = tmp_vport_stats.rx_multicast_packets;
+ vport_stats->rx_multicast_bytes = tmp_vport_stats.rx_multicast_bytes;
+ vport_stats->rx_broadcast_packets = tmp_vport_stats.rx_broadcast_packets;
+ vport_stats->rx_broadcast_bytes = tmp_vport_stats.rx_broadcast_bytes;
+ vport_stats->rx_dropped = tmp_vport_stats.rx_dropped;
+ vport_stats->rx_filtered = tmp_vport_stats.rx_filtered;
+ vport_stats->tx_unicast_packets = tmp_vport_stats.tx_unicast_packets;
+ vport_stats->tx_unicast_bytes = tmp_vport_stats.tx_unicast_bytes;
+ vport_stats->tx_multicast_packets = tmp_vport_stats.tx_multicast_packets;
+ vport_stats->tx_multicast_bytes = tmp_vport_stats.tx_multicast_bytes;
+ vport_stats->tx_broadcast_packets = tmp_vport_stats.tx_broadcast_packets;
+ vport_stats->tx_broadcast_bytes = tmp_vport_stats.tx_broadcast_bytes;
+ vport_stats->tx_dropped = tmp_vport_stats.tx_dropped;
+ }
+
+ if (mlx4_is_mfunc(mdev->dev) && !err && read_counters) {
+ /* netdevice stats format */
+ stats->rx_packets = tmp_vport_stats.rx_unicast_packets +
+ tmp_vport_stats.rx_broadcast_packets +
+ tmp_vport_stats.rx_multicast_packets;
+ stats->tx_packets = tmp_vport_stats.tx_unicast_packets +
+ tmp_vport_stats.tx_broadcast_packets +
+ tmp_vport_stats.tx_multicast_packets;
+ stats->rx_bytes = tmp_vport_stats.rx_unicast_bytes +
+ tmp_vport_stats.rx_broadcast_bytes +
+ tmp_vport_stats.rx_multicast_bytes;
+ stats->tx_bytes = tmp_vport_stats.tx_unicast_bytes +
+ tmp_vport_stats.tx_broadcast_bytes +
+ tmp_vport_stats.tx_multicast_bytes;
+ /* PF netdev stats behaves like VF so no rx_errros. */
+ stats->rx_errors = 0;
+ stats->rx_dropped = tmp_vport_stats.rx_dropped;
+ stats->tx_dropped = tmp_vport_stats.tx_dropped;
+ stats->multicast = tmp_vport_stats.rx_multicast_packets;
+ }
+
+ spin_unlock_bh(&priv->stats_lock);
+
+ /* 0xffs indicates invalid value */
+ memset(mailbox->buf, 0xff, sizeof(*flowstats) * MLX4_NUM_PRIORITIES);
+
+ if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_FLOWSTATS_EN) {
+ memset(mailbox->buf, 0,
+ sizeof(*flowstats) * MLX4_NUM_PRIORITIES);
+ err = mlx4_cmd_box(mdev->dev, 0, mailbox->dma,
+ in_mod | MLX4_DUMP_ETH_STATS_FLOW_CONTROL,
+ 0, MLX4_CMD_DUMP_ETH_STATS,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+ if (err)
+ goto out;
+ }
+
+ flowstats = mailbox->buf;
+
+ spin_lock_bh(&priv->stats_lock);
+
+ for (i = 0; i < MLX4_NUM_PRIORITIES; i++) {
+ priv->rx_priority_flowstats[i].rx_pause =
+ be64_to_cpu(flowstats[i].rx_pause);
+ priv->rx_priority_flowstats[i].rx_pause_duration =
+ be64_to_cpu(flowstats[i].rx_pause_duration);
+ priv->rx_priority_flowstats[i].rx_pause_transition =
+ be64_to_cpu(flowstats[i].rx_pause_transition);
+ priv->tx_priority_flowstats[i].tx_pause =
+ be64_to_cpu(flowstats[i].tx_pause);
+ priv->tx_priority_flowstats[i].tx_pause_duration =
+ be64_to_cpu(flowstats[i].tx_pause_duration);
+ priv->tx_priority_flowstats[i].tx_pause_transition =
+ be64_to_cpu(flowstats[i].tx_pause_transition);
+ }
+
+ /* if pfc is not in use, all priorities counters have the same value */
+ priv->rx_flowstats.rx_pause =
+ be64_to_cpu(flowstats[0].rx_pause);
+ priv->rx_flowstats.rx_pause_duration =
+ be64_to_cpu(flowstats[0].rx_pause_duration);
+ priv->rx_flowstats.rx_pause_transition =
+ be64_to_cpu(flowstats[0].rx_pause_transition);
+ priv->tx_flowstats.tx_pause =
+ be64_to_cpu(flowstats[0].tx_pause);
+ priv->tx_flowstats.tx_pause_duration =
+ be64_to_cpu(flowstats[0].tx_pause_duration);
+ priv->tx_flowstats.tx_pause_transition =
+ be64_to_cpu(flowstats[0].tx_pause_transition);
+
+ spin_unlock_bh(&priv->stats_lock);
+
+out:
+ mlx4_free_cmd_mailbox(mdev->dev, mailbox);
+ return err;
+}
+
+int mlx4_en_get_vport_stats(struct mlx4_en_dev *mdev, u8 port)
+{
+ struct mlx4_en_priv *priv = netdev_priv(mdev->pndev[port]);
+ struct mlx4_en_vport_stats tmp_vport_stats;
+ struct mlx4_en_vf_stats *vf_stats = &priv->vf_stats;
+ int err, i, read_counters = 0;
+
+ spin_lock_bh(&priv->stats_lock);
+
+ priv->stats.rx_packets = 0;
+ priv->stats.rx_bytes = 0;
+ priv->port_stats.rx_chksum_good = 0;
+ priv->port_stats.rx_chksum_none = 0;
+ priv->port_stats.rx_chksum_complete = 0;
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ priv->stats.rx_packets += priv->rx_ring[i]->packets;
+ priv->stats.rx_bytes += priv->rx_ring[i]->bytes;
+ priv->port_stats.rx_chksum_good += priv->rx_ring[i]->csum_ok;
+ priv->port_stats.rx_chksum_none += priv->rx_ring[i]->csum_none;
+ priv->port_stats.rx_chksum_complete += priv->rx_ring[i]->csum_complete;
+ }
+ priv->stats.tx_packets = 0;
+ priv->stats.tx_bytes = 0;
+ priv->port_stats.tx_chksum_offload = 0;
+ priv->port_stats.queue_stopped = 0;
+ priv->port_stats.wake_queue = 0;
+ priv->port_stats.tso_packets = 0;
+ priv->port_stats.xmit_more = 0;
+
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ const struct mlx4_en_tx_ring *ring = priv->tx_ring[i];
+
+ priv->stats.tx_packets += ring->packets;
+ priv->stats.tx_bytes += ring->bytes;
+ priv->port_stats.tx_chksum_offload += ring->tx_csum;
+ priv->port_stats.queue_stopped += ring->queue_stopped;
+ priv->port_stats.wake_queue += ring->wake_queue;
+ priv->port_stats.tso_packets += ring->tso_packets;
+ priv->port_stats.xmit_more += ring->xmit_more;
+ }
+
+ spin_unlock_bh(&priv->stats_lock);
+
+ memset(&tmp_vport_stats, 0, sizeof(tmp_vport_stats));
+
+ err = mlx4_get_vport_ethtool_stats(mdev->dev, port, &tmp_vport_stats, 0, &read_counters);
+ if (!err && read_counters) {
+ spin_lock_bh(&priv->stats_lock);
+ vf_stats->rx_multicast_packets = tmp_vport_stats.rx_multicast_packets;
+ vf_stats->rx_broadcast_packets = tmp_vport_stats.rx_broadcast_packets;
+ vf_stats->rx_filtered = tmp_vport_stats.rx_filtered;
+ vf_stats->tx_multicast_packets = tmp_vport_stats.tx_multicast_packets;
+ vf_stats->tx_broadcast_packets = tmp_vport_stats.tx_broadcast_packets;
+ vf_stats->tx_dropped = tmp_vport_stats.tx_dropped;
+ priv->stats.rx_packets = tmp_vport_stats.rx_unicast_packets +
+ tmp_vport_stats.rx_multicast_packets +
+ tmp_vport_stats.rx_broadcast_packets;
+ priv->stats.rx_bytes = tmp_vport_stats.rx_unicast_bytes +
+ tmp_vport_stats.rx_multicast_bytes +
+ tmp_vport_stats.rx_broadcast_bytes;
+ priv->stats.tx_packets = tmp_vport_stats.tx_unicast_packets +
+ tmp_vport_stats.tx_multicast_packets +
+ tmp_vport_stats.tx_broadcast_packets;
+ priv->stats.tx_bytes = tmp_vport_stats.tx_unicast_bytes +
+ tmp_vport_stats.tx_multicast_bytes +
+ tmp_vport_stats.tx_broadcast_bytes;
+ /* PF&VFs are not expected to report errors in ifconfig.
+ * rx_errors will be reprted in PF's ethtool statistics,
+ * see: mlx4_en_DUMP_ETH_STATS
+ */
+ priv->stats.rx_errors = 0;
+ priv->stats.rx_dropped = tmp_vport_stats.rx_dropped;
+ priv->stats.tx_dropped = tmp_vport_stats.tx_dropped;
+ priv->stats.multicast = vf_stats->rx_multicast_packets;
+
+ spin_unlock_bh(&priv->stats_lock);
+ }
+
+ return err;
+}
+#endif
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_port.h b/drivers/net/mlnx_uio/mlnx/mlx4/en_port.h
new file mode 100644
index 0000000..6f432d9
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_port.h
@@ -0,0 +1,593 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#ifndef _MLX4_EN_PORT_H_
+#define _MLX4_EN_PORT_H_
+
+
+#define SET_PORT_GEN_ALL_VALID 0x7
+#define SET_PORT_PROMISC_SHIFT 31
+#define SET_PORT_MC_PROMISC_SHIFT 30
+
+#define MLX4_EN_NUM_TC 8
+
+#define VLAN_FLTR_SIZE 128
+struct mlx4_set_vlan_fltr_mbox {
+ __be32 entry[VLAN_FLTR_SIZE];
+};
+
+
+enum {
+ MLX4_MCAST_CONFIG = 0,
+ MLX4_MCAST_DISABLE = 1,
+ MLX4_MCAST_ENABLE = 2,
+};
+
+enum mlx4_link_mode {
+ MLX4_1000BASE_CX_SGMII = 0,
+ MLX4_1000BASE_KX = 1,
+ MLX4_10GBASE_CX4 = 2,
+ MLX4_10GBASE_KX4 = 3,
+ MLX4_10GBASE_KR = 4,
+ MLX4_20GBASE_KR2 = 5,
+ MLX4_40GBASE_CR4 = 6,
+ MLX4_40GBASE_KR4 = 7,
+ MLX4_56GBASE_KR4 = 8,
+ MLX4_10GBASE_CR = 12,
+ MLX4_10GBASE_SR = 13,
+ MLX4_40GBASE_SR4 = 15,
+ MLX4_56GBASE_CR4 = 17,
+ MLX4_56GBASE_SR4 = 18,
+ MLX4_100BASE_TX = 24,
+ MLX4_1000BASE_T = 25,
+ MLX4_10GBASE_T = 26,
+};
+
+#define MLX4_PROT_MASK(link_mode) (1<<link_mode)
+
+enum {
+ MLX4_EN_100M_SPEED = 0x04,
+ MLX4_EN_10G_SPEED_XAUI = 0x00,
+ MLX4_EN_10G_SPEED_XFI = 0x01,
+ MLX4_EN_1G_SPEED = 0x02,
+ MLX4_EN_20G_SPEED = 0x08,
+ MLX4_EN_40G_SPEED = 0x40,
+ MLX4_EN_56G_SPEED = 0x20,
+ MLX4_EN_OTHER_SPEED = 0x0f,
+};
+
+struct mlx4_en_query_port_context {
+ u8 link_up;
+#define MLX4_EN_LINK_UP_MASK 0x80
+#define MLX4_EN_ANC_MASK 0x40
+ u8 autoneg;
+#define MLX4_EN_AUTONEG_MASK 0x80
+ __be16 mtu;
+ u8 reserved2;
+ u8 link_speed;
+#define MLX4_EN_SPEED_MASK 0x6f
+ u16 reserved3[5];
+ __be64 mac;
+ u8 transceiver;
+};
+
+
+struct mlx4_en_stat_out_mbox {
+ /* Received frames with a length of 64 octets */
+ __be64 R64_prio_0;
+ __be64 R64_prio_1;
+ __be64 R64_prio_2;
+ __be64 R64_prio_3;
+ __be64 R64_prio_4;
+ __be64 R64_prio_5;
+ __be64 R64_prio_6;
+ __be64 R64_prio_7;
+ __be64 R64_novlan;
+ /* Received frames with a length of 127 octets */
+ __be64 R127_prio_0;
+ __be64 R127_prio_1;
+ __be64 R127_prio_2;
+ __be64 R127_prio_3;
+ __be64 R127_prio_4;
+ __be64 R127_prio_5;
+ __be64 R127_prio_6;
+ __be64 R127_prio_7;
+ __be64 R127_novlan;
+ /* Received frames with a length of 255 octets */
+ __be64 R255_prio_0;
+ __be64 R255_prio_1;
+ __be64 R255_prio_2;
+ __be64 R255_prio_3;
+ __be64 R255_prio_4;
+ __be64 R255_prio_5;
+ __be64 R255_prio_6;
+ __be64 R255_prio_7;
+ __be64 R255_novlan;
+ /* Received frames with a length of 511 octets */
+ __be64 R511_prio_0;
+ __be64 R511_prio_1;
+ __be64 R511_prio_2;
+ __be64 R511_prio_3;
+ __be64 R511_prio_4;
+ __be64 R511_prio_5;
+ __be64 R511_prio_6;
+ __be64 R511_prio_7;
+ __be64 R511_novlan;
+ /* Received frames with a length of 1023 octets */
+ __be64 R1023_prio_0;
+ __be64 R1023_prio_1;
+ __be64 R1023_prio_2;
+ __be64 R1023_prio_3;
+ __be64 R1023_prio_4;
+ __be64 R1023_prio_5;
+ __be64 R1023_prio_6;
+ __be64 R1023_prio_7;
+ __be64 R1023_novlan;
+ /* Received frames with a length of 1518 octets */
+ __be64 R1518_prio_0;
+ __be64 R1518_prio_1;
+ __be64 R1518_prio_2;
+ __be64 R1518_prio_3;
+ __be64 R1518_prio_4;
+ __be64 R1518_prio_5;
+ __be64 R1518_prio_6;
+ __be64 R1518_prio_7;
+ __be64 R1518_novlan;
+ /* Received frames with a length of 1522 octets */
+ __be64 R1522_prio_0;
+ __be64 R1522_prio_1;
+ __be64 R1522_prio_2;
+ __be64 R1522_prio_3;
+ __be64 R1522_prio_4;
+ __be64 R1522_prio_5;
+ __be64 R1522_prio_6;
+ __be64 R1522_prio_7;
+ __be64 R1522_novlan;
+ /* Received frames with a length of 1548 octets */
+ __be64 R1548_prio_0;
+ __be64 R1548_prio_1;
+ __be64 R1548_prio_2;
+ __be64 R1548_prio_3;
+ __be64 R1548_prio_4;
+ __be64 R1548_prio_5;
+ __be64 R1548_prio_6;
+ __be64 R1548_prio_7;
+ __be64 R1548_novlan;
+ /* Received frames with a length of 1548 < octets < MTU */
+ __be64 R2MTU_prio_0;
+ __be64 R2MTU_prio_1;
+ __be64 R2MTU_prio_2;
+ __be64 R2MTU_prio_3;
+ __be64 R2MTU_prio_4;
+ __be64 R2MTU_prio_5;
+ __be64 R2MTU_prio_6;
+ __be64 R2MTU_prio_7;
+ __be64 R2MTU_novlan;
+ /* Received frames with a length of MTU< octets and good CRC */
+ __be64 RGIANT_prio_0;
+ __be64 RGIANT_prio_1;
+ __be64 RGIANT_prio_2;
+ __be64 RGIANT_prio_3;
+ __be64 RGIANT_prio_4;
+ __be64 RGIANT_prio_5;
+ __be64 RGIANT_prio_6;
+ __be64 RGIANT_prio_7;
+ __be64 RGIANT_novlan;
+ /* Received broadcast frames with good CRC */
+ __be64 RBCAST_prio_0;
+ __be64 RBCAST_prio_1;
+ __be64 RBCAST_prio_2;
+ __be64 RBCAST_prio_3;
+ __be64 RBCAST_prio_4;
+ __be64 RBCAST_prio_5;
+ __be64 RBCAST_prio_6;
+ __be64 RBCAST_prio_7;
+ __be64 RBCAST_novlan;
+ /* Received multicast frames with good CRC */
+ __be64 MCAST_prio_0;
+ __be64 MCAST_prio_1;
+ __be64 MCAST_prio_2;
+ __be64 MCAST_prio_3;
+ __be64 MCAST_prio_4;
+ __be64 MCAST_prio_5;
+ __be64 MCAST_prio_6;
+ __be64 MCAST_prio_7;
+ __be64 MCAST_novlan;
+ /* Received unicast not short or GIANT frames with good CRC */
+ __be64 RTOTG_prio_0;
+ __be64 RTOTG_prio_1;
+ __be64 RTOTG_prio_2;
+ __be64 RTOTG_prio_3;
+ __be64 RTOTG_prio_4;
+ __be64 RTOTG_prio_5;
+ __be64 RTOTG_prio_6;
+ __be64 RTOTG_prio_7;
+ __be64 RTOTG_novlan;
+
+ /* Count of total octets of received frames, includes framing characters */
+ __be64 RTTLOCT_prio_0;
+ /* Count of total octets of received frames, not including framing
+ characters */
+ __be64 RTTLOCT_NOFRM_prio_0;
+ /* Count of Total number of octets received
+ (only for frames without errors) */
+ __be64 ROCT_prio_0;
+
+ __be64 RTTLOCT_prio_1;
+ __be64 RTTLOCT_NOFRM_prio_1;
+ __be64 ROCT_prio_1;
+
+ __be64 RTTLOCT_prio_2;
+ __be64 RTTLOCT_NOFRM_prio_2;
+ __be64 ROCT_prio_2;
+
+ __be64 RTTLOCT_prio_3;
+ __be64 RTTLOCT_NOFRM_prio_3;
+ __be64 ROCT_prio_3;
+
+ __be64 RTTLOCT_prio_4;
+ __be64 RTTLOCT_NOFRM_prio_4;
+ __be64 ROCT_prio_4;
+
+ __be64 RTTLOCT_prio_5;
+ __be64 RTTLOCT_NOFRM_prio_5;
+ __be64 ROCT_prio_5;
+
+ __be64 RTTLOCT_prio_6;
+ __be64 RTTLOCT_NOFRM_prio_6;
+ __be64 ROCT_prio_6;
+
+ __be64 RTTLOCT_prio_7;
+ __be64 RTTLOCT_NOFRM_prio_7;
+ __be64 ROCT_prio_7;
+
+ __be64 RTTLOCT_novlan;
+ __be64 RTTLOCT_NOFRM_novlan;
+ __be64 ROCT_novlan;
+
+ /* Count of Total received frames including bad frames */
+ __be64 RTOT_prio_0;
+ /* Count of Total number of received frames with 802.1Q encapsulation */
+ __be64 R1Q_prio_0;
+ __be64 reserved1;
+
+ __be64 RTOT_prio_1;
+ __be64 R1Q_prio_1;
+ __be64 reserved2;
+
+ __be64 RTOT_prio_2;
+ __be64 R1Q_prio_2;
+ __be64 reserved3;
+
+ __be64 RTOT_prio_3;
+ __be64 R1Q_prio_3;
+ __be64 reserved4;
+
+ __be64 RTOT_prio_4;
+ __be64 R1Q_prio_4;
+ __be64 reserved5;
+
+ __be64 RTOT_prio_5;
+ __be64 R1Q_prio_5;
+ __be64 reserved6;
+
+ __be64 RTOT_prio_6;
+ __be64 R1Q_prio_6;
+ __be64 reserved7;
+
+ __be64 RTOT_prio_7;
+ __be64 R1Q_prio_7;
+ __be64 reserved8;
+
+ __be64 RTOT_novlan;
+ __be64 R1Q_novlan;
+ __be64 reserved9;
+
+ /* Total number of Successfully Received Control Frames */
+ __be64 RCNTL;
+ __be64 reserved10;
+ __be64 reserved11;
+ __be64 reserved12;
+ /* Count of received frames with a length/type field value between 46
+ (42 for VLANtagged frames) and 1500 (also 1500 for VLAN-tagged frames),
+ inclusive */
+ __be64 RInRangeLengthErr;
+ /* Count of received frames with length/type field between 1501 and 1535
+ decimal, inclusive */
+ __be64 ROutRangeLengthErr;
+ /* Count of received frames that are longer than max allowed size for
+ 802.3 frames (1518/1522) */
+ __be64 RFrmTooLong;
+ /* Count frames received with PCS error */
+ __be64 PCS;
+
+ /* Transmit frames with a length of 64 octets */
+ __be64 T64_prio_0;
+ __be64 T64_prio_1;
+ __be64 T64_prio_2;
+ __be64 T64_prio_3;
+ __be64 T64_prio_4;
+ __be64 T64_prio_5;
+ __be64 T64_prio_6;
+ __be64 T64_prio_7;
+ __be64 T64_novlan;
+ __be64 T64_loopbk;
+ /* Transmit frames with a length of 65 to 127 octets. */
+ __be64 T127_prio_0;
+ __be64 T127_prio_1;
+ __be64 T127_prio_2;
+ __be64 T127_prio_3;
+ __be64 T127_prio_4;
+ __be64 T127_prio_5;
+ __be64 T127_prio_6;
+ __be64 T127_prio_7;
+ __be64 T127_novlan;
+ __be64 T127_loopbk;
+ /* Transmit frames with a length of 128 to 255 octets */
+ __be64 T255_prio_0;
+ __be64 T255_prio_1;
+ __be64 T255_prio_2;
+ __be64 T255_prio_3;
+ __be64 T255_prio_4;
+ __be64 T255_prio_5;
+ __be64 T255_prio_6;
+ __be64 T255_prio_7;
+ __be64 T255_novlan;
+ __be64 T255_loopbk;
+ /* Transmit frames with a length of 256 to 511 octets */
+ __be64 T511_prio_0;
+ __be64 T511_prio_1;
+ __be64 T511_prio_2;
+ __be64 T511_prio_3;
+ __be64 T511_prio_4;
+ __be64 T511_prio_5;
+ __be64 T511_prio_6;
+ __be64 T511_prio_7;
+ __be64 T511_novlan;
+ __be64 T511_loopbk;
+ /* Transmit frames with a length of 512 to 1023 octets */
+ __be64 T1023_prio_0;
+ __be64 T1023_prio_1;
+ __be64 T1023_prio_2;
+ __be64 T1023_prio_3;
+ __be64 T1023_prio_4;
+ __be64 T1023_prio_5;
+ __be64 T1023_prio_6;
+ __be64 T1023_prio_7;
+ __be64 T1023_novlan;
+ __be64 T1023_loopbk;
+ /* Transmit frames with a length of 1024 to 1518 octets */
+ __be64 T1518_prio_0;
+ __be64 T1518_prio_1;
+ __be64 T1518_prio_2;
+ __be64 T1518_prio_3;
+ __be64 T1518_prio_4;
+ __be64 T1518_prio_5;
+ __be64 T1518_prio_6;
+ __be64 T1518_prio_7;
+ __be64 T1518_novlan;
+ __be64 T1518_loopbk;
+ /* Counts transmit frames with a length of 1519 to 1522 bytes */
+ __be64 T1522_prio_0;
+ __be64 T1522_prio_1;
+ __be64 T1522_prio_2;
+ __be64 T1522_prio_3;
+ __be64 T1522_prio_4;
+ __be64 T1522_prio_5;
+ __be64 T1522_prio_6;
+ __be64 T1522_prio_7;
+ __be64 T1522_novlan;
+ __be64 T1522_loopbk;
+ /* Transmit frames with a length of 1523 to 1548 octets */
+ __be64 T1548_prio_0;
+ __be64 T1548_prio_1;
+ __be64 T1548_prio_2;
+ __be64 T1548_prio_3;
+ __be64 T1548_prio_4;
+ __be64 T1548_prio_5;
+ __be64 T1548_prio_6;
+ __be64 T1548_prio_7;
+ __be64 T1548_novlan;
+ __be64 T1548_loopbk;
+ /* Counts transmit frames with a length of 1549 to MTU bytes */
+ __be64 T2MTU_prio_0;
+ __be64 T2MTU_prio_1;
+ __be64 T2MTU_prio_2;
+ __be64 T2MTU_prio_3;
+ __be64 T2MTU_prio_4;
+ __be64 T2MTU_prio_5;
+ __be64 T2MTU_prio_6;
+ __be64 T2MTU_prio_7;
+ __be64 T2MTU_novlan;
+ __be64 T2MTU_loopbk;
+ /* Transmit frames with a length greater than MTU octets and a good CRC. */
+ __be64 TGIANT_prio_0;
+ __be64 TGIANT_prio_1;
+ __be64 TGIANT_prio_2;
+ __be64 TGIANT_prio_3;
+ __be64 TGIANT_prio_4;
+ __be64 TGIANT_prio_5;
+ __be64 TGIANT_prio_6;
+ __be64 TGIANT_prio_7;
+ __be64 TGIANT_novlan;
+ __be64 TGIANT_loopbk;
+ /* Transmit broadcast frames with a good CRC */
+ __be64 TBCAST_prio_0;
+ __be64 TBCAST_prio_1;
+ __be64 TBCAST_prio_2;
+ __be64 TBCAST_prio_3;
+ __be64 TBCAST_prio_4;
+ __be64 TBCAST_prio_5;
+ __be64 TBCAST_prio_6;
+ __be64 TBCAST_prio_7;
+ __be64 TBCAST_novlan;
+ __be64 TBCAST_loopbk;
+ /* Transmit multicast frames with a good CRC */
+ __be64 TMCAST_prio_0;
+ __be64 TMCAST_prio_1;
+ __be64 TMCAST_prio_2;
+ __be64 TMCAST_prio_3;
+ __be64 TMCAST_prio_4;
+ __be64 TMCAST_prio_5;
+ __be64 TMCAST_prio_6;
+ __be64 TMCAST_prio_7;
+ __be64 TMCAST_novlan;
+ __be64 TMCAST_loopbk;
+ /* Transmit good frames that are neither broadcast nor multicast */
+ __be64 TTOTG_prio_0;
+ __be64 TTOTG_prio_1;
+ __be64 TTOTG_prio_2;
+ __be64 TTOTG_prio_3;
+ __be64 TTOTG_prio_4;
+ __be64 TTOTG_prio_5;
+ __be64 TTOTG_prio_6;
+ __be64 TTOTG_prio_7;
+ __be64 TTOTG_novlan;
+ __be64 TTOTG_loopbk;
+
+ /* total octets of transmitted frames, including framing characters */
+ __be64 TTTLOCT_prio_0;
+ /* total octets of transmitted frames, not including framing characters */
+ __be64 TTTLOCT_NOFRM_prio_0;
+ /* ifOutOctets */
+ __be64 TOCT_prio_0;
+
+ __be64 TTTLOCT_prio_1;
+ __be64 TTTLOCT_NOFRM_prio_1;
+ __be64 TOCT_prio_1;
+
+ __be64 TTTLOCT_prio_2;
+ __be64 TTTLOCT_NOFRM_prio_2;
+ __be64 TOCT_prio_2;
+
+ __be64 TTTLOCT_prio_3;
+ __be64 TTTLOCT_NOFRM_prio_3;
+ __be64 TOCT_prio_3;
+
+ __be64 TTTLOCT_prio_4;
+ __be64 TTTLOCT_NOFRM_prio_4;
+ __be64 TOCT_prio_4;
+
+ __be64 TTTLOCT_prio_5;
+ __be64 TTTLOCT_NOFRM_prio_5;
+ __be64 TOCT_prio_5;
+
+ __be64 TTTLOCT_prio_6;
+ __be64 TTTLOCT_NOFRM_prio_6;
+ __be64 TOCT_prio_6;
+
+ __be64 TTTLOCT_prio_7;
+ __be64 TTTLOCT_NOFRM_prio_7;
+ __be64 TOCT_prio_7;
+
+ __be64 TTTLOCT_novlan;
+ __be64 TTTLOCT_NOFRM_novlan;
+ __be64 TOCT_novlan;
+
+ __be64 TTTLOCT_loopbk;
+ __be64 TTTLOCT_NOFRM_loopbk;
+ __be64 TOCT_loopbk;
+
+ /* Total frames transmitted with a good CRC that are not aborted */
+ __be64 TTOT_prio_0;
+ /* Total number of frames transmitted with 802.1Q encapsulation */
+ __be64 T1Q_prio_0;
+ __be64 reserved13;
+
+ __be64 TTOT_prio_1;
+ __be64 T1Q_prio_1;
+ __be64 reserved14;
+
+ __be64 TTOT_prio_2;
+ __be64 T1Q_prio_2;
+ __be64 reserved15;
+
+ __be64 TTOT_prio_3;
+ __be64 T1Q_prio_3;
+ __be64 reserved16;
+
+ __be64 TTOT_prio_4;
+ __be64 T1Q_prio_4;
+ __be64 reserved17;
+
+ __be64 TTOT_prio_5;
+ __be64 T1Q_prio_5;
+ __be64 reserved18;
+
+ __be64 TTOT_prio_6;
+ __be64 T1Q_prio_6;
+ __be64 reserved19;
+
+ __be64 TTOT_prio_7;
+ __be64 T1Q_prio_7;
+ __be64 reserved20;
+
+ __be64 TTOT_novlan;
+ __be64 T1Q_novlan;
+ __be64 reserved21;
+
+ __be64 TTOT_loopbk;
+ __be64 T1Q_loopbk;
+ __be64 reserved22;
+
+ /* Received frames with a length greater than MTU octets and a bad CRC */
+ __be32 RJBBR;
+ /* Received frames with a bad CRC that are not runts, jabbers,
+ or alignment errors */
+ __be32 RCRC;
+ /* Received frames with SFD with a length of less than 64 octets and a
+ bad CRC */
+ __be32 RRUNT;
+ /* Received frames with a length less than 64 octets and a good CRC */
+ __be32 RSHORT;
+ /* Total Number of Received Packets Dropped */
+ __be32 RDROP;
+ /* Drop due to overflow */
+ __be32 RdropOvflw;
+ /* Drop due to overflow */
+ __be32 RdropLength;
+ /* Total of good frames. Does not include frames received with
+ frame-too-long, FCS, or length errors */
+ __be32 RTOTFRMS;
+ /* Total dropped Xmited packets */
+ __be32 TDROP;
+};
+
+
+#endif
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_resources.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_resources.c
new file mode 100644
index 0000000..8b17cae
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_resources.c
@@ -0,0 +1,184 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+
+#include "mlx4_en.h"
+#include "log2.h"
+
+void mlx4_en_fill_qp_context(struct mlx4_en_priv *priv, int size, int stride,
+ int is_tx, int rss, int qpn, int cqn,
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ int user_prio, struct mlx4_qp_context *context)
+#else
+ struct mlx4_qp_context *context)
+#endif
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ //struct net_device *dev = priv->dev;
+ struct rte_eth_dev *rte_dev = priv->rte_dev;
+
+ memset(context, 0, sizeof *context);
+ context->flags = cpu_to_be32(7 << 16 | rss << MLX4_RSS_QPC_FLAG_OFFSET);
+ context->pd = cpu_to_be32(mdev->priv_pdn);
+ context->mtu_msgmax = 0xff;
+ if (!is_tx && !rss)
+ context->rq_size_stride = ilog2(size) << 3 | (ilog2(stride) - 4);
+ if (is_tx) {
+ context->sq_size_stride = ilog2(size) << 3 | (ilog2(stride) - 4);
+ if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_PORT_REMAP)
+ context->params2 |= MLX4_QP_BIT_FPP;
+
+ } else {
+ context->sq_size_stride = ilog2(TXBB_SIZE) - 4;
+ }
+ context->usr_page = cpu_to_be32(mdev->priv_uar.index);
+ context->local_qpn = cpu_to_be32(qpn);
+ context->pri_path.ackto = 1 & 0x07;
+ context->pri_path.sched_queue = 0x83 | (priv->port - 1) << 6;
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ if (user_prio >= 0) {
+ context->pri_path.sched_queue |= user_prio << 3;
+ context->pri_path.feup = MLX4_FEUP_FORCE_ETH_UP;
+ }
+#endif
+ context->pri_path.counter_index = (u8)(priv->counter_index);
+ context->cqn_send = cpu_to_be32(cqn);
+ context->cqn_recv = cpu_to_be32(cqn);
+ if (!rss &&
+ (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_LB_SRC_CHK) &&
+ context->pri_path.counter_index != MLX4_SINK_COUNTER_INDEX) {
+ /* disable multicast loopback to qp with same counter */
+#ifdef KMOD_MODIFIED
+ if(!(rte_dev->data->dev_conf.lpbk_mode))
+ context->pri_path.fl |= MLX4_FL_ETH_SRC_CHECK_MC_LB;
+#else
+ if (!(dev->features & NETIF_F_LOOPBACK))
+ context->pri_path.fl |= MLX4_FL_ETH_SRC_CHECK_MC_LB;
+#endif
+ context->pri_path.control |=
+ MLX4_CTRL_ETH_SRC_CHECK_IF_COUNTER;
+ }
+ context->db_rec_addr = cpu_to_be64(priv->res.db.dma << 2);
+#ifdef KMOD_MODIFIED
+ //XXX VLAN rte_dev->data->dev_conf.rxmode.hw_vlan_strip;
+#else
+ if (!(dev->features & NETIF_F_HW_VLAN_CTAG_RX))
+ context->param3 |= cpu_to_be32(1 << 30);
+#endif
+#ifdef KMOD_MODIFIED
+ //XXX DISABLE inline scatter
+#else
+ if (!is_tx && !rss &&
+ (priv->prof->inline_scatter_thold >= MIN_INLINE_SCATTER)) {
+ context->param3 |= cpu_to_be32(1 << 25);
+ }
+#endif
+
+#ifdef KMOD_MODIFIED
+ //XXX DISABLE vlan offloading
+ //we usually use multi rx rings with rss
+#else
+ if (!is_tx && !rss &&
+ (mdev->dev->caps.tunnel_offload_mode == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN)) {
+ en_dbg(HW, priv, "Setting RX qp %x tunnel mode to RX tunneled & non-tunneled\n", qpn);
+ context->srqn = cpu_to_be32(7 << 28); /* this fills bits 30:28 */
+ }
+#endif
+}
+
+int mlx4_en_change_mcast_loopback(struct mlx4_en_priv *priv, struct mlx4_qp *qp,
+ int loopback)
+{
+ int ret;
+ struct mlx4_update_qp_params qp_params;
+
+ memset(&qp_params, 0, sizeof(qp_params));
+ if (!loopback)
+ qp_params.flags = MLX4_UPDATE_QP_PARAMS_FLAGS_ETH_CHECK_MC_LB;
+
+ ret = mlx4_update_qp(priv->mdev->dev, qp->qpn,
+ MLX4_UPDATE_QP_ETH_SRC_CHECK_MC_LB,
+ &qp_params);
+
+ return ret;
+}
+
+int mlx4_en_map_buffer(struct mlx4_buf *buf)
+{
+ //struct page **pages;
+ //int i;
+
+ if (BITS_PER_LONG == 64 || buf->nbufs == 1)
+ return 0;
+#ifdef KMOD_MODIFIED
+ assert(buf->nbufs == 1); //we do not provide vmap of virtual addresses
+ return -1;
+#else
+
+ pages = kmalloc(sizeof *pages * buf->nbufs, GFP_KERNEL);
+ if (!pages)
+ return -ENOMEM;
+
+ for (i = 0; i < buf->nbufs; ++i)
+ pages[i] = virt_to_page(buf->page_list[i].buf);
+
+ buf->direct.buf = vmap(pages, buf->nbufs, VM_MAP, PAGE_KERNEL);
+ kfree(pages);
+ if (!buf->direct.buf)
+ return -ENOMEM;
+
+ return 0;
+#endif
+}
+
+void mlx4_en_unmap_buffer(struct mlx4_buf *buf)
+{
+ if (BITS_PER_LONG == 64 || buf->nbufs == 1)
+ return;
+#ifdef KMOD_MODIFIED
+ assert(buf->nbufs == 1); //we do not provide vmap of virtual addresses
+ return;
+#else
+ vunmap(buf->direct.buf);
+#endif
+}
+
+void mlx4_en_sqp_event(struct mlx4_qp *qp, enum mlx4_event event)
+{
+ return;
+}
+
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_rx.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_rx.c
new file mode 100644
index 0000000..31ccb18
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_rx.c
@@ -0,0 +1,1565 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+#ifdef HAVE_SKB_MARK_NAPI_ID
+#endif
+
+#if IS_ENABLED(CONFIG_IPV6)
+#endif
+
+#include "mlx4_en.h"
+
+static int mlx4_alloc_pages(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_alloc *page_alloc,
+ const struct mlx4_en_frag_info *frag_info,
+ gfp_t _gfp)
+{
+ int order;
+ struct page *page;
+ dma_addr_t dma;
+
+ for (order = MLX4_EN_ALLOC_PREFER_ORDER; ;) {
+ gfp_t gfp = _gfp;
+
+ if (order)
+ gfp |= __GFP_COMP | __GFP_NOWARN;
+ page = alloc_pages(gfp, order);
+ if (likely(page))
+ break;
+ if (--order < 0 ||
+ ((PAGE_SIZE << order) < frag_info->frag_size))
+ return -ENOMEM;
+ }
+ dma = dma_map_page(priv->ddev, page, 0, PAGE_SIZE << order,
+ PCI_DMA_FROMDEVICE);
+ if (dma_mapping_error(priv->ddev, dma)) {
+ put_page(page);
+ return -ENOMEM;
+ }
+ page_alloc->page_size = PAGE_SIZE << order;
+ page_alloc->page = page;
+ page_alloc->dma = dma;
+ page_alloc->page_offset = 0;
+ /* Not doing get_page() for each frag is a big win
+ * on asymetric workloads. Note we can not use atomic_set().
+ */
+ atomic_add(page_alloc->page_size / frag_info->frag_stride - 1,
+ &page->_count);
+ return 0;
+}
+
+static int mlx4_en_alloc_frags(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_desc *rx_desc,
+ struct mlx4_en_rx_alloc *frags,
+ struct mlx4_en_rx_alloc *ring_alloc,
+ gfp_t gfp)
+{
+ struct mlx4_en_rx_alloc page_alloc[MLX4_EN_MAX_RX_FRAGS];
+ const struct mlx4_en_frag_info *frag_info;
+ struct page *page;
+ dma_addr_t dma;
+ int i;
+
+ for (i = 0; i < priv->num_frags; i++) {
+ frag_info = &priv->frag_info[i];
+ page_alloc[i] = ring_alloc[i];
+ page_alloc[i].page_offset += frag_info->frag_stride;
+
+ if (page_alloc[i].page_offset + frag_info->frag_stride <=
+ ring_alloc[i].page_size)
+ continue;
+
+ if (mlx4_alloc_pages(priv, &page_alloc[i], frag_info, gfp))
+ goto out;
+ }
+
+ for (i = 0; i < priv->num_frags; i++) {
+ frags[i] = ring_alloc[i];
+ dma = ring_alloc[i].dma + ring_alloc[i].page_offset;
+ ring_alloc[i] = page_alloc[i];
+ rx_desc->data[i].addr = cpu_to_be64(dma);
+ }
+
+ return 0;
+
+out:
+ while (i--) {
+ if (page_alloc[i].page != ring_alloc[i].page) {
+ dma_unmap_page(priv->ddev, page_alloc[i].dma,
+ page_alloc[i].page_size, PCI_DMA_FROMDEVICE);
+ page = page_alloc[i].page;
+ atomic_set(&page->_count, 1);
+ put_page(page);
+ }
+ }
+ return -ENOMEM;
+}
+
+static void mlx4_en_free_frag(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_alloc *frags,
+ int i)
+{
+ const struct mlx4_en_frag_info *frag_info = &priv->frag_info[i];
+ u32 next_frag_end = frags[i].page_offset + 2 * frag_info->frag_stride;
+
+
+ if (next_frag_end > frags[i].page_size)
+ dma_unmap_page(priv->ddev, frags[i].dma, frags[i].page_size,
+ PCI_DMA_FROMDEVICE);
+
+ if (frags[i].page)
+ put_page(frags[i].page);
+}
+
+static int mlx4_en_init_allocator(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring)
+{
+ int i;
+ struct mlx4_en_rx_alloc *page_alloc;
+
+ for (i = 0; i < priv->num_frags; i++) {
+ const struct mlx4_en_frag_info *frag_info = &priv->frag_info[i];
+
+ if (mlx4_alloc_pages(priv, &ring->page_alloc[i],
+ frag_info, GFP_KERNEL | __GFP_COLD))
+ goto out;
+
+ en_dbg(DRV, priv, " frag %d allocator: - size:%d frags:%d\n",
+ i, ring->page_alloc[i].page_size,
+ atomic_read(&ring->page_alloc[i].page->_count));
+ }
+ return 0;
+
+out:
+ while (i--) {
+ struct page *page;
+
+ page_alloc = &ring->page_alloc[i];
+ dma_unmap_page(priv->ddev, page_alloc->dma,
+ page_alloc->page_size, PCI_DMA_FROMDEVICE);
+ page = page_alloc->page;
+ atomic_set(&page->_count, 1);
+ put_page(page);
+ page_alloc->page = NULL;
+ }
+ return -ENOMEM;
+}
+
+static void mlx4_en_destroy_allocator(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring)
+{
+ struct mlx4_en_rx_alloc *page_alloc;
+ int i;
+
+ for (i = 0; i < priv->num_frags; i++) {
+ const struct mlx4_en_frag_info *frag_info = &priv->frag_info[i];
+
+ page_alloc = &ring->page_alloc[i];
+ en_dbg(DRV, priv, "Freeing allocator:%d count:%d\n",
+ i, page_count(page_alloc->page));
+
+ dma_unmap_page(priv->ddev, page_alloc->dma,
+ page_alloc->page_size, PCI_DMA_FROMDEVICE);
+ while (page_alloc->page_offset + frag_info->frag_stride <
+ page_alloc->page_size) {
+ put_page(page_alloc->page);
+ page_alloc->page_offset += frag_info->frag_stride;
+ }
+ page_alloc->page = NULL;
+ }
+}
+
+static void mlx4_en_init_rx_desc(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring, int index)
+{
+ struct mlx4_en_rx_desc *rx_desc = ring->buf + ring->stride * index;
+ int possible_frags;
+ int i;
+
+ /* Set size and memtype fields */
+ for (i = 0; i < priv->num_frags; i++) {
+ rx_desc->data[i].byte_count =
+ cpu_to_be32(priv->frag_info[i].frag_size);
+ rx_desc->data[i].lkey = cpu_to_be32(priv->mdev->mr.key);
+ }
+
+ /* If the number of used fragments does not fill up the ring stride,
+ * remaining (unused) fragments must be padded with null address/size
+ * and a special memory key */
+ possible_frags = (ring->stride - sizeof(struct mlx4_en_rx_desc)) / DS_SIZE;
+ for (i = priv->num_frags; i < possible_frags; i++) {
+ rx_desc->data[i].byte_count = 0;
+ rx_desc->data[i].lkey = cpu_to_be32(MLX4_EN_MEMTYPE_PAD);
+ rx_desc->data[i].addr = 0;
+ }
+}
+
+static int mlx4_en_prepare_rx_desc(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring, int index,
+ gfp_t gfp)
+{
+ struct mlx4_en_rx_desc *rx_desc = ring->buf + (index * ring->stride);
+ struct mlx4_en_rx_alloc *frags = ring->rx_info +
+ (index << priv->log_rx_info);
+
+ return mlx4_en_alloc_frags(priv, rx_desc, frags, ring->page_alloc, gfp);
+}
+
+static inline bool mlx4_en_is_ring_empty(struct mlx4_en_rx_ring *ring)
+{
+ BUG_ON((u32)(ring->prod - ring->cons) > ring->actual_size);
+ return ring->prod == ring->cons;
+}
+
+static inline void mlx4_en_update_rx_prod_db(struct mlx4_en_rx_ring *ring)
+{
+ *ring->wqres.db.db = cpu_to_be32(ring->prod & 0xffff);
+}
+
+static void mlx4_en_free_rx_desc(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring,
+ int index)
+{
+ struct mlx4_en_rx_alloc *frags;
+ int nr;
+
+ frags = ring->rx_info + (index << priv->log_rx_info);
+ for (nr = 0; nr < priv->num_frags; nr++) {
+ en_dbg(DRV, priv, "Freeing fragment:%d\n", nr);
+ mlx4_en_free_frag(priv, frags, nr);
+ }
+}
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+static inline int mlx4_en_can_lro(__be16 status)
+{
+ static __be16 status_all;
+ static __be16 status_ipv4_ipok_tcp;
+
+ status_all = cpu_to_be16(
+ MLX4_CQE_STATUS_IPV4 |
+ MLX4_CQE_STATUS_IPV4F |
+ MLX4_CQE_STATUS_IPV6 |
+ MLX4_CQE_STATUS_IPV4OPT |
+ MLX4_CQE_STATUS_TCP |
+ MLX4_CQE_STATUS_UDP |
+ MLX4_CQE_STATUS_IPOK);
+
+ status_ipv4_ipok_tcp = cpu_to_be16(
+ MLX4_CQE_STATUS_IPV4 |
+ MLX4_CQE_STATUS_IPOK |
+ MLX4_CQE_STATUS_TCP);
+
+ status &= status_all;
+ return status == status_ipv4_ipok_tcp;
+}
+#endif
+
+static int mlx4_en_fill_rx_buffers(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_rx_ring *ring;
+ int ring_ind;
+ int buf_ind;
+ int new_size;
+
+ for (buf_ind = 0; buf_ind < priv->prof->rx_ring_size; buf_ind++) {
+ for (ring_ind = 0; ring_ind < priv->rx_ring_num; ring_ind++) {
+ ring = priv->rx_ring[ring_ind];
+
+ if (mlx4_en_prepare_rx_desc(priv, ring,
+ ring->actual_size,
+ GFP_KERNEL | __GFP_COLD)) {
+ if (ring->actual_size < MLX4_EN_MIN_RX_SIZE) {
+ en_err(priv, "Failed to allocate enough rx buffers\n");
+ return -ENOMEM;
+ } else {
+ new_size = rounddown_pow_of_two(ring->actual_size);
+ en_warn(priv, "Only %d buffers allocated reducing ring size to %d\n",
+ ring->actual_size, new_size);
+ goto reduce_rings;
+ }
+ }
+ ring->actual_size++;
+ ring->prod++;
+ }
+ }
+ return 0;
+
+reduce_rings:
+ for (ring_ind = 0; ring_ind < priv->rx_ring_num; ring_ind++) {
+ ring = priv->rx_ring[ring_ind];
+ while (ring->actual_size > new_size) {
+ ring->actual_size--;
+ ring->prod--;
+ mlx4_en_free_rx_desc(priv, ring, ring->actual_size);
+ }
+ }
+
+ return 0;
+}
+
+static void mlx4_en_free_rx_buf(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring)
+{
+ int index;
+
+ en_dbg(DRV, priv, "Freeing Rx buf - cons:%d prod:%d\n",
+ ring->cons, ring->prod);
+
+ /* Unmap and free Rx buffers */
+ while (!mlx4_en_is_ring_empty(ring)) {
+ index = ring->cons & ring->size_mask;
+ en_dbg(DRV, priv, "Processing descriptor:%d\n", index);
+ mlx4_en_free_rx_desc(priv, ring, index);
+ ++ring->cons;
+ }
+}
+
+void mlx4_en_set_num_rx_rings(struct mlx4_en_dev *mdev)
+{
+ int i;
+ int num_of_eqs;
+ int num_rx_rings;
+ struct mlx4_dev *dev = mdev->dev;
+
+ mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_ETH) {
+ num_of_eqs = max_t(int, MIN_RX_RINGS,
+ min_t(int,
+ mlx4_get_eqs_per_port(mdev->dev, i),
+ DEF_RX_RINGS));
+
+ num_rx_rings = mlx4_low_memory_profile() ? MIN_RX_RINGS :
+ num_of_eqs;
+ mdev->profile.prof[i].rx_ring_num =
+ rounddown_pow_of_two(num_rx_rings);
+ }
+}
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+static int mlx4_en_get_frag_hdr(struct skb_frag_struct *frags, void **mac_hdr,
+ void **ip_hdr, void **tcpudp_hdr,
+ u64 *hdr_flags, void *priv)
+{
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,1,0))
+ *mac_hdr = page_address(frags->page) + frags->page_offset;
+#else
+ *mac_hdr = page_address(skb_frag_page(frags)) + frags->page_offset;
+#endif
+ *ip_hdr = *mac_hdr + ETH_HLEN;
+ *tcpudp_hdr = (struct tcphdr *)(*ip_hdr + sizeof(struct iphdr));
+ *hdr_flags = LRO_IPV4 | LRO_TCP;
+
+ return 0;
+}
+
+static void mlx4_en_lro_init(struct mlx4_en_rx_ring *ring,
+ struct mlx4_en_priv *priv)
+{
+ /*
+ * Commit 9d4dde5215779 reduced SKB's frags array to 17 from 18.
+ * The lro_receive_frags routine aggregates priv->num_frags to this
+ * array and only then check that total number of frags did not
+ * passed the max_aggr, so need to align max_aggr to a multiple of
+ * priv->num_frags, in order for LRO to avoid overflow.
+ */
+ ring->lro.lro_mgr.max_aggr =
+ MAX_SKB_FRAGS - (MAX_SKB_FRAGS % priv->num_frags);
+
+ ring->lro.lro_mgr.max_desc = MLX4_EN_LRO_MAX_DESC;
+ ring->lro.lro_mgr.lro_arr = ring->lro.lro_desc;
+ ring->lro.lro_mgr.get_frag_header = mlx4_en_get_frag_hdr;
+ ring->lro.lro_mgr.features = LRO_F_NAPI;
+ ring->lro.lro_mgr.frag_align_pad = NET_IP_ALIGN;
+ ring->lro.lro_mgr.dev = priv->dev;
+ ring->lro.lro_mgr.ip_summed = CHECKSUM_UNNECESSARY;
+ ring->lro.lro_mgr.ip_summed_aggr = CHECKSUM_UNNECESSARY;
+}
+#endif
+
+int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring,
+ u32 size, u16 stride, int node)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_en_rx_ring *ring;
+ int err = -ENOMEM;
+ int tmp;
+
+ ring->prod = 0;
+ ring->cons = 0;
+ ring->size = size;
+ ring->size_mask = size - 1;
+ ring->stride = stride;
+ ring->log_stride = ffs(ring->stride) - 1;
+ ring->buf_size = ring->size * ring->stride + TXBB_SIZE;
+
+ tmp = size * roundup_pow_of_two(MLX4_EN_MAX_RX_FRAGS *
+ sizeof(struct mlx4_en_rx_alloc));
+ ring->rx_info = vmalloc_node(tmp, node);
+ if (!ring->rx_info) {
+ ring->rx_info = vmalloc(tmp);
+ if (!ring->rx_info) {
+ err = -ENOMEM;
+ goto err_ring;
+ }
+ }
+
+ en_dbg(DRV, priv, "Allocated rx_info ring at addr:%p size:%d\n",
+ ring->rx_info, tmp);
+
+ /* Allocate HW buffers on provided NUMA node */
+ set_dev_node(&mdev->dev->persist->pdev->dev, node);
+ err = mlx4_alloc_hwq_res(mdev->dev, &ring->wqres,
+ ring->buf_size, 2 * PAGE_SIZE);
+ set_dev_node(&mdev->dev->persist->pdev->dev, mdev->dev->numa_node);
+ if (err)
+ goto err_info;
+
+ err = mlx4_en_map_buffer(&ring->wqres.buf);
+ if (err) {
+ en_err(priv, "Failed to map RX buffer\n");
+ goto err_hwq;
+ }
+ ring->buf = ring->wqres.buf.direct.buf;
+
+ ring->hwtstamp_rx_filter = priv->hwtstamp_config.rx_filter;
+
+ *pring = ring;
+ return 0;
+
+err_hwq:
+ mlx4_free_hwq_res(mdev->dev, &ring->wqres, ring->buf_size);
+err_info:
+ vfree(ring->rx_info);
+ ring->rx_info = NULL;
+err_ring:
+ kfree(ring);
+ *pring = NULL;
+
+ return err;
+}
+
+int mlx4_en_activate_rx_rings(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_rx_ring *ring;
+ int i;
+ int ring_ind;
+ int err;
+ int stride = (priv->prof->inline_scatter_thold >= MIN_INLINE_SCATTER) ?
+ priv->stride :
+ roundup_pow_of_two(sizeof(struct mlx4_en_rx_desc) +
+ DS_SIZE * priv->num_frags);
+
+ for (ring_ind = 0; ring_ind < priv->rx_ring_num; ring_ind++) {
+ ring = priv->rx_ring[ring_ind];
+
+ ring->prod = 0;
+ ring->cons = 0;
+ ring->actual_size = 0;
+ ring->cqn = priv->rx_cq[ring_ind]->mcq.cqn;
+
+ ring->stride = stride;
+ if (ring->stride <= TXBB_SIZE)
+ ring->buf += TXBB_SIZE;
+
+ ring->log_stride = ffs(ring->stride) - 1;
+ ring->buf_size = ring->size * ring->stride;
+
+ memset(ring->buf, 0, ring->buf_size);
+ mlx4_en_update_rx_prod_db(ring);
+
+ /* Initialize all descriptors */
+ for (i = 0; i < ring->size; i++)
+ mlx4_en_init_rx_desc(priv, ring, i);
+
+ /* Initialize page allocators */
+ err = mlx4_en_init_allocator(priv, ring);
+ if (err) {
+ en_err(priv, "Failed initializing ring allocator\n");
+ if (ring->stride <= TXBB_SIZE)
+ ring->buf -= TXBB_SIZE;
+ ring_ind--;
+ goto err_allocator;
+ }
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+ mlx4_en_lro_init(ring, priv);
+#endif
+ }
+ err = mlx4_en_fill_rx_buffers(priv);
+ if (err)
+ goto err_buffers;
+
+ for (ring_ind = 0; ring_ind < priv->rx_ring_num; ring_ind++) {
+ ring = priv->rx_ring[ring_ind];
+
+ ring->size_mask = ring->actual_size - 1;
+ mlx4_en_update_rx_prod_db(ring);
+ }
+
+ return 0;
+
+err_buffers:
+ for (ring_ind = 0; ring_ind < priv->rx_ring_num; ring_ind++)
+ mlx4_en_free_rx_buf(priv, priv->rx_ring[ring_ind]);
+
+ ring_ind = priv->rx_ring_num - 1;
+err_allocator:
+ while (ring_ind >= 0) {
+ if (priv->rx_ring[ring_ind]->stride <= TXBB_SIZE)
+ priv->rx_ring[ring_ind]->buf -= TXBB_SIZE;
+ mlx4_en_destroy_allocator(priv, priv->rx_ring[ring_ind]);
+ ring_ind--;
+ }
+ return err;
+}
+
+/* We recover from out of memory by scheduling our napi poll
+ * function (mlx4_en_process_cq), which tries to allocate
+ * all missing RX buffers (call to mlx4_en_refill_rx_buffers).
+ */
+void mlx4_en_recover_from_oom(struct mlx4_en_priv *priv)
+{
+ int ring;
+
+ if (!priv->port_up)
+ return;
+
+ for (ring = 0; ring < priv->rx_ring_num; ring++) {
+ if (mlx4_en_is_ring_empty(priv->rx_ring[ring]))
+ napi_reschedule(&priv->rx_cq[ring]->napi);
+ }
+}
+
+void mlx4_en_destroy_rx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring **pring,
+ u32 size, u16 stride)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_en_rx_ring *ring = *pring;
+
+ mlx4_en_unmap_buffer(&ring->wqres.buf);
+ mlx4_free_hwq_res(mdev->dev, &ring->wqres, size * stride + TXBB_SIZE);
+ vfree(ring->rx_info);
+ ring->rx_info = NULL;
+ kfree(ring);
+ *pring = NULL;
+#ifdef CONFIG_RFS_ACCEL
+#ifdef HAVE_NDO_RX_FLOW_STEER
+ mlx4_en_cleanup_filters(priv);
+#endif
+#endif
+}
+
+void mlx4_en_deactivate_rx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring)
+{
+ mlx4_en_free_rx_buf(priv, ring);
+ if (ring->stride <= TXBB_SIZE)
+ ring->buf -= TXBB_SIZE;
+ mlx4_en_destroy_allocator(priv, ring);
+}
+
+static int mlx4_en_complete_rx_desc(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_desc *rx_desc,
+ struct mlx4_en_rx_alloc *frags,
+ struct skb_frag_struct *skb_frags_rx,
+ int length,
+ int *truesize)
+{
+ struct mlx4_en_frag_info *frag_info;
+ int nr;
+ dma_addr_t dma;
+
+ /* Collect used fragments while replacing them in the HW descriptors */
+ for (nr = 0; nr < priv->num_frags; nr++) {
+ frag_info = &priv->frag_info[nr];
+ if (length <= frag_info->frag_prefix_size)
+ break;
+ if (!frags[nr].page)
+ goto fail;
+
+ dma = be64_to_cpu(rx_desc->data[nr].addr);
+ dma_sync_single_for_cpu(priv->ddev, dma, frag_info->frag_size,
+ DMA_FROM_DEVICE);
+
+ /* Save page reference in skb */
+ __skb_frag_set_page(&skb_frags_rx[nr], frags[nr].page);
+ skb_frag_size_set(&skb_frags_rx[nr], frag_info->frag_size);
+ skb_frags_rx[nr].page_offset = frags[nr].page_offset;
+ *truesize += frag_info->frag_stride;
+ frags[nr].page = NULL;
+ }
+ /* Adjust size of last fragment to match actual length */
+ if (nr > 0)
+ skb_frag_size_set(&skb_frags_rx[nr - 1],
+ length - priv->frag_info[nr - 1].frag_prefix_size);
+ return nr;
+
+fail:
+ while (nr > 0) {
+ nr--;
+ __skb_frag_unref(&skb_frags_rx[nr]);
+ }
+ return 0;
+}
+
+
+static struct sk_buff *mlx4_en_rx_skb(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_desc *rx_desc,
+ struct mlx4_en_rx_alloc *frags,
+ unsigned int length)
+{
+ struct sk_buff *skb;
+ void *va;
+ int used_frags;
+ dma_addr_t dma;
+
+ skb = netdev_alloc_skb(priv->dev, SMALL_PACKET_SIZE + NET_IP_ALIGN);
+ if (!skb) {
+ en_dbg(RX_ERR, priv, "Failed allocating skb\n");
+ return NULL;
+ }
+ skb_reserve(skb, NET_IP_ALIGN);
+ skb->len = length;
+
+ /* Get pointer to first fragment so we could copy the headers into the
+ * (linear part of the) skb */
+ va = page_address(frags[0].page) + frags[0].page_offset;
+
+ if (length <= SMALL_PACKET_SIZE) {
+ /* We are copying all relevant data to the skb - temporarily
+ * sync buffers for the copy */
+ dma = be64_to_cpu(rx_desc->data[0].addr);
+ dma_sync_single_for_cpu(priv->ddev, dma, length,
+ DMA_FROM_DEVICE);
+ skb_copy_to_linear_data(skb, va, length);
+ skb->tail += length;
+ } else {
+#ifdef HAVE_ETH_GET_HEADLEN
+ unsigned int pull_len;
+#endif
+
+ /* Move relevant fragments to skb */
+ used_frags = mlx4_en_complete_rx_desc(priv, rx_desc, frags,
+ skb_shinfo(skb)->frags,
+ length, &skb->truesize);
+ if (unlikely(!used_frags)) {
+ kfree_skb(skb);
+ return NULL;
+ }
+ skb_shinfo(skb)->nr_frags = used_frags;
+
+#ifdef HAVE_ETH_GET_HEADLEN
+ pull_len = eth_get_headlen(va, SMALL_PACKET_SIZE);
+ /* Copy headers into the skb linear buffer */
+ memcpy(skb->data, va, pull_len);
+ skb->tail += pull_len;
+
+ /* Skip headers in first fragment */
+ skb_shinfo(skb)->frags[0].page_offset += pull_len;
+
+ /* Adjust size of first fragment */
+ skb_frag_size_sub(&skb_shinfo(skb)->frags[0], pull_len);
+ skb->data_len = length - pull_len;
+#else
+ memcpy(skb->data, va, HEADER_COPY_SIZE);
+ skb->tail += HEADER_COPY_SIZE;
+
+ /* Skip headers in first fragment */
+ skb_shinfo(skb)->frags[0].page_offset += HEADER_COPY_SIZE;
+
+ /* Adjust size of first fragment */
+ skb_frag_size_sub(&skb_shinfo(skb)->frags[0], HEADER_COPY_SIZE);
+ skb->data_len = length - HEADER_COPY_SIZE;
+#endif
+ }
+ return skb;
+}
+
+static void validate_loopback(struct mlx4_en_priv *priv, struct sk_buff *skb)
+{
+ int i;
+ int offset = ETH_HLEN;
+
+ for (i = 0; i < MLX4_LOOPBACK_TEST_PAYLOAD; i++, offset++) {
+ if (*(skb->data + offset) != (unsigned char) (i & 0xff))
+ goto out_loopback;
+ }
+ /* Loopback found */
+ priv->loopback_ok = 1;
+
+out_loopback:
+ dev_kfree_skb_any(skb);
+}
+
+static void mlx4_en_refill_rx_buffers(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring)
+{
+ int index = ring->prod & ring->size_mask;
+
+ while ((u32) (ring->prod - ring->cons) < ring->actual_size) {
+ if (mlx4_en_prepare_rx_desc(priv, ring, index,
+ GFP_ATOMIC | __GFP_COLD))
+ break;
+ ring->prod++;
+ index = ring->prod & ring->size_mask;
+ }
+}
+
+/* When hardware doesn't strip the vlan, we need to calculate the checksum
+ * over it and add it to the hardware's checksum calculation
+ */
+static inline __wsum get_fixed_vlan_csum(__wsum hw_checksum,
+ struct vlan_hdr *vlanh)
+{
+ return csum_add(hw_checksum, *(__wsum *)vlanh);
+}
+
+/* Although the stack expects checksum which doesn't include the pseudo
+ * header, the HW adds it. To address that, we are subtracting the pseudo
+ * header checksum from the checksum value provided by the HW.
+ */
+static void get_fixed_ipv4_csum(__wsum hw_checksum, struct sk_buff *skb,
+ struct iphdr *iph)
+{
+ __u16 length_for_csum = 0;
+ __wsum csum_pseudo_header = 0;
+
+ length_for_csum = (be16_to_cpu(iph->tot_len) - (iph->ihl << 2));
+ csum_pseudo_header = csum_tcpudp_nofold(iph->saddr, iph->daddr,
+ length_for_csum, iph->protocol, 0);
+ skb->csum = csum_sub(hw_checksum, csum_pseudo_header);
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+/* In IPv6 packets, besides subtracting the pseudo header checksum,
+ * we also compute/add the IP header checksum which
+ * is not added by the HW.
+ */
+static int get_fixed_ipv6_csum(__wsum hw_checksum, struct sk_buff *skb,
+ struct ipv6hdr *ipv6h)
+{
+ __wsum csum_pseudo_hdr = 0;
+
+ if (ipv6h->nexthdr == IPPROTO_FRAGMENT || ipv6h->nexthdr == IPPROTO_HOPOPTS)
+ return -1;
+ hw_checksum = csum_add(hw_checksum, (__force __wsum)(ipv6h->nexthdr << 8));
+
+ csum_pseudo_hdr = csum_partial(&ipv6h->saddr,
+ sizeof(ipv6h->saddr) + sizeof(ipv6h->daddr), 0);
+ csum_pseudo_hdr = csum_add(csum_pseudo_hdr, (__force __wsum)ipv6h->payload_len);
+ csum_pseudo_hdr = csum_add(csum_pseudo_hdr, (__force __wsum)ntohs(ipv6h->nexthdr));
+
+ skb->csum = csum_sub(hw_checksum, csum_pseudo_hdr);
+ skb->csum = csum_add(skb->csum, csum_partial(ipv6h, sizeof(struct ipv6hdr), 0));
+ return 0;
+}
+#endif
+static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va,
+ netdev_features_t dev_features)
+{
+ __wsum hw_checksum = 0;
+
+ void *hdr = (u8 *)va + sizeof(struct ethhdr);
+
+ hw_checksum = csum_unfold((__force __sum16)cqe->checksum);
+
+ if (((struct ethhdr *)va)->h_proto == htons(ETH_P_8021Q) &&
+ !(dev_features & NETIF_F_HW_VLAN_CTAG_RX)) {
+ /* next protocol non IPv4 or IPv6 */
+ if (((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto
+ != htons(ETH_P_IP) &&
+ ((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto
+ != htons(ETH_P_IPV6))
+ return -1;
+ hw_checksum = get_fixed_vlan_csum(hw_checksum, hdr);
+ hdr += sizeof(struct vlan_hdr);
+ }
+
+ if (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPV4))
+ get_fixed_ipv4_csum(hw_checksum, skb, hdr);
+#if IS_ENABLED(CONFIG_IPV6)
+ else if (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPV6))
+ if (get_fixed_ipv6_csum(hw_checksum, skb, hdr))
+ return -1;
+#endif
+ return 0;
+}
+
+static void mlx4_en_inline_scatter(struct mlx4_en_rx_ring *ring,
+ struct mlx4_en_rx_alloc *frags,
+ struct mlx4_en_rx_desc *rx_desc,
+ struct mlx4_en_priv *priv,
+ unsigned int length)
+{
+ int frag_size;
+ dma_addr_t dma;
+ void *va;
+
+ /* fill frag */
+ frag_size = priv->frag_info[0].frag_size;
+ va = page_address(frags[0].page) + frags[0].page_offset;
+ dma = frags[0].dma + frags[0].page_offset;
+ dma_sync_single_for_cpu(priv->ddev, dma, frag_size,
+ DMA_FROM_DEVICE);
+ memcpy(va, rx_desc, length);
+
+ /* prepare a valid rx_desc */
+ memset(rx_desc, 0, ring->stride);
+ rx_desc->data[0].byte_count = cpu_to_be32(frag_size);
+ rx_desc->data[0].lkey = cpu_to_be32(priv->mdev->mr.key);
+ rx_desc->data[0].addr = cpu_to_be64(dma);
+
+ rx_desc->data[1].lkey = cpu_to_be32(MLX4_EN_MEMTYPE_PAD);
+ rx_desc->data[1].byte_count = 0;
+}
+
+int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int budget)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_cqe *cqe;
+ struct mlx4_en_rx_ring *ring = priv->rx_ring[cq->ring];
+ struct mlx4_en_rx_alloc *frags;
+ struct mlx4_en_rx_desc *rx_desc;
+ struct sk_buff *skb;
+ int index;
+ int nr;
+ unsigned int length;
+ int polled = 0;
+ int ip_summed;
+ int factor = priv->cqe_factor;
+ u64 timestamp;
+#ifdef HAVE_NETDEV_HW_ENC_FEATURES
+ bool l2_tunnel;
+#endif
+
+ if (!priv->port_up)
+ return 0;
+
+ if (budget <= 0)
+ return polled;
+
+ /* We assume a 1:1 mapping between CQEs and Rx descriptors, so Rx
+ * descriptor offset can be deduced from the CQE index instead of
+ * reading 'cqe->index' */
+ index = cq->mcq.cons_index & ring->size_mask;
+ cqe = mlx4_en_get_cqe(cq->buf, index, priv->cqe_size) + factor;
+
+ /* Process all completed CQEs */
+ while (XNOR(cqe->owner_sr_opcode & MLX4_CQE_OWNER_MASK,
+ cq->mcq.cons_index & cq->size)) {
+
+ frags = ring->rx_info + (index << priv->log_rx_info);
+ rx_desc = ring->buf + (index << ring->log_stride);
+
+ /*
+ * make sure we read the CQE after we read the ownership bit
+ */
+ rmb();
+
+ /* Drop packet on bad receive or bad checksum */
+ if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
+ MLX4_CQE_OPCODE_ERROR)) {
+ en_err(priv, "CQE completed in error - vendor syndrom:%d syndrom:%d\n",
+ ((struct mlx4_err_cqe *)cqe)->vendor_err_syndrome,
+ ((struct mlx4_err_cqe *)cqe)->syndrome);
+ goto next;
+ }
+ if (unlikely(cqe->badfcs_enc & MLX4_CQE_BAD_FCS)) {
+ en_dbg(RX_ERR, priv, "Accepted frame with bad FCS\n");
+ goto next;
+ }
+
+ length = be32_to_cpu(cqe->byte_cnt);
+ length -= ring->fcs_del;
+
+ if (cqe->owner_sr_opcode & MLX4_CQE_IS_RECV_MASK)
+ mlx4_en_inline_scatter(ring, frags,
+ rx_desc, priv, length);
+
+ /* Check if we need to drop the packet if SRIOV is not enabled
+ * and not performing the selftest or flb disabled
+ */
+ if (priv->flags & MLX4_EN_FLAG_RX_FILTER_NEEDED) {
+ struct ethhdr *ethh;
+ dma_addr_t dma;
+ /* Get pointer to first fragment since we haven't
+ * skb yet and cast it to ethhdr struct
+ */
+ dma = be64_to_cpu(rx_desc->data[0].addr);
+ dma_sync_single_for_cpu(priv->ddev, dma, sizeof(*ethh),
+ DMA_FROM_DEVICE);
+ ethh = (struct ethhdr *)(page_address(frags[0].page) +
+ frags[0].page_offset);
+
+ if (is_multicast_ether_addr(ethh->h_dest)) {
+ struct mlx4_mac_entry *entry;
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0))
+ struct hlist_node *n;
+#endif
+ struct hlist_head *bucket;
+ unsigned int mac_hash;
+
+ /* Drop the packet, since HW loopback-ed it */
+ mac_hash = ethh->h_source[MLX4_EN_MAC_HASH_IDX];
+ bucket = &priv->mac_hash[mac_hash];
+ rcu_read_lock();
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0))
+ hlist_for_each_entry_rcu(entry, n, bucket, hlist) {
+#else
+ hlist_for_each_entry_rcu(entry, bucket, hlist) {
+#endif
+ if (ether_addr_equal_64bits(entry->mac,
+ ethh->h_source)) {
+ rcu_read_unlock();
+ goto next;
+ }
+ }
+ rcu_read_unlock();
+ }
+ }
+
+ /*
+ * Packet is OK - process it.
+ */
+ ring->bytes += length;
+ ring->packets++;
+#ifdef HAVE_NETDEV_HW_ENC_FEATURES
+ l2_tunnel = (dev->hw_enc_features & NETIF_F_RXCSUM) &&
+ (cqe->vlan_my_qpn & cpu_to_be32(MLX4_CQE_L2_TUNNEL));
+#endif
+
+ if (likely(dev->features & NETIF_F_RXCSUM)) {
+ if (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_TCP |
+ MLX4_CQE_STATUS_UDP)) {
+ if ((cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPOK)) &&
+ cqe->checksum == cpu_to_be16(0xffff)) {
+ ip_summed = CHECKSUM_UNNECESSARY;
+ ring->csum_ok++;
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+ /* traffic eligible for LRO */
+ if ((dev->features & NETIF_F_LRO) &&
+ mlx4_en_can_lro(cqe->status) &&
+ (ring->hwtstamp_rx_filter ==
+ HWTSTAMP_FILTER_NONE) &&
+#ifdef HAVE_NETDEV_HW_ENC_FEATURES
+ !l2_tunnel &&
+#endif
+ !(be32_to_cpu(cqe->vlan_my_qpn) &
+ MLX4_CQE_VLAN_PRESENT_MASK)) {
+ int truesize = 0;
+ struct skb_frag_struct lro_frag[MLX4_EN_MAX_RX_FRAGS];
+
+ nr = mlx4_en_complete_rx_desc(priv, rx_desc, frags,
+ lro_frag, length, &truesize);
+
+ if (unlikely(!nr))
+ goto next;
+
+ /* Push it up the stack (LRO) */
+ lro_receive_frags(&ring->lro.lro_mgr, lro_frag,
+ length, truesize, NULL, 0);
+ goto next;
+ }
+#endif
+ } else {
+ ip_summed = CHECKSUM_NONE;
+ ring->csum_none++;
+ }
+ } else {
+ if (priv->flags & MLX4_EN_FLAG_RX_CSUM_NON_TCP_UDP &&
+ (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPV4 |
+ MLX4_CQE_STATUS_IPV6))) {
+ ip_summed = CHECKSUM_COMPLETE;
+ ring->csum_complete++;
+ } else {
+ ip_summed = CHECKSUM_NONE;
+ ring->csum_none++;
+ }
+ }
+ } else {
+ ip_summed = CHECKSUM_NONE;
+ ring->csum_none++;
+ }
+
+ /* This packet is eligible for GRO if it is:
+ * - DIX Ethernet (type interpretation)
+ * - TCP/IP (v4)
+ * - without IP options
+ * - not an IP fragment
+ * - no LLS polling in progress
+ */
+ if ((dev->features & NETIF_F_GRO)
+#ifdef HAVE_SKB_MARK_NAPI_ID
+ && (!mlx4_en_cq_busy_polling(cq))
+#endif
+#ifdef HAVE_VLAN_GRO_RECEIVE
+ && (!(be32_to_cpu(cqe->vlan_my_qpn) &
+ MLX4_CQE_VLAN_PRESENT_MASK))
+#endif
+ ) {
+ struct sk_buff *gro_skb = napi_get_frags(&cq->napi);
+ if (!gro_skb)
+ goto next;
+
+ nr = mlx4_en_complete_rx_desc(priv,
+ rx_desc, frags, skb_shinfo(gro_skb)->frags,
+ length, &gro_skb->truesize);
+ if (!nr)
+ goto next;
+
+ if (ip_summed == CHECKSUM_COMPLETE) {
+ void *va = skb_frag_address(skb_shinfo(gro_skb)->frags);
+ if (check_csum(cqe, gro_skb, va, dev->features)) {
+ ip_summed = CHECKSUM_NONE;
+ ring->csum_none++;
+ ring->csum_complete--;
+ }
+ }
+
+ skb_shinfo(gro_skb)->nr_frags = nr;
+ gro_skb->len = length;
+ gro_skb->data_len = length;
+ gro_skb->ip_summed = ip_summed;
+
+#ifdef HAVE_NETDEV_HW_ENC_FEATURES
+ if (l2_tunnel && ip_summed == CHECKSUM_UNNECESSARY)
+#ifdef HAVE_SK_BUFF_CSUM_LEVEL
+ gro_skb->csum_level = 1;
+#else
+ gro_skb->encapsulation = 1;
+#endif
+#endif
+
+ if ((cqe->vlan_my_qpn &
+ cpu_to_be32(MLX4_CQE_VLAN_PRESENT_MASK)) &&
+ (dev->features & NETIF_F_HW_VLAN_CTAG_RX)) {
+ u16 vid = be16_to_cpu(cqe->sl_vid);
+
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,10,0))
+ __vlan_hwaccel_put_tag(gro_skb, vid);
+#else
+ __vlan_hwaccel_put_tag(gro_skb, htons(ETH_P_8021Q), vid);
+#endif
+ }
+
+#ifdef HAVE_NETIF_F_RXHASH
+ if (dev->features & NETIF_F_RXHASH)
+#ifdef HAVE_SKB_SET_HASH
+ skb_set_hash(gro_skb,
+ be32_to_cpu(cqe->immed_rss_invalid),
+ PKT_HASH_TYPE_L3);
+#else
+ gro_skb->rxhash = be32_to_cpu(cqe->immed_rss_invalid);
+#endif
+#endif
+
+ skb_record_rx_queue(gro_skb, cq->ring);
+#ifdef HAVE_SKB_MARK_NAPI_ID
+ skb_mark_napi_id(gro_skb, &cq->napi);
+#endif
+
+ if (ring->hwtstamp_rx_filter == HWTSTAMP_FILTER_ALL) {
+ timestamp = mlx4_en_get_cqe_ts(cqe);
+ mlx4_en_fill_hwtstamps(mdev,
+ skb_hwtstamps(gro_skb),
+ timestamp);
+ }
+
+ napi_gro_frags(&cq->napi);
+ goto next;
+ }
+
+ /* GRO not possible, complete processing here */
+ skb = mlx4_en_rx_skb(priv, rx_desc, frags, length);
+ if (!skb) {
+ priv->stats.rx_dropped++;
+ goto next;
+ }
+
+ if (unlikely(priv->validate_loopback)) {
+ validate_loopback(priv, skb);
+ goto next;
+ }
+
+ if (ip_summed == CHECKSUM_COMPLETE) {
+ if (check_csum(cqe, skb, skb->data, dev->features)) {
+ ip_summed = CHECKSUM_NONE;
+ ring->csum_complete--;
+ ring->csum_none++;
+ }
+ }
+
+ skb->ip_summed = ip_summed;
+ skb->protocol = eth_type_trans(skb, dev);
+ skb_record_rx_queue(skb, cq->ring);
+
+#ifdef HAVE_NETDEV_HW_ENC_FEATURES
+#ifdef HAVE_SK_BUFF_CSUM_LEVEL
+ if (l2_tunnel && ip_summed == CHECKSUM_UNNECESSARY)
+ skb->csum_level = 1;
+#else
+ if (l2_tunnel)
+ skb->encapsulation = 1;
+#endif
+#endif
+
+#ifdef HAVE_NETIF_F_RXHASH
+#ifdef HAVE_SKB_SET_HASH
+ if (dev->features & NETIF_F_RXHASH)
+ skb_set_hash(skb,
+ be32_to_cpu(cqe->immed_rss_invalid),
+ PKT_HASH_TYPE_L3);
+#else
+ skb->rxhash = be32_to_cpu(cqe->immed_rss_invalid);
+#endif
+#endif
+
+ if ((be32_to_cpu(cqe->vlan_my_qpn) &
+ MLX4_CQE_VLAN_PRESENT_MASK) &&
+ (dev->features & NETIF_F_HW_VLAN_CTAG_RX)) {
+#ifdef HAVE_VLAN_GRO_RECEIVE
+ if (priv->vlgrp) {
+ vlan_gro_receive(&cq->napi, priv->vlgrp,
+ be16_to_cpu(cqe->sl_vid),
+ skb);
+ goto next;
+ }
+#endif
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,10,0))
+ __vlan_hwaccel_put_tag(skb, be16_to_cpu(cqe->sl_vid));
+#else
+ __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), be16_to_cpu(cqe->sl_vid));
+#endif
+ }
+
+ if (ring->hwtstamp_rx_filter == HWTSTAMP_FILTER_ALL) {
+ timestamp = mlx4_en_get_cqe_ts(cqe);
+ mlx4_en_fill_hwtstamps(mdev, skb_hwtstamps(skb),
+ timestamp);
+ }
+
+#ifdef HAVE_SKB_MARK_NAPI_ID
+ skb_mark_napi_id(skb, &cq->napi);
+#endif
+
+ if (!mlx4_en_cq_busy_polling(cq))
+ napi_gro_receive(&cq->napi, skb);
+ else
+ netif_receive_skb(skb);
+
+next:
+ for (nr = 0; nr < priv->num_frags; nr++)
+ mlx4_en_free_frag(priv, frags, nr);
+
+ ++cq->mcq.cons_index;
+ index = (cq->mcq.cons_index) & ring->size_mask;
+ cqe = mlx4_en_get_cqe(cq->buf, index, priv->cqe_size) + factor;
+ if (++polled == budget)
+ goto out;
+ }
+
+out:
+ AVG_PERF_COUNTER(priv->pstats.rx_coal_avg, polled);
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+ if (dev->features & NETIF_F_LRO)
+ lro_flush_all(&priv->rx_ring[cq->ring]->lro.lro_mgr);
+#endif
+ mlx4_cq_set_ci(&cq->mcq);
+ wmb(); /* ensure HW sees CQ consumer before we post new buffers */
+ ring->cons = cq->mcq.cons_index;
+ mlx4_en_refill_rx_buffers(priv, ring);
+ mlx4_en_update_rx_prod_db(ring);
+ return polled;
+}
+
+
+void mlx4_en_rx_irq(struct mlx4_cq *mcq)
+{
+ struct mlx4_en_cq *cq = container_of(mcq, struct mlx4_en_cq, mcq);
+ struct mlx4_en_priv *priv = netdev_priv(cq->dev);
+
+ if (likely(priv->port_up))
+#ifdef HAVE_NAPI_SCHEDULE_IRQOFF
+ napi_schedule_irqoff(&cq->napi);
+#else
+ napi_schedule(&cq->napi);
+#endif
+ else
+ mlx4_en_arm_cq(priv, cq);
+}
+
+/* Rx CQ polling - called by NAPI */
+int mlx4_en_poll_rx_cq(struct napi_struct *napi, int budget)
+{
+ struct mlx4_en_cq *cq = container_of(napi, struct mlx4_en_cq, napi);
+ struct net_device *dev = cq->dev;
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ int done;
+
+#ifdef HAVE_SKB_MARK_NAPI_ID
+ if (!mlx4_en_cq_lock_napi(cq))
+ return budget;
+#endif
+
+ done = mlx4_en_process_rx_cq(dev, cq, budget);
+
+#ifdef HAVE_SKB_MARK_NAPI_ID
+ mlx4_en_cq_unlock_napi(cq);
+#endif
+
+ /* If we used up all the quota - we're probably not done yet... */
+#if !(defined(HAVE_IRQ_DESC_GET_IRQ_DATA) && defined(HAVE_IRQ_TO_DESC_EXPORTED))
+ cq->tot_rx += done;
+#endif
+ if (done == budget) {
+#if defined(HAVE_IRQ_DESC_GET_IRQ_DATA) && defined(HAVE_IRQ_TO_DESC_EXPORTED)
+ int cpu_curr;
+ const struct cpumask *aff;
+#endif
+
+ INC_PERF_COUNTER(priv->pstats.napi_quota);
+
+#if defined(HAVE_IRQ_DESC_GET_IRQ_DATA) && defined(HAVE_IRQ_TO_DESC_EXPORTED)
+ cpu_curr = smp_processor_id();
+ aff = irq_desc_get_irq_data(cq->irq_desc)->affinity;
+
+ if (likely(cpumask_test_cpu(cpu_curr, aff)))
+ return budget;
+
+ /* Current cpu is not according to smp_irq_affinity -
+ * probably affinity changed. need to stop this NAPI
+ * poll, and restart it on the right CPU
+ */
+ done = 0;
+#else
+ if (cq->tot_rx < MLX4_EN_MIN_RX_ARM)
+ return budget;
+
+ cq->tot_rx = 0;
+ done = 0;
+ } else {
+ cq->tot_rx = 0;
+#endif
+ }
+ /* Done for now */
+#ifdef HAVE_NAPI_COMPLETE_DONE
+ napi_complete_done(napi, done);
+#else
+ napi_complete(napi);
+#endif
+ mlx4_en_arm_cq(priv, cq);
+ return done;
+}
+
+static const int frag_sizes[] = {
+ FRAG_SZ0,
+ FRAG_SZ1,
+ FRAG_SZ2,
+ FRAG_SZ3
+};
+
+void mlx4_en_calc_rx_buf(struct net_device *dev)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ int eff_mtu = dev->mtu + ETH_HLEN + VLAN_HLEN;
+ int buf_size = 0;
+ int i = 0;
+
+ while (buf_size < eff_mtu) {
+ priv->frag_info[i].frag_size =
+ (eff_mtu > buf_size + frag_sizes[i]) ?
+ frag_sizes[i] : eff_mtu - buf_size;
+ priv->frag_info[i].frag_prefix_size = buf_size;
+ priv->frag_info[i].frag_stride =
+ ALIGN(priv->frag_info[i].frag_size,
+ SMP_CACHE_BYTES);
+ buf_size += priv->frag_info[i].frag_size;
+ i++;
+ }
+
+ priv->num_frags = i;
+ priv->rx_skb_size = eff_mtu;
+ priv->log_rx_info = ROUNDUP_LOG2(i * sizeof(struct mlx4_en_rx_alloc));
+
+ en_dbg(DRV, priv, "Rx buffer scatter-list (effective-mtu:%d num_frags:%d):\n",
+ eff_mtu, priv->num_frags);
+ for (i = 0; i < priv->num_frags; i++) {
+ en_err(priv,
+ " frag:%d - size:%d prefix:%d stride:%d\n",
+ i,
+ priv->frag_info[i].frag_size,
+ priv->frag_info[i].frag_prefix_size,
+ priv->frag_info[i].frag_stride);
+ }
+}
+
+/* RSS related functions */
+
+static int mlx4_en_config_rss_qp(struct mlx4_en_priv *priv, int qpn,
+ struct mlx4_en_rx_ring *ring,
+ enum mlx4_qp_state *state,
+ struct mlx4_qp *qp)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_qp_context *context;
+ int err = 0;
+
+ context = kmalloc(sizeof(*context), GFP_KERNEL);
+ if (!context)
+ return -ENOMEM;
+
+ err = mlx4_qp_alloc(mdev->dev, qpn, qp, GFP_KERNEL);
+ if (err) {
+ en_err(priv, "Failed to allocate qp #%x\n", qpn);
+ goto out;
+ }
+ qp->event = mlx4_en_sqp_event;
+
+ memset(context, 0, sizeof *context);
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ mlx4_en_fill_qp_context(priv, ring->actual_size, ring->stride, 0, 0,
+ qpn, ring->cqn, -1, context);
+#else
+ mlx4_en_fill_qp_context(priv, ring->actual_size, ring->stride, 0, 0,
+ qpn, ring->cqn, context);
+#endif
+ context->db_rec_addr = cpu_to_be64(ring->wqres.db.dma);
+
+ /* Cancel FCS removal if FW allows */
+ if (mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_FCS_KEEP) {
+ context->param3 |= cpu_to_be32(1 << 29);
+#ifdef HAVE_NETIF_F_RXFCS
+ if (priv->dev->features & NETIF_F_RXFCS)
+#else
+ if (priv->pflags & MLX4_EN_PRIV_FLAGS_RXFCS)
+#endif
+ ring->fcs_del = 0;
+ else
+ ring->fcs_del = ETH_FCS_LEN;
+ } else
+ ring->fcs_del = 0;
+
+ err = mlx4_qp_to_ready(mdev->dev, &ring->wqres.mtt, context, qp, state);
+ if (err) {
+ mlx4_qp_remove(mdev->dev, qp);
+ mlx4_qp_free(mdev->dev, qp);
+ }
+ mlx4_en_update_rx_prod_db(ring);
+out:
+ kfree(context);
+ return err;
+}
+
+int mlx4_en_create_drop_qp(struct mlx4_en_priv *priv)
+{
+ int err;
+ u32 qpn;
+
+ err = mlx4_qp_reserve_range(priv->mdev->dev, 1, 1, &qpn,
+ MLX4_RESERVE_A0_QP);
+ if (err) {
+ en_err(priv, "Failed reserving drop qpn\n");
+ return err;
+ }
+ err = mlx4_qp_alloc(priv->mdev->dev, qpn, &priv->drop_qp, GFP_KERNEL);
+ if (err) {
+ en_err(priv, "Failed allocating drop qp\n");
+ mlx4_qp_release_range(priv->mdev->dev, qpn, 1);
+ return err;
+ }
+
+ return 0;
+}
+
+void mlx4_en_destroy_drop_qp(struct mlx4_en_priv *priv)
+{
+ u32 qpn;
+
+ qpn = priv->drop_qp.qpn;
+ mlx4_qp_remove(priv->mdev->dev, &priv->drop_qp);
+ mlx4_qp_free(priv->mdev->dev, &priv->drop_qp);
+ mlx4_qp_release_range(priv->mdev->dev, qpn, 1);
+}
+
+/* Allocate rx qp's and configure them according to rss map */
+int mlx4_en_config_rss_steer(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_en_rss_map *rss_map = &priv->rss_map;
+ struct mlx4_qp_context context;
+ struct mlx4_rss_context *rss_context;
+ int rss_rings;
+ void *ptr;
+ u8 rss_mask = (MLX4_RSS_IPV4 | MLX4_RSS_TCP_IPV4 | MLX4_RSS_IPV6 |
+ MLX4_RSS_TCP_IPV6);
+ int i, qpn;
+ int err = 0;
+ int good_qps = 0;
+#ifndef HAVE_NETDEV_RSS_KEY_FILL
+ static const u32 rsskey[MLX4_EN_RSS_KEY_SIZE] = { 0xD181C62C, 0xF7F4DB5B, 0x1983A2FC,
+ 0x943E1ADB, 0xD9389E6B, 0xD1039C2C, 0xA74499AD,
+ 0x593D56D9, 0xF3253C06, 0x2ADC1FFC};
+#endif
+
+ en_dbg(DRV, priv, "Configuring rss steering\n");
+ err = mlx4_qp_reserve_range(mdev->dev, priv->rx_ring_num,
+ priv->rx_ring_num,
+ &rss_map->base_qpn, 0);
+ if (err) {
+ en_err(priv, "Failed reserving %d qps\n", priv->rx_ring_num);
+ return err;
+ }
+
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ qpn = rss_map->base_qpn + i;
+ err = mlx4_en_config_rss_qp(priv, qpn, priv->rx_ring[i],
+ &rss_map->state[i],
+ &rss_map->qps[i]);
+ if (err)
+ goto rss_err;
+
+ ++good_qps;
+ }
+
+ /* Configure RSS indirection qp */
+ err = mlx4_qp_alloc(mdev->dev, priv->base_qpn, &rss_map->indir_qp, GFP_KERNEL);
+ if (err) {
+ en_err(priv, "Failed to allocate RSS indirection QP\n");
+ goto rss_err;
+ }
+ rss_map->indir_qp.event = mlx4_en_sqp_event;
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ mlx4_en_fill_qp_context(priv, 0, 0, 0, 1, priv->base_qpn,
+ priv->rx_ring[0]->cqn, -1, &context);
+#else
+ mlx4_en_fill_qp_context(priv, 0, 0, 0, 1, priv->base_qpn,
+ priv->rx_ring[0]->cqn, &context);
+#endif
+
+ if (!priv->prof->rss_rings || priv->prof->rss_rings > priv->rx_ring_num)
+ rss_rings = priv->rx_ring_num;
+ else
+ rss_rings = priv->prof->rss_rings;
+
+ ptr = ((void *) &context) + offsetof(struct mlx4_qp_context, pri_path)
+ + MLX4_RSS_OFFSET_IN_QPC_PRI_PATH;
+ rss_context = ptr;
+ rss_context->base_qpn = cpu_to_be32(ilog2(rss_rings) << 24 |
+ (rss_map->base_qpn));
+ rss_context->default_qpn = cpu_to_be32(rss_map->base_qpn);
+ if (priv->mdev->profile.udp_rss) {
+ rss_mask |= MLX4_RSS_UDP_IPV4 | MLX4_RSS_UDP_IPV6;
+ rss_context->base_qpn_udp = rss_context->default_qpn;
+ }
+
+ if (mdev->dev->caps.tunnel_offload_mode == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN) {
+ en_info(priv, "Setting RSS context tunnel type to RSS on inner headers\n");
+ rss_mask |= MLX4_RSS_BY_INNER_HEADERS;
+ }
+
+ rss_context->flags = rss_mask;
+ rss_context->hash_fn = MLX4_RSS_HASH_TOP;
+#ifdef HAVE_ETH_SS_RSS_HASH_FUNCS
+ if (priv->rss_hash_fn == ETH_RSS_HASH_XOR) {
+ rss_context->hash_fn = MLX4_RSS_HASH_XOR;
+ } else if (priv->rss_hash_fn == ETH_RSS_HASH_TOP) {
+ rss_context->hash_fn = MLX4_RSS_HASH_TOP;
+ memcpy(rss_context->rss_key, priv->rss_key,
+ MLX4_EN_RSS_KEY_SIZE);
+#ifdef HAVE_NETDEV_RSS_KEY_FILL
+ netdev_rss_key_fill(rss_context->rss_key,
+ MLX4_EN_RSS_KEY_SIZE);
+#else
+ for (i = 0; i < MLX4_EN_RSS_KEY_SIZE; i++)
+ rss_context->rss_key[i] = cpu_to_be32(rsskey[i]);
+#endif
+ } else {
+ en_err(priv, "Unknown RSS hash function requested\n");
+ err = -EINVAL;
+ goto indir_err;
+ }
+#else
+#ifndef HAVE_NETDEV_RSS_KEY_FILL
+ for (i = 0; i < MLX4_EN_RSS_KEY_SIZE; i++)
+ rss_context->rss_key[i] = cpu_to_be32(rsskey[i]);
+#else
+ memcpy(rss_context->rss_key, priv->rss_key, MLX4_EN_RSS_KEY_SIZE);
+#endif
+#endif
+ err = mlx4_qp_to_ready(mdev->dev, &priv->res.mtt, &context,
+ &rss_map->indir_qp, &rss_map->indir_state);
+ if (err)
+ goto indir_err;
+
+ return 0;
+
+indir_err:
+ mlx4_qp_modify(mdev->dev, NULL, rss_map->indir_state,
+ MLX4_QP_STATE_RST, NULL, 0, 0, &rss_map->indir_qp);
+ mlx4_qp_remove(mdev->dev, &rss_map->indir_qp);
+ mlx4_qp_free(mdev->dev, &rss_map->indir_qp);
+rss_err:
+ for (i = 0; i < good_qps; i++) {
+ mlx4_qp_modify(mdev->dev, NULL, rss_map->state[i],
+ MLX4_QP_STATE_RST, NULL, 0, 0, &rss_map->qps[i]);
+ mlx4_qp_remove(mdev->dev, &rss_map->qps[i]);
+ mlx4_qp_free(mdev->dev, &rss_map->qps[i]);
+ }
+ mlx4_qp_release_range(mdev->dev, rss_map->base_qpn, priv->rx_ring_num);
+ return err;
+}
+
+void mlx4_en_release_rss_steer(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_en_rss_map *rss_map = &priv->rss_map;
+ int i;
+
+ mlx4_qp_modify(mdev->dev, NULL, rss_map->indir_state,
+ MLX4_QP_STATE_RST, NULL, 0, 0, &rss_map->indir_qp);
+ mlx4_qp_remove(mdev->dev, &rss_map->indir_qp);
+ mlx4_qp_free(mdev->dev, &rss_map->indir_qp);
+
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ mlx4_qp_modify(mdev->dev, NULL, rss_map->state[i],
+ MLX4_QP_STATE_RST, NULL, 0, 0, &rss_map->qps[i]);
+ mlx4_qp_remove(mdev->dev, &rss_map->qps[i]);
+ mlx4_qp_free(mdev->dev, &rss_map->qps[i]);
+ }
+ mlx4_qp_release_range(mdev->dev, rss_map->base_qpn, priv->rx_ring_num);
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_rx_uio.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_rx_uio.c
new file mode 100644
index 0000000..837aaa5
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_rx_uio.c
@@ -0,0 +1,187 @@
+/*
+ * en_rx_uio.c
+ *
+ * Created on: Jul 1, 2015
+ * Author: leeopop
+ */
+
+
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#include "mlx4_en.h"
+#include "log2.h"
+
+#include "mlx4_uio_helper.h"
+
+static void mlx4_en_init_rx_desc(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring, int index)
+{
+ struct mlx4_en_rx_desc *rx_desc = ring->buf + ring->stride * index;
+ int possible_frags;
+ int i;
+
+ /* Set size and memtype fields */
+ for (i = 0; i < ring->num_frags; i++) {
+ rx_desc->data[i].byte_count =
+ cpu_to_be32(ring->frag_size);
+ rx_desc->data[i].lkey = cpu_to_be32(priv->mdev->mr.key);
+ }
+
+ /* If the number of used fragments does not fill up the ring stride,
+ * remaining (unused) fragments must be padded with null address/size
+ * and a special memory key */
+ possible_frags = (ring->stride - sizeof(struct mlx4_en_rx_desc)) / DS_SIZE;
+ for (i = ring->num_frags; i < possible_frags; i++) {
+ rx_desc->data[i].byte_count = 0;
+ rx_desc->data[i].lkey = cpu_to_be32(MLX4_EN_MEMTYPE_PAD);
+ rx_desc->data[i].addr = 0;
+ }
+}
+
+static void mlx4_en_free_rx_desc(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring,
+ int index)
+{
+ struct rte_mbuf **frags;
+ int nr;
+
+ frags = ring->rx_info + (index * ring->num_frags);
+ for (nr = 0; nr < ring->num_frags; nr++) {
+ en_dbg(DRV, priv, "Freeing fragment:%d\n", nr);
+ rte_pktmbuf_free_seg(frags[nr]);
+ }
+}
+
+static int mlx4_en_fill_rx_buffers(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_rx_ring *ring;
+ int ring_ind;
+ int buf_ind;
+ int new_size;
+ for (ring_ind = 0; ring_ind < priv->rte_dev->data->nb_rx_queues; ring_ind++) {
+ ring = priv->rte_dev->data->rx_queues[ring_ind];
+ for (buf_ind = 0; buf_ind < priv->prof->rx_ring_size; buf_ind++) {
+ if (mlx4_en_prepare_rx_desc(priv, ring, ring->actual_size))
+ {
+ if (ring->actual_size < MLX4_EN_MIN_RX_SIZE)
+ {
+ en_err(priv, "Failed to allocate enough rx buffers\n");
+ return -ENOMEM;
+ }
+ else
+ {
+ new_size = rounddown_pow_of_two(ring->actual_size);
+ en_warn(priv, "Only %d buffers allocated reducing ring size to %d\n",
+ ring->actual_size, new_size);
+ while (ring->actual_size > new_size) {
+ ring->actual_size--;
+ ring->prod--;
+ mlx4_en_free_rx_desc(priv, ring, ring->actual_size);
+ }
+ break;
+ }
+ }
+ ring->actual_size++;
+ ring->prod++;
+ }
+ }
+ return 0;
+}
+
+int mlx4_en_activate_rx_rings(struct mlx4_en_priv *priv)
+{
+ struct mlx4_en_rx_ring *ring;
+ int i;
+ int ring_ind;
+ int err;
+
+
+ for (ring_ind = 0; ring_ind < priv->rte_dev->data->nb_rx_queues; ring_ind++) {
+ ring = priv->rte_dev->data->rx_queues[ring_ind];
+ int stride = (priv->prof->inline_scatter_thold >= MIN_INLINE_SCATTER) ?
+ priv->stride :
+ roundup_pow_of_two(sizeof(struct mlx4_en_rx_desc) +
+ DS_SIZE * ring->num_frags);
+
+ ring->prod = 0;
+ ring->cons = 0;
+ ring->actual_size = 0;
+ //ring->cqn = priv->rx_cq[ring_ind]->mcq.cqn;
+
+ ring->stride = stride;
+ if (ring->stride <= TXBB_SIZE)
+ ring->buf += TXBB_SIZE;
+
+ ring->log_stride = ffs(ring->stride) - 1;
+ ring->buf_size = ring->size * ring->stride;
+
+ memset(ring->buf, 0, ring->buf_size);
+ mlx4_en_update_rx_prod_db(ring);
+
+ /* Initialize all descriptors */
+ for (i = 0; i < ring->size; i++)
+ mlx4_en_init_rx_desc(priv, ring, i);
+
+ /* Initialize page allocators */
+#ifdef KMOD_DISABLED
+ err = mlx4_en_init_allocator(priv, ring);
+ if (err) {
+ en_err(priv, "Failed initializing ring allocator\n");
+ if (ring->stride <= TXBB_SIZE)
+ ring->buf -= TXBB_SIZE;
+ ring_ind--;
+ goto err_allocator;
+ }
+#endif
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+ mlx4_en_lro_init(ring, priv);
+#endif
+ }
+ err = mlx4_en_fill_rx_buffers(priv);
+ if (err)
+ return err;
+
+ for (ring_ind = 0; ring_ind < priv->rte_dev->data->nb_rx_queues; ring_ind++) {
+ ring = priv->rte_dev->data->rx_queues[ring_ind];
+
+ ring->size_mask = ring->actual_size - 1;
+ mlx4_en_update_rx_prod_db(ring);
+ }
+
+ return err;
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_selftest.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_selftest.c
new file mode 100644
index 0000000..75c0e74
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_selftest.c
@@ -0,0 +1,194 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+
+#include "mlx4_en.h"
+
+
+
+static int mlx4_en_test_registers(struct mlx4_en_priv *priv)
+{
+ return mlx4_cmd(priv->mdev->dev, 0, 0, 0, MLX4_CMD_HW_HEALTH_CHECK,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+}
+
+#ifdef KMOD_DISABLED
+
+static int mlx4_en_test_loopback_xmit(struct mlx4_en_priv *priv)
+{
+ struct sk_buff *skb;
+ struct ethhdr *ethh;
+ unsigned char *packet;
+ unsigned int packet_size = MLX4_LOOPBACK_TEST_PAYLOAD;
+ unsigned int i;
+ int err;
+
+
+ /* build the pkt before xmit */
+ skb = netdev_alloc_skb(priv->dev, MLX4_LOOPBACK_TEST_PAYLOAD + ETH_HLEN + NET_IP_ALIGN);
+ if (!skb)
+ return -ENOMEM;
+
+ skb_reserve(skb, NET_IP_ALIGN);
+
+ ethh = (struct ethhdr *)skb_put(skb, sizeof(struct ethhdr));
+ packet = (unsigned char *)skb_put(skb, packet_size);
+ memcpy(ethh->h_dest, priv->dev->dev_addr, ETH_ALEN);
+ memset(ethh->h_source, 0, ETH_ALEN);
+ ethh->h_proto = htons(ETH_P_ARP);
+ skb_set_mac_header(skb, 0);
+ for (i = 0; i < packet_size; ++i) /* fill our packet */
+ packet[i] = (unsigned char)(i & 0xff);
+
+ /* xmit the pkt */
+ err = mlx4_en_xmit(skb, priv->dev);
+ return err;
+}
+
+static int mlx4_en_test_loopback(struct mlx4_en_priv *priv)
+{
+ u32 loopback_ok = 0;
+ int i;
+ bool gro_enabled;
+
+ priv->loopback_ok = 0;
+ priv->validate_loopback = 1;
+ gro_enabled = priv->dev->features & NETIF_F_GRO;
+
+ mlx4_en_update_loopback_state(priv->dev, priv->dev->features);
+ priv->dev->features &= ~NETIF_F_GRO;
+
+ /* xmit */
+ if (mlx4_en_test_loopback_xmit(priv)) {
+ en_err(priv, "Transmitting loopback packet failed\n");
+ goto mlx4_en_test_loopback_exit;
+ }
+
+ /* polling for result */
+ for (i = 0; i < MLX4_EN_LOOPBACK_RETRIES; ++i) {
+ msleep(MLX4_EN_LOOPBACK_TIMEOUT);
+ if (priv->loopback_ok) {
+ loopback_ok = 1;
+ break;
+ }
+ }
+ if (!loopback_ok)
+ en_err(priv, "Loopback packet didn't arrive\n");
+
+mlx4_en_test_loopback_exit:
+
+ priv->validate_loopback = 0;
+
+ if (gro_enabled)
+ priv->dev->features |= NETIF_F_GRO;
+
+ mlx4_en_update_loopback_state(priv->dev, priv->dev->features);
+ return !loopback_ok;
+}
+
+#endif
+
+static int mlx4_en_test_link(struct mlx4_en_priv *priv)
+{
+ if (mlx4_en_QUERY_PORT(priv->mdev, priv->port))
+ return -ENOMEM;
+ if (priv->port_state.link_state == 1)
+ return 0;
+ else
+ return 1;
+}
+
+static int mlx4_en_test_speed(struct mlx4_en_priv *priv)
+{
+
+ if (mlx4_en_QUERY_PORT(priv->mdev, priv->port))
+ return -ENOMEM;
+
+ /* The device supports 100M, 1G, 10G, 20G, 40G and 56G speed */
+ if (priv->port_state.link_speed != SPEED_100 &&
+ priv->port_state.link_speed != SPEED_1000 &&
+ priv->port_state.link_speed != SPEED_10000 &&
+ priv->port_state.link_speed != SPEED_20000 &&
+ priv->port_state.link_speed != SPEED_40000 &&
+ priv->port_state.link_speed != SPEED_56000)
+ return priv->port_state.link_speed;
+
+ return 0;
+}
+
+
+int mlx4_en_ex_selftest(struct rte_eth_dev *rte_dev)
+{
+ struct mlx4_en_priv *priv = rte_dev->data->dev_private;
+ struct mlx4_en_dev *mdev = priv->mdev;
+
+#ifdef KMOD_DISABLED
+ int i, carrier_ok;
+
+
+ memset(buf, 0, sizeof(u64) * MLX4_EN_NUM_SELF_TEST);
+
+ if (*flags & ETH_TEST_FL_OFFLINE) {
+ /* disable the interface */
+ carrier_ok = netif_carrier_ok(dev);
+
+ netif_carrier_off(dev);
+ /* Wait until all tx queues are empty.
+ * there should not be any additional incoming traffic
+ * since we turned the carrier off */
+ msleep(200);
+
+ if (priv->mdev->dev->caps.flags &
+ MLX4_DEV_CAP_FLAG_UC_LOOPBACK) {
+ buf[3] = mlx4_en_test_registers(priv);
+ if (priv->port_up)
+ buf[4] = mlx4_en_test_loopback(priv);
+ }
+
+ if (carrier_ok)
+ netif_carrier_on(dev);
+
+ }
+#endif
+ //buf[0] = mlx4_test_interrupts(mdev->dev);
+ int ret = 0;
+ ret |= mlx4_en_test_registers(priv);
+ ret |= mlx4_en_test_link(priv);
+ ret |= mlx4_en_test_speed(priv);
+
+ return (ret == 0) ? 0 : -1;
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_sysfs.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_sysfs.c
new file mode 100644
index 0000000..092b8ea
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_sysfs.c
@@ -0,0 +1,623 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2015 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+
+#include "mlx4_en.h"
+
+#ifdef KMOD_DISABLED
+
+#define to_en_priv(cd) ((struct mlx4_en_priv *)(netdev_priv(to_net_dev(cd))))
+
+#ifdef CONFIG_SYSFS_QCN
+
+#define MLX4_EN_NUM_QCN_PARAMS 12
+
+static ssize_t mlx4_en_show_qcn(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct mlx4_en_priv *priv = to_en_priv(d);
+ int i;
+ int len = 0;
+ struct ieee_qcn qcn;
+ int ret;
+
+ ret = mlx4_en_dcbnl_ieee_getqcn(priv->dev, &qcn);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < MLX4_EN_NUM_TC; i++) {
+ len += sprintf(buf + len, "%s %d %s", "priority", i, ": ");
+ len += sprintf(buf + len, "%u ", qcn.rpg_enable[i]);
+ len += sprintf(buf + len, "%u ", qcn.rppp_max_rps[i]);
+ len += sprintf(buf + len, "%u ", qcn.rpg_time_reset[i]);
+ len += sprintf(buf + len, "%u ", qcn.rpg_byte_reset[i]);
+ len += sprintf(buf + len, "%u ", qcn.rpg_threshold[i]);
+ len += sprintf(buf + len, "%u ", qcn.rpg_max_rate[i]);
+ len += sprintf(buf + len, "%u ", qcn.rpg_ai_rate[i]);
+ len += sprintf(buf + len, "%u ", qcn.rpg_hai_rate[i]);
+ len += sprintf(buf + len, "%u ", qcn.rpg_gd[i]);
+ len += sprintf(buf + len, "%u ", qcn.rpg_min_dec_fac[i]);
+ len += sprintf(buf + len, "%u ", qcn.rpg_min_rate[i]);
+ len += sprintf(buf + len, "%u ", qcn.cndd_state_machine[i]);
+ len += sprintf(buf + len, "%s", "|");
+ }
+ len += sprintf(buf + len, "\n");
+
+ return len;
+}
+
+static ssize_t mlx4_en_store_qcn(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ struct mlx4_en_priv *priv = to_en_priv(d);
+ char save;
+ int i = 0;
+ int j = 0;
+ struct ieee_qcn qcn;
+
+ do {
+ int len;
+ u32 new_value;
+
+ if (i >= (MLX4_EN_NUM_TC * MLX4_EN_NUM_QCN_PARAMS))
+ goto bad_elem_count;
+
+ len = strcspn(buf, " ");
+ /* nul-terminate and parse */
+ save = buf[len];
+ ((char *)buf)[len] = '\0';
+
+ if (sscanf(buf, "%u", &new_value) != 1 ||
+ new_value < 0) {
+ en_err(priv, "bad qcn value: '%s'\n", buf);
+ ret = -EINVAL;
+ goto out;
+ }
+ switch (i % MLX4_EN_NUM_QCN_PARAMS) {
+ case 0:
+ qcn.rpg_enable[j] = new_value;
+ break;
+ case 1:
+ qcn.rppp_max_rps[j] = new_value;
+ break;
+ case 2:
+ qcn.rpg_time_reset[j] = new_value;
+ break;
+ case 3:
+ qcn.rpg_byte_reset[j] = new_value;
+ break;
+ case 4:
+ qcn.rpg_threshold[j] = new_value;
+ break;
+ case 5:
+ qcn.rpg_max_rate[j] = new_value;
+ break;
+ case 6:
+ qcn.rpg_ai_rate[j] = new_value;
+ break;
+ case 7:
+ qcn.rpg_hai_rate[j] = new_value;
+ break;
+ case 8:
+ qcn.rpg_gd[j] = new_value;
+ break;
+ case 9:
+ qcn.rpg_min_dec_fac[j] = new_value;
+ break;
+ case 10:
+ qcn.rpg_min_rate[j] = new_value;
+ break;
+ case 11:
+ qcn.cndd_state_machine[j] = new_value;
+ break;
+ default:
+ ret = -EINVAL;
+ goto out;
+ }
+
+ buf += len+1;
+ i++;
+ if ((i % MLX4_EN_NUM_QCN_PARAMS) == 0)
+ j++;
+ } while (save == ' ');
+
+ if (i != (MLX4_EN_NUM_TC * MLX4_EN_NUM_QCN_PARAMS))
+ goto bad_elem_count;
+
+ ret = mlx4_en_dcbnl_ieee_setqcn(priv->dev, &qcn);
+ if (!ret)
+ ret = count;
+
+out:
+ return ret;
+bad_elem_count:
+ en_err(priv, "bad number of elemets in qcn array\n");
+ return -EINVAL;
+}
+
+static ssize_t mlx4_en_show_qcnstats(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct mlx4_en_priv *priv = to_en_priv(d);
+ int i;
+ int len = 0;
+ struct ieee_qcn_stats qcn_stats;
+ int ret;
+
+ ret = mlx4_en_dcbnl_ieee_getqcnstats(priv->dev, &qcn_stats);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < MLX4_EN_NUM_TC; i++) {
+ len += sprintf(buf + len, "%s %d %s", "priority", i, ": ");
+ len += sprintf(buf + len, "%lld ", qcn_stats.rppp_rp_centiseconds[i]);
+ len += sprintf(buf + len, "%u ", qcn_stats.rppp_created_rps[i]);
+ len += sprintf(buf + len, "%u ", qcn_stats.ignored_cnm[i]);
+ len += sprintf(buf + len, "%u ", qcn_stats.estimated_total_rate[i]);
+ len += sprintf(buf + len, "%u ", qcn_stats.cnms_handled_successfully[i]);
+ len += sprintf(buf + len, "%u ", qcn_stats.min_total_limiters_rate[i]);
+ len += sprintf(buf + len, "%u ", qcn_stats.max_total_limiters_rate[i]);
+ len += sprintf(buf + len, "%s", "|");
+ }
+ len += sprintf(buf + len, "\n");
+
+ return len;
+}
+
+static DEVICE_ATTR(qcn, S_IRUGO | S_IWUSR,
+ mlx4_en_show_qcn, mlx4_en_store_qcn);
+
+static DEVICE_ATTR(qcn_stats, S_IRUGO,
+ mlx4_en_show_qcnstats, NULL);
+#endif
+
+#ifdef CONFIG_SYSFS_MAXRATE
+static ssize_t mlx4_en_show_maxrate(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct mlx4_en_priv *priv = to_en_priv(d);
+ int i;
+ int len = 0;
+ struct ieee_maxrate maxrate;
+ int ret;
+
+ ret = mlx4_en_dcbnl_ieee_getmaxrate(priv->dev, &maxrate);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < MLX4_EN_NUM_TC; i++)
+ len += sprintf(buf + len, "%lld ", maxrate.tc_maxrate[i]);
+ len += sprintf(buf + len, "\n");
+
+ return len;
+}
+
+static ssize_t mlx4_en_store_maxrate(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret;
+ struct mlx4_en_priv *priv = to_en_priv(d);
+ char save;
+ int i = 0;
+ struct ieee_maxrate maxrate;
+
+ do {
+ int len;
+ u64 new_value;
+
+ if (i >= MLX4_EN_NUM_TC)
+ goto bad_elem_count;
+
+ len = strcspn(buf, " ");
+
+ /* nul-terminate and parse */
+ save = buf[len];
+ ((char *)buf)[len] = '\0';
+
+ if (sscanf(buf, "%lld", &new_value) != 1 ||
+ new_value < 0) {
+ en_err(priv, "bad maxrate value: '%s'\n", buf);
+ ret = -EINVAL;
+ goto out;
+ }
+ maxrate.tc_maxrate[i] = new_value;
+
+ buf += len+1;
+ i++;
+ } while (save == ' ');
+
+ if (i != MLX4_EN_NUM_TC)
+ goto bad_elem_count;
+
+ ret = mlx4_en_dcbnl_ieee_setmaxrate(priv->dev, &maxrate);
+ if (!ret)
+ ret = count;
+
+out:
+ return ret;
+
+bad_elem_count:
+ en_err(priv, "bad number of elemets in maxrate array\n");
+ return -EINVAL;
+}
+
+static DEVICE_ATTR(maxrate, S_IRUGO | S_IWUSR,
+ mlx4_en_show_maxrate, mlx4_en_store_maxrate);
+#endif
+
+#ifdef CONFIG_SYSFS_MQPRIO
+
+#define MLX4_EN_NUM_SKPRIO 16
+
+static ssize_t mlx4_en_show_skprio2up(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct mlx4_en_priv *priv = to_en_priv(d);
+ struct net_device *dev = priv->dev;
+ int i;
+ int len = 0;
+
+ for (i = 0; i < MLX4_EN_NUM_SKPRIO; i++)
+ len += sprintf(buf + len, "%d ",
+ netdev_get_prio_tc_map(dev, i));
+ len += sprintf(buf + len, "\n");
+
+ return len;
+}
+
+static ssize_t mlx4_en_store_skprio2up(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int ret = count;
+ struct mlx4_en_priv *priv = to_en_priv(d);
+ struct net_device *dev = priv->dev;
+ char save;
+ int i = 0;
+ u8 skprio2up[MLX4_EN_NUM_SKPRIO];
+
+ do {
+ int len;
+ int new_value;
+
+ if (i >= MLX4_EN_NUM_SKPRIO)
+ goto bad_elem_count;
+
+ len = strcspn(buf, " ");
+
+ /* nul-terminate and parse */
+ save = buf[len];
+ ((char *)buf)[len] = '\0';
+
+ if (sscanf(buf, "%d", &new_value) != 1 ||
+ new_value > MLX4_EN_NUM_UP || new_value < 0) {
+ en_err(priv, "bad user priority: '%s'\n", buf);
+ ret = -EINVAL;
+ goto out;
+ }
+ skprio2up[i] = new_value;
+
+ buf += len+1;
+ i++;
+ } while (save == ' ');
+
+ if (i != MLX4_EN_NUM_SKPRIO)
+ goto bad_elem_count;
+
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ mlx4_en_setup_tc(dev, MLX4_EN_NUM_UP);
+#endif
+
+ for (i = 0; i < MLX4_EN_NUM_SKPRIO; i++)
+ netdev_set_prio_tc_map(dev, i, skprio2up[i]);
+
+out:
+ return ret;
+
+bad_elem_count:
+ en_err(priv, "bad number of elemets in skprio2up array\n");
+ return -EINVAL;
+}
+
+static DEVICE_ATTR(skprio2up, S_IRUGO | S_IWUSR,
+ mlx4_en_show_skprio2up, mlx4_en_store_skprio2up);
+#endif
+
+#ifdef CONFIG_SYSFS_INDIR_SETTING
+static ssize_t mlx4_en_show_rxfh_indir(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct net_device *dev = to_net_dev(d);
+ int i, err;
+ int len = 0;
+ int ring_num;
+ u32 *ring_index;
+
+ ring_num = mlx4_en_get_rxfh_indir_size(dev);
+ if (ring_num < 0)
+ return -EINVAL;
+
+ ring_index = kzalloc(sizeof(u32) * ring_num, GFP_KERNEL);
+ if (!ring_index)
+ return -ENOMEM;
+
+ err = mlx4_en_get_rxfh_indir(dev, ring_index);
+ if (err)
+ goto err;
+
+ for (i = 0; i < ring_num; i++)
+ len += sprintf(buf + len, "%d\n", ring_index[i]);
+
+ err = len;
+err:
+ kfree(ring_index);
+
+ return err;
+}
+
+static ssize_t mlx4_en_store_rxfh_indir(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct net_device *dev = to_net_dev(d);
+ char *endp;
+ unsigned long new;
+ int i, err;
+ int ring_num;
+ u32 *ring_index;
+
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
+ new = simple_strtoul(buf, &endp, 0);
+ if (endp == buf)
+ return -EINVAL;
+
+ if (!is_power_of_2(new))
+ return -EINVAL;
+
+ ring_num = mlx4_en_get_rxfh_indir_size(dev);
+ if (ring_num < 0)
+ return -EINVAL;
+
+ ring_index = kzalloc(sizeof(u32) * ring_num, GFP_KERNEL);
+ if (!ring_index)
+ return -ENOMEM;
+
+ for (i = 0; i < ring_num; i++)
+ ring_index[i] = i % new;
+
+ err = mlx4_en_set_rxfh_indir(dev, ring_index);
+ if (err)
+ goto err;
+
+ err = count;
+
+err:
+ kfree(ring_index);
+
+ return err;
+}
+
+static DEVICE_ATTR(rxfh_indir, S_IRUGO | S_IWUSR,
+ mlx4_en_show_rxfh_indir, mlx4_en_store_rxfh_indir);
+#endif
+
+#ifdef CONFIG_SYSFS_NUM_CHANNELS
+static ssize_t mlx4_en_show_channels(struct device *d,
+ struct device_attribute *attr,
+ char *buf, int is_tx)
+{
+ struct net_device *dev = to_net_dev(d);
+ struct ethtool_channels channel;
+ int len = 0;
+
+ mlx4_en_get_channels(dev, &channel);
+
+ len += sprintf(buf + len, "%d\n",
+ is_tx ? channel.tx_count : channel.rx_count);
+
+ return len;
+}
+
+static ssize_t mlx4_en_store_channels(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count, int is_tx)
+{
+ struct net_device *dev = to_net_dev(d);
+ char *endp;
+ struct ethtool_channels channel;
+ int ret = -EINVAL;
+
+ mlx4_en_get_channels(dev, &channel);
+
+ if (is_tx)
+ channel.tx_count = simple_strtoul(buf, &endp, 0);
+ else
+ channel.rx_count = simple_strtoul(buf, &endp, 0);
+ if (endp == buf)
+ goto err;
+
+ rtnl_lock();
+ ret = mlx4_en_set_channels(dev, &channel);
+ rtnl_unlock();
+ if (ret)
+ goto err;
+
+ ret = count;
+err:
+ return ret;
+}
+
+static ssize_t mlx4_en_show_tx_channels(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ return mlx4_en_show_channels(d, attr, buf, 1);
+}
+
+static ssize_t mlx4_en_store_tx_channels(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ return mlx4_en_store_channels(d, attr, buf, count, 1);
+}
+
+static ssize_t mlx4_en_show_rx_channels(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ return mlx4_en_show_channels(d, attr, buf, 0);
+}
+
+static ssize_t mlx4_en_store_rx_channels(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ return mlx4_en_store_channels(d, attr, buf, count, 0);
+}
+
+static DEVICE_ATTR(tx_channels, S_IRUGO | S_IWUSR,
+ mlx4_en_show_tx_channels, mlx4_en_store_tx_channels);
+
+static DEVICE_ATTR(rx_channels, S_IRUGO | S_IWUSR,
+ mlx4_en_show_rx_channels, mlx4_en_store_rx_channels);
+
+#endif
+
+#ifdef CONFIG_SYSFS_LOOPBACK
+static ssize_t mlx4_en_show_loopback(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct net_device *dev = to_net_dev(d);
+ int len = 0;
+
+ len += sprintf(buf + len, "%d\n", !!(dev->features & NETIF_F_LOOPBACK));
+
+ return len;
+}
+
+static ssize_t mlx4_en_store_loopback(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct net_device *dev = to_net_dev(d);
+ char *endp;
+ unsigned long new;
+ int ret = -EINVAL;
+#ifdef HAVE_NET_DEVICE_OPS_EXT
+ u32 features = dev->features;
+#else
+ netdev_features_t features = dev->features;
+#endif
+
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
+ new = simple_strtoul(buf, &endp, 0);
+ if (endp == buf)
+ goto err;
+
+ if (new)
+ features |= NETIF_F_LOOPBACK;
+ else
+ features &= ~NETIF_F_LOOPBACK;
+
+ rtnl_lock();
+ mlx4_en_set_features(dev, features);
+ dev->features = features;
+ rtnl_unlock();
+
+ ret = count;
+
+err:
+ return ret;
+}
+
+static DEVICE_ATTR(loopback, S_IRUGO | S_IWUSR,
+ mlx4_en_show_loopback, mlx4_en_store_loopback);
+#endif
+
+static struct attribute *mlx4_en_qos_attrs[] = {
+#ifdef CONFIG_SYSFS_MAXRATE
+ &dev_attr_maxrate.attr,
+#endif
+#ifdef CONFIG_SYSFS_MQPRIO
+ &dev_attr_skprio2up.attr,
+#endif
+#ifdef CONFIG_SYSFS_INDIR_SETTING
+ &dev_attr_rxfh_indir.attr,
+#endif
+#ifdef CONFIG_SYSFS_NUM_CHANNELS
+ &dev_attr_tx_channels.attr,
+ &dev_attr_rx_channels.attr,
+#endif
+#ifdef CONFIG_SYSFS_LOOPBACK
+ &dev_attr_loopback.attr,
+#endif
+#ifdef CONFIG_SYSFS_QCN
+ &dev_attr_qcn.attr,
+ &dev_attr_qcn_stats.attr,
+#endif
+ NULL,
+};
+
+static struct attribute_group qos_group = {
+ .name = "qos",
+ .attrs = mlx4_en_qos_attrs,
+};
+
+int mlx4_en_sysfs_create(struct net_device *dev)
+{
+ return sysfs_create_group(&(dev->dev.kobj), &qos_group);
+}
+
+void mlx4_en_sysfs_remove(struct net_device *dev)
+{
+ sysfs_remove_group(&(dev->dev.kobj), &qos_group);
+}
+
+#endif
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_tx.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_tx.c
new file mode 100644
index 0000000..55930f6
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_tx.c
@@ -0,0 +1,1143 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+
+#include "mlx4_en.h"
+
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ /* The +8 is for mss_header and inline header */
+ #define GET_LSO_SEG_SIZE_EN(lso_header_size) \
+ ((lso_header_size > 44) ? \
+ ALIGN(lso_header_size + 8, DS_SIZE_ALIGNMENT) : \
+ ALIGN(lso_header_size + 4, DS_SIZE_ALIGNMENT))
+#else
+ #define GET_LSO_SEG_SIZE_EN(lso_header_size) \
+ ALIGN(lso_header_size + 4, DS_SIZE_ALIGNMENT)
+#endif
+static inline void copy_lso_header(__be32 *dst, void *src, int hdr_sz,
+ __be32 owner_bit)
+{
+/* In WQE_FORMAT = 1 we need to split segments larger
+ * than 64 bytes, in this case: 64 - sizeof(ctrl) -
+ * sizeof(lso->mss_hdr_size) = 44
+ */
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ if (likely(hdr_sz > 44)) {
+ memcpy(dst, src, 44);
+ /* writing the rest of the Header and leaving 4 byte for
+ * right the inline header
+ */
+ memcpy((dst + 12), src + 44,
+ hdr_sz - 44);
+ /* make sure we write the reset of the segment before
+ * setting ownership bit to HW
+ */
+ wmb();
+ *(dst + 11) =
+ cpu_to_be32((1 << 31) |
+ (hdr_sz - 44)) |
+ owner_bit;
+ } else
+#endif
+ memcpy(dst, src, hdr_sz);
+}
+
+int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring *ring, u32 size,
+ u16 stride, int node, int queue_index)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_en_tx_ring *ring;
+ int tmp;
+ int err;
+
+ ring->size = size;
+ ring->size_mask = size - 1;
+ ring->stride = stride;
+
+ tmp = size * sizeof(struct mlx4_en_tx_info);
+ ring->tx_info = kmalloc_node(tmp, GFP_KERNEL | __GFP_NOWARN, node);
+ if (!ring->tx_info) {
+ ring->tx_info = vmalloc(tmp);
+ if (!ring->tx_info) {
+ err = -ENOMEM;
+ goto err_ring;
+ }
+ }
+
+ en_dbg(DRV, priv, "Allocated tx_info ring at addr:%p size:%d\n",
+ ring->tx_info, tmp);
+
+ ring->bounce_buf = kmalloc_node(MAX_DESC_SIZE, GFP_KERNEL, node);
+ if (!ring->bounce_buf) {
+ ring->bounce_buf = kmalloc(MAX_DESC_SIZE, GFP_KERNEL);
+ if (!ring->bounce_buf) {
+ err = -ENOMEM;
+ goto err_info;
+ }
+ }
+ ring->buf_size = ALIGN(size * ring->stride, MLX4_EN_PAGE_SIZE);
+
+ /* Allocate HW buffers on provided NUMA node */
+ set_dev_node(&mdev->dev->persist->pdev->dev, node);
+ err = mlx4_alloc_hwq_res(mdev->dev, &ring->wqres, ring->buf_size,
+ 2 * PAGE_SIZE);
+ set_dev_node(&mdev->dev->persist->pdev->dev, mdev->dev->numa_node);
+ if (err) {
+ en_err(priv, "Failed allocating hwq resources\n");
+ goto err_bounce;
+ }
+
+ err = mlx4_en_map_buffer(&ring->wqres.buf);
+ if (err) {
+ en_err(priv, "Failed to map TX buffer\n");
+ goto err_hwq_res;
+ }
+
+ ring->buf = ring->wqres.buf.direct.buf;
+
+ en_dbg(DRV, priv, "Allocated TX ring (addr:%p) - buf:%p size:%d buf_size:%d dma:%llx\n",
+ ring, ring->buf, ring->size, ring->buf_size,
+ (unsigned long long) ring->wqres.buf.direct.map);
+
+ err = mlx4_qp_reserve_range(mdev->dev, 1, 1, &ring->qpn,
+ MLX4_RESERVE_ETH_BF_QP);
+ if (err) {
+ en_err(priv, "failed reserving qp for TX ring\n");
+ goto err_map;
+ }
+
+ err = mlx4_qp_alloc(mdev->dev, ring->qpn, &ring->qp, GFP_KERNEL);
+ if (err) {
+ en_err(priv, "Failed allocating qp %d\n", ring->qpn);
+ goto err_reserve;
+ }
+ ring->qp.event = mlx4_en_sqp_event;
+
+ err = mlx4_bf_alloc(mdev->dev, &ring->bf, node);
+ if (err) {
+ en_dbg(DRV, priv, "working without blueflame (%d)\n", err);
+ ring->bf.uar = &mdev->priv_uar;
+ ring->bf.uar->map = mdev->uar_map;
+ ring->bf_enabled = false;
+ ring->bf_alloced = false;
+ priv->pflags &= ~MLX4_EN_PRIV_FLAGS_BLUEFLAME;
+ } else {
+ ring->bf_alloced = true;
+ ring->bf_enabled = !!(priv->pflags &
+ MLX4_EN_PRIV_FLAGS_BLUEFLAME);
+ }
+
+ ring->hwtstamp_tx_type = priv->hwtstamp_config.tx_type;
+ ring->queue_index = queue_index;
+
+ if (queue_index < priv->num_tx_rings_p_up && cpu_online(queue_index))
+ cpumask_set_cpu(queue_index, &ring->affinity_mask);
+
+ *pring = ring;
+ return 0;
+
+err_reserve:
+ mlx4_qp_release_range(mdev->dev, ring->qpn, 1);
+err_map:
+ mlx4_en_unmap_buffer(&ring->wqres.buf);
+err_hwq_res:
+ mlx4_free_hwq_res(mdev->dev, &ring->wqres, ring->buf_size);
+err_bounce:
+ kfree(ring->bounce_buf);
+ ring->bounce_buf = NULL;
+err_info:
+ kvfree(ring->tx_info);
+ ring->tx_info = NULL;
+err_ring:
+ kfree(ring);
+ *pring = NULL;
+ return err;
+}
+
+void mlx4_en_destroy_tx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring **pring)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_en_tx_ring *ring = *pring;
+ en_dbg(DRV, priv, "Destroying tx ring, qpn: %d\n", ring->qpn);
+
+ if (ring->bf_alloced)
+ mlx4_bf_free(mdev->dev, &ring->bf);
+ mlx4_qp_remove(mdev->dev, &ring->qp);
+ mlx4_qp_free(mdev->dev, &ring->qp);
+ mlx4_qp_release_range(priv->mdev->dev, ring->qpn, 1);
+ mlx4_en_unmap_buffer(&ring->wqres.buf);
+ mlx4_free_hwq_res(mdev->dev, &ring->wqres, ring->buf_size);
+ kfree(ring->bounce_buf);
+ ring->bounce_buf = NULL;
+ kvfree(ring->tx_info);
+ ring->tx_info = NULL;
+ kfree(ring);
+ *pring = NULL;
+}
+
+int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring *ring,
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ int cq, int user_prio)
+#else
+ int cq)
+#endif
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err;
+
+ ring->cqn = cq;
+ ring->prod = 0;
+ ring->cons = 0xffffffff;
+ ring->last_nr_txbb = 1;
+ memset(ring->tx_info, 0, ring->size * sizeof(struct mlx4_en_tx_info));
+ memset(ring->buf, 0, ring->buf_size);
+
+ ring->qp_state = MLX4_QP_STATE_RST;
+ ring->doorbell_qpn = cpu_to_be32(ring->qp.qpn << 8);
+ ring->mr_key = cpu_to_be32(mdev->mr.key);
+
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ mlx4_en_fill_qp_context(priv, ring->size, ring->stride, 1, 0, ring->qpn,
+ ring->cqn, user_prio, &ring->context);
+#else
+ mlx4_en_fill_qp_context(priv, ring->size, ring->stride, 1, 0, ring->qpn,
+ ring->cqn, &ring->context);
+#endif
+ if (ring->bf_alloced)
+ ring->context.usr_page = cpu_to_be32(ring->bf.uar->index);
+
+ err = mlx4_qp_to_ready(mdev->dev, &ring->wqres.mtt, &ring->context,
+ &ring->qp, &ring->qp_state);
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ if (!user_prio && cpu_online(ring->queue_index))
+#else
+ if (cpu_online(ring->queue_index))
+#endif
+ netif_set_xps_queue(priv->dev, &ring->affinity_mask,
+ ring->queue_index);
+
+ return err;
+}
+
+void mlx4_en_deactivate_tx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring *ring)
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+
+ mlx4_qp_modify(mdev->dev, NULL, ring->qp_state,
+ MLX4_QP_STATE_RST, NULL, 0, 0, &ring->qp);
+}
+
+#ifndef CONFIG_INFINIBAND_WQE_FORMAT
+static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring *ring, int index,
+ u8 owner)
+{
+ __be32 stamp = cpu_to_be32(STAMP_VAL | (!!owner << STAMP_SHIFT));
+ struct mlx4_en_tx_desc *tx_desc = ring->buf + index * TXBB_SIZE;
+ struct mlx4_en_tx_info *tx_info = &ring->tx_info[index];
+ void *end = ring->buf + ring->buf_size;
+ __be32 *ptr = (__be32 *)tx_desc;
+ int i;
+
+ /* Optimize the common case when there are no wraparounds */
+ if (likely((void *)tx_desc + tx_info->nr_txbb * TXBB_SIZE <= end)) {
+ /* Stamp the freed descriptor */
+ for (i = 0; i < tx_info->nr_txbb * TXBB_SIZE;
+ i += STAMP_STRIDE) {
+ *ptr = stamp;
+ ptr += STAMP_DWORDS;
+ }
+ } else {
+ /* Stamp the freed descriptor */
+ for (i = 0; i < tx_info->nr_txbb * TXBB_SIZE;
+ i += STAMP_STRIDE) {
+ *ptr = stamp;
+ ptr += STAMP_DWORDS;
+ if ((void *)ptr >= end) {
+ ptr = ring->buf;
+ stamp ^= cpu_to_be32(0x80000000);
+ }
+ }
+ }
+}
+#endif
+
+
+static u32 mlx4_en_free_tx_desc(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring *ring,
+ int index, u8 owner, u64 timestamp)
+{
+ struct mlx4_en_tx_info *tx_info = &ring->tx_info[index];
+ struct mlx4_en_tx_desc *tx_desc = ring->buf + index * TXBB_SIZE;
+ struct mlx4_wqe_data_seg *data = (void *) tx_desc + tx_info->data_offset;
+ void *end = ring->buf + ring->buf_size;
+ struct sk_buff *skb = tx_info->skb;
+ int nr_maps = tx_info->nr_maps;
+ int i;
+
+ /* We do not touch skb here, so prefetch skb->users location
+ * to speedup consume_skb()
+ */
+ prefetchw(&skb->users);
+
+ if (unlikely(timestamp)) {
+ struct skb_shared_hwtstamps hwts;
+
+ mlx4_en_fill_hwtstamps(priv->mdev, &hwts, timestamp);
+ skb_tstamp_tx(skb, &hwts);
+ }
+
+ /* Optimize the common case when there are no wraparounds */
+ if (likely((void *) tx_desc + tx_info->nr_txbb * TXBB_SIZE <= end)) {
+ if (!tx_info->inl) {
+ if (tx_info->linear)
+ dma_unmap_single(priv->ddev,
+ tx_info->map0_dma,
+ tx_info->map0_byte_count,
+ PCI_DMA_TODEVICE);
+ else
+ dma_unmap_page(priv->ddev,
+ tx_info->map0_dma,
+ tx_info->map0_byte_count,
+ PCI_DMA_TODEVICE);
+ for (i = 1; i < nr_maps; i++) {
+ data++;
+ dma_unmap_page(priv->ddev,
+ (dma_addr_t)be64_to_cpu(data->addr),
+ be32_to_cpu(data->byte_count),
+ PCI_DMA_TODEVICE);
+ }
+ }
+ } else {
+ if (!tx_info->inl) {
+ if ((void *) data >= end) {
+ data = ring->buf + ((void *)data - end);
+ }
+
+ if (tx_info->linear)
+ dma_unmap_single(priv->ddev,
+ tx_info->map0_dma,
+ tx_info->map0_byte_count,
+ PCI_DMA_TODEVICE);
+ else
+ dma_unmap_page(priv->ddev,
+ tx_info->map0_dma,
+ tx_info->map0_byte_count,
+ PCI_DMA_TODEVICE);
+ for (i = 1; i < nr_maps; i++) {
+ data++;
+ /* Check for wraparound before unmapping */
+ if ((void *) data >= end)
+ data = ring->buf;
+ dma_unmap_page(priv->ddev,
+ (dma_addr_t)be64_to_cpu(data->addr),
+ be32_to_cpu(data->byte_count),
+ PCI_DMA_TODEVICE);
+ }
+ }
+ }
+#ifdef HAVE_DEV_CONSUME_SKB_ANY
+ dev_consume_skb_any(skb);
+#else
+ dev_kfree_skb_any(skb);
+#endif
+ return tx_info->nr_txbb;
+}
+
+
+int mlx4_en_free_tx_buf(struct net_device *dev, struct mlx4_en_tx_ring *ring)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ int cnt = 0;
+
+ /* Skip last polled descriptor */
+ ring->cons += ring->last_nr_txbb;
+ en_dbg(DRV, priv, "Freeing Tx buf - cons:0x%x prod:0x%x\n",
+ ring->cons, ring->prod);
+
+ if ((u32) (ring->prod - ring->cons) > ring->size) {
+ if (netif_msg_tx_err(priv))
+ en_warn(priv, "Tx consumer passed producer!\n");
+ return 0;
+ }
+
+ while (ring->cons != ring->prod) {
+ ring->last_nr_txbb = mlx4_en_free_tx_desc(priv, ring,
+ ring->cons & ring->size_mask,
+ !!(ring->cons & ring->size), 0);
+ ring->cons += ring->last_nr_txbb;
+ cnt++;
+ }
+
+ netdev_tx_reset_queue(ring->tx_queue);
+
+ if (cnt)
+ en_dbg(DRV, priv, "Freed %d uncompleted tx descriptors\n", cnt);
+
+ return cnt;
+}
+
+static bool mlx4_en_process_tx_cq(struct net_device *dev, struct mlx4_en_cq *cq)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_cq *mcq = &cq->mcq;
+ struct mlx4_en_tx_ring *ring = priv->tx_ring[cq->ring];
+ struct mlx4_cqe *cqe;
+ u16 index;
+ u16 new_index, ring_index, stamp_index;
+ u32 txbbs_skipped = 0;
+#ifndef CONFIG_INFINIBAND_WQE_FORMAT
+ u32 txbbs_stamp = 0;
+#endif
+ u32 cons_index = mcq->cons_index;
+ int size = cq->size;
+ u32 size_mask = ring->size_mask;
+ struct mlx4_cqe *buf = cq->buf;
+ u32 packets = 0;
+ u32 bytes = 0;
+ int factor = priv->cqe_factor;
+ u64 timestamp = 0;
+ int done = 0;
+ int budget = priv->tx_work_limit;
+ u32 last_nr_txbb;
+ u32 ring_cons;
+
+ if (!priv->port_up)
+ return true;
+
+#ifdef HAVE_NETDEV_TXQ_BQL_PREFETCHW
+ netdev_txq_bql_complete_prefetchw(ring->tx_queue);
+#else
+#ifdef CONFIG_BQL
+ prefetchw(&ring->tx_queue->dql.limit);
+#endif
+#endif
+
+ index = cons_index & size_mask;
+ cqe = mlx4_en_get_cqe(buf, index, priv->cqe_size) + factor;
+ last_nr_txbb = ACCESS_ONCE(ring->last_nr_txbb);
+ ring_cons = ACCESS_ONCE(ring->cons);
+ ring_index = ring_cons & size_mask;
+ stamp_index = ring_index;
+
+ /* Process all completed CQEs */
+ while (XNOR(cqe->owner_sr_opcode & MLX4_CQE_OWNER_MASK,
+ cons_index & size) && (done < budget)) {
+ /*
+ * make sure we read the CQE after we read the
+ * ownership bit
+ */
+ rmb();
+
+ if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
+ MLX4_CQE_OPCODE_ERROR)) {
+ struct mlx4_err_cqe *cqe_err = (struct mlx4_err_cqe *)cqe;
+
+ en_err(priv, "CQE error - vendor syndrome: 0x%x syndrome: 0x%x\n",
+ cqe_err->vendor_err_syndrome,
+ cqe_err->syndrome);
+ }
+
+ /* Skip over last polled CQE */
+ new_index = be16_to_cpu(cqe->wqe_index) & size_mask;
+
+ do {
+ txbbs_skipped += last_nr_txbb;
+ ring_index = (ring_index + last_nr_txbb) & size_mask;
+ if (ring->tx_info[ring_index].ts_requested)
+ timestamp = mlx4_en_get_cqe_ts(cqe);
+
+ /* free next descriptor */
+ last_nr_txbb = mlx4_en_free_tx_desc(
+ priv, ring, ring_index,
+ !!((ring_cons + txbbs_skipped) &
+ ring->size), timestamp);
+
+#ifndef CONFIG_INFINIBAND_WQE_FORMAT
+ mlx4_en_stamp_wqe(priv, ring, stamp_index,
+ !!((ring_cons + txbbs_stamp) &
+ ring->size));
+ stamp_index = ring_index;
+ txbbs_stamp = txbbs_skipped;
+#endif
+ packets++;
+ bytes += ring->tx_info[ring_index].nr_bytes;
+ } while ((++done < budget) && (ring_index != new_index));
+
+ ++cons_index;
+ index = cons_index & size_mask;
+ cqe = mlx4_en_get_cqe(buf, index, priv->cqe_size) + factor;
+ }
+
+
+ /*
+ * To prevent CQ overflow we first update CQ consumer and only then
+ * the ring consumer.
+ */
+ mcq->cons_index = cons_index;
+ mlx4_cq_set_ci(mcq);
+ wmb();
+
+ /* we want to dirty this cache line once */
+ ACCESS_ONCE(ring->last_nr_txbb) = last_nr_txbb;
+ ACCESS_ONCE(ring->cons) = ring_cons + txbbs_skipped;
+
+ netdev_tx_completed_queue(ring->tx_queue, packets, bytes);
+
+ /*
+ * Wakeup Tx queue if this stopped, and at least 1 packet
+ * was completed
+ */
+ if (netif_tx_queue_stopped(ring->tx_queue) &&
+ (ring->prod - ring->cons) <=
+ (ring->size - HEADROOM - MAX_DESC_TXBBS)) {
+ netif_tx_wake_queue(ring->tx_queue);
+ ring->wake_queue++;
+ }
+ return done < budget;
+}
+
+void mlx4_en_tx_irq(struct mlx4_cq *mcq)
+{
+ struct mlx4_en_cq *cq = container_of(mcq, struct mlx4_en_cq, mcq);
+ struct mlx4_en_priv *priv = netdev_priv(cq->dev);
+
+ if (likely(priv->port_up))
+#ifdef HAVE_NAPI_SCHEDULE_IRQOFF
+ napi_schedule_irqoff(&cq->napi);
+#else
+ napi_schedule(&cq->napi);
+#endif
+ else
+ mlx4_en_arm_cq(priv, cq);
+}
+
+/* TX CQ polling - called by NAPI */
+int mlx4_en_poll_tx_cq(struct napi_struct *napi, int budget)
+{
+ struct mlx4_en_cq *cq = container_of(napi, struct mlx4_en_cq, napi);
+ struct net_device *dev = cq->dev;
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ int clean_complete;
+
+ clean_complete = mlx4_en_process_tx_cq(dev, cq);
+ if (!clean_complete)
+ return budget;
+
+ napi_complete(napi);
+ mlx4_en_arm_cq(priv, cq);
+
+ return 0;
+}
+
+static struct mlx4_en_tx_desc *mlx4_en_bounce_to_desc(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring *ring,
+ u32 index,
+ unsigned int desc_size)
+{
+ u32 copy = (ring->size - index) * TXBB_SIZE;
+ int i;
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ __be32 owner_bit = (ring->prod & ring->size) ?
+ cpu_to_be32(MLX4_EN_BIT_DESC_OWN) : 0;
+#endif
+
+ for (i = desc_size - copy - 4; i >= 0; i -= 4) {
+ if ((i & (TXBB_SIZE - 1)) == 0) {
+ wmb();
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ *((u32 *) (ring->buf + i)) =
+ (*((u32 *) (ring->bounce_buf + copy + i)) &
+ WQE_FORMAT_1_MASK) |
+ owner_bit;
+ continue;
+#endif
+ }
+
+ *((u32 *) (ring->buf + i)) =
+ *((u32 *) (ring->bounce_buf + copy + i));
+ }
+
+ for (i = copy - 4; i >= 4; i -= 4) {
+ if ((i & (TXBB_SIZE - 1)) == 0)
+ wmb();
+
+ *((u32 *) (ring->buf + index * TXBB_SIZE + i)) =
+ *((u32 *) (ring->bounce_buf + i));
+ }
+
+ /* Return real descriptor location */
+ return ring->buf + index * TXBB_SIZE;
+}
+
+/* Decide if skb can be inlined in tx descriptor to avoid dma mapping
+ *
+ * It seems strange we do not simply use skb_copy_bits().
+ * This would allow to inline all skbs iff skb->len <= inline_thold
+ *
+ * Note that caller already checked skb was not a gso packet
+ */
+static bool is_inline(int inline_thold, const struct sk_buff *skb,
+ const struct skb_shared_info *shinfo,
+ void **pfrag)
+{
+ void *ptr;
+
+ if (skb->len > inline_thold || !inline_thold)
+ return false;
+
+ if (shinfo->nr_frags == 1) {
+ ptr = skb_frag_address_safe(&shinfo->frags[0]);
+ if (unlikely(!ptr))
+ return false;
+ *pfrag = ptr;
+ return true;
+ }
+ if (shinfo->nr_frags)
+ return false;
+ return true;
+}
+
+static int inline_size(const struct sk_buff *skb)
+{
+ if (skb->len + CTRL_SIZE + sizeof(struct mlx4_wqe_inline_seg)
+ <= MLX4_INLINE_ALIGN)
+ return ALIGN(skb->len + CTRL_SIZE +
+ sizeof(struct mlx4_wqe_inline_seg), 16);
+ else
+ return ALIGN(skb->len + CTRL_SIZE + 2 *
+ sizeof(struct mlx4_wqe_inline_seg), 16);
+}
+
+static int get_real_size(const struct sk_buff *skb,
+ const struct skb_shared_info *shinfo,
+ struct net_device *dev, int *lso_header_size,
+ bool *inline_ok, void **pfrag)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ int real_size;
+
+ if (shinfo->gso_size) {
+ *inline_ok = false;
+#ifdef HAVE_SKB_INNER_TRANSPORT_HEADER
+ if (skb->encapsulation)
+ *lso_header_size = (skb_inner_transport_header(skb) - skb->data) + inner_tcp_hdrlen(skb);
+ else
+#endif
+ *lso_header_size = skb_transport_offset(skb) + tcp_hdrlen(skb);
+
+ real_size = CTRL_SIZE + shinfo->nr_frags * DS_SIZE +
+ GET_LSO_SEG_SIZE_EN(*lso_header_size);
+ if (unlikely(*lso_header_size != skb_headlen(skb))) {
+ /* We add a segment for the skb linear buffer only if
+ * it contains data */
+ if (*lso_header_size < skb_headlen(skb)) {
+ real_size += DS_SIZE;
+ } else {
+ if (netif_msg_tx_err(priv))
+ en_warn(priv, "Non-linear headers\n");
+ return 0;
+ }
+ }
+ } else {
+ *lso_header_size = 0;
+ *inline_ok = is_inline(priv->prof->inline_thold, skb,
+ shinfo, pfrag);
+
+ if (*inline_ok)
+ real_size = inline_size(skb);
+ else
+ real_size = CTRL_SIZE +
+ (shinfo->nr_frags + 1) * DS_SIZE;
+ }
+
+ return real_size;
+}
+
+/* If we are working with wqe format 0, owner_bit must be 0 */
+static inline void build_inline_wqe(struct mlx4_en_tx_desc *tx_desc,
+ const struct sk_buff *skb,
+ const struct skb_shared_info *shinfo,
+ int real_size, u16 *vlan_tag, int tx_ind,
+ void *fragptr, __be32 owner_bit)
+{
+ struct mlx4_wqe_inline_seg *inl = &tx_desc->inl;
+ int spc = MLX4_INLINE_ALIGN - CTRL_SIZE - sizeof *inl;
+ unsigned int hlen = skb_headlen(skb);
+
+ if (skb->len <= spc) {
+ if (likely(skb->len >= MIN_PKT_LEN)) {
+ inl->byte_count = SET_BYTE_COUNT((1 << 31 | skb->len));
+ } else {
+ inl->byte_count = SET_BYTE_COUNT((1 << 31 |
+ MIN_PKT_LEN));
+
+ memset(((void *)(inl + 1)) + skb->len, 0,
+ MIN_PKT_LEN - skb->len);
+ }
+ skb_copy_from_linear_data(skb, inl + 1, hlen);
+ if (shinfo->nr_frags)
+ memcpy(((void *)(inl + 1)) + hlen, fragptr,
+ skb_frag_size(&shinfo->frags[0]));
+
+ } else {
+ inl->byte_count = SET_BYTE_COUNT((1 << 31 | spc));
+ if (hlen <= spc) {
+ skb_copy_from_linear_data(skb, inl + 1, hlen);
+ if (hlen < spc) {
+ memcpy(((void *)(inl + 1)) + hlen,
+ fragptr, spc - hlen);
+ fragptr += spc - hlen;
+ }
+ inl = (void *) (inl + 1) + spc;
+ memcpy(((void *)(inl + 1)), fragptr, skb->len - spc);
+ } else {
+ skb_copy_from_linear_data(skb, inl + 1, spc);
+ inl = (void *) (inl + 1) + spc;
+ skb_copy_from_linear_data_offset(skb, spc, inl + 1,
+ hlen - spc);
+ if (shinfo->nr_frags)
+ memcpy(((void *)(inl + 1)) + hlen - spc,
+ fragptr,
+ skb_frag_size(&shinfo->frags[0]));
+ }
+
+ wmb();
+ inl->byte_count = SET_BYTE_COUNT((1 << 31 | (skb->len - spc)));
+ }
+}
+
+#if defined(NDO_SELECT_QUEUE_HAS_ACCEL_PRIV) || defined(HAVE_SELECT_QUEUE_FALLBACK_T)
+u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb,
+#ifdef HAVE_SELECT_QUEUE_FALLBACK_T
+ void *accel_priv, select_queue_fallback_t fallback)
+#else
+ void *accel_priv)
+#endif
+#else /* NDO_SELECT_QUEUE_HAS_ACCEL_PRIV || HAVE_SELECT_QUEUE_FALLBACK_T */
+u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb)
+#endif
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ u16 rings_p_up = priv->num_tx_rings_p_up;
+#endif
+ u8 up = 0;
+
+#ifdef HAVE_NEW_TX_RING_SCHEME
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,39))
+ if (dev->num_tc)
+#else
+ if (netdev_get_num_tc(dev))
+#endif
+ return skb_tx_hash(dev, skb);
+
+ if (skb_vlan_tag_present(skb))
+ up = skb_vlan_tag_get(skb) >> VLAN_PRIO_SHIFT;
+
+#ifdef HAVE_SELECT_QUEUE_FALLBACK_T
+ return fallback(dev, skb) % rings_p_up + up * rings_p_up;
+#else
+ return __netdev_pick_tx(dev, skb) % rings_p_up + up * rings_p_up;
+#endif
+#else /* HAVE_NEW_TX_RING_SCHEME */
+ /* If we support per priority flow control and the packet contains
+ * a vlan tag, send the packet to the TX ring assigned to that priority
+ */
+ if (priv->prof->rx_ppp)
+ return MLX4_EN_NUM_TX_RINGS + up;
+
+ return __netdev_pick_tx(dev, skb);
+#endif
+}
+
+static void mlx4_bf_copy(void __iomem *dst, const void *src,
+ unsigned int bytecnt)
+{
+ __iowrite64_copy(dst, src, bytecnt / 8);
+}
+
+netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+ struct skb_shared_info *shinfo = skb_shinfo(skb);
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct device *ddev = priv->ddev;
+ struct mlx4_en_tx_ring *ring;
+ struct mlx4_en_tx_desc *tx_desc;
+ struct mlx4_wqe_data_seg *data;
+ struct mlx4_en_tx_info *tx_info;
+ int tx_ind = 0;
+ int nr_txbb;
+ int desc_size;
+ int real_size;
+ u32 index, bf_index;
+ __be32 op_own;
+ u16 vlan_tag = 0;
+ int i_frag;
+ int lso_header_size;
+ void *fragptr = NULL;
+ bool bounce = false;
+ bool send_doorbell;
+ bool stop_queue;
+ bool inline_ok;
+ u32 ring_cons;
+ __be32 owner_bit;
+
+ if (!priv->port_up)
+ goto tx_drop;
+
+ tx_ind = skb_get_queue_mapping(skb);
+ ring = priv->tx_ring[tx_ind];
+
+ owner_bit = (ring->prod & ring->size) ?
+ cpu_to_be32(MLX4_EN_BIT_DESC_OWN) : 0;
+
+ /* fetch ring->cons far ahead before needing it to avoid stall */
+ ring_cons = ACCESS_ONCE(ring->cons);
+
+ real_size = get_real_size(skb, shinfo, dev, &lso_header_size,
+ &inline_ok, &fragptr);
+ if (unlikely(!real_size))
+ goto tx_drop;
+
+ /* Align descriptor to TXBB size */
+ desc_size = ALIGN(real_size, TXBB_SIZE);
+ nr_txbb = desc_size / TXBB_SIZE;
+ if (unlikely(nr_txbb > MAX_DESC_TXBBS)) {
+ if (netif_msg_tx_err(priv))
+ en_warn(priv, "Oversized header or SG list\n");
+ goto tx_drop;
+ }
+
+ if (skb_vlan_tag_present(skb))
+ vlan_tag = skb_vlan_tag_get(skb);
+
+
+#ifdef HAVE_NETDEV_TXQ_BQL_PREFETCHW
+ netdev_txq_bql_enqueue_prefetchw(ring->tx_queue);
+#else
+#ifdef CONFIG_BQL
+ prefetchw(&ring->tx_queue->dql);
+#endif
+#endif
+
+ /* Track current inflight packets for performance analysis */
+ AVG_PERF_COUNTER(priv->pstats.inflight_avg,
+ (u32)(ring->prod - ring_cons - 1));
+
+ /* Packet is good - grab an index and transmit it */
+ index = ring->prod & ring->size_mask;
+ bf_index = ring->prod;
+
+ /* See if we have enough space for whole descriptor TXBB for setting
+ * SW ownership on next descriptor; if not, use a bounce buffer. */
+ if (likely(index + nr_txbb <= ring->size)) {
+ tx_desc = ring->buf + index * TXBB_SIZE;
+ } else {
+ tx_desc = (struct mlx4_en_tx_desc *) ring->bounce_buf;
+ bounce = true;
+ }
+
+ /* Save skb in tx_info ring */
+ tx_info = &ring->tx_info[index];
+ tx_info->skb = skb;
+ tx_info->nr_txbb = nr_txbb;
+
+ data = &tx_desc->data;
+ if (lso_header_size)
+ data = ((void *)&tx_desc->lso +
+ GET_LSO_SEG_SIZE_EN(lso_header_size));
+
+ /* valid only for none inline segments */
+ tx_info->data_offset = (void *)data - (void *)tx_desc;
+
+ tx_info->inl = inline_ok;
+
+ tx_info->linear = (lso_header_size < skb_headlen(skb) &&
+ !inline_ok) ? 1 : 0;
+
+ tx_info->nr_maps = shinfo->nr_frags + tx_info->linear;
+ data += tx_info->nr_maps - 1;
+
+ if (!tx_info->inl) {
+ dma_addr_t dma = 0;
+ u32 byte_count = 0;
+
+ /* Map fragments if any */
+ for (i_frag = shinfo->nr_frags - 1; i_frag >= 0; i_frag--) {
+ const struct skb_frag_struct *frag;
+
+ frag = &shinfo->frags[i_frag];
+ byte_count = skb_frag_size(frag);
+ dma = skb_frag_dma_map(ddev, frag,
+ 0, byte_count,
+ DMA_TO_DEVICE);
+ if (dma_mapping_error(ddev, dma))
+ goto tx_drop_unmap;
+
+ data->addr = cpu_to_be64(dma);
+ data->lkey = ring->mr_key;
+ wmb();
+ data->byte_count = SET_BYTE_COUNT(byte_count);
+ --data;
+ }
+
+ /* Map linear part if needed */
+ if (tx_info->linear) {
+ byte_count = skb_headlen(skb) - lso_header_size;
+
+ dma = dma_map_single(ddev, skb->data +
+ lso_header_size, byte_count,
+ PCI_DMA_TODEVICE);
+ if (dma_mapping_error(ddev, dma))
+ goto tx_drop_unmap;
+
+ data->addr = cpu_to_be64(dma);
+ data->lkey = ring->mr_key;
+ wmb();
+ data->byte_count = SET_BYTE_COUNT(byte_count);
+ }
+ /* tx completion can avoid cache line miss for common cases */
+ tx_info->map0_dma = dma;
+ tx_info->map0_byte_count = byte_count;
+ }
+
+ /*
+ * For timestamping add flag to skb_shinfo and
+ * set flag for further reference
+ */
+ tx_info->ts_requested = 0;
+ if (unlikely(ring->hwtstamp_tx_type == HWTSTAMP_TX_ON &&
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 37)
+ shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
+ shinfo->tx_flags |= SKBTX_IN_PROGRESS;
+#else
+ shinfo->tx_flags.flags & SKBTX_HW_TSTAMP)) {
+ shinfo->tx_flags.flags |= SKBTX_IN_PROGRESS;
+#endif
+ tx_info->ts_requested = 1;
+ }
+
+ /* Prepare ctrl segement apart opcode+ownership, which depends on
+ * whether LSO is used */
+ tx_desc->ctrl.srcrb_flags = priv->ctrl_flags;
+ if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) {
+#ifdef HAVE_SK_BUFF_ENCAPSULATION
+ if (!skb->encapsulation)
+ tx_desc->ctrl.srcrb_flags |= cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM |
+ MLX4_WQE_CTRL_TCP_UDP_CSUM);
+ else
+ tx_desc->ctrl.srcrb_flags |= cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM);
+#else
+ tx_desc->ctrl.srcrb_flags |= cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM |
+ MLX4_WQE_CTRL_TCP_UDP_CSUM);
+#endif
+ ring->tx_csum++;
+ }
+
+ if (priv->flags & MLX4_EN_FLAG_ENABLE_HW_LOOPBACK) {
+ struct ethhdr *ethh;
+
+ /* Copy dst mac address to wqe. This allows loopback in eSwitch,
+ * so that VFs and PF can communicate with each other
+ */
+ ethh = (struct ethhdr *)skb->data;
+ tx_desc->ctrl.srcrb_flags16[0] = get_unaligned((__be16 *)ethh->h_dest);
+ tx_desc->ctrl.imm = get_unaligned((__be32 *)(ethh->h_dest + 2));
+ }
+
+ /* Handle LSO (TSO) packets */
+ if (lso_header_size) {
+ int i;
+
+ /* Mark opcode as LSO */
+ op_own = cpu_to_be32(MLX4_OPCODE_LSO | MLX4_WQE_CTRL_RR);
+
+ /* Fill in the LSO prefix */
+ tx_desc->lso.mss_hdr_size = cpu_to_be32(
+ shinfo->gso_size << 16 | lso_header_size);
+
+ /* Copy headers;
+ * note that we already verified that it is linear */
+ copy_lso_header(tx_desc->lso.header, skb->data,
+ lso_header_size, owner_bit);
+
+ ring->tso_packets++;
+
+ i = ((skb->len - lso_header_size) / shinfo->gso_size) +
+ !!((skb->len - lso_header_size) % shinfo->gso_size);
+ tx_info->nr_bytes = skb->len + (i - 1) * lso_header_size;
+ ring->packets += i;
+ } else {
+ /* Normal (Non LSO) packet */
+ op_own = cpu_to_be32(MLX4_OPCODE_SEND);
+ tx_info->nr_bytes = max_t(unsigned int, skb->len, ETH_ZLEN);
+ ring->packets++;
+ }
+ ring->bytes += tx_info->nr_bytes;
+ netdev_tx_sent_queue(ring->tx_queue, tx_info->nr_bytes);
+ AVG_PERF_COUNTER(priv->pstats.tx_pktsz_avg, skb->len);
+ if (tx_info->inl)
+ build_inline_wqe(tx_desc, skb, shinfo, real_size, &vlan_tag,
+ tx_ind, fragptr, owner_bit);
+
+#ifdef HAVE_SKB_INNER_NETWORK_HEADER
+ if (skb->encapsulation) {
+ struct iphdr *ipv4 = (struct iphdr *)skb_inner_network_header(skb);
+ if (ipv4->protocol == IPPROTO_TCP || ipv4->protocol == IPPROTO_UDP)
+ op_own |= cpu_to_be32(MLX4_WQE_CTRL_IIP | MLX4_WQE_CTRL_ILP);
+ else
+ op_own |= cpu_to_be32(MLX4_WQE_CTRL_IIP);
+ }
+#endif
+
+ op_own |= owner_bit;
+
+ ring->prod += nr_txbb;
+
+ /* If we used a bounce buffer then copy descriptor back into place */
+ if (unlikely(bounce))
+ tx_desc = mlx4_en_bounce_to_desc(priv, ring, index, desc_size);
+
+ skb_tx_timestamp(skb);
+
+ /* Check available TXBBs And 2K spare for prefetch */
+ stop_queue = (int)(ring->prod - ring_cons) >
+ ring->size - HEADROOM - MAX_DESC_TXBBS;
+ if (unlikely(stop_queue)) {
+ netif_tx_stop_queue(ring->tx_queue);
+ ring->queue_stopped++;
+ }
+#ifdef HAVE_SK_BUFF_XMIT_MORE
+ send_doorbell = !skb->xmit_more || netif_xmit_stopped(ring->tx_queue);
+#else
+ send_doorbell = true;
+#endif
+
+ real_size = (real_size / 16) & 0x3f;
+
+ if (ring->bf_enabled && desc_size <= MAX_BF && !bounce &&
+ !skb_vlan_tag_present(skb) && send_doorbell) {
+ tx_desc->ctrl.bf_qpn = ring->doorbell_qpn |
+ cpu_to_be32(real_size);
+
+ op_own |= htonl((bf_index & 0xffff) << 8);
+ /* Ensure new descriptor hits memory
+ * before setting ownership of this descriptor to HW
+ */
+ wmb();
+ tx_desc->ctrl.owner_opcode = op_own;
+
+ wmb();
+
+ mlx4_bf_copy(ring->bf.reg + ring->bf.offset, &tx_desc->ctrl,
+ desc_size);
+
+ wmb();
+
+ ring->bf.offset ^= ring->bf.buf_size;
+ } else {
+ tx_desc->ctrl.vlan_tag = cpu_to_be16(vlan_tag);
+ tx_desc->ctrl.ins_vlan = MLX4_WQE_CTRL_INS_VLAN *
+ !!skb_vlan_tag_present(skb);
+ tx_desc->ctrl.fence_size = real_size;
+
+ /* Ensure new descriptor hits memory
+ * before setting ownership of this descriptor to HW
+ */
+ wmb();
+ tx_desc->ctrl.owner_opcode = op_own;
+ if (send_doorbell) {
+ wmb();
+ /* Since there is no iowrite*_native() that writes the
+ * value as is, without byteswapping - using the one
+ * the doesn't do byteswapping in the relevant arch
+ * endianness.
+ */
+#if defined(__LITTLE_ENDIAN)
+ iowrite32(
+#else
+ iowrite32be(
+#endif
+ ring->doorbell_qpn,
+ ring->bf.uar->map + MLX4_SEND_DOORBELL);
+#ifdef HAVE_SK_BUFF_XMIT_MORE
+ } else {
+ ring->xmit_more++;
+#endif
+ }
+ }
+
+ if (unlikely(stop_queue)) {
+ /* If queue was emptied after the if (stop_queue) , and before
+ * the netif_tx_stop_queue() - need to wake the queue,
+ * or else it will remain stopped forever.
+ * Need a memory barrier to make sure ring->cons was not
+ * updated before queue was stopped.
+ */
+ smp_rmb();
+
+ ring_cons = ACCESS_ONCE(ring->cons);
+ if (unlikely(((int)(ring->prod - ring_cons)) <=
+ ring->size - HEADROOM - MAX_DESC_TXBBS)) {
+ netif_tx_wake_queue(ring->tx_queue);
+ ring->wake_queue++;
+ }
+ }
+ return NETDEV_TX_OK;
+
+tx_drop_unmap:
+ en_err(priv, "DMA mapping error\n");
+
+ while (++i_frag < shinfo->nr_frags) {
+ ++data;
+ dma_unmap_page(ddev, (dma_addr_t) be64_to_cpu(data->addr),
+ be32_to_cpu(data->byte_count &
+ DS_BYTE_COUNT_MASK),
+ PCI_DMA_TODEVICE);
+ }
+
+tx_drop:
+ dev_kfree_skb_any(skb);
+ priv->stats.tx_dropped++;
+ return NETDEV_TX_OK;
+}
+
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/en_tx_uio.c b/drivers/net/mlnx_uio/mlnx/mlx4/en_tx_uio.c
new file mode 100644
index 0000000..f95fc14
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/en_tx_uio.c
@@ -0,0 +1,47 @@
+/*
+ * en_tx_uio.c
+ *
+ * Created on: Jul 1, 2015
+ * Author: leeopop
+ */
+
+#include "mlx4_en.h"
+#include "mlx4_uio_helper.h"
+
+int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring *ring,
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ int cq, int user_prio)
+#else
+ int cq)
+#endif
+{
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int err;
+
+ ring->cqn = cq;
+ ring->prod = 0;
+ ring->cons = 0xffffffff;
+ ring->last_nr_txbb = 1;
+ memset(ring->tx_info, 0, ring->size * sizeof(struct mlx4_en_tx_info));
+ memset(ring->buf, 0, ring->buf_size);
+
+ ring->qp_state = MLX4_QP_STATE_RST;
+ ring->doorbell_qpn = cpu_to_be32(ring->qp.qpn << 8);
+ ring->mr_key = cpu_to_be32(mdev->mr.key);
+
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ mlx4_en_fill_qp_context(priv, ring->size, ring->stride, 1, 0, ring->qpn,
+ ring->cqn, user_prio, &ring->context);
+#else
+ mlx4_en_fill_qp_context(priv, ring->size, ring->stride, 1, 0, ring->qpn,
+ ring->cqn, &ring->context);
+#endif
+ if (ring->bf_alloced)
+ ring->context.usr_page = cpu_to_be32(ring->bf.uar->index);
+
+ err = mlx4_qp_to_ready(mdev->dev, &ring->wqres.mtt, &ring->context,
+ &ring->qp, &ring->qp_state);
+
+ return err;
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/eq.c b/drivers/net/mlnx_uio/mlnx/mlx4/eq.c
new file mode 100644
index 0000000..337c949
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/eq.c
@@ -0,0 +1,1777 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx4/device.h"
+#include "log2.h"
+
+#include "mlx4.h"
+#include "fw.h"
+
+enum {
+ MLX4_IRQNAME_SIZE = 32
+};
+
+enum {
+ MLX4_NUM_ASYNC_EQE = 0x100,
+ MLX4_NUM_SPARE_EQE = 0x80,
+ MLX4_EQ_ENTRY_SIZE = 0x20
+};
+
+#define MLX4_EQ_STATUS_OK ( 0 << 28)
+#define MLX4_EQ_STATUS_WRITE_FAIL (10 << 28)
+#define MLX4_EQ_OWNER_SW ( 0 << 24)
+#define MLX4_EQ_OWNER_HW ( 1 << 24)
+#define MLX4_EQ_FLAG_EC ( 1 << 18)
+#define MLX4_EQ_FLAG_OI ( 1 << 17)
+#define MLX4_EQ_STATE_ARMED ( 9 << 8)
+#define MLX4_EQ_STATE_FIRED (10 << 8)
+#define MLX4_EQ_STATE_ALWAYS_ARMED (11 << 8)
+
+#define MLX4_ASYNC_EVENT_MASK ((1ull << MLX4_EVENT_TYPE_PATH_MIG) | \
+ (1ull << MLX4_EVENT_TYPE_COMM_EST) | \
+ (1ull << MLX4_EVENT_TYPE_SQ_DRAINED) | \
+ (1ull << MLX4_EVENT_TYPE_CQ_ERROR) | \
+ (1ull << MLX4_EVENT_TYPE_WQ_CATAS_ERROR) | \
+ (1ull << MLX4_EVENT_TYPE_EEC_CATAS_ERROR) | \
+ (1ull << MLX4_EVENT_TYPE_PATH_MIG_FAILED) | \
+ (1ull << MLX4_EVENT_TYPE_WQ_INVAL_REQ_ERROR) | \
+ (1ull << MLX4_EVENT_TYPE_WQ_ACCESS_ERROR) | \
+ (1ull << MLX4_EVENT_TYPE_PORT_CHANGE) | \
+ (1ull << MLX4_EVENT_TYPE_ECC_DETECT) | \
+ (1ull << MLX4_EVENT_TYPE_SRQ_CATAS_ERROR) | \
+ (1ull << MLX4_EVENT_TYPE_SRQ_QP_LAST_WQE) | \
+ (1ull << MLX4_EVENT_TYPE_SRQ_LIMIT) | \
+ (1ull << MLX4_EVENT_TYPE_CMD) | \
+ (1ull << MLX4_EVENT_TYPE_OP_REQUIRED) | \
+ (1ull << MLX4_EVENT_TYPE_COMM_CHANNEL) | \
+ (1ull << MLX4_EVENT_TYPE_FLR_EVENT) | \
+ (1ull << MLX4_EVENT_TYPE_FATAL_WARNING))
+
+static u64 get_async_ev_mask(struct mlx4_dev *dev)
+{
+ u64 async_ev_mask = MLX4_ASYNC_EVENT_MASK;
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_PORT_MNG_CHG_EV)
+ async_ev_mask |= (1ull << MLX4_EVENT_TYPE_PORT_MNG_CHG_EVENT);
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RECOVERABLE_ERROR_EVENT)
+ async_ev_mask |= (1ull << MLX4_EVENT_TYPE_RECOVERABLE_ERROR_EVENT);
+
+ return async_ev_mask;
+}
+
+static void eq_set_ci(struct mlx4_eq *eq, int req_not)
+{
+ __raw_writel((__force u32) cpu_to_be32((eq->cons_index & 0xffffff) |
+ req_not << 31),
+ eq->doorbell);
+ /* We still want ordering, just not swabbing, so add a barrier */
+ mb();
+}
+
+static struct mlx4_eqe *get_eqe(struct mlx4_eq *eq, u32 entry, u8 eqe_factor,
+ u8 eqe_size)
+{
+ /* (entry & (eq->nent - 1)) gives us a cyclic array */
+ unsigned long offset = (entry & (eq->nent - 1)) * eqe_size;
+ /* CX3 is capable of extending the EQE from 32 to 64 bytes.
+ * When this feature is enabled, the first (in the lower addresses)
+ * 32 bytes in the 64 byte EQE are reserved and the next 32 bytes
+ * contain the legacy EQE information.
+ */
+ return eq->page_list[offset / PAGE_SIZE].buf + (offset + (eqe_factor ? MLX4_EQ_ENTRY_SIZE : 0)) % PAGE_SIZE;
+}
+
+static struct mlx4_eqe *next_eqe_sw(struct mlx4_eq *eq, u8 eqe_factor, u8 size)
+{
+ struct mlx4_eqe *eqe = get_eqe(eq, eq->cons_index, eqe_factor, size);
+ return !!(eqe->owner & 0x80) ^ !!(eq->cons_index & eq->nent) ? NULL : eqe;
+}
+
+static struct mlx4_eqe *next_slave_event_eqe(struct mlx4_slave_event_eq *slave_eq)
+{
+ struct mlx4_eqe *eqe =
+ &slave_eq->event_eqe[slave_eq->cons & (SLAVE_EVENT_EQ_SIZE - 1)];
+ return (!!(eqe->owner & 0x80) ^
+ !!(slave_eq->cons & SLAVE_EVENT_EQ_SIZE)) ?
+ eqe : NULL;
+}
+
+void mlx4_gen_slave_eqe(struct mlx4_mfunc_master_ctx *master)
+{
+ /*
+ struct mlx4_mfunc_master_ctx *master =
+ container_of(work, struct mlx4_mfunc_master_ctx,
+ slave_event_work);
+ */
+ struct mlx4_mfunc *mfunc =
+ container_of(master, struct mlx4_mfunc, master);
+
+ struct mlx4_priv *priv = container_of(mfunc, struct mlx4_priv, mfunc);
+ struct mlx4_dev *dev = &priv->dev;
+ struct mlx4_slave_event_eq *slave_eq = &mfunc->master.slave_eq;
+ struct mlx4_eqe *eqe;
+ u8 slave;
+ int i;
+
+ for (eqe = next_slave_event_eqe(slave_eq); eqe;
+ eqe = next_slave_event_eqe(slave_eq)) {
+ slave = eqe->slave_id;
+
+ /* All active slaves need to receive the event */
+ if (slave == ALL_SLAVES) {
+ for (i = 0; i < dev->num_slaves; i++) {
+ if (mlx4_GEN_EQE(dev, i, eqe))
+ mlx4_warn(dev, "Failed to generate "
+ "event for slave %d\n", i);
+ }
+ } else {
+ if (mlx4_GEN_EQE(dev, slave, eqe))
+ mlx4_warn(dev, "Failed to generate event "
+ "for slave %d\n", slave);
+ }
+ ++slave_eq->cons;
+ }
+}
+
+
+static void slave_event(struct mlx4_dev *dev, u8 slave, struct mlx4_eqe *eqe)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_event_eq *slave_eq = &priv->mfunc.master.slave_eq;
+ struct mlx4_eqe *s_eqe;
+ unsigned long flags;
+
+ spin_lock_irqsave(&slave_eq->event_lock, flags);
+ s_eqe = &slave_eq->event_eqe[slave_eq->prod & (SLAVE_EVENT_EQ_SIZE - 1)];
+ if ((!!(s_eqe->owner & 0x80)) ^
+ (!!(slave_eq->prod & SLAVE_EVENT_EQ_SIZE))) {
+ mlx4_warn(dev, "Master failed to generate an EQE for slave: %d. "
+ "No free EQE on slave events queue\n", slave);
+ spin_unlock_irqrestore(&slave_eq->event_lock, flags);
+ return;
+ }
+
+ memcpy(s_eqe, eqe, dev->caps.eqe_size - 1);
+ s_eqe->slave_id = slave;
+ /* ensure all information is written before setting the ownersip bit */
+ wmb();
+ s_eqe->owner = !!(slave_eq->prod & SLAVE_EVENT_EQ_SIZE) ? 0x0 : 0x80;
+ ++slave_eq->prod;
+#ifdef KMOD_DISABLED
+ queue_work(priv->mfunc.master.comm_wq,
+ &priv->mfunc.master.slave_event_work);
+#endif
+ spin_unlock_irqrestore(&slave_eq->event_lock, flags);
+}
+
+static void mlx4_slave_event(struct mlx4_dev *dev, int slave,
+ struct mlx4_eqe *eqe)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (slave < 0 || slave >= dev->num_slaves ||
+ slave == dev->caps.function)
+ return;
+
+ if (!priv->mfunc.master.slave_state[slave].active)
+ return;
+
+ slave_event(dev, slave, eqe);
+}
+
+#ifdef KMOD_DISABLED
+static void mlx4_set_eq_affinity_hint(struct mlx4_priv *priv, int vec)
+{
+ int hint_err;
+ struct mlx4_dev *dev = &priv->dev;
+ struct mlx4_eq *eq = &priv->eq_table.eq[vec];
+
+ if (!eq->affinity_mask || cpumask_empty(eq->affinity_mask))
+ return;
+
+ hint_err = irq_set_affinity_hint(eq->irq, eq->affinity_mask);
+
+ if (hint_err) {
+ switch (hint_err) {
+ case -EINVAL:
+ mlx4_warn(dev, "irq_set_affinity_hint failed\n");
+ break;
+ case -ENOSYS:
+ mlx4_dbg(dev, "irq_set_affinity_hint not supported\n");
+ break;
+ default:
+ mlx4_dbg(dev, "irq_set_affinity_hint undefined err\n");
+ }
+ }
+}
+#endif
+
+int mlx4_gen_pkey_eqe(struct mlx4_dev *dev, int slave, u8 port)
+{
+ struct mlx4_eqe eqe;
+
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_state *s_slave = &priv->mfunc.master.slave_state[slave];
+
+ if (!s_slave->active)
+ return 0;
+
+ memset(&eqe, 0, sizeof eqe);
+
+ eqe.type = MLX4_EVENT_TYPE_PORT_MNG_CHG_EVENT;
+ eqe.subtype = MLX4_DEV_PMC_SUBTYPE_PKEY_TABLE;
+ eqe.event.port_mgmt_change.port = port;
+
+ return mlx4_GEN_EQE(dev, slave, &eqe);
+}
+EXPORT_SYMBOL(mlx4_gen_pkey_eqe);
+
+int mlx4_gen_guid_change_eqe(struct mlx4_dev *dev, int slave, u8 port)
+{
+ struct mlx4_eqe eqe;
+
+ /*don't send if we don't have the that slave */
+ if (dev->persist->num_vfs < slave)
+ return 0;
+ memset(&eqe, 0, sizeof eqe);
+
+ eqe.type = MLX4_EVENT_TYPE_PORT_MNG_CHG_EVENT;
+ eqe.subtype = MLX4_DEV_PMC_SUBTYPE_GUID_INFO;
+ eqe.event.port_mgmt_change.port = port;
+
+ return mlx4_GEN_EQE(dev, slave, &eqe);
+}
+EXPORT_SYMBOL(mlx4_gen_guid_change_eqe);
+
+int mlx4_gen_port_state_change_eqe(struct mlx4_dev *dev, int slave, u8 port,
+ u8 port_subtype_change)
+{
+ struct mlx4_eqe eqe;
+
+ /*don't send if we don't have the that slave */
+ if (dev->persist->num_vfs < slave)
+ return 0;
+ memset(&eqe, 0, sizeof eqe);
+
+ eqe.type = MLX4_EVENT_TYPE_PORT_CHANGE;
+ eqe.subtype = port_subtype_change;
+ eqe.event.port_change.port = cpu_to_be32(port << 28);
+
+ mlx4_dbg(dev, "%s: sending: %d to slave: %d on port: %d\n", __func__,
+ port_subtype_change, slave, port);
+ return mlx4_GEN_EQE(dev, slave, &eqe);
+}
+EXPORT_SYMBOL(mlx4_gen_port_state_change_eqe);
+
+enum slave_port_state mlx4_get_slave_port_state(struct mlx4_dev *dev, int slave, u8 port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_state *s_state = priv->mfunc.master.slave_state;
+ struct mlx4_active_ports actv_ports = mlx4_get_active_ports(dev, slave);
+
+ if (slave >= dev->num_slaves || port > dev->caps.num_ports ||
+ port <= 0 || !test_bit(port - 1, actv_ports.ports)) {
+ pr_err("%s: Error: asking for slave:%d, port:%d\n",
+ __func__, slave, port);
+ return SLAVE_PORT_DOWN;
+ }
+ return s_state[slave].port_state[port];
+}
+EXPORT_SYMBOL(mlx4_get_slave_port_state);
+
+static int mlx4_set_slave_port_state(struct mlx4_dev *dev, int slave, u8 port,
+ enum slave_port_state state)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_state *s_state = priv->mfunc.master.slave_state;
+ struct mlx4_active_ports actv_ports = mlx4_get_active_ports(dev, slave);
+
+ if (slave >= dev->num_slaves || port > dev->caps.num_ports ||
+ port <= 0 || !test_bit(port - 1, actv_ports.ports)) {
+ pr_err("%s: Error: asking for slave:%d, port:%d\n",
+ __func__, slave, port);
+ return -1;
+ }
+ s_state[slave].port_state[port] = state;
+
+ return 0;
+}
+
+static void set_all_slave_state(struct mlx4_dev *dev, u8 port, int event)
+{
+ int i;
+ enum slave_port_gen_event gen_event;
+ struct mlx4_slaves_pport slaves_pport = mlx4_phys_to_slaves_pport(dev,
+ port);
+
+ for (i = 0; i < dev->persist->num_vfs + 1; i++)
+ if (test_bit(i, slaves_pport.slaves))
+ set_and_calc_slave_port_state(dev, i, port,
+ event, &gen_event);
+}
+/**************************************************************************
+ The function get as input the new event to that port,
+ and according to the prev state change the slave's port state.
+ The events are:
+ MLX4_PORT_STATE_DEV_EVENT_PORT_DOWN,
+ MLX4_PORT_STATE_DEV_EVENT_PORT_UP
+ MLX4_PORT_STATE_IB_EVENT_GID_VALID
+ MLX4_PORT_STATE_IB_EVENT_GID_INVALID
+***************************************************************************/
+int set_and_calc_slave_port_state(struct mlx4_dev *dev, int slave,
+ u8 port, int event,
+ enum slave_port_gen_event *gen_event)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_state *ctx = NULL;
+ unsigned long flags;
+ int ret = -1;
+ struct mlx4_active_ports actv_ports = mlx4_get_active_ports(dev, slave);
+ enum slave_port_state cur_state =
+ mlx4_get_slave_port_state(dev, slave, port);
+
+ *gen_event = SLAVE_PORT_GEN_EVENT_NONE;
+
+ if (slave >= dev->num_slaves || port > dev->caps.num_ports ||
+ port <= 0 || !test_bit(port - 1, actv_ports.ports)) {
+ pr_err("%s: Error: asking for slave:%d, port:%d\n",
+ __func__, slave, port);
+ return ret;
+ }
+
+ ctx = &priv->mfunc.master.slave_state[slave];
+ spin_lock_irqsave(&ctx->lock, flags);
+
+ switch (cur_state) {
+ case SLAVE_PORT_DOWN:
+ if (MLX4_PORT_STATE_DEV_EVENT_PORT_UP == event)
+ mlx4_set_slave_port_state(dev, slave, port,
+ SLAVE_PENDING_UP);
+ break;
+ case SLAVE_PENDING_UP:
+ if (MLX4_PORT_STATE_DEV_EVENT_PORT_DOWN == event)
+ mlx4_set_slave_port_state(dev, slave, port,
+ SLAVE_PORT_DOWN);
+ else if (MLX4_PORT_STATE_IB_PORT_STATE_EVENT_GID_VALID == event) {
+ mlx4_set_slave_port_state(dev, slave, port,
+ SLAVE_PORT_UP);
+ *gen_event = SLAVE_PORT_GEN_EVENT_UP;
+ }
+ break;
+ case SLAVE_PORT_UP:
+ if (MLX4_PORT_STATE_DEV_EVENT_PORT_DOWN == event) {
+ mlx4_set_slave_port_state(dev, slave, port,
+ SLAVE_PORT_DOWN);
+ *gen_event = SLAVE_PORT_GEN_EVENT_DOWN;
+ } else if (MLX4_PORT_STATE_IB_EVENT_GID_INVALID ==
+ event) {
+ mlx4_set_slave_port_state(dev, slave, port,
+ SLAVE_PENDING_UP);
+ *gen_event = SLAVE_PORT_GEN_EVENT_DOWN;
+ }
+ break;
+ default:
+ pr_err("%s: BUG!!! UNKNOWN state: "
+ "slave:%d, port:%d\n", __func__, slave, port);
+ goto out;
+ }
+ ret = mlx4_get_slave_port_state(dev, slave, port);
+
+out:
+ spin_unlock_irqrestore(&ctx->lock, flags);
+ return ret;
+}
+
+EXPORT_SYMBOL(set_and_calc_slave_port_state);
+
+int mlx4_gen_slaves_port_mgt_ev(struct mlx4_dev *dev, u8 port, int attr, u16 sm_lid, u8 sm_sl)
+{
+ struct mlx4_eqe eqe;
+
+ memset(&eqe, 0, sizeof eqe);
+
+ eqe.type = MLX4_EVENT_TYPE_PORT_MNG_CHG_EVENT;
+ eqe.subtype = MLX4_DEV_PMC_SUBTYPE_PORT_INFO;
+ eqe.event.port_mgmt_change.port = port;
+ eqe.event.port_mgmt_change.params.port_info.changed_attr =
+ cpu_to_be32((u32) attr);
+ if (attr & MSTR_SM_CHANGE_MASK) {
+ eqe.event.port_mgmt_change.params.port_info.mstr_sm_lid =
+ cpu_to_be16(sm_lid);
+ eqe.event.port_mgmt_change.params.port_info.mstr_sm_sl =
+ sm_sl;
+ }
+
+ slave_event(dev, ALL_SLAVES, &eqe);
+ return 0;
+}
+EXPORT_SYMBOL(mlx4_gen_slaves_port_mgt_ev);
+
+void mlx4_master_handle_slave_flr(struct mlx4_mfunc_master_ctx *master)
+{
+ /*
+ struct mlx4_mfunc_master_ctx *master =
+ container_of(work, struct mlx4_mfunc_master_ctx,
+ slave_flr_event_work);
+ */
+ struct mlx4_mfunc *mfunc =
+ container_of(master, struct mlx4_mfunc, master);
+ struct mlx4_priv *priv =
+ container_of(mfunc, struct mlx4_priv, mfunc);
+ struct mlx4_dev *dev = &priv->dev;
+ struct mlx4_slave_state *slave_state = priv->mfunc.master.slave_state;
+ int i;
+ int err;
+ unsigned long flags;
+
+ mlx4_dbg(dev, "mlx4_handle_slave_flr\n");
+
+ for (i = 0 ; i < dev->num_slaves; i++) {
+
+ if (MLX4_COMM_CMD_FLR == slave_state[i].last_cmd) {
+ mlx4_dbg(dev, "mlx4_handle_slave_flr: "
+ "clean slave: %d\n", i);
+
+ /* In case of 'Reset flow' FLR can be generated for
+ * a slave before mlx4_load_one is done.
+ * make sure interface is up before trying to delete
+ * slave resources which weren't allocated yet.
+ */
+ if (dev->persist->interface_state & MLX4_INTERFACE_STATE_UP)
+ mlx4_delete_all_resources_for_slave(dev, i);
+ /*return the slave to running mode*/
+ spin_lock_irqsave(&priv->mfunc.master.slave_state_lock, flags);
+ slave_state[i].last_cmd = MLX4_COMM_CMD_RESET;
+ slave_state[i].is_slave_going_down = 0;
+ spin_unlock_irqrestore(&priv->mfunc.master.slave_state_lock, flags);
+ /*notify the FW:*/
+ err = mlx4_cmd(dev, 0, i, 0, MLX4_CMD_INFORM_FLR_DONE,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (err)
+ mlx4_warn(dev, "Failed to notify FW on "
+ "FLR done (slave:%d)\n", i);
+ }
+ }
+}
+
+static int mlx4_eq_int(struct mlx4_dev *dev, struct mlx4_eq *eq)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_eqe *eqe;
+ int cqn = -1;
+ int eqes_found = 0;
+ int set_ci = 0;
+ int port;
+ int slave = 0;
+ int ret;
+ u32 flr_slave;
+ u8 update_slave_state;
+ int i;
+ enum slave_port_gen_event gen_event;
+ unsigned long flags;
+#ifdef HAVE_LINKSTATE
+ struct mlx4_vport_state *s_info;
+#endif
+ int eqe_size = dev->caps.eqe_size;
+
+ while ((eqe = next_eqe_sw(eq, dev->caps.eqe_factor, eqe_size))) {
+ /*
+ * Make sure we read EQ entry contents after we've
+ * checked the ownership bit.
+ */
+ rmb();
+
+ switch (eqe->type) {
+ case MLX4_EVENT_TYPE_COMP:
+ cqn = be32_to_cpu(eqe->event.comp.cqn) & 0xffffff;
+ mlx4_cq_completion(dev, cqn);
+ break;
+
+ case MLX4_EVENT_TYPE_PATH_MIG:
+ case MLX4_EVENT_TYPE_COMM_EST:
+ case MLX4_EVENT_TYPE_SQ_DRAINED:
+ case MLX4_EVENT_TYPE_SRQ_QP_LAST_WQE:
+ case MLX4_EVENT_TYPE_WQ_CATAS_ERROR:
+ case MLX4_EVENT_TYPE_PATH_MIG_FAILED:
+ case MLX4_EVENT_TYPE_WQ_INVAL_REQ_ERROR:
+ case MLX4_EVENT_TYPE_WQ_ACCESS_ERROR:
+ mlx4_dbg(dev, "event %d arrived\n", eqe->type);
+ if (mlx4_is_master(dev)) {
+ /* forward only to slave owning the QP */
+ ret = mlx4_get_slave_from_resource_id(dev,
+ RES_QP,
+ be32_to_cpu(eqe->event.qp.qpn)
+ & 0xffffff, &slave);
+ if (ret && ret != -ENOENT) {
+ mlx4_dbg(dev, "QP event %02x(%02x) on "
+ "EQ %d at index %u: could "
+ "not get slave id (%d)\n",
+ eqe->type, eqe->subtype,
+ eq->eqn, eq->cons_index, ret);
+ break;
+ }
+
+ if (!ret && slave != dev->caps.function) {
+ mlx4_slave_event(dev, slave, eqe);
+ break;
+ }
+
+ }
+ mlx4_qp_event(dev, be32_to_cpu(eqe->event.qp.qpn) &
+ 0xffffff, eqe->type);
+ break;
+
+ case MLX4_EVENT_TYPE_SRQ_LIMIT:
+ mlx4_dbg(dev, "%s: MLX4_EVENT_TYPE_SRQ_LIMIT\n",
+ __func__);
+ /* fall through */
+ case MLX4_EVENT_TYPE_SRQ_CATAS_ERROR:
+ if (mlx4_is_master(dev)) {
+ /* forward only to slave owning the SRQ */
+ ret = mlx4_get_slave_from_resource_id(dev,
+ RES_SRQ,
+ be32_to_cpu(eqe->event.srq.srqn)
+ & 0xffffff,
+ &slave);
+ if (ret && ret != -ENOENT) {
+ mlx4_warn(dev, "SRQ event %02x(%02x) "
+ "on EQ %d at index %u: could"
+ " not get slave id (%d)\n",
+ eqe->type, eqe->subtype,
+ eq->eqn, eq->cons_index, ret);
+ break;
+ }
+ mlx4_dbg(dev, "%s: slave:%d, srq_no:0x%x, event: %02x(%02x)\n",
+ __func__, slave,
+ be32_to_cpu(eqe->event.srq.srqn),
+ eqe->type, eqe->subtype);
+
+ if (!ret && slave != dev->caps.function) {
+ mlx4_dbg(dev, "%s: sending event %02x(%02x) to slave:%d\n",
+ __func__, eqe->type,
+ eqe->subtype, slave);
+ mlx4_slave_event(dev, slave, eqe);
+ break;
+ }
+ }
+ mlx4_srq_event(dev, be32_to_cpu(eqe->event.srq.srqn) &
+ 0xffffff, eqe->type);
+ break;
+
+ case MLX4_EVENT_TYPE_CMD:
+ mlx4_cmd_event(dev,
+ be16_to_cpu(eqe->event.cmd.token),
+ eqe->event.cmd.status,
+ be64_to_cpu(eqe->event.cmd.out_param));
+ break;
+
+ case MLX4_EVENT_TYPE_PORT_CHANGE: {
+ struct mlx4_slaves_pport slaves_port;
+ port = be32_to_cpu(eqe->event.port_change.port) >> 28;
+ slaves_port = mlx4_phys_to_slaves_pport(dev, port);
+ if (eqe->subtype == MLX4_PORT_CHANGE_SUBTYPE_DOWN) {
+ mlx4_dispatch_event(dev, MLX4_DEV_EVENT_PORT_DOWN,
+ port);
+ mlx4_priv(dev)->sense.do_sense_port[port] = 1;
+ if (!mlx4_is_master(dev))
+ break;
+ for (i = 0; i < dev->persist->num_vfs + 1; i++) {
+ if (!test_bit(i, slaves_port.slaves))
+ continue;
+ if (dev->caps.port_type[port] == MLX4_PORT_TYPE_ETH) {
+ if (i == mlx4_master_func_num(dev))
+ continue;
+ mlx4_dbg(dev, "%s: Sending MLX4_PORT_CHANGE_SUBTYPE_DOWN"
+ " to slave: %d, port:%d\n",
+ __func__, i, port);
+#ifdef HAVE_LINKSTATE
+ s_info = &priv->mfunc.master.vf_oper[slave].vport[port].state;
+ if (IFLA_VF_LINK_STATE_AUTO == s_info->link_state)
+ mlx4_slave_event(dev, i, eqe);
+#else
+ mlx4_slave_event(dev, i, eqe);
+#endif
+ } else { /* IB port */
+ set_and_calc_slave_port_state(dev, i, port,
+ MLX4_PORT_STATE_DEV_EVENT_PORT_DOWN,
+ &gen_event);
+ /*we can be in pending state, then do not send port_down event*/
+ if (SLAVE_PORT_GEN_EVENT_DOWN == gen_event) {
+ if (i == mlx4_master_func_num(dev))
+ continue;
+ mlx4_slave_event(dev, i, eqe);
+ }
+ }
+ }
+ } else {
+ mlx4_dispatch_event(dev, MLX4_DEV_EVENT_PORT_UP, port);
+
+ mlx4_priv(dev)->sense.do_sense_port[port] = 0;
+
+ if (!mlx4_is_master(dev))
+ break;
+ if (dev->caps.port_type[port] == MLX4_PORT_TYPE_ETH)
+ for (i = 0; i < dev->persist->num_vfs + 1; i++) {
+ if (!test_bit(i, slaves_port.slaves))
+ continue;
+ if (i == mlx4_master_func_num(dev))
+ continue;
+#ifdef HAVE_LINKSTATE
+ s_info = &priv->mfunc.master.vf_oper[slave].vport[port].state;
+ if (IFLA_VF_LINK_STATE_AUTO == s_info->link_state)
+ mlx4_slave_event(dev, i, eqe);
+#else
+ mlx4_slave_event(dev, i, eqe);
+#endif
+ }
+ else /* IB port */
+ /* port-up event will be sent to a slave when the
+ * slave's alias-guid is set. This is done in alias_GUID.c
+ */
+ set_all_slave_state(dev, port, MLX4_DEV_EVENT_PORT_UP);
+ }
+ break;
+ }
+
+ case MLX4_EVENT_TYPE_CQ_ERROR:
+ mlx4_warn(dev, "CQ %s on CQN %06x\n",
+ eqe->event.cq_err.syndrome == 1 ?
+ "overrun" : "access violation",
+ be32_to_cpu(eqe->event.cq_err.cqn) & 0xffffff);
+ if (mlx4_is_master(dev)) {
+ ret = mlx4_get_slave_from_resource_id(dev,
+ RES_CQ,
+ be32_to_cpu(eqe->event.cq_err.cqn)
+ & 0xffffff, &slave);
+ if (ret && ret != -ENOENT) {
+ mlx4_dbg(dev, "CQ event %02x(%02x) on "
+ "EQ %d at index %u: could "
+ "not get slave id (%d)\n",
+ eqe->type, eqe->subtype,
+ eq->eqn, eq->cons_index, ret);
+ break;
+ }
+
+ if (!ret && slave != dev->caps.function) {
+ mlx4_slave_event(dev, slave, eqe);
+ break;
+ }
+ }
+ mlx4_cq_event(dev,
+ be32_to_cpu(eqe->event.cq_err.cqn)
+ & 0xffffff,
+ eqe->type);
+ break;
+
+ case MLX4_EVENT_TYPE_EQ_OVERFLOW:
+ mlx4_warn(dev, "EQ overrun on EQN %d\n", eq->eqn);
+ break;
+
+ case MLX4_EVENT_TYPE_OP_REQUIRED:
+ atomic_inc(&priv->opreq_count);
+ /* FW commands can't be executed from interrupt context
+ working in deferred task */
+#ifdef KMOD_DISABLED
+ queue_work(mlx4_wq, &priv->opreq_task);
+#endif
+ break;
+
+ case MLX4_EVENT_TYPE_COMM_CHANNEL:
+ if (!mlx4_is_master(dev)) {
+ mlx4_warn(dev, "Received comm channel event "
+ "for non master device\n");
+ break;
+ }
+
+ memcpy(&priv->mfunc.master.comm_arm_bit_vector,
+ eqe->event.comm_channel_arm.bit_vec,
+ sizeof eqe->event.comm_channel_arm.bit_vec);
+#ifdef KMOD_DISABLED
+ if (!queue_work(priv->mfunc.master.comm_wq,
+ &priv->mfunc.master.comm_work))
+#endif
+ mlx4_warn(dev, "Failed to queue comm channel work\n");
+ break;
+
+ case MLX4_EVENT_TYPE_FLR_EVENT:
+ flr_slave = be32_to_cpu(eqe->event.flr_event.slave_id);
+ if (!mlx4_is_master(dev)) {
+ mlx4_warn(dev, "Non-master function received"
+ "FLR event\n");
+ break;
+ }
+
+ mlx4_dbg(dev, "FLR event for slave: %d\n", flr_slave);
+
+ if (flr_slave >= dev->num_slaves) {
+ mlx4_warn(dev,
+ "Got FLR for unknown function: %d\n",
+ flr_slave);
+ update_slave_state = 0;
+ } else
+ update_slave_state = 1;
+
+ spin_lock_irqsave(&priv->mfunc.master.slave_state_lock, flags);
+ if (update_slave_state) {
+ priv->mfunc.master.slave_state[flr_slave].active = false;
+ priv->mfunc.master.slave_state[flr_slave].last_cmd = MLX4_COMM_CMD_FLR;
+ priv->mfunc.master.slave_state[flr_slave].is_slave_going_down = 1;
+ }
+ spin_unlock_irqrestore(&priv->mfunc.master.slave_state_lock, flags);
+ mlx4_dispatch_event(dev, MLX4_DEV_EVENT_SLAVE_SHUTDOWN,
+ flr_slave);
+#ifdef KMOD_DISABLED
+ queue_work(priv->mfunc.master.comm_wq,
+ &priv->mfunc.master.slave_flr_event_work);
+#endif
+ break;
+
+ case MLX4_EVENT_TYPE_FATAL_WARNING:
+ if (eqe->subtype == MLX4_FATAL_WARNING_SUBTYPE_WARMING) {
+ if (mlx4_is_master(dev))
+ for (i = 0; i < dev->num_slaves; i++) {
+ mlx4_dbg(dev, "%s: Sending "
+ "MLX4_FATAL_WARNING_SUBTYPE_WARMING"
+ " to slave: %d\n", __func__, i);
+ if (i == dev->caps.function)
+ continue;
+ mlx4_slave_event(dev, i, eqe);
+ }
+ mlx4_err(dev, "Temperature Threshold was reached! "
+ "Threshold: %d celsius degrees; "
+ "Current Temperature: %d\n",
+ be16_to_cpu(eqe->event.warming.warning_threshold),
+ be16_to_cpu(eqe->event.warming.current_temperature));
+ } else
+ mlx4_warn(dev, "Unhandled event FATAL WARNING (%02x), "
+ "subtype %02x on EQ %d at index %u. owner=%x, "
+ "nent=0x%x, slave=%x, ownership=%s\n",
+ eqe->type, eqe->subtype, eq->eqn,
+ eq->cons_index, eqe->owner, eq->nent,
+ eqe->slave_id,
+ !!(eqe->owner & 0x80) ^
+ !!(eq->cons_index & eq->nent) ? "HW" : "SW");
+
+ break;
+
+ case MLX4_EVENT_TYPE_PORT_MNG_CHG_EVENT:
+ mlx4_dispatch_event(dev, MLX4_DEV_EVENT_PORT_MGMT_CHANGE,
+ (unsigned long) eqe);
+ break;
+
+ case MLX4_EVENT_TYPE_RECOVERABLE_ERROR_EVENT:
+ switch (eqe->subtype) {
+ case MLX4_RECOVERABLE_ERROR_EVENT_SUBTYPE_BAD_CABLE:
+ mlx4_warn(dev, "Bad cable detected on port %u\n",
+ eqe->event.bad_cable.port);
+ break;
+ case MLX4_RECOVERABLE_ERROR_EVENT_SUBTYPE_UNSUPPORTED_CABLE:
+ mlx4_warn(dev, "Unsupported cable detected\n");
+ break;
+ case MLX4_RECOVERABLE_ERROR_EVENT_SUBTYPE_BAD_UNREADABLE_EEPROM:
+ mlx4_warn(dev, "Bad or unreadable EEPROM on port %u\n",
+ eqe->event.bad_cable.port);
+ break;
+ default:
+ mlx4_dbg(dev,
+ "Unhandled recoverable error event detected: %02x(%02x) on EQ %d at index %u. owner=%x, nent=0x%x, ownership=%s\n",
+ eqe->type, eqe->subtype, eq->eqn,
+ eq->cons_index, eqe->owner, eq->nent,
+ !!(eqe->owner & 0x80) ^
+ !!(eq->cons_index & eq->nent) ? "HW" : "SW");
+ break;
+ }
+ break;
+
+ case MLX4_EVENT_TYPE_EEC_CATAS_ERROR:
+ case MLX4_EVENT_TYPE_ECC_DETECT:
+ default:
+ mlx4_warn(dev, "Unhandled event %02x(%02x) on EQ %d at index %u. owner=%x, nent=0x%x, slave=%x, ownership=%s\n",
+ eqe->type, eqe->subtype, eq->eqn,
+ eq->cons_index, eqe->owner, eq->nent,
+ eqe->slave_id,
+ !!(eqe->owner & 0x80) ^
+ !!(eq->cons_index & eq->nent) ? "HW" : "SW");
+ break;
+ };
+
+ ++eq->cons_index;
+ eqes_found = 1;
+ ++set_ci;
+
+ /*
+ * The HCA will think the queue has overflowed if we
+ * don't tell it we've been processing events. We
+ * create our EQs with MLX4_NUM_SPARE_EQE extra
+ * entries, so we must update our consumer index at
+ * least that often.
+ */
+ if (unlikely(set_ci >= MLX4_NUM_SPARE_EQE)) {
+ eq_set_ci(eq, 0);
+ set_ci = 0;
+ }
+ }
+
+ eq_set_ci(eq, 1);
+
+ /* cqn is 24bit wide but is initialized such that its higher bits
+ * are ones too. Thus, if we got any event, cqn's high bits should be off
+ * and we need to schedule the tasklet.
+ */
+#ifdef KMOD_DISABLED
+ if (!(cqn & ~0xffffff))
+ tasklet_schedule(&eq->tasklet_ctx.task);
+#endif
+
+ return eqes_found;
+}
+
+#ifdef KMOD_MODIFIED
+static int mlx4_interrupt(int irq, void *dev_ptr)
+{
+ struct mlx4_dev *dev = dev_ptr;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int work = 0;
+ int i;
+
+ writel(priv->eq_table.clr_mask, priv->eq_table.clr_int);
+
+ for (i = 0; i < dev->caps.num_comp_vectors + 1; ++i)
+ work |= mlx4_eq_int(dev, &priv->eq_table.eq[i]);
+
+ return work;
+ //return IRQ_RETVAL(work);
+}
+#endif
+#ifdef KMOD_MODIFIED
+static int mlx4_msi_x_interrupt(int irq, void *eq_ptr)
+{
+ struct mlx4_eq *eq = eq_ptr;
+ struct mlx4_dev *dev = eq->dev;
+
+ mlx4_eq_int(dev, eq);
+
+ /* MSI-X vectors always belong to us */
+ //return IRQ_HANDLED;
+ return 0;
+}
+#endif
+
+int mlx4_MAP_EQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_event_eq_info *event_eq =
+ priv->mfunc.master.slave_state[slave].event_eq;
+ u32 in_modifier = vhcr->in_modifier;
+ u32 eqn = in_modifier & 0x3FF;
+ u64 in_param = vhcr->in_param;
+ int err = 0;
+ int i;
+
+ if (slave == dev->caps.function)
+ err = mlx4_cmd(dev, in_param, (in_modifier & 0x80000000) | eqn,
+ 0, MLX4_CMD_MAP_EQ, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_NATIVE);
+ if (!err)
+ for (i = 0; i < MLX4_EVENT_TYPES_NUM; ++i)
+ if (in_param & (1LL << i))
+ event_eq[i].eqn = in_modifier >> 31 ? -1 : eqn;
+
+ return err;
+}
+
+static int mlx4_MAP_EQ(struct mlx4_dev *dev, u64 event_mask, int unmap,
+ int eq_num)
+{
+ return mlx4_cmd(dev, event_mask, (unmap << 31) | eq_num,
+ 0, MLX4_CMD_MAP_EQ, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_WRAPPED);
+}
+
+static int mlx4_SW2HW_EQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
+ int eq_num)
+{
+ return mlx4_cmd(dev, mailbox->dma, eq_num, 0,
+ MLX4_CMD_SW2HW_EQ, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+}
+
+static int mlx4_HW2SW_EQ(struct mlx4_dev *dev, int eq_num)
+{
+ return mlx4_cmd(dev, 0, eq_num, 1, MLX4_CMD_HW2SW_EQ,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+}
+
+static int mlx4_num_eq_uar(struct mlx4_dev *dev)
+{
+ /*
+ * Each UAR holds 4 EQ doorbells. To figure out how many UARs
+ * we need to map, take the difference of highest index and
+ * the lowest index we'll use and add 1.
+ */
+ return (dev->caps.num_comp_vectors + 1 + dev->caps.reserved_eqs) / 4 -
+ dev->caps.reserved_eqs / 4 + 1;
+}
+
+static void __iomem *mlx4_get_eq_uar(struct mlx4_dev *dev, struct mlx4_eq *eq)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int index;
+
+ index = eq->eqn / 4 - dev->caps.reserved_eqs / 4;
+
+ if (!priv->eq_table.uar_map[index]) {
+
+#ifdef KMOD_MODIFIED
+ assert(dev->persist->rte_pdev->mem_resource[2].len >= (((eq->eqn / 4) << PAGE_SHIFT) + PAGE_SIZE));
+ priv->eq_table.uar_map[index] =
+ RTE_PTR_ADD(dev->persist->rte_pdev->mem_resource[2].addr, ((eq->eqn / 4) << PAGE_SHIFT));
+#else
+ priv->eq_table.uar_map[index] =
+ ioremap(pci_resource_start(dev->persist->pdev, 2) +
+ ((eq->eqn / 4) << PAGE_SHIFT),
+ PAGE_SIZE);
+#endif
+ if (!priv->eq_table.uar_map[index]) {
+ mlx4_err(dev, "Couldn't map EQ doorbell for EQN 0x%06x\n",
+ eq->eqn);
+ return NULL;
+ }
+ }
+
+ return priv->eq_table.uar_map[index] + 0x800 + 8 * (eq->eqn % 4);
+}
+
+static void mlx4_unmap_uar(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i;
+
+ for (i = 0; i < mlx4_num_eq_uar(dev); ++i)
+ if (priv->eq_table.uar_map[i]) {
+#ifdef KMOD_REMOVED
+ iounmap(priv->eq_table.uar_map[i]);
+#endif
+ priv->eq_table.uar_map[i] = NULL;
+ }
+}
+
+static int mlx4_create_eq(struct mlx4_dev *dev, int nent,
+ u8 intr, struct mlx4_eq *eq)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_eq_context *eq_context;
+ int npages;
+ u64 *dma_list = NULL;
+ dma_addr_t t;
+ u64 mtt_addr;
+ int err = -ENOMEM;
+ int i;
+
+ eq->dev = dev;
+ eq->nent = roundup_pow_of_two(max(nent, 2));
+ /* CX3 is capable of extending the CQE\EQE from 32 to 64 bytes */
+ npages = PAGE_ALIGN(eq->nent * dev->caps.eqe_size) / PAGE_SIZE;
+
+ eq->page_list = kmalloc(npages * sizeof *eq->page_list,
+ GFP_KERNEL);
+ if (!eq->page_list)
+ goto err_out;
+
+ for (i = 0; i < npages; ++i)
+ eq->page_list[i].buf = NULL;
+
+ dma_list = kmalloc(npages * sizeof *dma_list, GFP_KERNEL);
+ if (!dma_list)
+ goto err_out_free;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ goto err_out_free;
+ eq_context = mailbox->buf;
+
+ for (i = 0; i < npages; ++i) {
+#ifdef KMOD_MODIFIED
+ eq->page_list[i].buf = rte_persistent_alloc(PAGE_SIZE, dev->persist->rte_pdev->numa_node);
+ t = rte_persistent_hw_addr(eq->page_list[i].buf);
+#else
+ eq->page_list[i].buf = dma_alloc_coherent(&dev->persist->pdev->dev,
+ PAGE_SIZE, &t, GFP_KERNEL);
+#endif
+ if (!eq->page_list[i].buf)
+ goto err_out_free_pages;
+
+ dma_list[i] = t;
+ eq->page_list[i].map = t;
+
+ memset(eq->page_list[i].buf, 0, PAGE_SIZE);
+ }
+
+ eq->eqn = mlx4_bitmap_alloc(&priv->eq_table.bitmap);
+ if (eq->eqn == -1)
+ goto err_out_free_pages;
+
+ eq->name_priority = 0;
+
+ eq->doorbell = mlx4_get_eq_uar(dev, eq);
+ if (!eq->doorbell) {
+ err = -ENOMEM;
+ goto err_out_free_eq;
+ }
+
+ err = mlx4_mtt_init(dev, npages, PAGE_SHIFT, &eq->mtt);
+ if (err)
+ goto err_out_free_eq;
+
+ err = mlx4_write_mtt(dev, &eq->mtt, 0, npages, dma_list);
+ if (err)
+ goto err_out_free_mtt;
+
+ memset(eq_context, 0, sizeof *eq_context);
+ eq_context->flags = cpu_to_be32(MLX4_EQ_STATUS_OK |
+ MLX4_EQ_STATE_ARMED);
+ eq_context->log_eq_size = ilog2(eq->nent);
+ eq_context->intr = intr;
+ eq_context->log_page_size = PAGE_SHIFT - MLX4_ICM_PAGE_SHIFT;
+
+ mtt_addr = mlx4_mtt_addr(dev, &eq->mtt);
+ eq_context->mtt_base_addr_h = mtt_addr >> 32;
+ eq_context->mtt_base_addr_l = cpu_to_be32(mtt_addr & 0xffffffff);
+
+ err = mlx4_SW2HW_EQ(dev, mailbox, eq->eqn);
+ if (err) {
+ mlx4_warn(dev, "SW2HW_EQ failed (%d) for eqn %d\n",
+ err, eq->eqn);
+ goto err_out_free_mtt;
+ }
+
+ kfree(dma_list);
+ mlx4_free_cmd_mailbox(dev, mailbox);
+#ifdef KMOD_DISABLED
+ RAW_INIT_NOTIFIER_HEAD(&eq->notifiers_list);
+#endif
+
+ eq->cons_index = 0;
+#ifdef KMOD_DISABLED
+ INIT_LIST_HEAD(&eq->tasklet_ctx.list);
+ INIT_LIST_HEAD(&eq->tasklet_ctx.process_list);
+ spin_lock_init(&eq->tasklet_ctx.lock);
+ tasklet_init(&eq->tasklet_ctx.task, mlx4_cq_tasklet_cb,
+ (unsigned long)&eq->tasklet_ctx);
+#endif
+
+ return err;
+
+err_out_free_mtt:
+ mlx4_mtt_cleanup(dev, &eq->mtt);
+
+err_out_free_eq:
+ mlx4_bitmap_free(&priv->eq_table.bitmap, eq->eqn, MLX4_USE_RR);
+
+err_out_free_pages:
+ for (i = 0; i < npages; ++i)
+ if (eq->page_list[i].buf)
+ {
+#ifdef KMOD_MODIFIED
+ rte_persistent_free(eq->page_list[i].buf);
+#else
+ dma_free_coherent(&dev->persist->pdev->dev, PAGE_SIZE,
+ eq->page_list[i].buf,
+ eq->page_list[i].map);
+#endif
+ }
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+err_out_free:
+ kfree(eq->page_list);
+ kfree(dma_list);
+
+err_out:
+ return err;
+}
+
+#ifdef KMOD_DISABLED
+struct mlx4_eq_notifier {
+ struct notifier_block nb;
+ u32 uuid;
+ void (*cb)(unsigned vector, u32 uuid, void *data);
+ void *data;
+};
+#endif
+
+static void mlx4_free_eq(struct mlx4_dev *dev,
+ struct mlx4_eq *eq)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err;
+ int i;
+ /* CX3 is capable of extending the CQE/EQE from 32 to 64 bytes, with
+ * strides of 64B,128B and 256B
+ */
+ int npages = PAGE_ALIGN(dev->caps.eqe_size * eq->nent) / PAGE_SIZE;
+
+ err = mlx4_HW2SW_EQ(dev, eq->eqn);
+ if (err)
+ mlx4_warn(dev, "HW2SW_EQ failed (%d)\n", err);
+
+ synchronize_irq(eq->irq);
+#ifdef KMOD_DISABLED
+ tasklet_disable(&eq->tasklet_ctx.task);
+#endif
+
+ mlx4_mtt_cleanup(dev, &eq->mtt);
+ for (i = 0; i < npages; ++i)
+ {
+#ifdef KMOD_MODIFIED
+ rte_persistent_free(eq->page_list[i].buf);
+#else
+ dma_free_coherent(&dev->persist->pdev->dev, PAGE_SIZE,
+ eq->page_list[i].buf,
+ eq->page_list[i].map);
+#endif
+ }
+
+ kfree(eq->page_list);
+ mlx4_bitmap_free(&priv->eq_table.bitmap, eq->eqn, MLX4_USE_RR);
+}
+
+static void mlx4_free_irqs(struct mlx4_dev *dev)
+{
+ struct mlx4_eq_table *eq_table = &mlx4_priv(dev)->eq_table;
+ int i;
+#ifdef KMOD_DISABLED
+ if (eq_table->have_irq)
+ free_irq(dev->persist->pdev->irq, dev);
+
+ for (i = 0; i < dev->caps.num_comp_vectors + 1; ++i)
+ if (eq_table->eq[i].have_irq) {
+ free_cpumask_var(eq_table->eq[i].affinity_mask);
+#if defined(CONFIG_SMP)
+ irq_set_affinity_hint(eq_table->eq[i].irq, NULL);
+#endif
+ free_irq(eq_table->eq[i].irq, eq_table->eq + i);
+ eq_table->eq[i].have_irq = 0;
+ }
+#endif
+
+ kfree(eq_table->irq_names);
+}
+
+static int mlx4_map_clr_int(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+#ifdef KMOD_MODIFIED
+ assert(dev->persist->rte_pdev->mem_resource[priv->fw.clr_int_bar].len >= priv->fw.clr_int_base + MLX4_CLR_INT_SIZE);
+
+ priv->clr_base = RTE_PTR_ADD(dev->persist->rte_pdev->mem_resource[priv->fw.clr_int_bar].addr,priv->fw.clr_int_base);
+#else
+ priv->clr_base = ioremap(pci_resource_start(dev->persist->pdev, priv->fw.clr_int_bar) +
+ priv->fw.clr_int_base, MLX4_CLR_INT_SIZE);
+#endif
+ if (!priv->clr_base) {
+ mlx4_err(dev, "Couldn't map interrupt clear register, aborting.\n");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void mlx4_unmap_clr_int(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+#ifdef KMOD_DISABLED
+ iounmap(priv->clr_base);
+#endif
+}
+
+int mlx4_alloc_eq_table(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ priv->eq_table.eq = kcalloc(dev->caps.num_eqs - dev->caps.reserved_eqs,
+ sizeof *priv->eq_table.eq, GFP_KERNEL);
+ if (!priv->eq_table.eq)
+ return -ENOMEM;
+
+ return 0;
+}
+
+void mlx4_free_eq_table(struct mlx4_dev *dev)
+{
+ kfree(mlx4_priv(dev)->eq_table.eq);
+}
+
+int mlx4_init_eq_table(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err;
+ int i;
+
+ spin_lock_init(&dev->eq_accounting_lock);
+ priv->eq_table.uar_map = kcalloc(mlx4_num_eq_uar(dev),
+ sizeof *priv->eq_table.uar_map,
+ GFP_KERNEL);
+ if (!priv->eq_table.uar_map) {
+ err = -ENOMEM;
+ goto err_out_free;
+ }
+
+ err = mlx4_bitmap_init(&priv->eq_table.bitmap,
+ roundup_pow_of_two(dev->caps.num_eqs),
+ dev->caps.num_eqs - 1,
+ dev->caps.reserved_eqs,
+ roundup_pow_of_two(dev->caps.num_eqs) -
+ dev->caps.num_eqs);
+ if (err)
+ goto err_out_free;
+
+ for (i = 0; i < mlx4_num_eq_uar(dev); ++i)
+ priv->eq_table.uar_map[i] = NULL;
+
+ if (!mlx4_is_slave(dev)) {
+ err = mlx4_map_clr_int(dev);
+ if (err)
+ goto err_out_bitmap;
+
+ priv->eq_table.clr_mask =
+ swab32(1 << (priv->eq_table.inta_pin & 31));
+ priv->eq_table.clr_int = priv->clr_base +
+ (priv->eq_table.inta_pin < 32 ? 4 : 0);
+ }
+
+ priv->eq_table.irq_names =
+ kmalloc(MLX4_IRQNAME_SIZE * (dev->caps.num_comp_vectors + 1),
+ GFP_KERNEL);
+ if (!priv->eq_table.irq_names) {
+ err = -ENOMEM;
+ goto err_out_clr_int;
+ }
+
+
+ for (i = 0; i < dev->caps.num_comp_vectors + 1; ++i) {
+ if (i == MLX4_EQ_ASYNC) {
+ err = mlx4_create_eq(dev,
+ MLX4_NUM_ASYNC_EQE + MLX4_NUM_SPARE_EQE,
+ 0, &priv->eq_table.eq[MLX4_EQ_ASYNC]);
+ } else {
+ struct mlx4_eq *eq = &priv->eq_table.eq[i];
+#ifdef HAVE_CPU_RMAP
+#ifdef CONFIG_RFS_ACCEL
+ int port = find_first_bit(eq->actv_ports.ports,
+ dev->caps.num_ports) + 1;
+
+ if (port <= dev->caps.num_ports) {
+ struct mlx4_port_info *info =
+ &mlx4_priv(dev)->port[port];
+
+ if (!info->rmap) {
+ info->rmap = alloc_irq_cpu_rmap(
+ mlx4_get_eqs_per_port(dev, port));
+ if (!info->rmap) {
+ mlx4_warn(dev, "Failed to allocate cpu rmap\n");
+ err = -ENOMEM;
+ goto err_out_unmap;
+ }
+ }
+ err = irq_cpu_rmap_add(
+ info->rmap, eq->irq);
+ if (err)
+ mlx4_warn(dev, "Failed adding irq rmap\n");
+ }
+#endif
+#endif
+ err = mlx4_create_eq(dev, dev->caps.num_cqs -
+ dev->caps.reserved_cqs +
+ MLX4_NUM_SPARE_EQE,
+ (dev->flags & MLX4_FLAG_MSI_X) ?
+ i + 1 - !!(i > MLX4_EQ_ASYNC) : 0,
+ eq);
+ }
+ if (err)
+ goto err_out_unmap;
+ }
+
+#ifdef KMOD_DISABLED
+ if (dev->flags & MLX4_FLAG_MSI_X) {
+ const char *eq_name;
+
+ snprintf(priv->eq_table.irq_names +
+ MLX4_EQ_ASYNC * MLX4_IRQNAME_SIZE,
+ MLX4_IRQNAME_SIZE,
+ "mlx4-async@pci:%s",
+ pci_name(dev->persist->pdev));
+ eq_name = priv->eq_table.irq_names +
+ MLX4_EQ_ASYNC * MLX4_IRQNAME_SIZE;
+
+ err = request_irq(priv->eq_table.eq[MLX4_EQ_ASYNC].irq,
+ mlx4_msi_x_interrupt, 0, eq_name,
+ priv->eq_table.eq + MLX4_EQ_ASYNC);
+ if (err)
+ goto err_out_unmap;
+
+ priv->eq_table.eq[MLX4_EQ_ASYNC].have_irq = 1;
+ } else {
+#endif
+#ifdef KMOD_MODIFIED
+ snprintf(priv->eq_table.irq_names,
+ MLX4_IRQNAME_SIZE,
+ DRV_NAME "@pci:%s",
+ "pci_name");//pci_name(dev->persist->pdev));
+ err = 0;//request_irq(dev->persist->pdev->irq, mlx4_interrupt,
+ //IRQF_SHARED, priv->eq_table.irq_names, dev);
+ if (err)
+ goto err_out_unmap;
+
+ priv->eq_table.have_irq = 1;
+#endif
+#ifdef KMOD_DISABLED
+ }
+#endif
+
+ err = mlx4_MAP_EQ(dev, get_async_ev_mask(dev), 0,
+ priv->eq_table.eq[MLX4_EQ_ASYNC].eqn);
+ if (err)
+ mlx4_warn(dev, "MAP_EQ for async EQ %d failed (%d)\n",
+ priv->eq_table.eq[MLX4_EQ_ASYNC].eqn, err);
+
+ /* arm ASYNC eq */
+ eq_set_ci(&priv->eq_table.eq[MLX4_EQ_ASYNC], 1);
+
+ return 0;
+
+err_out_unmap:
+ while (i >= 0)
+ mlx4_free_eq(dev, &priv->eq_table.eq[i--]);
+#ifdef CONFIG_RFS_ACCEL
+ for (i = 1; i <= dev->caps.num_ports; i++) {
+ if (mlx4_priv(dev)->port[i].rmap) {
+ free_irq_cpu_rmap(mlx4_priv(dev)->port[i].rmap);
+ mlx4_priv(dev)->port[i].rmap = NULL;
+ }
+ }
+#endif
+ mlx4_free_irqs(dev);
+
+err_out_clr_int:
+ if (!mlx4_is_slave(dev))
+ mlx4_unmap_clr_int(dev);
+
+err_out_bitmap:
+ mlx4_unmap_uar(dev);
+ mlx4_bitmap_cleanup(&priv->eq_table.bitmap);
+
+err_out_free:
+ kfree(priv->eq_table.uar_map);
+
+ return err;
+}
+
+void mlx4_cleanup_eq_table(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i;
+
+ mlx4_MAP_EQ(dev, get_async_ev_mask(dev), 1,
+ priv->eq_table.eq[MLX4_EQ_ASYNC].eqn);
+
+#ifdef CONFIG_RFS_ACCEL
+ for (i = 1; i <= dev->caps.num_ports; i++) {
+ if (mlx4_priv(dev)->port[i].rmap) {
+ free_irq_cpu_rmap(mlx4_priv(dev)->port[i].rmap);
+ mlx4_priv(dev)->port[i].rmap = NULL;
+ }
+ }
+#endif
+ mlx4_free_irqs(dev);
+
+ for (i = 0; i < dev->caps.num_comp_vectors + 1; ++i)
+ mlx4_free_eq(dev, &priv->eq_table.eq[i]);
+
+ if (!mlx4_is_slave(dev))
+ mlx4_unmap_clr_int(dev);
+
+ mlx4_unmap_uar(dev);
+ mlx4_bitmap_cleanup(&priv->eq_table.bitmap);
+
+ kfree(priv->eq_table.uar_map);
+}
+
+/* A test that verifies that we can accept interrupts on all
+ * the irq vectors of the device.
+ * Interrupts are checked using the NOP command.
+ */
+#ifdef KMOD_DISABLED
+int mlx4_test_interrupts(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i;
+ int err;
+
+ err = mlx4_NOP(dev);
+ /* When not in MSI_X, there is only one irq to check */
+ if (!(dev->flags & MLX4_FLAG_MSI_X) || mlx4_is_slave(dev))
+ return err;
+
+ /* A loop over all completion vectors, for each vector we will check
+ * whether it works by mapping command completions to that vector
+ * and performing a NOP command
+ */
+ for(i = 0; !err && (i < dev->caps.num_comp_vectors); ++i) {
+ /* Temporary use polling for command completions */
+ mlx4_cmd_use_polling(dev);
+
+ /* Map the new eq to handle all asynchronous events */
+ err = mlx4_MAP_EQ(dev, get_async_ev_mask(dev), 0,
+ priv->eq_table.eq[i].eqn);
+ if (err) {
+ mlx4_warn(dev, "Failed mapping eq for interrupt test\n");
+ mlx4_cmd_use_events(dev);
+ break;
+ }
+
+ /* Go back to using events */
+ mlx4_cmd_use_events(dev);
+ err = mlx4_NOP(dev);
+ }
+
+ /* Return to default */
+ mlx4_MAP_EQ(dev, get_async_ev_mask(dev), 0,
+ priv->eq_table.eq[MLX4_EQ_ASYNC].eqn);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_test_interrupts);
+#endif
+
+bool mlx4_is_eq_vector_valid(struct mlx4_dev *dev, u8 port, int vector)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ vector = MLX4_CQ_TO_EQ_VECTOR(vector);
+ if (vector < 0 || (vector >= dev->caps.num_comp_vectors + 1) ||
+ (vector == MLX4_EQ_ASYNC))
+ return false;
+
+ return test_bit(port - 1, priv->eq_table.eq[vector].actv_ports.ports);
+}
+EXPORT_SYMBOL(mlx4_is_eq_vector_valid);
+
+u32 mlx4_get_eqs_per_port(struct mlx4_dev *dev, u8 port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ unsigned int i;
+ unsigned int sum = 0;
+
+ for (i = 0; i < dev->caps.num_comp_vectors + 1; i++)
+ sum += !!test_bit(port - 1,
+ priv->eq_table.eq[i].actv_ports.ports);
+
+ return sum;
+}
+EXPORT_SYMBOL(mlx4_get_eqs_per_port);
+
+int mlx4_is_eq_shared(struct mlx4_dev *dev, int vector)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ vector = MLX4_CQ_TO_EQ_VECTOR(vector);
+ if (vector <= 0 || (vector >= dev->caps.num_comp_vectors + 1))
+ return -EINVAL;
+
+ return !!(bitmap_weight(priv->eq_table.eq[vector].actv_ports.ports,
+ dev->caps.num_ports) > 1);
+}
+EXPORT_SYMBOL(mlx4_is_eq_shared);
+
+struct cpu_rmap *mlx4_get_cpu_rmap(struct mlx4_dev *dev, int port)
+{
+ return mlx4_priv(dev)->port[port].rmap;
+}
+EXPORT_SYMBOL(mlx4_get_cpu_rmap);
+
+int mlx4_rename_eq(struct mlx4_dev *dev, int port, int vector,
+ u8 priority, const char namefmt[], ...)
+{
+ va_list args;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int eq_vector = MLX4_CQ_TO_EQ_VECTOR(vector);
+
+ if (!mlx4_is_eq_vector_valid(dev, port, vector) ||
+ (dev->flags & MLX4_FLAG_MSI_X &&
+ !test_bit(eq_vector, priv->msix_ctl.pool_bm)))
+ return -EINVAL;
+
+ if (priv->eq_table.eq[eq_vector].name_priority >= priority)
+ return 0;
+
+ priv->eq_table.eq[eq_vector].name_priority = priority;
+ va_start(args, namefmt);
+ vsnprintf(priv->eq_table.irq_names +
+ eq_vector * MLX4_IRQNAME_SIZE,
+ MLX4_IRQNAME_SIZE, namefmt, args);
+ va_end(args);
+
+ return 0;
+}
+EXPORT_SYMBOL(mlx4_rename_eq);
+
+struct mlx4_eq_notifier_event {
+ int vec;
+ int uuid;
+ struct mlx4_eq *eq;
+};
+
+#ifdef KMOD_DISABLED
+static int eq_notifier_cb(struct notifier_block *nb, unsigned long action,
+ void *data)
+{
+ struct mlx4_eq_notifier *eq_notifier = container_of(nb,
+ struct mlx4_eq_notifier,
+ nb);
+ struct mlx4_eq_notifier_event *event = data;
+
+ if (eq_notifier->uuid == event->uuid) {
+ raw_notifier_chain_unregister(&event->eq->notifiers_list, nb);
+ kfree(eq_notifier);
+ } else {
+ eq_notifier->cb(event->vec, eq_notifier->uuid,
+ eq_notifier->data);
+ }
+
+ return NOTIFY_DONE;
+}
+#endif
+
+int mlx4_assign_eq(struct mlx4_dev *dev, u8 port, u32 consumer_uuid,
+ void (*cb)(unsigned vector, u32 uuid, void *data),
+ void *notifier_data, int *vector)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err = 0, i = 0;
+ u32 min_ref_count_val = (u32)-1;
+ int requested_vector = MLX4_CQ_TO_EQ_VECTOR(*vector);
+ int *prequested_vector = NULL;
+ struct mlx4_eq_notifier *notifier = NULL;
+
+ if (cb) {
+#ifdef KMOD_DISABLED
+ notifier = kmalloc(sizeof(*notifier), GFP_KERNEL);
+
+ if (!notifier)
+ return -ENOMEM;
+
+ notifier->nb.notifier_call = eq_notifier_cb;
+ notifier->nb.priority = 0;
+ notifier->cb = cb;
+ notifier->uuid = consumer_uuid;
+ notifier->data = notifier_data;
+#endif
+ }
+ mutex_lock(&priv->msix_ctl.pool_lock);
+ if (requested_vector < (dev->caps.num_comp_vectors + 1) &&
+ (requested_vector >= 0) &&
+ (requested_vector != MLX4_EQ_ASYNC)) {
+ if (test_bit(port - 1,
+ priv->eq_table.eq[requested_vector].actv_ports.ports)) {
+ prequested_vector = &requested_vector;
+ } else {
+ struct mlx4_eq *eq;
+
+ for (i = 1; i < port;
+ requested_vector += mlx4_get_eqs_per_port(dev, i++))
+ ;
+
+ eq = &priv->eq_table.eq[requested_vector];
+ if (requested_vector < dev->caps.num_comp_vectors + 1 &&
+ test_bit(port - 1, eq->actv_ports.ports)) {
+ prequested_vector = &requested_vector;
+ }
+ }
+ }
+
+ if (!prequested_vector) {
+ requested_vector = -1;
+ for (i = 0; min_ref_count_val && i < dev->caps.num_comp_vectors + 1;
+ i++) {
+ struct mlx4_eq *eq = &priv->eq_table.eq[i];
+
+ if (min_ref_count_val > eq->ref_count &&
+ test_bit(port - 1, eq->actv_ports.ports)) {
+ min_ref_count_val = eq->ref_count;
+ requested_vector = i;
+ }
+ }
+
+ if (requested_vector < 0) {
+ err = -ENOSPC;
+ goto err_unlock;
+ }
+
+ prequested_vector = &requested_vector;
+ }
+
+ if (!test_bit(*prequested_vector, priv->msix_ctl.pool_bm) &&
+ dev->flags & MLX4_FLAG_MSI_X) {
+ set_bit(*prequested_vector, priv->msix_ctl.pool_bm);
+ snprintf(priv->eq_table.irq_names +
+ *prequested_vector * MLX4_IRQNAME_SIZE,
+ MLX4_IRQNAME_SIZE, "mlx4-%d@%s",
+ *prequested_vector, "dev_name");//dev_name(&dev->persist->pdev->dev));
+
+ err = 0;/*request_irq(priv->eq_table.eq[*prequested_vector].irq,
+ mlx4_msi_x_interrupt, 0,
+ &priv->eq_table.irq_names[*prequested_vector << 5],
+ priv->eq_table.eq + *prequested_vector);*/
+
+ if (err) {
+ clear_bit(*prequested_vector, priv->msix_ctl.pool_bm);
+ *prequested_vector = -1;
+ } else {
+#if defined(CONFIG_SMP)
+ mlx4_set_eq_affinity_hint(priv, *prequested_vector);
+#endif
+ eq_set_ci(&priv->eq_table.eq[*prequested_vector], 1);
+ priv->eq_table.eq[*prequested_vector].have_irq = 1;
+ }
+ }
+
+ if (!err && *prequested_vector >= 0) {
+ priv->eq_table.eq[*prequested_vector].ref_count++;
+#ifdef KMOD_DISABLED
+ if (cb)
+ raw_notifier_chain_register(
+ &priv->eq_table.eq[*prequested_vector].notifiers_list,
+ ¬ifier->nb);
+#endif
+ }
+
+err_unlock:
+ mutex_unlock(&priv->msix_ctl.pool_lock);
+
+ if (!err && *prequested_vector >= 0) {
+ *vector = MLX4_EQ_TO_CQ_VECTOR(*prequested_vector);
+ } else {
+ *vector = 0;
+ kfree(notifier);
+ }
+ return err;
+}
+EXPORT_SYMBOL(mlx4_assign_eq);
+
+int mlx4_eq_get_irq(struct mlx4_dev *dev, int cq_vec)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ return priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq_vec)].irq;
+}
+EXPORT_SYMBOL(mlx4_eq_get_irq);
+
+void mlx4_release_eq(struct mlx4_dev *dev, int uuid, int vec)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int eq_vec = MLX4_CQ_TO_EQ_VECTOR(vec);
+ struct mlx4_eq_notifier_event event = {.vec = vec, .uuid = uuid,
+ .eq = &priv->eq_table.eq[eq_vec]};
+
+ mutex_lock(&priv->msix_ctl.pool_lock);
+ priv->eq_table.eq[eq_vec].ref_count--;
+
+ /* we zero the name priority, but keep the name. Afterward, we
+ * call all notifiers to notify that there was a change in this EQ.
+ * Live notifiers could rename this EQ back to the required name.
+ */
+ priv->eq_table.eq[eq_vec].name_priority = 0;
+ snprintf(priv->eq_table.irq_names +
+ eq_vec * MLX4_IRQNAME_SIZE,
+ MLX4_IRQNAME_SIZE, "mlx4-%d@%s", vec,
+ "dev_name");
+ //dev_name(&dev->persist->pdev->dev));
+
+#ifdef KMOD_DISABLED
+ raw_notifier_call_chain(&priv->eq_table.eq[eq_vec].notifiers_list, 0,
+ &event);
+#endif
+
+ /* once we allocated EQ, we don't release it because it might be binded
+ * to cpu_rmap.
+ */
+ mutex_unlock(&priv->msix_ctl.pool_lock);
+}
+EXPORT_SYMBOL(mlx4_release_eq);
+
+int mlx4_choose_vector(struct mlx4_dev *dev, int vector, int num_comp)
+{
+ struct mlx4_eq *chosen;
+ int k;
+
+ vector = MLX4_CQ_TO_EQ_VECTOR(vector);
+#ifdef KMOD_DISABLED
+ if (vector || smp_processor_id() == (vector % num_online_cpus())) {
+ spin_lock(&dev->eq_accounting_lock);
+ mlx4_priv(dev)->eq_table.eq[vector].ncqs++;
+ spin_unlock(&dev->eq_accounting_lock);
+ } else {
+#endif
+ spin_lock(&dev->eq_accounting_lock);
+ chosen = &mlx4_priv(dev)->eq_table.eq[0];
+ for (k = 0; k < num_comp; k++) {
+ if (mlx4_priv(dev)->eq_table.eq[k].ncqs < chosen->ncqs) {
+ chosen = &mlx4_priv(dev)->eq_table.eq[k];
+ vector = k;
+ }
+ }
+ chosen->ncqs++;
+ spin_unlock(&dev->eq_accounting_lock);
+#ifdef KMOD_DISABLED
+ }
+#endif// use single msi interrupt
+
+ return MLX4_EQ_TO_CQ_VECTOR(vector);
+}
+EXPORT_SYMBOL(mlx4_choose_vector);
+
+void mlx4_release_vector(struct mlx4_dev *dev, int vector)
+{
+ spin_lock(&dev->eq_accounting_lock);
+ mlx4_priv(dev)->eq_table.eq[vector].ncqs--;
+ spin_unlock(&dev->eq_accounting_lock);
+}
+EXPORT_SYMBOL(mlx4_release_vector);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/fw.c b/drivers/net/mlnx_uio/mlnx/mlx4/fw.c
new file mode 100644
index 0000000..1b7022a
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/fw.c
@@ -0,0 +1,3005 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+#include "fw.h"
+#include "icm.h"
+#include "log2.h"
+
+
+static inline bool ipv6_addr_v4mapped(const struct in6_addr *a)
+{
+ return (
+ (__force unsigned long)(a->s6_addr32[0] | a->s6_addr32[1]) |
+ (__force unsigned long)(a->s6_addr32[2] ^
+ cpu_to_be32(0x0000ffff))) == 0UL;
+}
+
+enum {
+ MLX4_COMMAND_INTERFACE_MIN_REV = 2,
+ MLX4_COMMAND_INTERFACE_MAX_REV = 3,
+ MLX4_COMMAND_INTERFACE_NEW_PORT_CMDS = 3,
+};
+
+extern void __buggy_use_of_MLX4_GET(void);
+extern void __buggy_use_of_MLX4_PUT(void);
+
+static bool enable_qos = false;
+module_param(enable_qos, int, 0444);
+MODULE_PARM_DESC(enable_qos, "Enable Enhanced QoS support (default: off)");
+
+static bool enable_vfs_qos;
+module_param(enable_vfs_qos, int, 0444);
+MODULE_PARM_DESC(enable_vfs_qos, "Enable Virtual VFs QoS (default: off)");
+
+#define MLX4_GET(dest, source, offset) \
+ do { \
+ void *__p = (char *) (source) + (offset); \
+ switch (sizeof (dest)) { \
+ case 1: (dest) = *(u8 *) __p; break; \
+ case 2: (dest) = be16_to_cpup(__p); break; \
+ case 4: (dest) = be32_to_cpup(__p); break; \
+ case 8: (dest) = be64_to_cpup(__p); break; \
+ default: __buggy_use_of_MLX4_GET(); \
+ } \
+ } while (0)
+
+#define MLX4_PUT(dest, source, offset) \
+ do { \
+ void *__d = ((char *) (dest) + (offset)); \
+ switch (sizeof(source)) { \
+ case 1: *(u8 *) __d = (source); break; \
+ case 2: *(__be16 *) __d = cpu_to_be16(source); break; \
+ case 4: *(__be32 *) __d = cpu_to_be32(source); break; \
+ case 8: *(__be64 *) __d = cpu_to_be64(source); break; \
+ default: __buggy_use_of_MLX4_PUT(); \
+ } \
+ } while (0)
+
+static void dump_dev_cap_flags(struct mlx4_dev *dev, u64 flags)
+{
+ static const char *fname[] = {
+ [ 0] = "RC transport",
+ [ 1] = "UC transport",
+ [ 2] = "UD transport",
+ [ 3] = "XRC transport",
+ [ 6] = "SRQ support",
+ [ 7] = "IPoIB checksum offload",
+ [ 8] = "P_Key violation counter",
+ [ 9] = "Q_Key violation counter",
+ [12] = "Dual Port Different Protocol (DPDP) support",
+ [15] = "Big LSO headers",
+ [16] = "MW support",
+ [17] = "APM support",
+ [18] = "Atomic ops support",
+ [19] = "Raw multicast support",
+ [20] = "Address vector port checking support",
+ [21] = "UD multicast support",
+ [30] = "IBoE support",
+ [32] = "Unicast loopback support",
+ [34] = "FCS header control",
+ [37] = "Wake On LAN (port1) support",
+ [38] = "Wake On LAN (port2) support",
+ [40] = "UDP RSS support",
+ [41] = "Unicast VEP steering support",
+ [42] = "Multicast VEP steering support",
+ [44] = "Cross-channel (sync_qp) operations support",
+ [48] = "Counters support",
+ [52] = "RSS IP fragments support",
+ [53] = "Port ETS Scheduler support",
+ [55] = "Port link type sensing support",
+ [59] = "Port management change event support",
+ [61] = "64 byte EQE support",
+ [62] = "64 byte CQE support",
+ };
+ int i;
+
+ mlx4_dbg(dev, "DEV_CAP flags:\n");
+ for (i = 0; i < ARRAY_SIZE(fname); ++i)
+ if (fname[i] && (flags & (1LL << i)))
+ mlx4_dbg(dev, " %s\n", fname[i]);
+}
+
+static void dump_dev_cap_flags2(struct mlx4_dev *dev, u64 flags)
+{
+ static const char * const fname[] = {
+ [0] = "RSS support",
+ [1] = "RSS Toeplitz Hash Function support",
+ [2] = "RSS XOR Hash Function support",
+ [3] = "Device managed flow steering support",
+ [4] = "Automatic MAC reassignment support",
+ [5] = "Time stamping support",
+ [6] = "VST (control vlan insertion/stripping) support",
+ [7] = "FSM (MAC anti-spoofing) support",
+ [8] = "Dynamic QP updates support",
+ [9] = "Device managed flow steering IPoIB support",
+ [10] = "TCP/IP offloads/flow-steering for VXLAN support",
+ [11] = "MAD DEMUX (Secure-Host) support",
+ [12] = "Large cache line (>64B) CQE stride support",
+ [13] = "Large cache line (>64B) EQE stride support",
+ [14] = "Ethernet protocol control support",
+ [15] = "Ethernet Backplane autoneg support",
+ [16] = "CONFIG DEV support",
+ [17] = "Asymmetric EQs support",
+ [18] = "More than 80 VFs support",
+ [19] = "Performance optimized for limited rule configuration flow steering support",
+ [20] = "Recoverable error events support",
+ [21] = "Port Remap support",
+ [23] = "Modifying loopback source checks using UPDATE_QP support",
+ [25] = "Set ingress parser mode support",
+ [26] = "Loopback source checks support",
+ [27] = "Port ETS Scheduler support",
+ [28] = "Ethernet Flow control statistics support",
+ [30] = "NCSI in DMFS mode support",
+ [32] = "RoCEv2 support",
+ [33] = "QCN support",
+ [34] = "Optimized steering table for non source IP rules",
+ [35] = "Granular QoS per VF support",
+ [36] = "Port beacon support",
+ [37] = "RX-ALL support",
+ };
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(fname); ++i)
+ if (fname[i] && (flags & (1LL << i)))
+ mlx4_dbg(dev, " %s\n", fname[i]);
+}
+
+int mlx4_MOD_STAT_CFG(struct mlx4_dev *dev, struct mlx4_mod_stat_cfg *cfg)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 *inbox;
+ int err = 0;
+
+#define MOD_STAT_CFG_IN_SIZE 0x100
+
+#define MOD_STAT_CFG_PG_SZ_M_OFFSET 0x002
+#define MOD_STAT_CFG_PG_SZ_OFFSET 0x003
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ inbox = mailbox->buf;
+
+ MLX4_PUT(inbox, cfg->log_pg_sz, MOD_STAT_CFG_PG_SZ_OFFSET);
+ MLX4_PUT(inbox, cfg->log_pg_sz_m, MOD_STAT_CFG_PG_SZ_M_OFFSET);
+
+ err = mlx4_cmd(dev, mailbox->dma, 0, 0, MLX4_CMD_MOD_STAT_CFG,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+int mlx4_QUERY_FUNC(struct mlx4_dev *dev, struct mlx4_func *func, int slave)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 *outbox;
+ u8 in_modifier;
+ u8 field;
+ u16 field16;
+ int err;
+
+#define QUERY_FUNC_BUS_OFFSET 0x00
+#define QUERY_FUNC_DEVICE_OFFSET 0x01
+#define QUERY_FUNC_FUNCTION_OFFSET 0x01
+#define QUERY_FUNC_PHYSICAL_FUNCTION_OFFSET 0x03
+#define QUERY_FUNC_RSVD_EQS_OFFSET 0x04
+#define QUERY_FUNC_MAX_EQ_OFFSET 0x06
+#define QUERY_FUNC_RSVD_UARS_OFFSET 0x0b
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ outbox = mailbox->buf;
+
+ in_modifier = slave;
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, in_modifier, 0,
+ MLX4_CMD_QUERY_FUNC,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err)
+ goto out;
+
+ MLX4_GET(field, outbox, QUERY_FUNC_BUS_OFFSET);
+ func->bus = field & 0xf;
+ MLX4_GET(field, outbox, QUERY_FUNC_DEVICE_OFFSET);
+ func->device = field & 0xf1;
+ MLX4_GET(field, outbox, QUERY_FUNC_FUNCTION_OFFSET);
+ func->function = field & 0x7;
+ MLX4_GET(field, outbox, QUERY_FUNC_PHYSICAL_FUNCTION_OFFSET);
+ func->physical_function = field & 0xf;
+ MLX4_GET(field16, outbox, QUERY_FUNC_RSVD_EQS_OFFSET);
+ func->rsvd_eqs = field16 & 0xffff;
+ MLX4_GET(field16, outbox, QUERY_FUNC_MAX_EQ_OFFSET);
+ func->max_eq = field16 & 0xffff;
+ MLX4_GET(field, outbox, QUERY_FUNC_RSVD_UARS_OFFSET);
+ func->rsvd_uars = field & 0x0f;
+
+ mlx4_dbg(dev, "Bus: %d, Device: %d, Function: %d, Physical function: %d, Max EQs: %d, Reserved EQs: %d, Reserved UARs: %d\n",
+ func->bus, func->device, func->function, func->physical_function,
+ func->max_eq, func->rsvd_eqs, func->rsvd_uars);
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+int mlx4_QUERY_FUNC_CAP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ u8 field, port;
+ u32 size, proxy_qp, qkey;
+ int err = 0;
+ struct mlx4_func func;
+
+#define QUERY_FUNC_CAP_FLAGS_OFFSET 0x0
+#define QUERY_FUNC_CAP_NUM_PORTS_OFFSET 0x1
+#define QUERY_FUNC_CAP_PF_BHVR_OFFSET 0x4
+#define QUERY_FUNC_CAP_FMR_OFFSET 0x8
+#define QUERY_FUNC_CAP_QP_QUOTA_OFFSET_DEP 0x10
+#define QUERY_FUNC_CAP_CQ_QUOTA_OFFSET_DEP 0x14
+#define QUERY_FUNC_CAP_SRQ_QUOTA_OFFSET_DEP 0x18
+#define QUERY_FUNC_CAP_MPT_QUOTA_OFFSET_DEP 0x20
+#define QUERY_FUNC_CAP_MTT_QUOTA_OFFSET_DEP 0x24
+#define QUERY_FUNC_CAP_MCG_QUOTA_OFFSET_DEP 0x28
+#define QUERY_FUNC_CAP_MAX_EQ_OFFSET 0x2c
+#define QUERY_FUNC_CAP_RESERVED_EQ_OFFSET 0x30
+#define QUERY_FUNC_CAP_QP_RESD_LKEY_OFFSET 0x48
+
+#define QUERY_FUNC_CAP_QP_QUOTA_OFFSET 0x50
+#define QUERY_FUNC_CAP_CQ_QUOTA_OFFSET 0x54
+#define QUERY_FUNC_CAP_SRQ_QUOTA_OFFSET 0x58
+#define QUERY_FUNC_CAP_MPT_QUOTA_OFFSET 0x60
+#define QUERY_FUNC_CAP_MTT_QUOTA_OFFSET 0x64
+#define QUERY_FUNC_CAP_MCG_QUOTA_OFFSET 0x68
+
+#define QUERY_FUNC_CAP_EXTRA_FLAGS_OFFSET 0x6c
+
+#define QUERY_FUNC_CAP_FMR_FLAG 0x80
+#define QUERY_FUNC_CAP_FLAG_RDMA 0x40
+#define QUERY_FUNC_CAP_FLAG_ETH 0x80
+#define QUERY_FUNC_CAP_FLAG_QUOTAS 0x10
+#define QUERY_FUNC_CAP_FLAG_RESD_LKEY 0x08
+#define QUERY_FUNC_CAP_FLAG_VALID_MAILBOX 0x04
+
+#define QUERY_FUNC_CAP_EXTRA_FLAGS_BF_QP_ALLOC_FLAG (1UL << 31)
+#define QUERY_FUNC_CAP_EXTRA_FLAGS_A0_QP_ALLOC_FLAG (1UL << 30)
+#define QUERY_FUNC_CAP_EXTRA_FLAGS_ROCE_MODE_PER_ADDR_FLAG (1UL << 27)
+
+/* when opcode modifier = 1 */
+#define QUERY_FUNC_CAP_PHYS_PORT_OFFSET 0x3
+#define QUERY_FUNC_CAP_PRIV_VF_QKEY_OFFSET 0x4
+#define QUERY_FUNC_CAP_FLAGS0_OFFSET 0x8
+#define QUERY_FUNC_CAP_FLAGS1_OFFSET 0xc
+#define QUERY_FUNC_CAP_COUNTER_INDEX_OFFSET 0xd
+
+#define QUERY_FUNC_CAP_QP0_TUNNEL 0x10
+#define QUERY_FUNC_CAP_QP0_PROXY 0x14
+#define QUERY_FUNC_CAP_QP1_TUNNEL 0x18
+#define QUERY_FUNC_CAP_QP1_PROXY 0x1c
+#define QUERY_FUNC_CAP_PHYS_PORT_ID 0x28
+
+#define QUERY_FUNC_CAP_FLAGS1_FORCE_MAC 0x40
+#define QUERY_FUNC_CAP_FLAGS1_FORCE_VLAN 0x80
+#define QUERY_FUNC_CAP_FLAGS1_NIC_INFO 0x10
+#define QUERY_FUNC_CAP_PROPS_DEF_COUNTER 0x20
+#define QUERY_FUNC_CAP_VF_ENABLE_QP0 0x08
+
+#define QUERY_FUNC_CAP_FLAGS0_FORCE_PHY_WQE_GID 0x80
+#define QUERY_FUNC_CAP_SUPPORTS_NON_POWER_OF_2_NUM_EQS (1 << 31)
+
+ if (vhcr->op_modifier == 1) {
+ struct mlx4_active_ports actv_ports =
+ mlx4_get_active_ports(dev, slave);
+ int converted_port = mlx4_slave_convert_port(
+ dev, slave, vhcr->in_modifier);
+
+ if (converted_port < 0)
+ return -EINVAL;
+
+ vhcr->in_modifier = converted_port;
+ /* phys-port = logical-port */
+ field = vhcr->in_modifier -
+ find_first_bit(actv_ports.ports, dev->caps.num_ports);
+ MLX4_PUT(outbox->buf, field, QUERY_FUNC_CAP_PHYS_PORT_OFFSET);
+
+ port = vhcr->in_modifier;
+ proxy_qp = dev->phys_caps.base_proxy_sqpn + 8 * slave + port - 1;
+
+ /* Set nic_info bit to mark new fields support */
+ field = QUERY_FUNC_CAP_FLAGS1_NIC_INFO;
+ field |= QUERY_FUNC_CAP_PROPS_DEF_COUNTER; /* def counter */
+
+ if (mlx4_vf_smi_enabled(dev, slave, port) &&
+ !mlx4_get_parav_qkey(dev, proxy_qp, &qkey)) {
+ field |= QUERY_FUNC_CAP_VF_ENABLE_QP0;
+ MLX4_PUT(outbox->buf, qkey,
+ QUERY_FUNC_CAP_PRIV_VF_QKEY_OFFSET);
+ }
+ MLX4_PUT(outbox->buf, field, QUERY_FUNC_CAP_FLAGS1_OFFSET);
+
+ /* There is always default counter legal or sink counter */
+ field = mlx4_get_default_counter_index(dev, slave, vhcr->in_modifier);
+ MLX4_PUT(outbox->buf, field, QUERY_FUNC_CAP_COUNTER_INDEX_OFFSET);
+
+ /* size is now the QP number */
+ size = dev->phys_caps.base_tunnel_sqpn + 8 * slave + port - 1;
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_QP0_TUNNEL);
+
+ size += 2;
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_QP1_TUNNEL);
+
+ MLX4_PUT(outbox->buf, proxy_qp, QUERY_FUNC_CAP_QP0_PROXY);
+ proxy_qp += 2;
+ MLX4_PUT(outbox->buf, proxy_qp, QUERY_FUNC_CAP_QP1_PROXY);
+
+ MLX4_PUT(outbox->buf, dev->caps.phys_port_id[vhcr->in_modifier],
+ QUERY_FUNC_CAP_PHYS_PORT_ID);
+
+ } else if (vhcr->op_modifier == 0) {
+ struct mlx4_active_ports actv_ports =
+ mlx4_get_active_ports(dev, slave);
+ /* enable rdma and ethernet interfaces, new quota locations,
+ * and reserved lkey
+ */
+ field = (QUERY_FUNC_CAP_FLAG_ETH | QUERY_FUNC_CAP_FLAG_RDMA |
+ QUERY_FUNC_CAP_FLAG_QUOTAS | QUERY_FUNC_CAP_FLAG_VALID_MAILBOX |
+ QUERY_FUNC_CAP_FLAG_RESD_LKEY);
+ MLX4_PUT(outbox->buf, field, QUERY_FUNC_CAP_FLAGS_OFFSET);
+
+ field = min(
+ bitmap_weight(actv_ports.ports, dev->caps.num_ports),
+ dev->caps.num_ports);
+ MLX4_PUT(outbox->buf, field, QUERY_FUNC_CAP_NUM_PORTS_OFFSET);
+
+ size = dev->caps.function_caps; /* set PF behaviours */
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_PF_BHVR_OFFSET);
+
+ field = 0; /* protected FMR support not available as yet */
+ MLX4_PUT(outbox->buf, field, QUERY_FUNC_CAP_FMR_OFFSET);
+
+ size = priv->mfunc.master.res_tracker.res_alloc[RES_QP].quota[slave];
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_QP_QUOTA_OFFSET);
+ size = dev->caps.num_qps;
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_QP_QUOTA_OFFSET_DEP);
+
+ size = priv->mfunc.master.res_tracker.res_alloc[RES_SRQ].quota[slave];
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_SRQ_QUOTA_OFFSET);
+ size = dev->caps.num_srqs;
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_SRQ_QUOTA_OFFSET_DEP);
+
+ size = priv->mfunc.master.res_tracker.res_alloc[RES_CQ].quota[slave];
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_CQ_QUOTA_OFFSET);
+ size = dev->caps.num_cqs;
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_CQ_QUOTA_OFFSET_DEP);
+
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_SYS_EQS) ||
+ mlx4_QUERY_FUNC(dev, &func, slave)) {
+ size = vhcr->in_modifier &
+ QUERY_FUNC_CAP_SUPPORTS_NON_POWER_OF_2_NUM_EQS ?
+ dev->caps.num_eqs :
+ rounddown_pow_of_two(dev->caps.num_eqs);
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_MAX_EQ_OFFSET);
+ size = dev->caps.reserved_eqs;
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_RESERVED_EQ_OFFSET);
+ } else {
+ size = vhcr->in_modifier &
+ QUERY_FUNC_CAP_SUPPORTS_NON_POWER_OF_2_NUM_EQS ?
+ func.max_eq :
+ rounddown_pow_of_two(func.max_eq);
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_MAX_EQ_OFFSET);
+ size = func.rsvd_eqs;
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_RESERVED_EQ_OFFSET);
+ }
+
+ size = priv->mfunc.master.res_tracker.res_alloc[RES_MPT].quota[slave];
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_MPT_QUOTA_OFFSET);
+ size = dev->caps.num_mpts;
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_MPT_QUOTA_OFFSET_DEP);
+
+ size = priv->mfunc.master.res_tracker.res_alloc[RES_MTT].quota[slave];
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_MTT_QUOTA_OFFSET);
+ size = dev->caps.num_mtts;
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_MTT_QUOTA_OFFSET_DEP);
+
+ size = dev->caps.num_mgms + dev->caps.num_amgms;
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_MCG_QUOTA_OFFSET);
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_MCG_QUOTA_OFFSET_DEP);
+
+ size = QUERY_FUNC_CAP_EXTRA_FLAGS_BF_QP_ALLOC_FLAG |
+ QUERY_FUNC_CAP_EXTRA_FLAGS_A0_QP_ALLOC_FLAG;
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2)
+ size |= QUERY_FUNC_CAP_EXTRA_FLAGS_ROCE_MODE_PER_ADDR_FLAG;
+
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_EXTRA_FLAGS_OFFSET);
+
+ size = dev->caps.reserved_lkey + ((slave << 8) & 0xFF00);
+ MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_QP_RESD_LKEY_OFFSET);
+ } else
+ err = -EINVAL;
+
+ return err;
+}
+
+int mlx4_QUERY_FUNC_CAP(struct mlx4_dev *dev, u8 gen_or_port,
+ struct mlx4_func_cap *func_cap)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 *outbox;
+ u8 field, field1, op_modifier;
+ u32 size, qkey;
+ int err = 0, quotas = 0;
+ u32 in_modifier;
+
+ op_modifier = !!gen_or_port; /* 0 = general, 1 = logical port */
+ in_modifier = op_modifier ? gen_or_port :
+ QUERY_FUNC_CAP_SUPPORTS_NON_POWER_OF_2_NUM_EQS;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, in_modifier, op_modifier,
+ MLX4_CMD_QUERY_FUNC_CAP,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (err)
+ goto out;
+
+ outbox = mailbox->buf;
+
+ if (!op_modifier) {
+ MLX4_GET(field, outbox, QUERY_FUNC_CAP_FLAGS_OFFSET);
+ if (!(field & (QUERY_FUNC_CAP_FLAG_ETH | QUERY_FUNC_CAP_FLAG_RDMA))) {
+ mlx4_err(dev, "The host supports neither eth nor rdma interfaces\n");
+ err = -EPROTONOSUPPORT;
+ goto out;
+ }
+ func_cap->flags = field;
+ quotas = !!(func_cap->flags & QUERY_FUNC_CAP_FLAG_QUOTAS);
+
+ MLX4_GET(field, outbox, QUERY_FUNC_CAP_NUM_PORTS_OFFSET);
+ func_cap->num_ports = field;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_PF_BHVR_OFFSET);
+ func_cap->pf_context_behaviour = size;
+
+ if (quotas) {
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_QP_QUOTA_OFFSET);
+ func_cap->qp_quota = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_SRQ_QUOTA_OFFSET);
+ func_cap->srq_quota = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_CQ_QUOTA_OFFSET);
+ func_cap->cq_quota = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_MPT_QUOTA_OFFSET);
+ func_cap->mpt_quota = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_MTT_QUOTA_OFFSET);
+ func_cap->mtt_quota = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_MCG_QUOTA_OFFSET);
+ func_cap->mcg_quota = size & 0xFFFFFF;
+
+ } else {
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_QP_QUOTA_OFFSET_DEP);
+ func_cap->qp_quota = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_SRQ_QUOTA_OFFSET_DEP);
+ func_cap->srq_quota = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_CQ_QUOTA_OFFSET_DEP);
+ func_cap->cq_quota = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_MPT_QUOTA_OFFSET_DEP);
+ func_cap->mpt_quota = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_MTT_QUOTA_OFFSET_DEP);
+ func_cap->mtt_quota = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_MCG_QUOTA_OFFSET_DEP);
+ func_cap->mcg_quota = size & 0xFFFFFF;
+ }
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_MAX_EQ_OFFSET);
+ func_cap->max_eq = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_RESERVED_EQ_OFFSET);
+ func_cap->reserved_eq = size & 0xFFFFFF;
+
+ if (func_cap->flags & QUERY_FUNC_CAP_FLAG_RESD_LKEY) {
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_QP_RESD_LKEY_OFFSET);
+ func_cap->reserved_lkey = size;
+ } else {
+ func_cap->reserved_lkey = 0;
+ }
+
+ func_cap->extra_flags = 0;
+
+ /* Mailbox data from 0x6c and onward should only be treated if
+ * QUERY_FUNC_CAP_FLAG_VALID_MAILBOX is set in func_cap->flags
+ */
+ if (func_cap->flags & QUERY_FUNC_CAP_FLAG_VALID_MAILBOX) {
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_EXTRA_FLAGS_OFFSET);
+ if (size & QUERY_FUNC_CAP_EXTRA_FLAGS_BF_QP_ALLOC_FLAG)
+ func_cap->extra_flags |= MLX4_QUERY_FUNC_FLAGS_BF_RES_QP;
+ if (size & QUERY_FUNC_CAP_EXTRA_FLAGS_A0_QP_ALLOC_FLAG)
+ func_cap->extra_flags |= MLX4_QUERY_FUNC_FLAGS_A0_RES_QP;
+ if (size & QUERY_FUNC_CAP_EXTRA_FLAGS_ROCE_MODE_PER_ADDR_FLAG)
+ func_cap->extra_flags |= MLX4_QUERY_FUNC_FLAGS_ROCE_ADDR;
+ }
+
+ goto out;
+ }
+
+ /* logical port query */
+ if (gen_or_port > dev->caps.num_ports) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ MLX4_GET(func_cap->flags1, outbox, QUERY_FUNC_CAP_FLAGS1_OFFSET);
+ if (dev->caps.port_type[gen_or_port] == MLX4_PORT_TYPE_ETH) {
+ if (func_cap->flags1 & QUERY_FUNC_CAP_FLAGS1_FORCE_VLAN) {
+ mlx4_err(dev, "VLAN is enforced on this port\n");
+ err = -EPROTONOSUPPORT;
+ goto out;
+ }
+
+ if (func_cap->flags1 & QUERY_FUNC_CAP_FLAGS1_FORCE_MAC) {
+ mlx4_err(dev, "Force mac is enabled on this port\n");
+ err = -EPROTONOSUPPORT;
+ goto out;
+ }
+ } else if (dev->caps.port_type[gen_or_port] == MLX4_PORT_TYPE_IB) {
+ MLX4_GET(field, outbox, QUERY_FUNC_CAP_FLAGS0_OFFSET);
+ if (field & QUERY_FUNC_CAP_FLAGS0_FORCE_PHY_WQE_GID) {
+ mlx4_err(dev, "phy_wqe_gid is enforced on this ib port\n");
+ err = -EPROTONOSUPPORT;
+ goto out;
+ }
+ }
+
+ MLX4_GET(field, outbox, QUERY_FUNC_CAP_PHYS_PORT_OFFSET);
+ func_cap->physical_port = field;
+ if (func_cap->physical_port != gen_or_port) {
+ err = -ENOSYS;
+ goto out;
+ }
+
+ MLX4_GET(field, outbox, QUERY_FUNC_CAP_FLAGS1_OFFSET);
+ if (field & QUERY_FUNC_CAP_PROPS_DEF_COUNTER) {
+ MLX4_GET(field1, outbox, QUERY_FUNC_CAP_COUNTER_INDEX_OFFSET);
+ func_cap->def_counter_index = field1;
+ } else {
+ func_cap->def_counter_index = MLX4_SINK_COUNTER_INDEX;
+ }
+
+ if (func_cap->flags1 & QUERY_FUNC_CAP_VF_ENABLE_QP0) {
+ MLX4_GET(qkey, outbox, QUERY_FUNC_CAP_PRIV_VF_QKEY_OFFSET);
+ func_cap->qp0_qkey = qkey;
+ } else {
+ func_cap->qp0_qkey = 0;
+ }
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_QP0_TUNNEL);
+ func_cap->qp0_tunnel_qpn = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_QP0_PROXY);
+ func_cap->qp0_proxy_qpn = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_QP1_TUNNEL);
+ func_cap->qp1_tunnel_qpn = size & 0xFFFFFF;
+
+ MLX4_GET(size, outbox, QUERY_FUNC_CAP_QP1_PROXY);
+ func_cap->qp1_proxy_qpn = size & 0xFFFFFF;
+
+ if (func_cap->flags1 & QUERY_FUNC_CAP_FLAGS1_NIC_INFO)
+ MLX4_GET(func_cap->phys_port_id, outbox,
+ QUERY_FUNC_CAP_PHYS_PORT_ID);
+
+ /* All other resources are allocated by the master, but we still report
+ * 'num' and 'reserved' capabilities as follows:
+ * - num remains the maximum resource index
+ * - 'num - reserved' is the total available objects of a resource, but
+ * resource indices may be less than 'reserved'
+ * TODO: set per-resource quotas */
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+ return err;
+}
+
+int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 *outbox;
+ u8 field;
+ u32 field32, flags, ext_flags;
+ u16 size;
+ u16 stat_rate;
+ int err;
+ int i;
+
+#define QUERY_DEV_CAP_OUT_SIZE 0x100
+#define QUERY_DEV_CAP_MAX_SRQ_SZ_OFFSET 0x10
+#define QUERY_DEV_CAP_MAX_QP_SZ_OFFSET 0x11
+#define QUERY_DEV_CAP_RSVD_QP_OFFSET 0x12
+#define QUERY_DEV_CAP_MAX_QP_OFFSET 0x13
+#define QUERY_DEV_CAP_RSVD_SRQ_OFFSET 0x14
+#define QUERY_DEV_CAP_MAX_SRQ_OFFSET 0x15
+#define QUERY_DEV_CAP_RSVD_EEC_OFFSET 0x16
+#define QUERY_DEV_CAP_MAX_EEC_OFFSET 0x17
+#define QUERY_DEV_CAP_MAX_CQ_SZ_OFFSET 0x19
+#define QUERY_DEV_CAP_RSVD_CQ_OFFSET 0x1a
+#define QUERY_DEV_CAP_MAX_CQ_OFFSET 0x1b
+#define QUERY_DEV_CAP_MAX_MPT_OFFSET 0x1d
+#define QUERY_DEV_CAP_RSVD_EQ_OFFSET 0x1e
+#define QUERY_DEV_CAP_MAX_EQ_OFFSET 0x1f
+#define QUERY_DEV_CAP_RSVD_MTT_OFFSET 0x20
+#define QUERY_DEV_CAP_MAX_MRW_SZ_OFFSET 0x21
+#define QUERY_DEV_CAP_RSVD_MRW_OFFSET 0x22
+#define QUERY_DEV_CAP_MAX_MTT_SEG_OFFSET 0x23
+#define QUERY_DEV_CAP_NUM_SYS_EQ_OFFSET 0x26
+#define QUERY_DEV_CAP_MAX_AV_OFFSET 0x27
+#define QUERY_DEV_CAP_MAX_REQ_QP_OFFSET 0x29
+#define QUERY_DEV_CAP_MAX_RES_QP_OFFSET 0x2b
+#define QUERY_DEV_CAP_MAX_GSO_OFFSET 0x2d
+#define QUERY_DEV_CAP_RSS_OFFSET 0x2e
+#define QUERY_DEV_CAP_MAX_RDMA_OFFSET 0x2f
+#define QUERY_DEV_CAP_RSZ_SRQ_OFFSET 0x33
+#define QUERY_DEV_CAP_PORT_BEACON_OFFSET 0x34
+#define QUERY_DEV_CAP_ACK_DELAY_OFFSET 0x35
+#define QUERY_DEV_CAP_MTU_WIDTH_OFFSET 0x36
+#define QUERY_DEV_CAP_VL_PORT_OFFSET 0x37
+#define QUERY_DEV_CAP_MAX_MSG_SZ_OFFSET 0x38
+#define QUERY_DEV_CAP_MAX_GID_OFFSET 0x3b
+#define QUERY_DEV_CAP_RATE_SUPPORT_OFFSET 0x3c
+#define QUERY_DEV_CAP_CQ_TS_SUPPORT_OFFSET 0x3e
+#define QUERY_DEV_CAP_MAX_PKEY_OFFSET 0x3f
+#define QUERY_DEV_CAP_EXT_FLAGS_OFFSET 0x40
+#define QUERY_DEV_CAP_FLAGS_OFFSET 0x44
+#define QUERY_DEV_CAP_RSVD_UAR_OFFSET 0x48
+#define QUERY_DEV_CAP_UAR_SZ_OFFSET 0x49
+#define QUERY_DEV_CAP_PAGE_SZ_OFFSET 0x4b
+#define QUERY_DEV_CAP_BF_OFFSET 0x4c
+#define QUERY_DEV_CAP_LOG_BF_REG_SZ_OFFSET 0x4d
+#define QUERY_DEV_CAP_LOG_MAX_BF_REGS_PER_PAGE_OFFSET 0x4e
+#define QUERY_DEV_CAP_LOG_MAX_BF_PAGES_OFFSET 0x4f
+#define QUERY_DEV_CAP_MAX_SG_SQ_OFFSET 0x51
+#define QUERY_DEV_CAP_MAX_DESC_SZ_SQ_OFFSET 0x52
+#define QUERY_DEV_CAP_MAX_SG_RQ_OFFSET 0x55
+#define QUERY_DEV_CAP_MAX_DESC_SZ_RQ_OFFSET 0x56
+#define QUERY_DEV_CAP_MAX_QP_MCG_OFFSET 0x61
+#define QUERY_DEV_CAP_RSVD_MCG_OFFSET 0x62
+#define QUERY_DEV_CAP_MAX_MCG_OFFSET 0x63
+#define QUERY_DEV_CAP_RSVD_PD_OFFSET 0x64
+#define QUERY_DEV_CAP_MAX_PD_OFFSET 0x65
+#define QUERY_DEV_CAP_RSVD_XRC_OFFSET 0x66
+#define QUERY_DEV_CAP_MAX_XRC_OFFSET 0x67
+#define QUERY_DEV_CAP_MAX_BASIC_COUNTERS_OFFSET 0x68
+#define QUERY_DEV_CAP_MAX_EXTENDED_COUNTERS_OFFSET 0x6c
+#define QUERY_DEV_CAP_PORT_FLOWSTATS_COUNTERS_OFFSET 0x70
+#define QUERY_DEV_CAP_EXT_2_FLAGS_OFFSET 0x70
+#define QUERY_DEV_CAP_FLOW_STEERING_IPOIB_OFFSET 0x74
+#define QUERY_DEV_CAP_FLOW_STEERING_RANGE_EN_OFFSET 0x76
+#define QUERY_DEV_CAP_FLOW_STEERING_MAX_QP_OFFSET 0x77
+#define QUERY_DEV_CAP_CQ_OVERRUN_OFFSET 0x7a
+#define QUERY_DEV_CAP_CQ_EQ_CACHE_LINE_STRIDE 0x7a
+#define QUERY_DEV_CAP_ETH_PROT_CTRL_OFFSET 0x7a
+#define QUERY_DEV_CAP_ECN_QCN_VER_OFFSET 0x7B
+#define QUERY_DEV_CAP_RDMARC_ENTRY_SZ_OFFSET 0x80
+#define QUERY_DEV_CAP_QPC_ENTRY_SZ_OFFSET 0x82
+#define QUERY_DEV_CAP_AUX_ENTRY_SZ_OFFSET 0x84
+#define QUERY_DEV_CAP_ALTC_ENTRY_SZ_OFFSET 0x86
+#define QUERY_DEV_CAP_EQC_ENTRY_SZ_OFFSET 0x88
+#define QUERY_DEV_CAP_CQC_ENTRY_SZ_OFFSET 0x8a
+#define QUERY_DEV_CAP_SRQ_ENTRY_SZ_OFFSET 0x8c
+#define QUERY_DEV_CAP_C_MPT_ENTRY_SZ_OFFSET 0x8e
+#define QUERY_DEV_CAP_MTT_ENTRY_SZ_OFFSET 0x90
+#define QUERY_DEV_CAP_D_MPT_ENTRY_SZ_OFFSET 0x92
+#define QUERY_DEV_CAP_BMME_FLAGS_OFFSET 0x94
+#define QUERY_DEV_CAP_CONFIG_DEV_OFFSET 0x94
+#define QUERY_DEV_CAP_RSVD_LKEY_OFFSET 0x98
+#define QUERY_DEV_CAP_MAX_ICM_SZ_OFFSET 0xa0
+#define QUERY_DEV_CAP_ETH_BACKPL_OFFSET 0x9c
+#define QUERY_DEV_CAP_FW_REASSIGN_MAC 0x9d
+#define QUERY_DEV_CAP_VXLAN 0x9e
+#define QUERY_DEV_CAP_ADD_MAC 0x9f
+#define QUERY_DEV_CAP_MAD_DEMUX_OFFSET 0xb0
+#define QUERY_DEV_CAP_DMFS_HIGH_RATE_QPN_BASE_OFFSET 0xa8
+#define QUERY_DEV_CAP_DMFS_HIGH_RATE_QPN_RANGE_OFFSET 0xac
+
+ dev_cap->flags2 = 0;
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ outbox = mailbox->buf;
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, 0, 0, MLX4_CMD_QUERY_DEV_CAP,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+ if (err)
+ goto out;
+
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_QP_OFFSET);
+ dev_cap->reserved_qps = 1 << (field & 0xf);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_QP_OFFSET);
+ dev_cap->max_qps = 1 << (field & 0x1f);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_SRQ_OFFSET);
+ dev_cap->reserved_srqs = 1 << (field >> 4);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_SRQ_OFFSET);
+ dev_cap->max_srqs = 1 << (field & 0x1f);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_CQ_SZ_OFFSET);
+ dev_cap->max_cq_sz = 1 << field;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_CQ_OFFSET);
+ dev_cap->reserved_cqs = 1 << (field & 0xf);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_CQ_OFFSET);
+ dev_cap->max_cqs = 1 << (field & 0x1f);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_MPT_OFFSET);
+ dev_cap->max_mpts = 1 << (field & 0x3f);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_EQ_OFFSET);
+ dev_cap->reserved_eqs = 1 << (field & 0xf);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_EQ_OFFSET);
+ dev_cap->max_eqs = 1 << (field & 0xf);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_MTT_OFFSET);
+ dev_cap->reserved_mtts = 1 << (field >> 4);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_MRW_SZ_OFFSET);
+ dev_cap->max_mrw_sz = 1 << field;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_MRW_OFFSET);
+ dev_cap->reserved_mrws = 1 << (field & 0xf);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_MTT_SEG_OFFSET);
+ dev_cap->max_mtt_seg = 1 << (field & 0x3f);
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_NUM_SYS_EQ_OFFSET);
+ dev_cap->num_sys_eqs = size & 0xfff;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_REQ_QP_OFFSET);
+ dev_cap->max_requester_per_qp = 1 << (field & 0x3f);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_RES_QP_OFFSET);
+ dev_cap->max_responder_per_qp = 1 << (field & 0x3f);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_GSO_OFFSET);
+ field &= 0x1f;
+ if (!field)
+ dev_cap->max_gso_sz = 0;
+ else
+ dev_cap->max_gso_sz = 1 << field;
+
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_RSS_OFFSET);
+ if (field & 0x20)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_RSS_XOR;
+ if (field & 0x10)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_RSS_TOP;
+ field &= 0xf;
+ if (field) {
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_RSS;
+ dev_cap->max_rss_tbl_sz = 1 << field;
+ } else
+ dev_cap->max_rss_tbl_sz = 0;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_RDMA_OFFSET);
+ dev_cap->max_rdma_global = 1 << (field & 0x3f);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_ACK_DELAY_OFFSET);
+ dev_cap->local_ca_ack_delay = field & 0x1f;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_VL_PORT_OFFSET);
+ dev_cap->num_ports = field & 0xf;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_MSG_SZ_OFFSET);
+ dev_cap->max_msg_sz = 1 << (field & 0x1f);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_PORT_FLOWSTATS_COUNTERS_OFFSET);
+ if (field & 0x10)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_FLOWSTATS_EN;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_FLOW_STEERING_RANGE_EN_OFFSET);
+ if (field & 0x80)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_FS_EN;
+ if (field & 0x40)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_FS_EN_NCSI;
+ dev_cap->fs_log_max_ucast_qp_range_size = field & 0x1f;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_PORT_BEACON_OFFSET);
+ if (field & 0x80)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_PORT_BEACON;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_FLOW_STEERING_IPOIB_OFFSET);
+ if (field & 0x1)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_DISABLE_SIP_CHECK;
+ if (field & 0x80)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_DMFS_IPOIB;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_FLOW_STEERING_MAX_QP_OFFSET);
+ dev_cap->fs_max_num_qp_per_entry = field;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_ECN_QCN_VER_OFFSET);
+ if (field & 0x1)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_QCN;
+ MLX4_GET(stat_rate, outbox, QUERY_DEV_CAP_RATE_SUPPORT_OFFSET);
+ dev_cap->stat_rate_support = stat_rate;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_CQ_TS_SUPPORT_OFFSET);
+ if (field & 0x80)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_TS;
+ MLX4_GET(ext_flags, outbox, QUERY_DEV_CAP_EXT_FLAGS_OFFSET);
+ MLX4_GET(flags, outbox, QUERY_DEV_CAP_FLAGS_OFFSET);
+ dev_cap->flags = flags | (u64)ext_flags << 32;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_UAR_OFFSET);
+ dev_cap->reserved_uars = field >> 4;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_UAR_SZ_OFFSET);
+ dev_cap->uar_size = 1 << ((field & 0x3f) + 20);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_PAGE_SZ_OFFSET);
+ dev_cap->min_page_sz = 1 << field;
+
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_BF_OFFSET);
+ if (field & 0x80) {
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_LOG_BF_REG_SZ_OFFSET);
+ dev_cap->bf_reg_size = 1 << (field & 0x1f);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_LOG_MAX_BF_REGS_PER_PAGE_OFFSET);
+ if ((1 << (field & 0x3f)) > (PAGE_SIZE / dev_cap->bf_reg_size))
+ field = 3;
+ dev_cap->bf_regs_per_page = 1 << (field & 0x3f);
+ } else {
+ dev_cap->bf_reg_size = 0;
+ }
+
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_SG_SQ_OFFSET);
+ dev_cap->max_sq_sg = field;
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_MAX_DESC_SZ_SQ_OFFSET);
+ dev_cap->max_sq_desc_sz = size;
+
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_QP_MCG_OFFSET);
+ dev_cap->max_qp_per_mcg = 1 << field;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_MCG_OFFSET);
+ dev_cap->reserved_mgms = field & 0xf;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_MCG_OFFSET);
+ dev_cap->max_mcgs = 1 << field;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_PD_OFFSET);
+ dev_cap->reserved_pds = field >> 4;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_PD_OFFSET);
+ dev_cap->max_pds = 1 << (field & 0x3f);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_XRC_OFFSET);
+ dev_cap->reserved_xrcds = field >> 4;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_XRC_OFFSET);
+ dev_cap->max_xrcds = 1 << (field & 0x1f);
+
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_RDMARC_ENTRY_SZ_OFFSET);
+ dev_cap->rdmarc_entry_sz = size;
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_QPC_ENTRY_SZ_OFFSET);
+ dev_cap->qpc_entry_sz = size;
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_AUX_ENTRY_SZ_OFFSET);
+ dev_cap->aux_entry_sz = size;
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_ALTC_ENTRY_SZ_OFFSET);
+ dev_cap->altc_entry_sz = size;
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_EQC_ENTRY_SZ_OFFSET);
+ dev_cap->eqc_entry_sz = size;
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_CQC_ENTRY_SZ_OFFSET);
+ dev_cap->cqc_entry_sz = size;
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_SRQ_ENTRY_SZ_OFFSET);
+ dev_cap->srq_entry_sz = size;
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_C_MPT_ENTRY_SZ_OFFSET);
+ dev_cap->cmpt_entry_sz = size;
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_MTT_ENTRY_SZ_OFFSET);
+ dev_cap->mtt_entry_sz = size;
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_D_MPT_ENTRY_SZ_OFFSET);
+ dev_cap->dmpt_entry_sz = size;
+
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_SRQ_SZ_OFFSET);
+ dev_cap->max_srq_sz = 1 << field;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_QP_SZ_OFFSET);
+ dev_cap->max_qp_sz = 1 << field;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_RSZ_SRQ_OFFSET);
+ dev_cap->resize_srq = field & 1;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_SG_RQ_OFFSET);
+ dev_cap->max_rq_sg = field;
+ MLX4_GET(size, outbox, QUERY_DEV_CAP_MAX_DESC_SZ_RQ_OFFSET);
+ dev_cap->max_rq_desc_sz = size;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_CQ_OVERRUN_OFFSET);
+ dev_cap->cq_overrun = (field >> 1) & 1;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_CQ_EQ_CACHE_LINE_STRIDE);
+ if (field & (1 << 4) && enable_vfs_qos)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_QOS_VPP;
+ if (field & (1 << 5))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ETH_PROT_CTRL;
+ if (field & (1 << 6))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_CQE_STRIDE;
+ if (field & (1 << 7))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
+ MLX4_GET(dev_cap->bmme_flags, outbox,
+ QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
+ if (dev_cap->bmme_flags & MLX4_FLAG_ROCE_V1_V2)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ROCE_V1_V2;
+ if (dev_cap->bmme_flags & MLX4_FLAG_PORT_REMAP)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_PORT_REMAP;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_CONFIG_DEV_OFFSET);
+ if (field & 0x20)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_CONFIG_DEV;
+ if (field & (1 << 2))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_IGNORE_FCS;
+ MLX4_GET(dev_cap->reserved_lkey, outbox,
+ QUERY_DEV_CAP_RSVD_LKEY_OFFSET);
+ MLX4_GET(field32, outbox, QUERY_DEV_CAP_ETH_BACKPL_OFFSET);
+ if (field32 & (1 << 0))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ETH_BACKPL_AN_REP;
+ if (field32 & (1 << 4))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_MODIFY_PARSER;
+ if (field32 & (1 << 7))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_RECOVERABLE_ERROR_EVENT;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_FW_REASSIGN_MAC);
+ if (field & 1<<6)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_REASSIGN_MAC_EN;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_VXLAN);
+ if (field & 1<<3)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_VXLAN_OFFLOADS;
+ if (field & (1 << 5))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ETS_CFG;
+ MLX4_GET(dev_cap->max_icm_sz, outbox,
+ QUERY_DEV_CAP_MAX_ICM_SZ_OFFSET);
+ if (dev_cap->flags & MLX4_DEV_CAP_FLAG_COUNTERS)
+ MLX4_GET(dev_cap->max_basic_counters, outbox,
+ QUERY_DEV_CAP_MAX_BASIC_COUNTERS_OFFSET);
+ /* FW reports 256 however real value is 255 */
+ dev_cap->max_basic_counters = min_t(u32, dev_cap->max_basic_counters, 255);
+ if (dev_cap->flags & MLX4_DEV_CAP_FLAG_COUNTERS_EXT)
+ MLX4_GET(dev_cap->max_extended_counters, outbox,
+ QUERY_DEV_CAP_MAX_EXTENDED_COUNTERS_OFFSET);
+
+ MLX4_GET(field32, outbox,
+ QUERY_DEV_CAP_MAD_DEMUX_OFFSET);
+ if (field32 & (1 << 0))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_MAD_DEMUX;
+
+ MLX4_GET(dev_cap->dmfs_high_rate_qpn_base, outbox,
+ QUERY_DEV_CAP_DMFS_HIGH_RATE_QPN_BASE_OFFSET);
+ dev_cap->dmfs_high_rate_qpn_base &= MGM_QPN_MASK;
+ MLX4_GET(dev_cap->dmfs_high_rate_qpn_range, outbox,
+ QUERY_DEV_CAP_DMFS_HIGH_RATE_QPN_RANGE_OFFSET);
+ dev_cap->dmfs_high_rate_qpn_range &= MGM_QPN_MASK;
+
+ MLX4_GET(field32, outbox, QUERY_DEV_CAP_EXT_2_FLAGS_OFFSET);
+ if (field32 & (1 << 16))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_UPDATE_QP;
+ if (field32 & (1 << 18))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB;
+ if (field32 & (1 << 19))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_LB_SRC_CHK;
+ if (field32 & (1 << 26))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_VLAN_CONTROL;
+ if (field32 & (1 << 20))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_FSM;
+ if (field32 & (1 << 21))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_80_VFS;
+ if (field32 & (1 << 24))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_ROCEV2;
+
+
+ for (i = 1; i <= dev_cap->num_ports; i++) {
+ err = mlx4_QUERY_PORT(dev, i, dev_cap->port_cap + i);
+ if (err)
+ goto out;
+ }
+
+ /*
+ * Each UAR has 4 EQ doorbells; so if a UAR is reserved, then
+ * we can't use any EQs whose doorbell falls on that page,
+ * even if the EQ itself isn't reserved.
+ */
+ if (dev_cap->num_sys_eqs == 0)
+ dev_cap->reserved_eqs = max(dev_cap->reserved_uars * 4,
+ dev_cap->reserved_eqs);
+ else
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_SYS_EQS;
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+void mlx4_dev_cap_dump(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
+{
+ if (dev_cap->bf_reg_size > 0)
+ mlx4_dbg(dev, "BlueFlame available (reg size %d, regs/page %d)\n",
+ dev_cap->bf_reg_size, dev_cap->bf_regs_per_page);
+ else
+ mlx4_dbg(dev, "BlueFlame not available\n");
+
+ mlx4_dbg(dev, "Base MM extensions: flags %08x, rsvd L_Key %08x\n",
+ dev_cap->bmme_flags, dev_cap->reserved_lkey);
+ mlx4_dbg(dev, "Max ICM size %lld MB\n",
+ (unsigned long long) dev_cap->max_icm_sz >> 20);
+ mlx4_dbg(dev, "Max QPs: %d, reserved QPs: %d, entry size: %d\n",
+ dev_cap->max_qps, dev_cap->reserved_qps, dev_cap->qpc_entry_sz);
+ mlx4_dbg(dev, "Max SRQs: %d, reserved SRQs: %d, entry size: %d\n",
+ dev_cap->max_srqs, dev_cap->reserved_srqs, dev_cap->srq_entry_sz);
+ mlx4_dbg(dev, "Max CQs: %d, reserved CQs: %d, entry size: %d\n",
+ dev_cap->max_cqs, dev_cap->reserved_cqs, dev_cap->cqc_entry_sz);
+ mlx4_dbg(dev, "Num sys EQs: %d, max EQs: %d, reserved EQs: %d, entry size: %d\n",
+ dev_cap->num_sys_eqs, dev_cap->max_eqs, dev_cap->reserved_eqs,
+ dev_cap->eqc_entry_sz);
+ mlx4_dbg(dev, "reserved MPTs: %d, reserved MTTs: %d\n",
+ dev_cap->reserved_mrws, dev_cap->reserved_mtts);
+ mlx4_dbg(dev, "Max PDs: %d, reserved PDs: %d, reserved UARs: %d\n",
+ dev_cap->max_pds, dev_cap->reserved_pds, dev_cap->reserved_uars);
+ mlx4_dbg(dev, "Max QP/MCG: %d, reserved MGMs: %d\n",
+ dev_cap->max_pds, dev_cap->reserved_mgms);
+ mlx4_dbg(dev, "Max CQEs: %d, max WQEs: %d, max SRQ WQEs: %d\n",
+ dev_cap->max_cq_sz, dev_cap->max_qp_sz, dev_cap->max_srq_sz);
+ mlx4_dbg(dev, "Local CA ACK delay: %d, max MTU: %d, port width cap: %d\n",
+ dev_cap->local_ca_ack_delay, 128 << dev_cap->port_cap[1].ib_mtu,
+ dev_cap->port_cap[1].max_port_width);
+ mlx4_dbg(dev, "Max SQ desc size: %d, max SQ S/G: %d\n",
+ dev_cap->max_sq_desc_sz, dev_cap->max_sq_sg);
+ mlx4_dbg(dev, "Max RQ desc size: %d, max RQ S/G: %d\n",
+ dev_cap->max_rq_desc_sz, dev_cap->max_rq_sg);
+ mlx4_dbg(dev, "Max GSO size: %d\n", dev_cap->max_gso_sz);
+ mlx4_dbg(dev, "Max basic counters: %d\n", dev_cap->max_basic_counters);
+ mlx4_dbg(dev, "Max extended counters: %d\n", dev_cap->max_extended_counters);
+ mlx4_dbg(dev, "Max RSS Table size: %d\n", dev_cap->max_rss_tbl_sz);
+ mlx4_dbg(dev, "DMFS high rate steer QPn base: %d\n",
+ dev_cap->dmfs_high_rate_qpn_base);
+ mlx4_dbg(dev, "DMFS high rate steer QPn range: %d\n",
+ dev_cap->dmfs_high_rate_qpn_range);
+ dump_dev_cap_flags(dev, dev_cap->flags);
+ dump_dev_cap_flags2(dev, dev_cap->flags2);
+}
+
+int mlx4_QUERY_PORT(struct mlx4_dev *dev, int port, struct mlx4_port_cap *port_cap)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 *outbox;
+ u8 field;
+ u32 field32;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ outbox = mailbox->buf;
+
+ if (dev->flags & MLX4_FLAG_OLD_PORT_CMDS) {
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, 0, 0, MLX4_CMD_QUERY_DEV_CAP,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+
+ if (err)
+ goto out;
+
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_VL_PORT_OFFSET);
+ port_cap->max_vl = field >> 4;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MTU_WIDTH_OFFSET);
+ port_cap->ib_mtu = field >> 4;
+ port_cap->max_port_width = field & 0xf;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_GID_OFFSET);
+ port_cap->max_gids = 1 << (field & 0xf);
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_PKEY_OFFSET);
+ port_cap->max_pkeys = 1 << (field & 0xf);
+ } else {
+#define QUERY_PORT_SUPPORTED_TYPE_OFFSET 0x00
+#define QUERY_PORT_MTU_OFFSET 0x01
+#define QUERY_PORT_ETH_MTU_OFFSET 0x02
+#define QUERY_PORT_WIDTH_OFFSET 0x06
+#define QUERY_PORT_MAX_GID_PKEY_OFFSET 0x07
+#define QUERY_PORT_MAX_MACVLAN_OFFSET 0x0a
+#define QUERY_PORT_MAX_VL_OFFSET 0x0b
+#define QUERY_PORT_MAC_OFFSET 0x10
+#define QUERY_PORT_TRANS_VENDOR_OFFSET 0x18
+#define QUERY_PORT_WAVELENGTH_OFFSET 0x1c
+#define QUERY_PORT_TRANS_CODE_OFFSET 0x20
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, port, 0, MLX4_CMD_QUERY_PORT,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+ if (err)
+ goto out;
+
+ MLX4_GET(field, outbox, QUERY_PORT_SUPPORTED_TYPE_OFFSET);
+ port_cap->supported_port_types = field & 3;
+ port_cap->suggested_type = (field >> 3) & 1;
+ port_cap->default_sense = (field >> 4) & 1;
+ port_cap->dmfs_optimized_state = (field >> 5) & 1;
+ MLX4_GET(field, outbox, QUERY_PORT_MTU_OFFSET);
+ port_cap->ib_mtu = field & 0xf;
+ MLX4_GET(field, outbox, QUERY_PORT_WIDTH_OFFSET);
+ port_cap->max_port_width = field & 0xf;
+ MLX4_GET(field, outbox, QUERY_PORT_MAX_GID_PKEY_OFFSET);
+ port_cap->max_gids = 1 << (field >> 4);
+ port_cap->max_pkeys = 1 << (field & 0xf);
+ MLX4_GET(field, outbox, QUERY_PORT_MAX_VL_OFFSET);
+ port_cap->max_vl = field & 0xf;
+ MLX4_GET(field, outbox, QUERY_PORT_MAX_MACVLAN_OFFSET);
+ port_cap->log_max_macs = field & 0xf;
+ port_cap->log_max_vlans = field >> 4;
+ MLX4_GET(port_cap->eth_mtu, outbox, QUERY_PORT_ETH_MTU_OFFSET);
+ MLX4_GET(port_cap->def_mac, outbox, QUERY_PORT_MAC_OFFSET);
+ MLX4_GET(field32, outbox, QUERY_PORT_TRANS_VENDOR_OFFSET);
+ port_cap->trans_type = field32 >> 24;
+ port_cap->vendor_oui = field32 & 0xffffff;
+ MLX4_GET(port_cap->wavelength, outbox, QUERY_PORT_WAVELENGTH_OFFSET);
+ MLX4_GET(port_cap->trans_code, outbox, QUERY_PORT_TRANS_CODE_OFFSET);
+ }
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+#define DEV_CAP_EXT_2_FLAG_PFC_COUNTERS (1 << 28)
+#define DEV_CAP_EXT_2_FLAG_VLAN_CONTROL (1 << 26)
+#define DEV_CAP_EXT_2_FLAG_80_VFS (1 << 21)
+#define DEV_CAP_EXT_2_FLAG_FSM (1 << 20)
+
+static void slave_disable_roce_caps(void *buf, bool dis_roce_1,
+ bool dis_roce_1_5, bool dis_roce_2,
+ bool dis_roce_1_plus_2)
+{
+ u32 flags;
+
+ if (dis_roce_1) {
+ MLX4_GET(flags, buf, QUERY_DEV_CAP_FLAGS_OFFSET);
+ flags &= ~(1UL << 30);
+ MLX4_PUT(buf, flags, QUERY_DEV_CAP_FLAGS_OFFSET);
+ }
+ if (dis_roce_1_5) {
+ MLX4_GET(flags, buf, QUERY_DEV_CAP_EXT_FLAGS_OFFSET);
+ flags &= ~(1UL << 31);
+ MLX4_PUT(buf, flags, QUERY_DEV_CAP_EXT_FLAGS_OFFSET);
+ }
+ if (dis_roce_2) {
+ MLX4_GET(flags, buf, QUERY_DEV_CAP_EXT_2_FLAGS_OFFSET);
+ flags &= ~(1UL << 24);
+ MLX4_PUT(buf, flags, QUERY_DEV_CAP_EXT_2_FLAGS_OFFSET);
+ }
+ if (dis_roce_1_plus_2) {
+ MLX4_GET(flags, buf, QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
+ flags &= ~(MLX4_FLAG_ROCE_V1_V2);
+ MLX4_PUT(buf, flags, QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
+ }
+}
+
+int mlx4_QUERY_DEV_CAP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ u64 flags;
+ int err = 0;
+ u8 field;
+ u32 bmme_flags, field32;
+ int real_port;
+ int slave_port;
+ int first_port;
+ struct mlx4_active_ports actv_ports;
+
+ err = mlx4_cmd_box(dev, 0, outbox->dma, 0, 0, MLX4_CMD_QUERY_DEV_CAP,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+ if (err)
+ return err;
+
+ /* add port mng change event capability and disable mw type 1
+ * unconditionally to slaves
+ */
+ MLX4_GET(flags, outbox->buf, QUERY_DEV_CAP_EXT_FLAGS_OFFSET);
+ flags |= MLX4_DEV_CAP_FLAG_PORT_MNG_CHG_EV;
+ flags &= ~MLX4_DEV_CAP_FLAG_MEM_WINDOW;
+ actv_ports = mlx4_get_active_ports(dev, slave);
+ first_port = find_first_bit(actv_ports.ports, dev->caps.num_ports);
+ for (slave_port = 0, real_port = first_port;
+ real_port < first_port +
+ bitmap_weight(actv_ports.ports, dev->caps.num_ports);
+ ++real_port, ++slave_port) {
+ if (flags & (MLX4_DEV_CAP_FLAG_WOL_PORT1 << real_port))
+ flags |= MLX4_DEV_CAP_FLAG_WOL_PORT1 << slave_port;
+ else
+ flags &= ~(MLX4_DEV_CAP_FLAG_WOL_PORT1 << slave_port);
+ }
+ for (; slave_port < dev->caps.num_ports; ++slave_port)
+ flags &= ~(MLX4_DEV_CAP_FLAG_WOL_PORT1 << slave_port);
+
+ /* Not exposing RSS IP fragments to guests */
+ flags &= ~MLX4_DEV_CAP_FLAG_RSS_IP_FRAG;
+ MLX4_PUT(outbox->buf, flags, QUERY_DEV_CAP_EXT_FLAGS_OFFSET);
+
+ MLX4_GET(field, outbox->buf, QUERY_DEV_CAP_VL_PORT_OFFSET);
+ field &= ~0x0F;
+ field |= bitmap_weight(actv_ports.ports, dev->caps.num_ports) & 0x0F;
+ MLX4_PUT(outbox->buf, field, QUERY_DEV_CAP_VL_PORT_OFFSET);
+
+ /* For guests, disable timestamp */
+ MLX4_GET(field, outbox->buf, QUERY_DEV_CAP_CQ_TS_SUPPORT_OFFSET);
+ field &= 0x7f;
+ MLX4_PUT(outbox->buf, field, QUERY_DEV_CAP_CQ_TS_SUPPORT_OFFSET);
+
+ /* For guests, disable vxlan tunneling and QoS support */
+ MLX4_GET(field, outbox->buf, QUERY_DEV_CAP_VXLAN);
+ field &= 0xd7;
+ MLX4_PUT(outbox->buf, field, QUERY_DEV_CAP_VXLAN);
+
+ /* For guests report additional-mac query not available */
+ MLX4_GET(field, outbox->buf, QUERY_DEV_CAP_ADD_MAC);
+ field &= 0xfb;
+ MLX4_PUT(outbox->buf, field, QUERY_DEV_CAP_ADD_MAC);
+
+ /* For guests, disable port BEACON */
+ MLX4_GET(field, outbox->buf, QUERY_DEV_CAP_PORT_BEACON_OFFSET);
+ field &= 0x7f;
+ MLX4_PUT(outbox->buf, field, QUERY_DEV_CAP_PORT_BEACON_OFFSET);
+
+ /* For guests, report Blueflame disabled */
+ MLX4_GET(field, outbox->buf, QUERY_DEV_CAP_BF_OFFSET);
+ field &= 0x7f;
+ MLX4_PUT(outbox->buf, field, QUERY_DEV_CAP_BF_OFFSET);
+
+ /* For guests, disable mw type 2 and port remap*/
+ MLX4_GET(bmme_flags, outbox->buf, QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
+ bmme_flags &= ~MLX4_BMME_FLAG_TYPE_2_WIN;
+ bmme_flags &= ~MLX4_FLAG_PORT_REMAP;
+ MLX4_PUT(outbox->buf, bmme_flags, QUERY_DEV_CAP_BMME_FLAGS_OFFSET);
+
+ /* turn off device-managed steering capability if not enabled */
+ if (dev->caps.steering_mode != MLX4_STEERING_MODE_DEVICE_MANAGED) {
+ MLX4_GET(field, outbox->buf,
+ QUERY_DEV_CAP_FLOW_STEERING_RANGE_EN_OFFSET);
+ field &= 0x7f;
+ MLX4_PUT(outbox->buf, field,
+ QUERY_DEV_CAP_FLOW_STEERING_RANGE_EN_OFFSET);
+ }
+
+ /* turn off ipoib managed steering and sip check ignore for guests */
+ MLX4_GET(field, outbox->buf, QUERY_DEV_CAP_FLOW_STEERING_IPOIB_OFFSET);
+ field &= ~0x81;
+ MLX4_PUT(outbox->buf, field, QUERY_DEV_CAP_FLOW_STEERING_IPOIB_OFFSET);
+
+ /* turn off QoS per VF support for guests */
+ MLX4_GET(field, outbox->buf, QUERY_DEV_CAP_CQ_EQ_CACHE_LINE_STRIDE);
+ field &= 0xef;
+ MLX4_PUT(outbox->buf, field, QUERY_DEV_CAP_CQ_EQ_CACHE_LINE_STRIDE);
+
+ /* turn off host side virt features (VST, FSM, etc) for guests */
+ MLX4_GET(field32, outbox->buf, QUERY_DEV_CAP_EXT_2_FLAGS_OFFSET);
+ field32 &= ~(DEV_CAP_EXT_2_FLAG_VLAN_CONTROL | DEV_CAP_EXT_2_FLAG_80_VFS |
+ DEV_CAP_EXT_2_FLAG_FSM | DEV_CAP_EXT_2_FLAG_PFC_COUNTERS);
+ MLX4_PUT(outbox->buf, field32, QUERY_DEV_CAP_EXT_2_FLAGS_OFFSET);
+
+ /* turn off ignore FCS feature for guests */
+ MLX4_GET(field, outbox->buf, QUERY_DEV_CAP_CONFIG_DEV_OFFSET);
+ field &= 0xfb;
+ MLX4_PUT(outbox->buf, field, QUERY_DEV_CAP_CONFIG_DEV_OFFSET);
+
+ if (slave) {
+ switch (dev->caps.roce_mode) {
+ case MLX4_ROCE_MODE_1:
+ slave_disable_roce_caps(outbox->buf, false, true, true, true);
+ break;
+ case MLX4_ROCE_MODE_1_5:
+ slave_disable_roce_caps(outbox->buf, true, false, true, true);
+ break;
+ case MLX4_ROCE_MODE_2:
+ case MLX4_ROCE_MODE_1_5_PLUS_2:
+ slave_disable_roce_caps(outbox->buf, true, false, false, true);
+ break;
+ case MLX4_ROCE_MODE_1_PLUS_2:
+ slave_disable_roce_caps(outbox->buf, false, true, false, false);
+ break;
+ default:
+ break;
+ }
+ }
+
+ return 0;
+}
+
+int mlx4_QUERY_PORT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ u64 def_mac;
+ u8 port_type;
+ u16 short_field;
+ int err;
+ int admin_link_state;
+ int port = mlx4_slave_convert_port(dev, slave,
+ vhcr->in_modifier & 0xFF);
+
+#define MLX4_VF_PORT_NO_LINK_SENSE_MASK 0xE0
+#define MLX4_PORT_LINK_UP_MASK 0x80
+#define QUERY_PORT_CUR_MAX_PKEY_OFFSET 0x0c
+#define QUERY_PORT_CUR_MAX_GID_OFFSET 0x0e
+
+ if (port < 0)
+ return -EINVAL;
+
+ /* Protect against untrusted guests: enforce that this is the
+ * QUERY_PORT general query.
+ */
+ if (vhcr->op_modifier || vhcr->in_modifier & ~0xFF)
+ return -EINVAL;
+
+ vhcr->in_modifier = port;
+
+ err = mlx4_cmd_box(dev, 0, outbox->dma, vhcr->in_modifier, 0,
+ MLX4_CMD_QUERY_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_NATIVE);
+
+ if (!err && dev->caps.function != slave) {
+ def_mac = priv->mfunc.master.vf_oper[slave].vport[vhcr->in_modifier].state.mac;
+ MLX4_PUT(outbox->buf, def_mac, QUERY_PORT_MAC_OFFSET);
+
+ /* get port type - currently only eth is enabled */
+ MLX4_GET(port_type, outbox->buf,
+ QUERY_PORT_SUPPORTED_TYPE_OFFSET);
+
+ /* No link sensing allowed */
+ port_type &= MLX4_VF_PORT_NO_LINK_SENSE_MASK;
+ /* set port type to currently operating port type */
+ port_type |= (dev->caps.port_type[vhcr->in_modifier] & 0x3);
+
+ admin_link_state = priv->mfunc.master.vf_oper[slave].vport[vhcr->in_modifier].state.link_state;
+ if (IFLA_VF_LINK_STATE_ENABLE == admin_link_state)
+ port_type |= MLX4_PORT_LINK_UP_MASK;
+ else if (IFLA_VF_LINK_STATE_DISABLE == admin_link_state)
+ port_type &= ~MLX4_PORT_LINK_UP_MASK;
+
+ MLX4_PUT(outbox->buf, port_type,
+ QUERY_PORT_SUPPORTED_TYPE_OFFSET);
+
+ if (dev->caps.port_type[vhcr->in_modifier] == MLX4_PORT_TYPE_ETH)
+ short_field = mlx4_get_slave_num_gids(dev, slave, port);
+ else
+ short_field = 1; /* slave max gids */
+ MLX4_PUT(outbox->buf, short_field,
+ QUERY_PORT_CUR_MAX_GID_OFFSET);
+
+ short_field = dev->caps.pkey_table_len[vhcr->in_modifier];
+ MLX4_PUT(outbox->buf, short_field,
+ QUERY_PORT_CUR_MAX_PKEY_OFFSET);
+ }
+
+ return err;
+}
+
+int mlx4_get_slave_pkey_gid_tbl_len(struct mlx4_dev *dev, u8 port,
+ int *gid_tbl_len, int *pkey_tbl_len)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 *outbox;
+ u16 field;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, port, 0,
+ MLX4_CMD_QUERY_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_WRAPPED);
+ if (err)
+ goto out;
+
+ outbox = mailbox->buf;
+
+ MLX4_GET(field, outbox, QUERY_PORT_CUR_MAX_GID_OFFSET);
+ *gid_tbl_len = field;
+
+ MLX4_GET(field, outbox, QUERY_PORT_CUR_MAX_PKEY_OFFSET);
+ *pkey_tbl_len = field;
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_get_slave_pkey_gid_tbl_len);
+
+int mlx4_map_cmd(struct mlx4_dev *dev, u16 op, struct mlx4_icm *icm, u64 virt)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_icm_iter iter;
+ __be64 *pages;
+ int lg;
+ int nent = 0;
+ int i;
+ int err = 0;
+ int ts = 0, tc = 0;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ pages = mailbox->buf;
+
+ for (mlx4_icm_first(icm, &iter);
+ !mlx4_icm_last(&iter);
+ mlx4_icm_next(&iter)) {
+ /*
+ * We have to pass pages that are aligned to their
+ * size, so find the least significant 1 in the
+ * address or size and use that as our log2 size.
+ */
+ lg = ffs(mlx4_icm_addr(&iter) | mlx4_icm_size(&iter)) - 1;
+ if (lg < MLX4_ICM_PAGE_SHIFT) {
+ mlx4_warn(dev, "Got FW area not aligned to %d (%llx/%lx)\n",
+ MLX4_ICM_PAGE_SIZE,
+ (unsigned long long) mlx4_icm_addr(&iter),
+ mlx4_icm_size(&iter));
+ err = -EINVAL;
+ goto out;
+ }
+
+ for (i = 0; i < mlx4_icm_size(&iter) >> lg; ++i) {
+ if (virt != -1) {
+ pages[nent * 2] = cpu_to_be64(virt);
+ virt += 1 << lg;
+ }
+
+ pages[nent * 2 + 1] =
+ cpu_to_be64((mlx4_icm_addr(&iter) + (i << lg)) |
+ (lg - MLX4_ICM_PAGE_SHIFT));
+ ts += 1 << (lg - 10);
+ ++tc;
+
+ if (++nent == MLX4_MAILBOX_SIZE / 16) {
+ err = mlx4_cmd(dev, mailbox->dma, nent, 0, op,
+ MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_NATIVE);
+ if (err)
+ goto out;
+ nent = 0;
+ }
+ }
+ }
+
+ if (nent)
+ err = mlx4_cmd(dev, mailbox->dma, nent, 0, op,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+ if (err)
+ goto out;
+
+ switch (op) {
+ case MLX4_CMD_MAP_FA:
+ mlx4_dbg(dev, "Mapped %d chunks/%d KB for FW\n", tc, ts);
+ break;
+ case MLX4_CMD_MAP_ICM_AUX:
+ mlx4_dbg(dev, "Mapped %d chunks/%d KB for ICM aux\n", tc, ts);
+ break;
+ case MLX4_CMD_MAP_ICM:
+ mlx4_dbg(dev, "Mapped %d chunks/%d KB at %llx for ICM\n",
+ tc, ts, (unsigned long long) virt - (ts << 10));
+ break;
+ }
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+int mlx4_MAP_FA(struct mlx4_dev *dev, struct mlx4_icm *icm)
+{
+ return mlx4_map_cmd(dev, MLX4_CMD_MAP_FA, icm, -1);
+}
+
+int mlx4_UNMAP_FA(struct mlx4_dev *dev)
+{
+ return mlx4_cmd(dev, 0, 0, 0, MLX4_CMD_UNMAP_FA,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+}
+
+
+int mlx4_RUN_FW(struct mlx4_dev *dev)
+{
+ return mlx4_cmd(dev, 0, 0, 0, MLX4_CMD_RUN_FW,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+}
+
+int mlx4_QUERY_FW(struct mlx4_dev *dev)
+{
+ struct mlx4_fw *fw = &mlx4_priv(dev)->fw;
+ struct mlx4_cmd *cmd = &mlx4_priv(dev)->cmd;
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 *outbox;
+ int err = 0;
+ u64 fw_ver;
+ u16 cmd_if_rev;
+ u8 lg;
+
+#define QUERY_FW_OUT_SIZE 0x100
+#define QUERY_FW_VER_OFFSET 0x00
+#define QUERY_FW_PPF_ID 0x09
+#define QUERY_FW_CMD_IF_REV_OFFSET 0x0a
+#define QUERY_FW_MAX_CMD_OFFSET 0x0f
+#define QUERY_FW_ERR_START_OFFSET 0x30
+#define QUERY_FW_ERR_SIZE_OFFSET 0x38
+#define QUERY_FW_ERR_BAR_OFFSET 0x3c
+
+#define QUERY_FW_SIZE_OFFSET 0x00
+#define QUERY_FW_CLR_INT_BASE_OFFSET 0x20
+#define QUERY_FW_CLR_INT_BAR_OFFSET 0x28
+
+#define QUERY_FW_COMM_BASE_OFFSET 0x40
+#define QUERY_FW_COMM_BAR_OFFSET 0x48
+
+#define QUERY_FW_CLOCK_OFFSET 0x50
+#define QUERY_FW_CLOCK_BAR 0x58
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ outbox = mailbox->buf;
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, 0, 0, MLX4_CMD_QUERY_FW,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+ if (err)
+ goto out;
+
+ MLX4_GET(fw_ver, outbox, QUERY_FW_VER_OFFSET);
+ /*
+ * FW subminor version is at more significant bits than minor
+ * version, so swap here.
+ */
+ dev->caps.fw_ver = (fw_ver & 0xffff00000000ull) |
+ ((fw_ver & 0xffff0000ull) >> 16) |
+ ((fw_ver & 0x0000ffffull) << 16);
+
+ MLX4_GET(lg, outbox, QUERY_FW_PPF_ID);
+ dev->caps.function = lg;
+
+ if (mlx4_is_slave(dev))
+ goto out;
+
+
+ MLX4_GET(cmd_if_rev, outbox, QUERY_FW_CMD_IF_REV_OFFSET);
+ if (cmd_if_rev < MLX4_COMMAND_INTERFACE_MIN_REV ||
+ cmd_if_rev > MLX4_COMMAND_INTERFACE_MAX_REV) {
+ mlx4_err(dev, "Installed FW has unsupported command interface revision %d\n",
+ cmd_if_rev);
+ mlx4_err(dev, "(Installed FW version is %d.%d.%03d)\n",
+ (int) (dev->caps.fw_ver >> 32),
+ (int) (dev->caps.fw_ver >> 16) & 0xffff,
+ (int) dev->caps.fw_ver & 0xffff);
+ mlx4_err(dev, "This driver version supports only revisions %d to %d\n",
+ MLX4_COMMAND_INTERFACE_MIN_REV, MLX4_COMMAND_INTERFACE_MAX_REV);
+ err = -ENODEV;
+ goto out;
+ }
+
+ if (cmd_if_rev < MLX4_COMMAND_INTERFACE_NEW_PORT_CMDS)
+ dev->flags |= MLX4_FLAG_OLD_PORT_CMDS;
+
+ MLX4_GET(lg, outbox, QUERY_FW_MAX_CMD_OFFSET);
+ cmd->max_cmds = 1 << lg;
+
+ mlx4_dbg(dev, "FW version %d.%d.%03d (cmd intf rev %d), max commands %d\n",
+ (int) (dev->caps.fw_ver >> 32),
+ (int) (dev->caps.fw_ver >> 16) & 0xffff,
+ (int) dev->caps.fw_ver & 0xffff,
+ cmd_if_rev, cmd->max_cmds);
+
+ MLX4_GET(fw->catas_offset, outbox, QUERY_FW_ERR_START_OFFSET);
+ MLX4_GET(fw->catas_size, outbox, QUERY_FW_ERR_SIZE_OFFSET);
+ MLX4_GET(fw->catas_bar, outbox, QUERY_FW_ERR_BAR_OFFSET);
+ fw->catas_bar = (fw->catas_bar >> 6) * 2;
+
+ mlx4_dbg(dev, "Catastrophic error buffer at 0x%llx, size 0x%x, BAR %d\n",
+ (unsigned long long) fw->catas_offset, fw->catas_size, fw->catas_bar);
+
+ MLX4_GET(fw->fw_pages, outbox, QUERY_FW_SIZE_OFFSET);
+ MLX4_GET(fw->clr_int_base, outbox, QUERY_FW_CLR_INT_BASE_OFFSET);
+ MLX4_GET(fw->clr_int_bar, outbox, QUERY_FW_CLR_INT_BAR_OFFSET);
+ fw->clr_int_bar = (fw->clr_int_bar >> 6) * 2;
+
+ MLX4_GET(fw->comm_base, outbox, QUERY_FW_COMM_BASE_OFFSET);
+ MLX4_GET(fw->comm_bar, outbox, QUERY_FW_COMM_BAR_OFFSET);
+ fw->comm_bar = (fw->comm_bar >> 6) * 2;
+ mlx4_dbg(dev, "Communication vector bar:%d offset:0x%llx\n",
+ fw->comm_bar, fw->comm_base);
+ mlx4_dbg(dev, "FW size %d KB\n", fw->fw_pages >> 2);
+
+ MLX4_GET(fw->clock_offset, outbox, QUERY_FW_CLOCK_OFFSET);
+ MLX4_GET(fw->clock_bar, outbox, QUERY_FW_CLOCK_BAR);
+ fw->clock_bar = (fw->clock_bar >> 6) * 2;
+ mlx4_dbg(dev, "Internal clock bar:%d offset:0x%llx\n",
+ fw->clock_bar, fw->clock_offset);
+
+ /*
+ * Round up number of system pages needed in case
+ * MLX4_ICM_PAGE_SIZE < PAGE_SIZE.
+ */
+ fw->fw_pages =
+ ALIGN(fw->fw_pages, PAGE_SIZE / MLX4_ICM_PAGE_SIZE) >>
+ (PAGE_SHIFT - MLX4_ICM_PAGE_SHIFT);
+
+ mlx4_dbg(dev, "Clear int @ %llx, BAR %d\n",
+ (unsigned long long) fw->clr_int_base, fw->clr_int_bar);
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+int mlx4_QUERY_FW_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ u8 *outbuf;
+ int err;
+
+ outbuf = outbox->buf;
+ err = mlx4_cmd_box(dev, 0, outbox->dma, 0, 0, MLX4_CMD_QUERY_FW,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+ if (err)
+ return err;
+
+ /* for slaves, set pci PPF ID to invalid and zero out everything
+ * else except FW version */
+ outbuf[0] = outbuf[1] = 0;
+ memset(&outbuf[8], 0, QUERY_FW_OUT_SIZE - 8);
+ outbuf[QUERY_FW_PPF_ID] = MLX4_INVALID_SLAVE_ID;
+
+ return 0;
+}
+
+static void get_board_id(void *vsd, char *board_id)
+{
+ int i;
+
+#define VSD_OFFSET_SIG1 0x00
+#define VSD_OFFSET_SIG2 0xde
+#define VSD_OFFSET_MLX_BOARD_ID 0xd0
+#define VSD_OFFSET_TS_BOARD_ID 0x20
+
+#define VSD_SIGNATURE_TOPSPIN 0x5ad
+
+ memset(board_id, 0, MLX4_BOARD_ID_LEN);
+
+ if (be16_to_cpup(vsd + VSD_OFFSET_SIG1) == VSD_SIGNATURE_TOPSPIN &&
+ be16_to_cpup(vsd + VSD_OFFSET_SIG2) == VSD_SIGNATURE_TOPSPIN) {
+ strlcpy(board_id, vsd + VSD_OFFSET_TS_BOARD_ID, MLX4_BOARD_ID_LEN);
+ } else {
+ /*
+ * The board ID is a string but the firmware byte
+ * swaps each 4-byte word before passing it back to
+ * us. Therefore we need to swab it before printing.
+ */
+ for (i = 0; i < 4; ++i)
+ ((u32 *) board_id)[i] =
+ swab32(*(u32 *) (vsd + VSD_OFFSET_MLX_BOARD_ID + i * 4));
+ }
+}
+
+int mlx4_QUERY_ADAPTER(struct mlx4_dev *dev, struct mlx4_adapter *adapter)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 *outbox;
+ int err;
+
+#define QUERY_ADAPTER_OUT_SIZE 0x100
+#define QUERY_ADAPTER_INTA_PIN_OFFSET 0x10
+#define QUERY_ADAPTER_VSD_OFFSET 0x20
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ outbox = mailbox->buf;
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, 0, 0, MLX4_CMD_QUERY_ADAPTER,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+ if (err)
+ goto out;
+
+ MLX4_GET(adapter->inta_pin, outbox, QUERY_ADAPTER_INTA_PIN_OFFSET);
+
+ get_board_id(outbox + QUERY_ADAPTER_VSD_OFFSET / 4,
+ adapter->board_id);
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+int mlx4_INIT_HCA(struct mlx4_dev *dev, struct mlx4_init_hca_param *param)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ __be32 *inbox;
+ int err;
+ static const u8 a0_dmfs_hw_steering[] = {
+ [MLX4_STEERING_DMFS_A0_DEFAULT] = 0,
+ [MLX4_STEERING_DMFS_A0_DYNAMIC] = 1,
+ [MLX4_STEERING_DMFS_A0_STATIC] = 2,
+ [MLX4_STEERING_DMFS_A0_DISABLE] = 3
+ };
+ u8 field_ipoib, field_eth;
+
+#define INIT_HCA_IN_SIZE 0x200
+#define INIT_HCA_VERSION_OFFSET 0x000
+#define INIT_HCA_VERSION 2
+#define INIT_HCA_VXLAN_OFFSET 0x0c
+#define INIT_HCA_CACHELINE_SZ_OFFSET 0x0e
+#define INIT_HCA_FLAGS_OFFSET 0x014
+#define INIT_HCA_RECOVERABLE_ERROR_EVENT_OFFSET 0x018
+#define INIT_HCA_QPC_OFFSET 0x020
+#define INIT_HCA_QPC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x10)
+#define INIT_HCA_LOG_QP_OFFSET (INIT_HCA_QPC_OFFSET + 0x17)
+#define INIT_HCA_SRQC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x28)
+#define INIT_HCA_LOG_SRQ_OFFSET (INIT_HCA_QPC_OFFSET + 0x2f)
+#define INIT_HCA_CQC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x30)
+#define INIT_HCA_LOG_CQ_OFFSET (INIT_HCA_QPC_OFFSET + 0x37)
+#define INIT_HCA_EQE_CQE_OFFSETS (INIT_HCA_QPC_OFFSET + 0x38)
+#define INIT_HCA_EQE_CQE_STRIDE_OFFSET (INIT_HCA_QPC_OFFSET + 0x3b)
+#define INIT_HCA_ALTC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x40)
+#define INIT_HCA_AUXC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x50)
+#define INIT_HCA_EQC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x60)
+#define INIT_HCA_LOG_EQ_OFFSET (INIT_HCA_QPC_OFFSET + 0x67)
+#define INIT_HCA_NUM_SYS_EQS_OFFSET (INIT_HCA_QPC_OFFSET + 0x6a)
+#define INIT_HCA_RDMARC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x70)
+#define INIT_HCA_LOG_RD_OFFSET (INIT_HCA_QPC_OFFSET + 0x77)
+#define INIT_HCA_MCAST_OFFSET 0x0c0
+#define INIT_HCA_MC_BASE_OFFSET (INIT_HCA_MCAST_OFFSET + 0x00)
+#define INIT_HCA_LOG_MC_ENTRY_SZ_OFFSET (INIT_HCA_MCAST_OFFSET + 0x12)
+#define INIT_HCA_LOG_MC_HASH_SZ_OFFSET (INIT_HCA_MCAST_OFFSET + 0x16)
+#define INIT_HCA_UC_STEERING_OFFSET (INIT_HCA_MCAST_OFFSET + 0x18)
+#define INIT_HCA_LOG_MC_TABLE_SZ_OFFSET (INIT_HCA_MCAST_OFFSET + 0x1b)
+#define INIT_HCA_DEVICE_MANAGED_FLOW_STEERING_EN 0x6
+#define INIT_HCA_FS_PARAM_OFFSET 0x1d0
+#define INIT_HCA_FS_BASE_OFFSET (INIT_HCA_FS_PARAM_OFFSET + 0x00)
+#define INIT_HCA_FS_LOG_ENTRY_SZ_OFFSET (INIT_HCA_FS_PARAM_OFFSET + 0x12)
+#define INIT_HCA_FS_A0_OFFSET (INIT_HCA_FS_PARAM_OFFSET + 0x18)
+#define INIT_HCA_FS_LOG_TABLE_SZ_OFFSET (INIT_HCA_FS_PARAM_OFFSET + 0x1b)
+#define INIT_HCA_FS_ETH_BITS_OFFSET (INIT_HCA_FS_PARAM_OFFSET + 0x21)
+#define INIT_HCA_FS_ETH_NUM_ADDRS_OFFSET (INIT_HCA_FS_PARAM_OFFSET + 0x22)
+#define INIT_HCA_FS_IB_BITS_OFFSET (INIT_HCA_FS_PARAM_OFFSET + 0x25)
+#define INIT_HCA_FS_IB_NUM_ADDRS_OFFSET (INIT_HCA_FS_PARAM_OFFSET + 0x26)
+#define INIT_HCA_TPT_OFFSET 0x0f0
+#define INIT_HCA_DMPT_BASE_OFFSET (INIT_HCA_TPT_OFFSET + 0x00)
+#define INIT_HCA_TPT_MW_OFFSET (INIT_HCA_TPT_OFFSET + 0x08)
+#define INIT_HCA_LOG_MPT_SZ_OFFSET (INIT_HCA_TPT_OFFSET + 0x0b)
+#define INIT_HCA_MTT_BASE_OFFSET (INIT_HCA_TPT_OFFSET + 0x10)
+#define INIT_HCA_CMPT_BASE_OFFSET (INIT_HCA_TPT_OFFSET + 0x18)
+#define INIT_HCA_UAR_OFFSET 0x120
+#define INIT_HCA_LOG_UAR_SZ_OFFSET (INIT_HCA_UAR_OFFSET + 0x0a)
+#define INIT_HCA_UAR_PAGE_SZ_OFFSET (INIT_HCA_UAR_OFFSET + 0x0b)
+#define MLX4_FS_UDP_UC_EN (1 << 1)
+#define MLX4_FS_TCP_UC_EN (1 << 2)
+#define MLX4_FS_IP_SIP_DISABLE (1 << 3)
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ inbox = mailbox->buf;
+
+ *((u8 *) mailbox->buf + INIT_HCA_VERSION_OFFSET) = INIT_HCA_VERSION;
+
+ *((u8 *) mailbox->buf + INIT_HCA_CACHELINE_SZ_OFFSET) =
+ ((ilog2(cache_line_size()) - 4) << 5) | (1 << 4);
+
+#if defined(__LITTLE_ENDIAN)
+ *(inbox + INIT_HCA_FLAGS_OFFSET / 4) &= ~cpu_to_be32(1 << 1);
+#elif defined(__BIG_ENDIAN)
+ *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1 << 1);
+#else
+#error Host endianness not defined
+#endif
+ /* Check port for UD address vector: */
+ *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1);
+
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ /* Set wqe_format to be 1 */
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_WQE_FORMAT) {
+ *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1 << 8);
+ dev->caps.userspace_caps |= MLX4_USER_DEV_CAP_WQE_FORMAT;
+ } else {
+ mlx4_err(dev, "INIT_HCA failed: WQE_FORMAT 1 not supported by FW\n");
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return -ENOSYS;
+ }
+#endif
+ /* Enable IPoIB checksumming if we can: */
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_IPOIB_CSUM)
+ *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1 << 3);
+
+ /* Enable QoS support if module parameter set */
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ETS_CFG && enable_qos)
+ *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1 << 2);
+
+ /* Enable fast drop performance optimization */
+ if (dev->caps.fast_drop)
+ *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1 << 7);
+
+ /* enable counters */
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_COUNTERS)
+ *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1 << 4);
+
+ /* Enable RSS spread to fragmented IP packets when supported */
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_RSS_IP_FRAG)
+ *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1 << 13);
+
+ /* CX3 is capable of extending CQEs/EQEs from 32 to 64 bytes */
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_64B_EQE) {
+ *(inbox + INIT_HCA_EQE_CQE_OFFSETS / 4) |= cpu_to_be32(1 << 29);
+ dev->caps.eqe_size = 64;
+ dev->caps.eqe_factor = 1;
+ } else {
+ dev->caps.eqe_size = 32;
+ dev->caps.eqe_factor = 0;
+ }
+
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_64B_CQE) {
+ *(inbox + INIT_HCA_EQE_CQE_OFFSETS / 4) |= cpu_to_be32(1 << 30);
+ dev->caps.cqe_size = 64;
+ dev->caps.userspace_caps |= MLX4_USER_DEV_CAP_LARGE_CQE;
+ } else {
+ dev->caps.cqe_size = 32;
+ }
+
+ /* CX3 is capable of extending CQEs\EQEs to strides larger than 64B */
+ if ((dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_EQE_STRIDE) &&
+ (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_CQE_STRIDE)) {
+ dev->caps.eqe_size = cache_line_size();
+ dev->caps.cqe_size = cache_line_size();
+ dev->caps.eqe_factor = 0;
+ MLX4_PUT(inbox, (u8)((ilog2(dev->caps.eqe_size) - 5) << 4 |
+ (ilog2(dev->caps.eqe_size) - 5)),
+ INIT_HCA_EQE_CQE_STRIDE_OFFSET);
+
+ /* User still need to know to support CQE > 32B */
+ dev->caps.userspace_caps |= MLX4_USER_DEV_CAP_LARGE_CQE;
+ }
+
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RECOVERABLE_ERROR_EVENT)
+ *(inbox + INIT_HCA_RECOVERABLE_ERROR_EVENT_OFFSET / 4) |= cpu_to_be32(1 << 31);
+
+ if (ingress_parser_mode == MLX4_INGRESS_PARSER_MODE_NON_L4_CSUM_OFFLOAD) {
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_MODIFY_PARSER) {
+ *(inbox + INIT_HCA_RECOVERABLE_ERROR_EVENT_OFFSET / 4) |= cpu_to_be32(1 << 26);
+ *(inbox + INIT_HCA_RECOVERABLE_ERROR_EVENT_OFFSET / 4) |= cpu_to_be32(1 << 28);
+ } else {
+ mlx4_warn(dev, "Device does not support change of ingress parser\n");
+ }
+ }
+
+ /* QPC/EEC/CQC/EQC/RDMARC attributes */
+
+ MLX4_PUT(inbox, param->qpc_base, INIT_HCA_QPC_BASE_OFFSET);
+ MLX4_PUT(inbox, param->log_num_qps, INIT_HCA_LOG_QP_OFFSET);
+ MLX4_PUT(inbox, param->srqc_base, INIT_HCA_SRQC_BASE_OFFSET);
+ MLX4_PUT(inbox, param->log_num_srqs, INIT_HCA_LOG_SRQ_OFFSET);
+ MLX4_PUT(inbox, param->cqc_base, INIT_HCA_CQC_BASE_OFFSET);
+ MLX4_PUT(inbox, param->log_num_cqs, INIT_HCA_LOG_CQ_OFFSET);
+ MLX4_PUT(inbox, param->altc_base, INIT_HCA_ALTC_BASE_OFFSET);
+ MLX4_PUT(inbox, param->auxc_base, INIT_HCA_AUXC_BASE_OFFSET);
+ MLX4_PUT(inbox, param->eqc_base, INIT_HCA_EQC_BASE_OFFSET);
+ MLX4_PUT(inbox, param->log_num_eqs, INIT_HCA_LOG_EQ_OFFSET);
+ MLX4_PUT(inbox, param->num_sys_eqs, INIT_HCA_NUM_SYS_EQS_OFFSET);
+ MLX4_PUT(inbox, param->rdmarc_base, INIT_HCA_RDMARC_BASE_OFFSET);
+ MLX4_PUT(inbox, param->log_rd_per_qp, INIT_HCA_LOG_RD_OFFSET);
+
+ /* steering attributes */
+ if (dev->caps.steering_mode ==
+ MLX4_STEERING_MODE_DEVICE_MANAGED) {
+ *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |=
+ cpu_to_be32(1 <<
+ INIT_HCA_DEVICE_MANAGED_FLOW_STEERING_EN);
+
+ MLX4_PUT(inbox, param->mc_base, INIT_HCA_FS_BASE_OFFSET);
+ MLX4_PUT(inbox, param->log_mc_entry_sz,
+ INIT_HCA_FS_LOG_ENTRY_SZ_OFFSET);
+ MLX4_PUT(inbox, param->log_mc_table_sz,
+ INIT_HCA_FS_LOG_TABLE_SZ_OFFSET);
+ field_ipoib = MLX4_FS_UDP_UC_EN | MLX4_FS_TCP_UC_EN;
+ field_eth = field_ipoib;
+ if (dev->caps.steering_attr & MLX4_STEERING_ATTR_ETH_IGNORE_SIP)
+ field_eth |= MLX4_FS_IP_SIP_DISABLE;
+ if (dev->caps.steering_attr & MLX4_STEERING_ATTR_IB_IGNORE_SIP)
+ field_ipoib |= MLX4_FS_IP_SIP_DISABLE;
+ /* Enable Ethernet flow steering
+ * with udp unicast, tcp unicast and disable sip check
+ */
+ if (dev->caps.steering_attr & MLX4_STEERING_ATTR_DMFS_EN) {
+ if (dev->caps.dmfs_high_steer_mode !=
+ MLX4_STEERING_DMFS_A0_STATIC)
+ MLX4_PUT(inbox, field_eth, INIT_HCA_FS_ETH_BITS_OFFSET);
+
+ MLX4_PUT(inbox, (u16)MLX4_FS_NUM_OF_L2_ADDR,
+ INIT_HCA_FS_ETH_NUM_ADDRS_OFFSET);
+ }
+ /* Enable IPoIB flow steering
+ * with udp unicast, tcp unicast and disable sip check
+ */
+ if (dev->caps.steering_attr & MLX4_STEERING_ATTR_DMFS_IPOIB) {
+ MLX4_PUT(inbox, field_ipoib, INIT_HCA_FS_IB_BITS_OFFSET);
+ MLX4_PUT(inbox, (u16)MLX4_FS_NUM_OF_L2_ADDR,
+ INIT_HCA_FS_IB_NUM_ADDRS_OFFSET);
+ }
+
+ if (dev->caps.dmfs_high_steer_mode !=
+ MLX4_STEERING_DMFS_A0_NOT_SUPPORTED)
+ MLX4_PUT(inbox,
+ ((u8)(a0_dmfs_hw_steering[dev->caps.dmfs_high_steer_mode]
+ << 6)),
+ INIT_HCA_FS_A0_OFFSET);
+ } else {
+ MLX4_PUT(inbox, param->mc_base, INIT_HCA_MC_BASE_OFFSET);
+ MLX4_PUT(inbox, param->log_mc_entry_sz,
+ INIT_HCA_LOG_MC_ENTRY_SZ_OFFSET);
+ MLX4_PUT(inbox, param->log_mc_hash_sz,
+ INIT_HCA_LOG_MC_HASH_SZ_OFFSET);
+ MLX4_PUT(inbox, param->log_mc_table_sz,
+ INIT_HCA_LOG_MC_TABLE_SZ_OFFSET);
+ if (dev->caps.steering_mode == MLX4_STEERING_MODE_B0)
+ MLX4_PUT(inbox, (u8) (1 << 3),
+ INIT_HCA_UC_STEERING_OFFSET);
+ }
+
+ /* TPT attributes */
+
+ MLX4_PUT(inbox, param->dmpt_base, INIT_HCA_DMPT_BASE_OFFSET);
+ MLX4_PUT(inbox, param->mw_enabled, INIT_HCA_TPT_MW_OFFSET);
+ MLX4_PUT(inbox, param->log_mpt_sz, INIT_HCA_LOG_MPT_SZ_OFFSET);
+ MLX4_PUT(inbox, param->mtt_base, INIT_HCA_MTT_BASE_OFFSET);
+ MLX4_PUT(inbox, param->cmpt_base, INIT_HCA_CMPT_BASE_OFFSET);
+
+ /* UAR attributes */
+
+ MLX4_PUT(inbox, param->uar_page_sz, INIT_HCA_UAR_PAGE_SZ_OFFSET);
+ MLX4_PUT(inbox, param->log_uar_sz, INIT_HCA_LOG_UAR_SZ_OFFSET);
+
+ /* set parser VXLAN attributes */
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_VXLAN_OFFLOADS) {
+ u8 parser_params = 0;
+ MLX4_PUT(inbox, parser_params, INIT_HCA_VXLAN_OFFSET);
+ }
+
+ err = mlx4_cmd(dev, mailbox->dma, 0, 0, MLX4_CMD_INIT_HCA,
+ MLX4_CMD_TIME_CLASS_C, MLX4_CMD_NATIVE);
+
+ if (err)
+ mlx4_err(dev, "INIT_HCA returns %d\n", err);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+int mlx4_QUERY_HCA(struct mlx4_dev *dev,
+ struct mlx4_init_hca_param *param)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ __be32 *outbox;
+ u32 dword_field;
+ int err;
+ u8 byte_field;
+ static const u8 a0_dmfs_query_hw_steering[] = {
+ [0] = MLX4_STEERING_DMFS_A0_DEFAULT,
+ [1] = MLX4_STEERING_DMFS_A0_DYNAMIC,
+ [2] = MLX4_STEERING_DMFS_A0_STATIC,
+ [3] = MLX4_STEERING_DMFS_A0_DISABLE
+ };
+
+#define QUERY_HCA_GLOBAL_CAPS_OFFSET 0x04
+#define QUERY_HCA_CORE_CLOCK_OFFSET 0x0c
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ outbox = mailbox->buf;
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, 0, 0,
+ MLX4_CMD_QUERY_HCA,
+ MLX4_CMD_TIME_CLASS_B,
+ !mlx4_is_slave(dev));
+ if (err)
+ goto out;
+
+ MLX4_GET(param->global_caps, outbox, QUERY_HCA_GLOBAL_CAPS_OFFSET);
+ MLX4_GET(param->hca_core_clock, outbox, QUERY_HCA_CORE_CLOCK_OFFSET);
+
+ /* QPC/EEC/CQC/EQC/RDMARC attributes */
+
+ MLX4_GET(param->qpc_base, outbox, INIT_HCA_QPC_BASE_OFFSET);
+ MLX4_GET(param->log_num_qps, outbox, INIT_HCA_LOG_QP_OFFSET);
+ MLX4_GET(param->srqc_base, outbox, INIT_HCA_SRQC_BASE_OFFSET);
+ MLX4_GET(param->log_num_srqs, outbox, INIT_HCA_LOG_SRQ_OFFSET);
+ MLX4_GET(param->cqc_base, outbox, INIT_HCA_CQC_BASE_OFFSET);
+ MLX4_GET(param->log_num_cqs, outbox, INIT_HCA_LOG_CQ_OFFSET);
+ MLX4_GET(param->altc_base, outbox, INIT_HCA_ALTC_BASE_OFFSET);
+ MLX4_GET(param->auxc_base, outbox, INIT_HCA_AUXC_BASE_OFFSET);
+ MLX4_GET(param->eqc_base, outbox, INIT_HCA_EQC_BASE_OFFSET);
+ MLX4_GET(param->log_num_eqs, outbox, INIT_HCA_LOG_EQ_OFFSET);
+ MLX4_GET(param->num_sys_eqs, outbox, INIT_HCA_NUM_SYS_EQS_OFFSET);
+ MLX4_GET(param->rdmarc_base, outbox, INIT_HCA_RDMARC_BASE_OFFSET);
+ MLX4_GET(param->log_rd_per_qp, outbox, INIT_HCA_LOG_RD_OFFSET);
+
+ MLX4_GET(dword_field, outbox, INIT_HCA_FLAGS_OFFSET);
+ if (dword_field & (1 << INIT_HCA_DEVICE_MANAGED_FLOW_STEERING_EN)) {
+ param->steering_mode = MLX4_STEERING_MODE_DEVICE_MANAGED;
+ } else {
+ MLX4_GET(byte_field, outbox, INIT_HCA_UC_STEERING_OFFSET);
+ if (byte_field & 0x8)
+ param->steering_mode = MLX4_STEERING_MODE_B0;
+ else
+ param->steering_mode = MLX4_STEERING_MODE_A0;
+ }
+
+ if (dword_field & (1 << 13))
+ param->rss_ip_frags = 1;
+
+ /* steering attributes */
+ if (param->steering_mode == MLX4_STEERING_MODE_DEVICE_MANAGED) {
+ MLX4_GET(param->mc_base, outbox, INIT_HCA_FS_BASE_OFFSET);
+ MLX4_GET(param->log_mc_entry_sz, outbox,
+ INIT_HCA_FS_LOG_ENTRY_SZ_OFFSET);
+ MLX4_GET(param->log_mc_table_sz, outbox,
+ INIT_HCA_FS_LOG_TABLE_SZ_OFFSET);
+ MLX4_GET(byte_field, outbox,
+ INIT_HCA_FS_A0_OFFSET);
+ param->dmfs_high_steer_mode =
+ a0_dmfs_query_hw_steering[(byte_field >> 6) & 3];
+
+ param->steering_attr = MLX4_STEERING_ATTR_DMFS_EN;
+
+ MLX4_GET(byte_field, outbox, INIT_HCA_FS_ETH_BITS_OFFSET);
+ if (byte_field & MLX4_FS_IP_SIP_DISABLE)
+ param->steering_attr |= MLX4_STEERING_ATTR_ETH_IGNORE_SIP;
+ MLX4_GET(byte_field, outbox, INIT_HCA_FS_IB_BITS_OFFSET);
+ if (byte_field & MLX4_FS_UDP_UC_EN && byte_field & MLX4_FS_TCP_UC_EN)
+ param->steering_attr |= MLX4_STEERING_ATTR_DMFS_IPOIB;
+ if (byte_field & MLX4_FS_IP_SIP_DISABLE)
+ param->steering_attr |= MLX4_STEERING_ATTR_IB_IGNORE_SIP;
+ } else {
+ MLX4_GET(param->mc_base, outbox, INIT_HCA_MC_BASE_OFFSET);
+ MLX4_GET(param->log_mc_entry_sz, outbox,
+ INIT_HCA_LOG_MC_ENTRY_SZ_OFFSET);
+ MLX4_GET(param->log_mc_hash_sz, outbox,
+ INIT_HCA_LOG_MC_HASH_SZ_OFFSET);
+ MLX4_GET(param->log_mc_table_sz, outbox,
+ INIT_HCA_LOG_MC_TABLE_SZ_OFFSET);
+ }
+
+ /* CX3 is capable of extending CQEs/EQEs from 32 to 64 bytes */
+ MLX4_GET(byte_field, outbox, INIT_HCA_EQE_CQE_OFFSETS);
+ if (byte_field & 0x20) /* 64-bytes eqe enabled */
+ param->dev_cap_enabled |= MLX4_DEV_CAP_64B_EQE_ENABLED;
+ if (byte_field & 0x40) /* 64-bytes cqe enabled */
+ param->dev_cap_enabled |= MLX4_DEV_CAP_64B_CQE_ENABLED;
+
+ /* CX3 is capable of extending CQEs\EQEs to strides larger than 64B */
+ MLX4_GET(byte_field, outbox, INIT_HCA_EQE_CQE_STRIDE_OFFSET);
+ if (byte_field) {
+ param->dev_cap_enabled |= MLX4_DEV_CAP_EQE_STRIDE_ENABLED;
+ param->dev_cap_enabled |= MLX4_DEV_CAP_CQE_STRIDE_ENABLED;
+ param->cqe_size = 1 << ((byte_field &
+ MLX4_CQE_SIZE_MASK_STRIDE) + 5);
+ param->eqe_size = 1 << (((byte_field &
+ MLX4_EQE_SIZE_MASK_STRIDE) >> 4) + 5);
+ }
+
+ /* TPT attributes */
+
+ MLX4_GET(param->dmpt_base, outbox, INIT_HCA_DMPT_BASE_OFFSET);
+ MLX4_GET(param->mw_enabled, outbox, INIT_HCA_TPT_MW_OFFSET);
+ MLX4_GET(param->log_mpt_sz, outbox, INIT_HCA_LOG_MPT_SZ_OFFSET);
+ MLX4_GET(param->mtt_base, outbox, INIT_HCA_MTT_BASE_OFFSET);
+ MLX4_GET(param->cmpt_base, outbox, INIT_HCA_CMPT_BASE_OFFSET);
+
+ /* UAR attributes */
+
+ MLX4_GET(param->uar_page_sz, outbox, INIT_HCA_UAR_PAGE_SZ_OFFSET);
+ MLX4_GET(param->log_uar_sz, outbox, INIT_HCA_LOG_UAR_SZ_OFFSET);
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+ return err;
+}
+
+static int mlx4_hca_core_clock_update(struct mlx4_dev *dev)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ __be32 *outbox;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox)) {
+ mlx4_warn(dev, "hca_core_clock mailbox allocation failed\n");
+ return PTR_ERR(mailbox);
+ }
+ outbox = mailbox->buf;
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, 0, 0,
+ MLX4_CMD_QUERY_HCA,
+ MLX4_CMD_TIME_CLASS_B,
+ !mlx4_is_slave(dev));
+ if (err) {
+ mlx4_warn(dev, "hca_core_clock update failed\n");
+ goto out;
+ }
+
+ MLX4_GET(dev->caps.hca_core_clock, outbox, QUERY_HCA_CORE_CLOCK_OFFSET);
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+ return err;
+}
+
+/* for IB-type ports only in SRIOV mode. Checks that both proxy QP0
+ * and real QP0 are active, so that the paravirtualized QP0 is ready
+ * to operate */
+static int check_qp0_state(struct mlx4_dev *dev, int function, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ /* irrelevant if not infiniband */
+ if (priv->mfunc.master.qp0_state[port].proxy_qp0_active &&
+ priv->mfunc.master.qp0_state[port].qp0_active)
+ return 1;
+ return 0;
+}
+
+int mlx4_INIT_PORT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int port = mlx4_slave_convert_port(dev, slave, vhcr->in_modifier);
+ int err;
+
+ if (port < 0)
+ return -EINVAL;
+
+ if (priv->mfunc.master.slave_state[slave].init_port_mask & (1 << port))
+ return 0;
+
+ if (dev->caps.port_mask[port] != MLX4_PORT_TYPE_IB) {
+ /* Enable port only if it was previously disabled */
+ if (!priv->mfunc.master.init_port_ref[port]) {
+ err = mlx4_cmd(dev, 0, port, 0, MLX4_CMD_INIT_PORT,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+ if (err)
+ return err;
+ }
+ priv->mfunc.master.slave_state[slave].init_port_mask |= (1 << port);
+ } else {
+ if (slave == mlx4_master_func_num(dev)) {
+ if (check_qp0_state(dev, slave, port) &&
+ !priv->mfunc.master.qp0_state[port].port_active) {
+ err = mlx4_cmd(dev, 0, port, 0, MLX4_CMD_INIT_PORT,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+ if (err)
+ return err;
+ priv->mfunc.master.qp0_state[port].port_active = 1;
+ priv->mfunc.master.slave_state[slave].init_port_mask |= (1 << port);
+ }
+ } else
+ priv->mfunc.master.slave_state[slave].init_port_mask |= (1 << port);
+ }
+ ++priv->mfunc.master.init_port_ref[port];
+ return 0;
+}
+
+int mlx4_INIT_PORT(struct mlx4_dev *dev, int port)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 *inbox;
+ int err;
+ u32 flags;
+ u16 field;
+
+ if (dev->flags & MLX4_FLAG_OLD_PORT_CMDS) {
+#define INIT_PORT_IN_SIZE 256
+#define INIT_PORT_FLAGS_OFFSET 0x00
+#define INIT_PORT_FLAG_SIG (1 << 18)
+#define INIT_PORT_FLAG_NG (1 << 17)
+#define INIT_PORT_FLAG_G0 (1 << 16)
+#define INIT_PORT_VL_SHIFT 4
+#define INIT_PORT_PORT_WIDTH_SHIFT 8
+#define INIT_PORT_MTU_OFFSET 0x04
+#define INIT_PORT_MAX_GID_OFFSET 0x06
+#define INIT_PORT_MAX_PKEY_OFFSET 0x0a
+#define INIT_PORT_GUID0_OFFSET 0x10
+#define INIT_PORT_NODE_GUID_OFFSET 0x18
+#define INIT_PORT_SI_GUID_OFFSET 0x20
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ inbox = mailbox->buf;
+
+ flags = 0;
+ flags |= (dev->caps.vl_cap[port] & 0xf) << INIT_PORT_VL_SHIFT;
+ flags |= (dev->caps.port_width_cap[port] & 0xf) << INIT_PORT_PORT_WIDTH_SHIFT;
+ MLX4_PUT(inbox, flags, INIT_PORT_FLAGS_OFFSET);
+
+ field = 128 << dev->caps.ib_mtu_cap[port];
+ MLX4_PUT(inbox, field, INIT_PORT_MTU_OFFSET);
+ field = dev->caps.gid_table_len[port];
+ MLX4_PUT(inbox, field, INIT_PORT_MAX_GID_OFFSET);
+ field = dev->caps.pkey_table_len[port];
+ MLX4_PUT(inbox, field, INIT_PORT_MAX_PKEY_OFFSET);
+
+ err = mlx4_cmd(dev, mailbox->dma, port, 0, MLX4_CMD_INIT_PORT,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ } else
+ err = mlx4_cmd(dev, 0, port, 0, MLX4_CMD_INIT_PORT,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+
+ if (!err)
+ mlx4_hca_core_clock_update(dev);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_INIT_PORT);
+
+int mlx4_CLOSE_PORT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int port = mlx4_slave_convert_port(dev, slave, vhcr->in_modifier);
+ int err;
+
+ if (port < 0)
+ return -EINVAL;
+
+ if (!(priv->mfunc.master.slave_state[slave].init_port_mask &
+ (1 << port)))
+ return 0;
+
+ if (dev->caps.port_mask[port] != MLX4_PORT_TYPE_IB) {
+ if (priv->mfunc.master.init_port_ref[port] == 1) {
+ err = mlx4_cmd(dev, 0, port, 0, MLX4_CMD_CLOSE_PORT,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+ if (err)
+ return err;
+ }
+ priv->mfunc.master.slave_state[slave].init_port_mask &= ~(1 << port);
+ } else {
+ /* infiniband port */
+ if (slave == mlx4_master_func_num(dev)) {
+ if (!priv->mfunc.master.qp0_state[port].qp0_active &&
+ priv->mfunc.master.qp0_state[port].port_active) {
+ err = mlx4_cmd(dev, 0, port, 0, MLX4_CMD_CLOSE_PORT,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+ if (err)
+ return err;
+ priv->mfunc.master.slave_state[slave].init_port_mask &= ~(1 << port);
+ priv->mfunc.master.qp0_state[port].port_active = 0;
+ }
+ } else
+ priv->mfunc.master.slave_state[slave].init_port_mask &= ~(1 << port);
+ }
+ --priv->mfunc.master.init_port_ref[port];
+ return 0;
+}
+
+int mlx4_CLOSE_PORT(struct mlx4_dev *dev, int port)
+{
+ return mlx4_cmd(dev, 0, port, 0, MLX4_CMD_CLOSE_PORT,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+}
+EXPORT_SYMBOL_GPL(mlx4_CLOSE_PORT);
+
+int mlx4_CLOSE_HCA(struct mlx4_dev *dev, int panic)
+{
+ return mlx4_cmd(dev, 0, 0, panic, MLX4_CMD_CLOSE_HCA,
+ MLX4_CMD_TIME_CLASS_C, MLX4_CMD_NATIVE);
+}
+
+struct mlx4_config_dev {
+ __be32 update_flags;
+ __be32 rsvd1[3];
+ __be16 vxlan_udp_dport;
+ __be16 rsvd2;
+ __be16 roce_v2_entropy;
+ __be16 roce_v2_udp_dport;
+ __be32 roce_flags;
+ __be32 rsvd4[25];
+ __be16 rsvd5;
+ u8 rsvd6;
+ u8 rx_checksum_val;
+};
+
+#define MLX4_VXLAN_UDP_DPORT (1 << 0)
+#define MLX4_ROCE_V2_UDP_DPORT BIT(3)
+#define MLX4_DISABLE_RX_PORT BIT(18)
+
+static int mlx4_CONFIG_DEV_set(struct mlx4_dev *dev, struct mlx4_config_dev *config_dev)
+{
+ int err;
+ struct mlx4_cmd_mailbox *mailbox;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ memcpy(mailbox->buf, config_dev, sizeof(*config_dev));
+
+ err = mlx4_cmd(dev, mailbox->dma, 0, 0, MLX4_CMD_CONFIG_DEV,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+static int mlx4_CONFIG_DEV_get(struct mlx4_dev *dev, struct mlx4_config_dev *config_dev)
+{
+ int err;
+ struct mlx4_cmd_mailbox *mailbox;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, 0, 1, MLX4_CMD_CONFIG_DEV,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+
+ if (!err)
+ memcpy(config_dev, mailbox->buf, sizeof(*config_dev));
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+/* Conversion between the HW values and the actual functionality.
+ * The value represented by the array index,
+ * and the functionality determined by the flags.
+ */
+static const u8 config_dev_csum_flags[] = {
+ [0] = 0,
+ [1] = MLX4_RX_CSUM_MODE_VAL_NON_TCP_UDP,
+ [2] = MLX4_RX_CSUM_MODE_VAL_NON_TCP_UDP |
+ MLX4_RX_CSUM_MODE_L4,
+ [3] = MLX4_RX_CSUM_MODE_L4 |
+ MLX4_RX_CSUM_MODE_IP_OK_IP_NON_TCP_UDP |
+ MLX4_RX_CSUM_MODE_MULTI_VLAN,
+ [4] = MLX4_RX_CSUM_MODE_VAL_NON_TCP_UDP |
+ MLX4_RX_CSUM_MODE_L4 |
+ MLX4_RX_CSUM_MODE_IP_OK_IP_NON_TCP_UDP
+};
+
+int mlx4_config_dev_retrieval(struct mlx4_dev *dev,
+ struct mlx4_config_dev_params *params)
+{
+ struct mlx4_config_dev config_dev = {0};
+ int err;
+ u8 csum_mask;
+
+#define CONFIG_DEV_RX_CSUM_MODE_MASK 0x7
+#define CONFIG_DEV_RX_CSUM_MODE_PORT1_BIT_OFFSET 0
+#define CONFIG_DEV_RX_CSUM_MODE_PORT2_BIT_OFFSET 4
+
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_CONFIG_DEV))
+ return -ENOTSUPP;
+
+ err = mlx4_CONFIG_DEV_get(dev, &config_dev);
+ if (err)
+ return err;
+
+ csum_mask = (config_dev.rx_checksum_val >> CONFIG_DEV_RX_CSUM_MODE_PORT1_BIT_OFFSET) &
+ CONFIG_DEV_RX_CSUM_MODE_MASK;
+
+ if (csum_mask >= sizeof(config_dev_csum_flags)/sizeof(config_dev_csum_flags[0]))
+ return -EINVAL;
+ params->rx_csum_flags_port_1 = config_dev_csum_flags[csum_mask];
+
+ csum_mask = (config_dev.rx_checksum_val >> CONFIG_DEV_RX_CSUM_MODE_PORT2_BIT_OFFSET) &
+ CONFIG_DEV_RX_CSUM_MODE_MASK;
+
+ if (csum_mask >= sizeof(config_dev_csum_flags)/sizeof(config_dev_csum_flags[0]))
+ return -EINVAL;
+ params->rx_csum_flags_port_2 = config_dev_csum_flags[csum_mask];
+
+ params->vxlan_udp_dport = be16_to_cpu(config_dev.vxlan_udp_dport);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_config_dev_retrieval);
+
+int mlx4_config_vxlan_port(struct mlx4_dev *dev, __be16 udp_port)
+{
+ struct mlx4_config_dev config_dev;
+
+ memset(&config_dev, 0, sizeof(config_dev));
+ config_dev.update_flags = cpu_to_be32(MLX4_VXLAN_UDP_DPORT);
+ config_dev.vxlan_udp_dport = udp_port;
+
+ return mlx4_CONFIG_DEV_set(dev, &config_dev);
+}
+EXPORT_SYMBOL_GPL(mlx4_config_vxlan_port);
+
+#define CONFIG_DISABLE_RX_PORT BIT(15)
+int mlx4_disable_rx_port_check(struct mlx4_dev *dev, bool dis)
+{
+ struct mlx4_config_dev config_dev;
+
+ memset(&config_dev, 0, sizeof(config_dev));
+ config_dev.update_flags = cpu_to_be32(MLX4_DISABLE_RX_PORT);
+ if (dis)
+ config_dev.roce_flags =
+ cpu_to_be32(CONFIG_DISABLE_RX_PORT);
+
+ return mlx4_CONFIG_DEV_set(dev, &config_dev);
+}
+
+int mlx4_config_roce_v2_port(struct mlx4_dev *dev, u16 udp_port)
+{
+ struct mlx4_config_dev config_dev;
+
+ memset(&config_dev, 0, sizeof(config_dev));
+ config_dev.update_flags = cpu_to_be32(MLX4_ROCE_V2_UDP_DPORT);
+ config_dev.roce_v2_udp_dport = cpu_to_be16(udp_port);
+
+ return mlx4_CONFIG_DEV_set(dev, &config_dev);
+}
+EXPORT_SYMBOL_GPL(mlx4_config_roce_v2_port);
+
+int mlx4_virt2phy_port_map(struct mlx4_dev *dev, u32 port1, u32 port2)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct {
+ __be32 v_port1;
+ __be32 v_port2;
+ } *v2p;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return -ENOMEM;
+
+ v2p = mailbox->buf;
+ v2p->v_port1 = cpu_to_be32(port1);
+ v2p->v_port2 = cpu_to_be32(port2);
+
+ err = mlx4_cmd(dev, mailbox->dma, 0,
+ MLX4_SET_PORT_VIRT2PHY, MLX4_CMD_VIRT_PORT_MAP,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+
+int mlx4_SET_ICM_SIZE(struct mlx4_dev *dev, u64 icm_size, u64 *aux_pages)
+{
+ int ret = mlx4_cmd_imm(dev, icm_size, aux_pages, 0, 0,
+ MLX4_CMD_SET_ICM_SIZE,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+ if (ret)
+ return ret;
+
+ /*
+ * Round up number of system pages needed in case
+ * MLX4_ICM_PAGE_SIZE < PAGE_SIZE.
+ */
+ *aux_pages = ALIGN(*aux_pages, PAGE_SIZE / MLX4_ICM_PAGE_SIZE) >>
+ (PAGE_SHIFT - MLX4_ICM_PAGE_SHIFT);
+
+ return 0;
+}
+
+int mlx4_NOP(struct mlx4_dev *dev)
+{
+ /* Input modifier of 0x1f means "finish as soon as possible." */
+ return mlx4_cmd(dev, 0, 0x1f, 0, MLX4_CMD_NOP, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+}
+
+int mlx4_get_phys_port_id(struct mlx4_dev *dev)
+{
+ u8 port;
+ u32 *outbox;
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 in_mod;
+ u32 guid_hi, guid_lo;
+ int err, ret = 0;
+#define MOD_STAT_CFG_PORT_OFFSET 8
+#define MOD_STAT_CFG_GUID_H 0X14
+#define MOD_STAT_CFG_GUID_L 0X1c
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ outbox = mailbox->buf;
+
+ for (port = 1; port <= dev->caps.num_ports; port++) {
+ in_mod = port << MOD_STAT_CFG_PORT_OFFSET;
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, in_mod, 0x2,
+ MLX4_CMD_MOD_STAT_CFG, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_err(dev, "Fail to get port %d uplink guid\n",
+ port);
+ ret = err;
+ } else {
+ MLX4_GET(guid_hi, outbox, MOD_STAT_CFG_GUID_H);
+ MLX4_GET(guid_lo, outbox, MOD_STAT_CFG_GUID_L);
+ dev->caps.phys_port_id[port] = (u64)guid_lo |
+ (u64)guid_hi << 32;
+ }
+ }
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return ret;
+}
+
+int mlx4_query_diag_counters(struct mlx4_dev *dev, int array_length,
+ u8 op_modifier, u32 in_offset[],
+ u32 counter_out[])
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 *outbox;
+ int ret;
+ int i;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ outbox = mailbox->buf;
+
+ ret = mlx4_cmd_box(dev, 0, mailbox->dma, 0, op_modifier,
+ MLX4_CMD_DIAG_RPRT, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (ret)
+ goto out;
+
+ for (i = 0; i < array_length; i++) {
+ if (in_offset[i] > MLX4_MAILBOX_SIZE) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ MLX4_GET(counter_out[i], outbox, in_offset[i]);
+ }
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mlx4_query_diag_counters);
+
+#define MLX4_WOL_SETUP_MODE (5 << 28)
+int mlx4_wol_read(struct mlx4_dev *dev, u64 *config, int port)
+{
+ u32 in_mod = MLX4_WOL_SETUP_MODE | port << 8;
+
+ return mlx4_cmd_imm(dev, 0, config, in_mod, 0x3,
+ MLX4_CMD_MOD_STAT_CFG, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+}
+EXPORT_SYMBOL_GPL(mlx4_wol_read);
+
+int mlx4_wol_write(struct mlx4_dev *dev, u64 config, int port)
+{
+ u32 in_mod = MLX4_WOL_SETUP_MODE | port << 8;
+
+ return mlx4_cmd(dev, config, in_mod, 0x1, MLX4_CMD_MOD_STAT_CFG,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+}
+EXPORT_SYMBOL_GPL(mlx4_wol_write);
+
+enum {
+ ADD_TO_MCG = 0x26,
+};
+
+#ifdef KMOD_REMOVED
+void mlx4_opreq_action(struct work_struct *work)
+{
+ struct mlx4_priv *priv = container_of(work, struct mlx4_priv,
+ opreq_task);
+ struct mlx4_dev *dev = &priv->dev;
+ int num_tasks = atomic_read(&priv->opreq_count);
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_mgm *mgm;
+ u32 *outbox;
+ u32 modifier;
+ u16 token;
+ u16 type;
+ int err;
+ u32 num_qps;
+ struct mlx4_qp qp;
+ int i;
+ u8 rem_mcg;
+ u8 prot;
+
+#define GET_OP_REQ_MODIFIER_OFFSET 0x08
+#define GET_OP_REQ_TOKEN_OFFSET 0x14
+#define GET_OP_REQ_TYPE_OFFSET 0x1a
+#define GET_OP_REQ_DATA_OFFSET 0x20
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox)) {
+ mlx4_err(dev, "Failed to allocate mailbox for GET_OP_REQ\n");
+ return;
+ }
+ outbox = mailbox->buf;
+
+ while (num_tasks) {
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, 0, 0,
+ MLX4_CMD_GET_OP_REQ, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_err(dev, "Failed to retrieve required operation: %d\n",
+ err);
+ return;
+ }
+ MLX4_GET(modifier, outbox, GET_OP_REQ_MODIFIER_OFFSET);
+ MLX4_GET(token, outbox, GET_OP_REQ_TOKEN_OFFSET);
+ MLX4_GET(type, outbox, GET_OP_REQ_TYPE_OFFSET);
+ type &= 0xfff;
+
+ switch (type) {
+ case ADD_TO_MCG:
+ if (dev->caps.steering_mode ==
+ MLX4_STEERING_MODE_DEVICE_MANAGED) {
+ mlx4_warn(dev, "ADD MCG operation is not supported in DEVICE_MANAGED steering mode\n");
+ err = EPERM;
+ break;
+ }
+ mgm = (struct mlx4_mgm *)((u8 *)(outbox) +
+ GET_OP_REQ_DATA_OFFSET);
+ num_qps = be32_to_cpu(mgm->members_count) &
+ MGM_QPN_MASK;
+ rem_mcg = ((u8 *)(&mgm->members_count))[0] & 1;
+ prot = ((u8 *)(&mgm->members_count))[0] >> 6;
+
+ for (i = 0; i < num_qps; i++) {
+ qp.qpn = be32_to_cpu(mgm->qp[i]);
+ if (rem_mcg)
+ err = mlx4_multicast_detach(dev, &qp,
+ mgm->gid,
+ prot, 0);
+ else
+ err = mlx4_multicast_attach(dev, &qp,
+ mgm->gid,
+ mgm->gid[5]
+ , 0, prot,
+ NULL);
+ if (err)
+ break;
+ }
+ break;
+ default:
+ mlx4_warn(dev, "Bad type for required operation\n");
+ err = EINVAL;
+ break;
+ }
+ err = mlx4_cmd(dev, 0, ((u32) err |
+ (__force u32)cpu_to_be32(token) << 16),
+ 1, MLX4_CMD_GET_OP_REQ, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_err(dev, "Failed to acknowledge required request: %d\n",
+ err);
+ goto out;
+ }
+ memset(outbox, 0, 0xffc);
+ num_tasks = atomic_dec_return(&priv->opreq_count);
+ }
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+}
+#endif
+
+static int mlx4_check_smp_firewall_active(struct mlx4_dev *dev,
+ struct mlx4_cmd_mailbox *mailbox)
+{
+#define MLX4_CMD_MAD_DEMUX_SET_ATTR_OFFSET 0x10
+#define MLX4_CMD_MAD_DEMUX_GETRESP_ATTR_OFFSET 0x20
+#define MLX4_CMD_MAD_DEMUX_TRAP_ATTR_OFFSET 0x40
+#define MLX4_CMD_MAD_DEMUX_TRAP_REPRESS_ATTR_OFFSET 0x70
+
+ u32 set_attr_mask, getresp_attr_mask;
+ u32 trap_attr_mask, traprepress_attr_mask;
+
+ MLX4_GET(set_attr_mask, mailbox->buf,
+ MLX4_CMD_MAD_DEMUX_SET_ATTR_OFFSET);
+ mlx4_dbg(dev, "SMP firewall set_attribute_mask = 0x%x\n",
+ set_attr_mask);
+
+ MLX4_GET(getresp_attr_mask, mailbox->buf,
+ MLX4_CMD_MAD_DEMUX_GETRESP_ATTR_OFFSET);
+ mlx4_dbg(dev, "SMP firewall getresp_attribute_mask = 0x%x\n",
+ getresp_attr_mask);
+
+ MLX4_GET(trap_attr_mask, mailbox->buf,
+ MLX4_CMD_MAD_DEMUX_TRAP_ATTR_OFFSET);
+ mlx4_dbg(dev, "SMP firewall trap_attribute_mask = 0x%x\n",
+ trap_attr_mask);
+
+ MLX4_GET(traprepress_attr_mask, mailbox->buf,
+ MLX4_CMD_MAD_DEMUX_TRAP_REPRESS_ATTR_OFFSET);
+ mlx4_dbg(dev, "SMP firewall traprepress_attribute_mask = 0x%x\n",
+ traprepress_attr_mask);
+
+ if (set_attr_mask && getresp_attr_mask && trap_attr_mask &&
+ traprepress_attr_mask)
+ return 1;
+
+ return 0;
+}
+
+int mlx4_config_mad_demux(struct mlx4_dev *dev)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ int secure_host_active;
+ int err;
+
+ /* Check if mad_demux is supported */
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_MAD_DEMUX))
+ return 0;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox)) {
+ mlx4_warn(dev, "Failed to allocate mailbox for cmd MAD_DEMUX");
+ return -ENOMEM;
+ }
+
+ /* Query mad_demux to find out which MADs are handled by internal sma */
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, 0x01 /* subn mgmt class */,
+ MLX4_CMD_MAD_DEMUX_QUERY_RESTR, MLX4_CMD_MAD_DEMUX,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_warn(dev, "MLX4_CMD_MAD_DEMUX: query restrictions failed (%d)\n",
+ err);
+ goto out;
+ }
+
+ secure_host_active = mlx4_check_smp_firewall_active(dev, mailbox);
+
+ /* Config mad_demux to handle all MADs returned by the query above */
+ err = mlx4_cmd(dev, mailbox->dma, 0x01 /* subn mgmt class */,
+ MLX4_CMD_MAD_DEMUX_CONFIG, MLX4_CMD_MAD_DEMUX,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_warn(dev, "MLX4_CMD_MAD_DEMUX: configure failed (%d)\n", err);
+ goto out;
+ }
+
+ if (secure_host_active)
+ mlx4_warn(dev, "HCA operating in secure-host mode. SMP firewall activated.\n");
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+/* Access Reg commands */
+enum mlx4_access_reg_masks {
+ MLX4_ACCESS_REG_STATUS_MASK = 0x7f,
+ MLX4_ACCESS_REG_METHOD_MASK = 0x7f,
+ MLX4_ACCESS_REG_LEN_MASK = 0x7ff
+};
+
+struct mlx4_access_reg {
+ __be16 constant1;
+ u8 status;
+ u8 resrvd1;
+ __be16 reg_id;
+ u8 method;
+ u8 constant2;
+ __be32 resrvd2[2];
+ __be16 len_const;
+ __be16 resrvd3;
+#define MLX4_ACCESS_REG_HEADER_SIZE (20)
+ u8 reg_data[MLX4_MAILBOX_SIZE-MLX4_ACCESS_REG_HEADER_SIZE];
+} __attribute__((__packed__));
+
+/**
+ * mlx4_ACCESS_REG - Generic access reg command.
+ * @dev: mlx4_dev.
+ * @reg_id: register ID to access.
+ * @method: Access method Read/Write.
+ * @reg_len: register length to Read/Write in bytes.
+ * @reg_data: reg_data pointer to Read/Write From/To.
+ *
+ * Access ConnectX registers FW command.
+ * Returns 0 on success and copies outbox mlx4_access_reg data
+ * field into reg_data or a negative error code.
+ */
+static int mlx4_ACCESS_REG(struct mlx4_dev *dev, u16 reg_id,
+ enum mlx4_access_reg_method method,
+ u16 reg_len, void *reg_data)
+{
+ struct mlx4_cmd_mailbox *inbox, *outbox;
+ struct mlx4_access_reg *inbuf, *outbuf;
+ int err;
+
+ inbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(inbox))
+ return PTR_ERR(inbox);
+
+ outbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(outbox)) {
+ mlx4_free_cmd_mailbox(dev, inbox);
+ return PTR_ERR(outbox);
+ }
+
+ inbuf = inbox->buf;
+ outbuf = outbox->buf;
+
+ inbuf->constant1 = cpu_to_be16(0x1<<11 | 0x4);
+ inbuf->constant2 = 0x1;
+ inbuf->reg_id = cpu_to_be16(reg_id);
+ inbuf->method = method & MLX4_ACCESS_REG_METHOD_MASK;
+
+ reg_len = min(reg_len, (u16)(sizeof(inbuf->reg_data)));
+ inbuf->len_const =
+ cpu_to_be16(((reg_len/4 + 1) & MLX4_ACCESS_REG_LEN_MASK) |
+ ((0x3) << 12));
+
+ memcpy(inbuf->reg_data, reg_data, reg_len);
+ err = mlx4_cmd_box(dev, inbox->dma, outbox->dma, 0, 0,
+ MLX4_CMD_ACCESS_REG, MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_WRAPPED);
+ if (err)
+ goto out;
+
+ if (outbuf->status & MLX4_ACCESS_REG_STATUS_MASK) {
+ err = outbuf->status & MLX4_ACCESS_REG_STATUS_MASK;
+ mlx4_err(dev,
+ "MLX4_CMD_ACCESS_REG(%x) returned REG status (%x)\n",
+ reg_id, err);
+ goto out;
+ }
+
+ memcpy(reg_data, outbuf->reg_data, reg_len);
+out:
+ mlx4_free_cmd_mailbox(dev, inbox);
+ mlx4_free_cmd_mailbox(dev, outbox);
+ return err;
+}
+
+/* ConnectX registers IDs */
+enum mlx4_reg_id {
+ MLX4_REG_ID_PTYS = 0x5004,
+};
+
+/**
+ * mlx4_ACCESS_PTYS_REG - Access PTYs (Port Type and Speed)
+ * register
+ * @dev: mlx4_dev.
+ * @method: Access method Read/Write.
+ * @ptys_reg: PTYS register data pointer.
+ *
+ * Access ConnectX PTYS register, to Read/Write Port Type/Speed
+ * configuration
+ * Returns 0 on success or a negative error code.
+ */
+int mlx4_ACCESS_PTYS_REG(struct mlx4_dev *dev,
+ enum mlx4_access_reg_method method,
+ struct mlx4_ptys_reg *ptys_reg)
+{
+ return mlx4_ACCESS_REG(dev, MLX4_REG_ID_PTYS,
+ method, sizeof(*ptys_reg), ptys_reg);
+}
+EXPORT_SYMBOL_GPL(mlx4_ACCESS_PTYS_REG);
+
+int mlx4_ACCESS_REG_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ struct mlx4_access_reg *inbuf = inbox->buf;
+ u8 method = inbuf->method & MLX4_ACCESS_REG_METHOD_MASK;
+ u16 reg_id = be16_to_cpu(inbuf->reg_id);
+
+ if (slave != mlx4_master_func_num(dev) &&
+ method == MLX4_ACCESS_REG_WRITE)
+ return -EPERM;
+
+ if (reg_id == MLX4_REG_ID_PTYS) {
+ struct mlx4_ptys_reg *ptys_reg =
+ (struct mlx4_ptys_reg *)inbuf->reg_data;
+
+ ptys_reg->local_port =
+ mlx4_slave_convert_port(dev, slave,
+ ptys_reg->local_port);
+ }
+
+ return mlx4_cmd_box(dev, inbox->dma, outbox->dma, vhcr->in_modifier,
+ 0, MLX4_CMD_ACCESS_REG, MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+}
+
+#define MLX4_ROCE_ADDR_L3_TYPE_IPV4 0
+#define MLX4_ROCE_ADDR_L3_TYPE_IPV6 1
+
+int mlx4_update_roce_addr_table(struct mlx4_dev *dev, u8 port_num,
+ struct mlx4_roce_addr_table *table,
+ int native_or_wrapped)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ int i;
+ int err;
+ u32 in_modifier;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR_OR_NULL(mailbox))
+ return -ENOMEM;
+
+ if ((native_or_wrapped != MLX4_CMD_WRAPPED) &&
+ (native_or_wrapped != MLX4_CMD_NATIVE))
+ return -EINVAL;
+
+ if (dev->caps.roce_addr_support) {
+ struct {
+ u8 gid[MLX4_GID_LEN];
+ __be32 rsrvd1[2];
+ __be16 rsrvd2;
+ u8 type;
+ u8 version;
+ __be32 rsrvd3;
+ } *gid_tbl;
+
+ gid_tbl = mailbox->buf;
+ for (i = 0; i < MLX4_MAX_PORT_GIDS; ++i) {
+ memcpy(gid_tbl[i].gid, table->addr[i].gid, MLX4_GID_LEN);
+ gid_tbl[i].version = table->addr[i].type;
+
+ if (table->addr[i].type != MLX4_ROCE_GID_TYPE_V1) {
+ if (ipv6_addr_v4mapped((struct in6_addr *)table->addr[i].gid))
+ gid_tbl[i].type = MLX4_ROCE_ADDR_L3_TYPE_IPV4;
+ else
+ gid_tbl[i].type = MLX4_ROCE_ADDR_L3_TYPE_IPV6;
+ }
+ }
+ in_modifier = MLX4_SET_PORT_ROCE_ADDR;
+ } else {
+ struct {
+ u8 gid[MLX4_GID_LEN];
+ } *gid_tbl;
+
+ gid_tbl = mailbox->buf;
+
+ for (i = 0; i < MLX4_MAX_PORT_GIDS; ++i)
+ memcpy(gid_tbl[i].gid, table->addr[i].gid, MLX4_GID_LEN);
+ in_modifier = MLX4_SET_PORT_GID_TABLE;
+ }
+
+ err = mlx4_cmd(dev, mailbox->dma,
+ in_modifier << 8 | port_num,
+ MLX4_SET_PORT_ETH_OPCODE, MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+ native_or_wrapped);
+ if (!err && mlx4_is_bonded(dev))
+ err = mlx4_cmd(dev, mailbox->dma,
+ in_modifier << 8 | 2,
+ MLX4_SET_PORT_ETH_OPCODE, MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+ native_or_wrapped);
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_update_roce_addr_table);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/fw.h b/drivers/net/mlnx_uio/mlnx/mlx4/fw.h
new file mode 100644
index 0000000..04466c5
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/fw.h
@@ -0,0 +1,270 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2006, 2007 Cisco Systems. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX4_FW_H
+#define MLX4_FW_H
+
+#include "mlx4.h"
+#include "icm.h"
+
+struct mlx4_mod_stat_cfg {
+ u8 log_pg_sz;
+ u8 log_pg_sz_m;
+};
+
+struct mlx4_port_cap {
+ u8 supported_port_types;
+ u8 suggested_type;
+ u8 default_sense;
+ u8 log_max_macs;
+ u8 log_max_vlans;
+ int ib_mtu;
+ int max_port_width;
+ int max_vl;
+ int max_gids;
+ int max_pkeys;
+ u64 def_mac;
+ u16 eth_mtu;
+ int trans_type;
+ int vendor_oui;
+ u16 wavelength;
+ u64 trans_code;
+ u8 dmfs_optimized_state;
+};
+
+struct mlx4_dev_cap {
+ int max_srq_sz;
+ int max_qp_sz;
+ int reserved_qps;
+ int max_qps;
+ int reserved_srqs;
+ int max_srqs;
+ int max_cq_sz;
+ int reserved_cqs;
+ int max_cqs;
+ int max_mpts;
+ int reserved_eqs;
+ int max_eqs;
+ int num_sys_eqs;
+ int reserved_mtts;
+ int max_mrw_sz;
+ int reserved_mrws;
+ int max_mtt_seg;
+ int max_requester_per_qp;
+ int max_responder_per_qp;
+ int max_rdma_global;
+ int local_ca_ack_delay;
+ int num_ports;
+ u32 max_msg_sz;
+ u16 stat_rate_support;
+ int fs_log_max_ucast_qp_range_size;
+ int fs_max_num_qp_per_entry;
+ u64 flags;
+ u64 flags2;
+ int reserved_uars;
+ int uar_size;
+ int min_page_sz;
+ int bf_reg_size;
+ int bf_regs_per_page;
+ int max_sq_sg;
+ int max_sq_desc_sz;
+ int max_rq_sg;
+ int max_rq_desc_sz;
+ int max_qp_per_mcg;
+ int reserved_mgms;
+ int max_mcgs;
+ int reserved_pds;
+ int max_pds;
+ int reserved_xrcds;
+ int max_xrcds;
+ int qpc_entry_sz;
+ int rdmarc_entry_sz;
+ int altc_entry_sz;
+ int aux_entry_sz;
+ int srq_entry_sz;
+ int cqc_entry_sz;
+ int eqc_entry_sz;
+ int dmpt_entry_sz;
+ int cmpt_entry_sz;
+ int mtt_entry_sz;
+ int resize_srq;
+ u32 bmme_flags;
+ u32 reserved_lkey;
+ u64 max_icm_sz;
+ int max_gso_sz;
+ int max_rss_tbl_sz;
+ u32 max_basic_counters;
+ u32 sync_qp;
+ u32 max_extended_counters;
+ u32 mad_demux;
+ u8 cq_overrun;
+ u32 dmfs_high_rate_qpn_base;
+ u32 dmfs_high_rate_qpn_range;
+ struct mlx4_port_cap port_cap[MLX4_MAX_PORTS + 1];
+};
+
+struct mlx4_func_cap {
+ u8 num_ports;
+ u8 flags;
+ u32 pf_context_behaviour;
+ int qp_quota;
+ int cq_quota;
+ int srq_quota;
+ int mpt_quota;
+ int mtt_quota;
+ int max_eq;
+ int reserved_eq;
+ int mcg_quota;
+ u32 qp0_qkey;
+ u32 qp0_tunnel_qpn;
+ u32 qp0_proxy_qpn;
+ u32 qp1_tunnel_qpn;
+ u32 qp1_proxy_qpn;
+ u32 reserved_lkey;
+ u8 physical_port;
+ u8 port_flags;
+ u8 flags1;
+ u64 phys_port_id;
+ u8 def_counter_index;
+ u32 extra_flags;
+};
+
+struct mlx4_func {
+ int bus;
+ int device;
+ int function;
+ int physical_function;
+ int rsvd_eqs;
+ int max_eq;
+ int rsvd_uars;
+};
+
+struct mlx4_adapter {
+ char board_id[MLX4_BOARD_ID_LEN];
+ u8 inta_pin;
+};
+
+struct mlx4_init_hca_param {
+ u64 qpc_base;
+ u64 rdmarc_base;
+ u64 auxc_base;
+ u64 altc_base;
+ u64 srqc_base;
+ u64 cqc_base;
+ u64 eqc_base;
+ u64 mc_base;
+ u64 dmpt_base;
+ u64 cmpt_base;
+ u64 mtt_base;
+ u64 global_caps;
+ u16 log_mc_entry_sz;
+ u16 log_mc_hash_sz;
+ u16 hca_core_clock; /* Internal Clock Frequency (in MHz) */
+ u8 log_num_qps;
+ u8 log_num_srqs;
+ u8 log_num_cqs;
+ u8 log_num_eqs;
+ u16 num_sys_eqs;
+ u8 log_rd_per_qp;
+ u8 log_mc_table_sz;
+ u8 log_mpt_sz;
+ u8 log_uar_sz;
+ u8 mw_enabled; /* Enable memory windows */
+ u8 uar_page_sz; /* log pg sz in 4k chunks */
+ u8 steering_mode; /* for QUERY_HCA */
+ u8 dmfs_high_steer_mode; /* for QUERY_HCA */
+ u8 steering_attr; /* for QUERY_HCA */
+ u64 dev_cap_enabled;
+ u16 cqe_size; /* For use only when CQE stride feature enabled */
+ u16 eqe_size; /* For use only when EQE stride feature enabled */
+ u8 rss_ip_frags;
+};
+
+struct mlx4_init_ib_param {
+ int port_width;
+ int vl_cap;
+ int mtu_cap;
+ u16 gid_cap;
+ u16 pkey_cap;
+ int set_guid0;
+ u64 guid0;
+ int set_node_guid;
+ u64 node_guid;
+ int set_si_guid;
+ u64 si_guid;
+};
+
+struct mlx4_set_ib_param {
+ int set_si_guid;
+ int reset_qkey_viol;
+ u64 si_guid;
+ u32 cap_mask;
+};
+
+void mlx4_dev_cap_dump(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap);
+int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap);
+int mlx4_QUERY_PORT(struct mlx4_dev *dev, int port, struct mlx4_port_cap *port_cap);
+int mlx4_QUERY_FUNC_CAP(struct mlx4_dev *dev, u8 gen_or_port,
+ struct mlx4_func_cap *func_cap);
+int mlx4_QUERY_FUNC_CAP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_QUERY_FUNC(struct mlx4_dev *dev, struct mlx4_func *func, int slave);
+int mlx4_MAP_FA(struct mlx4_dev *dev, struct mlx4_icm *icm);
+int mlx4_UNMAP_FA(struct mlx4_dev *dev);
+int mlx4_RUN_FW(struct mlx4_dev *dev);
+int mlx4_QUERY_FW(struct mlx4_dev *dev);
+int mlx4_QUERY_ADAPTER(struct mlx4_dev *dev, struct mlx4_adapter *adapter);
+int mlx4_INIT_HCA(struct mlx4_dev *dev, struct mlx4_init_hca_param *param);
+int mlx4_QUERY_HCA(struct mlx4_dev *dev, struct mlx4_init_hca_param *param);
+int mlx4_CLOSE_HCA(struct mlx4_dev *dev, int panic);
+int mlx4_map_cmd(struct mlx4_dev *dev, u16 op, struct mlx4_icm *icm, u64 virt);
+int mlx4_SET_ICM_SIZE(struct mlx4_dev *dev, u64 icm_size, u64 *aux_pages);
+int mlx4_MAP_ICM_AUX(struct mlx4_dev *dev, struct mlx4_icm *icm);
+int mlx4_UNMAP_ICM_AUX(struct mlx4_dev *dev);
+int mlx4_NOP(struct mlx4_dev *dev);
+int mlx4_MOD_STAT_CFG(struct mlx4_dev *dev, struct mlx4_mod_stat_cfg *cfg);
+#ifdef KMOD_REMOVED
+void mlx4_opreq_action(struct work_struct *work);
+#endif
+
+#endif /* MLX4_FW_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.c b/drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.c
new file mode 100644
index 0000000..345a676
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.c
@@ -0,0 +1,292 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies.
+ * All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "fw_qos.h"
+#include "fw.h"
+
+enum {
+ /* allocate vpp opcode modifiers */
+ MLX4_ALLOCATE_VPP_ALLOCATE = 0x0,
+ MLX4_ALLOCATE_VPP_QUERY = 0x1
+};
+
+enum {
+ /* set vport qos opcode modifiers */
+ MLX4_SET_VPORT_QOS_SET = 0x0,
+ MLX4_SET_VPORT_QOS_QUERY = 0x1
+};
+
+struct mlx4_set_port_prio2tc_context {
+ u8 prio2tc[4];
+};
+
+struct mlx4_port_scheduler_tc_cfg_be {
+ __be16 pg;
+ __be16 bw_precentage;
+ __be16 max_bw_units; /* 3-100Mbps, 4-1Gbps, other values - reserved */
+ __be16 max_bw_value;
+};
+
+struct mlx4_set_port_scheduler_context {
+ struct mlx4_port_scheduler_tc_cfg_be tc[MLX4_NUM_TC];
+};
+
+/* Granular Qos (per VF) section */
+struct mlx4_alloc_vpp_param {
+ __be32 availible_vpp;
+ __be32 vpp_p_up[MLX4_NUM_UP];
+};
+
+struct mlx4_prio_qos_param {
+ __be32 bw_share;
+ __be32 max_avg_bw;
+ __be32 reserved;
+ __be32 enable;
+ __be32 reserved1[4];
+};
+
+struct mlx4_set_vport_context {
+ __be32 reserved[8];
+ struct mlx4_prio_qos_param qos_p_up[MLX4_NUM_UP];
+};
+
+int mlx4_SET_PORT_PRIO2TC(struct mlx4_dev *dev, u8 port, u8 *prio2tc)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_set_port_prio2tc_context *context;
+ int err;
+ u32 in_mod;
+ int i;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ context = mailbox->buf;
+
+ for (i = 0; i < MLX4_NUM_UP; i += 2)
+ context->prio2tc[i >> 1] = prio2tc[i] << 4 | prio2tc[i + 1];
+
+ in_mod = MLX4_SET_PORT_PRIO2TC << 8 | port;
+ err = mlx4_cmd(dev, mailbox->dma, in_mod, 1, MLX4_CMD_SET_PORT,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_SET_PORT_PRIO2TC);
+
+int mlx4_SET_PORT_SCHEDULER(struct mlx4_dev *dev, u8 port, u8 *tc_tx_bw,
+ u8 *pg, u16 *ratelimit)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_set_port_scheduler_context *context;
+ int err;
+ u32 in_mod;
+ int i;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ context = mailbox->buf;
+
+ for (i = 0; i < MLX4_NUM_TC; i++) {
+ struct mlx4_port_scheduler_tc_cfg_be *tc = &context->tc[i];
+ u16 r;
+
+ if (ratelimit && ratelimit[i]) {
+ if (ratelimit[i] <= MLX4_MAX_100M_UNITS_VAL) {
+ r = ratelimit[i];
+ tc->max_bw_units =
+ htons(MLX4_RATELIMIT_100M_UNITS);
+ } else {
+ r = ratelimit[i] / 10;
+ tc->max_bw_units =
+ htons(MLX4_RATELIMIT_1G_UNITS);
+ }
+ tc->max_bw_value = htons(r);
+ } else {
+ tc->max_bw_value = htons(MLX4_RATELIMIT_DEFAULT);
+ tc->max_bw_units = htons(MLX4_RATELIMIT_1G_UNITS);
+ }
+
+ tc->pg = htons(pg[i]);
+ tc->bw_precentage = htons(tc_tx_bw[i]);
+ }
+
+ in_mod = MLX4_SET_PORT_SCHEDULER << 8 | port;
+ err = mlx4_cmd(dev, mailbox->dma, in_mod, 1, MLX4_CMD_SET_PORT,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_SET_PORT_SCHEDULER);
+
+int mlx4_ALLOCATE_VPP_get(struct mlx4_dev *dev, u8 port,
+ u16 *availible_vpp, u8 *vpp_p_up)
+{
+ int i;
+ int err;
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_alloc_vpp_param *out_param;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ out_param = mailbox->buf;
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, port,
+ MLX4_ALLOCATE_VPP_QUERY,
+ MLX4_CMD_ALLOCATE_VPP,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err)
+ goto out;
+
+ /* Total number of supported VPPs */
+ *availible_vpp = (u16)be32_to_cpu(out_param->availible_vpp);
+
+ for (i = 0; i < MLX4_NUM_UP; i++)
+ vpp_p_up[i] = (u8)be32_to_cpu(out_param->vpp_p_up[i]);
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx4_ALLOCATE_VPP_get);
+
+int mlx4_ALLOCATE_VPP_set(struct mlx4_dev *dev, u8 port, u8 *vpp_p_up)
+{
+ int i;
+ int err;
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_alloc_vpp_param *in_param;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ in_param = mailbox->buf;
+
+ for (i = 0; i < MLX4_NUM_UP; i++)
+ in_param->vpp_p_up[i] = cpu_to_be32(vpp_p_up[i]);
+
+ err = mlx4_cmd(dev, mailbox->dma, port,
+ MLX4_ALLOCATE_VPP_ALLOCATE,
+ MLX4_CMD_ALLOCATE_VPP,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_ALLOCATE_VPP_set);
+
+int mlx4_SET_VPORT_QOS_get(struct mlx4_dev *dev, u8 port, u8 vport,
+ struct mlx4_vport_qos_param *out_param)
+{
+ int i;
+ int err;
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_set_vport_context *ctx;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ ctx = mailbox->buf;
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, (vport << 8) | port,
+ MLX4_SET_VPORT_QOS_QUERY,
+ MLX4_CMD_SET_VPORT_QOS,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err)
+ goto out;
+
+ for (i = 0; i < MLX4_NUM_UP; i++) {
+ out_param[i].bw_share = be32_to_cpu(ctx->qos_p_up[i].bw_share);
+ out_param[i].max_avg_bw =
+ be32_to_cpu(ctx->qos_p_up[i].max_avg_bw);
+ out_param[i].enable =
+ !!(be32_to_cpu(ctx->qos_p_up[i].enable) & 31);
+ }
+
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx4_SET_VPORT_QOS_get);
+
+int mlx4_SET_VPORT_QOS_set(struct mlx4_dev *dev, u8 port, u8 vport,
+ struct mlx4_vport_qos_param *in_param)
+{
+ int i;
+ int err;
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_set_vport_context *ctx;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ ctx = mailbox->buf;
+
+ for (i = 0; i < MLX4_NUM_UP; i++) {
+ ctx->qos_p_up[i].bw_share = cpu_to_be32(in_param[i].bw_share);
+ ctx->qos_p_up[i].max_avg_bw =
+ cpu_to_be32(in_param[i].max_avg_bw);
+ ctx->qos_p_up[i].enable =
+ cpu_to_be32(in_param[i].enable << 31);
+ }
+
+ err = mlx4_cmd(dev, mailbox->dma, (vport << 8) | port,
+ MLX4_SET_VPORT_QOS_SET,
+ MLX4_CMD_SET_VPORT_QOS,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_SET_VPORT_QOS_set);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.h b/drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.h
new file mode 100644
index 0000000..a173060
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/fw_qos.h
@@ -0,0 +1,150 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies.
+ * All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX4_FW_QOS_H
+#define MLX4_FW_QOS_H
+
+#include "mlx4/device.h"
+
+#define MLX4_NUM_UP 8
+#define MLX4_NUM_TC 8
+
+/* Default supported priorities for VPP allocation */
+#define MLX4_DEFAULT_QOS_PRIO (0)
+
+/* Derived from FW feature definition, 0 is the default vport fo all QPs */
+#define MLX4_VPP_DEFAULT_VPORT (0)
+
+struct mlx4_vport_qos_param {
+ u32 bw_share;
+ u32 max_avg_bw;
+ u8 enable;
+};
+
+/**
+ * mlx4_SET_PORT_PRIO2TC - This routine maps user priorities to traffic
+ * classes of a given port and device.
+ *
+ * @dev: mlx4_dev.
+ * @port: Physical port number.
+ * @prio2tc: Array of TC associated with each priorities.
+ *
+ * Returns 0 on success or a negative mlx4_core errno code.
+ **/
+int mlx4_SET_PORT_PRIO2TC(struct mlx4_dev *dev, u8 port, u8 *prio2tc);
+
+/**
+ * mlx4_SET_PORT_SCHEDULER - This routine configures the arbitration between
+ * traffic classes (ETS) and configured rate limit for traffic classes.
+ * tc_tx_bw, pg and ratelimit are arrays where each index represents a TC.
+ * The description for those parameters below refers to a single TC.
+ *
+ * @dev: mlx4_dev.
+ * @port: Physical port number.
+ * @tc_tx_bw: The percentage of the bandwidth allocated for traffic class
+ * within a TC group. The sum of the bw_percentage of all the traffic
+ * classes within a TC group must equal 100% for correct operation.
+ * @pg: The TC group the traffic class is associated with.
+ * @ratelimit: The maximal bandwidth allowed for the use by this traffic class.
+ *
+ * Returns 0 on success or a negative mlx4_core errno code.
+ **/
+int mlx4_SET_PORT_SCHEDULER(struct mlx4_dev *dev, u8 port, u8 *tc_tx_bw,
+ u8 *pg, u16 *ratelimit);
+/**
+ * mlx4_ALLOCATE_VPP_get - Query port VPP availible resources and allocation.
+ * Before distribution of VPPs to priorities, only availible_vpp is returned.
+ * After initialization it returns the distribution of VPPs among priorities.
+ *
+ * @dev: mlx4_dev.
+ * @port: Physical port number.
+ * @availible_vpp: Pointer to variable where number of availible VPPs is stored
+ * @vpp_p_up: Distribution of VPPs to priorities is stored in this array
+ *
+ * Returns 0 on success or a negative mlx4_core errno code.
+ **/
+int mlx4_ALLOCATE_VPP_get(struct mlx4_dev *dev, u8 port,
+ u16 *availible_vpp, u8 *vpp_p_up);
+/**
+ * mlx4_ALLOCATE_VPP_set - Distribution of VPPs among differnt priorities.
+ * The total number of VPPs assigned to all for a port must not exceed
+ * the value reported by availible_vpp in mlx4_ALLOCATE_VPP_get.
+ * VPP allocation is allowed only after the port type has been set,
+ * and while no QPs are open for this port.
+ *
+ * @dev: mlx4_dev.
+ * @port: Physical port number.
+ * @vpp_p_up: Allocation of VPPs to different priorities.
+ *
+ * Returns 0 on success or a negative mlx4_core errno code.
+ **/
+int mlx4_ALLOCATE_VPP_set(struct mlx4_dev *dev, u8 port, u8 *vpp_p_up);
+
+/**
+ * mlx4_SET_VPORT_QOS_get - Query QoS proporties of a Vport.
+ * Each priority allowed for the Vport is assigned with a share of the BW,
+ * and a BW limitation. This commands query the current QoS values.
+ *
+ * @dev: mlx4_dev.
+ * @port: Physical port number.
+ * @vport: Vport id.
+ * @out_param: Array of mlx4_vport_qos_param that will contain the values.
+ *
+ * Returns 0 on success or a negative mlx4_core errno code.
+ **/
+int mlx4_SET_VPORT_QOS_get(struct mlx4_dev *dev, u8 port, u8 vport,
+ struct mlx4_vport_qos_param *out_param);
+
+/**
+ * mlx4_SET_VPORT_QOS_set - Set QoS proporties of a Vport.
+ * QoS parameters can be modified at any time, but must be initialized
+ * before any QP is associated with the VPort.
+ *
+ * @dev: mlx4_dev.
+ * @port: Physical port number.
+ * @vport: Vport id.
+ * @out_param: Array of mlx4_vport_qos_param which holds the requested values.
+ *
+ * Returns 0 on success or a negative mlx4_core errno code.
+ **/
+int mlx4_SET_VPORT_QOS_set(struct mlx4_dev *dev, u8 port, u8 vport,
+ struct mlx4_vport_qos_param *in_param);
+
+#endif /* MLX4_FW_QOS_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/icm.c b/drivers/net/mlnx_uio/mlnx/mlx4/icm.c
new file mode 100644
index 0000000..2a1b7d7
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/icm.c
@@ -0,0 +1,522 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+
+#include "mlx4.h"
+#include "icm.h"
+#include "fw.h"
+#include "log2.h"
+
+#include <rte_persistent.h>
+
+/*
+ * We allocate in as big chunks as we can, up to a maximum of 256 KB
+ * per chunk.
+ */
+enum {
+ MLX4_ICM_ALLOC_SIZE = 1 << 18,
+ MLX4_TABLE_CHUNK_SIZE = 1 << 18
+};
+
+#ifdef KMOD_REMOVED
+static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk)
+{
+#ifdef KMOD_MODIFIED
+ int i;
+
+ //if (chunk->nsg > 0)
+ // pci_unmap_sg(dev->persist->pdev, chunk->mem, chunk->npages,
+ // PCI_DMA_BIDIRECTIONAL);
+
+ //for (i = 0; i < chunk->npages; ++i)
+ // __free_pages(sg_page(&chunk->mem[i]),
+ // get_order(chunk->mem[i].length));
+ for(i=0; i<chunk->npages; i++)
+ {
+ rte_persistent_free(chunk->persistent_mem[i]);
+ chunk->persistent_mem[i] = NULL;
+ }
+#endif
+}
+#endif
+
+#ifdef KMOD_MODIFIED
+static void mlx4_free_icm_coherent(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk)
+{
+ int i;
+ /*
+ for (i = 0; i < chunk->npages; ++i)
+ dma_free_coherent(&dev->persist->pdev->dev,
+ chunk->mem[i].length,
+ lowmem_page_address(sg_page(&chunk->mem[i])),
+ sg_dma_address(&chunk->mem[i]));
+ */
+ for(i=0; i<chunk->npages; i++)
+ {
+ rte_persistent_free(chunk->persistent_mem[i]);
+ chunk->persistent_mem[i] = NULL;
+ }
+}
+#endif
+
+#ifdef KMOD_MODIFIED
+void mlx4_free_icm(struct mlx4_dev *dev, struct mlx4_icm *icm/*, int coherent = 1*/)
+{
+ struct mlx4_icm_chunk *chunk, *tmp;
+
+ if (!icm)
+ return;
+
+ list_for_each_entry_safe(chunk, tmp, &icm->chunk_list, list) {
+ //if (coherent)
+ mlx4_free_icm_coherent(dev, chunk);
+ //else
+ // mlx4_free_icm_pages(dev, chunk);
+
+ kfree(chunk);
+ }
+
+ kfree(icm);
+}
+#endif
+
+#ifdef KMOD_REMOVED
+static int mlx4_alloc_icm_pages(struct scatterlist *mem, int order,
+ gfp_t gfp_mask, int node)
+{
+ struct page *page;
+
+ page = alloc_pages_node(node, gfp_mask, order);
+ if (!page) {
+ page = alloc_pages(gfp_mask, order);
+ if (!page)
+ return -ENOMEM;
+ }
+
+ sg_set_page(mem, page, PAGE_SIZE << order, 0);
+ return 0;
+}
+#endif
+
+#ifdef KMOD_MODIFIED
+static int mlx4_alloc_icm_coherent(struct rte_pci_device *dev, void **persistent_mem,
+ int order, gfp_t gfp_mask)
+{
+ void *buf = rte_persistent_alloc(PAGE_SIZE << order, dev->numa_node);
+ if (!buf)
+ return -ENOMEM;
+
+ *persistent_mem = buf;
+ return 0;
+}
+#endif
+
+#ifdef KMOD_MODIFIED
+struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
+ gfp_t gfp_mask/*, int coherent*/)
+{
+ struct mlx4_icm *icm;
+ struct mlx4_icm_chunk *chunk = NULL;
+ int cur_order;
+ int ret;
+ int coherent = 1; //only persistent memory
+
+ /* We use sg_set_buf for coherent allocs, which assumes low memory */
+ //BUG_ON(coherent && (gfp_mask & __GFP_HIGHMEM));
+
+ icm = kmalloc_node(sizeof(*icm),
+ gfp_mask & ~(__GFP_HIGHMEM | __GFP_NOWARN),
+ dev->numa_node);
+ if (!icm) {
+ icm = kmalloc(sizeof(*icm),
+ gfp_mask & ~(__GFP_HIGHMEM | __GFP_NOWARN));
+ if (!icm)
+ return NULL;
+ }
+
+ icm->refcount = 0;
+ INIT_LIST_HEAD(&icm->chunk_list);
+
+ cur_order = get_order(MLX4_ICM_ALLOC_SIZE);
+
+ while (npages > 0) {
+ if (!chunk) {
+ chunk = kmalloc_node(sizeof(*chunk),
+ gfp_mask & ~(__GFP_HIGHMEM |
+ __GFP_NOWARN),
+ dev->numa_node);
+ if (!chunk) {
+ chunk = kmalloc(sizeof(*chunk),
+ gfp_mask & ~(__GFP_HIGHMEM |
+ __GFP_NOWARN));
+ if (!chunk)
+ goto fail;
+ }
+
+ //sg_init_table(chunk->mem, MLX4_ICM_CHUNK_LEN);
+ memset(chunk->persistent_mem, 0, sizeof(chunk->persistent_mem));
+ chunk->npages = 0;
+ chunk->nsg = 0;
+ list_add_tail(&chunk->list, &icm->chunk_list);
+ }
+
+ while (1 << cur_order > npages)
+ --cur_order;
+
+ if (coherent)
+ {
+ ret = mlx4_alloc_icm_coherent(dev->persist->rte_pdev,
+ &chunk->persistent_mem[chunk->npages],
+ cur_order, gfp_mask);
+ }
+ else
+ {
+ assert(0);
+ //ret = mlx4_alloc_icm_pages(&chunk->mem[chunk->npages],
+ // cur_order, gfp_mask,
+ // dev->numa_node);
+ }
+
+ if (ret) {
+ if (--cur_order < 0)
+ goto fail;
+ else
+ continue;
+ }
+
+ ++chunk->npages;
+
+ if (coherent)
+ ++chunk->nsg;
+ else if (chunk->npages == MLX4_ICM_CHUNK_LEN) {
+ assert(0);
+ /*
+ chunk->nsg = pci_map_sg(dev->persist->pdev, chunk->mem,
+ chunk->npages,
+ PCI_DMA_BIDIRECTIONAL);
+
+ if (chunk->nsg <= 0)
+ goto fail;
+ */
+ }
+
+ if (chunk->npages == MLX4_ICM_CHUNK_LEN)
+ chunk = NULL;
+
+ npages -= 1 << cur_order;
+ }
+/*
+ if (!coherent && chunk) {
+ chunk->nsg = pci_map_sg(dev->persist->pdev, chunk->mem,
+ chunk->npages,
+ PCI_DMA_BIDIRECTIONAL);
+
+ if (chunk->nsg <= 0)
+ goto fail;
+ }
+*/
+ return icm;
+
+fail:
+ mlx4_free_icm(dev, icm/*, coherent*/);
+ return NULL;
+}
+#endif
+
+static int mlx4_MAP_ICM(struct mlx4_dev *dev, struct mlx4_icm *icm, u64 virt)
+{
+ return mlx4_map_cmd(dev, MLX4_CMD_MAP_ICM, icm, virt);
+}
+
+static int mlx4_UNMAP_ICM(struct mlx4_dev *dev, u64 virt, u32 page_count)
+{
+ return mlx4_cmd(dev, virt, page_count, 0, MLX4_CMD_UNMAP_ICM,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+}
+
+int mlx4_MAP_ICM_AUX(struct mlx4_dev *dev, struct mlx4_icm *icm)
+{
+ return mlx4_map_cmd(dev, MLX4_CMD_MAP_ICM_AUX, icm, -1);
+}
+
+int mlx4_UNMAP_ICM_AUX(struct mlx4_dev *dev)
+{
+ return mlx4_cmd(dev, 0, 0, 0, MLX4_CMD_UNMAP_ICM_AUX,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+}
+
+#ifdef KMOD_MODIFIED
+int mlx4_table_get(struct mlx4_dev *dev, struct mlx4_icm_table *table, u32 obj,
+ gfp_t gfp)
+{
+ u32 i = (obj & (table->num_obj - 1)) /
+ (MLX4_TABLE_CHUNK_SIZE / table->obj_size);
+ int ret = 0;
+
+ mutex_lock(&table->mutex);
+
+ if (table->icm[i]) {
+ ++table->icm[i]->refcount;
+ goto out;
+ }
+
+ table->icm[i] = mlx4_alloc_icm(dev, MLX4_TABLE_CHUNK_SIZE >> PAGE_SHIFT,
+ (table->lowmem ? gfp : GFP_HIGHUSER) |
+ __GFP_NOWARN/*, table->coherent*/);
+ if (!table->icm[i]) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ if (mlx4_MAP_ICM(dev, table->icm[i], table->virt +
+ (u64) i * MLX4_TABLE_CHUNK_SIZE)) {
+ mlx4_free_icm(dev, table->icm[i]/*, table->coherent*/);
+ table->icm[i] = NULL;
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ ++table->icm[i]->refcount;
+
+out:
+ mutex_unlock(&table->mutex);
+ return ret;
+}
+#endif
+
+#ifdef KMOD_MODIFIED
+void mlx4_table_put(struct mlx4_dev *dev, struct mlx4_icm_table *table, u32 obj)
+{
+ u32 i;
+ u64 offset;
+
+ i = (obj & (table->num_obj - 1)) / (MLX4_TABLE_CHUNK_SIZE / table->obj_size);
+
+ mutex_lock(&table->mutex);
+
+ if (--table->icm[i]->refcount == 0) {
+ offset = (u64) i * MLX4_TABLE_CHUNK_SIZE;
+ mlx4_UNMAP_ICM(dev, table->virt + offset,
+ MLX4_TABLE_CHUNK_SIZE / MLX4_ICM_PAGE_SIZE);
+ mlx4_free_icm(dev, table->icm[i]/*, table->coherent*/);
+ table->icm[i] = NULL;
+ }
+
+ mutex_unlock(&table->mutex);
+}
+#endif
+
+#ifdef KMOD_MODIFIED
+void *mlx4_table_find(struct mlx4_icm_table *table, u32 obj,
+ dma_addr_t *dma_handle)
+{
+ int offset, dma_offset, i;
+ u64 idx;
+ struct mlx4_icm_chunk *chunk;
+ struct mlx4_icm *icm;
+ void *persistent_addr = NULL;
+
+ if (!table->lowmem)
+ return NULL;
+
+ mutex_lock(&table->mutex);
+
+ idx = (u64) (obj & (table->num_obj - 1)) * table->obj_size;
+ icm = table->icm[idx / MLX4_TABLE_CHUNK_SIZE];
+ dma_offset = offset = idx % MLX4_TABLE_CHUNK_SIZE;
+
+ if (!icm)
+ goto out;
+
+ list_for_each_entry(chunk, &icm->chunk_list, list) {
+ for (i = 0; i < chunk->npages; ++i) {
+ /*
+ if (dma_handle && dma_offset >= 0) {
+ if (sg_dma_len(&chunk->mem[i]) > dma_offset)
+ *dma_handle = sg_dma_address(&chunk->mem[i]) +
+ dma_offset;
+ dma_offset -= sg_dma_len(&chunk->mem[i]);
+ }
+ */
+ if (dma_handle && dma_offset >= 0) {
+ if (rte_persistent_mem_length(chunk->persistent_mem[i]) > dma_offset)
+ *dma_handle = rte_persistent_hw_addr(chunk->persistent_mem[i]) +
+ dma_offset;
+ dma_offset -= rte_persistent_mem_length(chunk->persistent_mem[i]);
+ }
+ /*
+ * DMA mapping can merge pages but not split them,
+ * so if we found the page, dma_handle has already
+ * been assigned to.
+ */
+ if (rte_persistent_mem_length(chunk->persistent_mem[i]) > offset) {
+ persistent_addr = chunk->persistent_mem[i];
+ goto out;
+ }
+ offset -= rte_persistent_mem_length(persistent_addr);
+ }
+ }
+
+out:
+ mutex_unlock(&table->mutex);
+ return persistent_addr ? RTE_PTR_ADD(persistent_addr, offset) : NULL;
+}
+#endif
+
+#ifdef KMOD_MODIFIED
+int mlx4_table_get_range(struct mlx4_dev *dev, struct mlx4_icm_table *table,
+ u32 start, u32 end)
+{
+ int inc = MLX4_TABLE_CHUNK_SIZE / table->obj_size;
+ int err;
+ u32 i;
+
+ for (i = start; i <= end; i += inc) {
+ err = mlx4_table_get(dev, table, i, GFP_KERNEL);
+ if (err)
+ goto fail;
+ }
+
+ return 0;
+
+fail:
+ while (i > start) {
+ i -= inc;
+ mlx4_table_put(dev, table, i);
+ }
+
+ return err;
+}
+#endif
+
+void mlx4_table_put_range(struct mlx4_dev *dev, struct mlx4_icm_table *table,
+ u32 start, u32 end)
+{
+ u32 i;
+
+ for (i = start; i <= end; i += MLX4_TABLE_CHUNK_SIZE / table->obj_size)
+ mlx4_table_put(dev, table, i);
+}
+
+#ifdef KMOD_MODIFIED
+
+int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table,
+ u64 virt, int obj_size, u32 nobj, int reserved,
+ int use_lowmem, int use_coherent)
+{
+ int obj_per_chunk;
+ int num_icm;
+ unsigned chunk_size;
+ int i;
+ u64 size;
+
+ if(!use_coherent)
+ {
+ dev_warn(dev, "%s tried not to use_coherent, but enabled by default.\n", __FUNCTION__);
+ }
+
+ obj_per_chunk = MLX4_TABLE_CHUNK_SIZE / obj_size;
+ num_icm = (nobj + obj_per_chunk - 1) / obj_per_chunk;
+
+ table->icm = kcalloc(num_icm, sizeof *table->icm, GFP_KERNEL);
+ if (!table->icm)
+ return -ENOMEM;
+ table->virt = virt;
+ table->num_icm = num_icm;
+ table->num_obj = nobj;
+ table->obj_size = obj_size;
+ table->lowmem = use_lowmem;
+ table->coherent = use_coherent;
+ mutex_init(&table->mutex);
+
+ size = (u64) nobj * obj_size;
+ for (i = 0; i * MLX4_TABLE_CHUNK_SIZE < reserved * obj_size; ++i) {
+ chunk_size = MLX4_TABLE_CHUNK_SIZE;
+ if ((i + 1) * MLX4_TABLE_CHUNK_SIZE > size)
+ chunk_size = PAGE_ALIGN(size -
+ i * MLX4_TABLE_CHUNK_SIZE);
+
+ table->icm[i] = mlx4_alloc_icm(dev, chunk_size >> PAGE_SHIFT,
+ (use_lowmem ? GFP_KERNEL : GFP_HIGHUSER) |
+ __GFP_NOWARN/*, use_coherent*/);
+ if (!table->icm[i])
+ goto err;
+ if (mlx4_MAP_ICM(dev, table->icm[i], virt + i * MLX4_TABLE_CHUNK_SIZE)) {
+ mlx4_free_icm(dev, table->icm[i]/*, use_coherent*/);
+ table->icm[i] = NULL;
+ goto err;
+ }
+
+ /*
+ * Add a reference to this ICM chunk so that it never
+ * gets freed (since it contains reserved firmware objects).
+ */
+ ++table->icm[i]->refcount;
+ }
+
+ return 0;
+
+err:
+ for (i = 0; i < num_icm; ++i)
+ if (table->icm[i]) {
+ mlx4_UNMAP_ICM(dev, virt + i * MLX4_TABLE_CHUNK_SIZE,
+ MLX4_TABLE_CHUNK_SIZE / MLX4_ICM_PAGE_SIZE);
+ mlx4_free_icm(dev, table->icm[i]/*, use_coherent*/);
+ }
+
+ kfree(table->icm);
+
+ return -ENOMEM;
+}
+#endif
+
+#ifdef KMOD_MODIFIED
+void mlx4_cleanup_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table)
+{
+ int i;
+
+ for (i = 0; i < table->num_icm; ++i)
+ if (table->icm[i]) {
+ mlx4_UNMAP_ICM(dev, table->virt + i * MLX4_TABLE_CHUNK_SIZE,
+ MLX4_TABLE_CHUNK_SIZE / MLX4_ICM_PAGE_SIZE);
+ mlx4_free_icm(dev, table->icm[i]/*, table->coherent*/);
+ }
+
+ kfree(table->icm);
+}
+#endif
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/icm.h b/drivers/net/mlnx_uio/mlnx/mlx4/icm.h
new file mode 100644
index 0000000..9cd74f6
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/icm.h
@@ -0,0 +1,133 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX4_ICM_H
+#define MLX4_ICM_H
+
+#include <rte_persistent.h>
+
+#define MLX4_ICM_CHUNK_LEN \
+ ((256 - sizeof (struct list_head) - 2 * sizeof (int)) / \
+ (sizeof (void*)))
+
+enum {
+ MLX4_ICM_PAGE_SHIFT = 12,
+ MLX4_ICM_PAGE_SIZE = 1 << MLX4_ICM_PAGE_SHIFT,
+};
+
+struct mlx4_icm_chunk {
+ struct list_head list;
+ int npages;
+ int nsg;
+ void* persistent_mem[MLX4_ICM_CHUNK_LEN];
+};
+
+struct mlx4_icm {
+ struct list_head chunk_list;
+ int refcount;
+};
+
+struct mlx4_icm_iter {
+ struct mlx4_icm *icm;
+ struct mlx4_icm_chunk *chunk;
+ int page_idx;
+};
+
+struct mlx4_dev;
+
+struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
+ gfp_t gfp_mask /*, int coherent*/);
+void mlx4_free_icm(struct mlx4_dev *dev, struct mlx4_icm *icm/*, int coherent = 1*/);
+
+int mlx4_table_get(struct mlx4_dev *dev, struct mlx4_icm_table *table, u32 obj,
+ gfp_t gfp);
+void mlx4_table_put(struct mlx4_dev *dev, struct mlx4_icm_table *table, u32 obj);
+int mlx4_table_get_range(struct mlx4_dev *dev, struct mlx4_icm_table *table,
+ u32 start, u32 end);
+void mlx4_table_put_range(struct mlx4_dev *dev, struct mlx4_icm_table *table,
+ u32 start, u32 end);
+int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table,
+ u64 virt, int obj_size, u32 nobj, int reserved,
+ int use_lowmem, int use_coherent);
+void mlx4_cleanup_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table);
+void *mlx4_table_find(struct mlx4_icm_table *table, u32 obj, dma_addr_t *dma_handle);
+
+static inline void mlx4_icm_first(struct mlx4_icm *icm,
+ struct mlx4_icm_iter *iter)
+{
+ iter->icm = icm;
+ iter->chunk = list_empty(&icm->chunk_list) ?
+ NULL : list_entry(icm->chunk_list.next,
+ struct mlx4_icm_chunk, list);
+ iter->page_idx = 0;
+}
+
+static inline int mlx4_icm_last(struct mlx4_icm_iter *iter)
+{
+ return !iter->chunk;
+}
+
+static inline void mlx4_icm_next(struct mlx4_icm_iter *iter)
+{
+ if (++iter->page_idx >= iter->chunk->nsg) {
+ if (iter->chunk->list.next == &iter->icm->chunk_list) {
+ iter->chunk = NULL;
+ return;
+ }
+
+ iter->chunk = list_entry(iter->chunk->list.next,
+ struct mlx4_icm_chunk, list);
+ iter->page_idx = 0;
+ }
+}
+
+static inline dma_addr_t mlx4_icm_addr(struct mlx4_icm_iter *iter)
+{
+ return rte_persistent_hw_addr(iter->chunk->persistent_mem[iter->page_idx]);
+}
+
+static inline unsigned long mlx4_icm_size(struct mlx4_icm_iter *iter)
+{
+ return rte_persistent_mem_length(iter->chunk->persistent_mem[iter->page_idx]);
+}
+
+int mlx4_MAP_ICM_AUX(struct mlx4_dev *dev, struct mlx4_icm *icm);
+int mlx4_UNMAP_ICM_AUX(struct mlx4_dev *dev);
+
+#endif /* MLX4_ICM_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/intf.c b/drivers/net/mlnx_uio/mlnx/mlx4/intf.c
new file mode 100644
index 0000000..cf0c6bc
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/intf.c
@@ -0,0 +1,246 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+#include "mlx4.h"
+
+struct mlx4_device_context {
+ struct list_head list;
+ struct list_head bond_list;
+ struct mlx4_interface *intf;
+ void *context;
+};
+
+static LIST_HEAD(intf_list);
+static LIST_HEAD(dev_list);
+static DEFINE_MUTEX(intf_mutex);
+
+static void mlx4_add_device(struct mlx4_interface *intf, struct mlx4_priv *priv)
+{
+ struct mlx4_device_context *dev_ctx;
+
+ dev_ctx = kmalloc(sizeof *dev_ctx, GFP_KERNEL);
+ if (!dev_ctx)
+ return;
+
+ dev_ctx->intf = intf;
+ dev_ctx->context = intf->add(&priv->dev);
+
+ if (dev_ctx->context) {
+ spin_lock_irq(&priv->ctx_lock);
+ list_add_tail(&dev_ctx->list, &priv->ctx_list);
+ spin_unlock_irq(&priv->ctx_lock);
+ if (intf->activate)
+ intf->activate(&priv->dev, dev_ctx->context);
+ } else
+ kfree(dev_ctx);
+
+}
+
+static void mlx4_remove_device(struct mlx4_interface *intf, struct mlx4_priv *priv)
+{
+ struct mlx4_device_context *dev_ctx;
+
+ list_for_each_entry(dev_ctx, &priv->ctx_list, list)
+ if (dev_ctx->intf == intf) {
+ spin_lock_irq(&priv->ctx_lock);
+ list_del(&dev_ctx->list);
+ spin_unlock_irq(&priv->ctx_lock);
+
+ intf->remove(&priv->dev, dev_ctx->context);
+ kfree(dev_ctx);
+ return;
+ }
+}
+
+int mlx4_register_interface(struct mlx4_interface *intf)
+{
+ struct mlx4_priv *priv;
+
+ if (!intf->add || !intf->remove)
+ return -EINVAL;
+
+ mutex_lock(&intf_mutex);
+
+ list_add_tail(&intf->list, &intf_list);
+ list_for_each_entry(priv, &dev_list, dev_list)
+ mlx4_add_device(intf, priv);
+
+ mutex_unlock(&intf_mutex);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_register_interface);
+
+void mlx4_unregister_interface(struct mlx4_interface *intf)
+{
+ struct mlx4_priv *priv;
+
+ mutex_lock(&intf_mutex);
+
+ list_for_each_entry(priv, &dev_list, dev_list)
+ mlx4_remove_device(intf, priv);
+
+ list_del(&intf->list);
+
+ mutex_unlock(&intf_mutex);
+}
+EXPORT_SYMBOL_GPL(mlx4_unregister_interface);
+
+int mlx4_do_bond(struct mlx4_dev *dev, bool enable)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_device_context *dev_ctx = NULL, *temp_dev_ctx;
+ unsigned long flags;
+ int ret;
+ LIST_HEAD(bond_list);
+
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_PORT_REMAP))
+ return -ENOTSUPP;
+
+ ret = mlx4_disable_rx_port_check(dev, enable);
+ if (ret) {
+ mlx4_err(dev, "Fail to %s rx port check\n",
+ enable ? "enable" : "disable");
+ return ret;
+ }
+ if (enable) {
+ dev->flags |= MLX4_FLAG_BONDED;
+ } else {
+ ret = mlx4_virt2phy_port_map(dev, 1, 2);
+ if (ret) {
+ mlx4_err(dev, "Fail to reset port map\n");
+ return ret;
+ }
+ dev->flags &= ~MLX4_FLAG_BONDED;
+ }
+
+ spin_lock_irqsave(&priv->ctx_lock, flags);
+ list_for_each_entry_safe(dev_ctx, temp_dev_ctx, &priv->ctx_list, list) {
+ if (dev_ctx->intf->flags & MLX4_INTFF_BONDING) {
+ list_add_tail(&dev_ctx->bond_list, &bond_list);
+ list_del(&dev_ctx->list);
+ }
+ }
+ spin_unlock_irqrestore(&priv->ctx_lock, flags);
+
+ list_for_each_entry(dev_ctx, &bond_list, bond_list) {
+ dev_ctx->intf->remove(dev, dev_ctx->context);
+ dev_ctx->context = dev_ctx->intf->add(dev);
+
+ spin_lock_irqsave(&priv->ctx_lock, flags);
+ list_add_tail(&dev_ctx->list, &priv->ctx_list);
+ spin_unlock_irqrestore(&priv->ctx_lock, flags);
+
+ mlx4_dbg(dev, "Inrerface for protocol %d restarted with when bonded mode is %s\n",
+ dev_ctx->intf->protocol, enable ?
+ "enabled" : "disabled");
+ }
+ return 0;
+}
+
+void mlx4_dispatch_event(struct mlx4_dev *dev, enum mlx4_dev_event type,
+ unsigned long param)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_device_context *dev_ctx;
+ unsigned long flags;
+
+ spin_lock_irqsave(&priv->ctx_lock, flags);
+
+ list_for_each_entry(dev_ctx, &priv->ctx_list, list)
+ if (dev_ctx->intf->event)
+ dev_ctx->intf->event(dev, dev_ctx->context, type, param);
+
+ spin_unlock_irqrestore(&priv->ctx_lock, flags);
+}
+
+int mlx4_register_device(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_interface *intf;
+
+ mutex_lock(&intf_mutex);
+
+ dev->persist->interface_state |= MLX4_INTERFACE_STATE_UP;
+ list_add_tail(&priv->dev_list, &dev_list);
+ list_for_each_entry(intf, &intf_list, list)
+ mlx4_add_device(intf, priv);
+
+ mutex_unlock(&intf_mutex);
+ mlx4_start_catas_poll(dev);
+
+ return 0;
+}
+
+void mlx4_unregister_device(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_interface *intf;
+
+ mlx4_stop_catas_poll(dev);
+ mutex_lock(&intf_mutex);
+
+ list_for_each_entry(intf, &intf_list, list)
+ mlx4_remove_device(intf, priv);
+
+ list_del(&priv->dev_list);
+ dev->persist->interface_state &= ~MLX4_INTERFACE_STATE_UP;
+
+ mutex_unlock(&intf_mutex);
+}
+
+void *mlx4_get_protocol_dev(struct mlx4_dev *dev, enum mlx4_protocol proto, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_device_context *dev_ctx;
+ unsigned long flags;
+ void *result = NULL;
+
+ spin_lock_irqsave(&priv->ctx_lock, flags);
+
+ list_for_each_entry(dev_ctx, &priv->ctx_list, list)
+ if (dev_ctx->intf->protocol == proto && dev_ctx->intf->get_dev) {
+ result = dev_ctx->intf->get_dev(dev, dev_ctx->context, port);
+ break;
+ }
+
+ spin_unlock_irqrestore(&priv->ctx_lock, flags);
+
+ return result;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_protocol_dev);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/main.c b/drivers/net/mlnx_uio/mlnx/mlx4/main.c
new file mode 100644
index 0000000..79951d3
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/main.c
@@ -0,0 +1,5485 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx4/device.h"
+
+#include "mlx4.h"
+#include "fw.h"
+#include "icm.h"
+#include "mlx4_stats.h"
+
+#include "log2.h"
+
+MODULE_AUTHOR("Roland Dreier");
+MODULE_DESCRIPTION("Mellanox ConnectX HCA low-level driver");
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_VERSION(DRV_VERSION);
+
+#ifdef KMOD_MODIFIED
+//struct workqueue_struct *mlx4_wq;
+#endif
+
+#ifdef CONFIG_MLX4_DEBUG
+
+int mlx4_debug_level = 0;
+module_param_named(debug_level, mlx4_debug_level, int, 0644);
+MODULE_PARM_DESC(debug_level, "Enable debug tracing if > 0");
+
+#endif /* CONFIG_MLX4_DEBUG */
+
+#ifdef CONFIG_PCI_MSI
+
+static int msi_x = 1;
+module_param(msi_x, int, 0444);
+MODULE_PARM_DESC(msi_x, "attempt to use MSI-X if nonzero");
+
+#else /* CONFIG_PCI_MSI */
+
+#define msi_x (0)
+
+#endif /* CONFIG_PCI_MSI */
+
+static int enable_sys_tune = 0;
+module_param(enable_sys_tune, int, 0444);
+MODULE_PARM_DESC(enable_sys_tune, "Tune the cpu's for better performance (default 0)");
+
+int mlx4_blck_lb = 1;
+module_param_named(block_loopback, mlx4_blck_lb, int, 0644);
+MODULE_PARM_DESC(block_loopback, "Block multicast loopback packets if > 0 "
+ "(default: 1)");
+
+#define MLX4_ROCE_1_5_DEF_PROTO 0xfe
+
+int mlx4_roce_proto_config = MLX4_ROCE_1_5_DEF_PROTO;
+module_param_named(rr_proto, mlx4_roce_proto_config, int, 0444);
+MODULE_PARM_DESC(rr_proto, "IP next protocol for RoCEv1.5 or destination port for RoCEv2. Setting 0 means using driver default values");
+
+int ingress_parser_mode = MLX4_INGRESS_PARSER_MODE_STANDARD;
+module_param(ingress_parser_mode, int, 0444);
+MODULE_PARM_DESC(ingress_parser_mode, "Mode of ingress parser for ConnectX3-Pro. 0 - standard. 1 - checksum for non TCP/UDP. (default: standard)");
+
+enum {
+ DEFAULT_DOMAIN = 0,
+ BDF_STR_SIZE = 8, /* bb:dd.f- */
+ DBDF_STR_SIZE = 13 /* mmmm:bb:dd.f- */
+};
+
+enum {
+ NUM_VFS,
+ PROBE_VF,
+ PORT_TYPE_ARRAY,
+ ROCE_MODE,
+ UD_GID_TYPE
+};
+
+enum {
+ VALID_DATA,
+ INVALID_DATA,
+ INVALID_STR
+};
+
+struct param_data {
+ int id;
+ struct mlx4_dbdf2val_lst dbdf2val;
+};
+
+static struct param_data roce_mode = {
+ .id = ROCE_MODE,
+ .dbdf2val = {
+ .name = "roce_mode param",
+ .num_vals = 1,
+ .def_val = {0},
+ .range = {0, 4},
+ .num_inval_vals = 0
+ }
+};
+module_param_string(roce_mode, roce_mode.dbdf2val.str,
+ sizeof(roce_mode.dbdf2val.str), 0444);
+MODULE_PARM_DESC(roce_mode,
+ "Set RoCE modes supported by the port\n"
+ "\tA single value (e.g. 0) to define uniform preferred RoCE_mode value for all devices\n"
+ "\t\tor a string to map device function numbers to their RoCE mode value (e.g. '0000:04:00.0-0,002b:1c:0b.a-0').\n"
+ "\t\tAllowed values are 0: RoCEv1 (default), 1: RoCEv1.5, 2: RoCEv2, 3: RoCEv1.5+2 and 4: RoCEv1+2)\n");
+
+static struct param_data ud_gid_type = {
+ .id = UD_GID_TYPE,
+ .dbdf2val = {
+ .name = "ud_gid_type param",
+ .num_vals = 1,
+ .def_val = {MLX4_ROCE_GID_TYPE_V1_5},
+ .range = {MLX4_ROCE_GID_TYPE_V1, MLX4_ROCE_GID_TYPE_V2},
+ .num_inval_vals = 0
+ }
+};
+module_param_string(ud_gid_type, ud_gid_type.dbdf2val.str,
+ sizeof(ud_gid_type.dbdf2val.str), 0444);
+MODULE_PARM_DESC(ud_gid_type,
+ "Set gid type for UD QPs\n"
+ "\tA single value (e.g. 1) to define uniform UD QP gid type for all devices\n"
+ "\t\tor a string to map device function numbers to their UD QP gid type (e.g. '0000:04:00.0-0,002b:1c:0b.a-1').\n"
+ "\t\tAllowed values are 0 for RoCEv1, 1 for RoCEv1.5 (default) and 2 for RoCEv2");
+
+static struct param_data num_vfs = {
+ .id = NUM_VFS,
+ .dbdf2val = {
+ .name = "num_vfs param",
+ .num_vals = 3,
+ .def_val = {0},
+ .range = {0, MLX4_MAX_NUM_VF},
+ .num_inval_vals = 0
+ }
+};
+module_param_string(num_vfs, num_vfs.dbdf2val.str,
+ sizeof(num_vfs.dbdf2val.str), 0444);
+MODULE_PARM_DESC(num_vfs,
+ "Either single value (e.g. '5') or triplet (e.g. '10,11,12') to define uniform num_vfs value for all devices functions.\n"
+ "\t\tIf a single value is given, this value will be used in order to define <num_vfs> dual ports virtual functions.\n"
+ "\t\tIf a triplet <a,b,c> is given, <a> single port virtual functions are defined on port1, <b> single port\n"
+ "\t\tvirtual functions are defined on port2 and <c> dual port virtual functions are defined.\n"
+ "\t\tAlternatively, a string to map device function numbers to their num_vfs values\n"
+ "\t\t (e.g. '0000:04:00.0-5,002b:1c:0b.a-15;2;4') could be given.\n"
+ "\t\tHexadecimal digits for the device function (e.g. 002b:1c:0b.a) and decimal or triplet for num_vfs value\n"
+ "\t\t(e.g. 15 or 1;2;3).");
+
+static struct param_data probe_vf = {
+ .id = PROBE_VF,
+ .dbdf2val = {
+ .name = "probe_vf param",
+ .num_vals = 3,
+ .def_val = {0},
+ .range = {0, MLX4_MAX_NUM_VF},
+ .num_inval_vals = 0
+ }
+};
+module_param_string(probe_vf, probe_vf.dbdf2val.str,
+ sizeof(probe_vf.dbdf2val.str), 0444);
+MODULE_PARM_DESC(probe_vf,
+ "Either single value (e.g. '3') or triplet (e.g '1,2,3') to define uniform number of VFs to probe by the pf\n"
+ "\t\tdriver for all devices functions.\n"
+ "\t\tIf a single value is given, this value will be used in order to define <probe_vf> probed dual ports virtual\n"
+ "\t\tfunctions. If a triplet <a,b,c> is given, <a> single port virtual functions are probed on port1, <b> single port\n"
+ "\t\tvirtual functions are probed on port2 and <c> dual port virtual functions are probed.\n"
+ "\t\tAlternatively, a string to map device function numbers to their probe_vf values\n"
+ "\t\t(e.g. '0000:04:00.0-3,002b:1c:0b.a-13;12;11') could be given.\n"
+ "\t\tHexadecimal digits for the device function (e.g. 002b:1c:0b.a) and decimal for probe_vf value (e.g. 13 or 1;2;3).");
+
+#define MLX4_FORCE_DMFS_IF_NO_NCSI_FS (1U << 0)
+#define MLX4_DMFS_ETH_ONLY (1U << 1)
+#define MLX4_DMFS_A0_STEERING (1U << 2)
+#define MLX4_DISABLE_DMFS_LOW_QP_NUM (1U << 3)
+#define MLX4_IB_IGNORE_SIP_CHECK (1U << 4)
+#define MLX4_ETH_IGNORE_SIP_CHECK (1U << 5)
+#define MLX4_DMFS_PARAM_VALUES ((MLX4_ETH_IGNORE_SIP_CHECK << 1) - 1)
+
+int mlx4_log_num_mgm_entry_size = -(MLX4_DMFS_ETH_ONLY | MLX4_DISABLE_DMFS_LOW_QP_NUM);
+module_param_named(log_num_mgm_entry_size,
+ mlx4_log_num_mgm_entry_size, int, 0444);
+MODULE_PARM_DESC(log_num_mgm_entry_size, "log mgm size, that defines the num"
+ " of qp per mcg, for example:"
+ " 10 gives 248.range: 7 <="
+ " log_num_mgm_entry_size <= 12 (default = -10).\n"
+ "\t\tTo activate one of device managed"
+ " flow steering modes, set to non positive value (-x) and sets bits in x:\n"
+ "\t\t0: Force DMFS, even on expense of NCSI support\n"
+ "\t\t1: Disable IPoIB DMFS rules (if enabled performance might decrease. Can't be cleared if b3 is set)\n"
+ "\t\t2: Enable optimized steering (even if in limited L2 mode. Can't be set if b2 is cleared)\n"
+ "\t\t3: Disable DMFS if number of QPs per MCG is low\n"
+ "\t\t4: Optimize IPoIB/EoIB steering table for non source IP rules if possible\n"
+ "\t\t5: Optimize steering table for non source IP rules if possible");
+
+static int fast_drop;
+module_param_named(fast_drop, fast_drop, int, 0444);
+MODULE_PARM_DESC(fast_drop,
+ "Enable fast packet drop when no recieve WQEs are posted");
+
+static bool enable_64b_cqe_eqe = true;
+module_param(enable_64b_cqe_eqe, int, 0444);
+MODULE_PARM_DESC(enable_64b_cqe_eqe,
+ "Enable 64 byte CQEs/EQEs when the FW supports this (default: True)");
+
+#define PF_CONTEXT_BEHAVIOUR_MASK (MLX4_FUNC_CAP_64B_EQE_CQE | \
+ MLX4_FUNC_CAP_EQE_CQE_STRIDE | \
+ MLX4_FUNC_CAP_DMFS_A0_STATIC)
+
+#define RESET_PERSIST_MASK_FLAGS (MLX4_FLAG_SRIOV)
+
+static char mlx4_version[] =
+ DRV_NAME ": Mellanox ConnectX core driver v"
+ DRV_VERSION " (" DRV_RELDATE ")\n";
+
+static struct mlx4_profile low_mem_profile = {
+ .num_qp = 1 << 17,
+ .num_srq = 1 << 6,
+ .rdmarc_per_qp = 1 << 4,
+ .num_cq = 1 << 8,
+ .num_mcg = 1 << 8,
+ .num_mpt = 1 << 9,
+ .num_mtt = 1 << 7,
+};
+
+
+#define MLX4_MAX_LOG_NUM_MACS 7
+static int log_num_mac = MLX4_MAX_LOG_NUM_MACS;
+module_param_named(log_num_mac, log_num_mac, int, 0444);
+MODULE_PARM_DESC(log_num_mac, "Log2 max number of MACs per ETH port (1-7)");
+
+static int log_num_vlan;
+module_param_named(log_num_vlan, log_num_vlan, int, 0444);
+MODULE_PARM_DESC(log_num_vlan, "Log2 max number of VLANs per ETH port (0-7)");
+/* Log2 max number of VLANs per ETH port (0-7) */
+#define MLX4_LOG_NUM_VLANS 7
+#define MLX4_MIN_LOG_NUM_VLANS 0
+#define MLX4_MIN_LOG_NUM_MAC 1
+
+static int use_prio;
+module_param_named(use_prio, use_prio, int, 0444);
+MODULE_PARM_DESC(use_prio, "Enable steering by VLAN priority on ETH ports (deprecated)");
+
+int log_mtts_per_seg = ilog2(1);
+module_param_named(log_mtts_per_seg, log_mtts_per_seg, int, 0444);
+MODULE_PARM_DESC(log_mtts_per_seg, "Log2 number of MTT entries per segment (0-7) (default: 0)");
+
+static struct param_data port_type_array = {
+ .id = PORT_TYPE_ARRAY,
+ .dbdf2val = {
+ .name = "port_type_array param",
+ .num_vals = 2,
+ .def_val = {MLX4_PORT_TYPE_ETH, MLX4_PORT_TYPE_ETH},
+ .range = {MLX4_PORT_TYPE_ETH, MLX4_PORT_TYPE_ETH},
+ .num_inval_vals = 1,
+ .inval_val = {MLX4_PORT_TYPE_AUTO}
+ }
+};
+module_param_string(port_type_array, port_type_array.dbdf2val.str,
+ sizeof(port_type_array.dbdf2val.str), 0444);
+MODULE_PARM_DESC(port_type_array,
+ "Valid only if num_vfs is non-zero (SRIOV mode). Ignored otherwise.\n"
+ "\t\tEither pair of values (e.g. '1,2') to define uniform port1/port2 types configuration for all devices functions\n"
+ "\t\tor a string to map device function numbers to their pair of port types values (e.g. '0000:04:00.0-1;2,002b:1c:0b.a-1;1').\n"
+ "\t\tValid port types: 1-ib, 2-eth, 4-N/A\n"
+ "\t\tIn case that only one port is available use the N/A port type for port2 (e.g '1,4').");
+
+
+struct mlx4_port_config {
+ struct list_head list;
+ enum mlx4_port_type port_type[MLX4_MAX_PORTS + 1];
+ struct pci_dev *pdev;
+};
+
+static atomic_t pf_loading = ATOMIC_INIT(0);
+
+#define MLX4_LOG_NUM_MTT 20
+/* We limit to 30 as of a bit map issue which uses int and not uint.
+ see mlx4_buddy_init -> bitmap_zero which gets int.
+*/
+#define MLX4_MAX_LOG_NUM_MTT 30
+static struct mlx4_profile mod_param_profile = {
+ .num_qp = 19,
+ .num_srq = 16,
+ .rdmarc_per_qp = 4,
+ .num_cq = 16,
+ .num_mcg = 13,
+ .num_mpt = 19,
+ .num_mtt = 0, /* max(20, 2*MTTs for host memory)) */
+};
+
+module_param_named(log_num_qp, mod_param_profile.num_qp, int, 0444);
+MODULE_PARM_DESC(log_num_qp, "log maximum number of QPs per HCA (default: 19)");
+
+module_param_named(log_num_srq, mod_param_profile.num_srq, int, 0444);
+MODULE_PARM_DESC(log_num_srq, "log maximum number of SRQs per HCA "
+ "(default: 16)");
+
+module_param_named(log_rdmarc_per_qp, mod_param_profile.rdmarc_per_qp, int,
+ 0444);
+MODULE_PARM_DESC(log_rdmarc_per_qp, "log number of RDMARC buffers per QP "
+ "(default: 4)");
+
+module_param_named(log_num_cq, mod_param_profile.num_cq, int, 0444);
+MODULE_PARM_DESC(log_num_cq, "log maximum number of CQs per HCA (default: 16)");
+
+module_param_named(log_num_mcg, mod_param_profile.num_mcg, int, 0444);
+MODULE_PARM_DESC(log_num_mcg, "log maximum number of multicast groups per HCA "
+ "(default: 13)");
+
+module_param_named(log_num_mpt, mod_param_profile.num_mpt, int, 0444);
+MODULE_PARM_DESC(log_num_mpt,
+ "log maximum number of memory protection table entries per "
+ "HCA (default: 19)");
+
+module_param_named(log_num_mtt, mod_param_profile.num_mtt, int, 0444);
+MODULE_PARM_DESC(log_num_mtt,
+ "log maximum number of memory translation table segments per "
+ "HCA (default: max(20, 2*MTTs for register all of the host memory limited to 30))");
+
+static void process_mod_param_profile(struct mlx4_profile *profile)
+{
+ struct sysinfo si;
+
+ profile->num_qp = 1 << mod_param_profile.num_qp;
+ profile->num_srq = 1 << mod_param_profile.num_srq;
+ profile->rdmarc_per_qp = 1 << mod_param_profile.rdmarc_per_qp;
+ profile->num_cq = 1 << mod_param_profile.num_cq;
+ profile->num_mcg = 1 << mod_param_profile.num_mcg;
+ profile->num_mpt = 1 << mod_param_profile.num_mpt;
+ /* We want to scale the number of MTTs with the size of the
+ * system memory, since it makes sense to register a lot of
+ * memory on a system with a lot of memory. As a heuristic,
+ * make sure we have enough MTTs to register twice the system
+ * memory (with PAGE_SIZE entries).
+ *
+ * This number has to be a power of two and fit into 32 bits
+ * due to device limitations. We cap this at 2^30 as of bit map
+ * limitation to work with int instead of uint (mlx4_buddy_init -> bitmap_zero)
+ * That limits us to 4TB of memory registration per HCA with
+ * 4KB pages, which is probably OK for the next few months.
+ */
+ if (mod_param_profile.num_mtt)
+ profile->num_mtt = 1 << mod_param_profile.num_mtt;
+ else {
+ si_meminfo(&si);
+ profile->num_mtt =
+ roundup_pow_of_two(max_t(unsigned,
+ 1 << (MLX4_LOG_NUM_MTT - log_mtts_per_seg),
+ min(1UL << (MLX4_MAX_LOG_NUM_MTT - log_mtts_per_seg),
+ (si.totalram << 1) >> log_mtts_per_seg)));
+ /* set the actual value, so it will be reflected to the user
+ * using the sysfs
+ */
+ mod_param_profile.num_mtt = ilog2(profile->num_mtt);
+ }
+}
+
+enum {
+ MLX4_IF_STATE_BASIC,
+ MLX4_IF_STATE_EXTENDED
+};
+
+static inline u64 dbdf_to_u64(int domain, int bus, int dev, int fn)
+{
+ return (domain << 20) | (bus << 12) | (dev << 4) | fn;
+}
+
+static inline void pr_bdf_err(const char *dbdf, const char *pname)
+{
+ pr_warn("mlx4_core: '%s' is not valid bdf in '%s'\n", dbdf, pname);
+}
+
+static inline void pr_val_err(const char *dbdf, const char *pname,
+ const char *val)
+{
+ pr_warn("mlx4_core: value '%s' of bdf '%s' in '%s' is not valid\n"
+ , val, dbdf, pname);
+}
+
+static inline void pr_out_of_range_bdf(const char *dbdf, int val,
+ struct mlx4_dbdf2val_lst *dbdf2val)
+{
+ pr_warn("mlx4_core: value %d in bdf '%s' of '%s' is out of its valid range (%d,%d)\n"
+ , val, dbdf, dbdf2val->name , dbdf2val->range.min,
+ dbdf2val->range.max);
+}
+
+static inline void pr_out_of_range(struct mlx4_dbdf2val_lst *dbdf2val)
+{
+ pr_warn("mlx4_core: value of '%s' is out of its valid range (%d,%d)\n"
+ , dbdf2val->name , dbdf2val->range.min, dbdf2val->range.max);
+}
+
+static inline int is_valid_value(int val, struct mlx4_dbdf2val_lst *v)
+{
+ int i;
+
+ for (i = 0; i < v->num_inval_vals; i++) {
+ if (val == v->inval_val[i])
+ return 0;
+ }
+ return 1;
+}
+
+static inline void pr_invalid_value(int val, struct mlx4_dbdf2val_lst *dbdf2val)
+{
+ pr_warn("mlx4_core: value %d of '%s' is not allowed\n",
+ val, dbdf2val->name);
+}
+
+static inline int is_in_range(int val, struct mlx4_range *r)
+{
+ return (val >= r->min && val <= r->max);
+}
+
+static int parse_array(struct param_data *pdata, char *p, long *vals, u32 n)
+{
+ u32 iter = 0;
+
+ while (n != 0 && strlen(p)) {
+ char *t = strchr(p, ',');
+ int val_len = t - p;
+ char sval[32];
+ int ret;
+ *vals = atol(p);
+ /* Try to parse as last element */
+ if (!t ) {
+ if (!is_in_range(*vals, &pdata->dbdf2val.range)) {
+ pr_out_of_range(&pdata->dbdf2val);
+ return -INVALID_DATA;
+ }
+ if (!is_valid_value(*vals, &pdata->dbdf2val)) {
+ pr_invalid_value(*vals, &pdata->dbdf2val);
+ return -INVALID_DATA;
+ }
+ return ++iter;
+ }
+
+ if (!t || t == p || val_len > sizeof(sval))
+ return -INVALID_STR;
+
+ strncpy(sval, p, val_len);
+ sval[val_len] = 0;
+
+ ret = 0;
+ *vals = atol(sval);
+
+ if (ret == -EINVAL)
+ return -INVALID_STR;
+ if (ret || !is_in_range(*vals, &pdata->dbdf2val.range)) {
+ pr_out_of_range(&pdata->dbdf2val);
+ return -INVALID_DATA;
+ }
+ if (!is_valid_value(*vals, &pdata->dbdf2val)) {
+ pr_invalid_value(*vals, &pdata->dbdf2val);
+ return -INVALID_DATA;
+ }
+
+ ++iter;
+ ++vals;
+ p += val_len + 1;
+ if (n > 0)
+ n--;
+ }
+
+ return -INVALID_STR;
+}
+
+#define ARRAY_LEN(arr) (sizeof((arr))/sizeof((arr)[0]))
+static int parse_mod_param(struct param_data *pdata)
+{
+ int i;
+ int ret = 0;
+ long port_array[ARRAY_LEN(pdata->dbdf2val.tbl[0].val)];
+ char *p = pdata->dbdf2val.str;
+
+ ret = parse_array(pdata, p, port_array,
+ pdata->dbdf2val.num_vals);
+ if (ret > pdata->dbdf2val.num_vals || ret <= 0)
+ return ret < 0 ? -ret : INVALID_STR;
+ for (i = 0; i < ret; i++)
+ pdata->dbdf2val.tbl[0].val[i] = port_array[i];
+ pdata->dbdf2val.tbl[0].argc = i;
+ return 0;
+}
+
+static int update_defaults(struct param_data *pdata)
+{
+ int ret;
+ char *p = pdata->dbdf2val.str;
+
+ if (!strlen(p) || strchr(p, ':') || strchr(p, '.') || strchr(p, ';'))
+ return INVALID_STR;
+
+ switch (pdata->id) {
+ case UD_GID_TYPE:
+ case ROCE_MODE:
+ case PORT_TYPE_ARRAY:
+ case NUM_VFS:
+ case PROBE_VF:
+ ret = parse_mod_param(pdata);
+ if (ret)
+ return ret;
+ break;
+ default:
+ return INVALID_DATA;
+ }
+ pdata->dbdf2val.tbl[1].dbdf = MLX4_ENDOF_TBL;
+
+ return VALID_DATA;
+}
+
+int mlx4_fill_dbdf2val_tbl(struct mlx4_dbdf2val_lst *dbdf2val_lst)
+{
+ int domain, bus, dev, fn;
+ u64 dbdf;
+ char *p, *t, *v;
+ char tmp[32];
+ char sbdf[32];
+ char sep = ',';
+ int j, k, str_size, i = 1;
+ int prfx_size;
+
+ p = dbdf2val_lst->str;
+
+ for (j = 0; j < dbdf2val_lst->num_vals; j++)
+ dbdf2val_lst->tbl[0].val[j] = dbdf2val_lst->def_val[j];
+ dbdf2val_lst->tbl[0].argc = 0;
+ dbdf2val_lst->tbl[1].dbdf = MLX4_ENDOF_TBL;
+
+ str_size = strlen(dbdf2val_lst->str);
+
+ if (str_size == 0)
+ return 0;
+
+ while (strlen(p)) {
+ prfx_size = BDF_STR_SIZE;
+ sbdf[prfx_size] = 0;
+ strncpy(sbdf, p, prfx_size);
+ domain = DEFAULT_DOMAIN;
+ if (sscanf(sbdf, "%02x:%02x.%x-", &bus, &dev, &fn) != 3) {
+ prfx_size = DBDF_STR_SIZE;
+ sbdf[prfx_size] = 0;
+ strncpy(sbdf, p, prfx_size);
+ if (sscanf(sbdf, "%04x:%02x:%02x.%x-", &domain, &bus,
+ &dev, &fn) != 4) {
+ pr_bdf_err(sbdf, dbdf2val_lst->name);
+ goto err;
+ }
+ sprintf(tmp, "%04x:%02x:%02x.%x-", domain, bus, dev,
+ fn);
+ } else {
+ sprintf(tmp, "%02x:%02x.%x-", bus, dev, fn);
+ }
+
+ if (strncasecmp(sbdf, tmp, sizeof(tmp))) {
+ pr_bdf_err(sbdf, dbdf2val_lst->name);
+ goto err;
+ }
+
+ dbdf = dbdf_to_u64(domain, bus, dev, fn);
+
+ for (j = 1; j < i; j++)
+ if (dbdf2val_lst->tbl[j].dbdf == dbdf) {
+ pr_warn("mlx4_core: in '%s', %s appears multiple times\n"
+ , dbdf2val_lst->name, sbdf);
+ goto err;
+ }
+
+ if (i >= MLX4_DEVS_TBL_SIZE) {
+ pr_warn("mlx4_core: Too many devices in '%s'\n"
+ , dbdf2val_lst->name);
+ goto err;
+ }
+
+ p += prfx_size;
+ t = strchr(p, sep);
+ t = t ? t : p + strlen(p);
+ if (p >= t) {
+ pr_val_err(sbdf, dbdf2val_lst->name, "");
+ goto err;
+ }
+
+ for (k = 0; k < dbdf2val_lst->num_vals; k++) {
+ char sval[32];
+ long int val;
+ int ret, val_len;
+ char vsep = ';';
+ int last_occurence = 0;
+
+ v = (k == dbdf2val_lst->num_vals - 1) ? t : strchr(p, vsep);
+ if (NULL == v) {
+ v = t;
+ last_occurence = 1;
+ }
+ if (!v || v > t || v == p || (v - p) > sizeof(sval)) {
+ pr_val_err(sbdf, dbdf2val_lst->name, p);
+ goto err;
+ }
+ val_len = v - p;
+ strncpy(sval, p, val_len);
+ sval[val_len] = 0;
+
+ ret = 0;
+ val = atol(sval);
+ if (ret) {
+ if (strchr(p, vsep))
+ pr_warn("mlx4_core: too many vals in bdf '%s' of '%s'\n"
+ , sbdf, dbdf2val_lst->name);
+ else
+ pr_val_err(sbdf, dbdf2val_lst->name,
+ sval);
+ goto err;
+ }
+ if (!is_in_range(val, &dbdf2val_lst->range)) {
+ pr_out_of_range_bdf(sbdf, val, dbdf2val_lst);
+ goto err;
+ }
+
+ dbdf2val_lst->tbl[i].val[k] = val;
+ dbdf2val_lst->tbl[i].argc = k + 1;
+ p = v;
+ if (p[0] == vsep)
+ p++;
+ if (last_occurence)
+ break;
+ }
+
+ dbdf2val_lst->tbl[i].dbdf = dbdf;
+ if (strlen(p)) {
+ if (p[0] != sep) {
+ pr_warn("mlx4_core: expect separator '%c' before '%s' in '%s'\n"
+ , sep, p, dbdf2val_lst->name);
+ goto err;
+ }
+ p++;
+ }
+ i++;
+ if (i < MLX4_DEVS_TBL_SIZE)
+ dbdf2val_lst->tbl[i].dbdf = MLX4_ENDOF_TBL;
+ }
+
+ return 0;
+
+err:
+ dbdf2val_lst->tbl[1].dbdf = MLX4_ENDOF_TBL;
+ pr_warn("mlx4_core: The value of '%s' is incorrect. The value is discarded!\n"
+ , dbdf2val_lst->name);
+
+ return -EINVAL;
+}
+EXPORT_SYMBOL(mlx4_fill_dbdf2val_tbl);
+
+int mlx4_get_val(struct mlx4_dbdf2val *tbl, struct rte_pci_device *pdev, int idx,
+ int *val)
+{
+ u64 dbdf;
+ int i = 1;
+
+ *val = tbl[0].val[idx];
+ if (!pdev)
+ return -EINVAL;
+
+ if (!pdev->addr.bus) {
+ pr_debug("mlx4_core: pci_dev without valid bus number\n");
+ return -EINVAL;
+ }
+
+ dbdf = dbdf_to_u64(pdev->addr.domain, pdev->addr.bus,
+ pdev->addr.devid, pdev->addr.function);
+
+ while ((i < MLX4_DEVS_TBL_SIZE) && (tbl[i].dbdf != MLX4_ENDOF_TBL)) {
+ if (tbl[i].dbdf == dbdf) {
+ if (idx < tbl[i].argc) {
+ *val = tbl[i].val[idx];
+ return 0;
+ } else {
+ return -EINVAL;
+ }
+ }
+ i++;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(mlx4_get_val);
+
+
+int mlx4_get_argc(struct mlx4_dbdf2val *tbl, struct rte_pci_device *pdev)
+{
+ u64 dbdf;
+ int i = 1;
+
+ if (!pdev)
+ return -EINVAL;
+
+ if (!pdev->addr.bus) {
+ pr_debug("mlx4_core: pci_dev without valid bus number\n");
+ return -EINVAL;
+ }
+
+ dbdf = dbdf_to_u64(pdev->addr.domain, pdev->addr.bus,
+ pdev->addr.devid, pdev->addr.function);
+
+ while ((i < MLX4_DEVS_TBL_SIZE) && (tbl[i].dbdf != MLX4_ENDOF_TBL)) {
+ if (tbl[i].dbdf == dbdf)
+ return tbl[i].argc;
+ i++;
+ }
+
+ return tbl[0].argc;
+}
+EXPORT_SYMBOL(mlx4_get_argc);
+
+int mlx4_check_port_params(struct mlx4_dev *dev,
+ enum mlx4_port_type *port_type)
+{
+ int i;
+
+ if (!(dev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP)) {
+ for (i = 0; i < dev->caps.num_ports - 1; i++) {
+ if (port_type[i] != port_type[i + 1]) {
+ mlx4_err(dev, "Only same port types supported on this HCA, aborting\n");
+ return -EINVAL;
+ }
+ }
+ }
+
+ for (i = 0; i < dev->caps.num_ports; i++) {
+ if (!(port_type[i] & dev->caps.supported_type[i+1])) {
+ mlx4_err(dev, "Requested port type for port %d is not supported on this HCA\n",
+ i + 1);
+ return -EINVAL;
+ }
+ }
+ return 0;
+}
+
+static void mlx4_set_port_mask(struct mlx4_dev *dev)
+{
+ int i;
+
+ for (i = 1; i <= dev->caps.num_ports; ++i)
+ dev->caps.port_mask[i] = dev->caps.port_type[i];
+}
+
+enum {
+ MLX4_QUERY_FUNC_NUM_SYS_EQS = 1 << 0,
+};
+
+static int mlx4_query_func(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
+{
+ int err = 0;
+ struct mlx4_func func;
+
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_SYS_EQS) {
+ err = mlx4_QUERY_FUNC(dev, &func, 0);
+ if (err) {
+ mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting.\n");
+ return err;
+ }
+ dev_cap->max_eqs = func.max_eq;
+ dev_cap->reserved_eqs = func.rsvd_eqs;
+ dev_cap->reserved_uars = func.rsvd_uars;
+ err |= MLX4_QUERY_FUNC_NUM_SYS_EQS;
+ }
+ return err;
+}
+
+static void mlx4_enable_cqe_eqe_stride(struct mlx4_dev *dev)
+{
+ struct mlx4_caps *dev_cap = &dev->caps;
+
+ /* FW not supporting or cancelled by user */
+ if (!(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_EQE_STRIDE) ||
+ !(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_CQE_STRIDE))
+ return;
+
+ /* Must have 64B CQE_EQE enabled by FW to use bigger stride
+ * When FW has NCSI it may decide not to report 64B CQE/EQEs
+ */
+ if (!(dev_cap->flags & MLX4_DEV_CAP_FLAG_64B_EQE) ||
+ !(dev_cap->flags & MLX4_DEV_CAP_FLAG_64B_CQE)) {
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_CQE_STRIDE;
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
+ return;
+ }
+
+ if (cache_line_size() == 128 || cache_line_size() == 256) {
+ mlx4_dbg(dev, "Enabling CQE stride cacheLine supported\n");
+ /* Changing the real data inside CQE size to 32B */
+ dev_cap->flags &= ~MLX4_DEV_CAP_FLAG_64B_CQE;
+ dev_cap->flags &= ~MLX4_DEV_CAP_FLAG_64B_EQE;
+
+ if (mlx4_is_master(dev))
+ dev_cap->function_caps |= MLX4_FUNC_CAP_EQE_CQE_STRIDE;
+ } else {
+ if (cache_line_size() != 32 && cache_line_size() != 64)
+ mlx4_dbg(dev, "Disabling CQE stride, cacheLine size unsupported\n");
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_CQE_STRIDE;
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
+ }
+}
+
+static int _mlx4_dev_port(struct mlx4_dev *dev, int port,
+ struct mlx4_port_cap *port_cap)
+{
+ dev->caps.vl_cap[port] = port_cap->max_vl;
+ dev->caps.ib_mtu_cap[port] = port_cap->ib_mtu;
+ dev->phys_caps.gid_phys_table_len[port] = port_cap->max_gids;
+ dev->phys_caps.pkey_phys_table_len[port] = port_cap->max_pkeys;
+ /* set gid and pkey table operating lengths by default
+ * to non-sriov values
+ */
+ dev->caps.gid_table_len[port] = port_cap->max_gids;
+ dev->caps.pkey_table_len[port] = port_cap->max_pkeys;
+ dev->caps.port_width_cap[port] = port_cap->max_port_width;
+ dev->caps.eth_mtu_cap[port] = port_cap->eth_mtu;
+ dev->caps.def_mac[port] = port_cap->def_mac;
+ dev->caps.supported_type[port] = port_cap->supported_port_types;
+ dev->caps.suggested_type[port] = port_cap->suggested_type;
+ dev->caps.default_sense[port] = port_cap->default_sense;
+ dev->caps.trans_type[port] = port_cap->trans_type;
+ dev->caps.vendor_oui[port] = port_cap->vendor_oui;
+ dev->caps.wavelength[port] = port_cap->wavelength;
+ dev->caps.trans_code[port] = port_cap->trans_code;
+
+ return 0;
+}
+
+static int mlx4_dev_port(struct mlx4_dev *dev, int port,
+ struct mlx4_port_cap *port_cap)
+{
+ int err = 0;
+
+ err = mlx4_QUERY_PORT(dev, port, port_cap);
+
+ if (err)
+ mlx4_err(dev, "QUERY_PORT command failed.\n");
+
+ return err;
+}
+
+static inline void mlx4_enable_ignore_fcs(struct mlx4_dev *dev)
+{
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_IGNORE_FCS))
+ return;
+
+ if (mlx4_is_mfunc(dev)) {
+ mlx4_dbg(dev, "SRIOV mode - Disabling Ignore FCS");
+ dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_IGNORE_FCS;
+ return;
+ }
+
+ if (!(dev->caps.flags & MLX4_DEV_CAP_FLAG_FCS_KEEP)) {
+ mlx4_dbg(dev,
+ "Keep FCS is not supported - Disabling Ignore FCS");
+ dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_IGNORE_FCS;
+ return;
+ }
+}
+
+#define MLX4_A0_STEERING_TABLE_SIZE 256
+static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
+{
+ int err;
+ int i;
+
+ err = mlx4_QUERY_DEV_CAP(dev, dev_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting\n");
+ return err;
+ }
+
+ if ((ingress_parser_mode != MLX4_INGRESS_PARSER_MODE_STANDARD) &&
+ (dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_MODIFY_PARSER)) {
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_VXLAN_OFFLOADS;
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_ROCEV2;
+ }
+
+ mlx4_dev_cap_dump(dev, dev_cap);
+
+ if (dev_cap->min_page_sz > PAGE_SIZE) {
+ mlx4_err(dev, "HCA minimum page size of %d bigger than kernel PAGE_SIZE of %ld, aborting\n",
+ dev_cap->min_page_sz, PAGE_SIZE);
+ return -ENODEV;
+ }
+ if (dev_cap->num_ports > MLX4_MAX_PORTS) {
+ mlx4_err(dev, "HCA has %d ports, but we only support %d, aborting\n",
+ dev_cap->num_ports, MLX4_MAX_PORTS);
+ return -ENODEV;
+ }
+
+ if (dev_cap->uar_size > dev->persist->rte_pdev->mem_resource[2].len) {
+ mlx4_err(dev, "HCA reported UAR size of 0x%x bigger than PCI resource 2 size of 0x%llx, aborting\n",
+ dev_cap->uar_size,
+ (unsigned long long)
+ dev->persist->rte_pdev->mem_resource[2].len);
+ return -ENODEV;
+ }
+ if (dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2)
+ dev->caps.roce_addr_support = 1;
+
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ if ((dev_cap->bmme_flags & MLX4_BMME_FLAG_WQE_FORMAT))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_WQE_FORMAT;
+#endif
+ dev->caps.num_ports = dev_cap->num_ports;
+ dev->caps.num_sys_eqs = dev_cap->num_sys_eqs;
+ dev->phys_caps.num_phys_eqs = dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_SYS_EQS ?
+ dev->caps.num_sys_eqs :
+ MLX4_MAX_EQ_NUM;
+ for (i = 1; i <= dev->caps.num_ports; ++i) {
+ err = _mlx4_dev_port(dev, i, dev_cap->port_cap + i);
+ if (err) {
+ mlx4_err(dev, "QUERY_PORT command failed, aborting\n");
+ return err;
+ }
+ }
+
+ dev->caps.uar_page_size = PAGE_SIZE;
+ dev->caps.num_uars = dev_cap->uar_size / PAGE_SIZE;
+ dev->caps.local_ca_ack_delay = dev_cap->local_ca_ack_delay;
+ dev->caps.bf_reg_size = dev_cap->bf_reg_size;
+ dev->caps.bf_regs_per_page = dev_cap->bf_regs_per_page;
+ dev->caps.max_sq_sg = dev_cap->max_sq_sg;
+ dev->caps.max_rq_sg = dev_cap->max_rq_sg;
+ dev->caps.max_wqes = dev_cap->max_qp_sz;
+ dev->caps.max_qp_init_rdma = dev_cap->max_requester_per_qp;
+ dev->caps.max_srq_wqes = dev_cap->max_srq_sz;
+ dev->caps.max_srq_sge = dev_cap->max_rq_sg - 1;
+ dev->caps.reserved_srqs = dev_cap->reserved_srqs;
+ dev->caps.max_sq_desc_sz = dev_cap->max_sq_desc_sz;
+ dev->caps.max_rq_desc_sz = dev_cap->max_rq_desc_sz;
+ /*
+ * Subtract 1 from the limit because we need to allocate a
+ * spare CQE so the HCA HW can tell the difference between an
+ * empty CQ and a full CQ.
+ */
+ dev->caps.max_cqes = dev_cap->max_cq_sz - 1;
+ dev->caps.reserved_cqs = dev_cap->reserved_cqs;
+ dev->caps.reserved_eqs = dev_cap->reserved_eqs;
+ dev->caps.reserved_mtts = dev_cap->reserved_mtts;
+ dev->caps.reserved_mrws = dev_cap->reserved_mrws;
+
+ /* The first 128 UARs are used for EQ doorbells */
+ dev->caps.reserved_uars = max_t(int, 128, dev_cap->reserved_uars);
+ dev->caps.reserved_pds = dev_cap->reserved_pds;
+ dev->caps.reserved_xrcds = (dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) ?
+ dev_cap->reserved_xrcds : 0;
+ dev->caps.max_xrcds = (dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) ?
+ dev_cap->max_xrcds : 0;
+ dev->caps.mtt_entry_sz = dev_cap->mtt_entry_sz;
+
+ dev->caps.max_msg_sz = dev_cap->max_msg_sz;
+ dev->caps.page_size_cap = ~(u32) (dev_cap->min_page_sz - 1);
+ dev->caps.flags = dev_cap->flags;
+ dev->caps.flags2 = dev_cap->flags2;
+ dev->caps.bmme_flags = dev_cap->bmme_flags;
+ dev->caps.reserved_lkey = dev_cap->reserved_lkey;
+ dev->caps.stat_rate_support = dev_cap->stat_rate_support;
+ dev->caps.max_gso_sz = dev_cap->max_gso_sz;
+ dev->caps.max_rss_tbl_sz = dev_cap->max_rss_tbl_sz;
+ dev->caps.cq_overrun = dev_cap->cq_overrun;
+
+ /* Sense port always allowed on supported devices for ConnectX-1 and -2 */
+ if (mlx4_priv(dev)->pci_dev_data & MLX4_PCI_DEV_FORCE_SENSE_PORT)
+ dev->caps.flags |= MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
+ /* Don't do sense port on multifunction devices (for now at least) */
+ if (mlx4_is_mfunc(dev))
+ dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
+
+ if (mlx4_low_memory_profile()) {
+ dev->caps.log_num_macs = MLX4_MIN_LOG_NUM_MAC;
+ dev->caps.log_num_vlans = MLX4_MIN_LOG_NUM_VLANS;
+ } else {
+ dev->caps.log_num_macs = log_num_mac;
+ dev->caps.log_num_vlans = MLX4_LOG_NUM_VLANS;
+ }
+
+ dev->caps.fast_drop = fast_drop ?
+ !!(dev->caps.flags & MLX4_DEV_CAP_FLAG_FAST_DROP) :
+ 0;
+
+ for (i = 1; i <= dev->caps.num_ports; ++i) {
+ dev->caps.port_type[i] = MLX4_PORT_TYPE_NONE;
+ if (dev->caps.supported_type[i]) {
+ /* if only ETH is supported - assign ETH */
+ if (dev->caps.supported_type[i] == MLX4_PORT_TYPE_ETH)
+ dev->caps.port_type[i] = MLX4_PORT_TYPE_ETH;
+ /* if only IB is supported, assign IB */
+ else if (dev->caps.supported_type[i] ==
+ MLX4_PORT_TYPE_IB)
+ dev->caps.port_type[i] = MLX4_PORT_TYPE_IB;
+ else {
+ /*
+ * if IB and ETH are supported, we set the port
+ * type according to user selection of port type;
+ * if there is no user selection, take the FW hint
+ */
+ int pta;
+ mlx4_get_val(port_type_array.dbdf2val.tbl,
+ dev->persist->rte_pdev, i - 1,
+ &pta);
+ if (pta == MLX4_PORT_TYPE_NONE) {
+ dev->caps.port_type[i] = dev->caps.suggested_type[i] ?
+ MLX4_PORT_TYPE_ETH : MLX4_PORT_TYPE_IB;
+ } else if (pta == MLX4_PORT_TYPE_NA) {
+ mlx4_err(dev, "Port %d is valid port. "
+ "It is not allowed to configure its type to N/A(%d)\n",
+ i, MLX4_PORT_TYPE_NA);
+ return -EINVAL;
+ } else {
+ dev->caps.port_type[i] = pta;
+ }
+ }
+ }
+ /* Link sensing is not allowed for ETH only package */
+ mlx4_priv(dev)->sense.sense_allowed[i] = 0;
+
+ /*
+ * If "default_sense" bit is set, we move the port to "AUTO" mode
+ * and perform sense_port FW command to try and set the correct
+ * port type from beginning
+ */
+ if (mlx4_priv(dev)->sense.sense_allowed[i] && dev->caps.default_sense[i]) {
+ enum mlx4_port_type sensed_port = MLX4_PORT_TYPE_NONE;
+ dev->caps.possible_type[i] = MLX4_PORT_TYPE_AUTO;
+ mlx4_SENSE_PORT(dev, i, &sensed_port);
+ if (sensed_port != MLX4_PORT_TYPE_NONE)
+ dev->caps.port_type[i] = sensed_port;
+ } else {
+ dev->caps.possible_type[i] = dev->caps.port_type[i];
+ }
+
+ if (dev->caps.log_num_macs > dev_cap->port_cap[i].log_max_macs) {
+ dev->caps.log_num_macs = dev_cap->port_cap[i].log_max_macs;
+ mlx4_warn(dev, "Requested number of MACs is too much for port %d, reducing to %d\n",
+ i, 1 << dev->caps.log_num_macs);
+ }
+ if (dev->caps.log_num_vlans > dev_cap->port_cap[i].log_max_vlans) {
+ dev->caps.log_num_vlans = dev_cap->port_cap[i].log_max_vlans;
+ mlx4_warn(dev, "Requested number of VLANs is too much for port %d, reducing to %d\n",
+ i, 1 << dev->caps.log_num_vlans);
+ }
+ }
+
+ dev->caps.max_basic_counters = dev_cap->max_basic_counters;
+ dev->caps.max_extended_counters = dev_cap->max_extended_counters;
+ /* support extended counters if available */
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_COUNTERS_EXT)
+ dev->caps.max_counters = dev->caps.max_extended_counters;
+ else
+ dev->caps.max_counters = dev->caps.max_basic_counters;
+
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW] = dev_cap->reserved_qps;
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_ETH_ADDR] =
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_ADDR] =
+ (1 << dev->caps.log_num_macs) *
+ (1 << dev->caps.log_num_vlans) *
+ dev->caps.num_ports;
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_EXCH] = MLX4_NUM_FEXCH;
+
+ if (dev_cap->dmfs_high_rate_qpn_base > 0 &&
+ dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_FS_EN)
+ dev->caps.dmfs_high_rate_qpn_base = dev_cap->dmfs_high_rate_qpn_base;
+ else
+ dev->caps.dmfs_high_rate_qpn_base =
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW];
+
+ if (dev_cap->dmfs_high_rate_qpn_range > 0 &&
+ dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_FS_EN) {
+ dev->caps.dmfs_high_rate_qpn_range = dev_cap->dmfs_high_rate_qpn_range;
+ dev->caps.dmfs_high_steer_mode = MLX4_STEERING_DMFS_A0_DEFAULT;
+ dev->caps.flags2 |= MLX4_DEV_CAP_FLAG2_FS_A0;
+ } else {
+ dev->caps.dmfs_high_steer_mode = MLX4_STEERING_DMFS_A0_NOT_SUPPORTED;
+ dev->caps.dmfs_high_rate_qpn_base =
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW];
+ dev->caps.dmfs_high_rate_qpn_range = MLX4_A0_STEERING_TABLE_SIZE;
+ }
+
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_RSS_RAW_ETH] =
+ dev->caps.dmfs_high_rate_qpn_range;
+
+ dev->caps.reserved_qps = dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW] +
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_ETH_ADDR] +
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_ADDR] +
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_EXCH];
+
+ dev->caps.sync_qp = dev_cap->sync_qp;
+ if (dev->persist->rte_pdev->id.device_id == 0x1003 || dev->caps.cq_overrun)
+ dev->caps.cq_flags |= MLX4_DEV_CAP_CQ_FLAG_IO;
+
+ dev->caps.sqp_demux = (mlx4_is_master(dev)) ? MLX4_MAX_NUM_SLAVES : 0;
+
+ if (!enable_64b_cqe_eqe && !mlx4_is_slave(dev)) {
+ if (dev_cap->flags &
+ (MLX4_DEV_CAP_FLAG_64B_CQE | MLX4_DEV_CAP_FLAG_64B_EQE)) {
+ mlx4_warn(dev, "64B EQEs/CQEs supported by the device but not enabled\n");
+ dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_64B_CQE;
+ dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_64B_EQE;
+ }
+
+ if (dev_cap->flags2 &
+ (MLX4_DEV_CAP_FLAG2_CQE_STRIDE |
+ MLX4_DEV_CAP_FLAG2_EQE_STRIDE)) {
+ mlx4_warn(dev, "Disabling EQE/CQE stride per user request\n");
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_CQE_STRIDE;
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
+ }
+ }
+
+ if ((dev->caps.flags &
+ (MLX4_DEV_CAP_FLAG_64B_CQE | MLX4_DEV_CAP_FLAG_64B_EQE)) &&
+ mlx4_is_master(dev))
+ dev->caps.function_caps |= MLX4_FUNC_CAP_64B_EQE_CQE;
+
+ if (!mlx4_is_slave(dev)) {
+ for (i = 0; i < dev->caps.num_ports; ++i)
+ dev->caps.def_counter_index[i] = i << 1;
+ mlx4_enable_cqe_eqe_stride(dev);
+ dev->caps.alloc_res_qp_mask =
+ (dev->caps.bf_reg_size ? MLX4_RESERVE_ETH_BF_QP : 0) |
+ MLX4_RESERVE_A0_QP;
+
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ETS_CFG) &&
+ dev->caps.flags & MLX4_DEV_CAP_FLAG_SET_ETH_SCHED) {
+ mlx4_warn(dev, "Old device ETS support detected\n");
+ mlx4_warn(dev, "Consider upgrading device FW.\n");
+ dev->caps.flags2 |= MLX4_DEV_CAP_FLAG2_ETS_CFG;
+ }
+
+ } else {
+ dev->caps.alloc_res_qp_mask = 0;
+ }
+
+ mlx4_enable_ignore_fcs(dev);
+
+ return 0;
+}
+#ifdef KMOD_DISABLED
+static int mlx4_get_pcie_dev_link_caps(struct mlx4_dev *dev,
+ enum pci_bus_speed *speed,
+ enum pcie_link_width *width)
+{
+ u32 lnkcap1, lnkcap2;
+ int err1, err2;
+
+#define PCIE_MLW_CAP_SHIFT 4 /* start of MLW mask in link capabilities */
+
+ *speed = PCI_SPEED_UNKNOWN;
+ *width = PCIE_LNK_WIDTH_UNKNOWN;
+
+ err1 = pcie_capability_read_dword(dev->persist->pdev, PCI_EXP_LNKCAP,
+ &lnkcap1);
+ err2 = pcie_capability_read_dword(dev->persist->pdev, PCI_EXP_LNKCAP2,
+ &lnkcap2);
+ if (!err2 && lnkcap2) { /* PCIe r3.0-compliant */
+ if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_8_0GB)
+ *speed = PCIE_SPEED_8_0GT;
+ else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_5_0GB)
+ *speed = PCIE_SPEED_5_0GT;
+ else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_2_5GB)
+ *speed = PCIE_SPEED_2_5GT;
+ }
+ if (!err1) {
+ *width = (lnkcap1 & PCI_EXP_LNKCAP_MLW) >> PCIE_MLW_CAP_SHIFT;
+ if (!lnkcap2) { /* pre-r3.0 */
+ if (lnkcap1 & PCI_EXP_LNKCAP_SLS_5_0GB)
+ *speed = PCIE_SPEED_5_0GT;
+ else if (lnkcap1 & PCI_EXP_LNKCAP_SLS_2_5GB)
+ *speed = PCIE_SPEED_2_5GT;
+ }
+ }
+
+ if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN) {
+ return err1 ? err1 :
+ err2 ? err2 : -EINVAL;
+ }
+ return 0;
+}
+#endif
+
+#ifdef KMOD_DISABLED
+static void mlx4_check_pcie_caps(struct mlx4_dev *dev)
+{
+ enum pcie_link_width width, width_cap;
+ enum pci_bus_speed speed, speed_cap;
+ int err;
+
+#define PCIE_SPEED_STR(speed) \
+ (speed == PCIE_SPEED_8_0GT ? "8.0GT/s" : \
+ speed == PCIE_SPEED_5_0GT ? "5.0GT/s" : \
+ speed == PCIE_SPEED_2_5GT ? "2.5GT/s" : \
+ "Unknown")
+
+ err = mlx4_get_pcie_dev_link_caps(dev, &speed_cap, &width_cap);
+ if (err) {
+ mlx4_warn(dev,
+ "Unable to determine PCIe device BW capabilities\n");
+ return;
+ }
+
+ err = pcie_get_minimum_link(dev->persist->pdev, &speed, &width);
+ if (err || speed == PCI_SPEED_UNKNOWN ||
+ width == PCIE_LNK_WIDTH_UNKNOWN) {
+ mlx4_warn(dev,
+ "Unable to determine PCI device chain minimum BW\n");
+ return;
+ }
+
+ if (width != width_cap || speed != speed_cap)
+ mlx4_warn(dev,
+ "PCIe BW is different than device's capability\n");
+
+ mlx4_info(dev, "PCIe link speed is %s, device supports %s\n",
+ PCIE_SPEED_STR(speed), PCIE_SPEED_STR(speed_cap));
+ mlx4_info(dev, "PCIe link width is x%d, device supports x%d\n",
+ width, width_cap);
+ return;
+}
+#endif
+
+/*The function checks if there are live vf, return the num of them*/
+static int mlx4_how_many_lives_vf(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_state *s_state;
+ int i;
+ int ret = 0;
+
+ for (i = 1/*the ppf is 0*/; i < dev->num_slaves; ++i) {
+ s_state = &priv->mfunc.master.slave_state[i];
+ if (s_state->active && s_state->last_cmd !=
+ MLX4_COMM_CMD_RESET) {
+ mlx4_warn(dev, "%s: slave: %d is still active\n",
+ __func__, i);
+ ret++;
+ }
+ }
+ return ret;
+}
+
+int mlx4_get_parav_qkey(struct mlx4_dev *dev, u32 qpn, u32 *qkey)
+{
+ u32 qk = MLX4_RESERVED_QKEY_BASE;
+
+ if (qpn >= dev->phys_caps.base_tunnel_sqpn + 8 * MLX4_MFUNC_MAX ||
+ qpn < dev->phys_caps.base_proxy_sqpn)
+ return -EINVAL;
+
+ if (qpn >= dev->phys_caps.base_tunnel_sqpn)
+ /* tunnel qp */
+ qk += qpn - dev->phys_caps.base_tunnel_sqpn;
+ else
+ qk += qpn - dev->phys_caps.base_proxy_sqpn;
+ *qkey = qk;
+ return 0;
+}
+EXPORT_SYMBOL(mlx4_get_parav_qkey);
+
+void mlx4_sync_pkey_table(struct mlx4_dev *dev, int slave, int port, int i, int val)
+{
+ struct mlx4_priv *priv = container_of(dev, struct mlx4_priv, dev);
+
+ if (!mlx4_is_master(dev))
+ return;
+
+ priv->virt2phys_pkey[slave][port - 1][i] = val;
+}
+EXPORT_SYMBOL(mlx4_sync_pkey_table);
+
+void mlx4_put_slave_node_guid(struct mlx4_dev *dev, int slave, __be64 guid)
+{
+ struct mlx4_priv *priv = container_of(dev, struct mlx4_priv, dev);
+
+ if (!mlx4_is_master(dev))
+ return;
+
+ priv->slave_node_guids[slave] = guid;
+}
+EXPORT_SYMBOL(mlx4_put_slave_node_guid);
+
+__be64 mlx4_get_slave_node_guid(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = container_of(dev, struct mlx4_priv, dev);
+
+ if (!mlx4_is_master(dev))
+ return 0;
+
+ return priv->slave_node_guids[slave];
+}
+EXPORT_SYMBOL(mlx4_get_slave_node_guid);
+
+int mlx4_is_slave_active(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_state *s_slave;
+
+ if (!mlx4_is_master(dev))
+ return 0;
+
+ s_slave = &priv->mfunc.master.slave_state[slave];
+ return !!s_slave->active;
+}
+EXPORT_SYMBOL(mlx4_is_slave_active);
+
+static void slave_adjust_steering_mode(struct mlx4_dev *dev,
+ struct mlx4_dev_cap *dev_cap,
+ struct mlx4_init_hca_param *hca_param)
+{
+ dev->caps.steering_mode = hca_param->steering_mode;
+ dev->caps.steering_attr = hca_param->steering_attr;
+ if (dev->caps.steering_mode == MLX4_STEERING_MODE_DEVICE_MANAGED) {
+ dev->caps.num_qp_per_mgm = dev_cap->fs_max_num_qp_per_entry;
+ dev->caps.fs_log_max_ucast_qp_range_size =
+ dev_cap->fs_log_max_ucast_qp_range_size;
+ } else
+ dev->caps.num_qp_per_mgm =
+ 4 * ((1 << hca_param->log_mc_entry_sz)/16 - 2);
+
+ mlx4_dbg(dev, "Steering mode is: %s\n",
+ mlx4_steering_mode_str(dev->caps.steering_mode));
+}
+
+static void mlx4_slave_destroy_special_qp_cap(struct mlx4_dev *dev)
+{
+ kfree(dev->caps.qp0_qkey);
+ kfree(dev->caps.qp0_tunnel);
+ kfree(dev->caps.qp0_proxy);
+ kfree(dev->caps.qp1_tunnel);
+ kfree(dev->caps.qp1_proxy);
+ dev->caps.qp0_qkey = NULL;
+ dev->caps.qp0_tunnel = NULL;
+ dev->caps.qp0_proxy = NULL;
+ dev->caps.qp1_tunnel = NULL;
+ dev->caps.qp1_proxy = NULL;
+}
+
+static int mlx4_slave_special_qp_cap(struct mlx4_dev *dev)
+{
+ struct mlx4_func_cap *func_cap = NULL;
+ int i, err;
+
+ func_cap = kzalloc(sizeof(*func_cap), GFP_KERNEL);
+ dev->caps.qp0_qkey = kcalloc(dev->caps.num_ports,
+ sizeof(u32), GFP_KERNEL);
+ dev->caps.qp0_tunnel = kcalloc(dev->caps.num_ports,
+ sizeof(u32), GFP_KERNEL);
+ dev->caps.qp0_proxy = kcalloc(dev->caps.num_ports,
+ sizeof(u32), GFP_KERNEL);
+ dev->caps.qp1_tunnel = kcalloc(dev->caps.num_ports,
+ sizeof(u32), GFP_KERNEL);
+ dev->caps.qp1_proxy = kcalloc(dev->caps.num_ports,
+ sizeof(u32), GFP_KERNEL);
+
+ if (!dev->caps.qp0_tunnel || !dev->caps.qp0_proxy ||
+ !dev->caps.qp1_tunnel || !dev->caps.qp1_proxy ||
+ !dev->caps.qp0_qkey || !func_cap) {
+ mlx4_err(dev, "Failed to allocate memory for special qps cap\n");
+ err = -ENOMEM;
+ goto err_mem;
+ }
+
+ for (i = 1; i <= dev->caps.num_ports; ++i) {
+ err = mlx4_QUERY_FUNC_CAP(dev, i, func_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_FUNC_CAP port command failed for port %d, aborting (%d)\n",
+ i, err);
+ goto err_mem;
+ }
+ dev->caps.qp0_qkey[i - 1] = func_cap->qp0_qkey;
+ dev->caps.qp0_tunnel[i - 1] = func_cap->qp0_tunnel_qpn;
+ dev->caps.qp0_proxy[i - 1] = func_cap->qp0_proxy_qpn;
+ dev->caps.qp1_tunnel[i - 1] = func_cap->qp1_tunnel_qpn;
+ dev->caps.qp1_proxy[i - 1] = func_cap->qp1_proxy_qpn;
+ dev->caps.def_counter_index[i - 1] = func_cap->def_counter_index;
+ dev->caps.port_mask[i] = dev->caps.port_type[i];
+ dev->caps.phys_port_id[i] = func_cap->phys_port_id;
+ err = mlx4_get_slave_pkey_gid_tbl_len(dev, i,
+ &dev->caps.gid_table_len[i],
+ &dev->caps.pkey_table_len[i]);
+ if (err) {
+ mlx4_err(dev, "QUERY_PORT command failed for port %d, aborting (%d)\n",
+ i, err);
+ goto err_mem;
+ }
+ }
+
+ kfree(func_cap);
+ return 0;
+
+err_mem:
+ kfree(func_cap);
+ mlx4_slave_destroy_special_qp_cap(dev);
+
+ return err;
+}
+
+int mlx4_verify_supported_gid_type(struct mlx4_dev *dev, enum mlx4_roce_gid_type gid_type,
+ enum mlx4_roce_gid_type *alt_type)
+{
+ static const int supported_gid_types[][2] = {
+ [MLX4_ROCE_MODE_1] = {MLX4_ROCE_GID_TYPE_V1, -1},
+ [MLX4_ROCE_MODE_1_5] = {MLX4_ROCE_GID_TYPE_V1_5, -1},
+ [MLX4_ROCE_MODE_2] = {MLX4_ROCE_GID_TYPE_V2, -1},
+ [MLX4_ROCE_MODE_1_5_PLUS_2] = {MLX4_ROCE_GID_TYPE_V1_5, MLX4_ROCE_GID_TYPE_V2},
+ [MLX4_ROCE_MODE_1_PLUS_2] = {MLX4_ROCE_GID_TYPE_V1, MLX4_ROCE_GID_TYPE_V2}
+ };
+ enum mlx4_roce_mode roce_mode = dev->caps.roce_mode;
+ int i;
+ if (roce_mode == MLX4_ROCE_MODE_INVALID) {
+ if (alt_type)
+ *alt_type = MLX4_ROCE_GID_TYPE_INVALID;
+ return -EINVAL;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(supported_gid_types[roce_mode]) &&
+ gid_type != supported_gid_types[roce_mode][i]; i++)
+ ;
+
+ if (i == ARRAY_SIZE(supported_gid_types[roce_mode])) {
+ if (alt_type)
+ *alt_type = supported_gid_types[roce_mode][0];
+ return -EINVAL;
+ }
+ return 0;
+}
+
+static void choose_roce_mode(struct mlx4_dev *dev,
+ struct mlx4_dev_cap *dev_cap)
+{
+ int req_roce_mode;
+ enum mlx4_roce_mode def_roce_mode;
+ int req_ud_gid_type;
+ enum mlx4_roce_gid_type alt_gid_type;
+
+ def_roce_mode = (dev_cap->flags & MLX4_DEV_CAP_FLAG_IBOE) ?
+ MLX4_ROCE_MODE_1 : MLX4_ROCE_MODE_INVALID;
+
+ mlx4_get_val(roce_mode.dbdf2val.tbl,
+ dev->persist->rte_pdev, 0, &req_roce_mode);
+ switch (req_roce_mode) {
+ case MLX4_ROCE_MODE_1:
+ req_roce_mode = def_roce_mode;
+ break;
+ case MLX4_ROCE_MODE_1_5:
+ if (!(dev_cap->flags & MLX4_DEV_CAP_FLAG_R_ROCE) &&
+ !(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_ROCEV2))
+ req_roce_mode = def_roce_mode;
+ break;
+ case MLX4_ROCE_MODE_2:
+ if (!(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_ROCEV2))
+ req_roce_mode = def_roce_mode;
+ break;
+ case MLX4_ROCE_MODE_1_5_PLUS_2:
+ if (!(dev_cap->flags & MLX4_DEV_CAP_FLAG_R_ROCE) ||
+ !dev->caps.roce_addr_support)
+ req_roce_mode = def_roce_mode;
+ break;
+ case MLX4_ROCE_MODE_1_PLUS_2:
+ if (!(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2))
+ req_roce_mode = def_roce_mode;
+ break;
+ default:
+ req_roce_mode = def_roce_mode;
+ }
+ dev->caps.roce_mode = req_roce_mode;
+ pr_info("mlx4_core: device is working in RoCE mode: %s\n",
+ mlx4_roce_mode_to_str(dev->caps.roce_mode));
+
+ mlx4_get_val(ud_gid_type.dbdf2val.tbl, dev->persist->rte_pdev, 0, &req_ud_gid_type);
+
+ if (mlx4_verify_supported_gid_type(dev, req_ud_gid_type, &alt_gid_type)) {
+ pr_warn("mlx4_core: gid_type %d for UD QPs is not supported by the device"
+ "gid_type %d was chosen instead\n", req_ud_gid_type, alt_gid_type);
+ req_ud_gid_type = alt_gid_type;
+ }
+ dev->caps.ud_gid_type = req_ud_gid_type;
+ pr_info("mlx4_core: UD QP Gid type is: %s\n",
+ mlx4_roce_gid_type_to_str(dev->caps.ud_gid_type));
+ dev->caps.rr_proto = mlx4_roce_proto_config;
+}
+
+static int mlx4_slave_cap(struct mlx4_dev *dev)
+{
+ int err;
+ u32 page_size;
+ struct mlx4_dev_cap *dev_cap = NULL;
+ struct mlx4_func_cap *func_cap = NULL;
+ struct mlx4_init_hca_param *hca_param = NULL;
+
+ hca_param = kzalloc(sizeof(*hca_param), GFP_KERNEL);
+ func_cap = kzalloc(sizeof(*func_cap), GFP_KERNEL);
+ dev_cap = kzalloc(sizeof(*dev_cap), GFP_KERNEL);
+ if (!hca_param || !func_cap || !dev_cap) {
+ mlx4_err(dev, "Failed to allocate memory for slave_cap\n");
+ err = -ENOMEM;
+ goto free_mem;
+ }
+
+ err = mlx4_QUERY_HCA(dev, hca_param);
+ if (err) {
+ mlx4_err(dev, "QUERY_HCA command failed, aborting\n");
+ goto free_mem;
+ }
+
+ /* fail if the hca has an unknown global capability
+ * at this time global_caps should be always zeroed
+ */
+ if (hca_param->global_caps) {
+ mlx4_err(dev, "Unknown hca global capabilities\n");
+ err = -ENOSYS;
+ goto free_mem;
+ }
+
+ dev->caps.hca_core_clock = hca_param->hca_core_clock;
+
+ dev->caps.max_qp_dest_rdma = 1 << hca_param->log_rd_per_qp;
+ err = mlx4_dev_cap(dev, dev_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting\n");
+ goto free_mem;
+ }
+
+ err = mlx4_QUERY_FW(dev);
+ if (err)
+ mlx4_err(dev, "QUERY_FW command failed: could not get FW version\n");
+
+ page_size = ~dev->caps.page_size_cap + 1;
+ mlx4_warn(dev, "HCA minimum page size:%d\n", page_size);
+ if (page_size > PAGE_SIZE) {
+ mlx4_err(dev, "HCA minimum page size of %d bigger than kernel PAGE_SIZE of %ld, aborting\n",
+ page_size, PAGE_SIZE);
+ err = -ENODEV;
+ goto free_mem;
+ }
+
+ /* slave gets uar page size from QUERY_HCA fw command */
+ dev->caps.uar_page_size = 1 << (hca_param->uar_page_sz + 12);
+
+ /* TODO: relax this assumption */
+ if (dev->caps.uar_page_size != PAGE_SIZE) {
+ mlx4_err(dev, "UAR size:%d != kernel PAGE_SIZE of %ld\n",
+ dev->caps.uar_page_size, PAGE_SIZE);
+ err = -ENODEV;
+ goto free_mem;
+ }
+
+ err = mlx4_QUERY_FUNC_CAP(dev, 0, func_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_FUNC_CAP general command failed, aborting (%d)\n",
+ err);
+ goto free_mem;
+ }
+
+ if ((func_cap->pf_context_behaviour | PF_CONTEXT_BEHAVIOUR_MASK) !=
+ PF_CONTEXT_BEHAVIOUR_MASK) {
+ mlx4_err(dev, "Unknown pf context behaviour %x known flags %x\n",
+ func_cap->pf_context_behaviour,
+ PF_CONTEXT_BEHAVIOUR_MASK);
+ err = -ENOSYS;
+ goto free_mem;
+ }
+
+ dev->caps.num_ports = func_cap->num_ports;
+ dev->quotas.qp = func_cap->qp_quota;
+ dev->quotas.srq = func_cap->srq_quota;
+ dev->quotas.cq = func_cap->cq_quota;
+ dev->quotas.mpt = func_cap->mpt_quota;
+ dev->quotas.mtt = func_cap->mtt_quota;
+ dev->caps.num_qps = 1 << hca_param->log_num_qps;
+ dev->caps.num_srqs = 1 << hca_param->log_num_srqs;
+ dev->caps.num_cqs = 1 << hca_param->log_num_cqs;
+ dev->caps.num_mpts = 1 << hca_param->log_mpt_sz;
+ dev->caps.num_eqs = func_cap->max_eq;
+ dev->caps.reserved_eqs = func_cap->reserved_eq;
+ dev->caps.reserved_lkey = func_cap->reserved_lkey;
+ dev->caps.num_pds = MLX4_NUM_PDS;
+ dev->caps.num_mgms = 0;
+ dev->caps.num_amgms = 0;
+
+ if (dev->caps.num_ports > MLX4_MAX_PORTS) {
+ mlx4_err(dev, "HCA has %d ports, but we only support %d, aborting\n",
+ dev->caps.num_ports, MLX4_MAX_PORTS);
+ return -ENODEV;
+ }
+
+ err = mlx4_slave_special_qp_cap(dev);
+ if (err) {
+ mlx4_err(dev, "Set special QP caps failed. aborting\n");
+ goto free_mem;
+ }
+
+ if (dev->caps.uar_page_size * (dev->caps.num_uars -
+ dev->caps.reserved_uars) >
+ dev->persist->rte_pdev->mem_resource[2].len) {
+ mlx4_err(dev, "HCA reported UAR region size of 0x%x bigger than PCI resource 2 size of 0x%llx, aborting\n",
+ dev->caps.uar_page_size * dev->caps.num_uars,
+ (unsigned long long)
+ dev->persist->rte_pdev->mem_resource[2].len);
+ err = -ENOMEM;
+ goto err_mem;
+ }
+
+ if (hca_param->dev_cap_enabled & MLX4_DEV_CAP_64B_EQE_ENABLED) {
+ dev->caps.eqe_size = 64;
+ dev->caps.eqe_factor = 1;
+ } else {
+ dev->caps.eqe_size = 32;
+ dev->caps.eqe_factor = 0;
+ }
+
+ if (hca_param->dev_cap_enabled & MLX4_DEV_CAP_64B_CQE_ENABLED) {
+ dev->caps.cqe_size = 64;
+ dev->caps.userspace_caps |= MLX4_USER_DEV_CAP_LARGE_CQE;
+ } else {
+ dev->caps.cqe_size = 32;
+ }
+
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_WQE_FORMAT)
+ dev->caps.userspace_caps |= MLX4_USER_DEV_CAP_WQE_FORMAT;
+#endif
+ if (hca_param->dev_cap_enabled & MLX4_DEV_CAP_EQE_STRIDE_ENABLED) {
+ dev->caps.eqe_size = hca_param->eqe_size;
+ dev->caps.eqe_factor = 0;
+ }
+
+ if (hca_param->dev_cap_enabled & MLX4_DEV_CAP_CQE_STRIDE_ENABLED) {
+ dev->caps.cqe_size = hca_param->cqe_size;
+ /* User still need to know when CQE > 32B */
+ dev->caps.userspace_caps |= MLX4_USER_DEV_CAP_LARGE_CQE;
+ }
+
+ dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_TS;
+ mlx4_warn(dev, "Timestamping is not supported in slave mode\n");
+
+ slave_adjust_steering_mode(dev, dev_cap, hca_param);
+ mlx4_dbg(dev, "RSS support for IP fragments is %s\n",
+ hca_param->rss_ip_frags ? "on" : "off");
+
+ if (func_cap->extra_flags & MLX4_QUERY_FUNC_FLAGS_BF_RES_QP &&
+ dev->caps.bf_reg_size)
+ dev->caps.alloc_res_qp_mask |= MLX4_RESERVE_ETH_BF_QP;
+
+ if (func_cap->extra_flags & MLX4_QUERY_FUNC_FLAGS_A0_RES_QP)
+ dev->caps.alloc_res_qp_mask |= MLX4_RESERVE_A0_QP;
+
+ if (func_cap->extra_flags & MLX4_QUERY_FUNC_FLAGS_ROCE_ADDR)
+ dev->caps.roce_addr_support = 1;
+
+ choose_roce_mode(dev, dev_cap);
+
+err_mem:
+ if (err)
+ mlx4_slave_destroy_special_qp_cap(dev);
+free_mem:
+ kfree(hca_param);
+ kfree(func_cap);
+ kfree(dev_cap);
+ return err;
+}
+#ifdef KMOD_DISABLED
+static void mlx4_request_modules(struct mlx4_dev *dev)
+{
+ int port;
+ int has_ib_port = false;
+ int has_eth_port = false;
+#define EN_DRV_NAME "mlx4_en"
+#define IB_DRV_NAME "mlx4_ib"
+
+ for (port = 1; port <= dev->caps.num_ports; port++) {
+ if (dev->caps.port_type[port] == MLX4_PORT_TYPE_IB)
+ has_ib_port = true;
+ else if (dev->caps.port_type[port] == MLX4_PORT_TYPE_ETH)
+ has_eth_port = true;
+ }
+
+ if (has_eth_port)
+ request_module_nowait(EN_DRV_NAME);
+ if (has_ib_port || (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE))
+ request_module_nowait(IB_DRV_NAME);
+}
+#endif
+
+/*
+ * Change the port configuration of the device.
+ * Every user of this function must hold the port mutex.
+ */
+int mlx4_change_port_types(struct mlx4_dev *dev,
+ enum mlx4_port_type *port_types)
+{
+ int err = 0;
+ int change = 0;
+ int port;
+
+ for (port = 0; port < dev->caps.num_ports; port++) {
+ /* Change the port type only if the new type is different
+ * from the current, and not set to Auto */
+ if (port_types[port] != dev->caps.port_type[port + 1])
+ change = 1;
+ }
+ if (change) {
+ mlx4_unregister_device(dev);
+ for (port = 1; port <= dev->caps.num_ports; port++) {
+ mlx4_CLOSE_PORT(dev, port);
+ dev->caps.port_type[port] = port_types[port - 1];
+ err = mlx4_SET_PORT(dev, port, -1);
+ if (err) {
+ mlx4_err(dev, "Failed to set port %d, aborting\n",
+ port);
+ goto out;
+ }
+ }
+ mlx4_set_port_mask(dev);
+ err = mlx4_register_device(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to register device\n");
+ goto out;
+ }
+ //mlx4_request_modules(dev);
+ }
+
+out:
+ return err;
+}
+
+#ifdef KMOD_DISABLED
+
+static ssize_t show_port_type(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct mlx4_port_info *info = container_of(attr, struct mlx4_port_info,
+ port_attr);
+ struct mlx4_dev *mdev = info->dev;
+ char type[8];
+
+ sprintf(type, "%s",
+ (mdev->caps.port_type[info->port] == MLX4_PORT_TYPE_IB) ?
+ "ib" : "eth");
+ if (mdev->caps.possible_type[info->port] == MLX4_PORT_TYPE_AUTO)
+ sprintf(buf, "auto (%s)\n", type);
+ else
+ sprintf(buf, "%s\n", type);
+
+ return strlen(buf);
+}
+
+
+static ssize_t set_port_type(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct mlx4_port_info *info = container_of(attr, struct mlx4_port_info,
+ port_attr);
+ struct mlx4_dev *mdev = info->dev;
+ struct mlx4_priv *priv = mlx4_priv(mdev);
+ enum mlx4_port_type types[MLX4_MAX_PORTS];
+ enum mlx4_port_type new_types[MLX4_MAX_PORTS];
+ static DEFINE_MUTEX(set_port_type_mutex);
+ int i;
+ int err = 0;
+
+ mutex_lock(&set_port_type_mutex);
+
+ if (!strcmp(buf, "ib\n"))
+ info->tmp_type = MLX4_PORT_TYPE_IB;
+ else if (!strcmp(buf, "eth\n"))
+ info->tmp_type = MLX4_PORT_TYPE_ETH;
+ else if (!strcmp(buf, "auto\n"))
+ info->tmp_type = MLX4_PORT_TYPE_AUTO;
+ else {
+ mlx4_err(mdev, "%s is not supported port type\n", buf);
+ err = -EINVAL;
+ goto err_out;
+ }
+
+ if ((info->tmp_type & mdev->caps.supported_type[info->port]) !=
+ info->tmp_type) {
+ mlx4_err(mdev,
+ "Requested port type for port %d is not supported on this HCA\n",
+ info->port);
+ err = -EINVAL;
+ goto err_out;
+ }
+
+ mlx4_stop_sense(mdev);
+ mutex_lock(&priv->port_mutex);
+ /* Possible type is always the one that was delivered */
+ mdev->caps.possible_type[info->port] = info->tmp_type;
+
+ for (i = 0; i < mdev->caps.num_ports; i++) {
+ types[i] = priv->port[i+1].tmp_type ? priv->port[i+1].tmp_type :
+ mdev->caps.possible_type[i+1];
+ if (types[i] == MLX4_PORT_TYPE_AUTO)
+ types[i] = mdev->caps.port_type[i+1];
+ }
+
+ if (!(mdev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP) &&
+ !(mdev->caps.flags & MLX4_DEV_CAP_FLAG_SENSE_SUPPORT)) {
+ for (i = 1; i <= mdev->caps.num_ports; i++) {
+ if (mdev->caps.possible_type[i] == MLX4_PORT_TYPE_AUTO) {
+ mdev->caps.possible_type[i] = mdev->caps.port_type[i];
+ err = -EINVAL;
+ }
+ }
+ }
+ if (err) {
+ mlx4_err(mdev, "Auto sensing is not supported on this HCA. Set only 'eth' or 'ib' for both ports (should be the same)\n");
+ goto out;
+ }
+
+ mlx4_do_sense_ports(mdev, new_types, types);
+
+ err = mlx4_check_port_params(mdev, new_types);
+ if (err)
+ goto out;
+
+ /* We are about to apply the changes after the configuration
+ * was verified, no need to remember the temporary types
+ * any more */
+ for (i = 0; i < mdev->caps.num_ports; i++)
+ priv->port[i + 1].tmp_type = 0;
+
+ err = mlx4_change_port_types(mdev, new_types);
+
+out:
+ mlx4_start_sense(mdev);
+ mutex_unlock(&priv->port_mutex);
+err_out:
+ mutex_unlock(&set_port_type_mutex);
+
+ return err ? err : count;
+}
+
+#endif
+
+enum ibta_mtu {
+ IB_MTU_256 = 1,
+ IB_MTU_512 = 2,
+ IB_MTU_1024 = 3,
+ IB_MTU_2048 = 4,
+ IB_MTU_4096 = 5
+};
+
+static inline int int_to_ibta_mtu(int mtu)
+{
+ switch (mtu) {
+ case 256: return IB_MTU_256;
+ case 512: return IB_MTU_512;
+ case 1024: return IB_MTU_1024;
+ case 2048: return IB_MTU_2048;
+ case 4096: return IB_MTU_4096;
+ default: return -1;
+ }
+}
+
+static inline int ibta_mtu_to_int(enum ibta_mtu mtu)
+{
+ switch (mtu) {
+ case IB_MTU_256: return 256;
+ case IB_MTU_512: return 512;
+ case IB_MTU_1024: return 1024;
+ case IB_MTU_2048: return 2048;
+ case IB_MTU_4096: return 4096;
+ default: return -1;
+ }
+}
+
+#ifdef KMOD_DISABLED
+
+static ssize_t show_port_ib_mtu(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct mlx4_port_info *info = container_of(attr, struct mlx4_port_info,
+ port_mtu_attr);
+ struct mlx4_dev *mdev = info->dev;
+
+ if (mdev->caps.port_type[info->port] == MLX4_PORT_TYPE_ETH)
+ mlx4_warn(mdev, "port level mtu is only used for IB ports\n");
+
+ sprintf(buf, "%d\n",
+ ibta_mtu_to_int(mdev->caps.port_ib_mtu[info->port]));
+ return strlen(buf);
+}
+
+
+static ssize_t set_port_ib_mtu(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct mlx4_port_info *info = container_of(attr, struct mlx4_port_info,
+ port_mtu_attr);
+ struct mlx4_dev *mdev = info->dev;
+ struct mlx4_priv *priv = mlx4_priv(mdev);
+ int err, port, mtu, ibta_mtu = -1;
+
+ if (mdev->caps.port_type[info->port] == MLX4_PORT_TYPE_ETH) {
+ mlx4_warn(mdev, "port level mtu is only used for IB ports\n");
+ return -EINVAL;
+ }
+
+ mtu = atoi(buf);
+ ibta_mtu = int_to_ibta_mtu(mtu);
+
+ if (ibta_mtu < 0) {
+ mlx4_err(mdev, "%s is invalid IBTA mtu\n", buf);
+ return -EINVAL;
+ }
+
+ mdev->caps.port_ib_mtu[info->port] = ibta_mtu;
+
+ mlx4_stop_sense(mdev);
+ mutex_lock(&priv->port_mutex);
+ mlx4_unregister_device(mdev);
+ for (port = 1; port <= mdev->caps.num_ports; port++) {
+ mlx4_CLOSE_PORT(mdev, port);
+ err = mlx4_SET_PORT(mdev, port, -1);
+ if (err) {
+ mlx4_err(mdev, "Failed to set port %d, aborting\n",
+ port);
+ goto err_set_port;
+ }
+ }
+ err = mlx4_register_device(mdev);
+err_set_port:
+ mutex_unlock(&priv->port_mutex);
+ mlx4_start_sense(mdev);
+ return err ? err : count;
+}
+#endif
+
+int mlx4_bond(struct mlx4_dev *dev)
+{
+ int ret = 0;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ mutex_lock(&priv->bond_mutex);
+
+ if (!mlx4_is_bonded(dev))
+ ret = mlx4_do_bond(dev, true);
+ else
+ ret = 0;
+
+ mutex_unlock(&priv->bond_mutex);
+ if (ret)
+ mlx4_err(dev, "Failed to bond device: %d\n", ret);
+ else
+ mlx4_dbg(dev, "Device is bonded\n");
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mlx4_bond);
+
+int mlx4_unbond(struct mlx4_dev *dev)
+{
+ int ret = 0;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ mutex_lock(&priv->bond_mutex);
+
+ if (mlx4_is_bonded(dev))
+ ret = mlx4_do_bond(dev, false);
+
+ mutex_unlock(&priv->bond_mutex);
+ if (ret)
+ mlx4_err(dev, "Failed to unbond device: %d\n", ret);
+ else
+ mlx4_dbg(dev, "Device is unbonded\n");
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mlx4_unbond);
+
+
+int mlx4_port_map_set(struct mlx4_dev *dev, struct mlx4_port_map *v2p)
+{
+ u8 port1 = v2p->port1;
+ u8 port2 = v2p->port2;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err;
+
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_PORT_REMAP))
+ return -ENOTSUPP;
+
+ mutex_lock(&priv->bond_mutex);
+
+ /* zero means keep current mapping for this port */
+ if (port1 == 0)
+ port1 = priv->v2p.port1;
+ if (port2 == 0)
+ port2 = priv->v2p.port2;
+
+ if ((port1 < 1) || (port1 > MLX4_MAX_PORTS) ||
+ (port2 < 1) || (port2 > MLX4_MAX_PORTS) ||
+ (port1 == 2 && port2 == 1)) {
+ /* besides boundary checks cross mapping makes
+ * no sense and therefore not allowed */
+ err = -EINVAL;
+ } else if ((port1 == priv->v2p.port1) &&
+ (port2 == priv->v2p.port2)) {
+ err = 0;
+ } else {
+ err = mlx4_virt2phy_port_map(dev, port1, port2);
+ if (!err) {
+ mlx4_dbg(dev, "port map changed: [%d][%d]\n",
+ port1, port2);
+ priv->v2p.port1 = port1;
+ priv->v2p.port2 = port2;
+ } else {
+ mlx4_err(dev, "Failed to change port mape: %d\n", err);
+ }
+ }
+
+ mutex_unlock(&priv->bond_mutex);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_port_map_set);
+
+int mlx4_port_map_get(struct mlx4_dev *dev, u8 vport, u8 *pport)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (!pport)
+ return -EINVAL;
+ *pport = 0;
+
+ if (vport == 1)
+ *pport = priv->v2p.port1;
+ else if (vport == 2)
+ *pport = priv->v2p.port2;
+ if (!*pport)
+ return -EINVAL;
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_port_map_get);
+
+static int mlx4_load_fw(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err;
+
+ priv->fw.fw_icm = mlx4_alloc_icm(dev, priv->fw.fw_pages,
+ GFP_HIGHUSER | __GFP_NOWARN/*, 0*/);
+ if (!priv->fw.fw_icm) {
+ mlx4_err(dev, "Couldn't allocate FW area, aborting\n");
+ return -ENOMEM;
+ }
+
+ err = mlx4_MAP_FA(dev, priv->fw.fw_icm);
+ if (err) {
+ mlx4_err(dev, "MAP_FA command failed, aborting\n");
+ goto err_free;
+ }
+
+ err = mlx4_RUN_FW(dev);
+ if (err) {
+ mlx4_err(dev, "RUN_FW command failed, aborting\n");
+ goto err_unmap_fa;
+ }
+
+ return 0;
+
+err_unmap_fa:
+ mlx4_UNMAP_FA(dev);
+
+err_free:
+ mlx4_free_icm(dev, priv->fw.fw_icm/*, 0*/);
+ return err;
+}
+
+static int mlx4_init_cmpt_table(struct mlx4_dev *dev, u64 cmpt_base,
+ int cmpt_entry_sz)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err;
+ int num_eqs;
+
+ err = mlx4_init_icm_table(dev, &priv->qp_table.cmpt_table,
+ cmpt_base +
+ ((u64) (MLX4_CMPT_TYPE_QP *
+ cmpt_entry_sz) << MLX4_CMPT_SHIFT),
+ cmpt_entry_sz, dev->caps.num_qps,
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW],
+ 0, 0);
+ if (err)
+ goto err;
+
+ err = mlx4_init_icm_table(dev, &priv->srq_table.cmpt_table,
+ cmpt_base +
+ ((u64) (MLX4_CMPT_TYPE_SRQ *
+ cmpt_entry_sz) << MLX4_CMPT_SHIFT),
+ cmpt_entry_sz, dev->caps.num_srqs,
+ dev->caps.reserved_srqs, 0, 0);
+ if (err)
+ goto err_qp;
+
+ err = mlx4_init_icm_table(dev, &priv->cq_table.cmpt_table,
+ cmpt_base +
+ ((u64) (MLX4_CMPT_TYPE_CQ *
+ cmpt_entry_sz) << MLX4_CMPT_SHIFT),
+ cmpt_entry_sz, dev->caps.num_cqs,
+ dev->caps.reserved_cqs, 0, 0);
+ if (err)
+ goto err_srq;
+
+ num_eqs = dev->phys_caps.num_phys_eqs;
+ err = mlx4_init_icm_table(dev, &priv->eq_table.cmpt_table,
+ cmpt_base +
+ ((u64) (MLX4_CMPT_TYPE_EQ *
+ cmpt_entry_sz) << MLX4_CMPT_SHIFT),
+ cmpt_entry_sz, num_eqs, num_eqs, 0, 0);
+ if (err)
+ goto err_cq;
+
+ return 0;
+
+err_cq:
+ mlx4_cleanup_icm_table(dev, &priv->cq_table.cmpt_table);
+
+err_srq:
+ mlx4_cleanup_icm_table(dev, &priv->srq_table.cmpt_table);
+
+err_qp:
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.cmpt_table);
+
+err:
+ return err;
+}
+
+static int mlx4_init_icm(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap,
+ struct mlx4_init_hca_param *init_hca, u64 icm_size)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ u64 aux_pages;
+ int num_eqs;
+ int err;
+
+ err = mlx4_SET_ICM_SIZE(dev, icm_size, &aux_pages);
+ if (err) {
+ mlx4_err(dev, "SET_ICM_SIZE command failed, aborting\n");
+ return err;
+ }
+
+ mlx4_dbg(dev, "%lld KB of HCA context requires %lld KB aux memory\n",
+ (unsigned long long) icm_size >> 10,
+ (unsigned long long) aux_pages << 2);
+
+ priv->fw.aux_icm = mlx4_alloc_icm(dev, aux_pages,
+ GFP_HIGHUSER | __GFP_NOWARN/*, 0*/);
+ if (!priv->fw.aux_icm) {
+ mlx4_err(dev, "Couldn't allocate aux memory, aborting\n");
+ return -ENOMEM;
+ }
+
+ err = mlx4_MAP_ICM_AUX(dev, priv->fw.aux_icm);
+ if (err) {
+ mlx4_err(dev, "MAP_ICM_AUX command failed, aborting\n");
+ goto err_free_aux;
+ }
+
+ err = mlx4_init_cmpt_table(dev, init_hca->cmpt_base, dev_cap->cmpt_entry_sz);
+ if (err) {
+ mlx4_err(dev, "Failed to map cMPT context memory, aborting\n");
+ goto err_unmap_aux;
+ }
+
+
+ num_eqs = dev->phys_caps.num_phys_eqs;
+ err = mlx4_init_icm_table(dev, &priv->eq_table.table,
+ init_hca->eqc_base, dev_cap->eqc_entry_sz,
+ num_eqs, num_eqs, 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map EQ context memory, aborting\n");
+ goto err_unmap_cmpt;
+ }
+
+ /*
+ * Reserved MTT entries must be aligned up to a cacheline
+ * boundary, since the FW will write to them, while the driver
+ * writes to all other MTT entries. (The variable
+ * dev->caps.mtt_entry_sz below is really the MTT segment
+ * size, not the raw entry size)
+ */
+ dev->caps.reserved_mtts =
+ ALIGN(dev->caps.reserved_mtts * dev->caps.mtt_entry_sz,
+ RTE_CACHE_LINE_SIZE) / dev->caps.mtt_entry_sz;
+
+ err = mlx4_init_icm_table(dev, &priv->mr_table.mtt_table,
+ init_hca->mtt_base,
+ dev->caps.mtt_entry_sz,
+ dev->caps.num_mtts,
+ dev->caps.reserved_mtts, 1, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map MTT context memory, aborting\n");
+ goto err_unmap_eq;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->mr_table.dmpt_table,
+ init_hca->dmpt_base,
+ dev_cap->dmpt_entry_sz,
+ dev->caps.num_mpts,
+ dev->caps.reserved_mrws, 1, 1);
+ if (err) {
+ mlx4_err(dev, "Failed to map dMPT context memory, aborting\n");
+ goto err_unmap_mtt;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->qp_table.qp_table,
+ init_hca->qpc_base,
+ dev_cap->qpc_entry_sz,
+ dev->caps.num_qps,
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW],
+ 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map QP context memory, aborting\n");
+ goto err_unmap_dmpt;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->qp_table.auxc_table,
+ init_hca->auxc_base,
+ dev_cap->aux_entry_sz,
+ dev->caps.num_qps,
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW],
+ 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map AUXC context memory, aborting\n");
+ goto err_unmap_qp;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->qp_table.altc_table,
+ init_hca->altc_base,
+ dev_cap->altc_entry_sz,
+ dev->caps.num_qps,
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW],
+ 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map ALTC context memory, aborting\n");
+ goto err_unmap_auxc;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->qp_table.rdmarc_table,
+ init_hca->rdmarc_base,
+ dev_cap->rdmarc_entry_sz << priv->qp_table.rdmarc_shift,
+ dev->caps.num_qps,
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW],
+ 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map RDMARC context memory, aborting\n");
+ goto err_unmap_altc;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->cq_table.table,
+ init_hca->cqc_base,
+ dev_cap->cqc_entry_sz,
+ dev->caps.num_cqs,
+ dev->caps.reserved_cqs, 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map CQ context memory, aborting\n");
+ goto err_unmap_rdmarc;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->srq_table.table,
+ init_hca->srqc_base,
+ dev_cap->srq_entry_sz,
+ dev->caps.num_srqs,
+ dev->caps.reserved_srqs, 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map SRQ context memory, aborting\n");
+ goto err_unmap_cq;
+ }
+
+ /*
+ * For flow steering device managed mode it is required to use
+ * mlx4_init_icm_table. For B0 steering mode it's not strictly
+ * required, but for simplicity just map the whole multicast
+ * group table now. The table isn't very big and it's a lot
+ * easier than trying to track ref counts.
+ */
+ err = mlx4_init_icm_table(dev, &priv->mcg_table.table,
+ init_hca->mc_base,
+ mlx4_get_mgm_entry_size(dev),
+ dev->caps.num_mgms + dev->caps.num_amgms,
+ dev->caps.num_mgms + dev->caps.num_amgms,
+ 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map MCG context memory, aborting\n");
+ goto err_unmap_srq;
+ }
+
+ return 0;
+
+err_unmap_srq:
+ mlx4_cleanup_icm_table(dev, &priv->srq_table.table);
+
+err_unmap_cq:
+ mlx4_cleanup_icm_table(dev, &priv->cq_table.table);
+
+err_unmap_rdmarc:
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.rdmarc_table);
+
+err_unmap_altc:
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.altc_table);
+
+err_unmap_auxc:
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.auxc_table);
+
+err_unmap_qp:
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.qp_table);
+
+err_unmap_dmpt:
+ mlx4_cleanup_icm_table(dev, &priv->mr_table.dmpt_table);
+
+err_unmap_mtt:
+ mlx4_cleanup_icm_table(dev, &priv->mr_table.mtt_table);
+
+err_unmap_eq:
+ mlx4_cleanup_icm_table(dev, &priv->eq_table.table);
+
+err_unmap_cmpt:
+ mlx4_cleanup_icm_table(dev, &priv->eq_table.cmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->cq_table.cmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->srq_table.cmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.cmpt_table);
+
+err_unmap_aux:
+ mlx4_UNMAP_ICM_AUX(dev);
+
+err_free_aux:
+ mlx4_free_icm(dev, priv->fw.aux_icm/*, 0*/);
+
+ return err;
+}
+
+static void mlx4_free_icms(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ mlx4_cleanup_icm_table(dev, &priv->mcg_table.table);
+ mlx4_cleanup_icm_table(dev, &priv->srq_table.table);
+ mlx4_cleanup_icm_table(dev, &priv->cq_table.table);
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.rdmarc_table);
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.altc_table);
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.auxc_table);
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.qp_table);
+ mlx4_cleanup_icm_table(dev, &priv->mr_table.dmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->mr_table.mtt_table);
+ mlx4_cleanup_icm_table(dev, &priv->eq_table.table);
+ mlx4_cleanup_icm_table(dev, &priv->eq_table.cmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->cq_table.cmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->srq_table.cmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.cmpt_table);
+
+ mlx4_UNMAP_ICM_AUX(dev);
+ mlx4_free_icm(dev, priv->fw.aux_icm/*, 0*/);
+}
+
+static void mlx4_slave_exit(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ mutex_lock(&priv->cmd.slave_cmd_mutex);
+ if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_RESET, 0, MLX4_COMM_CMD_NA_OP,
+ MLX4_COMM_TIME))
+ mlx4_warn(dev, "Failed to close slave function\n");
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
+}
+
+static int map_bf_area(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ void* bf_start;
+ size_t bf_len;
+ int err = 0;
+
+ if (!dev->caps.bf_reg_size)
+ return -ENXIO;
+
+ bf_start = RTE_PTR_ADD(dev->persist->rte_pdev->mem_resource[2].addr,dev->caps.num_uars << PAGE_SHIFT);
+ bf_len = dev->persist->rte_pdev->mem_resource[2].len -
+ (dev->caps.num_uars << PAGE_SHIFT);
+ priv->bf_mapping_addr = bf_start;
+ priv->bf_mapping_len = bf_len;
+ priv->bf_mapping_phys_addr = rte_mem_virt2phy(bf_start);
+ if (!priv->bf_mapping_addr)
+ err = -ENOMEM;
+
+ return err;
+}
+
+static void unmap_bf_area(struct mlx4_dev *dev)
+{
+ mlx4_priv(dev)->bf_mapping_addr = 0;
+ //if (mlx4_priv(dev)->bf_mapping)
+// io_mapping_free(mlx4_priv(dev)->bf_mapping);
+}
+
+#ifdef KMOD_MODIFIED
+uint64_t mlx4_read_clock(struct mlx4_dev *dev)
+{
+ u32 clockhi, clocklo, clockhi1;
+ uint64_t cycles;
+ int i;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ for (i = 0; i < 10; i++) {
+ clockhi = swab32(readl(priv->clock_mapping));
+ clocklo = swab32(readl(priv->clock_mapping + 4));
+ clockhi1 = swab32(readl(priv->clock_mapping));
+ if (clockhi == clockhi1)
+ break;
+ }
+
+ cycles = (u64) clockhi << 32 | (u64) clocklo;
+
+ return cycles;
+}
+EXPORT_SYMBOL_GPL(mlx4_read_clock);
+#endif
+
+
+int mlx4_get_internal_clock_params(struct mlx4_dev *dev,
+ struct mlx4_clock_params *params)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (mlx4_is_slave(dev))
+ return -ENOTSUPP;
+ if (!params)
+ return -EINVAL;
+
+ params->bar = priv->fw.clock_bar;
+ params->offset = priv->fw.clock_offset;
+ params->size = MLX4_CLOCK_SIZE;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_internal_clock_params);
+
+static int map_internal_clock(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+#ifdef KMOD_MODIFIED
+ assert(dev->persist->rte_pdev->mem_resource[priv->fw.clock_bar].len >= (priv->fw.clock_offset + MLX4_CLOCK_SIZE));
+ priv->clock_mapping = RTE_PTR_ADD(dev->persist->rte_pdev->mem_resource[priv->fw.clock_bar].addr,priv->fw.clock_offset);
+#else
+ priv->clock_mapping =
+ ioremap(pci_resource_start(dev->persist->pdev,
+ priv->fw.clock_bar) +
+ priv->fw.clock_offset, MLX4_CLOCK_SIZE);
+#endif
+
+ if (!priv->clock_mapping)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void unmap_internal_clock(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (priv->clock_mapping)
+ priv->clock_mapping = 0;
+}
+
+static void mlx4_close_hca(struct mlx4_dev *dev)
+{
+ unmap_internal_clock(dev);
+ unmap_bf_area(dev);
+ if (mlx4_is_slave(dev))
+ mlx4_slave_exit(dev);
+ else {
+ mlx4_CLOSE_HCA(dev, 0);
+ mlx4_free_icms(dev);
+ }
+}
+
+static void mlx4_close_fw(struct mlx4_dev *dev)
+{
+ if (!mlx4_is_slave(dev)) {
+ mlx4_UNMAP_FA(dev);
+ mlx4_free_icm(dev, mlx4_priv(dev)->fw.fw_icm/*, 0*/);
+ }
+}
+
+static int mlx4_comm_check_offline(struct mlx4_dev *dev)
+{
+#define COMM_CHAN_OFFLINE_OFFSET 0x09
+
+ u32 comm_flags;
+ u32 offline_bit;
+ unsigned long end;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ end = msecs_to_jiffies(MLX4_COMM_OFFLINE_TIME_OUT) + jiffies;
+ while (time_before(jiffies, end)) {
+ comm_flags = swab32(readl((__iomem char *)priv->mfunc.comm +
+ MLX4_COMM_CHAN_FLAGS));
+ offline_bit = (comm_flags &
+ (u32)(1 << COMM_CHAN_OFFLINE_OFFSET));
+ if (!offline_bit)
+ return 0;
+ /* There are cases as part of AER/Reset flow that PF needs
+ * around 100 msec to load. We therefore sleep for 100 msec
+ * to allow other tasks to make use of that CPU during this
+ * time interval.
+ */
+ msleep(100);
+ }
+ mlx4_err(dev, "Communication channel is offline.\n");
+ return -EIO;
+}
+
+static void mlx4_reset_vf_support(struct mlx4_dev *dev)
+{
+#define COMM_CHAN_RST_OFFSET 0x1e
+
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ u32 comm_rst;
+ u32 comm_caps;
+
+ comm_caps = swab32(readl((__iomem char *)priv->mfunc.comm +
+ MLX4_COMM_CHAN_CAPS));
+ comm_rst = (comm_caps & (u32)(1 << COMM_CHAN_RST_OFFSET));
+
+ if (comm_rst)
+ dev->caps.vf_caps |= MLX4_VF_CAP_FLAG_RESET;
+}
+
+static int mlx4_init_slave(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ u64 dma = (u64) priv->mfunc.vhcr_dma;
+ int ret_from_reset = 0;
+ u32 slave_read;
+ u32 cmd_channel_ver;
+
+ if (atomic_read(&pf_loading)) {
+ mlx4_warn(dev, "PF is not ready - Deferring probe\n");
+ return -EPROBE_DEFER;
+ }
+
+ mutex_lock(&priv->cmd.slave_cmd_mutex);
+ priv->cmd.max_cmds = 1;
+ if (mlx4_comm_check_offline(dev)) {
+ mlx4_err(dev, "PF is not responsive, skipping initialization\n");
+ goto err_offline;
+ }
+
+ mlx4_reset_vf_support(dev);
+ mlx4_warn(dev, "Sending reset\n");
+ ret_from_reset = mlx4_comm_cmd(dev, MLX4_COMM_CMD_RESET, 0,
+ MLX4_COMM_CMD_NA_OP, MLX4_COMM_TIME);
+ /* if we are in the middle of flr the slave will try
+ * NUM_OF_RESET_RETRIES times before leaving.*/
+ if (ret_from_reset) {
+ if (MLX4_DELAY_RESET_SLAVE == ret_from_reset) {
+ mlx4_warn(dev, "slave is currently in the middle of FLR - Deferring probe\n");
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
+ return -EPROBE_DEFER;
+ } else
+ goto err;
+ }
+
+ /* check the driver version - the slave I/F revision
+ * must match the master's */
+ slave_read = swab32(readl(&priv->mfunc.comm->slave_read));
+ cmd_channel_ver = mlx4_comm_get_version();
+
+ if (MLX4_COMM_GET_IF_REV(cmd_channel_ver) !=
+ MLX4_COMM_GET_IF_REV(slave_read)) {
+ mlx4_err(dev, "slave driver version is not supported by the master\n");
+ goto err;
+ }
+
+ mlx4_warn(dev, "Sending vhcr0\n");
+ if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_VHCR0, dma >> 48,
+ MLX4_COMM_CMD_NA_OP, MLX4_COMM_TIME))
+ goto err;
+ if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_VHCR1, dma >> 32,
+ MLX4_COMM_CMD_NA_OP, MLX4_COMM_TIME))
+ goto err;
+ if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_VHCR2, dma >> 16,
+ MLX4_COMM_CMD_NA_OP, MLX4_COMM_TIME))
+ goto err;
+ if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_VHCR_EN, dma,
+ MLX4_COMM_CMD_NA_OP, MLX4_COMM_TIME))
+ goto err;
+
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
+ return 0;
+
+err:
+ mlx4_comm_cmd(dev, MLX4_COMM_CMD_RESET, 0, MLX4_COMM_CMD_NA_OP, 0);
+err_offline:
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
+ return -EIO;
+}
+
+static void mlx4_parav_master_pf_caps(struct mlx4_dev *dev)
+{
+ int i;
+
+ for (i = 1; i <= dev->caps.num_ports; i++) {
+ if (dev->caps.port_type[i] == MLX4_PORT_TYPE_ETH)
+ dev->caps.gid_table_len[i] =
+ mlx4_get_slave_num_gids(dev, 0, i);
+ else
+ dev->caps.gid_table_len[i] = 1;
+ dev->caps.pkey_table_len[i] =
+ dev->phys_caps.pkey_phys_table_len[i] - 1;
+ }
+}
+
+static int choose_log_fs_mgm_entry_size(int qp_per_entry)
+{
+ int i = MLX4_MIN_MGM_LOG_ENTRY_SIZE;
+
+ for (i = MLX4_MIN_MGM_LOG_ENTRY_SIZE; i <= MLX4_MAX_MGM_LOG_ENTRY_SIZE;
+ i++) {
+ if (qp_per_entry <= 4 * ((1 << i) / 16 - 2))
+ break;
+ }
+
+ return (i <= MLX4_MAX_MGM_LOG_ENTRY_SIZE) ? i : -1;
+}
+
+static const char *dmfs_high_rate_steering_mode_str(int dmfs_high_steer_mode)
+{
+ switch (dmfs_high_steer_mode) {
+ case MLX4_STEERING_DMFS_A0_DEFAULT:
+ return "default performance";
+
+ case MLX4_STEERING_DMFS_A0_DYNAMIC:
+ return "dynamic hybrid mode";
+
+ case MLX4_STEERING_DMFS_A0_STATIC:
+ return "performance optimized for limited rule configuration (static)";
+
+ case MLX4_STEERING_DMFS_A0_DISABLE:
+ return "disabled performance optimized steering";
+
+ case MLX4_STEERING_DMFS_A0_NOT_SUPPORTED:
+ return "performance optimized steering not supported";
+
+ default:
+ return "Unrecognized mode";
+ }
+}
+
+#define MLX4_DMFS_LOW_QP_COUNT 63
+
+static void choose_steering_mode(struct mlx4_dev *dev,
+ struct mlx4_dev_cap *dev_cap)
+{
+ int mlx4_current_steering_mode = mlx4_log_num_mgm_entry_size;
+ dev->caps.steering_attr = 0;
+
+ if (mlx4_current_steering_mode <= 0) {
+ if (!((-mlx4_current_steering_mode) & MLX4_FORCE_DMFS_IF_NO_NCSI_FS))
+ if (!(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_FS_EN_NCSI))
+ mlx4_current_steering_mode =
+ MLX4_DEFAULT_MGM_LOG_ENTRY_SIZE;
+
+ if ((-mlx4_current_steering_mode) & MLX4_DISABLE_DMFS_LOW_QP_NUM)
+ if (dev_cap->fs_max_num_qp_per_entry <= MLX4_DMFS_LOW_QP_COUNT) {
+ mlx4_warn(dev, "FW supports only %d QPs per mcg entry, "
+ "falling back to B0\n",
+ dev_cap->fs_max_num_qp_per_entry);
+ mlx4_current_steering_mode =
+ MLX4_DEFAULT_MGM_LOG_ENTRY_SIZE;
+ }
+
+ if ((-mlx4_current_steering_mode) & MLX4_DMFS_A0_STEERING) {
+ if (dev->caps.dmfs_high_steer_mode ==
+ MLX4_STEERING_DMFS_A0_NOT_SUPPORTED)
+ mlx4_err(dev, "DMFS high rate mode not supported\n");
+ else
+ dev->caps.dmfs_high_steer_mode =
+ MLX4_STEERING_DMFS_A0_STATIC;
+ }
+ if (dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_DISABLE_SIP_CHECK) {
+ if (-mlx4_current_steering_mode & MLX4_IB_IGNORE_SIP_CHECK)
+ dev->caps.steering_attr |= MLX4_STEERING_ATTR_IB_IGNORE_SIP;
+ if (-mlx4_current_steering_mode & MLX4_ETH_IGNORE_SIP_CHECK)
+ dev->caps.steering_attr |= MLX4_STEERING_ATTR_ETH_IGNORE_SIP;
+ }
+ }
+
+ if (mlx4_current_steering_mode <= 0 &&
+ dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_FS_EN &&
+ (!mlx4_is_mfunc(dev) ||
+ (dev_cap->fs_max_num_qp_per_entry >=
+ (dev->persist->num_vfs + 1))) &&
+ choose_log_fs_mgm_entry_size(dev_cap->fs_max_num_qp_per_entry) >=
+ MLX4_MIN_MGM_LOG_ENTRY_SIZE) {
+ dev->oper_log_mgm_entry_size =
+ choose_log_fs_mgm_entry_size(dev_cap->fs_max_num_qp_per_entry);
+ dev->caps.steering_mode = MLX4_STEERING_MODE_DEVICE_MANAGED;
+ dev->caps.num_qp_per_mgm = dev_cap->fs_max_num_qp_per_entry;
+
+ dev->caps.steering_attr |= MLX4_STEERING_ATTR_DMFS_EN;
+
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_DMFS_IPOIB &&
+ (!((-mlx4_current_steering_mode) & MLX4_DMFS_ETH_ONLY)))
+ dev->caps.steering_attr |= MLX4_STEERING_ATTR_DMFS_IPOIB;
+
+ dev->caps.fs_log_max_ucast_qp_range_size =
+ dev_cap->fs_log_max_ucast_qp_range_size;
+ } else {
+ if (dev->caps.dmfs_high_steer_mode !=
+ MLX4_STEERING_DMFS_A0_NOT_SUPPORTED)
+ dev->caps.dmfs_high_steer_mode = MLX4_STEERING_DMFS_A0_DISABLE;
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_VEP_UC_STEER &&
+ dev->caps.flags & MLX4_DEV_CAP_FLAG_VEP_MC_STEER)
+ dev->caps.steering_mode = MLX4_STEERING_MODE_B0;
+ else {
+ dev->caps.steering_mode = MLX4_STEERING_MODE_A0;
+
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_VEP_UC_STEER ||
+ dev->caps.flags & MLX4_DEV_CAP_FLAG_VEP_MC_STEER)
+ mlx4_warn(dev, "Must have both UC_STEER and MC_STEER flags set to use B0 steering - falling back to A0 steering mode\n");
+ }
+ dev->oper_log_mgm_entry_size =
+ mlx4_current_steering_mode > 0 ?
+ mlx4_current_steering_mode :
+ MLX4_DEFAULT_MGM_LOG_ENTRY_SIZE;
+ dev->caps.num_qp_per_mgm = mlx4_get_qp_per_mgm(dev);
+ }
+ mlx4_dbg(dev, "Steering mode is: %s, oper_log_mgm_entry_size = %d, modparam log_num_mgm_entry_size = %d\n",
+ mlx4_steering_mode_str(dev->caps.steering_mode),
+ dev->oper_log_mgm_entry_size,
+ mlx4_current_steering_mode);
+}
+
+static void choose_tunnel_offload_mode(struct mlx4_dev *dev,
+ struct mlx4_dev_cap *dev_cap)
+{
+#ifdef HAVE_VXLAN_ENABLED
+ if (dev->caps.steering_mode == MLX4_STEERING_MODE_DEVICE_MANAGED &&
+ dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_VXLAN_OFFLOADS)
+ dev->caps.tunnel_offload_mode = MLX4_TUNNEL_OFFLOAD_MODE_VXLAN;
+ else
+#endif
+ dev->caps.tunnel_offload_mode = MLX4_TUNNEL_OFFLOAD_MODE_NONE;
+
+ mlx4_dbg(dev, "Tunneling offload mode is: %s\n", (dev->caps.tunnel_offload_mode
+ == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN) ? "vxlan" : "none");
+}
+
+static int mlx4_validate_optimized_steering(struct mlx4_dev *dev)
+{
+ int i;
+ struct mlx4_port_cap port_cap;
+
+ if (dev->caps.dmfs_high_steer_mode == MLX4_STEERING_DMFS_A0_NOT_SUPPORTED)
+ return -EINVAL;
+
+ for (i = 1; i <= dev->caps.num_ports; i++) {
+ if (mlx4_dev_port(dev, i, &port_cap)) {
+ mlx4_err(dev,
+ "QUERY_DEV_CAP command failed, can't veify DMFS high rate steering.\n");
+ } else if ((dev->caps.dmfs_high_steer_mode !=
+ MLX4_STEERING_DMFS_A0_DEFAULT) &&
+ (port_cap.dmfs_optimized_state ==
+ !!(dev->caps.dmfs_high_steer_mode ==
+ MLX4_STEERING_DMFS_A0_DISABLE))) {
+ mlx4_err(dev,
+ "DMFS high rate steer mode differ, driver requested %s but %s in FW.\n",
+ dmfs_high_rate_steering_mode_str(
+ dev->caps.dmfs_high_steer_mode),
+ (port_cap.dmfs_optimized_state ?
+ "enabled" : "disabled"));
+ }
+ }
+
+ return 0;
+}
+
+static int mlx4_init_fw(struct mlx4_dev *dev)
+{
+ struct mlx4_mod_stat_cfg mlx4_cfg;
+ int err = 0;
+
+ if (!mlx4_is_slave(dev)) {
+ err = mlx4_QUERY_FW(dev);
+ if (err) {
+ if (err == -EACCES)
+ mlx4_info(dev, "non-primary physical function, skipping\n");
+ else
+ mlx4_err(dev, "QUERY_FW command failed, aborting\n");
+ return err;
+ }
+
+ err = mlx4_load_fw(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to start FW, aborting\n");
+ return err;
+ }
+
+ mlx4_cfg.log_pg_sz_m = 1;
+ mlx4_cfg.log_pg_sz = 0;
+ err = mlx4_MOD_STAT_CFG(dev, &mlx4_cfg);
+ if (err)
+ mlx4_warn(dev, "Failed to override log_pg_sz parameter\n");
+ }
+
+ return err;
+}
+
+static int mlx4_init_hca(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_adapter adapter;
+ struct mlx4_dev_cap dev_cap;
+ struct mlx4_profile profile;
+ struct mlx4_init_hca_param init_hca;
+ u64 icm_size;
+ struct mlx4_config_dev_params params;
+ int err;
+
+ if (!mlx4_is_slave(dev)) {
+ err = mlx4_dev_cap(dev, &dev_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting\n");
+ return err;
+ }
+
+ choose_steering_mode(dev, &dev_cap);
+ choose_roce_mode(dev, &dev_cap);
+ choose_tunnel_offload_mode(dev, &dev_cap);
+
+ if (dev->caps.dmfs_high_steer_mode == MLX4_STEERING_DMFS_A0_STATIC &&
+ mlx4_is_master(dev))
+ dev->caps.function_caps |= MLX4_FUNC_CAP_DMFS_A0_STATIC;
+
+ err = mlx4_get_phys_port_id(dev);
+ if (err)
+ mlx4_err(dev, "Fail to get physical port id\n");
+
+ if (mlx4_is_master(dev))
+ mlx4_parav_master_pf_caps(dev);
+
+ if (mlx4_low_memory_profile()) {
+ mlx4_info(dev, "Running from within kdump kernel. Using low memory profile\n");
+ /* use old default log_mtts_per_seg */
+ log_mtts_per_seg = ilog2(MLX4_MTT_ENTRY_PER_SEG);
+ profile = low_mem_profile;
+ } else {
+ process_mod_param_profile(&profile);
+ }
+ if (dev->caps.steering_mode ==
+ MLX4_STEERING_MODE_DEVICE_MANAGED)
+ profile.num_mcg = MLX4_FS_NUM_MCG;
+
+ icm_size = mlx4_make_profile(dev, &profile, &dev_cap,
+ &init_hca);
+ if ((long long) icm_size < 0) {
+ err = icm_size;
+ return err;
+ }
+
+ dev->caps.max_fmr_maps = (1 << (32 - ilog2(dev->caps.num_mpts))) - 1;
+
+ init_hca.log_uar_sz = ilog2(dev->caps.num_uars);
+ init_hca.uar_page_sz = PAGE_SHIFT - 12;
+ init_hca.mw_enabled = 0;
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_MEM_WINDOW ||
+ dev->caps.bmme_flags & MLX4_BMME_FLAG_TYPE_2_WIN)
+ init_hca.mw_enabled = INIT_HCA_TPT_MW_ENABLE;
+
+ err = mlx4_init_icm(dev, &dev_cap, &init_hca, icm_size);
+ if (err)
+ return err;
+
+ err = mlx4_INIT_HCA(dev, &init_hca);
+ if (err) {
+ mlx4_err(dev, "INIT_HCA command failed, aborting\n");
+ goto err_free_icm;
+ }
+
+ if (dev_cap.flags2 & MLX4_DEV_CAP_FLAG2_SYS_EQS) {
+ err = mlx4_query_func(dev, &dev_cap);
+ if (err < 0) {
+ mlx4_err(dev, "QUERY_FUNC command failed, aborting.\n");
+ goto err_close;
+ } else if (err & MLX4_QUERY_FUNC_NUM_SYS_EQS) {
+ dev->caps.num_eqs = dev_cap.max_eqs;
+ dev->caps.reserved_eqs = dev_cap.reserved_eqs;
+ dev->caps.reserved_uars = dev_cap.reserved_uars;
+ }
+ }
+
+ /*
+ * If TS is supported by FW
+ * read HCA frequency by QUERY_HCA command
+ */
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_TS) {
+ memset(&init_hca, 0, sizeof(init_hca));
+ err = mlx4_QUERY_HCA(dev, &init_hca);
+ if (err) {
+ mlx4_err(dev, "QUERY_HCA command failed, disable timestamp\n");
+ dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_TS;
+ } else {
+ dev->caps.hca_core_clock =
+ init_hca.hca_core_clock;
+ }
+
+ /* In case we got HCA frequency 0 - disable timestamping
+ * to avoid dividing by zero
+ */
+ if (!dev->caps.hca_core_clock) {
+ dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_TS;
+ mlx4_err(dev,
+ "HCA frequency is 0 - timestamping is not supported\n");
+ } else if (map_internal_clock(dev)) {
+ /*
+ * Map internal clock,
+ * in case of failure disable timestamping
+ */
+ dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_TS;
+ mlx4_err(dev, "Failed to map internal clock. Timestamping is not supported\n");
+ }
+ }
+
+ if (dev->caps.dmfs_high_steer_mode !=
+ MLX4_STEERING_DMFS_A0_NOT_SUPPORTED) {
+ if (mlx4_validate_optimized_steering(dev))
+ mlx4_warn(dev, "Optimized steering validation failed\n");
+
+ if (dev->caps.dmfs_high_steer_mode ==
+ MLX4_STEERING_DMFS_A0_DISABLE) {
+ dev->caps.dmfs_high_rate_qpn_base =
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW];
+ dev->caps.dmfs_high_rate_qpn_range =
+ MLX4_A0_STEERING_TABLE_SIZE;
+ }
+
+ mlx4_dbg(dev, "DMFS high rate steer mode is: %s\n",
+ dmfs_high_rate_steering_mode_str(
+ dev->caps.dmfs_high_steer_mode));
+ }
+ } else {
+ err = mlx4_init_slave(dev);
+ if (err) {
+ if (err != -EPROBE_DEFER)
+ mlx4_err(dev, "Failed to initialize slave\n");
+ return err;
+ }
+
+ err = mlx4_slave_cap(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to obtain slave caps\n");
+ goto err_close;
+ }
+ }
+
+ if (map_bf_area(dev))
+ mlx4_dbg(dev, "Failed to map blue flame area\n");
+
+ /*Only the master set the ports, all the rest got it from it.*/
+ if (!mlx4_is_slave(dev))
+ mlx4_set_port_mask(dev);
+
+ err = mlx4_QUERY_ADAPTER(dev, &adapter);
+ if (err) {
+ mlx4_err(dev, "QUERY_ADAPTER command failed, aborting\n");
+ goto unmap_bf;
+ }
+
+ /* Query CONFIG_DEV parameters */
+ err = mlx4_config_dev_retrieval(dev, ¶ms);
+ if (err && err != -ENOTSUPP) {
+ mlx4_err(dev, "Failed to query CONFIG_DEV parameters\n");
+ } else if (!err) {
+ dev->caps.rx_checksum_flags_port[1] = params.rx_csum_flags_port_1;
+ dev->caps.rx_checksum_flags_port[2] = params.rx_csum_flags_port_2;
+ }
+ priv->eq_table.inta_pin = adapter.inta_pin;
+ memcpy(dev->board_id, adapter.board_id, sizeof dev->board_id);
+
+ return 0;
+
+unmap_bf:
+ unmap_internal_clock(dev);
+ unmap_bf_area(dev);
+
+ if (mlx4_is_slave(dev))
+ mlx4_slave_destroy_special_qp_cap(dev);
+
+err_close:
+ if (mlx4_is_slave(dev))
+ mlx4_slave_exit(dev);
+ else
+ mlx4_CLOSE_HCA(dev, 0);
+
+err_free_icm:
+ if (!mlx4_is_slave(dev))
+ mlx4_free_icms(dev);
+
+ return err;
+}
+
+static int mlx4_init_counters_table(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int nent_pow2, port_indx, vf_index, num_counters;
+ int res, index = 0;
+ struct counter_index *new_counter_index;
+
+
+ mutex_init(&priv->counters_table.mutex);
+
+ if (!(dev->caps.flags & MLX4_DEV_CAP_FLAG_COUNTERS))
+ return -ENOENT;
+
+ if (!mlx4_is_slave(dev) &&
+ dev->caps.max_counters == dev->caps.max_extended_counters) {
+ res = mlx4_cmd(dev, MLX4_IF_STATE_EXTENDED, 0, 0,
+ MLX4_CMD_SET_IF_STAT,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+ if (res) {
+ mlx4_err(dev, "Failed to set extended counters (err=%d)\n", res);
+ return res;
+ }
+ }
+
+ if (mlx4_is_slave(dev)) {
+ for (port_indx = 0; port_indx < dev->caps.num_ports; port_indx++) {
+ INIT_LIST_HEAD(&priv->counters_table.global_port_list[port_indx]);
+ if (dev->caps.def_counter_index[port_indx] != 0xFF) {
+ new_counter_index = kmalloc(sizeof(struct counter_index), GFP_KERNEL);
+ if (!new_counter_index)
+ return -ENOMEM;
+ new_counter_index->index = dev->caps.def_counter_index[port_indx];
+ list_add_tail(&new_counter_index->list, &priv->counters_table.global_port_list[port_indx]);
+ }
+ }
+ mlx4_dbg(dev, "%s: slave allocated %d counters for %d ports\n",
+ __func__, dev->caps.num_ports, dev->caps.num_ports);
+ return 0;
+ }
+
+ nent_pow2 = roundup_pow_of_two(dev->caps.max_counters);
+
+ for (port_indx = 0; port_indx < dev->caps.num_ports; port_indx++) {
+ INIT_LIST_HEAD(&priv->counters_table.global_port_list[port_indx]);
+ /* allocating 2 counters per port for PFs */
+ /* For the PF, the ETH default counters are 0,2; */
+ /* and the RoCE default counters are 1,3 */
+ for (num_counters = 0; num_counters < 2; num_counters++, index++) {
+ new_counter_index = kmalloc(sizeof(struct counter_index), GFP_KERNEL);
+ if (!new_counter_index)
+ return -ENOMEM;
+ new_counter_index->index = index;
+ list_add_tail(&new_counter_index->list,
+ &priv->counters_table.global_port_list[port_indx]);
+ }
+ }
+
+ if (mlx4_is_master(dev)) {
+ for (vf_index = 0; vf_index < dev->persist->num_vfs; vf_index++) {
+ int slave = mlx4_get_slave_indx(&priv->dev, vf_index);
+ struct mlx4_active_ports actv_ports;
+ if (slave < 0)
+ continue;
+ actv_ports = mlx4_get_active_ports(&priv->dev, slave);
+ for (port_indx = 0; port_indx < dev->caps.num_ports; port_indx++) {
+ INIT_LIST_HEAD(&priv->counters_table.vf_list[vf_index][port_indx]);
+ new_counter_index = kmalloc(sizeof(struct counter_index), GFP_KERNEL);
+ if (!new_counter_index)
+ return -ENOMEM;
+ if (index < nent_pow2 - 1 &&
+ test_bit(port_indx, actv_ports.ports)) {
+ new_counter_index->index = index;
+ index++;
+ } else {
+ new_counter_index->index = MLX4_SINK_COUNTER_INDEX;
+ }
+
+ list_add_tail(&new_counter_index->list,
+ &priv->counters_table.vf_list[vf_index][port_indx]);
+ }
+ }
+
+ res = mlx4_bitmap_init(&priv->counters_table.bitmap,
+ nent_pow2, nent_pow2 - 1,
+ index, 1);
+ mlx4_dbg(dev, "%s: master allocated %d counters for %d VFs\n",
+ __func__, index, dev->persist->num_vfs);
+ } else {
+ res = mlx4_bitmap_init(&priv->counters_table.bitmap,
+ nent_pow2, nent_pow2 - 1,
+ index, 1);
+ mlx4_dbg(dev, "%s: native allocated %d counters for %d ports\n",
+ __func__, index, dev->caps.num_ports);
+ }
+
+ return 0;
+
+}
+
+static void mlx4_cleanup_counters_table(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i, j;
+ struct counter_index *port, *tmp_port;
+ struct counter_index *vf, *tmp_vf;
+
+ mutex_lock(&priv->counters_table.mutex);
+
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_COUNTERS) {
+ for (i = 0; i < dev->caps.num_ports; i++) {
+ list_for_each_entry_safe(port, tmp_port,
+ &priv->counters_table.global_port_list[i],
+ list) {
+ list_del(&port->list);
+ kfree(port);
+ }
+ }
+ if (mlx4_is_master(dev)) {
+ for (i = 0; i < dev->persist->num_vfs; i++) {
+ for (j = 0; j < dev->caps.num_ports; j++) {
+ list_for_each_entry_safe(vf, tmp_vf,
+ &priv->counters_table.vf_list[i][j],
+ list) {
+ /* clear the counter statistic */
+ if (__mlx4_clear_if_stat(dev, vf->index))
+ mlx4_dbg(dev, "%s: reset counter %d failed\n",
+ __func__, vf->index);
+ list_del(&vf->list);
+ kfree(vf);
+ }
+ }
+ }
+ }
+ mlx4_bitmap_cleanup(&priv->counters_table.bitmap);
+ }
+ mutex_unlock(&priv->counters_table.mutex);
+}
+
+int __mlx4_slave_counters_free(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i, first;
+ struct counter_index *vf, *tmp_vf;
+
+ /* clean VF's counters for the next useg */
+ if (slave > 0 && slave <= dev->persist->num_vfs) {
+ mlx4_dbg(dev, "%s: free counters of slave(%d)\n"
+ , __func__, slave);
+
+ mutex_lock(&priv->counters_table.mutex);
+ for (i = 0; i < dev->caps.num_ports; i++) {
+ first = 0;
+ list_for_each_entry_safe(vf, tmp_vf,
+ &priv->counters_table.vf_list[slave - 1][i],
+ list) {
+ /* clear the counter statistic */
+ if (__mlx4_clear_if_stat(dev, vf->index))
+ mlx4_dbg(dev, "%s: reset counter %d failed\n",
+ __func__, vf->index);
+ if (first++ && vf->index != MLX4_SINK_COUNTER_INDEX) {
+ mlx4_dbg(dev, "%s: delete counter index %d for slave %d and port %d\n"
+ , __func__, vf->index, slave, i + 1);
+ mlx4_bitmap_free(&priv->counters_table.bitmap, vf->index, MLX4_USE_RR);
+ list_del(&vf->list);
+ kfree(vf);
+ } else {
+ mlx4_dbg(dev, "%s: can't delete default counter index %d for slave %d and port %d\n"
+ , __func__, vf->index, slave, i + 1);
+ }
+ }
+ }
+ mutex_unlock(&priv->counters_table.mutex);
+ }
+
+ return 0;
+}
+
+int __mlx4_counter_alloc(struct mlx4_dev *dev, int slave, int port, u32 *idx)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct counter_index *new_counter_index;
+
+ if (!(dev->caps.flags & MLX4_DEV_CAP_FLAG_COUNTERS))
+ return -ENOENT;
+
+ if ((slave > MLX4_MAX_NUM_VF) || (slave < 0) ||
+ (port < 0) || (port > dev->caps.num_ports)) {
+ mlx4_dbg(dev, "%s: invalid slave(%d) or port(%d) index\n",
+ __func__, slave, port);
+ return -EINVAL;
+ }
+
+ /* handle old guest request does not support request by port index */
+ if (port == 0) {
+ *idx = MLX4_SINK_COUNTER_INDEX;
+ mlx4_dbg(dev, "%s: allocated default counter index %d for slave %d port %d\n"
+ , __func__, *idx, slave, port);
+ return 0;
+ }
+
+ mutex_lock(&priv->counters_table.mutex);
+
+ *idx = mlx4_bitmap_alloc(&priv->counters_table.bitmap);
+ /* if no resources return the default counter of the slave and port */
+ if (*idx == -1) {
+ if (slave == 0) { /* its the ethernet counter ?????? */
+ new_counter_index = list_entry(priv->counters_table.global_port_list[port - 1].next,
+ struct counter_index,
+ list);
+ } else {
+ new_counter_index = list_entry(priv->counters_table.vf_list[slave - 1][port - 1].next,
+ struct counter_index,
+ list);
+ }
+
+ *idx = new_counter_index->index;
+ mlx4_dbg(dev, "%s: allocated defualt counter index %d for slave %d port %d\n"
+ , __func__, *idx, slave, port);
+ goto out;
+ }
+
+ if (slave == 0) { /* native or master */
+ new_counter_index = kmalloc(sizeof(struct counter_index), GFP_KERNEL);
+ if (!new_counter_index)
+ goto no_mem;
+ new_counter_index->index = *idx;
+ list_add_tail(&new_counter_index->list, &priv->counters_table.global_port_list[port - 1]);
+ } else {
+ new_counter_index = kmalloc(sizeof(struct counter_index), GFP_KERNEL);
+ if (!new_counter_index)
+ goto no_mem;
+ new_counter_index->index = *idx;
+ list_add_tail(&new_counter_index->list, &priv->counters_table.vf_list[slave - 1][port - 1]);
+ }
+
+ mlx4_dbg(dev, "%s: allocated counter index %d for slave %d port %d\n"
+ , __func__, *idx, slave, port);
+out:
+ mutex_unlock(&priv->counters_table.mutex);
+ return 0;
+
+no_mem:
+ mlx4_bitmap_free(&priv->counters_table.bitmap, *idx, MLX4_USE_RR);
+ mutex_unlock(&priv->counters_table.mutex);
+ *idx = MLX4_SINK_COUNTER_INDEX;
+ mlx4_dbg(dev, "%s: failed err (%d)\n"
+ , __func__, -ENOMEM);
+ return -ENOMEM;
+}
+
+int mlx4_counter_alloc(struct mlx4_dev *dev, u8 port, u32 *idx)
+{
+ u64 out_param;
+ int err;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct counter_index *new_counter_index, *c_index;
+
+ if (mlx4_is_mfunc(dev)) {
+ err = mlx4_cmd_imm(dev, 0, &out_param,
+ ((u32) port) << 8 | (u32) RES_COUNTER,
+ RES_OP_RESERVE, MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (!err) {
+ *idx = get_param_l(&out_param);
+ if (*idx == MLX4_SINK_COUNTER_INDEX)
+ return -ENOSPC;
+
+ mutex_lock(&priv->counters_table.mutex);
+ c_index = list_entry(priv->counters_table.global_port_list[port - 1].next,
+ struct counter_index,
+ list);
+ mutex_unlock(&priv->counters_table.mutex);
+ if (c_index->index == *idx)
+ return -EEXIST;
+
+ if (mlx4_is_slave(dev)) {
+ new_counter_index = kmalloc(sizeof(struct counter_index), GFP_KERNEL);
+ if (!new_counter_index) {
+ mlx4_counter_free(dev, port, *idx);
+ return -ENOMEM;
+ }
+ new_counter_index->index = *idx;
+ mutex_lock(&priv->counters_table.mutex);
+ list_add_tail(&new_counter_index->list, &priv->counters_table.global_port_list[port - 1]);
+ mutex_unlock(&priv->counters_table.mutex);
+ mlx4_dbg(dev, "%s: allocated counter index %d for port %d\n"
+ , __func__, *idx, port);
+ }
+ }
+ return err;
+ }
+ return __mlx4_counter_alloc(dev, 0, port, idx);
+}
+EXPORT_SYMBOL_GPL(mlx4_counter_alloc);
+
+void __mlx4_counter_free(struct mlx4_dev *dev, int slave, int port, u32 idx)
+{
+ /* check if native or slave and deletes accordingly */
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct counter_index *pf, *tmp_pf;
+ struct counter_index *vf, *tmp_vf;
+ int first;
+
+
+ if (idx == MLX4_SINK_COUNTER_INDEX) {
+ mlx4_dbg(dev, "%s: try to delete default counter index %d for port %d\n"
+ , __func__, idx, port);
+ return;
+ }
+
+ if ((slave > MLX4_MAX_NUM_VF) || (slave < 0) ||
+ (port < 0) || (port > MLX4_MAX_PORTS)) {
+ mlx4_warn(dev, "%s: deletion failed due to invalid slave(%d) or port(%d) index\n"
+ , __func__, slave, idx);
+ return;
+ }
+
+ mutex_lock(&priv->counters_table.mutex);
+ if (slave == 0) {
+ first = 0;
+ list_for_each_entry_safe(pf, tmp_pf,
+ &priv->counters_table.global_port_list[port - 1],
+ list) {
+ /* the first 2 counters are reserved */
+ if (pf->index == idx) {
+ /* clear the counter statistic */
+ if (__mlx4_clear_if_stat(dev, pf->index))
+ mlx4_dbg(dev, "%s: reset counter %d failed\n",
+ __func__, pf->index);
+ if (1 < first && idx != MLX4_SINK_COUNTER_INDEX) {
+ list_del(&pf->list);
+ kfree(pf);
+ mlx4_dbg(dev, "%s: delete counter index %d for native device (%d) port %d\n"
+ , __func__, idx, slave, port);
+ mlx4_bitmap_free(&priv->counters_table.bitmap, idx, MLX4_USE_RR);
+ goto out;
+ } else {
+ mlx4_dbg(dev, "%s: can't delete default counter index %d for native device (%d) port %d\n"
+ , __func__, idx, slave, port);
+ goto out;
+ }
+ }
+ first++;
+ }
+ mlx4_dbg(dev, "%s: can't delete counter index %d for native device (%d) port %d\n"
+ , __func__, idx, slave, port);
+ } else {
+ first = 0;
+ list_for_each_entry_safe(vf, tmp_vf,
+ &priv->counters_table.vf_list[slave - 1][port - 1],
+ list) {
+ /* the first element is reserved */
+ if (vf->index == idx) {
+ /* clear the counter statistic */
+ if (__mlx4_clear_if_stat(dev, vf->index))
+ mlx4_dbg(dev, "%s: reset counter %d failed\n",
+ __func__, vf->index);
+ if (first) {
+ list_del(&vf->list);
+ kfree(vf);
+ mlx4_dbg(dev, "%s: delete counter index %d for slave %d port %d\n",
+ __func__, idx, slave, port);
+ mlx4_bitmap_free(&priv->counters_table.bitmap, idx, MLX4_USE_RR);
+ goto out;
+ } else {
+ mlx4_dbg(dev, "%s: can't delete default slave (%d) counter index %d for port %d\n"
+ , __func__, slave, idx, port);
+ goto out;
+ }
+ }
+ first++;
+ }
+ mlx4_dbg(dev, "%s: can't delete slave (%d) counter index %d for port %d\n"
+ , __func__, slave, idx, port);
+ }
+
+out:
+ mutex_unlock(&priv->counters_table.mutex);
+}
+
+void mlx4_counter_free(struct mlx4_dev *dev, u8 port, u32 idx)
+{
+ u64 in_param = 0;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct counter_index *counter, *tmp_counter;
+ int first = 0;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, idx);
+ mlx4_cmd(dev, in_param,
+ ((u32) port) << 8 | (u32) RES_COUNTER,
+ RES_OP_RESERVE,
+ MLX4_CMD_FREE_RES, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+
+ if (mlx4_is_slave(dev) && idx != MLX4_SINK_COUNTER_INDEX) {
+ mutex_lock(&priv->counters_table.mutex);
+ list_for_each_entry_safe(counter, tmp_counter,
+ &priv->counters_table.global_port_list[port - 1],
+ list) {
+ if (counter->index == idx && first++) {
+ list_del(&counter->list);
+ kfree(counter);
+ mlx4_dbg(dev, "%s: delete counter index %d for port %d\n"
+ , __func__, idx, port);
+ mutex_unlock(&priv->counters_table.mutex);
+ return;
+ }
+ }
+ mutex_unlock(&priv->counters_table.mutex);
+ }
+
+ return;
+ }
+ __mlx4_counter_free(dev, 0, port, idx);
+}
+EXPORT_SYMBOL_GPL(mlx4_counter_free);
+
+int __mlx4_clear_if_stat(struct mlx4_dev *dev,
+ u8 counter_index)
+{
+ struct mlx4_cmd_mailbox *if_stat_mailbox = NULL;
+ int err = 0;
+ u32 if_stat_in_mod = (counter_index & 0xff) | (1 << 31);
+
+ if (counter_index == MLX4_SINK_COUNTER_INDEX)
+ return -EINVAL;
+
+ if (mlx4_is_slave(dev))
+ return 0;
+
+ if_stat_mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(if_stat_mailbox)) {
+ err = PTR_ERR(if_stat_mailbox);
+ return err;
+ }
+
+ err = mlx4_cmd_box(dev, 0, if_stat_mailbox->dma, if_stat_in_mod, 0,
+ MLX4_CMD_QUERY_IF_STAT, MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, if_stat_mailbox);
+ return err;
+}
+
+u8 mlx4_get_default_counter_index(struct mlx4_dev *dev, int slave, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct counter_index *new_counter_index;
+
+ mutex_lock(&priv->counters_table.mutex);
+ if (slave == 0) {
+ new_counter_index = list_entry(priv->counters_table.global_port_list[port - 1].next,
+ struct counter_index,
+ list);
+ } else {
+ new_counter_index = list_entry(priv->counters_table.vf_list[slave - 1][port - 1].next,
+ struct counter_index,
+ list);
+ }
+ mutex_unlock(&priv->counters_table.mutex);
+
+ mlx4_dbg(dev, "%s: return counter index %d for slave %d port %d\n",
+ __func__, new_counter_index->index, slave, port);
+
+ return (u8)new_counter_index->index;
+}
+
+int mlx4_get_vport_ethtool_stats(struct mlx4_dev *dev, int port,
+ struct mlx4_en_vport_stats *vport_stats,
+ int reset, int *read_counters)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cmd_mailbox *if_stat_mailbox = NULL;
+ union mlx4_counter *counter;
+ int err = 0;
+ u32 if_stat_in_mod;
+ struct counter_index *vport, *tmp_vport;
+
+ if (!vport_stats)
+ return -EINVAL;
+
+ if_stat_mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(if_stat_mailbox)) {
+ err = PTR_ERR(if_stat_mailbox);
+ return err;
+ }
+
+ mutex_lock(&priv->counters_table.mutex);
+ list_for_each_entry_safe(vport, tmp_vport,
+ &priv->counters_table.global_port_list[port - 1],
+ list) {
+ if (vport->index == MLX4_SINK_COUNTER_INDEX)
+ continue;
+
+ memset(if_stat_mailbox->buf, 0, sizeof(union mlx4_counter));
+ if_stat_in_mod = (vport->index & 0xff) | ((reset & 1) << 31);
+ err = mlx4_cmd_box(dev, 0, if_stat_mailbox->dma,
+ if_stat_in_mod, 0,
+ MLX4_CMD_QUERY_IF_STAT,
+ MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_dbg(dev, "%s: failed to read statistics for counter index %d\n",
+ __func__, vport->index);
+ goto if_stat_out;
+ }
+ counter = (union mlx4_counter *)if_stat_mailbox->buf;
+ if ((counter->control.cnt_mode & 0xf) == 1) {
+ vport_stats->rx_broadcast_packets += be64_to_cpu(counter->ext.counters[0].IfRxBroadcastFrames);
+ vport_stats->rx_unicast_packets += be64_to_cpu(counter->ext.counters[0].IfRxUnicastFrames);
+ vport_stats->rx_multicast_packets += be64_to_cpu(counter->ext.counters[0].IfRxMulticastFrames);
+ vport_stats->tx_broadcast_packets += be64_to_cpu(counter->ext.counters[0].IfTxBroadcastFrames);
+ vport_stats->tx_unicast_packets += be64_to_cpu(counter->ext.counters[0].IfTxUnicastFrames);
+ vport_stats->tx_multicast_packets += be64_to_cpu(counter->ext.counters[0].IfTxMulticastFrames);
+ vport_stats->rx_broadcast_bytes += be64_to_cpu(counter->ext.counters[0].IfRxBroadcastOctets);
+ vport_stats->rx_unicast_bytes += be64_to_cpu(counter->ext.counters[0].IfRxUnicastOctets);
+ vport_stats->rx_multicast_bytes += be64_to_cpu(counter->ext.counters[0].IfRxMulticastOctets);
+ vport_stats->tx_broadcast_bytes += be64_to_cpu(counter->ext.counters[0].IfTxBroadcastOctets);
+ vport_stats->tx_unicast_bytes += be64_to_cpu(counter->ext.counters[0].IfTxUnicastOctets);
+ vport_stats->tx_multicast_bytes += be64_to_cpu(counter->ext.counters[0].IfTxMulticastOctets);
+ vport_stats->rx_filtered += be64_to_cpu(counter->ext.counters[0].IfRxErrorFrames);
+ vport_stats->rx_dropped += be64_to_cpu(counter->ext.counters[0].IfRxNoBufferFrames);
+ vport_stats->tx_dropped += be64_to_cpu(counter->ext.counters[0].IfTxDroppedFrames);
+ if (read_counters)
+ (*read_counters)++;
+ }
+ }
+
+if_stat_out:
+ mutex_unlock(&priv->counters_table.mutex);
+ mlx4_free_cmd_mailbox(dev, if_stat_mailbox);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_vport_ethtool_stats);
+
+void mlx4_set_admin_guid(struct mlx4_dev *dev, __be64 guid, int entry, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ priv->mfunc.master.vf_admin[entry].vport[port].guid = guid;
+}
+EXPORT_SYMBOL_GPL(mlx4_set_admin_guid);
+
+__be64 mlx4_get_admin_guid(struct mlx4_dev *dev, int entry, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ return priv->mfunc.master.vf_admin[entry].vport[port].guid;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_admin_guid);
+
+void mlx4_set_random_admin_guid(struct mlx4_dev *dev, int entry, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ __be64 guid;
+
+ /* hw GUID */
+ if (entry == 0)
+ return;
+
+ guid = rte_rand();
+ guid &= ~(cpu_to_be64(1ULL << 56));
+ guid |= cpu_to_be64(1ULL << 57);
+ priv->mfunc.master.vf_admin[entry].vport[port].guid = guid;
+}
+
+static int mlx4_setup_hca(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err;
+ int port;
+ __be32 ib_port_default_caps;
+
+ err = mlx4_init_uar_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize user access region table, aborting\n");
+ return err;
+ }
+
+ err = mlx4_uar_alloc(dev, &priv->driver_uar);
+ if (err) {
+ mlx4_err(dev, "Failed to allocate driver access region, aborting\n");
+ goto err_uar_table_free;
+ }
+
+ //priv->kar = ioremap((phys_addr_t) priv->driver_uar.pfn << PAGE_SHIFT, PAGE_SIZE);
+
+ priv->kar = priv->driver_uar.pfn_addr;
+
+ if (!priv->kar) {
+ mlx4_err(dev, "Couldn't map kernel access region, aborting\n");
+ err = -ENOMEM;
+ goto err_uar_free;
+ }
+
+ err = mlx4_init_pd_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize protection domain table, aborting\n");
+ goto err_kar_unmap;
+ }
+
+ err = mlx4_init_xrcd_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize reliable connection domain table, aborting\n");
+ goto err_pd_table_free;
+ }
+
+ err = mlx4_init_mr_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize memory region table, aborting\n");
+ goto err_xrcd_table_free;
+ }
+
+ if (!mlx4_is_slave(dev)) {
+ err = mlx4_init_mcg_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize multicast group table, aborting\n");
+ goto err_mr_table_free;
+ }
+ err = mlx4_config_mad_demux(dev);
+ if (err) {
+ mlx4_err(dev, "Failed in config_mad_demux, aborting\n");
+ goto err_mcg_table_free;
+ }
+ }
+
+ err = mlx4_init_eq_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize event queue table, aborting\n");
+ goto err_mcg_table_free;
+ }
+/*
+ err = mlx4_cmd_use_events(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to switch to event-driven firmware commands, aborting\n");
+ goto err_eq_table_free;
+ }
+ */
+
+ err = mlx4_NOP(dev);
+ if (err) {
+ if (dev->flags & MLX4_FLAG_MSI_X) {
+ mlx4_warn(dev, "NOP command failed to generate MSI-X interrupt IRQ %d)\n",
+ priv->eq_table.eq[MLX4_EQ_ASYNC].irq);
+ mlx4_warn(dev, "Trying again without MSI-X\n");
+ } else {
+ mlx4_err(dev, "NOP command failed to generate interrupt (IRQ %d), aborting\n",
+ priv->eq_table.eq[MLX4_EQ_ASYNC].irq);
+ mlx4_err(dev, "BIOS or ACPI interrupt routing problem?\n");
+ }
+
+ goto err_cmd_poll;
+ }
+
+ mlx4_dbg(dev, "NOP command IRQ test passed\n");
+
+ err = mlx4_init_cq_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize completion queue table, aborting\n");
+ goto err_cmd_poll;
+ }
+
+ err = mlx4_init_srq_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize shared receive queue table, aborting\n");
+ goto err_cq_table_free;
+ }
+
+ err = mlx4_init_qp_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize queue pair table, aborting\n");
+ goto err_srq_table_free;
+ }
+
+ err = mlx4_init_counters_table(dev);
+ if (err && err != -ENOENT) {
+ mlx4_err(dev, "Failed to initialize counters table, aborting\n");
+ goto err_qp_table_free;
+ }
+
+ if (!mlx4_is_slave(dev)) {
+ for (port = 1; port <= dev->caps.num_ports; port++) {
+ ib_port_default_caps = 0;
+ err = mlx4_get_port_ib_caps(dev, port,
+ &ib_port_default_caps);
+ if (err)
+ mlx4_warn(dev, "failed to get port %d default ib capabilities (%d). Continuing with caps = 0\n",
+ port, err);
+ dev->caps.ib_port_def_cap[port] = ib_port_default_caps;
+
+ /* initialize per-slave default ib port capabilities */
+ if (mlx4_is_master(dev)) {
+ int i;
+ for (i = 0; i < dev->num_slaves; i++) {
+ if (i == mlx4_master_func_num(dev))
+ continue;
+ priv->mfunc.master.slave_state[i].ib_cap_mask[port] =
+ ib_port_default_caps;
+ }
+ }
+
+ dev->caps.port_ib_mtu[port] = IB_MTU_4096;
+
+ err = mlx4_SET_PORT(dev, port, mlx4_is_master(dev) ?
+ dev->caps.pkey_table_len[port] : -1);
+ if (err) {
+ mlx4_err(dev, "Failed to set port %d, aborting\n",
+ port);
+ goto err_counters_table_free;
+ }
+ }
+ }
+
+ return 0;
+
+err_counters_table_free:
+ mlx4_cleanup_counters_table(dev);
+
+err_qp_table_free:
+ mlx4_cleanup_qp_table(dev);
+
+err_srq_table_free:
+ mlx4_cleanup_srq_table(dev);
+
+err_cq_table_free:
+ mlx4_cleanup_cq_table(dev);
+
+err_cmd_poll:
+ mlx4_cmd_use_polling(dev);
+
+err_eq_table_free:
+ mlx4_cleanup_eq_table(dev);
+
+err_mcg_table_free:
+ if (!mlx4_is_slave(dev))
+ mlx4_cleanup_mcg_table(dev);
+
+err_mr_table_free:
+ mlx4_cleanup_mr_table(dev);
+
+err_xrcd_table_free:
+ mlx4_cleanup_xrcd_table(dev);
+
+err_pd_table_free:
+ mlx4_cleanup_pd_table(dev);
+
+err_kar_unmap:
+ //iounmap(priv->kar);
+
+err_uar_free:
+ mlx4_uar_free(dev, &priv->driver_uar);
+
+err_uar_table_free:
+ mlx4_cleanup_uar_table(dev);
+ return err;
+}
+
+static int mlx4_init_affinity_hint(struct mlx4_dev *dev, int port, int eqn)
+{
+ int requested_cpu = 0;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_eq *eq;
+ int off = 0;
+ int i;
+
+ if (eqn > dev->caps.num_comp_vectors)
+ return -EINVAL;
+
+ for (i = 1; i < port; i++)
+ off += mlx4_get_eqs_per_port(dev, i);
+
+ requested_cpu = eqn - off - !!(eqn > MLX4_EQ_ASYNC);
+
+ /* Meaning EQs are shared, and this call comes from the second port */
+ if (requested_cpu < 0)
+ return 0;
+
+ eq = &priv->eq_table.eq[eqn];
+
+// if (!zalloc_cpumask_var(&eq->affinity_mask, GFP_KERNEL))
+// return -ENOMEM;
+
+// cpumask_set_cpu(requested_cpu, eq->affinity_mask);
+
+ return 0;
+}
+
+static void mlx4_enable_msi_x(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct msix_entry *entries;
+ int i;
+ int port = 0;
+#ifndef HAVE_PCI_ENABLE_MSIX_RANGE
+ int err;
+#endif
+
+#ifdef KMOD_DISABLED
+
+ if (msi_x) {
+ int nreq = dev->caps.num_ports * num_online_cpus() + 1;
+
+ nreq = min_t(int, dev->caps.num_eqs - dev->caps.reserved_eqs,
+ nreq);
+#ifdef CONFIG_PPC
+ nreq = min_t(int, nreq, PPC_MAX_MSIX);
+#endif
+ entries = kcalloc(nreq, sizeof *entries, GFP_KERNEL);
+ if (!entries)
+ goto no_msi;
+
+ for (i = 0; i < nreq; ++i)
+ entries[i].entry = i;
+
+#ifdef HAVE_PCI_ENABLE_MSIX_RANGE
+ nreq = pci_enable_msix_range(dev->persist->pdev, entries, 2,
+ nreq);
+#else
+ retry:
+ err = pci_enable_msix(dev->persist->pdev, entries, nreq);
+ if (err) {
+ /* Try again if at least 2 vectors are available */
+ if (err > 1) {
+ mlx4_info(dev, "Requested %d vectors, "
+ "but only %d MSI-X vectors available, "
+ "trying again\n", nreq, err);
+ nreq = err;
+ goto retry;
+ }
+ nreq = -1;
+ }
+#endif
+
+ if (nreq < 2 || nreq < MLX4_EQ_ASYNC + 1) {
+ kfree(entries);
+ goto no_msi;
+ }
+ /* 1 is reserved for events (asyncrounous EQ) */
+ dev->caps.num_comp_vectors = nreq - 1;
+
+ priv->eq_table.eq[MLX4_EQ_ASYNC].irq = entries[0].vector;
+ bitmap_zero(priv->eq_table.eq[MLX4_EQ_ASYNC].actv_ports.ports,
+ dev->caps.num_ports);
+
+ for (i = 0; i < dev->caps.num_comp_vectors + 1; i++) {
+ if (i == MLX4_EQ_ASYNC)
+ continue;
+
+ priv->eq_table.eq[i].irq =
+ entries[i + 1 - !!(i > MLX4_EQ_ASYNC)].vector;
+
+ if (MLX4_IS_LEGACY_EQ_MODE(dev->caps)) {
+ bitmap_fill(priv->eq_table.eq[i].actv_ports.ports,
+ dev->caps.num_ports);
+ /* We don't set affinity hint when there
+ * aren't enough EQs
+ */
+ } else {
+ set_bit(port,
+ priv->eq_table.eq[i].actv_ports.ports);
+ if (mlx4_init_affinity_hint(dev, port + 1, i))
+ mlx4_warn(dev, "Couldn't init hint cpumask for EQ %d\n",
+ i);
+ }
+ /* We divide the Eqs evenly between the two ports.
+ * (dev->caps.num_comp_vectors / dev->caps.num_ports)
+ * refers to the number of Eqs per port
+ * (i.e eqs_per_port). Theoretically, we would like to
+ * write something like (i + 1) % eqs_per_port == 0.
+ * However, since there's an asynchronous Eq, we have
+ * to skip over it by comparing this condition to
+ * !!((i + 1) > MLX4_EQ_ASYNC).
+ */
+ if ((dev->caps.num_comp_vectors > dev->caps.num_ports) &&
+ ((i + 1) %
+ (dev->caps.num_comp_vectors / dev->caps.num_ports)) ==
+ !!((i + 1) > MLX4_EQ_ASYNC))
+ /* If dev->caps.num_comp_vectors < dev->caps.num_ports,
+ * everything is shared anyway.
+ */
+ port++;
+ }
+
+ dev->flags |= MLX4_FLAG_MSI_X;
+
+ kfree(entries);
+ return;
+ }
+
+#endif
+
+no_msi:
+ dev->caps.num_comp_vectors = 1;
+
+ BUG_ON(MLX4_EQ_ASYNC >= 2);
+ for (i = 0; i < 2; ++i) {
+ priv->eq_table.eq[i].irq = dev->persist->rte_pdev->intr_handle.fd; //XXX
+ if (i != MLX4_EQ_ASYNC) {
+ bitmap_fill(priv->eq_table.eq[i].actv_ports.ports,
+ dev->caps.num_ports);
+ }
+ }
+}
+
+static int mlx4_init_port_info(struct mlx4_dev *dev, int port)
+{
+ struct mlx4_port_info *info = &mlx4_priv(dev)->port[port];
+ int err = 0;
+
+ info->dev = dev;
+ info->port = port;
+ if (!mlx4_is_slave(dev)) {
+ mlx4_init_mac_table(dev, &info->mac_table);
+ mlx4_init_vlan_table(dev, &info->vlan_table);
+ mlx4_init_roce_gid_table(dev, &info->roce);
+ info->base_qpn = mlx4_get_base_qpn(dev, port);
+ }
+#ifdef KMOD_DISABLED
+ sprintf(info->dev_name, "mlx4_port%d", port);
+ info->port_attr.attr.name = info->dev_name;
+ if (mlx4_is_mfunc(dev))
+ info->port_attr.attr.mode = S_IRUGO;
+ else {
+ info->port_attr.attr.mode = S_IRUGO | S_IWUSR;
+ info->port_attr.store = set_port_type;
+ }
+ info->port_attr.show = show_port_type;
+ sysfs_attr_init(&info->port_attr.attr);
+
+ err = device_create_file(&dev->persist->pdev->dev, &info->port_attr);
+ if (err) {
+ mlx4_err(dev, "Failed to create file for port %d\n", port);
+ info->port = -1;
+ }
+
+ sprintf(info->dev_mtu_name, "mlx4_port%d_mtu", port);
+ info->port_mtu_attr.attr.name = info->dev_mtu_name;
+ if (mlx4_is_mfunc(dev))
+ info->port_mtu_attr.attr.mode = S_IRUGO;
+ else {
+ info->port_mtu_attr.attr.mode = S_IRUGO | S_IWUSR;
+ info->port_mtu_attr.store = set_port_ib_mtu;
+ }
+ info->port_mtu_attr.show = show_port_ib_mtu;
+ sysfs_attr_init(&info->port_mtu_attr.attr);
+
+ err = device_create_file(&dev->persist->pdev->dev,
+ &info->port_mtu_attr);
+ if (err) {
+ mlx4_err(dev, "Failed to create mtu file for port %d\n", port);
+ device_remove_file(&info->dev->persist->pdev->dev,
+ &info->port_attr);
+ info->port = -1;
+ }
+#endif
+
+ return err;
+}
+
+static void mlx4_cleanup_port_info(struct mlx4_port_info *info)
+{
+ if (info->port < 0)
+ return;
+#ifdef KMOD_DISABLED
+ device_remove_file(&info->dev->persist->pdev->dev, &info->port_attr);
+ device_remove_file(&info->dev->persist->pdev->dev,
+ &info->port_mtu_attr);
+#endif
+#ifdef CONFIG_RFS_ACCEL
+ free_irq_cpu_rmap(info->rmap);
+ info->rmap = NULL;
+#endif
+
+}
+
+static int mlx4_init_steering(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int num_entries = dev->caps.num_ports;
+ int i, j;
+
+ priv->steer = kzalloc(sizeof(struct mlx4_steer) * num_entries, GFP_KERNEL);
+ if (!priv->steer)
+ return -ENOMEM;
+
+ for (i = 0; i < num_entries; i++)
+ for (j = 0; j < MLX4_NUM_STEERS; j++) {
+ INIT_LIST_HEAD(&priv->steer[i].promisc_qps[j]);
+ INIT_LIST_HEAD(&priv->steer[i].steer_entries[j]);
+ }
+ return 0;
+}
+
+static void mlx4_clear_steering(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_steer_index *entry, *tmp_entry;
+ struct mlx4_promisc_qp *pqp, *tmp_pqp;
+ int num_entries = dev->caps.num_ports;
+ int i, j;
+
+ for (i = 0; i < num_entries; i++) {
+ for (j = 0; j < MLX4_NUM_STEERS; j++) {
+ list_for_each_entry_safe(pqp, tmp_pqp,
+ &priv->steer[i].promisc_qps[j],
+ list) {
+ list_del(&pqp->list);
+ kfree(pqp);
+ }
+ list_for_each_entry_safe(entry, tmp_entry,
+ &priv->steer[i].steer_entries[j],
+ list) {
+ list_del(&entry->list);
+ list_for_each_entry_safe(pqp, tmp_pqp,
+ &entry->duplicates,
+ list) {
+ list_del(&pqp->list);
+ kfree(pqp);
+ }
+ kfree(entry);
+ }
+ }
+ }
+ kfree(priv->steer);
+}
+
+static int extended_func_num(struct rte_pci_device *pdev)
+{
+ return PCI_SLOT(pdev->addr.function) * 8 + PCI_FUNC(pdev->addr.function); //XXX
+}
+
+#define MLX4_OWNER_BASE 0x8069c
+#define MLX4_OWNER_SIZE 4
+
+static int mlx4_get_ownership(struct mlx4_dev *dev)
+{
+ void __iomem *owner;
+ u32 ret;
+
+ //if (pci_channel_offline(dev->persist->pdev))
+// return -EIO;
+ assert(dev->persist->rte_pdev->mem_resource[0].len >= (MLX4_OWNER_BASE + MLX4_OWNER_SIZE));
+ owner = RTE_PTR_ADD(dev->persist->rte_pdev->mem_resource[0].addr, MLX4_OWNER_BASE);
+/*
+ owner = ioremap(pci_resource_start(dev->persist->pdev, 0) +
+ MLX4_OWNER_BASE,
+ MLX4_OWNER_SIZE);
+ */
+ if (!owner) {
+ mlx4_err(dev, "Failed to obtain ownership bit\n");
+ return -ENOMEM;
+ }
+
+ ret = readl(owner);
+ //iounmap(owner);
+ return (int) !!ret;
+}
+
+static void mlx4_free_ownership(struct mlx4_dev *dev)
+{
+ void __iomem *owner;
+
+ //if (pci_channel_offline(dev->persist->pdev))
+ // return;
+ assert(dev->persist->rte_pdev->mem_resource[0].len >= (MLX4_OWNER_BASE + MLX4_OWNER_SIZE));
+ owner = RTE_PTR_ADD(dev->persist->rte_pdev->mem_resource[0].addr, MLX4_OWNER_BASE);
+ /*
+ owner = ioremap(pci_resource_start(dev->persist->pdev, 0) +
+ MLX4_OWNER_BASE,
+ MLX4_OWNER_SIZE);
+ */
+ if (!owner) {
+ mlx4_err(dev, "Failed to obtain ownership bit\n");
+ return;
+ }
+ writel(0, owner);
+ //msleep(1000);
+ msleep(5000); //sleep more
+ //iounmap(owner);
+}
+
+#define SRIOV_VALID_STATE(flags) (!!((flags) & MLX4_FLAG_SRIOV) ==\
+ !!((flags) & MLX4_FLAG_MASTER))
+
+#ifndef HAVE_PCI_NUM_VF
+static int mlx4_find_vfs(struct rte_pci_device *pdev)
+{
+#ifdef KMOD_DISABLED
+ struct pci_dev *dev;
+ int vfs = 0, pos;
+ u16 offset, stride;
+
+ pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV);
+ if (!pos)
+ return 0;
+ pci_read_config_word(pdev, pos + PCI_SRIOV_VF_OFFSET, &offset);
+ pci_read_config_word(pdev, pos + PCI_SRIOV_VF_STRIDE, &stride);
+
+ dev = pci_get_device(pdev->vendor, PCI_ANY_ID, NULL);
+ while (dev) {
+ if (dev->is_virtfn && pci_physfn(dev) == pdev) {
+ vfs++;
+ }
+ dev = pci_get_device(pdev->vendor, PCI_ANY_ID, dev);
+ }
+ return vfs;
+#else
+ return 0; //no virtual function
+#endif
+}
+#endif
+
+static u64 mlx4_enable_sriov(struct mlx4_dev *dev, struct rte_pci_device *pdev,
+ u8 total_vfs, int existing_vfs, int reset_flow)
+{
+ u64 dev_flags = dev->flags;
+ int err = 0;
+
+ if (reset_flow) {
+ dev->dev_vfs = kcalloc(total_vfs, sizeof(*dev->dev_vfs),
+ GFP_KERNEL);
+ if (!dev->dev_vfs)
+ goto free_mem;
+ return dev_flags;
+ }
+
+ atomic_inc(&pf_loading);
+ if (dev->flags & MLX4_FLAG_SRIOV) {
+ if (existing_vfs != total_vfs) {
+ mlx4_err(dev, "SR-IOV was already enabled, but with num_vfs (%d) different than requested (%d)\n",
+ existing_vfs, total_vfs);
+ total_vfs = existing_vfs;
+ }
+ }
+
+ dev->dev_vfs = kzalloc(total_vfs * sizeof(*dev->dev_vfs), GFP_KERNEL);
+ if (NULL == dev->dev_vfs) {
+ mlx4_err(dev, "Failed to allocate memory for VFs\n");
+ goto disable_sriov;
+ }
+
+ if (!(dev->flags & MLX4_FLAG_SRIOV)) {
+ mlx4_warn(dev, "Enabling SR-IOV with %d VFs\n", total_vfs);
+ //err = pci_enable_sriov(pdev, total_vfs);
+ assert(0); //not implemented
+ }
+ if (err) {
+ mlx4_err(dev, "Failed to enable SR-IOV, continuing without SR-IOV (err = %d)\n",
+ err);
+ goto disable_sriov;
+ } else {
+ mlx4_warn(dev, "Running in master mode\n");
+ dev_flags |= MLX4_FLAG_SRIOV |
+ MLX4_FLAG_MASTER;
+ dev_flags &= ~MLX4_FLAG_SLAVE;
+ dev->persist->num_vfs = total_vfs;
+ }
+ return dev_flags;
+
+disable_sriov:
+ atomic_dec(&pf_loading);
+free_mem:
+ dev->persist->num_vfs = 0;
+ kfree(dev->dev_vfs);
+ dev->dev_vfs = NULL;
+ return dev_flags & ~MLX4_FLAG_MASTER;
+}
+
+enum {
+ MLX4_DEV_CAP_CHECK_NUM_VFS_ABOVE_64 = -1,
+};
+
+static int mlx4_check_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap,
+ int *nvfs)
+{
+ int requested_vfs = nvfs[0] + nvfs[1] + nvfs[2];
+ /* Checking for 64 VFs as a limitation of CX2 */
+ if (!(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_80_VFS) &&
+ requested_vfs >= 64) {
+ mlx4_err(dev, "Requested %d VFs, but FW does not support more than 64\n",
+ requested_vfs);
+ return MLX4_DEV_CAP_CHECK_NUM_VFS_ABOVE_64;
+ }
+ return 0;
+}
+
+static int mlx4_load_one(struct rte_pci_device *pdev, int pci_dev_data,
+ int total_vfs, int *nvfs, struct mlx4_priv *priv,
+ int reset_flow)
+{
+ struct mlx4_dev *dev;
+ unsigned sum = 0;
+ int err;
+ int port;
+ int i;
+ struct mlx4_dev_cap *dev_cap = NULL;
+ int num_vfs_argc =
+ mlx4_get_argc(num_vfs.dbdf2val.tbl, pdev);
+ int probe_vfs_argc =
+ mlx4_get_argc(probe_vf.dbdf2val.tbl, pdev);
+ /* existing_vfs will contain the number of VFs which were active when
+ remove_one was invoked on the PF driver. In this case,
+ the PF driver did not disable SRIOV during remove_one.
+ When the PF is reloaded (mlx4_load_one), SRIOV is therefore
+ still enabled, and pci_enable_sriov should not be called. */
+ int existing_vfs = 0;
+
+ dev = &priv->dev;
+
+ INIT_LIST_HEAD(&priv->dev_list);
+ INIT_LIST_HEAD(&priv->ctx_list);
+ spin_lock_init(&priv->ctx_lock);
+
+ mutex_init(&priv->port_mutex);
+ mutex_init(&priv->bond_mutex);
+
+ INIT_LIST_HEAD(&priv->pgdir_list);
+ mutex_init(&priv->pgdir_mutex);
+
+ INIT_LIST_HEAD(&priv->bf_list);
+ mutex_init(&priv->bf_mutex);
+
+ dev->rev_id = 0;//pdev->revision; XXX
+ dev->numa_node = pdev->numa_node;
+ if (dev->numa_node == -1)
+ dev->numa_node = 0;
+ memcpy(dev->persist->nvfs, nvfs, sizeof(dev->persist->nvfs));
+
+ /* Detect if this device is a virtual function */
+ if (pci_dev_data & MLX4_PCI_DEV_IS_VF) {
+ mlx4_warn(dev, "Detected virtual function - running in slave mode\n");
+ dev->flags |= MLX4_FLAG_SLAVE;
+ } else {
+ /* We reset the device and enable SRIOV only for physical
+ * devices. Try to claim ownership on the device;
+ * if already taken, skip -- do not allow multiple PFs */
+#ifdef KMOD_DISABLED
+ err = mlx4_get_ownership(dev);
+ if (err) {
+ if (err < 0)
+ return err;
+ else {
+ mlx4_warn(dev, "Multiple PFs not yet supported - Skipping PF\n");
+ return -EINVAL;
+ }
+ }
+#endif
+
+ atomic_set(&priv->opreq_count, 0);
+ //INIT_WORK(&priv->opreq_task, mlx4_opreq_action);
+
+ /*
+ * Now reset the HCA before we touch the PCI capabilities or
+ * attempt a firmware command, since a boot ROM may have left
+ * the HCA in an undefined state.
+ */
+ err = mlx4_reset(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to reset HCA, aborting\n");
+ goto err_sriov;
+ }
+
+ if (total_vfs) {
+ dev->flags = MLX4_FLAG_MASTER;
+#ifdef HAVE_PCI_NUM_VF
+ existing_vfs = pci_num_vf(pdev);
+#else
+ existing_vfs = mlx4_find_vfs(pdev);
+#endif
+ if (existing_vfs)
+ dev->flags |= MLX4_FLAG_SRIOV;
+ dev->persist->num_vfs = total_vfs;
+ }
+ }
+
+ /* on load remove any previous indication of internal error,
+ * device is up.
+ */
+ dev->persist->state = MLX4_DEVICE_STATE_UP;
+
+slave_start:
+ err = mlx4_cmd_init(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to init command interface, aborting\n");
+ goto err_sriov;
+ }
+
+ /* In slave functions, the communication channel must be initialized
+ * before posting commands. Also, init num_slaves before calling
+ * mlx4_init_hca */
+ if (mlx4_is_mfunc(dev)) {
+ if (mlx4_is_master(dev)) {
+ dev->num_slaves = MLX4_MAX_NUM_SLAVES;
+
+ } else {
+ dev->num_slaves = 0;
+ err = mlx4_multi_func_init(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to init slave mfunc interface, aborting\n");
+ goto err_cmd;
+ }
+ }
+ }
+
+ err = mlx4_init_fw(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to init fw, aborting.\n");
+ goto err_mfunc;
+ }
+
+ if (mlx4_is_master(dev)) {
+ /* when we hit the goto slave_start below, dev_cap already initialized */
+ if (!dev_cap) {
+ dev_cap = kzalloc(sizeof(*dev_cap), GFP_KERNEL);
+
+ if (!dev_cap) {
+ err = -ENOMEM;
+ goto err_fw;
+ }
+
+ err = mlx4_QUERY_DEV_CAP(dev, dev_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting.\n");
+ goto err_fw;
+ }
+
+ if (mlx4_check_dev_cap(dev, dev_cap, nvfs))
+ goto err_fw;
+
+ if (!(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_SYS_EQS)) {
+ u64 dev_flags = mlx4_enable_sriov(dev, pdev,
+ total_vfs,
+ existing_vfs,
+ reset_flow);
+
+ mlx4_cmd_cleanup(dev, MLX4_CMD_CLEANUP_ALL);
+ dev->flags = dev_flags;
+ if (!SRIOV_VALID_STATE(dev->flags)) {
+ mlx4_err(dev, "Invalid SRIOV state\n");
+ goto err_sriov;
+ }
+ err = mlx4_reset(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to reset HCA, aborting.\n");
+ goto err_sriov;
+ }
+ goto slave_start;
+ }
+ } else {
+ /* Legacy mode FW requires SRIOV to be enabled before
+ * doing QUERY_DEV_CAP, since max_eq's value is different if
+ * SRIOV is enabled.
+ */
+ memset(dev_cap, 0, sizeof(*dev_cap));
+ err = mlx4_QUERY_DEV_CAP(dev, dev_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting.\n");
+ goto err_fw;
+ }
+
+ if (mlx4_check_dev_cap(dev, dev_cap, nvfs))
+ goto err_fw;
+ }
+ }
+
+ err = mlx4_init_hca(dev);
+ if (err) {
+ if (err == -EACCES) {
+ /* Not primary Physical function
+ * Running in slave mode */
+ mlx4_cmd_cleanup(dev, MLX4_CMD_CLEANUP_ALL);
+ /* We're not a PF */
+ if (dev->flags & MLX4_FLAG_SRIOV) {
+ if (!existing_vfs)
+ {
+ //pci_disable_sriov(pdev);
+ //assert(0);
+ }
+ if (mlx4_is_master(dev) && !reset_flow)
+ atomic_dec(&pf_loading);
+ dev->flags &= ~MLX4_FLAG_SRIOV;
+ }
+ if (!mlx4_is_slave(dev))
+ mlx4_free_ownership(dev);
+ dev->flags |= MLX4_FLAG_SLAVE;
+ dev->flags &= ~MLX4_FLAG_MASTER;
+ goto slave_start;
+ } else
+ goto err_fw;
+ }
+
+ if (mlx4_is_master(dev) && (dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_SYS_EQS)) {
+ u64 dev_flags = mlx4_enable_sriov(dev, pdev, total_vfs,
+ existing_vfs, reset_flow);
+
+ if ((dev->flags ^ dev_flags) & (MLX4_FLAG_MASTER | MLX4_FLAG_SLAVE)) {
+ mlx4_cmd_cleanup(dev, MLX4_CMD_CLEANUP_VHCR);
+ dev->flags = dev_flags;
+ err = mlx4_cmd_init(dev);
+ if (err) {
+ /* Only VHCR is cleaned up, so could still
+ * send FW commands
+ */
+ mlx4_err(dev, "Failed to init VHCR command interface, aborting\n");
+ goto err_close;
+ }
+ } else {
+ dev->flags = dev_flags;
+ }
+
+ if (!SRIOV_VALID_STATE(dev->flags)) {
+ mlx4_err(dev, "Invalid SRIOV state\n");
+ goto err_close;
+ }
+ }
+
+ /* check if the device is functioning at its maximum possible speed.
+ * No return code for this call, just warn the user in case of PCI
+ * express device capabilities are under-satisfied by the bus.
+ */
+ //if (!mlx4_is_slave(dev))
+ // mlx4_check_pcie_caps(dev);
+
+ /* In master functions, the communication channel must be initialized
+ * after obtaining its address from fw */
+ if (mlx4_is_master(dev)) {
+ int ib_ports = 0;
+
+ mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_IB)
+ ib_ports++;
+
+ if (ib_ports &&
+ (num_vfs_argc > 1 || probe_vfs_argc > 1)) {
+ mlx4_err(dev,
+ "Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet\n");
+ err = -EINVAL;
+ goto err_close;
+ }
+ if (dev->caps.num_ports < 2 &&
+ num_vfs_argc > 1) {
+ err = -EINVAL;
+ mlx4_err(dev,
+ "Error: Trying to configure VFs on port 2, but HCA has only %d physical ports\n",
+ dev->caps.num_ports);
+ goto err_close;
+ }
+ memcpy(dev->persist->nvfs, nvfs, sizeof(dev->persist->nvfs));
+
+ for (i = 0;
+ i < sizeof(dev->persist->nvfs)/
+ sizeof(dev->persist->nvfs[0]); i++) {
+ unsigned j;
+
+ for (j = 0; j < dev->persist->nvfs[i]; ++sum, ++j) {
+ dev->dev_vfs[sum].min_port = i < 2 ? i + 1 : 1;
+ dev->dev_vfs[sum].n_ports = i < 2 ? 1 :
+ dev->caps.num_ports;
+ }
+ }
+
+ /* In master functions, the communication channel
+ * must be initialized after obtaining its address from fw
+ */
+ err = mlx4_multi_func_init(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to init master mfunc interface, aborting.\n");
+ goto err_close;
+ }
+ }
+
+ err = mlx4_alloc_eq_table(dev);
+ if (err)
+ goto err_master_mfunc;
+
+ bitmap_zero(priv->msix_ctl.pool_bm, MAX_MSIX);
+ mutex_init(&priv->msix_ctl.pool_lock);
+
+ mlx4_enable_msi_x(dev);
+ if ((mlx4_is_mfunc(dev)) &&
+ !(dev->flags & MLX4_FLAG_MSI_X)) {
+ err = -ENOSYS;
+ mlx4_err(dev, "INTx is not supported in multi-function mode, aborting\n");
+ goto err_free_eq;
+ }
+
+ if (!mlx4_is_slave(dev)) {
+ err = mlx4_init_steering(dev);
+ if (err)
+ goto err_disable_msix;
+ }
+
+ //err = mlx4_setup_hca(dev);
+ // we have only one intr vector XXX
+ //if (err == -EBUSY && (dev->flags & MLX4_FLAG_MSI_X) &&
+ // !mlx4_is_mfunc(dev)) {
+ dev->flags &= ~MLX4_FLAG_MSI_X;
+ dev->caps.num_comp_vectors = 1;
+ //pci_disable_msix(pdev);
+ err = mlx4_setup_hca(dev);
+ //}
+
+ if (err)
+ goto err_steer;
+
+ mlx4_init_quotas(dev);
+ /* When PF resources are ready arm its comm channel to enable
+ * getting commands
+ */
+ if (mlx4_is_master(dev)) {
+ err = mlx4_ARM_COMM_CHANNEL(dev);
+ if (err) {
+ mlx4_err(dev, " Failed to arm comm channel eq: %x\n",
+ err);
+ goto err_steer;
+ }
+ }
+
+ for (port = 1; port <= dev->caps.num_ports; port++) {
+ err = mlx4_init_port_info(dev, port);
+ if (err)
+ goto err_port;
+ }
+
+ priv->v2p.port1 = 1;
+ priv->v2p.port2 = 2;
+
+ err = mlx4_register_device(dev);
+ if (err)
+ goto err_port;
+
+ //mlx4_request_modules(dev);
+
+ mlx4_sense_init(dev);
+ mlx4_start_sense(dev);
+
+ priv->removed = 0;
+
+ if (mlx4_is_master(dev) && dev->persist->num_vfs && !reset_flow)
+ atomic_dec(&pf_loading);
+
+ kfree(dev_cap);
+ return 0;
+
+err_port:
+ for (--port; port >= 1; --port)
+ mlx4_cleanup_port_info(&priv->port[port]);
+
+ mlx4_cleanup_counters_table(dev);
+ mlx4_cleanup_qp_table(dev);
+ mlx4_cleanup_srq_table(dev);
+ mlx4_cleanup_cq_table(dev);
+ mlx4_cmd_use_polling(dev);
+ mlx4_cleanup_eq_table(dev);
+ mlx4_cleanup_mcg_table(dev);
+ mlx4_cleanup_mr_table(dev);
+ mlx4_cleanup_xrcd_table(dev);
+ mlx4_cleanup_pd_table(dev);
+ mlx4_cleanup_uar_table(dev);
+
+err_steer:
+ if (!mlx4_is_slave(dev))
+ mlx4_clear_steering(dev);
+
+err_disable_msix:
+ //if (dev->flags & MLX4_FLAG_MSI_X)
+ // pci_disable_msix(pdev);
+
+err_free_eq:
+ mlx4_free_eq_table(dev);
+
+err_master_mfunc:
+ if (mlx4_is_master(dev)) {
+ mlx4_free_resource_tracker(dev, RES_TR_FREE_STRUCTS_ONLY);
+ mlx4_multi_func_cleanup(dev);
+ }
+
+ if (mlx4_is_slave(dev)) {
+ kfree(dev->caps.qp0_qkey);
+ kfree(dev->caps.qp0_tunnel);
+ kfree(dev->caps.qp0_proxy);
+ kfree(dev->caps.qp1_tunnel);
+ kfree(dev->caps.qp1_proxy);
+ }
+
+err_close:
+ mlx4_close_hca(dev);
+
+err_fw:
+ mlx4_close_fw(dev);
+
+err_mfunc:
+ if (mlx4_is_slave(dev))
+ mlx4_multi_func_cleanup(dev);
+
+err_cmd:
+ mlx4_cmd_cleanup(dev, MLX4_CMD_CLEANUP_ALL);
+
+err_sriov:
+ if (dev->flags & MLX4_FLAG_SRIOV && !existing_vfs) {
+ //pci_disable_sriov(pdev);
+ dev->flags &= ~MLX4_FLAG_SRIOV;
+ }
+
+ if (mlx4_is_master(dev) && dev->persist->num_vfs && !reset_flow)
+ atomic_dec(&pf_loading);
+
+ kfree(priv->dev.dev_vfs);
+
+ if (!mlx4_is_slave(dev))
+ mlx4_free_ownership(dev);
+
+ kfree(dev_cap);
+ return err;
+}
+
+static int __mlx4_init_one(struct rte_pci_device *pdev, int pci_dev_data,
+ struct mlx4_priv *priv)
+{
+ int err;
+ unsigned int i;
+ unsigned total_vfs = 0;
+ int nvfs[MLX4_MAX_PORTS + 1] = {0, 0, 0};
+ int prb_vf[MLX4_MAX_PORTS + 1] = {0, 0, 0};
+ const int param_map[MLX4_MAX_PORTS + 1][MLX4_MAX_PORTS + 1] = {
+ {2, 0, 0}, {0, 1, 2}, {0, 1, 2} };
+ int num_vfs_argc =
+ mlx4_get_argc(num_vfs.dbdf2val.tbl, pdev);
+ int probe_vfs_argc =
+ mlx4_get_argc(probe_vf.dbdf2val.tbl, pdev);
+
+ pr_info(DRV_NAME ": Initializing %s\n", "mlx4");
+
+ err = 0;//pci_enable_device(pdev);
+ if (err) {
+ dev_err(&pdev->dev, "Cannot enable PCI device, aborting.\n");
+ return err;
+ }
+
+ for (i = 0; i < num_vfs_argc;
+ total_vfs += nvfs[param_map[num_vfs_argc - 1][i]], i++) {
+ int *cur_nvfs = &nvfs[param_map[num_vfs_argc - 1][i]];
+ mlx4_get_val(num_vfs.dbdf2val.tbl, pdev, i,
+ cur_nvfs);
+ if (*cur_nvfs < 0) {
+ dev_err(&pdev->dev, "num_vfs module parameter cannot be negative\n");
+ err = -EINVAL;
+ goto err_disable_pdev;
+ }
+ }
+ for (i = 0; i < probe_vfs_argc; i++) {
+ int *cur_prbvf = &prb_vf[param_map[probe_vfs_argc - 1][i]];
+ mlx4_get_val(probe_vf.dbdf2val.tbl, pdev, i,
+ cur_prbvf);
+ if (*cur_prbvf < 0) {
+ dev_err(&pdev->dev, "probe_vf module parameter cannot be negative\n");
+ err = -EINVAL;
+ goto err_disable_pdev;
+ }
+ }
+ for (i = 0; i < sizeof(nvfs)/sizeof(nvfs[0]); i++) {
+ if (prb_vf[i] > nvfs[i]) {
+ dev_err(&pdev->dev, "probe_vf module parameter cannot be greater than num_vfs\n");
+ err = -EINVAL;
+ goto err_disable_pdev;
+ }
+ }
+ if (total_vfs > MLX4_MAX_NUM_VF) {
+ dev_err(&pdev->dev, "total vfs (%d) can't be more than %d\n",
+ total_vfs, MLX4_MAX_NUM_VF);
+ err = -EINVAL;
+ goto err_disable_pdev;
+ }
+
+ for (i = 0; i < MLX4_MAX_PORTS; i++) {
+ if (nvfs[i] + nvfs[2] >= MLX4_MAX_NUM_VF_P_PORT) {
+ dev_err(&pdev->dev,
+ "Requested more VF's (%d) for port (%d) than allowed (%d)\n",
+ nvfs[i] + nvfs[2], i + 1,
+ MLX4_MAX_NUM_VF_P_PORT - 1);
+ err = -EINVAL;
+ goto err_disable_pdev;
+ }
+ }
+
+ /* Check for BARs. */
+ if (!(pci_dev_data & MLX4_PCI_DEV_IS_VF) &&
+ !(pdev->mem_resource[0].len != 0)) {
+ dev_err(&pdev->dev, "Missing DCS, aborting (driver_data: 0x%x, pci_resource_flags(pdev, 0):0x%lx)\n",
+ pci_dev_data, IORESOURCE_MEM);
+ err = -ENODEV;
+ goto err_disable_pdev;
+ }
+ if (!(pdev->mem_resource[2].len != 0)) {
+ dev_err(&pdev->dev, "Missing UAR, aborting\n");
+ err = -ENODEV;
+ goto err_disable_pdev;
+ }
+/*
+ err = pci_request_regions(pdev, DRV_NAME);
+ if (err) {
+ dev_err(&pdev->dev, "Couldn't get PCI resources, aborting\n");
+ goto err_disable_pdev;
+ }
+ */
+
+ //pci_set_master(pdev);
+#ifdef KMOD_DISABLED
+ err = pci_set_dma_mask(pdev, DMA_BIT_MASK(64));
+ if (err) {
+ dev_warn(&pdev->dev, "Warning: couldn't set 64-bit PCI DMA mask\n");
+ err = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
+ if (err) {
+ dev_err(&pdev->dev, "Can't set PCI DMA mask, aborting\n");
+ goto err_release_regions;
+ }
+ }
+ err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+ if (err) {
+ dev_warn(&pdev->dev, "Warning: couldn't set 64-bit consistent PCI DMA mask\n");
+ err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
+ if (err) {
+ dev_err(&pdev->dev, "Can't set consistent PCI DMA mask, aborting\n");
+ goto err_release_regions;
+ }
+ }
+
+ /* Allow large DMA segments, up to the firmware limit of 1 GB */
+ dma_set_max_seg_size(&pdev->dev, 1024 * 1024 * 1024);
+#endif
+ /* Detect if this device is a virtual function */
+ if (pci_dev_data & MLX4_PCI_DEV_IS_VF) {
+ /* When acting as pf, we normally skip vfs unless explicitly
+ * requested to probe them.
+ */
+ if (total_vfs) {
+ unsigned vfs_offset = 0;
+
+ for (i = 0; i < sizeof(nvfs)/sizeof(nvfs[0]) &&
+ vfs_offset + nvfs[i] < extended_func_num(pdev);
+ vfs_offset += nvfs[i], i++)
+ ;
+ if (i == sizeof(nvfs)/sizeof(nvfs[0])) {
+ err = -ENODEV;
+ goto err_release_regions;
+ }
+ if ((extended_func_num(pdev) - vfs_offset)
+ > prb_vf[i]) {
+ dev_warn(&pdev->dev, "Skipping virtual function:%d\n",
+ extended_func_num(pdev));
+ err = -ENODEV;
+ goto err_release_regions;
+ }
+ }
+ }
+
+ err = mlx4_catas_init(&priv->dev);
+ if (err)
+ goto err_release_regions;
+
+ err = mlx4_load_one(pdev, pci_dev_data, total_vfs, nvfs, priv, 0);
+ if (err)
+ goto err_catas;
+
+ return 0;
+
+err_catas:
+ mlx4_catas_end(&priv->dev);
+
+err_release_regions:
+ //pci_release_regions(pdev);
+
+err_disable_pdev:
+ //pci_disable_device(pdev);
+ //pci_set_drvdata(pdev, NULL);
+ return err;
+}
+
+static int mlx4_init_one(struct rte_pci_device *pdev, unsigned long driver_data)
+{
+ struct mlx4_priv *priv;
+ struct mlx4_dev *dev;
+ int ret;
+
+ printk_once(KERN_INFO "%s", mlx4_version);
+
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ dev = &priv->dev;
+ dev->persist = kzalloc(sizeof(*dev->persist), GFP_KERNEL);
+ if (!dev->persist) {
+ kfree(priv);
+ return -ENOMEM;
+ }
+ dev->persist->rte_pdev = pdev;
+ dev->persist->dev = dev;
+ pdev->driver->priv = dev->persist;
+ //pci_set_drvdata(pdev, dev->persist);
+ priv->pci_dev_data = driver_data;
+ mutex_init(&dev->persist->device_state_mutex);
+ mutex_init(&dev->persist->interface_state_mutex);
+
+ ret = __mlx4_init_one(pdev, driver_data, priv);
+ if (ret) {
+ kfree(dev->persist);
+ kfree(priv);
+ } else {
+ //pci_save_state(pdev);
+ }
+
+ return ret;
+}
+
+static void mlx4_clean_dev(struct mlx4_dev *dev)
+{
+ struct mlx4_dev_persistent *persist = dev->persist;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ unsigned long flags = (dev->flags & RESET_PERSIST_MASK_FLAGS);
+
+ memset(priv, 0, sizeof(*priv));
+ priv->dev.persist = persist;
+ priv->dev.flags = flags;
+}
+
+static void mlx4_unload_one(struct rte_pci_device *pdev)
+{
+ struct mlx4_dev_persistent *persist = (struct mlx4_dev_persistent *)pdev->driver->priv;//pci_get_drvdata(pdev);
+ struct mlx4_dev *dev = persist->dev;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int pci_dev_data;
+ int p, i;
+
+ if (priv->removed)
+ return;
+
+ /* saving current ports type for further use */
+ for (i = 0; i < dev->caps.num_ports; i++) {
+ dev->persist->curr_port_type[i] = dev->caps.port_type[i + 1];
+ dev->persist->curr_port_poss_type[i] = dev->caps.
+ possible_type[i + 1];
+ }
+
+ pci_dev_data = priv->pci_dev_data;
+
+ mlx4_stop_sense(dev);
+ mlx4_unregister_device(dev);
+
+ for (p = 1; p <= dev->caps.num_ports; p++) {
+ mlx4_cleanup_port_info(&priv->port[p]);
+ mlx4_CLOSE_PORT(dev, p);
+ }
+
+ if (mlx4_is_master(dev))
+ mlx4_free_resource_tracker(dev,
+ RES_TR_FREE_SLAVES_ONLY);
+
+ mlx4_cleanup_counters_table(dev);
+ mlx4_cleanup_qp_table(dev);
+ mlx4_cleanup_srq_table(dev);
+ mlx4_cleanup_cq_table(dev);
+ mlx4_cmd_use_polling(dev);
+ mlx4_cleanup_eq_table(dev);
+ mlx4_cleanup_mcg_table(dev);
+ mlx4_cleanup_mr_table(dev);
+ mlx4_cleanup_xrcd_table(dev);
+ mlx4_cleanup_pd_table(dev);
+
+ if (mlx4_is_master(dev))
+ mlx4_free_resource_tracker(dev,
+ RES_TR_FREE_STRUCTS_ONLY);
+
+ //iounmap(priv->kar);
+ mlx4_uar_free(dev, &priv->driver_uar);
+ mlx4_cleanup_uar_table(dev);
+ if (!mlx4_is_slave(dev))
+ mlx4_clear_steering(dev);
+ mlx4_free_eq_table(dev);
+ if (mlx4_is_master(dev))
+ mlx4_multi_func_cleanup(dev);
+ mlx4_close_hca(dev);
+ mlx4_close_fw(dev);
+ if (mlx4_is_slave(dev))
+ mlx4_multi_func_cleanup(dev);
+ mlx4_cmd_cleanup(dev, MLX4_CMD_CLEANUP_ALL);
+
+// if (dev->flags & MLX4_FLAG_MSI_X)
+// pci_disable_msix(pdev);
+
+ if (!mlx4_is_slave(dev))
+ mlx4_free_ownership(dev);
+
+ kfree(dev->caps.qp0_qkey);
+ kfree(dev->caps.qp0_tunnel);
+ kfree(dev->caps.qp0_proxy);
+ kfree(dev->caps.qp1_tunnel);
+ kfree(dev->caps.qp1_proxy);
+ kfree(dev->dev_vfs);
+
+ mlx4_clean_dev(dev);
+ priv->pci_dev_data = pci_dev_data;
+ priv->removed = 1;
+}
+
+static void mlx4_remove_one(struct rte_pci_device *pdev)
+{
+ struct mlx4_dev_persistent *persist = (struct mlx4_dev_persistent *)pdev->driver->priv;
+ struct mlx4_dev *dev = persist->dev;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int active_vfs = 0;
+
+ mutex_lock(&persist->interface_state_mutex);
+ persist->interface_state |= MLX4_INTERFACE_STATE_DELETION;
+ mutex_unlock(&persist->interface_state_mutex);
+
+ /* Disabling SR-IOV is not allowed while there are active vf's */
+ if (mlx4_is_master(dev) && dev->flags & MLX4_FLAG_SRIOV) {
+ active_vfs = mlx4_how_many_lives_vf(dev);
+ if (active_vfs) {
+ pr_warn("Removing PF when there are active VF's !!\n");
+ pr_warn("Will not disable SR-IOV.\n");
+ }
+ }
+
+ /* device marked to be under deletion running now without the lock
+ * letting other tasks to be terminated
+ */
+ if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
+ mlx4_unload_one(pdev);
+ else
+ mlx4_info(dev, "%s: interface is down\n", __func__);
+ mlx4_catas_end(dev);
+ if (dev->flags & MLX4_FLAG_SRIOV && !active_vfs) {
+ mlx4_warn(dev, "Disabling SR-IOV\n");
+// pci_disable_sriov(pdev);
+ }
+
+// pci_release_regions(pdev);
+// pci_disable_device(pdev);
+ kfree(dev->persist);
+ kfree(priv);
+ pdev->driver->priv = 0;
+ //pci_set_drvdata(pdev, NULL);
+}
+
+static int restore_current_port_types(struct mlx4_dev *dev,
+ enum mlx4_port_type *types,
+ enum mlx4_port_type *poss_types)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err, i;
+
+ mlx4_stop_sense(dev);
+
+ mutex_lock(&priv->port_mutex);
+ for (i = 0; i < dev->caps.num_ports; i++)
+ dev->caps.possible_type[i + 1] = poss_types[i];
+ err = mlx4_change_port_types(dev, types);
+ mlx4_start_sense(dev);
+ mutex_unlock(&priv->port_mutex);
+
+ return err;
+}
+
+int mlx4_restart_one(struct rte_pci_device *pdev)
+{
+ struct mlx4_dev_persistent *persist = (struct mlx4_dev_persistent *)pdev->driver->priv;
+ struct mlx4_dev *dev = persist->dev;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int nvfs[MLX4_MAX_PORTS + 1] = {0, 0, 0};
+ int pci_dev_data, err, total_vfs;
+
+ pci_dev_data = priv->pci_dev_data;
+ total_vfs = dev->persist->num_vfs;
+ memcpy(nvfs, dev->persist->nvfs, sizeof(dev->persist->nvfs));
+
+ mlx4_unload_one(pdev);
+ err = mlx4_load_one(pdev, pci_dev_data, total_vfs, nvfs, priv, 1);
+ if (err) {
+ mlx4_err(dev, "%s: ERROR: mlx4_load_one failed, pci_name=%s, err=%d\n",
+ __func__, "mlx4", err);
+ return err;
+ }
+
+ err = restore_current_port_types(dev, dev->persist->curr_port_type,
+ dev->persist->curr_port_poss_type);
+ if (err)
+ mlx4_err(dev, "could not restore original port types (%d)\n",
+ err);
+
+ return err;
+}
+
+struct pci_driver_data
+{
+ u32 vendor;
+ u32 device;
+ unsigned long driver_data;
+};
+
+#define PCI_ENTRY(a,b,c) .vendor = a, .device = b, .driver_data = c
+
+static const struct pci_driver_data mlx4_pci_table[] = {
+ /* MT25408 "Hermon" SDR */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6340, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25408 "Hermon" DDR */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x634a, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25408 "Hermon" QDR */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6354, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25408 "Hermon" DDR PCIe gen2 */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6732, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25408 "Hermon" QDR PCIe gen2 */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x673c, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25408 "Hermon" EN 10GigE */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6368, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25408 "Hermon" EN 10GigE PCIe gen2 */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6750, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25458 ConnectX EN 10GBASE-T 10GigE */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6372, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25458 ConnectX EN 10GBASE-T+Gen2 10GigE */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x675a, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT26468 ConnectX EN 10GigE PCIe gen2*/
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6764, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT26438 ConnectX EN 40GigE PCIe gen2 5GT/s */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6746, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT26478 ConnectX2 40GigE PCIe gen2 */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x676e, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25400 Family [ConnectX-2 Virtual Function] */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1002, MLX4_PCI_DEV_IS_VF )},
+ /* MT27500 Family [ConnectX-3] */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1003, 0 )},
+ /* MT27500 Family [ConnectX-3 Virtual Function] */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1004, MLX4_PCI_DEV_IS_VF )},
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1005, 0 )}, /* MT27510 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1006, 0 )}, /* MT27511 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1007, 0 )}, /* MT27520 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1008, 0 )}, /* MT27521 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1009, 0 )}, /* MT27530 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x100a, 0 )}, /* MT27531 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x100b, 0 )}, /* MT27540 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x100c, 0 )}, /* MT27541 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x100d, 0 )}, /* MT27550 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x100e, 0 )}, /* MT27551 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x100f, 0 )}, /* MT27560 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1010, 0 )}, /* MT27561 Family */
+ { 0, }
+};
+
+#undef PCI_ENTRY
+
+#define PCI_ENTRY(a,b,c) .vendor_id = a, .device_id = b, .subsystem_vendor_id = PCI_ANY_ID, .subsystem_device_id = PCI_ANY_ID
+
+static const struct rte_pci_id rte_mlx4_pci_table[] = {
+ /* MT25408 "Hermon" SDR */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6340, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25408 "Hermon" DDR */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x634a, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25408 "Hermon" QDR */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6354, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25408 "Hermon" DDR PCIe gen2 */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6732, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25408 "Hermon" QDR PCIe gen2 */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x673c, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25408 "Hermon" EN 10GigE */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6368, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25408 "Hermon" EN 10GigE PCIe gen2 */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6750, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25458 ConnectX EN 10GBASE-T 10GigE */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6372, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25458 ConnectX EN 10GBASE-T+Gen2 10GigE */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x675a, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT26468 ConnectX EN 10GigE PCIe gen2*/
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6764, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT26438 ConnectX EN 40GigE PCIe gen2 5GT/s */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x6746, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT26478 ConnectX2 40GigE PCIe gen2 */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x676e, MLX4_PCI_DEV_FORCE_SENSE_PORT )},
+ /* MT25400 Family [ConnectX-2 Virtual Function] */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1002, MLX4_PCI_DEV_IS_VF )},
+ /* MT27500 Family [ConnectX-3] */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1003, 0 )},
+ /* MT27500 Family [ConnectX-3 Virtual Function] */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1004, MLX4_PCI_DEV_IS_VF )},
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1005, 0 )}, /* MT27510 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1006, 0 )}, /* MT27511 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1007, 0 )}, /* MT27520 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1008, 0 )}, /* MT27521 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1009, 0 )}, /* MT27530 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x100a, 0 )}, /* MT27531 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x100b, 0 )}, /* MT27540 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x100c, 0 )}, /* MT27541 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x100d, 0 )}, /* MT27550 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x100e, 0 )}, /* MT27551 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x100f, 0 )}, /* MT27560 Family */
+ { PCI_ENTRY(PCI_VENDOR_ID_MELLANOX, 0x1010, 0 )}, /* MT27561 Family */
+ { 0, }
+};
+
+#undef PCI_ENTRY
+
+//MODULE_DEVICE_TABLE(pci, mlx4_pci_table);
+
+static int mlx4_pci_err_detected(struct rte_pci_device *pdev
+ )// ,pci_channel_state_t state)
+{
+ struct mlx4_dev_persistent *persist = (struct mlx4_dev_persistent *)pdev->driver->priv;
+
+ mlx4_err(persist->dev, "mlx4_pci_err_detected was called\n");
+ mlx4_enter_error_state(persist);
+
+ mutex_lock(&persist->interface_state_mutex);
+ if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
+ mlx4_unload_one(pdev);
+
+ mutex_unlock(&persist->interface_state_mutex);
+ //if (state == pci_channel_io_perm_failure)
+ // return PCI_ERS_RESULT_DISCONNECT;
+
+ //pci_disable_device(pdev);
+ //return PCI_ERS_RESULT_NEED_RESET;
+ return -1;
+}
+
+static int mlx4_pci_slot_reset(struct rte_pci_device *pdev)
+{
+ struct mlx4_dev_persistent *persist = (struct mlx4_dev_persistent *)pdev->driver->priv;
+ struct mlx4_dev *dev = persist->dev;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int ret;
+ int nvfs[MLX4_MAX_PORTS + 1] = {0, 0, 0};
+ int total_vfs;
+
+ mlx4_err(dev, "mlx4_pci_slot_reset was called\n");
+ /*
+ ret = pci_enable_device(pdev);
+ if (ret) {
+ mlx4_err(dev, "Can not re-enable device, ret=%d\n", ret);
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+ */
+
+ //pci_set_master(pdev);
+ //pci_restore_state(pdev);
+ //pci_save_state(pdev);
+
+ total_vfs = dev->persist->num_vfs;
+ memcpy(nvfs, dev->persist->nvfs, sizeof(dev->persist->nvfs));
+
+ mutex_lock(&persist->interface_state_mutex);
+ if (!(persist->interface_state & MLX4_INTERFACE_STATE_UP)) {
+ ret = mlx4_load_one(pdev, priv->pci_dev_data, total_vfs, nvfs,
+ priv, 1);
+ if (ret) {
+ mlx4_err(dev, "%s: mlx4_load_one failed, ret=%d\n",
+ __func__, ret);
+ goto end;
+ }
+
+ ret = restore_current_port_types(dev, dev->persist->
+ curr_port_type, dev->persist->
+ curr_port_poss_type);
+ if (ret)
+ mlx4_err(dev, "could not restore original port types (%d)\n", ret);
+ }
+end:
+ mutex_unlock(&persist->interface_state_mutex);
+
+ return ret;
+}
+
+static void mlx4_shutdown(struct rte_pci_device *pdev)
+{
+ struct mlx4_dev_persistent *persist = (struct mlx4_dev_persistent *)pdev->driver->priv;
+
+ mlx4_info(persist->dev, "mlx4_shutdown was called\n");
+ mutex_lock(&persist->interface_state_mutex);
+ if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
+ mlx4_unload_one(pdev);
+ mutex_unlock(&persist->interface_state_mutex);
+}
+#ifdef KMOD_DISABLED
+#ifdef CONFIG_COMPAT_IS_CONST_PCI_ERROR_HANDLERS
+static const struct pci_error_handlers mlx4_err_handler = {
+#else
+static struct pci_error_handlers mlx4_err_handler = {
+#endif
+ .error_detected = mlx4_pci_err_detected,
+ .slot_reset = mlx4_pci_slot_reset,
+};
+#endif
+
+static int mlx4_suspend(struct rte_pci_device *pdev)
+{
+ struct mlx4_dev_persistent *persist = (struct mlx4_dev_persistent *)pdev->driver->priv;
+ struct mlx4_dev *dev = persist->dev;
+
+ mlx4_err(dev, "suspend was called\n");
+ mutex_lock(&persist->interface_state_mutex);
+ if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
+ mlx4_unload_one(pdev);
+ mutex_unlock(&persist->interface_state_mutex);
+
+ return 0;
+}
+
+static int mlx4_resume(struct rte_pci_device *pdev)
+{
+ int nvfs[MLX4_MAX_PORTS + 1] = {0, 0, 0};
+ int total_vfs;
+ struct mlx4_dev_persistent *persist = (struct mlx4_dev_persistent *)pdev->driver->priv;
+ struct mlx4_dev *dev = persist->dev;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int ret = 0;
+
+ mlx4_err(dev, "resume was called\n");
+ total_vfs = dev->persist->num_vfs;
+ memcpy(nvfs, dev->persist->nvfs, sizeof(dev->persist->nvfs));
+
+ mutex_lock(&persist->interface_state_mutex);
+ if (!(persist->interface_state & MLX4_INTERFACE_STATE_UP)) {
+ ret = mlx4_load_one(pdev, priv->pci_dev_data, total_vfs, nvfs, priv, 1);
+ if (!ret) {
+ ret = restore_current_port_types(dev, dev->persist->
+ curr_port_type, dev->persist->
+ curr_port_poss_type);
+ if (ret)
+ mlx4_err(dev, "resume: could not restore original port types (%d)\n", ret);
+ }
+ }
+ mutex_unlock(&persist->interface_state_mutex);
+
+ return ret;
+}
+
+#ifdef KMOD_DISABLED
+static struct pci_driver mlx4_driver = {
+ .name = DRV_NAME,
+ .id_table = mlx4_pci_table,
+ .probe = mlx4_init_one,
+ .shutdown = mlx4_shutdown,
+ .remove = mlx4_remove_one,
+ .suspend = mlx4_suspend,
+ .resume = mlx4_resume,
+ .err_handler = &mlx4_err_handler,
+};
+#endif
+
+static int __init mlx4_verify_params(void)
+{
+ int status;
+
+ status = update_defaults(&port_type_array);
+ if (status == INVALID_STR) {
+ if (mlx4_fill_dbdf2val_tbl(&port_type_array.dbdf2val))
+ return -1;
+ } else if (status == INVALID_DATA) {
+ return -1;
+ }
+
+ status = update_defaults(&num_vfs);
+ if (status == INVALID_STR) {
+ if (mlx4_fill_dbdf2val_tbl(&num_vfs.dbdf2val))
+ return -1;
+ } else if (status == INVALID_DATA) {
+ return -1;
+ }
+
+ status = update_defaults(&probe_vf);
+ if (status == INVALID_STR) {
+ if (mlx4_fill_dbdf2val_tbl(&probe_vf.dbdf2val))
+ return -1;
+ } else if (status == INVALID_DATA) {
+ return -1;
+ }
+
+ status = update_defaults(&roce_mode);
+ if (status == INVALID_STR) {
+ if (mlx4_fill_dbdf2val_tbl(&roce_mode.dbdf2val))
+ return -1;
+ } else if (status == INVALID_DATA) {
+ return -1;
+ }
+
+ status = update_defaults(&ud_gid_type);
+ if (status == INVALID_STR) {
+ if (mlx4_fill_dbdf2val_tbl(&ud_gid_type.dbdf2val))
+ return -1;
+ } else if (status == INVALID_DATA) {
+ return -1;
+ }
+
+ if (msi_x < 0) {
+ pr_warn("mlx4_core: bad msi_x: %d\n", msi_x);
+ return -1;
+ }
+
+ if ((log_num_mac < 0) || (log_num_mac > MLX4_MAX_LOG_NUM_MACS)) {
+ pr_warning("mlx4_core: bad num_mac: %d\n", log_num_mac);
+ return -1;
+ }
+
+ if (log_num_vlan != 0)
+ pr_warning("mlx4_core: log_num_vlan - obsolete module param, using %d\n",
+ MLX4_LOG_NUM_VLANS);
+
+ if ((log_mtts_per_seg < 0) || (log_mtts_per_seg > 7)) {
+ pr_warning("mlx4_core: bad log_mtts_per_seg: %d\n", log_mtts_per_seg);
+ return -1;
+ }
+
+ if (mlx4_log_num_mgm_entry_size < (int)(-MLX4_DMFS_PARAM_VALUES) ||
+ (mlx4_log_num_mgm_entry_size > 0 &&
+ (mlx4_log_num_mgm_entry_size < MLX4_MIN_MGM_LOG_ENTRY_SIZE ||
+ mlx4_log_num_mgm_entry_size > MLX4_MAX_MGM_LOG_ENTRY_SIZE))) {
+ pr_warning("mlx4_core: mlx4_log_num_mgm_entry_size (%d) not "
+ "in legal range -%d..0 or %d..%d)\n",
+ mlx4_log_num_mgm_entry_size,
+ MLX4_DMFS_PARAM_VALUES,
+ MLX4_MIN_MGM_LOG_ENTRY_SIZE,
+ MLX4_MAX_MGM_LOG_ENTRY_SIZE);
+ return -1;
+ }
+ if (ingress_parser_mode < MLX4_INGRESS_PARSER_MODE_STANDARD ||
+ ingress_parser_mode >= MLX4_INGRESS_PARSER_MODE_MAX) {
+ pr_warn("mlx4_core: ingress_parser_mode (%d) not "
+ "in legal range %d..%d. "
+ "Changing to default\n",
+ ingress_parser_mode,
+ MLX4_INGRESS_PARSER_MODE_STANDARD,
+ MLX4_INGRESS_PARSER_MODE_MAX - 1);
+ ingress_parser_mode = MLX4_INGRESS_PARSER_MODE_STANDARD;
+ }
+
+ if (mlx4_log_num_mgm_entry_size < 0 &&
+ (!((-mlx4_log_num_mgm_entry_size) & MLX4_DMFS_ETH_ONLY)) &&
+ ((-mlx4_log_num_mgm_entry_size) & MLX4_DMFS_A0_STEERING)) {
+ pr_warn("mlx4_core: Can't support IPoIB flow steering along "
+ "with optimized steering\n");
+ return -1;
+ }
+
+ if (mod_param_profile.num_qp < 18 || mod_param_profile.num_qp > 23) {
+ pr_warning("mlx4_core: bad log_num_qp: %d\n",
+ mod_param_profile.num_qp);
+ return -1;
+ }
+
+ if (mod_param_profile.num_srq < 10) {
+ pr_warning("mlx4_core: too low log_num_srq: %d\n",
+ mod_param_profile.num_srq);
+ return -1;
+ }
+
+ if (mod_param_profile.num_cq < 10) {
+ pr_warning("mlx4_core: too low log_num_cq: %d\n",
+ mod_param_profile.num_cq);
+ return -1;
+ }
+
+ if (mod_param_profile.num_mpt < 10) {
+ pr_warning("mlx4_core: too low log_num_mpt: %d\n",
+ mod_param_profile.num_mpt);
+ return -1;
+ }
+
+ if (mod_param_profile.num_mtt &&
+ mod_param_profile.num_mtt < 15) {
+ pr_warning("mlx4_core: too low log_num_mtt: %d\n",
+ mod_param_profile.num_mtt);
+ return -1;
+ }
+
+ if (mod_param_profile.num_mtt > MLX4_MAX_LOG_NUM_MTT) {
+ pr_warning("mlx4_core: too high log_num_mtt: %d\n",
+ mod_param_profile.num_mtt);
+ return -1;
+ }
+ return 0;
+}
+
+static int mlx4_init_one_helper(struct rte_pci_driver *drv, struct rte_pci_device * dev)
+{
+ if(rte_persistent_init() < 0)
+ return -1;
+ int k;
+ unsigned long param = 0;
+ for(k=0; k<ARRAY_LEN(mlx4_pci_table); k++)
+ {
+ if(mlx4_pci_table[k].vendor == dev->id.vendor_id
+ &&
+ mlx4_pci_table[k].device == dev->id.device_id)
+ {
+ param = mlx4_pci_table[k].driver_data;
+ break;
+ }
+ }
+ return mlx4_init_one(dev, param);
+}
+
+static struct rte_pci_driver mlx4_core_driver = {
+ .name = "mlx4_core_uio",
+ .devinit = mlx4_init_one_helper,
+ .id_table = rte_mlx4_pci_table,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+};
+
+static int mlx4_init(const char *name, const char *args)
+{
+ //int ret;
+ //for debug
+ mlx4_debug_level = 1;
+ fast_drop = 1;
+
+ if (mlx4_verify_params())
+ return -EINVAL;
+
+
+ //mlx4_wq = create_singlethread_workqueue("mlx4");
+ //if (!mlx4_wq)
+ // return -ENOMEM;
+
+ //ret = pci_register_driver(&mlx4_driver);
+ rte_eal_pci_register(&mlx4_core_driver);
+ //if (ret < 0)
+ // destroy_workqueue(mlx4_wq);
+ //return ret < 0 ? ret : 0;
+ return 0;
+}
+
+
+static struct rte_driver mlx4_core = {
+ .type = PMD_PDEV,
+ .init = mlx4_init,
+};
+
+
+PMD_REGISTER_DRIVER(mlx4_core);
+
+#ifdef KMOD_DISABLED
+static void __exit mlx4_cleanup(void)
+{
+ pci_unregister_driver(&mlx4_driver);
+ destroy_workqueue(mlx4_wq);
+}
+#endif
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/main.c.orig b/drivers/net/mlnx_uio/mlnx/mlx4/main.c.orig
new file mode 100644
index 0000000..5bb9b82
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/main.c.orig
@@ -0,0 +1,5335 @@
+/*
+ * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/errno.h>
+#include <linux/pci.h>
+#include <linux/dma-mapping.h>
+#include <linux/slab.h>
+#include <linux/io-mapping.h>
+#include <linux/delay.h>
+#include <linux/kmod.h>
+
+#include <linux/mlx4/device.h>
+#include <linux/mlx4/doorbell.h>
+
+#include "mlx4.h"
+#include "fw.h"
+#include "icm.h"
+#include "mlx4_stats.h"
+
+MODULE_AUTHOR("Roland Dreier");
+MODULE_DESCRIPTION("Mellanox ConnectX HCA low-level driver");
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_VERSION(DRV_VERSION);
+
+struct workqueue_struct *mlx4_wq;
+
+#ifdef CONFIG_MLX4_DEBUG
+
+int mlx4_debug_level = 0;
+module_param_named(debug_level, mlx4_debug_level, int, 0644);
+MODULE_PARM_DESC(debug_level, "Enable debug tracing if > 0");
+
+#endif /* CONFIG_MLX4_DEBUG */
+
+#ifdef CONFIG_PCI_MSI
+
+static int msi_x = 1;
+module_param(msi_x, int, 0444);
+MODULE_PARM_DESC(msi_x, "attempt to use MSI-X if nonzero");
+
+#else /* CONFIG_PCI_MSI */
+
+#define msi_x (0)
+
+#endif /* CONFIG_PCI_MSI */
+
+static int enable_sys_tune = 0;
+module_param(enable_sys_tune, int, 0444);
+MODULE_PARM_DESC(enable_sys_tune, "Tune the cpu's for better performance (default 0)");
+
+int mlx4_blck_lb = 1;
+module_param_named(block_loopback, mlx4_blck_lb, int, 0644);
+MODULE_PARM_DESC(block_loopback, "Block multicast loopback packets if > 0 "
+ "(default: 1)");
+
+#define MLX4_ROCE_1_5_DEF_PROTO 0xfe
+
+int mlx4_roce_proto_config = MLX4_ROCE_1_5_DEF_PROTO;
+module_param_named(rr_proto, mlx4_roce_proto_config, int, 0444);
+MODULE_PARM_DESC(rr_proto, "IP next protocol for RoCEv1.5 or destination port for RoCEv2. Setting 0 means using driver default values");
+
+int ingress_parser_mode = MLX4_INGRESS_PARSER_MODE_STANDARD;
+module_param(ingress_parser_mode, int, 0444);
+MODULE_PARM_DESC(ingress_parser_mode, "Mode of ingress parser for ConnectX3-Pro. 0 - standard. 1 - checksum for non TCP/UDP. (default: standard)");
+
+enum {
+ DEFAULT_DOMAIN = 0,
+ BDF_STR_SIZE = 8, /* bb:dd.f- */
+ DBDF_STR_SIZE = 13 /* mmmm:bb:dd.f- */
+};
+
+enum {
+ NUM_VFS,
+ PROBE_VF,
+ PORT_TYPE_ARRAY,
+ ROCE_MODE,
+ UD_GID_TYPE
+};
+
+enum {
+ VALID_DATA,
+ INVALID_DATA,
+ INVALID_STR
+};
+
+struct param_data {
+ int id;
+ struct mlx4_dbdf2val_lst dbdf2val;
+};
+
+static struct param_data roce_mode = {
+ .id = ROCE_MODE,
+ .dbdf2val = {
+ .name = "roce_mode param",
+ .num_vals = 1,
+ .def_val = {0},
+ .range = {0, 4},
+ .num_inval_vals = 0
+ }
+};
+module_param_string(roce_mode, roce_mode.dbdf2val.str,
+ sizeof(roce_mode.dbdf2val.str), 0444);
+MODULE_PARM_DESC(roce_mode,
+ "Set RoCE modes supported by the port\n"
+ "\tA single value (e.g. 0) to define uniform preferred RoCE_mode value for all devices\n"
+ "\t\tor a string to map device function numbers to their RoCE mode value (e.g. '0000:04:00.0-0,002b:1c:0b.a-0').\n"
+ "\t\tAllowed values are 0: RoCEv1 (default), 1: RoCEv1.5, 2: RoCEv2, 3: RoCEv1.5+2 and 4: RoCEv1+2)\n");
+
+static struct param_data ud_gid_type = {
+ .id = UD_GID_TYPE,
+ .dbdf2val = {
+ .name = "ud_gid_type param",
+ .num_vals = 1,
+ .def_val = {MLX4_ROCE_GID_TYPE_V1_5},
+ .range = {MLX4_ROCE_GID_TYPE_V1, MLX4_ROCE_GID_TYPE_V2},
+ .num_inval_vals = 0
+ }
+};
+module_param_string(ud_gid_type, ud_gid_type.dbdf2val.str,
+ sizeof(ud_gid_type.dbdf2val.str), 0444);
+MODULE_PARM_DESC(ud_gid_type,
+ "Set gid type for UD QPs\n"
+ "\tA single value (e.g. 1) to define uniform UD QP gid type for all devices\n"
+ "\t\tor a string to map device function numbers to their UD QP gid type (e.g. '0000:04:00.0-0,002b:1c:0b.a-1').\n"
+ "\t\tAllowed values are 0 for RoCEv1, 1 for RoCEv1.5 (default) and 2 for RoCEv2");
+
+static struct param_data num_vfs = {
+ .id = NUM_VFS,
+ .dbdf2val = {
+ .name = "num_vfs param",
+ .num_vals = 3,
+ .def_val = {0},
+ .range = {0, MLX4_MAX_NUM_VF},
+ .num_inval_vals = 0
+ }
+};
+module_param_string(num_vfs, num_vfs.dbdf2val.str,
+ sizeof(num_vfs.dbdf2val.str), 0444);
+MODULE_PARM_DESC(num_vfs,
+ "Either single value (e.g. '5') or triplet (e.g. '10,11,12') to define uniform num_vfs value for all devices functions.\n"
+ "\t\tIf a single value is given, this value will be used in order to define <num_vfs> dual ports virtual functions.\n"
+ "\t\tIf a triplet <a,b,c> is given, <a> single port virtual functions are defined on port1, <b> single port\n"
+ "\t\tvirtual functions are defined on port2 and <c> dual port virtual functions are defined.\n"
+ "\t\tAlternatively, a string to map device function numbers to their num_vfs values\n"
+ "\t\t (e.g. '0000:04:00.0-5,002b:1c:0b.a-15;2;4') could be given.\n"
+ "\t\tHexadecimal digits for the device function (e.g. 002b:1c:0b.a) and decimal or triplet for num_vfs value\n"
+ "\t\t(e.g. 15 or 1;2;3).");
+
+static struct param_data probe_vf = {
+ .id = PROBE_VF,
+ .dbdf2val = {
+ .name = "probe_vf param",
+ .num_vals = 3,
+ .def_val = {0},
+ .range = {0, MLX4_MAX_NUM_VF},
+ .num_inval_vals = 0
+ }
+};
+module_param_string(probe_vf, probe_vf.dbdf2val.str,
+ sizeof(probe_vf.dbdf2val.str), 0444);
+MODULE_PARM_DESC(probe_vf,
+ "Either single value (e.g. '3') or triplet (e.g '1,2,3') to define uniform number of VFs to probe by the pf\n"
+ "\t\tdriver for all devices functions.\n"
+ "\t\tIf a single value is given, this value will be used in order to define <probe_vf> probed dual ports virtual\n"
+ "\t\tfunctions. If a triplet <a,b,c> is given, <a> single port virtual functions are probed on port1, <b> single port\n"
+ "\t\tvirtual functions are probed on port2 and <c> dual port virtual functions are probed.\n"
+ "\t\tAlternatively, a string to map device function numbers to their probe_vf values\n"
+ "\t\t(e.g. '0000:04:00.0-3,002b:1c:0b.a-13;12;11') could be given.\n"
+ "\t\tHexadecimal digits for the device function (e.g. 002b:1c:0b.a) and decimal for probe_vf value (e.g. 13 or 1;2;3).");
+
+#define MLX4_FORCE_DMFS_IF_NO_NCSI_FS (1U << 0)
+#define MLX4_DMFS_ETH_ONLY (1U << 1)
+#define MLX4_DMFS_A0_STEERING (1U << 2)
+#define MLX4_DISABLE_DMFS_LOW_QP_NUM (1U << 3)
+#define MLX4_IB_IGNORE_SIP_CHECK (1U << 4)
+#define MLX4_ETH_IGNORE_SIP_CHECK (1U << 5)
+#define MLX4_DMFS_PARAM_VALUES ((MLX4_ETH_IGNORE_SIP_CHECK << 1) - 1)
+
+int mlx4_log_num_mgm_entry_size = -(MLX4_DMFS_ETH_ONLY | MLX4_DISABLE_DMFS_LOW_QP_NUM);
+module_param_named(log_num_mgm_entry_size,
+ mlx4_log_num_mgm_entry_size, int, 0444);
+MODULE_PARM_DESC(log_num_mgm_entry_size, "log mgm size, that defines the num"
+ " of qp per mcg, for example:"
+ " 10 gives 248.range: 7 <="
+ " log_num_mgm_entry_size <= 12 (default = -10).\n"
+ "\t\tTo activate one of device managed"
+ " flow steering modes, set to non positive value (-x) and sets bits in x:\n"
+ "\t\t0: Force DMFS, even on expense of NCSI support\n"
+ "\t\t1: Disable IPoIB DMFS rules (if enabled performance might decrease. Can't be cleared if b3 is set)\n"
+ "\t\t2: Enable optimized steering (even if in limited L2 mode. Can't be set if b2 is cleared)\n"
+ "\t\t3: Disable DMFS if number of QPs per MCG is low\n"
+ "\t\t4: Optimize IPoIB/EoIB steering table for non source IP rules if possible\n"
+ "\t\t5: Optimize steering table for non source IP rules if possible");
+
+static int fast_drop;
+module_param_named(fast_drop, fast_drop, int, 0444);
+MODULE_PARM_DESC(fast_drop,
+ "Enable fast packet drop when no recieve WQEs are posted");
+
+static bool enable_64b_cqe_eqe = true;
+module_param(enable_64b_cqe_eqe, bool, 0444);
+MODULE_PARM_DESC(enable_64b_cqe_eqe,
+ "Enable 64 byte CQEs/EQEs when the FW supports this (default: True)");
+
+#define PF_CONTEXT_BEHAVIOUR_MASK (MLX4_FUNC_CAP_64B_EQE_CQE | \
+ MLX4_FUNC_CAP_EQE_CQE_STRIDE | \
+ MLX4_FUNC_CAP_DMFS_A0_STATIC)
+
+#define RESET_PERSIST_MASK_FLAGS (MLX4_FLAG_SRIOV)
+
+static char mlx4_version[] =
+ DRV_NAME ": Mellanox ConnectX core driver v"
+ DRV_VERSION " (" DRV_RELDATE ")\n";
+
+static struct mlx4_profile low_mem_profile = {
+ .num_qp = 1 << 17,
+ .num_srq = 1 << 6,
+ .rdmarc_per_qp = 1 << 4,
+ .num_cq = 1 << 8,
+ .num_mcg = 1 << 8,
+ .num_mpt = 1 << 9,
+ .num_mtt = 1 << 7,
+};
+
+
+#define MLX4_MAX_LOG_NUM_MACS 7
+static int log_num_mac = MLX4_MAX_LOG_NUM_MACS;
+module_param_named(log_num_mac, log_num_mac, int, 0444);
+MODULE_PARM_DESC(log_num_mac, "Log2 max number of MACs per ETH port (1-7)");
+
+static int log_num_vlan;
+module_param_named(log_num_vlan, log_num_vlan, int, 0444);
+MODULE_PARM_DESC(log_num_vlan, "Log2 max number of VLANs per ETH port (0-7)");
+/* Log2 max number of VLANs per ETH port (0-7) */
+#define MLX4_LOG_NUM_VLANS 7
+#define MLX4_MIN_LOG_NUM_VLANS 0
+#define MLX4_MIN_LOG_NUM_MAC 1
+
+static bool use_prio;
+module_param_named(use_prio, use_prio, bool, 0444);
+MODULE_PARM_DESC(use_prio, "Enable steering by VLAN priority on ETH ports (deprecated)");
+
+int log_mtts_per_seg = ilog2(1);
+module_param_named(log_mtts_per_seg, log_mtts_per_seg, int, 0444);
+MODULE_PARM_DESC(log_mtts_per_seg, "Log2 number of MTT entries per segment (0-7) (default: 0)");
+
+static struct param_data port_type_array = {
+ .id = PORT_TYPE_ARRAY,
+ .dbdf2val = {
+ .name = "port_type_array param",
+ .num_vals = 2,
+ .def_val = {MLX4_PORT_TYPE_NONE, MLX4_PORT_TYPE_NONE},
+ .range = {MLX4_PORT_TYPE_IB, MLX4_PORT_TYPE_NA},
+ .num_inval_vals = 1,
+ .inval_val = {MLX4_PORT_TYPE_AUTO}
+ }
+};
+module_param_string(port_type_array, port_type_array.dbdf2val.str,
+ sizeof(port_type_array.dbdf2val.str), 0444);
+MODULE_PARM_DESC(port_type_array,
+ "Valid only if num_vfs is non-zero (SRIOV mode). Ignored otherwise.\n"
+ "\t\tEither pair of values (e.g. '1,2') to define uniform port1/port2 types configuration for all devices functions\n"
+ "\t\tor a string to map device function numbers to their pair of port types values (e.g. '0000:04:00.0-1;2,002b:1c:0b.a-1;1').\n"
+ "\t\tValid port types: 1-ib, 2-eth, 4-N/A\n"
+ "\t\tIn case that only one port is available use the N/A port type for port2 (e.g '1,4').");
+
+
+struct mlx4_port_config {
+ struct list_head list;
+ enum mlx4_port_type port_type[MLX4_MAX_PORTS + 1];
+ struct pci_dev *pdev;
+};
+
+static atomic_t pf_loading = ATOMIC_INIT(0);
+
+#define MLX4_LOG_NUM_MTT 20
+/* We limit to 30 as of a bit map issue which uses int and not uint.
+ see mlx4_buddy_init -> bitmap_zero which gets int.
+*/
+#define MLX4_MAX_LOG_NUM_MTT 30
+static struct mlx4_profile mod_param_profile = {
+ .num_qp = 19,
+ .num_srq = 16,
+ .rdmarc_per_qp = 4,
+ .num_cq = 16,
+ .num_mcg = 13,
+ .num_mpt = 19,
+ .num_mtt = 0, /* max(20, 2*MTTs for host memory)) */
+};
+
+module_param_named(log_num_qp, mod_param_profile.num_qp, int, 0444);
+MODULE_PARM_DESC(log_num_qp, "log maximum number of QPs per HCA (default: 19)");
+
+module_param_named(log_num_srq, mod_param_profile.num_srq, int, 0444);
+MODULE_PARM_DESC(log_num_srq, "log maximum number of SRQs per HCA "
+ "(default: 16)");
+
+module_param_named(log_rdmarc_per_qp, mod_param_profile.rdmarc_per_qp, int,
+ 0444);
+MODULE_PARM_DESC(log_rdmarc_per_qp, "log number of RDMARC buffers per QP "
+ "(default: 4)");
+
+module_param_named(log_num_cq, mod_param_profile.num_cq, int, 0444);
+MODULE_PARM_DESC(log_num_cq, "log maximum number of CQs per HCA (default: 16)");
+
+module_param_named(log_num_mcg, mod_param_profile.num_mcg, int, 0444);
+MODULE_PARM_DESC(log_num_mcg, "log maximum number of multicast groups per HCA "
+ "(default: 13)");
+
+module_param_named(log_num_mpt, mod_param_profile.num_mpt, int, 0444);
+MODULE_PARM_DESC(log_num_mpt,
+ "log maximum number of memory protection table entries per "
+ "HCA (default: 19)");
+
+module_param_named(log_num_mtt, mod_param_profile.num_mtt, int, 0444);
+MODULE_PARM_DESC(log_num_mtt,
+ "log maximum number of memory translation table segments per "
+ "HCA (default: max(20, 2*MTTs for register all of the host memory limited to 30))");
+
+static void process_mod_param_profile(struct mlx4_profile *profile)
+{
+ struct sysinfo si;
+
+ profile->num_qp = 1 << mod_param_profile.num_qp;
+ profile->num_srq = 1 << mod_param_profile.num_srq;
+ profile->rdmarc_per_qp = 1 << mod_param_profile.rdmarc_per_qp;
+ profile->num_cq = 1 << mod_param_profile.num_cq;
+ profile->num_mcg = 1 << mod_param_profile.num_mcg;
+ profile->num_mpt = 1 << mod_param_profile.num_mpt;
+ /* We want to scale the number of MTTs with the size of the
+ * system memory, since it makes sense to register a lot of
+ * memory on a system with a lot of memory. As a heuristic,
+ * make sure we have enough MTTs to register twice the system
+ * memory (with PAGE_SIZE entries).
+ *
+ * This number has to be a power of two and fit into 32 bits
+ * due to device limitations. We cap this at 2^30 as of bit map
+ * limitation to work with int instead of uint (mlx4_buddy_init -> bitmap_zero)
+ * That limits us to 4TB of memory registration per HCA with
+ * 4KB pages, which is probably OK for the next few months.
+ */
+ if (mod_param_profile.num_mtt)
+ profile->num_mtt = 1 << mod_param_profile.num_mtt;
+ else {
+ si_meminfo(&si);
+ profile->num_mtt =
+ roundup_pow_of_two(max_t(unsigned,
+ 1 << (MLX4_LOG_NUM_MTT - log_mtts_per_seg),
+ min(1UL << (MLX4_MAX_LOG_NUM_MTT - log_mtts_per_seg),
+ (si.totalram << 1) >> log_mtts_per_seg)));
+ /* set the actual value, so it will be reflected to the user
+ * using the sysfs
+ */
+ mod_param_profile.num_mtt = ilog2(profile->num_mtt);
+ }
+}
+
+enum {
+ MLX4_IF_STATE_BASIC,
+ MLX4_IF_STATE_EXTENDED
+};
+
+static inline u64 dbdf_to_u64(int domain, int bus, int dev, int fn)
+{
+ return (domain << 20) | (bus << 12) | (dev << 4) | fn;
+}
+
+static inline void pr_bdf_err(const char *dbdf, const char *pname)
+{
+ pr_warn("mlx4_core: '%s' is not valid bdf in '%s'\n", dbdf, pname);
+}
+
+static inline void pr_val_err(const char *dbdf, const char *pname,
+ const char *val)
+{
+ pr_warn("mlx4_core: value '%s' of bdf '%s' in '%s' is not valid\n"
+ , val, dbdf, pname);
+}
+
+static inline void pr_out_of_range_bdf(const char *dbdf, int val,
+ struct mlx4_dbdf2val_lst *dbdf2val)
+{
+ pr_warn("mlx4_core: value %d in bdf '%s' of '%s' is out of its valid range (%d,%d)\n"
+ , val, dbdf, dbdf2val->name , dbdf2val->range.min,
+ dbdf2val->range.max);
+}
+
+static inline void pr_out_of_range(struct mlx4_dbdf2val_lst *dbdf2val)
+{
+ pr_warn("mlx4_core: value of '%s' is out of its valid range (%d,%d)\n"
+ , dbdf2val->name , dbdf2val->range.min, dbdf2val->range.max);
+}
+
+static inline int is_valid_value(int val, struct mlx4_dbdf2val_lst *v)
+{
+ int i;
+
+ for (i = 0; i < v->num_inval_vals; i++) {
+ if (val == v->inval_val[i])
+ return 0;
+ }
+ return 1;
+}
+
+static inline void pr_invalid_value(int val, struct mlx4_dbdf2val_lst *dbdf2val)
+{
+ pr_warn("mlx4_core: value %d of '%s' is not allowed\n",
+ val, dbdf2val->name);
+}
+
+static inline int is_in_range(int val, struct mlx4_range *r)
+{
+ return (val >= r->min && val <= r->max);
+}
+
+static int parse_array(struct param_data *pdata, char *p, long *vals, u32 n)
+{
+ u32 iter = 0;
+
+ while (n != 0 && strlen(p)) {
+ char *t = strchr(p, ',');
+ int val_len = t - p;
+ char sval[32];
+ int ret;
+
+ /* Try to parse as last element */
+ if (!t && !kstrtol(p, 0, vals)) {
+ if (!is_in_range(*vals, &pdata->dbdf2val.range)) {
+ pr_out_of_range(&pdata->dbdf2val);
+ return -INVALID_DATA;
+ }
+ if (!is_valid_value(*vals, &pdata->dbdf2val)) {
+ pr_invalid_value(*vals, &pdata->dbdf2val);
+ return -INVALID_DATA;
+ }
+ return ++iter;
+ }
+
+ if (!t || t == p || val_len > sizeof(sval))
+ return -INVALID_STR;
+
+ strncpy(sval, p, val_len);
+ sval[val_len] = 0;
+
+ ret = kstrtol(sval, 0, vals);
+
+ if (ret == -EINVAL)
+ return -INVALID_STR;
+ if (ret || !is_in_range(*vals, &pdata->dbdf2val.range)) {
+ pr_out_of_range(&pdata->dbdf2val);
+ return -INVALID_DATA;
+ }
+ if (!is_valid_value(*vals, &pdata->dbdf2val)) {
+ pr_invalid_value(*vals, &pdata->dbdf2val);
+ return -INVALID_DATA;
+ }
+
+ ++iter;
+ ++vals;
+ p += val_len + 1;
+ if (n > 0)
+ n--;
+ }
+
+ return -INVALID_STR;
+}
+
+#define ARRAY_LEN(arr) (sizeof((arr))/sizeof((arr)[0]))
+static int parse_mod_param(struct param_data *pdata)
+{
+ int i;
+ int ret = 0;
+ long port_array[ARRAY_LEN(pdata->dbdf2val.tbl[0].val)];
+ char *p = pdata->dbdf2val.str;
+
+ ret = parse_array(pdata, p, port_array,
+ pdata->dbdf2val.num_vals);
+ if (ret > pdata->dbdf2val.num_vals || ret <= 0)
+ return ret < 0 ? -ret : INVALID_STR;
+ for (i = 0; i < ret; i++)
+ pdata->dbdf2val.tbl[0].val[i] = port_array[i];
+ pdata->dbdf2val.tbl[0].argc = i;
+ return 0;
+}
+
+static int update_defaults(struct param_data *pdata)
+{
+ int ret;
+ char *p = pdata->dbdf2val.str;
+
+ if (!strlen(p) || strchr(p, ':') || strchr(p, '.') || strchr(p, ';'))
+ return INVALID_STR;
+
+ switch (pdata->id) {
+ case UD_GID_TYPE:
+ case ROCE_MODE:
+ case PORT_TYPE_ARRAY:
+ case NUM_VFS:
+ case PROBE_VF:
+ ret = parse_mod_param(pdata);
+ if (ret)
+ return ret;
+ break;
+ default:
+ return INVALID_DATA;
+ }
+ pdata->dbdf2val.tbl[1].dbdf = MLX4_ENDOF_TBL;
+
+ return VALID_DATA;
+}
+
+int mlx4_fill_dbdf2val_tbl(struct mlx4_dbdf2val_lst *dbdf2val_lst)
+{
+ int domain, bus, dev, fn;
+ u64 dbdf;
+ char *p, *t, *v;
+ char tmp[32];
+ char sbdf[32];
+ char sep = ',';
+ int j, k, str_size, i = 1;
+ int prfx_size;
+
+ p = dbdf2val_lst->str;
+
+ for (j = 0; j < dbdf2val_lst->num_vals; j++)
+ dbdf2val_lst->tbl[0].val[j] = dbdf2val_lst->def_val[j];
+ dbdf2val_lst->tbl[0].argc = 0;
+ dbdf2val_lst->tbl[1].dbdf = MLX4_ENDOF_TBL;
+
+ str_size = strlen(dbdf2val_lst->str);
+
+ if (str_size == 0)
+ return 0;
+
+ while (strlen(p)) {
+ prfx_size = BDF_STR_SIZE;
+ sbdf[prfx_size] = 0;
+ strncpy(sbdf, p, prfx_size);
+ domain = DEFAULT_DOMAIN;
+ if (sscanf(sbdf, "%02x:%02x.%x-", &bus, &dev, &fn) != 3) {
+ prfx_size = DBDF_STR_SIZE;
+ sbdf[prfx_size] = 0;
+ strncpy(sbdf, p, prfx_size);
+ if (sscanf(sbdf, "%04x:%02x:%02x.%x-", &domain, &bus,
+ &dev, &fn) != 4) {
+ pr_bdf_err(sbdf, dbdf2val_lst->name);
+ goto err;
+ }
+ sprintf(tmp, "%04x:%02x:%02x.%x-", domain, bus, dev,
+ fn);
+ } else {
+ sprintf(tmp, "%02x:%02x.%x-", bus, dev, fn);
+ }
+
+ if (strnicmp(sbdf, tmp, sizeof(tmp))) {
+ pr_bdf_err(sbdf, dbdf2val_lst->name);
+ goto err;
+ }
+
+ dbdf = dbdf_to_u64(domain, bus, dev, fn);
+
+ for (j = 1; j < i; j++)
+ if (dbdf2val_lst->tbl[j].dbdf == dbdf) {
+ pr_warn("mlx4_core: in '%s', %s appears multiple times\n"
+ , dbdf2val_lst->name, sbdf);
+ goto err;
+ }
+
+ if (i >= MLX4_DEVS_TBL_SIZE) {
+ pr_warn("mlx4_core: Too many devices in '%s'\n"
+ , dbdf2val_lst->name);
+ goto err;
+ }
+
+ p += prfx_size;
+ t = strchr(p, sep);
+ t = t ? t : p + strlen(p);
+ if (p >= t) {
+ pr_val_err(sbdf, dbdf2val_lst->name, "");
+ goto err;
+ }
+
+ for (k = 0; k < dbdf2val_lst->num_vals; k++) {
+ char sval[32];
+ long int val;
+ int ret, val_len;
+ char vsep = ';';
+ int last_occurence = 0;
+
+ v = (k == dbdf2val_lst->num_vals - 1) ? t : strchr(p, vsep);
+ if (NULL == v) {
+ v = t;
+ last_occurence = 1;
+ }
+ if (!v || v > t || v == p || (v - p) > sizeof(sval)) {
+ pr_val_err(sbdf, dbdf2val_lst->name, p);
+ goto err;
+ }
+ val_len = v - p;
+ strncpy(sval, p, val_len);
+ sval[val_len] = 0;
+
+ ret = kstrtol(sval, 0, &val);
+ if (ret) {
+ if (strchr(p, vsep))
+ pr_warn("mlx4_core: too many vals in bdf '%s' of '%s'\n"
+ , sbdf, dbdf2val_lst->name);
+ else
+ pr_val_err(sbdf, dbdf2val_lst->name,
+ sval);
+ goto err;
+ }
+ if (!is_in_range(val, &dbdf2val_lst->range)) {
+ pr_out_of_range_bdf(sbdf, val, dbdf2val_lst);
+ goto err;
+ }
+
+ dbdf2val_lst->tbl[i].val[k] = val;
+ dbdf2val_lst->tbl[i].argc = k + 1;
+ p = v;
+ if (p[0] == vsep)
+ p++;
+ if (last_occurence)
+ break;
+ }
+
+ dbdf2val_lst->tbl[i].dbdf = dbdf;
+ if (strlen(p)) {
+ if (p[0] != sep) {
+ pr_warn("mlx4_core: expect separator '%c' before '%s' in '%s'\n"
+ , sep, p, dbdf2val_lst->name);
+ goto err;
+ }
+ p++;
+ }
+ i++;
+ if (i < MLX4_DEVS_TBL_SIZE)
+ dbdf2val_lst->tbl[i].dbdf = MLX4_ENDOF_TBL;
+ }
+
+ return 0;
+
+err:
+ dbdf2val_lst->tbl[1].dbdf = MLX4_ENDOF_TBL;
+ pr_warn("mlx4_core: The value of '%s' is incorrect. The value is discarded!\n"
+ , dbdf2val_lst->name);
+
+ return -EINVAL;
+}
+EXPORT_SYMBOL(mlx4_fill_dbdf2val_tbl);
+
+int mlx4_get_val(struct mlx4_dbdf2val *tbl, struct pci_dev *pdev, int idx,
+ int *val)
+{
+ u64 dbdf;
+ int i = 1;
+
+ *val = tbl[0].val[idx];
+ if (!pdev)
+ return -EINVAL;
+
+ if (!pdev->bus) {
+ pr_debug("mlx4_core: pci_dev without valid bus number\n");
+ return -EINVAL;
+ }
+
+ dbdf = dbdf_to_u64(pci_domain_nr(pdev->bus), pdev->bus->number,
+ PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
+
+ while ((i < MLX4_DEVS_TBL_SIZE) && (tbl[i].dbdf != MLX4_ENDOF_TBL)) {
+ if (tbl[i].dbdf == dbdf) {
+ if (idx < tbl[i].argc) {
+ *val = tbl[i].val[idx];
+ return 0;
+ } else {
+ return -EINVAL;
+ }
+ }
+ i++;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(mlx4_get_val);
+
+
+int mlx4_get_argc(struct mlx4_dbdf2val *tbl, struct pci_dev *pdev)
+{
+ u64 dbdf;
+ int i = 1;
+
+ if (!pdev)
+ return -EINVAL;
+
+ if (!pdev->bus) {
+ pr_debug("mlx4_core: pci_dev without valid bus number\n");
+ return -EINVAL;
+ }
+
+ dbdf = dbdf_to_u64(pci_domain_nr(pdev->bus), pdev->bus->number,
+ PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
+
+ while ((i < MLX4_DEVS_TBL_SIZE) && (tbl[i].dbdf != MLX4_ENDOF_TBL)) {
+ if (tbl[i].dbdf == dbdf)
+ return tbl[i].argc;
+ i++;
+ }
+
+ return tbl[0].argc;
+}
+EXPORT_SYMBOL(mlx4_get_argc);
+
+int mlx4_check_port_params(struct mlx4_dev *dev,
+ enum mlx4_port_type *port_type)
+{
+ int i;
+
+ if (!(dev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP)) {
+ for (i = 0; i < dev->caps.num_ports - 1; i++) {
+ if (port_type[i] != port_type[i + 1]) {
+ mlx4_err(dev, "Only same port types supported on this HCA, aborting\n");
+ return -EINVAL;
+ }
+ }
+ }
+
+ for (i = 0; i < dev->caps.num_ports; i++) {
+ if (!(port_type[i] & dev->caps.supported_type[i+1])) {
+ mlx4_err(dev, "Requested port type for port %d is not supported on this HCA\n",
+ i + 1);
+ return -EINVAL;
+ }
+ }
+ return 0;
+}
+
+static void mlx4_set_port_mask(struct mlx4_dev *dev)
+{
+ int i;
+
+ for (i = 1; i <= dev->caps.num_ports; ++i)
+ dev->caps.port_mask[i] = dev->caps.port_type[i];
+}
+
+enum {
+ MLX4_QUERY_FUNC_NUM_SYS_EQS = 1 << 0,
+};
+
+static int mlx4_query_func(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
+{
+ int err = 0;
+ struct mlx4_func func;
+
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_SYS_EQS) {
+ err = mlx4_QUERY_FUNC(dev, &func, 0);
+ if (err) {
+ mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting.\n");
+ return err;
+ }
+ dev_cap->max_eqs = func.max_eq;
+ dev_cap->reserved_eqs = func.rsvd_eqs;
+ dev_cap->reserved_uars = func.rsvd_uars;
+ err |= MLX4_QUERY_FUNC_NUM_SYS_EQS;
+ }
+ return err;
+}
+
+static void mlx4_enable_cqe_eqe_stride(struct mlx4_dev *dev)
+{
+ struct mlx4_caps *dev_cap = &dev->caps;
+
+ /* FW not supporting or cancelled by user */
+ if (!(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_EQE_STRIDE) ||
+ !(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_CQE_STRIDE))
+ return;
+
+ /* Must have 64B CQE_EQE enabled by FW to use bigger stride
+ * When FW has NCSI it may decide not to report 64B CQE/EQEs
+ */
+ if (!(dev_cap->flags & MLX4_DEV_CAP_FLAG_64B_EQE) ||
+ !(dev_cap->flags & MLX4_DEV_CAP_FLAG_64B_CQE)) {
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_CQE_STRIDE;
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
+ return;
+ }
+
+ if (cache_line_size() == 128 || cache_line_size() == 256) {
+ mlx4_dbg(dev, "Enabling CQE stride cacheLine supported\n");
+ /* Changing the real data inside CQE size to 32B */
+ dev_cap->flags &= ~MLX4_DEV_CAP_FLAG_64B_CQE;
+ dev_cap->flags &= ~MLX4_DEV_CAP_FLAG_64B_EQE;
+
+ if (mlx4_is_master(dev))
+ dev_cap->function_caps |= MLX4_FUNC_CAP_EQE_CQE_STRIDE;
+ } else {
+ if (cache_line_size() != 32 && cache_line_size() != 64)
+ mlx4_dbg(dev, "Disabling CQE stride, cacheLine size unsupported\n");
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_CQE_STRIDE;
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
+ }
+}
+
+static int _mlx4_dev_port(struct mlx4_dev *dev, int port,
+ struct mlx4_port_cap *port_cap)
+{
+ dev->caps.vl_cap[port] = port_cap->max_vl;
+ dev->caps.ib_mtu_cap[port] = port_cap->ib_mtu;
+ dev->phys_caps.gid_phys_table_len[port] = port_cap->max_gids;
+ dev->phys_caps.pkey_phys_table_len[port] = port_cap->max_pkeys;
+ /* set gid and pkey table operating lengths by default
+ * to non-sriov values
+ */
+ dev->caps.gid_table_len[port] = port_cap->max_gids;
+ dev->caps.pkey_table_len[port] = port_cap->max_pkeys;
+ dev->caps.port_width_cap[port] = port_cap->max_port_width;
+ dev->caps.eth_mtu_cap[port] = port_cap->eth_mtu;
+ dev->caps.def_mac[port] = port_cap->def_mac;
+ dev->caps.supported_type[port] = port_cap->supported_port_types;
+ dev->caps.suggested_type[port] = port_cap->suggested_type;
+ dev->caps.default_sense[port] = port_cap->default_sense;
+ dev->caps.trans_type[port] = port_cap->trans_type;
+ dev->caps.vendor_oui[port] = port_cap->vendor_oui;
+ dev->caps.wavelength[port] = port_cap->wavelength;
+ dev->caps.trans_code[port] = port_cap->trans_code;
+
+ return 0;
+}
+
+static int mlx4_dev_port(struct mlx4_dev *dev, int port,
+ struct mlx4_port_cap *port_cap)
+{
+ int err = 0;
+
+ err = mlx4_QUERY_PORT(dev, port, port_cap);
+
+ if (err)
+ mlx4_err(dev, "QUERY_PORT command failed.\n");
+
+ return err;
+}
+
+static inline void mlx4_enable_ignore_fcs(struct mlx4_dev *dev)
+{
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_IGNORE_FCS))
+ return;
+
+ if (mlx4_is_mfunc(dev)) {
+ mlx4_dbg(dev, "SRIOV mode - Disabling Ignore FCS");
+ dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_IGNORE_FCS;
+ return;
+ }
+
+ if (!(dev->caps.flags & MLX4_DEV_CAP_FLAG_FCS_KEEP)) {
+ mlx4_dbg(dev,
+ "Keep FCS is not supported - Disabling Ignore FCS");
+ dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_IGNORE_FCS;
+ return;
+ }
+}
+
+#define MLX4_A0_STEERING_TABLE_SIZE 256
+static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
+{
+ int err;
+ int i;
+
+ err = mlx4_QUERY_DEV_CAP(dev, dev_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting\n");
+ return err;
+ }
+
+ if ((ingress_parser_mode != MLX4_INGRESS_PARSER_MODE_STANDARD) &&
+ (dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_MODIFY_PARSER)) {
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_VXLAN_OFFLOADS;
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_ROCEV2;
+ }
+
+ mlx4_dev_cap_dump(dev, dev_cap);
+
+ if (dev_cap->min_page_sz > PAGE_SIZE) {
+ mlx4_err(dev, "HCA minimum page size of %d bigger than kernel PAGE_SIZE of %ld, aborting\n",
+ dev_cap->min_page_sz, PAGE_SIZE);
+ return -ENODEV;
+ }
+ if (dev_cap->num_ports > MLX4_MAX_PORTS) {
+ mlx4_err(dev, "HCA has %d ports, but we only support %d, aborting\n",
+ dev_cap->num_ports, MLX4_MAX_PORTS);
+ return -ENODEV;
+ }
+
+ if (dev_cap->uar_size > pci_resource_len(dev->persist->pdev, 2)) {
+ mlx4_err(dev, "HCA reported UAR size of 0x%x bigger than PCI resource 2 size of 0x%llx, aborting\n",
+ dev_cap->uar_size,
+ (unsigned long long)
+ pci_resource_len(dev->persist->pdev, 2));
+ return -ENODEV;
+ }
+ if (dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2)
+ dev->caps.roce_addr_support = 1;
+
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ if ((dev_cap->bmme_flags & MLX4_BMME_FLAG_WQE_FORMAT))
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_WQE_FORMAT;
+#endif
+ dev->caps.num_ports = dev_cap->num_ports;
+ dev->caps.num_sys_eqs = dev_cap->num_sys_eqs;
+ dev->phys_caps.num_phys_eqs = dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_SYS_EQS ?
+ dev->caps.num_sys_eqs :
+ MLX4_MAX_EQ_NUM;
+ for (i = 1; i <= dev->caps.num_ports; ++i) {
+ err = _mlx4_dev_port(dev, i, dev_cap->port_cap + i);
+ if (err) {
+ mlx4_err(dev, "QUERY_PORT command failed, aborting\n");
+ return err;
+ }
+ }
+
+ dev->caps.uar_page_size = PAGE_SIZE;
+ dev->caps.num_uars = dev_cap->uar_size / PAGE_SIZE;
+ dev->caps.local_ca_ack_delay = dev_cap->local_ca_ack_delay;
+ dev->caps.bf_reg_size = dev_cap->bf_reg_size;
+ dev->caps.bf_regs_per_page = dev_cap->bf_regs_per_page;
+ dev->caps.max_sq_sg = dev_cap->max_sq_sg;
+ dev->caps.max_rq_sg = dev_cap->max_rq_sg;
+ dev->caps.max_wqes = dev_cap->max_qp_sz;
+ dev->caps.max_qp_init_rdma = dev_cap->max_requester_per_qp;
+ dev->caps.max_srq_wqes = dev_cap->max_srq_sz;
+ dev->caps.max_srq_sge = dev_cap->max_rq_sg - 1;
+ dev->caps.reserved_srqs = dev_cap->reserved_srqs;
+ dev->caps.max_sq_desc_sz = dev_cap->max_sq_desc_sz;
+ dev->caps.max_rq_desc_sz = dev_cap->max_rq_desc_sz;
+ /*
+ * Subtract 1 from the limit because we need to allocate a
+ * spare CQE so the HCA HW can tell the difference between an
+ * empty CQ and a full CQ.
+ */
+ dev->caps.max_cqes = dev_cap->max_cq_sz - 1;
+ dev->caps.reserved_cqs = dev_cap->reserved_cqs;
+ dev->caps.reserved_eqs = dev_cap->reserved_eqs;
+ dev->caps.reserved_mtts = dev_cap->reserved_mtts;
+ dev->caps.reserved_mrws = dev_cap->reserved_mrws;
+
+ /* The first 128 UARs are used for EQ doorbells */
+ dev->caps.reserved_uars = max_t(int, 128, dev_cap->reserved_uars);
+ dev->caps.reserved_pds = dev_cap->reserved_pds;
+ dev->caps.reserved_xrcds = (dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) ?
+ dev_cap->reserved_xrcds : 0;
+ dev->caps.max_xrcds = (dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC) ?
+ dev_cap->max_xrcds : 0;
+ dev->caps.mtt_entry_sz = dev_cap->mtt_entry_sz;
+
+ dev->caps.max_msg_sz = dev_cap->max_msg_sz;
+ dev->caps.page_size_cap = ~(u32) (dev_cap->min_page_sz - 1);
+ dev->caps.flags = dev_cap->flags;
+ dev->caps.flags2 = dev_cap->flags2;
+ dev->caps.bmme_flags = dev_cap->bmme_flags;
+ dev->caps.reserved_lkey = dev_cap->reserved_lkey;
+ dev->caps.stat_rate_support = dev_cap->stat_rate_support;
+ dev->caps.max_gso_sz = dev_cap->max_gso_sz;
+ dev->caps.max_rss_tbl_sz = dev_cap->max_rss_tbl_sz;
+ dev->caps.cq_overrun = dev_cap->cq_overrun;
+
+ /* Sense port always allowed on supported devices for ConnectX-1 and -2 */
+ if (mlx4_priv(dev)->pci_dev_data & MLX4_PCI_DEV_FORCE_SENSE_PORT)
+ dev->caps.flags |= MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
+ /* Don't do sense port on multifunction devices (for now at least) */
+ if (mlx4_is_mfunc(dev))
+ dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;
+
+ if (mlx4_low_memory_profile()) {
+ dev->caps.log_num_macs = MLX4_MIN_LOG_NUM_MAC;
+ dev->caps.log_num_vlans = MLX4_MIN_LOG_NUM_VLANS;
+ } else {
+ dev->caps.log_num_macs = log_num_mac;
+ dev->caps.log_num_vlans = MLX4_LOG_NUM_VLANS;
+ }
+
+ dev->caps.fast_drop = fast_drop ?
+ !!(dev->caps.flags & MLX4_DEV_CAP_FLAG_FAST_DROP) :
+ 0;
+
+ for (i = 1; i <= dev->caps.num_ports; ++i) {
+ dev->caps.port_type[i] = MLX4_PORT_TYPE_NONE;
+ if (dev->caps.supported_type[i]) {
+ /* if only ETH is supported - assign ETH */
+ if (dev->caps.supported_type[i] == MLX4_PORT_TYPE_ETH)
+ dev->caps.port_type[i] = MLX4_PORT_TYPE_ETH;
+ /* if only IB is supported, assign IB */
+ else if (dev->caps.supported_type[i] ==
+ MLX4_PORT_TYPE_IB)
+ dev->caps.port_type[i] = MLX4_PORT_TYPE_IB;
+ else {
+ /*
+ * if IB and ETH are supported, we set the port
+ * type according to user selection of port type;
+ * if there is no user selection, take the FW hint
+ */
+ int pta;
+ mlx4_get_val(port_type_array.dbdf2val.tbl,
+ pci_physfn(dev->persist->pdev), i - 1,
+ &pta);
+ if (pta == MLX4_PORT_TYPE_NONE) {
+ dev->caps.port_type[i] = dev->caps.suggested_type[i] ?
+ MLX4_PORT_TYPE_ETH : MLX4_PORT_TYPE_IB;
+ } else if (pta == MLX4_PORT_TYPE_NA) {
+ mlx4_err(dev, "Port %d is valid port. "
+ "It is not allowed to configure its type to N/A(%d)\n",
+ i, MLX4_PORT_TYPE_NA);
+ return -EINVAL;
+ } else {
+ dev->caps.port_type[i] = pta;
+ }
+ }
+ }
+ /*
+ * Link sensing is allowed on the port if 3 conditions are true:
+ * 1. Both protocols are supported on the port.
+ * 2. Different types are supported on the port
+ * 3. FW declared that it supports link sensing
+ */
+ mlx4_priv(dev)->sense.sense_allowed[i] =
+ ((dev->caps.supported_type[i] == MLX4_PORT_TYPE_AUTO) &&
+ (dev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP) &&
+ (dev->caps.flags & MLX4_DEV_CAP_FLAG_SENSE_SUPPORT));
+
+ /*
+ * If "default_sense" bit is set, we move the port to "AUTO" mode
+ * and perform sense_port FW command to try and set the correct
+ * port type from beginning
+ */
+ if (mlx4_priv(dev)->sense.sense_allowed[i] && dev->caps.default_sense[i]) {
+ enum mlx4_port_type sensed_port = MLX4_PORT_TYPE_NONE;
+ dev->caps.possible_type[i] = MLX4_PORT_TYPE_AUTO;
+ mlx4_SENSE_PORT(dev, i, &sensed_port);
+ if (sensed_port != MLX4_PORT_TYPE_NONE)
+ dev->caps.port_type[i] = sensed_port;
+ } else {
+ dev->caps.possible_type[i] = dev->caps.port_type[i];
+ }
+
+ if (dev->caps.log_num_macs > dev_cap->port_cap[i].log_max_macs) {
+ dev->caps.log_num_macs = dev_cap->port_cap[i].log_max_macs;
+ mlx4_warn(dev, "Requested number of MACs is too much for port %d, reducing to %d\n",
+ i, 1 << dev->caps.log_num_macs);
+ }
+ if (dev->caps.log_num_vlans > dev_cap->port_cap[i].log_max_vlans) {
+ dev->caps.log_num_vlans = dev_cap->port_cap[i].log_max_vlans;
+ mlx4_warn(dev, "Requested number of VLANs is too much for port %d, reducing to %d\n",
+ i, 1 << dev->caps.log_num_vlans);
+ }
+ }
+
+ dev->caps.max_basic_counters = dev_cap->max_basic_counters;
+ dev->caps.max_extended_counters = dev_cap->max_extended_counters;
+ /* support extended counters if available */
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_COUNTERS_EXT)
+ dev->caps.max_counters = dev->caps.max_extended_counters;
+ else
+ dev->caps.max_counters = dev->caps.max_basic_counters;
+
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW] = dev_cap->reserved_qps;
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_ETH_ADDR] =
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_ADDR] =
+ (1 << dev->caps.log_num_macs) *
+ (1 << dev->caps.log_num_vlans) *
+ dev->caps.num_ports;
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_EXCH] = MLX4_NUM_FEXCH;
+
+ if (dev_cap->dmfs_high_rate_qpn_base > 0 &&
+ dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_FS_EN)
+ dev->caps.dmfs_high_rate_qpn_base = dev_cap->dmfs_high_rate_qpn_base;
+ else
+ dev->caps.dmfs_high_rate_qpn_base =
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW];
+
+ if (dev_cap->dmfs_high_rate_qpn_range > 0 &&
+ dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_FS_EN) {
+ dev->caps.dmfs_high_rate_qpn_range = dev_cap->dmfs_high_rate_qpn_range;
+ dev->caps.dmfs_high_steer_mode = MLX4_STEERING_DMFS_A0_DEFAULT;
+ dev->caps.flags2 |= MLX4_DEV_CAP_FLAG2_FS_A0;
+ } else {
+ dev->caps.dmfs_high_steer_mode = MLX4_STEERING_DMFS_A0_NOT_SUPPORTED;
+ dev->caps.dmfs_high_rate_qpn_base =
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW];
+ dev->caps.dmfs_high_rate_qpn_range = MLX4_A0_STEERING_TABLE_SIZE;
+ }
+
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_RSS_RAW_ETH] =
+ dev->caps.dmfs_high_rate_qpn_range;
+
+ dev->caps.reserved_qps = dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW] +
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_ETH_ADDR] +
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_ADDR] +
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FC_EXCH];
+
+ dev->caps.sync_qp = dev_cap->sync_qp;
+ if (dev->persist->pdev->device == 0x1003 || dev->caps.cq_overrun)
+ dev->caps.cq_flags |= MLX4_DEV_CAP_CQ_FLAG_IO;
+
+ dev->caps.sqp_demux = (mlx4_is_master(dev)) ? MLX4_MAX_NUM_SLAVES : 0;
+
+ if (!enable_64b_cqe_eqe && !mlx4_is_slave(dev)) {
+ if (dev_cap->flags &
+ (MLX4_DEV_CAP_FLAG_64B_CQE | MLX4_DEV_CAP_FLAG_64B_EQE)) {
+ mlx4_warn(dev, "64B EQEs/CQEs supported by the device but not enabled\n");
+ dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_64B_CQE;
+ dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_64B_EQE;
+ }
+
+ if (dev_cap->flags2 &
+ (MLX4_DEV_CAP_FLAG2_CQE_STRIDE |
+ MLX4_DEV_CAP_FLAG2_EQE_STRIDE)) {
+ mlx4_warn(dev, "Disabling EQE/CQE stride per user request\n");
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_CQE_STRIDE;
+ dev_cap->flags2 &= ~MLX4_DEV_CAP_FLAG2_EQE_STRIDE;
+ }
+ }
+
+ if ((dev->caps.flags &
+ (MLX4_DEV_CAP_FLAG_64B_CQE | MLX4_DEV_CAP_FLAG_64B_EQE)) &&
+ mlx4_is_master(dev))
+ dev->caps.function_caps |= MLX4_FUNC_CAP_64B_EQE_CQE;
+
+ if (!mlx4_is_slave(dev)) {
+ for (i = 0; i < dev->caps.num_ports; ++i)
+ dev->caps.def_counter_index[i] = i << 1;
+ mlx4_enable_cqe_eqe_stride(dev);
+ dev->caps.alloc_res_qp_mask =
+ (dev->caps.bf_reg_size ? MLX4_RESERVE_ETH_BF_QP : 0) |
+ MLX4_RESERVE_A0_QP;
+
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ETS_CFG) &&
+ dev->caps.flags & MLX4_DEV_CAP_FLAG_SET_ETH_SCHED) {
+ mlx4_warn(dev, "Old device ETS support detected\n");
+ mlx4_warn(dev, "Consider upgrading device FW.\n");
+ dev->caps.flags2 |= MLX4_DEV_CAP_FLAG2_ETS_CFG;
+ }
+
+ } else {
+ dev->caps.alloc_res_qp_mask = 0;
+ }
+
+ mlx4_enable_ignore_fcs(dev);
+
+ return 0;
+}
+
+static int mlx4_get_pcie_dev_link_caps(struct mlx4_dev *dev,
+ enum pci_bus_speed *speed,
+ enum pcie_link_width *width)
+{
+ u32 lnkcap1, lnkcap2;
+ int err1, err2;
+
+#define PCIE_MLW_CAP_SHIFT 4 /* start of MLW mask in link capabilities */
+
+ *speed = PCI_SPEED_UNKNOWN;
+ *width = PCIE_LNK_WIDTH_UNKNOWN;
+
+ err1 = pcie_capability_read_dword(dev->persist->pdev, PCI_EXP_LNKCAP,
+ &lnkcap1);
+ err2 = pcie_capability_read_dword(dev->persist->pdev, PCI_EXP_LNKCAP2,
+ &lnkcap2);
+ if (!err2 && lnkcap2) { /* PCIe r3.0-compliant */
+ if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_8_0GB)
+ *speed = PCIE_SPEED_8_0GT;
+ else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_5_0GB)
+ *speed = PCIE_SPEED_5_0GT;
+ else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_2_5GB)
+ *speed = PCIE_SPEED_2_5GT;
+ }
+ if (!err1) {
+ *width = (lnkcap1 & PCI_EXP_LNKCAP_MLW) >> PCIE_MLW_CAP_SHIFT;
+ if (!lnkcap2) { /* pre-r3.0 */
+ if (lnkcap1 & PCI_EXP_LNKCAP_SLS_5_0GB)
+ *speed = PCIE_SPEED_5_0GT;
+ else if (lnkcap1 & PCI_EXP_LNKCAP_SLS_2_5GB)
+ *speed = PCIE_SPEED_2_5GT;
+ }
+ }
+
+ if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN) {
+ return err1 ? err1 :
+ err2 ? err2 : -EINVAL;
+ }
+ return 0;
+}
+
+static void mlx4_check_pcie_caps(struct mlx4_dev *dev)
+{
+ enum pcie_link_width width, width_cap;
+ enum pci_bus_speed speed, speed_cap;
+ int err;
+
+#define PCIE_SPEED_STR(speed) \
+ (speed == PCIE_SPEED_8_0GT ? "8.0GT/s" : \
+ speed == PCIE_SPEED_5_0GT ? "5.0GT/s" : \
+ speed == PCIE_SPEED_2_5GT ? "2.5GT/s" : \
+ "Unknown")
+
+ err = mlx4_get_pcie_dev_link_caps(dev, &speed_cap, &width_cap);
+ if (err) {
+ mlx4_warn(dev,
+ "Unable to determine PCIe device BW capabilities\n");
+ return;
+ }
+
+ err = pcie_get_minimum_link(dev->persist->pdev, &speed, &width);
+ if (err || speed == PCI_SPEED_UNKNOWN ||
+ width == PCIE_LNK_WIDTH_UNKNOWN) {
+ mlx4_warn(dev,
+ "Unable to determine PCI device chain minimum BW\n");
+ return;
+ }
+
+ if (width != width_cap || speed != speed_cap)
+ mlx4_warn(dev,
+ "PCIe BW is different than device's capability\n");
+
+ mlx4_info(dev, "PCIe link speed is %s, device supports %s\n",
+ PCIE_SPEED_STR(speed), PCIE_SPEED_STR(speed_cap));
+ mlx4_info(dev, "PCIe link width is x%d, device supports x%d\n",
+ width, width_cap);
+ return;
+}
+
+/*The function checks if there are live vf, return the num of them*/
+static int mlx4_how_many_lives_vf(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_state *s_state;
+ int i;
+ int ret = 0;
+
+ for (i = 1/*the ppf is 0*/; i < dev->num_slaves; ++i) {
+ s_state = &priv->mfunc.master.slave_state[i];
+ if (s_state->active && s_state->last_cmd !=
+ MLX4_COMM_CMD_RESET) {
+ mlx4_warn(dev, "%s: slave: %d is still active\n",
+ __func__, i);
+ ret++;
+ }
+ }
+ return ret;
+}
+
+int mlx4_get_parav_qkey(struct mlx4_dev *dev, u32 qpn, u32 *qkey)
+{
+ u32 qk = MLX4_RESERVED_QKEY_BASE;
+
+ if (qpn >= dev->phys_caps.base_tunnel_sqpn + 8 * MLX4_MFUNC_MAX ||
+ qpn < dev->phys_caps.base_proxy_sqpn)
+ return -EINVAL;
+
+ if (qpn >= dev->phys_caps.base_tunnel_sqpn)
+ /* tunnel qp */
+ qk += qpn - dev->phys_caps.base_tunnel_sqpn;
+ else
+ qk += qpn - dev->phys_caps.base_proxy_sqpn;
+ *qkey = qk;
+ return 0;
+}
+EXPORT_SYMBOL(mlx4_get_parav_qkey);
+
+void mlx4_sync_pkey_table(struct mlx4_dev *dev, int slave, int port, int i, int val)
+{
+ struct mlx4_priv *priv = container_of(dev, struct mlx4_priv, dev);
+
+ if (!mlx4_is_master(dev))
+ return;
+
+ priv->virt2phys_pkey[slave][port - 1][i] = val;
+}
+EXPORT_SYMBOL(mlx4_sync_pkey_table);
+
+void mlx4_put_slave_node_guid(struct mlx4_dev *dev, int slave, __be64 guid)
+{
+ struct mlx4_priv *priv = container_of(dev, struct mlx4_priv, dev);
+
+ if (!mlx4_is_master(dev))
+ return;
+
+ priv->slave_node_guids[slave] = guid;
+}
+EXPORT_SYMBOL(mlx4_put_slave_node_guid);
+
+__be64 mlx4_get_slave_node_guid(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = container_of(dev, struct mlx4_priv, dev);
+
+ if (!mlx4_is_master(dev))
+ return 0;
+
+ return priv->slave_node_guids[slave];
+}
+EXPORT_SYMBOL(mlx4_get_slave_node_guid);
+
+int mlx4_is_slave_active(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_state *s_slave;
+
+ if (!mlx4_is_master(dev))
+ return 0;
+
+ s_slave = &priv->mfunc.master.slave_state[slave];
+ return !!s_slave->active;
+}
+EXPORT_SYMBOL(mlx4_is_slave_active);
+
+static void slave_adjust_steering_mode(struct mlx4_dev *dev,
+ struct mlx4_dev_cap *dev_cap,
+ struct mlx4_init_hca_param *hca_param)
+{
+ dev->caps.steering_mode = hca_param->steering_mode;
+ dev->caps.steering_attr = hca_param->steering_attr;
+ if (dev->caps.steering_mode == MLX4_STEERING_MODE_DEVICE_MANAGED) {
+ dev->caps.num_qp_per_mgm = dev_cap->fs_max_num_qp_per_entry;
+ dev->caps.fs_log_max_ucast_qp_range_size =
+ dev_cap->fs_log_max_ucast_qp_range_size;
+ } else
+ dev->caps.num_qp_per_mgm =
+ 4 * ((1 << hca_param->log_mc_entry_sz)/16 - 2);
+
+ mlx4_dbg(dev, "Steering mode is: %s\n",
+ mlx4_steering_mode_str(dev->caps.steering_mode));
+}
+
+static void mlx4_slave_destroy_special_qp_cap(struct mlx4_dev *dev)
+{
+ kfree(dev->caps.qp0_qkey);
+ kfree(dev->caps.qp0_tunnel);
+ kfree(dev->caps.qp0_proxy);
+ kfree(dev->caps.qp1_tunnel);
+ kfree(dev->caps.qp1_proxy);
+ dev->caps.qp0_qkey = NULL;
+ dev->caps.qp0_tunnel = NULL;
+ dev->caps.qp0_proxy = NULL;
+ dev->caps.qp1_tunnel = NULL;
+ dev->caps.qp1_proxy = NULL;
+}
+
+static int mlx4_slave_special_qp_cap(struct mlx4_dev *dev)
+{
+ struct mlx4_func_cap *func_cap = NULL;
+ int i, err;
+
+ func_cap = kzalloc(sizeof(*func_cap), GFP_KERNEL);
+ dev->caps.qp0_qkey = kcalloc(dev->caps.num_ports,
+ sizeof(u32), GFP_KERNEL);
+ dev->caps.qp0_tunnel = kcalloc(dev->caps.num_ports,
+ sizeof(u32), GFP_KERNEL);
+ dev->caps.qp0_proxy = kcalloc(dev->caps.num_ports,
+ sizeof(u32), GFP_KERNEL);
+ dev->caps.qp1_tunnel = kcalloc(dev->caps.num_ports,
+ sizeof(u32), GFP_KERNEL);
+ dev->caps.qp1_proxy = kcalloc(dev->caps.num_ports,
+ sizeof(u32), GFP_KERNEL);
+
+ if (!dev->caps.qp0_tunnel || !dev->caps.qp0_proxy ||
+ !dev->caps.qp1_tunnel || !dev->caps.qp1_proxy ||
+ !dev->caps.qp0_qkey || !func_cap) {
+ mlx4_err(dev, "Failed to allocate memory for special qps cap\n");
+ err = -ENOMEM;
+ goto err_mem;
+ }
+
+ for (i = 1; i <= dev->caps.num_ports; ++i) {
+ err = mlx4_QUERY_FUNC_CAP(dev, i, func_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_FUNC_CAP port command failed for port %d, aborting (%d)\n",
+ i, err);
+ goto err_mem;
+ }
+ dev->caps.qp0_qkey[i - 1] = func_cap->qp0_qkey;
+ dev->caps.qp0_tunnel[i - 1] = func_cap->qp0_tunnel_qpn;
+ dev->caps.qp0_proxy[i - 1] = func_cap->qp0_proxy_qpn;
+ dev->caps.qp1_tunnel[i - 1] = func_cap->qp1_tunnel_qpn;
+ dev->caps.qp1_proxy[i - 1] = func_cap->qp1_proxy_qpn;
+ dev->caps.def_counter_index[i - 1] = func_cap->def_counter_index;
+ dev->caps.port_mask[i] = dev->caps.port_type[i];
+ dev->caps.phys_port_id[i] = func_cap->phys_port_id;
+ err = mlx4_get_slave_pkey_gid_tbl_len(dev, i,
+ &dev->caps.gid_table_len[i],
+ &dev->caps.pkey_table_len[i]);
+ if (err) {
+ mlx4_err(dev, "QUERY_PORT command failed for port %d, aborting (%d)\n",
+ i, err);
+ goto err_mem;
+ }
+ }
+
+ kfree(func_cap);
+ return 0;
+
+err_mem:
+ kfree(func_cap);
+ mlx4_slave_destroy_special_qp_cap(dev);
+
+ return err;
+}
+
+int mlx4_verify_supported_gid_type(struct mlx4_dev *dev, enum mlx4_roce_gid_type gid_type,
+ enum mlx4_roce_gid_type *alt_type)
+{
+ static const int supported_gid_types[][2] = {
+ [MLX4_ROCE_MODE_1] = {MLX4_ROCE_GID_TYPE_V1, -1},
+ [MLX4_ROCE_MODE_1_5] = {MLX4_ROCE_GID_TYPE_V1_5, -1},
+ [MLX4_ROCE_MODE_2] = {MLX4_ROCE_GID_TYPE_V2, -1},
+ [MLX4_ROCE_MODE_1_5_PLUS_2] = {MLX4_ROCE_GID_TYPE_V1_5, MLX4_ROCE_GID_TYPE_V2},
+ [MLX4_ROCE_MODE_1_PLUS_2] = {MLX4_ROCE_GID_TYPE_V1, MLX4_ROCE_GID_TYPE_V2}
+ };
+ enum mlx4_roce_mode roce_mode = dev->caps.roce_mode;
+ int i;
+ if (roce_mode == MLX4_ROCE_MODE_INVALID) {
+ if (alt_type)
+ *alt_type = MLX4_ROCE_GID_TYPE_INVALID;
+ return -EINVAL;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(supported_gid_types[roce_mode]) &&
+ gid_type != supported_gid_types[roce_mode][i]; i++)
+ ;
+
+ if (i == ARRAY_SIZE(supported_gid_types[roce_mode])) {
+ if (alt_type)
+ *alt_type = supported_gid_types[roce_mode][0];
+ return -EINVAL;
+ }
+ return 0;
+}
+
+static void choose_roce_mode(struct mlx4_dev *dev,
+ struct mlx4_dev_cap *dev_cap)
+{
+ int req_roce_mode;
+ enum mlx4_roce_mode def_roce_mode;
+ int req_ud_gid_type;
+ enum mlx4_roce_gid_type alt_gid_type;
+
+ def_roce_mode = (dev_cap->flags & MLX4_DEV_CAP_FLAG_IBOE) ?
+ MLX4_ROCE_MODE_1 : MLX4_ROCE_MODE_INVALID;
+
+ mlx4_get_val(roce_mode.dbdf2val.tbl,
+ pci_physfn(dev->persist->pdev), 0, &req_roce_mode);
+ switch (req_roce_mode) {
+ case MLX4_ROCE_MODE_1:
+ req_roce_mode = def_roce_mode;
+ break;
+ case MLX4_ROCE_MODE_1_5:
+ if (!(dev_cap->flags & MLX4_DEV_CAP_FLAG_R_ROCE) &&
+ !(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_ROCEV2))
+ req_roce_mode = def_roce_mode;
+ break;
+ case MLX4_ROCE_MODE_2:
+ if (!(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_ROCEV2))
+ req_roce_mode = def_roce_mode;
+ break;
+ case MLX4_ROCE_MODE_1_5_PLUS_2:
+ if (!(dev_cap->flags & MLX4_DEV_CAP_FLAG_R_ROCE) ||
+ !dev->caps.roce_addr_support)
+ req_roce_mode = def_roce_mode;
+ break;
+ case MLX4_ROCE_MODE_1_PLUS_2:
+ if (!(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_ROCE_V1_V2))
+ req_roce_mode = def_roce_mode;
+ break;
+ default:
+ req_roce_mode = def_roce_mode;
+ }
+ dev->caps.roce_mode = req_roce_mode;
+ pr_info("mlx4_core: device is working in RoCE mode: %s\n",
+ mlx4_roce_mode_to_str(dev->caps.roce_mode));
+
+ mlx4_get_val(ud_gid_type.dbdf2val.tbl, pci_physfn(dev->persist->pdev), 0, &req_ud_gid_type);
+
+ if (mlx4_verify_supported_gid_type(dev, req_ud_gid_type, &alt_gid_type)) {
+ pr_warn("mlx4_core: gid_type %d for UD QPs is not supported by the device"
+ "gid_type %d was chosen instead\n", req_ud_gid_type, alt_gid_type);
+ req_ud_gid_type = alt_gid_type;
+ }
+ dev->caps.ud_gid_type = req_ud_gid_type;
+ pr_info("mlx4_core: UD QP Gid type is: %s\n",
+ mlx4_roce_gid_type_to_str(dev->caps.ud_gid_type));
+ dev->caps.rr_proto = mlx4_roce_proto_config;
+}
+
+static int mlx4_slave_cap(struct mlx4_dev *dev)
+{
+ int err;
+ u32 page_size;
+ struct mlx4_dev_cap *dev_cap = NULL;
+ struct mlx4_func_cap *func_cap = NULL;
+ struct mlx4_init_hca_param *hca_param = NULL;
+
+ hca_param = kzalloc(sizeof(*hca_param), GFP_KERNEL);
+ func_cap = kzalloc(sizeof(*func_cap), GFP_KERNEL);
+ dev_cap = kzalloc(sizeof(*dev_cap), GFP_KERNEL);
+ if (!hca_param || !func_cap || !dev_cap) {
+ mlx4_err(dev, "Failed to allocate memory for slave_cap\n");
+ err = -ENOMEM;
+ goto free_mem;
+ }
+
+ err = mlx4_QUERY_HCA(dev, hca_param);
+ if (err) {
+ mlx4_err(dev, "QUERY_HCA command failed, aborting\n");
+ goto free_mem;
+ }
+
+ /* fail if the hca has an unknown global capability
+ * at this time global_caps should be always zeroed
+ */
+ if (hca_param->global_caps) {
+ mlx4_err(dev, "Unknown hca global capabilities\n");
+ err = -ENOSYS;
+ goto free_mem;
+ }
+
+ dev->caps.hca_core_clock = hca_param->hca_core_clock;
+
+ dev->caps.max_qp_dest_rdma = 1 << hca_param->log_rd_per_qp;
+ err = mlx4_dev_cap(dev, dev_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting\n");
+ goto free_mem;
+ }
+
+ err = mlx4_QUERY_FW(dev);
+ if (err)
+ mlx4_err(dev, "QUERY_FW command failed: could not get FW version\n");
+
+ page_size = ~dev->caps.page_size_cap + 1;
+ mlx4_warn(dev, "HCA minimum page size:%d\n", page_size);
+ if (page_size > PAGE_SIZE) {
+ mlx4_err(dev, "HCA minimum page size of %d bigger than kernel PAGE_SIZE of %ld, aborting\n",
+ page_size, PAGE_SIZE);
+ err = -ENODEV;
+ goto free_mem;
+ }
+
+ /* slave gets uar page size from QUERY_HCA fw command */
+ dev->caps.uar_page_size = 1 << (hca_param->uar_page_sz + 12);
+
+ /* TODO: relax this assumption */
+ if (dev->caps.uar_page_size != PAGE_SIZE) {
+ mlx4_err(dev, "UAR size:%d != kernel PAGE_SIZE of %ld\n",
+ dev->caps.uar_page_size, PAGE_SIZE);
+ err = -ENODEV;
+ goto free_mem;
+ }
+
+ err = mlx4_QUERY_FUNC_CAP(dev, 0, func_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_FUNC_CAP general command failed, aborting (%d)\n",
+ err);
+ goto free_mem;
+ }
+
+ if ((func_cap->pf_context_behaviour | PF_CONTEXT_BEHAVIOUR_MASK) !=
+ PF_CONTEXT_BEHAVIOUR_MASK) {
+ mlx4_err(dev, "Unknown pf context behaviour %x known flags %x\n",
+ func_cap->pf_context_behaviour,
+ PF_CONTEXT_BEHAVIOUR_MASK);
+ err = -ENOSYS;
+ goto free_mem;
+ }
+
+ dev->caps.num_ports = func_cap->num_ports;
+ dev->quotas.qp = func_cap->qp_quota;
+ dev->quotas.srq = func_cap->srq_quota;
+ dev->quotas.cq = func_cap->cq_quota;
+ dev->quotas.mpt = func_cap->mpt_quota;
+ dev->quotas.mtt = func_cap->mtt_quota;
+ dev->caps.num_qps = 1 << hca_param->log_num_qps;
+ dev->caps.num_srqs = 1 << hca_param->log_num_srqs;
+ dev->caps.num_cqs = 1 << hca_param->log_num_cqs;
+ dev->caps.num_mpts = 1 << hca_param->log_mpt_sz;
+ dev->caps.num_eqs = func_cap->max_eq;
+ dev->caps.reserved_eqs = func_cap->reserved_eq;
+ dev->caps.reserved_lkey = func_cap->reserved_lkey;
+ dev->caps.num_pds = MLX4_NUM_PDS;
+ dev->caps.num_mgms = 0;
+ dev->caps.num_amgms = 0;
+
+ if (dev->caps.num_ports > MLX4_MAX_PORTS) {
+ mlx4_err(dev, "HCA has %d ports, but we only support %d, aborting\n",
+ dev->caps.num_ports, MLX4_MAX_PORTS);
+ return -ENODEV;
+ }
+
+ err = mlx4_slave_special_qp_cap(dev);
+ if (err) {
+ mlx4_err(dev, "Set special QP caps failed. aborting\n");
+ goto free_mem;
+ }
+
+ if (dev->caps.uar_page_size * (dev->caps.num_uars -
+ dev->caps.reserved_uars) >
+ pci_resource_len(dev->persist->pdev,
+ 2)) {
+ mlx4_err(dev, "HCA reported UAR region size of 0x%x bigger than PCI resource 2 size of 0x%llx, aborting\n",
+ dev->caps.uar_page_size * dev->caps.num_uars,
+ (unsigned long long)
+ pci_resource_len(dev->persist->pdev, 2));
+ err = -ENOMEM;
+ goto err_mem;
+ }
+
+ if (hca_param->dev_cap_enabled & MLX4_DEV_CAP_64B_EQE_ENABLED) {
+ dev->caps.eqe_size = 64;
+ dev->caps.eqe_factor = 1;
+ } else {
+ dev->caps.eqe_size = 32;
+ dev->caps.eqe_factor = 0;
+ }
+
+ if (hca_param->dev_cap_enabled & MLX4_DEV_CAP_64B_CQE_ENABLED) {
+ dev->caps.cqe_size = 64;
+ dev->caps.userspace_caps |= MLX4_USER_DEV_CAP_LARGE_CQE;
+ } else {
+ dev->caps.cqe_size = 32;
+ }
+
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_WQE_FORMAT)
+ dev->caps.userspace_caps |= MLX4_USER_DEV_CAP_WQE_FORMAT;
+#endif
+ if (hca_param->dev_cap_enabled & MLX4_DEV_CAP_EQE_STRIDE_ENABLED) {
+ dev->caps.eqe_size = hca_param->eqe_size;
+ dev->caps.eqe_factor = 0;
+ }
+
+ if (hca_param->dev_cap_enabled & MLX4_DEV_CAP_CQE_STRIDE_ENABLED) {
+ dev->caps.cqe_size = hca_param->cqe_size;
+ /* User still need to know when CQE > 32B */
+ dev->caps.userspace_caps |= MLX4_USER_DEV_CAP_LARGE_CQE;
+ }
+
+ dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_TS;
+ mlx4_warn(dev, "Timestamping is not supported in slave mode\n");
+
+ slave_adjust_steering_mode(dev, dev_cap, hca_param);
+ mlx4_dbg(dev, "RSS support for IP fragments is %s\n",
+ hca_param->rss_ip_frags ? "on" : "off");
+
+ if (func_cap->extra_flags & MLX4_QUERY_FUNC_FLAGS_BF_RES_QP &&
+ dev->caps.bf_reg_size)
+ dev->caps.alloc_res_qp_mask |= MLX4_RESERVE_ETH_BF_QP;
+
+ if (func_cap->extra_flags & MLX4_QUERY_FUNC_FLAGS_A0_RES_QP)
+ dev->caps.alloc_res_qp_mask |= MLX4_RESERVE_A0_QP;
+
+ if (func_cap->extra_flags & MLX4_QUERY_FUNC_FLAGS_ROCE_ADDR)
+ dev->caps.roce_addr_support = 1;
+
+ choose_roce_mode(dev, dev_cap);
+
+err_mem:
+ if (err)
+ mlx4_slave_destroy_special_qp_cap(dev);
+free_mem:
+ kfree(hca_param);
+ kfree(func_cap);
+ kfree(dev_cap);
+ return err;
+}
+
+static void mlx4_request_modules(struct mlx4_dev *dev)
+{
+ int port;
+ int has_ib_port = false;
+ int has_eth_port = false;
+#define EN_DRV_NAME "mlx4_en"
+#define IB_DRV_NAME "mlx4_ib"
+
+ for (port = 1; port <= dev->caps.num_ports; port++) {
+ if (dev->caps.port_type[port] == MLX4_PORT_TYPE_IB)
+ has_ib_port = true;
+ else if (dev->caps.port_type[port] == MLX4_PORT_TYPE_ETH)
+ has_eth_port = true;
+ }
+
+ if (has_eth_port)
+ request_module_nowait(EN_DRV_NAME);
+ if (has_ib_port || (dev->caps.flags & MLX4_DEV_CAP_FLAG_IBOE))
+ request_module_nowait(IB_DRV_NAME);
+}
+
+/*
+ * Change the port configuration of the device.
+ * Every user of this function must hold the port mutex.
+ */
+int mlx4_change_port_types(struct mlx4_dev *dev,
+ enum mlx4_port_type *port_types)
+{
+ int err = 0;
+ int change = 0;
+ int port;
+
+ for (port = 0; port < dev->caps.num_ports; port++) {
+ /* Change the port type only if the new type is different
+ * from the current, and not set to Auto */
+ if (port_types[port] != dev->caps.port_type[port + 1])
+ change = 1;
+ }
+ if (change) {
+ mlx4_unregister_device(dev);
+ for (port = 1; port <= dev->caps.num_ports; port++) {
+ mlx4_CLOSE_PORT(dev, port);
+ dev->caps.port_type[port] = port_types[port - 1];
+ err = mlx4_SET_PORT(dev, port, -1);
+ if (err) {
+ mlx4_err(dev, "Failed to set port %d, aborting\n",
+ port);
+ goto out;
+ }
+ }
+ mlx4_set_port_mask(dev);
+ err = mlx4_register_device(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to register device\n");
+ goto out;
+ }
+ mlx4_request_modules(dev);
+ }
+
+out:
+ return err;
+}
+
+static ssize_t show_port_type(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct mlx4_port_info *info = container_of(attr, struct mlx4_port_info,
+ port_attr);
+ struct mlx4_dev *mdev = info->dev;
+ char type[8];
+
+ sprintf(type, "%s",
+ (mdev->caps.port_type[info->port] == MLX4_PORT_TYPE_IB) ?
+ "ib" : "eth");
+ if (mdev->caps.possible_type[info->port] == MLX4_PORT_TYPE_AUTO)
+ sprintf(buf, "auto (%s)\n", type);
+ else
+ sprintf(buf, "%s\n", type);
+
+ return strlen(buf);
+}
+
+static ssize_t set_port_type(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct mlx4_port_info *info = container_of(attr, struct mlx4_port_info,
+ port_attr);
+ struct mlx4_dev *mdev = info->dev;
+ struct mlx4_priv *priv = mlx4_priv(mdev);
+ enum mlx4_port_type types[MLX4_MAX_PORTS];
+ enum mlx4_port_type new_types[MLX4_MAX_PORTS];
+ static DEFINE_MUTEX(set_port_type_mutex);
+ int i;
+ int err = 0;
+
+ mutex_lock(&set_port_type_mutex);
+
+ if (!strcmp(buf, "ib\n"))
+ info->tmp_type = MLX4_PORT_TYPE_IB;
+ else if (!strcmp(buf, "eth\n"))
+ info->tmp_type = MLX4_PORT_TYPE_ETH;
+ else if (!strcmp(buf, "auto\n"))
+ info->tmp_type = MLX4_PORT_TYPE_AUTO;
+ else {
+ mlx4_err(mdev, "%s is not supported port type\n", buf);
+ err = -EINVAL;
+ goto err_out;
+ }
+
+ if ((info->tmp_type & mdev->caps.supported_type[info->port]) !=
+ info->tmp_type) {
+ mlx4_err(mdev,
+ "Requested port type for port %d is not supported on this HCA\n",
+ info->port);
+ err = -EINVAL;
+ goto err_out;
+ }
+
+ mlx4_stop_sense(mdev);
+ mutex_lock(&priv->port_mutex);
+ /* Possible type is always the one that was delivered */
+ mdev->caps.possible_type[info->port] = info->tmp_type;
+
+ for (i = 0; i < mdev->caps.num_ports; i++) {
+ types[i] = priv->port[i+1].tmp_type ? priv->port[i+1].tmp_type :
+ mdev->caps.possible_type[i+1];
+ if (types[i] == MLX4_PORT_TYPE_AUTO)
+ types[i] = mdev->caps.port_type[i+1];
+ }
+
+ if (!(mdev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP) &&
+ !(mdev->caps.flags & MLX4_DEV_CAP_FLAG_SENSE_SUPPORT)) {
+ for (i = 1; i <= mdev->caps.num_ports; i++) {
+ if (mdev->caps.possible_type[i] == MLX4_PORT_TYPE_AUTO) {
+ mdev->caps.possible_type[i] = mdev->caps.port_type[i];
+ err = -EINVAL;
+ }
+ }
+ }
+ if (err) {
+ mlx4_err(mdev, "Auto sensing is not supported on this HCA. Set only 'eth' or 'ib' for both ports (should be the same)\n");
+ goto out;
+ }
+
+ mlx4_do_sense_ports(mdev, new_types, types);
+
+ err = mlx4_check_port_params(mdev, new_types);
+ if (err)
+ goto out;
+
+ /* We are about to apply the changes after the configuration
+ * was verified, no need to remember the temporary types
+ * any more */
+ for (i = 0; i < mdev->caps.num_ports; i++)
+ priv->port[i + 1].tmp_type = 0;
+
+ err = mlx4_change_port_types(mdev, new_types);
+
+out:
+ mlx4_start_sense(mdev);
+ mutex_unlock(&priv->port_mutex);
+err_out:
+ mutex_unlock(&set_port_type_mutex);
+
+ return err ? err : count;
+}
+
+enum ibta_mtu {
+ IB_MTU_256 = 1,
+ IB_MTU_512 = 2,
+ IB_MTU_1024 = 3,
+ IB_MTU_2048 = 4,
+ IB_MTU_4096 = 5
+};
+
+static inline int int_to_ibta_mtu(int mtu)
+{
+ switch (mtu) {
+ case 256: return IB_MTU_256;
+ case 512: return IB_MTU_512;
+ case 1024: return IB_MTU_1024;
+ case 2048: return IB_MTU_2048;
+ case 4096: return IB_MTU_4096;
+ default: return -1;
+ }
+}
+
+static inline int ibta_mtu_to_int(enum ibta_mtu mtu)
+{
+ switch (mtu) {
+ case IB_MTU_256: return 256;
+ case IB_MTU_512: return 512;
+ case IB_MTU_1024: return 1024;
+ case IB_MTU_2048: return 2048;
+ case IB_MTU_4096: return 4096;
+ default: return -1;
+ }
+}
+
+static ssize_t show_port_ib_mtu(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct mlx4_port_info *info = container_of(attr, struct mlx4_port_info,
+ port_mtu_attr);
+ struct mlx4_dev *mdev = info->dev;
+
+ if (mdev->caps.port_type[info->port] == MLX4_PORT_TYPE_ETH)
+ mlx4_warn(mdev, "port level mtu is only used for IB ports\n");
+
+ sprintf(buf, "%d\n",
+ ibta_mtu_to_int(mdev->caps.port_ib_mtu[info->port]));
+ return strlen(buf);
+}
+
+static ssize_t set_port_ib_mtu(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct mlx4_port_info *info = container_of(attr, struct mlx4_port_info,
+ port_mtu_attr);
+ struct mlx4_dev *mdev = info->dev;
+ struct mlx4_priv *priv = mlx4_priv(mdev);
+ int err, port, mtu, ibta_mtu = -1;
+
+ if (mdev->caps.port_type[info->port] == MLX4_PORT_TYPE_ETH) {
+ mlx4_warn(mdev, "port level mtu is only used for IB ports\n");
+ return -EINVAL;
+ }
+
+ err = kstrtoint(buf, 0, &mtu);
+ if (!err)
+ ibta_mtu = int_to_ibta_mtu(mtu);
+
+ if (err || ibta_mtu < 0) {
+ mlx4_err(mdev, "%s is invalid IBTA mtu\n", buf);
+ return -EINVAL;
+ }
+
+ mdev->caps.port_ib_mtu[info->port] = ibta_mtu;
+
+ mlx4_stop_sense(mdev);
+ mutex_lock(&priv->port_mutex);
+ mlx4_unregister_device(mdev);
+ for (port = 1; port <= mdev->caps.num_ports; port++) {
+ mlx4_CLOSE_PORT(mdev, port);
+ err = mlx4_SET_PORT(mdev, port, -1);
+ if (err) {
+ mlx4_err(mdev, "Failed to set port %d, aborting\n",
+ port);
+ goto err_set_port;
+ }
+ }
+ err = mlx4_register_device(mdev);
+err_set_port:
+ mutex_unlock(&priv->port_mutex);
+ mlx4_start_sense(mdev);
+ return err ? err : count;
+}
+
+int mlx4_bond(struct mlx4_dev *dev)
+{
+ int ret = 0;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ mutex_lock(&priv->bond_mutex);
+
+ if (!mlx4_is_bonded(dev))
+ ret = mlx4_do_bond(dev, true);
+ else
+ ret = 0;
+
+ mutex_unlock(&priv->bond_mutex);
+ if (ret)
+ mlx4_err(dev, "Failed to bond device: %d\n", ret);
+ else
+ mlx4_dbg(dev, "Device is bonded\n");
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mlx4_bond);
+
+int mlx4_unbond(struct mlx4_dev *dev)
+{
+ int ret = 0;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ mutex_lock(&priv->bond_mutex);
+
+ if (mlx4_is_bonded(dev))
+ ret = mlx4_do_bond(dev, false);
+
+ mutex_unlock(&priv->bond_mutex);
+ if (ret)
+ mlx4_err(dev, "Failed to unbond device: %d\n", ret);
+ else
+ mlx4_dbg(dev, "Device is unbonded\n");
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mlx4_unbond);
+
+
+int mlx4_port_map_set(struct mlx4_dev *dev, struct mlx4_port_map *v2p)
+{
+ u8 port1 = v2p->port1;
+ u8 port2 = v2p->port2;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err;
+
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_PORT_REMAP))
+ return -ENOTSUPP;
+
+ mutex_lock(&priv->bond_mutex);
+
+ /* zero means keep current mapping for this port */
+ if (port1 == 0)
+ port1 = priv->v2p.port1;
+ if (port2 == 0)
+ port2 = priv->v2p.port2;
+
+ if ((port1 < 1) || (port1 > MLX4_MAX_PORTS) ||
+ (port2 < 1) || (port2 > MLX4_MAX_PORTS) ||
+ (port1 == 2 && port2 == 1)) {
+ /* besides boundary checks cross mapping makes
+ * no sense and therefore not allowed */
+ err = -EINVAL;
+ } else if ((port1 == priv->v2p.port1) &&
+ (port2 == priv->v2p.port2)) {
+ err = 0;
+ } else {
+ err = mlx4_virt2phy_port_map(dev, port1, port2);
+ if (!err) {
+ mlx4_dbg(dev, "port map changed: [%d][%d]\n",
+ port1, port2);
+ priv->v2p.port1 = port1;
+ priv->v2p.port2 = port2;
+ } else {
+ mlx4_err(dev, "Failed to change port mape: %d\n", err);
+ }
+ }
+
+ mutex_unlock(&priv->bond_mutex);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_port_map_set);
+
+int mlx4_port_map_get(struct mlx4_dev *dev, u8 vport, u8 *pport)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (!pport)
+ return -EINVAL;
+ *pport = 0;
+
+ if (vport == 1)
+ *pport = priv->v2p.port1;
+ else if (vport == 2)
+ *pport = priv->v2p.port2;
+ if (!*pport)
+ return -EINVAL;
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_port_map_get);
+
+static int mlx4_load_fw(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err;
+
+ priv->fw.fw_icm = mlx4_alloc_icm(dev, priv->fw.fw_pages,
+ GFP_HIGHUSER | __GFP_NOWARN, 0);
+ if (!priv->fw.fw_icm) {
+ mlx4_err(dev, "Couldn't allocate FW area, aborting\n");
+ return -ENOMEM;
+ }
+
+ err = mlx4_MAP_FA(dev, priv->fw.fw_icm);
+ if (err) {
+ mlx4_err(dev, "MAP_FA command failed, aborting\n");
+ goto err_free;
+ }
+
+ err = mlx4_RUN_FW(dev);
+ if (err) {
+ mlx4_err(dev, "RUN_FW command failed, aborting\n");
+ goto err_unmap_fa;
+ }
+
+ return 0;
+
+err_unmap_fa:
+ mlx4_UNMAP_FA(dev);
+
+err_free:
+ mlx4_free_icm(dev, priv->fw.fw_icm, 0);
+ return err;
+}
+
+static int mlx4_init_cmpt_table(struct mlx4_dev *dev, u64 cmpt_base,
+ int cmpt_entry_sz)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err;
+ int num_eqs;
+
+ err = mlx4_init_icm_table(dev, &priv->qp_table.cmpt_table,
+ cmpt_base +
+ ((u64) (MLX4_CMPT_TYPE_QP *
+ cmpt_entry_sz) << MLX4_CMPT_SHIFT),
+ cmpt_entry_sz, dev->caps.num_qps,
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW],
+ 0, 0);
+ if (err)
+ goto err;
+
+ err = mlx4_init_icm_table(dev, &priv->srq_table.cmpt_table,
+ cmpt_base +
+ ((u64) (MLX4_CMPT_TYPE_SRQ *
+ cmpt_entry_sz) << MLX4_CMPT_SHIFT),
+ cmpt_entry_sz, dev->caps.num_srqs,
+ dev->caps.reserved_srqs, 0, 0);
+ if (err)
+ goto err_qp;
+
+ err = mlx4_init_icm_table(dev, &priv->cq_table.cmpt_table,
+ cmpt_base +
+ ((u64) (MLX4_CMPT_TYPE_CQ *
+ cmpt_entry_sz) << MLX4_CMPT_SHIFT),
+ cmpt_entry_sz, dev->caps.num_cqs,
+ dev->caps.reserved_cqs, 0, 0);
+ if (err)
+ goto err_srq;
+
+ num_eqs = dev->phys_caps.num_phys_eqs;
+ err = mlx4_init_icm_table(dev, &priv->eq_table.cmpt_table,
+ cmpt_base +
+ ((u64) (MLX4_CMPT_TYPE_EQ *
+ cmpt_entry_sz) << MLX4_CMPT_SHIFT),
+ cmpt_entry_sz, num_eqs, num_eqs, 0, 0);
+ if (err)
+ goto err_cq;
+
+ return 0;
+
+err_cq:
+ mlx4_cleanup_icm_table(dev, &priv->cq_table.cmpt_table);
+
+err_srq:
+ mlx4_cleanup_icm_table(dev, &priv->srq_table.cmpt_table);
+
+err_qp:
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.cmpt_table);
+
+err:
+ return err;
+}
+
+static int mlx4_init_icm(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap,
+ struct mlx4_init_hca_param *init_hca, u64 icm_size)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ u64 aux_pages;
+ int num_eqs;
+ int err;
+
+ err = mlx4_SET_ICM_SIZE(dev, icm_size, &aux_pages);
+ if (err) {
+ mlx4_err(dev, "SET_ICM_SIZE command failed, aborting\n");
+ return err;
+ }
+
+ mlx4_dbg(dev, "%lld KB of HCA context requires %lld KB aux memory\n",
+ (unsigned long long) icm_size >> 10,
+ (unsigned long long) aux_pages << 2);
+
+ priv->fw.aux_icm = mlx4_alloc_icm(dev, aux_pages,
+ GFP_HIGHUSER | __GFP_NOWARN, 0);
+ if (!priv->fw.aux_icm) {
+ mlx4_err(dev, "Couldn't allocate aux memory, aborting\n");
+ return -ENOMEM;
+ }
+
+ err = mlx4_MAP_ICM_AUX(dev, priv->fw.aux_icm);
+ if (err) {
+ mlx4_err(dev, "MAP_ICM_AUX command failed, aborting\n");
+ goto err_free_aux;
+ }
+
+ err = mlx4_init_cmpt_table(dev, init_hca->cmpt_base, dev_cap->cmpt_entry_sz);
+ if (err) {
+ mlx4_err(dev, "Failed to map cMPT context memory, aborting\n");
+ goto err_unmap_aux;
+ }
+
+
+ num_eqs = dev->phys_caps.num_phys_eqs;
+ err = mlx4_init_icm_table(dev, &priv->eq_table.table,
+ init_hca->eqc_base, dev_cap->eqc_entry_sz,
+ num_eqs, num_eqs, 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map EQ context memory, aborting\n");
+ goto err_unmap_cmpt;
+ }
+
+ /*
+ * Reserved MTT entries must be aligned up to a cacheline
+ * boundary, since the FW will write to them, while the driver
+ * writes to all other MTT entries. (The variable
+ * dev->caps.mtt_entry_sz below is really the MTT segment
+ * size, not the raw entry size)
+ */
+ dev->caps.reserved_mtts =
+ ALIGN(dev->caps.reserved_mtts * dev->caps.mtt_entry_sz,
+ dma_get_cache_alignment()) / dev->caps.mtt_entry_sz;
+
+ err = mlx4_init_icm_table(dev, &priv->mr_table.mtt_table,
+ init_hca->mtt_base,
+ dev->caps.mtt_entry_sz,
+ dev->caps.num_mtts,
+ dev->caps.reserved_mtts, 1, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map MTT context memory, aborting\n");
+ goto err_unmap_eq;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->mr_table.dmpt_table,
+ init_hca->dmpt_base,
+ dev_cap->dmpt_entry_sz,
+ dev->caps.num_mpts,
+ dev->caps.reserved_mrws, 1, 1);
+ if (err) {
+ mlx4_err(dev, "Failed to map dMPT context memory, aborting\n");
+ goto err_unmap_mtt;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->qp_table.qp_table,
+ init_hca->qpc_base,
+ dev_cap->qpc_entry_sz,
+ dev->caps.num_qps,
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW],
+ 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map QP context memory, aborting\n");
+ goto err_unmap_dmpt;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->qp_table.auxc_table,
+ init_hca->auxc_base,
+ dev_cap->aux_entry_sz,
+ dev->caps.num_qps,
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW],
+ 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map AUXC context memory, aborting\n");
+ goto err_unmap_qp;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->qp_table.altc_table,
+ init_hca->altc_base,
+ dev_cap->altc_entry_sz,
+ dev->caps.num_qps,
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW],
+ 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map ALTC context memory, aborting\n");
+ goto err_unmap_auxc;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->qp_table.rdmarc_table,
+ init_hca->rdmarc_base,
+ dev_cap->rdmarc_entry_sz << priv->qp_table.rdmarc_shift,
+ dev->caps.num_qps,
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW],
+ 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map RDMARC context memory, aborting\n");
+ goto err_unmap_altc;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->cq_table.table,
+ init_hca->cqc_base,
+ dev_cap->cqc_entry_sz,
+ dev->caps.num_cqs,
+ dev->caps.reserved_cqs, 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map CQ context memory, aborting\n");
+ goto err_unmap_rdmarc;
+ }
+
+ err = mlx4_init_icm_table(dev, &priv->srq_table.table,
+ init_hca->srqc_base,
+ dev_cap->srq_entry_sz,
+ dev->caps.num_srqs,
+ dev->caps.reserved_srqs, 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map SRQ context memory, aborting\n");
+ goto err_unmap_cq;
+ }
+
+ /*
+ * For flow steering device managed mode it is required to use
+ * mlx4_init_icm_table. For B0 steering mode it's not strictly
+ * required, but for simplicity just map the whole multicast
+ * group table now. The table isn't very big and it's a lot
+ * easier than trying to track ref counts.
+ */
+ err = mlx4_init_icm_table(dev, &priv->mcg_table.table,
+ init_hca->mc_base,
+ mlx4_get_mgm_entry_size(dev),
+ dev->caps.num_mgms + dev->caps.num_amgms,
+ dev->caps.num_mgms + dev->caps.num_amgms,
+ 0, 0);
+ if (err) {
+ mlx4_err(dev, "Failed to map MCG context memory, aborting\n");
+ goto err_unmap_srq;
+ }
+
+ return 0;
+
+err_unmap_srq:
+ mlx4_cleanup_icm_table(dev, &priv->srq_table.table);
+
+err_unmap_cq:
+ mlx4_cleanup_icm_table(dev, &priv->cq_table.table);
+
+err_unmap_rdmarc:
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.rdmarc_table);
+
+err_unmap_altc:
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.altc_table);
+
+err_unmap_auxc:
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.auxc_table);
+
+err_unmap_qp:
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.qp_table);
+
+err_unmap_dmpt:
+ mlx4_cleanup_icm_table(dev, &priv->mr_table.dmpt_table);
+
+err_unmap_mtt:
+ mlx4_cleanup_icm_table(dev, &priv->mr_table.mtt_table);
+
+err_unmap_eq:
+ mlx4_cleanup_icm_table(dev, &priv->eq_table.table);
+
+err_unmap_cmpt:
+ mlx4_cleanup_icm_table(dev, &priv->eq_table.cmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->cq_table.cmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->srq_table.cmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.cmpt_table);
+
+err_unmap_aux:
+ mlx4_UNMAP_ICM_AUX(dev);
+
+err_free_aux:
+ mlx4_free_icm(dev, priv->fw.aux_icm, 0);
+
+ return err;
+}
+
+static void mlx4_free_icms(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ mlx4_cleanup_icm_table(dev, &priv->mcg_table.table);
+ mlx4_cleanup_icm_table(dev, &priv->srq_table.table);
+ mlx4_cleanup_icm_table(dev, &priv->cq_table.table);
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.rdmarc_table);
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.altc_table);
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.auxc_table);
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.qp_table);
+ mlx4_cleanup_icm_table(dev, &priv->mr_table.dmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->mr_table.mtt_table);
+ mlx4_cleanup_icm_table(dev, &priv->eq_table.table);
+ mlx4_cleanup_icm_table(dev, &priv->eq_table.cmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->cq_table.cmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->srq_table.cmpt_table);
+ mlx4_cleanup_icm_table(dev, &priv->qp_table.cmpt_table);
+
+ mlx4_UNMAP_ICM_AUX(dev);
+ mlx4_free_icm(dev, priv->fw.aux_icm, 0);
+}
+
+static void mlx4_slave_exit(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ mutex_lock(&priv->cmd.slave_cmd_mutex);
+ if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_RESET, 0, MLX4_COMM_CMD_NA_OP,
+ MLX4_COMM_TIME))
+ mlx4_warn(dev, "Failed to close slave function\n");
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
+}
+
+static int map_bf_area(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ resource_size_t bf_start;
+ resource_size_t bf_len;
+ int err = 0;
+
+ if (!dev->caps.bf_reg_size)
+ return -ENXIO;
+
+ bf_start = pci_resource_start(dev->persist->pdev, 2) +
+ (dev->caps.num_uars << PAGE_SHIFT);
+ bf_len = pci_resource_len(dev->persist->pdev, 2) -
+ (dev->caps.num_uars << PAGE_SHIFT);
+ priv->bf_mapping = io_mapping_create_wc(bf_start, bf_len);
+ if (!priv->bf_mapping)
+ err = -ENOMEM;
+
+ return err;
+}
+
+static void unmap_bf_area(struct mlx4_dev *dev)
+{
+ if (mlx4_priv(dev)->bf_mapping)
+ io_mapping_free(mlx4_priv(dev)->bf_mapping);
+}
+
+cycle_t mlx4_read_clock(struct mlx4_dev *dev)
+{
+ u32 clockhi, clocklo, clockhi1;
+ cycle_t cycles;
+ int i;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ for (i = 0; i < 10; i++) {
+ clockhi = swab32(readl(priv->clock_mapping));
+ clocklo = swab32(readl(priv->clock_mapping + 4));
+ clockhi1 = swab32(readl(priv->clock_mapping));
+ if (clockhi == clockhi1)
+ break;
+ }
+
+ cycles = (u64) clockhi << 32 | (u64) clocklo;
+
+ return cycles;
+}
+EXPORT_SYMBOL_GPL(mlx4_read_clock);
+
+
+int mlx4_get_internal_clock_params(struct mlx4_dev *dev,
+ struct mlx4_clock_params *params)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (mlx4_is_slave(dev))
+ return -ENOTSUPP;
+ if (!params)
+ return -EINVAL;
+
+ params->bar = priv->fw.clock_bar;
+ params->offset = priv->fw.clock_offset;
+ params->size = MLX4_CLOCK_SIZE;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_internal_clock_params);
+
+static int map_internal_clock(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ priv->clock_mapping =
+ ioremap(pci_resource_start(dev->persist->pdev,
+ priv->fw.clock_bar) +
+ priv->fw.clock_offset, MLX4_CLOCK_SIZE);
+
+ if (!priv->clock_mapping)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void unmap_internal_clock(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (priv->clock_mapping)
+ iounmap(priv->clock_mapping);
+}
+
+static void mlx4_close_hca(struct mlx4_dev *dev)
+{
+ unmap_internal_clock(dev);
+ unmap_bf_area(dev);
+ if (mlx4_is_slave(dev))
+ mlx4_slave_exit(dev);
+ else {
+ mlx4_CLOSE_HCA(dev, 0);
+ mlx4_free_icms(dev);
+ }
+}
+
+static void mlx4_close_fw(struct mlx4_dev *dev)
+{
+ if (!mlx4_is_slave(dev)) {
+ mlx4_UNMAP_FA(dev);
+ mlx4_free_icm(dev, mlx4_priv(dev)->fw.fw_icm, 0);
+ }
+}
+
+static int mlx4_comm_check_offline(struct mlx4_dev *dev)
+{
+#define COMM_CHAN_OFFLINE_OFFSET 0x09
+
+ u32 comm_flags;
+ u32 offline_bit;
+ unsigned long end;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ end = msecs_to_jiffies(MLX4_COMM_OFFLINE_TIME_OUT) + jiffies;
+ while (time_before(jiffies, end)) {
+ comm_flags = swab32(readl((__iomem char *)priv->mfunc.comm +
+ MLX4_COMM_CHAN_FLAGS));
+ offline_bit = (comm_flags &
+ (u32)(1 << COMM_CHAN_OFFLINE_OFFSET));
+ if (!offline_bit)
+ return 0;
+ /* There are cases as part of AER/Reset flow that PF needs
+ * around 100 msec to load. We therefore sleep for 100 msec
+ * to allow other tasks to make use of that CPU during this
+ * time interval.
+ */
+ msleep(100);
+ }
+ mlx4_err(dev, "Communication channel is offline.\n");
+ return -EIO;
+}
+
+static void mlx4_reset_vf_support(struct mlx4_dev *dev)
+{
+#define COMM_CHAN_RST_OFFSET 0x1e
+
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ u32 comm_rst;
+ u32 comm_caps;
+
+ comm_caps = swab32(readl((__iomem char *)priv->mfunc.comm +
+ MLX4_COMM_CHAN_CAPS));
+ comm_rst = (comm_caps & (u32)(1 << COMM_CHAN_RST_OFFSET));
+
+ if (comm_rst)
+ dev->caps.vf_caps |= MLX4_VF_CAP_FLAG_RESET;
+}
+
+static int mlx4_init_slave(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ u64 dma = (u64) priv->mfunc.vhcr_dma;
+ int ret_from_reset = 0;
+ u32 slave_read;
+ u32 cmd_channel_ver;
+
+ if (atomic_read(&pf_loading)) {
+ mlx4_warn(dev, "PF is not ready - Deferring probe\n");
+ return -EPROBE_DEFER;
+ }
+
+ mutex_lock(&priv->cmd.slave_cmd_mutex);
+ priv->cmd.max_cmds = 1;
+ if (mlx4_comm_check_offline(dev)) {
+ mlx4_err(dev, "PF is not responsive, skipping initialization\n");
+ goto err_offline;
+ }
+
+ mlx4_reset_vf_support(dev);
+ mlx4_warn(dev, "Sending reset\n");
+ ret_from_reset = mlx4_comm_cmd(dev, MLX4_COMM_CMD_RESET, 0,
+ MLX4_COMM_CMD_NA_OP, MLX4_COMM_TIME);
+ /* if we are in the middle of flr the slave will try
+ * NUM_OF_RESET_RETRIES times before leaving.*/
+ if (ret_from_reset) {
+ if (MLX4_DELAY_RESET_SLAVE == ret_from_reset) {
+ mlx4_warn(dev, "slave is currently in the middle of FLR - Deferring probe\n");
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
+ return -EPROBE_DEFER;
+ } else
+ goto err;
+ }
+
+ /* check the driver version - the slave I/F revision
+ * must match the master's */
+ slave_read = swab32(readl(&priv->mfunc.comm->slave_read));
+ cmd_channel_ver = mlx4_comm_get_version();
+
+ if (MLX4_COMM_GET_IF_REV(cmd_channel_ver) !=
+ MLX4_COMM_GET_IF_REV(slave_read)) {
+ mlx4_err(dev, "slave driver version is not supported by the master\n");
+ goto err;
+ }
+
+ mlx4_warn(dev, "Sending vhcr0\n");
+ if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_VHCR0, dma >> 48,
+ MLX4_COMM_CMD_NA_OP, MLX4_COMM_TIME))
+ goto err;
+ if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_VHCR1, dma >> 32,
+ MLX4_COMM_CMD_NA_OP, MLX4_COMM_TIME))
+ goto err;
+ if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_VHCR2, dma >> 16,
+ MLX4_COMM_CMD_NA_OP, MLX4_COMM_TIME))
+ goto err;
+ if (mlx4_comm_cmd(dev, MLX4_COMM_CMD_VHCR_EN, dma,
+ MLX4_COMM_CMD_NA_OP, MLX4_COMM_TIME))
+ goto err;
+
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
+ return 0;
+
+err:
+ mlx4_comm_cmd(dev, MLX4_COMM_CMD_RESET, 0, MLX4_COMM_CMD_NA_OP, 0);
+err_offline:
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
+ return -EIO;
+}
+
+static void mlx4_parav_master_pf_caps(struct mlx4_dev *dev)
+{
+ int i;
+
+ for (i = 1; i <= dev->caps.num_ports; i++) {
+ if (dev->caps.port_type[i] == MLX4_PORT_TYPE_ETH)
+ dev->caps.gid_table_len[i] =
+ mlx4_get_slave_num_gids(dev, 0, i);
+ else
+ dev->caps.gid_table_len[i] = 1;
+ dev->caps.pkey_table_len[i] =
+ dev->phys_caps.pkey_phys_table_len[i] - 1;
+ }
+}
+
+static int choose_log_fs_mgm_entry_size(int qp_per_entry)
+{
+ int i = MLX4_MIN_MGM_LOG_ENTRY_SIZE;
+
+ for (i = MLX4_MIN_MGM_LOG_ENTRY_SIZE; i <= MLX4_MAX_MGM_LOG_ENTRY_SIZE;
+ i++) {
+ if (qp_per_entry <= 4 * ((1 << i) / 16 - 2))
+ break;
+ }
+
+ return (i <= MLX4_MAX_MGM_LOG_ENTRY_SIZE) ? i : -1;
+}
+
+static const char *dmfs_high_rate_steering_mode_str(int dmfs_high_steer_mode)
+{
+ switch (dmfs_high_steer_mode) {
+ case MLX4_STEERING_DMFS_A0_DEFAULT:
+ return "default performance";
+
+ case MLX4_STEERING_DMFS_A0_DYNAMIC:
+ return "dynamic hybrid mode";
+
+ case MLX4_STEERING_DMFS_A0_STATIC:
+ return "performance optimized for limited rule configuration (static)";
+
+ case MLX4_STEERING_DMFS_A0_DISABLE:
+ return "disabled performance optimized steering";
+
+ case MLX4_STEERING_DMFS_A0_NOT_SUPPORTED:
+ return "performance optimized steering not supported";
+
+ default:
+ return "Unrecognized mode";
+ }
+}
+
+#define MLX4_DMFS_LOW_QP_COUNT 63
+
+static void choose_steering_mode(struct mlx4_dev *dev,
+ struct mlx4_dev_cap *dev_cap)
+{
+ int mlx4_current_steering_mode = mlx4_log_num_mgm_entry_size;
+ dev->caps.steering_attr = 0;
+
+ if (mlx4_current_steering_mode <= 0) {
+ if (!((-mlx4_current_steering_mode) & MLX4_FORCE_DMFS_IF_NO_NCSI_FS))
+ if (!(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_FS_EN_NCSI))
+ mlx4_current_steering_mode =
+ MLX4_DEFAULT_MGM_LOG_ENTRY_SIZE;
+
+ if ((-mlx4_current_steering_mode) & MLX4_DISABLE_DMFS_LOW_QP_NUM)
+ if (dev_cap->fs_max_num_qp_per_entry <= MLX4_DMFS_LOW_QP_COUNT) {
+ mlx4_warn(dev, "FW supports only %d QPs per mcg entry, "
+ "falling back to B0\n",
+ dev_cap->fs_max_num_qp_per_entry);
+ mlx4_current_steering_mode =
+ MLX4_DEFAULT_MGM_LOG_ENTRY_SIZE;
+ }
+
+ if ((-mlx4_current_steering_mode) & MLX4_DMFS_A0_STEERING) {
+ if (dev->caps.dmfs_high_steer_mode ==
+ MLX4_STEERING_DMFS_A0_NOT_SUPPORTED)
+ mlx4_err(dev, "DMFS high rate mode not supported\n");
+ else
+ dev->caps.dmfs_high_steer_mode =
+ MLX4_STEERING_DMFS_A0_STATIC;
+ }
+ if (dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_DISABLE_SIP_CHECK) {
+ if (-mlx4_current_steering_mode & MLX4_IB_IGNORE_SIP_CHECK)
+ dev->caps.steering_attr |= MLX4_STEERING_ATTR_IB_IGNORE_SIP;
+ if (-mlx4_current_steering_mode & MLX4_ETH_IGNORE_SIP_CHECK)
+ dev->caps.steering_attr |= MLX4_STEERING_ATTR_ETH_IGNORE_SIP;
+ }
+ }
+
+ if (mlx4_current_steering_mode <= 0 &&
+ dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_FS_EN &&
+ (!mlx4_is_mfunc(dev) ||
+ (dev_cap->fs_max_num_qp_per_entry >=
+ (dev->persist->num_vfs + 1))) &&
+ choose_log_fs_mgm_entry_size(dev_cap->fs_max_num_qp_per_entry) >=
+ MLX4_MIN_MGM_LOG_ENTRY_SIZE) {
+ dev->oper_log_mgm_entry_size =
+ choose_log_fs_mgm_entry_size(dev_cap->fs_max_num_qp_per_entry);
+ dev->caps.steering_mode = MLX4_STEERING_MODE_DEVICE_MANAGED;
+ dev->caps.num_qp_per_mgm = dev_cap->fs_max_num_qp_per_entry;
+
+ dev->caps.steering_attr |= MLX4_STEERING_ATTR_DMFS_EN;
+
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_DMFS_IPOIB &&
+ (!((-mlx4_current_steering_mode) & MLX4_DMFS_ETH_ONLY)))
+ dev->caps.steering_attr |= MLX4_STEERING_ATTR_DMFS_IPOIB;
+
+ dev->caps.fs_log_max_ucast_qp_range_size =
+ dev_cap->fs_log_max_ucast_qp_range_size;
+ } else {
+ if (dev->caps.dmfs_high_steer_mode !=
+ MLX4_STEERING_DMFS_A0_NOT_SUPPORTED)
+ dev->caps.dmfs_high_steer_mode = MLX4_STEERING_DMFS_A0_DISABLE;
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_VEP_UC_STEER &&
+ dev->caps.flags & MLX4_DEV_CAP_FLAG_VEP_MC_STEER)
+ dev->caps.steering_mode = MLX4_STEERING_MODE_B0;
+ else {
+ dev->caps.steering_mode = MLX4_STEERING_MODE_A0;
+
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_VEP_UC_STEER ||
+ dev->caps.flags & MLX4_DEV_CAP_FLAG_VEP_MC_STEER)
+ mlx4_warn(dev, "Must have both UC_STEER and MC_STEER flags set to use B0 steering - falling back to A0 steering mode\n");
+ }
+ dev->oper_log_mgm_entry_size =
+ mlx4_current_steering_mode > 0 ?
+ mlx4_current_steering_mode :
+ MLX4_DEFAULT_MGM_LOG_ENTRY_SIZE;
+ dev->caps.num_qp_per_mgm = mlx4_get_qp_per_mgm(dev);
+ }
+ mlx4_dbg(dev, "Steering mode is: %s, oper_log_mgm_entry_size = %d, modparam log_num_mgm_entry_size = %d\n",
+ mlx4_steering_mode_str(dev->caps.steering_mode),
+ dev->oper_log_mgm_entry_size,
+ mlx4_current_steering_mode);
+}
+
+static void choose_tunnel_offload_mode(struct mlx4_dev *dev,
+ struct mlx4_dev_cap *dev_cap)
+{
+#ifdef HAVE_VXLAN_ENABLED
+ if (dev->caps.steering_mode == MLX4_STEERING_MODE_DEVICE_MANAGED &&
+ dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_VXLAN_OFFLOADS)
+ dev->caps.tunnel_offload_mode = MLX4_TUNNEL_OFFLOAD_MODE_VXLAN;
+ else
+#endif
+ dev->caps.tunnel_offload_mode = MLX4_TUNNEL_OFFLOAD_MODE_NONE;
+
+ mlx4_dbg(dev, "Tunneling offload mode is: %s\n", (dev->caps.tunnel_offload_mode
+ == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN) ? "vxlan" : "none");
+}
+
+static int mlx4_validate_optimized_steering(struct mlx4_dev *dev)
+{
+ int i;
+ struct mlx4_port_cap port_cap;
+
+ if (dev->caps.dmfs_high_steer_mode == MLX4_STEERING_DMFS_A0_NOT_SUPPORTED)
+ return -EINVAL;
+
+ for (i = 1; i <= dev->caps.num_ports; i++) {
+ if (mlx4_dev_port(dev, i, &port_cap)) {
+ mlx4_err(dev,
+ "QUERY_DEV_CAP command failed, can't veify DMFS high rate steering.\n");
+ } else if ((dev->caps.dmfs_high_steer_mode !=
+ MLX4_STEERING_DMFS_A0_DEFAULT) &&
+ (port_cap.dmfs_optimized_state ==
+ !!(dev->caps.dmfs_high_steer_mode ==
+ MLX4_STEERING_DMFS_A0_DISABLE))) {
+ mlx4_err(dev,
+ "DMFS high rate steer mode differ, driver requested %s but %s in FW.\n",
+ dmfs_high_rate_steering_mode_str(
+ dev->caps.dmfs_high_steer_mode),
+ (port_cap.dmfs_optimized_state ?
+ "enabled" : "disabled"));
+ }
+ }
+
+ return 0;
+}
+
+static int mlx4_init_fw(struct mlx4_dev *dev)
+{
+ struct mlx4_mod_stat_cfg mlx4_cfg;
+ int err = 0;
+
+ if (!mlx4_is_slave(dev)) {
+ err = mlx4_QUERY_FW(dev);
+ if (err) {
+ if (err == -EACCES)
+ mlx4_info(dev, "non-primary physical function, skipping\n");
+ else
+ mlx4_err(dev, "QUERY_FW command failed, aborting\n");
+ return err;
+ }
+
+ err = mlx4_load_fw(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to start FW, aborting\n");
+ return err;
+ }
+
+ mlx4_cfg.log_pg_sz_m = 1;
+ mlx4_cfg.log_pg_sz = 0;
+ err = mlx4_MOD_STAT_CFG(dev, &mlx4_cfg);
+ if (err)
+ mlx4_warn(dev, "Failed to override log_pg_sz parameter\n");
+ }
+
+ return err;
+}
+
+static int mlx4_init_hca(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_adapter adapter;
+ struct mlx4_dev_cap dev_cap;
+ struct mlx4_profile profile;
+ struct mlx4_init_hca_param init_hca;
+ u64 icm_size;
+ struct mlx4_config_dev_params params;
+ int err;
+
+ if (!mlx4_is_slave(dev)) {
+ err = mlx4_dev_cap(dev, &dev_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting\n");
+ return err;
+ }
+
+ choose_steering_mode(dev, &dev_cap);
+ choose_roce_mode(dev, &dev_cap);
+ choose_tunnel_offload_mode(dev, &dev_cap);
+
+ if (dev->caps.dmfs_high_steer_mode == MLX4_STEERING_DMFS_A0_STATIC &&
+ mlx4_is_master(dev))
+ dev->caps.function_caps |= MLX4_FUNC_CAP_DMFS_A0_STATIC;
+
+ err = mlx4_get_phys_port_id(dev);
+ if (err)
+ mlx4_err(dev, "Fail to get physical port id\n");
+
+ if (mlx4_is_master(dev))
+ mlx4_parav_master_pf_caps(dev);
+
+ if (mlx4_low_memory_profile()) {
+ mlx4_info(dev, "Running from within kdump kernel. Using low memory profile\n");
+ /* use old default log_mtts_per_seg */
+ log_mtts_per_seg = ilog2(MLX4_MTT_ENTRY_PER_SEG);
+ profile = low_mem_profile;
+ } else {
+ process_mod_param_profile(&profile);
+ }
+ if (dev->caps.steering_mode ==
+ MLX4_STEERING_MODE_DEVICE_MANAGED)
+ profile.num_mcg = MLX4_FS_NUM_MCG;
+
+ icm_size = mlx4_make_profile(dev, &profile, &dev_cap,
+ &init_hca);
+ if ((long long) icm_size < 0) {
+ err = icm_size;
+ return err;
+ }
+
+ dev->caps.max_fmr_maps = (1 << (32 - ilog2(dev->caps.num_mpts))) - 1;
+
+ init_hca.log_uar_sz = ilog2(dev->caps.num_uars);
+ init_hca.uar_page_sz = PAGE_SHIFT - 12;
+ init_hca.mw_enabled = 0;
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_MEM_WINDOW ||
+ dev->caps.bmme_flags & MLX4_BMME_FLAG_TYPE_2_WIN)
+ init_hca.mw_enabled = INIT_HCA_TPT_MW_ENABLE;
+
+ err = mlx4_init_icm(dev, &dev_cap, &init_hca, icm_size);
+ if (err)
+ return err;
+
+ err = mlx4_INIT_HCA(dev, &init_hca);
+ if (err) {
+ mlx4_err(dev, "INIT_HCA command failed, aborting\n");
+ goto err_free_icm;
+ }
+
+ if (dev_cap.flags2 & MLX4_DEV_CAP_FLAG2_SYS_EQS) {
+ err = mlx4_query_func(dev, &dev_cap);
+ if (err < 0) {
+ mlx4_err(dev, "QUERY_FUNC command failed, aborting.\n");
+ goto err_close;
+ } else if (err & MLX4_QUERY_FUNC_NUM_SYS_EQS) {
+ dev->caps.num_eqs = dev_cap.max_eqs;
+ dev->caps.reserved_eqs = dev_cap.reserved_eqs;
+ dev->caps.reserved_uars = dev_cap.reserved_uars;
+ }
+ }
+
+ /*
+ * If TS is supported by FW
+ * read HCA frequency by QUERY_HCA command
+ */
+ if (dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_TS) {
+ memset(&init_hca, 0, sizeof(init_hca));
+ err = mlx4_QUERY_HCA(dev, &init_hca);
+ if (err) {
+ mlx4_err(dev, "QUERY_HCA command failed, disable timestamp\n");
+ dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_TS;
+ } else {
+ dev->caps.hca_core_clock =
+ init_hca.hca_core_clock;
+ }
+
+ /* In case we got HCA frequency 0 - disable timestamping
+ * to avoid dividing by zero
+ */
+ if (!dev->caps.hca_core_clock) {
+ dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_TS;
+ mlx4_err(dev,
+ "HCA frequency is 0 - timestamping is not supported\n");
+ } else if (map_internal_clock(dev)) {
+ /*
+ * Map internal clock,
+ * in case of failure disable timestamping
+ */
+ dev->caps.flags2 &= ~MLX4_DEV_CAP_FLAG2_TS;
+ mlx4_err(dev, "Failed to map internal clock. Timestamping is not supported\n");
+ }
+ }
+
+ if (dev->caps.dmfs_high_steer_mode !=
+ MLX4_STEERING_DMFS_A0_NOT_SUPPORTED) {
+ if (mlx4_validate_optimized_steering(dev))
+ mlx4_warn(dev, "Optimized steering validation failed\n");
+
+ if (dev->caps.dmfs_high_steer_mode ==
+ MLX4_STEERING_DMFS_A0_DISABLE) {
+ dev->caps.dmfs_high_rate_qpn_base =
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW];
+ dev->caps.dmfs_high_rate_qpn_range =
+ MLX4_A0_STEERING_TABLE_SIZE;
+ }
+
+ mlx4_dbg(dev, "DMFS high rate steer mode is: %s\n",
+ dmfs_high_rate_steering_mode_str(
+ dev->caps.dmfs_high_steer_mode));
+ }
+ } else {
+ err = mlx4_init_slave(dev);
+ if (err) {
+ if (err != -EPROBE_DEFER)
+ mlx4_err(dev, "Failed to initialize slave\n");
+ return err;
+ }
+
+ err = mlx4_slave_cap(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to obtain slave caps\n");
+ goto err_close;
+ }
+ }
+
+ if (map_bf_area(dev))
+ mlx4_dbg(dev, "Failed to map blue flame area\n");
+
+ /*Only the master set the ports, all the rest got it from it.*/
+ if (!mlx4_is_slave(dev))
+ mlx4_set_port_mask(dev);
+
+ err = mlx4_QUERY_ADAPTER(dev, &adapter);
+ if (err) {
+ mlx4_err(dev, "QUERY_ADAPTER command failed, aborting\n");
+ goto unmap_bf;
+ }
+
+ /* Query CONFIG_DEV parameters */
+ err = mlx4_config_dev_retrieval(dev, ¶ms);
+ if (err && err != -ENOTSUPP) {
+ mlx4_err(dev, "Failed to query CONFIG_DEV parameters\n");
+ } else if (!err) {
+ dev->caps.rx_checksum_flags_port[1] = params.rx_csum_flags_port_1;
+ dev->caps.rx_checksum_flags_port[2] = params.rx_csum_flags_port_2;
+ }
+ priv->eq_table.inta_pin = adapter.inta_pin;
+ memcpy(dev->board_id, adapter.board_id, sizeof dev->board_id);
+
+ return 0;
+
+unmap_bf:
+ unmap_internal_clock(dev);
+ unmap_bf_area(dev);
+
+ if (mlx4_is_slave(dev))
+ mlx4_slave_destroy_special_qp_cap(dev);
+
+err_close:
+ if (mlx4_is_slave(dev))
+ mlx4_slave_exit(dev);
+ else
+ mlx4_CLOSE_HCA(dev, 0);
+
+err_free_icm:
+ if (!mlx4_is_slave(dev))
+ mlx4_free_icms(dev);
+
+ return err;
+}
+
+static int mlx4_init_counters_table(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int nent_pow2, port_indx, vf_index, num_counters;
+ int res, index = 0;
+ struct counter_index *new_counter_index;
+
+
+ mutex_init(&priv->counters_table.mutex);
+
+ if (!(dev->caps.flags & MLX4_DEV_CAP_FLAG_COUNTERS))
+ return -ENOENT;
+
+ if (!mlx4_is_slave(dev) &&
+ dev->caps.max_counters == dev->caps.max_extended_counters) {
+ res = mlx4_cmd(dev, MLX4_IF_STATE_EXTENDED, 0, 0,
+ MLX4_CMD_SET_IF_STAT,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+ if (res) {
+ mlx4_err(dev, "Failed to set extended counters (err=%d)\n", res);
+ return res;
+ }
+ }
+
+ if (mlx4_is_slave(dev)) {
+ for (port_indx = 0; port_indx < dev->caps.num_ports; port_indx++) {
+ INIT_LIST_HEAD(&priv->counters_table.global_port_list[port_indx]);
+ if (dev->caps.def_counter_index[port_indx] != 0xFF) {
+ new_counter_index = kmalloc(sizeof(struct counter_index), GFP_KERNEL);
+ if (!new_counter_index)
+ return -ENOMEM;
+ new_counter_index->index = dev->caps.def_counter_index[port_indx];
+ list_add_tail(&new_counter_index->list, &priv->counters_table.global_port_list[port_indx]);
+ }
+ }
+ mlx4_dbg(dev, "%s: slave allocated %d counters for %d ports\n",
+ __func__, dev->caps.num_ports, dev->caps.num_ports);
+ return 0;
+ }
+
+ nent_pow2 = roundup_pow_of_two(dev->caps.max_counters);
+
+ for (port_indx = 0; port_indx < dev->caps.num_ports; port_indx++) {
+ INIT_LIST_HEAD(&priv->counters_table.global_port_list[port_indx]);
+ /* allocating 2 counters per port for PFs */
+ /* For the PF, the ETH default counters are 0,2; */
+ /* and the RoCE default counters are 1,3 */
+ for (num_counters = 0; num_counters < 2; num_counters++, index++) {
+ new_counter_index = kmalloc(sizeof(struct counter_index), GFP_KERNEL);
+ if (!new_counter_index)
+ return -ENOMEM;
+ new_counter_index->index = index;
+ list_add_tail(&new_counter_index->list,
+ &priv->counters_table.global_port_list[port_indx]);
+ }
+ }
+
+ if (mlx4_is_master(dev)) {
+ for (vf_index = 0; vf_index < dev->persist->num_vfs; vf_index++) {
+ int slave = mlx4_get_slave_indx(&priv->dev, vf_index);
+ struct mlx4_active_ports actv_ports;
+ if (slave < 0)
+ continue;
+ actv_ports = mlx4_get_active_ports(&priv->dev, slave);
+ for (port_indx = 0; port_indx < dev->caps.num_ports; port_indx++) {
+ INIT_LIST_HEAD(&priv->counters_table.vf_list[vf_index][port_indx]);
+ new_counter_index = kmalloc(sizeof(struct counter_index), GFP_KERNEL);
+ if (!new_counter_index)
+ return -ENOMEM;
+ if (index < nent_pow2 - 1 &&
+ test_bit(port_indx, actv_ports.ports)) {
+ new_counter_index->index = index;
+ index++;
+ } else {
+ new_counter_index->index = MLX4_SINK_COUNTER_INDEX;
+ }
+
+ list_add_tail(&new_counter_index->list,
+ &priv->counters_table.vf_list[vf_index][port_indx]);
+ }
+ }
+
+ res = mlx4_bitmap_init(&priv->counters_table.bitmap,
+ nent_pow2, nent_pow2 - 1,
+ index, 1);
+ mlx4_dbg(dev, "%s: master allocated %d counters for %d VFs\n",
+ __func__, index, dev->persist->num_vfs);
+ } else {
+ res = mlx4_bitmap_init(&priv->counters_table.bitmap,
+ nent_pow2, nent_pow2 - 1,
+ index, 1);
+ mlx4_dbg(dev, "%s: native allocated %d counters for %d ports\n",
+ __func__, index, dev->caps.num_ports);
+ }
+
+ return 0;
+
+}
+
+static void mlx4_cleanup_counters_table(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i, j;
+ struct counter_index *port, *tmp_port;
+ struct counter_index *vf, *tmp_vf;
+
+ mutex_lock(&priv->counters_table.mutex);
+
+ if (dev->caps.flags & MLX4_DEV_CAP_FLAG_COUNTERS) {
+ for (i = 0; i < dev->caps.num_ports; i++) {
+ list_for_each_entry_safe(port, tmp_port,
+ &priv->counters_table.global_port_list[i],
+ list) {
+ list_del(&port->list);
+ kfree(port);
+ }
+ }
+ if (mlx4_is_master(dev)) {
+ for (i = 0; i < dev->persist->num_vfs; i++) {
+ for (j = 0; j < dev->caps.num_ports; j++) {
+ list_for_each_entry_safe(vf, tmp_vf,
+ &priv->counters_table.vf_list[i][j],
+ list) {
+ /* clear the counter statistic */
+ if (__mlx4_clear_if_stat(dev, vf->index))
+ mlx4_dbg(dev, "%s: reset counter %d failed\n",
+ __func__, vf->index);
+ list_del(&vf->list);
+ kfree(vf);
+ }
+ }
+ }
+ }
+ mlx4_bitmap_cleanup(&priv->counters_table.bitmap);
+ }
+ mutex_unlock(&priv->counters_table.mutex);
+}
+
+int __mlx4_slave_counters_free(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i, first;
+ struct counter_index *vf, *tmp_vf;
+
+ /* clean VF's counters for the next useg */
+ if (slave > 0 && slave <= dev->persist->num_vfs) {
+ mlx4_dbg(dev, "%s: free counters of slave(%d)\n"
+ , __func__, slave);
+
+ mutex_lock(&priv->counters_table.mutex);
+ for (i = 0; i < dev->caps.num_ports; i++) {
+ first = 0;
+ list_for_each_entry_safe(vf, tmp_vf,
+ &priv->counters_table.vf_list[slave - 1][i],
+ list) {
+ /* clear the counter statistic */
+ if (__mlx4_clear_if_stat(dev, vf->index))
+ mlx4_dbg(dev, "%s: reset counter %d failed\n",
+ __func__, vf->index);
+ if (first++ && vf->index != MLX4_SINK_COUNTER_INDEX) {
+ mlx4_dbg(dev, "%s: delete counter index %d for slave %d and port %d\n"
+ , __func__, vf->index, slave, i + 1);
+ mlx4_bitmap_free(&priv->counters_table.bitmap, vf->index, MLX4_USE_RR);
+ list_del(&vf->list);
+ kfree(vf);
+ } else {
+ mlx4_dbg(dev, "%s: can't delete default counter index %d for slave %d and port %d\n"
+ , __func__, vf->index, slave, i + 1);
+ }
+ }
+ }
+ mutex_unlock(&priv->counters_table.mutex);
+ }
+
+ return 0;
+}
+
+int __mlx4_counter_alloc(struct mlx4_dev *dev, int slave, int port, u32 *idx)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct counter_index *new_counter_index;
+
+ if (!(dev->caps.flags & MLX4_DEV_CAP_FLAG_COUNTERS))
+ return -ENOENT;
+
+ if ((slave > MLX4_MAX_NUM_VF) || (slave < 0) ||
+ (port < 0) || (port > dev->caps.num_ports)) {
+ mlx4_dbg(dev, "%s: invalid slave(%d) or port(%d) index\n",
+ __func__, slave, port);
+ return -EINVAL;
+ }
+
+ /* handle old guest request does not support request by port index */
+ if (port == 0) {
+ *idx = MLX4_SINK_COUNTER_INDEX;
+ mlx4_dbg(dev, "%s: allocated default counter index %d for slave %d port %d\n"
+ , __func__, *idx, slave, port);
+ return 0;
+ }
+
+ mutex_lock(&priv->counters_table.mutex);
+
+ *idx = mlx4_bitmap_alloc(&priv->counters_table.bitmap);
+ /* if no resources return the default counter of the slave and port */
+ if (*idx == -1) {
+ if (slave == 0) { /* its the ethernet counter ?????? */
+ new_counter_index = list_entry(priv->counters_table.global_port_list[port - 1].next,
+ struct counter_index,
+ list);
+ } else {
+ new_counter_index = list_entry(priv->counters_table.vf_list[slave - 1][port - 1].next,
+ struct counter_index,
+ list);
+ }
+
+ *idx = new_counter_index->index;
+ mlx4_dbg(dev, "%s: allocated defualt counter index %d for slave %d port %d\n"
+ , __func__, *idx, slave, port);
+ goto out;
+ }
+
+ if (slave == 0) { /* native or master */
+ new_counter_index = kmalloc(sizeof(struct counter_index), GFP_KERNEL);
+ if (!new_counter_index)
+ goto no_mem;
+ new_counter_index->index = *idx;
+ list_add_tail(&new_counter_index->list, &priv->counters_table.global_port_list[port - 1]);
+ } else {
+ new_counter_index = kmalloc(sizeof(struct counter_index), GFP_KERNEL);
+ if (!new_counter_index)
+ goto no_mem;
+ new_counter_index->index = *idx;
+ list_add_tail(&new_counter_index->list, &priv->counters_table.vf_list[slave - 1][port - 1]);
+ }
+
+ mlx4_dbg(dev, "%s: allocated counter index %d for slave %d port %d\n"
+ , __func__, *idx, slave, port);
+out:
+ mutex_unlock(&priv->counters_table.mutex);
+ return 0;
+
+no_mem:
+ mlx4_bitmap_free(&priv->counters_table.bitmap, *idx, MLX4_USE_RR);
+ mutex_unlock(&priv->counters_table.mutex);
+ *idx = MLX4_SINK_COUNTER_INDEX;
+ mlx4_dbg(dev, "%s: failed err (%d)\n"
+ , __func__, -ENOMEM);
+ return -ENOMEM;
+}
+
+int mlx4_counter_alloc(struct mlx4_dev *dev, u8 port, u32 *idx)
+{
+ u64 out_param;
+ int err;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct counter_index *new_counter_index, *c_index;
+
+ if (mlx4_is_mfunc(dev)) {
+ err = mlx4_cmd_imm(dev, 0, &out_param,
+ ((u32) port) << 8 | (u32) RES_COUNTER,
+ RES_OP_RESERVE, MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (!err) {
+ *idx = get_param_l(&out_param);
+ if (*idx == MLX4_SINK_COUNTER_INDEX)
+ return -ENOSPC;
+
+ mutex_lock(&priv->counters_table.mutex);
+ c_index = list_entry(priv->counters_table.global_port_list[port - 1].next,
+ struct counter_index,
+ list);
+ mutex_unlock(&priv->counters_table.mutex);
+ if (c_index->index == *idx)
+ return -EEXIST;
+
+ if (mlx4_is_slave(dev)) {
+ new_counter_index = kmalloc(sizeof(struct counter_index), GFP_KERNEL);
+ if (!new_counter_index) {
+ mlx4_counter_free(dev, port, *idx);
+ return -ENOMEM;
+ }
+ new_counter_index->index = *idx;
+ mutex_lock(&priv->counters_table.mutex);
+ list_add_tail(&new_counter_index->list, &priv->counters_table.global_port_list[port - 1]);
+ mutex_unlock(&priv->counters_table.mutex);
+ mlx4_dbg(dev, "%s: allocated counter index %d for port %d\n"
+ , __func__, *idx, port);
+ }
+ }
+ return err;
+ }
+ return __mlx4_counter_alloc(dev, 0, port, idx);
+}
+EXPORT_SYMBOL_GPL(mlx4_counter_alloc);
+
+void __mlx4_counter_free(struct mlx4_dev *dev, int slave, int port, u32 idx)
+{
+ /* check if native or slave and deletes accordingly */
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct counter_index *pf, *tmp_pf;
+ struct counter_index *vf, *tmp_vf;
+ int first;
+
+
+ if (idx == MLX4_SINK_COUNTER_INDEX) {
+ mlx4_dbg(dev, "%s: try to delete default counter index %d for port %d\n"
+ , __func__, idx, port);
+ return;
+ }
+
+ if ((slave > MLX4_MAX_NUM_VF) || (slave < 0) ||
+ (port < 0) || (port > MLX4_MAX_PORTS)) {
+ mlx4_warn(dev, "%s: deletion failed due to invalid slave(%d) or port(%d) index\n"
+ , __func__, slave, idx);
+ return;
+ }
+
+ mutex_lock(&priv->counters_table.mutex);
+ if (slave == 0) {
+ first = 0;
+ list_for_each_entry_safe(pf, tmp_pf,
+ &priv->counters_table.global_port_list[port - 1],
+ list) {
+ /* the first 2 counters are reserved */
+ if (pf->index == idx) {
+ /* clear the counter statistic */
+ if (__mlx4_clear_if_stat(dev, pf->index))
+ mlx4_dbg(dev, "%s: reset counter %d failed\n",
+ __func__, pf->index);
+ if (1 < first && idx != MLX4_SINK_COUNTER_INDEX) {
+ list_del(&pf->list);
+ kfree(pf);
+ mlx4_dbg(dev, "%s: delete counter index %d for native device (%d) port %d\n"
+ , __func__, idx, slave, port);
+ mlx4_bitmap_free(&priv->counters_table.bitmap, idx, MLX4_USE_RR);
+ goto out;
+ } else {
+ mlx4_dbg(dev, "%s: can't delete default counter index %d for native device (%d) port %d\n"
+ , __func__, idx, slave, port);
+ goto out;
+ }
+ }
+ first++;
+ }
+ mlx4_dbg(dev, "%s: can't delete counter index %d for native device (%d) port %d\n"
+ , __func__, idx, slave, port);
+ } else {
+ first = 0;
+ list_for_each_entry_safe(vf, tmp_vf,
+ &priv->counters_table.vf_list[slave - 1][port - 1],
+ list) {
+ /* the first element is reserved */
+ if (vf->index == idx) {
+ /* clear the counter statistic */
+ if (__mlx4_clear_if_stat(dev, vf->index))
+ mlx4_dbg(dev, "%s: reset counter %d failed\n",
+ __func__, vf->index);
+ if (first) {
+ list_del(&vf->list);
+ kfree(vf);
+ mlx4_dbg(dev, "%s: delete counter index %d for slave %d port %d\n",
+ __func__, idx, slave, port);
+ mlx4_bitmap_free(&priv->counters_table.bitmap, idx, MLX4_USE_RR);
+ goto out;
+ } else {
+ mlx4_dbg(dev, "%s: can't delete default slave (%d) counter index %d for port %d\n"
+ , __func__, slave, idx, port);
+ goto out;
+ }
+ }
+ first++;
+ }
+ mlx4_dbg(dev, "%s: can't delete slave (%d) counter index %d for port %d\n"
+ , __func__, slave, idx, port);
+ }
+
+out:
+ mutex_unlock(&priv->counters_table.mutex);
+}
+
+void mlx4_counter_free(struct mlx4_dev *dev, u8 port, u32 idx)
+{
+ u64 in_param = 0;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct counter_index *counter, *tmp_counter;
+ int first = 0;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, idx);
+ mlx4_cmd(dev, in_param,
+ ((u32) port) << 8 | (u32) RES_COUNTER,
+ RES_OP_RESERVE,
+ MLX4_CMD_FREE_RES, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+
+ if (mlx4_is_slave(dev) && idx != MLX4_SINK_COUNTER_INDEX) {
+ mutex_lock(&priv->counters_table.mutex);
+ list_for_each_entry_safe(counter, tmp_counter,
+ &priv->counters_table.global_port_list[port - 1],
+ list) {
+ if (counter->index == idx && first++) {
+ list_del(&counter->list);
+ kfree(counter);
+ mlx4_dbg(dev, "%s: delete counter index %d for port %d\n"
+ , __func__, idx, port);
+ mutex_unlock(&priv->counters_table.mutex);
+ return;
+ }
+ }
+ mutex_unlock(&priv->counters_table.mutex);
+ }
+
+ return;
+ }
+ __mlx4_counter_free(dev, 0, port, idx);
+}
+EXPORT_SYMBOL_GPL(mlx4_counter_free);
+
+int __mlx4_clear_if_stat(struct mlx4_dev *dev,
+ u8 counter_index)
+{
+ struct mlx4_cmd_mailbox *if_stat_mailbox = NULL;
+ int err = 0;
+ u32 if_stat_in_mod = (counter_index & 0xff) | (1 << 31);
+
+ if (counter_index == MLX4_SINK_COUNTER_INDEX)
+ return -EINVAL;
+
+ if (mlx4_is_slave(dev))
+ return 0;
+
+ if_stat_mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(if_stat_mailbox)) {
+ err = PTR_ERR(if_stat_mailbox);
+ return err;
+ }
+
+ err = mlx4_cmd_box(dev, 0, if_stat_mailbox->dma, if_stat_in_mod, 0,
+ MLX4_CMD_QUERY_IF_STAT, MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, if_stat_mailbox);
+ return err;
+}
+
+u8 mlx4_get_default_counter_index(struct mlx4_dev *dev, int slave, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct counter_index *new_counter_index;
+
+ mutex_lock(&priv->counters_table.mutex);
+ if (slave == 0) {
+ new_counter_index = list_entry(priv->counters_table.global_port_list[port - 1].next,
+ struct counter_index,
+ list);
+ } else {
+ new_counter_index = list_entry(priv->counters_table.vf_list[slave - 1][port - 1].next,
+ struct counter_index,
+ list);
+ }
+ mutex_unlock(&priv->counters_table.mutex);
+
+ mlx4_dbg(dev, "%s: return counter index %d for slave %d port %d\n",
+ __func__, new_counter_index->index, slave, port);
+
+ return (u8)new_counter_index->index;
+}
+
+int mlx4_get_vport_ethtool_stats(struct mlx4_dev *dev, int port,
+ struct mlx4_en_vport_stats *vport_stats,
+ int reset, int *read_counters)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cmd_mailbox *if_stat_mailbox = NULL;
+ union mlx4_counter *counter;
+ int err = 0;
+ u32 if_stat_in_mod;
+ struct counter_index *vport, *tmp_vport;
+
+ if (!vport_stats)
+ return -EINVAL;
+
+ if_stat_mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(if_stat_mailbox)) {
+ err = PTR_ERR(if_stat_mailbox);
+ return err;
+ }
+
+ mutex_lock(&priv->counters_table.mutex);
+ list_for_each_entry_safe(vport, tmp_vport,
+ &priv->counters_table.global_port_list[port - 1],
+ list) {
+ if (vport->index == MLX4_SINK_COUNTER_INDEX)
+ continue;
+
+ memset(if_stat_mailbox->buf, 0, sizeof(union mlx4_counter));
+ if_stat_in_mod = (vport->index & 0xff) | ((reset & 1) << 31);
+ err = mlx4_cmd_box(dev, 0, if_stat_mailbox->dma,
+ if_stat_in_mod, 0,
+ MLX4_CMD_QUERY_IF_STAT,
+ MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_dbg(dev, "%s: failed to read statistics for counter index %d\n",
+ __func__, vport->index);
+ goto if_stat_out;
+ }
+ counter = (union mlx4_counter *)if_stat_mailbox->buf;
+ if ((counter->control.cnt_mode & 0xf) == 1) {
+ vport_stats->rx_broadcast_packets += be64_to_cpu(counter->ext.counters[0].IfRxBroadcastFrames);
+ vport_stats->rx_unicast_packets += be64_to_cpu(counter->ext.counters[0].IfRxUnicastFrames);
+ vport_stats->rx_multicast_packets += be64_to_cpu(counter->ext.counters[0].IfRxMulticastFrames);
+ vport_stats->tx_broadcast_packets += be64_to_cpu(counter->ext.counters[0].IfTxBroadcastFrames);
+ vport_stats->tx_unicast_packets += be64_to_cpu(counter->ext.counters[0].IfTxUnicastFrames);
+ vport_stats->tx_multicast_packets += be64_to_cpu(counter->ext.counters[0].IfTxMulticastFrames);
+ vport_stats->rx_broadcast_bytes += be64_to_cpu(counter->ext.counters[0].IfRxBroadcastOctets);
+ vport_stats->rx_unicast_bytes += be64_to_cpu(counter->ext.counters[0].IfRxUnicastOctets);
+ vport_stats->rx_multicast_bytes += be64_to_cpu(counter->ext.counters[0].IfRxMulticastOctets);
+ vport_stats->tx_broadcast_bytes += be64_to_cpu(counter->ext.counters[0].IfTxBroadcastOctets);
+ vport_stats->tx_unicast_bytes += be64_to_cpu(counter->ext.counters[0].IfTxUnicastOctets);
+ vport_stats->tx_multicast_bytes += be64_to_cpu(counter->ext.counters[0].IfTxMulticastOctets);
+ vport_stats->rx_filtered += be64_to_cpu(counter->ext.counters[0].IfRxErrorFrames);
+ vport_stats->rx_dropped += be64_to_cpu(counter->ext.counters[0].IfRxNoBufferFrames);
+ vport_stats->tx_dropped += be64_to_cpu(counter->ext.counters[0].IfTxDroppedFrames);
+ if (read_counters)
+ (*read_counters)++;
+ }
+ }
+
+if_stat_out:
+ mutex_unlock(&priv->counters_table.mutex);
+ mlx4_free_cmd_mailbox(dev, if_stat_mailbox);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_vport_ethtool_stats);
+
+void mlx4_set_admin_guid(struct mlx4_dev *dev, __be64 guid, int entry, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ priv->mfunc.master.vf_admin[entry].vport[port].guid = guid;
+}
+EXPORT_SYMBOL_GPL(mlx4_set_admin_guid);
+
+__be64 mlx4_get_admin_guid(struct mlx4_dev *dev, int entry, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ return priv->mfunc.master.vf_admin[entry].vport[port].guid;
+}
+EXPORT_SYMBOL_GPL(mlx4_get_admin_guid);
+
+void mlx4_set_random_admin_guid(struct mlx4_dev *dev, int entry, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ __be64 guid;
+
+ /* hw GUID */
+ if (entry == 0)
+ return;
+
+ get_random_bytes((char *)&guid, sizeof(guid));
+ guid &= ~(cpu_to_be64(1ULL << 56));
+ guid |= cpu_to_be64(1ULL << 57);
+ priv->mfunc.master.vf_admin[entry].vport[port].guid = guid;
+}
+
+static int mlx4_setup_hca(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err;
+ int port;
+ __be32 ib_port_default_caps;
+
+ err = mlx4_init_uar_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize user access region table, aborting\n");
+ return err;
+ }
+
+ err = mlx4_uar_alloc(dev, &priv->driver_uar);
+ if (err) {
+ mlx4_err(dev, "Failed to allocate driver access region, aborting\n");
+ goto err_uar_table_free;
+ }
+
+ priv->kar = ioremap((phys_addr_t) priv->driver_uar.pfn << PAGE_SHIFT, PAGE_SIZE);
+ if (!priv->kar) {
+ mlx4_err(dev, "Couldn't map kernel access region, aborting\n");
+ err = -ENOMEM;
+ goto err_uar_free;
+ }
+
+ err = mlx4_init_pd_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize protection domain table, aborting\n");
+ goto err_kar_unmap;
+ }
+
+ err = mlx4_init_xrcd_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize reliable connection domain table, aborting\n");
+ goto err_pd_table_free;
+ }
+
+ err = mlx4_init_mr_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize memory region table, aborting\n");
+ goto err_xrcd_table_free;
+ }
+
+ if (!mlx4_is_slave(dev)) {
+ err = mlx4_init_mcg_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize multicast group table, aborting\n");
+ goto err_mr_table_free;
+ }
+ err = mlx4_config_mad_demux(dev);
+ if (err) {
+ mlx4_err(dev, "Failed in config_mad_demux, aborting\n");
+ goto err_mcg_table_free;
+ }
+ }
+
+ err = mlx4_init_eq_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize event queue table, aborting\n");
+ goto err_mcg_table_free;
+ }
+
+ err = mlx4_cmd_use_events(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to switch to event-driven firmware commands, aborting\n");
+ goto err_eq_table_free;
+ }
+
+ err = mlx4_NOP(dev);
+ if (err) {
+ if (dev->flags & MLX4_FLAG_MSI_X) {
+ mlx4_warn(dev, "NOP command failed to generate MSI-X interrupt IRQ %d)\n",
+ priv->eq_table.eq[MLX4_EQ_ASYNC].irq);
+ mlx4_warn(dev, "Trying again without MSI-X\n");
+ } else {
+ mlx4_err(dev, "NOP command failed to generate interrupt (IRQ %d), aborting\n",
+ priv->eq_table.eq[MLX4_EQ_ASYNC].irq);
+ mlx4_err(dev, "BIOS or ACPI interrupt routing problem?\n");
+ }
+
+ goto err_cmd_poll;
+ }
+
+ mlx4_dbg(dev, "NOP command IRQ test passed\n");
+
+ err = mlx4_init_cq_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize completion queue table, aborting\n");
+ goto err_cmd_poll;
+ }
+
+ err = mlx4_init_srq_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize shared receive queue table, aborting\n");
+ goto err_cq_table_free;
+ }
+
+ err = mlx4_init_qp_table(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to initialize queue pair table, aborting\n");
+ goto err_srq_table_free;
+ }
+
+ err = mlx4_init_counters_table(dev);
+ if (err && err != -ENOENT) {
+ mlx4_err(dev, "Failed to initialize counters table, aborting\n");
+ goto err_qp_table_free;
+ }
+
+ if (!mlx4_is_slave(dev)) {
+ for (port = 1; port <= dev->caps.num_ports; port++) {
+ ib_port_default_caps = 0;
+ err = mlx4_get_port_ib_caps(dev, port,
+ &ib_port_default_caps);
+ if (err)
+ mlx4_warn(dev, "failed to get port %d default ib capabilities (%d). Continuing with caps = 0\n",
+ port, err);
+ dev->caps.ib_port_def_cap[port] = ib_port_default_caps;
+
+ /* initialize per-slave default ib port capabilities */
+ if (mlx4_is_master(dev)) {
+ int i;
+ for (i = 0; i < dev->num_slaves; i++) {
+ if (i == mlx4_master_func_num(dev))
+ continue;
+ priv->mfunc.master.slave_state[i].ib_cap_mask[port] =
+ ib_port_default_caps;
+ }
+ }
+
+ dev->caps.port_ib_mtu[port] = IB_MTU_4096;
+
+ err = mlx4_SET_PORT(dev, port, mlx4_is_master(dev) ?
+ dev->caps.pkey_table_len[port] : -1);
+ if (err) {
+ mlx4_err(dev, "Failed to set port %d, aborting\n",
+ port);
+ goto err_counters_table_free;
+ }
+ }
+ }
+
+ return 0;
+
+err_counters_table_free:
+ mlx4_cleanup_counters_table(dev);
+
+err_qp_table_free:
+ mlx4_cleanup_qp_table(dev);
+
+err_srq_table_free:
+ mlx4_cleanup_srq_table(dev);
+
+err_cq_table_free:
+ mlx4_cleanup_cq_table(dev);
+
+err_cmd_poll:
+ mlx4_cmd_use_polling(dev);
+
+err_eq_table_free:
+ mlx4_cleanup_eq_table(dev);
+
+err_mcg_table_free:
+ if (!mlx4_is_slave(dev))
+ mlx4_cleanup_mcg_table(dev);
+
+err_mr_table_free:
+ mlx4_cleanup_mr_table(dev);
+
+err_xrcd_table_free:
+ mlx4_cleanup_xrcd_table(dev);
+
+err_pd_table_free:
+ mlx4_cleanup_pd_table(dev);
+
+err_kar_unmap:
+ iounmap(priv->kar);
+
+err_uar_free:
+ mlx4_uar_free(dev, &priv->driver_uar);
+
+err_uar_table_free:
+ mlx4_cleanup_uar_table(dev);
+ return err;
+}
+
+static int mlx4_init_affinity_hint(struct mlx4_dev *dev, int port, int eqn)
+{
+ int requested_cpu = 0;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_eq *eq;
+ int off = 0;
+ int i;
+
+ if (eqn > dev->caps.num_comp_vectors)
+ return -EINVAL;
+
+ for (i = 1; i < port; i++)
+ off += mlx4_get_eqs_per_port(dev, i);
+
+ requested_cpu = eqn - off - !!(eqn > MLX4_EQ_ASYNC);
+
+ /* Meaning EQs are shared, and this call comes from the second port */
+ if (requested_cpu < 0)
+ return 0;
+
+ eq = &priv->eq_table.eq[eqn];
+
+ if (!zalloc_cpumask_var(&eq->affinity_mask, GFP_KERNEL))
+ return -ENOMEM;
+
+ cpumask_set_cpu(requested_cpu, eq->affinity_mask);
+
+ return 0;
+}
+
+static void mlx4_enable_msi_x(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct msix_entry *entries;
+ int i;
+ int port = 0;
+#ifndef HAVE_PCI_ENABLE_MSIX_RANGE
+ int err;
+#endif
+
+ if (msi_x) {
+ int nreq = dev->caps.num_ports * num_online_cpus() + 1;
+
+ nreq = min_t(int, dev->caps.num_eqs - dev->caps.reserved_eqs,
+ nreq);
+#ifdef CONFIG_PPC
+ nreq = min_t(int, nreq, PPC_MAX_MSIX);
+#endif
+ entries = kcalloc(nreq, sizeof *entries, GFP_KERNEL);
+ if (!entries)
+ goto no_msi;
+
+ for (i = 0; i < nreq; ++i)
+ entries[i].entry = i;
+
+#ifdef HAVE_PCI_ENABLE_MSIX_RANGE
+ nreq = pci_enable_msix_range(dev->persist->pdev, entries, 2,
+ nreq);
+#else
+ retry:
+ err = pci_enable_msix(dev->persist->pdev, entries, nreq);
+ if (err) {
+ /* Try again if at least 2 vectors are available */
+ if (err > 1) {
+ mlx4_info(dev, "Requested %d vectors, "
+ "but only %d MSI-X vectors available, "
+ "trying again\n", nreq, err);
+ nreq = err;
+ goto retry;
+ }
+ nreq = -1;
+ }
+#endif
+
+ if (nreq < 2 || nreq < MLX4_EQ_ASYNC + 1) {
+ kfree(entries);
+ goto no_msi;
+ }
+ /* 1 is reserved for events (asyncrounous EQ) */
+ dev->caps.num_comp_vectors = nreq - 1;
+
+ priv->eq_table.eq[MLX4_EQ_ASYNC].irq = entries[0].vector;
+ bitmap_zero(priv->eq_table.eq[MLX4_EQ_ASYNC].actv_ports.ports,
+ dev->caps.num_ports);
+
+ for (i = 0; i < dev->caps.num_comp_vectors + 1; i++) {
+ if (i == MLX4_EQ_ASYNC)
+ continue;
+
+ priv->eq_table.eq[i].irq =
+ entries[i + 1 - !!(i > MLX4_EQ_ASYNC)].vector;
+
+ if (MLX4_IS_LEGACY_EQ_MODE(dev->caps)) {
+ bitmap_fill(priv->eq_table.eq[i].actv_ports.ports,
+ dev->caps.num_ports);
+ /* We don't set affinity hint when there
+ * aren't enough EQs
+ */
+ } else {
+ set_bit(port,
+ priv->eq_table.eq[i].actv_ports.ports);
+ if (mlx4_init_affinity_hint(dev, port + 1, i))
+ mlx4_warn(dev, "Couldn't init hint cpumask for EQ %d\n",
+ i);
+ }
+ /* We divide the Eqs evenly between the two ports.
+ * (dev->caps.num_comp_vectors / dev->caps.num_ports)
+ * refers to the number of Eqs per port
+ * (i.e eqs_per_port). Theoretically, we would like to
+ * write something like (i + 1) % eqs_per_port == 0.
+ * However, since there's an asynchronous Eq, we have
+ * to skip over it by comparing this condition to
+ * !!((i + 1) > MLX4_EQ_ASYNC).
+ */
+ if ((dev->caps.num_comp_vectors > dev->caps.num_ports) &&
+ ((i + 1) %
+ (dev->caps.num_comp_vectors / dev->caps.num_ports)) ==
+ !!((i + 1) > MLX4_EQ_ASYNC))
+ /* If dev->caps.num_comp_vectors < dev->caps.num_ports,
+ * everything is shared anyway.
+ */
+ port++;
+ }
+
+ dev->flags |= MLX4_FLAG_MSI_X;
+
+ kfree(entries);
+ return;
+ }
+
+no_msi:
+ dev->caps.num_comp_vectors = 1;
+
+ BUG_ON(MLX4_EQ_ASYNC >= 2);
+ for (i = 0; i < 2; ++i) {
+ priv->eq_table.eq[i].irq = dev->persist->pdev->irq;
+ if (i != MLX4_EQ_ASYNC) {
+ bitmap_fill(priv->eq_table.eq[i].actv_ports.ports,
+ dev->caps.num_ports);
+ }
+ }
+}
+
+static int mlx4_init_port_info(struct mlx4_dev *dev, int port)
+{
+ struct mlx4_port_info *info = &mlx4_priv(dev)->port[port];
+ int err = 0;
+
+ info->dev = dev;
+ info->port = port;
+ if (!mlx4_is_slave(dev)) {
+ mlx4_init_mac_table(dev, &info->mac_table);
+ mlx4_init_vlan_table(dev, &info->vlan_table);
+ mlx4_init_roce_gid_table(dev, &info->roce);
+ info->base_qpn = mlx4_get_base_qpn(dev, port);
+ }
+
+ sprintf(info->dev_name, "mlx4_port%d", port);
+ info->port_attr.attr.name = info->dev_name;
+ if (mlx4_is_mfunc(dev))
+ info->port_attr.attr.mode = S_IRUGO;
+ else {
+ info->port_attr.attr.mode = S_IRUGO | S_IWUSR;
+ info->port_attr.store = set_port_type;
+ }
+ info->port_attr.show = show_port_type;
+ sysfs_attr_init(&info->port_attr.attr);
+
+ err = device_create_file(&dev->persist->pdev->dev, &info->port_attr);
+ if (err) {
+ mlx4_err(dev, "Failed to create file for port %d\n", port);
+ info->port = -1;
+ }
+
+ sprintf(info->dev_mtu_name, "mlx4_port%d_mtu", port);
+ info->port_mtu_attr.attr.name = info->dev_mtu_name;
+ if (mlx4_is_mfunc(dev))
+ info->port_mtu_attr.attr.mode = S_IRUGO;
+ else {
+ info->port_mtu_attr.attr.mode = S_IRUGO | S_IWUSR;
+ info->port_mtu_attr.store = set_port_ib_mtu;
+ }
+ info->port_mtu_attr.show = show_port_ib_mtu;
+ sysfs_attr_init(&info->port_mtu_attr.attr);
+
+ err = device_create_file(&dev->persist->pdev->dev,
+ &info->port_mtu_attr);
+ if (err) {
+ mlx4_err(dev, "Failed to create mtu file for port %d\n", port);
+ device_remove_file(&info->dev->persist->pdev->dev,
+ &info->port_attr);
+ info->port = -1;
+ }
+
+ return err;
+}
+
+static void mlx4_cleanup_port_info(struct mlx4_port_info *info)
+{
+ if (info->port < 0)
+ return;
+
+ device_remove_file(&info->dev->persist->pdev->dev, &info->port_attr);
+ device_remove_file(&info->dev->persist->pdev->dev,
+ &info->port_mtu_attr);
+#ifdef CONFIG_RFS_ACCEL
+ free_irq_cpu_rmap(info->rmap);
+ info->rmap = NULL;
+#endif
+}
+
+static int mlx4_init_steering(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int num_entries = dev->caps.num_ports;
+ int i, j;
+
+ priv->steer = kzalloc(sizeof(struct mlx4_steer) * num_entries, GFP_KERNEL);
+ if (!priv->steer)
+ return -ENOMEM;
+
+ for (i = 0; i < num_entries; i++)
+ for (j = 0; j < MLX4_NUM_STEERS; j++) {
+ INIT_LIST_HEAD(&priv->steer[i].promisc_qps[j]);
+ INIT_LIST_HEAD(&priv->steer[i].steer_entries[j]);
+ }
+ return 0;
+}
+
+static void mlx4_clear_steering(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_steer_index *entry, *tmp_entry;
+ struct mlx4_promisc_qp *pqp, *tmp_pqp;
+ int num_entries = dev->caps.num_ports;
+ int i, j;
+
+ for (i = 0; i < num_entries; i++) {
+ for (j = 0; j < MLX4_NUM_STEERS; j++) {
+ list_for_each_entry_safe(pqp, tmp_pqp,
+ &priv->steer[i].promisc_qps[j],
+ list) {
+ list_del(&pqp->list);
+ kfree(pqp);
+ }
+ list_for_each_entry_safe(entry, tmp_entry,
+ &priv->steer[i].steer_entries[j],
+ list) {
+ list_del(&entry->list);
+ list_for_each_entry_safe(pqp, tmp_pqp,
+ &entry->duplicates,
+ list) {
+ list_del(&pqp->list);
+ kfree(pqp);
+ }
+ kfree(entry);
+ }
+ }
+ }
+ kfree(priv->steer);
+}
+
+static int extended_func_num(struct pci_dev *pdev)
+{
+ return PCI_SLOT(pdev->devfn) * 8 + PCI_FUNC(pdev->devfn);
+}
+
+#define MLX4_OWNER_BASE 0x8069c
+#define MLX4_OWNER_SIZE 4
+
+static int mlx4_get_ownership(struct mlx4_dev *dev)
+{
+ void __iomem *owner;
+ u32 ret;
+
+ if (pci_channel_offline(dev->persist->pdev))
+ return -EIO;
+
+ owner = ioremap(pci_resource_start(dev->persist->pdev, 0) +
+ MLX4_OWNER_BASE,
+ MLX4_OWNER_SIZE);
+ if (!owner) {
+ mlx4_err(dev, "Failed to obtain ownership bit\n");
+ return -ENOMEM;
+ }
+
+ ret = readl(owner);
+ iounmap(owner);
+ return (int) !!ret;
+}
+
+static void mlx4_free_ownership(struct mlx4_dev *dev)
+{
+ void __iomem *owner;
+
+ if (pci_channel_offline(dev->persist->pdev))
+ return;
+
+ owner = ioremap(pci_resource_start(dev->persist->pdev, 0) +
+ MLX4_OWNER_BASE,
+ MLX4_OWNER_SIZE);
+ if (!owner) {
+ mlx4_err(dev, "Failed to obtain ownership bit\n");
+ return;
+ }
+ writel(0, owner);
+ msleep(1000);
+ iounmap(owner);
+}
+
+#define SRIOV_VALID_STATE(flags) (!!((flags) & MLX4_FLAG_SRIOV) ==\
+ !!((flags) & MLX4_FLAG_MASTER))
+
+#ifndef HAVE_PCI_NUM_VF
+static int mlx4_find_vfs(struct pci_dev *pdev)
+{
+ struct pci_dev *dev;
+ int vfs = 0, pos;
+ u16 offset, stride;
+
+ pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV);
+ if (!pos)
+ return 0;
+ pci_read_config_word(pdev, pos + PCI_SRIOV_VF_OFFSET, &offset);
+ pci_read_config_word(pdev, pos + PCI_SRIOV_VF_STRIDE, &stride);
+
+ dev = pci_get_device(pdev->vendor, PCI_ANY_ID, NULL);
+ while (dev) {
+ if (dev->is_virtfn && pci_physfn(dev) == pdev) {
+ vfs++;
+ }
+ dev = pci_get_device(pdev->vendor, PCI_ANY_ID, dev);
+ }
+ return vfs;
+}
+#endif
+
+static u64 mlx4_enable_sriov(struct mlx4_dev *dev, struct pci_dev *pdev,
+ u8 total_vfs, int existing_vfs, int reset_flow)
+{
+ u64 dev_flags = dev->flags;
+ int err = 0;
+
+ if (reset_flow) {
+ dev->dev_vfs = kcalloc(total_vfs, sizeof(*dev->dev_vfs),
+ GFP_KERNEL);
+ if (!dev->dev_vfs)
+ goto free_mem;
+ return dev_flags;
+ }
+
+ atomic_inc(&pf_loading);
+ if (dev->flags & MLX4_FLAG_SRIOV) {
+ if (existing_vfs != total_vfs) {
+ mlx4_err(dev, "SR-IOV was already enabled, but with num_vfs (%d) different than requested (%d)\n",
+ existing_vfs, total_vfs);
+ total_vfs = existing_vfs;
+ }
+ }
+
+ dev->dev_vfs = kzalloc(total_vfs * sizeof(*dev->dev_vfs), GFP_KERNEL);
+ if (NULL == dev->dev_vfs) {
+ mlx4_err(dev, "Failed to allocate memory for VFs\n");
+ goto disable_sriov;
+ }
+
+ if (!(dev->flags & MLX4_FLAG_SRIOV)) {
+ mlx4_warn(dev, "Enabling SR-IOV with %d VFs\n", total_vfs);
+ err = pci_enable_sriov(pdev, total_vfs);
+ }
+ if (err) {
+ mlx4_err(dev, "Failed to enable SR-IOV, continuing without SR-IOV (err = %d)\n",
+ err);
+ goto disable_sriov;
+ } else {
+ mlx4_warn(dev, "Running in master mode\n");
+ dev_flags |= MLX4_FLAG_SRIOV |
+ MLX4_FLAG_MASTER;
+ dev_flags &= ~MLX4_FLAG_SLAVE;
+ dev->persist->num_vfs = total_vfs;
+ }
+ return dev_flags;
+
+disable_sriov:
+ atomic_dec(&pf_loading);
+free_mem:
+ dev->persist->num_vfs = 0;
+ kfree(dev->dev_vfs);
+ dev->dev_vfs = NULL;
+ return dev_flags & ~MLX4_FLAG_MASTER;
+}
+
+enum {
+ MLX4_DEV_CAP_CHECK_NUM_VFS_ABOVE_64 = -1,
+};
+
+static int mlx4_check_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap,
+ int *nvfs)
+{
+ int requested_vfs = nvfs[0] + nvfs[1] + nvfs[2];
+ /* Checking for 64 VFs as a limitation of CX2 */
+ if (!(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_80_VFS) &&
+ requested_vfs >= 64) {
+ mlx4_err(dev, "Requested %d VFs, but FW does not support more than 64\n",
+ requested_vfs);
+ return MLX4_DEV_CAP_CHECK_NUM_VFS_ABOVE_64;
+ }
+ return 0;
+}
+
+static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
+ int total_vfs, int *nvfs, struct mlx4_priv *priv,
+ int reset_flow)
+{
+ struct mlx4_dev *dev;
+ unsigned sum = 0;
+ int err;
+ int port;
+ int i;
+ struct mlx4_dev_cap *dev_cap = NULL;
+ int num_vfs_argc =
+ mlx4_get_argc(num_vfs.dbdf2val.tbl, pci_physfn(pdev));
+ int probe_vfs_argc =
+ mlx4_get_argc(probe_vf.dbdf2val.tbl, pci_physfn(pdev));
+ /* existing_vfs will contain the number of VFs which were active when
+ remove_one was invoked on the PF driver. In this case,
+ the PF driver did not disable SRIOV during remove_one.
+ When the PF is reloaded (mlx4_load_one), SRIOV is therefore
+ still enabled, and pci_enable_sriov should not be called. */
+ int existing_vfs = 0;
+
+ dev = &priv->dev;
+
+ INIT_LIST_HEAD(&priv->dev_list);
+ INIT_LIST_HEAD(&priv->ctx_list);
+ spin_lock_init(&priv->ctx_lock);
+
+ mutex_init(&priv->port_mutex);
+ mutex_init(&priv->bond_mutex);
+
+ INIT_LIST_HEAD(&priv->pgdir_list);
+ mutex_init(&priv->pgdir_mutex);
+
+ INIT_LIST_HEAD(&priv->bf_list);
+ mutex_init(&priv->bf_mutex);
+
+ dev->rev_id = pdev->revision;
+ dev->numa_node = dev_to_node(&pdev->dev);
+ if (dev->numa_node == -1)
+ dev->numa_node = first_online_node;
+ memcpy(dev->persist->nvfs, nvfs, sizeof(dev->persist->nvfs));
+
+ /* Detect if this device is a virtual function */
+ if (pci_dev_data & MLX4_PCI_DEV_IS_VF) {
+ mlx4_warn(dev, "Detected virtual function - running in slave mode\n");
+ dev->flags |= MLX4_FLAG_SLAVE;
+ } else {
+ /* We reset the device and enable SRIOV only for physical
+ * devices. Try to claim ownership on the device;
+ * if already taken, skip -- do not allow multiple PFs */
+ err = mlx4_get_ownership(dev);
+ if (err) {
+ if (err < 0)
+ return err;
+ else {
+ mlx4_warn(dev, "Multiple PFs not yet supported - Skipping PF\n");
+ return -EINVAL;
+ }
+ }
+
+ atomic_set(&priv->opreq_count, 0);
+ INIT_WORK(&priv->opreq_task, mlx4_opreq_action);
+
+ /*
+ * Now reset the HCA before we touch the PCI capabilities or
+ * attempt a firmware command, since a boot ROM may have left
+ * the HCA in an undefined state.
+ */
+ err = mlx4_reset(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to reset HCA, aborting\n");
+ goto err_sriov;
+ }
+
+ if (total_vfs) {
+ dev->flags = MLX4_FLAG_MASTER;
+#ifdef HAVE_PCI_NUM_VF
+ existing_vfs = pci_num_vf(pdev);
+#else
+ existing_vfs = mlx4_find_vfs(pdev);
+#endif
+ if (existing_vfs)
+ dev->flags |= MLX4_FLAG_SRIOV;
+ dev->persist->num_vfs = total_vfs;
+ }
+ }
+
+ /* on load remove any previous indication of internal error,
+ * device is up.
+ */
+ dev->persist->state = MLX4_DEVICE_STATE_UP;
+
+slave_start:
+ err = mlx4_cmd_init(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to init command interface, aborting\n");
+ goto err_sriov;
+ }
+
+ /* In slave functions, the communication channel must be initialized
+ * before posting commands. Also, init num_slaves before calling
+ * mlx4_init_hca */
+ if (mlx4_is_mfunc(dev)) {
+ if (mlx4_is_master(dev)) {
+ dev->num_slaves = MLX4_MAX_NUM_SLAVES;
+
+ } else {
+ dev->num_slaves = 0;
+ err = mlx4_multi_func_init(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to init slave mfunc interface, aborting\n");
+ goto err_cmd;
+ }
+ }
+ }
+
+ err = mlx4_init_fw(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to init fw, aborting.\n");
+ goto err_mfunc;
+ }
+
+ if (mlx4_is_master(dev)) {
+ /* when we hit the goto slave_start below, dev_cap already initialized */
+ if (!dev_cap) {
+ dev_cap = kzalloc(sizeof(*dev_cap), GFP_KERNEL);
+
+ if (!dev_cap) {
+ err = -ENOMEM;
+ goto err_fw;
+ }
+
+ err = mlx4_QUERY_DEV_CAP(dev, dev_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting.\n");
+ goto err_fw;
+ }
+
+ if (mlx4_check_dev_cap(dev, dev_cap, nvfs))
+ goto err_fw;
+
+ if (!(dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_SYS_EQS)) {
+ u64 dev_flags = mlx4_enable_sriov(dev, pdev,
+ total_vfs,
+ existing_vfs,
+ reset_flow);
+
+ mlx4_cmd_cleanup(dev, MLX4_CMD_CLEANUP_ALL);
+ dev->flags = dev_flags;
+ if (!SRIOV_VALID_STATE(dev->flags)) {
+ mlx4_err(dev, "Invalid SRIOV state\n");
+ goto err_sriov;
+ }
+ err = mlx4_reset(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to reset HCA, aborting.\n");
+ goto err_sriov;
+ }
+ goto slave_start;
+ }
+ } else {
+ /* Legacy mode FW requires SRIOV to be enabled before
+ * doing QUERY_DEV_CAP, since max_eq's value is different if
+ * SRIOV is enabled.
+ */
+ memset(dev_cap, 0, sizeof(*dev_cap));
+ err = mlx4_QUERY_DEV_CAP(dev, dev_cap);
+ if (err) {
+ mlx4_err(dev, "QUERY_DEV_CAP command failed, aborting.\n");
+ goto err_fw;
+ }
+
+ if (mlx4_check_dev_cap(dev, dev_cap, nvfs))
+ goto err_fw;
+ }
+ }
+
+ err = mlx4_init_hca(dev);
+ if (err) {
+ if (err == -EACCES) {
+ /* Not primary Physical function
+ * Running in slave mode */
+ mlx4_cmd_cleanup(dev, MLX4_CMD_CLEANUP_ALL);
+ /* We're not a PF */
+ if (dev->flags & MLX4_FLAG_SRIOV) {
+ if (!existing_vfs)
+ pci_disable_sriov(pdev);
+ if (mlx4_is_master(dev) && !reset_flow)
+ atomic_dec(&pf_loading);
+ dev->flags &= ~MLX4_FLAG_SRIOV;
+ }
+ if (!mlx4_is_slave(dev))
+ mlx4_free_ownership(dev);
+ dev->flags |= MLX4_FLAG_SLAVE;
+ dev->flags &= ~MLX4_FLAG_MASTER;
+ goto slave_start;
+ } else
+ goto err_fw;
+ }
+
+ if (mlx4_is_master(dev) && (dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_SYS_EQS)) {
+ u64 dev_flags = mlx4_enable_sriov(dev, pdev, total_vfs,
+ existing_vfs, reset_flow);
+
+ if ((dev->flags ^ dev_flags) & (MLX4_FLAG_MASTER | MLX4_FLAG_SLAVE)) {
+ mlx4_cmd_cleanup(dev, MLX4_CMD_CLEANUP_VHCR);
+ dev->flags = dev_flags;
+ err = mlx4_cmd_init(dev);
+ if (err) {
+ /* Only VHCR is cleaned up, so could still
+ * send FW commands
+ */
+ mlx4_err(dev, "Failed to init VHCR command interface, aborting\n");
+ goto err_close;
+ }
+ } else {
+ dev->flags = dev_flags;
+ }
+
+ if (!SRIOV_VALID_STATE(dev->flags)) {
+ mlx4_err(dev, "Invalid SRIOV state\n");
+ goto err_close;
+ }
+ }
+
+ /* check if the device is functioning at its maximum possible speed.
+ * No return code for this call, just warn the user in case of PCI
+ * express device capabilities are under-satisfied by the bus.
+ */
+ if (!mlx4_is_slave(dev))
+ mlx4_check_pcie_caps(dev);
+
+ /* In master functions, the communication channel must be initialized
+ * after obtaining its address from fw */
+ if (mlx4_is_master(dev)) {
+ int ib_ports = 0;
+
+ mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_IB)
+ ib_ports++;
+
+ if (ib_ports &&
+ (num_vfs_argc > 1 || probe_vfs_argc > 1)) {
+ mlx4_err(dev,
+ "Invalid syntax of num_vfs/probe_vfs with IB port - single port VFs syntax is only supported when all ports are configured as ethernet\n");
+ err = -EINVAL;
+ goto err_close;
+ }
+ if (dev->caps.num_ports < 2 &&
+ num_vfs_argc > 1) {
+ err = -EINVAL;
+ mlx4_err(dev,
+ "Error: Trying to configure VFs on port 2, but HCA has only %d physical ports\n",
+ dev->caps.num_ports);
+ goto err_close;
+ }
+ memcpy(dev->persist->nvfs, nvfs, sizeof(dev->persist->nvfs));
+
+ for (i = 0;
+ i < sizeof(dev->persist->nvfs)/
+ sizeof(dev->persist->nvfs[0]); i++) {
+ unsigned j;
+
+ for (j = 0; j < dev->persist->nvfs[i]; ++sum, ++j) {
+ dev->dev_vfs[sum].min_port = i < 2 ? i + 1 : 1;
+ dev->dev_vfs[sum].n_ports = i < 2 ? 1 :
+ dev->caps.num_ports;
+ }
+ }
+
+ /* In master functions, the communication channel
+ * must be initialized after obtaining its address from fw
+ */
+ err = mlx4_multi_func_init(dev);
+ if (err) {
+ mlx4_err(dev, "Failed to init master mfunc interface, aborting.\n");
+ goto err_close;
+ }
+ }
+
+ err = mlx4_alloc_eq_table(dev);
+ if (err)
+ goto err_master_mfunc;
+
+ bitmap_zero(priv->msix_ctl.pool_bm, MAX_MSIX);
+ mutex_init(&priv->msix_ctl.pool_lock);
+
+ mlx4_enable_msi_x(dev);
+ if ((mlx4_is_mfunc(dev)) &&
+ !(dev->flags & MLX4_FLAG_MSI_X)) {
+ err = -ENOSYS;
+ mlx4_err(dev, "INTx is not supported in multi-function mode, aborting\n");
+ goto err_free_eq;
+ }
+
+ if (!mlx4_is_slave(dev)) {
+ err = mlx4_init_steering(dev);
+ if (err)
+ goto err_disable_msix;
+ }
+
+ err = mlx4_setup_hca(dev);
+ if (err == -EBUSY && (dev->flags & MLX4_FLAG_MSI_X) &&
+ !mlx4_is_mfunc(dev)) {
+ dev->flags &= ~MLX4_FLAG_MSI_X;
+ dev->caps.num_comp_vectors = 1;
+ pci_disable_msix(pdev);
+ err = mlx4_setup_hca(dev);
+ }
+
+ if (err)
+ goto err_steer;
+
+ mlx4_init_quotas(dev);
+ /* When PF resources are ready arm its comm channel to enable
+ * getting commands
+ */
+ if (mlx4_is_master(dev)) {
+ err = mlx4_ARM_COMM_CHANNEL(dev);
+ if (err) {
+ mlx4_err(dev, " Failed to arm comm channel eq: %x\n",
+ err);
+ goto err_steer;
+ }
+ }
+
+ for (port = 1; port <= dev->caps.num_ports; port++) {
+ err = mlx4_init_port_info(dev, port);
+ if (err)
+ goto err_port;
+ }
+
+ priv->v2p.port1 = 1;
+ priv->v2p.port2 = 2;
+
+ err = mlx4_register_device(dev);
+ if (err)
+ goto err_port;
+
+ mlx4_request_modules(dev);
+
+ mlx4_sense_init(dev);
+ mlx4_start_sense(dev);
+
+ priv->removed = 0;
+
+ if (mlx4_is_master(dev) && dev->persist->num_vfs && !reset_flow)
+ atomic_dec(&pf_loading);
+
+ kfree(dev_cap);
+ return 0;
+
+err_port:
+ for (--port; port >= 1; --port)
+ mlx4_cleanup_port_info(&priv->port[port]);
+
+ mlx4_cleanup_counters_table(dev);
+ mlx4_cleanup_qp_table(dev);
+ mlx4_cleanup_srq_table(dev);
+ mlx4_cleanup_cq_table(dev);
+ mlx4_cmd_use_polling(dev);
+ mlx4_cleanup_eq_table(dev);
+ mlx4_cleanup_mcg_table(dev);
+ mlx4_cleanup_mr_table(dev);
+ mlx4_cleanup_xrcd_table(dev);
+ mlx4_cleanup_pd_table(dev);
+ mlx4_cleanup_uar_table(dev);
+
+err_steer:
+ if (!mlx4_is_slave(dev))
+ mlx4_clear_steering(dev);
+
+err_disable_msix:
+ if (dev->flags & MLX4_FLAG_MSI_X)
+ pci_disable_msix(pdev);
+
+err_free_eq:
+ mlx4_free_eq_table(dev);
+
+err_master_mfunc:
+ if (mlx4_is_master(dev)) {
+ mlx4_free_resource_tracker(dev, RES_TR_FREE_STRUCTS_ONLY);
+ mlx4_multi_func_cleanup(dev);
+ }
+
+ if (mlx4_is_slave(dev)) {
+ kfree(dev->caps.qp0_qkey);
+ kfree(dev->caps.qp0_tunnel);
+ kfree(dev->caps.qp0_proxy);
+ kfree(dev->caps.qp1_tunnel);
+ kfree(dev->caps.qp1_proxy);
+ }
+
+err_close:
+ mlx4_close_hca(dev);
+
+err_fw:
+ mlx4_close_fw(dev);
+
+err_mfunc:
+ if (mlx4_is_slave(dev))
+ mlx4_multi_func_cleanup(dev);
+
+err_cmd:
+ mlx4_cmd_cleanup(dev, MLX4_CMD_CLEANUP_ALL);
+
+err_sriov:
+ if (dev->flags & MLX4_FLAG_SRIOV && !existing_vfs) {
+ pci_disable_sriov(pdev);
+ dev->flags &= ~MLX4_FLAG_SRIOV;
+ }
+
+ if (mlx4_is_master(dev) && dev->persist->num_vfs && !reset_flow)
+ atomic_dec(&pf_loading);
+
+ kfree(priv->dev.dev_vfs);
+
+ if (!mlx4_is_slave(dev))
+ mlx4_free_ownership(dev);
+
+ kfree(dev_cap);
+ return err;
+}
+
+static int __mlx4_init_one(struct pci_dev *pdev, int pci_dev_data,
+ struct mlx4_priv *priv)
+{
+ int err;
+ unsigned int i;
+ unsigned total_vfs = 0;
+ int nvfs[MLX4_MAX_PORTS + 1] = {0, 0, 0};
+ int prb_vf[MLX4_MAX_PORTS + 1] = {0, 0, 0};
+ const int param_map[MLX4_MAX_PORTS + 1][MLX4_MAX_PORTS + 1] = {
+ {2, 0, 0}, {0, 1, 2}, {0, 1, 2} };
+ int num_vfs_argc =
+ mlx4_get_argc(num_vfs.dbdf2val.tbl, pci_physfn(pdev));
+ int probe_vfs_argc =
+ mlx4_get_argc(probe_vf.dbdf2val.tbl, pci_physfn(pdev));
+
+ pr_info(DRV_NAME ": Initializing %s\n", pci_name(pdev));
+
+ err = pci_enable_device(pdev);
+ if (err) {
+ dev_err(&pdev->dev, "Cannot enable PCI device, aborting.\n");
+ return err;
+ }
+
+ for (i = 0; i < num_vfs_argc;
+ total_vfs += nvfs[param_map[num_vfs_argc - 1][i]], i++) {
+ int *cur_nvfs = &nvfs[param_map[num_vfs_argc - 1][i]];
+ mlx4_get_val(num_vfs.dbdf2val.tbl, pci_physfn(pdev), i,
+ cur_nvfs);
+ if (*cur_nvfs < 0) {
+ dev_err(&pdev->dev, "num_vfs module parameter cannot be negative\n");
+ err = -EINVAL;
+ goto err_disable_pdev;
+ }
+ }
+ for (i = 0; i < probe_vfs_argc; i++) {
+ int *cur_prbvf = &prb_vf[param_map[probe_vfs_argc - 1][i]];
+ mlx4_get_val(probe_vf.dbdf2val.tbl, pci_physfn(pdev), i,
+ cur_prbvf);
+ if (*cur_prbvf < 0) {
+ dev_err(&pdev->dev, "probe_vf module parameter cannot be negative\n");
+ err = -EINVAL;
+ goto err_disable_pdev;
+ }
+ }
+ for (i = 0; i < sizeof(nvfs)/sizeof(nvfs[0]); i++) {
+ if (prb_vf[i] > nvfs[i]) {
+ dev_err(&pdev->dev, "probe_vf module parameter cannot be greater than num_vfs\n");
+ err = -EINVAL;
+ goto err_disable_pdev;
+ }
+ }
+ if (total_vfs > MLX4_MAX_NUM_VF) {
+ dev_err(&pdev->dev, "total vfs (%d) can't be more than %d\n",
+ total_vfs, MLX4_MAX_NUM_VF);
+ err = -EINVAL;
+ goto err_disable_pdev;
+ }
+
+ for (i = 0; i < MLX4_MAX_PORTS; i++) {
+ if (nvfs[i] + nvfs[2] >= MLX4_MAX_NUM_VF_P_PORT) {
+ dev_err(&pdev->dev,
+ "Requested more VF's (%d) for port (%d) than allowed (%d)\n",
+ nvfs[i] + nvfs[2], i + 1,
+ MLX4_MAX_NUM_VF_P_PORT - 1);
+ err = -EINVAL;
+ goto err_disable_pdev;
+ }
+ }
+
+ /* Check for BARs. */
+ if (!(pci_dev_data & MLX4_PCI_DEV_IS_VF) &&
+ !(pci_resource_flags(pdev, 0) & IORESOURCE_MEM)) {
+ dev_err(&pdev->dev, "Missing DCS, aborting (driver_data: 0x%x, pci_resource_flags(pdev, 0):0x%lx)\n",
+ pci_dev_data, pci_resource_flags(pdev, 0));
+ err = -ENODEV;
+ goto err_disable_pdev;
+ }
+ if (!(pci_resource_flags(pdev, 2) & IORESOURCE_MEM)) {
+ dev_err(&pdev->dev, "Missing UAR, aborting\n");
+ err = -ENODEV;
+ goto err_disable_pdev;
+ }
+
+ err = pci_request_regions(pdev, DRV_NAME);
+ if (err) {
+ dev_err(&pdev->dev, "Couldn't get PCI resources, aborting\n");
+ goto err_disable_pdev;
+ }
+
+ pci_set_master(pdev);
+
+ err = pci_set_dma_mask(pdev, DMA_BIT_MASK(64));
+ if (err) {
+ dev_warn(&pdev->dev, "Warning: couldn't set 64-bit PCI DMA mask\n");
+ err = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
+ if (err) {
+ dev_err(&pdev->dev, "Can't set PCI DMA mask, aborting\n");
+ goto err_release_regions;
+ }
+ }
+ err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+ if (err) {
+ dev_warn(&pdev->dev, "Warning: couldn't set 64-bit consistent PCI DMA mask\n");
+ err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
+ if (err) {
+ dev_err(&pdev->dev, "Can't set consistent PCI DMA mask, aborting\n");
+ goto err_release_regions;
+ }
+ }
+
+ /* Allow large DMA segments, up to the firmware limit of 1 GB */
+ dma_set_max_seg_size(&pdev->dev, 1024 * 1024 * 1024);
+ /* Detect if this device is a virtual function */
+ if (pci_dev_data & MLX4_PCI_DEV_IS_VF) {
+ /* When acting as pf, we normally skip vfs unless explicitly
+ * requested to probe them.
+ */
+ if (total_vfs) {
+ unsigned vfs_offset = 0;
+
+ for (i = 0; i < sizeof(nvfs)/sizeof(nvfs[0]) &&
+ vfs_offset + nvfs[i] < extended_func_num(pdev);
+ vfs_offset += nvfs[i], i++)
+ ;
+ if (i == sizeof(nvfs)/sizeof(nvfs[0])) {
+ err = -ENODEV;
+ goto err_release_regions;
+ }
+ if ((extended_func_num(pdev) - vfs_offset)
+ > prb_vf[i]) {
+ dev_warn(&pdev->dev, "Skipping virtual function:%d\n",
+ extended_func_num(pdev));
+ err = -ENODEV;
+ goto err_release_regions;
+ }
+ }
+ }
+
+ err = mlx4_catas_init(&priv->dev);
+ if (err)
+ goto err_release_regions;
+
+ err = mlx4_load_one(pdev, pci_dev_data, total_vfs, nvfs, priv, 0);
+ if (err)
+ goto err_catas;
+
+ return 0;
+
+err_catas:
+ mlx4_catas_end(&priv->dev);
+
+err_release_regions:
+ pci_release_regions(pdev);
+
+err_disable_pdev:
+ pci_disable_device(pdev);
+ pci_set_drvdata(pdev, NULL);
+ return err;
+}
+
+static int mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct mlx4_priv *priv;
+ struct mlx4_dev *dev;
+ int ret;
+
+ printk_once(KERN_INFO "%s", mlx4_version);
+
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ dev = &priv->dev;
+ dev->persist = kzalloc(sizeof(*dev->persist), GFP_KERNEL);
+ if (!dev->persist) {
+ kfree(priv);
+ return -ENOMEM;
+ }
+ dev->persist->pdev = pdev;
+ dev->persist->dev = dev;
+ pci_set_drvdata(pdev, dev->persist);
+ priv->pci_dev_data = id->driver_data;
+ mutex_init(&dev->persist->device_state_mutex);
+ mutex_init(&dev->persist->interface_state_mutex);
+
+ ret = __mlx4_init_one(pdev, id->driver_data, priv);
+ if (ret) {
+ kfree(dev->persist);
+ kfree(priv);
+ } else {
+ pci_save_state(pdev);
+ }
+
+ return ret;
+}
+
+static void mlx4_clean_dev(struct mlx4_dev *dev)
+{
+ struct mlx4_dev_persistent *persist = dev->persist;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ unsigned long flags = (dev->flags & RESET_PERSIST_MASK_FLAGS);
+
+ memset(priv, 0, sizeof(*priv));
+ priv->dev.persist = persist;
+ priv->dev.flags = flags;
+}
+
+static void mlx4_unload_one(struct pci_dev *pdev)
+{
+ struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
+ struct mlx4_dev *dev = persist->dev;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int pci_dev_data;
+ int p, i;
+
+ if (priv->removed)
+ return;
+
+ /* saving current ports type for further use */
+ for (i = 0; i < dev->caps.num_ports; i++) {
+ dev->persist->curr_port_type[i] = dev->caps.port_type[i + 1];
+ dev->persist->curr_port_poss_type[i] = dev->caps.
+ possible_type[i + 1];
+ }
+
+ pci_dev_data = priv->pci_dev_data;
+
+ mlx4_stop_sense(dev);
+ mlx4_unregister_device(dev);
+
+ for (p = 1; p <= dev->caps.num_ports; p++) {
+ mlx4_cleanup_port_info(&priv->port[p]);
+ mlx4_CLOSE_PORT(dev, p);
+ }
+
+ if (mlx4_is_master(dev))
+ mlx4_free_resource_tracker(dev,
+ RES_TR_FREE_SLAVES_ONLY);
+
+ mlx4_cleanup_counters_table(dev);
+ mlx4_cleanup_qp_table(dev);
+ mlx4_cleanup_srq_table(dev);
+ mlx4_cleanup_cq_table(dev);
+ mlx4_cmd_use_polling(dev);
+ mlx4_cleanup_eq_table(dev);
+ mlx4_cleanup_mcg_table(dev);
+ mlx4_cleanup_mr_table(dev);
+ mlx4_cleanup_xrcd_table(dev);
+ mlx4_cleanup_pd_table(dev);
+
+ if (mlx4_is_master(dev))
+ mlx4_free_resource_tracker(dev,
+ RES_TR_FREE_STRUCTS_ONLY);
+
+ iounmap(priv->kar);
+ mlx4_uar_free(dev, &priv->driver_uar);
+ mlx4_cleanup_uar_table(dev);
+ if (!mlx4_is_slave(dev))
+ mlx4_clear_steering(dev);
+ mlx4_free_eq_table(dev);
+ if (mlx4_is_master(dev))
+ mlx4_multi_func_cleanup(dev);
+ mlx4_close_hca(dev);
+ mlx4_close_fw(dev);
+ if (mlx4_is_slave(dev))
+ mlx4_multi_func_cleanup(dev);
+ mlx4_cmd_cleanup(dev, MLX4_CMD_CLEANUP_ALL);
+
+ if (dev->flags & MLX4_FLAG_MSI_X)
+ pci_disable_msix(pdev);
+
+ if (!mlx4_is_slave(dev))
+ mlx4_free_ownership(dev);
+
+ kfree(dev->caps.qp0_qkey);
+ kfree(dev->caps.qp0_tunnel);
+ kfree(dev->caps.qp0_proxy);
+ kfree(dev->caps.qp1_tunnel);
+ kfree(dev->caps.qp1_proxy);
+ kfree(dev->dev_vfs);
+
+ mlx4_clean_dev(dev);
+ priv->pci_dev_data = pci_dev_data;
+ priv->removed = 1;
+}
+
+static void mlx4_remove_one(struct pci_dev *pdev)
+{
+ struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
+ struct mlx4_dev *dev = persist->dev;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int active_vfs = 0;
+
+ mutex_lock(&persist->interface_state_mutex);
+ persist->interface_state |= MLX4_INTERFACE_STATE_DELETION;
+ mutex_unlock(&persist->interface_state_mutex);
+
+ /* Disabling SR-IOV is not allowed while there are active vf's */
+ if (mlx4_is_master(dev) && dev->flags & MLX4_FLAG_SRIOV) {
+ active_vfs = mlx4_how_many_lives_vf(dev);
+ if (active_vfs) {
+ pr_warn("Removing PF when there are active VF's !!\n");
+ pr_warn("Will not disable SR-IOV.\n");
+ }
+ }
+
+ /* device marked to be under deletion running now without the lock
+ * letting other tasks to be terminated
+ */
+ if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
+ mlx4_unload_one(pdev);
+ else
+ mlx4_info(dev, "%s: interface is down\n", __func__);
+ mlx4_catas_end(dev);
+ if (dev->flags & MLX4_FLAG_SRIOV && !active_vfs) {
+ mlx4_warn(dev, "Disabling SR-IOV\n");
+ pci_disable_sriov(pdev);
+ }
+
+ pci_release_regions(pdev);
+ pci_disable_device(pdev);
+ kfree(dev->persist);
+ kfree(priv);
+ pci_set_drvdata(pdev, NULL);
+}
+
+static int restore_current_port_types(struct mlx4_dev *dev,
+ enum mlx4_port_type *types,
+ enum mlx4_port_type *poss_types)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err, i;
+
+ mlx4_stop_sense(dev);
+
+ mutex_lock(&priv->port_mutex);
+ for (i = 0; i < dev->caps.num_ports; i++)
+ dev->caps.possible_type[i + 1] = poss_types[i];
+ err = mlx4_change_port_types(dev, types);
+ mlx4_start_sense(dev);
+ mutex_unlock(&priv->port_mutex);
+
+ return err;
+}
+
+int mlx4_restart_one(struct pci_dev *pdev)
+{
+ struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
+ struct mlx4_dev *dev = persist->dev;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int nvfs[MLX4_MAX_PORTS + 1] = {0, 0, 0};
+ int pci_dev_data, err, total_vfs;
+
+ pci_dev_data = priv->pci_dev_data;
+ total_vfs = dev->persist->num_vfs;
+ memcpy(nvfs, dev->persist->nvfs, sizeof(dev->persist->nvfs));
+
+ mlx4_unload_one(pdev);
+ err = mlx4_load_one(pdev, pci_dev_data, total_vfs, nvfs, priv, 1);
+ if (err) {
+ mlx4_err(dev, "%s: ERROR: mlx4_load_one failed, pci_name=%s, err=%d\n",
+ __func__, pci_name(pdev), err);
+ return err;
+ }
+
+ err = restore_current_port_types(dev, dev->persist->curr_port_type,
+ dev->persist->curr_port_poss_type);
+ if (err)
+ mlx4_err(dev, "could not restore original port types (%d)\n",
+ err);
+
+ return err;
+}
+
+static const struct pci_device_id mlx4_pci_table[] = {
+ /* MT25408 "Hermon" SDR */
+ { PCI_VDEVICE(MELLANOX, 0x6340), MLX4_PCI_DEV_FORCE_SENSE_PORT },
+ /* MT25408 "Hermon" DDR */
+ { PCI_VDEVICE(MELLANOX, 0x634a), MLX4_PCI_DEV_FORCE_SENSE_PORT },
+ /* MT25408 "Hermon" QDR */
+ { PCI_VDEVICE(MELLANOX, 0x6354), MLX4_PCI_DEV_FORCE_SENSE_PORT },
+ /* MT25408 "Hermon" DDR PCIe gen2 */
+ { PCI_VDEVICE(MELLANOX, 0x6732), MLX4_PCI_DEV_FORCE_SENSE_PORT },
+ /* MT25408 "Hermon" QDR PCIe gen2 */
+ { PCI_VDEVICE(MELLANOX, 0x673c), MLX4_PCI_DEV_FORCE_SENSE_PORT },
+ /* MT25408 "Hermon" EN 10GigE */
+ { PCI_VDEVICE(MELLANOX, 0x6368), MLX4_PCI_DEV_FORCE_SENSE_PORT },
+ /* MT25408 "Hermon" EN 10GigE PCIe gen2 */
+ { PCI_VDEVICE(MELLANOX, 0x6750), MLX4_PCI_DEV_FORCE_SENSE_PORT },
+ /* MT25458 ConnectX EN 10GBASE-T 10GigE */
+ { PCI_VDEVICE(MELLANOX, 0x6372), MLX4_PCI_DEV_FORCE_SENSE_PORT },
+ /* MT25458 ConnectX EN 10GBASE-T+Gen2 10GigE */
+ { PCI_VDEVICE(MELLANOX, 0x675a), MLX4_PCI_DEV_FORCE_SENSE_PORT },
+ /* MT26468 ConnectX EN 10GigE PCIe gen2*/
+ { PCI_VDEVICE(MELLANOX, 0x6764), MLX4_PCI_DEV_FORCE_SENSE_PORT },
+ /* MT26438 ConnectX EN 40GigE PCIe gen2 5GT/s */
+ { PCI_VDEVICE(MELLANOX, 0x6746), MLX4_PCI_DEV_FORCE_SENSE_PORT },
+ /* MT26478 ConnectX2 40GigE PCIe gen2 */
+ { PCI_VDEVICE(MELLANOX, 0x676e), MLX4_PCI_DEV_FORCE_SENSE_PORT },
+ /* MT25400 Family [ConnectX-2 Virtual Function] */
+ { PCI_VDEVICE(MELLANOX, 0x1002), MLX4_PCI_DEV_IS_VF },
+ /* MT27500 Family [ConnectX-3] */
+ { PCI_VDEVICE(MELLANOX, 0x1003), 0 },
+ /* MT27500 Family [ConnectX-3 Virtual Function] */
+ { PCI_VDEVICE(MELLANOX, 0x1004), MLX4_PCI_DEV_IS_VF },
+ { PCI_VDEVICE(MELLANOX, 0x1005), 0 }, /* MT27510 Family */
+ { PCI_VDEVICE(MELLANOX, 0x1006), 0 }, /* MT27511 Family */
+ { PCI_VDEVICE(MELLANOX, 0x1007), 0 }, /* MT27520 Family */
+ { PCI_VDEVICE(MELLANOX, 0x1008), 0 }, /* MT27521 Family */
+ { PCI_VDEVICE(MELLANOX, 0x1009), 0 }, /* MT27530 Family */
+ { PCI_VDEVICE(MELLANOX, 0x100a), 0 }, /* MT27531 Family */
+ { PCI_VDEVICE(MELLANOX, 0x100b), 0 }, /* MT27540 Family */
+ { PCI_VDEVICE(MELLANOX, 0x100c), 0 }, /* MT27541 Family */
+ { PCI_VDEVICE(MELLANOX, 0x100d), 0 }, /* MT27550 Family */
+ { PCI_VDEVICE(MELLANOX, 0x100e), 0 }, /* MT27551 Family */
+ { PCI_VDEVICE(MELLANOX, 0x100f), 0 }, /* MT27560 Family */
+ { PCI_VDEVICE(MELLANOX, 0x1010), 0 }, /* MT27561 Family */
+ { 0, }
+};
+
+MODULE_DEVICE_TABLE(pci, mlx4_pci_table);
+
+static pci_ers_result_t mlx4_pci_err_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+ struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
+
+ mlx4_err(persist->dev, "mlx4_pci_err_detected was called\n");
+ mlx4_enter_error_state(persist);
+
+ mutex_lock(&persist->interface_state_mutex);
+ if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
+ mlx4_unload_one(pdev);
+
+ mutex_unlock(&persist->interface_state_mutex);
+ if (state == pci_channel_io_perm_failure)
+ return PCI_ERS_RESULT_DISCONNECT;
+
+ pci_disable_device(pdev);
+ return PCI_ERS_RESULT_NEED_RESET;
+}
+
+static pci_ers_result_t mlx4_pci_slot_reset(struct pci_dev *pdev)
+{
+ struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
+ struct mlx4_dev *dev = persist->dev;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int ret;
+ int nvfs[MLX4_MAX_PORTS + 1] = {0, 0, 0};
+ int total_vfs;
+
+ mlx4_err(dev, "mlx4_pci_slot_reset was called\n");
+ ret = pci_enable_device(pdev);
+ if (ret) {
+ mlx4_err(dev, "Can not re-enable device, ret=%d\n", ret);
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+
+ pci_set_master(pdev);
+ pci_restore_state(pdev);
+ pci_save_state(pdev);
+
+ total_vfs = dev->persist->num_vfs;
+ memcpy(nvfs, dev->persist->nvfs, sizeof(dev->persist->nvfs));
+
+ mutex_lock(&persist->interface_state_mutex);
+ if (!(persist->interface_state & MLX4_INTERFACE_STATE_UP)) {
+ ret = mlx4_load_one(pdev, priv->pci_dev_data, total_vfs, nvfs,
+ priv, 1);
+ if (ret) {
+ mlx4_err(dev, "%s: mlx4_load_one failed, ret=%d\n",
+ __func__, ret);
+ goto end;
+ }
+
+ ret = restore_current_port_types(dev, dev->persist->
+ curr_port_type, dev->persist->
+ curr_port_poss_type);
+ if (ret)
+ mlx4_err(dev, "could not restore original port types (%d)\n", ret);
+ }
+end:
+ mutex_unlock(&persist->interface_state_mutex);
+
+ return ret ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
+}
+
+static void mlx4_shutdown(struct pci_dev *pdev)
+{
+ struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
+
+ mlx4_info(persist->dev, "mlx4_shutdown was called\n");
+ mutex_lock(&persist->interface_state_mutex);
+ if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
+ mlx4_unload_one(pdev);
+ mutex_unlock(&persist->interface_state_mutex);
+}
+
+#ifdef CONFIG_COMPAT_IS_CONST_PCI_ERROR_HANDLERS
+static const struct pci_error_handlers mlx4_err_handler = {
+#else
+static struct pci_error_handlers mlx4_err_handler = {
+#endif
+ .error_detected = mlx4_pci_err_detected,
+ .slot_reset = mlx4_pci_slot_reset,
+};
+
+static int mlx4_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+ struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
+ struct mlx4_dev *dev = persist->dev;
+
+ mlx4_err(dev, "suspend was called\n");
+ mutex_lock(&persist->interface_state_mutex);
+ if (persist->interface_state & MLX4_INTERFACE_STATE_UP)
+ mlx4_unload_one(pdev);
+ mutex_unlock(&persist->interface_state_mutex);
+
+ return 0;
+}
+
+static int mlx4_resume(struct pci_dev *pdev)
+{
+ int nvfs[MLX4_MAX_PORTS + 1] = {0, 0, 0};
+ int total_vfs;
+ struct mlx4_dev_persistent *persist = pci_get_drvdata(pdev);
+ struct mlx4_dev *dev = persist->dev;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int ret = 0;
+
+ mlx4_err(dev, "resume was called\n");
+ total_vfs = dev->persist->num_vfs;
+ memcpy(nvfs, dev->persist->nvfs, sizeof(dev->persist->nvfs));
+
+ mutex_lock(&persist->interface_state_mutex);
+ if (!(persist->interface_state & MLX4_INTERFACE_STATE_UP)) {
+ ret = mlx4_load_one(pdev, priv->pci_dev_data, total_vfs, nvfs, priv, 1);
+ if (!ret) {
+ ret = restore_current_port_types(dev, dev->persist->
+ curr_port_type, dev->persist->
+ curr_port_poss_type);
+ if (ret)
+ mlx4_err(dev, "resume: could not restore original port types (%d)\n", ret);
+ }
+ }
+ mutex_unlock(&persist->interface_state_mutex);
+
+ return ret;
+}
+
+static struct pci_driver mlx4_driver = {
+ .name = DRV_NAME,
+ .id_table = mlx4_pci_table,
+ .probe = mlx4_init_one,
+ .shutdown = mlx4_shutdown,
+ .remove = mlx4_remove_one,
+ .suspend = mlx4_suspend,
+ .resume = mlx4_resume,
+ .err_handler = &mlx4_err_handler,
+};
+
+static int __init mlx4_verify_params(void)
+{
+ int status;
+
+ status = update_defaults(&port_type_array);
+ if (status == INVALID_STR) {
+ if (mlx4_fill_dbdf2val_tbl(&port_type_array.dbdf2val))
+ return -1;
+ } else if (status == INVALID_DATA) {
+ return -1;
+ }
+
+ status = update_defaults(&num_vfs);
+ if (status == INVALID_STR) {
+ if (mlx4_fill_dbdf2val_tbl(&num_vfs.dbdf2val))
+ return -1;
+ } else if (status == INVALID_DATA) {
+ return -1;
+ }
+
+ status = update_defaults(&probe_vf);
+ if (status == INVALID_STR) {
+ if (mlx4_fill_dbdf2val_tbl(&probe_vf.dbdf2val))
+ return -1;
+ } else if (status == INVALID_DATA) {
+ return -1;
+ }
+
+ status = update_defaults(&roce_mode);
+ if (status == INVALID_STR) {
+ if (mlx4_fill_dbdf2val_tbl(&roce_mode.dbdf2val))
+ return -1;
+ } else if (status == INVALID_DATA) {
+ return -1;
+ }
+
+ status = update_defaults(&ud_gid_type);
+ if (status == INVALID_STR) {
+ if (mlx4_fill_dbdf2val_tbl(&ud_gid_type.dbdf2val))
+ return -1;
+ } else if (status == INVALID_DATA) {
+ return -1;
+ }
+
+ if (msi_x < 0) {
+ pr_warn("mlx4_core: bad msi_x: %d\n", msi_x);
+ return -1;
+ }
+
+ if ((log_num_mac < 0) || (log_num_mac > MLX4_MAX_LOG_NUM_MACS)) {
+ pr_warning("mlx4_core: bad num_mac: %d\n", log_num_mac);
+ return -1;
+ }
+
+ if (log_num_vlan != 0)
+ pr_warning("mlx4_core: log_num_vlan - obsolete module param, using %d\n",
+ MLX4_LOG_NUM_VLANS);
+
+ if ((log_mtts_per_seg < 0) || (log_mtts_per_seg > 7)) {
+ pr_warning("mlx4_core: bad log_mtts_per_seg: %d\n", log_mtts_per_seg);
+ return -1;
+ }
+
+ if (mlx4_log_num_mgm_entry_size < (int)(-MLX4_DMFS_PARAM_VALUES) ||
+ (mlx4_log_num_mgm_entry_size > 0 &&
+ (mlx4_log_num_mgm_entry_size < MLX4_MIN_MGM_LOG_ENTRY_SIZE ||
+ mlx4_log_num_mgm_entry_size > MLX4_MAX_MGM_LOG_ENTRY_SIZE))) {
+ pr_warning("mlx4_core: mlx4_log_num_mgm_entry_size (%d) not "
+ "in legal range -%d..0 or %d..%d)\n",
+ mlx4_log_num_mgm_entry_size,
+ MLX4_DMFS_PARAM_VALUES,
+ MLX4_MIN_MGM_LOG_ENTRY_SIZE,
+ MLX4_MAX_MGM_LOG_ENTRY_SIZE);
+ return -1;
+ }
+ if (ingress_parser_mode < MLX4_INGRESS_PARSER_MODE_STANDARD ||
+ ingress_parser_mode >= MLX4_INGRESS_PARSER_MODE_MAX) {
+ pr_warn("mlx4_core: ingress_parser_mode (%d) not "
+ "in legal range %d..%d. "
+ "Changing to default\n",
+ ingress_parser_mode,
+ MLX4_INGRESS_PARSER_MODE_STANDARD,
+ MLX4_INGRESS_PARSER_MODE_MAX - 1);
+ ingress_parser_mode = MLX4_INGRESS_PARSER_MODE_STANDARD;
+ }
+
+ if (mlx4_log_num_mgm_entry_size < 0 &&
+ (!((-mlx4_log_num_mgm_entry_size) & MLX4_DMFS_ETH_ONLY)) &&
+ ((-mlx4_log_num_mgm_entry_size) & MLX4_DMFS_A0_STEERING)) {
+ pr_warn("mlx4_core: Can't support IPoIB flow steering along "
+ "with optimized steering\n");
+ return -1;
+ }
+
+ if (mod_param_profile.num_qp < 18 || mod_param_profile.num_qp > 23) {
+ pr_warning("mlx4_core: bad log_num_qp: %d\n",
+ mod_param_profile.num_qp);
+ return -1;
+ }
+
+ if (mod_param_profile.num_srq < 10) {
+ pr_warning("mlx4_core: too low log_num_srq: %d\n",
+ mod_param_profile.num_srq);
+ return -1;
+ }
+
+ if (mod_param_profile.num_cq < 10) {
+ pr_warning("mlx4_core: too low log_num_cq: %d\n",
+ mod_param_profile.num_cq);
+ return -1;
+ }
+
+ if (mod_param_profile.num_mpt < 10) {
+ pr_warning("mlx4_core: too low log_num_mpt: %d\n",
+ mod_param_profile.num_mpt);
+ return -1;
+ }
+
+ if (mod_param_profile.num_mtt &&
+ mod_param_profile.num_mtt < 15) {
+ pr_warning("mlx4_core: too low log_num_mtt: %d\n",
+ mod_param_profile.num_mtt);
+ return -1;
+ }
+
+ if (mod_param_profile.num_mtt > MLX4_MAX_LOG_NUM_MTT) {
+ pr_warning("mlx4_core: too high log_num_mtt: %d\n",
+ mod_param_profile.num_mtt);
+ return -1;
+ }
+ return 0;
+}
+
+static int __init mlx4_init(void)
+{
+ int ret;
+
+ if (mlx4_verify_params())
+ return -EINVAL;
+
+
+ mlx4_wq = create_singlethread_workqueue("mlx4");
+ if (!mlx4_wq)
+ return -ENOMEM;
+
+ ret = pci_register_driver(&mlx4_driver);
+ if (ret < 0)
+ destroy_workqueue(mlx4_wq);
+ return ret < 0 ? ret : 0;
+}
+
+static void __exit mlx4_cleanup(void)
+{
+ pci_unregister_driver(&mlx4_driver);
+ destroy_workqueue(mlx4_wq);
+}
+
+module_init(mlx4_init);
+module_exit(mlx4_cleanup);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/mcg.c b/drivers/net/mlnx_uio/mlnx/mlx4/mcg.c
new file mode 100644
index 0000000..3c17bb6
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/mcg.c
@@ -0,0 +1,1665 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+
+#include "mlx4.h"
+
+static const u8 zero_gid[16]; /* automatically initialized to 0 */
+
+int mlx4_get_mgm_entry_size(struct mlx4_dev *dev)
+{
+ return 1 << dev->oper_log_mgm_entry_size;
+}
+
+int mlx4_get_qp_per_mgm(struct mlx4_dev *dev)
+{
+ return 4 * (mlx4_get_mgm_entry_size(dev) / 16 - 2);
+}
+
+static int mlx4_QP_FLOW_STEERING_ATTACH(struct mlx4_dev *dev,
+ struct mlx4_cmd_mailbox *mailbox,
+ u32 size,
+ u64 *reg_id)
+{
+ u64 imm;
+ int err = 0;
+
+ err = mlx4_cmd_imm(dev, mailbox->dma, &imm, size, 0,
+ MLX4_QP_FLOW_STEERING_ATTACH, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err)
+ return err;
+ *reg_id = imm;
+
+ return err;
+}
+
+static int mlx4_QP_FLOW_STEERING_DETACH(struct mlx4_dev *dev, u64 regid)
+{
+ int err = 0;
+
+ err = mlx4_cmd(dev, regid, 0, 0,
+ MLX4_QP_FLOW_STEERING_DETACH, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+
+ return err;
+}
+
+static int mlx4_READ_ENTRY(struct mlx4_dev *dev, int index,
+ struct mlx4_cmd_mailbox *mailbox)
+{
+ return mlx4_cmd_box(dev, 0, mailbox->dma, index, 0, MLX4_CMD_READ_MCG,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+}
+
+static int mlx4_WRITE_ENTRY(struct mlx4_dev *dev, int index,
+ struct mlx4_cmd_mailbox *mailbox)
+{
+ return mlx4_cmd(dev, mailbox->dma, index, 0, MLX4_CMD_WRITE_MCG,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+}
+
+static int mlx4_WRITE_PROMISC(struct mlx4_dev *dev, u8 port, u8 steer,
+ struct mlx4_cmd_mailbox *mailbox)
+{
+ u32 in_mod;
+
+ in_mod = (u32) port << 16 | steer << 1;
+ return mlx4_cmd(dev, mailbox->dma, in_mod, 0x1,
+ MLX4_CMD_WRITE_MCG, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+}
+
+static int mlx4_GID_HASH(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
+ u16 *hash, u8 op_mod)
+{
+ u64 imm;
+ int err;
+
+ err = mlx4_cmd_imm(dev, mailbox->dma, &imm, 0, op_mod,
+ MLX4_CMD_MGID_HASH, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+
+ if (!err)
+ *hash = imm;
+
+ return err;
+}
+
+static struct mlx4_promisc_qp *get_promisc_qp(struct mlx4_dev *dev, u8 port,
+ enum mlx4_steer_type steer,
+ u32 qpn)
+{
+ struct mlx4_steer *s_steer;
+ struct mlx4_promisc_qp *pqp;
+
+ if (port < 1 || port > dev->caps.num_ports)
+ return NULL;
+
+ s_steer = &mlx4_priv(dev)->steer[port - 1];
+
+ list_for_each_entry(pqp, &s_steer->promisc_qps[steer], list) {
+ if (pqp->qpn == qpn)
+ return pqp;
+ }
+ /* not found */
+ return NULL;
+}
+
+/*
+ * Add new entry to steering data structure.
+ * All promisc QPs should be added as well
+ */
+static int new_steering_entry(struct mlx4_dev *dev, u8 port,
+ enum mlx4_steer_type steer,
+ unsigned int index, u32 qpn)
+{
+ struct mlx4_steer *s_steer;
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_mgm *mgm;
+ u32 members_count;
+ struct mlx4_steer_index *new_entry;
+ struct mlx4_promisc_qp *pqp;
+ struct mlx4_promisc_qp *dqp = NULL;
+ u32 prot;
+ int err;
+
+ if (port < 1 || port > dev->caps.num_ports)
+ return -EINVAL;
+
+ s_steer = &mlx4_priv(dev)->steer[port - 1];
+ new_entry = kzalloc(sizeof *new_entry, GFP_KERNEL);
+ if (!new_entry)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&new_entry->duplicates);
+ new_entry->index = index;
+ list_add_tail(&new_entry->list, &s_steer->steer_entries[steer]);
+
+ /* If the given qpn is also a promisc qp,
+ * it should be inserted to duplicates list
+ */
+ pqp = get_promisc_qp(dev, port, steer, qpn);
+ if (pqp) {
+ dqp = kmalloc(sizeof *dqp, GFP_KERNEL);
+ if (!dqp) {
+ err = -ENOMEM;
+ goto out_alloc;
+ }
+ dqp->qpn = qpn;
+ list_add_tail(&dqp->list, &new_entry->duplicates);
+ }
+
+ /* if no promisc qps for this vep, we are done */
+ if (list_empty(&s_steer->promisc_qps[steer]))
+ return 0;
+
+ /* now need to add all the promisc qps to the new
+ * steering entry, as they should also receive the packets
+ * destined to this address */
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox)) {
+ err = -ENOMEM;
+ goto out_alloc;
+ }
+ mgm = mailbox->buf;
+
+ err = mlx4_READ_ENTRY(dev, index, mailbox);
+ if (err)
+ goto out_mailbox;
+
+ members_count = be32_to_cpu(mgm->members_count) & 0xffffff;
+ prot = be32_to_cpu(mgm->members_count) >> 30;
+ list_for_each_entry(pqp, &s_steer->promisc_qps[steer], list) {
+ /* don't add already existing qpn */
+ if (pqp->qpn == qpn)
+ continue;
+ if (members_count == dev->caps.num_qp_per_mgm) {
+ /* out of space */
+ err = -ENOMEM;
+ goto out_mailbox;
+ }
+
+ /* add the qpn */
+ mgm->qp[members_count++] = cpu_to_be32(pqp->qpn & MGM_QPN_MASK);
+ }
+ /* update the qps count and update the entry with all the promisc qps*/
+ mgm->members_count = cpu_to_be32(members_count | (prot << 30));
+ err = mlx4_WRITE_ENTRY(dev, index, mailbox);
+
+out_mailbox:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ if (!err)
+ return 0;
+out_alloc:
+ if (dqp) {
+ list_del(&dqp->list);
+ kfree(dqp);
+ }
+ list_del(&new_entry->list);
+ kfree(new_entry);
+ return err;
+}
+
+/* update the data structures with existing steering entry */
+static int existing_steering_entry(struct mlx4_dev *dev, u8 port,
+ enum mlx4_steer_type steer,
+ unsigned int index, u32 qpn)
+{
+ struct mlx4_steer *s_steer;
+ struct mlx4_steer_index *tmp_entry, *entry = NULL;
+ struct mlx4_promisc_qp *pqp;
+ struct mlx4_promisc_qp *dqp;
+
+ if (port < 1 || port > dev->caps.num_ports)
+ return -EINVAL;
+
+ s_steer = &mlx4_priv(dev)->steer[port - 1];
+
+ pqp = get_promisc_qp(dev, port, steer, qpn);
+ if (!pqp)
+ return 0; /* nothing to do */
+
+ list_for_each_entry(tmp_entry, &s_steer->steer_entries[steer], list) {
+ if (tmp_entry->index == index) {
+ entry = tmp_entry;
+ break;
+ }
+ }
+ if (unlikely(!entry)) {
+ mlx4_warn(dev, "Steering entry at index %x is not registered\n", index);
+ return -EINVAL;
+ }
+
+ /* the given qpn is listed as a promisc qpn
+ * we need to add it as a duplicate to this entry
+ * for future references */
+ list_for_each_entry(dqp, &entry->duplicates, list) {
+ if (qpn == dqp->qpn)
+ return 0; /* qp is already duplicated */
+ }
+
+ /* add the qp as a duplicate on this index */
+ dqp = kmalloc(sizeof *dqp, GFP_KERNEL);
+ if (!dqp)
+ return -ENOMEM;
+ dqp->qpn = qpn;
+ list_add_tail(&dqp->list, &entry->duplicates);
+
+ return 0;
+}
+
+/* Check whether a qpn is a duplicate on steering entry
+ * If so, it should not be removed from mgm */
+static bool check_duplicate_entry(struct mlx4_dev *dev, u8 port,
+ enum mlx4_steer_type steer,
+ unsigned int index, u32 qpn)
+{
+ struct mlx4_steer *s_steer;
+ struct mlx4_steer_index *tmp_entry, *entry = NULL;
+ struct mlx4_promisc_qp *dqp, *tmp_dqp;
+
+ if (port < 1 || port > dev->caps.num_ports)
+ return NULL;
+
+ s_steer = &mlx4_priv(dev)->steer[port - 1];
+
+ /* if qp is not promisc, it cannot be duplicated */
+ if (!get_promisc_qp(dev, port, steer, qpn))
+ return false;
+
+ /* The qp is promisc qp so it is a duplicate on this index
+ * Find the index entry, and remove the duplicate */
+ list_for_each_entry(tmp_entry, &s_steer->steer_entries[steer], list) {
+ if (tmp_entry->index == index) {
+ entry = tmp_entry;
+ break;
+ }
+ }
+ if (unlikely(!entry)) {
+ mlx4_warn(dev, "Steering entry for index %x is not registered\n", index);
+ return false;
+ }
+ list_for_each_entry_safe(dqp, tmp_dqp, &entry->duplicates, list) {
+ if (dqp->qpn == qpn) {
+ list_del(&dqp->list);
+ kfree(dqp);
+ }
+ }
+ return true;
+}
+
+/* Returns true if all the QPs != tqpn contained in this entry
+ * are Promisc QPs. Returns false otherwise.
+ */
+static bool promisc_steering_entry(struct mlx4_dev *dev, u8 port,
+ enum mlx4_steer_type steer,
+ unsigned int index, u32 tqpn,
+ u32 *members_count)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_mgm *mgm;
+ u32 m_count;
+ bool ret = false;
+ int i;
+
+ if (port < 1 || port > dev->caps.num_ports)
+ return false;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return false;
+ mgm = mailbox->buf;
+
+ if (mlx4_READ_ENTRY(dev, index, mailbox))
+ goto out;
+ m_count = be32_to_cpu(mgm->members_count) & 0xffffff;
+ if (members_count)
+ *members_count = m_count;
+
+ for (i = 0; i < m_count; i++) {
+ u32 qpn = be32_to_cpu(mgm->qp[i]) & MGM_QPN_MASK;
+ if (!get_promisc_qp(dev, port, steer, qpn) && qpn != tqpn) {
+ /* the qp is not promisc, the entry can't be removed */
+ goto out;
+ }
+ }
+ ret = true;
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return ret;
+}
+
+/* IF a steering entry contains only promisc QPs, it can be removed. */
+static bool can_remove_steering_entry(struct mlx4_dev *dev, u8 port,
+ enum mlx4_steer_type steer,
+ unsigned int index, u32 tqpn)
+{
+ struct mlx4_steer *s_steer;
+ struct mlx4_steer_index *entry = NULL, *tmp_entry;
+ u32 members_count;
+ bool ret = false;
+
+ if (port < 1 || port > dev->caps.num_ports)
+ return NULL;
+
+ s_steer = &mlx4_priv(dev)->steer[port - 1];
+
+ if (!promisc_steering_entry(dev, port, steer, index,
+ tqpn, &members_count))
+ goto out;
+
+ /* All the qps currently registered for this entry are promiscuous,
+ * Checking for duplicates */
+ ret = true;
+ list_for_each_entry_safe(entry, tmp_entry, &s_steer->steer_entries[steer], list) {
+ if (entry->index == index) {
+ if (list_empty(&entry->duplicates) ||
+ members_count == 1) {
+ struct mlx4_promisc_qp *pqp, *tmp_pqp;
+ /* If there is only 1 entry in duplicates then
+ * this is the QP we want to delete, going over
+ * the list and deleting the entry.
+ */
+ list_del(&entry->list);
+ list_for_each_entry_safe(pqp, tmp_pqp,
+ &entry->duplicates,
+ list) {
+ list_del(&pqp->list);
+ kfree(pqp);
+ }
+ kfree(entry);
+ } else {
+ /* This entry contains duplicates so it shouldn't be removed */
+ ret = false;
+ goto out;
+ }
+ }
+ }
+
+out:
+ return ret;
+}
+
+static int add_promisc_qp(struct mlx4_dev *dev, u8 port,
+ enum mlx4_steer_type steer, u32 qpn)
+{
+ struct mlx4_steer *s_steer;
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_mgm *mgm;
+ struct mlx4_steer_index *entry;
+ struct mlx4_promisc_qp *pqp;
+ struct mlx4_promisc_qp *dqp;
+ u32 members_count;
+ u32 prot;
+ int i;
+ bool found;
+ int err;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ if (port < 1 || port > dev->caps.num_ports)
+ return -EINVAL;
+
+ s_steer = &mlx4_priv(dev)->steer[port - 1];
+
+ mutex_lock(&priv->mcg_table.mutex);
+
+ if (get_promisc_qp(dev, port, steer, qpn)) {
+ err = 0; /* Noting to do, already exists */
+ goto out_mutex;
+ }
+
+ pqp = kmalloc(sizeof *pqp, GFP_KERNEL);
+ if (!pqp) {
+ err = -ENOMEM;
+ goto out_mutex;
+ }
+ pqp->qpn = qpn;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox)) {
+ err = -ENOMEM;
+ goto out_alloc;
+ }
+ mgm = mailbox->buf;
+
+ if (!(mlx4_is_mfunc(dev) && steer == MLX4_UC_STEER)) {
+ /* The promisc QP needs to be added for each one of the steering
+ * entries. If it already exists, needs to be added as
+ * a duplicate for this entry.
+ */
+ list_for_each_entry(entry,
+ &s_steer->steer_entries[steer],
+ list) {
+ err = mlx4_READ_ENTRY(dev, entry->index, mailbox);
+ if (err)
+ goto out_mailbox;
+
+ members_count = be32_to_cpu(mgm->members_count) &
+ 0xffffff;
+ prot = be32_to_cpu(mgm->members_count) >> 30;
+ found = false;
+ for (i = 0; i < members_count; i++) {
+ if ((be32_to_cpu(mgm->qp[i]) &
+ MGM_QPN_MASK) == qpn) {
+ /* Entry already exists.
+ * Add to duplicates.
+ */
+ dqp = kmalloc(sizeof(*dqp), GFP_KERNEL);
+ if (!dqp) {
+ err = -ENOMEM;
+ goto out_mailbox;
+ }
+ dqp->qpn = qpn;
+ list_add_tail(&dqp->list,
+ &entry->duplicates);
+ found = true;
+ }
+ }
+ if (!found) {
+ /* Need to add the qpn to mgm */
+ if (members_count ==
+ dev->caps.num_qp_per_mgm) {
+ /* entry is full */
+ err = -ENOMEM;
+ goto out_mailbox;
+ }
+ mgm->qp[members_count++] =
+ cpu_to_be32(qpn & MGM_QPN_MASK);
+ mgm->members_count =
+ cpu_to_be32(members_count |
+ (prot << 30));
+ err = mlx4_WRITE_ENTRY(dev, entry->index,
+ mailbox);
+ if (err)
+ goto out_mailbox;
+ }
+ }
+ }
+
+ /* add the new qpn to list of promisc qps */
+ list_add_tail(&pqp->list, &s_steer->promisc_qps[steer]);
+ /* now need to add all the promisc qps to default entry */
+ memset(mgm, 0, sizeof *mgm);
+ members_count = 0;
+ list_for_each_entry(dqp, &s_steer->promisc_qps[steer], list) {
+ if (members_count == dev->caps.num_qp_per_mgm) {
+ /* entry is full */
+ err = -ENOMEM;
+ goto out_list;
+ }
+ mgm->qp[members_count++] = cpu_to_be32(dqp->qpn & MGM_QPN_MASK);
+ }
+ mgm->members_count = cpu_to_be32(members_count | MLX4_PROT_ETH << 30);
+
+ err = mlx4_WRITE_PROMISC(dev, port, steer, mailbox);
+ if (err)
+ goto out_list;
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ mutex_unlock(&priv->mcg_table.mutex);
+ return 0;
+
+out_list:
+ list_del(&pqp->list);
+out_mailbox:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+out_alloc:
+ kfree(pqp);
+out_mutex:
+ mutex_unlock(&priv->mcg_table.mutex);
+ return err;
+}
+
+static int remove_promisc_qp(struct mlx4_dev *dev, u8 port,
+ enum mlx4_steer_type steer, u32 qpn)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_steer *s_steer;
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_mgm *mgm;
+ struct mlx4_steer_index *entry, *tmp_entry;
+ struct mlx4_promisc_qp *pqp;
+ struct mlx4_promisc_qp *dqp;
+ u32 members_count;
+ bool found;
+ bool back_to_list = false;
+ int i;
+ int err;
+
+ if (port < 1 || port > dev->caps.num_ports)
+ return -EINVAL;
+
+ s_steer = &mlx4_priv(dev)->steer[port - 1];
+ mutex_lock(&priv->mcg_table.mutex);
+
+ pqp = get_promisc_qp(dev, port, steer, qpn);
+ if (unlikely(!pqp)) {
+ mlx4_warn(dev, "QP %x is not promiscuous QP\n", qpn);
+ /* nothing to do */
+ err = 0;
+ goto out_mutex;
+ }
+
+ /*remove from list of promisc qps */
+ list_del(&pqp->list);
+
+ /* set the default entry not to include the removed one */
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox)) {
+ err = -ENOMEM;
+ back_to_list = true;
+ goto out_list;
+ }
+ mgm = mailbox->buf;
+ members_count = 0;
+ list_for_each_entry(dqp, &s_steer->promisc_qps[steer], list)
+ mgm->qp[members_count++] = cpu_to_be32(dqp->qpn & MGM_QPN_MASK);
+ mgm->members_count = cpu_to_be32(members_count | MLX4_PROT_ETH << 30);
+
+ err = mlx4_WRITE_PROMISC(dev, port, steer, mailbox);
+ if (err)
+ goto out_mailbox;
+
+ if (!(mlx4_is_mfunc(dev) && steer == MLX4_UC_STEER)) {
+ /* Remove the QP from all the steering entries */
+ list_for_each_entry_safe(entry, tmp_entry,
+ &s_steer->steer_entries[steer],
+ list) {
+ found = false;
+ list_for_each_entry(dqp, &entry->duplicates, list) {
+ if (dqp->qpn == qpn) {
+ found = true;
+ break;
+ }
+ }
+ if (found) {
+ /* A duplicate, no need to change the MGM,
+ * only update the duplicates list
+ */
+ list_del(&dqp->list);
+ kfree(dqp);
+ } else {
+ int loc = -1;
+
+ err = mlx4_READ_ENTRY(dev,
+ entry->index,
+ mailbox);
+ if (err)
+ goto out_mailbox;
+ members_count =
+ be32_to_cpu(mgm->members_count) &
+ 0xffffff;
+ if (!members_count) {
+ mlx4_warn(dev, "QP %06x wasn't found in entry %x mcount=0. deleting entry...\n",
+ qpn, entry->index);
+ list_del(&entry->list);
+ kfree(entry);
+ continue;
+ }
+
+ for (i = 0; i < members_count; ++i)
+ if ((be32_to_cpu(mgm->qp[i]) &
+ MGM_QPN_MASK) == qpn) {
+ loc = i;
+ break;
+ }
+
+ if (loc < 0) {
+ mlx4_err(dev, "QP %06x wasn't found in entry %d\n",
+ qpn, entry->index);
+ err = -EINVAL;
+ goto out_mailbox;
+ }
+
+ /* Copy the last QP in this MGM
+ * over removed QP
+ */
+ mgm->qp[loc] = mgm->qp[members_count - 1];
+ mgm->qp[members_count - 1] = 0;
+ mgm->members_count =
+ cpu_to_be32(--members_count |
+ (MLX4_PROT_ETH << 30));
+
+ err = mlx4_WRITE_ENTRY(dev,
+ entry->index,
+ mailbox);
+ if (err)
+ goto out_mailbox;
+ }
+ }
+ }
+
+out_mailbox:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+out_list:
+ if (back_to_list)
+ list_add_tail(&pqp->list, &s_steer->promisc_qps[steer]);
+ else
+ kfree(pqp);
+out_mutex:
+ mutex_unlock(&priv->mcg_table.mutex);
+ return err;
+}
+
+/*
+ * Caller must hold MCG table semaphore. gid and mgm parameters must
+ * be properly aligned for command interface.
+ *
+ * Returns 0 unless a firmware command error occurs.
+ *
+ * If GID is found in MGM or MGM is empty, *index = *hash, *prev = -1
+ * and *mgm holds MGM entry.
+ *
+ * if GID is found in AMGM, *index = index in AMGM, *prev = index of
+ * previous entry in hash chain and *mgm holds AMGM entry.
+ *
+ * If no AMGM exists for given gid, *index = -1, *prev = index of last
+ * entry in hash chain and *mgm holds end of hash chain.
+ */
+static int find_entry(struct mlx4_dev *dev, u8 port,
+ u8 *gid, enum mlx4_protocol prot,
+ struct mlx4_cmd_mailbox *mgm_mailbox,
+ int *prev, int *index)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_mgm *mgm = mgm_mailbox->buf;
+ u8 *mgid;
+ int err;
+ u16 hash = 0;
+ u8 op_mod = (prot == MLX4_PROT_ETH) ?
+ !!(dev->caps.flags & MLX4_DEV_CAP_FLAG_VEP_MC_STEER) : 0;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return -ENOMEM;
+ mgid = mailbox->buf;
+
+ memcpy(mgid, gid, 16);
+
+ err = mlx4_GID_HASH(dev, mailbox, &hash, op_mod);
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ if (err)
+ return err;
+
+ if (0)
+ mlx4_dbg(dev, "Hash for %pI6 is %04x\n", gid, hash);
+
+ *index = hash;
+ *prev = -1;
+
+ do {
+ err = mlx4_READ_ENTRY(dev, *index, mgm_mailbox);
+ if (err)
+ return err;
+
+ if (!(be32_to_cpu(mgm->members_count) & 0xffffff)) {
+ if (*index != hash) {
+ mlx4_err(dev, "Found zero MGID in AMGM\n");
+ err = -EINVAL;
+ }
+ return err;
+ }
+
+ if (!memcmp(mgm->gid, gid, 16) &&
+ be32_to_cpu(mgm->members_count) >> 30 == prot)
+ return err;
+
+ *prev = *index;
+ *index = be32_to_cpu(mgm->next_gid_index) >> 6;
+ } while (*index);
+
+ *index = -1;
+ return err;
+}
+
+static const u8 __promisc_mode[] = {
+ [MLX4_FS_REGULAR] = 0x0,
+ [MLX4_FS_ALL_DEFAULT] = 0x1,
+ [MLX4_FS_MC_DEFAULT] = 0x3,
+ [MLX4_FS_UC_SNIFFER] = 0x4,
+ [MLX4_FS_MC_SNIFFER] = 0x5,
+};
+
+int mlx4_map_sw_to_hw_steering_mode(struct mlx4_dev *dev,
+ enum mlx4_net_trans_promisc_mode flow_type)
+{
+ if (flow_type >= MLX4_FS_MODE_NUM) {
+ mlx4_err(dev, "Invalid flow type. type = %d\n", flow_type);
+ return -EINVAL;
+ }
+ return __promisc_mode[flow_type];
+}
+EXPORT_SYMBOL_GPL(mlx4_map_sw_to_hw_steering_mode);
+
+int mlx4_map_hw_to_sw_steering_mode(struct mlx4_dev *dev,
+ u8 flow_type)
+{
+ u8 i;
+
+ for (i = MLX4_NET_TRANS_PROMISC_MODE_OFFSET;
+ i < sizeof(__promisc_mode) / sizeof(__promisc_mode[0]) +
+ MLX4_NET_TRANS_PROMISC_MODE_OFFSET; i++) {
+ if (__promisc_mode[i] == flow_type)
+ return i;
+ }
+
+ return -1;
+}
+EXPORT_SYMBOL_GPL(mlx4_map_hw_to_sw_steering_mode);
+
+static void trans_rule_ctrl_to_hw(struct mlx4_net_trans_rule *ctrl,
+ struct mlx4_net_trans_rule_hw_ctrl *hw)
+{
+ u8 flags = 0;
+
+ flags = ctrl->queue_mode == MLX4_NET_TRANS_Q_LIFO ? 1 : 0;
+ flags |= ctrl->exclusive ? (1 << 2) : 0;
+ flags |= ctrl->allow_loopback ? (1 << 3) : 0;
+
+ hw->flags = flags;
+ hw->type = __promisc_mode[ctrl->promisc_mode];
+ hw->prio = cpu_to_be16(ctrl->priority);
+ hw->port = ctrl->port;
+ hw->qpn = cpu_to_be32(ctrl->qpn);
+}
+
+const u16 __sw_id_hw[] = {
+ [MLX4_NET_TRANS_RULE_ID_ETH] = 0xE001,
+ [MLX4_NET_TRANS_RULE_ID_IB] = 0xE005,
+ [MLX4_NET_TRANS_RULE_ID_IPV6] = 0xE003,
+ [MLX4_NET_TRANS_RULE_ID_IPV4] = 0xE002,
+ [MLX4_NET_TRANS_RULE_ID_TCP] = 0xE004,
+ [MLX4_NET_TRANS_RULE_ID_UDP] = 0xE006,
+ [MLX4_NET_TRANS_RULE_ID_VXLAN] = 0xE008
+};
+
+int mlx4_map_sw_to_hw_steering_id(struct mlx4_dev *dev,
+ enum mlx4_net_trans_rule_id id)
+{
+ if (id >= MLX4_NET_TRANS_RULE_NUM) {
+ mlx4_err(dev, "Invalid network rule id. id = %d\n", id);
+ return -EINVAL;
+ }
+ return __sw_id_hw[id];
+}
+EXPORT_SYMBOL_GPL(mlx4_map_sw_to_hw_steering_id);
+
+static const int __rule_hw_sz[] = {
+ [MLX4_NET_TRANS_RULE_ID_ETH] =
+ sizeof(struct mlx4_net_trans_rule_hw_eth),
+ [MLX4_NET_TRANS_RULE_ID_IB] =
+ sizeof(struct mlx4_net_trans_rule_hw_ib),
+ [MLX4_NET_TRANS_RULE_ID_IPV6] = 0,
+ [MLX4_NET_TRANS_RULE_ID_IPV4] =
+ sizeof(struct mlx4_net_trans_rule_hw_ipv4),
+ [MLX4_NET_TRANS_RULE_ID_TCP] =
+ sizeof(struct mlx4_net_trans_rule_hw_tcp_udp),
+ [MLX4_NET_TRANS_RULE_ID_UDP] =
+ sizeof(struct mlx4_net_trans_rule_hw_tcp_udp),
+ [MLX4_NET_TRANS_RULE_ID_VXLAN] =
+ sizeof(struct mlx4_net_trans_rule_hw_vxlan)
+};
+
+int mlx4_hw_rule_sz(struct mlx4_dev *dev,
+ enum mlx4_net_trans_rule_id id)
+{
+ if (id >= MLX4_NET_TRANS_RULE_NUM) {
+ mlx4_err(dev, "Invalid network rule id. id = %d\n", id);
+ return -EINVAL;
+ }
+
+ return __rule_hw_sz[id];
+}
+EXPORT_SYMBOL_GPL(mlx4_hw_rule_sz);
+
+static int parse_trans_rule(struct mlx4_dev *dev, struct mlx4_spec_list *spec,
+ struct _rule_hw *rule_hw)
+{
+ if (mlx4_hw_rule_sz(dev, spec->id) < 0)
+ return -EINVAL;
+ memset(rule_hw, 0, mlx4_hw_rule_sz(dev, spec->id));
+ rule_hw->id = cpu_to_be16(__sw_id_hw[spec->id]);
+ rule_hw->size = mlx4_hw_rule_sz(dev, spec->id) >> 2;
+
+ switch (spec->id) {
+ case MLX4_NET_TRANS_RULE_ID_ETH:
+ memcpy(rule_hw->eth.dst_mac, spec->eth.dst_mac, ETH_ALEN);
+ memcpy(rule_hw->eth.dst_mac_msk, spec->eth.dst_mac_msk,
+ ETH_ALEN);
+ memcpy(rule_hw->eth.src_mac, spec->eth.src_mac, ETH_ALEN);
+ memcpy(rule_hw->eth.src_mac_msk, spec->eth.src_mac_msk,
+ ETH_ALEN);
+ if (spec->eth.ether_type_enable) {
+ rule_hw->eth.ether_type_enable = 1;
+ rule_hw->eth.ether_type = spec->eth.ether_type;
+ }
+ rule_hw->eth.vlan_tag = spec->eth.vlan_id;
+ rule_hw->eth.vlan_tag_msk = spec->eth.vlan_id_msk;
+ break;
+
+ case MLX4_NET_TRANS_RULE_ID_IB:
+ rule_hw->ib.l3_qpn = spec->ib.l3_qpn |
+ (spec->ib.roce_type == MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4 ? 0x80 : 0);
+ rule_hw->ib.qpn_mask = spec->ib.qpn_msk;
+ memcpy(&rule_hw->ib.dst_gid, &spec->ib.dst_gid, 16);
+ memcpy(&rule_hw->ib.dst_gid_msk, &spec->ib.dst_gid_msk, 16);
+ break;
+
+ case MLX4_NET_TRANS_RULE_ID_IPV6:
+ return -EOPNOTSUPP;
+
+ case MLX4_NET_TRANS_RULE_ID_IPV4:
+ rule_hw->ipv4.src_ip = spec->ipv4.src_ip;
+ rule_hw->ipv4.src_ip_msk = spec->ipv4.src_ip_msk;
+ rule_hw->ipv4.dst_ip = spec->ipv4.dst_ip;
+ rule_hw->ipv4.dst_ip_msk = spec->ipv4.dst_ip_msk;
+ break;
+
+ case MLX4_NET_TRANS_RULE_ID_TCP:
+ case MLX4_NET_TRANS_RULE_ID_UDP:
+ rule_hw->tcp_udp.dst_port = spec->tcp_udp.dst_port;
+ rule_hw->tcp_udp.dst_port_msk = spec->tcp_udp.dst_port_msk;
+ rule_hw->tcp_udp.src_port = spec->tcp_udp.src_port;
+ rule_hw->tcp_udp.src_port_msk = spec->tcp_udp.src_port_msk;
+ break;
+
+ case MLX4_NET_TRANS_RULE_ID_VXLAN:
+ rule_hw->vxlan.vni =
+ cpu_to_be32(be32_to_cpu(spec->vxlan.vni) << 8);
+ rule_hw->vxlan.vni_mask =
+ cpu_to_be32(be32_to_cpu(spec->vxlan.vni_mask) << 8);
+ break;
+
+ default:
+ return -EINVAL;
+ }
+
+ return __rule_hw_sz[spec->id];
+}
+
+static void mlx4_err_rule(struct mlx4_dev *dev, char *str,
+ struct mlx4_net_trans_rule *rule)
+{
+#define BUF_SIZE 256
+ struct mlx4_spec_list *cur;
+ char buf[BUF_SIZE];
+ int len = 0;
+
+ mlx4_err(dev, "%s", str);
+ len += snprintf(buf + len, BUF_SIZE - len,
+ "port = %d prio = 0x%x qp = 0x%x ",
+ rule->port, rule->priority, rule->qpn);
+
+ list_for_each_entry(cur, &rule->list, list) {
+ switch (cur->id) {
+ case MLX4_NET_TRANS_RULE_ID_ETH:
+ len += snprintf(buf + len, BUF_SIZE - len,
+ "dmac = %pM ", &cur->eth.dst_mac);
+ if (cur->eth.ether_type)
+ len += snprintf(buf + len, BUF_SIZE - len,
+ "ethertype = 0x%x ",
+ be16_to_cpu(cur->eth.ether_type));
+ if (cur->eth.vlan_id)
+ len += snprintf(buf + len, BUF_SIZE - len,
+ "vlan-id = %d ",
+ be16_to_cpu(cur->eth.vlan_id));
+ break;
+
+ case MLX4_NET_TRANS_RULE_ID_IPV4:
+ if (cur->ipv4.src_ip)
+ len += snprintf(buf + len, BUF_SIZE - len,
+ "src-ip = %pI4 ",
+ &cur->ipv4.src_ip);
+ if (cur->ipv4.dst_ip)
+ len += snprintf(buf + len, BUF_SIZE - len,
+ "dst-ip = %pI4 ",
+ &cur->ipv4.dst_ip);
+ break;
+
+ case MLX4_NET_TRANS_RULE_ID_TCP:
+ case MLX4_NET_TRANS_RULE_ID_UDP:
+ if (cur->tcp_udp.src_port)
+ len += snprintf(buf + len, BUF_SIZE - len,
+ "src-port = %d ",
+ be16_to_cpu(cur->tcp_udp.src_port));
+ if (cur->tcp_udp.dst_port)
+ len += snprintf(buf + len, BUF_SIZE - len,
+ "dst-port = %d ",
+ be16_to_cpu(cur->tcp_udp.dst_port));
+ break;
+
+ case MLX4_NET_TRANS_RULE_ID_IB:
+ len += snprintf(buf + len, BUF_SIZE - len,
+ "dst-gid = %pI6\n", cur->ib.dst_gid);
+ len += snprintf(buf + len, BUF_SIZE - len,
+ "dst-gid-mask = %pI6\n",
+ cur->ib.dst_gid_msk);
+ break;
+
+ case MLX4_NET_TRANS_RULE_ID_VXLAN:
+ len += snprintf(buf + len, BUF_SIZE - len,
+ "VNID = %d ", be32_to_cpu(cur->vxlan.vni));
+ break;
+ case MLX4_NET_TRANS_RULE_ID_IPV6:
+ break;
+
+ default:
+ break;
+ }
+ }
+ len += snprintf(buf + len, BUF_SIZE - len, "\n");
+ mlx4_err(dev, "%s", buf);
+
+ if (len >= BUF_SIZE)
+ mlx4_err(dev, "Network rule error message was truncated, print buffer is too small\n");
+}
+
+int mlx4_flow_attach(struct mlx4_dev *dev,
+ struct mlx4_net_trans_rule *rule, u64 *reg_id)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_spec_list *cur;
+ u32 size = 0;
+ int ret;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ trans_rule_ctrl_to_hw(rule, mailbox->buf);
+
+ size += sizeof(struct mlx4_net_trans_rule_hw_ctrl);
+
+ list_for_each_entry(cur, &rule->list, list) {
+ ret = parse_trans_rule(dev, cur, mailbox->buf + size);
+ if (ret < 0) {
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return ret;
+ }
+ size += ret;
+ }
+
+ ret = mlx4_QP_FLOW_STEERING_ATTACH(dev, mailbox, size >> 2, reg_id);
+ if (ret == -ENOMEM) {
+ mlx4_err_rule(dev,
+ "mcg table is full. Fail to register network rule\n",
+ rule);
+ } else if (ret) {
+ if (ret == -ENXIO) {
+ if (dev->caps.steering_mode != MLX4_STEERING_MODE_DEVICE_MANAGED)
+ mlx4_err_rule(dev,
+ "DMFS is not enabled, "
+ "failed to register network rule.\n",
+ rule);
+ else
+ mlx4_err_rule(dev,
+ "Rule exceeds the dmfs_high_rate_mode limitations, "
+ "failed to register network rule.\n",
+ rule);
+
+ } else {
+ mlx4_err_rule(dev, "Fail to register network rule.\n", rule);
+ }
+ }
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mlx4_flow_attach);
+
+int mlx4_flow_detach(struct mlx4_dev *dev, u64 reg_id)
+{
+ int err;
+
+ err = mlx4_QP_FLOW_STEERING_DETACH(dev, reg_id);
+ if (err)
+ mlx4_err(dev, "Fail to detach network rule. registration id = 0x%llx\n",
+ reg_id);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_flow_detach);
+
+int mlx4_tunnel_steer_add(struct mlx4_dev *dev, unsigned char *addr,
+ int port, int qpn, u16 prio, u64 *reg_id)
+{
+ int err;
+ struct mlx4_spec_list spec_eth_outer = { {NULL} };
+ struct mlx4_spec_list spec_vxlan = { {NULL} };
+ struct mlx4_spec_list spec_eth_inner = { {NULL} };
+
+ struct mlx4_net_trans_rule rule = {
+ .queue_mode = MLX4_NET_TRANS_Q_FIFO,
+ .exclusive = 0,
+ .allow_loopback = 1,
+ .promisc_mode = MLX4_FS_REGULAR,
+ };
+
+ __be64 mac_mask = cpu_to_be64(MLX4_MAC_MASK << 16);
+
+ rule.port = port;
+ rule.qpn = qpn;
+ rule.priority = prio;
+ INIT_LIST_HEAD(&rule.list);
+
+ spec_eth_outer.id = MLX4_NET_TRANS_RULE_ID_ETH;
+ memcpy(spec_eth_outer.eth.dst_mac, addr, ETH_ALEN);
+ memcpy(spec_eth_outer.eth.dst_mac_msk, &mac_mask, ETH_ALEN);
+
+ spec_vxlan.id = MLX4_NET_TRANS_RULE_ID_VXLAN; /* any vxlan header */
+ spec_eth_inner.id = MLX4_NET_TRANS_RULE_ID_ETH; /* any inner eth header */
+
+ list_add_tail(&spec_eth_outer.list, &rule.list);
+ list_add_tail(&spec_vxlan.list, &rule.list);
+ list_add_tail(&spec_eth_inner.list, &rule.list);
+
+ err = mlx4_flow_attach(dev, &rule, reg_id);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_tunnel_steer_add);
+
+int mlx4_FLOW_STEERING_IB_UC_QP_RANGE(struct mlx4_dev *dev, u32 min_range_qpn,
+ u32 max_range_qpn)
+{
+ int err;
+ u64 in_param;
+
+ in_param = ((u64) min_range_qpn) << 32;
+ in_param |= ((u64) max_range_qpn) & 0xFFFFFFFF;
+
+ err = mlx4_cmd(dev, in_param, 0, 0,
+ MLX4_FLOW_STEERING_IB_UC_QP_RANGE,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_FLOW_STEERING_IB_UC_QP_RANGE);
+
+int mlx4_qp_attach_common(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16],
+ int block_mcast_loopback, enum mlx4_protocol prot,
+ enum mlx4_steer_type steer)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_mgm *mgm;
+ u32 members_count;
+ int index, prev;
+ int link = 0;
+ int i;
+ int err;
+ u8 port = gid[5];
+ u8 new_entry = 0;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ mgm = mailbox->buf;
+
+ mutex_lock(&priv->mcg_table.mutex);
+ err = find_entry(dev, port, gid, prot,
+ mailbox, &prev, &index);
+ if (err)
+ goto out;
+
+ if (index != -1) {
+ if (!(be32_to_cpu(mgm->members_count) & 0xffffff)) {
+ new_entry = 1;
+ memcpy(mgm->gid, gid, 16);
+ }
+ } else {
+ link = 1;
+
+ index = mlx4_bitmap_alloc(&priv->mcg_table.bitmap);
+ if (index == -1) {
+ mlx4_err(dev, "No AMGM entries left\n");
+ err = -ENOMEM;
+ goto out;
+ }
+ index += dev->caps.num_mgms;
+
+ new_entry = 1;
+ memset(mgm, 0, sizeof *mgm);
+ memcpy(mgm->gid, gid, 16);
+ }
+
+ members_count = be32_to_cpu(mgm->members_count) & 0xffffff;
+ if (members_count == dev->caps.num_qp_per_mgm) {
+ mlx4_err(dev, "MGM at index %x is full\n", index);
+ err = -ENOMEM;
+ goto out;
+ }
+
+ for (i = 0; i < members_count; ++i)
+ if ((be32_to_cpu(mgm->qp[i]) & MGM_QPN_MASK) == qp->qpn) {
+ mlx4_dbg(dev, "QP %06x already a member of MGM\n", qp->qpn);
+ err = 0;
+ goto out;
+ }
+
+ if (block_mcast_loopback)
+ mgm->qp[members_count++] = cpu_to_be32((qp->qpn & MGM_QPN_MASK) |
+ (1U << MGM_BLCK_LB_BIT));
+ else
+ mgm->qp[members_count++] = cpu_to_be32(qp->qpn & MGM_QPN_MASK);
+
+ mgm->members_count = cpu_to_be32(members_count | (u32) prot << 30);
+
+ err = mlx4_WRITE_ENTRY(dev, index, mailbox);
+ if (err)
+ goto out;
+
+ if (!link)
+ goto out;
+
+ err = mlx4_READ_ENTRY(dev, prev, mailbox);
+ if (err)
+ goto out;
+
+ mgm->next_gid_index = cpu_to_be32(index << 6);
+
+ err = mlx4_WRITE_ENTRY(dev, prev, mailbox);
+ if (err)
+ goto out;
+
+out:
+ if (prot == MLX4_PROT_ETH) {
+ /* manage the steering entry for promisc mode */
+ if (new_entry)
+ new_steering_entry(dev, port, steer, index, qp->qpn);
+ else
+ existing_steering_entry(dev, port, steer,
+ index, qp->qpn);
+ }
+ if (err && link && index != -1) {
+ if (index < dev->caps.num_mgms)
+ mlx4_warn(dev, "Got AMGM index %d < %d\n",
+ index, dev->caps.num_mgms);
+ else
+ mlx4_bitmap_free(&priv->mcg_table.bitmap,
+ index - dev->caps.num_mgms, MLX4_USE_RR);
+ }
+ mutex_unlock(&priv->mcg_table.mutex);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+int mlx4_qp_detach_common(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16],
+ enum mlx4_protocol prot, enum mlx4_steer_type steer)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_mgm *mgm;
+ u32 members_count;
+ int prev, index;
+ int i, loc = -1;
+ int err;
+ u8 port = gid[5];
+ bool removed_entry = false;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ mgm = mailbox->buf;
+
+ mutex_lock(&priv->mcg_table.mutex);
+
+ err = find_entry(dev, port, gid, prot,
+ mailbox, &prev, &index);
+ if (err)
+ goto out;
+
+ if (index == -1) {
+ mlx4_err(dev, "MGID %pI6 not found\n", gid);
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* If this QP is also a promisc QP, it shouldn't be removed only if
+ * at least one none promisc QP is also attached to this MCG
+ */
+ if (prot == MLX4_PROT_ETH &&
+ check_duplicate_entry(dev, port, steer, index, qp->qpn) &&
+ !promisc_steering_entry(dev, port, steer, index, qp->qpn, NULL))
+ goto out;
+
+ members_count = be32_to_cpu(mgm->members_count) & 0xffffff;
+ for (i = 0; i < members_count; ++i)
+ if ((be32_to_cpu(mgm->qp[i]) & MGM_QPN_MASK) == qp->qpn) {
+ loc = i;
+ break;
+ }
+
+ if (loc == -1) {
+ mlx4_err(dev, "QP %06x not found in MGM\n", qp->qpn);
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* copy the last QP in this MGM over removed QP */
+ mgm->qp[loc] = mgm->qp[members_count - 1];
+ mgm->qp[members_count - 1] = 0;
+ mgm->members_count = cpu_to_be32(--members_count | (u32) prot << 30);
+
+ if (prot == MLX4_PROT_ETH)
+ removed_entry = can_remove_steering_entry(dev, port, steer,
+ index, qp->qpn);
+ if (members_count && (prot != MLX4_PROT_ETH || !removed_entry)) {
+ err = mlx4_WRITE_ENTRY(dev, index, mailbox);
+ goto out;
+ }
+
+ /* We are going to delete the entry, members count should be 0 */
+ mgm->members_count = cpu_to_be32((u32) prot << 30);
+
+ if (prev == -1) {
+ /* Remove entry from MGM */
+ int amgm_index = be32_to_cpu(mgm->next_gid_index) >> 6;
+ if (amgm_index) {
+ err = mlx4_READ_ENTRY(dev, amgm_index, mailbox);
+ if (err)
+ goto out;
+ } else
+ memset(mgm->gid, 0, 16);
+
+ err = mlx4_WRITE_ENTRY(dev, index, mailbox);
+ if (err)
+ goto out;
+
+ if (amgm_index) {
+ if (amgm_index < dev->caps.num_mgms)
+ mlx4_warn(dev, "MGM entry %d had AMGM index %d < %d\n",
+ index, amgm_index, dev->caps.num_mgms);
+ else
+ mlx4_bitmap_free(&priv->mcg_table.bitmap,
+ amgm_index - dev->caps.num_mgms, MLX4_USE_RR);
+ }
+ } else {
+ /* Remove entry from AMGM */
+ int cur_next_index = be32_to_cpu(mgm->next_gid_index) >> 6;
+ err = mlx4_READ_ENTRY(dev, prev, mailbox);
+ if (err)
+ goto out;
+
+ mgm->next_gid_index = cpu_to_be32(cur_next_index << 6);
+
+ err = mlx4_WRITE_ENTRY(dev, prev, mailbox);
+ if (err)
+ goto out;
+
+ if (index < dev->caps.num_mgms)
+ mlx4_warn(dev, "entry %d had next AMGM index %d < %d\n",
+ prev, index, dev->caps.num_mgms);
+ else
+ mlx4_bitmap_free(&priv->mcg_table.bitmap,
+ index - dev->caps.num_mgms, MLX4_USE_RR);
+ }
+
+out:
+ mutex_unlock(&priv->mcg_table.mutex);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ if (err && dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR)
+ /* In case device is under an error, return success as a closing command */
+ err = 0;
+ return err;
+}
+
+static int mlx4_QP_ATTACH(struct mlx4_dev *dev, struct mlx4_qp *qp,
+ u8 gid[16], u8 attach, u8 block_loopback,
+ enum mlx4_protocol prot)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ int err = 0;
+ int qpn;
+
+ if (!mlx4_is_mfunc(dev))
+ return -EBADF;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ memcpy(mailbox->buf, gid, 16);
+ qpn = qp->qpn;
+ qpn |= (prot << 28);
+ if (attach && block_loopback)
+ qpn |= (1 << 31);
+
+ err = mlx4_cmd(dev, mailbox->dma, qpn, attach,
+ MLX4_CMD_QP_ATTACH, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ if (err && !attach &&
+ dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR)
+ err = 0;
+ return err;
+}
+
+int mlx4_trans_to_dmfs_attach(struct mlx4_dev *dev, struct mlx4_qp *qp,
+ u8 gid[16], u8 port,
+ int block_mcast_loopback,
+ enum mlx4_protocol prot, u64 *reg_id)
+{
+ struct mlx4_spec_list spec = { {NULL} };
+ __be64 mac_mask = cpu_to_be64(MLX4_MAC_MASK << 16);
+
+ struct mlx4_net_trans_rule rule = {
+ .queue_mode = MLX4_NET_TRANS_Q_FIFO,
+ .exclusive = 0,
+ .promisc_mode = MLX4_FS_REGULAR,
+ .priority = MLX4_DOMAIN_NIC,
+ };
+
+ rule.allow_loopback = !block_mcast_loopback;
+ rule.port = port;
+ rule.qpn = qp->qpn;
+ INIT_LIST_HEAD(&rule.list);
+
+ switch (prot) {
+ case MLX4_PROT_ETH:
+ spec.id = MLX4_NET_TRANS_RULE_ID_ETH;
+ memcpy(spec.eth.dst_mac, &gid[10], ETH_ALEN);
+ memcpy(spec.eth.dst_mac_msk, &mac_mask, ETH_ALEN);
+ break;
+
+ case MLX4_PROT_IB_IPV4:
+ spec.id = MLX4_NET_TRANS_RULE_ID_IB;
+ memcpy(spec.ib.dst_gid + 12, gid + 12, 4);
+ memset(spec.ib.dst_gid_msk + 12, 0xff, 4);
+ spec.ib.roce_type = MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV4;
+
+ break;
+ case MLX4_PROT_IB_IPV6:
+ spec.id = MLX4_NET_TRANS_RULE_ID_IB;
+ memcpy(spec.ib.dst_gid, gid, 16);
+ memset(spec.ib.dst_gid_msk, 0xff, 16);
+ spec.ib.roce_type = MLX4_FLOW_SPEC_IB_ROCE_TYPE_IPV6;
+ break;
+ default:
+ return -EINVAL;
+ }
+ list_add_tail(&spec.list, &rule.list);
+
+ return mlx4_flow_attach(dev, &rule, reg_id);
+}
+
+int mlx4_multicast_attach(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16],
+ u8 port, int block_mcast_loopback,
+ enum mlx4_protocol prot, u64 *reg_id)
+{
+ enum mlx4_steer_type steer;
+ steer = (is_valid_ether_addr(&gid[10])) ? MLX4_UC_STEER : MLX4_MC_STEER;
+
+ switch (dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_A0:
+ if (prot == MLX4_PROT_ETH)
+ return 0;
+
+ case MLX4_STEERING_MODE_B0:
+ if (prot == MLX4_PROT_ETH)
+ gid[7] |= (steer << 1);
+
+ if (mlx4_is_mfunc(dev))
+ return mlx4_QP_ATTACH(dev, qp, gid, 1,
+ block_mcast_loopback, prot);
+ return mlx4_qp_attach_common(dev, qp, gid,
+ block_mcast_loopback, prot,
+ MLX4_MC_STEER);
+
+ case MLX4_STEERING_MODE_DEVICE_MANAGED:
+ return mlx4_trans_to_dmfs_attach(dev, qp, gid, port,
+ block_mcast_loopback,
+ prot, reg_id);
+ default:
+ return -EINVAL;
+ }
+}
+EXPORT_SYMBOL_GPL(mlx4_multicast_attach);
+
+int mlx4_multicast_detach(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16],
+ enum mlx4_protocol prot, u64 reg_id)
+{
+ enum mlx4_steer_type steer;
+ steer = (is_valid_ether_addr(&gid[10])) ? MLX4_UC_STEER : MLX4_MC_STEER;
+
+ switch (dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_A0:
+ if (prot == MLX4_PROT_ETH)
+ return 0;
+
+ case MLX4_STEERING_MODE_B0:
+ if (prot == MLX4_PROT_ETH)
+ gid[7] |= (steer << 1);
+
+ if (mlx4_is_mfunc(dev))
+ return mlx4_QP_ATTACH(dev, qp, gid, 0, 0, prot);
+
+ return mlx4_qp_detach_common(dev, qp, gid, prot,
+ MLX4_MC_STEER);
+
+ case MLX4_STEERING_MODE_DEVICE_MANAGED:
+ return mlx4_flow_detach(dev, reg_id);
+
+ default:
+ return -EINVAL;
+ }
+}
+EXPORT_SYMBOL_GPL(mlx4_multicast_detach);
+
+int mlx4_flow_steer_promisc_add(struct mlx4_dev *dev, u8 port,
+ u32 qpn, enum mlx4_net_trans_promisc_mode mode)
+{
+ struct mlx4_net_trans_rule rule;
+ u64 *regid_p;
+
+ switch (mode) {
+ case MLX4_FS_ALL_DEFAULT:
+ regid_p = &dev->regid_promisc_array[port];
+ break;
+ case MLX4_FS_MC_DEFAULT:
+ regid_p = &dev->regid_allmulti_array[port];
+ break;
+ default:
+ return -1;
+ }
+
+ if (*regid_p != 0)
+ return -1;
+
+ rule.promisc_mode = mode;
+ rule.port = port;
+ rule.qpn = qpn;
+ INIT_LIST_HEAD(&rule.list);
+ mlx4_err(dev, "going promisc on %x\n", port);
+
+ return mlx4_flow_attach(dev, &rule, regid_p);
+}
+EXPORT_SYMBOL_GPL(mlx4_flow_steer_promisc_add);
+
+int mlx4_flow_steer_promisc_remove(struct mlx4_dev *dev, u8 port,
+ enum mlx4_net_trans_promisc_mode mode)
+{
+ int ret;
+ u64 *regid_p;
+
+ switch (mode) {
+ case MLX4_FS_ALL_DEFAULT:
+ regid_p = &dev->regid_promisc_array[port];
+ break;
+ case MLX4_FS_MC_DEFAULT:
+ regid_p = &dev->regid_allmulti_array[port];
+ break;
+ default:
+ return -1;
+ }
+
+ if (*regid_p == 0)
+ return -1;
+
+ ret = mlx4_flow_detach(dev, *regid_p);
+ if (ret == 0)
+ *regid_p = 0;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mlx4_flow_steer_promisc_remove);
+
+int mlx4_unicast_attach(struct mlx4_dev *dev,
+ struct mlx4_qp *qp, u8 gid[16],
+ int block_mcast_loopback, enum mlx4_protocol prot)
+{
+ if (prot == MLX4_PROT_ETH)
+ gid[7] |= (MLX4_UC_STEER << 1);
+
+ if (mlx4_is_mfunc(dev))
+ return mlx4_QP_ATTACH(dev, qp, gid, 1,
+ block_mcast_loopback, prot);
+
+ return mlx4_qp_attach_common(dev, qp, gid, block_mcast_loopback,
+ prot, MLX4_UC_STEER);
+}
+EXPORT_SYMBOL_GPL(mlx4_unicast_attach);
+
+int mlx4_unicast_detach(struct mlx4_dev *dev, struct mlx4_qp *qp,
+ u8 gid[16], enum mlx4_protocol prot)
+{
+ if (prot == MLX4_PROT_ETH)
+ gid[7] |= (MLX4_UC_STEER << 1);
+
+ if (mlx4_is_mfunc(dev))
+ return mlx4_QP_ATTACH(dev, qp, gid, 0, 0, prot);
+
+ return mlx4_qp_detach_common(dev, qp, gid, prot, MLX4_UC_STEER);
+}
+EXPORT_SYMBOL_GPL(mlx4_unicast_detach);
+
+int mlx4_PROMISC_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ u32 qpn = (u32) vhcr->in_param & 0xffffffff;
+ int port = mlx4_slave_convert_port(dev, slave, vhcr->in_param >> 62);
+ enum mlx4_steer_type steer = vhcr->in_modifier;
+
+ if (port < 0)
+ return -EINVAL;
+
+ /* Promiscuous unicast is not allowed in mfunc */
+ if (mlx4_is_mfunc(dev) && steer == MLX4_UC_STEER)
+ return 0;
+
+ if (vhcr->op_modifier)
+ return add_promisc_qp(dev, port, steer, qpn);
+ else
+ return remove_promisc_qp(dev, port, steer, qpn);
+}
+
+static int mlx4_PROMISC(struct mlx4_dev *dev, u32 qpn,
+ enum mlx4_steer_type steer, u8 add, u8 port)
+{
+ return mlx4_cmd(dev, (u64) qpn | (u64) port << 62, (u32) steer, add,
+ MLX4_CMD_PROMISC, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+}
+
+int mlx4_multicast_promisc_add(struct mlx4_dev *dev, u32 qpn, u8 port)
+{
+ if (mlx4_is_mfunc(dev))
+ return mlx4_PROMISC(dev, qpn, MLX4_MC_STEER, 1, port);
+
+ return add_promisc_qp(dev, port, MLX4_MC_STEER, qpn);
+}
+EXPORT_SYMBOL_GPL(mlx4_multicast_promisc_add);
+
+int mlx4_multicast_promisc_remove(struct mlx4_dev *dev, u32 qpn, u8 port)
+{
+ if (mlx4_is_mfunc(dev))
+ return mlx4_PROMISC(dev, qpn, MLX4_MC_STEER, 0, port);
+
+ return remove_promisc_qp(dev, port, MLX4_MC_STEER, qpn);
+}
+EXPORT_SYMBOL_GPL(mlx4_multicast_promisc_remove);
+
+int mlx4_unicast_promisc_add(struct mlx4_dev *dev, u32 qpn, u8 port)
+{
+ if (mlx4_is_mfunc(dev))
+ return mlx4_PROMISC(dev, qpn, MLX4_UC_STEER, 1, port);
+
+ return add_promisc_qp(dev, port, MLX4_UC_STEER, qpn);
+}
+EXPORT_SYMBOL_GPL(mlx4_unicast_promisc_add);
+
+int mlx4_unicast_promisc_remove(struct mlx4_dev *dev, u32 qpn, u8 port)
+{
+ if (mlx4_is_mfunc(dev))
+ return mlx4_PROMISC(dev, qpn, MLX4_UC_STEER, 0, port);
+
+ return remove_promisc_qp(dev, port, MLX4_UC_STEER, qpn);
+}
+EXPORT_SYMBOL_GPL(mlx4_unicast_promisc_remove);
+
+int mlx4_init_mcg_table(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err;
+
+ /* No need for mcg_table when fw managed the mcg table*/
+ if (dev->caps.steering_mode ==
+ MLX4_STEERING_MODE_DEVICE_MANAGED)
+ return 0;
+ err = mlx4_bitmap_init(&priv->mcg_table.bitmap, dev->caps.num_amgms,
+ dev->caps.num_amgms - 1, 0, 0);
+ if (err)
+ return err;
+
+ mutex_init(&priv->mcg_table.mutex);
+
+ return 0;
+}
+
+void mlx4_cleanup_mcg_table(struct mlx4_dev *dev)
+{
+ if (dev->caps.steering_mode !=
+ MLX4_STEERING_MODE_DEVICE_MANAGED)
+ mlx4_bitmap_cleanup(&mlx4_priv(dev)->mcg_table.bitmap);
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/mlx4.h b/drivers/net/mlnx_uio/mlnx/mlx4/mlx4.h
new file mode 100644
index 0000000..ec881a3
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/mlx4.h
@@ -0,0 +1,1514 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007 Cisco Systems. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2004 Voltaire, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef MLX4_H
+#define MLX4_H
+
+
+#include "fw_qos.h"
+#include "mlx4/device.h"
+#include "mlx4/driver.h"
+#include "mlx4/cmd.h"
+#include "rbtree.h"
+
+#define DRV_NAME "mlx4_core"
+#define PFX DRV_NAME ": "
+#define DRV_VERSION "3.0-1.0.1"
+#define DRV_RELDATE "Feb, 2014"
+
+#define MLX4_FS_NUM_OF_L2_ADDR 8
+#define MLX4_FS_MGM_LOG_ENTRY_SIZE 7
+#define MLX4_FS_NUM_MCG (1 << 17)
+
+#define INIT_HCA_TPT_MW_ENABLE (1 << 7)
+
+enum {
+ MLX4_INGRESS_PARSER_MODE_STANDARD = 0,
+ MLX4_INGRESS_PARSER_MODE_NON_L4_CSUM_OFFLOAD = 1,
+ MLX4_INGRESS_PARSER_MODE_MAX = 2
+};
+
+enum {
+ MLX4_HCR_BASE = 0x80680,
+ MLX4_HCR_SIZE = 0x0001c,
+ MLX4_CLR_INT_SIZE = 0x00008,
+ MLX4_SLAVE_COMM_BASE = 0x0,
+ MLX4_COMM_PAGESIZE = 0x1000,
+ MLX4_CLOCK_SIZE = 0x00008,
+ MLX4_COMM_CHAN_CAPS = 0x8,
+ MLX4_COMM_CHAN_FLAGS = 0xc
+};
+
+enum {
+ MLX4_DEFAULT_MGM_LOG_ENTRY_SIZE = 10,
+ MLX4_MIN_MGM_LOG_ENTRY_SIZE = 7,
+ MLX4_MAX_MGM_LOG_ENTRY_SIZE = 12,
+ MLX4_MAX_QP_PER_MGM = 4 * ((1 << MLX4_MAX_MGM_LOG_ENTRY_SIZE) / 16 - 2),
+ MLX4_MTT_ENTRY_PER_SEG = 8,
+};
+
+enum {
+ MLX4_NUM_PDS = 1 << 15
+};
+
+enum {
+ MLX4_NUM_A0_MAC_PER_PORT = 256
+};
+
+enum {
+ MLX4_CMPT_TYPE_QP = 0,
+ MLX4_CMPT_TYPE_SRQ = 1,
+ MLX4_CMPT_TYPE_CQ = 2,
+ MLX4_CMPT_TYPE_EQ = 3,
+ MLX4_CMPT_NUM_TYPE
+};
+
+enum {
+ MLX4_CMPT_SHIFT = 24,
+ MLX4_NUM_CMPTS = MLX4_CMPT_NUM_TYPE << MLX4_CMPT_SHIFT
+};
+
+enum mlx4_mpt_state {
+ MLX4_MPT_DISABLED = 0,
+ MLX4_MPT_EN_HW,
+ MLX4_MPT_EN_SW
+};
+
+#define MLX4_COMM_TIME 10000
+#define MLX4_COMM_OFFLINE_TIME_OUT 30000
+#define MLX4_COMM_CMD_NA_OP 0x0
+
+
+enum {
+ MLX4_COMM_CMD_RESET,
+ MLX4_COMM_CMD_VHCR0,
+ MLX4_COMM_CMD_VHCR1,
+ MLX4_COMM_CMD_VHCR2,
+ MLX4_COMM_CMD_VHCR_EN,
+ MLX4_COMM_CMD_VHCR_POST,
+ MLX4_COMM_CMD_FLR = 254
+};
+
+enum {
+ MLX4_VF_SMI_DISABLED,
+ MLX4_VF_SMI_ENABLED
+};
+
+/*The flag indicates that the slave should delay the RESET cmd*/
+#define MLX4_DELAY_RESET_SLAVE 0xbbbbbbb
+/*indicates how many retries will be done if we are in the middle of FLR*/
+#define NUM_OF_RESET_RETRIES 10
+#define SLEEP_TIME_IN_RESET (2 * 1000)
+enum mlx4_resource {
+ RES_QP,
+ RES_CQ,
+ RES_SRQ,
+ RES_XRCD,
+ RES_MPT,
+ RES_MTT,
+ RES_MAC,
+ RES_VLAN,
+ RES_NPORT_ID,
+ RES_COUNTER,
+ RES_FS_RULE,
+ RES_EQ,
+ MLX4_NUM_OF_RESOURCE_TYPE
+};
+
+enum mlx4_alloc_mode {
+ RES_OP_RESERVE,
+ RES_OP_RESERVE_AND_MAP,
+ RES_OP_MAP_ICM,
+};
+
+enum mlx4_res_tracker_free_type {
+ RES_TR_FREE_ALL,
+ RES_TR_FREE_SLAVES_ONLY,
+ RES_TR_FREE_STRUCTS_ONLY,
+};
+
+/*
+ *Virtual HCR structures.
+ * mlx4_vhcr is the sw representation, in machine endianess
+ *
+ * mlx4_vhcr_cmd is the formalized structure, the one that is passed
+ * to FW to go through communication channel.
+ * It is big endian, and has the same structure as the physical HCR
+ * used by command interface
+ */
+struct mlx4_vhcr {
+ u64 in_param;
+ u64 out_param;
+ u32 in_modifier;
+ u32 err_no;
+ u16 op;
+ u16 token;
+ u8 op_modifier;
+ u8 e_bit;
+};
+
+struct mlx4_vhcr_cmd {
+ __be64 in_param;
+ __be32 in_modifier;
+ u32 reserved1;
+ __be64 out_param;
+ __be16 token;
+ u16 reserved;
+ u8 status;
+ u8 flags;
+ __be16 opcode;
+};
+
+struct mlx4_cmd_info {
+ u16 opcode;
+ bool has_inbox;
+ bool has_outbox;
+ bool out_is_imm;
+ bool encode_slave_id;
+ int (*verify)(struct mlx4_dev *dev, int slave, struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox);
+ int (*wrapper)(struct mlx4_dev *dev, int slave, struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+};
+
+#ifdef CONFIG_MLX4_DEBUG
+extern int mlx4_debug_level;
+#else /* CONFIG_MLX4_DEBUG */
+#define mlx4_debug_level (0)
+#endif /* CONFIG_MLX4_DEBUG */
+
+#define mlx4_dbg(mdev, format, ...) \
+do { \
+ if (mlx4_debug_level) \
+ dev_printk(KERN_DEBUG, \
+ &(mdev)->persist->pdev->dev, format, \
+ ##__VA_ARGS__); \
+} while (0)
+
+#define mlx4_err(mdev, format, ...) \
+ dev_err(&(mdev)->persist->pdev->dev, format, ##__VA_ARGS__)
+#define mlx4_info(mdev, format, ...) \
+ dev_info(&(mdev)->persist->pdev->dev, format, ##__VA_ARGS__)
+#define mlx4_warn(mdev, format, ...) \
+ dev_warn(&(mdev)->persist->pdev->dev, format, ##__VA_ARGS__)
+
+extern int mlx4_log_num_mgm_entry_size;
+extern int log_mtts_per_seg;
+extern int mlx4_internal_err_reset;
+extern int ingress_parser_mode;
+
+#define MLX4_MAX_NUM_SLAVES (RTE_MIN(MLX4_MAX_NUM_PF + MLX4_MAX_NUM_VF, MLX4_MFUNC_MAX))
+#define ALL_SLAVES 0xff
+
+struct mlx4_bitmap {
+ u32 last;
+ u32 top;
+ u32 max;
+ u32 reserved_top;
+ u32 mask;
+ u32 avail;
+ u32 effective_len;
+ spinlock_t lock;
+ unsigned long *table;
+};
+
+struct mlx4_buddy {
+ unsigned long **bits;
+ unsigned int *num_free;
+ u32 max_order;
+ spinlock_t lock;
+};
+
+struct mlx4_icm;
+
+struct mlx4_icm_table {
+ u64 virt;
+ int num_icm;
+ u32 num_obj;
+ int obj_size;
+ int lowmem;
+ int coherent;
+ struct mutex mutex;
+ struct mlx4_icm **icm;
+};
+
+#define MLX4_MPT_FLAG_SW_OWNS (0xfUL << 28)
+#define MLX4_MPT_FLAG_FREE (0x3UL << 28)
+#define MLX4_MPT_FLAG_MIO (1 << 17)
+#define MLX4_MPT_FLAG_BIND_ENABLE (1 << 15)
+#define MLX4_MPT_FLAG_PHYSICAL (1 << 9)
+#define MLX4_MPT_FLAG_REGION (1 << 8)
+
+#define MLX4_MPT_PD_MASK (0x1FFFFUL)
+#define MLX4_MPT_PD_VF_MASK (0xFE0000UL)
+#define MLX4_MPT_PD_FLAG_FAST_REG (1 << 27)
+#define MLX4_MPT_PD_FLAG_RAE (1 << 28)
+#define MLX4_MPT_PD_FLAG_EN_INV (3 << 24)
+
+#define MLX4_MPT_QP_FLAG_BOUND_QP (1 << 7)
+
+#define MLX4_MPT_STATUS_SW 0xF0
+#define MLX4_MPT_STATUS_HW 0x00
+
+#define MLX4_CQE_SIZE_MASK_STRIDE 0x3
+#define MLX4_EQE_SIZE_MASK_STRIDE 0x30
+
+#define MLX4_EQ_ASYNC 0
+#define MLX4_EQ_TO_CQ_VECTOR(vector) ((vector) - \
+ !!((int)(vector) >= MLX4_EQ_ASYNC))
+#define MLX4_CQ_TO_EQ_VECTOR(vector) ((vector) + \
+ !!((int)(vector) >= MLX4_EQ_ASYNC))
+
+/*
+ * Must be packed because mtt_seg is 64 bits but only aligned to 32 bits.
+ */
+struct mlx4_mpt_entry {
+ __be32 flags;
+ __be32 qpn;
+ __be32 key;
+ __be32 pd_flags;
+ __be64 start;
+ __be64 length;
+ __be32 lkey;
+ __be32 win_cnt;
+ u8 reserved1[3];
+ u8 mtt_rep;
+ __be64 mtt_addr;
+ __be32 mtt_sz;
+ __be32 entity_size;
+ __be32 first_byte_offset;
+} __packed;
+
+/*
+ * Must be packed because start is 64 bits but only aligned to 32 bits.
+ */
+struct mlx4_eq_context {
+ __be32 flags;
+ u16 reserved1[3];
+ __be16 page_offset;
+ u8 log_eq_size;
+ u8 reserved2[4];
+ u8 eq_period;
+ u8 reserved3;
+ u8 eq_max_count;
+ u8 reserved4[3];
+ u8 intr;
+ u8 log_page_size;
+ u8 reserved5[2];
+ u8 mtt_base_addr_h;
+ __be32 mtt_base_addr_l;
+ u32 reserved6[2];
+ __be32 consumer_index;
+ __be32 producer_index;
+ u32 reserved7[4];
+};
+
+struct mlx4_cq_context {
+ __be32 flags;
+ u16 reserved1[3];
+ __be16 page_offset;
+ __be32 logsize_usrpage;
+ __be16 cq_period;
+ __be16 cq_max_count;
+ u8 reserved2[3];
+ u8 comp_eqn;
+ u8 log_page_size;
+ u8 reserved3[2];
+ u8 mtt_base_addr_h;
+ __be32 mtt_base_addr_l;
+ __be32 last_notified_index;
+ __be32 solicit_producer_index;
+ __be32 consumer_index;
+ __be32 producer_index;
+ u32 reserved4[2];
+ __be64 db_rec_addr;
+};
+
+struct mlx4_srq_context {
+ __be32 state_logsize_srqn;
+ u8 logstride;
+ u8 reserved1;
+ __be16 xrcd;
+ __be32 pg_offset_cqn;
+ u32 reserved2;
+ u8 log_page_size;
+ u8 reserved3[2];
+ u8 mtt_base_addr_h;
+ __be32 mtt_base_addr_l;
+ __be32 pd;
+ __be16 limit_watermark;
+ __be16 wqe_cnt;
+ u16 reserved4;
+ __be16 wqe_counter;
+ u32 reserved5;
+ __be64 db_rec_addr;
+};
+
+#ifdef KMOD_REMOVED
+struct mlx4_eq_tasklet {
+ struct list_head list;
+ struct list_head process_list;
+
+ struct tasklet_struct task;
+
+ /* lock on completion tasklet list */
+ spinlock_t lock;
+};
+#endif
+
+struct mlx4_eq {
+ struct mlx4_dev *dev;
+ void __iomem *doorbell;
+ int eqn;
+ u32 cons_index;
+ u16 irq;
+ u16 have_irq;
+ int nent;
+ struct mlx4_buf_list *page_list;
+ struct mlx4_mtt mtt;
+ u32 ncqs;
+#ifdef KMOD_REMOVED
+ struct mlx4_eq_tasklet tasklet_ctx;
+#endif
+ struct mlx4_active_ports actv_ports;
+ u32 ref_count;
+ u8 name_priority;
+#ifdef KMOD_MODIFIED
+ //struct list notifier_raw_list;
+#else
+ struct raw_notifier_head notifiers_list;
+ cpumask_var_t affinity_mask;
+#endif
+};
+
+struct mlx4_slave_eqe {
+ u8 type;
+ u8 port;
+ u32 param;
+};
+
+struct mlx4_slave_event_eq_info {
+ int eqn;
+ u16 token;
+};
+
+struct mlx4_profile {
+ int num_qp;
+ int rdmarc_per_qp;
+ int num_srq;
+ int num_cq;
+ int num_mcg;
+ int num_mpt;
+ unsigned num_mtt;
+};
+
+struct mlx4_fw {
+ u64 clr_int_base;
+ u64 catas_offset;
+ u64 comm_base;
+ u64 clock_offset;
+ struct mlx4_icm *fw_icm;
+ struct mlx4_icm *aux_icm;
+ u32 catas_size;
+ u16 fw_pages;
+ u8 clr_int_bar;
+ u8 catas_bar;
+ u8 comm_bar;
+ u8 clock_bar;
+};
+
+struct mlx4_comm {
+ u32 slave_write;
+ u32 slave_read;
+};
+
+enum {
+ MLX4_MCAST_CONFIG = 0,
+ MLX4_MCAST_DISABLE = 1,
+ MLX4_MCAST_ENABLE = 2,
+};
+
+#define VLAN_FLTR_SIZE 128
+
+struct mlx4_vlan_fltr {
+ __be32 entry[VLAN_FLTR_SIZE];
+};
+
+struct mlx4_mcast_entry {
+ struct list_head list;
+ u64 addr;
+};
+
+struct mlx4_promisc_qp {
+ struct list_head list;
+ u32 qpn;
+};
+
+struct mlx4_steer_index {
+ struct list_head list;
+ unsigned int index;
+ struct list_head duplicates;
+};
+
+#define MLX4_EVENT_TYPES_NUM 64
+
+struct mlx4_slave_state {
+ u8 comm_toggle;
+ u8 last_cmd;
+ u8 init_port_mask;
+ bool active;
+ bool old_vlan_api;
+ u8 function;
+ dma_addr_t vhcr_dma;
+ u16 mtu[MLX4_MAX_PORTS + 1];
+ __be32 ib_cap_mask[MLX4_MAX_PORTS + 1];
+ struct mlx4_slave_eqe eq[MLX4_MFUNC_MAX_EQES];
+ struct list_head mcast_filters[MLX4_MAX_PORTS + 1];
+ struct mlx4_vlan_fltr *vlan_filter[MLX4_MAX_PORTS + 1];
+ /* event type to eq number lookup */
+ struct mlx4_slave_event_eq_info event_eq[MLX4_EVENT_TYPES_NUM];
+ u16 eq_pi;
+ u16 eq_ci;
+ spinlock_t lock;
+ /*initialized via the kzalloc*/
+ u8 is_slave_going_down;
+ u32 cookie;
+ enum slave_port_state port_state[MLX4_MAX_PORTS + 1];
+ enum mlx4_roce_gid_type slave_gid_type;
+};
+
+#define MLX4_VGT 4095
+#define NO_INDX (-1)
+
+struct mlx4_vport_state {
+ u64 mac;
+ u16 default_vlan;
+ u8 default_qos;
+ u32 tx_rate;
+ bool spoofchk;
+ u32 link_state;
+ u8 qos_vport;
+ __be64 guid;
+};
+
+struct mlx4_vf_admin_state {
+ struct mlx4_vport_state vport[MLX4_MAX_PORTS + 1];
+ u8 enable_smi[MLX4_MAX_PORTS + 1];
+};
+
+struct mlx4_vport_oper_state {
+ struct mlx4_vport_state state;
+ int mac_idx;
+ int vlan_idx;
+};
+
+struct mlx4_vf_oper_state {
+ struct mlx4_vport_oper_state vport[MLX4_MAX_PORTS + 1];
+ u8 smi_enabled[MLX4_MAX_PORTS + 1];
+};
+
+struct slave_list {
+ struct mutex mutex;
+ struct list_head res_list[MLX4_NUM_OF_RESOURCE_TYPE];
+};
+
+struct resource_allocator {
+ spinlock_t alloc_lock; /* protect quotas */
+ union {
+ int res_reserved;
+ int res_port_rsvd[MLX4_MAX_PORTS];
+ };
+ union {
+ int res_free;
+ int res_port_free[MLX4_MAX_PORTS];
+ };
+ int *quota;
+ int *allocated;
+ int *guaranteed;
+};
+
+struct mlx4_resource_tracker {
+ spinlock_t lock;
+ /* tree for each resources */
+ struct rb_root res_tree[MLX4_NUM_OF_RESOURCE_TYPE];
+ /* num_of_slave's lists, one per slave */
+ struct slave_list *slave_list;
+ struct resource_allocator res_alloc[MLX4_NUM_OF_RESOURCE_TYPE];
+};
+
+#define SLAVE_EVENT_EQ_SIZE 128
+struct mlx4_slave_event_eq {
+ u32 eqn;
+ u32 cons;
+ u32 prod;
+ spinlock_t event_lock;
+ struct mlx4_eqe event_eqe[SLAVE_EVENT_EQ_SIZE];
+};
+
+struct mlx4_qos_manager {
+ int num_of_qos_vfs;
+ DECLARE_BITMAP(priority_bm, MLX4_NUM_UP);
+};
+
+struct mlx4_master_qp0_state {
+ int proxy_qp0_active;
+ int qp0_active;
+ int port_active;
+};
+
+struct mlx4_mfunc_master_ctx {
+ struct mlx4_slave_state *slave_state;
+ struct mlx4_vf_admin_state *vf_admin;
+ struct mlx4_vf_oper_state *vf_oper;
+ struct mlx4_master_qp0_state qp0_state[MLX4_MAX_PORTS + 1];
+ int init_port_ref[MLX4_MAX_PORTS + 1];
+ u16 max_mtu[MLX4_MAX_PORTS + 1];
+ int disable_mcast_ref[MLX4_MAX_PORTS + 1];
+ struct mlx4_resource_tracker res_tracker;
+#ifdef KMOD_REMOVED
+ struct workqueue_struct *comm_wq;
+ struct work_struct comm_work;
+ struct work_struct slave_event_work;
+ struct work_struct slave_flr_event_work;
+#endif
+ spinlock_t slave_state_lock;
+ __be32 comm_arm_bit_vector[4];
+ struct mlx4_eqe cmd_eqe;
+ struct mlx4_slave_event_eq slave_eq;
+ struct mutex gen_eqe_mutex[MLX4_MFUNC_MAX];
+ struct mlx4_qos_manager qos_ctl[MLX4_MAX_PORTS + 1];
+};
+
+struct mlx4_mfunc {
+ struct mlx4_comm __iomem *comm;
+ struct mlx4_vhcr_cmd *vhcr;
+ dma_addr_t vhcr_dma;
+
+ struct mlx4_mfunc_master_ctx master;
+};
+
+#define MGM_QPN_MASK 0x00FFFFFF
+#define MGM_BLCK_LB_BIT 30
+
+struct mlx4_mgm {
+ __be32 next_gid_index;
+ __be32 members_count;
+ u32 reserved[2];
+ u8 gid[16];
+ __be32 qp[MLX4_MAX_QP_PER_MGM];
+};
+
+struct mlx4_cmd {
+#ifdef KMOD_MODIFIED
+ //struct rte_ring* mailbox_buf_ring;
+#else
+ struct pci_pool *pool;
+#endif
+ void __iomem *hcr;
+ struct mutex slave_cmd_mutex;
+ struct semaphore poll_sem;
+ struct semaphore event_sem;
+ struct rw_semaphore switch_sem;
+ int max_cmds;
+ spinlock_t context_lock;
+ int free_head;
+ struct mlx4_cmd_context *context;
+ u16 token_mask;
+ u8 use_events;
+ u8 toggle;
+ u8 comm_toggle;
+ u8 initialized;
+};
+
+enum {
+ MLX4_VF_IMMED_VLAN_FLAG_VLAN = 1 << 0,
+ MLX4_VF_IMMED_VLAN_FLAG_QOS = 1 << 1,
+ MLX4_VF_IMMED_VLAN_FLAG_LINK_DISABLE = 1 << 2,
+};
+struct mlx4_vf_immed_vlan_work {
+#ifdef KMOD_REMOVED
+ struct work_struct work;
+#endif
+ struct mlx4_priv *priv;
+ int flags;
+ int slave;
+ int vlan_ix;
+ int orig_vlan_ix;
+ u8 port;
+ u8 qos;
+ u8 qos_vport;
+ u16 vlan_id;
+ u16 orig_vlan_id;
+};
+
+
+struct mlx4_uar_table {
+ struct mlx4_bitmap bitmap;
+};
+
+struct mlx4_mr_table {
+ struct mlx4_bitmap mpt_bitmap;
+ struct mlx4_buddy mtt_buddy;
+ u64 mtt_base;
+ u64 mpt_base;
+ struct mlx4_icm_table mtt_table;
+ struct mlx4_icm_table dmpt_table;
+};
+
+struct mlx4_cq_table {
+ struct mlx4_bitmap bitmap;
+ spinlock_t lock;
+ struct radix_tree_root tree;
+ struct mlx4_icm_table table;
+ struct mlx4_icm_table cmpt_table;
+};
+
+struct mlx4_eq_table {
+ struct mlx4_bitmap bitmap;
+ char *irq_names;
+ void __iomem *clr_int;
+ void __iomem **uar_map;
+ u32 clr_mask;
+ struct mlx4_eq *eq;
+ struct mlx4_icm_table table;
+ struct mlx4_icm_table cmpt_table;
+ int have_irq;
+ u8 inta_pin;
+};
+
+struct mlx4_srq_table {
+ struct mlx4_bitmap bitmap;
+ spinlock_t lock;
+ struct radix_tree_root tree;
+ struct mlx4_icm_table table;
+ struct mlx4_icm_table cmpt_table;
+};
+
+enum mlx4_qp_table_zones {
+ MLX4_QP_TABLE_ZONE_GENERAL,
+ MLX4_QP_TABLE_ZONE_RSS,
+ MLX4_QP_TABLE_ZONE_RAW_ETH,
+ MLX4_QP_TABLE_ZONE_NUM
+};
+
+struct mlx4_qp_table {
+ struct mlx4_bitmap *bitmap_gen;
+ struct mlx4_zone_allocator *zones;
+ u32 zones_uids[MLX4_QP_TABLE_ZONE_NUM];
+ u32 rdmarc_base;
+ int rdmarc_shift;
+ spinlock_t lock;
+ struct mlx4_icm_table qp_table;
+ struct mlx4_icm_table auxc_table;
+ struct mlx4_icm_table altc_table;
+ struct mlx4_icm_table rdmarc_table;
+ struct mlx4_icm_table cmpt_table;
+};
+
+struct mlx4_mcg_table {
+ struct mutex mutex;
+ struct mlx4_bitmap bitmap;
+ struct mlx4_icm_table table;
+};
+
+struct mlx4_catas_err {
+ u32 __iomem *map;
+#ifdef KMOD_REMOVED
+ struct timer_list timer;
+#endif
+ struct list_head list;
+};
+
+#define MLX4_MAX_MAC_NUM 128
+#define MLX4_MAC_TABLE_SIZE (MLX4_MAX_MAC_NUM << 3)
+
+struct mlx4_mac_table {
+ __be64 entries[MLX4_MAX_MAC_NUM];
+ int refs[MLX4_MAX_MAC_NUM];
+ struct mutex mutex;
+ int total;
+ int max;
+};
+
+struct mlx4_roce_info {
+ struct mlx4_roce_addr_table addr_table;
+ struct mutex mutex;
+};
+
+#define MLX4_MAX_VLAN_NUM 128
+#define MLX4_VLAN_TABLE_SIZE (MLX4_MAX_VLAN_NUM << 2)
+
+struct mlx4_vlan_table {
+ __be32 entries[MLX4_MAX_VLAN_NUM];
+ int refs[MLX4_MAX_VLAN_NUM];
+ struct mutex mutex;
+ int total;
+ int max;
+};
+
+#define SET_PORT_GEN_ALL_VALID 0x7
+#define SET_PORT_ROCE_1_5_FLAGS 0x30
+#define SET_PORT_ROCE_2_FLAGS 0x10
+#define SET_PORT_PROMISC_SHIFT 31
+#define SET_PORT_MC_PROMISC_SHIFT 30
+
+enum {
+ MCAST_DIRECT_ONLY = 0,
+ MCAST_DIRECT = 1,
+ MCAST_DEFAULT = 2
+};
+
+struct mlx4_set_port_general_context {
+ u16 reserved1;
+ u8 v_ignore_fcs;
+ u8 flags;
+ u8 roce_mode;
+ u8 rr_proto;
+ __be16 mtu;
+ u8 pptx;
+ u8 pfctx;
+ u16 reserved3;
+ u8 pprx;
+ u8 pfcrx;
+ u16 reserved4;
+};
+
+struct mlx4_set_port_rqp_calc_context {
+ __be32 base_qpn;
+ u8 rererved;
+ u8 n_mac;
+ u8 n_vlan;
+ u8 n_prio;
+ u8 reserved2[3];
+ u8 mac_miss;
+ u8 intra_no_vlan;
+ u8 no_vlan;
+ u8 intra_vlan_miss;
+ u8 vlan_miss;
+ u8 reserved3[3];
+ u8 no_vlan_prio;
+ __be32 promisc;
+ __be32 mcast;
+};
+
+struct mlx4_port_info {
+ struct mlx4_dev *dev;
+ int port;
+ char dev_name[16];
+#ifdef KMOD_REMOVED
+ struct device_attribute port_attr;
+#endif
+ enum mlx4_port_type tmp_type;
+ char dev_mtu_name[16];
+#ifdef KMOD_REMOVED
+ struct device_attribute port_mtu_attr;
+#endif
+ struct mlx4_mac_table mac_table;
+ struct mlx4_vlan_table vlan_table;
+ struct mlx4_roce_info roce;
+ int base_qpn;
+ struct cpu_rmap *rmap;
+};
+
+struct mlx4_sense {
+ struct mlx4_dev *dev;
+ u8 do_sense_port[MLX4_MAX_PORTS + 1];
+ u8 sense_allowed[MLX4_MAX_PORTS + 1];
+#ifdef KMOD_REMOVED
+ struct delayed_work sense_poll;
+#endif
+};
+
+struct mlx4_msix_ctl {
+ DECLARE_BITMAP(pool_bm, MAX_MSIX);
+ struct mutex pool_lock;
+};
+
+struct mlx4_steer {
+ struct list_head promisc_qps[MLX4_NUM_STEERS];
+ struct list_head steer_entries[MLX4_NUM_STEERS];
+};
+
+enum {
+ MLX4_PCI_DEV_IS_VF = 1 << 0,
+ MLX4_PCI_DEV_FORCE_SENSE_PORT = 1 << 1,
+};
+
+struct counter_index {
+ struct list_head list;
+ u32 index;
+};
+
+struct mlx4_counters {
+ struct mlx4_bitmap bitmap;
+ struct list_head global_port_list[MLX4_MAX_PORTS];
+ struct list_head vf_list[MLX4_MAX_NUM_VF][MLX4_MAX_PORTS];
+ struct mutex mutex;
+};
+
+enum {
+ MLX4_NO_RR = 0,
+ MLX4_USE_RR = 1,
+};
+
+struct mlx4_priv {
+ struct mlx4_dev dev;
+
+ struct list_head dev_list;
+ struct list_head ctx_list;
+ spinlock_t ctx_lock;
+
+ int pci_dev_data;
+ int removed;
+
+ struct list_head pgdir_list;
+ struct mutex pgdir_mutex;
+
+ struct mlx4_fw fw;
+ struct mlx4_cmd cmd;
+ struct mlx4_mfunc mfunc;
+
+ struct mlx4_bitmap pd_bitmap;
+ struct mlx4_bitmap xrcd_bitmap;
+ struct mlx4_uar_table uar_table;
+ struct mlx4_mr_table mr_table;
+ struct mlx4_cq_table cq_table;
+ struct mlx4_eq_table eq_table;
+ struct mlx4_srq_table srq_table;
+ struct mlx4_qp_table qp_table;
+ struct mlx4_mcg_table mcg_table;
+ struct mlx4_bitmap counters_bitmap;
+ struct mlx4_counters counters_table;
+
+ struct mlx4_catas_err catas_err;
+
+ void __iomem *clr_base;
+
+ struct mlx4_uar driver_uar;
+ void __iomem *kar;
+ struct mlx4_port_info port[MLX4_MAX_PORTS + 1];
+ struct mlx4_sense sense;
+ struct mutex port_mutex;
+ struct mlx4_msix_ctl msix_ctl;
+ struct mlx4_steer *steer;
+ struct list_head bf_list;
+ struct mutex bf_mutex;
+#ifdef KMOD_MODIFIED
+ void *bf_mapping_addr;
+ dma_addr_t bf_mapping_phys_addr;
+ size_t bf_mapping_len;
+#else
+ struct io_mapping *bf_mapping;
+#endif
+ void __iomem *clock_mapping;
+ int reserved_mtts;
+ int fs_hash_mode;
+ u8 virt2phys_pkey[MLX4_MFUNC_MAX][MLX4_MAX_PORTS][MLX4_MAX_PORT_PKEYS];
+ struct mlx4_port_map v2p; /* cached port mapping configuration */
+ struct mutex bond_mutex; /* for bond mode */
+ __be64 slave_node_guids[MLX4_MFUNC_MAX];
+
+ atomic_t opreq_count;
+#ifdef KMOD_REMOVED
+ struct work_struct opreq_task;
+#endif
+};
+
+static inline struct mlx4_priv *mlx4_priv(struct mlx4_dev *dev)
+{
+ return container_of(dev, struct mlx4_priv, dev);
+}
+
+#define MLX4_SENSE_RANGE (HZ * 3)
+
+extern struct workqueue_struct *mlx4_wq;
+
+u32 mlx4_bitmap_alloc(struct mlx4_bitmap *bitmap);
+void mlx4_bitmap_free(struct mlx4_bitmap *bitmap, u32 obj, int use_rr);
+u32 mlx4_bitmap_alloc_range(struct mlx4_bitmap *bitmap, int cnt,
+ int align, u32 skip_mask);
+void mlx4_bitmap_free_range(struct mlx4_bitmap *bitmap, u32 obj, int cnt,
+ int use_rr);
+u32 mlx4_bitmap_avail(struct mlx4_bitmap *bitmap);
+int mlx4_bitmap_init(struct mlx4_bitmap *bitmap, u32 num, u32 mask,
+ u32 reserved_bot, u32 resetrved_top);
+void mlx4_bitmap_cleanup(struct mlx4_bitmap *bitmap);
+
+int mlx4_reset(struct mlx4_dev *dev);
+
+int mlx4_alloc_eq_table(struct mlx4_dev *dev);
+void mlx4_free_eq_table(struct mlx4_dev *dev);
+
+int mlx4_init_pd_table(struct mlx4_dev *dev);
+int mlx4_init_xrcd_table(struct mlx4_dev *dev);
+int mlx4_init_uar_table(struct mlx4_dev *dev);
+int mlx4_init_mr_table(struct mlx4_dev *dev);
+int mlx4_init_eq_table(struct mlx4_dev *dev);
+int mlx4_init_cq_table(struct mlx4_dev *dev);
+int mlx4_init_qp_table(struct mlx4_dev *dev);
+int mlx4_init_srq_table(struct mlx4_dev *dev);
+int mlx4_init_mcg_table(struct mlx4_dev *dev);
+
+void mlx4_cleanup_pd_table(struct mlx4_dev *dev);
+void mlx4_cleanup_xrcd_table(struct mlx4_dev *dev);
+void mlx4_cleanup_uar_table(struct mlx4_dev *dev);
+void mlx4_cleanup_mr_table(struct mlx4_dev *dev);
+void mlx4_cleanup_eq_table(struct mlx4_dev *dev);
+void mlx4_cleanup_cq_table(struct mlx4_dev *dev);
+void mlx4_cleanup_qp_table(struct mlx4_dev *dev);
+void mlx4_cleanup_srq_table(struct mlx4_dev *dev);
+void mlx4_cleanup_mcg_table(struct mlx4_dev *dev);
+int __mlx4_qp_alloc_icm(struct mlx4_dev *dev, int qpn, gfp_t gfp);
+void __mlx4_qp_free_icm(struct mlx4_dev *dev, int qpn);
+int __mlx4_cq_alloc_icm(struct mlx4_dev *dev, int *cqn);
+void __mlx4_cq_free_icm(struct mlx4_dev *dev, int cqn);
+int __mlx4_srq_alloc_icm(struct mlx4_dev *dev, int *srqn);
+void __mlx4_srq_free_icm(struct mlx4_dev *dev, int srqn);
+int __mlx4_mpt_reserve(struct mlx4_dev *dev);
+void __mlx4_mpt_release(struct mlx4_dev *dev, u32 index);
+int __mlx4_mpt_alloc_icm(struct mlx4_dev *dev, u32 index, gfp_t gfp);
+void __mlx4_mpt_free_icm(struct mlx4_dev *dev, u32 index);
+u32 __mlx4_alloc_mtt_range(struct mlx4_dev *dev, int order);
+void __mlx4_free_mtt_range(struct mlx4_dev *dev, u32 first_seg, int order);
+
+int mlx4_WRITE_MTT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_SYNC_TPT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_SW2HW_MPT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_HW2SW_MPT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_QUERY_MPT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_SW2HW_EQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_CONFIG_DEV_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_DMA_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int __mlx4_qp_reserve_range(struct mlx4_dev *dev, int cnt, int align,
+ int *base, u8 flags);
+void __mlx4_qp_release_range(struct mlx4_dev *dev, int base_qpn, int cnt);
+int __mlx4_register_mac(struct mlx4_dev *dev, u8 port, u64 mac);
+void __mlx4_unregister_mac(struct mlx4_dev *dev, u8 port, u64 mac);
+int __mlx4_write_mtt(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
+ int start_index, int npages, u64 *page_list);
+int __mlx4_counter_alloc(struct mlx4_dev *dev, int slave, int port, u32 *idx);
+void __mlx4_counter_free(struct mlx4_dev *dev, int slave, int port, u32 idx);
+int __mlx4_slave_counters_free(struct mlx4_dev *dev, int slave);
+int __mlx4_clear_if_stat(struct mlx4_dev *dev,
+ u8 counter_index);
+u8 mlx4_get_default_counter_index(struct mlx4_dev *dev, int slave, int port);
+
+int __mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn);
+void __mlx4_xrcd_free(struct mlx4_dev *dev, u32 xrcdn);
+
+void mlx4_start_catas_poll(struct mlx4_dev *dev);
+void mlx4_stop_catas_poll(struct mlx4_dev *dev);
+int mlx4_catas_init(struct mlx4_dev *dev);
+void mlx4_catas_end(struct mlx4_dev *dev);
+#ifdef KMOD_MODIFIED
+int mlx4_restart_one(struct rte_pci_device *pdev);
+#endif
+int mlx4_register_device(struct mlx4_dev *dev);
+void mlx4_unregister_device(struct mlx4_dev *dev);
+void mlx4_dispatch_event(struct mlx4_dev *dev, enum mlx4_dev_event type,
+ unsigned long param);
+
+struct mlx4_dev_cap;
+struct mlx4_init_hca_param;
+
+u64 mlx4_make_profile(struct mlx4_dev *dev,
+ struct mlx4_profile *request,
+ struct mlx4_dev_cap *dev_cap,
+ struct mlx4_init_hca_param *init_hca);
+#ifdef KMOD_REMOVED
+void mlx4_master_comm_channel(struct work_struct *work);
+void mlx4_gen_slave_eqe(struct work_struct *work);
+void mlx4_master_handle_slave_flr(struct work_struct *work);
+#endif
+
+int mlx4_ALLOC_RES_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_FREE_RES_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_MAP_EQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr, struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_COMM_INT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_HW2SW_EQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_QUERY_EQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_SW2HW_CQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_HW2SW_CQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_QUERY_CQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_MODIFY_CQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_SW2HW_SRQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_HW2SW_SRQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_QUERY_SRQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_ARM_SRQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_GEN_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_RST2INIT_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_INIT2INIT_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_INIT2RTR_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_RTR2RTS_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_RTS2RTS_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_SQERR2RTS_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_2ERR_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_RTS2SQD_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_SQD2SQD_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_SQD2RTS_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_2RST_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_QUERY_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+
+int mlx4_GEN_EQE(struct mlx4_dev *dev, int slave, struct mlx4_eqe *eqe);
+
+enum {
+ MLX4_CMD_CLEANUP_STRUCT = 1UL << 0,
+ MLX4_CMD_CLEANUP_POOL = 1UL << 1,
+ MLX4_CMD_CLEANUP_HCR = 1UL << 2,
+ MLX4_CMD_CLEANUP_VHCR = 1UL << 3,
+ MLX4_CMD_CLEANUP_ALL = (MLX4_CMD_CLEANUP_VHCR << 1) - 1
+};
+
+int mlx4_cmd_init(struct mlx4_dev *dev);
+void mlx4_cmd_cleanup(struct mlx4_dev *dev, int cleanup_mask);
+int mlx4_multi_func_init(struct mlx4_dev *dev);
+int mlx4_ARM_COMM_CHANNEL(struct mlx4_dev *dev);
+void mlx4_multi_func_cleanup(struct mlx4_dev *dev);
+void mlx4_cmd_event(struct mlx4_dev *dev, u16 token, u8 status, u64 out_param);
+#ifdef KMOD_DISABLED
+int mlx4_cmd_use_events(struct mlx4_dev *dev);
+#endif
+void mlx4_cmd_use_polling(struct mlx4_dev *dev);
+
+int mlx4_comm_cmd(struct mlx4_dev *dev, u8 cmd, u16 param,
+ u16 op, unsigned long timeout);
+
+void mlx4_cq_tasklet_cb(unsigned long data);
+void mlx4_cq_completion(struct mlx4_dev *dev, u32 cqn);
+void mlx4_cq_event(struct mlx4_dev *dev, u32 cqn, int event_type);
+
+void mlx4_qp_event(struct mlx4_dev *dev, u32 qpn, int event_type);
+
+void mlx4_srq_event(struct mlx4_dev *dev, u32 srqn, int event_type);
+
+void mlx4_enter_error_state(struct mlx4_dev_persistent *persist);
+
+int mlx4_SENSE_PORT(struct mlx4_dev *dev, int port,
+ enum mlx4_port_type *type);
+void mlx4_do_sense_ports(struct mlx4_dev *dev,
+ enum mlx4_port_type *stype,
+ enum mlx4_port_type *defaults);
+void mlx4_start_sense(struct mlx4_dev *dev);
+void mlx4_stop_sense(struct mlx4_dev *dev);
+void mlx4_sense_init(struct mlx4_dev *dev);
+int mlx4_check_port_params(struct mlx4_dev *dev,
+ enum mlx4_port_type *port_type);
+int mlx4_change_port_types(struct mlx4_dev *dev,
+ enum mlx4_port_type *port_types);
+
+void mlx4_init_mac_table(struct mlx4_dev *dev, struct mlx4_mac_table *table);
+void mlx4_init_vlan_table(struct mlx4_dev *dev, struct mlx4_vlan_table *table);
+void mlx4_init_roce_gid_table(struct mlx4_dev *dev,
+ struct mlx4_roce_info *roce);
+void __mlx4_unregister_vlan(struct mlx4_dev *dev, u8 port, u16 vlan);
+int __mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index);
+
+int mlx4_SET_PORT(struct mlx4_dev *dev, u8 port, int pkey_tbl_sz);
+/* resource tracker functions*/
+int mlx4_get_slave_from_resource_id(struct mlx4_dev *dev,
+ enum mlx4_resource resource_type,
+ u64 resource_id, int *slave);
+void mlx4_delete_all_resources_for_slave(struct mlx4_dev *dev, int slave_id);
+void mlx4_reset_roce_gids(struct mlx4_dev *dev, int slave);
+int mlx4_init_resource_tracker(struct mlx4_dev *dev);
+
+void mlx4_free_resource_tracker(struct mlx4_dev *dev,
+ enum mlx4_res_tracker_free_type type);
+
+int mlx4_QUERY_FW_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_SET_PORT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_INIT_PORT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_CLOSE_PORT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_QUERY_DEV_CAP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_QUERY_PORT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_get_port_ib_caps(struct mlx4_dev *dev, u8 port, __be32 *caps);
+
+int mlx4_get_slave_pkey_gid_tbl_len(struct mlx4_dev *dev, u8 port,
+ int *gid_tbl_len, int *pkey_tbl_len);
+
+int mlx4_QP_ATTACH_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+
+int mlx4_UPDATE_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+
+int mlx4_PROMISC_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_qp_detach_common(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16],
+ enum mlx4_protocol prot, enum mlx4_steer_type steer);
+int mlx4_qp_attach_common(struct mlx4_dev *dev, struct mlx4_qp *qp, u8 gid[16],
+ int block_mcast_loopback, enum mlx4_protocol prot,
+ enum mlx4_steer_type steer);
+int mlx4_trans_to_dmfs_attach(struct mlx4_dev *dev, struct mlx4_qp *qp,
+ u8 gid[16], u8 port,
+ int block_mcast_loopback,
+ enum mlx4_protocol prot, u64 *reg_id);
+int mlx4_SET_MCAST_FLTR_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_SET_VLAN_FLTR_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_common_set_vlan_fltr(struct mlx4_dev *dev, int function,
+ int port, void *buf);
+int mlx4_DUMP_ETH_STATS_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_PKEY_TABLE_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_QUERY_IF_STAT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_QP_FLOW_STEERING_ATTACH_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_QP_FLOW_STEERING_DETACH_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+int mlx4_ACCESS_REG_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd);
+
+int mlx4_get_mgm_entry_size(struct mlx4_dev *dev);
+int mlx4_get_qp_per_mgm(struct mlx4_dev *dev);
+
+static inline void set_param_l(u64 *arg, u32 val)
+{
+ *arg = (*arg & 0xffffffff00000000ULL) | (u64) val;
+}
+
+static inline void set_param_h(u64 *arg, u32 val)
+{
+ *arg = (*arg & 0xffffffff) | ((u64) val << 32);
+}
+
+static inline u32 get_param_l(u64 *arg)
+{
+ return (u32) (*arg & 0xffffffff);
+}
+
+static inline u32 get_param_h(u64 *arg)
+{
+ return (u32)(*arg >> 32);
+}
+
+static inline spinlock_t *mlx4_tlock(struct mlx4_dev *dev)
+{
+ return &mlx4_priv(dev)->mfunc.master.res_tracker.lock;
+}
+
+#define NOT_MASKED_PD_BITS 17
+#ifdef KMOD_MODIFIED
+//void mlx4_vf_immed_vlan_work_handler(struct work_struct *_work);
+void mlx4_vf_immed_vlan_work_handler(struct mlx4_vf_immed_vlan_work *work);
+#endif
+void mlx4_init_quotas(struct mlx4_dev *dev);
+
+int mlx4_get_slave_num_gids(struct mlx4_dev *dev, int slave, int port);
+int mlx4_get_slave_indx(struct mlx4_dev *dev, int vf);
+/* Returns the VF index of slave */
+int mlx4_get_vf_indx(struct mlx4_dev *dev, int slave);
+int mlx4_config_mad_demux(struct mlx4_dev *dev);
+int mlx4_do_bond(struct mlx4_dev *dev, bool enable);
+int mlx4_verify_supported_gid_type(struct mlx4_dev *dev, enum mlx4_roce_gid_type gid_type,
+ enum mlx4_roce_gid_type *alt_type);
+
+enum mlx4_zone_flags {
+ MLX4_ZONE_ALLOW_ALLOC_FROM_LOWER_PRIO = 1UL << 0,
+ MLX4_ZONE_ALLOW_ALLOC_FROM_EQ_PRIO = 1UL << 1,
+ MLX4_ZONE_FALLBACK_TO_HIGHER_PRIO = 1UL << 2,
+ MLX4_ZONE_USE_RR = 1UL << 3,
+};
+
+enum mlx4_zone_alloc_flags {
+ /* No two objects could overlap between zones. UID
+ * could be left unused. If this flag is given and
+ * two overlapped zones are used, an object will be free'd
+ * from the smallest possible matching zone.
+ */
+ MLX4_ZONE_ALLOC_FLAGS_NO_OVERLAP = 1UL << 0,
+};
+
+struct mlx4_zone_allocator;
+
+/* Create a new zone allocator */
+struct mlx4_zone_allocator *mlx4_zone_allocator_create(enum mlx4_zone_alloc_flags flags);
+
+/* Attach a mlx4_bitmap <bitmap> of priority <priority> to the zone allocator
+ * <zone_alloc>. Allocating an object from this zone adds an offset <offset>.
+ * Similarly, when searching for an object to free, this offset it taken into
+ * account. The use_rr mlx4_ib parameter for allocating objects from this <bitmap>
+ * is given through the MLX4_ZONE_USE_RR flag in <flags>.
+ * When an allocation fails, <zone_alloc> tries to allocate from other zones
+ * according to the policy set by <flags>. <puid> is the unique identifier
+ * received to this zone.
+ */
+int mlx4_zone_add_one(struct mlx4_zone_allocator *zone_alloc,
+ struct mlx4_bitmap *bitmap,
+ u32 flags,
+ int priority,
+ int offset,
+ u32 *puid);
+
+/* Remove bitmap indicated by <uid> from <zone_alloc> */
+int mlx4_zone_remove_one(struct mlx4_zone_allocator *zone_alloc, u32 uid);
+
+/* Delete the zone allocator <zone_alloc. This function doesn't destroy
+ * the attached bitmaps.
+ */
+void mlx4_zone_allocator_destroy(struct mlx4_zone_allocator *zone_alloc);
+
+/* Allocate <count> objects with align <align> and skip_mask <skip_mask>
+ * from the mlx4_bitmap whose uid is <uid>. The bitmap which we actually
+ * allocated from is returned in <puid>. If the allocation fails, a negative
+ * number is returned. Otherwise, the offset of the first object is returned.
+ */
+u32 mlx4_zone_alloc_entries(struct mlx4_zone_allocator *zones, u32 uid, int count,
+ int align, u32 skip_mask, u32 *puid);
+
+/* Free <count> objects, start from <obj> of the uid <uid> from zone_allocator
+ * <zones>.
+ */
+u32 mlx4_zone_free_entries(struct mlx4_zone_allocator *zones,
+ u32 uid, u32 obj, u32 count);
+
+/* If <zones> was allocated with MLX4_ZONE_ALLOC_FLAGS_NO_OVERLAP, instead of
+ * specifying the uid when freeing an object, zone allocator could figure it by
+ * itself. Other parameters are similar to mlx4_zone_free.
+ */
+u32 mlx4_zone_free_entries_unique(struct mlx4_zone_allocator *zones, u32 obj, u32 count);
+
+/* Returns a pointer to mlx4_bitmap that was attached to <zones> with <uid> */
+struct mlx4_bitmap *mlx4_zone_get_bitmap(struct mlx4_zone_allocator *zones, u32 uid);
+
+#endif /* MLX4_H */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/mlx4_en.h b/drivers/net/mlnx_uio/mlnx/mlx4/mlx4_en.h
new file mode 100644
index 0000000..78e119a
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/mlx4_en.h
@@ -0,0 +1,1188 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#ifndef _MLX4_EN_H_
+#define _MLX4_EN_H_
+
+#include "mlx4/cq.h"
+#include "mlx4/qp.h"
+#include "mlx4/cmd.h"
+#include "mlx4/srq.h"
+#include "mlx4/doorbell.h"
+#include "mlx4/driver.h"
+#include "mlx4/device.h"
+
+#include "dcbnl.h"
+
+#ifdef CONFIG_MLX4_EN_DCB
+#endif
+#if defined (HAVE_PTP_CLOCK_INFO) && (defined (CONFIG_PTP_1588_CLOCK) || defined(CONFIG_PTP_1588_CLOCK_MODULE))
+#endif
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+#endif
+
+
+#include "en_port.h"
+#include "mlx4_stats.h"
+
+#define DRV_NAME "mlx4_en"
+#define DRV_VERSION "3.0-1.0.1"
+#define DRV_RELDATE "Feb 2014"
+
+#ifndef CONFIG_COMPAT_DISABLE_DCB
+#ifdef CONFIG_MLX4_EN_DCB
+
+#ifndef HAVE_IEEE_GET_SET_MAXRATE
+#define CONFIG_SYSFS_MAXRATE
+#endif
+
+/* make sure to define QCN only when DCB is not disabled
+ * and EN_DCB is defined
+ */
+#ifndef HAVE_IEEE_GETQCN
+#define CONFIG_SYSFS_QCN
+#endif
+
+#ifdef HAVE_NETDEV_GET_PRIO_TC_MAP
+#define CONFIG_SYSFS_MQPRIO
+#endif
+
+#endif
+#endif
+
+#if !defined(HAVE_GET_SET_RXFH) && !defined(HAVE_GET_SET_RXFH_INDIR_EXT)
+#define CONFIG_SYSFS_INDIR_SETTING
+#endif
+
+#if !defined(HAVE_GET_SET_CHANNELS) && !defined(HAVE_GET_SET_CHANNELS_EXT)
+#define CONFIG_SYSFS_NUM_CHANNELS
+#endif
+
+#ifndef HAVE_NDO_SET_FEATURES
+#define CONFIG_SYSFS_LOOPBACK
+#endif
+
+#define MLX4_EN_MSG_LEVEL (NETIF_MSG_LINK | NETIF_MSG_IFDOWN)
+
+/*
+ * Device constants
+ */
+
+
+#define MLX4_EN_PAGE_SHIFT 12
+#define MLX4_EN_PAGE_SIZE (1 << MLX4_EN_PAGE_SHIFT)
+#define DEF_RX_RINGS 16
+#define MAX_RX_RINGS 128
+#define MIN_RX_RINGS 4
+#define TXBB_SIZE 64
+
+/* When working with wqe format 1
+ * we are not need headroom for stamping
+ */
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ #define HEADROOM 0
+#else
+ #define HEADROOM (2048 / TXBB_SIZE + 1)
+#endif
+
+#define STAMP_STRIDE 64
+#define STAMP_DWORDS (STAMP_STRIDE / 4)
+#define STAMP_SHIFT 31
+#define STAMP_VAL 0x7fffffff
+#define STATS_DELAY (HZ / 4)
+#define SERVICE_TASK_DELAY (HZ / 4)
+#define MAX_NUM_OF_FS_RULES 256
+
+#define MLX4_EN_FILTER_HASH_SHIFT 4
+#define MLX4_EN_FILTER_EXPIRY_QUOTA 60
+
+/* Typical TSO descriptor with 16 gather entries is 352 bytes... */
+#define MAX_DESC_SIZE 512
+#define MAX_DESC_TXBBS (MAX_DESC_SIZE / TXBB_SIZE)
+
+/*
+ * OS related constants and tunables
+ */
+
+enum {
+ MLX4_EN_PRIV_FLAGS_BLUEFLAME = (1 << 0),
+ MLX4_EN_PRIV_FLAGS_FS_EN_L2 = (1 << 1),
+ MLX4_EN_PRIV_FLAGS_FS_EN_IPV4 = (1 << 2),
+ MLX4_EN_PRIV_FLAGS_FS_EN_TCP = (1 << 3),
+ MLX4_EN_PRIV_FLAGS_FS_EN_UDP = (1 << 4),
+ MLX4_EN_PRIV_FLAGS_DISABLE_32_14_4_E = (1 << 5),
+ MLX4_EN_PRIV_FLAGS_INLINE_SCATTER = (1 << 6),
+ MLX4_EN_PRIV_FLAGS_RXFCS = (1 << 7),
+ MLX4_EN_PRIV_FLAGS_RXALL = (1 << 8),
+};
+
+#define MLX4_EN_WATCHDOG_TIMEOUT (15 * HZ)
+
+/* Use the maximum between 16384 and a single page */
+#define MLX4_EN_ALLOC_SIZE PAGE_ALIGN(16384)
+
+#define MLX4_EN_ALLOC_PREFER_ORDER PAGE_ALLOC_COSTLY_ORDER
+
+/* Receive fragment sizes; we use at most 3 fragments (for 9600 byte MTU
+ * and 4K allocations) */
+enum {
+#ifdef KMOD_MODIFIED
+ FRAG_SZ0 = 1536, //we do not need net_ip_align
+#else
+ FRAG_SZ0 = 1536 - NET_IP_ALIGN,
+#endif
+ FRAG_SZ1 = 4096,
+ FRAG_SZ2 = 4096,
+ FRAG_SZ3 = MLX4_EN_ALLOC_SIZE
+};
+#define MLX4_EN_MAX_RX_FRAGS 4
+#if !(defined(HAVE_IRQ_DESC_GET_IRQ_DATA) && defined(HAVE_IRQ_TO_DESC_EXPORTED))
+/* Minimum packet number till arming the CQ */
+#define MLX4_EN_MIN_RX_ARM 2097152
+#endif
+
+/* Maximum ring sizes */
+#define MLX4_EN_MAX_TX_SIZE 8192
+#define MLX4_EN_MAX_RX_SIZE 8192
+
+/* Minimum ring size for our page-allocation scheme to work */
+#define MLX4_EN_MIN_RX_SIZE (MLX4_EN_ALLOC_SIZE / SMP_CACHE_BYTES)
+#define MLX4_EN_MIN_TX_SIZE (4096 / TXBB_SIZE)
+
+#define MLX4_EN_SMALL_PKT_SIZE 64
+#define MLX4_EN_MIN_TX_RING_P_UP 1
+
+#ifdef HAVE_NEW_TX_RING_SCHEME
+#define MLX4_EN_MAX_TX_RING_P_UP 32
+#define MLX4_EN_NUM_UP 8
+#define MAX_TX_RINGS (MLX4_EN_MAX_TX_RING_P_UP * \
+ MLX4_EN_NUM_UP)
+#else
+#define MLX4_EN_NUM_TX_RINGS 8
+#define MLX4_EN_NUM_PPP_RINGS 8
+#define MAX_TX_RINGS (MLX4_EN_NUM_TX_RINGS * 2 + \
+ MLX4_EN_NUM_PPP_RINGS)
+#endif
+
+#define MLX4_EN_DEF_TX_RING_SIZE 512
+#define MLX4_EN_DEF_RX_RING_SIZE 1024
+
+#define MLX4_EN_DEFAULT_TX_WORK 256
+
+/* Target number of packets to coalesce with interrupt moderation */
+#define MLX4_EN_RX_COAL_TARGET 44
+#define MLX4_EN_RX_COAL_TIME 0x10
+
+#define MLX4_EN_TX_COAL_PKTS 16
+#define MLX4_EN_TX_COAL_TIME 0x10
+
+#define MLX4_EN_RX_RATE_LOW 400000
+#define MLX4_EN_RX_COAL_TIME_LOW 0
+#define MLX4_EN_RX_RATE_HIGH 450000
+#define MLX4_EN_RX_COAL_TIME_HIGH 128
+#define MLX4_EN_RX_SIZE_THRESH 1024
+#define MLX4_EN_RX_RATE_THRESH (1000000 / MLX4_EN_RX_COAL_TIME_HIGH)
+#define MLX4_EN_SAMPLE_INTERVAL 0
+#define MLX4_EN_AVG_PKT_SMALL 256
+
+#define MLX4_EN_AUTO_CONF 0xffff
+
+#define MLX4_EN_DEF_RX_PAUSE 1
+#define MLX4_EN_DEF_TX_PAUSE 1
+
+/* Interval between successive polls in the Tx routine when polling is used
+ instead of interrupts (in per-core Tx rings) - should be power of 2 */
+#define MLX4_EN_TX_POLL_MODER 16
+#define MLX4_EN_TX_POLL_TIMEOUT (HZ / 4)
+
+#define SMALL_PACKET_SIZE (256 - NET_IP_ALIGN)
+#define HEADER_COPY_SIZE (128 - NET_IP_ALIGN)
+#define MLX4_LOOPBACK_TEST_PAYLOAD (HEADER_COPY_SIZE - ETH_HLEN)
+
+#define MLX4_EN_MIN_MTU 46
+#define ETH_BCAST 0xffffffffffffULL
+
+#define MLX4_EN_LOOPBACK_RETRIES 5
+#define MLX4_EN_LOOPBACK_TIMEOUT 100
+
+#ifdef MLX4_EN_PERF_STAT
+/* Number of samples to 'average' */
+#define AVG_SIZE 128
+#define AVG_FACTOR 1024
+
+#define INC_PERF_COUNTER(cnt) (++(cnt))
+#define ADD_PERF_COUNTER(cnt, add) ((cnt) += (add))
+#define AVG_PERF_COUNTER(cnt, sample) \
+ ((cnt) = ((cnt) * (AVG_SIZE - 1) + (sample) * AVG_FACTOR) / AVG_SIZE)
+#define GET_PERF_COUNTER(cnt) (cnt)
+#define GET_AVG_PERF_COUNTER(cnt) ((cnt) / AVG_FACTOR)
+
+#else
+
+#define INC_PERF_COUNTER(cnt) do {} while (0)
+#define ADD_PERF_COUNTER(cnt, add) do {} while (0)
+#define AVG_PERF_COUNTER(cnt, sample) do {} while (0)
+#define GET_PERF_COUNTER(cnt) (0)
+#define GET_AVG_PERF_COUNTER(cnt) (0)
+#endif /* MLX4_EN_PERF_STAT */
+
+/* Constants for TX flow */
+enum {
+ MAX_INLINE = 104, /* 128 - 16 - 4 - 4 */
+ MAX_BF = 256,
+ MIN_PKT_LEN = 17,
+};
+
+/* Constants for RX flow */
+enum {
+ MAX_INLINE_SCATTER = 2048,
+ MIN_INLINE_SCATTER = 64,
+};
+
+/*
+ * Configurables
+ */
+
+enum cq_type {
+ RX = 0,
+ TX = 1,
+};
+
+
+/*
+ * Useful macros
+ */
+#define ROUNDUP_LOG2(x) ilog2(roundup_pow_of_two(x))
+#define XNOR(x, y) (!(x) == !(y))
+
+struct mlx4_en_tx_info {
+#ifdef KMOD_MODIFIED
+ struct rte_mbuf* mbuf; //SAVE
+#else
+ struct sk_buff *skb;
+#endif
+ //dma_addr_t map0_dma;
+ //u32 map0_byte_count;
+ u32 nr_txbb;
+} ____cacheline_aligned_in_smp;
+
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ #define MLX4_EN_BIT_DESC_OWN 0x40000000
+#else
+ #define MLX4_EN_BIT_DESC_OWN 0x80000000
+#endif
+
+#define CTRL_SIZE sizeof(struct mlx4_wqe_ctrl_seg)
+#define MLX4_EN_MEMTYPE_PAD 0x100
+#define DS_SIZE sizeof(struct mlx4_wqe_data_seg)
+
+
+struct mlx4_en_tx_desc {
+ struct mlx4_wqe_ctrl_seg ctrl;
+ union {
+ struct mlx4_wqe_data_seg data; /* at least one data segment */
+ struct mlx4_wqe_lso_seg lso;
+ struct mlx4_wqe_inline_seg inl;
+ };
+};
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+/* LRO defines for MLX4_EN */
+#define MLX4_EN_LRO_MAX_DESC 32
+
+struct mlx4_en_lro {
+ struct net_lro_mgr lro_mgr;
+ struct net_lro_desc lro_desc[MLX4_EN_LRO_MAX_DESC];
+};
+#endif
+
+#define MLX4_EN_USE_SRQ 0x01000000
+
+#define MLX4_EN_CX3_LOW_ID 0x1000
+#define MLX4_EN_CX3_HIGH_ID 0x1005
+/*
+struct mlx4_en_rx_alloc {
+ struct page *page;
+ dma_addr_t dma;
+ u32 page_offset;
+ u32 page_size;
+};
+*/
+
+struct mlx4_en_cq {
+ struct mlx4_cq mcq;
+ struct mlx4_hwq_resources wqres;
+ int ring;
+#ifdef KMOD_MODIFIED
+ struct rte_eth_dev* rte_dev;
+#else
+ struct net_device *dev;
+ struct napi_struct napi;
+#endif
+ int size;
+ int buf_size;
+ int vector;
+ enum cq_type is_tx;
+ u16 moder_time;
+ u16 moder_cnt;
+ struct mlx4_cqe *buf;
+#define MLX4_EN_OPCODE_ERROR 0x1e
+#if !(defined(HAVE_IRQ_DESC_GET_IRQ_DATA) && defined(HAVE_IRQ_TO_DESC_EXPORTED))
+ u32 tot_rx;
+#endif
+
+#ifdef CONFIG_NET_RX_BUSY_POLL
+ unsigned int state;
+#define MLX4_EN_CQ_STATE_IDLE 0
+#define MLX4_EN_CQ_STATE_NAPI 1 /* NAPI owns this CQ */
+#define MLX4_EN_CQ_STATE_POLL 2 /* poll owns this CQ */
+#define MLX4_CQ_LOCKED (MLX4_EN_CQ_STATE_NAPI | MLX4_EN_CQ_STATE_POLL)
+#define MLX4_EN_CQ_STATE_NAPI_YIELD 4 /* NAPI yielded this CQ */
+#define MLX4_EN_CQ_STATE_POLL_YIELD 8 /* poll yielded this CQ */
+#define CQ_YIELD (MLX4_EN_CQ_STATE_NAPI_YIELD | MLX4_EN_CQ_STATE_POLL_YIELD)
+#define CQ_USER_PEND (MLX4_EN_CQ_STATE_POLL | MLX4_EN_CQ_STATE_POLL_YIELD)
+ spinlock_t poll_lock; /* protects from LLS/napi conflicts */
+#endif /* CONFIG_NET_RX_BUSY_POLL */
+
+#ifdef KMOD_MODIFIED
+#else
+ struct irq_desc *irq_desc;
+#endif
+};
+
+struct mlx4_en_tx_ring {
+ /* cache line used and dirtied in tx completion
+ * (mlx4_en_free_tx_buf())
+ */
+ u32 last_nr_txbb;
+ u32 cons;
+ unsigned long wake_queue;
+
+ /* cache line used and dirtied in mlx4_en_xmit() */
+ u32 prod ____cacheline_aligned_in_smp;
+ struct mlx4_bf bf;
+
+ /* Following part should be mostly read */
+#ifdef KMOD_MODIFIED
+ //cpumask_t affinity_mask;
+#endif
+ struct mlx4_qp qp;
+ struct mlx4_hwq_resources wqres;
+ u32 size; /* number of TXBBs */
+ u32 size_mask;
+ u16 stride;
+ u16 cqn; /* index of port CQ associated with this ring */
+ struct mlx4_en_cq tx_cq;
+ u32 buf_size;
+ __be32 doorbell_qpn;
+ __be32 mr_key;
+ void *buf;
+ struct mlx4_en_tx_info *tx_info;
+ u8 *bounce_buf;
+ struct mlx4_qp_context context;
+ int qpn;
+ enum mlx4_qp_state qp_state;
+ u8 queue_index;
+ bool bf_enabled;
+ bool bf_alloced;
+ //struct netdev_queue *tx_queue;
+ //int hwtstamp_tx_type;
+ int enable_hwtstamp;
+ int is_stopped;
+ void (*tx_tstamp_callback)(uint64_t tstamp, struct rte_mbuf* mbuf, void* arg);
+ void* tx_tstamp_callback_arg;
+} ____cacheline_aligned_in_smp;
+
+struct mlx4_en_rx_desc {
+ /* actual number of entries depends on rx ring stride */
+ struct mlx4_wqe_data_seg data[0];
+};
+
+struct mlx4_en_rx_ring {
+ struct mlx4_hwq_resources wqres;
+#ifdef KMOD_MODIFIED
+ struct rte_mempool *mb_pool;
+#else
+ struct mlx4_en_rx_alloc page_alloc[MLX4_EN_MAX_RX_FRAGS];
+#endif
+ u32 size ; /* number of Rx descs*/
+ u32 actual_size;
+ u32 size_mask;
+ u16 stride;
+ u16 log_stride;
+ //u16 cqn; /* index of port CQ associated with this ring */
+ struct mlx4_en_cq rx_cq;
+ u32 prod;
+ u32 cons;
+ u32 buf_size;
+ u8 fcs_del;
+ void *buf;
+ struct rte_mbuf **rx_info;
+ //int hwtstamp_rx_filter;
+ int enable_hwtstamp;
+
+ int frag_size;
+ u16 num_frags;
+};
+
+struct mlx4_en_port_profile {
+ u32 flags;
+ u32 tx_ring_num;
+ u32 rx_ring_num;
+ u32 tx_ring_size;
+ u32 rx_ring_size;
+ u8 rx_pause;
+ u8 rx_ppp;
+ u8 tx_pause;
+ u8 tx_ppp;
+ int rss_rings;
+ int inline_thold;
+ int inline_scatter_thold;
+};
+
+struct mlx4_en_profile {
+#ifndef HAVE_ETH_SS_RSS_HASH_FUNCS
+ int rss_xor;
+#endif
+ int udp_rss;
+ u8 rss_mask;
+ u32 active_ports;
+ u32 small_pkt_int;
+ u8 no_reset;
+ u8 num_tx_rings_p_up;
+ struct mlx4_en_port_profile prof[MLX4_MAX_PORTS + 1];
+};
+
+struct mlx4_en_dev {
+ struct mlx4_dev *dev;
+#ifdef KMOD_MODIFIED
+ struct rte_pci_device* rte_pdev;
+#else
+ struct pci_dev *pdev;
+#endif
+ struct mutex state_lock;
+#ifdef KMOD_MODIFIED
+ struct rte_eth_dev *rte_pndev[MLX4_MAX_PORTS + 1];
+ struct rte_eth_dev *rte_upper[MLX4_MAX_PORTS + 1];
+#else
+ struct net_device *pndev[MLX4_MAX_PORTS + 1];
+ struct net_device *upper[MLX4_MAX_PORTS + 1];
+#endif
+ u32 port_cnt;
+ bool device_up;
+ struct mlx4_en_profile profile;
+ u32 LSO_support;
+#ifdef KMOD_DISABLED
+ struct workqueue_struct *workqueue;
+ struct device *dma_device;
+#endif
+ void __iomem *uar_map;
+ struct mlx4_uar priv_uar;
+ struct mlx4_mr mr;
+ u32 priv_pdn;
+ spinlock_t uar_lock;
+ u8 mac_removed[MLX4_MAX_PORTS + 1];
+ rwlock_t clock_lock;
+ u32 nominal_c_mult;
+#ifdef KMOD_MODIFIED
+ //uint64_t internal_clock_hz;
+ //uint64_t internal_clock;
+#else
+ struct cyclecounter cycles;
+ struct timecounter clock;
+#endif
+ unsigned long last_overflow_check;
+ unsigned long overflow_period;
+#if defined (HAVE_PTP_CLOCK_INFO) && (defined (CONFIG_PTP_1588_CLOCK) || defined(CONFIG_PTP_1588_CLOCK_MODULE))
+ struct ptp_clock *ptp_clock;
+ struct ptp_clock_info ptp_clock_info;
+#endif
+#ifdef KMOD_DISABLED
+ struct notifier_block nb;
+#endif
+};
+
+
+struct mlx4_en_rss_map {
+ int base_qpn;
+ struct mlx4_qp qps[MAX_RX_RINGS];
+ enum mlx4_qp_state state[MAX_RX_RINGS];
+ struct mlx4_qp indir_qp;
+ enum mlx4_qp_state indir_state;
+};
+
+enum mlx4_en_port_flag {
+ MLX4_EN_PORT_ANC = 1<<0, /* Auto-negotiation complete */
+ MLX4_EN_PORT_ANE = 1<<1, /* Auto-negotiation enabled */
+};
+
+struct mlx4_en_port_state {
+ int link_state;
+ int link_speed;
+ int transceiver;
+ u32 flags;
+};
+
+enum mlx4_en_mclist_act {
+ MCLIST_NONE,
+ MCLIST_REM,
+ MCLIST_ADD,
+};
+
+struct mlx4_en_mc_list {
+ struct list_head list;
+ enum mlx4_en_mclist_act action;
+ u8 addr[ETH_ALEN];
+ u64 reg_id;
+ u64 tunnel_reg_id;
+};
+
+/*
+struct mlx4_en_frag_info {
+ u16 frag_size;
+ u16 frag_prefix_size;
+ u16 frag_stride;
+};
+*/
+
+#ifdef CONFIG_MLX4_EN_DCB
+/* Minimal TC BW - setting to 0 will block traffic */
+#define MLX4_EN_BW_MIN 1
+#define MLX4_EN_BW_MAX 100 /* Utilize 100% of the line */
+
+#define MLX4_EN_TC_VENDOR 0
+#define MLX4_EN_TC_ETS 7
+
+#endif
+
+#include <linux/ethtool.h>
+
+struct ethtool_flow_id {
+ struct list_head list;
+ struct ethtool_rx_flow_spec flow_spec;
+ u64 id;
+};
+
+enum {
+ MLX4_EN_FLAG_PROMISC = (1 << 0),
+ MLX4_EN_FLAG_MC_PROMISC = (1 << 1),
+ /* whether we need to enable hardware loopback by putting dmac
+ * in Tx WQE
+ */
+ MLX4_EN_FLAG_ENABLE_HW_LOOPBACK = (1 << 2),
+ /* whether we need to drop packets that hardware loopback-ed */
+ MLX4_EN_FLAG_RX_FILTER_NEEDED = (1 << 3),
+ MLX4_EN_FLAG_FORCE_PROMISC = (1 << 4),
+ MLX4_EN_FLAG_RX_CSUM_NON_TCP_UDP = (1 << 5),
+};
+
+#define PORT_BEACON_MAX_LIMIT (65535)
+#define MLX4_EN_MAC_HASH_SIZE (1 << BITS_PER_BYTE)
+#define MLX4_EN_MAC_HASH_IDX 5
+
+struct mlx4_en_stats_bitmap {
+ DECLARE_BITMAP(bitmap, NUM_ALL_STATS);
+ struct mutex mutex; /* for mutual access to stats bitmap */
+};
+
+struct en_port {
+#ifdef KMOD_DISABLED
+ struct kobject kobj_vf;
+ struct kobject kobj_stats;
+#endif
+ struct mlx4_dev *dev;
+ u8 port_num;
+ u8 vport_num;
+};
+
+struct mlx4_en_priv {
+ struct mlx4_en_dev *mdev;
+ struct mlx4_en_port_profile *prof;
+#ifdef KMOD_MODIFIED
+ struct rte_eth_dev* rte_dev;
+#else
+ struct net_device *dev;
+#endif
+#ifdef HAVE_VLAN_GRO_RECEIVE
+ struct vlan_group *vlgrp;
+#endif
+ unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
+#ifdef KMOD_DISABLED
+ struct net_device_stats stats;
+ struct net_device_stats ret_stats;
+#endif
+ struct mlx4_en_port_state port_state;
+ spinlock_t stats_lock;
+ struct ethtool_flow_id ethtool_rules[MAX_NUM_OF_FS_RULES];
+ /* To allow rules removal while port is going down */
+ struct list_head ethtool_list;
+
+ unsigned long last_moder_packets[MAX_RX_RINGS];
+ unsigned long last_moder_tx_packets;
+ unsigned long last_moder_bytes[MAX_RX_RINGS];
+ unsigned long last_moder_jiffies;
+ int last_moder_time[MAX_RX_RINGS];
+ u16 rx_usecs;
+ u16 rx_frames;
+ u16 tx_usecs;
+ u16 tx_frames;
+ u32 pkt_rate_low;
+ u16 rx_usecs_low;
+ u32 pkt_rate_high;
+ u16 rx_usecs_high;
+ u16 sample_interval;
+ u16 adaptive_rx_coal;
+ u32 msg_enable;
+ u32 loopback_ok;
+ u32 validate_loopback;
+
+ struct mlx4_hwq_resources res;
+ int link_state;
+ int last_link_state;
+ bool port_up;
+ int port;
+ int registered;
+ int allocated;
+ int stride;
+ unsigned char current_mac[ETH_ALEN];
+ int mac_index;
+ unsigned max_mtu;
+ int base_qpn;
+ int cqe_factor;
+ int cqe_size;
+
+ struct mlx4_en_rss_map rss_map;
+ __be32 ctrl_flags;
+ u32 flags;
+ u32 eff_mtu;
+ //u8 num_tx_rings_p_up;
+ //u32 tx_work_limit;
+ //u32 tx_ring_num;
+ //u32 rx_ring_num;
+ //u32 rx_skb_size;
+
+ //struct mlx4_en_tx_ring **tx_ring;
+ //struct mlx4_en_rx_ring *rx_ring[MAX_RX_RINGS];
+ //struct mlx4_en_cq **tx_cq;
+ //struct mlx4_en_cq *rx_cq[MAX_RX_RINGS];
+ struct mlx4_qp drop_qp;
+#ifdef KMOD_DISABLED
+ struct work_struct rx_mode_task;
+ struct work_struct watchdog_task;
+ struct work_struct linkstate_task;
+ struct delayed_work stats_task;
+ struct delayed_work service_task;
+ struct work_struct vxlan_add_task;
+ struct work_struct vxlan_del_task;
+#endif
+ struct mlx4_en_perf_stats pstats;
+ struct mlx4_en_pkt_stats pkstats;
+ struct mlx4_en_flow_stats_rx rx_priority_flowstats[MLX4_NUM_PRIORITIES];
+ struct mlx4_en_flow_stats_tx tx_priority_flowstats[MLX4_NUM_PRIORITIES];
+ struct mlx4_en_flow_stats_rx rx_flowstats;
+ struct mlx4_en_flow_stats_tx tx_flowstats;
+ struct mlx4_en_port_stats port_stats;
+ struct mlx4_en_vport_stats vport_stats;
+ struct mlx4_en_vf_stats vf_stats;
+ struct mlx4_en_stats_bitmap stats_bitmap;
+ struct list_head mc_list;
+ struct list_head curr_list;
+ u64 broadcast_id;
+ struct mlx4_en_stat_out_mbox hw_stats;
+ int vids[128];
+ bool wol;
+ //struct device *ddev;
+ u32 counter_index;
+ struct en_port *vf_ports[MLX4_MAX_NUM_VF];
+ struct hlist_head mac_hash[MLX4_EN_MAC_HASH_SIZE];
+#ifdef KMOD_MODIFIED
+ int stat_reset;
+#else
+ struct hwtstamp_config hwtstamp_config;
+#endif
+
+#ifndef CONFIG_COMPAT_DISABLE_DCB
+#ifdef CONFIG_MLX4_EN_DCB
+ struct ieee_ets ets;
+ u16 maxrate[IEEE_8021QAZ_MAX_TCS];
+ enum dcbnl_cndd_states cndd_state[IEEE_8021QAZ_MAX_TCS];
+#endif
+#endif
+#ifdef CONFIG_RFS_ACCEL
+ spinlock_t filters_lock;
+ int last_filter_id;
+ struct list_head filters;
+ struct hlist_head filter_hash[1 << MLX4_EN_FILTER_HASH_SHIFT];
+#endif
+ u64 tunnel_reg_id;
+ __be16 vxlan_port;
+#ifdef CONFIG_COMPAT_EN_SYSFS
+ int sysfs_group_initialized;
+#endif
+
+ u32 pflags;
+ u8 rss_key[MLX4_EN_RSS_KEY_SIZE];
+#ifdef HAVE_ETH_SS_RSS_HASH_FUNCS
+ u8 rss_hash_fn;
+#endif
+};
+
+enum mlx4_en_wol {
+ MLX4_EN_WOL_MAGIC = (1ULL << 61),
+ MLX4_EN_WOL_ENABLED = (1ULL << 62),
+};
+
+struct mlx4_mac_entry {
+ struct hlist_node hlist;
+ unsigned char mac[ETH_ALEN];
+ u64 reg_id;
+ //struct rcu_head rcu;
+};
+
+static inline struct mlx4_cqe *mlx4_en_get_cqe(void *buf, int idx, int cqe_sz)
+{
+ return buf + idx * cqe_sz;
+}
+
+#ifdef KMOD_DISABLED
+
+#ifdef CONFIG_NET_RX_BUSY_POLL
+static inline void mlx4_en_cq_init_lock(struct mlx4_en_cq *cq)
+{
+ spin_lock_init(&cq->poll_lock);
+ cq->state = MLX4_EN_CQ_STATE_IDLE;
+}
+
+/* called from the device poll rutine to get ownership of a cq */
+static inline bool mlx4_en_cq_lock_napi(struct mlx4_en_cq *cq)
+{
+ int rc = true;
+ spin_lock(&cq->poll_lock);
+ if (cq->state & MLX4_CQ_LOCKED) {
+ WARN_ON(cq->state & MLX4_EN_CQ_STATE_NAPI);
+ cq->state |= MLX4_EN_CQ_STATE_NAPI_YIELD;
+ rc = false;
+ } else
+ /* we don't care if someone yielded */
+ cq->state = MLX4_EN_CQ_STATE_NAPI;
+ spin_unlock(&cq->poll_lock);
+ return rc;
+}
+
+/* returns true is someone tried to get the cq while napi had it */
+static inline bool mlx4_en_cq_unlock_napi(struct mlx4_en_cq *cq)
+{
+ int rc = false;
+ spin_lock(&cq->poll_lock);
+ WARN_ON(cq->state & (MLX4_EN_CQ_STATE_POLL |
+ MLX4_EN_CQ_STATE_NAPI_YIELD));
+
+ if (cq->state & MLX4_EN_CQ_STATE_POLL_YIELD)
+ rc = true;
+ cq->state = MLX4_EN_CQ_STATE_IDLE;
+ spin_unlock(&cq->poll_lock);
+ return rc;
+}
+
+/* called from mlx4_en_low_latency_poll() */
+static inline bool mlx4_en_cq_lock_poll(struct mlx4_en_cq *cq)
+{
+ int rc = true;
+ spin_lock_bh(&cq->poll_lock);
+ if ((cq->state & MLX4_CQ_LOCKED)) {
+ struct net_device *dev = cq->dev;
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_rx_ring *rx_ring = priv->rx_ring[cq->ring];
+
+ cq->state |= MLX4_EN_CQ_STATE_POLL_YIELD;
+ rc = false;
+ rx_ring->yields++;
+ } else
+ /* preserve yield marks */
+ cq->state |= MLX4_EN_CQ_STATE_POLL;
+ spin_unlock_bh(&cq->poll_lock);
+ return rc;
+}
+
+/* returns true if someone tried to get the cq while it was locked */
+static inline bool mlx4_en_cq_unlock_poll(struct mlx4_en_cq *cq)
+{
+ int rc = false;
+ spin_lock_bh(&cq->poll_lock);
+ WARN_ON(cq->state & (MLX4_EN_CQ_STATE_NAPI));
+
+ if (cq->state & MLX4_EN_CQ_STATE_POLL_YIELD)
+ rc = true;
+ cq->state = MLX4_EN_CQ_STATE_IDLE;
+ spin_unlock_bh(&cq->poll_lock);
+ return rc;
+}
+
+/* true if a socket is polling, even if it did not get the lock */
+static inline bool mlx4_en_cq_busy_polling(struct mlx4_en_cq *cq)
+{
+ WARN_ON(!(cq->state & MLX4_CQ_LOCKED));
+ return cq->state & CQ_USER_PEND;
+}
+#else
+static inline void mlx4_en_cq_init_lock(struct mlx4_en_cq *cq)
+{
+}
+
+static inline bool mlx4_en_cq_lock_napi(struct mlx4_en_cq *cq)
+{
+ return true;
+}
+
+static inline bool mlx4_en_cq_unlock_napi(struct mlx4_en_cq *cq)
+{
+ return false;
+}
+
+static inline bool mlx4_en_cq_lock_poll(struct mlx4_en_cq *cq)
+{
+ return false;
+}
+
+static inline bool mlx4_en_cq_unlock_poll(struct mlx4_en_cq *cq)
+{
+ return false;
+}
+
+static inline bool mlx4_en_cq_busy_polling(struct mlx4_en_cq *cq)
+{
+ return false;
+}
+#endif /* CONFIG_NET_RX_BUSY_POLL */
+#endif
+
+#define MLX4_EN_WOL_DO_MODIFY (1ULL << 63)
+
+#ifdef KMOD_MODIFIED
+void mlx4_en_update_loopback_state(struct rte_eth_dev *dev,
+ netdev_features_t features);
+
+void mlx4_en_destroy_netdev(struct rte_eth_dev *dev);
+int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
+ struct mlx4_en_port_profile *prof);
+
+int mlx4_en_start_port(struct rte_eth_dev *dev);
+void mlx4_en_stop_port(struct rte_eth_dev *dev, int detach);
+#endif
+int mlx4_en_get_vport_stats(struct mlx4_en_dev *mdev, u8 port);
+void mlx4_en_set_stats_bitmap(struct mlx4_dev *dev,
+ struct mlx4_en_stats_bitmap *stats_bitmap,
+ u8 rx_ppp, u8 rx_pause,
+ u8 tx_ppp, u8 tx_pause);
+
+int mlx4_disable_32_14_4_e_write(struct mlx4_dev *dev, u8 config, int port);
+int mlx4_disable_32_14_4_e_read(struct mlx4_dev *dev, u8 *config, int port);
+
+void mlx4_en_free_resources(struct mlx4_en_priv *priv);
+int mlx4_en_alloc_resources(struct mlx4_en_priv *priv);
+
+int mlx4_en_create_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
+ int entries, int ring, enum cq_type mode, int node);
+void mlx4_en_destroy_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq);
+int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
+ int cq_idx, int timestamp_en);
+void mlx4_en_deactivate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq);
+int mlx4_en_set_cq_moder(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq);
+int mlx4_en_arm_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq);
+
+void mlx4_en_tx_irq(struct mlx4_cq *mcq);
+#if defined(NDO_SELECT_QUEUE_HAS_ACCEL_PRIV) || defined(HAVE_SELECT_QUEUE_FALLBACK_T)
+u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb,
+#ifdef HAVE_SELECT_QUEUE_FALLBACK_T
+ void *accel_priv, select_queue_fallback_t fallback);
+#else
+ void *accel_priv);
+#endif
+#else /* NDO_SELECT_QUEUE_HAS_ACCEL_PRIV */
+#ifdef KMOD_DISABLED
+u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb);
+#endif
+#endif
+#ifdef KMOD_DISABLED
+netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev);
+#endif
+
+int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring *ring,
+ u32 size, u16 stride,
+ int node, int queue_index);
+void mlx4_en_destroy_tx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring **pring);
+int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring *ring,
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ int cq, int user_prio);
+#else
+ int cq);
+#endif
+void mlx4_en_deactivate_tx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_tx_ring *ring);
+void mlx4_en_set_num_rx_rings(struct mlx4_en_dev *mdev);
+void mlx4_en_recover_from_oom(struct mlx4_en_priv *priv);
+int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring,
+ u32 size, u16 stride, int node);
+void mlx4_en_destroy_rx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring **pring,
+ u32 size, u16 stride);
+int mlx4_en_activate_rx_rings(struct mlx4_en_priv *priv);
+void mlx4_en_deactivate_rx_ring(struct mlx4_en_priv *priv,
+ struct mlx4_en_rx_ring *ring);
+#ifdef KMOD_MODIFIED
+/*int mlx4_en_process_rx_cq(struct rte_eth_dev *dev,
+ struct mlx4_en_cq *cq,
+ int budget);
+ */
+int mlx4_en_poll_rx_cq(struct mlx4_en_cq *cq, int budget);
+int mlx4_en_poll_tx_cq(struct mlx4_en_cq *cq, int budget);
+#endif
+void mlx4_en_fill_qp_context(struct mlx4_en_priv *priv, int size, int stride,
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ int is_tx, int rss, int qpn, int cqn, int user_prio,
+#else
+ int is_tx, int rss, int qpn, int cqn,
+#endif
+ struct mlx4_qp_context *context);
+void mlx4_en_sqp_event(struct mlx4_qp *qp, enum mlx4_event event);
+int mlx4_en_map_buffer(struct mlx4_buf *buf);
+void mlx4_en_unmap_buffer(struct mlx4_buf *buf);
+int mlx4_en_change_mcast_loopback(struct mlx4_en_priv *priv, struct mlx4_qp *qp,
+ int loopback);
+
+#ifdef KMOD_MODIFIED
+void mlx4_en_calc_rx_buf(struct rte_eth_dev *dev);
+#endif
+//int mlx4_en_config_rss_steer(struct mlx4_en_priv *priv);
+//void mlx4_en_release_rss_steer(struct mlx4_en_priv *priv);
+//int mlx4_en_create_drop_qp(struct mlx4_en_priv *priv);
+//void mlx4_en_destroy_drop_qp(struct mlx4_en_priv *priv);
+#ifdef KMOD_MODIFIED
+int mlx4_en_free_tx_buf(struct rte_eth_dev *dev, struct mlx4_en_tx_ring *ring);
+#endif
+void mlx4_en_rx_irq(struct mlx4_cq *mcq);
+
+int mlx4_SET_MCAST_FLTR(struct mlx4_dev *dev, u8 port, u64 mac, u64 clear, u8 mode);
+int mlx4_SET_VLAN_FLTR(struct mlx4_dev *dev, struct mlx4_en_priv *priv);
+
+int mlx4_get_vport_ethtool_stats(struct mlx4_dev *dev, int port,
+ struct mlx4_en_vport_stats *vport_stats,
+ int reset, int *read_counters);
+
+int mlx4_en_DUMP_ETH_STATS(struct mlx4_en_dev *mdev, u8 port, u8 reset);
+int mlx4_en_QUERY_PORT(struct mlx4_en_dev *mdev, u8 port);
+
+#ifndef CONFIG_COMPAT_DISABLE_DCB
+#ifdef CONFIG_MLX4_EN_DCB
+extern const struct dcbnl_rtnl_ops mlx4_en_dcbnl_ops;
+extern const struct dcbnl_rtnl_ops mlx4_en_dcbnl_pfc_ops;
+#endif
+#endif
+
+#ifdef CONFIG_SYSFS_QCN
+
+#ifdef KMOD_MODIFIED
+int mlx4_en_dcbnl_ieee_getqcn(struct rte_eth_dev *dev,
+ struct ieee_qcn *qcn);
+int mlx4_en_dcbnl_ieee_setqcn(struct rte_eth_dev *dev,
+ struct ieee_qcn *qcn);
+int mlx4_en_dcbnl_ieee_getqcnstats(struct rte_eth_dev *dev,
+ struct ieee_qcn_stats *qcn_stats);
+#endif
+#endif
+
+#ifdef CONFIG_COMPAT_EN_SYSFS
+int mlx4_en_sysfs_create(struct net_device *dev);
+void mlx4_en_sysfs_remove(struct net_device *dev);
+#endif
+
+#ifdef CONFIG_SYSFS_MAXRATE
+#ifdef KMOD_MODIFIED
+
+int mlx4_en_dcbnl_ieee_setmaxrate(struct rte_eth_dev *dev,
+ struct ieee_maxrate *maxrate);
+int mlx4_en_dcbnl_ieee_getmaxrate(struct rte_eth_dev *dev,
+ struct ieee_maxrate *maxrate);
+#endif
+#endif
+
+#ifdef KMOD_DISABLED
+#ifdef CONFIG_SYSFS_NUM_CHANNELS
+struct ethtool_channels {
+ __u32 cmd;
+ __u32 max_rx;
+ __u32 max_tx;
+ __u32 max_other;
+ __u32 max_combined;
+ __u32 rx_count;
+ __u32 tx_count;
+ __u32 other_count;
+ __u32 combined_count;
+};
+
+int mlx4_en_set_channels(struct net_device *dev,
+ struct ethtool_channels *channel);
+void mlx4_en_get_channels(struct net_device *dev,
+ struct ethtool_channels *channel);
+#endif
+#endif
+
+#ifdef KMOD_MODIFIED
+int mlx4_en_setup_tc(struct rte_eth_dev *dev, u8 up);
+#endif
+
+#ifdef CONFIG_RFS_ACCEL
+#ifdef HAVE_NDO_RX_FLOW_STEER
+void mlx4_en_cleanup_filters(struct mlx4_en_priv *priv);
+#endif
+#endif
+
+#ifdef KMOD_DISABLED
+#define MLX4_EN_NUM_SELF_TEST 5
+void mlx4_en_ex_selftest(struct net_device *dev, u32 *flags, u64 *buf);
+void mlx4_en_ptp_overflow_check(struct mlx4_en_dev *mdev);
+
+#define DEV_FEATURE_CHANGED(dev, new_features, feature) \
+ ((dev->features & feature) ^ (new_features & feature))
+
+int mlx4_en_reset_config(struct net_device *dev,
+ struct hwtstamp_config ts_config,
+ netdev_features_t new_features);
+#endif
+void mlx4_en_update_pfc_stats_bitmap(struct mlx4_dev *dev,
+ struct mlx4_en_stats_bitmap *stats_bitmap,
+ u8 rx_ppp, u8 rx_pause,
+ u8 tx_ppp, u8 tx_pause);
+#ifdef KMOD_DISABLED
+int mlx4_en_netdev_event(struct notifier_block *this,
+ unsigned long event, void *ptr);
+#endif
+
+/*
+ * Functions for time stamping
+ */
+u64 mlx4_en_get_cqe_ts(struct mlx4_cqe *cqe);
+#ifdef KMOD_DISABLED
+void mlx4_en_fill_hwtstamps(struct mlx4_en_dev *mdev,
+ struct skb_shared_hwtstamps *hwts,
+ u64 timestamp);
+#endif
+void mlx4_en_init_timestamp(struct mlx4_en_dev *mdev);
+#if defined (HAVE_PTP_CLOCK_INFO) && (defined (CONFIG_PTP_1588_CLOCK) || defined(CONFIG_PTP_1588_CLOCK_MODULE))
+void mlx4_en_remove_timestamp(struct mlx4_en_dev *mdev);
+#endif
+
+#ifdef KMOD_DISABLED
+/* Globals
+ */
+extern const struct ethtool_ops mlx4_en_ethtool_ops;
+#ifdef HAVE_ETHTOOL_OPS_EXT
+extern const struct ethtool_ops_ext mlx4_en_ethtool_ops_ext;
+#endif
+#endif
+
+/*
+ * printk / logging functions
+ */
+
+#if !defined(HAVE_VA_FORMAT) || defined CONFIG_X86_XEN
+#define en_print(level, priv, format, arg...) \
+ { \
+ if ((priv)->registered) \
+ printk(level "%s: %s: " format, DRV_NAME, \
+ "mlx4_en", ## arg); \
+ else \
+ printk(level "%s: %s: Port %d: " format, \
+ DRV_NAME, "mlx4_en", \
+ (priv)->port, ## arg); \
+ }
+#else
+__printf(3, 4)
+void en_print(const char *level, const struct mlx4_en_priv *priv,
+ const char *format, ...);
+#endif
+
+#define en_dbg(mlevel, priv, format, ...) \
+do { \
+ if (NETIF_MSG_##mlevel & (priv)->msg_enable) \
+ en_print(KERN_DEBUG, priv, format, ##__VA_ARGS__); \
+} while (0)
+#define en_warn(priv, format, ...) \
+ en_print(KERN_WARNING, priv, format, ##__VA_ARGS__)
+#define en_err(priv, format, ...) \
+ en_print(KERN_ERR, priv, format, ##__VA_ARGS__)
+#define en_info(priv, format, ...) \
+ en_print(KERN_INFO, priv, format, ##__VA_ARGS__)
+
+#define mlx4_err(mdev, format, ...) \
+ pr_err(DRV_NAME " %s: " format, \
+ "mlx4_en", ##__VA_ARGS__)
+#define mlx4_info(mdev, format, ...) \
+ pr_info(DRV_NAME " %s: " format, \
+ "mlx4_en", ##__VA_ARGS__)
+#define mlx4_warn(mdev, format, ...) \
+ pr_warn(DRV_NAME " %s: " format, \
+ "mlx4_en", ##__VA_ARGS__)
+#ifdef KMOD_DISABLED
+#ifdef CONFIG_SYSFS_INDIR_SETTING
+u32 mlx4_en_get_rxfh_indir_size(struct net_device *dev);
+int mlx4_en_get_rxfh_indir(struct net_device *dev, u32 *ring_index);
+int mlx4_en_set_rxfh_indir(struct net_device *dev, const u32 *ring_index);
+#endif
+#ifdef CONFIG_SYSFS_LOOPBACK
+int mlx4_en_set_features(struct net_device *netdev,
+#ifdef HAVE_NET_DEVICE_OPS_EXT
+ u32 features);
+#else
+ netdev_features_t features);
+#endif
+#endif
+#endif
+
+static struct mlx4_en_priv* rtedev_priv(struct rte_eth_dev* dev)
+{
+ return dev->data->dev_private;
+}
+
+#endif
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/mlx4_stats.h b/drivers/net/mlnx_uio/mlnx/mlx4/mlx4_stats.h
new file mode 100644
index 0000000..3317b9c
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/mlx4_stats.h
@@ -0,0 +1,153 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+#ifndef _MLX4_STATS_
+#define _MLX4_STATS_
+
+#ifdef MLX4_EN_PERF_STAT
+#define NUM_PERF_STATS NUM_PERF_COUNTERS
+#else
+#define NUM_PERF_STATS 0
+#endif
+
+#define NUM_PRIORITIES 9
+#define NUM_PRIORITY_STATS 2
+
+struct mlx4_en_pkt_stats {
+ unsigned long rx_multicast_packets;
+ unsigned long rx_broadcast_packets;
+ unsigned long rx_jabbers;
+ unsigned long rx_in_range_length_error;
+ unsigned long rx_out_range_length_error;
+ unsigned long tx_multicast_packets;
+ unsigned long tx_broadcast_packets;
+ unsigned long rx_prio[NUM_PRIORITIES][NUM_PRIORITY_STATS];
+ unsigned long tx_prio[NUM_PRIORITIES][NUM_PRIORITY_STATS];
+#define NUM_PKT_STATS 43
+};
+
+struct mlx4_en_vf_stats {
+ unsigned long rx_multicast_packets;
+ unsigned long rx_broadcast_packets;
+ unsigned long rx_filtered;
+ unsigned long tx_multicast_packets;
+ unsigned long tx_broadcast_packets;
+ unsigned long tx_dropped;
+#define NUM_VF_STATS 6
+};
+
+struct mlx4_en_vport_stats {
+ unsigned long rx_unicast_packets;
+ unsigned long rx_unicast_bytes;
+ unsigned long rx_multicast_packets;
+ unsigned long rx_multicast_bytes;
+ unsigned long rx_broadcast_packets;
+ unsigned long rx_broadcast_bytes;
+ unsigned long rx_dropped;
+ unsigned long rx_filtered;
+ unsigned long tx_unicast_packets;
+ unsigned long tx_unicast_bytes;
+ unsigned long tx_multicast_packets;
+ unsigned long tx_multicast_bytes;
+ unsigned long tx_broadcast_packets;
+ unsigned long tx_broadcast_bytes;
+ unsigned long tx_dropped;
+#define NUM_VPORT_STATS 15
+};
+
+
+struct mlx4_en_port_stats {
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+ unsigned long lro_aggregated;
+ unsigned long lro_flushed;
+ unsigned long lro_no_desc;
+#endif
+ unsigned long tso_packets;
+ unsigned long xmit_more;
+ unsigned long queue_stopped;
+ unsigned long wake_queue;
+ unsigned long tx_timeout;
+ unsigned long rx_alloc_failed;
+ unsigned long rx_chksum_good;
+ unsigned long rx_chksum_none;
+ unsigned long rx_chksum_complete;
+ unsigned long tx_chksum_offload;
+#ifdef CONFIG_COMPAT_LRO_ENABLED
+#define NUM_PORT_STATS 13
+#else
+#define NUM_PORT_STATS 10
+#endif
+};
+
+struct mlx4_en_perf_stats {
+ u32 tx_poll;
+ u64 tx_pktsz_avg;
+ u32 inflight_avg;
+ u16 tx_coal_avg;
+ u16 rx_coal_avg;
+ u32 napi_quota;
+#define NUM_PERF_COUNTERS 6
+};
+
+#define NUM_MAIN_STATS 21
+
+#define MLX4_NUM_PRIORITIES 8
+
+struct mlx4_en_flow_stats_rx {
+ u64 rx_pause;
+ u64 rx_pause_duration;
+ u64 rx_pause_transition;
+#define NUM_FLOW_STATS_RX 3
+#define NUM_FLOW_PRIORITY_STATS_RX (NUM_FLOW_STATS_RX * \
+ MLX4_NUM_PRIORITIES)
+};
+
+struct mlx4_en_flow_stats_tx {
+ u64 tx_pause;
+ u64 tx_pause_duration;
+ u64 tx_pause_transition;
+#define NUM_FLOW_STATS_TX 3
+#define NUM_FLOW_PRIORITY_STATS_TX (NUM_FLOW_STATS_TX * \
+ MLX4_NUM_PRIORITIES)
+};
+
+#define NUM_FLOW_STATS (NUM_FLOW_STATS_RX + NUM_FLOW_STATS_TX + \
+ NUM_FLOW_PRIORITY_STATS_TX + \
+ NUM_FLOW_PRIORITY_STATS_RX)
+
+struct mlx4_en_stat_out_flow_control_mbox {
+ /* Total number of PAUSE frames received from the far-end port */
+ __be64 rx_pause;
+ /* Total number of microseconds that far-end port requested to pause
+ * transmission of packets
+ */
+ __be64 rx_pause_duration;
+ /* Number of received transmission from XOFF state to XON state */
+ __be64 rx_pause_transition;
+ /* Total number of PAUSE frames sent from the far-end port */
+ __be64 tx_pause;
+ /* Total time in microseconds that transmission of packets has been
+ * paused
+ */
+ __be64 tx_pause_duration;
+ /* Number of transmitter transitions from XOFF state to XON state */
+ __be64 tx_pause_transition;
+ /* Reserverd */
+ __be64 reserved[2];
+};
+
+enum {
+ MLX4_DUMP_ETH_STATS_FLOW_CONTROL = 1 << 12
+};
+
+#define NUM_ALL_STATS (NUM_MAIN_STATS + NUM_PORT_STATS + NUM_PKT_STATS + \
+ NUM_VF_STATS + NUM_FLOW_STATS + NUM_PERF_STATS + \
+ NUM_VPORT_STATS)
+
+#define MLX4_FIND_NETDEV_STAT(n) (offsetof(struct net_device_stats, n) / \
+ sizeof(((struct net_device_stats *)0)->n))
+
+#endif
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/mr.c b/drivers/net/mlnx_uio/mlnx/mlx4/mr.c
new file mode 100644
index 0000000..cf84287
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/mr.c
@@ -0,0 +1,1178 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+
+#include "mlx4.h"
+#include "icm.h"
+
+#include "log2.h"
+
+static u32 mlx4_buddy_alloc(struct mlx4_buddy *buddy, int order)
+{
+ int o;
+ int m;
+ u32 seg;
+
+ spin_lock(&buddy->lock);
+
+ for (o = order; o <= buddy->max_order; ++o)
+ if (buddy->num_free[o]) {
+ m = 1 << (buddy->max_order - o);
+ seg = find_first_bit(buddy->bits[o], m);
+ if (seg < m)
+ goto found;
+ }
+
+ spin_unlock(&buddy->lock);
+ return -1;
+
+ found:
+ clear_bit(seg, buddy->bits[o]);
+ --buddy->num_free[o];
+
+ while (o > order) {
+ --o;
+ seg <<= 1;
+ set_bit(seg ^ 1, buddy->bits[o]);
+ ++buddy->num_free[o];
+ }
+
+ spin_unlock(&buddy->lock);
+
+ seg <<= order;
+
+ return seg;
+}
+
+static void mlx4_buddy_free(struct mlx4_buddy *buddy, u32 seg, int order)
+{
+ seg >>= order;
+
+ spin_lock(&buddy->lock);
+
+ while (test_bit(seg ^ 1, buddy->bits[order])) {
+ clear_bit(seg ^ 1, buddy->bits[order]);
+ --buddy->num_free[order];
+ seg >>= 1;
+ ++order;
+ }
+
+ set_bit(seg, buddy->bits[order]);
+ ++buddy->num_free[order];
+
+ spin_unlock(&buddy->lock);
+}
+
+static int mlx4_buddy_init(struct mlx4_buddy *buddy, int max_order)
+{
+ int i, s;
+
+ buddy->max_order = max_order;
+ spin_lock_init(&buddy->lock);
+
+ buddy->bits = kcalloc(buddy->max_order + 1, sizeof (long *),
+ GFP_KERNEL);
+ buddy->num_free = kcalloc((buddy->max_order + 1), sizeof *buddy->num_free,
+ GFP_KERNEL);
+ if (!buddy->bits || !buddy->num_free)
+ goto err_out;
+
+ for (i = 0; i <= buddy->max_order; ++i) {
+ s = BITS_TO_LONGS(1 << (buddy->max_order - i));
+ buddy->bits[i] = kcalloc(s, sizeof (long), GFP_KERNEL | __GFP_NOWARN);
+ if (!buddy->bits[i]) {
+ buddy->bits[i] = vzalloc(s * sizeof(long));
+ if (!buddy->bits[i])
+ goto err_out_free;
+ }
+ }
+
+ set_bit(0, buddy->bits[buddy->max_order]);
+ buddy->num_free[buddy->max_order] = 1;
+
+ return 0;
+
+err_out_free:
+ for (i = 0; i <= buddy->max_order; ++i)
+#ifdef KMOD_MODIFIED
+ kfree(buddy->bits[i]);
+#endif
+
+err_out:
+ kfree(buddy->bits);
+ kfree(buddy->num_free);
+
+ return -ENOMEM;
+}
+
+static void mlx4_buddy_cleanup(struct mlx4_buddy *buddy)
+{
+ int i;
+
+ for (i = 0; i <= buddy->max_order; ++i)
+#ifdef KMOD_MODIFIED
+ kfree(buddy->bits[i]);
+#endif
+
+ kfree(buddy->bits);
+ kfree(buddy->num_free);
+}
+
+u32 __mlx4_alloc_mtt_range(struct mlx4_dev *dev, int order)
+{
+ struct mlx4_mr_table *mr_table = &mlx4_priv(dev)->mr_table;
+ u32 seg;
+ int seg_order;
+ u32 offset;
+
+ seg_order = max_t(int, order - log_mtts_per_seg, 0);
+
+ seg = mlx4_buddy_alloc(&mr_table->mtt_buddy, seg_order);
+ if (seg == -1)
+ return -1;
+
+ offset = seg * (1 << log_mtts_per_seg);
+
+ if (mlx4_table_get_range(dev, &mr_table->mtt_table, offset,
+ offset + (1 << order) - 1)) {
+ mlx4_buddy_free(&mr_table->mtt_buddy, seg, seg_order);
+ return -1;
+ }
+
+ return offset;
+}
+
+static u32 mlx4_alloc_mtt_range(struct mlx4_dev *dev, int order)
+{
+ u64 in_param = 0;
+ u64 out_param;
+ int err;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, order);
+ err = mlx4_cmd_imm(dev, in_param, &out_param, RES_MTT,
+ RES_OP_RESERVE_AND_MAP,
+ MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+ if (err)
+ return -1;
+ return get_param_l(&out_param);
+ }
+ return __mlx4_alloc_mtt_range(dev, order);
+}
+
+int mlx4_mtt_init(struct mlx4_dev *dev, int npages, int page_shift,
+ struct mlx4_mtt *mtt)
+{
+ int i;
+
+ if (!npages) {
+ mtt->order = -1;
+ mtt->page_shift = MLX4_ICM_PAGE_SHIFT;
+ return 0;
+ } else
+ mtt->page_shift = page_shift;
+
+ for (mtt->order = 0, i = 1; i < npages; i <<= 1)
+ ++mtt->order;
+
+ mtt->offset = mlx4_alloc_mtt_range(dev, mtt->order);
+ if (mtt->offset == -1)
+ return -ENOMEM;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_mtt_init);
+
+void __mlx4_free_mtt_range(struct mlx4_dev *dev, u32 offset, int order)
+{
+ u32 first_seg;
+ int seg_order;
+ struct mlx4_mr_table *mr_table = &mlx4_priv(dev)->mr_table;
+
+ seg_order = max_t(int, order - log_mtts_per_seg, 0);
+ first_seg = offset / (1 << log_mtts_per_seg);
+
+ mlx4_buddy_free(&mr_table->mtt_buddy, first_seg, seg_order);
+ mlx4_table_put_range(dev, &mr_table->mtt_table, offset,
+ offset + (1 << order) - 1);
+}
+
+static void mlx4_free_mtt_range(struct mlx4_dev *dev, u32 offset, int order)
+{
+ u64 in_param = 0;
+ int err;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, offset);
+ set_param_h(&in_param, order);
+ err = mlx4_cmd(dev, in_param, RES_MTT, RES_OP_RESERVE_AND_MAP,
+ MLX4_CMD_FREE_RES,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+ if (err)
+ mlx4_warn(dev, "Failed to free mtt range at:%d order:%d\n",
+ offset, order);
+ return;
+ }
+ __mlx4_free_mtt_range(dev, offset, order);
+}
+
+void mlx4_mtt_cleanup(struct mlx4_dev *dev, struct mlx4_mtt *mtt)
+{
+ if (mtt->order < 0)
+ return;
+
+ mlx4_free_mtt_range(dev, mtt->offset, mtt->order);
+}
+EXPORT_SYMBOL_GPL(mlx4_mtt_cleanup);
+
+u64 mlx4_mtt_addr(struct mlx4_dev *dev, struct mlx4_mtt *mtt)
+{
+ return (u64) mtt->offset * dev->caps.mtt_entry_sz;
+}
+EXPORT_SYMBOL_GPL(mlx4_mtt_addr);
+
+static u32 hw_index_to_key(u32 ind)
+{
+ return (ind >> 24) | (ind << 8);
+}
+
+static u32 key_to_hw_index(u32 key)
+{
+ return (key << 24) | (key >> 8);
+}
+
+static int mlx4_SW2HW_MPT(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
+ int mpt_index)
+{
+ return mlx4_cmd(dev, mailbox->dma, mpt_index,
+ 0, MLX4_CMD_SW2HW_MPT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_WRAPPED);
+}
+
+static int mlx4_HW2SW_MPT(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
+ int mpt_index)
+{
+ return mlx4_cmd_box(dev, 0, mailbox ? mailbox->dma : 0, mpt_index,
+ !mailbox, MLX4_CMD_HW2SW_MPT,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_WRAPPED);
+}
+
+/* Must protect against concurrent access */
+int mlx4_mr_hw_get_mpt(struct mlx4_dev *dev, struct mlx4_mr *mmr,
+ struct mlx4_mpt_entry ***mpt_entry)
+{
+ int err;
+ int key = key_to_hw_index(mmr->key) & (dev->caps.num_mpts - 1);
+ struct mlx4_cmd_mailbox *mailbox = NULL;
+
+ if (mmr->enabled != MLX4_MPT_EN_HW)
+ return -EINVAL;
+
+ err = mlx4_HW2SW_MPT(dev, NULL, key);
+ if (err) {
+ mlx4_warn(dev, "HW2SW_MPT failed (%d).", err);
+ mlx4_warn(dev, "Most likely the MR has MWs bound to it.\n");
+ return err;
+ }
+
+ mmr->enabled = MLX4_MPT_EN_SW;
+
+ if (!mlx4_is_mfunc(dev)) {
+ **mpt_entry = mlx4_table_find(
+ &mlx4_priv(dev)->mr_table.dmpt_table,
+ key, NULL);
+ } else {
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR_OR_NULL(mailbox))
+ return PTR_ERR(mailbox);
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, key,
+ 0, MLX4_CMD_QUERY_MPT,
+ MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_WRAPPED);
+ if (err)
+ goto free_mailbox;
+
+ *mpt_entry = (struct mlx4_mpt_entry **)&mailbox->buf;
+ }
+
+ if (!(*mpt_entry) || !(**mpt_entry)) {
+ err = -ENOMEM;
+ goto free_mailbox;
+ }
+
+ return 0;
+
+free_mailbox:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_mr_hw_get_mpt);
+
+int mlx4_mr_hw_write_mpt(struct mlx4_dev *dev, struct mlx4_mr *mmr,
+ struct mlx4_mpt_entry **mpt_entry)
+{
+ int err;
+
+ if (!mlx4_is_mfunc(dev)) {
+ /* Make sure any changes to this entry are flushed */
+ wmb();
+
+ *(u8 *)(*mpt_entry) = MLX4_MPT_STATUS_HW;
+
+ /* Make sure the new status is written */
+ wmb();
+
+ err = mlx4_SYNC_TPT(dev);
+ } else {
+ int key = key_to_hw_index(mmr->key) & (dev->caps.num_mpts - 1);
+
+ struct mlx4_cmd_mailbox *mailbox =
+ container_of((void *)mpt_entry, struct mlx4_cmd_mailbox,
+ buf);
+
+ err = mlx4_SW2HW_MPT(dev, mailbox, key);
+ }
+
+ if (!err) {
+ mmr->pd = be32_to_cpu((*mpt_entry)->pd_flags) & MLX4_MPT_PD_MASK;
+ mmr->enabled = MLX4_MPT_EN_HW;
+ }
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_mr_hw_write_mpt);
+
+void mlx4_mr_hw_put_mpt(struct mlx4_dev *dev,
+ struct mlx4_mpt_entry **mpt_entry)
+{
+ if (mlx4_is_mfunc(dev)) {
+ struct mlx4_cmd_mailbox *mailbox =
+ container_of((void *)mpt_entry, struct mlx4_cmd_mailbox,
+ buf);
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ }
+}
+EXPORT_SYMBOL_GPL(mlx4_mr_hw_put_mpt);
+
+int mlx4_mr_hw_change_pd(struct mlx4_dev *dev, struct mlx4_mpt_entry *mpt_entry,
+ u32 pdn)
+{
+ u32 pd_flags = be32_to_cpu(mpt_entry->pd_flags) & ~MLX4_MPT_PD_MASK;
+ /* The wrapper function will put the slave's id here */
+ if (mlx4_is_mfunc(dev))
+ pd_flags &= ~MLX4_MPT_PD_VF_MASK;
+
+ mpt_entry->pd_flags = cpu_to_be32(pd_flags |
+ (pdn & MLX4_MPT_PD_MASK)
+ | MLX4_MPT_PD_FLAG_EN_INV);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_mr_hw_change_pd);
+
+int mlx4_mr_hw_change_access(struct mlx4_dev *dev,
+ struct mlx4_mpt_entry *mpt_entry,
+ u32 access)
+{
+ u32 flags = (be32_to_cpu(mpt_entry->flags) & ~MLX4_PERM_MASK) |
+ (access & MLX4_PERM_MASK);
+
+ mpt_entry->flags = cpu_to_be32(flags);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_mr_hw_change_access);
+
+static int mlx4_mr_alloc_reserved(struct mlx4_dev *dev, u32 mridx, u32 pd,
+ u64 iova, u64 size, u32 access, int npages,
+ int page_shift, struct mlx4_mr *mr)
+{
+ mr->iova = iova;
+ mr->size = size;
+ mr->pd = pd;
+ mr->access = access;
+ mr->enabled = MLX4_MPT_DISABLED;
+ mr->key = hw_index_to_key(mridx);
+
+ return mlx4_mtt_init(dev, npages, page_shift, &mr->mtt);
+}
+
+static int mlx4_WRITE_MTT(struct mlx4_dev *dev,
+ struct mlx4_cmd_mailbox *mailbox,
+ int num_entries)
+{
+ return mlx4_cmd(dev, mailbox->dma, num_entries, 0, MLX4_CMD_WRITE_MTT,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+}
+
+int __mlx4_mpt_reserve(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ return mlx4_bitmap_alloc(&priv->mr_table.mpt_bitmap);
+}
+
+static int mlx4_mpt_reserve(struct mlx4_dev *dev)
+{
+ u64 out_param;
+
+ if (mlx4_is_mfunc(dev)) {
+ if (mlx4_cmd_imm(dev, 0, &out_param, RES_MPT, RES_OP_RESERVE,
+ MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED))
+ return -1;
+ return get_param_l(&out_param);
+ }
+ return __mlx4_mpt_reserve(dev);
+}
+
+void __mlx4_mpt_release(struct mlx4_dev *dev, u32 index)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ mlx4_bitmap_free(&priv->mr_table.mpt_bitmap, index, MLX4_NO_RR);
+}
+
+static void mlx4_mpt_release(struct mlx4_dev *dev, u32 index)
+{
+ u64 in_param = 0;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, index);
+ if (mlx4_cmd(dev, in_param, RES_MPT, RES_OP_RESERVE,
+ MLX4_CMD_FREE_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED))
+ mlx4_warn(dev, "Failed to release mr index:%d\n",
+ index);
+ return;
+ }
+ __mlx4_mpt_release(dev, index);
+}
+
+int __mlx4_mpt_alloc_icm(struct mlx4_dev *dev, u32 index, gfp_t gfp)
+{
+ struct mlx4_mr_table *mr_table = &mlx4_priv(dev)->mr_table;
+
+ return mlx4_table_get(dev, &mr_table->dmpt_table, index, gfp);
+}
+
+static int mlx4_mpt_alloc_icm(struct mlx4_dev *dev, u32 index, gfp_t gfp)
+{
+ u64 param = 0;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(¶m, index);
+ return mlx4_cmd_imm(dev, param, ¶m, RES_MPT, RES_OP_MAP_ICM,
+ MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+ }
+ return __mlx4_mpt_alloc_icm(dev, index, gfp);
+}
+
+void __mlx4_mpt_free_icm(struct mlx4_dev *dev, u32 index)
+{
+ struct mlx4_mr_table *mr_table = &mlx4_priv(dev)->mr_table;
+
+ mlx4_table_put(dev, &mr_table->dmpt_table, index);
+}
+
+static void mlx4_mpt_free_icm(struct mlx4_dev *dev, u32 index)
+{
+ u64 in_param = 0;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, index);
+ if (mlx4_cmd(dev, in_param, RES_MPT, RES_OP_MAP_ICM,
+ MLX4_CMD_FREE_RES, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED))
+ mlx4_warn(dev, "Failed to free icm of mr index:%d\n",
+ index);
+ return;
+ }
+ return __mlx4_mpt_free_icm(dev, index);
+}
+
+int mlx4_mr_alloc(struct mlx4_dev *dev, u32 pd, u64 iova, u64 size, u32 access,
+ int npages, int page_shift, struct mlx4_mr *mr)
+{
+ u32 index;
+ int err;
+
+ index = mlx4_mpt_reserve(dev);
+ if (index == -1)
+ return -ENOMEM;
+
+ err = mlx4_mr_alloc_reserved(dev, index, pd, iova, size,
+ access, npages, page_shift, mr);
+ if (err)
+ mlx4_mpt_release(dev, index);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_mr_alloc);
+
+static int mlx4_mr_free_reserved(struct mlx4_dev *dev, struct mlx4_mr *mr)
+{
+ int err;
+
+ if (mr->enabled == MLX4_MPT_EN_HW) {
+ err = mlx4_HW2SW_MPT(dev, NULL,
+ key_to_hw_index(mr->key) &
+ (dev->caps.num_mpts - 1));
+ if (err) {
+ mlx4_warn(dev, "HW2SW_MPT failed (%d), MR has MWs bound to it\n",
+ err);
+ return err;
+ }
+
+ mr->enabled = MLX4_MPT_EN_SW;
+ }
+ mlx4_mtt_cleanup(dev, &mr->mtt);
+
+ return 0;
+}
+
+int mlx4_mr_free(struct mlx4_dev *dev, struct mlx4_mr *mr)
+{
+ int ret;
+
+ ret = mlx4_mr_free_reserved(dev, mr);
+ if (ret)
+ return ret;
+ if (mr->enabled)
+ mlx4_mpt_free_icm(dev, key_to_hw_index(mr->key));
+ mlx4_mpt_release(dev, key_to_hw_index(mr->key));
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_mr_free);
+
+void mlx4_mr_rereg_mem_cleanup(struct mlx4_dev *dev, struct mlx4_mr *mr)
+{
+ mlx4_mtt_cleanup(dev, &mr->mtt);
+ mr->mtt.order = -1;
+}
+EXPORT_SYMBOL_GPL(mlx4_mr_rereg_mem_cleanup);
+
+int mlx4_mr_rereg_mem_write(struct mlx4_dev *dev, struct mlx4_mr *mr,
+ u64 iova, u64 size, int npages,
+ int page_shift, struct mlx4_mpt_entry *mpt_entry)
+{
+ int err;
+
+ err = mlx4_mtt_init(dev, npages, page_shift, &mr->mtt);
+ if (err)
+ return err;
+
+ mpt_entry->start = cpu_to_be64(iova);
+ mpt_entry->length = cpu_to_be64(size);
+ mpt_entry->entity_size = cpu_to_be32(page_shift);
+ mpt_entry->flags &= ~(cpu_to_be32(MLX4_MPT_FLAG_FREE |
+ MLX4_MPT_FLAG_SW_OWNS));
+ if (mr->mtt.order < 0) {
+ mpt_entry->flags |= cpu_to_be32(MLX4_MPT_FLAG_PHYSICAL);
+ mpt_entry->mtt_addr = 0;
+ } else {
+ mpt_entry->mtt_addr = cpu_to_be64(mlx4_mtt_addr(dev,
+ &mr->mtt));
+ if (mr->mtt.page_shift == 0)
+ mpt_entry->mtt_sz = cpu_to_be32(1 << mr->mtt.order);
+ }
+ if (mr->mtt.order >= 0 && mr->mtt.page_shift == 0) {
+ /* fast register MR in free state */
+ mpt_entry->flags |= cpu_to_be32(MLX4_MPT_FLAG_FREE);
+ mpt_entry->pd_flags |= cpu_to_be32(MLX4_MPT_PD_FLAG_FAST_REG |
+ MLX4_MPT_PD_FLAG_RAE);
+ } else {
+ mpt_entry->flags |= cpu_to_be32(MLX4_MPT_FLAG_SW_OWNS);
+ }
+ mr->enabled = MLX4_MPT_EN_SW;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_mr_rereg_mem_write);
+
+int mlx4_mr_enable(struct mlx4_dev *dev, struct mlx4_mr *mr)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_mpt_entry *mpt_entry;
+ int err;
+
+ err = mlx4_mpt_alloc_icm(dev, key_to_hw_index(mr->key), GFP_KERNEL);
+ if (err)
+ return err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox)) {
+ err = PTR_ERR(mailbox);
+ goto err_table;
+ }
+ mpt_entry = mailbox->buf;
+ mpt_entry->flags = cpu_to_be32(MLX4_MPT_FLAG_MIO |
+ MLX4_MPT_FLAG_REGION |
+ mr->access);
+
+ mpt_entry->key = cpu_to_be32(key_to_hw_index(mr->key));
+ mpt_entry->pd_flags = cpu_to_be32(mr->pd | MLX4_MPT_PD_FLAG_EN_INV);
+ mpt_entry->start = cpu_to_be64(mr->iova);
+ mpt_entry->length = cpu_to_be64(mr->size);
+ mpt_entry->entity_size = cpu_to_be32(mr->mtt.page_shift);
+
+ if (mr->mtt.order < 0) {
+ mpt_entry->flags |= cpu_to_be32(MLX4_MPT_FLAG_PHYSICAL);
+ mpt_entry->mtt_addr = 0;
+ } else {
+ mpt_entry->mtt_addr = cpu_to_be64(mlx4_mtt_addr(dev,
+ &mr->mtt));
+ }
+
+ if (mr->mtt.order >= 0 && mr->mtt.page_shift == 0) {
+ /* fast register MR in free state */
+ mpt_entry->flags |= cpu_to_be32(MLX4_MPT_FLAG_FREE);
+ mpt_entry->pd_flags |= cpu_to_be32(MLX4_MPT_PD_FLAG_FAST_REG |
+ MLX4_MPT_PD_FLAG_RAE);
+ mpt_entry->mtt_sz = cpu_to_be32(1 << mr->mtt.order);
+ } else {
+ mpt_entry->flags |= cpu_to_be32(MLX4_MPT_FLAG_SW_OWNS);
+ }
+
+ err = mlx4_SW2HW_MPT(dev, mailbox,
+ key_to_hw_index(mr->key) & (dev->caps.num_mpts - 1));
+ if (err) {
+ mlx4_warn(dev, "SW2HW_MPT failed (%d)\n", err);
+ goto err_cmd;
+ }
+ mr->enabled = MLX4_MPT_EN_HW;
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+ return 0;
+
+err_cmd:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+err_table:
+ mlx4_mpt_free_icm(dev, key_to_hw_index(mr->key));
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_mr_enable);
+
+static int mlx4_write_mtt_chunk(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
+ int start_index, int npages, u64 *page_list)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ __be64 *mtts;
+ dma_addr_t dma_handle;
+ int i;
+
+ mtts = mlx4_table_find(&priv->mr_table.mtt_table, mtt->offset +
+ start_index, &dma_handle);
+
+ if (!mtts)
+ return -ENOMEM;
+#ifdef KMOD_MODIFIED
+ rte_mb(); //using coherent architecture
+#else
+ dma_sync_single_for_cpu(&dev->persist->pdev->dev, dma_handle,
+ npages * sizeof (u64), DMA_TO_DEVICE);
+#endif
+
+ for (i = 0; i < npages; ++i)
+ mtts[i] = cpu_to_be64(page_list[i] | MLX4_MTT_FLAG_PRESENT);
+
+#ifdef KMOD_MODIFIED
+ rte_mb(); //using coherent architecture
+#else
+ dma_sync_single_for_device(&dev->persist->pdev->dev, dma_handle,
+ npages * sizeof (u64), DMA_TO_DEVICE);
+#endif
+
+ return 0;
+}
+
+int __mlx4_write_mtt(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
+ int start_index, int npages, u64 *page_list)
+{
+ int err = 0;
+ int chunk;
+ int mtts_per_page;
+ int max_mtts_first_page;
+
+ /* compute how may mtts fit in the first page */
+ mtts_per_page = PAGE_SIZE / sizeof(u64);
+ max_mtts_first_page = mtts_per_page - (mtt->offset + start_index)
+ % mtts_per_page;
+
+ chunk = min_t(int, max_mtts_first_page, npages);
+
+ while (npages > 0) {
+ err = mlx4_write_mtt_chunk(dev, mtt, start_index, chunk, page_list);
+ if (err)
+ return err;
+ npages -= chunk;
+ start_index += chunk;
+ page_list += chunk;
+
+ chunk = min_t(int, mtts_per_page, npages);
+ }
+ return err;
+}
+
+int mlx4_write_mtt(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
+ int start_index, int npages, u64 *page_list)
+{
+ struct mlx4_cmd_mailbox *mailbox = NULL;
+ __be64 *inbox = NULL;
+ int chunk;
+ int err = 0;
+ int i;
+
+ if (mtt->order < 0)
+ return -EINVAL;
+
+ if (mlx4_is_mfunc(dev)) {
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ inbox = mailbox->buf;
+
+ while (npages > 0) {
+ chunk = min_t(int, MLX4_MAILBOX_SIZE / sizeof(u64) - 2,
+ npages);
+ inbox[0] = cpu_to_be64(mtt->offset + start_index);
+ inbox[1] = 0;
+ for (i = 0; i < chunk; ++i)
+ inbox[i + 2] = cpu_to_be64(page_list[i] |
+ MLX4_MTT_FLAG_PRESENT);
+ err = mlx4_WRITE_MTT(dev, mailbox, chunk);
+ if (err) {
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+ }
+
+ npages -= chunk;
+ start_index += chunk;
+ page_list += chunk;
+ }
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+ }
+
+ return __mlx4_write_mtt(dev, mtt, start_index, npages, page_list);
+}
+EXPORT_SYMBOL_GPL(mlx4_write_mtt);
+
+int mlx4_buf_write_mtt(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
+ struct mlx4_buf *buf, gfp_t gfp)
+{
+ u64 *page_list;
+ int err;
+ int i;
+
+ page_list = kmalloc(buf->npages * sizeof *page_list,
+ gfp);
+ if (!page_list)
+ return -ENOMEM;
+
+ for (i = 0; i < buf->npages; ++i)
+ if (buf->nbufs == 1)
+ page_list[i] = buf->direct.map + (i << buf->page_shift);
+ else
+ page_list[i] = buf->page_list[i].map;
+
+ err = mlx4_write_mtt(dev, mtt, 0, buf->npages, page_list);
+
+ kfree(page_list);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_buf_write_mtt);
+
+int mlx4_mw_alloc(struct mlx4_dev *dev, u32 pd, enum mlx4_mw_type type,
+ struct mlx4_mw *mw)
+{
+ u32 index;
+
+ if ((type == MLX4_MW_TYPE_1 &&
+ !(dev->caps.flags & MLX4_DEV_CAP_FLAG_MEM_WINDOW)) ||
+ (type == MLX4_MW_TYPE_2 &&
+ !(dev->caps.bmme_flags & MLX4_BMME_FLAG_TYPE_2_WIN)))
+ return -ENOTSUPP;
+
+ index = mlx4_mpt_reserve(dev);
+ if (index == -1)
+ return -ENOMEM;
+
+ mw->key = hw_index_to_key(index);
+ mw->pd = pd;
+ mw->type = type;
+ mw->enabled = MLX4_MPT_DISABLED;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_mw_alloc);
+
+int mlx4_mw_enable(struct mlx4_dev *dev, struct mlx4_mw *mw)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_mpt_entry *mpt_entry;
+ int err;
+
+ err = mlx4_mpt_alloc_icm(dev, key_to_hw_index(mw->key), GFP_KERNEL);
+ if (err)
+ return err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox)) {
+ err = PTR_ERR(mailbox);
+ goto err_table;
+ }
+ mpt_entry = mailbox->buf;
+
+ /* Note that the MLX4_MPT_FLAG_REGION bit in mpt_entry->flags is turned
+ * off, thus creating a memory window and not a memory region.
+ */
+ mpt_entry->key = cpu_to_be32(key_to_hw_index(mw->key));
+ mpt_entry->pd_flags = cpu_to_be32(mw->pd);
+ if (mw->type == MLX4_MW_TYPE_2) {
+ mpt_entry->flags |= cpu_to_be32(MLX4_MPT_FLAG_FREE);
+ mpt_entry->qpn = cpu_to_be32(MLX4_MPT_QP_FLAG_BOUND_QP);
+ mpt_entry->pd_flags |= cpu_to_be32(MLX4_MPT_PD_FLAG_EN_INV);
+ }
+
+ err = mlx4_SW2HW_MPT(dev, mailbox,
+ key_to_hw_index(mw->key) &
+ (dev->caps.num_mpts - 1));
+ if (err) {
+ mlx4_warn(dev, "SW2HW_MPT failed (%d)\n", err);
+ goto err_cmd;
+ }
+ mw->enabled = MLX4_MPT_EN_HW;
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+ return 0;
+
+err_cmd:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+err_table:
+ mlx4_mpt_free_icm(dev, key_to_hw_index(mw->key));
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_mw_enable);
+
+void mlx4_mw_free(struct mlx4_dev *dev, struct mlx4_mw *mw)
+{
+ int err;
+
+ if (mw->enabled == MLX4_MPT_EN_HW) {
+ err = mlx4_HW2SW_MPT(dev, NULL,
+ key_to_hw_index(mw->key) &
+ (dev->caps.num_mpts - 1));
+ if (err)
+ mlx4_warn(dev, "xxx HW2SW_MPT failed (%d)\n", err);
+
+ mw->enabled = MLX4_MPT_EN_SW;
+ }
+ if (mw->enabled)
+ mlx4_mpt_free_icm(dev, key_to_hw_index(mw->key));
+ mlx4_mpt_release(dev, key_to_hw_index(mw->key));
+}
+EXPORT_SYMBOL_GPL(mlx4_mw_free);
+
+int mlx4_init_mr_table(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_mr_table *mr_table = &priv->mr_table;
+ int err;
+
+ /* Nothing to do for slaves - all MR handling is forwarded
+ * to the master */
+ if (mlx4_is_slave(dev))
+ return 0;
+
+ if (!is_power_of_2(dev->caps.num_mpts))
+ return -EINVAL;
+
+ err = mlx4_bitmap_init(&mr_table->mpt_bitmap, dev->caps.num_mpts,
+ ~0, dev->caps.reserved_mrws, 0);
+ if (err)
+ return err;
+
+ err = mlx4_buddy_init(&mr_table->mtt_buddy,
+ ilog2((u32)dev->caps.num_mtts /
+ (1 << log_mtts_per_seg)));
+ if (err)
+ goto err_buddy;
+
+ if (dev->caps.reserved_mtts) {
+ priv->reserved_mtts =
+ mlx4_alloc_mtt_range(dev,
+ fls(dev->caps.reserved_mtts - 1));
+ if (priv->reserved_mtts < 0) {
+ mlx4_warn(dev, "MTT table of order %u is too small\n",
+ mr_table->mtt_buddy.max_order);
+ err = -ENOMEM;
+ goto err_reserve_mtts;
+ }
+ }
+
+ return 0;
+
+err_reserve_mtts:
+ mlx4_buddy_cleanup(&mr_table->mtt_buddy);
+
+err_buddy:
+ mlx4_bitmap_cleanup(&mr_table->mpt_bitmap);
+
+ return err;
+}
+
+void mlx4_cleanup_mr_table(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_mr_table *mr_table = &priv->mr_table;
+
+ if (mlx4_is_slave(dev))
+ return;
+ if (priv->reserved_mtts >= 0)
+ mlx4_free_mtt_range(dev, priv->reserved_mtts,
+ fls(dev->caps.reserved_mtts - 1));
+ mlx4_buddy_cleanup(&mr_table->mtt_buddy);
+ mlx4_bitmap_cleanup(&mr_table->mpt_bitmap);
+}
+
+static inline int mlx4_check_fmr(struct mlx4_fmr *fmr, u64 *page_list,
+ int npages, u64 iova)
+{
+ int i, page_mask;
+
+ if (npages > fmr->max_pages)
+ return -EINVAL;
+
+ page_mask = (1 << fmr->page_shift) - 1;
+
+ /* We are getting page lists, so va must be page aligned. */
+ if (iova & page_mask)
+ return -EINVAL;
+
+ /* Trust the user not to pass misaligned data in page_list */
+ if (0)
+ for (i = 0; i < npages; ++i) {
+ if (page_list[i] & ~page_mask)
+ return -EINVAL;
+ }
+
+ if (fmr->maps >= fmr->max_maps)
+ return -EINVAL;
+
+ return 0;
+}
+
+int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list,
+ int npages, u64 iova, u32 *lkey, u32 *rkey)
+{
+ u32 key;
+ int i, err;
+
+ err = mlx4_check_fmr(fmr, page_list, npages, iova);
+ if (err)
+ return err;
+
+ ++fmr->maps;
+
+ key = key_to_hw_index(fmr->mr.key);
+ key += dev->caps.num_mpts;
+ *lkey = *rkey = fmr->mr.key = hw_index_to_key(key);
+
+ *(u8 *) fmr->mpt = MLX4_MPT_STATUS_SW;
+
+ /* Make sure MPT status is visible before writing MTT entries */
+ wmb();
+
+#ifdef KMOD_MODIFIED
+ rte_mb();
+#else
+ dma_sync_single_for_cpu(&dev->persist->pdev->dev, fmr->dma_handle,
+ npages * sizeof(u64), DMA_TO_DEVICE);
+#endif
+
+ for (i = 0; i < npages; ++i)
+ fmr->mtts[i] = cpu_to_be64(page_list[i] | MLX4_MTT_FLAG_PRESENT);
+#ifdef KMOD_MODIFIED
+ rte_mb();
+#else
+ dma_sync_single_for_device(&dev->persist->pdev->dev, fmr->dma_handle,
+ npages * sizeof(u64), DMA_TO_DEVICE);
+#endif
+
+ fmr->mpt->key = cpu_to_be32(key);
+ fmr->mpt->lkey = cpu_to_be32(key);
+ fmr->mpt->length = cpu_to_be64(npages * (1ull << fmr->page_shift));
+ fmr->mpt->start = cpu_to_be64(iova);
+
+ /* Make MTT entries are visible before setting MPT status */
+ wmb();
+
+ *(u8 *) fmr->mpt = MLX4_MPT_STATUS_HW;
+
+ /* Make sure MPT status is visible before consumer can use FMR */
+ wmb();
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_map_phys_fmr);
+
+int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages,
+ int max_maps, u8 page_shift, struct mlx4_fmr *fmr)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err = -ENOMEM;
+
+ if (max_maps > dev->caps.max_fmr_maps)
+ return -EINVAL;
+
+ if (page_shift < (ffs(dev->caps.page_size_cap) - 1) || page_shift >= 32)
+ return -EINVAL;
+
+ /* All MTTs must fit in the same page */
+ if (max_pages * sizeof *fmr->mtts > PAGE_SIZE)
+ return -EINVAL;
+
+ fmr->page_shift = page_shift;
+ fmr->max_pages = max_pages;
+ fmr->max_maps = max_maps;
+ fmr->maps = 0;
+
+ err = mlx4_mr_alloc(dev, pd, 0, 0, access, max_pages,
+ page_shift, &fmr->mr);
+ if (err)
+ return err;
+
+ fmr->mtts = mlx4_table_find(&priv->mr_table.mtt_table,
+ fmr->mr.mtt.offset,
+ &fmr->dma_handle);
+
+ if (!fmr->mtts) {
+ err = -ENOMEM;
+ goto err_free;
+ }
+
+ return 0;
+
+err_free:
+ (void) mlx4_mr_free(dev, &fmr->mr);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_fmr_alloc);
+
+int mlx4_fmr_enable(struct mlx4_dev *dev, struct mlx4_fmr *fmr)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int err;
+
+ err = mlx4_mr_enable(dev, &fmr->mr);
+ if (err)
+ return err;
+
+ fmr->mpt = mlx4_table_find(&priv->mr_table.dmpt_table,
+ key_to_hw_index(fmr->mr.key), NULL);
+ if (!fmr->mpt)
+ return -ENOMEM;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_fmr_enable);
+
+void mlx4_fmr_unmap(struct mlx4_dev *dev, struct mlx4_fmr *fmr,
+ u32 *lkey, u32 *rkey)
+{
+ if (!fmr->maps)
+ return;
+
+ /* To unmap: it is sufficient to take back ownership from HW */
+ *(u8 *)fmr->mpt = MLX4_MPT_STATUS_SW;
+
+ /* Make sure MPT status is visible */
+ wmb();
+
+ fmr->maps = 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_fmr_unmap);
+
+int mlx4_fmr_free(struct mlx4_dev *dev, struct mlx4_fmr *fmr)
+{
+ int ret;
+
+ if (fmr->maps)
+ return -EBUSY;
+ if (fmr->mr.enabled == MLX4_MPT_EN_HW) {
+ /* In case of FMR was enabled and unmapped
+ * make sure to give ownership of MPT back to HW
+ * so HW2SW_MPT command will success.
+ */
+ *(u8 *)fmr->mpt = MLX4_MPT_STATUS_SW;
+ /* Make sure MPT status is visible before changing MPT fields */
+ wmb();
+ fmr->mpt->length = 0;
+ fmr->mpt->start = 0;
+ /* Make sure MPT data is visible after changing MPT status */
+ wmb();
+ *(u8 *)fmr->mpt = MLX4_MPT_STATUS_HW;
+ /* make sure MPT status is visible */
+ wmb();
+ }
+
+ ret = mlx4_mr_free(dev, &fmr->mr);
+ if (ret)
+ return ret;
+ fmr->mr.enabled = MLX4_MPT_DISABLED;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_fmr_free);
+
+int mlx4_SYNC_TPT(struct mlx4_dev *dev)
+{
+ return mlx4_cmd(dev, 0, 0, 0, MLX4_CMD_SYNC_TPT,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
+}
+EXPORT_SYMBOL_GPL(mlx4_SYNC_TPT);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/pd.c b/drivers/net/mlnx_uio/mlnx/mlx4/pd.c
new file mode 100644
index 0000000..55c3ce5
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/pd.c
@@ -0,0 +1,310 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ * Copyright (c) 2005 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+
+#include "mlx4.h"
+#include "icm.h"
+
+enum {
+ MLX4_NUM_RESERVED_UARS = 8
+};
+
+int mlx4_pd_alloc(struct mlx4_dev *dev, u32 *pdn)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ *pdn = mlx4_bitmap_alloc(&priv->pd_bitmap);
+ if (*pdn == -1)
+ return -ENOMEM;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_pd_alloc);
+
+void mlx4_pd_free(struct mlx4_dev *dev, u32 pdn)
+{
+ mlx4_bitmap_free(&mlx4_priv(dev)->pd_bitmap, pdn, MLX4_USE_RR);
+}
+EXPORT_SYMBOL_GPL(mlx4_pd_free);
+
+int __mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ *xrcdn = mlx4_bitmap_alloc(&priv->xrcd_bitmap);
+ if (*xrcdn == -1)
+ return -ENOMEM;
+
+ return 0;
+}
+
+int mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn)
+{
+ u64 out_param;
+ int err;
+
+ if (mlx4_is_mfunc(dev)) {
+ err = mlx4_cmd_imm(dev, 0, &out_param,
+ RES_XRCD, RES_OP_RESERVE,
+ MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (err)
+ return err;
+
+ *xrcdn = get_param_l(&out_param);
+ return 0;
+ }
+ return __mlx4_xrcd_alloc(dev, xrcdn);
+}
+EXPORT_SYMBOL_GPL(mlx4_xrcd_alloc);
+
+void __mlx4_xrcd_free(struct mlx4_dev *dev, u32 xrcdn)
+{
+ mlx4_bitmap_free(&mlx4_priv(dev)->xrcd_bitmap, xrcdn, MLX4_USE_RR);
+}
+
+void mlx4_xrcd_free(struct mlx4_dev *dev, u32 xrcdn)
+{
+ u64 in_param = 0;
+ int err;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, xrcdn);
+ err = mlx4_cmd(dev, in_param, RES_XRCD,
+ RES_OP_RESERVE, MLX4_CMD_FREE_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (err)
+ mlx4_warn(dev, "Failed to release xrcdn %d\n", xrcdn);
+ } else
+ __mlx4_xrcd_free(dev, xrcdn);
+}
+EXPORT_SYMBOL_GPL(mlx4_xrcd_free);
+
+int mlx4_init_pd_table(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ return mlx4_bitmap_init(&priv->pd_bitmap, dev->caps.num_pds,
+ (1 << NOT_MASKED_PD_BITS) - 1,
+ dev->caps.reserved_pds, 0);
+}
+
+void mlx4_cleanup_pd_table(struct mlx4_dev *dev)
+{
+ mlx4_bitmap_cleanup(&mlx4_priv(dev)->pd_bitmap);
+}
+
+int mlx4_init_xrcd_table(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ return mlx4_bitmap_init(&priv->xrcd_bitmap, (1 << 16),
+ (1 << 16) - 1, dev->caps.reserved_xrcds + 1, 0);
+}
+
+void mlx4_cleanup_xrcd_table(struct mlx4_dev *dev)
+{
+ mlx4_bitmap_cleanup(&mlx4_priv(dev)->xrcd_bitmap);
+}
+
+int mlx4_uar_alloc(struct mlx4_dev *dev, struct mlx4_uar *uar)
+{
+ int offset;
+
+ uar->index = mlx4_bitmap_alloc(&mlx4_priv(dev)->uar_table.bitmap);
+ if (uar->index == -1)
+ return -ENOMEM;
+
+ if (mlx4_is_slave(dev))
+ {
+#ifdef KMOD_MODIFIED
+ int temp = dev->persist->rte_pdev->mem_resource[2].len;
+ offset = uar->index % ( temp /
+ dev->caps.uar_page_size);
+#else
+ offset = uar->index % ((int)pci_resource_len(dev->persist->pdev,
+ 2) /
+ dev->caps.uar_page_size);
+#endif
+ }
+ else
+ offset = uar->index;
+#ifdef KMOD_MODIFIED
+ uar->pfn_addr = RTE_PTR_ADD(dev->persist->rte_pdev->mem_resource[2].addr,
+ offset << PAGE_SHIFT);
+#else
+ uar->pfn = (pci_resource_start(dev->persist->pdev, 2) >> PAGE_SHIFT)
+ + offset;
+#endif
+ uar->map = NULL;
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_uar_alloc);
+
+void mlx4_uar_free(struct mlx4_dev *dev, struct mlx4_uar *uar)
+{
+ mlx4_bitmap_free(&mlx4_priv(dev)->uar_table.bitmap, uar->index, MLX4_USE_RR);
+}
+EXPORT_SYMBOL_GPL(mlx4_uar_free);
+
+int mlx4_bf_alloc(struct mlx4_dev *dev, struct mlx4_bf *bf, int node)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_uar *uar;
+ int err = 0;
+ int idx;
+#ifdef KMOD_MODIFIED
+ if (!priv->bf_mapping_addr)
+ return -ENOMEM;
+#endif
+
+ mutex_lock(&priv->bf_mutex);
+ if (!list_empty(&priv->bf_list))
+ uar = list_entry(priv->bf_list.next, struct mlx4_uar, bf_list);
+ else {
+ if (mlx4_bitmap_avail(&priv->uar_table.bitmap) < MLX4_NUM_RESERVED_UARS) {
+ err = -ENOMEM;
+ goto out;
+ }
+ uar = kmalloc_node(sizeof(*uar), GFP_KERNEL, node);
+ if (!uar) {
+ uar = kmalloc(sizeof(*uar), GFP_KERNEL);
+ if (!uar) {
+ err = -ENOMEM;
+ goto out;
+ }
+ }
+ err = mlx4_uar_alloc(dev, uar);
+ if (err)
+ goto free_kmalloc;
+#ifdef KMOD_MODIFIED
+ uar->map = uar->pfn_addr;
+#else
+ uar->map = ioremap(uar->pfn << PAGE_SHIFT, PAGE_SIZE);
+#endif
+ if (!uar->map) {
+ err = -ENOMEM;
+ goto free_uar;
+ }
+#ifdef KMOD_MODIFIED
+ uar->bf_map = RTE_PTR_ADD(priv->bf_mapping_addr, uar->index << PAGE_SHIFT);
+#else
+ uar->bf_map = io_mapping_map_wc(priv->bf_mapping, uar->index << PAGE_SHIFT);
+#endif
+ if (!uar->bf_map) {
+ err = -ENOMEM;
+ goto unamp_uar;
+ }
+ uar->free_bf_bmap = 0;
+ list_add(&uar->bf_list, &priv->bf_list);
+ }
+
+ idx = ffz(uar->free_bf_bmap);
+ uar->free_bf_bmap |= 1 << idx;
+ bf->uar = uar;
+ bf->offset = 0;
+ bf->buf_size = dev->caps.bf_reg_size / 2;
+ bf->reg = uar->bf_map + idx * dev->caps.bf_reg_size;
+ if (uar->free_bf_bmap == (1 << dev->caps.bf_regs_per_page) - 1)
+ list_del_init(&uar->bf_list);
+
+ goto out;
+
+unamp_uar:
+ bf->uar = NULL;
+#ifdef KMOD_REMOVED
+ iounmap(uar->map);
+#endif
+
+free_uar:
+ mlx4_uar_free(dev, uar);
+
+free_kmalloc:
+ kfree(uar);
+
+out:
+ mutex_unlock(&priv->bf_mutex);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_bf_alloc);
+
+void mlx4_bf_free(struct mlx4_dev *dev, struct mlx4_bf *bf)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int idx;
+
+ if (!bf->uar || !bf->uar->bf_map)
+ return;
+
+ mutex_lock(&priv->bf_mutex);
+ idx = (bf->reg - bf->uar->bf_map) / dev->caps.bf_reg_size;
+ bf->uar->free_bf_bmap &= ~(1 << idx);
+ if (!bf->uar->free_bf_bmap) {
+ if (!list_empty(&bf->uar->bf_list))
+ list_del(&bf->uar->bf_list);
+#ifdef KMOD_REMOVED
+ io_mapping_unmap(bf->uar->bf_map);
+ iounmap(bf->uar->map);
+#endif
+ mlx4_uar_free(dev, bf->uar);
+ kfree(bf->uar);
+ } else if (list_empty(&bf->uar->bf_list))
+ list_add(&bf->uar->bf_list, &priv->bf_list);
+
+ mutex_unlock(&priv->bf_mutex);
+}
+EXPORT_SYMBOL_GPL(mlx4_bf_free);
+
+int mlx4_init_uar_table(struct mlx4_dev *dev)
+{
+ if (dev->caps.num_uars <= 128) {
+ mlx4_err(dev, "Only %d UAR pages (need more than 128)\n",
+ dev->caps.num_uars);
+ mlx4_err(dev, "Increase firmware log2_uar_bar_megabytes?\n");
+ return -ENODEV;
+ }
+
+ return mlx4_bitmap_init(&mlx4_priv(dev)->uar_table.bitmap,
+ dev->caps.num_uars, dev->caps.num_uars - 1,
+ dev->caps.reserved_uars, 0);
+}
+
+void mlx4_cleanup_uar_table(struct mlx4_dev *dev)
+{
+ mlx4_bitmap_cleanup(&mlx4_priv(dev)->uar_table.bitmap);
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/port.c b/drivers/net/mlnx_uio/mlnx/mlx4/port.c
new file mode 100644
index 0000000..09f6b83
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/port.c
@@ -0,0 +1,1636 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+
+#include "mlx4.h"
+#include "mlx4_stats.h"
+
+#define MLX4_MAC_VALID (1ull << 63)
+
+#define MLX4_VLAN_VALID (1u << 31)
+#define MLX4_VLAN_MASK 0xfff
+
+#define MLX4_FLAG_V_IGNORE_FCS_MASK 0x2
+#define MLX4_IGNORE_FCS_MASK 0x1
+
+void mlx4_init_mac_table(struct mlx4_dev *dev, struct mlx4_mac_table *table)
+{
+ int i;
+
+ mutex_init(&table->mutex);
+ for (i = 0; i < MLX4_MAX_MAC_NUM; i++) {
+ table->entries[i] = 0;
+ table->refs[i] = 0;
+ }
+ table->max = 1 << dev->caps.log_num_macs;
+ table->total = 0;
+}
+
+void mlx4_init_vlan_table(struct mlx4_dev *dev, struct mlx4_vlan_table *table)
+{
+ int i;
+
+ mutex_init(&table->mutex);
+ for (i = 0; i < MLX4_MAX_VLAN_NUM; i++) {
+ table->entries[i] = 0;
+ table->refs[i] = 0;
+ }
+ table->max = (1 << dev->caps.log_num_vlans) - MLX4_VLAN_REGULAR;
+ table->total = 0;
+}
+
+void mlx4_init_roce_gid_table(struct mlx4_dev *dev,
+ struct mlx4_roce_info *roce)
+{
+ struct mlx4_roce_addr_table *addr_table = &roce->addr_table;
+
+ mutex_init(&roce->mutex);
+ memset(addr_table, 0, sizeof(*addr_table));
+}
+
+static int validate_index(struct mlx4_dev *dev,
+ struct mlx4_mac_table *table, int index)
+{
+ int err = 0;
+
+ if (index < 0 || index >= table->max || !table->entries[index]) {
+ mlx4_warn(dev, "No valid Mac entry for the given index\n");
+ err = -EINVAL;
+ }
+ return err;
+}
+
+static int find_index(struct mlx4_dev *dev,
+ struct mlx4_mac_table *table, u64 mac)
+{
+ int i;
+
+ for (i = 0; i < MLX4_MAX_MAC_NUM; i++) {
+ if (table->refs[i] &&
+ (MLX4_MAC_MASK & mac) ==
+ (MLX4_MAC_MASK & be64_to_cpu(table->entries[i])))
+ return i;
+ }
+ /* Mac not found */
+ return -EINVAL;
+}
+
+static int mlx4_set_port_mac_table(struct mlx4_dev *dev, u8 port,
+ __be64 *entries)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 in_mod;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ memcpy(mailbox->buf, entries, MLX4_MAC_TABLE_SIZE);
+
+ in_mod = MLX4_SET_PORT_MAC_TABLE << 8 | port;
+
+ err = mlx4_cmd(dev, mailbox->dma, in_mod, MLX4_SET_PORT_ETH_OPCODE,
+ MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+int mlx4_find_cached_mac(struct mlx4_dev *dev, u8 port, u64 mac, int *idx)
+{
+ struct mlx4_port_info *info = &mlx4_priv(dev)->port[port];
+ struct mlx4_mac_table *table = &info->mac_table;
+ int i;
+
+ for (i = 0; i < MLX4_MAX_MAC_NUM; i++) {
+ if (!table->refs[i])
+ continue;
+
+ if (mac == (MLX4_MAC_MASK & be64_to_cpu(table->entries[i]))) {
+ *idx = i;
+ return 0;
+ }
+ }
+
+ return -ENOENT;
+}
+EXPORT_SYMBOL_GPL(mlx4_find_cached_mac);
+
+int __mlx4_register_mac(struct mlx4_dev *dev, u8 port, u64 mac)
+{
+ struct mlx4_port_info *info = &mlx4_priv(dev)->port[port];
+ struct mlx4_mac_table *table = &info->mac_table;
+ int i, err = 0;
+ int free = -1;
+
+ mlx4_dbg(dev, "Registering MAC: 0x%llx for port %d\n",
+ (unsigned long long) mac, port);
+
+ mutex_lock(&table->mutex);
+ for (i = 0; i < MLX4_MAX_MAC_NUM; i++) {
+ if (!table->refs[i]) {
+ if (free < 0)
+ free = i;
+ continue;
+ }
+
+ if ((MLX4_MAC_MASK & mac) ==
+ (MLX4_MAC_MASK & be64_to_cpu(table->entries[i]))) {
+ /* MAC already registered, increment ref count */
+ err = i;
+ ++table->refs[i];
+ goto out;
+ }
+ }
+
+ mlx4_dbg(dev, "Free MAC index is %d\n", free);
+
+ if (table->total == table->max) {
+ /* No free mac entries */
+ err = -ENOSPC;
+ goto out;
+ }
+
+ /* Register new MAC */
+ table->entries[free] = cpu_to_be64(mac | MLX4_MAC_VALID);
+
+ err = mlx4_set_port_mac_table(dev, port, table->entries);
+ if (unlikely(err)) {
+ mlx4_err(dev, "Failed adding MAC: 0x%llx\n",
+ (unsigned long long) mac);
+ table->entries[free] = 0;
+ goto out;
+ }
+ table->refs[free] = 1;
+ err = free;
+ ++table->total;
+out:
+ mutex_unlock(&table->mutex);
+ return err;
+}
+EXPORT_SYMBOL_GPL(__mlx4_register_mac);
+
+int mlx4_register_mac(struct mlx4_dev *dev, u8 port, u64 mac)
+{
+ u64 out_param = 0;
+ int err = -EINVAL;
+
+ if (mlx4_is_mfunc(dev)) {
+ if (!(dev->flags & MLX4_FLAG_OLD_REG_MAC)) {
+ err = mlx4_cmd_imm(dev, mac, &out_param,
+ ((u32) port) << 8 | (u32) RES_MAC,
+ RES_OP_RESERVE_AND_MAP, MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ }
+ if (err && err == -EINVAL && mlx4_is_slave(dev)) {
+ /* retry using old REG_MAC format */
+ set_param_l(&out_param, port);
+ err = mlx4_cmd_imm(dev, mac, &out_param, RES_MAC,
+ RES_OP_RESERVE_AND_MAP, MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (!err)
+ dev->flags |= MLX4_FLAG_OLD_REG_MAC;
+ }
+ if (err)
+ return err;
+
+ return get_param_l(&out_param);
+ }
+ return __mlx4_register_mac(dev, port, mac);
+}
+EXPORT_SYMBOL_GPL(mlx4_register_mac);
+
+int mlx4_get_base_qpn(struct mlx4_dev *dev, u8 port)
+{
+ return dev->caps.reserved_qps_base[MLX4_QP_REGION_ETH_ADDR] +
+ (port - 1) * (1 << dev->caps.log_num_macs);
+}
+EXPORT_SYMBOL_GPL(mlx4_get_base_qpn);
+
+void __mlx4_unregister_mac(struct mlx4_dev *dev, u8 port, u64 mac)
+{
+ struct mlx4_port_info *info;
+ struct mlx4_mac_table *table;
+ int index;
+
+ if (port < 1 || port > dev->caps.num_ports) {
+ mlx4_warn(dev, "invalid port number (%d), aborting...\n", port);
+ return;
+ }
+ info = &mlx4_priv(dev)->port[port];
+ table = &info->mac_table;
+ mutex_lock(&table->mutex);
+ index = find_index(dev, table, mac);
+
+ if (validate_index(dev, table, index))
+ goto out;
+ if (--table->refs[index]) {
+ mlx4_dbg(dev, "Have more references for index %d, no need to modify mac table\n",
+ index);
+ goto out;
+ }
+
+ table->entries[index] = 0;
+ mlx4_set_port_mac_table(dev, port, table->entries);
+ --table->total;
+out:
+ mutex_unlock(&table->mutex);
+}
+EXPORT_SYMBOL_GPL(__mlx4_unregister_mac);
+
+void mlx4_unregister_mac(struct mlx4_dev *dev, u8 port, u64 mac)
+{
+ u64 out_param = 0;
+
+ if (mlx4_is_mfunc(dev)) {
+ if (!(dev->flags & MLX4_FLAG_OLD_REG_MAC)) {
+ (void) mlx4_cmd_imm(dev, mac, &out_param,
+ ((u32) port) << 8 | (u32) RES_MAC,
+ RES_OP_RESERVE_AND_MAP, MLX4_CMD_FREE_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ } else {
+ /* use old unregister mac format */
+ set_param_l(&out_param, port);
+ (void) mlx4_cmd_imm(dev, mac, &out_param, RES_MAC,
+ RES_OP_RESERVE_AND_MAP, MLX4_CMD_FREE_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ }
+ return;
+ }
+ __mlx4_unregister_mac(dev, port, mac);
+ return;
+}
+EXPORT_SYMBOL_GPL(mlx4_unregister_mac);
+
+int __mlx4_replace_mac(struct mlx4_dev *dev, u8 port, int qpn, u64 new_mac)
+{
+ struct mlx4_port_info *info = &mlx4_priv(dev)->port[port];
+ struct mlx4_mac_table *table = &info->mac_table;
+ int index = qpn - info->base_qpn;
+ int err = 0;
+
+ /* CX1 doesn't support multi-functions */
+ mutex_lock(&table->mutex);
+
+ err = validate_index(dev, table, index);
+ if (err)
+ goto out;
+
+ table->entries[index] = cpu_to_be64(new_mac | MLX4_MAC_VALID);
+
+ err = mlx4_set_port_mac_table(dev, port, table->entries);
+ if (unlikely(err)) {
+ mlx4_err(dev, "Failed adding MAC: 0x%llx\n",
+ (unsigned long long) new_mac);
+ table->entries[index] = 0;
+ }
+out:
+ mutex_unlock(&table->mutex);
+ return err;
+}
+EXPORT_SYMBOL_GPL(__mlx4_replace_mac);
+
+static int mlx4_set_port_vlan_table(struct mlx4_dev *dev, u8 port,
+ __be32 *entries)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 in_mod;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ memcpy(mailbox->buf, entries, MLX4_VLAN_TABLE_SIZE);
+ in_mod = MLX4_SET_PORT_VLAN_TABLE << 8 | port;
+ err = mlx4_cmd(dev, mailbox->dma, in_mod, MLX4_SET_PORT_ETH_OPCODE,
+ MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+ return err;
+}
+
+int mlx4_find_cached_vlan(struct mlx4_dev *dev, u8 port, u16 vid, int *idx)
+{
+ struct mlx4_vlan_table *table = &mlx4_priv(dev)->port[port].vlan_table;
+ int i;
+
+ for (i = 0; i < MLX4_MAX_VLAN_NUM; ++i) {
+ if (table->refs[i] &&
+ (vid == (MLX4_VLAN_MASK &
+ be32_to_cpu(table->entries[i])))) {
+ /* VLAN already registered, increase reference count */
+ *idx = i;
+ return 0;
+ }
+ }
+
+ return -ENOENT;
+}
+EXPORT_SYMBOL_GPL(mlx4_find_cached_vlan);
+
+int __mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan,
+ int *index)
+{
+ struct mlx4_vlan_table *table = &mlx4_priv(dev)->port[port].vlan_table;
+ int i, err = 0;
+ int free = -1;
+
+ mutex_lock(&table->mutex);
+
+ if (table->total == table->max) {
+ /* No free vlan entries */
+ err = -ENOSPC;
+ goto out;
+ }
+
+ for (i = MLX4_VLAN_REGULAR; i < MLX4_MAX_VLAN_NUM; i++) {
+ if (free < 0 && (table->refs[i] == 0)) {
+ free = i;
+ continue;
+ }
+
+ if (table->refs[i] &&
+ (vlan == (MLX4_VLAN_MASK &
+ be32_to_cpu(table->entries[i])))) {
+ /* Vlan already registered, increase references count */
+ *index = i;
+ ++table->refs[i];
+ goto out;
+ }
+ }
+
+ if (free < 0) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ /* Register new VLAN */
+ table->refs[free] = 1;
+ table->entries[free] = cpu_to_be32(vlan | MLX4_VLAN_VALID);
+
+ err = mlx4_set_port_vlan_table(dev, port, table->entries);
+ if (unlikely(err)) {
+ mlx4_warn(dev, "Failed adding vlan: %u\n", vlan);
+ table->refs[free] = 0;
+ table->entries[free] = 0;
+ goto out;
+ }
+
+ *index = free;
+ ++table->total;
+out:
+ mutex_unlock(&table->mutex);
+ return err;
+}
+
+int mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index)
+{
+ u64 out_param = 0;
+ int err;
+
+ if (vlan > 4095)
+ return -EINVAL;
+
+ if (mlx4_is_mfunc(dev)) {
+ err = mlx4_cmd_imm(dev, vlan, &out_param,
+ ((u32) port) << 8 | (u32) RES_VLAN,
+ RES_OP_RESERVE_AND_MAP, MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (!err)
+ *index = get_param_l(&out_param);
+
+ return err;
+ }
+ return __mlx4_register_vlan(dev, port, vlan, index);
+}
+EXPORT_SYMBOL_GPL(mlx4_register_vlan);
+
+void __mlx4_unregister_vlan(struct mlx4_dev *dev, u8 port, u16 vlan)
+{
+ struct mlx4_vlan_table *table = &mlx4_priv(dev)->port[port].vlan_table;
+ int index;
+
+ mutex_lock(&table->mutex);
+ if (mlx4_find_cached_vlan(dev, port, vlan, &index)) {
+ mlx4_warn(dev, "vlan 0x%x is not in the vlan table\n", vlan);
+ goto out;
+ }
+
+ if (index < MLX4_VLAN_REGULAR) {
+ mlx4_warn(dev, "Trying to free special vlan index %d\n", index);
+ goto out;
+ }
+
+ if (--table->refs[index]) {
+ mlx4_dbg(dev, "Have %d more references for index %d, no need to modify vlan table\n",
+ table->refs[index], index);
+ goto out;
+ }
+ table->entries[index] = 0;
+ mlx4_set_port_vlan_table(dev, port, table->entries);
+ --table->total;
+out:
+ mutex_unlock(&table->mutex);
+}
+
+void mlx4_unregister_vlan(struct mlx4_dev *dev, u8 port, u16 vlan)
+{
+ u64 out_param = 0;
+
+ if (mlx4_is_mfunc(dev)) {
+ (void) mlx4_cmd_imm(dev, vlan, &out_param,
+ ((u32) port) << 8 | (u32) RES_VLAN,
+ RES_OP_RESERVE_AND_MAP,
+ MLX4_CMD_FREE_RES, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+ return;
+ }
+ __mlx4_unregister_vlan(dev, port, vlan);
+}
+EXPORT_SYMBOL_GPL(mlx4_unregister_vlan);
+
+int mlx4_get_port_ib_caps(struct mlx4_dev *dev, u8 port, __be32 *caps)
+{
+ struct mlx4_cmd_mailbox *inmailbox, *outmailbox;
+ u8 *inbuf, *outbuf;
+ int err;
+
+ inmailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(inmailbox))
+ return PTR_ERR(inmailbox);
+
+ outmailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(outmailbox)) {
+ mlx4_free_cmd_mailbox(dev, inmailbox);
+ return PTR_ERR(outmailbox);
+ }
+
+ inbuf = inmailbox->buf;
+ outbuf = outmailbox->buf;
+ inbuf[0] = 1;
+ inbuf[1] = 1;
+ inbuf[2] = 1;
+ inbuf[3] = 1;
+ *(__be16 *) (&inbuf[16]) = cpu_to_be16(0x0015);
+ *(__be32 *) (&inbuf[20]) = cpu_to_be32(port);
+
+ err = mlx4_cmd_box(dev, inmailbox->dma, outmailbox->dma, port, 3,
+ MLX4_CMD_MAD_IFC, MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+ if (!err)
+ *caps = *(__be32 *) (outbuf + 84);
+ mlx4_free_cmd_mailbox(dev, inmailbox);
+ mlx4_free_cmd_mailbox(dev, outmailbox);
+ return err;
+}
+
+static u8 mlx4_zgid[MLX4_GID_LEN];
+
+int mlx4_get_slave_num_gids(struct mlx4_dev *dev, int slave, int port)
+{
+ int vfs;
+ int slave_gid = slave;
+ unsigned i;
+ struct mlx4_slaves_pport slaves_pport;
+ struct mlx4_active_ports actv_ports;
+ unsigned max_port_p_one;
+
+ if (slave == 0)
+ return MLX4_ROCE_PF_GIDS;
+
+ /* Slave is a VF */
+ slaves_pport = mlx4_phys_to_slaves_pport(dev, port);
+ actv_ports = mlx4_get_active_ports(dev, slave);
+ max_port_p_one = find_first_bit(actv_ports.ports, dev->caps.num_ports) +
+ bitmap_weight(actv_ports.ports, dev->caps.num_ports) + 1;
+
+ for (i = 1; i < max_port_p_one; i++) {
+ struct mlx4_active_ports exclusive_ports;
+ struct mlx4_slaves_pport slaves_pport_actv;
+ bitmap_zero(exclusive_ports.ports, dev->caps.num_ports);
+ set_bit(i - 1, exclusive_ports.ports);
+ if (i == port)
+ continue;
+ slaves_pport_actv = mlx4_phys_to_slaves_pport_actv(
+ dev, &exclusive_ports);
+ slave_gid -= bitmap_weight(slaves_pport_actv.slaves,
+ dev->persist->num_vfs + 1);
+ }
+ vfs = bitmap_weight(slaves_pport.slaves, dev->persist->num_vfs + 1) - 1;
+ if (slave_gid <= ((MLX4_ROCE_MAX_GIDS - MLX4_ROCE_PF_GIDS) % vfs))
+ return ((MLX4_ROCE_MAX_GIDS - MLX4_ROCE_PF_GIDS) / vfs) + 1;
+ return (MLX4_ROCE_MAX_GIDS - MLX4_ROCE_PF_GIDS) / vfs;
+}
+
+int mlx4_get_base_gid_ix(struct mlx4_dev *dev, int slave, int port)
+{
+ int gids;
+ unsigned i;
+ int slave_gid = slave;
+ int vfs;
+
+ struct mlx4_slaves_pport slaves_pport;
+ struct mlx4_active_ports actv_ports;
+ unsigned max_port_p_one;
+
+ if (slave == 0)
+ return 0;
+
+ slaves_pport = mlx4_phys_to_slaves_pport(dev, port);
+ actv_ports = mlx4_get_active_ports(dev, slave);
+ max_port_p_one = find_first_bit(actv_ports.ports, dev->caps.num_ports) +
+ bitmap_weight(actv_ports.ports, dev->caps.num_ports) + 1;
+
+ for (i = 1; i < max_port_p_one; i++) {
+ struct mlx4_active_ports exclusive_ports;
+ struct mlx4_slaves_pport slaves_pport_actv;
+ bitmap_zero(exclusive_ports.ports, dev->caps.num_ports);
+ set_bit(i - 1, exclusive_ports.ports);
+ if (i == port)
+ continue;
+ slaves_pport_actv = mlx4_phys_to_slaves_pport_actv(
+ dev, &exclusive_ports);
+ slave_gid -= bitmap_weight(slaves_pport_actv.slaves,
+ dev->persist->num_vfs + 1);
+ }
+ gids = MLX4_ROCE_MAX_GIDS - MLX4_ROCE_PF_GIDS;
+ vfs = bitmap_weight(slaves_pport.slaves, dev->persist->num_vfs + 1) - 1;
+ if (slave_gid <= gids % vfs)
+ return MLX4_ROCE_PF_GIDS + ((gids / vfs) + 1) * (slave_gid - 1);
+
+ return MLX4_ROCE_PF_GIDS + (gids % vfs) +
+ ((gids / vfs) * (slave_gid - 1));
+}
+EXPORT_SYMBOL_GPL(mlx4_get_base_gid_ix);
+
+static int mlx4_reset_roce_port_gids(struct mlx4_dev *dev, int slave,
+ int port, struct mlx4_cmd_mailbox *mailbox)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int num_gids, base, offset;
+ int i, err;
+ struct mlx4_roce_addr_table *t = &priv->port[port].roce.addr_table;
+
+ num_gids = mlx4_get_slave_num_gids(dev, slave, port);
+ base = mlx4_get_base_gid_ix(dev, slave, port);
+
+ memset(mailbox->buf, 0, MLX4_MAILBOX_SIZE);
+
+ mutex_lock(&(priv->port[port].roce.mutex));
+ /* Zero-out gids belonging to that slave in the port GID table */
+ for (i = 0, offset = base; i < num_gids; offset++, i++)
+ memcpy(t->addr[offset].gid, &mlx4_zgid, MLX4_GID_LEN);
+
+ err = mlx4_update_roce_addr_table(dev, port, t, MLX4_CMD_NATIVE);
+ mutex_unlock(&(priv->port[port].roce.mutex));
+
+ return err;
+}
+
+
+void mlx4_reset_roce_gids(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_active_ports actv_ports;
+ struct mlx4_cmd_mailbox *mailbox;
+ int num_eth_ports, err;
+ int i;
+
+ if (slave < 0 || slave > dev->persist->num_vfs)
+ return;
+
+ actv_ports = mlx4_get_active_ports(dev, slave);
+
+ for (i = 0, num_eth_ports = 0; i < dev->caps.num_ports; i++) {
+ if (test_bit(i, actv_ports.ports)) {
+ if (dev->caps.port_type[i + 1] != MLX4_PORT_TYPE_ETH)
+ continue;
+ num_eth_ports++;
+ }
+ }
+
+ if (!num_eth_ports)
+ return;
+
+ /* have ETH ports. Alloc mailbox for SET_PORT command */
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return;
+
+ for (i = 0; i < dev->caps.num_ports; i++) {
+ if (test_bit(i, actv_ports.ports)) {
+ if (dev->caps.port_type[i + 1] != MLX4_PORT_TYPE_ETH)
+ continue;
+ err = mlx4_reset_roce_port_gids(dev, slave, i + 1, mailbox);
+ if (err)
+ mlx4_warn(dev, "Could not reset ETH port GID table for slave %d, port %d (%d)\n",
+ slave, i + 1, err);
+ }
+ }
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return;
+}
+
+struct roce_gid_table_mbox_entry {
+ u8 gid[MLX4_GID_LEN];
+};
+
+struct roce_adr_table_mbox_entry {
+ u8 gid[MLX4_GID_LEN];
+ __be32 rsrvd1[2];
+ __be16 rsrvd2;
+ u8 type;
+ u8 version;
+ __be32 rsrvd3;
+};
+
+static inline bool roce_table_entry_is_empty(int inmod, void *e)
+{
+ struct roce_gid_table_mbox_entry *gid_e = (struct roce_gid_table_mbox_entry *)e;
+ struct roce_adr_table_mbox_entry *adr_e = (struct roce_adr_table_mbox_entry *)e;
+
+ switch (inmod) {
+ case MLX4_SET_PORT_GID_TABLE:
+ return memcmp(gid_e->gid, mlx4_zgid, MLX4_GID_LEN) ? false : true;
+ case MLX4_SET_PORT_ROCE_ADDR:
+ return memcmp(adr_e->gid, mlx4_zgid, MLX4_GID_LEN) ? false : true;
+ default:
+ return false;
+ }
+}
+
+static inline bool roce_table_entry_is_valid(struct mlx4_dev *dev, int inmod, void *e)
+{
+ struct roce_adr_table_mbox_entry *adr_e = (struct roce_adr_table_mbox_entry *)e;
+
+ if (roce_table_entry_is_empty(inmod, e))
+ return true;
+ switch (inmod) {
+ case MLX4_SET_PORT_GID_TABLE:
+ return true;
+ case MLX4_SET_PORT_ROCE_ADDR:
+ return mlx4_verify_supported_gid_type(dev, adr_e->version, NULL) ?
+ false : true;
+ default:
+ return false;
+ }
+}
+
+static inline bool roce_table_entry_has_gid(int inmod, void *e, u8 *gid)
+{
+ struct roce_gid_table_mbox_entry *gid_e = (struct roce_gid_table_mbox_entry *)e;
+ struct roce_adr_table_mbox_entry *adr_e = (struct roce_adr_table_mbox_entry *)e;
+
+ switch (inmod) {
+ case MLX4_SET_PORT_GID_TABLE:
+ return memcmp(gid_e->gid, gid, MLX4_GID_LEN) ? false : true;
+ case MLX4_SET_PORT_ROCE_ADDR:
+ return memcmp(adr_e->gid, gid, MLX4_GID_LEN) ? false : true;
+ default:
+ return false;
+ }
+}
+
+bool roce_table_entry_is_eq(int inmod, void *e1, void *e2)
+{
+ struct roce_gid_table_mbox_entry *gid_e1 = (struct roce_gid_table_mbox_entry *)e1;
+ struct roce_gid_table_mbox_entry *gid_e2 = (struct roce_gid_table_mbox_entry *)e2;
+ struct roce_adr_table_mbox_entry *adr_e1 = (struct roce_adr_table_mbox_entry *)e1;
+ struct roce_adr_table_mbox_entry *adr_e2 = (struct roce_adr_table_mbox_entry *)e2;
+
+ switch (inmod) {
+ case MLX4_SET_PORT_GID_TABLE:
+ return memcmp(gid_e1->gid, gid_e2->gid, MLX4_GID_LEN) ? false : true;
+ case MLX4_SET_PORT_ROCE_ADDR:
+ return (!memcmp(adr_e1->gid, adr_e2->gid, MLX4_GID_LEN) &&
+ (adr_e1->version == adr_e2->version)) ? true : false;
+ default:
+ return false;
+ }
+}
+
+void *roce_table_entry_next(int inmod, void *e)
+{
+ struct roce_gid_table_mbox_entry *gid_e = (struct roce_gid_table_mbox_entry *)e;
+ struct roce_adr_table_mbox_entry *adr_e = (struct roce_adr_table_mbox_entry *)e;
+
+ switch (inmod) {
+ case MLX4_SET_PORT_GID_TABLE:
+ return (void *)(gid_e + 1);
+ case MLX4_SET_PORT_ROCE_ADDR:
+ return (void *)(adr_e + 1);
+ default:
+ return NULL;
+ }
+}
+
+void roce_table_entry_copy(int inmod, void *e, struct mlx4_roce_addr *to)
+{
+ struct roce_gid_table_mbox_entry *gid_e = (struct roce_gid_table_mbox_entry *)e;
+ struct roce_adr_table_mbox_entry *adr_e = (struct roce_adr_table_mbox_entry *)e;
+
+ switch (inmod) {
+ case MLX4_SET_PORT_GID_TABLE:
+ memcpy(to->gid, gid_e->gid, MLX4_GID_LEN);
+ return;
+ case MLX4_SET_PORT_ROCE_ADDR:
+ memcpy(to->gid, adr_e->gid, MLX4_GID_LEN);
+ to->type = adr_e->version;
+ return;
+ default:
+ return;
+ }
+}
+
+enum mlx4_set_port_roce_mode {
+ MLX4_SET_PORT_ROCE_MODE_1,
+ MLX4_SET_PORT_ROCE_MODE_1_5,
+ MLX4_SET_PORT_ROCE_MODE_1_PLUS_2,
+ MLX4_SET_PORT_ROCE_MODE_1_5_PLUS_2,
+ MLX4_SET_PORT_ROCE_MODE_MAX,
+ MLX4_SET_PORT_ROCE_MODE_INVALID = MLX4_SET_PORT_ROCE_MODE_MAX
+};
+
+static int mlx4_common_set_port(struct mlx4_dev *dev, int slave, u32 in_mod,
+ u8 op_mod, struct mlx4_cmd_mailbox *inbox)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_port_info *port_info;
+ struct mlx4_mfunc_master_ctx *master = &priv->mfunc.master;
+ struct mlx4_slave_state *slave_st = &master->slave_state[slave];
+ struct mlx4_set_port_rqp_calc_context *qpn_context;
+ struct mlx4_set_port_general_context *gen_context;
+ int reset_qkey_viols;
+ int port;
+ int is_eth;
+ int num_gids;
+ int base;
+ u32 in_modifier;
+ u32 promisc;
+ u16 mtu, prev_mtu;
+ int err;
+ int i, j;
+ int offset;
+ __be32 agg_cap_mask;
+ __be32 slave_cap_mask;
+ __be32 new_cap_mask;
+ void *mbox, *mbox2;
+ enum mlx4_set_port_roce_mode slave_set_port_mode;
+
+ port = in_mod & 0xff;
+ in_modifier = in_mod >> 8;
+ is_eth = op_mod;
+ port_info = &priv->port[port];
+
+ /* Slaves cannot perform SET_PORT operations except changing MTU */
+ if (is_eth) {
+ if (slave != dev->caps.function &&
+ in_modifier != MLX4_SET_PORT_GENERAL &&
+ in_modifier != MLX4_SET_PORT_ROCE_ADDR &&
+ in_modifier != MLX4_SET_PORT_GID_TABLE) {
+ mlx4_warn(dev, "denying SET_PORT for slave:%d\n",
+ slave);
+ return -EINVAL;
+ }
+ switch (in_modifier) {
+ case MLX4_SET_PORT_RQP_CALC:
+ qpn_context = inbox->buf;
+ qpn_context->base_qpn =
+ cpu_to_be32(port_info->base_qpn);
+ qpn_context->n_mac = 0x7;
+ promisc = be32_to_cpu(qpn_context->promisc) >>
+ SET_PORT_PROMISC_SHIFT;
+ qpn_context->promisc = cpu_to_be32(
+ promisc << SET_PORT_PROMISC_SHIFT |
+ port_info->base_qpn);
+ promisc = be32_to_cpu(qpn_context->mcast) >>
+ SET_PORT_MC_PROMISC_SHIFT;
+ qpn_context->mcast = cpu_to_be32(
+ promisc << SET_PORT_MC_PROMISC_SHIFT |
+ port_info->base_qpn);
+ break;
+ case MLX4_SET_PORT_GENERAL:
+ gen_context = inbox->buf;
+ /* Mtu is configured as the max MTU among all the
+ * the functions on the port. */
+ mtu = be16_to_cpu(gen_context->mtu);
+ mtu = min_t(int, mtu, dev->caps.eth_mtu_cap[port] +
+ ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN);
+ prev_mtu = slave_st->mtu[port];
+ slave_st->mtu[port] = mtu;
+ if (mtu > master->max_mtu[port])
+ master->max_mtu[port] = mtu;
+ if (mtu < prev_mtu && prev_mtu ==
+ master->max_mtu[port]) {
+ slave_st->mtu[port] = mtu;
+ master->max_mtu[port] = mtu;
+ for (i = 0; i < dev->num_slaves; i++) {
+ master->max_mtu[port] =
+ max(master->max_mtu[port],
+ master->slave_state[i].mtu[port]);
+ }
+ }
+ gen_context->mtu = cpu_to_be16(master->max_mtu[port]);
+
+ /* For old slaves we parse the port settings and figure
+ * out the roce_mode of the slave. Such slaves are
+ * assumed to use input_modifier MLX4_SET_PORT_GID_TABLE
+ * later on
+ */
+#define SET_PORT_ROCE_MODE_BITS 0x10
+ if (gen_context->flags & SET_PORT_ROCE_MODE_BITS) {
+ slave_set_port_mode = (gen_context->roce_mode >> 4) & 7;
+ switch (slave_set_port_mode) {
+ case MLX4_SET_PORT_ROCE_MODE_1:
+ slave_st->slave_gid_type = MLX4_ROCE_GID_TYPE_V1;
+ break;
+ case MLX4_SET_PORT_ROCE_MODE_1_5:
+ slave_st->slave_gid_type = MLX4_ROCE_GID_TYPE_V1_5;
+ break;
+ case MLX4_SET_PORT_ROCE_MODE_1_PLUS_2:
+ slave_st->slave_gid_type = MLX4_ROCE_GID_TYPE_INVALID;
+ break;
+ case MLX4_SET_PORT_ROCE_MODE_1_5_PLUS_2:
+ slave_st->slave_gid_type = MLX4_ROCE_GID_TYPE_V2;
+ break;
+ default:
+ return -EINVAL;
+ }
+ } else {
+ /* Old slaves don't set roce_mode in old slaves
+ * if mode if V1
+ */
+ slave_st->slave_gid_type = MLX4_ROCE_GID_TYPE_V1;
+ }
+
+ if ((slave_st->slave_gid_type != MLX4_ROCE_GID_TYPE_INVALID) &&
+ (mlx4_verify_supported_gid_type(dev, slave_st->slave_gid_type, NULL))) {
+ slave_st->slave_gid_type = MLX4_ROCE_GID_TYPE_INVALID;
+ return -EINVAL;
+ }
+
+#define SET_PORT_ROCE_IP_PROTO_BITS 0x20
+ if (slave)
+ gen_context->flags &= ~(SET_PORT_ROCE_MODE_BITS | SET_PORT_ROCE_IP_PROTO_BITS);
+ break;
+ case MLX4_SET_PORT_GID_TABLE:
+ if (slave_st->slave_gid_type == MLX4_ROCE_GID_TYPE_INVALID)
+ return -EINVAL;
+ case MLX4_SET_PORT_ROCE_ADDR:
+ num_gids = mlx4_get_slave_num_gids(dev, slave, port);
+ base = mlx4_get_base_gid_ix(dev, slave, port);
+
+ /* check that VF table is valid */
+ mbox = (struct roce_adr_table_mbox_entry *)(inbox->buf);
+ for (i = 0;
+ i < num_gids;
+ i++, mbox = roce_table_entry_next(in_modifier, mbox)) {
+ if (!roce_table_entry_is_valid(dev, in_modifier, mbox)) {
+ pr_err("Invalid entry %d in RoCE GID table mailbox of slave %d\n",
+ i, slave);
+ return -EINVAL;
+ }
+ }
+
+ /* check for duplicates inside VF mailbox */
+ mbox = (void *)(inbox->buf);
+ for (i = 0;
+ i < num_gids;
+ i++, mbox = roce_table_entry_next(in_modifier, mbox)) {
+ if (roce_table_entry_is_empty(in_modifier, mbox))
+ continue;
+ mbox2 = roce_table_entry_next(in_modifier, mbox);
+ for (j = i + 1;
+ j < num_gids;
+ j++, mbox2 = roce_table_entry_next(in_modifier, mbox2)) {
+ if (roce_table_entry_is_eq(in_modifier, mbox, mbox2)) {
+ pr_err("Duplicate entry within slave %d GID table\n", slave);
+ return -EINVAL;
+ }
+ }
+ }
+ mutex_lock(&(priv->port[port].roce.mutex));
+ /* check for duplicates with other VFs */
+ for (i = 0; i < MLX4_ROCE_MAX_GIDS; i++) {
+ struct mlx4_roce_addr *a = &priv->port[port].roce.addr_table.addr[i];
+
+ if (i >= base && i < base + num_gids)
+ continue; /* don't compare to slave's current gids */
+
+ if (!memcmp(a->gid, mlx4_zgid, MLX4_GID_LEN))
+ continue;
+
+ mbox = (void *)(inbox->buf);
+ for (j = 0;
+ j < num_gids;
+ j++, mbox = roce_table_entry_next(in_modifier, mbox)) {
+ if (roce_table_entry_has_gid(in_modifier, mbox, a->gid)) {
+ mutex_unlock(&(priv->port[port].roce.mutex));
+ pr_err("Duplicate GID for slave %d with another slave\n", slave);
+ return -EINVAL;
+ }
+
+ }
+ }
+ /* add GIDs to HW */
+ mbox = (void *)(inbox->buf);
+ for (i = 0, offset = base;
+ i < num_gids;
+ mbox = roce_table_entry_next(in_modifier, mbox), offset++, i++) {
+ struct mlx4_roce_addr *a = &priv->port[port].roce.addr_table.addr[offset];
+
+ roce_table_entry_copy(in_modifier, mbox, a);
+
+ /* If slave doesn't use MLX4_SET_PORT_ROCE_ADDR
+ * take type from slave's global RoCE mode
+ */
+ if (in_modifier == MLX4_SET_PORT_GID_TABLE)
+ a->type = slave_st->slave_gid_type;
+ }
+ mutex_unlock(&(priv->port[port].roce.mutex));
+ err = mlx4_update_roce_addr_table(dev, port, &priv->port[port].roce.addr_table, MLX4_CMD_NATIVE);
+ return err;
+ }
+ return mlx4_cmd(dev, inbox->dma, in_mod & 0xffff, op_mod,
+ MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_NATIVE);
+ }
+
+ /* Slaves are not allowed to SET_PORT beacon (LED) blink */
+ if (op_mod == MLX4_SET_PORT_BEACON_OPCODE) {
+ mlx4_warn(dev, "denying SET_PORT Beacon slave:%d\n", slave);
+ return -EPERM;
+ }
+
+ /* For IB, we only consider:
+ * - The capability mask, which is set to the aggregate of all
+ * slave function capabilities
+ * - The QKey violatin counter - reset according to each request.
+ */
+
+ if (dev->flags & MLX4_FLAG_OLD_PORT_CMDS) {
+ reset_qkey_viols = (*(u8 *) inbox->buf) & 0x40;
+ new_cap_mask = ((__be32 *) inbox->buf)[2];
+ } else {
+ reset_qkey_viols = ((u8 *) inbox->buf)[3] & 0x1;
+ new_cap_mask = ((__be32 *) inbox->buf)[1];
+ }
+
+ /* slave may not set the IS_SM capability for the port */
+ if (slave != mlx4_master_func_num(dev) &&
+ (be32_to_cpu(new_cap_mask) & MLX4_PORT_CAP_IS_SM))
+ return -EINVAL;
+
+ /* No DEV_MGMT in multifunc mode */
+ if (mlx4_is_mfunc(dev) &&
+ (be32_to_cpu(new_cap_mask) & MLX4_PORT_CAP_DEV_MGMT_SUP))
+ return -EINVAL;
+
+ agg_cap_mask = 0;
+ slave_cap_mask =
+ priv->mfunc.master.slave_state[slave].ib_cap_mask[port];
+ priv->mfunc.master.slave_state[slave].ib_cap_mask[port] = new_cap_mask;
+ for (i = 0; i < dev->num_slaves; i++)
+ agg_cap_mask |=
+ priv->mfunc.master.slave_state[i].ib_cap_mask[port];
+
+ /* only clear mailbox for guests. Master may be setting
+ * MTU or PKEY table size
+ */
+ if (slave != dev->caps.function)
+ memset(inbox->buf, 0, 256);
+ if (dev->flags & MLX4_FLAG_OLD_PORT_CMDS) {
+ *(u8 *) inbox->buf |= !!reset_qkey_viols << 6;
+ ((__be32 *) inbox->buf)[2] = agg_cap_mask;
+ } else {
+ ((u8 *) inbox->buf)[3] |= !!reset_qkey_viols;
+ ((__be32 *) inbox->buf)[1] = agg_cap_mask;
+ }
+
+ err = mlx4_cmd(dev, inbox->dma, port, is_eth, MLX4_CMD_SET_PORT,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+ if (err)
+ priv->mfunc.master.slave_state[slave].ib_cap_mask[port] =
+ slave_cap_mask;
+ return err;
+}
+
+int mlx4_SET_PORT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int port = mlx4_slave_convert_port(
+ dev, slave, vhcr->in_modifier & 0xFF);
+
+ if (port < 0)
+ return -EINVAL;
+
+ vhcr->in_modifier = (vhcr->in_modifier & ~0xFF) |
+ (port & 0xFF);
+
+ return mlx4_common_set_port(dev, slave, vhcr->in_modifier,
+ vhcr->op_modifier, inbox);
+}
+
+/* bit locations for set port command with zero op modifier */
+enum {
+ MLX4_SET_PORT_VL_CAP = 4, /* bits 7:4 */
+ MLX4_SET_PORT_MTU_CAP = 12, /* bits 15:12 */
+ MLX4_CHANGE_PORT_PKEY_TBL_SZ = 20,
+ MLX4_CHANGE_PORT_VL_CAP = 21,
+ MLX4_CHANGE_PORT_MTU_CAP = 22,
+};
+
+int mlx4_SET_PORT(struct mlx4_dev *dev, u8 port, int pkey_tbl_sz)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ int err, vl_cap, pkey_tbl_flag = 0;
+
+ if (dev->caps.port_type[port] == MLX4_PORT_TYPE_ETH)
+ return 0;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ ((__be32 *) mailbox->buf)[1] = dev->caps.ib_port_def_cap[port];
+
+ if (pkey_tbl_sz >= 0 && mlx4_is_master(dev)) {
+ pkey_tbl_flag = 1;
+ ((__be16 *) mailbox->buf)[20] = cpu_to_be16(pkey_tbl_sz);
+ }
+
+ /* IB VL CAP enum isn't used by the firmware, just numerical values */
+ for (vl_cap = 8; vl_cap >= 1; vl_cap >>= 1) {
+ ((__be32 *) mailbox->buf)[0] = cpu_to_be32(
+ (1 << MLX4_CHANGE_PORT_MTU_CAP) |
+ (1 << MLX4_CHANGE_PORT_VL_CAP) |
+ (pkey_tbl_flag << MLX4_CHANGE_PORT_PKEY_TBL_SZ) |
+ (dev->caps.port_ib_mtu[port] << MLX4_SET_PORT_MTU_CAP) |
+ (vl_cap << MLX4_SET_PORT_VL_CAP));
+ err = mlx4_cmd(dev, mailbox->dma, port,
+ MLX4_SET_PORT_IB_OPCODE, MLX4_CMD_SET_PORT,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_WRAPPED);
+ if (err != -ENOMEM)
+ break;
+ }
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+
+static inline enum mlx4_set_port_roce_mode get_set_port_roce_mode(struct mlx4_dev *dev)
+{
+ switch (dev->caps.roce_mode) {
+ case MLX4_ROCE_MODE_1:
+ return MLX4_SET_PORT_ROCE_MODE_1;
+ case MLX4_ROCE_MODE_1_5:
+ return MLX4_SET_PORT_ROCE_MODE_1_5;
+ case MLX4_ROCE_MODE_2:
+ case MLX4_ROCE_MODE_1_5_PLUS_2:
+ return MLX4_SET_PORT_ROCE_MODE_1_5_PLUS_2;
+ case MLX4_ROCE_MODE_1_PLUS_2:
+ return MLX4_SET_PORT_ROCE_MODE_1_PLUS_2;
+ default:
+ return MLX4_SET_PORT_ROCE_MODE_INVALID;
+ }
+}
+#define SET_PORT_ROCE_1_5_FLAGS 0x30
+#define SET_PORT_ROCE_2_FLAGS 0x10
+#define MLX4_SET_PORT_ROCE_V1_V2 0x2
+int mlx4_SET_PORT_general(struct mlx4_dev *dev, u8 port, int mtu,
+ u8 pptx, u8 pfctx, u8 pprx, u8 pfcrx)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_set_port_general_context *context;
+ int err;
+ u32 in_mod;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ context = mailbox->buf;
+ context->flags = SET_PORT_GEN_ALL_VALID;
+ context->mtu = cpu_to_be16(mtu);
+ context->pptx = (pptx * (!pfctx)) << 7;
+ context->pfctx = pfctx;
+ context->pprx = (pprx * (!pfcrx)) << 7;
+ context->pfcrx = pfcrx;
+
+ if (dev->caps.port_type[port] == MLX4_PORT_TYPE_ETH) {
+ enum mlx4_set_port_roce_mode set_roce_mode = get_set_port_roce_mode(dev);
+
+ if (set_roce_mode == MLX4_SET_PORT_ROCE_MODE_INVALID)
+ return -EINVAL;
+
+ context->roce_mode |= (set_roce_mode & 7) << 4;
+ if (set_roce_mode == MLX4_SET_PORT_ROCE_MODE_1_5 ||
+ set_roce_mode == MLX4_SET_PORT_ROCE_MODE_1_5_PLUS_2) {
+ context->flags |= SET_PORT_ROCE_1_5_FLAGS;
+ context->rr_proto = dev->caps.rr_proto;
+ } else if (set_roce_mode == MLX4_SET_PORT_ROCE_MODE_1_PLUS_2) {
+ context->flags |= SET_PORT_ROCE_2_FLAGS;
+ }
+
+ if (!mlx4_is_slave(dev) &&
+ (dev->caps.roce_mode == MLX4_ROCE_MODE_1_5_PLUS_2 ||
+ dev->caps.roce_mode == MLX4_ROCE_MODE_2 ||
+ dev->caps.roce_mode == MLX4_ROCE_MODE_1_PLUS_2)) {
+#define MLX4_ROCE_V2_UDP_DPORT BIT(3)
+ //err = mlx4_config_roce_v2_port(dev, ROCE_V2_UDP_DPORT);
+ err = mlx4_config_roce_v2_port(dev, MLX4_ROCE_V2_UDP_DPORT);
+ if (err)
+ return err;
+ }
+ }
+
+ in_mod = MLX4_SET_PORT_GENERAL << 8 | port;
+ err = mlx4_cmd(dev, mailbox->dma, in_mod, MLX4_SET_PORT_ETH_OPCODE,
+ MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_WRAPPED);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_SET_PORT_general);
+
+int mlx4_SET_PORT_qpn_calc(struct mlx4_dev *dev, u8 port, u32 base_qpn,
+ u8 promisc)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_set_port_rqp_calc_context *context;
+ int err;
+ u32 in_mod;
+ u32 m_promisc = (dev->caps.flags & MLX4_DEV_CAP_FLAG_VEP_MC_STEER) ?
+ MCAST_DIRECT : MCAST_DEFAULT;
+
+ if (dev->caps.steering_mode != MLX4_STEERING_MODE_A0)
+ return 0;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ context = mailbox->buf;
+ context->base_qpn = cpu_to_be32(base_qpn);
+ context->n_mac = dev->caps.log_num_macs;
+ context->promisc = cpu_to_be32(promisc << SET_PORT_PROMISC_SHIFT |
+ base_qpn);
+ context->mcast = cpu_to_be32(m_promisc << SET_PORT_MC_PROMISC_SHIFT |
+ base_qpn);
+ context->intra_no_vlan = 0;
+ context->no_vlan = MLX4_NO_VLAN_IDX;
+ context->intra_vlan_miss = 0;
+ context->vlan_miss = MLX4_VLAN_MISS_IDX;
+
+ in_mod = MLX4_SET_PORT_RQP_CALC << 8 | port;
+ err = mlx4_cmd(dev, mailbox->dma, in_mod, MLX4_SET_PORT_ETH_OPCODE,
+ MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_WRAPPED);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_SET_PORT_qpn_calc);
+
+int mlx4_SET_PORT_fcs_check(struct mlx4_dev *dev, u8 port, u8 ignore_fcs_value)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_set_port_general_context *context;
+ u32 in_mod;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ context = mailbox->buf;
+ context->v_ignore_fcs |= MLX4_FLAG_V_IGNORE_FCS_MASK;
+ if (ignore_fcs_value)
+ context->roce_mode |= MLX4_IGNORE_FCS_MASK;
+ else
+ context->roce_mode &= ~MLX4_IGNORE_FCS_MASK;
+
+ in_mod = MLX4_SET_PORT_GENERAL << 8 | port;
+ err = mlx4_cmd(dev, mailbox->dma, in_mod, 1, MLX4_CMD_SET_PORT,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_SET_PORT_fcs_check);
+
+enum {
+ VXLAN_ENABLE_MODIFY = 1 << 7,
+ VXLAN_STEERING_MODIFY = 1 << 6,
+
+ VXLAN_ENABLE = 1 << 7,
+};
+
+struct mlx4_set_port_vxlan_context {
+ u32 reserved1;
+ u8 modify_flags;
+ u8 reserved2;
+ u8 enable_flags;
+ u8 steering;
+};
+
+int mlx4_SET_PORT_VXLAN(struct mlx4_dev *dev, u8 port, u8 steering, int enable)
+{
+ int err;
+ u32 in_mod;
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_set_port_vxlan_context *context;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+ context = mailbox->buf;
+ memset(context, 0, sizeof(*context));
+
+ context->modify_flags = VXLAN_ENABLE_MODIFY | VXLAN_STEERING_MODIFY;
+ if (enable)
+ context->enable_flags = VXLAN_ENABLE;
+ context->steering = steering;
+
+ in_mod = MLX4_SET_PORT_VXLAN << 8 | port;
+ err = mlx4_cmd(dev, mailbox->dma, in_mod, MLX4_SET_PORT_ETH_OPCODE,
+ MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_SET_PORT_VXLAN);
+
+int mlx4_SET_PORT_BEACON(struct mlx4_dev *dev, u8 port, u16 time)
+{
+ int err;
+ struct mlx4_cmd_mailbox *mailbox;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ *((__be32 *)mailbox->buf) = cpu_to_be32(time);
+
+ err = mlx4_cmd(dev, mailbox->dma, port, MLX4_SET_PORT_BEACON_OPCODE,
+ MLX4_CMD_SET_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_NATIVE);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_SET_PORT_BEACON);
+
+int mlx4_SET_MCAST_FLTR_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err = 0;
+
+ return err;
+}
+
+int mlx4_SET_MCAST_FLTR(struct mlx4_dev *dev, u8 port,
+ u64 mac, u64 clear, u8 mode)
+{
+ return mlx4_cmd(dev, (mac | (clear << 63)), port, mode,
+ MLX4_CMD_SET_MCAST_FLTR, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_WRAPPED);
+}
+EXPORT_SYMBOL(mlx4_SET_MCAST_FLTR);
+
+int mlx4_SET_VLAN_FLTR_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err = 0;
+
+ return err;
+}
+
+int mlx4_DUMP_ETH_STATS_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ return 0;
+}
+
+int mlx4_get_slave_from_roce_gid(struct mlx4_dev *dev, int port, u8 *gid,
+ int *slave_id)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i, found_ix = -1;
+ int vf_gids = MLX4_ROCE_MAX_GIDS - MLX4_ROCE_PF_GIDS;
+ struct mlx4_slaves_pport slaves_pport;
+ unsigned num_vfs;
+ int slave_gid;
+
+ if (!mlx4_is_mfunc(dev))
+ return -EINVAL;
+
+ slaves_pport = mlx4_phys_to_slaves_pport(dev, port);
+ num_vfs = bitmap_weight(slaves_pport.slaves,
+ dev->persist->num_vfs + 1) - 1;
+
+ for (i = 0; i < MLX4_ROCE_MAX_GIDS; i++) {
+ struct mlx4_roce_addr *a = &priv->port[port].roce.addr_table.addr[i];
+
+ if (!memcmp(a->gid, gid, MLX4_GID_LEN)) {
+ found_ix = i;
+ break;
+ }
+ }
+
+ if (found_ix >= 0) {
+ /* Calculate a slave_gid which is the slave number in the gid
+ * table and not a globally unique slave number.
+ */
+ if (found_ix < MLX4_ROCE_PF_GIDS)
+ slave_gid = 0;
+ else if (found_ix < MLX4_ROCE_PF_GIDS + (vf_gids % num_vfs) *
+ (vf_gids / num_vfs + 1))
+ slave_gid = ((found_ix - MLX4_ROCE_PF_GIDS) /
+ (vf_gids / num_vfs + 1)) + 1;
+ else
+ slave_gid =
+ ((found_ix - MLX4_ROCE_PF_GIDS -
+ ((vf_gids % num_vfs) * ((vf_gids / num_vfs + 1)))) /
+ (vf_gids / num_vfs)) + vf_gids % num_vfs + 1;
+
+ /* Calculate the globally unique slave id */
+ if (slave_gid) {
+ struct mlx4_active_ports exclusive_ports;
+ struct mlx4_active_ports actv_ports;
+ struct mlx4_slaves_pport slaves_pport_actv;
+ unsigned max_port_p_one;
+ int num_vfs_before = 0;
+ int candidate_slave_gid;
+
+ /* Calculate how many VFs are on the previous port, if exists */
+ for (i = 1; i < port; i++) {
+ bitmap_zero(exclusive_ports.ports, dev->caps.num_ports);
+ set_bit(i - 1, exclusive_ports.ports);
+ slaves_pport_actv =
+ mlx4_phys_to_slaves_pport_actv(
+ dev, &exclusive_ports);
+ num_vfs_before += bitmap_weight(
+ slaves_pport_actv.slaves,
+ dev->persist->num_vfs + 1);
+ }
+
+ /* candidate_slave_gid isn't necessarily the correct slave, but
+ * it has the same number of ports and is assigned to the same
+ * ports as the real slave we're looking for. On dual port VF,
+ * slave_gid = [single port VFs on port <port>] +
+ * [offset of the current slave from the first dual port VF] +
+ * 1 (for the PF).
+ */
+ candidate_slave_gid = slave_gid + num_vfs_before;
+
+ actv_ports = mlx4_get_active_ports(dev, candidate_slave_gid);
+ max_port_p_one = find_first_bit(
+ actv_ports.ports, dev->caps.num_ports) +
+ bitmap_weight(actv_ports.ports,
+ dev->caps.num_ports) + 1;
+
+ /* Calculate the real slave number */
+ for (i = 1; i < max_port_p_one; i++) {
+ if (i == port)
+ continue;
+ bitmap_zero(exclusive_ports.ports,
+ dev->caps.num_ports);
+ set_bit(i - 1, exclusive_ports.ports);
+ slaves_pport_actv =
+ mlx4_phys_to_slaves_pport_actv(
+ dev, &exclusive_ports);
+ slave_gid += bitmap_weight(
+ slaves_pport_actv.slaves,
+ dev->persist->num_vfs + 1);
+ }
+ }
+ *slave_id = slave_gid;
+ }
+
+ return (found_ix >= 0) ? 0 : -EINVAL;
+}
+EXPORT_SYMBOL(mlx4_get_slave_from_roce_gid);
+
+int mlx4_get_roce_gid_from_slave(struct mlx4_dev *dev, int port, int slave_id,
+ u8 *gid, enum mlx4_roce_gid_type *gid_type)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_roce_addr *a = &priv->port[port].roce.addr_table.addr[slave_id];
+
+
+ if (!mlx4_is_master(dev))
+ return -EINVAL;
+
+ memcpy(gid, a->gid, MLX4_GID_LEN);
+ *gid_type = a->type;
+ return 0;
+}
+EXPORT_SYMBOL(mlx4_get_roce_gid_from_slave);
+
+/* Cable Module Info */
+#define MODULE_INFO_MAX_READ 48
+
+#define I2C_ADDR_LOW 0x50
+#define I2C_ADDR_HIGH 0x51
+#define I2C_PAGE_SIZE 256
+
+/* Module Info Data */
+struct mlx4_cable_info {
+ u8 i2c_addr;
+ u8 page_num;
+ __be16 dev_mem_address;
+ __be16 reserved1;
+ __be16 size;
+ __be32 reserved2[2];
+ u8 data[MODULE_INFO_MAX_READ];
+};
+
+enum cable_info_err {
+ CABLE_INF_INV_PORT = 0x1,
+ CABLE_INF_OP_NOSUP = 0x2,
+ CABLE_INF_NOT_CONN = 0x3,
+ CABLE_INF_NO_EEPRM = 0x4,
+ CABLE_INF_PAGE_ERR = 0x5,
+ CABLE_INF_INV_ADDR = 0x6,
+ CABLE_INF_I2C_ADDR = 0x7,
+ CABLE_INF_QSFP_VIO = 0x8,
+ CABLE_INF_I2C_BUSY = 0x9,
+};
+
+#define MAD_STATUS_2_CABLE_ERR(mad_status) ((mad_status >> 8) & 0xFF)
+
+static inline const char *cable_info_mad_err_str(u16 mad_status)
+{
+ u8 err = MAD_STATUS_2_CABLE_ERR(mad_status);
+
+ switch (err) {
+ case CABLE_INF_INV_PORT:
+ return "invalid port selected";
+ case CABLE_INF_OP_NOSUP:
+ return "operation not supported for this port (the port is of type CX4 or internal)";
+ case CABLE_INF_NOT_CONN:
+ return "cable is not connected";
+ case CABLE_INF_NO_EEPRM:
+ return "the connected cable has no EPROM (passive copper cable)";
+ case CABLE_INF_PAGE_ERR:
+ return "page number is greater than 15";
+ case CABLE_INF_INV_ADDR:
+ return "invalid device_address or size (that is, size equals 0 or address+size is greater than 256)";
+ case CABLE_INF_I2C_ADDR:
+ return "invalid I2C slave address";
+ case CABLE_INF_QSFP_VIO:
+ return "at least one cable violates the QSFP specification and ignores the modsel signal";
+ case CABLE_INF_I2C_BUSY:
+ return "I2C bus is constantly busy";
+ }
+ return "Unknown Error";
+}
+
+/**
+ * mlx4_get_module_info - Read cable module eeprom data
+ * @dev: mlx4_dev.
+ * @port: port number.
+ * @offset: byte offset in eeprom to start reading data from.
+ * @size: num of bytes to read.
+ * @data: output buffer to put the requested data into.
+ *
+ * Reads cable module eeprom data, puts the outcome data into
+ * data pointer paramer.
+ * Returns num of read bytes on success or a negative error
+ * code.
+ */
+int mlx4_get_module_info(struct mlx4_dev *dev, u8 port,
+ u16 offset, u16 size, u8 *data)
+{
+ struct mlx4_cmd_mailbox *inbox, *outbox;
+ struct mlx4_mad_ifc *inmad, *outmad;
+ struct mlx4_cable_info *cable_info;
+ u16 i2c_addr;
+ int ret;
+
+ if (size > MODULE_INFO_MAX_READ)
+ size = MODULE_INFO_MAX_READ;
+
+ inbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(inbox))
+ return PTR_ERR(inbox);
+
+ outbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(outbox)) {
+ mlx4_free_cmd_mailbox(dev, inbox);
+ return PTR_ERR(outbox);
+ }
+
+ inmad = (struct mlx4_mad_ifc *)(inbox->buf);
+ outmad = (struct mlx4_mad_ifc *)(outbox->buf);
+
+ inmad->method = 0x1; /* Get */
+ inmad->class_version = 0x1;
+ inmad->mgmt_class = 0x1;
+ inmad->base_version = 0x1;
+ inmad->attr_id = cpu_to_be16(0xFF60); /* Module Info */
+
+ if (offset < I2C_PAGE_SIZE && offset + size > I2C_PAGE_SIZE)
+ /* Cross pages reads are not allowed
+ * read until offset 256 in low page
+ */
+ size -= offset + size - I2C_PAGE_SIZE;
+
+ i2c_addr = I2C_ADDR_LOW;
+ if (offset >= I2C_PAGE_SIZE) {
+ /* Reset offset to high page */
+ i2c_addr = I2C_ADDR_HIGH;
+ offset -= I2C_PAGE_SIZE;
+ }
+
+ cable_info = (struct mlx4_cable_info *)inmad->data;
+ cable_info->dev_mem_address = cpu_to_be16(offset);
+ cable_info->page_num = 0;
+ cable_info->i2c_addr = i2c_addr;
+ cable_info->size = cpu_to_be16(size);
+
+ ret = mlx4_cmd_box(dev, inbox->dma, outbox->dma, port, 3,
+ MLX4_CMD_MAD_IFC, MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+ if (ret)
+ goto out;
+
+ if (be16_to_cpu(outmad->status)) {
+ /* Mad returned with bad status */
+ ret = be16_to_cpu(outmad->status);
+ mlx4_warn(dev,
+ "MLX4_CMD_MAD_IFC Get Module info attr(%x) port(%d) i2c_addr(%x) offset(%d) size(%d): Response Mad Status(%x) - %s\n",
+ 0xFF60, port, i2c_addr, offset, size,
+ ret, cable_info_mad_err_str(ret));
+
+ if (i2c_addr == I2C_ADDR_HIGH &&
+ MAD_STATUS_2_CABLE_ERR(ret) == CABLE_INF_I2C_ADDR)
+ /* Some SFP cables do not support i2c slave
+ * address 0x51 (high page), abort silently.
+ */
+ ret = 0;
+ else
+ ret = -ret;
+ goto out;
+ }
+ cable_info = (struct mlx4_cable_info *)outmad->data;
+ memcpy(data, cable_info->data, size);
+ ret = size;
+out:
+ mlx4_free_cmd_mailbox(dev, inbox);
+ mlx4_free_cmd_mailbox(dev, outbox);
+ return ret;
+}
+EXPORT_SYMBOL(mlx4_get_module_info);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/profile.c b/drivers/net/mlnx_uio/mlnx/mlx4/profile.c
new file mode 100644
index 0000000..e149173
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/profile.c
@@ -0,0 +1,259 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+#include "mlx4.h"
+#include "fw.h"
+
+#include "log2.h"
+
+enum {
+ MLX4_RES_QP,
+ MLX4_RES_RDMARC,
+ MLX4_RES_ALTC,
+ MLX4_RES_AUXC,
+ MLX4_RES_SRQ,
+ MLX4_RES_CQ,
+ MLX4_RES_EQ,
+ MLX4_RES_DMPT,
+ MLX4_RES_CMPT,
+ MLX4_RES_MTT,
+ MLX4_RES_MCG,
+ MLX4_RES_NUM
+};
+
+static const char *res_name[] = {
+ [MLX4_RES_QP] = "QP",
+ [MLX4_RES_RDMARC] = "RDMARC",
+ [MLX4_RES_ALTC] = "ALTC",
+ [MLX4_RES_AUXC] = "AUXC",
+ [MLX4_RES_SRQ] = "SRQ",
+ [MLX4_RES_CQ] = "CQ",
+ [MLX4_RES_EQ] = "EQ",
+ [MLX4_RES_DMPT] = "DMPT",
+ [MLX4_RES_CMPT] = "CMPT",
+ [MLX4_RES_MTT] = "MTT",
+ [MLX4_RES_MCG] = "MCG",
+};
+
+u64 mlx4_make_profile(struct mlx4_dev *dev,
+ struct mlx4_profile *request,
+ struct mlx4_dev_cap *dev_cap,
+ struct mlx4_init_hca_param *init_hca)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource {
+ u64 size;
+ u64 start;
+ int type;
+ u32 num;
+ int log_num;
+ };
+
+ u64 total_size = 0;
+ struct mlx4_resource *profile;
+ struct mlx4_resource tmp;
+ int i, j;
+
+ profile = kcalloc(MLX4_RES_NUM, sizeof(*profile), GFP_KERNEL);
+ if (!profile)
+ return -ENOMEM;
+
+ profile[MLX4_RES_QP].size = dev_cap->qpc_entry_sz;
+ profile[MLX4_RES_RDMARC].size = dev_cap->rdmarc_entry_sz;
+ profile[MLX4_RES_ALTC].size = dev_cap->altc_entry_sz;
+ profile[MLX4_RES_AUXC].size = dev_cap->aux_entry_sz;
+ profile[MLX4_RES_SRQ].size = dev_cap->srq_entry_sz;
+ profile[MLX4_RES_CQ].size = dev_cap->cqc_entry_sz;
+ profile[MLX4_RES_EQ].size = dev_cap->eqc_entry_sz;
+ profile[MLX4_RES_DMPT].size = dev_cap->dmpt_entry_sz;
+ profile[MLX4_RES_CMPT].size = dev_cap->cmpt_entry_sz;
+ profile[MLX4_RES_MTT].size = dev_cap->mtt_entry_sz;
+ profile[MLX4_RES_MCG].size = mlx4_get_mgm_entry_size(dev);
+
+ profile[MLX4_RES_QP].num = request->num_qp;
+ profile[MLX4_RES_RDMARC].num = request->num_qp * request->rdmarc_per_qp;
+ profile[MLX4_RES_ALTC].num = request->num_qp;
+ profile[MLX4_RES_AUXC].num = request->num_qp;
+ profile[MLX4_RES_SRQ].num = request->num_srq;
+ profile[MLX4_RES_CQ].num = request->num_cq;
+ profile[MLX4_RES_EQ].num = mlx4_is_mfunc(dev) ? dev->phys_caps.num_phys_eqs :
+ min_t(unsigned, dev_cap->max_eqs, MAX_MSIX);
+ profile[MLX4_RES_DMPT].num = request->num_mpt;
+ profile[MLX4_RES_CMPT].num = MLX4_NUM_CMPTS;
+ profile[MLX4_RES_MTT].num = request->num_mtt * (1 << log_mtts_per_seg);
+ profile[MLX4_RES_MCG].num = request->num_mcg;
+
+ for (i = 0; i < MLX4_RES_NUM; ++i) {
+ profile[i].type = i;
+ profile[i].num = roundup_pow_of_two(profile[i].num);
+ profile[i].log_num = ilog2(profile[i].num);
+ profile[i].size *= profile[i].num;
+ profile[i].size = max(profile[i].size, (u64) PAGE_SIZE);
+ }
+
+ /*
+ * Sort the resources in decreasing order of size. Since they
+ * all have sizes that are powers of 2, we'll be able to keep
+ * resources aligned to their size and pack them without gaps
+ * using the sorted order.
+ */
+ for (i = MLX4_RES_NUM; i > 0; --i)
+ for (j = 1; j < i; ++j) {
+ if (profile[j].size > profile[j - 1].size) {
+ tmp = profile[j];
+ profile[j] = profile[j - 1];
+ profile[j - 1] = tmp;
+ }
+ }
+
+ for (i = 0; i < MLX4_RES_NUM; ++i) {
+ if (profile[i].size) {
+ profile[i].start = total_size;
+ total_size += profile[i].size;
+ }
+
+ if (total_size > dev_cap->max_icm_sz) {
+ mlx4_err(dev, "Profile requires 0x%llx bytes; won't fit in 0x%llx bytes of context memory\n",
+ (unsigned long long) total_size,
+ (unsigned long long) dev_cap->max_icm_sz);
+ kfree(profile);
+ return -ENOMEM;
+ }
+
+ if (profile[i].size)
+ mlx4_dbg(dev, " profile[%2d] (%6s): 2^%02d entries @ 0x%10llx, size 0x%10llx\n",
+ i, res_name[profile[i].type],
+ profile[i].log_num,
+ (unsigned long long) profile[i].start,
+ (unsigned long long) profile[i].size);
+ }
+
+ mlx4_dbg(dev, "HCA context memory: reserving %d KB\n",
+ (int) (total_size >> 10));
+
+ for (i = 0; i < MLX4_RES_NUM; ++i) {
+ switch (profile[i].type) {
+ case MLX4_RES_QP:
+ dev->caps.num_qps = profile[i].num;
+ init_hca->qpc_base = profile[i].start;
+ init_hca->log_num_qps = profile[i].log_num;
+ break;
+ case MLX4_RES_RDMARC:
+ for (priv->qp_table.rdmarc_shift = 0;
+ request->num_qp << priv->qp_table.rdmarc_shift < profile[i].num;
+ ++priv->qp_table.rdmarc_shift)
+ ; /* nothing */
+ dev->caps.max_qp_dest_rdma = 1 << priv->qp_table.rdmarc_shift;
+ priv->qp_table.rdmarc_base = (u32) profile[i].start;
+ init_hca->rdmarc_base = profile[i].start;
+ init_hca->log_rd_per_qp = priv->qp_table.rdmarc_shift;
+ break;
+ case MLX4_RES_ALTC:
+ init_hca->altc_base = profile[i].start;
+ break;
+ case MLX4_RES_AUXC:
+ init_hca->auxc_base = profile[i].start;
+ break;
+ case MLX4_RES_SRQ:
+ dev->caps.num_srqs = profile[i].num;
+ init_hca->srqc_base = profile[i].start;
+ init_hca->log_num_srqs = profile[i].log_num;
+ break;
+ case MLX4_RES_CQ:
+ dev->caps.num_cqs = profile[i].num;
+ init_hca->cqc_base = profile[i].start;
+ init_hca->log_num_cqs = profile[i].log_num;
+ break;
+ case MLX4_RES_EQ:
+ if (dev_cap->flags2 & MLX4_DEV_CAP_FLAG2_SYS_EQS) {
+ init_hca->log_num_eqs = 0x1f;
+ init_hca->eqc_base = profile[i].start;
+ init_hca->num_sys_eqs = dev_cap->num_sys_eqs;
+ } else {
+ dev->caps.num_eqs = roundup_pow_of_two(
+ min_t(unsigned,
+ dev_cap->max_eqs,
+ MAX_MSIX));
+ init_hca->eqc_base = profile[i].start;
+ init_hca->log_num_eqs = ilog2(dev->caps.num_eqs);
+ }
+ break;
+ case MLX4_RES_DMPT:
+ dev->caps.num_mpts = profile[i].num;
+ priv->mr_table.mpt_base = profile[i].start;
+ init_hca->dmpt_base = profile[i].start;
+ init_hca->log_mpt_sz = profile[i].log_num;
+ break;
+ case MLX4_RES_CMPT:
+ init_hca->cmpt_base = profile[i].start;
+ break;
+ case MLX4_RES_MTT:
+ dev->caps.num_mtts = profile[i].num;
+ priv->mr_table.mtt_base = profile[i].start;
+ init_hca->mtt_base = profile[i].start;
+ break;
+ case MLX4_RES_MCG:
+ init_hca->mc_base = profile[i].start;
+ init_hca->log_mc_entry_sz =
+ ilog2(mlx4_get_mgm_entry_size(dev));
+ init_hca->log_mc_table_sz = profile[i].log_num;
+ if (dev->caps.steering_mode ==
+ MLX4_STEERING_MODE_DEVICE_MANAGED) {
+ dev->caps.num_mgms = profile[i].num;
+ } else {
+ init_hca->log_mc_hash_sz =
+ profile[i].log_num - 1;
+ dev->caps.num_mgms = profile[i].num >> 1;
+ dev->caps.num_amgms = profile[i].num >> 1;
+ }
+ break;
+ default:
+ break;
+ }
+ }
+
+ /*
+ * PDs don't take any HCA memory, but we assign them as part
+ * of the HCA profile anyway.
+ */
+ dev->caps.num_pds = MLX4_NUM_PDS;
+
+ kfree(profile);
+ return total_size;
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/qp.c b/drivers/net/mlnx_uio/mlnx/mlx4/qp.c
new file mode 100644
index 0000000..b010028
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/qp.c
@@ -0,0 +1,956 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2004 Voltaire, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+
+#include "mlx4.h"
+#include "icm.h"
+#include "mlx4/device.h"
+#include "mlx4/qp.h"
+#include "log2.h"
+
+/* QP to support BF should have bits 6,7 cleared */
+#define MLX4_BF_QP_SKIP_MASK 0xc0
+#define MLX4_MAX_BF_QP_RANGE 0x40
+
+void mlx4_qp_event(struct mlx4_dev *dev, u32 qpn, int event_type)
+{
+ struct mlx4_qp_table *qp_table = &mlx4_priv(dev)->qp_table;
+ struct mlx4_qp *qp;
+
+ spin_lock(&qp_table->lock);
+
+ qp = __mlx4_qp_lookup(dev, qpn);
+ if (qp)
+ atomic_inc(&qp->refcount);
+
+ spin_unlock(&qp_table->lock);
+
+ if (!qp) {
+ mlx4_dbg(dev, "Async event for none existent QP %08x\n", qpn);
+ return;
+ }
+
+ qp->event(qp, event_type);
+
+ if (atomic_dec_and_test(&qp->refcount))
+ complete(&qp->free);
+}
+
+/* used for INIT/CLOSE port logic */
+static int is_master_qp0(struct mlx4_dev *dev, struct mlx4_qp *qp, int *real_qp0, int *proxy_qp0)
+{
+ /* this procedure is called after we already know we are on the master */
+ /* qp0 is either the proxy qp0, or the real qp0 */
+ u32 pf_proxy_offset = dev->phys_caps.base_proxy_sqpn + 8 * mlx4_master_func_num(dev);
+ *proxy_qp0 = qp->qpn >= pf_proxy_offset && qp->qpn <= pf_proxy_offset + 1;
+
+ *real_qp0 = qp->qpn >= dev->phys_caps.base_sqpn &&
+ qp->qpn <= dev->phys_caps.base_sqpn + 1;
+
+ return *real_qp0 || *proxy_qp0;
+}
+
+static int __mlx4_qp_modify(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
+ enum mlx4_qp_state cur_state, enum mlx4_qp_state new_state,
+ struct mlx4_qp_context *context,
+ enum mlx4_qp_optpar optpar,
+ int sqd_event, struct mlx4_qp *qp, int native)
+{
+ static const u16 op[MLX4_QP_NUM_STATE][MLX4_QP_NUM_STATE] = {
+ [MLX4_QP_STATE_RST] = {
+ [MLX4_QP_STATE_RST] = MLX4_CMD_2RST_QP,
+ [MLX4_QP_STATE_ERR] = MLX4_CMD_2ERR_QP,
+ [MLX4_QP_STATE_INIT] = MLX4_CMD_RST2INIT_QP,
+ },
+ [MLX4_QP_STATE_INIT] = {
+ [MLX4_QP_STATE_RST] = MLX4_CMD_2RST_QP,
+ [MLX4_QP_STATE_ERR] = MLX4_CMD_2ERR_QP,
+ [MLX4_QP_STATE_INIT] = MLX4_CMD_INIT2INIT_QP,
+ [MLX4_QP_STATE_RTR] = MLX4_CMD_INIT2RTR_QP,
+ },
+ [MLX4_QP_STATE_RTR] = {
+ [MLX4_QP_STATE_RST] = MLX4_CMD_2RST_QP,
+ [MLX4_QP_STATE_ERR] = MLX4_CMD_2ERR_QP,
+ [MLX4_QP_STATE_RTS] = MLX4_CMD_RTR2RTS_QP,
+ },
+ [MLX4_QP_STATE_RTS] = {
+ [MLX4_QP_STATE_RST] = MLX4_CMD_2RST_QP,
+ [MLX4_QP_STATE_ERR] = MLX4_CMD_2ERR_QP,
+ [MLX4_QP_STATE_RTS] = MLX4_CMD_RTS2RTS_QP,
+ [MLX4_QP_STATE_SQD] = MLX4_CMD_RTS2SQD_QP,
+ },
+ [MLX4_QP_STATE_SQD] = {
+ [MLX4_QP_STATE_RST] = MLX4_CMD_2RST_QP,
+ [MLX4_QP_STATE_ERR] = MLX4_CMD_2ERR_QP,
+ [MLX4_QP_STATE_RTS] = MLX4_CMD_SQD2RTS_QP,
+ [MLX4_QP_STATE_SQD] = MLX4_CMD_SQD2SQD_QP,
+ },
+ [MLX4_QP_STATE_SQER] = {
+ [MLX4_QP_STATE_RST] = MLX4_CMD_2RST_QP,
+ [MLX4_QP_STATE_ERR] = MLX4_CMD_2ERR_QP,
+ [MLX4_QP_STATE_RTS] = MLX4_CMD_SQERR2RTS_QP,
+ },
+ [MLX4_QP_STATE_ERR] = {
+ [MLX4_QP_STATE_RST] = MLX4_CMD_2RST_QP,
+ [MLX4_QP_STATE_ERR] = MLX4_CMD_2ERR_QP,
+ }
+ };
+
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_cmd_mailbox *mailbox;
+ int ret = 0;
+ int real_qp0 = 0;
+ int proxy_qp0 = 0;
+ u8 port;
+
+ if (cur_state >= MLX4_QP_NUM_STATE || new_state >= MLX4_QP_NUM_STATE ||
+ !op[cur_state][new_state])
+ return -EINVAL;
+
+ if (op[cur_state][new_state] == MLX4_CMD_2RST_QP) {
+ ret = mlx4_cmd(dev, 0, qp->qpn, 2,
+ MLX4_CMD_2RST_QP, MLX4_CMD_TIME_CLASS_A, native);
+ if (mlx4_is_master(dev) && cur_state != MLX4_QP_STATE_ERR &&
+ cur_state != MLX4_QP_STATE_RST &&
+ is_master_qp0(dev, qp, &real_qp0, &proxy_qp0)) {
+ port = (qp->qpn & 1) + 1;
+ if (proxy_qp0)
+ priv->mfunc.master.qp0_state[port].proxy_qp0_active = 0;
+ else
+ priv->mfunc.master.qp0_state[port].qp0_active = 0;
+ }
+ return ret;
+ }
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ if (cur_state == MLX4_QP_STATE_RST && new_state == MLX4_QP_STATE_INIT) {
+ u64 mtt_addr = mlx4_mtt_addr(dev, mtt);
+ context->mtt_base_addr_h = mtt_addr >> 32;
+ context->mtt_base_addr_l = cpu_to_be32(mtt_addr & 0xffffffff);
+ context->log_page_size = mtt->page_shift - MLX4_ICM_PAGE_SHIFT;
+ }
+
+ if ((cur_state == MLX4_QP_STATE_RTR) &&
+ (new_state == MLX4_QP_STATE_RTS) &&
+ ((dev->caps.roce_mode == MLX4_ROCE_MODE_2) ||
+ (dev->caps.roce_mode == MLX4_ROCE_MODE_1_5_PLUS_2) ||
+ (dev->caps.roce_mode == MLX4_ROCE_MODE_1_PLUS_2)) &&
+ !mlx4_is_mfunc(dev))
+ context->roce_entropy = cpu_to_be16(mlx4_qp_roce_entropy(dev, qp->qpn));
+
+ *(__be32 *) mailbox->buf = cpu_to_be32(optpar);
+ memcpy(mailbox->buf + 8, context, sizeof *context);
+
+ ((struct mlx4_qp_context *) (mailbox->buf + 8))->local_qpn =
+ cpu_to_be32(qp->qpn);
+
+ ret = mlx4_cmd(dev, mailbox->dma,
+ qp->qpn | (!!sqd_event << 31),
+ new_state == MLX4_QP_STATE_RST ? 2 : 0,
+ op[cur_state][new_state], MLX4_CMD_TIME_CLASS_C, native);
+
+ if (mlx4_is_master(dev) && is_master_qp0(dev, qp, &real_qp0, &proxy_qp0)) {
+ port = (qp->qpn & 1) + 1;
+ if (cur_state != MLX4_QP_STATE_ERR &&
+ cur_state != MLX4_QP_STATE_RST &&
+ new_state == MLX4_QP_STATE_ERR) {
+ if (proxy_qp0)
+ priv->mfunc.master.qp0_state[port].proxy_qp0_active = 0;
+ else
+ priv->mfunc.master.qp0_state[port].qp0_active = 0;
+ } else if (new_state == MLX4_QP_STATE_RTR) {
+ if (proxy_qp0)
+ priv->mfunc.master.qp0_state[port].proxy_qp0_active = 1;
+ else
+ priv->mfunc.master.qp0_state[port].qp0_active = 1;
+ }
+ }
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return ret;
+}
+
+int mlx4_qp_modify(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
+ enum mlx4_qp_state cur_state, enum mlx4_qp_state new_state,
+ struct mlx4_qp_context *context,
+ enum mlx4_qp_optpar optpar,
+ int sqd_event, struct mlx4_qp *qp)
+{
+ return __mlx4_qp_modify(dev, mtt, cur_state, new_state, context,
+ optpar, sqd_event, qp, 0);
+}
+EXPORT_SYMBOL_GPL(mlx4_qp_modify);
+
+int __mlx4_qp_reserve_range(struct mlx4_dev *dev, int cnt, int align,
+ int *base, u8 flags)
+{
+ u32 uid;
+ int bf_qp = !!(flags & (u8)MLX4_RESERVE_ETH_BF_QP);
+
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_qp_table *qp_table = &priv->qp_table;
+
+ if (cnt > MLX4_MAX_BF_QP_RANGE && bf_qp)
+ return -ENOMEM;
+
+ uid = MLX4_QP_TABLE_ZONE_GENERAL;
+ if (flags & (u8)MLX4_RESERVE_A0_QP) {
+ if (bf_qp)
+ uid = MLX4_QP_TABLE_ZONE_RAW_ETH;
+ else
+ uid = MLX4_QP_TABLE_ZONE_RSS;
+ }
+
+ *base = mlx4_zone_alloc_entries(qp_table->zones, uid, cnt, align,
+ bf_qp ? MLX4_BF_QP_SKIP_MASK : 0, NULL);
+ if (*base == -1)
+ return -ENOMEM;
+
+ return 0;
+}
+
+int mlx4_qp_reserve_range(struct mlx4_dev *dev, int cnt, int align,
+ int *base, u8 flags)
+{
+ u64 in_param = 0;
+ u64 out_param;
+ int err;
+
+ /* Turn off all unsupported QP allocation flags */
+ flags &= dev->caps.alloc_res_qp_mask;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, (((u32)flags) << 24) | (u32)cnt);
+ set_param_h(&in_param, align);
+ err = mlx4_cmd_imm(dev, in_param, &out_param,
+ RES_QP, RES_OP_RESERVE,
+ MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (err)
+ return err;
+
+ *base = get_param_l(&out_param);
+ return 0;
+ }
+ return __mlx4_qp_reserve_range(dev, cnt, align, base, flags);
+}
+EXPORT_SYMBOL_GPL(mlx4_qp_reserve_range);
+
+void __mlx4_qp_release_range(struct mlx4_dev *dev, int base_qpn, int cnt)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_qp_table *qp_table = &priv->qp_table;
+
+ if (mlx4_is_qp_reserved(dev, (u32) base_qpn))
+ return;
+ mlx4_zone_free_entries_unique(qp_table->zones, base_qpn, cnt);
+}
+
+void mlx4_qp_release_range(struct mlx4_dev *dev, int base_qpn, int cnt)
+{
+ u64 in_param = 0;
+ int err;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, base_qpn);
+ set_param_h(&in_param, cnt);
+ err = mlx4_cmd(dev, in_param, RES_QP, RES_OP_RESERVE,
+ MLX4_CMD_FREE_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (err) {
+ mlx4_warn(dev, "Failed to release qp range base:%d cnt:%d\n",
+ base_qpn, cnt);
+ }
+ } else
+ __mlx4_qp_release_range(dev, base_qpn, cnt);
+}
+EXPORT_SYMBOL_GPL(mlx4_qp_release_range);
+
+int __mlx4_qp_alloc_icm(struct mlx4_dev *dev, int qpn, gfp_t gfp)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_qp_table *qp_table = &priv->qp_table;
+ int err;
+
+ err = mlx4_table_get(dev, &qp_table->qp_table, qpn, gfp);
+ if (err)
+ goto err_out;
+
+ err = mlx4_table_get(dev, &qp_table->auxc_table, qpn, gfp);
+ if (err)
+ goto err_put_qp;
+
+ err = mlx4_table_get(dev, &qp_table->altc_table, qpn, gfp);
+ if (err)
+ goto err_put_auxc;
+
+ err = mlx4_table_get(dev, &qp_table->rdmarc_table, qpn, gfp);
+ if (err)
+ goto err_put_altc;
+
+ err = mlx4_table_get(dev, &qp_table->cmpt_table, qpn, gfp);
+ if (err)
+ goto err_put_rdmarc;
+
+ return 0;
+
+err_put_rdmarc:
+ mlx4_table_put(dev, &qp_table->rdmarc_table, qpn);
+
+err_put_altc:
+ mlx4_table_put(dev, &qp_table->altc_table, qpn);
+
+err_put_auxc:
+ mlx4_table_put(dev, &qp_table->auxc_table, qpn);
+
+err_put_qp:
+ mlx4_table_put(dev, &qp_table->qp_table, qpn);
+
+err_out:
+ return err;
+}
+
+static int mlx4_qp_alloc_icm(struct mlx4_dev *dev, int qpn, gfp_t gfp)
+{
+ u64 param = 0;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(¶m, qpn);
+ return mlx4_cmd_imm(dev, param, ¶m, RES_QP, RES_OP_MAP_ICM,
+ MLX4_CMD_ALLOC_RES, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+ }
+ return __mlx4_qp_alloc_icm(dev, qpn, gfp);
+}
+
+void __mlx4_qp_free_icm(struct mlx4_dev *dev, int qpn)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_qp_table *qp_table = &priv->qp_table;
+
+ mlx4_table_put(dev, &qp_table->cmpt_table, qpn);
+ mlx4_table_put(dev, &qp_table->rdmarc_table, qpn);
+ mlx4_table_put(dev, &qp_table->altc_table, qpn);
+ mlx4_table_put(dev, &qp_table->auxc_table, qpn);
+ mlx4_table_put(dev, &qp_table->qp_table, qpn);
+}
+
+static void mlx4_qp_free_icm(struct mlx4_dev *dev, int qpn)
+{
+ u64 in_param = 0;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, qpn);
+ if (mlx4_cmd(dev, in_param, RES_QP, RES_OP_MAP_ICM,
+ MLX4_CMD_FREE_RES, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED))
+ mlx4_warn(dev, "Failed to free icm of qp:%d\n", qpn);
+ } else
+ __mlx4_qp_free_icm(dev, qpn);
+}
+
+int mlx4_qp_alloc(struct mlx4_dev *dev, int qpn, struct mlx4_qp *qp, gfp_t gfp)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_qp_table *qp_table = &priv->qp_table;
+ int err;
+
+ if (!qpn)
+ return -EINVAL;
+
+ qp->qpn = qpn;
+
+ err = mlx4_qp_alloc_icm(dev, qpn, gfp);
+ if (err)
+ return err;
+
+ spin_lock_irq(&qp_table->lock);
+ err = radix_tree_insert(&dev->qp_table_tree, qp->qpn &
+ (dev->caps.num_qps - 1), qp);
+ spin_unlock_irq(&qp_table->lock);
+ if (err)
+ goto err_icm;
+
+ atomic_set(&qp->refcount, 1);
+ init_completion(&qp->free);
+
+ return 0;
+
+err_icm:
+ mlx4_qp_free_icm(dev, qpn);
+ return err;
+}
+
+EXPORT_SYMBOL_GPL(mlx4_qp_alloc);
+
+int mlx4_update_qp(struct mlx4_dev *dev, u32 qpn,
+ enum mlx4_update_qp_attr attr,
+ struct mlx4_update_qp_params *params)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_update_qp_context *cmd;
+ u64 pri_addr_path_mask = 0;
+ u64 qp_mask = 0;
+ int err = 0;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ cmd = (struct mlx4_update_qp_context *)mailbox->buf;
+
+ if (!attr || (attr & ~MLX4_UPDATE_QP_SUPPORTED_ATTRS))
+ return -EINVAL;
+
+ if (attr & MLX4_UPDATE_QP_SMAC) {
+ pri_addr_path_mask |= 1ULL << MLX4_UPD_QP_PATH_MASK_MAC_INDEX;
+ cmd->qp_context.pri_path.grh_mylmc = params->smac_index;
+ }
+
+ if (attr & MLX4_UPDATE_QP_ETH_SRC_CHECK_MC_LB) {
+ if (!(dev->caps.flags2
+ & MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB)) {
+ mlx4_warn(dev,
+ "Trying to set src check LB, but it isn't supported\n");
+ err = -ENOTSUPP;
+ goto out;
+ }
+ pri_addr_path_mask |= 1ULL << MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB;
+ if (params->flags &
+ MLX4_UPDATE_QP_PARAMS_FLAGS_ETH_CHECK_MC_LB) {
+ cmd->qp_context.pri_path.fl |= MLX4_FL_ETH_SRC_CHECK_MC_LB;
+ }
+ }
+
+ if (attr & MLX4_UPDATE_QP_VSD) {
+ qp_mask |= 1ULL << MLX4_UPD_QP_MASK_VSD;
+ if (params->flags & MLX4_UPDATE_QP_PARAMS_FLAGS_VSD_ENABLE)
+ cmd->qp_context.param3 |= cpu_to_be32(MLX4_STRIP_VLAN);
+ }
+
+ if (attr & MLX4_UPDATE_QP_QOS_VPORT) {
+ if (!(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_QOS_VPP)) {
+ mlx4_warn(dev, "QoS graular not supported\n");
+ err = -ENOTSUPP;
+ goto out;
+ }
+
+ qp_mask |= 1ULL << MLX4_UPD_QP_MASK_QOS_VPP;
+ cmd->qp_context.qos_vport = params->qos_vport;
+ }
+
+ cmd->primary_addr_path_mask = cpu_to_be64(pri_addr_path_mask);
+ cmd->qp_mask = cpu_to_be64(qp_mask);
+
+ err = mlx4_cmd(dev, mailbox->dma, qpn & 0xffffff, 0,
+ MLX4_CMD_UPDATE_QP, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_update_qp);
+
+void mlx4_qp_remove(struct mlx4_dev *dev, struct mlx4_qp *qp)
+{
+ struct mlx4_qp_table *qp_table = &mlx4_priv(dev)->qp_table;
+ unsigned long flags;
+
+ spin_lock_irqsave(&qp_table->lock, flags);
+ radix_tree_delete(&dev->qp_table_tree, qp->qpn & (dev->caps.num_qps - 1));
+ spin_unlock_irqrestore(&qp_table->lock, flags);
+}
+EXPORT_SYMBOL_GPL(mlx4_qp_remove);
+
+void mlx4_qp_free(struct mlx4_dev *dev, struct mlx4_qp *qp)
+{
+ if (atomic_dec_and_test(&qp->refcount))
+ complete(&qp->free);
+ wait_for_completion(&qp->free);
+
+ mlx4_qp_free_icm(dev, qp->qpn);
+}
+EXPORT_SYMBOL_GPL(mlx4_qp_free);
+
+static int mlx4_CONF_SPECIAL_QP(struct mlx4_dev *dev, u32 base_qpn)
+{
+ return mlx4_cmd(dev, 0, base_qpn, 0, MLX4_CMD_CONF_SPECIAL_QP,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_NATIVE);
+}
+
+#define MLX4_QP_TABLE_RSS_ETH_PRIORITY 2
+#define MLX4_QP_TABLE_RAW_ETH_PRIORITY 1
+#define MLX4_QP_TABLE_RAW_ETH_SIZE 256
+
+static int mlx4_create_zones(struct mlx4_dev *dev,
+ u32 reserved_bottom_general,
+ u32 reserved_top_general,
+ u32 reserved_bottom_rss,
+ u32 start_offset_rss,
+ u32 max_table_offset)
+{
+ struct mlx4_qp_table *qp_table = &mlx4_priv(dev)->qp_table;
+ struct mlx4_bitmap (*bitmap)[MLX4_QP_TABLE_ZONE_NUM] = NULL;
+ int bitmap_initialized = 0;
+ u32 last_offset;
+ int k;
+ int err;
+
+ qp_table->zones = mlx4_zone_allocator_create(MLX4_ZONE_ALLOC_FLAGS_NO_OVERLAP);
+
+ if (NULL == qp_table->zones)
+ return -ENOMEM;
+
+ bitmap = kmalloc(sizeof(*bitmap), GFP_KERNEL);
+
+ if (NULL == bitmap) {
+ err = -ENOMEM;
+ goto free_zone;
+ }
+
+ err = mlx4_bitmap_init(*bitmap + MLX4_QP_TABLE_ZONE_GENERAL, dev->caps.num_qps,
+ (1 << 23) - 1, reserved_bottom_general,
+ reserved_top_general);
+
+ if (err)
+ goto free_bitmap;
+
+ ++bitmap_initialized;
+
+ err = mlx4_zone_add_one(qp_table->zones, *bitmap + MLX4_QP_TABLE_ZONE_GENERAL,
+ MLX4_ZONE_FALLBACK_TO_HIGHER_PRIO |
+ MLX4_ZONE_USE_RR, 0,
+ 0, qp_table->zones_uids + MLX4_QP_TABLE_ZONE_GENERAL);
+
+ if (err)
+ goto free_bitmap;
+
+ err = mlx4_bitmap_init(*bitmap + MLX4_QP_TABLE_ZONE_RSS,
+ reserved_bottom_rss,
+ reserved_bottom_rss - 1,
+ dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW],
+ reserved_bottom_rss - start_offset_rss);
+
+ if (err)
+ goto free_bitmap;
+
+ ++bitmap_initialized;
+
+ err = mlx4_zone_add_one(qp_table->zones, *bitmap + MLX4_QP_TABLE_ZONE_RSS,
+ MLX4_ZONE_ALLOW_ALLOC_FROM_LOWER_PRIO |
+ MLX4_ZONE_ALLOW_ALLOC_FROM_EQ_PRIO |
+ MLX4_ZONE_USE_RR, MLX4_QP_TABLE_RSS_ETH_PRIORITY,
+ 0, qp_table->zones_uids + MLX4_QP_TABLE_ZONE_RSS);
+
+ if (err)
+ goto free_bitmap;
+
+ last_offset = dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW];
+ /* We have a single zone for the A0 steering QPs area of the FW. This area
+ * needs to be split into subareas. One set of subareas is for RSS QPs
+ * (in which qp number bits 6 and/or 7 are set); the other set of subareas
+ * is for RAW_ETH QPs, which require that both bits 6 and 7 are zero.
+ * Currently, the values returned by the FW (A0 steering area starting qp number
+ * and A0 steering area size) are such that there are only two subareas -- one
+ * for RSS and one for RAW_ETH.
+ */
+ for (k = MLX4_QP_TABLE_ZONE_RSS + 1; k < sizeof(*bitmap)/sizeof((*bitmap)[0]);
+ k++) {
+ int size;
+ u32 offset = start_offset_rss;
+ u32 bf_mask;
+ u32 requested_size;
+
+ /* Assuming MLX4_BF_QP_SKIP_MASK is consecutive ones, this calculates
+ * a mask of all LSB bits set until (and not including) the first
+ * set bit of MLX4_BF_QP_SKIP_MASK. For example, if MLX4_BF_QP_SKIP_MASK
+ * is 0xc0, bf_mask will be 0x3f.
+ */
+ bf_mask = (MLX4_BF_QP_SKIP_MASK & ~(MLX4_BF_QP_SKIP_MASK - 1)) - 1;
+ requested_size = min((u32)MLX4_QP_TABLE_RAW_ETH_SIZE, bf_mask + 1);
+
+ if (((last_offset & MLX4_BF_QP_SKIP_MASK) &&
+ ((int)(max_table_offset - last_offset)) >=
+ roundup_pow_of_two(MLX4_BF_QP_SKIP_MASK)) ||
+ (!(last_offset & MLX4_BF_QP_SKIP_MASK) &&
+ !((last_offset + requested_size - 1) &
+ MLX4_BF_QP_SKIP_MASK)))
+ size = requested_size;
+ else {
+ u32 candidate_offset =
+ (last_offset | MLX4_BF_QP_SKIP_MASK | bf_mask) + 1;
+
+ if (last_offset & MLX4_BF_QP_SKIP_MASK)
+ last_offset = candidate_offset;
+
+ /* From this point, the BF bits are 0 */
+
+ if (last_offset > max_table_offset) {
+ /* need to skip */
+ size = -1;
+ } else {
+ size = min3(max_table_offset - last_offset,
+ bf_mask - (last_offset & bf_mask),
+ requested_size);
+ if (size < requested_size) {
+ int candidate_size;
+
+ candidate_size = min3(
+ max_table_offset - candidate_offset,
+ bf_mask - (last_offset & bf_mask),
+ requested_size);
+
+ /* We will not take this path if last_offset was
+ * already set above to candidate_offset
+ */
+ if (candidate_size > size) {
+ last_offset = candidate_offset;
+ size = candidate_size;
+ }
+ }
+ }
+ }
+
+ if (size > 0) {
+ /* mlx4_bitmap_alloc_range will find a contiguous range of "size"
+ * QPs in which both bits 6 and 7 are zero, because we pass it the
+ * MLX4_BF_SKIP_MASK).
+ */
+ offset = mlx4_bitmap_alloc_range(
+ *bitmap + MLX4_QP_TABLE_ZONE_RSS,
+ size, 1,
+ MLX4_BF_QP_SKIP_MASK);
+
+ if (offset == (u32)-1) {
+ err = -ENOMEM;
+ break;
+ }
+
+ last_offset = offset + size;
+
+ err = mlx4_bitmap_init(*bitmap + k, roundup_pow_of_two(size),
+ roundup_pow_of_two(size) - 1, 0,
+ roundup_pow_of_two(size) - size);
+ } else {
+ /* Add an empty bitmap, we'll allocate from different zones (since
+ * at least one is reserved)
+ */
+ err = mlx4_bitmap_init(*bitmap + k, 1,
+ MLX4_QP_TABLE_RAW_ETH_SIZE - 1, 0,
+ 0);
+ mlx4_bitmap_alloc_range(*bitmap + k, 1, 1, 0);
+ }
+
+ if (err)
+ break;
+
+ ++bitmap_initialized;
+
+ err = mlx4_zone_add_one(qp_table->zones, *bitmap + k,
+ MLX4_ZONE_ALLOW_ALLOC_FROM_LOWER_PRIO |
+ MLX4_ZONE_ALLOW_ALLOC_FROM_EQ_PRIO |
+ MLX4_ZONE_USE_RR, MLX4_QP_TABLE_RAW_ETH_PRIORITY,
+ offset, qp_table->zones_uids + k);
+
+ if (err)
+ break;
+ }
+
+ if (err)
+ goto free_bitmap;
+
+ qp_table->bitmap_gen = *bitmap;
+
+ return err;
+
+free_bitmap:
+ for (k = 0; k < bitmap_initialized; k++)
+ mlx4_bitmap_cleanup(*bitmap + k);
+ kfree(bitmap);
+free_zone:
+ mlx4_zone_allocator_destroy(qp_table->zones);
+ return err;
+}
+
+static void mlx4_cleanup_qp_zones(struct mlx4_dev *dev)
+{
+ struct mlx4_qp_table *qp_table = &mlx4_priv(dev)->qp_table;
+
+ if (qp_table->zones) {
+ int i;
+
+ for (i = 0;
+ i < sizeof(qp_table->zones_uids)/sizeof(qp_table->zones_uids[0]);
+ i++) {
+ struct mlx4_bitmap *bitmap =
+ mlx4_zone_get_bitmap(qp_table->zones,
+ qp_table->zones_uids[i]);
+
+ mlx4_zone_remove_one(qp_table->zones, qp_table->zones_uids[i]);
+ if (NULL == bitmap)
+ continue;
+
+ mlx4_bitmap_cleanup(bitmap);
+ }
+ mlx4_zone_allocator_destroy(qp_table->zones);
+ kfree(qp_table->bitmap_gen);
+ qp_table->bitmap_gen = NULL;
+ qp_table->zones = NULL;
+ }
+}
+
+int mlx4_init_qp_table(struct mlx4_dev *dev)
+{
+ struct mlx4_qp_table *qp_table = &mlx4_priv(dev)->qp_table;
+ int err;
+ int reserved_from_top = 0;
+ int reserved_from_bot;
+ int k;
+ int fixed_reserved_from_bot_rv = 0;
+ int bottom_reserved_for_rss_bitmap;
+ u32 max_table_offset = dev->caps.dmfs_high_rate_qpn_base +
+ dev->caps.dmfs_high_rate_qpn_range;
+
+ spin_lock_init(&qp_table->lock);
+ INIT_RADIX_TREE(&dev->qp_table_tree, GFP_ATOMIC);
+ if (mlx4_is_slave(dev))
+ return 0;
+
+ /* We reserve 2 extra QPs per port for the special QPs. The
+ * block of special QPs must be aligned to a multiple of 8, so
+ * round up.
+ *
+ * We also reserve the MSB of the 24-bit QP number to indicate
+ * that a QP is an XRC QP.
+ */
+ for (k = 0; k <= MLX4_QP_REGION_BOTTOM; k++)
+ fixed_reserved_from_bot_rv += dev->caps.reserved_qps_cnt[k];
+
+ if (fixed_reserved_from_bot_rv < max_table_offset)
+ fixed_reserved_from_bot_rv = max_table_offset;
+
+ /* We reserve at least 1 extra for bitmaps that we don't have enough space for*/
+ bottom_reserved_for_rss_bitmap =
+ roundup_pow_of_two(fixed_reserved_from_bot_rv + 1);
+ dev->phys_caps.base_sqpn = ALIGN(bottom_reserved_for_rss_bitmap, 8);
+
+ {
+ int sort[MLX4_NUM_QP_REGION];
+ int i, j, tmp;
+ int last_base = dev->caps.num_qps;
+
+ for (i = 1; i < MLX4_NUM_QP_REGION; ++i)
+ sort[i] = i;
+
+ for (i = MLX4_NUM_QP_REGION; i > MLX4_QP_REGION_BOTTOM; --i) {
+ for (j = MLX4_QP_REGION_BOTTOM + 2; j < i; ++j) {
+ if (dev->caps.reserved_qps_cnt[sort[j]] >
+ dev->caps.reserved_qps_cnt[sort[j - 1]]) {
+ tmp = sort[j];
+ sort[j] = sort[j - 1];
+ sort[j - 1] = tmp;
+ }
+ }
+ }
+
+ for (i = MLX4_QP_REGION_BOTTOM + 1; i < MLX4_NUM_QP_REGION; ++i) {
+ last_base -= dev->caps.reserved_qps_cnt[sort[i]];
+ dev->caps.reserved_qps_base[sort[i]] = last_base;
+ reserved_from_top +=
+ dev->caps.reserved_qps_cnt[sort[i]];
+ }
+ }
+
+ /* Reserve 8 real SQPs in both native and SRIOV modes.
+ * In addition, in SRIOV mode, reserve 8 proxy SQPs per function
+ * (for all PFs and VFs), and 8 corresponding tunnel QPs.
+ * Each proxy SQP works opposite its own tunnel QP.
+ *
+ * The QPs are arranged as follows:
+ * a. 8 real SQPs
+ * b. All the proxy SQPs (8 per function)
+ * c. All the tunnel QPs (8 per function)
+ */
+ reserved_from_bot = mlx4_num_reserved_sqps(dev);
+ if (reserved_from_bot + reserved_from_top > dev->caps.num_qps) {
+ mlx4_err(dev, "Number of reserved QPs is higher than number of QPs\n");
+ return -EINVAL;
+ }
+
+ err = mlx4_create_zones(dev, reserved_from_bot, reserved_from_bot,
+ bottom_reserved_for_rss_bitmap,
+ fixed_reserved_from_bot_rv,
+ max_table_offset);
+
+ if (err)
+ return err;
+
+ if (mlx4_is_mfunc(dev)) {
+ /* for PPF use */
+ dev->phys_caps.base_proxy_sqpn = dev->phys_caps.base_sqpn + 8;
+ dev->phys_caps.base_tunnel_sqpn = dev->phys_caps.base_sqpn + 8 + 8 * MLX4_MFUNC_MAX;
+
+ /* In mfunc, calculate proxy and tunnel qp offsets for the PF here,
+ * since the PF does not call mlx4_slave_caps */
+ dev->caps.qp0_tunnel = kcalloc(dev->caps.num_ports, sizeof (u32), GFP_KERNEL);
+ dev->caps.qp0_proxy = kcalloc(dev->caps.num_ports, sizeof (u32), GFP_KERNEL);
+ dev->caps.qp1_tunnel = kcalloc(dev->caps.num_ports, sizeof (u32), GFP_KERNEL);
+ dev->caps.qp1_proxy = kcalloc(dev->caps.num_ports, sizeof (u32), GFP_KERNEL);
+
+ if (!dev->caps.qp0_tunnel || !dev->caps.qp0_proxy ||
+ !dev->caps.qp1_tunnel || !dev->caps.qp1_proxy) {
+ err = -ENOMEM;
+ goto err_mem;
+ }
+
+ for (k = 0; k < dev->caps.num_ports; k++) {
+ dev->caps.qp0_proxy[k] = dev->phys_caps.base_proxy_sqpn +
+ 8 * mlx4_master_func_num(dev) + k;
+ dev->caps.qp0_tunnel[k] = dev->caps.qp0_proxy[k] + 8 * MLX4_MFUNC_MAX;
+ dev->caps.qp1_proxy[k] = dev->phys_caps.base_proxy_sqpn +
+ 8 * mlx4_master_func_num(dev) + MLX4_MAX_PORTS + k;
+ dev->caps.qp1_tunnel[k] = dev->caps.qp1_proxy[k] + 8 * MLX4_MFUNC_MAX;
+ }
+ }
+
+
+ err = mlx4_CONF_SPECIAL_QP(dev, dev->phys_caps.base_sqpn);
+ if (err)
+ goto err_mem;
+
+ return err;
+
+err_mem:
+ kfree(dev->caps.qp0_tunnel);
+ kfree(dev->caps.qp0_proxy);
+ kfree(dev->caps.qp1_tunnel);
+ kfree(dev->caps.qp1_proxy);
+ dev->caps.qp0_tunnel = dev->caps.qp0_proxy =
+ dev->caps.qp1_tunnel = dev->caps.qp1_proxy = NULL;
+ mlx4_cleanup_qp_zones(dev);
+ return err;
+}
+
+void mlx4_cleanup_qp_table(struct mlx4_dev *dev)
+{
+ if (mlx4_is_slave(dev))
+ return;
+
+ mlx4_CONF_SPECIAL_QP(dev, 0);
+
+ mlx4_cleanup_qp_zones(dev);
+}
+
+int mlx4_qp_query(struct mlx4_dev *dev, struct mlx4_qp *qp,
+ struct mlx4_qp_context *context, int native_or_wrapped)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ err = mlx4_cmd_box(dev, 0, mailbox->dma, qp->qpn, 0,
+ MLX4_CMD_QUERY_QP, MLX4_CMD_TIME_CLASS_A,
+ native_or_wrapped);
+ if (!err)
+ memcpy(context, mailbox->buf + 8, sizeof *context);
+
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_qp_query);
+
+int mlx4_qp_to_ready(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
+ struct mlx4_qp_context *context,
+ struct mlx4_qp *qp, enum mlx4_qp_state *qp_state)
+{
+ int err;
+ int i;
+ enum mlx4_qp_state states[] = {
+ MLX4_QP_STATE_RST,
+ MLX4_QP_STATE_INIT,
+ MLX4_QP_STATE_RTR,
+ MLX4_QP_STATE_RTS
+ };
+
+ for (i = 0; i < ARRAY_SIZE(states) - 1; i++) {
+ context->flags &= cpu_to_be32(~(0xf << 28));
+ context->flags |= cpu_to_be32(states[i + 1] << 28);
+ if (states[i + 1] != MLX4_QP_STATE_RTR)
+ context->params2 &= ~MLX4_QP_BIT_FPP;
+ err = mlx4_qp_modify(dev, mtt, states[i], states[i + 1],
+ context, 0, 0, qp);
+ if (err) {
+ mlx4_err(dev, "Failed to bring QP to state: %d with error: %d\n",
+ states[i + 1], err);
+ return err;
+ }
+
+ *qp_state = states[i + 1];
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx4_qp_to_ready);
+
+u32 mlx4_qp_roce_entropy(struct mlx4_dev *dev, u32 qpn)
+{
+ struct mlx4_qp_context context;
+ struct mlx4_qp qp;
+ int err;
+
+ qp.qpn = qpn;
+ err = mlx4_qp_query(dev, &qp, &context, MLX4_CMD_NATIVE);
+ if (!err) {
+ u32 dest_qpn = be32_to_cpu(context.remote_qpn) & 0xffffff;
+ u16 folded_dst = folded_qp(dest_qpn);
+ u16 folded_src = folded_qp(qpn);
+
+ return (dest_qpn != qpn) ? ((folded_dst ^ folded_src) | 0xC000) :
+ folded_src | 0xC000;
+ }
+ return 0xdead;
+}
+EXPORT_SYMBOL_GPL(mlx4_qp_roce_entropy);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/reset.c b/drivers/net/mlnx_uio/mlnx/mlx4/reset.c
new file mode 100644
index 0000000..e8bdb4a
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/reset.c
@@ -0,0 +1,202 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+#include "mlx4.h"
+
+int mlx4_reset(struct mlx4_dev *dev)
+{
+ void __iomem *reset;
+ u32 *hca_header = NULL;
+ int pcie_cap;
+ u16 devctl;
+ u16 linkctl;
+ u16 vendor;
+ unsigned long end;
+ u32 sem;
+ int i;
+ int err = 0;
+
+#define MLX4_RESET_BASE 0xf0000
+#define MLX4_RESET_SIZE 0x400
+#define MLX4_SEM_OFFSET 0x3fc
+#define MLX4_RESET_OFFSET 0x10
+#define MLX4_RESET_VALUE swab32(1)
+
+#define MLX4_SEM_TIMEOUT_JIFFIES (10 * HZ)
+#define MLX4_RESET_TIMEOUT_JIFFIES (2 * HZ)
+
+ /*
+ * Reset the chip. This is somewhat ugly because we have to
+ * save off the PCI header before reset and then restore it
+ * after the chip reboots. We skip config space offsets 22
+ * and 23 since those have a special meaning.
+ */
+
+ /* Do we need to save off the full 4K PCI Express header?? */
+ hca_header = kmalloc(256, GFP_KERNEL);
+ if (!hca_header) {
+ err = -ENOMEM;
+ mlx4_err(dev, "Couldn't allocate memory to save HCA PCI header, aborting\n");
+ goto out;
+ }
+
+#ifdef KMOD_REMOVED
+ pcie_cap = pci_pcie_cap(dev->persist->pdev);
+
+ for (i = 0; i < 64; ++i) {
+ if (i == 22 || i == 23)
+ continue;
+ if (pci_read_config_dword(dev->persist->pdev, i * 4,
+ hca_header + i)) {
+ err = -ENODEV;
+ mlx4_err(dev, "Couldn't save HCA PCI header, aborting\n");
+ goto out;
+ }
+ }
+#endif
+
+#ifdef KMOD_MODIFIED
+ assert((MLX4_RESET_BASE+MLX4_RESET_SIZE) <= dev->persist->rte_pdev->mem_resource[0].len);
+ reset = RTE_PTR_ADD(dev->persist->rte_pdev->mem_resource[0].addr, MLX4_RESET_BASE);
+#else
+ reset = ioremap(pci_resource_start(dev->persist->pdev, 0) +
+ MLX4_RESET_BASE,
+ MLX4_RESET_SIZE);
+#endif
+ if (!reset) {
+ err = -ENOMEM;
+ mlx4_err(dev, "Couldn't map HCA reset register, aborting\n");
+ goto out;
+ }
+
+ /* grab HW semaphore to lock out flash updates */
+ end = jiffies + MLX4_SEM_TIMEOUT_JIFFIES;
+ do {
+ sem = readl(reset + MLX4_SEM_OFFSET);
+ if (!sem)
+ break;
+
+ msleep(1);
+ } while (time_before(jiffies, end));
+
+ if (sem) {
+ mlx4_err(dev, "Failed to obtain HW semaphore, aborting\n");
+ err = -EAGAIN;
+#ifdef KMOD_REMOVED
+ iounmap(reset);
+#endif
+ goto out;
+ }
+
+ /* actually hit reset */
+ writel(MLX4_RESET_VALUE, reset + MLX4_RESET_OFFSET);
+#ifdef KMOD_REMOVED
+ iounmap(reset);
+#endif
+
+ /* Docs say to wait one second before accessing device */
+#ifdef KMOD_MODIFIED
+ //We do not check pci config, so sleep more for the safty
+ msleep(5000);
+#else
+ msleep(1000);
+#endif
+
+ end = jiffies + MLX4_RESET_TIMEOUT_JIFFIES;
+#ifdef KMOD_REMOVED
+ do {
+ if (!pci_read_config_word(dev->persist->pdev, PCI_VENDOR_ID,
+ &vendor) && vendor != 0xffff)
+ break;
+
+ msleep(1);
+ } while (time_before(jiffies, end));
+
+
+ if (vendor == 0xffff) {
+ err = -ENODEV;
+ mlx4_err(dev, "PCI device did not come back after reset, aborting\n");
+ goto out;
+ }
+
+ /* Now restore the PCI headers */
+ if (pcie_cap) {
+ devctl = hca_header[(pcie_cap + PCI_EXP_DEVCTL) / 4];
+ if (pcie_capability_write_word(dev->persist->pdev,
+ PCI_EXP_DEVCTL,
+ devctl)) {
+ err = -ENODEV;
+ mlx4_err(dev, "Couldn't restore HCA PCI Express Device Control register, aborting\n");
+ goto out;
+ }
+ linkctl = hca_header[(pcie_cap + PCI_EXP_LNKCTL) / 4];
+ if (pcie_capability_write_word(dev->persist->pdev,
+ PCI_EXP_LNKCTL,
+ linkctl)) {
+ err = -ENODEV;
+ mlx4_err(dev, "Couldn't restore HCA PCI Express Link control register, aborting\n");
+ goto out;
+ }
+ }
+
+ for (i = 0; i < 16; ++i) {
+ if (i * 4 == PCI_COMMAND)
+ continue;
+
+ if (pci_write_config_dword(dev->persist->pdev, i * 4,
+ hca_header[i])) {
+ err = -ENODEV;
+ mlx4_err(dev, "Couldn't restore HCA reg %x, aborting\n",
+ i);
+ goto out;
+ }
+ }
+
+ if (pci_write_config_dword(dev->persist->pdev, PCI_COMMAND,
+ hca_header[PCI_COMMAND / 4])) {
+ err = -ENODEV;
+ mlx4_err(dev, "Couldn't restore HCA COMMAND, aborting\n");
+ goto out;
+ }
+#endif
+
+out:
+ kfree(hca_header);
+
+ return err;
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/resource_tracker.c b/drivers/net/mlnx_uio/mlnx/mlx4/resource_tracker.c
new file mode 100644
index 0000000..2b09623
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/resource_tracker.c
@@ -0,0 +1,5052 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved.
+ * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies.
+ * All rights reserved.
+ * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx4/qp.h"
+#include "log2.h"
+
+#include "mlx4.h"
+#include "fw.h"
+
+#define MLX4_MAC_VALID (1ull << 63)
+
+struct mac_res {
+ struct list_head list;
+ u64 mac;
+ int ref_count;
+ u8 smac_index;
+ u8 port;
+};
+
+struct vlan_res {
+ struct list_head list;
+ u16 vlan;
+ int ref_count;
+ int vlan_index;
+ u8 port;
+};
+
+struct res_common {
+ struct list_head list;
+ struct rb_node node;
+ u64 res_id;
+ int owner;
+ int state;
+ int from_state;
+ int to_state;
+ int removing;
+ const char *func_name;
+};
+
+enum {
+ RES_ANY_BUSY = 1
+};
+
+struct res_gid {
+ struct list_head list;
+ u8 gid[16];
+ enum mlx4_protocol prot;
+ enum mlx4_steer_type steer;
+ u64 reg_id;
+};
+
+enum res_qp_states {
+ RES_QP_BUSY = RES_ANY_BUSY,
+
+ /* QP number was allocated */
+ RES_QP_RESERVED,
+
+ /* ICM memory for QP context was mapped */
+ RES_QP_MAPPED,
+
+ /* QP is in hw ownership */
+ RES_QP_HW
+};
+
+struct res_qp {
+ struct res_common com;
+ struct res_mtt *mtt;
+ struct res_cq *rcq;
+ struct res_cq *scq;
+ struct res_srq *srq;
+ struct list_head mcg_list;
+ spinlock_t mcg_spl;
+ int local_qpn;
+ atomic_t ref_count;
+ u32 qpc_flags;
+ /* saved qp params before VST enforcement in order to restore on VGT */
+ u8 sched_queue;
+ __be32 param3;
+ u8 vlan_control;
+ u8 fvl_rx;
+ u8 pri_path_fl;
+ u8 vlan_index;
+ u8 feup;
+};
+
+enum res_mtt_states {
+ RES_MTT_BUSY = RES_ANY_BUSY,
+ RES_MTT_ALLOCATED,
+};
+
+static inline const char *mtt_states_str(enum res_mtt_states state)
+{
+ switch (state) {
+ case RES_MTT_BUSY: return "RES_MTT_BUSY";
+ case RES_MTT_ALLOCATED: return "RES_MTT_ALLOCATED";
+ default: return "Unknown";
+ }
+}
+
+struct res_mtt {
+ struct res_common com;
+ int order;
+ atomic_t ref_count;
+};
+
+enum res_mpt_states {
+ RES_MPT_BUSY = RES_ANY_BUSY,
+ RES_MPT_RESERVED,
+ RES_MPT_MAPPED,
+ RES_MPT_HW,
+};
+
+struct res_mpt {
+ struct res_common com;
+ struct res_mtt *mtt;
+ int key;
+};
+
+enum res_eq_states {
+ RES_EQ_BUSY = RES_ANY_BUSY,
+ RES_EQ_RESERVED,
+ RES_EQ_HW,
+};
+
+struct res_eq {
+ struct res_common com;
+ struct res_mtt *mtt;
+};
+
+enum res_cq_states {
+ RES_CQ_BUSY = RES_ANY_BUSY,
+ RES_CQ_ALLOCATED,
+ RES_CQ_HW,
+};
+
+struct res_cq {
+ struct res_common com;
+ struct res_mtt *mtt;
+ atomic_t ref_count;
+};
+
+enum res_srq_states {
+ RES_SRQ_BUSY = RES_ANY_BUSY,
+ RES_SRQ_ALLOCATED,
+ RES_SRQ_HW,
+};
+
+struct res_srq {
+ struct res_common com;
+ struct res_mtt *mtt;
+ struct res_cq *cq;
+ atomic_t ref_count;
+};
+
+enum res_counter_states {
+ RES_COUNTER_BUSY = RES_ANY_BUSY,
+ RES_COUNTER_ALLOCATED,
+};
+
+struct res_counter {
+ struct res_common com;
+ int port;
+};
+
+enum res_xrcdn_states {
+ RES_XRCD_BUSY = RES_ANY_BUSY,
+ RES_XRCD_ALLOCATED,
+};
+
+struct res_xrcdn {
+ struct res_common com;
+ int port;
+};
+
+enum res_fs_rule_states {
+ RES_FS_RULE_BUSY = RES_ANY_BUSY,
+ RES_FS_RULE_ALLOCATED,
+};
+
+struct res_fs_rule {
+ struct res_common com;
+ int qpn;
+};
+
+static void *res_tracker_lookup(struct rb_root *root, u64 res_id)
+{
+ struct rb_node *node = root->rb_node;
+
+ while (node) {
+ struct res_common *res = container_of(node, struct res_common,
+ node);
+
+ if (res_id < res->res_id)
+ node = node->rb_left;
+ else if (res_id > res->res_id)
+ node = node->rb_right;
+ else
+ return res;
+ }
+ return NULL;
+}
+
+static int res_tracker_insert(struct rb_root *root, struct res_common *res)
+{
+ struct rb_node **new = &(root->rb_node), *parent = NULL;
+
+ /* Figure out where to put new node */
+ while (*new) {
+ struct res_common *this = container_of(*new, struct res_common,
+ node);
+
+ parent = *new;
+ if (res->res_id < this->res_id)
+ new = &((*new)->rb_left);
+ else if (res->res_id > this->res_id)
+ new = &((*new)->rb_right);
+ else
+ return -EEXIST;
+ }
+
+ /* Add new node and rebalance tree. */
+ rb_link_node(&res->node, parent, new);
+ rb_insert_color(&res->node, root);
+
+ return 0;
+}
+
+enum qp_transition {
+ QP_TRANS_INIT2RTR,
+ QP_TRANS_RTR2RTS,
+ QP_TRANS_RTS2RTS,
+ QP_TRANS_SQERR2RTS,
+ QP_TRANS_SQD2SQD,
+ QP_TRANS_SQD2RTS
+};
+
+/* For Debug uses */
+static const char *resource_str(enum mlx4_resource rt)
+{
+ switch (rt) {
+ case RES_QP: return "RES_QP";
+ case RES_CQ: return "RES_CQ";
+ case RES_SRQ: return "RES_SRQ";
+ case RES_MPT: return "RES_MPT";
+ case RES_MTT: return "RES_MTT";
+ case RES_MAC: return "RES_MAC";
+ case RES_VLAN: return "RES_VLAN";
+ case RES_EQ: return "RES_EQ";
+ case RES_COUNTER: return "RES_COUNTER";
+ case RES_FS_RULE: return "RES_FS_RULE";
+ case RES_XRCD: return "RES_XRCD";
+ default: return "Unknown resource type !!!";
+ };
+}
+
+static void rem_slave_vlans(struct mlx4_dev *dev, int slave);
+static inline int mlx4_grant_resource(struct mlx4_dev *dev, int slave,
+ enum mlx4_resource res_type, int count,
+ int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct resource_allocator *res_alloc =
+ &priv->mfunc.master.res_tracker.res_alloc[res_type];
+ int err = -EINVAL;
+ int allocated, free, reserved, guaranteed, from_free;
+ int from_rsvd;
+
+ if (slave > dev->persist->num_vfs)
+ return -EINVAL;
+
+ spin_lock(&res_alloc->alloc_lock);
+ allocated = (port > 0) ?
+ res_alloc->allocated[(port - 1) *
+ (dev->persist->num_vfs + 1) + slave] :
+ res_alloc->allocated[slave];
+ free = (port > 0) ? res_alloc->res_port_free[port - 1] :
+ res_alloc->res_free;
+ reserved = (port > 0) ? res_alloc->res_port_rsvd[port - 1] :
+ res_alloc->res_reserved;
+ guaranteed = res_alloc->guaranteed[slave];
+
+ if (allocated + count > res_alloc->quota[slave]) {
+ mlx4_warn(dev, "VF %d port %d res %s: quota exceeded, count %d alloc %d quota %d\n",
+ slave, port, resource_str(res_type), count,
+ allocated, res_alloc->quota[slave]);
+ goto out;
+ }
+
+ if (allocated + count <= guaranteed) {
+ err = 0;
+ from_rsvd = count;
+ } else {
+ /* portion may need to be obtained from free area */
+ if (guaranteed - allocated > 0)
+ from_free = count - (guaranteed - allocated);
+ else
+ from_free = count;
+
+ from_rsvd = count - from_free;
+
+ if (free - from_free >= reserved)
+ err = 0;
+ else
+ mlx4_warn(dev, "VF %d port %d res %s: free pool empty, free %d from_free %d rsvd %d\n",
+ slave, port, resource_str(res_type), free,
+ from_free, reserved);
+ }
+
+ if (!err) {
+ /* grant the request */
+ if (port > 0) {
+ res_alloc->allocated[(port - 1) *
+ (dev->persist->num_vfs + 1) + slave] += count;
+ res_alloc->res_port_free[port - 1] -= count;
+ res_alloc->res_port_rsvd[port - 1] -= from_rsvd;
+ } else {
+ res_alloc->allocated[slave] += count;
+ res_alloc->res_free -= count;
+ res_alloc->res_reserved -= from_rsvd;
+ }
+ }
+
+out:
+ spin_unlock(&res_alloc->alloc_lock);
+ return err;
+}
+
+static inline void mlx4_release_resource(struct mlx4_dev *dev, int slave,
+ enum mlx4_resource res_type, int count,
+ int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct resource_allocator *res_alloc =
+ &priv->mfunc.master.res_tracker.res_alloc[res_type];
+ int allocated, guaranteed, from_rsvd;
+
+ if (slave > dev->persist->num_vfs)
+ return;
+
+ spin_lock(&res_alloc->alloc_lock);
+
+ allocated = (port > 0) ?
+ res_alloc->allocated[(port - 1) *
+ (dev->persist->num_vfs + 1) + slave] :
+ res_alloc->allocated[slave];
+ guaranteed = res_alloc->guaranteed[slave];
+
+ if (allocated - count >= guaranteed) {
+ from_rsvd = 0;
+ } else {
+ /* portion may need to be returned to reserved area */
+ if (allocated - guaranteed > 0)
+ from_rsvd = count - (allocated - guaranteed);
+ else
+ from_rsvd = count;
+ }
+
+ if (port > 0) {
+ res_alloc->allocated[(port - 1) *
+ (dev->persist->num_vfs + 1) + slave] -= count;
+ res_alloc->res_port_free[port - 1] += count;
+ res_alloc->res_port_rsvd[port - 1] += from_rsvd;
+ } else {
+ res_alloc->allocated[slave] -= count;
+ res_alloc->res_free += count;
+ res_alloc->res_reserved += from_rsvd;
+ }
+
+ spin_unlock(&res_alloc->alloc_lock);
+ return;
+}
+
+static inline void initialize_res_quotas(struct mlx4_dev *dev,
+ struct resource_allocator *res_alloc,
+ enum mlx4_resource res_type,
+ int vf, int num_instances)
+{
+ res_alloc->guaranteed[vf] = num_instances /
+ (2 * (dev->persist->num_vfs + 1));
+ res_alloc->quota[vf] = (num_instances / 2) + res_alloc->guaranteed[vf];
+ if (vf == mlx4_master_func_num(dev)) {
+ res_alloc->res_free = num_instances;
+ if (res_type == RES_MTT) {
+ /* reserved mtts will be taken out of the PF allocation */
+ res_alloc->res_free += dev->caps.reserved_mtts;
+ res_alloc->guaranteed[vf] += dev->caps.reserved_mtts;
+ res_alloc->quota[vf] += dev->caps.reserved_mtts;
+ }
+ }
+}
+
+void mlx4_init_quotas(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int pf;
+
+ /* quotas for VFs are initialized in mlx4_slave_cap */
+ if (mlx4_is_slave(dev))
+ return;
+
+ if (!mlx4_is_mfunc(dev)) {
+ dev->quotas.qp = dev->caps.num_qps - dev->caps.reserved_qps -
+ mlx4_num_reserved_sqps(dev);
+ dev->quotas.cq = dev->caps.num_cqs - dev->caps.reserved_cqs;
+ dev->quotas.srq = dev->caps.num_srqs - dev->caps.reserved_srqs;
+ dev->quotas.mtt = dev->caps.num_mtts - dev->caps.reserved_mtts;
+ dev->quotas.mpt = dev->caps.num_mpts - dev->caps.reserved_mrws;
+ return;
+ }
+
+ pf = mlx4_master_func_num(dev);
+ dev->quotas.qp =
+ priv->mfunc.master.res_tracker.res_alloc[RES_QP].quota[pf];
+ dev->quotas.cq =
+ priv->mfunc.master.res_tracker.res_alloc[RES_CQ].quota[pf];
+ dev->quotas.srq =
+ priv->mfunc.master.res_tracker.res_alloc[RES_SRQ].quota[pf];
+ dev->quotas.mtt =
+ priv->mfunc.master.res_tracker.res_alloc[RES_MTT].quota[pf];
+ dev->quotas.mpt =
+ priv->mfunc.master.res_tracker.res_alloc[RES_MPT].quota[pf];
+}
+int mlx4_init_resource_tracker(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i, j;
+ int t;
+
+ priv->mfunc.master.res_tracker.slave_list =
+ kzalloc(dev->num_slaves * sizeof(struct slave_list),
+ GFP_KERNEL);
+ if (!priv->mfunc.master.res_tracker.slave_list)
+ return -ENOMEM;
+
+ for (i = 0 ; i < dev->num_slaves; i++) {
+ for (t = 0; t < MLX4_NUM_OF_RESOURCE_TYPE; ++t)
+ INIT_LIST_HEAD(&priv->mfunc.master.res_tracker.
+ slave_list[i].res_list[t]);
+ mutex_init(&priv->mfunc.master.res_tracker.slave_list[i].mutex);
+ }
+
+ mlx4_dbg(dev, "Started init_resource_tracker: %ld slaves\n",
+ dev->num_slaves);
+ for (i = 0 ; i < MLX4_NUM_OF_RESOURCE_TYPE; i++)
+ priv->mfunc.master.res_tracker.res_tree[i] = RB_ROOT;
+
+ for (i = 0; i < MLX4_NUM_OF_RESOURCE_TYPE; i++) {
+ struct resource_allocator *res_alloc =
+ &priv->mfunc.master.res_tracker.res_alloc[i];
+ res_alloc->quota = kmalloc((dev->persist->num_vfs + 1) *
+ sizeof(int), GFP_KERNEL);
+ res_alloc->guaranteed = kmalloc((dev->persist->num_vfs + 1) *
+ sizeof(int), GFP_KERNEL);
+ if (i == RES_MAC || i == RES_VLAN)
+ res_alloc->allocated = kzalloc(MLX4_MAX_PORTS *
+ (dev->persist->num_vfs
+ + 1) *
+ sizeof(int), GFP_KERNEL);
+ else
+ res_alloc->allocated = kzalloc((dev->persist->
+ num_vfs + 1) *
+ sizeof(int), GFP_KERNEL);
+
+ if (!res_alloc->quota || !res_alloc->guaranteed ||
+ !res_alloc->allocated)
+ goto no_mem_err;
+
+ spin_lock_init(&res_alloc->alloc_lock);
+ for (t = 0; t < dev->persist->num_vfs + 1; t++) {
+ struct mlx4_active_ports actv_ports =
+ mlx4_get_active_ports(dev, t);
+ switch (i) {
+ case RES_QP:
+ initialize_res_quotas(dev, res_alloc, RES_QP,
+ t, dev->caps.num_qps -
+ dev->caps.reserved_qps -
+ mlx4_num_reserved_sqps(dev));
+ break;
+ case RES_CQ:
+ initialize_res_quotas(dev, res_alloc, RES_CQ,
+ t, dev->caps.num_cqs -
+ dev->caps.reserved_cqs);
+ break;
+ case RES_SRQ:
+ initialize_res_quotas(dev, res_alloc, RES_SRQ,
+ t, dev->caps.num_srqs -
+ dev->caps.reserved_srqs);
+ break;
+ case RES_MPT:
+ initialize_res_quotas(dev, res_alloc, RES_MPT,
+ t, dev->caps.num_mpts -
+ dev->caps.reserved_mrws);
+ break;
+ case RES_MTT:
+ initialize_res_quotas(dev, res_alloc, RES_MTT,
+ t, dev->caps.num_mtts -
+ dev->caps.reserved_mtts);
+ break;
+ case RES_MAC:
+ if (t == mlx4_master_func_num(dev)) {
+ int max_vfs_pport = 0;
+ /* Calculate the max vfs per port for */
+ /* both ports. */
+ for (j = 0; j < dev->caps.num_ports;
+ j++) {
+ struct mlx4_slaves_pport slaves_pport =
+ mlx4_phys_to_slaves_pport(dev, j + 1);
+ unsigned current_slaves =
+ bitmap_weight(slaves_pport.slaves,
+ dev->caps.num_ports) - 1;
+ if (max_vfs_pport < current_slaves)
+ max_vfs_pport =
+ current_slaves;
+ }
+ res_alloc->quota[t] =
+ MLX4_MAX_MAC_NUM -
+ 2 * max_vfs_pport;
+ res_alloc->guaranteed[t] = 2;
+ for (j = 0; j < MLX4_MAX_PORTS; j++)
+ res_alloc->res_port_free[j] =
+ MLX4_MAX_MAC_NUM;
+ } else {
+ res_alloc->quota[t] = MLX4_MAX_MAC_NUM;
+ res_alloc->guaranteed[t] = 2;
+ }
+ break;
+ case RES_VLAN:
+ if (t == mlx4_master_func_num(dev)) {
+ res_alloc->quota[t] = MLX4_MAX_VLAN_NUM;
+ res_alloc->guaranteed[t] = MLX4_MAX_VLAN_NUM / 2;
+ for (j = 0; j < MLX4_MAX_PORTS; j++)
+ res_alloc->res_port_free[j] =
+ res_alloc->quota[t];
+ } else {
+ res_alloc->quota[t] = MLX4_MAX_VLAN_NUM / 2;
+ res_alloc->guaranteed[t] = 0;
+ }
+ break;
+ case RES_COUNTER:
+ res_alloc->quota[t] = dev->caps.max_counters;
+ res_alloc->guaranteed[t] = 0;
+ if (t == mlx4_master_func_num(dev))
+ res_alloc->res_free = res_alloc->quota[t];
+ break;
+ default:
+ break;
+ }
+ if (i == RES_MAC || i == RES_VLAN) {
+ for (j = 0; j < dev->caps.num_ports; j++)
+ if (test_bit(j, actv_ports.ports))
+ res_alloc->res_port_rsvd[j] +=
+ res_alloc->guaranteed[t];
+ } else {
+ res_alloc->res_reserved += res_alloc->guaranteed[t];
+ }
+ }
+ }
+ spin_lock_init(&priv->mfunc.master.res_tracker.lock);
+ return 0;
+
+no_mem_err:
+ for (i = 0; i < MLX4_NUM_OF_RESOURCE_TYPE; i++) {
+ kfree(priv->mfunc.master.res_tracker.res_alloc[i].allocated);
+ priv->mfunc.master.res_tracker.res_alloc[i].allocated = NULL;
+ kfree(priv->mfunc.master.res_tracker.res_alloc[i].guaranteed);
+ priv->mfunc.master.res_tracker.res_alloc[i].guaranteed = NULL;
+ kfree(priv->mfunc.master.res_tracker.res_alloc[i].quota);
+ priv->mfunc.master.res_tracker.res_alloc[i].quota = NULL;
+ }
+ return -ENOMEM;
+}
+
+void mlx4_free_resource_tracker(struct mlx4_dev *dev,
+ enum mlx4_res_tracker_free_type type)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int i;
+
+ if (priv->mfunc.master.res_tracker.slave_list) {
+ if (type != RES_TR_FREE_STRUCTS_ONLY) {
+ for (i = 0; i < dev->num_slaves; i++) {
+ if (type == RES_TR_FREE_ALL ||
+ dev->caps.function != i)
+ mlx4_delete_all_resources_for_slave(dev, i);
+ }
+ /* free master's vlans */
+ i = dev->caps.function;
+ mlx4_reset_roce_gids(dev, i);
+ mutex_lock(&priv->mfunc.master.res_tracker.slave_list[i].mutex);
+ rem_slave_vlans(dev, i);
+ mutex_unlock(&priv->mfunc.master.res_tracker.slave_list[i].mutex);
+ }
+
+ if (type != RES_TR_FREE_SLAVES_ONLY) {
+ for (i = 0; i < MLX4_NUM_OF_RESOURCE_TYPE; i++) {
+ kfree(priv->mfunc.master.res_tracker.res_alloc[i].allocated);
+ priv->mfunc.master.res_tracker.res_alloc[i].allocated = NULL;
+ kfree(priv->mfunc.master.res_tracker.res_alloc[i].guaranteed);
+ priv->mfunc.master.res_tracker.res_alloc[i].guaranteed = NULL;
+ kfree(priv->mfunc.master.res_tracker.res_alloc[i].quota);
+ priv->mfunc.master.res_tracker.res_alloc[i].quota = NULL;
+ }
+ kfree(priv->mfunc.master.res_tracker.slave_list);
+ priv->mfunc.master.res_tracker.slave_list = NULL;
+ }
+ }
+}
+
+static void update_pkey_index(struct mlx4_dev *dev, int slave,
+ struct mlx4_cmd_mailbox *inbox)
+{
+ u8 sched = *(u8 *)(inbox->buf + 64);
+ u8 orig_index = *(u8 *)(inbox->buf + 35);
+ u8 new_index;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ int port;
+
+ port = (sched >> 6 & 1) + 1;
+
+ new_index = priv->virt2phys_pkey[slave][port - 1][orig_index];
+ *(u8 *)(inbox->buf + 35) = new_index;
+}
+
+static void update_gid(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *inbox,
+ u8 slave)
+{
+ struct mlx4_qp_context *qp_ctx = inbox->buf + 8;
+ enum mlx4_qp_optpar optpar = be32_to_cpu(*(__be32 *) inbox->buf);
+ u32 ts = (be32_to_cpu(qp_ctx->flags) >> 16) & 0xff;
+ int port;
+
+ if (MLX4_QP_ST_UD == ts) {
+ port = (qp_ctx->pri_path.sched_queue >> 6 & 1) + 1;
+ if (mlx4_is_eth(dev, port))
+ qp_ctx->pri_path.mgid_index =
+ mlx4_get_base_gid_ix(dev, slave, port) | 0x80;
+ else
+ qp_ctx->pri_path.mgid_index = slave | 0x80;
+
+ } else if (MLX4_QP_ST_RC == ts || MLX4_QP_ST_XRC == ts || MLX4_QP_ST_UC == ts) {
+ if (optpar & MLX4_QP_OPTPAR_PRIMARY_ADDR_PATH) {
+ port = (qp_ctx->pri_path.sched_queue >> 6 & 1) + 1;
+ if (mlx4_is_eth(dev, port)) {
+ qp_ctx->pri_path.mgid_index +=
+ mlx4_get_base_gid_ix(dev, slave, port);
+ qp_ctx->pri_path.mgid_index &= 0x7f;
+ } else {
+ qp_ctx->pri_path.mgid_index = slave & 0x7F;
+ }
+ }
+ if (optpar & MLX4_QP_OPTPAR_ALT_ADDR_PATH) {
+ port = (qp_ctx->alt_path.sched_queue >> 6 & 1) + 1;
+ if (mlx4_is_eth(dev, port)) {
+ qp_ctx->alt_path.mgid_index +=
+ mlx4_get_base_gid_ix(dev, slave, port);
+ qp_ctx->alt_path.mgid_index &= 0x7f;
+ } else {
+ qp_ctx->alt_path.mgid_index = slave & 0x7F;
+ }
+ }
+ }
+}
+
+static int check_counter_index_validity(struct mlx4_dev *dev, int slave, int port, int idx)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct counter_index *counter, *tmp_counter;
+
+ if (slave == 0) {
+ list_for_each_entry_safe(counter, tmp_counter,
+ &priv->counters_table.global_port_list[port - 1],
+ list) {
+ if (counter->index == idx)
+ return 0;
+ }
+ } else {
+ list_for_each_entry_safe(counter, tmp_counter,
+ &priv->counters_table.vf_list[slave - 1][port - 1],
+ list) {
+ if (counter->index == idx)
+ return 0;
+ }
+ }
+ return -EINVAL;
+}
+
+static int update_vport_qp_param(struct mlx4_dev *dev,
+ struct mlx4_cmd_mailbox *inbox,
+ u8 slave, u32 qpn)
+{
+ struct mlx4_qp_context *qpc = inbox->buf + 8;
+ struct mlx4_vport_oper_state *vp_oper;
+ struct mlx4_priv *priv;
+ u32 qp_type;
+ int port;
+
+ port = (qpc->pri_path.sched_queue & 0x40) ? 2 : 1;
+ priv = mlx4_priv(dev);
+ vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
+ qp_type = (be32_to_cpu(qpc->flags) >> 16) & 0xff;
+
+ if (dev->caps.port_type[port] == MLX4_PORT_TYPE_ETH &&
+ qpc->pri_path.counter_index != MLX4_SINK_COUNTER_INDEX) {
+ if (check_counter_index_validity(dev, slave, port,
+ qpc->pri_path.counter_index))
+ return -EINVAL;
+ }
+
+ if (MLX4_VGT != vp_oper->state.default_vlan) {
+ /* the reserved QPs (special, proxy, tunnel)
+ * do not operate over vlans
+ */
+ if (mlx4_is_qp_reserved(dev, qpn))
+ return 0;
+
+ /* force strip vlan by clear vsd, MLX QP refers to Raw Ethernet */
+ if (qp_type == MLX4_QP_ST_UD ||
+ (qp_type == MLX4_QP_ST_MLX && mlx4_is_eth(dev, port))) {
+ if (dev->caps.bmme_flags & MLX4_BMME_FLAG_VSD_INIT2RTR) {
+ *(__be32 *)inbox->buf =
+ cpu_to_be32(be32_to_cpu(*(__be32 *)inbox->buf) |
+ MLX4_QP_OPTPAR_VLAN_STRIPPING);
+ qpc->param3 &= ~cpu_to_be32(MLX4_STRIP_VLAN);
+ } else {
+ struct mlx4_update_qp_params params = {.flags = 0};
+
+ mlx4_update_qp(dev, qpn, MLX4_UPDATE_QP_VSD, ¶ms);
+ }
+ }
+
+ /* preserve IF_COUNTER flag */
+ qpc->pri_path.vlan_control &=
+ MLX4_CTRL_ETH_SRC_CHECK_IF_COUNTER;
+ if (vp_oper->state.link_state == IFLA_VF_LINK_STATE_DISABLE &&
+ dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_UPDATE_QP) {
+ qpc->pri_path.vlan_control |=
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_PRIO_TAGGED |
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_UNTAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED;
+ } else if (0 != vp_oper->state.default_vlan) {
+ qpc->pri_path.vlan_control |=
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED;
+ } else { /* priority tagged */
+ qpc->pri_path.vlan_control |=
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED;
+ }
+
+ qpc->pri_path.fvl_rx |= MLX4_FVL_RX_FORCE_ETH_VLAN;
+ qpc->pri_path.vlan_index = vp_oper->vlan_idx;
+ qpc->pri_path.fl |= MLX4_FL_CV | MLX4_FL_ETH_HIDE_CQE_VLAN;
+ qpc->pri_path.feup |= MLX4_FEUP_FORCE_ETH_UP | MLX4_FVL_FORCE_ETH_VLAN;
+ qpc->pri_path.sched_queue &= 0xC7;
+ qpc->pri_path.sched_queue |= (vp_oper->state.default_qos) << 3;
+ qpc->qos_vport = vp_oper->state.qos_vport;
+ }
+ if (vp_oper->state.spoofchk) {
+ qpc->pri_path.feup |= MLX4_FSM_FORCE_ETH_SRC_MAC;
+ qpc->pri_path.grh_mylmc = (0x80 & qpc->pri_path.grh_mylmc) + vp_oper->mac_idx;
+ }
+ return 0;
+}
+
+static int mpt_mask(struct mlx4_dev *dev)
+{
+ return dev->caps.num_mpts - 1;
+}
+
+const char *mlx4_resource_type_to_str(enum mlx4_resource t)
+{
+ switch (t) {
+ case RES_QP:
+ return "QP";
+ case RES_CQ:
+ return "CQ";
+ case RES_SRQ:
+ return "SRQ";
+ case RES_XRCD:
+ return "XRCD";
+ case RES_MPT:
+ return "MPT";
+ case RES_MTT:
+ return "MTT";
+ case RES_MAC:
+ return "MAC";
+ case RES_VLAN:
+ return "VLAN";
+ case RES_COUNTER:
+ return "COUNTER";
+ case RES_FS_RULE:
+ return "FS_RULE";
+ case RES_EQ:
+ return "EQ";
+ default:
+ return "INVALID RESOURCE";
+ }
+}
+
+static void *find_res(struct mlx4_dev *dev, u64 res_id,
+ enum mlx4_resource type)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+
+ return res_tracker_lookup(&priv->mfunc.master.res_tracker.res_tree[type],
+ res_id);
+}
+
+static int _get_res(struct mlx4_dev *dev, int slave, u64 res_id,
+ enum mlx4_resource type,
+ void *res, const char *func_name)
+{
+ struct res_common *r;
+ int err = 0;
+
+ spin_lock_irq(mlx4_tlock(dev));
+ r = find_res(dev, res_id, type);
+ if (!r) {
+ err = -ENONET;
+ goto exit;
+ }
+
+ if (r->state == RES_ANY_BUSY) {
+ mlx4_warn(dev,
+ "%s(%d) trying to get resource %llx of type %s, but it's already taken by %s\n",
+ func_name, slave, res_id, mlx4_resource_type_to_str(type),
+ r->func_name);
+ err = -EBUSY;
+ goto exit;
+ }
+
+ if (r->owner != slave) {
+ err = -EPERM;
+ goto exit;
+ }
+
+ r->from_state = r->state;
+ r->state = RES_ANY_BUSY;
+ r->func_name = func_name;
+
+ if (res)
+ *((struct res_common **)res) = r;
+
+exit:
+ spin_unlock_irq(mlx4_tlock(dev));
+ return err;
+}
+
+#define get_res(dev, slave, res_id, type, res) \
+ _get_res((dev), (slave), (res_id), (type), (res), __func__)
+
+int mlx4_get_slave_from_resource_id(struct mlx4_dev *dev,
+ enum mlx4_resource type,
+ u64 res_id, int *slave)
+{
+
+ struct res_common *r;
+ int err = -ENOENT;
+ int id = res_id;
+
+ if (type == RES_QP)
+ id &= 0x7fffff;
+ spin_lock(mlx4_tlock(dev));
+
+ r = find_res(dev, id, type);
+ if (r) {
+ *slave = r->owner;
+ err = 0;
+ }
+ spin_unlock(mlx4_tlock(dev));
+
+ return err;
+}
+
+static void put_res(struct mlx4_dev *dev, int slave, u64 res_id,
+ enum mlx4_resource type)
+{
+ struct res_common *r;
+
+ spin_lock_irq(mlx4_tlock(dev));
+ r = find_res(dev, res_id, type);
+ if (r) {
+ r->state = r->from_state;
+ r->func_name = "";
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
+static struct res_common *alloc_qp_tr(int id)
+{
+ struct res_qp *ret;
+
+ ret = kzalloc(sizeof *ret, GFP_KERNEL);
+ if (!ret)
+ return NULL;
+
+ ret->com.res_id = id;
+ ret->com.state = RES_QP_RESERVED;
+ ret->local_qpn = id;
+ INIT_LIST_HEAD(&ret->mcg_list);
+ spin_lock_init(&ret->mcg_spl);
+ atomic_set(&ret->ref_count, 0);
+
+ return &ret->com;
+}
+
+static struct res_common *alloc_mtt_tr(int id, int order)
+{
+ struct res_mtt *ret;
+
+ ret = kzalloc(sizeof *ret, GFP_KERNEL);
+ if (!ret)
+ return NULL;
+
+ ret->com.res_id = id;
+ ret->order = order;
+ ret->com.state = RES_MTT_ALLOCATED;
+ atomic_set(&ret->ref_count, 0);
+
+ return &ret->com;
+}
+
+static struct res_common *alloc_mpt_tr(int id, int key)
+{
+ struct res_mpt *ret;
+
+ ret = kzalloc(sizeof *ret, GFP_KERNEL);
+ if (!ret)
+ return NULL;
+
+ ret->com.res_id = id;
+ ret->com.state = RES_MPT_RESERVED;
+ ret->key = key;
+
+ return &ret->com;
+}
+
+static struct res_common *alloc_eq_tr(int id)
+{
+ struct res_eq *ret;
+
+ ret = kzalloc(sizeof *ret, GFP_KERNEL);
+ if (!ret)
+ return NULL;
+
+ ret->com.res_id = id;
+ ret->com.state = RES_EQ_RESERVED;
+
+ return &ret->com;
+}
+
+static struct res_common *alloc_cq_tr(int id)
+{
+ struct res_cq *ret;
+
+ ret = kzalloc(sizeof *ret, GFP_KERNEL);
+ if (!ret)
+ return NULL;
+
+ ret->com.res_id = id;
+ ret->com.state = RES_CQ_ALLOCATED;
+ atomic_set(&ret->ref_count, 0);
+
+ return &ret->com;
+}
+
+static struct res_common *alloc_srq_tr(int id)
+{
+ struct res_srq *ret;
+
+ ret = kzalloc(sizeof *ret, GFP_KERNEL);
+ if (!ret)
+ return NULL;
+
+ ret->com.res_id = id;
+ ret->com.state = RES_SRQ_ALLOCATED;
+ atomic_set(&ret->ref_count, 0);
+
+ return &ret->com;
+}
+
+static struct res_common *alloc_counter_tr(int id)
+{
+ struct res_counter *ret;
+
+ ret = kzalloc(sizeof *ret, GFP_KERNEL);
+ if (!ret)
+ return NULL;
+
+ ret->com.res_id = id;
+ ret->com.state = RES_COUNTER_ALLOCATED;
+
+ return &ret->com;
+}
+
+static struct res_common *alloc_xrcdn_tr(int id)
+{
+ struct res_xrcdn *ret;
+
+ ret = kzalloc(sizeof *ret, GFP_KERNEL);
+ if (!ret)
+ return NULL;
+
+ ret->com.res_id = id;
+ ret->com.state = RES_XRCD_ALLOCATED;
+
+ return &ret->com;
+}
+
+static struct res_common *alloc_fs_rule_tr(u64 id, int qpn)
+{
+ struct res_fs_rule *ret;
+
+ ret = kzalloc(sizeof *ret, GFP_KERNEL);
+ if (!ret)
+ return NULL;
+
+ ret->com.res_id = id;
+ ret->com.state = RES_FS_RULE_ALLOCATED;
+ ret->qpn = qpn;
+ return &ret->com;
+}
+
+static struct res_common *alloc_tr(u64 id, enum mlx4_resource type, int slave,
+ int extra)
+{
+ struct res_common *ret;
+
+ switch (type) {
+ case RES_QP:
+ ret = alloc_qp_tr(id);
+ break;
+ case RES_MPT:
+ ret = alloc_mpt_tr(id, extra);
+ break;
+ case RES_MTT:
+ ret = alloc_mtt_tr(id, extra);
+ break;
+ case RES_EQ:
+ ret = alloc_eq_tr(id);
+ break;
+ case RES_CQ:
+ ret = alloc_cq_tr(id);
+ break;
+ case RES_SRQ:
+ ret = alloc_srq_tr(id);
+ break;
+ case RES_MAC:
+ pr_err("implementation missing\n");
+ return NULL;
+ case RES_COUNTER:
+ ret = alloc_counter_tr(id);
+ break;
+ case RES_XRCD:
+ ret = alloc_xrcdn_tr(id);
+ break;
+ case RES_FS_RULE:
+ ret = alloc_fs_rule_tr(id, extra);
+ break;
+ default:
+ return NULL;
+ }
+ if (ret)
+ ret->owner = slave;
+
+ return ret;
+}
+
+static int add_res_range(struct mlx4_dev *dev, int slave, u64 base, int count,
+ enum mlx4_resource type, int extra)
+{
+ int i;
+ int err;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct res_common **res_arr;
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct rb_root *root = &tracker->res_tree[type];
+
+ res_arr = kzalloc(count * sizeof *res_arr, GFP_KERNEL);
+ if (!res_arr)
+ return -ENOMEM;
+
+ for (i = 0; i < count; ++i) {
+ res_arr[i] = alloc_tr(base + i, type, slave, extra);
+ if (!res_arr[i]) {
+ for (--i; i >= 0; --i)
+ kfree(res_arr[i]);
+
+ kfree(res_arr);
+ return -ENOMEM;
+ }
+ }
+
+ spin_lock_irq(mlx4_tlock(dev));
+ for (i = 0; i < count; ++i) {
+ if (find_res(dev, base + i, type)) {
+ err = -EEXIST;
+ goto undo;
+ }
+ err = res_tracker_insert(root, res_arr[i]);
+ if (err)
+ goto undo;
+ list_add_tail(&res_arr[i]->list,
+ &tracker->slave_list[slave].res_list[type]);
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+ kfree(res_arr);
+
+ return 0;
+
+undo:
+ for (--i; i >= 0; --i) {
+ rb_erase(&res_arr[i]->node, root);
+ list_del_init(&res_arr[i]->list);
+ }
+
+ spin_unlock_irq(mlx4_tlock(dev));
+
+ for (i = 0; i < count; ++i)
+ kfree(res_arr[i]);
+
+ kfree(res_arr);
+
+ return err;
+}
+
+static int remove_qp_ok(struct res_qp *res)
+{
+ if (res->com.state == RES_QP_BUSY || atomic_read(&res->ref_count) ||
+ !list_empty(&res->mcg_list)) {
+ pr_err("resource tracker: fail to remove qp, state %d, ref_count %d\n",
+ res->com.state, atomic_read(&res->ref_count));
+ return -EBUSY;
+ } else if (res->com.state != RES_QP_RESERVED) {
+ return -EPERM;
+ }
+
+ return 0;
+}
+
+static int remove_mtt_ok(struct res_mtt *res, int order)
+{
+ if (res->com.state == RES_MTT_BUSY ||
+ atomic_read(&res->ref_count)) {
+ pr_devel("%s-%d: state %s, ref_count %d\n",
+ __func__, __LINE__,
+ mtt_states_str(res->com.state),
+ atomic_read(&res->ref_count));
+ return -EBUSY;
+ } else if (res->com.state != RES_MTT_ALLOCATED)
+ return -EPERM;
+ else if (res->order != order)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int remove_mpt_ok(struct res_mpt *res)
+{
+ if (res->com.state == RES_MPT_BUSY)
+ return -EBUSY;
+ else if (res->com.state != RES_MPT_RESERVED)
+ return -EPERM;
+
+ return 0;
+}
+
+static int remove_eq_ok(struct res_eq *res)
+{
+ if (res->com.state == RES_MPT_BUSY)
+ return -EBUSY;
+ else if (res->com.state != RES_MPT_RESERVED)
+ return -EPERM;
+
+ return 0;
+}
+
+static int remove_counter_ok(struct res_counter *res)
+{
+ if (res->com.state == RES_COUNTER_BUSY)
+ return -EBUSY;
+ else if (res->com.state != RES_COUNTER_ALLOCATED)
+ return -EPERM;
+
+ return 0;
+}
+
+static int remove_xrcdn_ok(struct res_xrcdn *res)
+{
+ if (res->com.state == RES_XRCD_BUSY)
+ return -EBUSY;
+ else if (res->com.state != RES_XRCD_ALLOCATED)
+ return -EPERM;
+
+ return 0;
+}
+
+static int remove_fs_rule_ok(struct res_fs_rule *res)
+{
+ if (res->com.state == RES_FS_RULE_BUSY)
+ return -EBUSY;
+ else if (res->com.state != RES_FS_RULE_ALLOCATED)
+ return -EPERM;
+
+ return 0;
+}
+
+static int remove_cq_ok(struct res_cq *res)
+{
+ if (res->com.state == RES_CQ_BUSY)
+ return -EBUSY;
+ else if (res->com.state != RES_CQ_ALLOCATED)
+ return -EPERM;
+
+ return 0;
+}
+
+static int remove_srq_ok(struct res_srq *res)
+{
+ if (res->com.state == RES_SRQ_BUSY)
+ return -EBUSY;
+ else if (res->com.state != RES_SRQ_ALLOCATED)
+ return -EPERM;
+
+ return 0;
+}
+
+static int remove_ok(struct res_common *res, enum mlx4_resource type, int extra)
+{
+ switch (type) {
+ case RES_QP:
+ return remove_qp_ok((struct res_qp *)res);
+ case RES_CQ:
+ return remove_cq_ok((struct res_cq *)res);
+ case RES_SRQ:
+ return remove_srq_ok((struct res_srq *)res);
+ case RES_MPT:
+ return remove_mpt_ok((struct res_mpt *)res);
+ case RES_MTT:
+ return remove_mtt_ok((struct res_mtt *)res, extra);
+ case RES_MAC:
+ return -ENOSYS;
+ case RES_EQ:
+ return remove_eq_ok((struct res_eq *)res);
+ case RES_COUNTER:
+ return remove_counter_ok((struct res_counter *)res);
+ case RES_XRCD:
+ return remove_xrcdn_ok((struct res_xrcdn *)res);
+ case RES_FS_RULE:
+ return remove_fs_rule_ok((struct res_fs_rule *)res);
+ default:
+ return -EINVAL;
+ }
+}
+
+static int rem_res_range(struct mlx4_dev *dev, int slave, u64 base, int count,
+ enum mlx4_resource type, int extra)
+{
+ u64 i;
+ int err;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct res_common *r;
+
+ spin_lock_irq(mlx4_tlock(dev));
+ for (i = base; i < base + count; ++i) {
+ r = res_tracker_lookup(&tracker->res_tree[type], i);
+ if (!r) {
+ err = -ENOENT;
+ goto out;
+ }
+ if (r->owner != slave) {
+ err = -EPERM;
+ goto out;
+ }
+ err = remove_ok(r, type, extra);
+ if (err)
+ goto out;
+ }
+
+ for (i = base; i < base + count; ++i) {
+ r = res_tracker_lookup(&tracker->res_tree[type], i);
+ rb_erase(&r->node, &tracker->res_tree[type]);
+ list_del(&r->list);
+ kfree(r);
+ }
+ err = 0;
+
+out:
+ spin_unlock_irq(mlx4_tlock(dev));
+
+ return err;
+}
+
+static int qp_res_start_move_to(struct mlx4_dev *dev, int slave, int qpn,
+ enum res_qp_states state, struct res_qp **qp,
+ int alloc)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct res_qp *r;
+ int err = 0;
+
+ spin_lock_irq(mlx4_tlock(dev));
+ r = res_tracker_lookup(&tracker->res_tree[RES_QP], qpn);
+ if (!r)
+ err = -ENOENT;
+ else if (r->com.owner != slave)
+ err = -EPERM;
+ else {
+ switch (state) {
+ case RES_QP_BUSY:
+ mlx4_dbg(dev, "%s: failed RES_QP, 0x%llx\n",
+ __func__, r->com.res_id);
+ err = -EBUSY;
+ break;
+
+ case RES_QP_RESERVED:
+ if (r->com.state == RES_QP_MAPPED && !alloc)
+ break;
+
+ mlx4_dbg(dev, "failed RES_QP, 0x%llx\n", r->com.res_id);
+ err = -EINVAL;
+ break;
+
+ case RES_QP_MAPPED:
+ if ((r->com.state == RES_QP_RESERVED && alloc) ||
+ r->com.state == RES_QP_HW)
+ break;
+ else {
+ mlx4_dbg(dev, "failed RES_QP, 0x%llx\n",
+ r->com.res_id);
+ err = -EINVAL;
+ }
+
+ break;
+
+ case RES_QP_HW:
+ if (r->com.state != RES_QP_MAPPED)
+ err = -EINVAL;
+ break;
+ default:
+ err = -EINVAL;
+ }
+
+ if (!err) {
+ r->com.from_state = r->com.state;
+ r->com.to_state = state;
+ r->com.state = RES_QP_BUSY;
+ if (qp)
+ *qp = r;
+ }
+ }
+
+ spin_unlock_irq(mlx4_tlock(dev));
+
+ return err;
+}
+
+static int mr_res_start_move_to(struct mlx4_dev *dev, int slave, int index,
+ enum res_mpt_states state, struct res_mpt **mpt)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct res_mpt *r;
+ int err = 0;
+
+ spin_lock_irq(mlx4_tlock(dev));
+ r = res_tracker_lookup(&tracker->res_tree[RES_MPT], index);
+ if (!r)
+ err = -ENOENT;
+ else if (r->com.owner != slave)
+ err = -EPERM;
+ else {
+ switch (state) {
+ case RES_MPT_BUSY:
+ err = -EINVAL;
+ break;
+
+ case RES_MPT_RESERVED:
+ if (r->com.state != RES_MPT_MAPPED)
+ err = -EINVAL;
+ break;
+
+ case RES_MPT_MAPPED:
+ if (r->com.state != RES_MPT_RESERVED &&
+ r->com.state != RES_MPT_HW)
+ err = -EINVAL;
+ break;
+
+ case RES_MPT_HW:
+ if (r->com.state != RES_MPT_MAPPED)
+ err = -EINVAL;
+ break;
+ default:
+ err = -EINVAL;
+ }
+
+ if (!err) {
+ r->com.from_state = r->com.state;
+ r->com.to_state = state;
+ r->com.state = RES_MPT_BUSY;
+ if (mpt)
+ *mpt = r;
+ }
+ }
+
+ spin_unlock_irq(mlx4_tlock(dev));
+
+ return err;
+}
+
+static int eq_res_start_move_to(struct mlx4_dev *dev, int slave, int index,
+ enum res_eq_states state, struct res_eq **eq)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct res_eq *r;
+ int err = 0;
+
+ spin_lock_irq(mlx4_tlock(dev));
+ r = res_tracker_lookup(&tracker->res_tree[RES_EQ], index);
+ if (!r)
+ err = -ENOENT;
+ else if (r->com.owner != slave)
+ err = -EPERM;
+ else {
+ switch (state) {
+ case RES_EQ_BUSY:
+ err = -EINVAL;
+ break;
+
+ case RES_EQ_RESERVED:
+ if (r->com.state != RES_EQ_HW)
+ err = -EINVAL;
+ break;
+
+ case RES_EQ_HW:
+ if (r->com.state != RES_EQ_RESERVED)
+ err = -EINVAL;
+ break;
+
+ default:
+ err = -EINVAL;
+ }
+
+ if (!err) {
+ r->com.from_state = r->com.state;
+ r->com.to_state = state;
+ r->com.state = RES_EQ_BUSY;
+ if (eq)
+ *eq = r;
+ }
+ }
+
+ spin_unlock_irq(mlx4_tlock(dev));
+
+ return err;
+}
+
+static int cq_res_start_move_to(struct mlx4_dev *dev, int slave, int cqn,
+ enum res_cq_states state, struct res_cq **cq)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct res_cq *r;
+ int err;
+
+ spin_lock_irq(mlx4_tlock(dev));
+ r = res_tracker_lookup(&tracker->res_tree[RES_CQ], cqn);
+ if (!r) {
+ err = -ENOENT;
+ } else if (r->com.owner != slave) {
+ err = -EPERM;
+ } else if (state == RES_CQ_ALLOCATED) {
+ if (r->com.state != RES_CQ_HW)
+ err = -EINVAL;
+ else if (atomic_read(&r->ref_count))
+ err = -EBUSY;
+ else
+ err = 0;
+ } else if (state != RES_CQ_HW || r->com.state != RES_CQ_ALLOCATED) {
+ err = -EINVAL;
+ } else {
+ err = 0;
+ }
+
+ if (!err) {
+ r->com.from_state = r->com.state;
+ r->com.to_state = state;
+ r->com.state = RES_CQ_BUSY;
+ if (cq)
+ *cq = r;
+ }
+
+ spin_unlock_irq(mlx4_tlock(dev));
+
+ return err;
+}
+
+static int srq_res_start_move_to(struct mlx4_dev *dev, int slave, int index,
+ enum res_srq_states state, struct res_srq **srq)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct res_srq *r;
+ int err = 0;
+
+ spin_lock_irq(mlx4_tlock(dev));
+ r = res_tracker_lookup(&tracker->res_tree[RES_SRQ], index);
+ if (!r) {
+ err = -ENOENT;
+ } else if (r->com.owner != slave) {
+ err = -EPERM;
+ } else if (state == RES_SRQ_ALLOCATED) {
+ if (r->com.state != RES_SRQ_HW)
+ err = -EINVAL;
+ else if (atomic_read(&r->ref_count))
+ err = -EBUSY;
+ } else if (state != RES_SRQ_HW || r->com.state != RES_SRQ_ALLOCATED) {
+ err = -EINVAL;
+ }
+
+ if (!err) {
+ r->com.from_state = r->com.state;
+ r->com.to_state = state;
+ r->com.state = RES_SRQ_BUSY;
+ if (srq)
+ *srq = r;
+ }
+
+ spin_unlock_irq(mlx4_tlock(dev));
+
+ return err;
+}
+
+static void res_abort_move(struct mlx4_dev *dev, int slave,
+ enum mlx4_resource type, int id)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct res_common *r;
+
+ spin_lock_irq(mlx4_tlock(dev));
+ r = res_tracker_lookup(&tracker->res_tree[type], id);
+ if (r && (r->owner == slave))
+ r->state = r->from_state;
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
+static void res_end_move(struct mlx4_dev *dev, int slave,
+ enum mlx4_resource type, int id)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct res_common *r;
+
+ spin_lock_irq(mlx4_tlock(dev));
+ r = res_tracker_lookup(&tracker->res_tree[type], id);
+ if (r && (r->owner == slave))
+ r->state = r->to_state;
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
+static int valid_reserved(struct mlx4_dev *dev, int slave, int qpn)
+{
+ return mlx4_is_qp_reserved(dev, qpn) &&
+ (mlx4_is_master(dev) || mlx4_is_guest_proxy(dev, slave, qpn));
+}
+
+static int fw_reserved(struct mlx4_dev *dev, int qpn)
+{
+ return qpn < dev->caps.reserved_qps_cnt[MLX4_QP_REGION_FW];
+}
+
+static int qp_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ int err;
+ int count;
+ int align;
+ int base;
+ int qpn;
+ u8 flags;
+
+ switch (op) {
+ case RES_OP_RESERVE:
+ count = get_param_l(&in_param) & 0xffffff;
+ /* Turn off all unsupported QP allocation flags that the
+ * slave tries to set.
+ */
+ flags = (get_param_l(&in_param) >> 24) & dev->caps.alloc_res_qp_mask;
+ align = get_param_h(&in_param);
+ err = mlx4_grant_resource(dev, slave, RES_QP, count, 0);
+ if (err)
+ return err;
+
+ err = __mlx4_qp_reserve_range(dev, count, align, &base, flags);
+ if (err) {
+ mlx4_release_resource(dev, slave, RES_QP, count, 0);
+ return err;
+ }
+
+ err = add_res_range(dev, slave, base, count, RES_QP, 0);
+ if (err) {
+ mlx4_release_resource(dev, slave, RES_QP, count, 0);
+ __mlx4_qp_release_range(dev, base, count);
+ return err;
+ }
+ set_param_l(out_param, base);
+ break;
+ case RES_OP_MAP_ICM:
+ qpn = get_param_l(&in_param) & 0x7fffff;
+ if (valid_reserved(dev, slave, qpn)) {
+ err = add_res_range(dev, slave, qpn, 1, RES_QP, 0);
+ if (err)
+ return err;
+ }
+
+ err = qp_res_start_move_to(dev, slave, qpn, RES_QP_MAPPED,
+ NULL, 1);
+ if (err)
+ return err;
+
+ if (!fw_reserved(dev, qpn)) {
+ err = __mlx4_qp_alloc_icm(dev, qpn, GFP_KERNEL);
+ if (err) {
+ res_abort_move(dev, slave, RES_QP, qpn);
+ return err;
+ }
+ }
+
+ res_end_move(dev, slave, RES_QP, qpn);
+ break;
+
+ default:
+ err = -EINVAL;
+ break;
+ }
+ return err;
+}
+
+static int mtt_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ int err = -EINVAL;
+ int base;
+ int order;
+
+ if (op != RES_OP_RESERVE_AND_MAP)
+ return err;
+
+ order = get_param_l(&in_param);
+
+ err = mlx4_grant_resource(dev, slave, RES_MTT, 1 << order, 0);
+ if (err)
+ return err;
+
+ base = __mlx4_alloc_mtt_range(dev, order);
+ if (base == -1) {
+ mlx4_release_resource(dev, slave, RES_MTT, 1 << order, 0);
+ return -ENOMEM;
+ }
+
+ err = add_res_range(dev, slave, base, 1, RES_MTT, order);
+ if (err) {
+ mlx4_release_resource(dev, slave, RES_MTT, 1 << order, 0);
+ __mlx4_free_mtt_range(dev, base, order);
+ } else {
+ set_param_l(out_param, base);
+ }
+
+ return err;
+}
+
+static int mpt_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ int err = -EINVAL;
+ int index;
+ int id;
+ struct res_mpt *mpt;
+
+ switch (op) {
+ case RES_OP_RESERVE:
+ err = mlx4_grant_resource(dev, slave, RES_MPT, 1, 0);
+ if (err)
+ break;
+
+ index = __mlx4_mpt_reserve(dev);
+ if (index == -1) {
+ mlx4_release_resource(dev, slave, RES_MPT, 1, 0);
+ break;
+ }
+ id = index & mpt_mask(dev);
+
+ err = add_res_range(dev, slave, id, 1, RES_MPT, index);
+ if (err) {
+ mlx4_release_resource(dev, slave, RES_MPT, 1, 0);
+ __mlx4_mpt_release(dev, index);
+ break;
+ }
+ set_param_l(out_param, index);
+ break;
+ case RES_OP_MAP_ICM:
+ index = get_param_l(&in_param);
+ id = index & mpt_mask(dev);
+ err = mr_res_start_move_to(dev, slave, id,
+ RES_MPT_MAPPED, &mpt);
+ if (err)
+ return err;
+
+ err = __mlx4_mpt_alloc_icm(dev, mpt->key, GFP_KERNEL);
+ if (err) {
+ res_abort_move(dev, slave, RES_MPT, id);
+ return err;
+ }
+
+ res_end_move(dev, slave, RES_MPT, id);
+ break;
+ }
+ return err;
+}
+
+static int cq_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ int cqn;
+ int err;
+
+ switch (op) {
+ case RES_OP_RESERVE_AND_MAP:
+ err = mlx4_grant_resource(dev, slave, RES_CQ, 1, 0);
+ if (err)
+ break;
+
+ err = __mlx4_cq_alloc_icm(dev, &cqn);
+ if (err) {
+ mlx4_release_resource(dev, slave, RES_CQ, 1, 0);
+ break;
+ }
+
+ err = add_res_range(dev, slave, cqn, 1, RES_CQ, 0);
+ if (err) {
+ mlx4_release_resource(dev, slave, RES_CQ, 1, 0);
+ __mlx4_cq_free_icm(dev, cqn);
+ break;
+ }
+
+ set_param_l(out_param, cqn);
+ break;
+
+ default:
+ err = -EINVAL;
+ }
+
+ return err;
+}
+
+static int srq_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ int srqn;
+ int err;
+
+ switch (op) {
+ case RES_OP_RESERVE_AND_MAP:
+ err = mlx4_grant_resource(dev, slave, RES_SRQ, 1, 0);
+ if (err)
+ break;
+
+ err = __mlx4_srq_alloc_icm(dev, &srqn);
+ if (err) {
+ mlx4_release_resource(dev, slave, RES_SRQ, 1, 0);
+ break;
+ }
+
+ err = add_res_range(dev, slave, srqn, 1, RES_SRQ, 0);
+ if (err) {
+ mlx4_release_resource(dev, slave, RES_SRQ, 1, 0);
+ __mlx4_srq_free_icm(dev, srqn);
+ break;
+ }
+
+ set_param_l(out_param, srqn);
+ break;
+
+ default:
+ err = -EINVAL;
+ }
+
+ return err;
+}
+
+static int mac_find_smac_ix_in_slave(struct mlx4_dev *dev, int slave, int port,
+ u8 smac_index, u64 *mac)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *mac_list =
+ &tracker->slave_list[slave].res_list[RES_MAC];
+ struct mac_res *res, *tmp;
+
+ list_for_each_entry_safe(res, tmp, mac_list, list) {
+ if (res->smac_index == smac_index && res->port == (u8) port) {
+ *mac = res->mac;
+ return 0;
+ }
+ }
+ return -ENOENT;
+}
+
+static int mac_add_to_slave(struct mlx4_dev *dev, int slave, u64 mac, int port, u8 smac_index)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *mac_list =
+ &tracker->slave_list[slave].res_list[RES_MAC];
+ struct mac_res *res, *tmp;
+
+ list_for_each_entry_safe(res, tmp, mac_list, list) {
+ if (res->mac == mac && res->port == (u8) port) {
+ /* mac found. update ref count */
+ ++res->ref_count;
+ return 0;
+ }
+ }
+
+ if (mlx4_grant_resource(dev, slave, RES_MAC, 1, port))
+ return -EINVAL;
+ res = kzalloc(sizeof *res, GFP_KERNEL);
+ if (!res) {
+ mlx4_release_resource(dev, slave, RES_MAC, 1, port);
+ return -ENOMEM;
+ }
+ res->mac = mac;
+ res->port = (u8) port;
+ res->smac_index = smac_index;
+ res->ref_count = 1;
+ list_add_tail(&res->list,
+ &tracker->slave_list[slave].res_list[RES_MAC]);
+ return 0;
+}
+
+static void mac_del_from_slave(struct mlx4_dev *dev, int slave, u64 mac,
+ int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *mac_list =
+ &tracker->slave_list[slave].res_list[RES_MAC];
+ struct mac_res *res, *tmp;
+
+ list_for_each_entry_safe(res, tmp, mac_list, list) {
+ if (res->mac == mac && res->port == (u8) port) {
+ if (!--res->ref_count) {
+ list_del(&res->list);
+ mlx4_release_resource(dev, slave, RES_MAC, 1, port);
+ kfree(res);
+ }
+ break;
+ }
+ }
+}
+
+static void rem_slave_macs(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *mac_list =
+ &tracker->slave_list[slave].res_list[RES_MAC];
+ struct mac_res *res, *tmp;
+ int i;
+
+ list_for_each_entry_safe(res, tmp, mac_list, list) {
+ list_del(&res->list);
+ /* dereference the mac the num times the slave referenced it */
+ for (i = 0; i < res->ref_count; i++)
+ __mlx4_unregister_mac(dev, res->port, res->mac);
+ mlx4_release_resource(dev, slave, RES_MAC, 1, res->port);
+ kfree(res);
+ }
+}
+
+static int mac_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param, int in_port)
+{
+ int err = -EINVAL;
+ int port;
+ u64 mac;
+ u8 smac_index;
+
+ if (op != RES_OP_RESERVE_AND_MAP)
+ return err;
+
+ port = !in_port ? get_param_l(out_param) : in_port;
+ port = mlx4_slave_convert_port(
+ dev, slave, port);
+
+ if (port < 0)
+ return -EINVAL;
+ mac = in_param;
+
+ err = __mlx4_register_mac(dev, port, mac);
+ if (err >= 0) {
+ smac_index = err;
+ set_param_l(out_param, err);
+ err = 0;
+ }
+
+ if (!err) {
+ err = mac_add_to_slave(dev, slave, mac, port, smac_index);
+ if (err)
+ __mlx4_unregister_mac(dev, port, mac);
+ }
+ return err;
+}
+
+static int vlan_add_to_slave(struct mlx4_dev *dev, int slave, u16 vlan,
+ int port, int vlan_index)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *vlan_list =
+ &tracker->slave_list[slave].res_list[RES_VLAN];
+ struct vlan_res *res, *tmp;
+
+ list_for_each_entry_safe(res, tmp, vlan_list, list) {
+ if (res->vlan == vlan && res->port == (u8) port) {
+ /* vlan found. update ref count */
+ ++res->ref_count;
+ return 0;
+ }
+ }
+
+ if (mlx4_grant_resource(dev, slave, RES_VLAN, 1, port))
+ return -EINVAL;
+ res = kzalloc(sizeof(*res), GFP_KERNEL);
+ if (!res) {
+ mlx4_release_resource(dev, slave, RES_VLAN, 1, port);
+ return -ENOMEM;
+ }
+ res->vlan = vlan;
+ res->port = (u8) port;
+ res->vlan_index = vlan_index;
+ res->ref_count = 1;
+ list_add_tail(&res->list,
+ &tracker->slave_list[slave].res_list[RES_VLAN]);
+ return 0;
+}
+
+
+static void vlan_del_from_slave(struct mlx4_dev *dev, int slave, u16 vlan,
+ int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *vlan_list =
+ &tracker->slave_list[slave].res_list[RES_VLAN];
+ struct vlan_res *res, *tmp;
+
+ list_for_each_entry_safe(res, tmp, vlan_list, list) {
+ if (res->vlan == vlan && res->port == (u8) port) {
+ if (!--res->ref_count) {
+ list_del(&res->list);
+ mlx4_release_resource(dev, slave, RES_VLAN,
+ 1, port);
+ kfree(res);
+ }
+ break;
+ }
+ }
+}
+
+static void rem_slave_vlans(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *vlan_list =
+ &tracker->slave_list[slave].res_list[RES_VLAN];
+ struct vlan_res *res, *tmp;
+ int i;
+
+ list_for_each_entry_safe(res, tmp, vlan_list, list) {
+ list_del(&res->list);
+ /* dereference the vlan the num times the slave referenced it */
+ for (i = 0; i < res->ref_count; i++)
+ __mlx4_unregister_vlan(dev, res->port, res->vlan);
+ mlx4_release_resource(dev, slave, RES_VLAN, 1, res->port);
+ kfree(res);
+ }
+}
+
+static int vlan_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param, int in_port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_state *slave_state = priv->mfunc.master.slave_state;
+ int err;
+ u16 vlan;
+ int vlan_index;
+ int port;
+
+ port = !in_port ? get_param_l(out_param) : in_port;
+
+ if (!port || op != RES_OP_RESERVE_AND_MAP)
+ return -EINVAL;
+
+ port = mlx4_slave_convert_port(
+ dev, slave, port);
+
+ if (port < 0)
+ return -EINVAL;
+ /* upstream kernels had NOP for reg/unreg vlan. Continue this. */
+ if (!in_port && port > 0 && port <= dev->caps.num_ports) {
+ slave_state[slave].old_vlan_api = true;
+ return 0;
+ }
+
+ vlan = (u16) in_param;
+
+ err = __mlx4_register_vlan(dev, port, vlan, &vlan_index);
+ if (!err) {
+ set_param_l(out_param, (u32) vlan_index);
+ err = vlan_add_to_slave(dev, slave, vlan, port, vlan_index);
+ if (err)
+ __mlx4_unregister_vlan(dev, port, vlan);
+ }
+ return err;
+}
+
+static int counter_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param, int port)
+{
+ u32 index;
+ int err;
+
+ if (op != RES_OP_RESERVE)
+ return -EINVAL;
+
+ if (port != 0)
+ port = mlx4_slave_convert_port(dev, slave, port);
+
+ if (port < 0)
+ return -EINVAL;
+
+ err = __mlx4_counter_alloc(dev, slave, port, &index);
+ if (!err)
+ set_param_l(out_param, index);
+
+ return err;
+}
+
+static int xrcdn_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ u32 xrcdn;
+ int err;
+
+ if (op != RES_OP_RESERVE)
+ return -EINVAL;
+
+ err = __mlx4_xrcd_alloc(dev, &xrcdn);
+ if (err)
+ return err;
+
+ err = add_res_range(dev, slave, xrcdn, 1, RES_XRCD, 0);
+ if (err)
+ __mlx4_xrcd_free(dev, xrcdn);
+ else
+ set_param_l(out_param, xrcdn);
+
+ return err;
+}
+
+int mlx4_ALLOC_RES_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int alop = vhcr->op_modifier;
+
+ switch (vhcr->in_modifier & 0xFF) {
+ case RES_QP:
+ err = qp_alloc_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+ break;
+
+ case RES_MTT:
+ err = mtt_alloc_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+ break;
+
+ case RES_MPT:
+ err = mpt_alloc_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+ break;
+
+ case RES_CQ:
+ err = cq_alloc_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+ break;
+
+ case RES_SRQ:
+ err = srq_alloc_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+ break;
+
+ case RES_MAC:
+ err = mac_alloc_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param,
+ (vhcr->in_modifier >> 8) & 0xFF);
+ break;
+
+ case RES_VLAN:
+ err = vlan_alloc_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param,
+ (vhcr->in_modifier >> 8) & 0xFF);
+ break;
+
+ case RES_COUNTER:
+ err = counter_alloc_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param,
+ (vhcr->in_modifier >> 8) & 0xFF);
+ break;
+
+ case RES_XRCD:
+ err = xrcdn_alloc_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+ break;
+
+ default:
+ err = -EINVAL;
+ break;
+ }
+
+ return err;
+}
+
+static int qp_free_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param)
+{
+ int err;
+ int count;
+ int base;
+ int qpn;
+
+ switch (op) {
+ case RES_OP_RESERVE:
+ base = get_param_l(&in_param) & 0x7fffff;
+ count = get_param_h(&in_param);
+ err = rem_res_range(dev, slave, base, count, RES_QP, 0);
+ if (err)
+ break;
+ mlx4_release_resource(dev, slave, RES_QP, count, 0);
+ __mlx4_qp_release_range(dev, base, count);
+ break;
+ case RES_OP_MAP_ICM:
+ qpn = get_param_l(&in_param) & 0x7fffff;
+ err = qp_res_start_move_to(dev, slave, qpn, RES_QP_RESERVED,
+ NULL, 0);
+ if (err)
+ return err;
+
+ if (!fw_reserved(dev, qpn))
+ __mlx4_qp_free_icm(dev, qpn);
+
+ res_end_move(dev, slave, RES_QP, qpn);
+
+ if (valid_reserved(dev, slave, qpn))
+ err = rem_res_range(dev, slave, qpn, 1, RES_QP, 0);
+ break;
+ default:
+ err = -EINVAL;
+ break;
+ }
+ return err;
+}
+
+static int mtt_free_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ int err = -EINVAL;
+ int base;
+ int order;
+
+ if (op != RES_OP_RESERVE_AND_MAP)
+ return err;
+
+ base = get_param_l(&in_param);
+ order = get_param_h(&in_param);
+ err = rem_res_range(dev, slave, base, 1, RES_MTT, order);
+ if (!err) {
+ mlx4_release_resource(dev, slave, RES_MTT, 1 << order, 0);
+ __mlx4_free_mtt_range(dev, base, order);
+ }
+ return err;
+}
+
+static int mpt_free_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param)
+{
+ int err = -EINVAL;
+ int index;
+ int id;
+ struct res_mpt *mpt;
+
+ switch (op) {
+ case RES_OP_RESERVE:
+ index = get_param_l(&in_param);
+ id = index & mpt_mask(dev);
+ err = get_res(dev, slave, id, RES_MPT, &mpt);
+ if (err)
+ break;
+ index = mpt->key;
+ put_res(dev, slave, id, RES_MPT);
+
+ err = rem_res_range(dev, slave, id, 1, RES_MPT, 0);
+ if (err)
+ break;
+ mlx4_release_resource(dev, slave, RES_MPT, 1, 0);
+ __mlx4_mpt_release(dev, index);
+ break;
+ case RES_OP_MAP_ICM:
+ index = get_param_l(&in_param);
+ id = index & mpt_mask(dev);
+ err = mr_res_start_move_to(dev, slave, id,
+ RES_MPT_RESERVED, &mpt);
+ if (err)
+ return err;
+
+ __mlx4_mpt_free_icm(dev, mpt->key);
+ res_end_move(dev, slave, RES_MPT, id);
+ return err;
+ break;
+ default:
+ err = -EINVAL;
+ break;
+ }
+ return err;
+}
+
+static int cq_free_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ int cqn;
+ int err;
+
+ switch (op) {
+ case RES_OP_RESERVE_AND_MAP:
+ cqn = get_param_l(&in_param);
+ err = rem_res_range(dev, slave, cqn, 1, RES_CQ, 0);
+ if (err)
+ break;
+
+ mlx4_release_resource(dev, slave, RES_CQ, 1, 0);
+ __mlx4_cq_free_icm(dev, cqn);
+ break;
+
+ default:
+ err = -EINVAL;
+ break;
+ }
+
+ return err;
+}
+
+static int srq_free_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ int srqn;
+ int err;
+
+ switch (op) {
+ case RES_OP_RESERVE_AND_MAP:
+ srqn = get_param_l(&in_param);
+ err = rem_res_range(dev, slave, srqn, 1, RES_SRQ, 0);
+ if (err)
+ break;
+
+ mlx4_release_resource(dev, slave, RES_SRQ, 1, 0);
+ __mlx4_srq_free_icm(dev, srqn);
+ break;
+
+ default:
+ err = -EINVAL;
+ break;
+ }
+
+ return err;
+}
+
+static int mac_free_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param, int in_port)
+{
+ int port;
+ int err = 0;
+
+ switch (op) {
+ case RES_OP_RESERVE_AND_MAP:
+ port = !in_port ? get_param_l(out_param) : in_port;
+ port = mlx4_slave_convert_port(
+ dev, slave, port);
+
+ if (port < 0)
+ return -EINVAL;
+ mac_del_from_slave(dev, slave, in_param, port);
+ __mlx4_unregister_mac(dev, port, in_param);
+ break;
+ default:
+ err = -EINVAL;
+ break;
+ }
+
+ return err;
+
+}
+
+static int vlan_free_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param, int port)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_state *slave_state = priv->mfunc.master.slave_state;
+ int err = 0;
+
+ port = mlx4_slave_convert_port(
+ dev, slave, port);
+
+ if (port < 0)
+ return -EINVAL;
+ switch (op) {
+ case RES_OP_RESERVE_AND_MAP:
+ if (slave_state[slave].old_vlan_api)
+ return 0;
+ if (!port)
+ return -EINVAL;
+ vlan_del_from_slave(dev, slave, in_param, port);
+ __mlx4_unregister_vlan(dev, port, in_param);
+ break;
+ default:
+ err = -EINVAL;
+ break;
+ }
+
+ return err;
+}
+
+static int counter_free_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param, int port)
+{
+ int index;
+
+ if (op != RES_OP_RESERVE)
+ return -EINVAL;
+
+ index = get_param_l(&in_param);
+
+ __mlx4_counter_free(dev, slave, port, index);
+
+ return 0;
+}
+
+static int xrcdn_free_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ int xrcdn;
+ int err;
+
+ if (op != RES_OP_RESERVE)
+ return -EINVAL;
+
+ xrcdn = get_param_l(&in_param);
+ err = rem_res_range(dev, slave, xrcdn, 1, RES_XRCD, 0);
+ if (err)
+ return err;
+
+ __mlx4_xrcd_free(dev, xrcdn);
+
+ return err;
+}
+
+int mlx4_FREE_RES_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err = -EINVAL;
+ int alop = vhcr->op_modifier;
+
+ switch (vhcr->in_modifier & 0xFF) {
+ case RES_QP:
+ err = qp_free_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param);
+ break;
+
+ case RES_MTT:
+ err = mtt_free_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+ break;
+
+ case RES_MPT:
+ err = mpt_free_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param);
+ break;
+
+ case RES_CQ:
+ err = cq_free_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+ break;
+
+ case RES_SRQ:
+ err = srq_free_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+ break;
+
+ case RES_MAC:
+ err = mac_free_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param,
+ (vhcr->in_modifier >> 8) & 0xFF);
+ break;
+
+ case RES_VLAN:
+ err = vlan_free_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param,
+ (vhcr->in_modifier >> 8) & 0xFF);
+ break;
+
+ case RES_COUNTER:
+ err = counter_free_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param,
+ (vhcr->in_modifier >> 8) & 0xFF);
+ break;
+
+ case RES_XRCD:
+ err = xrcdn_free_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+
+ default:
+ break;
+ }
+ return err;
+}
+
+/* ugly but other choices are uglier */
+static int mr_phys_mpt(struct mlx4_mpt_entry *mpt)
+{
+ return (be32_to_cpu(mpt->flags) >> 9) & 1;
+}
+
+static int mr_get_mtt_addr(struct mlx4_mpt_entry *mpt)
+{
+ return (int)be64_to_cpu(mpt->mtt_addr) & 0xfffffff8;
+}
+
+static int mr_get_mtt_size(struct mlx4_mpt_entry *mpt)
+{
+ return be32_to_cpu(mpt->mtt_sz);
+}
+
+static u32 mr_get_pd(struct mlx4_mpt_entry *mpt)
+{
+ return be32_to_cpu(mpt->pd_flags) & 0x00ffffff;
+}
+
+static int mr_is_fmr(struct mlx4_mpt_entry *mpt)
+{
+ return be32_to_cpu(mpt->pd_flags) & MLX4_MPT_PD_FLAG_FAST_REG;
+}
+
+static int mr_is_bind_enabled(struct mlx4_mpt_entry *mpt)
+{
+ return be32_to_cpu(mpt->flags) & MLX4_MPT_FLAG_BIND_ENABLE;
+}
+
+static int mr_is_region(struct mlx4_mpt_entry *mpt)
+{
+ return be32_to_cpu(mpt->flags) & MLX4_MPT_FLAG_REGION;
+}
+
+static int qp_get_mtt_addr(struct mlx4_qp_context *qpc)
+{
+ return be32_to_cpu(qpc->mtt_base_addr_l) & 0xfffffff8;
+}
+
+static int srq_get_mtt_addr(struct mlx4_srq_context *srqc)
+{
+ return be32_to_cpu(srqc->mtt_base_addr_l) & 0xfffffff8;
+}
+
+static int qp_get_mtt_size(struct mlx4_qp_context *qpc)
+{
+ int page_shift = (qpc->log_page_size & 0x3f) + 12;
+ int log_sq_size = (qpc->sq_size_stride >> 3) & 0xf;
+ int log_sq_sride = qpc->sq_size_stride & 7;
+ int log_rq_size = (qpc->rq_size_stride >> 3) & 0xf;
+ int log_rq_stride = qpc->rq_size_stride & 7;
+ int srq = (be32_to_cpu(qpc->srqn) >> 24) & 1;
+ int rss = (be32_to_cpu(qpc->flags) >> 13) & 1;
+ u32 ts = (be32_to_cpu(qpc->flags) >> 16) & 0xff;
+ int xrc = (ts == MLX4_QP_ST_XRC) ? 1 : 0;
+ int sq_size;
+ int rq_size;
+ int total_pages;
+ int total_mem;
+ int page_offset = (be32_to_cpu(qpc->params2) >> 6) & 0x3f;
+
+ sq_size = 1 << (log_sq_size + log_sq_sride + 4);
+ rq_size = (srq|rss|xrc) ? 0 : (1 << (log_rq_size + log_rq_stride + 4));
+ total_mem = sq_size + rq_size;
+ total_pages =
+ roundup_pow_of_two((total_mem + (page_offset << 6)) >>
+ page_shift);
+
+ return total_pages;
+}
+
+static int check_mtt_range(struct mlx4_dev *dev, int slave, int start,
+ int size, struct res_mtt *mtt)
+{
+ int res_start = mtt->com.res_id;
+ int res_size = (1 << mtt->order);
+
+ if (start < res_start || start + size > res_start + res_size)
+ return -EPERM;
+ return 0;
+}
+
+int mlx4_SW2HW_MPT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int index = vhcr->in_modifier;
+ struct res_mtt *mtt;
+ struct res_mpt *mpt;
+ int mtt_base = mr_get_mtt_addr(inbox->buf) / dev->caps.mtt_entry_sz;
+ int phys;
+ int id;
+ u32 pd;
+ int pd_slave;
+
+ id = index & mpt_mask(dev);
+ err = mr_res_start_move_to(dev, slave, id, RES_MPT_HW, &mpt);
+ if (err)
+ return err;
+
+ /* Disable memory windows for VFs. */
+ if (!mr_is_region(inbox->buf)) {
+ err = -EPERM;
+ goto ex_abort;
+ }
+
+ /* Make sure that the PD bits related to the slave id are zeros. */
+ pd = mr_get_pd(inbox->buf);
+ pd_slave = (pd >> 17) & 0x7f;
+ if (pd_slave != 0 && --pd_slave != slave) {
+ err = -EPERM;
+ goto ex_abort;
+ }
+
+ if (mr_is_fmr(inbox->buf)) {
+ /* FMR and Bind Enable are forbidden in slave devices. */
+ if (mr_is_bind_enabled(inbox->buf)) {
+ err = -EPERM;
+ goto ex_abort;
+ }
+ /* FMR and Memory Windows are also forbidden. */
+ if (!mr_is_region(inbox->buf)) {
+ err = -EPERM;
+ goto ex_abort;
+ }
+ }
+
+ phys = mr_phys_mpt(inbox->buf);
+ if (!phys) {
+ err = get_res(dev, slave, mtt_base, RES_MTT, &mtt);
+ if (err)
+ goto ex_abort;
+
+ err = check_mtt_range(dev, slave, mtt_base,
+ mr_get_mtt_size(inbox->buf), mtt);
+ if (err)
+ goto ex_put;
+
+ mpt->mtt = mtt;
+ }
+
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ if (err)
+ goto ex_put;
+
+ if (!phys) {
+ atomic_inc(&mtt->ref_count);
+ put_res(dev, slave, mtt->com.res_id, RES_MTT);
+ }
+
+ res_end_move(dev, slave, RES_MPT, id);
+ return 0;
+
+ex_put:
+ if (!phys)
+ put_res(dev, slave, mtt->com.res_id, RES_MTT);
+ex_abort:
+ res_abort_move(dev, slave, RES_MPT, id);
+
+ return err;
+}
+
+int mlx4_HW2SW_MPT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int index = vhcr->in_modifier;
+ struct res_mpt *mpt;
+ int id;
+
+ id = index & mpt_mask(dev);
+ err = mr_res_start_move_to(dev, slave, id, RES_MPT_MAPPED, &mpt);
+ if (err)
+ return err;
+
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ if (err)
+ goto ex_abort;
+
+ if (mpt->mtt)
+ atomic_dec(&mpt->mtt->ref_count);
+
+ res_end_move(dev, slave, RES_MPT, id);
+ return 0;
+
+ex_abort:
+ res_abort_move(dev, slave, RES_MPT, id);
+
+ return err;
+}
+
+int mlx4_QUERY_MPT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int index = vhcr->in_modifier;
+ struct res_mpt *mpt;
+ int id;
+
+ id = index & mpt_mask(dev);
+ err = get_res(dev, slave, id, RES_MPT, &mpt);
+ if (err)
+ return err;
+
+ if (mpt->com.from_state == RES_MPT_MAPPED) {
+ /* In order to allow rereg in SRIOV, we need to alter the MPT entry. To do
+ * that, the VF must read the MPT. But since the MPT entry memory is not
+ * in the VF's virtual memory space, it must use QUERY_MPT to obtain the
+ * entry contents. To guarantee that the MPT cannot be changed, the driver
+ * must perform HW2SW_MPT before this query and return the MPT entry to HW
+ * ownership fofollowing the change. The change here allows the VF to
+ * perform QUERY_MPT also when the entry is in SW ownership.
+ */
+ struct mlx4_mpt_entry *mpt_entry = mlx4_table_find(
+ &mlx4_priv(dev)->mr_table.dmpt_table,
+ mpt->key, NULL);
+
+ if (NULL == mpt_entry || NULL == outbox->buf) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ memcpy(outbox->buf, mpt_entry, sizeof(*mpt_entry));
+
+ err = 0;
+ } else if (mpt->com.from_state == RES_MPT_HW) {
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ } else {
+ err = -EBUSY;
+ goto out;
+ }
+
+
+out:
+ put_res(dev, slave, id, RES_MPT);
+ return err;
+}
+
+static int qp_get_rcqn(struct mlx4_qp_context *qpc)
+{
+ return be32_to_cpu(qpc->cqn_recv) & 0xffffff;
+}
+
+static int qp_get_scqn(struct mlx4_qp_context *qpc)
+{
+ return be32_to_cpu(qpc->cqn_send) & 0xffffff;
+}
+
+static u32 qp_get_srqn(struct mlx4_qp_context *qpc)
+{
+ return be32_to_cpu(qpc->srqn) & 0x1ffffff;
+}
+
+static void adjust_proxy_tun_qkey(struct mlx4_dev *dev, struct mlx4_vhcr *vhcr,
+ struct mlx4_qp_context *context)
+{
+ u32 qpn = vhcr->in_modifier & 0xffffff;
+ u32 qkey = 0;
+
+ if (mlx4_get_parav_qkey(dev, qpn, &qkey))
+ return;
+
+ /* adjust qkey in qp context */
+ context->qkey = cpu_to_be32(qkey);
+}
+
+int mlx4_RST2INIT_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int qpn = vhcr->in_modifier & 0x7fffff;
+ struct res_mtt *mtt;
+ struct res_qp *qp;
+ struct mlx4_qp_context *qpc = inbox->buf + 8;
+ int mtt_base = qp_get_mtt_addr(qpc) / dev->caps.mtt_entry_sz;
+ int mtt_size = qp_get_mtt_size(qpc);
+ struct res_cq *rcq;
+ struct res_cq *scq;
+ int rcqn = qp_get_rcqn(qpc);
+ int scqn = qp_get_scqn(qpc);
+ u32 srqn = qp_get_srqn(qpc) & 0xffffff;
+ int use_srq = (qp_get_srqn(qpc) >> 24) & 1;
+ struct res_srq *srq;
+ int local_qpn = be32_to_cpu(qpc->local_qpn) & 0xffffff;
+
+ err = qp_res_start_move_to(dev, slave, qpn, RES_QP_HW, &qp, 0);
+ if (err)
+ return err;
+ qp->local_qpn = local_qpn;
+ qp->sched_queue = 0;
+ qp->param3 = 0;
+ qp->vlan_control = 0;
+ qp->fvl_rx = 0;
+ qp->pri_path_fl = 0;
+ qp->vlan_index = 0;
+ qp->feup = 0;
+ qp->qpc_flags = be32_to_cpu(qpc->flags);
+
+ err = get_res(dev, slave, mtt_base, RES_MTT, &mtt);
+ if (err)
+ goto ex_abort;
+
+ err = check_mtt_range(dev, slave, mtt_base, mtt_size, mtt);
+ if (err)
+ goto ex_put_mtt;
+
+ err = get_res(dev, slave, rcqn, RES_CQ, &rcq);
+ if (err)
+ goto ex_put_mtt;
+
+ if (scqn != rcqn) {
+ err = get_res(dev, slave, scqn, RES_CQ, &scq);
+ if (err)
+ goto ex_put_rcq;
+ } else
+ scq = rcq;
+
+ if (use_srq) {
+ err = get_res(dev, slave, srqn, RES_SRQ, &srq);
+ if (err)
+ goto ex_put_scq;
+ }
+
+ adjust_proxy_tun_qkey(dev, vhcr, qpc);
+ update_pkey_index(dev, slave, inbox);
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ if (err)
+ goto ex_put_srq;
+ atomic_inc(&mtt->ref_count);
+ qp->mtt = mtt;
+ atomic_inc(&rcq->ref_count);
+ qp->rcq = rcq;
+ atomic_inc(&scq->ref_count);
+ qp->scq = scq;
+
+ if (scqn != rcqn)
+ put_res(dev, slave, scqn, RES_CQ);
+
+ if (use_srq) {
+ atomic_inc(&srq->ref_count);
+ put_res(dev, slave, srqn, RES_SRQ);
+ qp->srq = srq;
+ }
+ put_res(dev, slave, rcqn, RES_CQ);
+ put_res(dev, slave, mtt_base, RES_MTT);
+ res_end_move(dev, slave, RES_QP, qpn);
+
+ return 0;
+
+ex_put_srq:
+ if (use_srq)
+ put_res(dev, slave, srqn, RES_SRQ);
+ex_put_scq:
+ if (scqn != rcqn)
+ put_res(dev, slave, scqn, RES_CQ);
+ex_put_rcq:
+ put_res(dev, slave, rcqn, RES_CQ);
+ex_put_mtt:
+ put_res(dev, slave, mtt_base, RES_MTT);
+ex_abort:
+ res_abort_move(dev, slave, RES_QP, qpn);
+
+ return err;
+}
+
+static int eq_get_mtt_addr(struct mlx4_eq_context *eqc)
+{
+ return be32_to_cpu(eqc->mtt_base_addr_l) & 0xfffffff8;
+}
+
+static int eq_get_mtt_size(struct mlx4_eq_context *eqc)
+{
+ int log_eq_size = eqc->log_eq_size & 0x1f;
+ int page_shift = (eqc->log_page_size & 0x3f) + 12;
+
+ if (log_eq_size + 5 < page_shift)
+ return 1;
+
+ return 1 << (log_eq_size + 5 - page_shift);
+}
+
+static int cq_get_mtt_addr(struct mlx4_cq_context *cqc)
+{
+ return be32_to_cpu(cqc->mtt_base_addr_l) & 0xfffffff8;
+}
+
+static int cq_get_mtt_size(struct mlx4_cq_context *cqc)
+{
+ int log_cq_size = (be32_to_cpu(cqc->logsize_usrpage) >> 24) & 0x1f;
+ int page_shift = (cqc->log_page_size & 0x3f) + 12;
+
+ if (log_cq_size + 5 < page_shift)
+ return 1;
+
+ return 1 << (log_cq_size + 5 - page_shift);
+}
+
+int mlx4_SW2HW_EQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int eqn = vhcr->in_modifier;
+ int res_id = (slave << 10) | eqn;
+ struct mlx4_eq_context *eqc = inbox->buf;
+ int mtt_base = eq_get_mtt_addr(eqc) / dev->caps.mtt_entry_sz;
+ int mtt_size = eq_get_mtt_size(eqc);
+ struct res_eq *eq;
+ struct res_mtt *mtt;
+
+ err = add_res_range(dev, slave, res_id, 1, RES_EQ, 0);
+ if (err)
+ return err;
+ err = eq_res_start_move_to(dev, slave, res_id, RES_EQ_HW, &eq);
+ if (err)
+ goto out_add;
+
+ err = get_res(dev, slave, mtt_base, RES_MTT, &mtt);
+ if (err)
+ goto out_move;
+
+ err = check_mtt_range(dev, slave, mtt_base, mtt_size, mtt);
+ if (err)
+ goto out_put;
+
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ if (err)
+ goto out_put;
+
+ atomic_inc(&mtt->ref_count);
+ eq->mtt = mtt;
+ put_res(dev, slave, mtt->com.res_id, RES_MTT);
+ res_end_move(dev, slave, RES_EQ, res_id);
+ return 0;
+
+out_put:
+ put_res(dev, slave, mtt->com.res_id, RES_MTT);
+out_move:
+ res_abort_move(dev, slave, RES_EQ, res_id);
+out_add:
+ rem_res_range(dev, slave, res_id, 1, RES_EQ, 0);
+ return err;
+}
+
+int mlx4_CONFIG_DEV_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ u8 get = vhcr->op_modifier;
+
+ if (get != 1)
+ return -EPERM;
+
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+
+ return err;
+}
+
+static int get_containing_mtt(struct mlx4_dev *dev, int slave, int start,
+ int len, struct res_mtt **res)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct res_mtt *mtt;
+ int err = -EINVAL;
+
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry(mtt, &tracker->slave_list[slave].res_list[RES_MTT],
+ com.list) {
+ if (!check_mtt_range(dev, slave, start, len, mtt)) {
+ *res = mtt;
+ mtt->com.from_state = mtt->com.state;
+ mtt->com.state = RES_MTT_BUSY;
+ err = 0;
+ break;
+ }
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+
+ return err;
+}
+
+static int verify_qp_parameters(struct mlx4_dev *dev,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ enum qp_transition transition, u8 slave)
+{
+ u32 qp_type;
+ u32 qpn;
+ struct mlx4_qp_context *qp_ctx;
+ enum mlx4_qp_optpar optpar;
+ int port;
+ int num_gids;
+
+ qp_ctx = inbox->buf + 8;
+ qp_type = (be32_to_cpu(qp_ctx->flags) >> 16) & 0xff;
+ optpar = be32_to_cpu(*(__be32 *) inbox->buf);
+
+ if (slave != mlx4_master_func_num(dev))
+ qp_ctx->params2 &= ~MLX4_QP_BIT_FPP;
+
+ switch (qp_type) {
+ case MLX4_QP_ST_RC:
+ case MLX4_QP_ST_XRC:
+ case MLX4_QP_ST_UC:
+ switch (transition) {
+ case QP_TRANS_INIT2RTR:
+ case QP_TRANS_RTR2RTS:
+ case QP_TRANS_RTS2RTS:
+ case QP_TRANS_SQD2SQD:
+ case QP_TRANS_SQD2RTS:
+ if (slave != mlx4_master_func_num(dev))
+ if (optpar & MLX4_QP_OPTPAR_PRIMARY_ADDR_PATH) {
+ port = (qp_ctx->pri_path.sched_queue >> 6 & 1) + 1;
+ if (dev->caps.port_mask[port] != MLX4_PORT_TYPE_IB)
+ num_gids = mlx4_get_slave_num_gids(dev, slave, port);
+ else
+ num_gids = 1;
+ if (qp_ctx->pri_path.mgid_index >= num_gids)
+ return -EINVAL;
+ }
+ if (optpar & MLX4_QP_OPTPAR_ALT_ADDR_PATH) {
+ port = (qp_ctx->alt_path.sched_queue >> 6 & 1) + 1;
+ if (dev->caps.port_mask[port] != MLX4_PORT_TYPE_IB)
+ num_gids = mlx4_get_slave_num_gids(dev, slave, port);
+ else
+ num_gids = 1;
+ if (qp_ctx->alt_path.mgid_index >= num_gids)
+ return -EINVAL;
+ }
+ break;
+ default:
+ break;
+ }
+ break;
+
+ case MLX4_QP_ST_MLX:
+ qpn = vhcr->in_modifier & 0x7fffff;
+ port = (qp_ctx->pri_path.sched_queue >> 6 & 1) + 1;
+ if (transition == QP_TRANS_INIT2RTR &&
+ slave != mlx4_master_func_num(dev) &&
+ mlx4_is_qp_reserved(dev, qpn) &&
+ !mlx4_vf_smi_enabled(dev, slave, port)) {
+ /* only enabled VFs may create MLX proxy QPs */
+ mlx4_err(dev, "%s: unprivileged slave %d attempting to create an MLX proxy special QP on port %d\n",
+ __func__, slave, port);
+ return -EPERM;
+ }
+ break;
+
+ default:
+ break;
+ }
+
+ return 0;
+}
+
+int mlx4_WRITE_MTT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ struct mlx4_mtt mtt;
+ __be64 *page_list = inbox->buf;
+ u64 *pg_list = (u64 *)page_list;
+ int i;
+ struct res_mtt *rmtt = NULL;
+ int start = be64_to_cpu(page_list[0]);
+ int npages = vhcr->in_modifier;
+ int err;
+
+ err = get_containing_mtt(dev, slave, start, npages, &rmtt);
+ if (err)
+ return err;
+
+ /* Call the SW implementation of write_mtt:
+ * - Prepare a dummy mtt struct
+ * - Translate inbox contents to simple addresses in host endianess */
+ mtt.offset = 0; /* TBD this is broken but I don't handle it since
+ we don't really use it */
+ mtt.order = 0;
+ mtt.page_shift = 0;
+ for (i = 0; i < npages; ++i)
+ pg_list[i + 2] = (be64_to_cpu(page_list[i + 2]) & ~1ULL);
+
+ err = __mlx4_write_mtt(dev, &mtt, be64_to_cpu(page_list[0]), npages,
+ ((u64 *)page_list + 2));
+
+ if (rmtt)
+ put_res(dev, slave, rmtt->com.res_id, RES_MTT);
+
+ return err;
+}
+
+int mlx4_HW2SW_EQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int eqn = vhcr->in_modifier;
+ int res_id = eqn | (slave << 10);
+ struct res_eq *eq;
+ int err;
+
+ err = eq_res_start_move_to(dev, slave, res_id, RES_EQ_RESERVED, &eq);
+ if (err)
+ return err;
+
+ err = get_res(dev, slave, eq->mtt->com.res_id, RES_MTT, NULL);
+ if (err)
+ goto ex_abort;
+
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ if (err)
+ goto ex_put;
+
+ atomic_dec(&eq->mtt->ref_count);
+ put_res(dev, slave, eq->mtt->com.res_id, RES_MTT);
+ res_end_move(dev, slave, RES_EQ, res_id);
+ rem_res_range(dev, slave, res_id, 1, RES_EQ, 0);
+
+ return 0;
+
+ex_put:
+ put_res(dev, slave, eq->mtt->com.res_id, RES_MTT);
+ex_abort:
+ res_abort_move(dev, slave, RES_EQ, res_id);
+
+ return err;
+}
+
+int mlx4_GEN_EQE(struct mlx4_dev *dev, int slave, struct mlx4_eqe *eqe)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_slave_event_eq_info *event_eq;
+ struct mlx4_cmd_mailbox *mailbox;
+ u32 in_modifier = 0;
+ int err;
+ int res_id;
+ struct res_eq *req;
+
+ if (!priv->mfunc.master.slave_state)
+ return -EINVAL;
+
+ /* check for slave valid, slave not PF, and slave active */
+ if (slave < 0 || slave >= dev->num_slaves ||
+ slave == dev->caps.function ||
+ !priv->mfunc.master.slave_state[slave].active)
+ return 0;
+
+ event_eq = &priv->mfunc.master.slave_state[slave].event_eq[eqe->type];
+
+ /* Create the event only if the slave is registered */
+ if (event_eq->eqn < 0)
+ return 0;
+
+ mutex_lock(&priv->mfunc.master.gen_eqe_mutex[slave]);
+ res_id = (slave << 10) | event_eq->eqn;
+ err = get_res(dev, slave, res_id, RES_EQ, &req);
+ if (err)
+ goto unlock;
+
+ if (req->com.from_state != RES_EQ_HW) {
+ err = -EINVAL;
+ goto put;
+ }
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox)) {
+ err = PTR_ERR(mailbox);
+ goto put;
+ }
+
+ if (eqe->type == MLX4_EVENT_TYPE_CMD) {
+ ++event_eq->token;
+ eqe->event.cmd.token = cpu_to_be16(event_eq->token);
+ }
+
+ memcpy(mailbox->buf, (u8 *) eqe, 28);
+
+ in_modifier = (slave & 0xff) | ((event_eq->eqn & 0x3ff) << 16);
+
+ err = mlx4_cmd(dev, mailbox->dma, in_modifier, 0,
+ MLX4_CMD_GEN_EQE, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_NATIVE);
+
+ put_res(dev, slave, res_id, RES_EQ);
+ mutex_unlock(&priv->mfunc.master.gen_eqe_mutex[slave]);
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+
+put:
+ put_res(dev, slave, res_id, RES_EQ);
+
+unlock:
+ mutex_unlock(&priv->mfunc.master.gen_eqe_mutex[slave]);
+ return err;
+}
+
+int mlx4_QUERY_EQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int eqn = vhcr->in_modifier;
+ int res_id = eqn | (slave << 10);
+ struct res_eq *eq;
+ int err;
+
+ err = get_res(dev, slave, res_id, RES_EQ, &eq);
+ if (err)
+ return err;
+
+ if (eq->com.from_state != RES_EQ_HW) {
+ err = -EINVAL;
+ goto ex_put;
+ }
+
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+
+ex_put:
+ put_res(dev, slave, res_id, RES_EQ);
+ return err;
+}
+
+int mlx4_SW2HW_CQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int cqn = vhcr->in_modifier;
+ struct mlx4_cq_context *cqc = inbox->buf;
+ int mtt_base = cq_get_mtt_addr(cqc) / dev->caps.mtt_entry_sz;
+ struct res_cq *cq;
+ struct res_mtt *mtt;
+
+ err = cq_res_start_move_to(dev, slave, cqn, RES_CQ_HW, &cq);
+ if (err)
+ return err;
+ err = get_res(dev, slave, mtt_base, RES_MTT, &mtt);
+ if (err)
+ goto out_move;
+ err = check_mtt_range(dev, slave, mtt_base, cq_get_mtt_size(cqc), mtt);
+ if (err)
+ goto out_put;
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ if (err)
+ goto out_put;
+ atomic_inc(&mtt->ref_count);
+ cq->mtt = mtt;
+ put_res(dev, slave, mtt->com.res_id, RES_MTT);
+ res_end_move(dev, slave, RES_CQ, cqn);
+ return 0;
+
+out_put:
+ put_res(dev, slave, mtt->com.res_id, RES_MTT);
+out_move:
+ res_abort_move(dev, slave, RES_CQ, cqn);
+ return err;
+}
+
+int mlx4_HW2SW_CQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int cqn = vhcr->in_modifier;
+ struct res_cq *cq;
+
+ err = cq_res_start_move_to(dev, slave, cqn, RES_CQ_ALLOCATED, &cq);
+ if (err)
+ return err;
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ if (err)
+ goto out_move;
+ atomic_dec(&cq->mtt->ref_count);
+ res_end_move(dev, slave, RES_CQ, cqn);
+ return 0;
+
+out_move:
+ res_abort_move(dev, slave, RES_CQ, cqn);
+ return err;
+}
+
+int mlx4_QUERY_CQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int cqn = vhcr->in_modifier;
+ struct res_cq *cq;
+ int err;
+
+ err = get_res(dev, slave, cqn, RES_CQ, &cq);
+ if (err)
+ return err;
+
+ if (cq->com.from_state != RES_CQ_HW)
+ goto ex_put;
+
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ex_put:
+ put_res(dev, slave, cqn, RES_CQ);
+
+ return err;
+}
+
+static int handle_resize(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd,
+ struct res_cq *cq)
+{
+ int err;
+ struct res_mtt *orig_mtt;
+ struct res_mtt *mtt;
+ struct mlx4_cq_context *cqc = inbox->buf;
+ int mtt_base = cq_get_mtt_addr(cqc) / dev->caps.mtt_entry_sz;
+
+ err = get_res(dev, slave, cq->mtt->com.res_id, RES_MTT, &orig_mtt);
+ if (err)
+ return err;
+
+ if (orig_mtt != cq->mtt) {
+ err = -EINVAL;
+ goto ex_put;
+ }
+
+ err = get_res(dev, slave, mtt_base, RES_MTT, &mtt);
+ if (err)
+ goto ex_put;
+
+ err = check_mtt_range(dev, slave, mtt_base, cq_get_mtt_size(cqc), mtt);
+ if (err)
+ goto ex_put1;
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ if (err)
+ goto ex_put1;
+ atomic_dec(&orig_mtt->ref_count);
+ put_res(dev, slave, orig_mtt->com.res_id, RES_MTT);
+ atomic_inc(&mtt->ref_count);
+ cq->mtt = mtt;
+ put_res(dev, slave, mtt->com.res_id, RES_MTT);
+ return 0;
+
+ex_put1:
+ put_res(dev, slave, mtt->com.res_id, RES_MTT);
+ex_put:
+ put_res(dev, slave, orig_mtt->com.res_id, RES_MTT);
+
+ return err;
+
+}
+
+int mlx4_MODIFY_CQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int cqn = vhcr->in_modifier;
+ struct res_cq *cq;
+ int err;
+
+ err = get_res(dev, slave, cqn, RES_CQ, &cq);
+ if (err)
+ return err;
+
+ if (cq->com.from_state != RES_CQ_HW)
+ goto ex_put;
+
+ if (vhcr->op_modifier == 0) {
+ err = handle_resize(dev, slave, vhcr, inbox, outbox, cmd, cq);
+ goto ex_put;
+ }
+
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ex_put:
+ put_res(dev, slave, cqn, RES_CQ);
+
+ return err;
+}
+
+static int srq_get_mtt_size(struct mlx4_srq_context *srqc)
+{
+ int log_srq_size = (be32_to_cpu(srqc->state_logsize_srqn) >> 24) & 0xf;
+ int log_rq_stride = srqc->logstride & 7;
+ int page_shift = (srqc->log_page_size & 0x3f) + 12;
+
+ if (log_srq_size + log_rq_stride + 4 < page_shift)
+ return 1;
+
+ return 1 << (log_srq_size + log_rq_stride + 4 - page_shift);
+}
+
+int mlx4_SW2HW_SRQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int srqn = vhcr->in_modifier;
+ struct res_mtt *mtt;
+ struct res_srq *srq;
+ struct mlx4_srq_context *srqc = inbox->buf;
+ int mtt_base = srq_get_mtt_addr(srqc) / dev->caps.mtt_entry_sz;
+
+ if (srqn != (be32_to_cpu(srqc->state_logsize_srqn) & 0xffffff))
+ return -EINVAL;
+
+ err = srq_res_start_move_to(dev, slave, srqn, RES_SRQ_HW, &srq);
+ if (err)
+ return err;
+ err = get_res(dev, slave, mtt_base, RES_MTT, &mtt);
+ if (err)
+ goto ex_abort;
+ err = check_mtt_range(dev, slave, mtt_base, srq_get_mtt_size(srqc),
+ mtt);
+ if (err)
+ goto ex_put_mtt;
+
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ if (err)
+ goto ex_put_mtt;
+
+ atomic_inc(&mtt->ref_count);
+ srq->mtt = mtt;
+ put_res(dev, slave, mtt->com.res_id, RES_MTT);
+ res_end_move(dev, slave, RES_SRQ, srqn);
+ return 0;
+
+ex_put_mtt:
+ put_res(dev, slave, mtt->com.res_id, RES_MTT);
+ex_abort:
+ res_abort_move(dev, slave, RES_SRQ, srqn);
+
+ return err;
+}
+
+int mlx4_HW2SW_SRQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int srqn = vhcr->in_modifier;
+ struct res_srq *srq;
+
+ err = srq_res_start_move_to(dev, slave, srqn, RES_SRQ_ALLOCATED, &srq);
+ if (err)
+ return err;
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ if (err)
+ goto ex_abort;
+ atomic_dec(&srq->mtt->ref_count);
+ if (srq->cq)
+ atomic_dec(&srq->cq->ref_count);
+ res_end_move(dev, slave, RES_SRQ, srqn);
+
+ return 0;
+
+ex_abort:
+ res_abort_move(dev, slave, RES_SRQ, srqn);
+
+ return err;
+}
+
+int mlx4_QUERY_SRQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int srqn = vhcr->in_modifier;
+ struct res_srq *srq;
+
+ err = get_res(dev, slave, srqn, RES_SRQ, &srq);
+ if (err)
+ return err;
+ if (srq->com.from_state != RES_SRQ_HW) {
+ err = -EBUSY;
+ goto out;
+ }
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+out:
+ put_res(dev, slave, srqn, RES_SRQ);
+ return err;
+}
+
+int mlx4_ARM_SRQ_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int srqn = vhcr->in_modifier;
+ struct res_srq *srq;
+
+ err = get_res(dev, slave, srqn, RES_SRQ, &srq);
+ if (err)
+ return err;
+
+ if (srq->com.from_state != RES_SRQ_HW) {
+ err = -EBUSY;
+ goto out;
+ }
+
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+out:
+ put_res(dev, slave, srqn, RES_SRQ);
+ return err;
+}
+
+int mlx4_GEN_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int qpn = vhcr->in_modifier & 0x7fffff;
+ struct res_qp *qp;
+
+ err = get_res(dev, slave, qpn, RES_QP, &qp);
+ if (err)
+ return err;
+ if (qp->com.from_state != RES_QP_HW) {
+ err = -EBUSY;
+ goto out;
+ }
+
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+out:
+ put_res(dev, slave, qpn, RES_QP);
+ return err;
+}
+
+int mlx4_INIT2INIT_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ struct mlx4_qp_context *context = inbox->buf + 8;
+ adjust_proxy_tun_qkey(dev, vhcr, context);
+ update_pkey_index(dev, slave, inbox);
+ return mlx4_GEN_QP_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+}
+
+static int adjust_qp_sched_queue(struct mlx4_dev *dev, int slave,
+ struct mlx4_qp_context *qpc,
+ struct mlx4_cmd_mailbox *inbox)
+{
+ enum mlx4_qp_optpar optpar = be32_to_cpu(*(__be32 *)inbox->buf);
+ u8 pri_sched_queue;
+ int port = mlx4_slave_convert_port(
+ dev, slave, (qpc->pri_path.sched_queue >> 6 & 1) + 1) - 1;
+
+ if (port < 0)
+ return -EINVAL;
+
+ pri_sched_queue = (qpc->pri_path.sched_queue & ~(1 << 6)) |
+ ((port & 1) << 6);
+
+ if (optpar & MLX4_QP_OPTPAR_PRIMARY_ADDR_PATH ||
+ mlx4_is_eth(dev, port + 1)) {
+ qpc->pri_path.sched_queue = pri_sched_queue;
+ }
+
+ if (optpar & MLX4_QP_OPTPAR_ALT_ADDR_PATH) {
+ port = mlx4_slave_convert_port(
+ dev, slave, (qpc->alt_path.sched_queue >> 6 & 1)
+ + 1) - 1;
+ if (port < 0)
+ return -EINVAL;
+ qpc->alt_path.sched_queue =
+ (qpc->alt_path.sched_queue & ~(1 << 6)) |
+ (port & 1) << 6;
+ }
+ return 0;
+}
+
+static int roce_verify_mac(struct mlx4_dev *dev, int slave,
+ struct mlx4_qp_context *qpc,
+ struct mlx4_cmd_mailbox *inbox)
+{
+ u64 mac;
+ int port;
+ u32 ts = (be32_to_cpu(qpc->flags) >> 16) & 0xff;
+ u8 sched = *(u8 *)(inbox->buf + 64);
+ u8 smac_ix;
+
+ port = (sched >> 6 & 1) + 1;
+ if (mlx4_is_eth(dev, port) && (ts != MLX4_QP_ST_MLX)) {
+ smac_ix = qpc->pri_path.grh_mylmc & 0x7f;
+ if (mac_find_smac_ix_in_slave(dev, slave, port, smac_ix, &mac))
+ return -ENOENT;
+ }
+ return 0;
+}
+
+int mlx4_INIT2RTR_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ struct mlx4_qp_context *qpc = inbox->buf + 8;
+ int qpn = vhcr->in_modifier & 0x7fffff;
+ struct res_qp *qp;
+ u8 orig_sched_queue;
+ __be32 orig_param3 = qpc->param3;
+ u8 orig_vlan_control = qpc->pri_path.vlan_control;
+ u8 orig_fvl_rx = qpc->pri_path.fvl_rx;
+ u8 orig_pri_path_fl = qpc->pri_path.fl;
+ u8 orig_vlan_index = qpc->pri_path.vlan_index;
+ u8 orig_feup = qpc->pri_path.feup;
+
+ err = adjust_qp_sched_queue(dev, slave, qpc, inbox);
+ if (err)
+ return err;
+ err = verify_qp_parameters(dev, vhcr, inbox, QP_TRANS_INIT2RTR, slave);
+ if (err)
+ return err;
+
+ if (roce_verify_mac(dev, slave, qpc, inbox))
+ return -EINVAL;
+
+ update_pkey_index(dev, slave, inbox);
+ update_gid(dev, inbox, (u8)slave);
+ adjust_proxy_tun_qkey(dev, vhcr, qpc);
+ orig_sched_queue = qpc->pri_path.sched_queue;
+
+ err = get_res(dev, slave, qpn, RES_QP, &qp);
+ if (err)
+ return err;
+ if (qp->com.from_state != RES_QP_HW) {
+ err = -EBUSY;
+ goto out;
+ }
+
+ err = update_vport_qp_param(dev, inbox, slave, qpn);
+ if (err)
+ goto out;
+
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+out:
+ /* if no error, save sched queue value passed in by VF. This is
+ * essentially the QOS value provided by the VF. This will be useful
+ * if we allow dynamic changes from VST back to VGT
+ */
+ if (!err) {
+ qp->sched_queue = orig_sched_queue;
+ qp->param3 = orig_param3;
+ qp->vlan_control = orig_vlan_control;
+ qp->fvl_rx = orig_fvl_rx;
+ qp->pri_path_fl = orig_pri_path_fl;
+ qp->vlan_index = orig_vlan_index;
+ qp->feup = orig_feup;
+ }
+ put_res(dev, slave, qpn, RES_QP);
+ return err;
+}
+
+int mlx4_RTR2RTS_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ struct mlx4_qp_context *context = inbox->buf + 8;
+
+ err = adjust_qp_sched_queue(dev, slave, context, inbox);
+ if (err)
+ return err;
+ err = verify_qp_parameters(dev, vhcr, inbox, QP_TRANS_RTR2RTS, slave);
+ if (err)
+ return err;
+
+ if ((dev->caps.roce_mode == MLX4_ROCE_MODE_2) ||
+ (dev->caps.roce_mode == MLX4_ROCE_MODE_1_5_PLUS_2) ||
+ (dev->caps.roce_mode == MLX4_ROCE_MODE_1_PLUS_2)) {
+ int qpn = vhcr->in_modifier & 0x7fffff;
+ context->roce_entropy = cpu_to_be16(mlx4_qp_roce_entropy(dev,qpn));
+ }
+
+ update_pkey_index(dev, slave, inbox);
+ update_gid(dev, inbox, (u8)slave);
+ adjust_proxy_tun_qkey(dev, vhcr, context);
+ return mlx4_GEN_QP_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+}
+
+int mlx4_RTS2RTS_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ struct mlx4_qp_context *context = inbox->buf + 8;
+
+ err = adjust_qp_sched_queue(dev, slave, context, inbox);
+ if (err)
+ return err;
+ err = verify_qp_parameters(dev, vhcr, inbox, QP_TRANS_RTS2RTS, slave);
+ if (err)
+ return err;
+
+ update_pkey_index(dev, slave, inbox);
+ update_gid(dev, inbox, (u8)slave);
+ adjust_proxy_tun_qkey(dev, vhcr, context);
+ return mlx4_GEN_QP_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+}
+
+
+int mlx4_SQERR2RTS_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ struct mlx4_qp_context *context = inbox->buf + 8;
+ int err = adjust_qp_sched_queue(dev, slave, context, inbox);
+ if (err)
+ return err;
+ adjust_proxy_tun_qkey(dev, vhcr, context);
+ return mlx4_GEN_QP_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+}
+
+int mlx4_SQD2SQD_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ struct mlx4_qp_context *context = inbox->buf + 8;
+
+ err = adjust_qp_sched_queue(dev, slave, context, inbox);
+ if (err)
+ return err;
+ err = verify_qp_parameters(dev, vhcr, inbox, QP_TRANS_SQD2SQD, slave);
+ if (err)
+ return err;
+
+ adjust_proxy_tun_qkey(dev, vhcr, context);
+ update_gid(dev, inbox, (u8)slave);
+ update_pkey_index(dev, slave, inbox);
+ return mlx4_GEN_QP_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+}
+
+int mlx4_SQD2RTS_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ struct mlx4_qp_context *context = inbox->buf + 8;
+
+ err = adjust_qp_sched_queue(dev, slave, context, inbox);
+ if (err)
+ return err;
+ err = verify_qp_parameters(dev, vhcr, inbox, QP_TRANS_SQD2RTS, slave);
+ if (err)
+ return err;
+
+ adjust_proxy_tun_qkey(dev, vhcr, context);
+ update_gid(dev, inbox, (u8)slave);
+ update_pkey_index(dev, slave, inbox);
+ return mlx4_GEN_QP_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+}
+
+int mlx4_2RST_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ int qpn = vhcr->in_modifier & 0x7fffff;
+ struct res_qp *qp;
+
+ err = qp_res_start_move_to(dev, slave, qpn, RES_QP_MAPPED, &qp, 0);
+ if (err)
+ return err;
+ err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+ if (err)
+ goto ex_abort;
+
+ atomic_dec(&qp->mtt->ref_count);
+ atomic_dec(&qp->rcq->ref_count);
+ atomic_dec(&qp->scq->ref_count);
+ if (qp->srq)
+ atomic_dec(&qp->srq->ref_count);
+ res_end_move(dev, slave, RES_QP, qpn);
+ return 0;
+
+ex_abort:
+ res_abort_move(dev, slave, RES_QP, qpn);
+
+ return err;
+}
+
+static struct res_gid *find_gid(struct mlx4_dev *dev, int slave,
+ struct res_qp *rqp, u8 *gid)
+{
+ struct res_gid *res;
+
+ list_for_each_entry(res, &rqp->mcg_list, list) {
+ if (!memcmp(res->gid, gid, 16))
+ return res;
+ }
+ return NULL;
+}
+
+static int add_mcg_res(struct mlx4_dev *dev, int slave, struct res_qp *rqp,
+ u8 *gid, enum mlx4_protocol prot,
+ enum mlx4_steer_type steer, u64 reg_id)
+{
+ struct res_gid *res;
+ int err;
+
+ res = kzalloc(sizeof *res, GFP_KERNEL);
+ if (!res)
+ return -ENOMEM;
+
+ spin_lock_irq(&rqp->mcg_spl);
+ if (find_gid(dev, slave, rqp, gid)) {
+ kfree(res);
+ err = -EEXIST;
+ } else {
+ memcpy(res->gid, gid, 16);
+ res->prot = prot;
+ res->steer = steer;
+ res->reg_id = reg_id;
+ list_add_tail(&res->list, &rqp->mcg_list);
+ err = 0;
+ }
+ spin_unlock_irq(&rqp->mcg_spl);
+
+ return err;
+}
+
+static int rem_mcg_res(struct mlx4_dev *dev, int slave, struct res_qp *rqp,
+ u8 *gid, enum mlx4_protocol prot,
+ enum mlx4_steer_type steer, u64 *reg_id)
+{
+ struct res_gid *res;
+ int err;
+
+ spin_lock_irq(&rqp->mcg_spl);
+ res = find_gid(dev, slave, rqp, gid);
+ if (!res || res->prot != prot || res->steer != steer)
+ err = -EINVAL;
+ else {
+ *reg_id = res->reg_id;
+ list_del(&res->list);
+ kfree(res);
+ err = 0;
+ }
+ spin_unlock_irq(&rqp->mcg_spl);
+
+ return err;
+}
+
+static int qp_attach(struct mlx4_dev *dev, int slave, struct mlx4_qp *qp,
+ u8 gid[16], int block_loopback, enum mlx4_protocol prot,
+ enum mlx4_steer_type type, u64 *reg_id)
+{
+ switch (dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_DEVICE_MANAGED: {
+ int port = mlx4_slave_convert_port(dev, slave, gid[5]);
+ if (port < 0)
+ return port;
+ return mlx4_trans_to_dmfs_attach(dev, qp, gid, port,
+ block_loopback, prot,
+ reg_id);
+ }
+ case MLX4_STEERING_MODE_B0:
+ if (prot == MLX4_PROT_ETH) {
+ int port = mlx4_slave_convert_port(dev, slave, gid[5]);
+ if (port < 0)
+ return port;
+ gid[5] = port;
+ }
+ return mlx4_qp_attach_common(dev, qp, gid,
+ block_loopback, prot, type);
+ default:
+ return -EINVAL;
+ }
+}
+
+static int qp_detach(struct mlx4_dev *dev, struct mlx4_qp *qp,
+ u8 gid[16], enum mlx4_protocol prot,
+ enum mlx4_steer_type type, u64 reg_id)
+{
+ switch (dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_DEVICE_MANAGED:
+ return mlx4_flow_detach(dev, reg_id);
+ case MLX4_STEERING_MODE_B0:
+ return mlx4_qp_detach_common(dev, qp, gid, prot, type);
+ default:
+ return -EINVAL;
+ }
+}
+
+static int mlx4_adjust_port(struct mlx4_dev *dev, int slave,
+ u8 *gid, enum mlx4_protocol prot)
+{
+ int real_port;
+
+ if (prot != MLX4_PROT_ETH)
+ return 0;
+
+ if (dev->caps.steering_mode == MLX4_STEERING_MODE_B0 ||
+ dev->caps.steering_mode == MLX4_STEERING_MODE_DEVICE_MANAGED) {
+ real_port = mlx4_slave_convert_port(dev, slave, gid[5]);
+ if (real_port < 0)
+ return -EINVAL;
+ gid[5] = real_port;
+ }
+
+ return 0;
+}
+
+int mlx4_QP_ATTACH_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ struct mlx4_qp qp; /* dummy for calling attach/detach */
+ u8 *gid = inbox->buf;
+ enum mlx4_protocol prot = (vhcr->in_modifier >> 28) & 0x7;
+ int err;
+ int qpn;
+ struct res_qp *rqp;
+ u64 reg_id = 0;
+ int attach = vhcr->op_modifier;
+ int block_loopback = vhcr->in_modifier >> 31;
+ u8 steer_type_mask = 2;
+ enum mlx4_steer_type type = (gid[7] & steer_type_mask) >> 1;
+
+ qpn = vhcr->in_modifier & 0xffffff;
+ err = get_res(dev, slave, qpn, RES_QP, &rqp);
+ if (err)
+ return err;
+
+ qp.qpn = qpn;
+ if (attach) {
+ err = qp_attach(dev, slave, &qp, gid, block_loopback, prot,
+ type, ®_id);
+ if (err) {
+ pr_err("Fail to attach rule to qp 0x%x\n", qpn);
+ goto ex_put;
+ }
+ err = add_mcg_res(dev, slave, rqp, gid, prot, type, reg_id);
+ if (err)
+ goto ex_detach;
+ } else {
+ err = mlx4_adjust_port(dev, slave, gid, prot);
+ if (err)
+ goto ex_put;
+
+ err = rem_mcg_res(dev, slave, rqp, gid, prot, type, ®_id);
+ if (err)
+ goto ex_put;
+
+ err = qp_detach(dev, &qp, gid, prot, type, reg_id);
+ if (err)
+ pr_err("Fail to detach rule from qp 0x%x reg_id = 0x%llx\n",
+ qpn, reg_id);
+ }
+ put_res(dev, slave, qpn, RES_QP);
+ return err;
+
+ex_detach:
+ qp_detach(dev, &qp, gid, prot, type, reg_id);
+ex_put:
+ put_res(dev, slave, qpn, RES_QP);
+ return err;
+}
+
+/*
+ * MAC validation for Flow Steering rules.
+ * VF can attach rules only with a mac address which is assigned to it.
+ */
+static int validate_eth_header_mac(int slave,
+ struct _rule_hw *eth_header,
+ struct list_head *rlist)
+{
+ struct mac_res *res, *tmp;
+ __be64 be_mac;
+
+ /* make sure it isn't multicast or broadcast mac*/
+ if (!is_multicast_ether_addr(eth_header->eth.dst_mac) &&
+ !is_broadcast_ether_addr(eth_header->eth.dst_mac)) {
+ list_for_each_entry_safe(res, tmp, rlist, list) {
+ be_mac = cpu_to_be64(res->mac << 16);
+ if (ether_addr_equal((u8 *)&be_mac, eth_header->eth.dst_mac))
+ return 0;
+ }
+ pr_err("MAC %pM doesn't belong to VF %d, Steering rule rejected\n",
+ eth_header->eth.dst_mac, slave);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int handle_eth_header_mcast_prio(struct mlx4_net_trans_rule_hw_ctrl *ctrl,
+ struct _rule_hw *eth_header)
+{
+ if (is_multicast_ether_addr(eth_header->eth.dst_mac) ||
+ is_broadcast_ether_addr(eth_header->eth.dst_mac)) {
+ struct mlx4_net_trans_rule_hw_eth *eth =
+ (struct mlx4_net_trans_rule_hw_eth *)eth_header;
+ struct _rule_hw *next_rule = (struct _rule_hw *)(eth + 1);
+ bool last_rule = next_rule->size == 0 && next_rule->id == 0 &&
+ next_rule->rsvd == 0;
+
+ if (last_rule)
+ ctrl->prio = cpu_to_be16(MLX4_DOMAIN_NIC);
+ }
+
+ return 0;
+}
+
+/*
+ * In case of missing eth header, append eth header with a MAC address
+ * assigned to the VF.
+ */
+static int add_eth_header(struct mlx4_dev *dev, int slave,
+ struct mlx4_cmd_mailbox *inbox,
+ struct list_head *rlist, int header_id)
+{
+ struct mac_res *res, *tmp;
+ u8 port;
+ struct mlx4_net_trans_rule_hw_ctrl *ctrl;
+ struct mlx4_net_trans_rule_hw_eth *eth_header;
+ struct mlx4_net_trans_rule_hw_ipv4 *ip_header;
+ struct mlx4_net_trans_rule_hw_tcp_udp *l4_header;
+ __be64 be_mac = 0;
+ __be64 mac_msk = cpu_to_be64(MLX4_MAC_MASK << 16);
+
+ ctrl = (struct mlx4_net_trans_rule_hw_ctrl *)inbox->buf;
+ port = ctrl->port;
+ eth_header = (struct mlx4_net_trans_rule_hw_eth *)(ctrl + 1);
+
+ /* Clear a space in the inbox for eth header */
+ switch (header_id) {
+ case MLX4_NET_TRANS_RULE_ID_IPV4:
+ ip_header =
+ (struct mlx4_net_trans_rule_hw_ipv4 *)(eth_header + 1);
+ memmove(ip_header, eth_header,
+ sizeof(*ip_header) + sizeof(*l4_header));
+ break;
+ case MLX4_NET_TRANS_RULE_ID_TCP:
+ case MLX4_NET_TRANS_RULE_ID_UDP:
+ l4_header = (struct mlx4_net_trans_rule_hw_tcp_udp *)
+ (eth_header + 1);
+ memmove(l4_header, eth_header, sizeof(*l4_header));
+ break;
+ default:
+ return -EINVAL;
+ }
+ list_for_each_entry_safe(res, tmp, rlist, list) {
+ if (port == res->port) {
+ be_mac = cpu_to_be64(res->mac << 16);
+ break;
+ }
+ }
+ if (!be_mac) {
+ pr_err("Failed adding eth header to FS rule, Can't find matching MAC for port %d\n",
+ port);
+ return -EINVAL;
+ }
+
+ memset(eth_header, 0, sizeof(*eth_header));
+ eth_header->size = sizeof(*eth_header) >> 2;
+ eth_header->id = cpu_to_be16(__sw_id_hw[MLX4_NET_TRANS_RULE_ID_ETH]);
+ memcpy(eth_header->dst_mac, &be_mac, ETH_ALEN);
+ memcpy(eth_header->dst_mac_msk, &mac_msk, ETH_ALEN);
+
+ return 0;
+
+}
+
+#define MLX4_UPD_QP_PATH_MASK_SUPPORTED ( \
+ 1ULL << MLX4_UPD_QP_PATH_MASK_MAC_INDEX |\
+ 1ULL << MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB)
+int mlx4_UPDATE_QP_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd_info)
+{
+ int err;
+ u32 qpn = vhcr->in_modifier & 0xffffff;
+ struct res_qp *rqp;
+ u64 mac;
+ unsigned port;
+ u64 pri_addr_path_mask;
+ struct mlx4_update_qp_context *cmd;
+ int smac_index;
+
+ cmd = (struct mlx4_update_qp_context *)inbox->buf;
+
+ pri_addr_path_mask = be64_to_cpu(cmd->primary_addr_path_mask);
+ if (cmd->qp_mask || cmd->secondary_addr_path_mask ||
+ (pri_addr_path_mask & ~MLX4_UPD_QP_PATH_MASK_SUPPORTED))
+ return -EPERM;
+
+ if ((pri_addr_path_mask &
+ (1ULL << MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB)) &&
+ !(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB)) {
+ mlx4_warn(dev,
+ "Trying to set src check LB for slave %d,"
+ "but it isn't supported\n",
+ slave);
+ return -ENOTSUPP;
+ }
+
+ /* Just change the smac for the QP */
+ err = get_res(dev, slave, qpn, RES_QP, &rqp);
+ if (err) {
+ mlx4_err(dev, "Updating qpn 0x%x for slave %d rejected\n", qpn, slave);
+ return err;
+ }
+
+ port = (rqp->sched_queue >> 6 & 1) + 1;
+
+ if (pri_addr_path_mask & (1ULL << MLX4_UPD_QP_PATH_MASK_MAC_INDEX)) {
+ smac_index = cmd->qp_context.pri_path.grh_mylmc;
+ err = mac_find_smac_ix_in_slave(dev, slave, port,
+ smac_index, &mac);
+
+ if (err) {
+ mlx4_err(dev, "Failed to update qpn 0x%x, MAC is invalid. smac_ix: %d\n",
+ qpn, smac_index);
+ goto err_mac;
+ }
+ }
+
+ err = mlx4_cmd(dev, inbox->dma,
+ vhcr->in_modifier, 0,
+ MLX4_CMD_UPDATE_QP, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_err(dev, "Failed to update qpn on qpn 0x%x, command failed\n", qpn);
+ goto err_mac;
+ }
+
+err_mac:
+ put_res(dev, slave, qpn, RES_QP);
+ return err;
+}
+
+int validate_flow_steering_vf_Spec(struct mlx4_dev *dev, int slave,
+ struct _rule_hw *rule_header,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *rlist = &tracker->slave_list[slave].res_list[RES_MAC];
+ int header_id;
+ enum mlx4_net_trans_promisc_mode rule_type;
+ struct mlx4_net_trans_rule_hw_ctrl *ctrl;
+
+ ctrl = (struct mlx4_net_trans_rule_hw_ctrl *)inbox->buf;
+ rule_type = mlx4_map_hw_to_sw_steering_mode(dev, ctrl->type);
+ if (rule_type != MLX4_FS_REGULAR && rule_type != MLX4_FS_MC_DEFAULT)
+ return -EPERM;
+
+ if (rule_type != MLX4_FS_REGULAR)
+ return 0;
+
+ header_id = map_hw_to_sw_id(be16_to_cpu(rule_header->id));
+
+ switch (header_id) {
+ case MLX4_NET_TRANS_RULE_ID_ETH:
+ if (validate_eth_header_mac(slave, rule_header, rlist))
+ return -EINVAL;
+ break;
+ case MLX4_NET_TRANS_RULE_ID_IB:
+ break;
+ case MLX4_NET_TRANS_RULE_ID_IPV4:
+ case MLX4_NET_TRANS_RULE_ID_TCP:
+ case MLX4_NET_TRANS_RULE_ID_UDP:
+ pr_warn("Can't attach FS rule without L2 headers, adding L2 header\n");
+ if (add_eth_header(dev, slave, inbox, rlist, header_id))
+ return -EINVAL;
+
+ vhcr->in_modifier +=
+ sizeof(struct mlx4_net_trans_rule_hw_eth) >> 2;
+ break;
+ default:
+ pr_err("Corrupted mailbox\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int mlx4_QP_FLOW_STEERING_ATTACH_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+
+ int err;
+ int qpn;
+ struct res_qp *rqp;
+ struct mlx4_net_trans_rule_hw_ctrl *ctrl;
+ struct _rule_hw *rule_header;
+ int header_id;
+
+ if (dev->caps.steering_mode !=
+ MLX4_STEERING_MODE_DEVICE_MANAGED)
+ return -EOPNOTSUPP;
+
+ ctrl = (struct mlx4_net_trans_rule_hw_ctrl *)inbox->buf;
+ ctrl->port = mlx4_slave_convert_port(dev, slave, ctrl->port);
+ if (ctrl->port <= 0)
+ return -EINVAL;
+ qpn = be32_to_cpu(ctrl->qpn) & 0xffffff;
+ err = get_res(dev, slave, qpn, RES_QP, &rqp);
+ if (err) {
+ pr_err("Steering rule with qpn 0x%x rejected\n", qpn);
+ return err;
+ }
+ rule_header = (struct _rule_hw *)(ctrl + 1);
+ header_id = map_hw_to_sw_id(be16_to_cpu(rule_header->id));
+
+ if (header_id == MLX4_NET_TRANS_RULE_ID_ETH) {
+ err = handle_eth_header_mcast_prio(ctrl, rule_header);
+ if (err)
+ goto err_put;
+ }
+
+ /* validate VF */
+ if (slave != dev->caps.function) {
+ err = validate_flow_steering_vf_Spec(dev, slave, rule_header,
+ vhcr, inbox);
+ if (err)
+ goto err_put;
+ }
+
+ err = mlx4_cmd_imm(dev, inbox->dma, &vhcr->out_param,
+ vhcr->in_modifier, 0,
+ MLX4_QP_FLOW_STEERING_ATTACH, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err)
+ goto err_put;
+
+ err = add_res_range(dev, slave, vhcr->out_param, 1, RES_FS_RULE, qpn);
+ if (err) {
+ mlx4_err(dev, "Fail to add flow steering resources\n");
+ /* detach rule*/
+ mlx4_cmd(dev, vhcr->out_param, 0, 0,
+ MLX4_QP_FLOW_STEERING_DETACH, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ goto err_put;
+ }
+ atomic_inc(&rqp->ref_count);
+err_put:
+ put_res(dev, slave, qpn, RES_QP);
+ return err;
+}
+
+int mlx4_QP_FLOW_STEERING_DETACH_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ int err;
+ struct res_qp *rqp;
+ struct res_fs_rule *rrule;
+
+ if (dev->caps.steering_mode !=
+ MLX4_STEERING_MODE_DEVICE_MANAGED)
+ return -EOPNOTSUPP;
+
+ err = get_res(dev, slave, vhcr->in_param, RES_FS_RULE, &rrule);
+ if (err)
+ return err;
+ /* Release the rule form busy state before removal */
+ put_res(dev, slave, vhcr->in_param, RES_FS_RULE);
+ err = get_res(dev, slave, rrule->qpn, RES_QP, &rqp);
+ if (err)
+ return err;
+
+ err = mlx4_cmd(dev, vhcr->in_param, 0, 0,
+ MLX4_QP_FLOW_STEERING_DETACH, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (!err)
+ atomic_dec(&rqp->ref_count);
+
+ put_res(dev, slave, rrule->qpn, RES_QP);
+ if (!err)
+ rem_res_range(dev, slave, vhcr->in_param, 1, RES_FS_RULE, 0);
+
+ return err;
+}
+
+enum {
+ BUSY_MAX_RETRIES = 10
+};
+
+int mlx4_QUERY_IF_STAT_wrapper(struct mlx4_dev *dev, int slave,
+ struct mlx4_vhcr *vhcr,
+ struct mlx4_cmd_mailbox *inbox,
+ struct mlx4_cmd_mailbox *outbox,
+ struct mlx4_cmd_info *cmd)
+{
+ return mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd);
+}
+
+static void detach_qp(struct mlx4_dev *dev, int slave, struct res_qp *rqp)
+{
+ struct res_gid *rgid;
+ struct res_gid *tmp;
+ struct mlx4_qp qp; /* dummy for calling attach/detach */
+
+ list_for_each_entry_safe(rgid, tmp, &rqp->mcg_list, list) {
+ switch (dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_DEVICE_MANAGED:
+ mlx4_flow_detach(dev, rgid->reg_id);
+ break;
+ case MLX4_STEERING_MODE_B0:
+ qp.qpn = rqp->local_qpn;
+ (void) mlx4_qp_detach_common(dev, &qp, rgid->gid,
+ rgid->prot, rgid->steer);
+ break;
+ }
+ list_del(&rgid->list);
+ kfree(rgid);
+ }
+}
+
+static int _move_all_busy(struct mlx4_dev *dev, int slave,
+ enum mlx4_resource type, int print)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker =
+ &priv->mfunc.master.res_tracker;
+ struct list_head *rlist = &tracker->slave_list[slave].res_list[type];
+ struct res_common *r;
+ struct res_common *tmp;
+ int busy;
+
+ busy = 0;
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry_safe(r, tmp, rlist, list) {
+ if (r->owner == slave) {
+ if (!r->removing) {
+ if (r->state == RES_ANY_BUSY) {
+ if (print)
+ mlx4_dbg(dev,
+ "%s id 0x%llx is busy\n",
+ resource_str(type),
+ r->res_id);
+ ++busy;
+ } else {
+ r->from_state = r->state;
+ r->state = RES_ANY_BUSY;
+ r->removing = 1;
+ }
+ }
+ }
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+
+ return busy;
+}
+
+static int move_all_busy(struct mlx4_dev *dev, int slave,
+ enum mlx4_resource type)
+{
+ unsigned long begin;
+ int busy;
+
+ begin = jiffies;
+ do {
+ busy = _move_all_busy(dev, slave, type, 0);
+ if (time_after(jiffies, begin + 5 * HZ))
+ break;
+ if (busy)
+ cond_resched();
+ } while (busy);
+
+ if (busy)
+ busy = _move_all_busy(dev, slave, type, 1);
+
+ return busy;
+}
+static void rem_slave_qps(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *qp_list =
+ &tracker->slave_list[slave].res_list[RES_QP];
+ struct res_qp *qp;
+ struct res_qp *tmp;
+ int state;
+ u64 in_param;
+ int qpn;
+ int err;
+
+ err = move_all_busy(dev, slave, RES_QP);
+ if (err)
+ mlx4_warn(dev, "rem_slave_qps: Could not move all qps to busy for slave %d\n",
+ slave);
+
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry_safe(qp, tmp, qp_list, com.list) {
+ spin_unlock_irq(mlx4_tlock(dev));
+ if (qp->com.owner == slave) {
+ qpn = qp->com.res_id;
+ detach_qp(dev, slave, qp);
+ state = qp->com.from_state;
+ while (state != 0) {
+ switch (state) {
+ case RES_QP_RESERVED:
+ spin_lock_irq(mlx4_tlock(dev));
+ rb_erase(&qp->com.node,
+ &tracker->res_tree[RES_QP]);
+ list_del(&qp->com.list);
+ spin_unlock_irq(mlx4_tlock(dev));
+ if (!valid_reserved(dev, slave, qpn)) {
+ __mlx4_qp_release_range(dev, qpn, 1);
+ mlx4_release_resource(dev, slave,
+ RES_QP, 1, 0);
+ }
+ kfree(qp);
+ state = 0;
+ break;
+ case RES_QP_MAPPED:
+ if (!valid_reserved(dev, slave, qpn))
+ __mlx4_qp_free_icm(dev, qpn);
+ state = RES_QP_RESERVED;
+ break;
+ case RES_QP_HW:
+ in_param = slave;
+ err = mlx4_cmd(dev, in_param,
+ qp->local_qpn, 2,
+ MLX4_CMD_2RST_QP,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err)
+ mlx4_dbg(dev, "rem_slave_qps: failed to move slave %d qpn %d to reset\n",
+ slave, qp->local_qpn);
+ atomic_dec(&qp->rcq->ref_count);
+ atomic_dec(&qp->scq->ref_count);
+ atomic_dec(&qp->mtt->ref_count);
+ if (qp->srq)
+ atomic_dec(&qp->srq->ref_count);
+ state = RES_QP_MAPPED;
+ break;
+ default:
+ state = 0;
+ }
+ }
+ }
+ spin_lock_irq(mlx4_tlock(dev));
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
+static void rem_slave_srqs(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *srq_list =
+ &tracker->slave_list[slave].res_list[RES_SRQ];
+ struct res_srq *srq;
+ struct res_srq *tmp;
+ int state;
+ u64 in_param;
+ LIST_HEAD(tlist);
+ int srqn;
+ int err;
+
+ err = move_all_busy(dev, slave, RES_SRQ);
+ if (err)
+ mlx4_warn(dev, "rem_slave_srqs: Could not move all srqs - too busy for slave %d\n",
+ slave);
+
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry_safe(srq, tmp, srq_list, com.list) {
+ spin_unlock_irq(mlx4_tlock(dev));
+ if (srq->com.owner == slave) {
+ srqn = srq->com.res_id;
+ state = srq->com.from_state;
+ while (state != 0) {
+ switch (state) {
+ case RES_SRQ_ALLOCATED:
+ __mlx4_srq_free_icm(dev, srqn);
+ spin_lock_irq(mlx4_tlock(dev));
+ rb_erase(&srq->com.node,
+ &tracker->res_tree[RES_SRQ]);
+ list_del(&srq->com.list);
+ spin_unlock_irq(mlx4_tlock(dev));
+ mlx4_release_resource(dev, slave,
+ RES_SRQ, 1, 0);
+ kfree(srq);
+ state = 0;
+ break;
+
+ case RES_SRQ_HW:
+ in_param = slave;
+ err = mlx4_cmd(dev, in_param, srqn, 1,
+ MLX4_CMD_HW2SW_SRQ,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err)
+ mlx4_dbg(dev, "rem_slave_srqs: failed to move slave %d srq %d to SW ownership\n",
+ slave, srqn);
+
+ atomic_dec(&srq->mtt->ref_count);
+ if (srq->cq)
+ atomic_dec(&srq->cq->ref_count);
+ state = RES_SRQ_ALLOCATED;
+ break;
+
+ default:
+ state = 0;
+ }
+ }
+ }
+ spin_lock_irq(mlx4_tlock(dev));
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
+static void rem_slave_cqs(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *cq_list =
+ &tracker->slave_list[slave].res_list[RES_CQ];
+ struct res_cq *cq;
+ struct res_cq *tmp;
+ int state;
+ u64 in_param;
+ LIST_HEAD(tlist);
+ int cqn;
+ int err;
+
+ err = move_all_busy(dev, slave, RES_CQ);
+ if (err)
+ mlx4_warn(dev, "rem_slave_cqs: Could not move all cqs - too busy for slave %d\n",
+ slave);
+
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry_safe(cq, tmp, cq_list, com.list) {
+ spin_unlock_irq(mlx4_tlock(dev));
+ if (cq->com.owner == slave && !atomic_read(&cq->ref_count)) {
+ cqn = cq->com.res_id;
+ state = cq->com.from_state;
+ while (state != 0) {
+ switch (state) {
+ case RES_CQ_ALLOCATED:
+ __mlx4_cq_free_icm(dev, cqn);
+ spin_lock_irq(mlx4_tlock(dev));
+ rb_erase(&cq->com.node,
+ &tracker->res_tree[RES_CQ]);
+ list_del(&cq->com.list);
+ spin_unlock_irq(mlx4_tlock(dev));
+ mlx4_release_resource(dev, slave,
+ RES_CQ, 1, 0);
+ kfree(cq);
+ state = 0;
+ break;
+
+ case RES_CQ_HW:
+ in_param = slave;
+ err = mlx4_cmd(dev, in_param, cqn, 1,
+ MLX4_CMD_HW2SW_CQ,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err)
+ mlx4_dbg(dev, "rem_slave_cqs: failed to move slave %d cq %d to SW ownership\n",
+ slave, cqn);
+ atomic_dec(&cq->mtt->ref_count);
+ state = RES_CQ_ALLOCATED;
+ break;
+
+ default:
+ state = 0;
+ }
+ }
+ }
+ spin_lock_irq(mlx4_tlock(dev));
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
+static void rem_slave_mrs(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *mpt_list =
+ &tracker->slave_list[slave].res_list[RES_MPT];
+ struct res_mpt *mpt;
+ struct res_mpt *tmp;
+ int state;
+ u64 in_param;
+ LIST_HEAD(tlist);
+ int mptn;
+ int err;
+
+ err = move_all_busy(dev, slave, RES_MPT);
+ if (err)
+ mlx4_warn(dev, "rem_slave_mrs: Could not move all mpts - too busy for slave %d\n",
+ slave);
+
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry_safe(mpt, tmp, mpt_list, com.list) {
+ spin_unlock_irq(mlx4_tlock(dev));
+ if (mpt->com.owner == slave) {
+ mptn = mpt->com.res_id;
+ state = mpt->com.from_state;
+ while (state != 0) {
+ switch (state) {
+ case RES_MPT_RESERVED:
+ __mlx4_mpt_release(dev, mpt->key);
+ spin_lock_irq(mlx4_tlock(dev));
+ rb_erase(&mpt->com.node,
+ &tracker->res_tree[RES_MPT]);
+ list_del(&mpt->com.list);
+ spin_unlock_irq(mlx4_tlock(dev));
+ mlx4_release_resource(dev, slave,
+ RES_MPT, 1, 0);
+ kfree(mpt);
+ state = 0;
+ break;
+
+ case RES_MPT_MAPPED:
+ __mlx4_mpt_free_icm(dev, mpt->key);
+ state = RES_MPT_RESERVED;
+ break;
+
+ case RES_MPT_HW:
+ in_param = slave;
+ err = mlx4_cmd(dev, in_param, mptn, 0,
+ MLX4_CMD_HW2SW_MPT,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err)
+ mlx4_dbg(dev, "rem_slave_mrs: failed to move slave %d mpt %d to SW ownership\n",
+ slave, mptn);
+ if (mpt->mtt)
+ atomic_dec(&mpt->mtt->ref_count);
+ state = RES_MPT_MAPPED;
+ break;
+ default:
+ state = 0;
+ }
+ }
+ }
+ spin_lock_irq(mlx4_tlock(dev));
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
+static void rem_slave_mtts(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker =
+ &priv->mfunc.master.res_tracker;
+ struct list_head *mtt_list =
+ &tracker->slave_list[slave].res_list[RES_MTT];
+ struct res_mtt *mtt;
+ struct res_mtt *tmp;
+ int state;
+ LIST_HEAD(tlist);
+ int base;
+ int err;
+
+ err = move_all_busy(dev, slave, RES_MTT);
+ if (err)
+ mlx4_warn(dev, "rem_slave_mtts: Could not move all mtts - too busy for slave %d\n",
+ slave);
+
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry_safe(mtt, tmp, mtt_list, com.list) {
+ spin_unlock_irq(mlx4_tlock(dev));
+ if (mtt->com.owner == slave) {
+ base = mtt->com.res_id;
+ state = mtt->com.from_state;
+ while (state != 0) {
+ switch (state) {
+ case RES_MTT_ALLOCATED:
+ __mlx4_free_mtt_range(dev, base,
+ mtt->order);
+ spin_lock_irq(mlx4_tlock(dev));
+ rb_erase(&mtt->com.node,
+ &tracker->res_tree[RES_MTT]);
+ list_del(&mtt->com.list);
+ spin_unlock_irq(mlx4_tlock(dev));
+ mlx4_release_resource(dev, slave, RES_MTT,
+ 1 << mtt->order, 0);
+ kfree(mtt);
+ state = 0;
+ break;
+
+ default:
+ state = 0;
+ }
+ }
+ }
+ spin_lock_irq(mlx4_tlock(dev));
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
+static void rem_slave_fs_rule(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker =
+ &priv->mfunc.master.res_tracker;
+ struct list_head *fs_rule_list =
+ &tracker->slave_list[slave].res_list[RES_FS_RULE];
+ struct res_fs_rule *fs_rule;
+ struct res_fs_rule *tmp;
+ int state;
+ u64 base;
+ int err;
+
+ err = move_all_busy(dev, slave, RES_FS_RULE);
+ if (err)
+ mlx4_warn(dev, "rem_slave_fs_rule: Could not move all mtts to busy for slave %d\n",
+ slave);
+
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry_safe(fs_rule, tmp, fs_rule_list, com.list) {
+ spin_unlock_irq(mlx4_tlock(dev));
+ if (fs_rule->com.owner == slave) {
+ base = fs_rule->com.res_id;
+ state = fs_rule->com.from_state;
+ while (state != 0) {
+ switch (state) {
+ case RES_FS_RULE_ALLOCATED:
+ /* detach rule */
+ err = mlx4_cmd(dev, base, 0, 0,
+ MLX4_QP_FLOW_STEERING_DETACH,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+
+ spin_lock_irq(mlx4_tlock(dev));
+ rb_erase(&fs_rule->com.node,
+ &tracker->res_tree[RES_FS_RULE]);
+ list_del(&fs_rule->com.list);
+ spin_unlock_irq(mlx4_tlock(dev));
+ kfree(fs_rule);
+ state = 0;
+ break;
+
+ default:
+ state = 0;
+ }
+ }
+ }
+ spin_lock_irq(mlx4_tlock(dev));
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
+static void rem_slave_eqs(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *eq_list =
+ &tracker->slave_list[slave].res_list[RES_EQ];
+ struct res_eq *eq;
+ struct res_eq *tmp;
+ int err;
+ int state;
+ LIST_HEAD(tlist);
+ int eqn;
+
+ err = move_all_busy(dev, slave, RES_EQ);
+ if (err)
+ mlx4_warn(dev, "rem_slave_eqs: Could not move all eqs - too busy for slave %d\n",
+ slave);
+
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry_safe(eq, tmp, eq_list, com.list) {
+ spin_unlock_irq(mlx4_tlock(dev));
+ if (eq->com.owner == slave) {
+ eqn = eq->com.res_id;
+ state = eq->com.from_state;
+ while (state != 0) {
+ switch (state) {
+ case RES_EQ_RESERVED:
+ spin_lock_irq(mlx4_tlock(dev));
+ rb_erase(&eq->com.node,
+ &tracker->res_tree[RES_EQ]);
+ list_del(&eq->com.list);
+ spin_unlock_irq(mlx4_tlock(dev));
+ kfree(eq);
+ state = 0;
+ break;
+
+ case RES_EQ_HW:
+ err = mlx4_cmd(dev, slave, eqn & 0x3ff,
+ 1, MLX4_CMD_HW2SW_EQ,
+ MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_NATIVE);
+ if (err)
+ mlx4_dbg(dev, "rem_slave_eqs: failed to move slave %d eqs %d to SW ownership\n",
+ slave, eqn & 0x3ff);
+ atomic_dec(&eq->mtt->ref_count);
+ state = RES_EQ_RESERVED;
+ break;
+
+ default:
+ state = 0;
+ }
+ }
+ }
+ spin_lock_irq(mlx4_tlock(dev));
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
+static void rem_slave_counters(struct mlx4_dev *dev, int slave)
+{
+ __mlx4_slave_counters_free(dev, slave);
+}
+
+static void rem_slave_xrcdns(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *xrcdn_list =
+ &tracker->slave_list[slave].res_list[RES_XRCD];
+ struct res_xrcdn *xrcd;
+ struct res_xrcdn *tmp;
+ int err;
+ int xrcdn;
+
+ err = move_all_busy(dev, slave, RES_XRCD);
+ if (err)
+ mlx4_warn(dev, "rem_slave_xrcdns: Could not move all xrcdns - too busy for slave %d\n",
+ slave);
+
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry_safe(xrcd, tmp, xrcdn_list, com.list) {
+ if (xrcd->com.owner == slave) {
+ xrcdn = xrcd->com.res_id;
+ rb_erase(&xrcd->com.node, &tracker->res_tree[RES_XRCD]);
+ list_del(&xrcd->com.list);
+ kfree(xrcd);
+ __mlx4_xrcd_free(dev, xrcdn);
+ }
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
+void mlx4_delete_all_resources_for_slave(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ mlx4_reset_roce_gids(dev, slave);
+ mutex_lock(&priv->mfunc.master.res_tracker.slave_list[slave].mutex);
+ rem_slave_vlans(dev, slave);
+ rem_slave_macs(dev, slave);
+ rem_slave_fs_rule(dev, slave);
+ rem_slave_qps(dev, slave);
+ rem_slave_srqs(dev, slave);
+ rem_slave_cqs(dev, slave);
+ rem_slave_mrs(dev, slave);
+ rem_slave_eqs(dev, slave);
+ rem_slave_mtts(dev, slave);
+ rem_slave_counters(dev, slave);
+ rem_slave_xrcdns(dev, slave);
+ mutex_unlock(&priv->mfunc.master.res_tracker.slave_list[slave].mutex);
+}
+
+static void update_qos_vpp(struct mlx4_update_qp_context *ctx,
+ struct mlx4_vf_immed_vlan_work *work)
+{
+ ctx->qp_mask |= cpu_to_be64(1ULL << MLX4_UPD_QP_MASK_QOS_VPP);
+ ctx->qp_context.qos_vport = work->qos_vport;
+}
+#ifdef KMOD_MODIFIED
+void mlx4_vf_immed_vlan_work_handler(struct mlx4_vf_immed_vlan_work *work)
+{
+ //struct mlx4_vf_immed_vlan_work *work =
+ // container_of(_work, struct mlx4_vf_immed_vlan_work, work);
+#endif
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_update_qp_context *upd_context;
+ struct mlx4_dev *dev = &work->priv->dev;
+ struct mlx4_resource_tracker *tracker =
+ &work->priv->mfunc.master.res_tracker;
+ struct list_head *qp_list =
+ &tracker->slave_list[work->slave].res_list[RES_QP];
+ struct res_qp *qp;
+ struct res_qp *tmp;
+ u64 qp_path_mask_vlan_ctrl =
+ ((1ULL << MLX4_UPD_QP_PATH_MASK_ETH_TX_BLOCK_UNTAGGED) |
+ (1ULL << MLX4_UPD_QP_PATH_MASK_ETH_TX_BLOCK_1P) |
+ (1ULL << MLX4_UPD_QP_PATH_MASK_ETH_TX_BLOCK_TAGGED) |
+ (1ULL << MLX4_UPD_QP_PATH_MASK_ETH_RX_BLOCK_UNTAGGED) |
+ (1ULL << MLX4_UPD_QP_PATH_MASK_ETH_RX_BLOCK_1P) |
+ (1ULL << MLX4_UPD_QP_PATH_MASK_ETH_RX_BLOCK_TAGGED));
+
+ u64 qp_path_mask = ((1ULL << MLX4_UPD_QP_PATH_MASK_VLAN_INDEX) |
+ (1ULL << MLX4_UPD_QP_PATH_MASK_FVL) |
+ (1ULL << MLX4_UPD_QP_PATH_MASK_CV) |
+ (1ULL << MLX4_UPD_QP_PATH_MASK_ETH_HIDE_CQE_VLAN) |
+ (1ULL << MLX4_UPD_QP_PATH_MASK_FEUP) |
+ (1ULL << MLX4_UPD_QP_PATH_MASK_FVL_RX) |
+ (1ULL << MLX4_UPD_QP_PATH_MASK_SCHED_QUEUE));
+
+ int err;
+ int port, errors = 0;
+ u8 vlan_control;
+
+ if (mlx4_is_slave(dev)) {
+ mlx4_warn(dev, "Trying to update-qp in slave %d\n",
+ work->slave);
+ goto out;
+ }
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ goto out;
+ if (work->flags & MLX4_VF_IMMED_VLAN_FLAG_LINK_DISABLE) /* block all */
+ vlan_control = MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_PRIO_TAGGED |
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_UNTAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED;
+ else if (!work->vlan_id)
+ vlan_control = MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED;
+ else
+ vlan_control = MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED;
+
+ upd_context = mailbox->buf;
+ upd_context->qp_mask = cpu_to_be64(1ULL << MLX4_UPD_QP_MASK_VSD);
+
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry_safe(qp, tmp, qp_list, com.list) {
+ spin_unlock_irq(mlx4_tlock(dev));
+ if (qp->com.owner == work->slave) {
+ if (qp->com.from_state != RES_QP_HW ||
+ !qp->sched_queue || /* no INIT2RTR trans yet */
+ mlx4_is_qp_reserved(dev, qp->local_qpn) ||
+ qp->qpc_flags & (1 << MLX4_RSS_QPC_FLAG_OFFSET)) {
+ spin_lock_irq(mlx4_tlock(dev));
+ continue;
+ }
+ port = (qp->sched_queue >> 6 & 1) + 1;
+ if (port != work->port) {
+ spin_lock_irq(mlx4_tlock(dev));
+ continue;
+ }
+ if (MLX4_QP_ST_RC == ((qp->qpc_flags >> 16) & 0xff))
+ upd_context->primary_addr_path_mask = cpu_to_be64(qp_path_mask);
+ else
+ upd_context->primary_addr_path_mask =
+ cpu_to_be64(qp_path_mask | qp_path_mask_vlan_ctrl);
+ if (work->vlan_id == MLX4_VGT) {
+ upd_context->qp_context.param3 = qp->param3;
+ upd_context->qp_context.pri_path.vlan_control = qp->vlan_control;
+ upd_context->qp_context.pri_path.fvl_rx = qp->fvl_rx;
+ upd_context->qp_context.pri_path.vlan_index = qp->vlan_index;
+ upd_context->qp_context.pri_path.fl = qp->pri_path_fl;
+ upd_context->qp_context.pri_path.feup = qp->feup;
+ upd_context->qp_context.pri_path.sched_queue =
+ qp->sched_queue;
+ } else {
+ upd_context->qp_context.param3 = qp->param3 & ~cpu_to_be32(MLX4_STRIP_VLAN);
+ upd_context->qp_context.pri_path.vlan_control = vlan_control;
+ upd_context->qp_context.pri_path.vlan_index = work->vlan_ix;
+ upd_context->qp_context.pri_path.fvl_rx =
+ qp->fvl_rx | MLX4_FVL_RX_FORCE_ETH_VLAN;
+ upd_context->qp_context.pri_path.fl =
+ qp->pri_path_fl | MLX4_FL_CV | MLX4_FL_ETH_HIDE_CQE_VLAN;
+ upd_context->qp_context.pri_path.feup =
+ qp->feup | MLX4_FEUP_FORCE_ETH_UP | MLX4_FVL_FORCE_ETH_VLAN;
+ upd_context->qp_context.pri_path.sched_queue =
+ qp->sched_queue & 0xC7;
+ upd_context->qp_context.pri_path.sched_queue |=
+ ((work->qos & 0x7) << 3);
+
+ if (dev->caps.flags2 &
+ MLX4_DEV_CAP_FLAG2_QOS_VPP)
+ update_qos_vpp(upd_context, work);
+ }
+
+ err = mlx4_cmd(dev, mailbox->dma,
+ qp->local_qpn & 0xffffff,
+ 0, MLX4_CMD_UPDATE_QP,
+ MLX4_CMD_TIME_CLASS_C, MLX4_CMD_NATIVE);
+ if (err) {
+ mlx4_info(dev, "UPDATE_QP failed for slave %d, port %d, qpn %d (%d)\n",
+ work->slave, port, qp->local_qpn, err);
+ errors++;
+ }
+ }
+ spin_lock_irq(mlx4_tlock(dev));
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+ mlx4_free_cmd_mailbox(dev, mailbox);
+
+ if (errors)
+ mlx4_err(dev, "%d UPDATE_QP failures for slave %d, port %d\n",
+ errors, work->slave, work->port);
+
+ /* unregister previous vlan_id if needed and we had no errors
+ * while updating the QPs
+ */
+ if (work->flags & MLX4_VF_IMMED_VLAN_FLAG_VLAN && !errors &&
+ NO_INDX != work->orig_vlan_ix)
+ __mlx4_unregister_vlan(&work->priv->dev, work->port,
+ work->orig_vlan_id);
+out:
+ kfree(work);
+ return;
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/sense.c b/drivers/net/mlnx_uio/mlnx/mlx4/sense.c
new file mode 100644
index 0000000..8935b3f
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/sense.c
@@ -0,0 +1,153 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2007 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+
+
+#include "mlx4.h"
+
+int mlx4_SENSE_PORT(struct mlx4_dev *dev, int port,
+ enum mlx4_port_type *type)
+{
+ u64 out_param;
+ int err = 0;
+
+ err = mlx4_cmd_imm(dev, 0, &out_param, port, 0,
+ MLX4_CMD_SENSE_PORT, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_WRAPPED);
+ if (err) {
+ mlx4_err(dev, "Sense command failed for port: %d\n", port);
+ return err;
+ }
+
+ if (out_param > 2) {
+ mlx4_err(dev, "Sense returned illegal value: 0x%llx\n", out_param);
+ return -EINVAL;
+ }
+
+ *type = out_param;
+ return 0;
+}
+
+void mlx4_do_sense_ports(struct mlx4_dev *dev,
+ enum mlx4_port_type *stype,
+ enum mlx4_port_type *defaults)
+{
+ struct mlx4_sense *sense = &mlx4_priv(dev)->sense;
+ int err;
+ int i;
+
+ for (i = 1; i <= dev->caps.num_ports; i++) {
+ stype[i - 1] = 0;
+ if (sense->do_sense_port[i] && sense->sense_allowed[i] &&
+ dev->caps.possible_type[i] == MLX4_PORT_TYPE_AUTO) {
+ err = mlx4_SENSE_PORT(dev, i, &stype[i - 1]);
+ if (err)
+ stype[i - 1] = defaults[i - 1];
+ } else
+ stype[i - 1] = defaults[i - 1];
+ }
+
+ /*
+ * If sensed nothing, remain in current configuration.
+ */
+ for (i = 0; i < dev->caps.num_ports; i++)
+ stype[i] = stype[i] ? stype[i] : defaults[i];
+
+}
+
+#ifdef KMOD_MODIFIED
+static void mlx4_sense_port(struct mlx4_sense *sense)
+{
+ //struct delayed_work *delay = to_delayed_work(work);
+ //struct mlx4_sense *sense = container_of(delay, struct mlx4_sense,
+ // sense_poll);
+ struct mlx4_dev *dev = sense->dev;
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ enum mlx4_port_type stype[MLX4_MAX_PORTS];
+
+ mutex_lock(&priv->port_mutex);
+ mlx4_do_sense_ports(dev, stype, &dev->caps.port_type[1]);
+
+ if (mlx4_check_port_params(dev, stype))
+ goto sense_again;
+
+ if (mlx4_change_port_types(dev, stype))
+ mlx4_err(dev, "Failed to change port_types\n");
+
+sense_again:
+#ifdef KMOD_DISABLED
+ mutex_unlock(&priv->port_mutex);
+ queue_delayed_work(mlx4_wq , &sense->sense_poll,
+ round_jiffies_relative(MLX4_SENSE_RANGE));
+#endif
+ return;
+}
+#endif
+
+void mlx4_start_sense(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_sense *sense = &priv->sense;
+
+ if (!(dev->caps.flags & MLX4_DEV_CAP_FLAG_DPDP))
+ return;
+#ifdef KMOD_DISABLED
+ queue_delayed_work(mlx4_wq , &sense->sense_poll,
+ round_jiffies_relative(MLX4_SENSE_RANGE));
+#endif
+}
+
+void mlx4_stop_sense(struct mlx4_dev *dev)
+{
+#ifdef KMOD_DISABLED
+ cancel_delayed_work_sync(&mlx4_priv(dev)->sense.sense_poll);
+#endif
+}
+
+void mlx4_sense_init(struct mlx4_dev *dev)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_sense *sense = &priv->sense;
+ int port;
+
+ sense->dev = dev;
+ for (port = 1; port <= dev->caps.num_ports; port++)
+ sense->do_sense_port[port] = 1;
+#ifdef KMOD_DISABLED
+ INIT_DEFERRABLE_WORK(&sense->sense_poll, mlx4_sense_port);
+#endif
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx4/srq.c b/drivers/net/mlnx_uio/mlnx/mlx4/srq.c
new file mode 100644
index 0000000..3da1255
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx4/srq.c
@@ -0,0 +1,314 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved.
+ * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+
+#include "mlx4.h"
+#include "icm.h"
+#include "log2.h"
+
+void mlx4_srq_event(struct mlx4_dev *dev, u32 srqn, int event_type)
+{
+ struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table;
+ struct mlx4_srq *srq;
+
+ spin_lock(&srq_table->lock);
+
+ srq = radix_tree_lookup(&srq_table->tree, srqn & (dev->caps.num_srqs - 1));
+ if (srq)
+ atomic_inc(&srq->refcount);
+
+ spin_unlock(&srq_table->lock);
+
+ if (!srq) {
+ mlx4_warn(dev, "Async event for bogus SRQ %08x\n", srqn);
+ return;
+ }
+
+ srq->event(srq, event_type);
+
+ if (atomic_dec_and_test(&srq->refcount))
+ complete(&srq->free);
+}
+
+static int mlx4_SW2HW_SRQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
+ int srq_num)
+{
+ return mlx4_cmd(dev, mailbox->dma, srq_num, 0,
+ MLX4_CMD_SW2HW_SRQ, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+}
+
+static int mlx4_HW2SW_SRQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
+ int srq_num)
+{
+ return mlx4_cmd_box(dev, 0, mailbox ? mailbox->dma : 0, srq_num,
+ mailbox ? 0 : 1, MLX4_CMD_HW2SW_SRQ,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+}
+
+static int mlx4_ARM_SRQ(struct mlx4_dev *dev, int srq_num, int limit_watermark)
+{
+ return mlx4_cmd(dev, limit_watermark, srq_num, 0, MLX4_CMD_ARM_SRQ,
+ MLX4_CMD_TIME_CLASS_B, MLX4_CMD_WRAPPED);
+}
+
+static int mlx4_QUERY_SRQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
+ int srq_num)
+{
+ return mlx4_cmd_box(dev, 0, mailbox->dma, srq_num, 0, MLX4_CMD_QUERY_SRQ,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+}
+
+int __mlx4_srq_alloc_icm(struct mlx4_dev *dev, int *srqn)
+{
+ struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table;
+ int err;
+
+
+ *srqn = mlx4_bitmap_alloc(&srq_table->bitmap);
+ if (*srqn == -1)
+ return -ENOMEM;
+
+ err = mlx4_table_get(dev, &srq_table->table, *srqn, GFP_KERNEL);
+ if (err)
+ goto err_out;
+
+ err = mlx4_table_get(dev, &srq_table->cmpt_table, *srqn, GFP_KERNEL);
+ if (err)
+ goto err_put;
+ return 0;
+
+err_put:
+ mlx4_table_put(dev, &srq_table->table, *srqn);
+
+err_out:
+ mlx4_bitmap_free(&srq_table->bitmap, *srqn, MLX4_NO_RR);
+ return err;
+}
+
+static int mlx4_srq_alloc_icm(struct mlx4_dev *dev, int *srqn)
+{
+ u64 out_param;
+ int err;
+
+ if (mlx4_is_mfunc(dev)) {
+ err = mlx4_cmd_imm(dev, 0, &out_param, RES_SRQ,
+ RES_OP_RESERVE_AND_MAP,
+ MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (!err)
+ *srqn = get_param_l(&out_param);
+
+ return err;
+ }
+ return __mlx4_srq_alloc_icm(dev, srqn);
+}
+
+void __mlx4_srq_free_icm(struct mlx4_dev *dev, int srqn)
+{
+ struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table;
+
+ mlx4_table_put(dev, &srq_table->cmpt_table, srqn);
+ mlx4_table_put(dev, &srq_table->table, srqn);
+ mlx4_bitmap_free(&srq_table->bitmap, srqn, MLX4_NO_RR);
+}
+
+static void mlx4_srq_free_icm(struct mlx4_dev *dev, int srqn)
+{
+ u64 in_param = 0;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, srqn);
+ if (mlx4_cmd(dev, in_param, RES_SRQ, RES_OP_RESERVE_AND_MAP,
+ MLX4_CMD_FREE_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED))
+ mlx4_warn(dev, "Failed freeing cq:%d\n", srqn);
+ return;
+ }
+ __mlx4_srq_free_icm(dev, srqn);
+}
+
+int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, u32 cqn, u16 xrcd,
+ struct mlx4_mtt *mtt, u64 db_rec, struct mlx4_srq *srq)
+{
+ struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table;
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_srq_context *srq_context;
+ u64 mtt_addr;
+ int err;
+
+ err = mlx4_srq_alloc_icm(dev, &srq->srqn);
+ if (err)
+ return err;
+
+ spin_lock_irq(&srq_table->lock);
+ err = radix_tree_insert(&srq_table->tree, srq->srqn, srq);
+ spin_unlock_irq(&srq_table->lock);
+ if (err)
+ goto err_icm;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox)) {
+ err = PTR_ERR(mailbox);
+ goto err_radix;
+ }
+
+ srq_context = mailbox->buf;
+ srq_context->state_logsize_srqn = cpu_to_be32((ilog2(srq->max) << 24) |
+ srq->srqn);
+ srq_context->logstride = srq->wqe_shift - 4;
+ srq_context->xrcd = cpu_to_be16(xrcd);
+ srq_context->pg_offset_cqn = cpu_to_be32(cqn & 0xffffff);
+ srq_context->log_page_size = mtt->page_shift - MLX4_ICM_PAGE_SHIFT;
+
+ mtt_addr = mlx4_mtt_addr(dev, mtt);
+ srq_context->mtt_base_addr_h = mtt_addr >> 32;
+ srq_context->mtt_base_addr_l = cpu_to_be32(mtt_addr & 0xffffffff);
+ srq_context->pd = cpu_to_be32(pdn);
+ srq_context->db_rec_addr = cpu_to_be64(db_rec);
+
+ err = mlx4_SW2HW_SRQ(dev, mailbox, srq->srqn);
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ if (err)
+ goto err_radix;
+
+ atomic_set(&srq->refcount, 1);
+ init_completion(&srq->free);
+
+ return 0;
+
+err_radix:
+ spin_lock_irq(&srq_table->lock);
+ radix_tree_delete(&srq_table->tree, srq->srqn);
+ spin_unlock_irq(&srq_table->lock);
+
+err_icm:
+ mlx4_srq_free_icm(dev, srq->srqn);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_srq_alloc);
+
+void mlx4_srq_free(struct mlx4_dev *dev, struct mlx4_srq *srq)
+{
+ struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table;
+ int err;
+
+ err = mlx4_HW2SW_SRQ(dev, NULL, srq->srqn);
+ if (err)
+ mlx4_warn(dev, "HW2SW_SRQ failed (%d) for SRQN %06x\n", err, srq->srqn);
+
+ spin_lock_irq(&srq_table->lock);
+ radix_tree_delete(&srq_table->tree, srq->srqn);
+ spin_unlock_irq(&srq_table->lock);
+
+ if (atomic_dec_and_test(&srq->refcount))
+ complete(&srq->free);
+ wait_for_completion(&srq->free);
+
+ mlx4_srq_free_icm(dev, srq->srqn);
+}
+EXPORT_SYMBOL_GPL(mlx4_srq_free);
+
+int mlx4_srq_arm(struct mlx4_dev *dev, struct mlx4_srq *srq, int limit_watermark)
+{
+ return mlx4_ARM_SRQ(dev, srq->srqn, limit_watermark);
+}
+EXPORT_SYMBOL_GPL(mlx4_srq_arm);
+
+int mlx4_srq_query(struct mlx4_dev *dev, struct mlx4_srq *srq, int *limit_watermark)
+{
+ struct mlx4_cmd_mailbox *mailbox;
+ struct mlx4_srq_context *srq_context;
+ int err;
+
+ mailbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(mailbox))
+ return PTR_ERR(mailbox);
+
+ srq_context = mailbox->buf;
+
+ err = mlx4_QUERY_SRQ(dev, mailbox, srq->srqn);
+ if (err)
+ goto err_out;
+ *limit_watermark = be16_to_cpu(srq_context->limit_watermark);
+
+err_out:
+ mlx4_free_cmd_mailbox(dev, mailbox);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx4_srq_query);
+
+int mlx4_init_srq_table(struct mlx4_dev *dev)
+{
+ struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table;
+ int err;
+
+ spin_lock_init(&srq_table->lock);
+ INIT_RADIX_TREE(&srq_table->tree, GFP_ATOMIC);
+ if (mlx4_is_slave(dev))
+ return 0;
+
+ err = mlx4_bitmap_init(&srq_table->bitmap, dev->caps.num_srqs,
+ dev->caps.num_srqs - 1, dev->caps.reserved_srqs, 0);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+void mlx4_cleanup_srq_table(struct mlx4_dev *dev)
+{
+ if (mlx4_is_slave(dev))
+ return;
+ mlx4_bitmap_cleanup(&mlx4_priv(dev)->srq_table.bitmap);
+}
+
+struct mlx4_srq *mlx4_srq_lookup(struct mlx4_dev *dev, u32 srqn)
+{
+ struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table;
+ struct mlx4_srq *srq;
+ unsigned long flags;
+
+ spin_lock_irqsave(&srq_table->lock, flags);
+ srq = radix_tree_lookup(&srq_table->tree,
+ srqn & (dev->caps.num_srqs - 1));
+ spin_unlock_irqrestore(&srq_table->lock, flags);
+
+ return srq;
+}
+EXPORT_SYMBOL_GPL(mlx4_srq_lookup);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/Kconfig b/drivers/net/mlnx_uio/mlnx/mlx5/core/Kconfig
new file mode 100644
index 0000000..8ff57e8
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/Kconfig
@@ -0,0 +1,8 @@
+#
+# Mellanox driver configuration
+#
+
+config MLX5_CORE
+ tristate
+ depends on PCI
+ default n
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/Makefile b/drivers/net/mlnx_uio/mlnx/mlx5/core/Makefile
new file mode 100644
index 0000000..6965254
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/Makefile
@@ -0,0 +1,9 @@
+ccflags-y += $(MLNX_CFLAGS)
+
+obj-$(CONFIG_MLX5_CORE) += mlx5_core.o
+
+mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
+ health.o mcg.o cq.o srq.o alloc.o qp.o port.o mr.o pd.o \
+ mad.o wq.o flow_table.o vport.o transobj.o en_main.o \
+ en_flow_table.o en_ethtool.o en_tx.o en_rx.o en_txrx.o \
+ sriov.o params.o en_debugfs.o
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/alloc.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/alloc.c
new file mode 100644
index 0000000..70f6e71
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/alloc.c
@@ -0,0 +1,273 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+#include "mlx5_core.h"
+
+/* Handling for queue buffers -- we allocate a bunch of memory and
+ * register it in a memory region at HCA virtual address 0. If the
+ * requested size is > max_direct, we split the allocation into
+ * multiple pages, so we don't require too much contiguous memory.
+ */
+
+static void *mlx5_dma_zalloc_coherent_node(struct mlx5_core_dev *dev,
+ size_t size, dma_addr_t *dma_handle,
+ int node)
+{
+ struct mlx5_priv *priv = &dev->priv;
+ int original_node;
+ void *cpu_handle;
+
+ mutex_lock(&priv->alloc_mutex);
+ original_node = dev_to_node(&dev->pdev->dev);
+ set_dev_node(&dev->pdev->dev, node);
+ cpu_handle = dma_zalloc_coherent(&dev->pdev->dev, size,
+ dma_handle, GFP_KERNEL);
+ set_dev_node(&dev->pdev->dev, original_node);
+ mutex_unlock(&priv->alloc_mutex);
+ return cpu_handle;
+}
+
+int mlx5_buf_alloc_node(struct mlx5_core_dev *dev, int size, int max_direct,
+ struct mlx5_buf *buf, int node)
+{
+ dma_addr_t t;
+
+ buf->size = size;
+ if (size <= max_direct) {
+ buf->nbufs = 1;
+ buf->npages = 1;
+ buf->page_shift = (u8)get_order(size) + PAGE_SHIFT;
+ buf->direct.buf = mlx5_dma_zalloc_coherent_node(dev, size,
+ &t, node);
+ if (!buf->direct.buf)
+ return -ENOMEM;
+
+ buf->direct.map = t;
+
+ while (t & ((1 << buf->page_shift) - 1)) {
+ --buf->page_shift;
+ buf->npages *= 2;
+ }
+ } else {
+ int i;
+
+ buf->direct.buf = NULL;
+ buf->nbufs = (size + PAGE_SIZE - 1) / PAGE_SIZE;
+ buf->npages = buf->nbufs;
+ buf->page_shift = PAGE_SHIFT;
+ buf->page_list = kcalloc(buf->nbufs, sizeof(*buf->page_list),
+ GFP_KERNEL);
+ if (!buf->page_list)
+ return -ENOMEM;
+
+ for (i = 0; i < buf->nbufs; i++) {
+ buf->page_list[i].buf =
+ mlx5_dma_zalloc_coherent_node(dev, PAGE_SIZE,
+ &t, node);
+ if (!buf->page_list[i].buf)
+ goto err_free;
+
+ buf->page_list[i].map = t;
+ }
+
+ if (BITS_PER_LONG == 64) {
+ struct page **pages;
+ pages = kmalloc(sizeof(*pages) * (buf->nbufs + 1),
+ GFP_KERNEL);
+ if (!pages)
+ goto err_free;
+ for (i = 0; i < buf->nbufs; i++)
+ pages[i] = virt_to_page(buf->page_list[i].buf);
+ pages[buf->nbufs] = pages[0];
+ buf->direct.buf = vmap(pages, buf->nbufs + 1, VM_MAP,
+ PAGE_KERNEL);
+ kfree(pages);
+ if (!buf->direct.buf)
+ goto err_free;
+ }
+ }
+
+ return 0;
+
+err_free:
+ mlx5_buf_free(dev, buf);
+
+ return -ENOMEM;
+}
+
+int mlx5_buf_alloc(struct mlx5_core_dev *dev, int size, int max_direct,
+ struct mlx5_buf *buf)
+{
+ return mlx5_buf_alloc_node(dev, size, max_direct,
+ buf, dev->priv.numa_node);
+}
+EXPORT_SYMBOL_GPL(mlx5_buf_alloc);
+
+void mlx5_buf_free(struct mlx5_core_dev *dev, struct mlx5_buf *buf)
+{
+ int i;
+
+ if (buf->nbufs == 1)
+ dma_free_coherent(&dev->pdev->dev, buf->size, buf->direct.buf,
+ buf->direct.map);
+ else {
+ if (BITS_PER_LONG == 64)
+ vunmap(buf->direct.buf);
+
+ for (i = 0; i < buf->nbufs; i++)
+ if (buf->page_list[i].buf)
+ dma_free_coherent(&dev->pdev->dev, PAGE_SIZE,
+ buf->page_list[i].buf,
+ buf->page_list[i].map);
+ kfree(buf->page_list);
+ }
+}
+EXPORT_SYMBOL_GPL(mlx5_buf_free);
+
+static struct mlx5_db_pgdir *mlx5_alloc_db_pgdir(struct mlx5_core_dev *dev,
+ int node)
+{
+ struct mlx5_db_pgdir *pgdir;
+
+ pgdir = kzalloc(sizeof(*pgdir), GFP_KERNEL);
+ if (!pgdir)
+ return NULL;
+
+ bitmap_fill(pgdir->bitmap, MLX5_DB_PER_PAGE);
+
+ pgdir->db_page = mlx5_dma_zalloc_coherent_node(dev, PAGE_SIZE,
+ &pgdir->db_dma, node);
+ if (!pgdir->db_page) {
+ kfree(pgdir);
+ return NULL;
+ }
+
+ return pgdir;
+}
+
+static int mlx5_alloc_db_from_pgdir(struct mlx5_db_pgdir *pgdir,
+ struct mlx5_db *db)
+{
+ int offset;
+ int i;
+
+ i = find_first_bit(pgdir->bitmap, MLX5_DB_PER_PAGE);
+ if (i >= MLX5_DB_PER_PAGE)
+ return -ENOMEM;
+
+ __clear_bit(i, pgdir->bitmap);
+
+ db->u.pgdir = pgdir;
+ db->index = i;
+ offset = db->index * L1_CACHE_BYTES;
+ db->db = pgdir->db_page + offset / sizeof(*pgdir->db_page);
+ db->dma = pgdir->db_dma + offset;
+
+ db->db[0] = 0;
+ db->db[1] = 0;
+
+ return 0;
+}
+
+int mlx5_db_alloc_node(struct mlx5_core_dev *dev, struct mlx5_db *db, int node)
+{
+ struct mlx5_db_pgdir *pgdir;
+ int ret = 0;
+
+ mutex_lock(&dev->priv.pgdir_mutex);
+
+ list_for_each_entry(pgdir, &dev->priv.pgdir_list, list)
+ if (!mlx5_alloc_db_from_pgdir(pgdir, db))
+ goto out;
+
+ pgdir = mlx5_alloc_db_pgdir(dev, node);
+ if (!pgdir) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ list_add(&pgdir->list, &dev->priv.pgdir_list);
+
+ /* This should never fail -- we just allocated an empty page: */
+ WARN_ON(mlx5_alloc_db_from_pgdir(pgdir, db));
+
+out:
+ mutex_unlock(&dev->priv.pgdir_mutex);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(mlx5_db_alloc_node);
+
+int mlx5_db_alloc(struct mlx5_core_dev *dev, struct mlx5_db *db)
+{
+ return mlx5_db_alloc_node(dev, db, dev->priv.numa_node);
+}
+EXPORT_SYMBOL_GPL(mlx5_db_alloc);
+
+void mlx5_db_free(struct mlx5_core_dev *dev, struct mlx5_db *db)
+{
+ mutex_lock(&dev->priv.pgdir_mutex);
+
+ __set_bit(db->index, db->u.pgdir->bitmap);
+
+ if (bitmap_full(db->u.pgdir->bitmap, MLX5_DB_PER_PAGE)) {
+ dma_free_coherent(&(dev->pdev->dev), PAGE_SIZE,
+ db->u.pgdir->db_page, db->u.pgdir->db_dma);
+ list_del(&db->u.pgdir->list);
+ kfree(db->u.pgdir);
+ }
+
+ mutex_unlock(&dev->priv.pgdir_mutex);
+}
+EXPORT_SYMBOL_GPL(mlx5_db_free);
+
+
+void mlx5_fill_page_array(struct mlx5_buf *buf, __be64 *pas)
+{
+ u64 addr;
+ int i;
+
+ for (i = 0; i < buf->npages; i++) {
+ if (buf->nbufs == 1)
+ addr = buf->direct.map + (i << buf->page_shift);
+ else
+ addr = buf->page_list[i].map;
+
+ pas[i] = cpu_to_be64(addr);
+ }
+}
+EXPORT_SYMBOL_GPL(mlx5_fill_page_array);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/cmd.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/cmd.c
new file mode 100644
index 0000000..1e8b721
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/cmd.c
@@ -0,0 +1,2069 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+#include "mlx5_core.h"
+
+enum {
+ CMD_IF_REV = 5,
+};
+
+enum {
+ CMD_MODE_POLLING,
+ CMD_MODE_EVENTS
+};
+
+enum {
+ MLX5_CMD_DELIVERY_STAT_OK = 0x0,
+ MLX5_CMD_DELIVERY_STAT_SIGNAT_ERR = 0x1,
+ MLX5_CMD_DELIVERY_STAT_TOK_ERR = 0x2,
+ MLX5_CMD_DELIVERY_STAT_BAD_BLK_NUM_ERR = 0x3,
+ MLX5_CMD_DELIVERY_STAT_OUT_PTR_ALIGN_ERR = 0x4,
+ MLX5_CMD_DELIVERY_STAT_IN_PTR_ALIGN_ERR = 0x5,
+ MLX5_CMD_DELIVERY_STAT_FW_ERR = 0x6,
+ MLX5_CMD_DELIVERY_STAT_IN_LENGTH_ERR = 0x7,
+ MLX5_CMD_DELIVERY_STAT_OUT_LENGTH_ERR = 0x8,
+ MLX5_CMD_DELIVERY_STAT_RES_FLD_NOT_CLR_ERR = 0x9,
+ MLX5_CMD_DELIVERY_STAT_CMD_DESCR_ERR = 0x10,
+};
+
+static int cmd_sysfs_init(struct mlx5_core_dev *dev);
+static void cmd_sysfs_cleanup(struct mlx5_core_dev *dev);
+
+static struct mlx5_cmd_work_ent *alloc_cmd(struct mlx5_cmd *cmd,
+ struct mlx5_cmd_msg *in,
+ struct mlx5_cmd_msg *out,
+ void *uout, int uout_size,
+ mlx5_cmd_cbk_t cbk,
+ void *context, int page_queue)
+{
+ gfp_t alloc_flags = cbk ? GFP_ATOMIC : GFP_KERNEL;
+ struct mlx5_cmd_work_ent *ent;
+
+ ent = kzalloc(sizeof(*ent), alloc_flags);
+ if (!ent)
+ return ERR_PTR(-ENOMEM);
+
+ ent->in = in;
+ ent->out = out;
+ ent->uout = uout;
+ ent->uout_size = uout_size;
+ ent->callback = cbk;
+ ent->context = context;
+ ent->cmd = cmd;
+ ent->page_queue = page_queue;
+
+ return ent;
+}
+
+static u8 alloc_token(struct mlx5_cmd *cmd)
+{
+ u8 token;
+
+ spin_lock(&cmd->token_lock);
+ cmd->token++;
+ if (cmd->token == 0)
+ cmd->token++;
+ token = cmd->token;
+ spin_unlock(&cmd->token_lock);
+
+ return token;
+}
+
+static int alloc_ent(struct mlx5_cmd *cmd)
+{
+ unsigned long flags;
+ int ret;
+
+ spin_lock_irqsave(&cmd->alloc_lock, flags);
+ ret = find_first_bit(&cmd->bitmask, cmd->max_reg_cmds);
+ if (ret < cmd->max_reg_cmds)
+ clear_bit(ret, &cmd->bitmask);
+ spin_unlock_irqrestore(&cmd->alloc_lock, flags);
+
+ return ret < cmd->max_reg_cmds ? ret : -ENOMEM;
+}
+
+static void free_ent(struct mlx5_cmd *cmd, int idx)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&cmd->alloc_lock, flags);
+ set_bit(idx, &cmd->bitmask);
+ spin_unlock_irqrestore(&cmd->alloc_lock, flags);
+}
+
+static struct mlx5_cmd_layout *get_inst(struct mlx5_cmd *cmd, int idx)
+{
+ return cmd->cmd_buf + (idx << cmd->log_stride);
+}
+
+static u8 xor8_buf(void *buf, int len)
+{
+ u8 *ptr = buf;
+ u8 sum = 0;
+ int i;
+
+ for (i = 0; i < len; i++)
+ sum ^= ptr[i];
+
+ return sum;
+}
+
+static int verify_block_sig(struct mlx5_cmd_prot_block *block)
+{
+ if (xor8_buf(block->rsvd0, sizeof(*block) - sizeof(block->data) - 1) != 0xff)
+ return -EINVAL;
+
+ if (xor8_buf(block, sizeof(*block)) != 0xff)
+ return -EINVAL;
+
+ return 0;
+}
+
+static void calc_block_sig(struct mlx5_cmd_prot_block *block, u8 token,
+ int csum)
+{
+ block->token = token;
+ if (csum) {
+ block->ctrl_sig = ~xor8_buf(block->rsvd0, sizeof(*block) -
+ sizeof(block->data) - 2);
+ block->sig = ~xor8_buf(block, sizeof(*block) - 1);
+ }
+}
+
+static void calc_chain_sig(struct mlx5_cmd_msg *msg, u8 token, int csum)
+{
+ struct mlx5_cmd_mailbox *next = msg->next;
+
+ while (next) {
+ calc_block_sig(next->buf, token, csum);
+ next = next->next;
+ }
+}
+
+static void set_signature(struct mlx5_cmd_work_ent *ent, int csum)
+{
+ ent->lay->sig = ~xor8_buf(ent->lay, sizeof(*ent->lay));
+ calc_chain_sig(ent->in, ent->token, csum);
+ calc_chain_sig(ent->out, ent->token, csum);
+}
+
+static void poll_timeout(struct mlx5_cmd_work_ent *ent)
+{
+ unsigned long poll_end = jiffies + msecs_to_jiffies(MLX5_CMD_TIMEOUT_MSEC + 1000);
+ u8 own;
+
+ do {
+ own = ent->lay->status_own;
+ if (!(own & CMD_OWNER_HW)) {
+ ent->ret = 0;
+ return;
+ }
+ usleep_range(5000, 10000);
+ } while (time_before(jiffies, poll_end));
+
+ ent->ret = -ETIMEDOUT;
+}
+
+static void free_cmd(struct mlx5_cmd_work_ent *ent)
+{
+ kfree(ent);
+}
+
+
+static int verify_signature(struct mlx5_cmd_work_ent *ent)
+{
+ struct mlx5_cmd_mailbox *next = ent->out->next;
+ int err;
+ u8 sig;
+
+ sig = xor8_buf(ent->lay, sizeof(*ent->lay));
+ if (sig != 0xff)
+ return -EINVAL;
+
+ while (next) {
+ err = verify_block_sig(next->buf);
+ if (err)
+ return err;
+
+ next = next->next;
+ }
+
+ return 0;
+}
+
+static void dump_buf(void *buf, int size, int data_only, int offset)
+{
+ __be32 *p = buf;
+ int i;
+
+ for (i = 0; i < size; i += 16) {
+ pr_debug("%03x: %08x %08x %08x %08x\n", offset, be32_to_cpu(p[0]),
+ be32_to_cpu(p[1]), be32_to_cpu(p[2]),
+ be32_to_cpu(p[3]));
+ p += 4;
+ offset += 16;
+ }
+ if (!data_only)
+ pr_debug("\n");
+}
+
+static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op)
+{
+ switch (op) {
+ case MLX5_CMD_OP_DISABLE_HCA:
+ case MLX5_CMD_OP_DESTROY_MKEY:
+ case MLX5_CMD_OP_TEARDOWN_HCA:
+ case MLX5_CMD_OP_DESTROY_EQ:
+ case MLX5_CMD_OP_DESTROY_CQ:
+ case MLX5_CMD_OP_DESTROY_QP:
+ case MLX5_CMD_OP_DESTROY_PSV:
+ case MLX5_CMD_OP_DESTROY_SRQ:
+ case MLX5_CMD_OP_DESTROY_XRC_SRQ:
+ case MLX5_CMD_OP_DESTROY_DCT:
+ case MLX5_CMD_OP_DEALLOC_PD:
+ case MLX5_CMD_OP_DEALLOC_UAR:
+ case MLX5_CMD_OP_DETTACH_FROM_MCG:
+ case MLX5_CMD_OP_DEALLOC_XRCD:
+ case MLX5_CMD_OP_MANAGE_PAGES:
+ case MLX5_CMD_OP_DESTROY_TIR:
+ case MLX5_CMD_OP_DESTROY_SQ:
+ case MLX5_CMD_OP_DESTROY_RQ:
+ case MLX5_CMD_OP_DESTROY_RMP:
+ case MLX5_CMD_OP_DESTROY_RQT:
+ case MLX5_CMD_OP_DESTROY_FLOW_TABLE:
+ case MLX5_CMD_OP_DESTROY_FLOW_GROUP:
+ return MLX5_CMD_STAT_OK;
+
+ case MLX5_CMD_OP_QUERY_HCA_CAP:
+ case MLX5_CMD_OP_QUERY_ADAPTER:
+ case MLX5_CMD_OP_INIT_HCA:
+ case MLX5_CMD_OP_ENABLE_HCA:
+ case MLX5_CMD_OP_QUERY_PAGES:
+ case MLX5_CMD_OP_SET_HCA_CAP:
+ case MLX5_CMD_OP_CREATE_MKEY:
+ case MLX5_CMD_OP_QUERY_MKEY:
+ case MLX5_CMD_OP_QUERY_SPECIAL_CONTEXTS:
+ case MLX5_CMD_OP_PAGE_FAULT_RESUME:
+ case MLX5_CMD_OP_CREATE_EQ:
+ case MLX5_CMD_OP_QUERY_EQ:
+ case MLX5_CMD_OP_CREATE_CQ:
+ case MLX5_CMD_OP_QUERY_CQ:
+ case MLX5_CMD_OP_MODIFY_CQ:
+ case MLX5_CMD_OP_CREATE_QP:
+ case MLX5_CMD_OP_RST2INIT_QP:
+ case MLX5_CMD_OP_INIT2RTR_QP:
+ case MLX5_CMD_OP_RTR2RTS_QP:
+ case MLX5_CMD_OP_RTS2RTS_QP:
+ case MLX5_CMD_OP_SQERR2RTS_QP:
+ case MLX5_CMD_OP_2ERR_QP:
+ case MLX5_CMD_OP_2RST_QP:
+ case MLX5_CMD_OP_QUERY_QP:
+ case MLX5_CMD_OP_MAD_IFC:
+ case MLX5_CMD_OP_INIT2INIT_QP:
+ case MLX5_CMD_OP_CREATE_PSV:
+ case MLX5_CMD_OP_CREATE_SRQ:
+ case MLX5_CMD_OP_QUERY_SRQ:
+ case MLX5_CMD_OP_ARM_RQ:
+ case MLX5_CMD_OP_CREATE_DCT:
+ case MLX5_CMD_OP_DRAIN_DCT:
+ case MLX5_CMD_OP_QUERY_DCT:
+ case MLX5_CMD_OP_ALLOC_PD:
+ case MLX5_CMD_OP_ALLOC_UAR:
+ case MLX5_CMD_OP_ATTACH_TO_MCG:
+ case MLX5_CMD_OP_ALLOC_XRCD:
+ case MLX5_CMD_OP_ACCESS_REG:
+ return -EIO;
+ default:
+ mlx5_core_err(dev, "Unknown FW command\n");
+ return -EINVAL;
+ }
+}
+
+const char *mlx5_command_str(int command)
+{
+ switch (command) {
+ case MLX5_CMD_OP_QUERY_HCA_CAP:
+ return "QUERY_HCA_CAP";
+
+ case MLX5_CMD_OP_SET_HCA_CAP:
+ return "SET_HCA_CAP";
+
+ case MLX5_CMD_OP_QUERY_ADAPTER:
+ return "QUERY_ADAPTER";
+
+ case MLX5_CMD_OP_INIT_HCA:
+ return "INIT_HCA";
+
+ case MLX5_CMD_OP_TEARDOWN_HCA:
+ return "TEARDOWN_HCA";
+
+ case MLX5_CMD_OP_ENABLE_HCA:
+ return "MLX5_CMD_OP_ENABLE_HCA";
+
+ case MLX5_CMD_OP_DISABLE_HCA:
+ return "MLX5_CMD_OP_DISABLE_HCA";
+
+ case MLX5_CMD_OP_QUERY_PAGES:
+ return "QUERY_PAGES";
+
+ case MLX5_CMD_OP_MANAGE_PAGES:
+ return "MANAGE_PAGES";
+
+ case MLX5_CMD_OP_CREATE_MKEY:
+ return "CREATE_MKEY";
+
+ case MLX5_CMD_OP_QUERY_MKEY:
+ return "QUERY_MKEY";
+
+ case MLX5_CMD_OP_DESTROY_MKEY:
+ return "DESTROY_MKEY";
+
+ case MLX5_CMD_OP_QUERY_SPECIAL_CONTEXTS:
+ return "QUERY_SPECIAL_CONTEXTS";
+
+ case MLX5_CMD_OP_CREATE_EQ:
+ return "CREATE_EQ";
+
+ case MLX5_CMD_OP_DESTROY_EQ:
+ return "DESTROY_EQ";
+
+ case MLX5_CMD_OP_QUERY_EQ:
+ return "QUERY_EQ";
+
+ case MLX5_CMD_OP_CREATE_CQ:
+ return "CREATE_CQ";
+
+ case MLX5_CMD_OP_DESTROY_CQ:
+ return "DESTROY_CQ";
+
+ case MLX5_CMD_OP_QUERY_CQ:
+ return "QUERY_CQ";
+
+ case MLX5_CMD_OP_MODIFY_CQ:
+ return "MODIFY_CQ";
+
+ case MLX5_CMD_OP_CREATE_QP:
+ return "CREATE_QP";
+
+ case MLX5_CMD_OP_DESTROY_QP:
+ return "DESTROY_QP";
+
+ case MLX5_CMD_OP_RST2INIT_QP:
+ return "RST2INIT_QP";
+
+ case MLX5_CMD_OP_INIT2RTR_QP:
+ return "INIT2RTR_QP";
+
+ case MLX5_CMD_OP_RTR2RTS_QP:
+ return "RTR2RTS_QP";
+
+ case MLX5_CMD_OP_RTS2RTS_QP:
+ return "RTS2RTS_QP";
+
+ case MLX5_CMD_OP_SQERR2RTS_QP:
+ return "SQERR2RTS_QP";
+
+ case MLX5_CMD_OP_2ERR_QP:
+ return "2ERR_QP";
+
+ case MLX5_CMD_OP_2RST_QP:
+ return "2RST_QP";
+
+ case MLX5_CMD_OP_QUERY_QP:
+ return "QUERY_QP";
+
+ case MLX5_CMD_OP_MAD_IFC:
+ return "MAD_IFC";
+
+ case MLX5_CMD_OP_INIT2INIT_QP:
+ return "INIT2INIT_QP";
+
+ case MLX5_CMD_OP_CREATE_PSV:
+ return "CREATE_PSV";
+
+ case MLX5_CMD_OP_DESTROY_PSV:
+ return "DESTROY_PSV";
+
+ case MLX5_CMD_OP_CREATE_SRQ:
+ return "CREATE_SRQ";
+
+ case MLX5_CMD_OP_DESTROY_SRQ:
+ return "DESTROY_SRQ";
+
+ case MLX5_CMD_OP_QUERY_SRQ:
+ return "QUERY_SRQ";
+
+ case MLX5_CMD_OP_ARM_RQ:
+ return "ARM_RQ";
+
+ case MLX5_CMD_OP_CREATE_XRC_SRQ:
+ return "CREATE_XRC_SRQ";
+
+ case MLX5_CMD_OP_DESTROY_XRC_SRQ:
+ return "DESTROY_XRC_SRQ";
+
+ case MLX5_CMD_OP_QUERY_XRC_SRQ:
+ return "QUERY_XRC_SRQ";
+
+ case MLX5_CMD_OP_ARM_XRC_SRQ:
+ return "ARM_XRC_SRQ";
+
+ case MLX5_CMD_OP_CREATE_DCT:
+ return "CREATE_DCT";
+
+ case MLX5_CMD_OP_DESTROY_DCT:
+ return "DESTROY_DCT";
+
+ case MLX5_CMD_OP_DRAIN_DCT:
+ return "DRAIN_DCT";
+
+ case MLX5_CMD_OP_QUERY_DCT:
+ return "QUERY_DCT";
+
+ case MLX5_CMD_OP_ARM_DCT_FOR_KEY_VIOLATION:
+ return "ARM_DCT";
+
+ case MLX5_CMD_OP_ALLOC_PD:
+ return "ALLOC_PD";
+
+ case MLX5_CMD_OP_DEALLOC_PD:
+ return "DEALLOC_PD";
+
+ case MLX5_CMD_OP_ALLOC_UAR:
+ return "ALLOC_UAR";
+
+ case MLX5_CMD_OP_DEALLOC_UAR:
+ return "DEALLOC_UAR";
+
+ case MLX5_CMD_OP_ATTACH_TO_MCG:
+ return "ATTACH_TO_MCG";
+
+ case MLX5_CMD_OP_DETTACH_FROM_MCG:
+ return "DETTACH_FROM_MCG";
+
+ case MLX5_CMD_OP_ALLOC_XRCD:
+ return "ALLOC_XRCD";
+
+ case MLX5_CMD_OP_DEALLOC_XRCD:
+ return "DEALLOC_XRCD";
+
+ case MLX5_CMD_OP_ACCESS_REG:
+ return "MLX5_CMD_OP_ACCESS_REG";
+
+ case MLX5_CMD_OP_QUERY_HCA_VPORT_CONTEXT:
+ return "QUERY_HCA_VPORT_CONTEXT";
+
+ case MLX5_CMD_OP_MODIFY_HCA_VPORT_CONTEXT:
+ return "MODIFY_HCA_VPORT_CONTEXT";
+
+ case MLX5_CMD_OP_QUERY_HCA_VPORT_PKEY:
+ return "QUERY_HCA_VPORT_PKEY";
+
+ case MLX5_CMD_OP_QUERY_HCA_VPORT_GID:
+ return "QUERY_HCA_VPORT_GID";
+
+ case MLX5_CMD_OP_QUERY_VPORT_COUNTER:
+ return "QUERY_VPORT_COUNTER";
+
+ default: return "unknown command opcode";
+ }
+}
+
+static void dump_command(struct mlx5_core_dev *dev,
+ struct mlx5_cmd_work_ent *ent, int input)
+{
+ u16 op = be16_to_cpu(((struct mlx5_inbox_hdr *)(ent->lay->in))->opcode);
+ struct mlx5_cmd_msg *msg = input ? ent->in : ent->out;
+ struct mlx5_cmd_mailbox *next = msg->next;
+ int data_only;
+ u32 offset = 0;
+ int dump_len;
+
+ data_only = !!(mlx5_core_debug_mask & (1 << MLX5_CMD_DATA));
+
+ if (data_only)
+ mlx5_core_dbg_mask(dev, 1 << MLX5_CMD_DATA,
+ "dump command data %s(0x%x) %s\n",
+ mlx5_command_str(op), op,
+ input ? "INPUT" : "OUTPUT");
+ else
+ mlx5_core_dbg(dev, "dump command %s(0x%x) %s\n",
+ mlx5_command_str(op), op,
+ input ? "INPUT" : "OUTPUT");
+
+ if (data_only) {
+ if (input) {
+ dump_buf(ent->lay->in, sizeof(ent->lay->in), 1, offset);
+ offset += sizeof(ent->lay->in);
+ } else {
+ dump_buf(ent->lay->out, sizeof(ent->lay->out), 1, offset);
+ offset += sizeof(ent->lay->out);
+ }
+ } else {
+ dump_buf(ent->lay, sizeof(*ent->lay), 0, offset);
+ offset += sizeof(*ent->lay);
+ }
+
+ while (next && offset < msg->len) {
+ if (data_only) {
+ dump_len = min_t(int, MLX5_CMD_DATA_BLOCK_SIZE, msg->len - offset);
+ dump_buf(next->buf, dump_len, 1, offset);
+ offset += MLX5_CMD_DATA_BLOCK_SIZE;
+ } else {
+ mlx5_core_dbg(dev, "command block:\n");
+ dump_buf(next->buf, sizeof(struct mlx5_cmd_prot_block), 0, offset);
+ offset += sizeof(struct mlx5_cmd_prot_block);
+ }
+ next = next->next;
+ }
+
+ if (data_only)
+ pr_debug("\n");
+}
+
+static void cmd_work_handler(struct work_struct *work)
+{
+ struct mlx5_cmd_work_ent *ent = container_of(work, struct mlx5_cmd_work_ent, work);
+ struct mlx5_cmd *cmd = ent->cmd;
+ struct mlx5_core_dev *dev = container_of(cmd, struct mlx5_core_dev, cmd);
+ struct mlx5_cmd_layout *lay;
+ struct semaphore *sem;
+
+ sem = ent->page_queue ? &cmd->pages_sem : &cmd->sem;
+ down(sem);
+ if (!ent->page_queue) {
+ ent->idx = alloc_ent(cmd);
+ if (ent->idx < 0) {
+ mlx5_core_err(dev, "failed to allocate command entry\n");
+ up(sem);
+ return;
+ }
+ } else {
+ ent->idx = cmd->max_reg_cmds;
+ }
+
+ ent->token = alloc_token(cmd);
+ cmd->ent_arr[ent->idx] = ent;
+ lay = get_inst(cmd, ent->idx);
+ ent->lay = lay;
+ memset(lay, 0, sizeof(*lay));
+ memcpy(lay->in, ent->in->first.data, sizeof(lay->in));
+ ent->op = be32_to_cpu(lay->in[0]) >> 16;
+ if (ent->in->next)
+ lay->in_ptr = cpu_to_be64(ent->in->next->dma);
+ lay->inlen = cpu_to_be32(ent->in->len);
+ if (ent->out->next)
+ lay->out_ptr = cpu_to_be64(ent->out->next->dma);
+ lay->outlen = cpu_to_be32(ent->out->len);
+ lay->type = MLX5_PCI_CMD_XPORT;
+ lay->token = ent->token;
+ lay->status_own = CMD_OWNER_HW;
+ set_signature(ent, !cmd->checksum_disabled);
+ dump_command(dev, ent, 1);
+#ifdef HAVE_KTIME_GET_NS
+ ent->ts1 = ktime_get_ns();
+#else
+ ktime_get_ts(&ent->ts1);
+#endif
+
+ /* ring doorbell after the descriptor is valid */
+ mlx5_core_dbg(dev, "writing 0x%x to command doorbell\n", 1 << ent->idx);
+ wmb();
+ iowrite32be(1 << ent->idx, &dev->iseg->cmd_dbell);
+ mmiowb();
+ /* if not in polling don't use ent after this point */
+ if (cmd->mode == CMD_MODE_POLLING) {
+ poll_timeout(ent);
+ /* make sure we read the descriptor after ownership is SW */
+ rmb();
+ mlx5_cmd_comp_handler(dev, 1UL << ent->idx);
+ }
+}
+
+static const char *deliv_status_to_str(u8 status)
+{
+ switch (status) {
+ case MLX5_CMD_DELIVERY_STAT_OK:
+ return "no errors";
+ case MLX5_CMD_DELIVERY_STAT_SIGNAT_ERR:
+ return "signature error";
+ case MLX5_CMD_DELIVERY_STAT_TOK_ERR:
+ return "token error";
+ case MLX5_CMD_DELIVERY_STAT_BAD_BLK_NUM_ERR:
+ return "bad block number";
+ case MLX5_CMD_DELIVERY_STAT_OUT_PTR_ALIGN_ERR:
+ return "output pointer not aligned to block size";
+ case MLX5_CMD_DELIVERY_STAT_IN_PTR_ALIGN_ERR:
+ return "input pointer not aligned to block size";
+ case MLX5_CMD_DELIVERY_STAT_FW_ERR:
+ return "firmware internal error";
+ case MLX5_CMD_DELIVERY_STAT_IN_LENGTH_ERR:
+ return "command input length error";
+ case MLX5_CMD_DELIVERY_STAT_OUT_LENGTH_ERR:
+ return "command ouput length error";
+ case MLX5_CMD_DELIVERY_STAT_RES_FLD_NOT_CLR_ERR:
+ return "reserved fields not cleared";
+ case MLX5_CMD_DELIVERY_STAT_CMD_DESCR_ERR:
+ return "bad command descriptor type";
+ default:
+ return "unknown status code";
+ }
+}
+
+static u16 msg_to_opcode(struct mlx5_cmd_msg *in)
+{
+ struct mlx5_inbox_hdr *hdr = (struct mlx5_inbox_hdr *)(in->first.data);
+
+ return be16_to_cpu(hdr->opcode);
+}
+
+static int wait_func(struct mlx5_core_dev *dev, struct mlx5_cmd_work_ent *ent)
+{
+ unsigned long timeout = msecs_to_jiffies(MLX5_CMD_TIMEOUT_MSEC);
+ struct mlx5_cmd *cmd = &dev->cmd;
+ int err;
+
+ if (cmd->mode == CMD_MODE_POLLING) {
+ wait_for_completion(&ent->done);
+ err = ent->ret;
+ } else {
+ if (!wait_for_completion_timeout(&ent->done, timeout))
+ err = -ETIMEDOUT;
+ else
+ err = 0;
+ }
+ if (err == -ETIMEDOUT) {
+ mlx5_core_warn(dev, "%s(0x%x) timeout. Will cause a leak of a command resource\n",
+ mlx5_command_str(msg_to_opcode(ent->in)),
+ msg_to_opcode(ent->in));
+ }
+ mlx5_core_dbg(dev, "err %d, delivery status %s(%d)\n",
+ err, deliv_status_to_str(ent->status), ent->status);
+
+ return err;
+}
+
+/* Notes:
+ * 1. Callback functions may not sleep
+ * 2. page queue commands do not support asynchrous completion
+ */
+static int mlx5_cmd_invoke(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *in,
+ struct mlx5_cmd_msg *out, void *uout, int uout_size,
+ mlx5_cmd_cbk_t callback,
+ void *context, int page_queue, u8 *status)
+{
+ struct mlx5_cmd *cmd = &dev->cmd;
+ struct mlx5_cmd_work_ent *ent;
+ struct mlx5_cmd_stats *stats;
+#ifndef HAVE_KTIME_GET_NS
+ ktime_t t1, t2, delta;
+#endif
+ int err = 0;
+ s64 ds;
+ u16 op;
+
+ if (pci_channel_offline(dev->pdev) ||
+ (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)) {
+ /* Device is going through error recovery
+ * and cannot accept commands.
+ */
+ return mlx5_internal_err_ret_value(dev, msg_to_opcode(in));
+ }
+
+ if (callback && page_queue)
+ return -EINVAL;
+
+ ent = alloc_cmd(cmd, in, out, uout, uout_size, callback, context,
+ page_queue);
+ if (IS_ERR(ent))
+ return PTR_ERR(ent);
+
+ if (!callback)
+ init_completion(&ent->done);
+
+ INIT_WORK(&ent->work, cmd_work_handler);
+ if (page_queue) {
+ cmd_work_handler(&ent->work);
+ } else if (!queue_work(cmd->wq, &ent->work)) {
+ mlx5_core_warn(dev, "failed to queue work\n");
+ err = -ENOMEM;
+ goto out_free;
+ }
+
+ if (!callback) {
+ err = wait_func(dev, ent);
+ if (err == -ETIMEDOUT)
+ goto out;
+
+#ifdef HAVE_KTIME_GET_NS
+ ds = ent->ts2 - ent->ts1;
+#else
+ t1 = timespec_to_ktime(ent->ts1);
+ t2 = timespec_to_ktime(ent->ts2);
+ delta = ktime_sub(t2, t1);
+ ds = ktime_to_ns(delta);
+#endif
+ op = be16_to_cpu(((struct mlx5_inbox_hdr *)in->first.data)->opcode);
+ if (op < ARRAY_SIZE(cmd->stats)) {
+ stats = &cmd->stats[op];
+ spin_lock_irq(&stats->lock);
+ stats->sum += ds;
+ ++stats->n;
+ spin_unlock_irq(&stats->lock);
+ }
+ mlx5_core_dbg_mask(dev, 1 << MLX5_CMD_TIME,
+ "fw exec time for %s is %lld nsec\n",
+ mlx5_command_str(op), ds);
+ *status = ent->status;
+ free_cmd(ent);
+ }
+
+ return err;
+
+out_free:
+ free_cmd(ent);
+out:
+ return err;
+}
+
+static ssize_t dbg_write(struct file *filp, const char __user *buf,
+ size_t count, loff_t *pos)
+{
+ struct mlx5_core_dev *dev = filp->private_data;
+ struct mlx5_cmd_debug *dbg = &dev->cmd.dbg;
+ char lbuf[3];
+ int err;
+
+ if (!dbg->in_msg || !dbg->out_msg)
+ return -ENOMEM;
+
+ if (copy_from_user(lbuf, buf, sizeof(lbuf)))
+ return -EFAULT;
+
+ lbuf[sizeof(lbuf) - 1] = 0;
+
+ if (strcmp(lbuf, "go"))
+ return -EINVAL;
+
+ err = mlx5_cmd_exec(dev, dbg->in_msg, dbg->inlen, dbg->out_msg, dbg->outlen);
+
+ return err ? err : count;
+}
+
+
+static const struct file_operations fops = {
+ .owner = THIS_MODULE,
+ .open = simple_open,
+ .write = dbg_write,
+};
+
+static int mlx5_copy_to_msg(struct mlx5_cmd_msg *to, void *from, int size)
+{
+ struct mlx5_cmd_prot_block *block;
+ struct mlx5_cmd_mailbox *next;
+ int copy;
+
+ if (!to || !from)
+ return -ENOMEM;
+
+ copy = min_t(int, size, sizeof(to->first.data));
+ memcpy(to->first.data, from, copy);
+ size -= copy;
+ from += copy;
+
+ next = to->next;
+ while (size) {
+ if (!next) {
+ /* this is a BUG */
+ return -ENOMEM;
+ }
+
+ copy = min_t(int, size, MLX5_CMD_DATA_BLOCK_SIZE);
+ block = next->buf;
+ memcpy(block->data, from, copy);
+ from += copy;
+ size -= copy;
+ next = next->next;
+ }
+
+ return 0;
+}
+
+static int mlx5_copy_from_msg(void *to, struct mlx5_cmd_msg *from, int size)
+{
+ struct mlx5_cmd_prot_block *block;
+ struct mlx5_cmd_mailbox *next;
+ int copy;
+
+ if (!to || !from)
+ return -ENOMEM;
+
+ copy = min_t(int, size, sizeof(from->first.data));
+ memcpy(to, from->first.data, copy);
+ size -= copy;
+ to += copy;
+
+ next = from->next;
+ while (size) {
+ if (!next) {
+ /* this is a BUG */
+ return -ENOMEM;
+ }
+
+ copy = min_t(int, size, MLX5_CMD_DATA_BLOCK_SIZE);
+ block = next->buf;
+
+ memcpy(to, block->data, copy);
+ to += copy;
+ size -= copy;
+ next = next->next;
+ }
+
+ return 0;
+}
+
+static struct mlx5_cmd_mailbox *alloc_cmd_box(struct mlx5_core_dev *dev,
+ gfp_t flags)
+{
+ struct mlx5_cmd_mailbox *mailbox;
+
+ mailbox = kmalloc(sizeof(*mailbox), flags);
+ if (!mailbox)
+ return ERR_PTR(-ENOMEM);
+
+ mailbox->buf = pci_pool_alloc(dev->cmd.pool, flags,
+ &mailbox->dma);
+ if (!mailbox->buf) {
+ mlx5_core_dbg(dev, "failed allocation\n");
+ kfree(mailbox);
+ return ERR_PTR(-ENOMEM);
+ }
+ memset(mailbox->buf, 0, sizeof(struct mlx5_cmd_prot_block));
+ mailbox->next = NULL;
+
+ return mailbox;
+}
+
+static void free_cmd_box(struct mlx5_core_dev *dev,
+ struct mlx5_cmd_mailbox *mailbox)
+{
+ pci_pool_free(dev->cmd.pool, mailbox->buf, mailbox->dma);
+ kfree(mailbox);
+}
+
+static struct mlx5_cmd_msg *mlx5_alloc_cmd_msg(struct mlx5_core_dev *dev,
+ gfp_t flags, int size)
+{
+ struct mlx5_cmd_mailbox *tmp, *head = NULL;
+ struct mlx5_cmd_prot_block *block;
+ struct mlx5_cmd_msg *msg;
+ int blen;
+ int err;
+ int n;
+ int i;
+
+ msg = kzalloc(sizeof(*msg), flags);
+ if (!msg)
+ return ERR_PTR(-ENOMEM);
+
+ blen = size - min_t(int, sizeof(msg->first.data), size);
+ n = (blen + MLX5_CMD_DATA_BLOCK_SIZE - 1) / MLX5_CMD_DATA_BLOCK_SIZE;
+
+ for (i = 0; i < n; i++) {
+ tmp = alloc_cmd_box(dev, flags);
+ if (IS_ERR(tmp)) {
+ mlx5_core_warn(dev, "failed allocating block\n");
+ err = PTR_ERR(tmp);
+ goto err_alloc;
+ }
+
+ block = tmp->buf;
+ tmp->next = head;
+ block->next = cpu_to_be64(tmp->next ? tmp->next->dma : 0);
+ block->block_num = cpu_to_be32(n - i - 1);
+ head = tmp;
+ }
+ msg->next = head;
+ msg->len = size;
+ return msg;
+
+err_alloc:
+ while (head) {
+ tmp = head->next;
+ free_cmd_box(dev, head);
+ head = tmp;
+ }
+ kfree(msg);
+
+ return ERR_PTR(err);
+}
+
+static void mlx5_free_cmd_msg(struct mlx5_core_dev *dev,
+ struct mlx5_cmd_msg *msg)
+{
+ struct mlx5_cmd_mailbox *head = msg->next;
+ struct mlx5_cmd_mailbox *next;
+
+ while (head) {
+ next = head->next;
+ free_cmd_box(dev, head);
+ head = next;
+ }
+ kfree(msg);
+}
+
+static ssize_t data_write(struct file *filp, const char __user *buf,
+ size_t count, loff_t *pos)
+{
+ struct mlx5_core_dev *dev = filp->private_data;
+ struct mlx5_cmd_debug *dbg = &dev->cmd.dbg;
+ void *ptr;
+ int err;
+
+ if (*pos != 0)
+ return -EINVAL;
+
+ kfree(dbg->in_msg);
+ dbg->in_msg = NULL;
+ dbg->inlen = 0;
+
+ ptr = kzalloc(count, GFP_KERNEL);
+ if (!ptr)
+ return -ENOMEM;
+
+ if (copy_from_user(ptr, buf, count)) {
+ err = -EFAULT;
+ goto out;
+ }
+ dbg->in_msg = ptr;
+ dbg->inlen = count;
+
+ *pos = count;
+
+ return count;
+
+out:
+ kfree(ptr);
+ return err;
+}
+
+static ssize_t data_read(struct file *filp, char __user *buf, size_t count,
+ loff_t *pos)
+{
+ struct mlx5_core_dev *dev = filp->private_data;
+ struct mlx5_cmd_debug *dbg = &dev->cmd.dbg;
+ int copy;
+
+ if (*pos)
+ return 0;
+
+ if (!dbg->out_msg)
+ return -ENOMEM;
+
+ copy = min_t(int, count, dbg->outlen);
+ if (copy_to_user(buf, dbg->out_msg, copy))
+ return -EFAULT;
+
+ *pos += copy;
+
+ return copy;
+}
+
+static const struct file_operations dfops = {
+ .owner = THIS_MODULE,
+ .open = simple_open,
+ .write = data_write,
+ .read = data_read,
+};
+
+static ssize_t outlen_read(struct file *filp, char __user *buf, size_t count,
+ loff_t *pos)
+{
+ struct mlx5_core_dev *dev = filp->private_data;
+ struct mlx5_cmd_debug *dbg = &dev->cmd.dbg;
+ char outlen[8];
+ int err;
+
+ if (*pos)
+ return 0;
+
+ err = snprintf(outlen, sizeof(outlen), "%d", dbg->outlen);
+ if (err < 0)
+ return err;
+
+ if (copy_to_user(buf, &outlen, err))
+ return -EFAULT;
+
+ *pos += err;
+
+ return err;
+}
+
+static ssize_t outlen_write(struct file *filp, const char __user *buf,
+ size_t count, loff_t *pos)
+{
+ struct mlx5_core_dev *dev = filp->private_data;
+ struct mlx5_cmd_debug *dbg = &dev->cmd.dbg;
+ char outlen_str[8];
+ int outlen;
+ void *ptr;
+ int err;
+
+ if (*pos != 0 || count > 6)
+ return -EINVAL;
+
+ kfree(dbg->out_msg);
+ dbg->out_msg = NULL;
+ dbg->outlen = 0;
+
+ if (copy_from_user(outlen_str, buf, count))
+ return -EFAULT;
+
+ outlen_str[7] = 0;
+
+ err = sscanf(outlen_str, "%d", &outlen);
+ if (err < 0)
+ return err;
+
+ ptr = kzalloc(outlen, GFP_KERNEL);
+ if (!ptr)
+ return -ENOMEM;
+
+ dbg->out_msg = ptr;
+ dbg->outlen = outlen;
+
+ *pos = count;
+
+ return count;
+}
+
+static const struct file_operations olfops = {
+ .owner = THIS_MODULE,
+ .open = simple_open,
+ .write = outlen_write,
+ .read = outlen_read,
+};
+
+static void set_wqname(struct mlx5_core_dev *dev)
+{
+ struct mlx5_cmd *cmd = &dev->cmd;
+
+ snprintf(cmd->wq_name, sizeof(cmd->wq_name), "mlx5_cmd_%s",
+ dev_name(&dev->pdev->dev));
+}
+
+static void clean_debug_files(struct mlx5_core_dev *dev)
+{
+ struct mlx5_cmd_debug *dbg = &dev->cmd.dbg;
+
+ if (!mlx5_debugfs_root)
+ return;
+
+ mlx5_cmdif_debugfs_cleanup(dev);
+ debugfs_remove_recursive(dbg->dbg_root);
+}
+
+static int create_debugfs_files(struct mlx5_core_dev *dev)
+{
+ struct mlx5_cmd_debug *dbg = &dev->cmd.dbg;
+ int err = -ENOMEM;
+
+ if (!mlx5_debugfs_root)
+ return 0;
+
+ dbg->dbg_root = debugfs_create_dir("cmd", dev->priv.dbg_root);
+ if (!dbg->dbg_root)
+ return err;
+
+ dbg->dbg_in = debugfs_create_file("in", 0400, dbg->dbg_root,
+ dev, &dfops);
+ if (!dbg->dbg_in)
+ goto err_dbg;
+
+ dbg->dbg_out = debugfs_create_file("out", 0200, dbg->dbg_root,
+ dev, &dfops);
+ if (!dbg->dbg_out)
+ goto err_dbg;
+
+ dbg->dbg_outlen = debugfs_create_file("out_len", 0600, dbg->dbg_root,
+ dev, &olfops);
+ if (!dbg->dbg_outlen)
+ goto err_dbg;
+
+ dbg->dbg_status = debugfs_create_u8("status", 0600, dbg->dbg_root,
+ &dbg->status);
+ if (!dbg->dbg_status)
+ goto err_dbg;
+
+ dbg->dbg_run = debugfs_create_file("run", 0200, dbg->dbg_root, dev, &fops);
+ if (!dbg->dbg_run)
+ goto err_dbg;
+
+ mlx5_cmdif_debugfs_init(dev);
+
+ return 0;
+
+err_dbg:
+ clean_debug_files(dev);
+ return err;
+}
+
+void mlx5_cmd_use_events(struct mlx5_core_dev *dev)
+{
+ struct mlx5_cmd *cmd = &dev->cmd;
+ int i;
+
+ for (i = 0; i < cmd->max_reg_cmds; i++)
+ down(&cmd->sem);
+
+ down(&cmd->pages_sem);
+
+ flush_workqueue(cmd->wq);
+
+ cmd->mode = CMD_MODE_EVENTS;
+
+ up(&cmd->pages_sem);
+ for (i = 0; i < cmd->max_reg_cmds; i++)
+ up(&cmd->sem);
+}
+
+void mlx5_cmd_use_polling(struct mlx5_core_dev *dev)
+{
+ struct mlx5_cmd *cmd = &dev->cmd;
+ int i;
+
+ for (i = 0; i < cmd->max_reg_cmds; i++)
+ down(&cmd->sem);
+
+ down(&cmd->pages_sem);
+
+ flush_workqueue(cmd->wq);
+ cmd->mode = CMD_MODE_POLLING;
+
+ up(&cmd->pages_sem);
+ for (i = 0; i < cmd->max_reg_cmds; i++)
+ up(&cmd->sem);
+}
+
+static void free_msg(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *msg)
+{
+ unsigned long flags;
+
+ if (msg->ch) {
+ spin_lock_irqsave(&msg->ch->lock, flags);
+ list_add_tail(&msg->list, &msg->ch->head);
+ msg->ch->free++;
+ spin_unlock_irqrestore(&msg->ch->lock, flags);
+ } else {
+ mlx5_free_cmd_msg(dev, msg);
+ }
+}
+
+void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, unsigned long vector)
+{
+ struct mlx5_cmd *cmd = &dev->cmd;
+ struct mlx5_cmd_work_ent *ent;
+ mlx5_cmd_cbk_t callback;
+ void *context;
+ int err;
+ int i;
+#ifndef HAVE_KTIME_GET_NS
+ ktime_t t1, t2, delta;
+#endif
+ s64 ds;
+ struct mlx5_cmd_stats *stats;
+ unsigned long flags;
+
+ for (i = 0; i < (1 << cmd->log_sz); i++) {
+ if (test_bit(i, &vector)) {
+ struct semaphore *sem;
+
+ ent = cmd->ent_arr[i];
+ if (ent->page_queue)
+ sem = &cmd->pages_sem;
+ else
+ sem = &cmd->sem;
+#ifdef HAVE_KTIME_GET_NS
+ ent->ts2 = ktime_get_ns();
+#else
+ ktime_get_ts(&ent->ts2);
+#endif
+ memcpy(ent->out->first.data, ent->lay->out, sizeof(ent->lay->out));
+ dump_command(dev, ent, 0);
+ if (!ent->ret) {
+ if (!cmd->checksum_disabled)
+ ent->ret = verify_signature(ent);
+ else
+ ent->ret = 0;
+ ent->status = ent->lay->status_own >> 1;
+ mlx5_core_dbg(dev, "command completed. ret 0x%x, delivery status %s(0x%x)\n",
+ ent->ret, deliv_status_to_str(ent->status), ent->status);
+ }
+ free_ent(cmd, ent->idx);
+ if (ent->callback) {
+#ifdef HAVE_KTIME_GET_NS
+ ds = ent->ts2 - ent->ts1;
+#else
+ t1 = timespec_to_ktime(ent->ts1);
+ t2 = timespec_to_ktime(ent->ts2);
+ delta = ktime_sub(t2, t1);
+ ds = ktime_to_ns(delta);
+#endif
+ if (ent->op < ARRAY_SIZE(cmd->stats)) {
+ stats = &cmd->stats[ent->op];
+ spin_lock_irqsave(&stats->lock, flags);
+ stats->sum += ds;
+ ++stats->n;
+ spin_unlock_irqrestore(&stats->lock, flags);
+ }
+
+ callback = ent->callback;
+ context = ent->context;
+ err = ent->ret;
+ if (!err)
+ err = mlx5_copy_from_msg(ent->uout,
+ ent->out,
+ ent->uout_size);
+
+ mlx5_free_cmd_msg(dev, ent->out);
+ free_msg(dev, ent->in);
+
+ free_cmd(ent);
+ callback(err, context);
+ } else {
+ complete(&ent->done);
+ }
+ up(sem);
+ }
+ }
+}
+EXPORT_SYMBOL(mlx5_cmd_comp_handler);
+
+static int status_to_err(u8 status)
+{
+ return status ? -1 : 0; /* TBD more meaningful codes */
+}
+
+static struct mlx5_cmd_msg *alloc_msg(struct mlx5_core_dev *dev, int in_size,
+ gfp_t gfp)
+{
+ struct mlx5_cmd_msg *msg = ERR_PTR(-ENOMEM);
+ struct mlx5_cmd *cmd = &dev->cmd;
+ struct mlx5_cmd_cache_head *ch = NULL;
+ int i;
+ int miss_accounted = 0;
+ int total_accounted = 0;
+
+ if (in_size > 16) {
+ for (i = 0; i < MLX5_NUM_COMMAND_CACHES; i++) {
+ ch = &cmd->cache.ch[i];
+ if (in_size <= ch->max_inbox_size) {
+ spin_lock_irq(&ch->lock);
+ if (!total_accounted) {
+ ch->total_commands++;
+ total_accounted = 1;
+ }
+ if (!list_empty(&ch->head)) {
+ msg = list_entry(ch->head.next, typeof(*msg), list);
+ /* For cached lists, we must explicitly state what is
+ * the real size
+ */
+ msg->len = in_size;
+ list_del(&msg->list);
+ ch->free--;
+ spin_unlock_irq(&ch->lock);
+ break;
+ }
+ if (!miss_accounted) {
+ ch->miss++;
+ miss_accounted = 1;
+ }
+ spin_unlock_irq(&ch->lock);
+ }
+ }
+ }
+
+ if (IS_ERR(msg)) {
+ if (in_size > 16)
+ atomic_inc(&cmd->cache.real_miss);
+ msg = mlx5_alloc_cmd_msg(dev, gfp, in_size);
+ }
+
+ return msg;
+}
+
+static u16 opcode_from_in(struct mlx5_inbox_hdr *in)
+{
+ return be16_to_cpu(in->opcode);
+}
+
+static int is_manage_pages(struct mlx5_inbox_hdr *in)
+{
+ return be16_to_cpu(in->opcode) == MLX5_CMD_OP_MANAGE_PAGES;
+}
+
+static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
+ int out_size, mlx5_cmd_cbk_t callback, void *context)
+{
+ struct mlx5_cmd_msg *inb;
+ struct mlx5_cmd_msg *outb;
+ int pages_queue;
+ gfp_t gfp;
+ int err;
+ u8 status = 0;
+
+ if (dev->priv.sriov.vf_partial_init) {
+ mlx5_core_warn(dev, "device is not initialized\n");
+ return -EPERM;
+ }
+
+ if (pci_channel_offline(dev->pdev) ||
+ dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
+ return mlx5_internal_err_ret_value(dev, opcode_from_in(in));
+
+ pages_queue = is_manage_pages(in);
+ gfp = callback ? GFP_ATOMIC : GFP_KERNEL;
+
+ inb = alloc_msg(dev, in_size, gfp);
+ if (IS_ERR(inb)) {
+ err = PTR_ERR(inb);
+ return err;
+ }
+
+ err = mlx5_copy_to_msg(inb, in, in_size);
+ if (err) {
+ mlx5_core_warn(dev, "err %d\n", err);
+ goto out_in;
+ }
+
+ outb = mlx5_alloc_cmd_msg(dev, gfp, out_size);
+ if (IS_ERR(outb)) {
+ err = PTR_ERR(outb);
+ goto out_in;
+ }
+
+ err = mlx5_cmd_invoke(dev, inb, outb, out, out_size, callback, context,
+ pages_queue, &status);
+ if (err)
+ goto out_out;
+
+ mlx5_core_dbg(dev, "err %d, status %d\n", err, status);
+ if (status) {
+ err = status_to_err(status);
+ goto out_out;
+ }
+
+ if (!callback)
+ err = mlx5_copy_from_msg(out, outb, out_size);
+
+out_out:
+ if (!callback)
+ mlx5_free_cmd_msg(dev, outb);
+
+out_in:
+ if (!callback)
+ free_msg(dev, inb);
+ return err;
+}
+
+int mlx5_cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
+ int out_size)
+{
+ return cmd_exec(dev, in, in_size, out, out_size, NULL, NULL);
+}
+EXPORT_SYMBOL(mlx5_cmd_exec);
+
+int mlx5_cmd_exec_cb(struct mlx5_core_dev *dev, void *in, int in_size,
+ void *out, int out_size, mlx5_cmd_cbk_t callback,
+ void *context)
+{
+ return cmd_exec(dev, in, in_size, out, out_size, callback, context);
+}
+EXPORT_SYMBOL(mlx5_cmd_exec_cb);
+
+static void destroy_msg_cache(struct mlx5_core_dev *dev)
+{
+ struct mlx5_cmd_cache_head *ch;
+ struct mlx5_cmd_msg *msg;
+ struct mlx5_cmd_msg *n;
+ int i;
+
+ for (i = 0; i < MLX5_NUM_COMMAND_CACHES; i++) {
+ ch = &dev->cmd.cache.ch[i];
+ list_for_each_entry_safe(msg, n, &ch->head, list) {
+ list_del(&msg->list);
+ ch->free--;
+ mlx5_free_cmd_msg(dev, msg);
+ }
+ }
+
+ cmd_sysfs_cleanup(dev);
+}
+
+static unsigned cmd_cache_num_ent[MLX5_NUM_COMMAND_CACHES] = {
+ 512, 32, 16, 8, 2
+};
+
+static unsigned cmd_cache_ent_size[MLX5_NUM_COMMAND_CACHES] = {
+ 16 + MLX5_CMD_DATA_BLOCK_SIZE,
+ 16 + MLX5_CMD_DATA_BLOCK_SIZE * 2,
+ 16 + MLX5_CMD_DATA_BLOCK_SIZE * 16,
+ 16 + MLX5_CMD_DATA_BLOCK_SIZE * 256,
+ 16 + MLX5_CMD_DATA_BLOCK_SIZE * 512,
+};
+
+static int create_msg_cache(struct mlx5_core_dev *dev)
+{
+ struct mlx5_cmd *cmd = &dev->cmd;
+ struct mlx5_cmd_msg *msg;
+ struct mlx5_cmd_cache_head *ch;
+ int err;
+ int i;
+ int k;
+
+ for (k = 0; k < MLX5_NUM_COMMAND_CACHES; k++) {
+ ch = &cmd->cache.ch[k];
+ spin_lock_init(&ch->lock);
+ INIT_LIST_HEAD(&ch->head);
+ ch->num_ent = cmd_cache_num_ent[k];
+ ch->max_inbox_size = cmd_cache_ent_size[k];
+ ch->miss = 0;
+
+ for (i = 0; i < ch->num_ent; i++) {
+ msg = mlx5_alloc_cmd_msg(dev, GFP_KERNEL, ch->max_inbox_size);
+ if (IS_ERR(msg)) {
+ err = PTR_ERR(msg);
+ goto ex_err;
+ }
+ msg->ch = ch;
+ ch->free++;
+ list_add_tail(&msg->list, &ch->head);
+ }
+ }
+
+ err = cmd_sysfs_init(dev);
+ if (err)
+ goto ex_err;
+
+ return 0;
+
+ex_err:
+ destroy_msg_cache(dev);
+ return err;
+}
+
+static int alloc_cmd_page(struct mlx5_core_dev *dev, struct mlx5_cmd *cmd)
+{
+ struct device *ddev = &dev->pdev->dev;
+
+ cmd->cmd_alloc_buf = dma_zalloc_coherent(ddev, MLX5_ADAPTER_PAGE_SIZE,
+ &cmd->alloc_dma, GFP_KERNEL);
+ if (!cmd->cmd_alloc_buf)
+ return -ENOMEM;
+
+ /* make sure it is aligned to 4K */
+ if (!((uintptr_t)cmd->cmd_alloc_buf & (MLX5_ADAPTER_PAGE_SIZE - 1))) {
+ cmd->cmd_buf = cmd->cmd_alloc_buf;
+ cmd->dma = cmd->alloc_dma;
+ cmd->alloc_size = MLX5_ADAPTER_PAGE_SIZE;
+ return 0;
+ }
+
+ dma_free_coherent(ddev, MLX5_ADAPTER_PAGE_SIZE, cmd->cmd_alloc_buf,
+ cmd->alloc_dma);
+ cmd->cmd_alloc_buf = dma_zalloc_coherent(ddev,
+ 2 * MLX5_ADAPTER_PAGE_SIZE - 1,
+ &cmd->alloc_dma, GFP_KERNEL);
+ if (!cmd->cmd_alloc_buf)
+ return -ENOMEM;
+
+ cmd->cmd_buf = PTR_ALIGN(cmd->cmd_alloc_buf, MLX5_ADAPTER_PAGE_SIZE);
+ cmd->dma = ALIGN(cmd->alloc_dma, MLX5_ADAPTER_PAGE_SIZE);
+ cmd->alloc_size = 2 * MLX5_ADAPTER_PAGE_SIZE - 1;
+ return 0;
+}
+
+static void free_cmd_page(struct mlx5_core_dev *dev, struct mlx5_cmd *cmd)
+{
+ struct device *ddev = &dev->pdev->dev;
+
+ dma_free_coherent(ddev, cmd->alloc_size, cmd->cmd_alloc_buf,
+ cmd->alloc_dma);
+}
+
+int mlx5_cmd_init(struct mlx5_core_dev *dev)
+{
+ int size = sizeof(struct mlx5_cmd_prot_block);
+ int align = roundup_pow_of_two(size);
+ struct mlx5_cmd *cmd = &dev->cmd;
+ u32 cmd_h, cmd_l;
+ u16 cmd_if_rev;
+ int err;
+ int i;
+
+ memset(cmd, 0, sizeof(*cmd));
+ cmd_if_rev = cmdif_rev(dev);
+ if (cmd_if_rev != CMD_IF_REV) {
+ dev_err(&dev->pdev->dev,
+ "Driver cmdif rev(%d) differs from firmware's(%d)\n",
+ CMD_IF_REV, cmd_if_rev);
+ return -EINVAL;
+ }
+
+ cmd->pool = pci_pool_create("mlx5_cmd", dev->pdev, size, align, 0);
+ if (!cmd->pool)
+ return -ENOMEM;
+
+ err = alloc_cmd_page(dev, cmd);
+ if (err)
+ goto err_free_pool;
+
+ cmd_l = ioread32be(&dev->iseg->cmdq_addr_l_sz) & 0xff;
+ cmd->log_sz = cmd_l >> 4 & 0xf;
+ cmd->log_stride = cmd_l & 0xf;
+ if (1 << cmd->log_sz > MLX5_MAX_COMMANDS) {
+ dev_err(&dev->pdev->dev, "firmware reports too many outstanding commands %d\n",
+ 1 << cmd->log_sz);
+ err = -EINVAL;
+ goto err_free_page;
+ }
+
+ if (cmd->log_sz + cmd->log_stride > MLX5_ADAPTER_PAGE_SHIFT) {
+ dev_err(&dev->pdev->dev, "command queue size overflow\n");
+ err = -EINVAL;
+ goto err_free_page;
+ }
+
+ cmd->checksum_disabled = 1;
+ cmd->max_reg_cmds = (1 << cmd->log_sz) - 1;
+ cmd->bitmask = (1 << cmd->max_reg_cmds) - 1;
+
+ cmd->cmdif_rev = ioread32be(&dev->iseg->cmdif_rev_fw_sub) >> 16;
+ if (cmd->cmdif_rev > CMD_IF_REV) {
+ dev_err(&dev->pdev->dev, "driver does not support command interface version. driver %d, firmware %d\n",
+ CMD_IF_REV, cmd->cmdif_rev);
+ err = -ENOTSUPP;
+ goto err_free_page;
+ }
+
+ spin_lock_init(&cmd->alloc_lock);
+ spin_lock_init(&cmd->token_lock);
+ for (i = 0; i < ARRAY_SIZE(cmd->stats); i++)
+ spin_lock_init(&cmd->stats[i].lock);
+
+ sema_init(&cmd->sem, cmd->max_reg_cmds);
+ sema_init(&cmd->pages_sem, 1);
+
+ cmd_h = (u32)((u64)(cmd->dma) >> 32);
+ cmd_l = (u32)(cmd->dma);
+ if (cmd_l & 0xfff) {
+ dev_err(&dev->pdev->dev, "invalid command queue address\n");
+ err = -ENOMEM;
+ goto err_free_page;
+ }
+
+ iowrite32be(cmd_h, &dev->iseg->cmdq_addr_h);
+ iowrite32be(cmd_l, &dev->iseg->cmdq_addr_l_sz);
+
+ /* Make sure firmware sees the complete address before we proceed */
+ wmb();
+
+ mlx5_core_dbg(dev, "descriptor at dma 0x%llx\n", (unsigned long long)(cmd->dma));
+
+ cmd->mode = CMD_MODE_POLLING;
+
+ err = create_msg_cache(dev);
+ if (err) {
+ dev_err(&dev->pdev->dev, "failed to create command cache\n");
+ goto err_free_page;
+ }
+
+ set_wqname(dev);
+ cmd->wq = create_singlethread_workqueue(cmd->wq_name);
+ if (!cmd->wq) {
+ dev_err(&dev->pdev->dev, "failed to create command workqueue\n");
+ err = -ENOMEM;
+ goto err_cache;
+ }
+
+ err = create_debugfs_files(dev);
+ if (err) {
+ err = -ENOMEM;
+ goto err_wq;
+ }
+
+ return 0;
+
+err_wq:
+ destroy_workqueue(cmd->wq);
+
+err_cache:
+ destroy_msg_cache(dev);
+
+err_free_page:
+ free_cmd_page(dev, cmd);
+
+err_free_pool:
+ pci_pool_destroy(cmd->pool);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx5_cmd_init);
+
+void mlx5_cmd_cleanup(struct mlx5_core_dev *dev)
+{
+ struct mlx5_cmd *cmd = &dev->cmd;
+
+ clean_debug_files(dev);
+ destroy_workqueue(cmd->wq);
+ destroy_msg_cache(dev);
+ free_cmd_page(dev, cmd);
+ pci_pool_destroy(cmd->pool);
+}
+EXPORT_SYMBOL(mlx5_cmd_cleanup);
+
+static const char *cmd_status_str(u8 status)
+{
+ switch (status) {
+ case MLX5_CMD_STAT_OK:
+ return "OK";
+ case MLX5_CMD_STAT_INT_ERR:
+ return "internal error";
+ case MLX5_CMD_STAT_BAD_OP_ERR:
+ return "bad operation";
+ case MLX5_CMD_STAT_BAD_PARAM_ERR:
+ return "bad parameter";
+ case MLX5_CMD_STAT_BAD_SYS_STATE_ERR:
+ return "bad system state";
+ case MLX5_CMD_STAT_BAD_RES_ERR:
+ return "bad resource";
+ case MLX5_CMD_STAT_RES_BUSY:
+ return "resource busy";
+ case MLX5_CMD_STAT_LIM_ERR:
+ return "limits exceeded";
+ case MLX5_CMD_STAT_BAD_RES_STATE_ERR:
+ return "bad resource state";
+ case MLX5_CMD_STAT_IX_ERR:
+ return "bad index";
+ case MLX5_CMD_STAT_NO_RES_ERR:
+ return "no resources";
+ case MLX5_CMD_STAT_BAD_INP_LEN_ERR:
+ return "bad input length";
+ case MLX5_CMD_STAT_BAD_OUTP_LEN_ERR:
+ return "bad output length";
+ case MLX5_CMD_STAT_BAD_QP_STATE_ERR:
+ return "bad QP state";
+ case MLX5_CMD_STAT_BAD_PKT_ERR:
+ return "bad packet (discarded)";
+ case MLX5_CMD_STAT_BAD_SIZE_OUTS_CQES_ERR:
+ return "bad size too many outstanding CQEs";
+ default:
+ return "unknown status";
+ }
+}
+
+static int cmd_status_to_err(u8 status)
+{
+ switch (status) {
+ case MLX5_CMD_STAT_OK: return 0;
+ case MLX5_CMD_STAT_INT_ERR: return -EIO;
+ case MLX5_CMD_STAT_BAD_OP_ERR: return -EINVAL;
+ case MLX5_CMD_STAT_BAD_PARAM_ERR: return -EINVAL;
+ case MLX5_CMD_STAT_BAD_SYS_STATE_ERR: return -EIO;
+ case MLX5_CMD_STAT_BAD_RES_ERR: return -EINVAL;
+ case MLX5_CMD_STAT_RES_BUSY: return -EBUSY;
+ case MLX5_CMD_STAT_LIM_ERR: return -ENOMEM;
+ case MLX5_CMD_STAT_BAD_RES_STATE_ERR: return -EINVAL;
+ case MLX5_CMD_STAT_IX_ERR: return -EINVAL;
+ case MLX5_CMD_STAT_NO_RES_ERR: return -EAGAIN;
+ case MLX5_CMD_STAT_BAD_INP_LEN_ERR: return -EIO;
+ case MLX5_CMD_STAT_BAD_OUTP_LEN_ERR: return -EIO;
+ case MLX5_CMD_STAT_BAD_QP_STATE_ERR: return -EINVAL;
+ case MLX5_CMD_STAT_BAD_PKT_ERR: return -EINVAL;
+ case MLX5_CMD_STAT_BAD_SIZE_OUTS_CQES_ERR: return -EINVAL;
+ default: return -EIO;
+ }
+}
+
+/* this will be available till all the commands use set/get macros */
+int mlx5_cmd_status_to_err(struct mlx5_outbox_hdr *hdr)
+{
+ if (!hdr->status)
+ return 0;
+
+ pr_warn("command failed, status %s(0x%x), syndrome 0x%x\n",
+ cmd_status_str(hdr->status), hdr->status,
+ be32_to_cpu(hdr->syndrome));
+
+ return cmd_status_to_err(hdr->status);
+}
+
+int mlx5_cmd_status_to_err_v2(void *ptr)
+{
+ u32 syndrome;
+ u8 status;
+
+ status = be32_to_cpu(*(__be32 *)ptr) >> 24;
+ if (!status)
+ return 0;
+
+ syndrome = be32_to_cpu(*(__be32 *)(ptr + 4));
+
+ pr_warn("command failed, status %s(0x%x), syndrome 0x%x\n",
+ cmd_status_str(status), status, syndrome);
+
+ return cmd_status_to_err(status);
+}
+
+struct cmd_cache_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct mlx5_cmd_cache_head *,
+ struct cmd_cache_attribute *, char *buf);
+ ssize_t (*store)(struct mlx5_cmd_cache_head *, struct cmd_cache_attribute *,
+ const char *buf, size_t count);
+};
+
+static ssize_t free_show(struct mlx5_cmd_cache_head *ch,
+ struct cmd_cache_attribute *ca,
+ char *buf)
+{
+ return snprintf(buf, 20, "%d\n", ch->free);
+}
+
+static ssize_t num_ent_show(struct mlx5_cmd_cache_head *ch,
+ struct cmd_cache_attribute *ca,
+ char *buf)
+{
+ return snprintf(buf, 20, "%d\n", ch->num_ent);
+}
+
+static ssize_t num_ent_store(struct mlx5_cmd_cache_head *ch,
+ struct cmd_cache_attribute *ca,
+ const char *buf, size_t count)
+{
+ struct mlx5_cmd_msg *msg;
+ struct mlx5_cmd_msg *n;
+ LIST_HEAD(remove_list);
+ LIST_HEAD(add_list);
+ unsigned long flags;
+ int err = count;
+ int add = 0;
+ int remove;
+ u32 var;
+ int i;
+
+#if (LINUX_VERSION_CODE > KERNEL_VERSION(2, 6, 18))
+ if (kstrtouint(buf, 0, &var))
+#else
+ if (sscanf(buf, "%u", &var) != 1)
+#endif
+ return -EINVAL;
+
+ spin_lock_irqsave(&ch->lock, flags);
+ if (var < ch->num_ent) {
+ remove = ch->num_ent - var;
+ for (i = 0; i < remove; i++) {
+ if (!list_empty(&ch->head)) {
+ msg = list_entry(ch->head.next, typeof(*msg), list);
+ list_del(&msg->list);
+ list_add(&msg->list, &remove_list);
+ ch->free--;
+ ch->num_ent--;
+ } else {
+ err = -EBUSY;
+ break;
+ }
+ }
+ } else if (var > ch->num_ent) {
+ add = var - ch->num_ent;
+ }
+ spin_unlock_irqrestore(&ch->lock, flags);
+
+ list_for_each_entry_safe(msg, n, &remove_list, list) {
+ list_del(&msg->list);
+ mlx5_free_cmd_msg(ch->dev, msg);
+ }
+
+ for (i = 0; i < add; i++) {
+ msg = mlx5_alloc_cmd_msg(ch->dev, GFP_KERNEL, ch->max_inbox_size);
+ if (IS_ERR(msg)) {
+ err = PTR_ERR(msg);
+ if (i)
+ pr_warn("could add only %d entries\n", i);
+ break;
+ }
+ list_add(&msg->list, &add_list);
+ }
+
+ spin_lock_irqsave(&ch->lock, flags);
+ list_for_each_entry_safe(msg, n, &add_list, list) {
+ list_del(&msg->list);
+ list_add_tail(&msg->list, &ch->head);
+ ch->num_ent++;
+ ch->free++;
+ }
+ spin_unlock_irqrestore(&ch->lock, flags);
+
+ return err;
+}
+
+static ssize_t miss_store(struct mlx5_cmd_cache_head *ch,
+ struct cmd_cache_attribute *ca,
+ const char *buf, size_t count)
+{
+ unsigned long flags;
+ u32 var;
+
+#if (LINUX_VERSION_CODE > KERNEL_VERSION(2, 6, 18))
+ if (kstrtouint(buf, 0, &var))
+#else
+ if (sscanf(buf, "%u", &var) != 1)
+#endif
+ return -EINVAL;
+
+ if (var) {
+ pr_warn("you may only clear the miss value\n");
+ return -EINVAL;
+ }
+
+ spin_lock_irqsave(&ch->lock, flags);
+ ch->miss = 0;
+ spin_unlock_irqrestore(&ch->lock, flags);
+
+ return count;
+}
+
+static ssize_t miss_show(struct mlx5_cmd_cache_head *ch,
+ struct cmd_cache_attribute *ca,
+ char *buf)
+{
+ return snprintf(buf, 20, "%d\n", ch->miss);
+}
+
+static ssize_t total_commands_show(struct mlx5_cmd_cache_head *ch,
+ struct cmd_cache_attribute *ca,
+ char *buf)
+{
+ return snprintf(buf, 20, "%d\n", ch->total_commands);
+}
+
+static ssize_t total_commands_store(struct mlx5_cmd_cache_head *ch,
+ struct cmd_cache_attribute *ca,
+ const char *buf, size_t count)
+{
+ unsigned long flags;
+ u32 var;
+
+#if (LINUX_VERSION_CODE > KERNEL_VERSION(2, 6, 18))
+ if (kstrtouint(buf, 0, &var))
+#else
+ if (sscanf(buf, "%u", &var) != 1)
+#endif
+ return -EINVAL;
+
+ if (var) {
+ pr_warn("you may only clear the total_commands value\n");
+ return -EINVAL;
+ }
+
+ spin_lock_irqsave(&ch->lock, flags);
+ ch->total_commands = 0;
+ spin_unlock_irqrestore(&ch->lock, flags);
+
+ return count;
+}
+
+static ssize_t cmd_cache_attr_show(struct kobject *kobj,
+ struct attribute *attr, char *buf)
+{
+ struct cmd_cache_attribute *ca =
+ container_of(attr, struct cmd_cache_attribute, attr);
+ struct mlx5_cmd_cache_head *ch = container_of(kobj, struct mlx5_cmd_cache_head, kobj);
+
+ if (!ca->show)
+ return -EIO;
+
+ return ca->show(ch, ca, buf);
+}
+
+static ssize_t cmd_cache_attr_store(struct kobject *kobj,
+ struct attribute *attr,
+ const char *buf,
+ size_t size)
+{
+ struct cmd_cache_attribute *ca =
+ container_of(attr, struct cmd_cache_attribute, attr);
+ struct mlx5_cmd_cache_head *ch = container_of(kobj, struct mlx5_cmd_cache_head, kobj);
+
+ if (!ca->store)
+ return -EIO;
+
+ return ca->store(ch, ca, buf, size);
+}
+
+static ssize_t real_miss_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
+ struct mlx5_core_dev *cdev = pci_get_drvdata(pdev);
+
+ return snprintf(buf, 20, "%d\n", atomic_read(&cdev->cmd.cache.real_miss));
+}
+
+static ssize_t real_miss_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
+ struct mlx5_core_dev *cdev = pci_get_drvdata(pdev);
+ u32 var;
+
+#if (LINUX_VERSION_CODE > KERNEL_VERSION(2, 6, 18))
+ if (kstrtouint(buf, 0, &var))
+#else
+ if (sscanf(buf, "%u", &var) != 1)
+#endif
+ return -EINVAL;
+
+ if (var) {
+ pr_warn("you may only clear this value\n");
+ return -EINVAL;
+ }
+
+ atomic_set(&cdev->cmd.cache.real_miss, 0);
+
+ return count;
+}
+
+#ifdef CONFIG_COMPAT_IS_CONST_KOBJECT_SYSFS_OPS
+static const struct sysfs_ops cmd_cache_sysfs_ops = {
+#else
+static struct sysfs_ops cmd_cache_sysfs_ops = {
+#endif
+ .show = cmd_cache_attr_show,
+ .store = cmd_cache_attr_store,
+};
+
+#define CMD_CACHE_ATTR(_name) struct cmd_cache_attribute cmd_cache_attr_##_name = \
+ __ATTR(_name, 0644, _name##_show, _name##_store)
+#define CMD_CACHE_ATTR_RO(_name) struct cmd_cache_attribute cmd_cache_attr_##_name = \
+ __ATTR(_name, 0444, _name##_show, NULL)
+
+static CMD_CACHE_ATTR_RO(free);
+static CMD_CACHE_ATTR(num_ent);
+static CMD_CACHE_ATTR(miss);
+static CMD_CACHE_ATTR(total_commands);
+
+struct cache_pdev_attr {
+ struct attribute attr;
+ struct mlx5_core_dev *dev;
+ struct kobject kobj;
+};
+
+static struct attribute *cmd_cache_default_attrs[] = {
+ &cmd_cache_attr_free.attr,
+ &cmd_cache_attr_num_ent.attr,
+ &cmd_cache_attr_miss.attr,
+ &cmd_cache_attr_total_commands.attr,
+ NULL
+};
+
+static struct kobj_type cmd_cache_type = {
+ .sysfs_ops = &cmd_cache_sysfs_ops,
+ .default_attrs = cmd_cache_default_attrs
+};
+
+static DEVICE_ATTR(real_miss, S_IRUGO , real_miss_show, real_miss_store);
+
+static int cmd_sysfs_init(struct mlx5_core_dev *dev)
+{
+ struct mlx5_cmd *cmd = &dev->cmd;
+ struct cmd_msg_cache *cache = &cmd->cache;
+ struct mlx5_cmd_cache_head *ch;
+ struct device *class_dev = &dev->pdev->dev;
+ int err;
+ int i;
+
+ cache->ko = kobject_create_and_add("commands_cache", &dev->pdev->dev.kobj);
+ if (!cache->ko)
+ return -ENOMEM;
+
+ err = device_create_file(class_dev, &dev_attr_real_miss);
+ if (err)
+ goto err_rm;
+
+ for (i = 0; i < MLX5_NUM_COMMAND_CACHES; i++) {
+ ch = &cache->ch[i];
+ err = kobject_init_and_add(&ch->kobj, &cmd_cache_type,
+ cache->ko, "%d", cmd_cache_ent_size[i]);
+ if (err)
+ goto err_put;
+ ch->dev = dev;
+ kobject_uevent(&ch->kobj, KOBJ_ADD);
+ }
+
+ return 0;
+
+err_put:
+ device_remove_file(class_dev, &dev_attr_real_miss);
+ for (; i >= 0; i--) {
+ ch = &cache->ch[i];
+ kobject_put(&ch->kobj);
+ }
+
+err_rm:
+ kobject_put(cache->ko);
+ return err;
+}
+
+static void cmd_sysfs_cleanup(struct mlx5_core_dev *dev)
+{
+ struct device *class_dev = &dev->pdev->dev;
+ struct mlx5_cmd_cache_head *ch;
+ int i;
+
+ device_remove_file(class_dev, &dev_attr_real_miss);
+ for (i = MLX5_NUM_COMMAND_CACHES - 1; i >= 0; i--) {
+ ch = &dev->cmd.cache.ch[i];
+ if (ch->dev)
+ kobject_put(&ch->kobj);
+ }
+ if (dev->cmd.cache.ko) {
+ kobject_put(dev->cmd.cache.ko);
+ dev->cmd.cache.ko = NULL;
+ }
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/cq.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/cq.c
new file mode 100644
index 0000000..b584864
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/cq.c
@@ -0,0 +1,236 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+void mlx5_cq_completion(struct mlx5_core_dev *dev, u32 cqn)
+{
+ struct mlx5_core_cq *cq;
+ struct mlx5_cq_table *table = &dev->priv.cq_table;
+
+ rcu_read_lock();
+ cq = radix_tree_lookup(&table->tree, cqn);
+ if (unlikely(!cq)) {
+ rcu_read_unlock();
+ mlx5_core_warn(dev, "Completion event for bogus CQ 0x%x\n", cqn);
+ return;
+ }
+
+ ++cq->arm_sn;
+
+ cq->comp(cq);
+ rcu_read_unlock();
+}
+
+void mlx5_cq_event(struct mlx5_core_dev *dev, u32 cqn, int event_type)
+{
+ struct mlx5_cq_table *table = &dev->priv.cq_table;
+ struct mlx5_core_cq *cq;
+
+ rcu_read_lock();
+ cq = radix_tree_lookup(&table->tree, cqn);
+ if (!cq) {
+ rcu_read_unlock();
+ mlx5_core_warn(dev, "Async event for bogus CQ 0x%x\n", cqn);
+ return;
+ }
+
+ cq->event(cq, event_type);
+ rcu_read_unlock();
+}
+
+
+int mlx5_core_create_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq,
+ struct mlx5_create_cq_mbox_in *in, int inlen)
+{
+ int err;
+ struct mlx5_cq_table *table = &dev->priv.cq_table;
+ struct mlx5_create_cq_mbox_out out;
+ struct mlx5_destroy_cq_mbox_in din;
+ struct mlx5_destroy_cq_mbox_out dout;
+
+ in->hdr.opcode = cpu_to_be16(MLX5_CMD_OP_CREATE_CQ);
+ memset(&out, 0, sizeof(out));
+ err = mlx5_cmd_exec(dev, in, inlen, &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ return mlx5_cmd_status_to_err(&out.hdr);
+
+ cq->cqn = be32_to_cpu(out.cqn) & 0xffffff;
+ cq->cons_index = 0;
+ cq->arm_sn = 0;
+
+ spin_lock_irq(&table->lock);
+ err = radix_tree_insert(&table->tree, cq->cqn, cq);
+ spin_unlock_irq(&table->lock);
+ if (err)
+ goto err_cmd;
+
+ cq->pid = current->pid;
+ err = mlx5_debug_cq_add(dev, cq);
+ if (err)
+ mlx5_core_dbg(dev, "failed adding CP 0x%x to debug file system\n",
+ cq->cqn);
+
+ return 0;
+
+err_cmd:
+ memset(&din, 0, sizeof(din));
+ memset(&dout, 0, sizeof(dout));
+ din.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_DESTROY_CQ);
+ din.cqn = cpu_to_be32(cq->cqn);
+ mlx5_cmd_exec(dev, &din, sizeof(din), &dout, sizeof(dout));
+ return err;
+}
+EXPORT_SYMBOL(mlx5_core_create_cq);
+
+int mlx5_core_destroy_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq)
+{
+ struct mlx5_cq_table *table = &dev->priv.cq_table;
+ struct mlx5_destroy_cq_mbox_in in;
+ struct mlx5_destroy_cq_mbox_out out;
+ struct mlx5_core_cq *tmp;
+ int err;
+
+ spin_lock_irq(&table->lock);
+ tmp = radix_tree_delete(&table->tree, cq->cqn);
+ spin_unlock_irq(&table->lock);
+ synchronize_rcu();
+ if (!tmp) {
+ mlx5_core_warn(dev, "cq 0x%x not found in tree\n", cq->cqn);
+ return -EINVAL;
+ }
+ if (tmp != cq) {
+ mlx5_core_warn(dev, "corruption on cqn 0x%x\n", cq->cqn);
+ return -EINVAL;
+ }
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_DESTROY_CQ);
+ in.cqn = cpu_to_be32(cq->cqn);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ return mlx5_cmd_status_to_err(&out.hdr);
+
+
+ mlx5_debug_cq_remove(dev, cq);
+ return 0;
+}
+EXPORT_SYMBOL(mlx5_core_destroy_cq);
+
+int mlx5_core_query_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq,
+ struct mlx5_query_cq_mbox_out *out)
+{
+ struct mlx5_query_cq_mbox_in in;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(out, 0, sizeof(*out));
+
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_QUERY_CQ);
+ in.cqn = cpu_to_be32(cq->cqn);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), out, sizeof(*out));
+ if (err)
+ return err;
+
+ if (out->hdr.status)
+ return mlx5_cmd_status_to_err(&out->hdr);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx5_core_query_cq);
+
+
+int mlx5_core_modify_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq,
+ struct mlx5_modify_cq_mbox_in *in, int in_sz)
+{
+ struct mlx5_modify_cq_mbox_out out;
+ int err;
+
+ memset(&out, 0, sizeof(out));
+ in->hdr.opcode = cpu_to_be16(MLX5_CMD_OP_MODIFY_CQ);
+ err = mlx5_cmd_exec(dev, in, in_sz, &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ return mlx5_cmd_status_to_err(&out.hdr);
+
+ return 0;
+}
+EXPORT_SYMBOL(mlx5_core_modify_cq);
+
+int mlx5_core_modify_cq_moderation(struct mlx5_core_dev *dev,
+ struct mlx5_core_cq *cq,
+ u16 cq_period,
+ u16 cq_max_count)
+{
+ struct mlx5_modify_cq_mbox_in in;
+
+ memset(&in, 0, sizeof(in));
+
+ in.cqn = cpu_to_be32(cq->cqn);
+ in.ctx.cq_period = cpu_to_be16(cq_period);
+ in.ctx.cq_max_count = cpu_to_be16(cq_max_count);
+ in.field_select = cpu_to_be32(MLX5_CQ_MODIFY_PERIOD |
+ MLX5_CQ_MODIFY_COUNT);
+
+ return mlx5_core_modify_cq(dev, cq, &in, sizeof(in));
+}
+
+int mlx5_init_cq_table(struct mlx5_core_dev *dev)
+{
+ struct mlx5_cq_table *table = &dev->priv.cq_table;
+ int err;
+
+ memset(table, 0, sizeof(*table));
+ spin_lock_init(&table->lock);
+ INIT_RADIX_TREE(&table->tree, GFP_ATOMIC);
+ err = mlx5_cq_debugfs_init(dev);
+
+ return err;
+}
+
+void mlx5_cleanup_cq_table(struct mlx5_core_dev *dev)
+{
+ mlx5_cq_debugfs_cleanup(dev);
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/debugfs.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/debugfs.c
new file mode 100644
index 0000000..7b4962b
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/debugfs.c
@@ -0,0 +1,718 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+enum {
+ QP_PID,
+ QP_STATE,
+ QP_XPORT,
+ QP_MTU,
+ QP_N_RECV,
+ QP_RECV_SZ,
+ QP_N_SEND,
+ QP_LOG_PG_SZ,
+ QP_RQPN,
+};
+
+static char *qp_fields[] = {
+ [QP_PID] = "pid",
+ [QP_STATE] = "state",
+ [QP_XPORT] = "transport",
+ [QP_MTU] = "mtu",
+ [QP_N_RECV] = "num_recv",
+ [QP_RECV_SZ] = "rcv_wqe_sz",
+ [QP_N_SEND] = "num_send",
+ [QP_LOG_PG_SZ] = "log2_page_sz",
+ [QP_RQPN] = "remote_qpn",
+};
+
+enum {
+ DCT_PID,
+ DCT_STATE,
+ DCT_MTU,
+ DCT_KEY_VIOL,
+ DCT_CQN,
+};
+
+static char *dct_fields[] = {
+ [DCT_PID] = "pid",
+ [DCT_STATE] = "state",
+ [DCT_MTU] = "mtu",
+ [DCT_KEY_VIOL] = "key_violations",
+ [DCT_CQN] = "cqn",
+};
+
+enum {
+ EQ_NUM_EQES,
+ EQ_INTR,
+ EQ_LOG_PG_SZ,
+};
+
+static char *eq_fields[] = {
+ [EQ_NUM_EQES] = "num_eqes",
+ [EQ_INTR] = "intr",
+ [EQ_LOG_PG_SZ] = "log_page_size",
+};
+
+enum {
+ CQ_PID,
+ CQ_NUM_CQES,
+ CQ_LOG_PG_SZ,
+};
+
+static char *cq_fields[] = {
+ [CQ_PID] = "pid",
+ [CQ_NUM_CQES] = "num_cqes",
+ [CQ_LOG_PG_SZ] = "log_page_size",
+};
+
+struct dentry *mlx5_debugfs_root;
+EXPORT_SYMBOL(mlx5_debugfs_root);
+
+void mlx5_register_debugfs(void)
+{
+ mlx5_debugfs_root = debugfs_create_dir("mlx5", NULL);
+ if (IS_ERR_OR_NULL(mlx5_debugfs_root))
+ mlx5_debugfs_root = NULL;
+}
+
+void mlx5_unregister_debugfs(void)
+{
+ debugfs_remove(mlx5_debugfs_root);
+}
+
+int mlx5_qp_debugfs_init(struct mlx5_core_dev *dev)
+{
+ if (!mlx5_debugfs_root)
+ return 0;
+
+ atomic_set(&dev->num_qps, 0);
+
+ dev->priv.qp_debugfs = debugfs_create_dir("QPs", dev->priv.dbg_root);
+ if (!dev->priv.qp_debugfs)
+ return -ENOMEM;
+
+ return 0;
+}
+
+void mlx5_qp_debugfs_cleanup(struct mlx5_core_dev *dev)
+{
+ if (!mlx5_debugfs_root)
+ return;
+
+ debugfs_remove_recursive(dev->priv.qp_debugfs);
+}
+
+int mlx5_dct_debugfs_init(struct mlx5_core_dev *dev)
+{
+ if (!mlx5_debugfs_root)
+ return 0;
+
+ dev->priv.dct_debugfs = debugfs_create_dir("DCTs", dev->priv.dbg_root);
+ if (!dev->priv.dct_debugfs)
+ return -ENOMEM;
+
+ return 0;
+}
+
+void mlx5_dct_debugfs_cleanup(struct mlx5_core_dev *dev)
+{
+ if (!mlx5_debugfs_root)
+ return;
+
+ debugfs_remove_recursive(dev->priv.dct_debugfs);
+}
+
+int mlx5_eq_debugfs_init(struct mlx5_core_dev *dev)
+{
+ if (!mlx5_debugfs_root)
+ return 0;
+
+ dev->priv.eq_debugfs = debugfs_create_dir("EQs", dev->priv.dbg_root);
+ if (!dev->priv.eq_debugfs)
+ return -ENOMEM;
+
+ return 0;
+}
+
+void mlx5_eq_debugfs_cleanup(struct mlx5_core_dev *dev)
+{
+ if (!mlx5_debugfs_root)
+ return;
+
+ debugfs_remove_recursive(dev->priv.eq_debugfs);
+}
+
+static ssize_t average_read(struct file *filp, char __user *buf, size_t count,
+ loff_t *pos)
+{
+ struct mlx5_cmd_stats *stats;
+ u64 field = 0;
+ int ret;
+ char tbuf[22];
+
+ if (*pos)
+ return 0;
+
+ stats = filp->private_data;
+ spin_lock_irq(&stats->lock);
+ if (stats->n)
+ field = div64_u64(stats->sum, stats->n);
+ spin_unlock_irq(&stats->lock);
+ ret = snprintf(tbuf, sizeof(tbuf), "%llu\n", field);
+ if (ret > 0) {
+ if (copy_to_user(buf, tbuf, ret))
+ return -EFAULT;
+ }
+
+ *pos += ret;
+ return ret;
+}
+
+
+static ssize_t average_write(struct file *filp, const char __user *buf,
+ size_t count, loff_t *pos)
+{
+ struct mlx5_cmd_stats *stats;
+
+ stats = filp->private_data;
+ spin_lock_irq(&stats->lock);
+ stats->sum = 0;
+ stats->n = 0;
+ spin_unlock_irq(&stats->lock);
+
+ *pos += count;
+
+ return count;
+}
+
+static const struct file_operations stats_fops = {
+ .owner = THIS_MODULE,
+ .open = simple_open,
+ .read = average_read,
+ .write = average_write,
+};
+
+int mlx5_cmdif_debugfs_init(struct mlx5_core_dev *dev)
+{
+ struct mlx5_cmd_stats *stats;
+ struct dentry **cmd;
+ const char *namep;
+ int err;
+ int i;
+
+ if (!mlx5_debugfs_root)
+ return 0;
+
+ cmd = &dev->priv.cmdif_debugfs;
+ *cmd = debugfs_create_dir("commands", dev->priv.dbg_root);
+ if (!*cmd)
+ return -ENOMEM;
+
+ for (i = 0; i < ARRAY_SIZE(dev->cmd.stats); i++) {
+ stats = &dev->cmd.stats[i];
+ namep = mlx5_command_str(i);
+ if (strcmp(namep, "unknown command opcode")) {
+ stats->root = debugfs_create_dir(namep, *cmd);
+ if (!stats->root) {
+ mlx5_core_warn(dev, "failed adding command %d\n",
+ i);
+ err = -ENOMEM;
+ goto out;
+ }
+
+ stats->avg = debugfs_create_file("average", 0400,
+ stats->root, stats,
+ &stats_fops);
+ if (!stats->avg) {
+ mlx5_core_warn(dev, "failed creating debugfs file\n");
+ err = -ENOMEM;
+ goto out;
+ }
+
+ stats->count = debugfs_create_u64("n", 0400,
+ stats->root,
+ &stats->n);
+ if (!stats->count) {
+ mlx5_core_warn(dev, "failed creating debugfs file\n");
+ err = -ENOMEM;
+ goto out;
+ }
+ }
+ }
+
+ return 0;
+out:
+ debugfs_remove_recursive(dev->priv.cmdif_debugfs);
+ dev->priv.cmdif_debugfs = NULL;
+ return err;
+}
+
+void mlx5_cmdif_debugfs_cleanup(struct mlx5_core_dev *dev)
+{
+ if (!mlx5_debugfs_root || !dev->priv.cmdif_debugfs)
+ return;
+
+ debugfs_remove_recursive(dev->priv.cmdif_debugfs);
+}
+
+int mlx5_cq_debugfs_init(struct mlx5_core_dev *dev)
+{
+ if (!mlx5_debugfs_root)
+ return 0;
+
+ dev->priv.cq_debugfs = debugfs_create_dir("CQs", dev->priv.dbg_root);
+ if (!dev->priv.cq_debugfs)
+ return -ENOMEM;
+
+ return 0;
+}
+
+void mlx5_cq_debugfs_cleanup(struct mlx5_core_dev *dev)
+{
+ if (!mlx5_debugfs_root)
+ return;
+
+ debugfs_remove_recursive(dev->priv.cq_debugfs);
+}
+
+static u64 qp_read_field(struct mlx5_core_dev *dev, struct mlx5_core_qp *qp,
+ int index, int *is_str)
+{
+ struct mlx5_query_qp_mbox_out *out;
+ struct mlx5_qp_context *ctx;
+ u64 param = 0;
+ int err;
+ int no_sq;
+
+ out = kzalloc(sizeof(*out), GFP_KERNEL);
+ if (!out)
+ return param;
+
+ err = mlx5_core_qp_query(dev, qp, out, sizeof(*out));
+ if (err) {
+ mlx5_core_warn(dev, "failed to query qp\n");
+ goto out;
+ }
+
+ *is_str = 0;
+ ctx = &out->ctx;
+ switch (index) {
+ case QP_PID:
+ param = qp->pid;
+ break;
+ case QP_STATE:
+ param = (unsigned long)mlx5_qp_state_str(be32_to_cpu(ctx->flags) >> 28);
+ *is_str = 1;
+ break;
+ case QP_XPORT:
+ param = (unsigned long)mlx5_qp_type_str((be32_to_cpu(ctx->flags) >> 16) & 0xff);
+ *is_str = 1;
+ break;
+ case QP_MTU:
+ switch (ctx->mtu_msgmax >> 5) {
+ case IB_MTU_256:
+ param = 256;
+ break;
+ case IB_MTU_512:
+ param = 512;
+ break;
+ case IB_MTU_1024:
+ param = 1024;
+ break;
+ case IB_MTU_2048:
+ param = 2048;
+ break;
+ case IB_MTU_4096:
+ param = 4096;
+ break;
+ default:
+ param = 0;
+ }
+ break;
+ case QP_N_RECV:
+ param = 1 << ((ctx->rq_size_stride >> 3) & 0xf);
+ break;
+ case QP_RECV_SZ:
+ param = 1 << ((ctx->rq_size_stride & 7) + 4);
+ break;
+ case QP_N_SEND:
+ no_sq = be16_to_cpu(ctx->sq_crq_size) >> 15;
+ if (!no_sq)
+ param = 1 << (be16_to_cpu(ctx->sq_crq_size) >> 11);
+ else
+ param = 0;
+ break;
+ case QP_LOG_PG_SZ:
+ param = (be32_to_cpu(ctx->log_pg_sz_remote_qpn) >> 24) & 0x1f;
+ param += 12;
+ break;
+ case QP_RQPN:
+ param = be32_to_cpu(ctx->log_pg_sz_remote_qpn) & 0xffffff;
+ break;
+ }
+
+out:
+ kfree(out);
+ return param;
+}
+
+static u64 dct_read_field(struct mlx5_core_dev *dev, struct mlx5_core_dct *dct,
+ int index, int *is_str)
+{
+ struct mlx5_query_dct_mbox_out *out;
+ struct mlx5_dct_context *ctx;
+ u64 param = 0;
+ int err;
+
+ out = kzalloc(sizeof(*out), GFP_KERNEL);
+ if (!out)
+ return param;
+
+ err = mlx5_core_dct_query(dev, dct, out);
+ if (err) {
+ mlx5_core_warn(dev, "failed to query dct\n");
+ goto out;
+ }
+
+ ctx = &out->ctx;
+ *is_str = 0;
+ switch (index) {
+ case DCT_PID:
+ param = dct->pid;
+ break;
+ case DCT_STATE:
+ param = (u64)mlx5_dct_state_str(ctx->state);
+ *is_str = 1;
+ break;
+ case DCT_MTU:
+ param = ctx->mtu;
+ break;
+ case DCT_KEY_VIOL:
+ param = be32_to_cpu(ctx->access_violations);
+ break;
+ case DCT_CQN:
+ param = be32_to_cpu(ctx->cqn) & 0xffffff;
+ break;
+ }
+
+out:
+ kfree(out);
+ return param;
+}
+
+static u64 eq_read_field(struct mlx5_core_dev *dev, struct mlx5_eq *eq,
+ int index)
+{
+ struct mlx5_query_eq_mbox_out *out;
+ struct mlx5_eq_context *ctx;
+ u64 param = 0;
+ int err;
+
+ out = kzalloc(sizeof(*out), GFP_KERNEL);
+ if (!out)
+ return param;
+
+ ctx = &out->ctx;
+
+ err = mlx5_core_eq_query(dev, eq, out, sizeof(*out));
+ if (err) {
+ mlx5_core_warn(dev, "failed to query eq\n");
+ goto out;
+ }
+
+ switch (index) {
+ case EQ_NUM_EQES:
+ param = 1 << ((be32_to_cpu(ctx->log_sz_usr_page) >> 24) & 0x1f);
+ break;
+ case EQ_INTR:
+ param = ctx->intr;
+ break;
+ case EQ_LOG_PG_SZ:
+ param = (ctx->log_page_size & 0x1f) + 12;
+ break;
+ }
+
+out:
+ kfree(out);
+ return param;
+}
+
+static u64 cq_read_field(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq,
+ int index)
+{
+ struct mlx5_query_cq_mbox_out *out;
+ struct mlx5_cq_context *ctx;
+ u64 param = 0;
+ int err;
+
+ out = kzalloc(sizeof(*out), GFP_KERNEL);
+ if (!out)
+ return param;
+
+ ctx = &out->ctx;
+
+ err = mlx5_core_query_cq(dev, cq, out);
+ if (err) {
+ mlx5_core_warn(dev, "failed to query cq\n");
+ goto out;
+ }
+
+ switch (index) {
+ case CQ_PID:
+ param = cq->pid;
+ break;
+ case CQ_NUM_CQES:
+ param = 1 << ((be32_to_cpu(ctx->log_sz_usr_page) >> 24) & 0x1f);
+ break;
+ case CQ_LOG_PG_SZ:
+ param = (ctx->log_pg_sz & 0x1f) + 12;
+ break;
+ }
+
+out:
+ kfree(out);
+ return param;
+}
+
+static ssize_t dbg_read(struct file *filp, char __user *buf, size_t count,
+ loff_t *pos)
+{
+ struct mlx5_field_desc *desc;
+ struct mlx5_rsc_debug *d;
+ char tbuf[18];
+ int is_str = 0;
+ u64 field;
+ int ret;
+
+ if (*pos)
+ return 0;
+
+ desc = filp->private_data;
+ d = (void *)(desc - desc->i) - sizeof(*d);
+ switch (d->type) {
+ case MLX5_DBG_RSC_QP:
+ field = qp_read_field(d->dev, d->object, desc->i, &is_str);
+ break;
+
+ case MLX5_DBG_RSC_EQ:
+ field = eq_read_field(d->dev, d->object, desc->i);
+ break;
+
+ case MLX5_DBG_RSC_CQ:
+ field = cq_read_field(d->dev, d->object, desc->i);
+ break;
+
+ case MLX5_DBG_RSC_DCT:
+ field = dct_read_field(d->dev, d->object, desc->i, &is_str);
+ break;
+
+ default:
+ mlx5_core_warn(d->dev, "invalid resource type %d\n", d->type);
+ return -EINVAL;
+ }
+
+
+ if (is_str)
+ ret = snprintf(tbuf, sizeof(tbuf), "%s\n", (const char *)(unsigned long)field);
+ else
+ ret = snprintf(tbuf, sizeof(tbuf), "0x%llx\n", field);
+
+ if (ret > 0) {
+ if (copy_to_user(buf, tbuf, ret))
+ return -EFAULT;
+ }
+
+ *pos += ret;
+ return ret;
+}
+
+static const struct file_operations fops = {
+ .owner = THIS_MODULE,
+ .open = simple_open,
+ .read = dbg_read,
+};
+
+static int add_res_tree(struct mlx5_core_dev *dev, enum dbg_rsc_type type,
+ struct dentry *root, struct mlx5_rsc_debug **dbg,
+ int rsn, char **field, int nfile, void *data)
+{
+ struct mlx5_rsc_debug *d;
+ char resn[32];
+ int err;
+ int i;
+
+ d = kzalloc(sizeof(*d) + nfile * sizeof(d->fields[0]), GFP_KERNEL);
+ if (!d)
+ return -ENOMEM;
+
+ d->dev = dev;
+ d->object = data;
+ d->type = type;
+ sprintf(resn, "0x%x", rsn);
+ d->root = debugfs_create_dir(resn, root);
+ if (!d->root) {
+ err = -ENOMEM;
+ goto out_free;
+ }
+
+ for (i = 0; i < nfile; i++) {
+ d->fields[i].i = i;
+ d->fields[i].dent = debugfs_create_file(field[i], 0400,
+ d->root, &d->fields[i],
+ &fops);
+ if (!d->fields[i].dent) {
+ err = -ENOMEM;
+ goto out_rem;
+ }
+ }
+ *dbg = d;
+
+ return 0;
+out_rem:
+ debugfs_remove_recursive(d->root);
+
+out_free:
+ kfree(d);
+ return err;
+}
+
+static void rem_res_tree(struct mlx5_rsc_debug *d)
+{
+ debugfs_remove_recursive(d->root);
+ kfree(d);
+}
+
+int mlx5_debug_qp_add(struct mlx5_core_dev *dev, struct mlx5_core_qp *qp)
+{
+ int err;
+
+ if (!mlx5_debugfs_root)
+ return 0;
+
+ err = add_res_tree(dev, MLX5_DBG_RSC_QP, dev->priv.qp_debugfs,
+ &qp->dbg, qp->qpn, qp_fields,
+ ARRAY_SIZE(qp_fields), qp);
+ if (err)
+ qp->dbg = NULL;
+
+ return err;
+}
+
+void mlx5_debug_qp_remove(struct mlx5_core_dev *dev, struct mlx5_core_qp *qp)
+{
+ if (!mlx5_debugfs_root)
+ return;
+
+ if (qp->dbg)
+ rem_res_tree(qp->dbg);
+}
+
+int mlx5_debug_dct_add(struct mlx5_core_dev *dev, struct mlx5_core_dct *dct)
+{
+ int err;
+
+ if (!mlx5_debugfs_root)
+ return 0;
+
+ err = add_res_tree(dev, MLX5_DBG_RSC_DCT, dev->priv.dct_debugfs,
+ &dct->dbg, dct->dctn, dct_fields,
+ ARRAY_SIZE(dct_fields), dct);
+ if (err)
+ dct->dbg = NULL;
+
+ return err;
+}
+
+void mlx5_debug_dct_remove(struct mlx5_core_dev *dev, struct mlx5_core_dct *dct)
+{
+ if (!mlx5_debugfs_root)
+ return;
+
+ if (dct->dbg)
+ rem_res_tree(dct->dbg);
+}
+
+int mlx5_debug_eq_add(struct mlx5_core_dev *dev, struct mlx5_eq *eq)
+{
+ int err;
+
+ if (!mlx5_debugfs_root)
+ return 0;
+
+ err = add_res_tree(dev, MLX5_DBG_RSC_EQ, dev->priv.eq_debugfs,
+ &eq->dbg, eq->eqn, eq_fields,
+ ARRAY_SIZE(eq_fields), eq);
+ if (err)
+ eq->dbg = NULL;
+
+ return err;
+}
+
+void mlx5_debug_eq_remove(struct mlx5_core_dev *dev, struct mlx5_eq *eq)
+{
+ if (!mlx5_debugfs_root)
+ return;
+
+ if (eq->dbg)
+ rem_res_tree(eq->dbg);
+}
+
+int mlx5_debug_cq_add(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq)
+{
+ int err;
+
+ if (!mlx5_debugfs_root)
+ return 0;
+
+ err = add_res_tree(dev, MLX5_DBG_RSC_CQ, dev->priv.cq_debugfs,
+ &cq->dbg, cq->cqn, cq_fields,
+ ARRAY_SIZE(cq_fields), cq);
+ if (err)
+ cq->dbg = NULL;
+
+ return err;
+}
+
+void mlx5_debug_cq_remove(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq)
+{
+ if (!mlx5_debugfs_root)
+ return;
+
+ if (cq->dbg)
+ rem_res_tree(cq->dbg);
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/en.h b/drivers/net/mlnx_uio/mlnx/mlx5/core/en.h
new file mode 100644
index 0000000..8da999a
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/en.h
@@ -0,0 +1,695 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+#else
+#endif
+#include "linux/mlx5/vport.h"
+#include "wq.h"
+#include "transobj.h"
+#include "mlx5_core.h"
+
+#define MLX5E_MAX_NUM_TC 8
+#define MLX5E_MAX_NUM_PRIO 8
+#define MLX5E_MAX_MTU 9600
+
+#define MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE 0x7
+#define MLX5E_PARAMS_DEFAULT_LOG_SQ_SIZE 0xa
+#define MLX5E_PARAMS_MAXIMUM_LOG_SQ_SIZE 0xd
+
+#define MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE 0x7
+#define MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE 0xa
+#define MLX5E_PARAMS_MAXIMUM_LOG_RQ_SIZE 0xd
+
+#define MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ (16 * 1024)
+#define MLX5E_PARAMS_DEFAULT_RX_CQ_MODERATION_USEC 0x10
+#define MLX5E_PARAMS_DEFAULT_RX_CQ_MODERATION_PKTS 0x20
+#define MLX5E_PARAMS_DEFAULT_TX_CQ_MODERATION_USEC 0x10
+#define MLX5E_PARAMS_DEFAULT_TX_CQ_MODERATION_PKTS 0x20
+#define MLX5E_PARAMS_DEFAULT_MIN_RX_WQES 0x80
+#define MLX5E_PARAMS_DEFAULT_RX_HASH_LOG_TBL_SZ 0x7
+
+#define MLX5E_TX_CQ_POLL_BUDGET 128
+#define MLX5E_UPDATE_STATS_INTERVAL 200 /* msecs */
+#define MLX5E_SQ_BF_BUDGET 16
+
+static const char vport_strings[][ETH_GSTRING_LEN] = {
+ /* vport statistics */
+ "rx_packets",
+ "rx_bytes",
+ "tx_packets",
+ "tx_bytes",
+ "rx_error_packets",
+ "rx_error_bytes",
+ "tx_error_packets",
+ "tx_error_bytes",
+ "rx_unicast_packets",
+ "rx_unicast_bytes",
+ "tx_unicast_packets",
+ "tx_unicast_bytes",
+ "rx_multicast_packets",
+ "rx_multicast_bytes",
+ "tx_multicast_packets",
+ "tx_multicast_bytes",
+ "rx_broadcast_packets",
+ "rx_broadcast_bytes",
+ "tx_broadcast_packets",
+ "tx_broadcast_bytes",
+
+ /* SW counters */
+ "tso_packets",
+ "tso_bytes",
+ "lro_packets",
+ "lro_bytes",
+ "rx_csum_good",
+ "rx_csum_none",
+ "tx_csum_offload",
+ "tx_queue_stopped",
+ "tx_queue_wake",
+ "tx_queue_dropped",
+ "rx_wqe_err",
+ /* port statistics */
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+ "sw_lro_aggregated", "sw_lro_flushed", "sw_lro_no_desc",
+#endif
+};
+
+struct mlx5e_vport_stats {
+ /* HW counters */
+ u64 rx_packets;
+ u64 rx_bytes;
+ u64 tx_packets;
+ u64 tx_bytes;
+ u64 rx_error_packets;
+ u64 rx_error_bytes;
+ u64 tx_error_packets;
+ u64 tx_error_bytes;
+ u64 rx_unicast_packets;
+ u64 rx_unicast_bytes;
+ u64 tx_unicast_packets;
+ u64 tx_unicast_bytes;
+ u64 rx_multicast_packets;
+ u64 rx_multicast_bytes;
+ u64 tx_multicast_packets;
+ u64 tx_multicast_bytes;
+ u64 rx_broadcast_packets;
+ u64 rx_broadcast_bytes;
+ u64 tx_broadcast_packets;
+ u64 tx_broadcast_bytes;
+
+ /* SW counters */
+ u64 tso_packets;
+ u64 tso_bytes;
+ u64 lro_packets;
+ u64 lro_bytes;
+ u64 rx_csum_good;
+ u64 rx_csum_none;
+ u64 tx_csum_offload;
+ u64 tx_queue_stopped;
+ u64 tx_queue_wake;
+ u64 tx_queue_dropped;
+ u64 rx_wqe_err;
+
+ /* SW LRO statistics */
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+ u64 sw_lro_aggregated;
+ u64 sw_lro_flushed;
+ u64 sw_lro_no_desc;
+#define NUM_VPORT_COUNTERS 34
+#else
+#define NUM_VPORT_COUNTERS 31
+#endif
+};
+
+static const char pport_strings[][ETH_GSTRING_LEN] = {
+ /* IEEE802.3 counters */
+ "frames_tx",
+ "frames_rx",
+ "check_seq_err",
+ "alignment_err",
+ "octets_tx",
+ "octets_received",
+ "multicast_xmitted",
+ "broadcast_xmitted",
+ "multicast_rx",
+ "broadcast_rx",
+ "in_range_len_errors",
+ "out_of_range_len",
+ "too_long_errors",
+ "symbol_err",
+ "mac_control_tx",
+ "mac_control_rx",
+ "unsupported_op_rx",
+ "pause_ctrl_rx",
+ "pause_ctrl_tx",
+
+ /* RFC2863 counters */
+ "in_octets",
+ "in_ucast_pkts",
+ "in_discards",
+ "in_errors",
+ "in_unknown_protos",
+ "out_octets",
+ "out_ucast_pkts",
+ "out_discards",
+ "out_errors",
+ "in_multicast_pkts",
+ "in_broadcast_pkts",
+ "out_multicast_pkts",
+ "out_broadcast_pkts",
+
+ /* RFC2819 counters */
+ "drop_events",
+ "octets",
+ "pkts",
+ "broadcast_pkts",
+ "multicast_pkts",
+ "crc_align_errors",
+ "undersize_pkts",
+ "oversize_pkts",
+ "fragments",
+ "jabbers",
+ "collisions",
+ "p64octets",
+ "p65to127octets",
+ "p128to255octets",
+ "p256to511octets",
+ "p512to1023octets",
+ "p1024to1518octets",
+ "p1519to2047octets",
+ "p2048to4095octets",
+ "p4096to8191octets",
+ "p8192to10239octets",
+};
+
+#define NUM_IEEE_802_3_COUNTERS 19
+#define NUM_RFC_2863_COUNTERS 13
+#define NUM_RFC_2819_COUNTERS 21
+#define NUM_PPORT_COUNTERS (NUM_IEEE_802_3_COUNTERS + \
+ NUM_RFC_2863_COUNTERS + \
+ NUM_RFC_2819_COUNTERS)
+
+struct mlx5e_pport_stats {
+ __be64 IEEE_802_3_counters[NUM_IEEE_802_3_COUNTERS];
+ __be64 RFC_2863_counters[NUM_RFC_2863_COUNTERS];
+ __be64 RFC_2819_counters[NUM_RFC_2819_COUNTERS];
+};
+
+static const char rq_stats_strings[][ETH_GSTRING_LEN] = {
+ "packets",
+ "csum_none",
+ "lro_packets",
+ "lro_bytes",
+ "wqe_err"
+};
+
+struct mlx5e_rq_stats {
+ u64 packets;
+ u64 csum_none;
+ u64 lro_packets;
+ u64 lro_bytes;
+ u64 wqe_err;
+#define NUM_RQ_STATS 5
+};
+
+static const char sq_stats_strings[][ETH_GSTRING_LEN] = {
+ "packets",
+ "tso_packets",
+ "tso_bytes",
+ "csum_offload_none",
+ "stopped",
+ "wake",
+ "dropped",
+ "nop"
+};
+
+struct mlx5e_sq_stats {
+ u64 packets;
+ u64 tso_packets;
+ u64 tso_bytes;
+ u64 csum_offload_none;
+ u64 stopped;
+ u64 wake;
+ u64 dropped;
+ u64 nop;
+#define NUM_SQ_STATS 8
+};
+
+struct mlx5e_stats {
+ struct mlx5e_vport_stats vport;
+ struct mlx5e_pport_stats pport;
+};
+
+struct mlx5e_params {
+ u8 log_sq_size;
+ u8 log_rq_size;
+ u16 num_channels;
+ u8 default_vlan_prio;
+ u8 num_tc;
+ u16 rx_cq_moderation_usec;
+ u16 rx_cq_moderation_pkts;
+ u16 tx_cq_moderation_usec;
+ u16 tx_cq_moderation_pkts;
+ u16 min_rx_wqes;
+ u16 rx_hash_log_tbl_sz;
+ bool lro_en;
+ u32 lro_wqe_sz;
+ bool rss_hash_xor;
+};
+
+enum {
+ MLX5E_RQ_STATE_POST_WQES_ENABLE,
+};
+
+enum cq_flags {
+ MLX5E_CQ_HAS_CQES = 1,
+};
+
+struct mlx5e_cq {
+ /* data path - accessed per cqe */
+ struct mlx5_cqwq wq;
+ unsigned long flags;
+
+ /* data path - accessed per napi poll */
+ struct napi_struct *napi;
+ struct mlx5_core_cq mcq;
+ struct mlx5e_channel *channel;
+
+ /* control */
+ struct mlx5_wq_ctrl wq_ctrl;
+} ____cacheline_aligned_in_smp;
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+
+static const char mlx5e_priv_flags[][ETH_GSTRING_LEN] = {
+ "sw_lro",
+ "hw_lro",
+};
+
+/* SW LRO defines for MLX5 */
+#define MLX5E_RQ_FLAG_SWLRO (1<<0)
+
+#define MLX5E_LRO_MAX_DESC 32
+struct mlx5e_sw_lro {
+ struct net_lro_mgr lro_mgr;
+ struct net_lro_desc lro_desc[MLX5E_LRO_MAX_DESC];
+};
+#endif
+
+struct mlx5e_rq {
+ /* data path */
+ struct mlx5_wq_ll wq;
+ u32 wqe_sz;
+ struct sk_buff **skb;
+
+ struct device *pdev;
+ struct net_device *netdev;
+ struct mlx5e_rq_stats stats;
+ struct mlx5e_cq cq;
+
+ unsigned long state;
+ int ix;
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+ unsigned long flags;
+ struct mlx5e_sw_lro sw_lro;
+#endif
+
+ /* control */
+ struct mlx5_wq_ctrl wq_ctrl;
+ u32 rqn;
+ struct mlx5e_channel *channel;
+} ____cacheline_aligned_in_smp;
+
+struct mlx5e_tx_skb_cb {
+ u32 num_bytes;
+ u8 num_wqebbs;
+ u8 num_dma;
+};
+
+#define MLX5E_TX_SKB_CB(__skb) ((struct mlx5e_tx_skb_cb *)__skb->cb)
+
+struct mlx5e_sq_dma {
+ dma_addr_t addr;
+ u32 size;
+};
+
+enum {
+ MLX5E_SQ_STATE_WAKE_TXQ_ENABLE,
+};
+
+struct mlx5e_sq {
+ /* data path */
+
+ /* dirtied @completion */
+ u16 cc;
+ u32 dma_fifo_cc;
+
+ /* dirtied @xmit */
+ u16 pc ____cacheline_aligned_in_smp;
+ u32 dma_fifo_pc;
+ u16 bf_offset;
+ u16 prev_cc;
+ u8 bf_budget;
+ struct mlx5e_sq_stats stats;
+
+ struct mlx5e_cq cq;
+
+ /* pointers to per packet info: write@xmit, read@completion */
+ struct sk_buff **skb;
+ struct mlx5e_sq_dma *dma_fifo;
+
+ /* read only */
+ struct mlx5_wq_cyc wq;
+ u32 dma_fifo_mask;
+ void __iomem *uar_map;
+ void __iomem *uar_bf_map;
+ struct netdev_queue *txq;
+ u32 sqn;
+ u16 bf_buf_size;
+ u16 max_inline;
+ u16 edge;
+ struct device *pdev;
+ __be32 mkey_be;
+ unsigned long state;
+
+ /* control path */
+ struct mlx5_wq_ctrl wq_ctrl;
+ struct mlx5_uar uar;
+ struct mlx5e_channel *channel;
+ int tc;
+} ____cacheline_aligned_in_smp;
+
+static inline bool mlx5e_sq_has_room_for(struct mlx5e_sq *sq, u16 n)
+{
+ return (((sq->wq.sz_m1 & (sq->cc - sq->pc)) >= n) ||
+ (sq->cc == sq->pc));
+}
+
+enum channel_flags {
+ MLX5E_CHANNEL_NAPI_SCHED = 1,
+};
+
+struct mlx5e_channel {
+ /* data path */
+ struct mlx5e_rq rq;
+ struct mlx5e_sq sq[MLX5E_MAX_NUM_TC];
+ struct napi_struct napi;
+ struct device *pdev;
+ struct net_device *netdev;
+ __be32 mkey_be;
+ u8 num_tc;
+ unsigned long flags;
+ int tc_to_txq_map[MLX5E_MAX_NUM_TC];
+
+ /* control */
+ struct mlx5e_priv *priv;
+ int ix;
+ int cpu;
+
+ struct dentry *dfs_root;
+};
+
+enum mlx5e_traffic_types {
+ MLX5E_TT_IPV4_TCP,
+ MLX5E_TT_IPV6_TCP,
+ MLX5E_TT_IPV4_UDP,
+ MLX5E_TT_IPV6_UDP,
+ MLX5E_TT_IPV4_IPSEC_AH,
+ MLX5E_TT_IPV6_IPSEC_AH,
+ MLX5E_TT_IPV4_IPSEC_ESP,
+ MLX5E_TT_IPV6_IPSEC_ESP,
+ MLX5E_TT_IPV4,
+ MLX5E_TT_IPV6,
+ MLX5E_TT_ANY,
+ MLX5E_NUM_TT,
+};
+
+enum {
+ MLX5E_RQT_SPREADING = 0,
+ MLX5E_RQT_DEFAULT_RQ = 1,
+ MLX5E_NUM_RQT = 2,
+};
+
+struct mlx5e_eth_addr_info {
+ u8 addr[ETH_ALEN + 2];
+ u32 tt_vec;
+ u32 ft_ix[MLX5E_NUM_TT]; /* flow table index per traffic type */
+};
+
+#define MLX5E_ETH_ADDR_HASH_SIZE (1 << BITS_PER_BYTE)
+
+struct mlx5e_eth_addr_db {
+ struct hlist_head netdev_uc[MLX5E_ETH_ADDR_HASH_SIZE];
+ struct hlist_head netdev_mc[MLX5E_ETH_ADDR_HASH_SIZE];
+ struct mlx5e_eth_addr_info broadcast;
+ struct mlx5e_eth_addr_info allmulti;
+ struct mlx5e_eth_addr_info promisc;
+ bool broadcast_enabled;
+ bool allmulti_enabled;
+ bool promisc_enabled;
+};
+
+enum {
+ MLX5E_STATE_ASYNC_EVENTS_ENABLE,
+ MLX5E_STATE_OPENED,
+};
+
+struct mlx5e_vlan_db {
+ unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
+ u32 active_vlans_ft_ix[VLAN_N_VID];
+ u32 untagged_rule_ft_ix;
+ u32 any_vlan_rule_ft_ix;
+ bool filter_disabled;
+};
+
+struct mlx5e_flow_table {
+ void *vlan;
+ void *main;
+};
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+#define MLX5E_PRIV_FLAG_SWLRO (1<<0)
+#define MLX5E_PRIV_FLAG_HWLRO (1<<1)
+#endif
+
+struct mlx5e_priv {
+ /* priv data path fields - start */
+ int default_vlan_prio;
+ struct mlx5e_sq **txq_to_sq_map;
+#if defined HAVE_VLAN_GRO_RECEIVE || defined HAVE_VLAN_HWACCEL_RX
+ struct vlan_group *vlan_grp;
+#endif
+ /* priv data path fields - end */
+
+ unsigned long state;
+ struct mutex state_lock; /* Protects Interface state */
+ struct mlx5_uar cq_uar;
+ u32 pdn;
+ u32 tdn;
+ struct mlx5_core_mr mr;
+
+ struct mlx5e_channel **channel;
+ u32 tisn[MLX5E_MAX_NUM_TC];
+ u32 rqtn;
+ u32 tirn[MLX5E_NUM_TT];
+
+ struct mlx5e_flow_table ft;
+ struct mlx5e_eth_addr_db eth_addr;
+ struct mlx5e_vlan_db vlan;
+
+ struct mlx5e_params params;
+ spinlock_t async_events_spinlock; /* sync hw events */
+ struct work_struct update_carrier_work;
+ struct work_struct set_rx_mode_work;
+ struct delayed_work update_stats_work;
+
+ struct mlx5_core_dev *mdev;
+ struct net_device *netdev;
+ struct mlx5e_stats stats;
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+ u32 pflags;
+#endif
+#ifndef HAVE_NDO_GET_STATS64
+ struct net_device_stats netdev_stats;
+#endif
+ struct dentry *dfs_root;
+};
+
+#define MLX5E_NET_IP_ALIGN 2
+
+struct mlx5e_tx_wqe {
+ struct mlx5_wqe_ctrl_seg ctrl;
+ struct mlx5_wqe_eth_seg eth;
+};
+
+struct mlx5e_rx_wqe {
+ struct mlx5_wqe_srq_next_seg next;
+ struct mlx5_wqe_data_seg data;
+};
+
+enum mlx5e_link_mode {
+ MLX5E_1000BASE_CX_SGMII = 0,
+ MLX5E_1000BASE_KX = 1,
+ MLX5E_10GBASE_CX4 = 2,
+ MLX5E_10GBASE_KX4 = 3,
+ MLX5E_10GBASE_KR = 4,
+ MLX5E_20GBASE_KR2 = 5,
+ MLX5E_40GBASE_CR4 = 6,
+ MLX5E_40GBASE_KR4 = 7,
+ MLX5E_56GBASE_R4 = 8,
+ MLX5E_10GBASE_CR = 12,
+ MLX5E_10GBASE_SR = 13,
+ MLX5E_10GBASE_ER = 14,
+ MLX5E_40GBASE_SR4 = 15,
+ MLX5E_40GBASE_LR4 = 16,
+ MLX5E_100GBASE_CR4 = 20,
+ MLX5E_100GBASE_SR4 = 21,
+ MLX5E_100GBASE_KR4 = 22,
+ MLX5E_100GBASE_LR4 = 23,
+ MLX5E_100BASE_TX = 24,
+ MLX5E_100BASE_T = 25,
+ MLX5E_10GBASE_T = 26,
+ MLX5E_25GBASE_CR = 27,
+ MLX5E_25GBASE_KR = 28,
+ MLX5E_25GBASE_SR = 29,
+ MLX5E_50GBASE_CR2 = 30,
+ MLX5E_50GBASE_KR2 = 31,
+ MLX5E_LINK_MODES_NUMBER,
+};
+
+#define MLX5E_PROT_MASK(link_mode) (1 << link_mode)
+
+void mlx5e_send_nop(struct mlx5e_sq *sq, bool notify_hw);
+#if defined(NDO_SELECT_QUEUE_HAS_ACCEL_PRIV) || defined(HAVE_SELECT_QUEUE_FALLBACK_T)
+u16 mlx5e_select_queue(struct net_device *dev, struct sk_buff *skb,
+#ifdef HAVE_SELECT_QUEUE_FALLBACK_T
+ void *accel_priv, select_queue_fallback_t fallback);
+#else
+ void *accel_priv);
+#endif
+#else /* NDO_SELECT_QUEUE_HAS_ACCEL_PRIV || HAVE_SELECT_QUEUE_FALLBACK_T */
+u16 mlx5e_select_queue(struct net_device *dev, struct sk_buff *skb);
+#endif
+
+netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev);
+
+void mlx5e_completion_event(struct mlx5_core_cq *mcq);
+void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event);
+int mlx5e_napi_poll(struct napi_struct *napi, int budget);
+bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq);
+bool mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget);
+bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq);
+void mlx5e_prefetch_cqe(struct mlx5e_cq *cq);
+struct mlx5_cqe64 *mlx5e_get_cqe(struct mlx5e_cq *cq);
+
+void mlx5e_update_stats(struct mlx5e_priv *priv);
+
+int mlx5e_open_flow_table(struct mlx5e_priv *priv);
+void mlx5e_close_flow_table(struct mlx5e_priv *priv);
+void mlx5e_init_eth_addr(struct mlx5e_priv *priv);
+void mlx5e_set_rx_mode_core(struct mlx5e_priv *priv);
+void mlx5e_set_rx_mode_work(struct work_struct *work);
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,10,0))
+int mlx5e_vlan_rx_add_vid(struct net_device *dev, __always_unused __be16 proto,
+ u16 vid);
+#elif (LINUX_VERSION_CODE >= KERNEL_VERSION(3,3,0))
+int mlx5e_vlan_rx_add_vid(struct net_device *dev, u16 vid);
+#else
+void mlx5e_vlan_rx_add_vid(struct net_device *dev, u16 vid);
+#endif
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,10,0))
+int mlx5e_vlan_rx_kill_vid(struct net_device *dev, __always_unused __be16 proto,
+ u16 vid);
+#elif (LINUX_VERSION_CODE >= KERNEL_VERSION(3,3,0))
+int mlx5e_vlan_rx_kill_vid(struct net_device *dev, u16 vid);
+#else
+void mlx5e_vlan_rx_kill_vid(struct net_device *dev, u16 vid);
+#endif
+void mlx5e_enable_vlan_filter(struct mlx5e_priv *priv);
+void mlx5e_disable_vlan_filter(struct mlx5e_priv *priv);
+int mlx5e_add_all_vlan_rules(struct mlx5e_priv *priv);
+void mlx5e_del_all_vlan_rules(struct mlx5e_priv *priv);
+
+int mlx5e_open_locked(struct net_device *netdev);
+int mlx5e_close_locked(struct net_device *netdev);
+int mlx5e_update_priv_params(struct mlx5e_priv *priv,
+ struct mlx5e_params *new_params);
+
+void mlx5e_create_debugfs(struct mlx5e_priv *priv);
+void mlx5e_destroy_debugfs(struct mlx5e_priv *priv);
+
+static inline void mlx5e_tx_notify_hw(struct mlx5e_sq *sq,
+ struct mlx5e_tx_wqe *wqe, int bf_sz)
+{
+ u16 ofst = MLX5_BF_OFFSET + sq->bf_offset;
+
+ /* ensure wqe is visible to device before updating doorbell record */
+ wmb();
+
+ *sq->wq.db = cpu_to_be32(sq->pc);
+
+ /* ensure doorbell record is visible to device before ringing the
+ * doorbell */
+ wmb();
+
+ if (bf_sz) {
+ __iowrite64_copy(sq->uar_bf_map + ofst, &wqe->ctrl, bf_sz);
+
+ /* flush the write-combining mapped buffer */
+ wmb();
+
+ } else {
+ mlx5_write64((__be32 *)&wqe->ctrl, sq->uar_map + ofst, NULL);
+ }
+
+ sq->bf_offset ^= sq->bf_buf_size;
+}
+
+static inline void mlx5e_cq_arm(struct mlx5e_cq *cq)
+{
+ struct mlx5_core_cq *mcq;
+
+ mcq = &cq->mcq;
+ mlx5_cq_arm(mcq, MLX5_CQ_DB_REQ_NOT, mcq->uar->map, NULL, cq->wq.cc);
+}
+
+extern const struct ethtool_ops mlx5e_ethtool_ops;
+#ifdef HAVE_ETHTOOL_OPS_EXT
+extern const struct ethtool_ops_ext mlx5e_ethtool_ops_ext;
+#endif
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/en_debugfs.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_debugfs.c
new file mode 100644
index 0000000..3886874
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_debugfs.c
@@ -0,0 +1,115 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2015, Mellanox Technologies inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "en.h"
+
+static void mlx5e_create_channel_debugfs(struct mlx5e_priv *priv,
+ int channel_num)
+{
+ int i;
+ char name[MLX5_MAX_NAME_LEN];
+ struct dentry *channel_root;
+ struct mlx5e_channel *channel;
+
+ snprintf(name, MLX5_MAX_NAME_LEN, "channel-%d", channel_num);
+ channel_root = debugfs_create_dir(name, priv->dfs_root);
+ if (!channel_root) {
+ netdev_err(priv->netdev,
+ "Failed to create channel debugfs for %s\n",
+ priv->netdev->name);
+ return;
+ }
+ priv->channel[channel_num]->dfs_root = channel_root;
+ channel = priv->channel[channel_num];
+
+ for (i = 0; i < priv->params.num_tc; i++) {
+ snprintf(name, MLX5_MAX_NAME_LEN, "sqn-%d", i);
+ debugfs_create_u32(name, S_IRUSR, channel_root,
+ &channel->sq[i].sqn);
+
+ snprintf(name, MLX5_MAX_NAME_LEN, "sq-cqn-%d", i);
+ debugfs_create_u32(name, S_IRUSR, channel_root,
+ &channel->sq[i].cq.mcq.cqn);
+ }
+
+ debugfs_create_u32("rqn", S_IRUSR, channel_root,
+ &channel->rq.rqn);
+
+ debugfs_create_u32("rq-cqn", S_IRUSR, channel_root,
+ &channel->rq.cq.mcq.cqn);
+}
+
+void mlx5e_create_debugfs(struct mlx5e_priv *priv)
+{
+ int i;
+ char name[MLX5_MAX_NAME_LEN];
+
+ priv->dfs_root = debugfs_create_dir(priv->netdev->name, NULL);
+ if (!priv->dfs_root) {
+ netdev_err(priv->netdev, "Failed to init debugfs files for %s\n",
+ priv->netdev->name);
+ return;
+ }
+
+ debugfs_create_u32("uar", S_IRUSR, priv->dfs_root,
+ &priv->cq_uar.index);
+ debugfs_create_u32("pdn", S_IRUSR, priv->dfs_root, &priv->pdn);
+ debugfs_create_u32("mkey", S_IRUSR, priv->dfs_root,
+ &priv->mr.key);
+ debugfs_create_u8("num_tc", S_IRUSR, priv->dfs_root,
+ &priv->params.num_tc);
+
+ for (i = 0; i < priv->params.num_tc; i++) {
+ snprintf(name, MLX5_MAX_NAME_LEN, "tisn-%d", i);
+ debugfs_create_u32(name, S_IRUSR, priv->dfs_root,
+ &priv->tisn[i]);
+ }
+
+ for (i = 0; i < MLX5E_NUM_TT; i++) {
+ snprintf(name, MLX5_MAX_NAME_LEN, "tirn-%d", i);
+ debugfs_create_u32(name, S_IRUSR, priv->dfs_root,
+ &priv->tirn[i]);
+ }
+
+ for (i = 0; i < priv->params.num_channels; i++)
+ mlx5e_create_channel_debugfs(priv, i);
+}
+
+void mlx5e_destroy_debugfs(struct mlx5e_priv *priv)
+{
+ debugfs_remove_recursive(priv->dfs_root);
+ priv->dfs_root = NULL;
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/en_ethtool.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_ethtool.c
new file mode 100644
index 0000000..af24fa2
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_ethtool.c
@@ -0,0 +1,816 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "en.h"
+
+static void mlx5e_get_drvinfo(struct net_device *dev,
+ struct ethtool_drvinfo *drvinfo)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+ strlcpy(drvinfo->driver, DRIVER_NAME, sizeof(drvinfo->driver));
+ strlcpy(drvinfo->version, DRIVER_VERSION " (" DRIVER_RELDATE ")",
+ sizeof(drvinfo->version));
+ snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
+ "%d.%d.%d",
+ fw_rev_maj(mdev), fw_rev_min(mdev), fw_rev_sub(mdev));
+ strlcpy(drvinfo->bus_info, pci_name(mdev->pdev),
+ sizeof(drvinfo->bus_info));
+}
+
+static const struct {
+ u32 supported;
+ u32 advertised;
+ u32 speed;
+} ptys2ethtool_table[MLX5E_LINK_MODES_NUMBER] = {
+ [MLX5E_1000BASE_CX_SGMII] = {
+ .supported = SUPPORTED_1000baseKX_Full,
+ .advertised = ADVERTISED_1000baseKX_Full,
+ .speed = SPEED_1000,
+ },
+ [MLX5E_1000BASE_KX] = {
+ .supported = SUPPORTED_1000baseKX_Full,
+ .advertised = ADVERTISED_1000baseKX_Full,
+ .speed = SPEED_1000,
+ },
+ [MLX5E_10GBASE_CX4] = {
+ .supported = SUPPORTED_10000baseKX4_Full,
+ .advertised = ADVERTISED_10000baseKX4_Full,
+ .speed = SPEED_10000,
+ },
+ [MLX5E_10GBASE_KX4] = {
+ .supported = SUPPORTED_10000baseKX4_Full,
+ .advertised = ADVERTISED_10000baseKX4_Full,
+ .speed = SPEED_10000,
+ },
+ [MLX5E_10GBASE_KR] = {
+ .supported = SUPPORTED_10000baseKR_Full,
+ .advertised = ADVERTISED_10000baseKR_Full,
+ .speed = SPEED_10000,
+ },
+ [MLX5E_20GBASE_KR2] = {
+ .supported = SUPPORTED_20000baseKR2_Full,
+ .advertised = ADVERTISED_20000baseKR2_Full,
+ .speed = SPEED_20000,
+ },
+ [MLX5E_40GBASE_CR4] = {
+ .supported = SUPPORTED_40000baseCR4_Full,
+ .advertised = ADVERTISED_40000baseCR4_Full,
+ .speed = SPEED_40000,
+ },
+ [MLX5E_40GBASE_KR4] = {
+ .supported = SUPPORTED_40000baseKR4_Full,
+ .advertised = ADVERTISED_40000baseKR4_Full,
+ .speed = SPEED_40000,
+ },
+ [MLX5E_56GBASE_R4] = {
+ .supported = SUPPORTED_56000baseKR4_Full,
+ .advertised = ADVERTISED_56000baseKR4_Full,
+ .speed = SPEED_56000,
+ },
+ [MLX5E_10GBASE_CR] = {
+ .supported = SUPPORTED_10000baseKR_Full,
+ .advertised = ADVERTISED_10000baseKR_Full,
+ .speed = SPEED_10000,
+ },
+ [MLX5E_10GBASE_SR] = {
+ .supported = SUPPORTED_10000baseKR_Full,
+ .advertised = ADVERTISED_10000baseKR_Full,
+ .speed = SPEED_10000,
+ },
+ [MLX5E_10GBASE_ER] = {
+ .supported = SUPPORTED_10000baseKR_Full,/* TODO: verify */
+ .advertised = ADVERTISED_10000baseKR_Full,
+ .speed = SPEED_10000,
+ },
+ [MLX5E_40GBASE_SR4] = {
+ .supported = SUPPORTED_40000baseSR4_Full,
+ .advertised = ADVERTISED_40000baseSR4_Full,
+ .speed = SPEED_40000,
+ },
+ [MLX5E_40GBASE_LR4] = {
+ .supported = SUPPORTED_40000baseLR4_Full,
+ .advertised = ADVERTISED_40000baseLR4_Full,
+ .speed = SPEED_40000,
+ },
+ [MLX5E_100GBASE_CR4] = {
+ .supported = /*SUPPORTED_100000baseCR4_Full*/ 0,
+ .advertised = /*ADVERTISED_100000baseCR4_Full*/ 0,
+ .speed = SPEED_100000,
+ },
+ [MLX5E_100GBASE_SR4] = {
+ .supported = /*SUPPORTED_100000baseSR4_Full*/ 0,
+ .advertised = /*ADVERTISED_100000baseSR4_Full*/ 0,
+ .speed = SPEED_100000,
+ },
+ [MLX5E_100GBASE_KR4] = {
+ .supported = /*SUPPORTED_100000baseKR4_Full*/ 0,
+ .advertised = /*ADVERTISED_100000baseKR4_Full*/ 0,
+ .speed = SPEED_100000,
+ },
+ [MLX5E_100GBASE_LR4] = {
+ .supported = /*SUPPORTED_1000000baseLR4_Full*/ 0,
+ .advertised = /*ADVERTISED_1000000baseLR4_Full*/ 0,
+ .speed = SPEED_100000,
+ },
+ [MLX5E_100BASE_TX] = {
+ .supported = /*SUPPORTED_100baseTX_Full*/ 0,
+ .advertised = /*ADVERTISED_100baseTX_Full*/ 0,
+ .speed = SPEED_100,
+ },
+ [MLX5E_100BASE_T] = {
+ .supported = SUPPORTED_100baseT_Full,
+ .advertised = ADVERTISED_100baseT_Full,
+ .speed = SPEED_100,
+ },
+ [MLX5E_10GBASE_T] = {
+ .supported = SUPPORTED_10000baseT_Full,
+ .advertised = ADVERTISED_10000baseT_Full,
+ .speed = SPEED_1000,
+ },
+ [MLX5E_25GBASE_CR] = {
+ .supported = /*SUPPORTED_25000baseCR_Full*/ 0,
+ .advertised = /*ADVERTISED_25000baseCR_Full*/ 0,
+ .speed = SPEED_25000,
+ },
+ [MLX5E_25GBASE_KR] = {
+ .supported = /*SUPPORTED_25000baseKR_Full*/ 0,
+ .advertised = /*ADVERTISED_25000baseKR_Full*/ 0,
+ .speed = SPEED_25000,
+ },
+ [MLX5E_25GBASE_SR] = {
+ .supported = /*SUPPORTED_25000baseSR_Full*/ 0,
+ .advertised = /*ADVERTISED_25000baseSR_Full*/ 0,
+ .speed = SPEED_25000,
+ },
+ [MLX5E_50GBASE_CR2] = {
+ .supported = /*SUPPORTED_50000baseCR2_Full*/ 0,
+ .advertised = /*ADVERTISED_50000baseCR2_Full*/ 0,
+ .speed = SPEED_50000,
+ },
+ [MLX5E_50GBASE_KR2] = {
+ .supported = /*SUPPORTED_50000baseKR2_Full*/ 0,
+ .advertised = /*ADVERTISED_50000baseKR2_Full*/ 0,
+ .speed = SPEED_50000,
+ },
+};
+
+static int mlx5e_get_sset_count(struct net_device *dev, int sset)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+
+ switch (sset) {
+ case ETH_SS_STATS:
+ return NUM_VPORT_COUNTERS + NUM_PPORT_COUNTERS +
+ priv->params.num_channels * NUM_RQ_STATS +
+ priv->params.num_channels * priv->params.num_tc *
+ NUM_SQ_STATS;
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+ case ETH_SS_PRIV_FLAGS:
+ return ARRAY_SIZE(mlx5e_priv_flags);
+#endif
+ /* fallthrough */
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+static void mlx5e_get_strings(struct net_device *dev,
+ uint32_t stringset, uint8_t *data)
+{
+ int i, j, tc, idx = 0;
+ struct mlx5e_priv *priv = netdev_priv(dev);
+
+ switch (stringset) {
+ case ETH_SS_PRIV_FLAGS:
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+ for (i = 0; i < ARRAY_SIZE(mlx5e_priv_flags); i++)
+ strcpy(data + i * ETH_GSTRING_LEN,
+ mlx5e_priv_flags[i]);
+#endif
+ break;
+
+ case ETH_SS_TEST:
+ /* TODO: not implemented yet */
+ break;
+
+ case ETH_SS_STATS:
+ /* VPORT counters */
+ for (i = 0; i < NUM_VPORT_COUNTERS; i++)
+ strcpy(data + (idx++) * ETH_GSTRING_LEN,
+ vport_strings[i]);
+
+ /* PPORT counters */
+ for (i = 0; i < NUM_PPORT_COUNTERS; i++)
+ strcpy(data + (idx++) * ETH_GSTRING_LEN,
+ pport_strings[i]);
+
+ /* per channel counters */
+ for (i = 0; i < priv->params.num_channels; i++)
+ for (j = 0; j < NUM_RQ_STATS; j++)
+ sprintf(data + (idx++) * ETH_GSTRING_LEN,
+ "rx%d_%s", i, rq_stats_strings[j]);
+
+ for (i = 0; i < priv->params.num_channels; i++)
+ for (tc = 0; tc < priv->params.num_tc; tc++)
+ for (j = 0; j < NUM_SQ_STATS; j++)
+ sprintf(data +
+ (idx++) * ETH_GSTRING_LEN,
+ "tx%d_%d_%s", i, tc,
+ sq_stats_strings[j]);
+ break;
+ }
+}
+
+static void mlx5e_get_ethtool_stats(struct net_device *dev,
+ struct ethtool_stats *stats, u64 *data)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+ int i, j, tc, idx = 0;
+
+ if (!data)
+ return;
+
+ mutex_lock(&priv->state_lock);
+ if (test_bit(MLX5E_STATE_OPENED, &priv->state))
+ mlx5e_update_stats(priv);
+ mutex_unlock(&priv->state_lock);
+
+ for (i = 0; i < NUM_VPORT_COUNTERS; i++)
+ data[idx++] = ((u64 *)&priv->stats.vport)[i];
+
+ for (i = 0; i < NUM_PPORT_COUNTERS; i++)
+ data[idx++] = be64_to_cpu(((__be64 *)&priv->stats.pport)[i]);
+
+ /* per channel counters */
+ for (i = 0; i < priv->params.num_channels; i++)
+ for (j = 0; j < NUM_RQ_STATS; j++)
+ data[idx++] = !test_bit(MLX5E_STATE_OPENED,
+ &priv->state) ? 0 :
+ ((u64 *)&priv->channel[i]->rq.stats)[j];
+
+ for (i = 0; i < priv->params.num_channels; i++)
+ for (tc = 0; tc < priv->params.num_tc; tc++)
+ for (j = 0; j < NUM_SQ_STATS; j++)
+ data[idx++] = !test_bit(MLX5E_STATE_OPENED,
+ &priv->state) ? 0 :
+ ((u64 *)&priv->channel[i]->sq[tc].stats)[j];
+}
+
+static void mlx5e_get_ringparam(struct net_device *dev,
+ struct ethtool_ringparam *param)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+
+ param->rx_max_pending = 1 << MLX5E_PARAMS_MAXIMUM_LOG_RQ_SIZE;
+ param->tx_max_pending = 1 << MLX5E_PARAMS_MAXIMUM_LOG_SQ_SIZE;
+ param->rx_pending = 1 << priv->params.log_rq_size;
+ param->tx_pending = 1 << priv->params.log_sq_size;
+}
+
+static int mlx5e_set_ringparam(struct net_device *dev,
+ struct ethtool_ringparam *param)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+ struct mlx5e_params new_params;
+ u16 min_rx_wqes;
+ u8 log_rq_size;
+ u8 log_sq_size;
+ int err = 0;
+
+ if (param->rx_jumbo_pending) {
+ netdev_info(dev, "%s: rx_jumbo_pending not supported\n",
+ __func__);
+ return -EINVAL;
+ }
+ if (param->rx_mini_pending) {
+ netdev_info(dev, "%s: rx_mini_pending not supported\n",
+ __func__);
+ return -EINVAL;
+ }
+ if (param->rx_pending < (1 << MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE)) {
+ netdev_info(dev, "%s: rx_pending (%d) < min (%d)\n",
+ __func__, param->rx_pending,
+ 1 << MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE);
+ return -EINVAL;
+ }
+ if (param->rx_pending > (1 << MLX5E_PARAMS_MAXIMUM_LOG_RQ_SIZE)) {
+ netdev_info(dev, "%s: rx_pending (%d) > max (%d)\n",
+ __func__, param->rx_pending,
+ 1 << MLX5E_PARAMS_MAXIMUM_LOG_RQ_SIZE);
+ return -EINVAL;
+ }
+ if (param->tx_pending < (1 << MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE)) {
+ netdev_info(dev, "%s: tx_pending (%d) < min (%d)\n",
+ __func__, param->tx_pending,
+ 1 << MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE);
+ return -EINVAL;
+ }
+ if (param->tx_pending > (1 << MLX5E_PARAMS_MAXIMUM_LOG_SQ_SIZE)) {
+ netdev_info(dev, "%s: tx_pending (%d) > max (%d)\n",
+ __func__, param->tx_pending,
+ 1 << MLX5E_PARAMS_MAXIMUM_LOG_SQ_SIZE);
+ return -EINVAL;
+ }
+
+ log_rq_size = order_base_2(param->rx_pending);
+ log_sq_size = order_base_2(param->tx_pending);
+ min_rx_wqes = min_t(u16, param->rx_pending - 1,
+ MLX5E_PARAMS_DEFAULT_MIN_RX_WQES);
+
+ if (log_rq_size == priv->params.log_rq_size &&
+ log_sq_size == priv->params.log_sq_size &&
+ min_rx_wqes == priv->params.min_rx_wqes)
+ return 0;
+
+ mutex_lock(&priv->state_lock);
+ new_params = priv->params;
+ new_params.log_rq_size = log_rq_size;
+ new_params.log_sq_size = log_sq_size;
+ new_params.min_rx_wqes = min_rx_wqes;
+ err = mlx5e_update_priv_params(priv, &new_params);
+ mutex_unlock(&priv->state_lock);
+
+ return err;
+}
+
+#if defined(HAVE_GET_SET_CHANNELS) || defined(HAVE_GET_SET_CHANNELS_EXT)
+static void mlx5e_get_channels(struct net_device *dev,
+ struct ethtool_channels *ch)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+ int ncv = priv->mdev->priv.eq_table.num_comp_vectors;
+
+ ch->max_combined = ncv;
+ ch->combined_count = priv->params.num_channels;
+}
+
+static int mlx5e_set_channels(struct net_device *dev,
+ struct ethtool_channels *ch)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+ int ncv = priv->mdev->priv.eq_table.num_comp_vectors;
+ unsigned int count = ch->combined_count;
+ struct mlx5e_params new_params;
+ int err = 0;
+
+ if (!count) {
+ netdev_info(dev, "%s: combined_count=0 not supported\n",
+ __func__);
+ return -EINVAL;
+ }
+ if (ch->rx_count || ch->tx_count) {
+ netdev_info(dev, "%s: separate rx/tx count not supported\n",
+ __func__);
+ return -EINVAL;
+ }
+ if (count > ncv) {
+ netdev_info(dev, "%s: count (%d) > max (%d)\n",
+ __func__, count, ncv);
+ return -EINVAL;
+ }
+
+ if (priv->params.num_channels == count)
+ return 0;
+
+ mutex_lock(&priv->state_lock);
+ new_params = priv->params;
+ new_params.num_channels = count;
+ err = mlx5e_update_priv_params(priv, &new_params);
+ mutex_unlock(&priv->state_lock);
+
+ return err;
+}
+#endif
+
+static int mlx5e_get_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *coal)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+
+ coal->rx_coalesce_usecs = priv->params.rx_cq_moderation_usec;
+ coal->rx_max_coalesced_frames = priv->params.rx_cq_moderation_pkts;
+ coal->tx_coalesce_usecs = priv->params.tx_cq_moderation_usec;
+ coal->tx_max_coalesced_frames = priv->params.tx_cq_moderation_pkts;
+
+ return 0;
+}
+
+static int mlx5e_set_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *coal)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct mlx5_core_dev *mdev = priv->mdev;
+ struct mlx5e_channel *c;
+ int tc;
+ int i;
+
+ priv->params.tx_cq_moderation_usec = coal->tx_coalesce_usecs;
+ priv->params.tx_cq_moderation_pkts = coal->tx_max_coalesced_frames;
+ priv->params.rx_cq_moderation_usec = coal->rx_coalesce_usecs;
+ priv->params.rx_cq_moderation_pkts = coal->rx_max_coalesced_frames;
+
+ for (i = 0; i < priv->params.num_channels; ++i) {
+ c = priv->channel[i];
+
+ for (tc = 0; tc < c->num_tc; tc++) {
+ mlx5_core_modify_cq_moderation(mdev,
+ &c->sq[tc].cq.mcq,
+ coal->tx_coalesce_usecs,
+ coal->tx_max_coalesced_frames);
+ }
+
+ mlx5_core_modify_cq_moderation(mdev, &c->rq.cq.mcq,
+ coal->rx_coalesce_usecs,
+ coal->rx_max_coalesced_frames);
+ }
+
+ return 0;
+}
+
+static u32 ptys2ethtool_supported_link(u32 eth_proto_cap)
+{
+ int i;
+ u32 supoprted_modes = 0;
+
+ for (i = 0; i < MLX5E_LINK_MODES_NUMBER; ++i) {
+ if (eth_proto_cap & MLX5E_PROT_MASK(i))
+ supoprted_modes |= ptys2ethtool_table[i].supported;
+ }
+ return supoprted_modes;
+}
+
+static u32 ptys2ethtool_adver_link(u32 eth_proto_cap)
+{
+ int i;
+ u32 advertising_modes = 0;
+
+ for (i = 0; i < MLX5E_LINK_MODES_NUMBER; ++i) {
+ if (eth_proto_cap & MLX5E_PROT_MASK(i))
+ advertising_modes |= ptys2ethtool_table[i].advertised;
+ }
+ return advertising_modes;
+}
+
+static u32 ptys2ethtool_supported_port(u32 eth_proto_cap)
+{
+ /*
+ TODO:
+ MLX5E_40GBASE_LR4 = 16,
+ MLX5E_10GBASE_ER = 14,
+ MLX5E_10GBASE_CX4 = 2,
+ */
+
+ if (eth_proto_cap & (MLX5E_PROT_MASK(MLX5E_10GBASE_CR)
+ | MLX5E_PROT_MASK(MLX5E_10GBASE_SR)
+ | MLX5E_PROT_MASK(MLX5E_40GBASE_CR4)
+ | MLX5E_PROT_MASK(MLX5E_40GBASE_SR4)
+ | MLX5E_PROT_MASK(MLX5E_100GBASE_SR4)
+ | MLX5E_PROT_MASK(MLX5E_1000BASE_CX_SGMII))) {
+ return SUPPORTED_FIBRE;
+ }
+
+ if (eth_proto_cap & (MLX5E_PROT_MASK(MLX5E_100GBASE_KR4)
+ | MLX5E_PROT_MASK(MLX5E_40GBASE_KR4)
+ | MLX5E_PROT_MASK(MLX5E_10GBASE_KR)
+ | MLX5E_PROT_MASK(MLX5E_10GBASE_KX4)
+ | MLX5E_PROT_MASK(MLX5E_1000BASE_KX))) {
+ return SUPPORTED_Backplane;
+ }
+ return 0;
+}
+
+static void get_speed_duplex(struct net_device *netdev,
+ u32 eth_proto_oper,
+ struct ethtool_cmd *cmd)
+{
+ int i;
+ u32 speed = SPEED_UNKNOWN;
+ u8 duplex = DUPLEX_UNKNOWN;
+
+ if (!netif_carrier_ok(netdev))
+ goto out;
+
+ for (i = 0; i < MLX5E_LINK_MODES_NUMBER; ++i) {
+ if (eth_proto_oper & MLX5E_PROT_MASK(i)) {
+ speed = ptys2ethtool_table[i].speed;
+ duplex = DUPLEX_FULL;
+ break;
+ }
+ }
+out:
+ ethtool_cmd_speed_set(cmd, speed);
+ cmd->duplex = duplex;
+}
+
+static void get_supported(u32 eth_proto_cap, u32 *supported)
+{
+ *supported |= ptys2ethtool_supported_port(eth_proto_cap);
+ *supported |= ptys2ethtool_supported_link(eth_proto_cap);
+ *supported |= SUPPORTED_Pause | SUPPORTED_Asym_Pause;
+}
+
+static void get_advertising(u32 eth_proto_cap, u8 tx_pause,
+ u8 rx_pause, u32 *advertising)
+{
+ *advertising |= ptys2ethtool_adver_link(eth_proto_cap);
+ *advertising |= tx_pause ? ADVERTISED_Pause : 0;
+ *advertising |= (tx_pause ^ rx_pause) ? ADVERTISED_Asym_Pause : 0;
+}
+
+static u8 get_connector_port(u32 eth_proto)
+{
+ /*
+ TODO:
+ MLX5E_40GBASE_LR4 = 16,
+ MLX5E_10GBASE_ER = 14,
+ MLX5E_10GBASE_CX4 = 2,
+ */
+
+ if (eth_proto & (MLX5E_PROT_MASK(MLX5E_10GBASE_SR)
+ | MLX5E_PROT_MASK(MLX5E_40GBASE_SR4)
+ | MLX5E_PROT_MASK(MLX5E_100GBASE_SR4)
+ | MLX5E_PROT_MASK(MLX5E_1000BASE_CX_SGMII))) {
+ return PORT_FIBRE;
+ }
+
+ if (eth_proto & (MLX5E_PROT_MASK(MLX5E_40GBASE_CR4)
+ | MLX5E_PROT_MASK(MLX5E_10GBASE_CR)
+ | MLX5E_PROT_MASK(MLX5E_100GBASE_CR4))) {
+ return PORT_DA;
+ }
+
+ if (eth_proto & (MLX5E_PROT_MASK(MLX5E_10GBASE_KX4)
+ | MLX5E_PROT_MASK(MLX5E_10GBASE_KR)
+ | MLX5E_PROT_MASK(MLX5E_40GBASE_KR4)
+ | MLX5E_PROT_MASK(MLX5E_100GBASE_KR4))) {
+ return PORT_NONE;
+ }
+
+ return PORT_OTHER;
+}
+
+static void get_lp_advertising(u32 eth_proto_lp, u32 *lp_advertising)
+{
+
+ *lp_advertising = ptys2ethtool_adver_link(eth_proto_lp);
+}
+
+static int mlx5e_get_settings(struct net_device *netdev,
+ struct ethtool_cmd *cmd)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct mlx5_core_dev *mdev = priv->mdev;
+ u32 out[MLX5_ST_SZ_DW(ptys_reg)];
+ u32 eth_proto_cap;
+ u32 eth_proto_admin;
+ u32 eth_proto_lp;
+ u32 eth_proto_oper;
+ int err;
+
+ err = mlx5_query_port_ptys(mdev, out, sizeof(out), MLX5_PTYS_EN);
+
+ if (err) {
+ netdev_err(netdev, "%s: query port ptys failed: %d\n",
+ __func__, err);
+ goto err_query_ptys;
+ }
+
+ eth_proto_cap = MLX5_GET(ptys_reg, out, eth_proto_capability);
+ eth_proto_admin = MLX5_GET(ptys_reg, out, eth_proto_admin);
+ eth_proto_oper = MLX5_GET(ptys_reg, out, eth_proto_oper);
+ eth_proto_lp = MLX5_GET(ptys_reg, out, eth_proto_lp_advertise);
+
+ cmd->supported = 0;
+ cmd->advertising = 0;
+
+ get_supported(eth_proto_cap, &cmd->supported);
+ get_advertising(eth_proto_admin, 0, 0, &cmd->advertising);
+ get_speed_duplex(netdev, eth_proto_oper, cmd);
+
+ eth_proto_oper = eth_proto_oper ? eth_proto_oper : eth_proto_cap;
+
+ cmd->port = get_connector_port(eth_proto_oper);
+ get_lp_advertising(eth_proto_lp, &cmd->lp_advertising);
+
+ cmd->transceiver = XCVR_INTERNAL;
+
+ /* TODO
+ set Pause
+ cmd->supported ? SUPPORTED_Autoneg
+ cmd->advertising ? ADVERTISED_Autoneg
+ cmd->autoneg ?
+ cmd->phy_address = 0;
+ cmd->mdio_support = 0;
+ cmd->maxtxpkt = 0;
+ cmd->maxrxpkt = 0;
+ cmd->eth_tp_mdix = ETH_TP_MDI_INVALID;
+ cmd->eth_tp_mdix_ctrl = ETH_TP_MDI_AUTO;
+
+ cmd->lp_advertising |= (priv->port_state.flags & MLX4_EN_PORT_ANC) ?
+ ADVERTISED_Autoneg : 0;
+ */
+
+err_query_ptys:
+ return err;
+}
+
+static u32 mlx5e_ethtool2ptys_adver_link(u32 link_modes)
+{
+ u32 i, ptys_modes = 0;
+
+ for (i = 0; i < MLX5E_LINK_MODES_NUMBER; ++i) {
+ if (ptys2ethtool_table[i].advertised & link_modes)
+ ptys_modes |= MLX5E_PROT_MASK(i);
+ }
+
+ return ptys_modes;
+}
+
+static u32 mlx5e_ethtool2ptys_speed_link(u32 speed)
+{
+ u32 i, speed_links = 0;
+
+ for (i = 0; i < MLX5E_LINK_MODES_NUMBER; ++i) {
+ if (ptys2ethtool_table[i].speed == speed)
+ speed_links |= MLX5E_PROT_MASK(i);
+ }
+
+ return speed_links;
+}
+
+static int mlx5e_set_settings(struct net_device *netdev,
+ struct ethtool_cmd *cmd)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct mlx5_core_dev *mdev = priv->mdev;
+ u32 link_modes;
+ u32 speed;
+ u32 eth_proto_cap, eth_proto_admin;
+ u8 port_status;
+ int err;
+
+ speed = ethtool_cmd_speed(cmd);
+
+ link_modes = cmd->autoneg == AUTONEG_ENABLE ?
+ mlx5e_ethtool2ptys_adver_link(cmd->advertising) :
+ mlx5e_ethtool2ptys_speed_link(speed);
+
+ err = mlx5_query_port_proto_cap(mdev, ð_proto_cap, MLX5_PTYS_EN);
+ if (err) {
+ netdev_err(netdev, "%s: query port eth proto cap failed: %d\n",
+ __func__, err);
+ goto out;
+ }
+
+ link_modes = link_modes & eth_proto_cap;
+ if (!link_modes) {
+ netdev_err(netdev, "%s: Not supported link mode(s) requested",
+ __func__);
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = mlx5_query_port_proto_admin(mdev, ð_proto_admin, MLX5_PTYS_EN);
+ if (err) {
+ netdev_err(netdev, "%s: query port eth proto admin failed: %d\n",
+ __func__, err);
+ goto out;
+ }
+
+ if (link_modes == eth_proto_admin)
+ goto out;
+
+ err = mlx5_set_port_proto(mdev, link_modes, MLX5_PTYS_EN);
+ if (err) {
+ netdev_err(netdev, "%s: set port eth proto admin failed: %d\n",
+ __func__, err);
+ goto out;
+ }
+
+ err = mlx5_query_port_status(mdev, &port_status);
+ if (err)
+ goto out;
+
+ if (port_status == MLX5_PORT_DOWN)
+ return 0;
+
+ err = mlx5_set_port_status(mdev, MLX5_PORT_DOWN);
+ if (err)
+ goto out;
+ err = mlx5_set_port_status(mdev, MLX5_PORT_UP);
+out:
+ return err;
+}
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+static int mlx5e_set_priv_flags(struct net_device *dev, u32 flags)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+ u32 changes = flags ^ priv->pflags;
+ struct mlx5e_params new_params;
+ bool update_params = false;
+ int i = 0;
+
+ mutex_lock(&priv->state_lock);
+ new_params = priv->params;
+
+ if (changes & MLX5E_PRIV_FLAG_SWLRO) {
+ priv->pflags ^= MLX5E_PRIV_FLAG_SWLRO;
+ if (!test_bit(MLX5E_STATE_OPENED, &priv->state))
+ goto out;
+ for (i = 0; i < priv->params.num_channels; i++)
+ priv->channel[i]->rq.flags ^= MLX5E_RQ_FLAG_SWLRO;
+ }
+
+ if (changes & MLX5E_PRIV_FLAG_HWLRO) {
+ new_params.lro_en = !!(flags & MLX5E_PRIV_FLAG_HWLRO);
+ priv->pflags ^= MLX5E_PRIV_FLAG_HWLRO;
+ update_params = true;
+ if (new_params.lro_en)
+ priv->netdev->flags |= NETIF_F_LRO;
+ else
+ priv->netdev->flags &= ~NETIF_F_LRO;
+ }
+
+ if (update_params)
+ mlx5e_update_priv_params(priv, &new_params);
+out:
+ mutex_unlock(&priv->state_lock);
+ return !(flags == priv->pflags);
+}
+
+static u32 mlx5e_get_priv_flags(struct net_device *dev)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+
+ return priv->pflags;
+}
+#endif
+
+const struct ethtool_ops mlx5e_ethtool_ops = {
+ .get_drvinfo = mlx5e_get_drvinfo,
+ .get_link = ethtool_op_get_link,
+ .get_strings = mlx5e_get_strings,
+ .get_sset_count = mlx5e_get_sset_count,
+ .get_ethtool_stats = mlx5e_get_ethtool_stats,
+ .get_ringparam = mlx5e_get_ringparam,
+ .set_ringparam = mlx5e_set_ringparam,
+#ifdef HAVE_GET_SET_CHANNELS
+ .get_channels = mlx5e_get_channels,
+ .set_channels = mlx5e_set_channels,
+#endif
+ .get_coalesce = mlx5e_get_coalesce,
+ .set_coalesce = mlx5e_set_coalesce,
+ .get_settings = mlx5e_get_settings,
+ .set_settings = mlx5e_set_settings,
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+ .set_priv_flags = mlx5e_set_priv_flags,
+ .get_priv_flags = mlx5e_get_priv_flags,
+#endif
+};
+
+
+#ifdef HAVE_ETHTOOL_OPS_EXT
+const struct ethtool_ops_ext mlx5e_ethtool_ops_ext = {
+ .size = sizeof(struct ethtool_ops_ext),
+#ifdef HAVE_GET_SET_CHANNELS_EXT
+ .get_channels = mlx5e_get_channels,
+ .set_channels = mlx5e_set_channels,
+#endif
+};
+#endif
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/en_flow_table.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_flow_table.c
new file mode 100644
index 0000000..0a1b8f8
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_flow_table.c
@@ -0,0 +1,1014 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "en.h"
+
+enum {
+ MLX5E_FULLMATCH = 0,
+ MLX5E_ALLMULTI = 1,
+ MLX5E_PROMISC = 2,
+};
+
+enum {
+ MLX5E_UC = 0,
+ MLX5E_MC_IPV4 = 1,
+ MLX5E_MC_IPV6 = 2,
+ MLX5E_MC_OTHER = 3,
+};
+
+enum {
+ MLX5E_ACTION_NONE = 0,
+ MLX5E_ACTION_ADD = 1,
+ MLX5E_ACTION_DEL = 2,
+};
+
+struct mlx5e_eth_addr_hash_node {
+ struct hlist_node hlist;
+ u8 action;
+ struct mlx5e_eth_addr_info ai;
+};
+
+static inline int mlx5e_hash_eth_addr(u8 *addr)
+{
+ return addr[5];
+}
+
+static void mlx5e_add_eth_addr_to_hash(struct hlist_head *hash, u8 *addr)
+{
+ struct mlx5e_eth_addr_hash_node *hn;
+ int ix = mlx5e_hash_eth_addr(addr);
+ int found = 0;
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0))
+ struct hlist_node *pos;
+#endif
+
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0))
+ hlist_for_each_entry(hn, pos, &hash[ix], hlist)
+#else
+ hlist_for_each_entry(hn, &hash[ix], hlist)
+#endif
+ if (ether_addr_equal_64bits(hn->ai.addr, addr)) {
+ found = 1;
+ break;
+ }
+
+ if (found) {
+ hn->action = MLX5E_ACTION_NONE;
+ return;
+ }
+
+ hn = kzalloc(sizeof(*hn), GFP_ATOMIC);
+ if (!hn)
+ return;
+
+ ether_addr_copy(hn->ai.addr, addr);
+ hn->action = MLX5E_ACTION_ADD;
+
+ hlist_add_head(&hn->hlist, &hash[ix]);
+}
+
+static void mlx5e_del_eth_addr_from_hash(struct mlx5e_eth_addr_hash_node *hn)
+{
+ hlist_del(&hn->hlist);
+ kfree(hn);
+}
+
+static void mlx5e_del_eth_addr_from_flow_table(struct mlx5e_priv *priv,
+ struct mlx5e_eth_addr_info *ai)
+{
+ void *ft = priv->ft.main;
+
+ if (ai->tt_vec & (1 << MLX5E_TT_IPV6_IPSEC_ESP))
+ mlx5_del_flow_table_entry(ft,
+ ai->ft_ix[MLX5E_TT_IPV6_IPSEC_ESP]);
+
+ if (ai->tt_vec & (1 << MLX5E_TT_IPV4_IPSEC_ESP))
+ mlx5_del_flow_table_entry(ft,
+ ai->ft_ix[MLX5E_TT_IPV4_IPSEC_ESP]);
+
+ if (ai->tt_vec & (1 << MLX5E_TT_IPV6_IPSEC_AH))
+ mlx5_del_flow_table_entry(ft,
+ ai->ft_ix[MLX5E_TT_IPV6_IPSEC_AH]);
+
+ if (ai->tt_vec & (1 << MLX5E_TT_IPV4_IPSEC_AH))
+ mlx5_del_flow_table_entry(ft,
+ ai->ft_ix[MLX5E_TT_IPV4_IPSEC_AH]);
+
+ if (ai->tt_vec & (1 << MLX5E_TT_IPV6_TCP))
+ mlx5_del_flow_table_entry(ft,
+ ai->ft_ix[MLX5E_TT_IPV6_TCP]);
+
+ if (ai->tt_vec & (1 << MLX5E_TT_IPV4_TCP))
+ mlx5_del_flow_table_entry(ft,
+ ai->ft_ix[MLX5E_TT_IPV4_TCP]);
+
+ if (ai->tt_vec & (1 << MLX5E_TT_IPV6_UDP))
+ mlx5_del_flow_table_entry(ft,
+ ai->ft_ix[MLX5E_TT_IPV6_UDP]);
+
+ if (ai->tt_vec & (1 << MLX5E_TT_IPV4_UDP))
+ mlx5_del_flow_table_entry(ft,
+ ai->ft_ix[MLX5E_TT_IPV4_UDP]);
+
+ if (ai->tt_vec & (1 << MLX5E_TT_IPV6))
+ mlx5_del_flow_table_entry(ft,
+ ai->ft_ix[MLX5E_TT_IPV6]);
+
+ if (ai->tt_vec & (1 << MLX5E_TT_IPV4))
+ mlx5_del_flow_table_entry(ft,
+ ai->ft_ix[MLX5E_TT_IPV4]);
+
+ if (ai->tt_vec & (1 << MLX5E_TT_ANY))
+ mlx5_del_flow_table_entry(ft,
+ ai->ft_ix[MLX5E_TT_ANY]);
+}
+
+static int mlx5e_get_eth_addr_type(u8 *addr)
+{
+ if (is_unicast_ether_addr(addr))
+ return MLX5E_UC;
+
+ if ((addr[0] == 0x01) &&
+ (addr[1] == 0x00) &&
+ (addr[2] == 0x5e) &&
+ !(addr[3] & 0x80))
+ return MLX5E_MC_IPV4;
+
+ if ((addr[0] == 0x33) &&
+ (addr[1] == 0x33))
+ return MLX5E_MC_IPV6;
+
+ return MLX5E_MC_OTHER;
+}
+
+static u32 mlx5e_get_tt_vec(struct mlx5e_eth_addr_info *ai, int type)
+{
+ int eth_addr_type;
+ u32 ret;
+
+ switch (type) {
+ case MLX5E_FULLMATCH:
+ eth_addr_type = mlx5e_get_eth_addr_type(ai->addr);
+ switch (eth_addr_type) {
+ case MLX5E_UC:
+ ret =
+ (1 << MLX5E_TT_IPV4_TCP) |
+ (1 << MLX5E_TT_IPV6_TCP) |
+ (1 << MLX5E_TT_IPV4_UDP) |
+ (1 << MLX5E_TT_IPV6_UDP) |
+ (1 << MLX5E_TT_IPV4_IPSEC_AH) |
+ (1 << MLX5E_TT_IPV6_IPSEC_AH) |
+ (1 << MLX5E_TT_IPV4_IPSEC_ESP) |
+ (1 << MLX5E_TT_IPV6_IPSEC_ESP) |
+ (1 << MLX5E_TT_IPV4) |
+ (1 << MLX5E_TT_IPV6) |
+ (1 << MLX5E_TT_ANY) |
+ 0;
+ break;
+
+ case MLX5E_MC_IPV4:
+ ret =
+ (1 << MLX5E_TT_IPV4_UDP) |
+ (1 << MLX5E_TT_IPV4) |
+ 0;
+ break;
+
+ case MLX5E_MC_IPV6:
+ ret =
+ (1 << MLX5E_TT_IPV6_UDP) |
+ (1 << MLX5E_TT_IPV6) |
+ 0;
+ break;
+
+ case MLX5E_MC_OTHER:
+ ret =
+ (1 << MLX5E_TT_ANY) |
+ 0;
+ break;
+ }
+
+ break;
+
+ case MLX5E_ALLMULTI:
+ ret =
+ (1 << MLX5E_TT_IPV4_UDP) |
+ (1 << MLX5E_TT_IPV6_UDP) |
+ (1 << MLX5E_TT_IPV4) |
+ (1 << MLX5E_TT_IPV6) |
+ (1 << MLX5E_TT_ANY) |
+ 0;
+ break;
+
+ default: /* MLX5E_PROMISC */
+ ret =
+ (1 << MLX5E_TT_IPV4_TCP) |
+ (1 << MLX5E_TT_IPV6_TCP) |
+ (1 << MLX5E_TT_IPV4_UDP) |
+ (1 << MLX5E_TT_IPV6_UDP) |
+ (1 << MLX5E_TT_IPV4_IPSEC_AH) |
+ (1 << MLX5E_TT_IPV6_IPSEC_AH) |
+ (1 << MLX5E_TT_IPV4_IPSEC_ESP) |
+ (1 << MLX5E_TT_IPV6_IPSEC_ESP) |
+ (1 << MLX5E_TT_IPV4) |
+ (1 << MLX5E_TT_IPV6) |
+ (1 << MLX5E_TT_ANY) |
+ 0;
+ break;
+ }
+
+ return ret;
+}
+
+static int __mlx5e_add_eth_addr_rule(struct mlx5e_priv *priv,
+ struct mlx5e_eth_addr_info *ai, int type,
+ void *flow_context, void *match_criteria)
+{
+ u8 match_criteria_enable = 0;
+ void *match_value;
+ void *dest;
+ u8 *dmac;
+ u8 *match_criteria_dmac;
+ void *ft = priv->ft.main;
+ u32 *tirn = priv->tirn;
+ u32 *ft_ix;
+ u32 tt_vec;
+ int err;
+
+ match_value = MLX5_ADDR_OF(flow_context, flow_context, match_value);
+ dmac = MLX5_ADDR_OF(fte_match_param, match_value,
+ outer_headers.dmac_47_16);
+ match_criteria_dmac = MLX5_ADDR_OF(fte_match_param, match_criteria,
+ outer_headers.dmac_47_16);
+ dest = MLX5_ADDR_OF(flow_context, flow_context, destination);
+
+ MLX5_SET(flow_context, flow_context, action,
+ MLX5_FLOW_CONTEXT_ACTION_FWD_DEST);
+ MLX5_SET(flow_context, flow_context, destination_list_size, 1);
+ MLX5_SET(dest_format_struct, dest, destination_type,
+ MLX5_FLOW_CONTEXT_DEST_TYPE_TIR);
+
+ switch (type) {
+ case MLX5E_FULLMATCH:
+ match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ memset(match_criteria_dmac, 0xff, ETH_ALEN);
+ ether_addr_copy(dmac, ai->addr);
+ break;
+
+ case MLX5E_ALLMULTI:
+ match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ match_criteria_dmac[0] = 0x01;
+ dmac[0] = 0x01;
+ break;
+
+ case MLX5E_PROMISC:
+ break;
+ }
+
+ tt_vec = mlx5e_get_tt_vec(ai, type);
+
+ ft_ix = &ai->ft_ix[MLX5E_TT_ANY];
+ if (tt_vec & (1 << MLX5E_TT_ANY)) {
+ MLX5_SET(dest_format_struct, dest, destination_id,
+ tirn[MLX5E_TT_ANY]);
+ err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+ match_criteria, flow_context,
+ ft_ix);
+ if (err) {
+ mlx5e_del_eth_addr_from_flow_table(priv, ai);
+ return err;
+ }
+ ai->tt_vec |= (1 << MLX5E_TT_ANY);
+ }
+
+ match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ MLX5_SET_TO_ONES(fte_match_param, match_criteria,
+ outer_headers.ethertype);
+
+ ft_ix = &ai->ft_ix[MLX5E_TT_IPV4];
+ if (tt_vec & (1 << MLX5E_TT_IPV4)) {
+ MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+ ETH_P_IP);
+ MLX5_SET(dest_format_struct, dest, destination_id,
+ tirn[MLX5E_TT_IPV4]);
+ err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+ match_criteria, flow_context,
+ ft_ix);
+ if (err) {
+ mlx5e_del_eth_addr_from_flow_table(priv, ai);
+ return err;
+ }
+ ai->tt_vec |= (1 << MLX5E_TT_IPV4);
+ }
+
+ ft_ix = &ai->ft_ix[MLX5E_TT_IPV6];
+ if (tt_vec & (1 << MLX5E_TT_IPV6)) {
+ MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+ ETH_P_IPV6);
+ MLX5_SET(dest_format_struct, dest, destination_id,
+ tirn[MLX5E_TT_IPV6]);
+ err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+ match_criteria, flow_context,
+ ft_ix);
+ if (err) {
+ mlx5e_del_eth_addr_from_flow_table(priv, ai);
+ return err;
+ }
+ ai->tt_vec |= (1 << MLX5E_TT_IPV6);
+ }
+
+ MLX5_SET_TO_ONES(fte_match_param, match_criteria,
+ outer_headers.ip_protocol);
+ MLX5_SET(fte_match_param, match_value, outer_headers.ip_protocol,
+ IPPROTO_UDP);
+
+ ft_ix = &ai->ft_ix[MLX5E_TT_IPV4_UDP];
+ if (tt_vec & (1 << MLX5E_TT_IPV4_UDP)) {
+ MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+ ETH_P_IP);
+ MLX5_SET(dest_format_struct, dest, destination_id,
+ tirn[MLX5E_TT_IPV4_UDP]);
+ err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+ match_criteria, flow_context,
+ ft_ix);
+ if (err) {
+ mlx5e_del_eth_addr_from_flow_table(priv, ai);
+ return err;
+ }
+ ai->tt_vec |= (1 << MLX5E_TT_IPV4_UDP);
+ }
+
+ ft_ix = &ai->ft_ix[MLX5E_TT_IPV6_UDP];
+ if (tt_vec & (1 << MLX5E_TT_IPV6_UDP)) {
+ MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+ ETH_P_IPV6);
+ MLX5_SET(dest_format_struct, dest, destination_id,
+ tirn[MLX5E_TT_IPV6_UDP]);
+ err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+ match_criteria, flow_context,
+ ft_ix);
+ if (err) {
+ mlx5e_del_eth_addr_from_flow_table(priv, ai);
+ return err;
+ }
+ ai->tt_vec |= (1 << MLX5E_TT_IPV6_UDP);
+ }
+
+ MLX5_SET(fte_match_param, match_value, outer_headers.ip_protocol,
+ IPPROTO_TCP);
+
+ ft_ix = &ai->ft_ix[MLX5E_TT_IPV4_TCP];
+ if (tt_vec & (1 << MLX5E_TT_IPV4_TCP)) {
+ MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+ ETH_P_IP);
+ MLX5_SET(dest_format_struct, dest, destination_id,
+ tirn[MLX5E_TT_IPV4_TCP]);
+ err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+ match_criteria, flow_context,
+ ft_ix);
+ if (err) {
+ mlx5e_del_eth_addr_from_flow_table(priv, ai);
+ return err;
+ }
+ ai->tt_vec |= (1 << MLX5E_TT_IPV4_TCP);
+ }
+
+ ft_ix = &ai->ft_ix[MLX5E_TT_IPV6_TCP];
+ if (tt_vec & (1 << MLX5E_TT_IPV6_TCP)) {
+ MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+ ETH_P_IPV6);
+ MLX5_SET(dest_format_struct, dest, destination_id,
+ tirn[MLX5E_TT_IPV6_TCP]);
+ err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+ match_criteria, flow_context,
+ ft_ix);
+ if (err) {
+ mlx5e_del_eth_addr_from_flow_table(priv, ai);
+ return err;
+ }
+ ai->tt_vec |= (1 << MLX5E_TT_IPV6_TCP);
+ }
+
+ MLX5_SET(fte_match_param, match_value, outer_headers.ip_protocol,
+ IPPROTO_AH);
+
+ ft_ix = &ai->ft_ix[MLX5E_TT_IPV4_IPSEC_AH];
+ if (tt_vec & (1 << MLX5E_TT_IPV4_IPSEC_AH)) {
+ MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+ ETH_P_IP);
+ MLX5_SET(dest_format_struct, dest, destination_id,
+ tirn[MLX5E_TT_IPV4_IPSEC_AH]);
+ err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+ match_criteria, flow_context,
+ ft_ix);
+ if (err) {
+ mlx5e_del_eth_addr_from_flow_table(priv, ai);
+ return err;
+ }
+ ai->tt_vec |= (1 << MLX5E_TT_IPV4_IPSEC_AH);
+ }
+
+ ft_ix = &ai->ft_ix[MLX5E_TT_IPV6_IPSEC_AH];
+ if (tt_vec & (1 << MLX5E_TT_IPV6_IPSEC_AH)) {
+ MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+ ETH_P_IPV6);
+ MLX5_SET(dest_format_struct, dest, destination_id,
+ tirn[MLX5E_TT_IPV6_IPSEC_AH]);
+ err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+ match_criteria, flow_context,
+ ft_ix);
+ if (err) {
+ mlx5e_del_eth_addr_from_flow_table(priv, ai);
+ return err;
+ }
+ ai->tt_vec |= (1 << MLX5E_TT_IPV6_IPSEC_AH);
+ }
+
+ MLX5_SET(fte_match_param, match_value, outer_headers.ip_protocol,
+ IPPROTO_ESP);
+
+ ft_ix = &ai->ft_ix[MLX5E_TT_IPV4_IPSEC_ESP];
+ if (tt_vec & (1 << MLX5E_TT_IPV4_IPSEC_ESP)) {
+ MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+ ETH_P_IP);
+ MLX5_SET(dest_format_struct, dest, destination_id,
+ tirn[MLX5E_TT_IPV4_IPSEC_ESP]);
+ err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+ match_criteria, flow_context,
+ ft_ix);
+ if (err) {
+ mlx5e_del_eth_addr_from_flow_table(priv, ai);
+ return err;
+ }
+ ai->tt_vec |= (1 << MLX5E_TT_IPV4_IPSEC_ESP);
+ }
+
+ ft_ix = &ai->ft_ix[MLX5E_TT_IPV6_IPSEC_ESP];
+ if (tt_vec & (1 << MLX5E_TT_IPV6_IPSEC_ESP)) {
+ MLX5_SET(fte_match_param, match_value, outer_headers.ethertype,
+ ETH_P_IPV6);
+ MLX5_SET(dest_format_struct, dest, destination_id,
+ tirn[MLX5E_TT_IPV6_IPSEC_ESP]);
+ err = mlx5_add_flow_table_entry(ft, match_criteria_enable,
+ match_criteria, flow_context,
+ ft_ix);
+ if (err) {
+ mlx5e_del_eth_addr_from_flow_table(priv, ai);
+ return err;
+ }
+ ai->tt_vec |= (1 << MLX5E_TT_IPV6_IPSEC_ESP);
+ }
+
+ return 0;
+}
+
+static int mlx5e_add_eth_addr_rule(struct mlx5e_priv *priv,
+ struct mlx5e_eth_addr_info *ai, int type)
+{
+ u32 *flow_context;
+ u32 *match_criteria;
+ int err;
+
+ flow_context = mlx5_vzalloc(MLX5_ST_SZ_BYTES(flow_context) +
+ MLX5_ST_SZ_BYTES(dest_format_struct));
+ match_criteria = mlx5_vzalloc(MLX5_ST_SZ_BYTES(fte_match_param));
+ if (!flow_context || !match_criteria) {
+ netdev_err(priv->netdev, "%s: alloc failed\n", __func__);
+ err = -ENOMEM;
+ goto add_eth_addr_rule_out;
+ }
+
+ err = __mlx5e_add_eth_addr_rule(priv, ai, type, flow_context,
+ match_criteria);
+ if (err)
+ netdev_err(priv->netdev, "%s: failed\n", __func__);
+
+add_eth_addr_rule_out:
+ kvfree(match_criteria);
+ kvfree(flow_context);
+ return err;
+}
+
+enum mlx5e_vlan_rule_type {
+ MLX5E_VLAN_RULE_TYPE_UNTAGGED,
+ MLX5E_VLAN_RULE_TYPE_ANY_VID,
+ MLX5E_VLAN_RULE_TYPE_MATCH_VID,
+};
+
+static int mlx5e_add_vlan_rule(struct mlx5e_priv *priv,
+ enum mlx5e_vlan_rule_type rule_type, u16 vid)
+{
+ u8 match_criteria_enable = 0;
+ u32 *flow_context;
+ void *match_value;
+ void *dest;
+ u32 *match_criteria;
+ u32 *ft_ix;
+ int err;
+
+ flow_context = mlx5_vzalloc(MLX5_ST_SZ_BYTES(flow_context) +
+ MLX5_ST_SZ_BYTES(dest_format_struct));
+ match_criteria = mlx5_vzalloc(MLX5_ST_SZ_BYTES(fte_match_param));
+ if (!flow_context || !match_criteria) {
+ netdev_err(priv->netdev, "%s: alloc failed\n", __func__);
+ err = -ENOMEM;
+ goto add_vlan_rule_out;
+ }
+ match_value = MLX5_ADDR_OF(flow_context, flow_context, match_value);
+ dest = MLX5_ADDR_OF(flow_context, flow_context, destination);
+
+ MLX5_SET(flow_context, flow_context, action,
+ MLX5_FLOW_CONTEXT_ACTION_FWD_DEST);
+ MLX5_SET(flow_context, flow_context, destination_list_size, 1);
+ MLX5_SET(dest_format_struct, dest, destination_type,
+ MLX5_FLOW_CONTEXT_DEST_TYPE_FLOW_TABLE);
+ MLX5_SET(dest_format_struct, dest, destination_id,
+ mlx5_get_flow_table_id(priv->ft.main));
+
+ match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ MLX5_SET_TO_ONES(fte_match_param, match_criteria,
+ outer_headers.vlan_tag);
+
+ switch (rule_type) {
+ case MLX5E_VLAN_RULE_TYPE_UNTAGGED:
+ ft_ix = &priv->vlan.untagged_rule_ft_ix;
+ break;
+ case MLX5E_VLAN_RULE_TYPE_ANY_VID:
+ ft_ix = &priv->vlan.any_vlan_rule_ft_ix;
+ MLX5_SET(fte_match_param, match_value, outer_headers.vlan_tag,
+ 1);
+ break;
+ default: /* MLX5E_VLAN_RULE_TYPE_MATCH_VID */
+ ft_ix = &priv->vlan.active_vlans_ft_ix[vid];
+ MLX5_SET(fte_match_param, match_value, outer_headers.vlan_tag,
+ 1);
+ MLX5_SET_TO_ONES(fte_match_param, match_criteria,
+ outer_headers.first_vid);
+ MLX5_SET(fte_match_param, match_value, outer_headers.first_vid,
+ vid);
+ break;
+ }
+
+ err = mlx5_add_flow_table_entry(priv->ft.vlan, match_criteria_enable,
+ match_criteria, flow_context, ft_ix);
+ if (err)
+ netdev_err(priv->netdev, "%s: failed\n", __func__);
+
+add_vlan_rule_out:
+ kvfree(match_criteria);
+ kvfree(flow_context);
+ return err;
+}
+
+static void mlx5e_del_vlan_rule(struct mlx5e_priv *priv,
+ enum mlx5e_vlan_rule_type rule_type, u16 vid)
+{
+ switch (rule_type) {
+ case MLX5E_VLAN_RULE_TYPE_UNTAGGED:
+ mlx5_del_flow_table_entry(priv->ft.vlan,
+ priv->vlan.untagged_rule_ft_ix);
+ break;
+ case MLX5E_VLAN_RULE_TYPE_ANY_VID:
+ mlx5_del_flow_table_entry(priv->ft.vlan,
+ priv->vlan.any_vlan_rule_ft_ix);
+ break;
+ case MLX5E_VLAN_RULE_TYPE_MATCH_VID:
+ mlx5_del_flow_table_entry(priv->ft.vlan,
+ priv->vlan.active_vlans_ft_ix[vid]);
+ break;
+ }
+}
+
+void mlx5e_enable_vlan_filter(struct mlx5e_priv *priv)
+{
+ WARN_ON(!mutex_is_locked(&priv->state_lock));
+
+ if (priv->vlan.filter_disabled) {
+ priv->vlan.filter_disabled = false;
+ if (test_bit(MLX5E_STATE_OPENED, &priv->state))
+ mlx5e_del_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_ANY_VID,
+ 0);
+ }
+}
+
+void mlx5e_disable_vlan_filter(struct mlx5e_priv *priv)
+{
+ WARN_ON(!mutex_is_locked(&priv->state_lock));
+
+ if (!priv->vlan.filter_disabled) {
+ priv->vlan.filter_disabled = true;
+ if (test_bit(MLX5E_STATE_OPENED, &priv->state))
+ mlx5e_add_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_ANY_VID,
+ 0);
+ }
+}
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,10,0))
+int mlx5e_vlan_rx_add_vid(struct net_device *dev, __always_unused __be16 proto,
+ u16 vid)
+#elif (LINUX_VERSION_CODE >= KERNEL_VERSION(3,3,0))
+int mlx5e_vlan_rx_add_vid(struct net_device *dev, u16 vid)
+#else
+void mlx5e_vlan_rx_add_vid(struct net_device *dev, u16 vid)
+#endif
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+ int err = 0;
+
+ mutex_lock(&priv->state_lock);
+
+ if (!test_and_set_bit(vid, priv->vlan.active_vlans) &&
+ test_bit(MLX5E_STATE_OPENED, &priv->state))
+ err = mlx5e_add_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_MATCH_VID,
+ vid);
+
+ mutex_unlock(&priv->state_lock);
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,3,0))
+ return err;
+#endif
+}
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,10,0))
+int mlx5e_vlan_rx_kill_vid(struct net_device *dev, __always_unused __be16 proto,
+ u16 vid)
+#elif (LINUX_VERSION_CODE >= KERNEL_VERSION(3,3,0))
+int mlx5e_vlan_rx_kill_vid(struct net_device *dev, u16 vid)
+#else
+void mlx5e_vlan_rx_kill_vid(struct net_device *dev, u16 vid)
+#endif
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+
+ mutex_lock(&priv->state_lock);
+
+ clear_bit(vid, priv->vlan.active_vlans);
+ if (test_bit(MLX5E_STATE_OPENED, &priv->state))
+ mlx5e_del_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_MATCH_VID, vid);
+
+ mutex_unlock(&priv->state_lock);
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,3,0))
+ return 0;
+#endif
+}
+
+int mlx5e_add_all_vlan_rules(struct mlx5e_priv *priv)
+{
+ u16 vid;
+ int err;
+
+ for_each_set_bit(vid, priv->vlan.active_vlans, VLAN_N_VID) {
+ err = mlx5e_add_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_MATCH_VID,
+ vid);
+ if (err)
+ return err;
+ }
+
+ err = mlx5e_add_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_UNTAGGED, 0);
+ if (err)
+ return err;
+
+ if (priv->vlan.filter_disabled) {
+ err = mlx5e_add_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_ANY_VID,
+ 0);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+void mlx5e_del_all_vlan_rules(struct mlx5e_priv *priv)
+{
+ u16 vid;
+
+ if (priv->vlan.filter_disabled)
+ mlx5e_del_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_ANY_VID, 0);
+
+ mlx5e_del_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_UNTAGGED, 0);
+
+ for_each_set_bit(vid, priv->vlan.active_vlans, VLAN_N_VID)
+ mlx5e_del_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_MATCH_VID, vid);
+}
+
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0))
+#define mlx5e_for_each_hash_node(hn, tmp, hash, i) \
+ for (i = 0; i < MLX5E_ETH_ADDR_HASH_SIZE; i++) \
+ hlist_for_each_entry_safe(hn, n, tmp, &hash[i], hlist)
+#else
+#define mlx5e_for_each_hash_node(hn, tmp, hash, i) \
+ for (i = 0; i < MLX5E_ETH_ADDR_HASH_SIZE; i++) \
+ hlist_for_each_entry_safe(hn, tmp, &hash[i], hlist)
+#endif
+
+static void mlx5e_execute_action(struct mlx5e_priv *priv,
+ struct mlx5e_eth_addr_hash_node *hn)
+{
+ switch (hn->action) {
+ case MLX5E_ACTION_ADD:
+ mlx5e_add_eth_addr_rule(priv, &hn->ai, MLX5E_FULLMATCH);
+ hn->action = MLX5E_ACTION_NONE;
+ break;
+
+ case MLX5E_ACTION_DEL:
+ mlx5e_del_eth_addr_from_flow_table(priv, &hn->ai);
+ mlx5e_del_eth_addr_from_hash(hn);
+ break;
+ }
+}
+
+static void mlx5e_sync_netdev_addr(struct mlx5e_priv *priv)
+{
+ struct net_device *netdev = priv->netdev;
+ struct netdev_hw_addr *ha;
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(2,6,35))
+ struct dev_mc_list *mclist;
+#endif
+
+ netif_addr_lock_bh(netdev);
+
+ mlx5e_add_eth_addr_to_hash(priv->eth_addr.netdev_uc,
+ priv->netdev->dev_addr);
+
+ netdev_for_each_uc_addr(ha, netdev)
+ mlx5e_add_eth_addr_to_hash(priv->eth_addr.netdev_uc, ha->addr);
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,35))
+ netdev_for_each_mc_addr(ha, netdev)
+ mlx5e_add_eth_addr_to_hash(priv->eth_addr.netdev_mc, ha->addr);
+#else
+ for (mclist = netdev->mc_list; mclist; mclist = mclist->next)
+ mlx5e_add_eth_addr_to_hash(priv->eth_addr.netdev_mc,
+ mclist->dmi_addr);
+#endif
+
+ netif_addr_unlock_bh(netdev);
+}
+
+static void mlx5e_apply_netdev_addr(struct mlx5e_priv *priv)
+{
+ struct mlx5e_eth_addr_hash_node *hn;
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0))
+ struct hlist_node *n, *tmp;
+#else
+ struct hlist_node *tmp;
+#endif
+ int i;
+
+ mlx5e_for_each_hash_node(hn, tmp, priv->eth_addr.netdev_uc, i)
+ mlx5e_execute_action(priv, hn);
+
+ mlx5e_for_each_hash_node(hn, tmp, priv->eth_addr.netdev_mc, i)
+ mlx5e_execute_action(priv, hn);
+}
+
+static void mlx5e_handle_netdev_addr(struct mlx5e_priv *priv)
+{
+ struct mlx5e_eth_addr_hash_node *hn;
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0))
+ struct hlist_node *n, *tmp;
+#else
+ struct hlist_node *tmp;
+#endif
+ int i;
+
+ mlx5e_for_each_hash_node(hn, tmp, priv->eth_addr.netdev_uc, i)
+ hn->action = MLX5E_ACTION_DEL;
+ mlx5e_for_each_hash_node(hn, tmp, priv->eth_addr.netdev_mc, i)
+ hn->action = MLX5E_ACTION_DEL;
+
+ if (test_bit(MLX5E_STATE_OPENED, &priv->state))
+ mlx5e_sync_netdev_addr(priv);
+
+ mlx5e_apply_netdev_addr(priv);
+}
+
+void mlx5e_set_rx_mode_core(struct mlx5e_priv *priv)
+{
+ struct mlx5e_eth_addr_db *ea = &priv->eth_addr;
+ struct net_device *ndev = priv->netdev;
+
+ bool rx_mode_enable = test_bit(MLX5E_STATE_OPENED, &priv->state);
+ bool promisc_enabled = rx_mode_enable && (ndev->flags & IFF_PROMISC);
+ bool allmulti_enabled = rx_mode_enable && (ndev->flags & IFF_ALLMULTI);
+ bool broadcast_enabled = rx_mode_enable;
+
+ bool enable_promisc = !ea->promisc_enabled && promisc_enabled;
+ bool disable_promisc = ea->promisc_enabled && !promisc_enabled;
+ bool enable_allmulti = !ea->allmulti_enabled && allmulti_enabled;
+ bool disable_allmulti = ea->allmulti_enabled && !allmulti_enabled;
+ bool enable_broadcast = !ea->broadcast_enabled && broadcast_enabled;
+ bool disable_broadcast = ea->broadcast_enabled && !broadcast_enabled;
+
+ if (enable_promisc)
+ mlx5e_add_eth_addr_rule(priv, &ea->promisc, MLX5E_PROMISC);
+ if (enable_allmulti)
+ mlx5e_add_eth_addr_rule(priv, &ea->allmulti, MLX5E_ALLMULTI);
+ if (enable_broadcast)
+ mlx5e_add_eth_addr_rule(priv, &ea->broadcast, MLX5E_FULLMATCH);
+
+ mlx5e_handle_netdev_addr(priv);
+
+ if (disable_broadcast)
+ mlx5e_del_eth_addr_from_flow_table(priv, &ea->broadcast);
+ if (disable_allmulti)
+ mlx5e_del_eth_addr_from_flow_table(priv, &ea->allmulti);
+ if (disable_promisc)
+ mlx5e_del_eth_addr_from_flow_table(priv, &ea->promisc);
+
+ ea->promisc_enabled = promisc_enabled;
+ ea->allmulti_enabled = allmulti_enabled;
+ ea->broadcast_enabled = broadcast_enabled;
+}
+
+void mlx5e_set_rx_mode_work(struct work_struct *work)
+{
+ struct mlx5e_priv *priv = container_of(work, struct mlx5e_priv,
+ set_rx_mode_work);
+
+ mutex_lock(&priv->state_lock);
+ if (test_bit(MLX5E_STATE_OPENED, &priv->state))
+ mlx5e_set_rx_mode_core(priv);
+ mutex_unlock(&priv->state_lock);
+}
+
+void mlx5e_init_eth_addr(struct mlx5e_priv *priv)
+{
+ ether_addr_copy(priv->eth_addr.broadcast.addr, priv->netdev->broadcast);
+}
+
+static int mlx5e_create_main_flow_table(struct mlx5e_priv *priv)
+{
+ struct mlx5_flow_table_group *g;
+ u8 *dmac;
+
+ g = kcalloc(9, sizeof(*g), GFP_KERNEL);
+ if (!g)
+ return -ENOMEM;
+
+ g[0].log_sz = 3;
+ g[0].match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ MLX5_SET_TO_ONES(fte_match_param, g[0].match_criteria,
+ outer_headers.ethertype);
+ MLX5_SET_TO_ONES(fte_match_param, g[0].match_criteria,
+ outer_headers.ip_protocol);
+
+ g[1].log_sz = 1;
+ g[1].match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ MLX5_SET_TO_ONES(fte_match_param, g[1].match_criteria,
+ outer_headers.ethertype);
+
+ g[2].log_sz = 0;
+
+ g[3].log_sz = 14;
+ g[3].match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ dmac = MLX5_ADDR_OF(fte_match_param, g[3].match_criteria,
+ outer_headers.dmac_47_16);
+ memset(dmac, 0xff, ETH_ALEN);
+ MLX5_SET_TO_ONES(fte_match_param, g[3].match_criteria,
+ outer_headers.ethertype);
+ MLX5_SET_TO_ONES(fte_match_param, g[3].match_criteria,
+ outer_headers.ip_protocol);
+
+ g[4].log_sz = 13;
+ g[4].match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ dmac = MLX5_ADDR_OF(fte_match_param, g[4].match_criteria,
+ outer_headers.dmac_47_16);
+ memset(dmac, 0xff, ETH_ALEN);
+ MLX5_SET_TO_ONES(fte_match_param, g[4].match_criteria,
+ outer_headers.ethertype);
+
+ g[5].log_sz = 11;
+ g[5].match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ dmac = MLX5_ADDR_OF(fte_match_param, g[5].match_criteria,
+ outer_headers.dmac_47_16);
+ memset(dmac, 0xff, ETH_ALEN);
+
+ g[6].log_sz = 2;
+ g[6].match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ dmac = MLX5_ADDR_OF(fte_match_param, g[6].match_criteria,
+ outer_headers.dmac_47_16);
+ dmac[0] = 0x01;
+ MLX5_SET_TO_ONES(fte_match_param, g[6].match_criteria,
+ outer_headers.ethertype);
+ MLX5_SET_TO_ONES(fte_match_param, g[6].match_criteria,
+ outer_headers.ip_protocol);
+
+ g[7].log_sz = 1;
+ g[7].match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ dmac = MLX5_ADDR_OF(fte_match_param, g[7].match_criteria,
+ outer_headers.dmac_47_16);
+ dmac[0] = 0x01;
+ MLX5_SET_TO_ONES(fte_match_param, g[7].match_criteria,
+ outer_headers.ethertype);
+
+ g[8].log_sz = 0;
+ g[8].match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ dmac = MLX5_ADDR_OF(fte_match_param, g[8].match_criteria,
+ outer_headers.dmac_47_16);
+ dmac[0] = 0x01;
+ priv->ft.main = mlx5_create_flow_table(priv->mdev, 1,
+ MLX5_FLOW_TABLE_TYPE_NIC_RCV,
+ 9, g);
+ kfree(g);
+
+ return priv->ft.main ? 0 : -ENOMEM;
+}
+
+static void mlx5e_destroy_main_flow_table(struct mlx5e_priv *priv)
+{
+ mlx5_destroy_flow_table(priv->ft.main);
+}
+
+static int mlx5e_create_vlan_flow_table(struct mlx5e_priv *priv)
+{
+ struct mlx5_flow_table_group *g;
+
+ g = kcalloc(2, sizeof(*g), GFP_KERNEL);
+ if (!g)
+ return -ENOMEM;
+
+ g[0].log_sz = 12;
+ g[0].match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ MLX5_SET_TO_ONES(fte_match_param, g[0].match_criteria,
+ outer_headers.vlan_tag);
+ MLX5_SET_TO_ONES(fte_match_param, g[0].match_criteria,
+ outer_headers.first_vid);
+
+ /* untagged + any vlan id */
+ g[1].log_sz = 1;
+ g[1].match_criteria_enable = MLX5_MATCH_OUTER_HEADERS;
+ MLX5_SET_TO_ONES(fte_match_param, g[1].match_criteria,
+ outer_headers.vlan_tag);
+
+ priv->ft.vlan = mlx5_create_flow_table(priv->mdev, 0,
+ MLX5_FLOW_TABLE_TYPE_NIC_RCV,
+ 2, g);
+
+ kfree(g);
+ return priv->ft.vlan ? 0 : -ENOMEM;
+}
+
+static void mlx5e_destroy_vlan_flow_table(struct mlx5e_priv *priv)
+{
+ mlx5_destroy_flow_table(priv->ft.vlan);
+}
+
+int mlx5e_open_flow_table(struct mlx5e_priv *priv)
+{
+ int err;
+
+ err = mlx5e_create_main_flow_table(priv);
+ if (err)
+ return err;
+
+ err = mlx5e_create_vlan_flow_table(priv);
+ if (err)
+ goto err_destroy_main_flow_table;
+
+ return 0;
+
+err_destroy_main_flow_table:
+ mlx5e_destroy_main_flow_table(priv);
+
+ return err;
+}
+
+void mlx5e_close_flow_table(struct mlx5e_priv *priv)
+{
+ mlx5e_destroy_vlan_flow_table(priv);
+ mlx5e_destroy_main_flow_table(priv);
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/en_main.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_main.c
new file mode 100644
index 0000000..aa1641a
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_main.c
@@ -0,0 +1,2265 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "en.h"
+
+struct mlx5e_rq_param {
+ u32 rqc[MLX5_ST_SZ_DW(rqc)];
+ struct mlx5_wq_param wq;
+};
+
+struct mlx5e_sq_param {
+ u32 sqc[MLX5_ST_SZ_DW(sqc)];
+ struct mlx5_wq_param wq;
+};
+
+struct mlx5e_cq_param {
+ u32 cqc[MLX5_ST_SZ_DW(cqc)];
+ struct mlx5_wq_param wq;
+ u16 eq_ix;
+};
+
+struct mlx5e_channel_param {
+ struct mlx5e_rq_param rq;
+ struct mlx5e_sq_param sq;
+ struct mlx5e_cq_param rx_cq;
+ struct mlx5e_cq_param tx_cq;
+};
+
+static void mlx5e_update_carrier(struct mlx5e_priv *priv)
+{
+ struct mlx5_core_dev *mdev = priv->mdev;
+ u8 port_state;
+
+ port_state = mlx5_query_vport_state(mdev,
+ MLX5_QUERY_VPORT_STATE_IN_OP_MOD_VNIC_VPORT);
+
+ if (port_state == VPORT_STATE_UP)
+ netif_carrier_on(priv->netdev);
+ else
+ netif_carrier_off(priv->netdev);
+}
+
+static void mlx5e_update_carrier_work(struct work_struct *work)
+{
+ struct mlx5e_priv *priv = container_of(work, struct mlx5e_priv,
+ update_carrier_work);
+
+ mutex_lock(&priv->state_lock);
+ if (test_bit(MLX5E_STATE_OPENED, &priv->state))
+ mlx5e_update_carrier(priv);
+ mutex_unlock(&priv->state_lock);
+}
+
+static void mlx5e_update_pport_counters(struct mlx5e_priv *priv)
+{
+ struct mlx5_core_dev *mdev = priv->mdev;
+ struct mlx5e_pport_stats *s = &priv->stats.pport;
+ u32 *in;
+ u32 *out;
+ int sz = MLX5_ST_SZ_BYTES(ppcnt_reg);
+
+ in = mlx5_vzalloc(sz);
+ out = mlx5_vzalloc(sz);
+ if (!in || !out)
+ goto free_out;
+
+ memset(in, 0, sz);
+ memset(out, 0, sz);
+ MLX5_SET(ppcnt_reg, in, local_port, 1);
+
+ MLX5_SET(ppcnt_reg, in, grp, MLX5_IEEE_802_3_COUNTERS_GROUP);
+ mlx5_core_access_reg(mdev, in, sz, out,
+ sz, MLX5_REG_PPCNT, 0, 0);
+ memcpy(s->IEEE_802_3_counters,
+ MLX5_ADDR_OF(ppcnt_reg, out, counter_set),
+ sizeof(s->IEEE_802_3_counters));
+
+ MLX5_SET(ppcnt_reg, in, grp, MLX5_RFC_2863_COUNTERS_GROUP);
+ mlx5_core_access_reg(mdev, in, sz, out,
+ sz, MLX5_REG_PPCNT, 0, 0);
+ memcpy(s->RFC_2863_counters,
+ MLX5_ADDR_OF(ppcnt_reg, out, counter_set),
+ sizeof(s->RFC_2863_counters));
+
+ MLX5_SET(ppcnt_reg, in, grp, MLX5_RFC_2819_COUNTERS_GROUP);
+ mlx5_core_access_reg(mdev, in, sz, out,
+ sz, MLX5_REG_PPCNT, 0, 0);
+ memcpy(s->RFC_2819_counters,
+ MLX5_ADDR_OF(ppcnt_reg, out, counter_set),
+ sizeof(s->RFC_2819_counters));
+
+free_out:
+ kvfree(in);
+ kvfree(out);
+}
+
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+static void mlx5e_update_sw_lro_stats(struct mlx5e_priv *priv)
+{
+ int i;
+ struct mlx5e_vport_stats *s = &priv->stats.vport;
+
+ s->sw_lro_aggregated = 0;
+ s->sw_lro_flushed = 0;
+ s->sw_lro_no_desc = 0;
+
+ for (i = 0; i < priv->params.num_channels; i++) {
+ struct mlx5e_rq *rq = &priv->channel[i]->rq;
+
+ s->sw_lro_aggregated += rq->sw_lro.lro_mgr.stats.aggregated;
+ s->sw_lro_flushed += rq->sw_lro.lro_mgr.stats.flushed;
+ s->sw_lro_no_desc += rq->sw_lro.lro_mgr.stats.no_desc;
+ }
+}
+#endif
+
+void mlx5e_update_stats(struct mlx5e_priv *priv)
+{
+ struct mlx5_core_dev *mdev = priv->mdev;
+ struct mlx5e_vport_stats *s = &priv->stats.vport;
+ struct mlx5e_rq_stats *rq_stats;
+ struct mlx5e_sq_stats *sq_stats;
+ u32 in[MLX5_ST_SZ_DW(query_vport_counter_in)];
+ u32 *out;
+ int outlen = MLX5_ST_SZ_BYTES(query_vport_counter_out);
+ u64 tx_offload_none;
+ int i, j;
+
+ out = mlx5_vzalloc(outlen);
+ if (!out)
+ return;
+
+ /* Collect firts the SW counters and then HW for consistency */
+ s->tso_packets = 0;
+ s->tso_bytes = 0;
+ s->tx_queue_stopped = 0;
+ s->tx_queue_wake = 0;
+ s->tx_queue_dropped = 0;
+ tx_offload_none = 0;
+ s->lro_packets = 0;
+ s->lro_bytes = 0;
+ s->rx_csum_none = 0;
+ s->rx_wqe_err = 0;
+ for (i = 0; i < priv->params.num_channels; i++) {
+ rq_stats = &priv->channel[i]->rq.stats;
+
+ s->lro_packets += rq_stats->lro_packets;
+ s->lro_bytes += rq_stats->lro_bytes;
+ s->rx_csum_none += rq_stats->csum_none;
+ s->rx_wqe_err += rq_stats->wqe_err;
+
+ for (j = 0; j < priv->params.num_tc; j++) {
+ sq_stats = &priv->channel[i]->sq[j].stats;
+
+ s->tso_packets += sq_stats->tso_packets;
+ s->tso_bytes += sq_stats->tso_bytes;
+ s->tx_queue_stopped += sq_stats->stopped;
+ s->tx_queue_wake += sq_stats->wake;
+ s->tx_queue_dropped += sq_stats->dropped;
+ tx_offload_none += sq_stats->csum_offload_none;
+ }
+ }
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+ mlx5e_update_sw_lro_stats(priv);
+#endif
+
+ /* HW counters */
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(query_vport_counter_in, in, opcode,
+ MLX5_CMD_OP_QUERY_VPORT_COUNTER);
+ MLX5_SET(query_vport_counter_in, in, op_mod, 0);
+ MLX5_SET(query_vport_counter_in, in, other_vport, 0);
+
+ memset(out, 0, outlen);
+
+ if (mlx5_cmd_exec(mdev, in, sizeof(in), out, outlen))
+ goto free_out;
+
+#define MLX5_GET_CTR(p, x) \
+ MLX5_GET64(query_vport_counter_out, p, x)
+
+ s->rx_error_packets =
+ MLX5_GET_CTR(out, received_errors.packets);
+ s->rx_error_bytes =
+ MLX5_GET_CTR(out, received_errors.octets);
+ s->tx_error_packets =
+ MLX5_GET_CTR(out, transmit_errors.packets);
+ s->tx_error_bytes =
+ MLX5_GET_CTR(out, transmit_errors.octets);
+
+ s->rx_unicast_packets =
+ MLX5_GET_CTR(out, received_eth_unicast.packets);
+ s->rx_unicast_bytes =
+ MLX5_GET_CTR(out, received_eth_unicast.octets);
+ s->tx_unicast_packets =
+ MLX5_GET_CTR(out, transmitted_eth_unicast.packets);
+ s->tx_unicast_bytes =
+ MLX5_GET_CTR(out, transmitted_eth_unicast.octets);
+
+ s->rx_multicast_packets =
+ MLX5_GET_CTR(out, received_eth_multicast.packets);
+ s->rx_multicast_bytes =
+ MLX5_GET_CTR(out, received_eth_multicast.octets);
+ s->tx_multicast_packets =
+ MLX5_GET_CTR(out, transmitted_eth_multicast.packets);
+ s->tx_multicast_bytes =
+ MLX5_GET_CTR(out, transmitted_eth_multicast.octets);
+
+ s->rx_broadcast_packets =
+ MLX5_GET_CTR(out, received_eth_broadcast.packets);
+ s->rx_broadcast_bytes =
+ MLX5_GET_CTR(out, received_eth_broadcast.octets);
+ s->tx_broadcast_packets =
+ MLX5_GET_CTR(out, transmitted_eth_broadcast.packets);
+ s->tx_broadcast_bytes =
+ MLX5_GET_CTR(out, transmitted_eth_broadcast.octets);
+
+ s->rx_packets =
+ s->rx_unicast_packets +
+ s->rx_multicast_packets +
+ s->rx_broadcast_packets;
+ s->rx_bytes =
+ s->rx_unicast_bytes +
+ s->rx_multicast_bytes +
+ s->rx_broadcast_bytes;
+ s->tx_packets =
+ s->tx_unicast_packets +
+ s->tx_multicast_packets +
+ s->tx_broadcast_packets;
+ s->tx_bytes =
+ s->tx_unicast_bytes +
+ s->tx_multicast_bytes +
+ s->tx_broadcast_bytes;
+
+ /* Update calculated offload counters */
+ s->tx_csum_offload = s->tx_packets - tx_offload_none;
+ s->rx_csum_good = s->rx_packets - s->rx_csum_none;
+
+ mlx5e_update_pport_counters(priv);
+free_out:
+ kvfree(out);
+}
+
+static void mlx5e_update_stats_work(struct work_struct *work)
+{
+ struct delayed_work *dwork = to_delayed_work(work);
+ struct mlx5e_priv *priv = container_of(dwork, struct mlx5e_priv,
+ update_stats_work);
+ mutex_lock(&priv->state_lock);
+ if (test_bit(MLX5E_STATE_OPENED, &priv->state)) {
+ mlx5e_update_stats(priv);
+ schedule_delayed_work(dwork,
+ msecs_to_jiffies(
+ MLX5E_UPDATE_STATS_INTERVAL));
+ }
+ mutex_unlock(&priv->state_lock);
+}
+
+static void __mlx5e_async_event(struct mlx5e_priv *priv,
+ enum mlx5_dev_event event)
+{
+ switch (event) {
+ case MLX5_DEV_EVENT_PORT_UP:
+ case MLX5_DEV_EVENT_PORT_DOWN:
+ schedule_work(&priv->update_carrier_work);
+ break;
+
+ default:
+ break;
+ }
+}
+
+static void mlx5e_async_event(struct mlx5_core_dev *mdev, void *vpriv,
+ enum mlx5_dev_event event, unsigned long param)
+{
+ struct mlx5e_priv *priv = vpriv;
+
+ spin_lock(&priv->async_events_spinlock);
+ if (test_bit(MLX5E_STATE_ASYNC_EVENTS_ENABLE, &priv->state))
+ __mlx5e_async_event(priv, event);
+ spin_unlock(&priv->async_events_spinlock);
+}
+
+static void mlx5e_enable_async_events(struct mlx5e_priv *priv)
+{
+ set_bit(MLX5E_STATE_ASYNC_EVENTS_ENABLE, &priv->state);
+}
+
+static void mlx5e_disable_async_events(struct mlx5e_priv *priv)
+{
+ spin_lock_irq(&priv->async_events_spinlock);
+ clear_bit(MLX5E_STATE_ASYNC_EVENTS_ENABLE, &priv->state);
+ spin_unlock_irq(&priv->async_events_spinlock);
+}
+
+#define MLX5E_HW2SW_MTU(hwmtu) (hwmtu - (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN))
+#define MLX5E_SW2HW_MTU(swmtu) (swmtu + (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN))
+
+static int mlx5e_create_rq(struct mlx5e_channel *c,
+ struct mlx5e_rq_param *param,
+ struct mlx5e_rq *rq)
+{
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+ void *rqc = param->rqc;
+ void *rqc_wq = MLX5_ADDR_OF(rqc, rqc, wq);
+ int wq_sz;
+ int err;
+ int i;
+
+ param->wq.db_numa_node = cpu_to_node(c->cpu);
+
+ err = mlx5_wq_ll_create(mdev, ¶m->wq, rqc_wq, &rq->wq,
+ &rq->wq_ctrl);
+ if (err)
+ return err;
+
+ rq->wq.db = &rq->wq.db[MLX5_RCV_DBR];
+
+ wq_sz = mlx5_wq_ll_get_size(&rq->wq);
+ rq->skb = kzalloc_node(wq_sz * sizeof(*rq->skb), GFP_KERNEL,
+ cpu_to_node(c->cpu));
+ if (!rq->skb) {
+ err = -ENOMEM;
+ goto err_rq_wq_destroy;
+ }
+
+ rq->wqe_sz = (priv->params.lro_en) ? priv->params.lro_wqe_sz :
+ MLX5E_SW2HW_MTU(priv->netdev->mtu);
+ rq->wqe_sz = SKB_DATA_ALIGN(rq->wqe_sz + MLX5E_NET_IP_ALIGN);
+ for (i = 0; i < wq_sz; i++) {
+ struct mlx5e_rx_wqe *wqe = mlx5_wq_ll_get_wqe(&rq->wq, i);
+ u32 byte_count = rq->wqe_sz - MLX5E_NET_IP_ALIGN;
+
+ wqe->data.lkey = c->mkey_be;
+ wqe->data.byte_count =
+ cpu_to_be32(byte_count | MLX5_HW_START_PADDING);
+ }
+
+ rq->pdev = c->pdev;
+ rq->netdev = c->netdev;
+ rq->channel = c;
+ rq->ix = c->ix;
+
+ return 0;
+
+err_rq_wq_destroy:
+ mlx5_wq_destroy(&rq->wq_ctrl);
+
+ return err;
+}
+
+static void mlx5e_destroy_rq(struct mlx5e_rq *rq)
+{
+ kfree(rq->skb);
+ mlx5_wq_destroy(&rq->wq_ctrl);
+}
+
+static int mlx5e_enable_rq(struct mlx5e_rq *rq, struct mlx5e_rq_param *param)
+{
+ struct mlx5e_channel *c = rq->channel;
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+ void *in;
+ void *rqc;
+ void *wq;
+ int inlen;
+ int err;
+
+ inlen = MLX5_ST_SZ_BYTES(create_rq_in) +
+ sizeof(u64) * rq->wq_ctrl.buf.npages;
+ in = mlx5_vzalloc(inlen);
+ if (!in)
+ return -ENOMEM;
+
+ rqc = MLX5_ADDR_OF(create_rq_in, in, ctx);
+ wq = MLX5_ADDR_OF(rqc, rqc, wq);
+
+ memcpy(rqc, param->rqc, sizeof(param->rqc));
+
+ MLX5_SET(rqc, rqc, cqn, c->rq.cq.mcq.cqn);
+ MLX5_SET(rqc, rqc, state, MLX5_RQC_STATE_RST);
+ MLX5_SET(rqc, rqc, flush_in_error_en, 1);
+ MLX5_SET(wq, wq, log_wq_pg_sz, rq->wq_ctrl.buf.page_shift -
+ PAGE_SHIFT);
+ MLX5_SET64(wq, wq, dbr_addr, rq->wq_ctrl.db.dma);
+
+ mlx5_fill_page_array(&rq->wq_ctrl.buf,
+ (__be64 *)MLX5_ADDR_OF(wq, wq, pas));
+
+ err = mlx5_core_create_rq(mdev, in, inlen, &rq->rqn);
+
+ kvfree(in);
+
+ return err;
+}
+
+static int mlx5e_modify_rq(struct mlx5e_rq *rq, int curr_state, int next_state)
+{
+ struct mlx5e_channel *c = rq->channel;
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+ void *in;
+ void *rqc;
+ int inlen;
+ int err;
+
+ inlen = MLX5_ST_SZ_BYTES(modify_rq_in);
+ in = mlx5_vzalloc(inlen);
+ if (!in)
+ return -ENOMEM;
+
+ rqc = MLX5_ADDR_OF(modify_rq_in, in, ctx);
+
+ MLX5_SET(modify_rq_in, in, rqn, rq->rqn);
+ MLX5_SET(modify_rq_in, in, rq_state, curr_state);
+ MLX5_SET(rqc, rqc, state, next_state);
+
+ err = mlx5_core_modify_rq(mdev, in, inlen);
+
+ kvfree(in);
+
+ return err;
+}
+
+static void mlx5e_disable_rq(struct mlx5e_rq *rq)
+{
+ struct mlx5e_channel *c = rq->channel;
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+ mlx5_core_destroy_rq(mdev, rq->rqn);
+}
+
+static int mlx5e_wait_for_min_rx_wqes(struct mlx5e_rq *rq)
+{
+ struct mlx5e_channel *c = rq->channel;
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_wq_ll *wq = &rq->wq;
+ int i;
+
+ for (i = 0; i < 1000; i++) {
+ if (wq->cur_sz >= priv->params.min_rx_wqes)
+ return 0;
+
+ msleep(20);
+ }
+
+ return -ETIMEDOUT;
+}
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+static int get_skb_hdr(struct sk_buff *skb, void **iphdr,
+ void **tcph, u64 *hdr_flags, void *priv)
+{
+ unsigned int ip_len;
+ struct iphdr *iph;
+
+ if (unlikely(skb->protocol != htons(ETH_P_IP)))
+ return -1;
+
+ /*
+ * In the future we may add an else clause that verifies the
+ * checksum and allows devices which do not calculate checksum
+ * to use LRO.
+ */
+ if (unlikely(skb->ip_summed != CHECKSUM_UNNECESSARY))
+ return -1;
+
+ /* Check for non-TCP packet */
+ skb_reset_network_header(skb);
+ iph = ip_hdr(skb);
+ if (iph->protocol != IPPROTO_TCP)
+ return -1;
+
+ ip_len = ip_hdrlen(skb);
+ skb_set_transport_header(skb, ip_len);
+ *tcph = tcp_hdr(skb);
+
+ /* check if IP header and TCP header are complete */
+ if (ntohs(iph->tot_len) < ip_len + tcp_hdrlen(skb))
+ return -1;
+
+ *hdr_flags = LRO_IPV4 | LRO_TCP;
+ *iphdr = iph;
+
+ return 0;
+}
+
+static void mlx5e_rq_sw_lro_init(struct mlx5e_rq *rq)
+{
+ struct mlx5e_priv *priv = netdev_priv(rq->netdev);
+
+ rq->sw_lro.lro_mgr.max_aggr = 64;
+ rq->sw_lro.lro_mgr.max_desc = MLX5E_LRO_MAX_DESC;
+ rq->sw_lro.lro_mgr.lro_arr = rq->sw_lro.lro_desc;
+ rq->sw_lro.lro_mgr.get_skb_header = get_skb_hdr;
+ rq->sw_lro.lro_mgr.features = LRO_F_NAPI;
+ rq->sw_lro.lro_mgr.frag_align_pad = NET_IP_ALIGN;
+ rq->sw_lro.lro_mgr.dev = rq->netdev;
+ rq->sw_lro.lro_mgr.ip_summed = CHECKSUM_UNNECESSARY;
+ rq->sw_lro.lro_mgr.ip_summed_aggr = CHECKSUM_UNNECESSARY;
+ rq->flags |= (priv->pflags & MLX5E_PRIV_FLAG_SWLRO) ? MLX5E_RQ_FLAG_SWLRO : 0;
+}
+#endif
+
+static int mlx5e_open_rq(struct mlx5e_channel *c,
+ struct mlx5e_rq_param *param,
+ struct mlx5e_rq *rq)
+{
+ int err;
+
+ err = mlx5e_create_rq(c, param, rq);
+ if (err)
+ return err;
+
+ err = mlx5e_enable_rq(rq, param);
+ if (err)
+ goto err_destroy_rq;
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+ mlx5e_rq_sw_lro_init(rq);
+#endif
+
+ err = mlx5e_modify_rq(rq, MLX5_RQC_STATE_RST, MLX5_RQC_STATE_RDY);
+ if (err)
+ goto err_disable_rq;
+
+ set_bit(MLX5E_RQ_STATE_POST_WQES_ENABLE, &rq->state);
+ mlx5e_send_nop(&c->sq[0], true); /* trigger mlx5e_post_rx_wqes() */
+
+ return 0;
+
+err_disable_rq:
+ mlx5e_disable_rq(rq);
+err_destroy_rq:
+ mlx5e_destroy_rq(rq);
+
+ return err;
+}
+
+static void mlx5e_close_rq(struct mlx5e_rq *rq)
+{
+ clear_bit(MLX5E_RQ_STATE_POST_WQES_ENABLE, &rq->state);
+ napi_synchronize(&rq->channel->napi); /* prevent mlx5e_post_rx_wqes */
+
+ mlx5e_modify_rq(rq, MLX5_RQC_STATE_RDY, MLX5_RQC_STATE_ERR);
+ while (!mlx5_wq_ll_is_empty(&rq->wq))
+ msleep(20);
+
+ /* avoid destroying rq before mlx5e_poll_rx_cq() is done with it */
+ napi_synchronize(&rq->channel->napi);
+
+ mlx5e_disable_rq(rq);
+ mlx5e_destroy_rq(rq);
+}
+
+static void mlx5e_free_sq_db(struct mlx5e_sq *sq)
+{
+ kfree(sq->dma_fifo);
+ kfree(sq->skb);
+}
+
+static int mlx5e_alloc_sq_db(struct mlx5e_sq *sq, int numa)
+{
+ int wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
+ int df_sz = wq_sz * MLX5_SEND_WQEBB_NUM_DS;
+
+ sq->skb = kzalloc_node(wq_sz * sizeof(*sq->skb), GFP_KERNEL, numa);
+ sq->dma_fifo = kzalloc_node(df_sz * sizeof(*sq->dma_fifo), GFP_KERNEL,
+ numa);
+
+ if (!sq->skb || !sq->dma_fifo) {
+ mlx5e_free_sq_db(sq);
+ return -ENOMEM;
+ }
+
+ sq->dma_fifo_mask = df_sz - 1;
+
+ return 0;
+}
+
+static int mlx5e_create_sq(struct mlx5e_channel *c,
+ int tc,
+ struct mlx5e_sq_param *param,
+ struct mlx5e_sq *sq)
+{
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+ void *sqc = param->sqc;
+ void *sqc_wq = MLX5_ADDR_OF(sqc, sqc, wq);
+ int txq_ix;
+ int err;
+
+ err = mlx5_alloc_map_uar(mdev, &sq->uar);
+ if (err)
+ return err;
+
+ param->wq.db_numa_node = cpu_to_node(c->cpu);
+
+ err = mlx5_wq_cyc_create(mdev, ¶m->wq, sqc_wq, &sq->wq,
+ &sq->wq_ctrl);
+ if (err)
+ goto err_unmap_free_uar;
+
+ sq->wq.db = &sq->wq.db[MLX5_SND_DBR];
+ sq->uar_map = sq->uar.map;
+ sq->uar_bf_map = sq->uar.bf_map;
+ sq->bf_buf_size = (1 << MLX5_CAP_GEN(mdev, log_bf_reg_size)) / 2;
+ sq->max_inline = sq->bf_buf_size -
+ sizeof(struct mlx5e_tx_wqe) +
+ 2 /*sizeof(mlx5e_tx_wqe.inline_hdr_start)*/;
+
+ err = mlx5e_alloc_sq_db(sq, cpu_to_node(c->cpu));
+ if (err)
+ goto err_sq_wq_destroy;
+
+ txq_ix = c->ix + tc * priv->params.num_channels;
+ sq->txq = netdev_get_tx_queue(priv->netdev, txq_ix);
+ priv->txq_to_sq_map[txq_ix] = sq;
+
+ sq->pdev = c->pdev;
+ sq->mkey_be = c->mkey_be;
+ sq->channel = c;
+ sq->tc = tc;
+ sq->bf_budget = MLX5E_SQ_BF_BUDGET;
+ sq->edge = (sq->wq.sz_m1 + 1) - MLX5_SEND_WQE_MAX_WQEBBS;
+
+ return 0;
+
+err_sq_wq_destroy:
+ mlx5_wq_destroy(&sq->wq_ctrl);
+
+err_unmap_free_uar:
+ mlx5_unmap_free_uar(mdev, &sq->uar);
+
+ return err;
+}
+
+static void mlx5e_destroy_sq(struct mlx5e_sq *sq)
+{
+ struct mlx5e_channel *c = sq->channel;
+ struct mlx5e_priv *priv = c->priv;
+
+ mlx5e_free_sq_db(sq);
+ mlx5_wq_destroy(&sq->wq_ctrl);
+ mlx5_unmap_free_uar(priv->mdev, &sq->uar);
+}
+
+static int mlx5e_enable_sq(struct mlx5e_sq *sq, struct mlx5e_sq_param *param)
+{
+ struct mlx5e_channel *c = sq->channel;
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+ void *in;
+ void *sqc;
+ void *wq;
+ int inlen;
+ int err;
+
+ inlen = MLX5_ST_SZ_BYTES(create_sq_in) +
+ sizeof(u64) * sq->wq_ctrl.buf.npages;
+ in = mlx5_vzalloc(inlen);
+ if (!in)
+ return -ENOMEM;
+
+ sqc = MLX5_ADDR_OF(create_sq_in, in, ctx);
+ wq = MLX5_ADDR_OF(sqc, sqc, wq);
+
+ memcpy(sqc, param->sqc, sizeof(param->sqc));
+
+ MLX5_SET(sqc, sqc, tis_num_0, priv->tisn[sq->tc]);
+ MLX5_SET(sqc, sqc, cqn, c->sq[sq->tc].cq.mcq.cqn);
+ MLX5_SET(sqc, sqc, state, MLX5_SQC_STATE_RST);
+ MLX5_SET(sqc, sqc, tis_lst_sz, 1);
+ MLX5_SET(sqc, sqc, flush_in_error_en, 1);
+
+ MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_CYCLIC);
+ MLX5_SET(wq, wq, uar_page, sq->uar.index);
+ MLX5_SET(wq, wq, log_wq_pg_sz, sq->wq_ctrl.buf.page_shift -
+ PAGE_SHIFT);
+ MLX5_SET64(wq, wq, dbr_addr, sq->wq_ctrl.db.dma);
+
+ mlx5_fill_page_array(&sq->wq_ctrl.buf,
+ (__be64 *)MLX5_ADDR_OF(wq, wq, pas));
+
+ err = mlx5_core_create_sq(mdev, in, inlen, &sq->sqn);
+
+ kvfree(in);
+
+ return err;
+}
+
+static int mlx5e_modify_sq(struct mlx5e_sq *sq, int curr_state, int next_state)
+{
+ struct mlx5e_channel *c = sq->channel;
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+ void *in;
+ void *sqc;
+ int inlen;
+ int err;
+
+ inlen = MLX5_ST_SZ_BYTES(modify_sq_in);
+ in = mlx5_vzalloc(inlen);
+ if (!in)
+ return -ENOMEM;
+
+ sqc = MLX5_ADDR_OF(modify_sq_in, in, ctx);
+
+ MLX5_SET(modify_sq_in, in, sqn, sq->sqn);
+ MLX5_SET(modify_sq_in, in, sq_state, curr_state);
+ MLX5_SET(sqc, sqc, state, next_state);
+
+ err = mlx5_core_modify_sq(mdev, in, inlen);
+
+ kvfree(in);
+
+ return err;
+}
+
+static void mlx5e_disable_sq(struct mlx5e_sq *sq)
+{
+ struct mlx5e_channel *c = sq->channel;
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+ mlx5_core_destroy_sq(mdev, sq->sqn);
+}
+
+static int mlx5e_open_sq(struct mlx5e_channel *c,
+ int tc,
+ struct mlx5e_sq_param *param,
+ struct mlx5e_sq *sq)
+{
+ int err;
+
+ err = mlx5e_create_sq(c, tc, param, sq);
+ if (err)
+ return err;
+
+ err = mlx5e_enable_sq(sq, param);
+ if (err)
+ goto err_destroy_sq;
+
+ err = mlx5e_modify_sq(sq, MLX5_SQC_STATE_RST, MLX5_SQC_STATE_RDY);
+ if (err)
+ goto err_disable_sq;
+
+ set_bit(MLX5E_SQ_STATE_WAKE_TXQ_ENABLE, &sq->state);
+ netdev_tx_reset_queue(sq->txq);
+ netif_tx_start_queue(sq->txq);
+
+ return 0;
+
+err_disable_sq:
+ mlx5e_disable_sq(sq);
+err_destroy_sq:
+ mlx5e_destroy_sq(sq);
+
+ return err;
+}
+
+/* TODO: make this function general, i.e move to netdevice.h */
+static inline void netif_tx_disable_queue(struct netdev_queue *txq)
+{
+ __netif_tx_lock_bh(txq);
+ netif_tx_stop_queue(txq);
+ __netif_tx_unlock_bh(txq);
+}
+
+static void mlx5e_close_sq(struct mlx5e_sq *sq)
+{
+ clear_bit(MLX5E_SQ_STATE_WAKE_TXQ_ENABLE, &sq->state);
+ napi_synchronize(&sq->channel->napi); /* prevent netif_tx_wake_queue */
+ netif_tx_disable_queue(sq->txq);
+
+ /* ensure hw is notified of all pending wqes */
+ if (mlx5e_sq_has_room_for(sq, 1))
+ mlx5e_send_nop(sq, true);
+
+ mlx5e_modify_sq(sq, MLX5_SQC_STATE_RDY, MLX5_SQC_STATE_ERR);
+ while (sq->cc != sq->pc) /* wait till sq is empty */
+ msleep(20);
+
+ /* avoid destroying sq before mlx5e_poll_tx_cq() is done with it */
+ napi_synchronize(&sq->channel->napi);
+
+ mlx5e_disable_sq(sq);
+ mlx5e_destroy_sq(sq);
+}
+
+static int mlx5e_create_cq(struct mlx5e_channel *c,
+ struct mlx5e_cq_param *param,
+ struct mlx5e_cq *cq)
+{
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+ struct mlx5_core_cq *mcq = &cq->mcq;
+ int eqn_not_used;
+ int irqn;
+ int err;
+ u32 i;
+
+ param->wq.buf_numa_node = cpu_to_node(c->cpu);
+ param->wq.db_numa_node = cpu_to_node(c->cpu);
+ param->eq_ix = c->ix;
+
+ err = mlx5_cqwq_create(mdev, ¶m->wq, param->cqc, &cq->wq,
+ &cq->wq_ctrl);
+ if (err)
+ return err;
+
+ mlx5_vector2eqn(mdev, param->eq_ix, &eqn_not_used, &irqn);
+
+ cq->napi = &c->napi;
+
+ mcq->cqe_sz = 64;
+ mcq->set_ci_db = cq->wq_ctrl.db.db;
+ mcq->arm_db = cq->wq_ctrl.db.db + 1;
+ *mcq->set_ci_db = 0;
+ *mcq->arm_db = 0;
+ mcq->vector = param->eq_ix;
+ mcq->comp = mlx5e_completion_event;
+ mcq->event = mlx5e_cq_error_event;
+ mcq->irqn = irqn;
+ mcq->uar = &priv->cq_uar;
+
+ for (i = 0; i < mlx5_cqwq_get_size(&cq->wq); i++) {
+ struct mlx5_cqe64 *cqe = mlx5_cqwq_get_wqe(&cq->wq, i);
+
+ cqe->op_own = 0xf1;
+ }
+
+ cq->channel = c;
+
+ return 0;
+}
+
+static void mlx5e_destroy_cq(struct mlx5e_cq *cq)
+{
+ mlx5_wq_destroy(&cq->wq_ctrl);
+}
+
+static int mlx5e_enable_cq(struct mlx5e_cq *cq, struct mlx5e_cq_param *param)
+{
+ struct mlx5e_channel *c = cq->channel;
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+ struct mlx5_core_cq *mcq = &cq->mcq;
+
+ void *in;
+ void *cqc;
+ int inlen;
+ int irqn_not_used;
+ int eqn;
+ int err;
+
+ inlen = MLX5_ST_SZ_BYTES(create_cq_in) +
+ sizeof(u64) * cq->wq_ctrl.buf.npages;
+ in = mlx5_vzalloc(inlen);
+ if (!in)
+ return -ENOMEM;
+
+ cqc = MLX5_ADDR_OF(create_cq_in, in, cq_context);
+
+ memcpy(cqc, param->cqc, sizeof(param->cqc));
+
+ mlx5_fill_page_array(&cq->wq_ctrl.buf,
+ (__be64 *)MLX5_ADDR_OF(create_cq_in, in, pas));
+
+ mlx5_vector2eqn(mdev, param->eq_ix, &eqn, &irqn_not_used);
+
+ MLX5_SET(cqc, cqc, c_eqn, eqn);
+ MLX5_SET(cqc, cqc, uar_page, mcq->uar->index);
+ MLX5_SET(cqc, cqc, log_page_size, cq->wq_ctrl.buf.page_shift -
+ PAGE_SHIFT);
+ MLX5_SET64(cqc, cqc, dbr_addr, cq->wq_ctrl.db.dma);
+
+ err = mlx5_core_create_cq(mdev, mcq, in, inlen);
+
+ kvfree(in);
+
+ if (err)
+ return err;
+
+ mlx5e_cq_arm(cq);
+
+ return 0;
+}
+
+static void mlx5e_disable_cq(struct mlx5e_cq *cq)
+{
+ struct mlx5e_channel *c = cq->channel;
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+ mlx5_core_destroy_cq(mdev, &cq->mcq);
+}
+
+static int mlx5e_open_cq(struct mlx5e_channel *c,
+ struct mlx5e_cq_param *param,
+ struct mlx5e_cq *cq,
+ u16 moderation_usecs,
+ u16 moderation_frames)
+{
+ int err;
+ struct mlx5e_priv *priv = c->priv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+ err = mlx5e_create_cq(c, param, cq);
+ if (err)
+ return err;
+
+ err = mlx5e_enable_cq(cq, param);
+ if (err)
+ goto err_destroy_cq;
+
+ err = mlx5_core_modify_cq_moderation(mdev, &cq->mcq,
+ moderation_usecs,
+ moderation_frames);
+ if (err)
+ goto err_destroy_cq;
+
+ return 0;
+
+err_destroy_cq:
+ mlx5e_destroy_cq(cq);
+
+ return err;
+}
+
+static void mlx5e_close_cq(struct mlx5e_cq *cq)
+{
+ mlx5e_disable_cq(cq);
+ mlx5e_destroy_cq(cq);
+}
+
+static int mlx5e_get_cpu(struct mlx5e_priv *priv, int ix)
+{
+#ifdef CONFIG_CPUMASK_OFFSTACK
+ cpumask_var_t affinity_mask = priv->mdev->priv.irq_info[ix].mask;
+
+ return affinity_mask ? cpumask_first(affinity_mask) : 0;
+#else
+ return 0;
+#endif
+}
+
+static void mlx5e_build_tc_to_txq_map(struct mlx5e_channel *c,
+ int num_channels)
+{
+ int i;
+
+ for (i = 0; i < MLX5E_MAX_NUM_TC; i++)
+ c->tc_to_txq_map[i] = c->ix + i * num_channels;
+}
+
+static int mlx5e_open_tx_cqs(struct mlx5e_channel *c,
+ struct mlx5e_channel_param *cparam)
+{
+ struct mlx5e_priv *priv = c->priv;
+ int err;
+ int tc;
+
+ for (tc = 0; tc < c->num_tc; tc++) {
+ err = mlx5e_open_cq(c, &cparam->tx_cq, &c->sq[tc].cq,
+ priv->params.tx_cq_moderation_usec,
+ priv->params.tx_cq_moderation_pkts);
+ if (err)
+ goto err_close_tx_cqs;
+ }
+
+ return 0;
+
+err_close_tx_cqs:
+ for (tc--; tc >= 0; tc--)
+ mlx5e_close_cq(&c->sq[tc].cq);
+
+ return err;
+}
+
+static void mlx5e_close_tx_cqs(struct mlx5e_channel *c)
+{
+ int tc;
+
+ for (tc = 0; tc < c->num_tc; tc++)
+ mlx5e_close_cq(&c->sq[tc].cq);
+}
+
+static int mlx5e_open_sqs(struct mlx5e_channel *c,
+ struct mlx5e_channel_param *cparam)
+{
+ int err;
+ int tc;
+
+ for (tc = 0; tc < c->num_tc; tc++) {
+ err = mlx5e_open_sq(c, tc, &cparam->sq, &c->sq[tc]);
+ if (err)
+ goto err_close_sqs;
+ }
+
+ return 0;
+
+err_close_sqs:
+ for (tc--; tc >= 0; tc--)
+ mlx5e_close_sq(&c->sq[tc]);
+
+ return err;
+}
+
+static void mlx5e_close_sqs(struct mlx5e_channel *c)
+{
+ int tc;
+
+ for (tc = 0; tc < c->num_tc; tc++)
+ mlx5e_close_sq(&c->sq[tc]);
+}
+
+static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
+ struct mlx5e_channel_param *cparam,
+ struct mlx5e_channel **cp)
+{
+ struct net_device *netdev = priv->netdev;
+ int cpu = mlx5e_get_cpu(priv, ix);
+ struct mlx5e_channel *c;
+ int err;
+
+ c = kzalloc_node(sizeof(*c), GFP_KERNEL, cpu_to_node(cpu));
+ if (!c)
+ return -ENOMEM;
+
+ c->priv = priv;
+ c->ix = ix;
+ c->cpu = cpu;
+ c->pdev = &priv->mdev->pdev->dev;
+ c->netdev = priv->netdev;
+ c->mkey_be = cpu_to_be32(priv->mr.key);
+ c->num_tc = priv->params.num_tc;
+
+ mlx5e_build_tc_to_txq_map(c, priv->params.num_channels);
+
+ netif_napi_add(netdev, &c->napi, mlx5e_napi_poll, 64);
+
+ err = mlx5e_open_tx_cqs(c, cparam);
+ if (err)
+ goto err_napi_del;
+
+ err = mlx5e_open_cq(c, &cparam->rx_cq, &c->rq.cq,
+ priv->params.rx_cq_moderation_usec,
+ priv->params.rx_cq_moderation_pkts);
+ if (err)
+ goto err_close_tx_cqs;
+
+ napi_enable(&c->napi);
+
+ err = mlx5e_open_sqs(c, cparam);
+ if (err)
+ goto err_disable_napi;
+
+ err = mlx5e_open_rq(c, &cparam->rq, &c->rq);
+ if (err)
+ goto err_close_sqs;
+
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,9,0)) || \
+ defined(CONFIG_COMPAT_IS_NETIF_SET_XPS_QUEUE_NOT_CONST_CPUMASK)
+ netif_set_xps_queue(netdev, (struct cpumask *)get_cpu_mask(c->cpu), ix);
+#else
+ netif_set_xps_queue(netdev, get_cpu_mask(c->cpu), ix);
+#endif
+ *cp = c;
+
+ return 0;
+
+err_close_sqs:
+ mlx5e_close_sqs(c);
+
+err_disable_napi:
+ napi_disable(&c->napi);
+ mlx5e_close_cq(&c->rq.cq);
+
+err_close_tx_cqs:
+ mlx5e_close_tx_cqs(c);
+
+err_napi_del:
+ netif_napi_del(&c->napi);
+ kfree(c);
+
+ return err;
+}
+
+static void mlx5e_close_channel(struct mlx5e_channel *c)
+{
+ mlx5e_close_rq(&c->rq);
+ mlx5e_close_sqs(c);
+ napi_disable(&c->napi);
+ mlx5e_close_cq(&c->rq.cq);
+ mlx5e_close_tx_cqs(c);
+ netif_napi_del(&c->napi);
+ kfree(c);
+}
+
+static void mlx5e_build_rq_param(struct mlx5e_priv *priv,
+ struct mlx5e_rq_param *param)
+{
+ void *rqc = param->rqc;
+ void *wq = MLX5_ADDR_OF(rqc, rqc, wq);
+
+ MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_LINKED_LIST);
+ MLX5_SET(wq, wq, end_padding_mode, MLX5_WQ_END_PAD_MODE_ALIGN);
+ MLX5_SET(wq, wq, log_wq_stride, ilog2(sizeof(struct mlx5e_rx_wqe)));
+ MLX5_SET(wq, wq, log_wq_sz, priv->params.log_rq_size);
+ MLX5_SET(wq, wq, pd, priv->pdn);
+
+ param->wq.buf_numa_node = dev_to_node(&priv->mdev->pdev->dev);
+ param->wq.linear = 1;
+}
+
+static void mlx5e_build_sq_param(struct mlx5e_priv *priv,
+ struct mlx5e_sq_param *param)
+{
+ void *sqc = param->sqc;
+ void *wq = MLX5_ADDR_OF(sqc, sqc, wq);
+
+ MLX5_SET(wq, wq, log_wq_sz, priv->params.log_sq_size);
+ MLX5_SET(wq, wq, log_wq_stride, ilog2(MLX5_SEND_WQE_BB));
+ MLX5_SET(wq, wq, pd, priv->pdn);
+
+ param->wq.buf_numa_node = dev_to_node(&priv->mdev->pdev->dev);
+}
+
+static void mlx5e_build_common_cq_param(struct mlx5e_priv *priv,
+ struct mlx5e_cq_param *param)
+{
+ void *cqc = param->cqc;
+
+ MLX5_SET(cqc, cqc, uar_page, priv->cq_uar.index);
+}
+
+static void mlx5e_build_rx_cq_param(struct mlx5e_priv *priv,
+ struct mlx5e_cq_param *param)
+{
+ void *cqc = param->cqc;
+
+ MLX5_SET(cqc, cqc, log_cq_size, priv->params.log_rq_size);
+
+ mlx5e_build_common_cq_param(priv, param);
+}
+
+static void mlx5e_build_tx_cq_param(struct mlx5e_priv *priv,
+ struct mlx5e_cq_param *param)
+{
+ void *cqc = param->cqc;
+
+ MLX5_SET(cqc, cqc, log_cq_size, priv->params.log_sq_size);
+
+ mlx5e_build_common_cq_param(priv, param);
+}
+
+static void mlx5e_build_channel_param(struct mlx5e_priv *priv,
+ struct mlx5e_channel_param *cparam)
+{
+ memset(cparam, 0, sizeof(*cparam));
+
+ mlx5e_build_rq_param(priv, &cparam->rq);
+ mlx5e_build_sq_param(priv, &cparam->sq);
+ mlx5e_build_rx_cq_param(priv, &cparam->rx_cq);
+ mlx5e_build_tx_cq_param(priv, &cparam->tx_cq);
+}
+
+static int mlx5e_open_channels(struct mlx5e_priv *priv)
+{
+ struct mlx5e_channel_param cparam;
+ int nch = priv->params.num_channels;
+ int err = -ENOMEM;
+ int i;
+ int j;
+
+ priv->channel = kcalloc(nch, sizeof(struct mlx5e_channel *),
+ GFP_KERNEL);
+
+ priv->txq_to_sq_map = kcalloc(nch * priv->params.num_tc,
+ sizeof(struct mlx5e_sq *), GFP_KERNEL);
+
+ if (!priv->channel || !priv->txq_to_sq_map)
+ goto err_free_txq_to_sq_map;
+
+ mlx5e_build_channel_param(priv, &cparam);
+ for (i = 0; i < nch; i++) {
+ err = mlx5e_open_channel(priv, i, &cparam, &priv->channel[i]);
+ if (err)
+ goto err_close_channels;
+ }
+
+ for (j = 0; j < nch; j++) {
+ err = mlx5e_wait_for_min_rx_wqes(&priv->channel[j]->rq);
+ if (err)
+ goto err_close_channels;
+ }
+
+ return 0;
+
+err_close_channels:
+ for (i--; i >= 0; i--)
+ mlx5e_close_channel(priv->channel[i]);
+
+err_free_txq_to_sq_map:
+ kfree(priv->txq_to_sq_map);
+ kfree(priv->channel);
+
+ return err;
+}
+
+static void mlx5e_rename_channels_eqs(struct mlx5e_priv *priv)
+{
+ int i;
+ int err;
+
+ for (i = 0; i < priv->params.num_channels; i++) {
+ err = mlx5_rename_eq(priv->mdev, i, priv->netdev->name);
+ if (err)
+ netdev_err(priv->netdev,
+ "%s: mlx5_rename_eq failed: %d\n",
+ __func__, err);
+ }
+}
+
+static void mlx5e_close_channels(struct mlx5e_priv *priv)
+{
+ int i;
+
+ for (i = 0; i < priv->params.num_channels; i++)
+ mlx5e_close_channel(priv->channel[i]);
+
+ kfree(priv->txq_to_sq_map);
+ kfree(priv->channel);
+}
+
+static int mlx5e_open_tis(struct mlx5e_priv *priv, int tc)
+{
+ struct mlx5_core_dev *mdev = priv->mdev;
+ u32 in[MLX5_ST_SZ_DW(create_tis_in)];
+ void *tisc = MLX5_ADDR_OF(create_tis_in, in, ctx);
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(tisc, tisc, prio, tc);
+ MLX5_SET(tisc, tisc, transport_domain, priv->tdn);
+
+ return mlx5_core_create_tis(mdev, in, sizeof(in), &priv->tisn[tc]);
+}
+
+static void mlx5e_close_tis(struct mlx5e_priv *priv, int tc)
+{
+ mlx5_core_destroy_tis(priv->mdev, priv->tisn[tc]);
+}
+
+static int mlx5e_open_tises(struct mlx5e_priv *priv)
+{
+ int err;
+ int tc;
+
+ for (tc = 0; tc < priv->params.num_tc; tc++) {
+ err = mlx5e_open_tis(priv, tc);
+ if (err)
+ goto err_close_tises;
+ }
+
+ return 0;
+
+err_close_tises:
+ for (tc--; tc >= 0; tc--)
+ mlx5e_close_tis(priv, tc);
+
+ return err;
+}
+
+static void mlx5e_close_tises(struct mlx5e_priv *priv)
+{
+ int tc;
+
+ for (tc = 0; tc < priv->params.num_tc; tc++)
+ mlx5e_close_tis(priv, tc);
+}
+
+static int mlx5e_bits_invert(unsigned long a, int size)
+{
+ int i;
+ int inv = 0;
+
+ for (i = 0; i < size; i++)
+ inv |= (test_bit(size - i - 1, &a) ? 1 : 0) << i;
+
+ return inv;
+}
+
+static int mlx5e_open_rqt(struct mlx5e_priv *priv)
+{
+ struct mlx5_core_dev *mdev = priv->mdev;
+ u32 *in;
+ u32 out[MLX5_ST_SZ_DW(create_rqt_out)];
+ void *rqtc;
+ int inlen;
+ int err;
+ int log_tbl_sz = priv->params.rx_hash_log_tbl_sz;
+ int sz = 1 << log_tbl_sz;
+ int i;
+
+ inlen = MLX5_ST_SZ_BYTES(create_rqt_in) + sizeof(u32) * sz;
+ in = mlx5_vzalloc(inlen);
+ if (!in)
+ return -ENOMEM;
+
+ rqtc = MLX5_ADDR_OF(create_rqt_in, in, rqt_context);
+
+ MLX5_SET(rqtc, rqtc, rqt_actual_size, sz);
+ MLX5_SET(rqtc, rqtc, rqt_max_size, sz);
+
+ for (i = 0; i < sz; i++) {
+ int ix = i;
+
+ if (priv->params.rss_hash_xor)
+ ix = mlx5e_bits_invert(i, log_tbl_sz);
+
+ ix = ix % priv->params.num_channels;
+ MLX5_SET(rqtc, rqtc, rq_num[i], priv->channel[ix]->rq.rqn);
+ }
+
+ MLX5_SET(create_rqt_in, in, opcode, MLX5_CMD_OP_CREATE_RQT);
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(mdev, in, inlen, out, sizeof(out));
+ if (!err)
+ priv->rqtn = MLX5_GET(create_rqt_out, out, rqtn);
+
+ kvfree(in);
+
+ return err;
+}
+
+static void mlx5e_close_rqt(struct mlx5e_priv *priv)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_rqt_in)];
+ u32 out[MLX5_ST_SZ_DW(destroy_rqt_out)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(destroy_rqt_in, in, opcode, MLX5_CMD_OP_DESTROY_RQT);
+ MLX5_SET(destroy_rqt_in, in, rqtn, priv->rqtn);
+
+ mlx5_cmd_exec_check_status(priv->mdev, in, sizeof(in), out,
+ sizeof(out));
+}
+
+static void mlx5e_build_tir_ctx(struct mlx5e_priv *priv, u32 *tirc, int tt)
+{
+ void *hfso = MLX5_ADDR_OF(tirc, tirc, rx_hash_field_selector_outer);
+
+ MLX5_SET(tirc, tirc, transport_domain, priv->tdn);
+
+#define ROUGH_MAX_L2_L3_HDR_SZ 256
+
+#define MLX5_HASH_IP (MLX5_HASH_FIELD_SEL_SRC_IP |\
+ MLX5_HASH_FIELD_SEL_DST_IP)
+
+#define MLX5_HASH_IP_L4PORTS (MLX5_HASH_FIELD_SEL_SRC_IP |\
+ MLX5_HASH_FIELD_SEL_DST_IP |\
+ MLX5_HASH_FIELD_SEL_L4_SPORT |\
+ MLX5_HASH_FIELD_SEL_L4_DPORT)
+
+#define MLX5_HASH_IP_IPSEC_SPI (MLX5_HASH_FIELD_SEL_SRC_IP |\
+ MLX5_HASH_FIELD_SEL_DST_IP |\
+ MLX5_HASH_FIELD_SEL_IPSEC_SPI)
+
+ if (priv->params.lro_en) {
+ MLX5_SET(tirc, tirc, lro_enable_mask,
+ MLX5_TIRC_LRO_ENABLE_MASK_IPV4_LRO |
+ MLX5_TIRC_LRO_ENABLE_MASK_IPV6_LRO);
+ MLX5_SET(tirc, tirc, lro_max_ip_payload_size,
+ (priv->params.lro_wqe_sz -
+ ROUGH_MAX_L2_L3_HDR_SZ) >> 8);
+ /* TODO: add the option to choose timer value dynamically */
+ MLX5_SET(tirc, tirc, lro_timeout_period_usecs,
+ MLX5_CAP_ETH(priv->mdev,
+ lro_timer_supported_periods[3]));
+ }
+
+ switch (tt) {
+ case MLX5E_TT_ANY:
+ MLX5_SET(tirc, tirc, disp_type,
+ MLX5_TIRC_DISP_TYPE_DIRECT);
+ MLX5_SET(tirc, tirc, inline_rqn,
+ priv->channel[0]->rq.rqn);
+ break;
+ default:
+ MLX5_SET(tirc, tirc, disp_type,
+ MLX5_TIRC_DISP_TYPE_INDIRECT);
+ MLX5_SET(tirc, tirc, indirect_table,
+ priv->rqtn);
+ if (priv->params.rss_hash_xor) {
+ MLX5_SET(tirc, tirc, rx_hash_fn,
+ MLX5_TIRC_RX_HASH_FN_HASH_INVERTED_XOR8);
+ } else {
+ void *rss_key = MLX5_ADDR_OF(tirc, tirc,
+ rx_hash_toeplitz_key);
+ size_t len = MLX5_FLD_SZ_BYTES(tirc,
+ rx_hash_toeplitz_key);
+
+ MLX5_SET(tirc, tirc, rx_hash_fn,
+ MLX5_TIRC_RX_HASH_FN_HASH_TOEPLITZ);
+ MLX5_SET(tirc, tirc, rx_hash_symmetric, 1);
+
+ netdev_rss_key_fill(rss_key, len);
+ }
+ break;
+ }
+
+ switch (tt) {
+ case MLX5E_TT_IPV4_TCP:
+ MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+ MLX5_L3_PROT_TYPE_IPV4);
+ MLX5_SET(rx_hash_field_select, hfso, l4_prot_type,
+ MLX5_L4_PROT_TYPE_TCP);
+ MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+ MLX5_HASH_IP_L4PORTS);
+ break;
+
+ case MLX5E_TT_IPV6_TCP:
+ MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+ MLX5_L3_PROT_TYPE_IPV6);
+ MLX5_SET(rx_hash_field_select, hfso, l4_prot_type,
+ MLX5_L4_PROT_TYPE_TCP);
+ MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+ MLX5_HASH_IP_L4PORTS);
+ break;
+
+ case MLX5E_TT_IPV4_UDP:
+ MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+ MLX5_L3_PROT_TYPE_IPV4);
+ MLX5_SET(rx_hash_field_select, hfso, l4_prot_type,
+ MLX5_L4_PROT_TYPE_UDP);
+ MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+ MLX5_HASH_IP_L4PORTS);
+ break;
+
+ case MLX5E_TT_IPV6_UDP:
+ MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+ MLX5_L3_PROT_TYPE_IPV6);
+ MLX5_SET(rx_hash_field_select, hfso, l4_prot_type,
+ MLX5_L4_PROT_TYPE_UDP);
+ MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+ MLX5_HASH_IP_L4PORTS);
+ break;
+
+ case MLX5E_TT_IPV4_IPSEC_AH:
+ MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+ MLX5_L3_PROT_TYPE_IPV4);
+ MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+ MLX5_HASH_IP_IPSEC_SPI);
+ break;
+
+ case MLX5E_TT_IPV6_IPSEC_AH:
+ MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+ MLX5_L3_PROT_TYPE_IPV6);
+ MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+ MLX5_HASH_IP_IPSEC_SPI);
+ break;
+
+ case MLX5E_TT_IPV4_IPSEC_ESP:
+ MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+ MLX5_L3_PROT_TYPE_IPV4);
+ MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+ MLX5_HASH_IP_IPSEC_SPI);
+ break;
+
+ case MLX5E_TT_IPV6_IPSEC_ESP:
+ MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+ MLX5_L3_PROT_TYPE_IPV6);
+ MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+ MLX5_HASH_IP_IPSEC_SPI);
+ break;
+
+ case MLX5E_TT_IPV4:
+ MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+ MLX5_L3_PROT_TYPE_IPV4);
+ MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+ MLX5_HASH_IP);
+ break;
+
+ case MLX5E_TT_IPV6:
+ MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+ MLX5_L3_PROT_TYPE_IPV6);
+ MLX5_SET(rx_hash_field_select, hfso, selected_fields,
+ MLX5_HASH_IP);
+ break;
+ }
+}
+
+static int mlx5e_open_tir(struct mlx5e_priv *priv, int tt)
+{
+ struct mlx5_core_dev *mdev = priv->mdev;
+ u32 *in;
+ void *tirc;
+ int inlen;
+ int err;
+
+ inlen = MLX5_ST_SZ_BYTES(create_tir_in);
+ in = mlx5_vzalloc(inlen);
+ if (!in)
+ return -ENOMEM;
+
+ tirc = MLX5_ADDR_OF(create_tir_in, in, ctx);
+
+ mlx5e_build_tir_ctx(priv, tirc, tt);
+
+ err = mlx5_core_create_tir(mdev, in, inlen, &priv->tirn[tt]);
+
+ kvfree(in);
+
+ return err;
+}
+
+static void mlx5e_close_tir(struct mlx5e_priv *priv, int tt)
+{
+ mlx5_core_destroy_tir(priv->mdev, priv->tirn[tt]);
+}
+
+static int mlx5e_open_tirs(struct mlx5e_priv *priv)
+{
+ int err;
+ int i;
+
+ for (i = 0; i < MLX5E_NUM_TT; i++) {
+ err = mlx5e_open_tir(priv, i);
+ if (err)
+ goto err_close_tirs;
+ }
+
+ return 0;
+
+err_close_tirs:
+ for (i--; i >= 0; i--)
+ mlx5e_close_tir(priv, i);
+
+ return err;
+}
+
+static void mlx5e_close_tirs(struct mlx5e_priv *priv)
+{
+ int i;
+
+ for (i = 0; i < MLX5E_NUM_TT; i++)
+ mlx5e_close_tir(priv, i);
+}
+
+static void mlx5e_netdev_set_tcs(struct net_device *netdev)
+{
+#ifdef HAVE_NDO_SETUP_TC
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ int nch = priv->params.num_channels;
+ int ntc = priv->params.num_tc;
+ int prio;
+ int tc;
+
+ netdev_reset_tc(netdev);
+
+ if (ntc == 1)
+ return;
+
+ netdev_set_num_tc(netdev, ntc);
+
+ for (tc = 0; tc < ntc; tc++)
+ netdev_set_tc_queue(netdev, tc, nch, tc * nch);
+
+ for (prio = 0; prio < MLX5E_MAX_NUM_PRIO; prio++)
+ netdev_set_prio_tc_map(netdev, prio, prio % ntc);
+#endif
+}
+
+static int mlx5e_set_dev_port_mtu(struct net_device *netdev)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct mlx5_core_dev *mdev = priv->mdev;
+ int hw_mtu;
+ int err;
+
+ err = mlx5_set_port_mtu(mdev, MLX5E_SW2HW_MTU(netdev->mtu));
+ if (err)
+ return err;
+
+ mlx5_query_port_oper_mtu(mdev, &hw_mtu);
+
+ if (MLX5E_HW2SW_MTU(hw_mtu) != netdev->mtu)
+ netdev_warn(netdev, "%s: Port MTU %d is different than netdev mtu %d\n",
+ __func__, MLX5E_HW2SW_MTU(hw_mtu), netdev->mtu);
+
+ netdev->mtu = MLX5E_HW2SW_MTU(hw_mtu);
+ return 0;
+}
+
+int mlx5e_open_locked(struct net_device *netdev)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ int num_txqs;
+ int err;
+
+ mlx5e_netdev_set_tcs(netdev);
+
+ num_txqs = priv->params.num_channels * priv->params.num_tc;
+ netif_set_real_num_tx_queues(netdev, num_txqs);
+ netif_set_real_num_rx_queues(netdev, priv->params.num_channels);
+
+ err = mlx5e_set_dev_port_mtu(netdev);
+ if (err)
+ return err;
+
+ err = mlx5e_open_tises(priv);
+ if (err) {
+ netdev_err(netdev, "%s: mlx5e_open_tises failed, %d\n",
+ __func__, err);
+ return err;
+ }
+
+ err = mlx5e_open_channels(priv);
+ if (err) {
+ netdev_err(netdev, "%s: mlx5e_open_channels failed, %d\n",
+ __func__, err);
+ goto err_close_tises;
+ }
+
+ err = mlx5e_open_rqt(priv);
+ if (err) {
+ netdev_err(netdev, "%s: mlx5e_open_rqt failed, %d\n",
+ __func__, err);
+ goto err_close_channels;
+ }
+
+ err = mlx5e_open_tirs(priv);
+ if (err) {
+ netdev_err(netdev, "%s: mlx5e_open_tir failed, %d\n",
+ __func__, err);
+ goto err_close_rqls;
+ }
+
+ err = mlx5e_open_flow_table(priv);
+ if (err) {
+ netdev_err(netdev, "%s: mlx5e_open_flow_table failed, %d\n",
+ __func__, err);
+ goto err_close_tirs;
+ }
+
+ err = mlx5e_add_all_vlan_rules(priv);
+ if (err) {
+ netdev_err(netdev, "%s: mlx5e_add_all_vlan_rules failed, %d\n",
+ __func__, err);
+ goto err_close_flow_table;
+ }
+
+ mlx5e_rename_channels_eqs(priv);
+ mlx5e_init_eth_addr(priv);
+
+ set_bit(MLX5E_STATE_OPENED, &priv->state);
+
+ mlx5e_create_debugfs(priv);
+ mlx5e_update_carrier(priv);
+ mlx5e_set_rx_mode_core(priv);
+
+ schedule_delayed_work(&priv->update_stats_work, 0);
+ return 0;
+
+err_close_flow_table:
+ mlx5e_close_flow_table(priv);
+
+err_close_tirs:
+ mlx5e_close_tirs(priv);
+
+err_close_rqls:
+ mlx5e_close_rqt(priv);
+
+err_close_channels:
+ mlx5e_close_channels(priv);
+
+err_close_tises:
+ mlx5e_close_tises(priv);
+
+ return err;
+}
+
+static int mlx5e_open(struct net_device *netdev)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ int err;
+
+ mutex_lock(&priv->state_lock);
+ err = mlx5e_open_locked(netdev);
+ mutex_unlock(&priv->state_lock);
+
+ return err;
+}
+
+int mlx5e_close_locked(struct net_device *netdev)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+
+ clear_bit(MLX5E_STATE_OPENED, &priv->state);
+
+ mlx5e_set_rx_mode_core(priv);
+ mlx5e_del_all_vlan_rules(priv);
+ netif_carrier_off(priv->netdev);
+ mlx5e_destroy_debugfs(priv);
+ mlx5e_close_flow_table(priv);
+ mlx5e_close_tirs(priv);
+ mlx5e_close_rqt(priv);
+ mlx5e_close_channels(priv);
+ mlx5e_close_tises(priv);
+
+ return 0;
+}
+
+static int mlx5e_close(struct net_device *netdev)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ int err;
+
+ mutex_lock(&priv->state_lock);
+ err = mlx5e_close_locked(netdev);
+ mutex_unlock(&priv->state_lock);
+
+ return err;
+}
+
+int mlx5e_update_priv_params(struct mlx5e_priv *priv,
+ struct mlx5e_params *new_params)
+{
+ int err = 0;
+ int was_opened;
+
+ WARN_ON(!mutex_is_locked(&priv->state_lock));
+
+ was_opened = test_bit(MLX5E_STATE_OPENED, &priv->state);
+ if (was_opened)
+ mlx5e_close_locked(priv->netdev);
+
+ priv->params = *new_params;
+
+ if (was_opened)
+ err = mlx5e_open_locked(priv->netdev);
+
+ return err;
+}
+
+#ifdef HAVE_NDO_SETUP_TC
+static int mlx5e_setup_tc(struct net_device *netdev, u8 tc)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct mlx5e_params new_params;
+ int err;
+
+ if (tc > MLX5E_MAX_NUM_TC)
+ return -EINVAL;
+
+ mutex_lock(&priv->state_lock);
+ new_params = priv->params;
+ new_params.num_tc = tc ? tc : 1;
+ err = mlx5e_update_priv_params(priv, &new_params);
+ mutex_unlock(&priv->state_lock);
+
+ return err;
+}
+#endif
+
+#ifdef HAVE_NDO_GET_STATS64
+static struct rtnl_link_stats64 *
+mlx5e_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
+#else
+static struct net_device_stats *mlx5e_get_stats(struct net_device *dev)
+#endif
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+ struct mlx5e_vport_stats *vstats = &priv->stats.vport;
+
+#ifndef HAVE_NDO_GET_STATS64
+ struct net_device_stats *stats = &priv->netdev_stats;
+#endif
+
+ stats->rx_packets = vstats->rx_packets;
+ stats->rx_bytes = vstats->rx_bytes;
+ stats->tx_packets = vstats->tx_packets;
+ stats->tx_bytes = vstats->tx_bytes;
+ stats->multicast = vstats->rx_multicast_packets +
+ vstats->tx_multicast_packets;
+ stats->tx_errors = vstats->tx_error_packets;
+ stats->rx_errors = vstats->rx_error_packets;
+ stats->tx_dropped = vstats->tx_queue_dropped;
+ /* TODO: replace 0s with true values */
+ stats->rx_crc_errors = 0;
+ stats->rx_length_errors = 0;
+
+ return stats;
+}
+
+static void mlx5e_set_rx_mode(struct net_device *dev)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+
+ schedule_work(&priv->set_rx_mode_work);
+}
+
+static int mlx5e_set_mac(struct net_device *netdev, void *addr)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct sockaddr *saddr = addr;
+
+ if (!is_valid_ether_addr(saddr->sa_data))
+ return -EADDRNOTAVAIL;
+
+ netif_addr_lock_bh(netdev);
+ ether_addr_copy(netdev->dev_addr, saddr->sa_data);
+ netif_addr_unlock_bh(netdev);
+
+ schedule_work(&priv->set_rx_mode_work);
+
+ return 0;
+}
+
+#if (defined(HAVE_NDO_SET_FEATURES) || defined(HAVE_NET_DEVICE_OPS_EXT))
+static int mlx5e_set_features(struct net_device *netdev,
+#ifdef HAVE_NET_DEVICE_OPS_EXT
+ u32 features)
+#else
+ netdev_features_t features)
+#endif
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ netdev_features_t changes = features ^ netdev->features;
+ struct mlx5e_params new_params;
+ bool update_params = false;
+
+ mutex_lock(&priv->state_lock);
+ new_params = priv->params;
+
+ if (changes & NETIF_F_LRO) {
+ new_params.lro_en = !!(features & NETIF_F_LRO);
+ update_params = true;
+ }
+
+ if (update_params)
+ mlx5e_update_priv_params(priv, &new_params);
+
+ if (changes & NETIF_F_HW_VLAN_CTAG_FILTER) {
+ if (features & NETIF_F_HW_VLAN_CTAG_FILTER)
+ mlx5e_enable_vlan_filter(priv);
+ else
+ mlx5e_disable_vlan_filter(priv);
+ }
+
+ mutex_unlock(&priv->state_lock);
+
+ return 0;
+}
+#endif
+
+static int mlx5e_change_mtu(struct net_device *netdev, int new_mtu)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct mlx5_core_dev *mdev = priv->mdev;
+ int max_mtu;
+ int err;
+
+ mlx5_query_port_max_mtu(mdev, &max_mtu);
+
+ if (MLX5E_SW2HW_MTU(new_mtu) > min_t(int, MLX5E_MAX_MTU, max_mtu)) {
+ netdev_err(netdev,
+ "%s: Bad MTU (%d) > (%d) Max\n",
+ __func__, new_mtu, max_mtu);
+ return -EINVAL;
+ }
+
+ mutex_lock(&priv->state_lock);
+ netdev->mtu = new_mtu;
+ err = mlx5e_update_priv_params(priv, &priv->params);
+ mutex_unlock(&priv->state_lock);
+
+ return err;
+}
+
+#if defined HAVE_VLAN_GRO_RECEIVE || defined HAVE_VLAN_HWACCEL_RX
+void mlx5e_vlan_register(struct net_device *netdev, struct vlan_group *grp)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ priv->vlan_grp = grp;
+}
+#endif
+
+static struct net_device_ops mlx5e_netdev_ops = {
+ .ndo_open = mlx5e_open,
+ .ndo_stop = mlx5e_close,
+ .ndo_start_xmit = mlx5e_xmit,
+#ifdef HAVE_NDO_SETUP_TC
+ .ndo_setup_tc = mlx5e_setup_tc,
+#endif
+/* .ndo_select_queue = mlx5e_select_queue, // issue 549663 */
+#ifdef HAVE_NDO_GET_STATS64
+ .ndo_get_stats64 = mlx5e_get_stats,
+#else
+ .ndo_get_stats = mlx5e_get_stats,
+#endif
+ .ndo_set_rx_mode = mlx5e_set_rx_mode,
+ .ndo_set_mac_address = mlx5e_set_mac,
+ .ndo_vlan_rx_add_vid = mlx5e_vlan_rx_add_vid,
+ .ndo_vlan_rx_kill_vid = mlx5e_vlan_rx_kill_vid,
+
+#if defined HAVE_VLAN_GRO_RECEIVE || defined HAVE_VLAN_HWACCEL_RX
+ .ndo_vlan_rx_register = mlx5e_vlan_register,
+#endif
+#if (defined(HAVE_NDO_SET_FEATURES) && !defined(HAVE_NET_DEVICE_OPS_EXT))
+ .ndo_set_features = mlx5e_set_features,
+#endif
+ .ndo_change_mtu = mlx5e_change_mtu,
+};
+
+#ifdef HAVE_NET_DEVICE_OPS_EXT
+static const struct net_device_ops_ext mlx5_netdev_ops_ext = {
+ .size = sizeof(struct net_device_ops_ext),
+ .ndo_set_features = mlx5e_set_features,
+};
+#endif
+
+static int mlx5e_check_required_hca_cap(struct mlx5_core_dev *mdev)
+{
+ if (MLX5_CAP_GEN(mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH)
+ return -ENOTSUPP;
+ /* TODO: cehck if more caps are needed */
+ if (!MLX5_CAP_GEN(mdev, eth_net_offloads) ||
+ !MLX5_CAP_GEN(mdev, nic_flow_table) ||
+ /* TODO: move following caps to control path (NETDEV Flags/OPs) */
+ !MLX5_CAP_ETH(mdev, csum_cap) ||
+ !MLX5_CAP_ETH(mdev, max_lso_cap) ||
+ !MLX5_CAP_ETH(mdev, vlan_cap) ||
+ !MLX5_CAP_ETH(mdev, rss_ind_tbl_cap) ||
+ MLX5_CAP_FLOWTABLE(mdev,
+ flow_table_properties_nic_receive.max_ft_level)
+ < 3) {
+ mlx5_core_warn(mdev,
+ "Not creating net device, some required device capabilities are missing\n");
+ return -ENOTSUPP;
+ }
+ return 0;
+}
+
+static void mlx5e_build_netdev_priv(struct mlx5_core_dev *mdev,
+ struct net_device *netdev,
+ int num_comp_vectors)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+
+ /* TODO: consider link speed for setting the following:
+ * log_sq_size
+ * log_rq_size
+ * cq moderation?
+ * lro_timeout_period_usecs@mlx5e_build_tir_ctx()
+ */
+ priv->params.log_sq_size =
+ MLX5E_PARAMS_DEFAULT_LOG_SQ_SIZE;
+ priv->params.log_rq_size =
+ MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE;
+ priv->params.rx_cq_moderation_usec =
+ MLX5E_PARAMS_DEFAULT_RX_CQ_MODERATION_USEC;
+ priv->params.rx_cq_moderation_pkts =
+ MLX5E_PARAMS_DEFAULT_RX_CQ_MODERATION_PKTS;
+ priv->params.tx_cq_moderation_usec =
+ MLX5E_PARAMS_DEFAULT_TX_CQ_MODERATION_USEC;
+ priv->params.tx_cq_moderation_pkts =
+ MLX5E_PARAMS_DEFAULT_TX_CQ_MODERATION_PKTS;
+ priv->params.min_rx_wqes =
+ MLX5E_PARAMS_DEFAULT_MIN_RX_WQES;
+ priv->params.rx_hash_log_tbl_sz =
+ (order_base_2(num_comp_vectors) >
+ MLX5E_PARAMS_DEFAULT_RX_HASH_LOG_TBL_SZ) ?
+ order_base_2(num_comp_vectors) :
+ MLX5E_PARAMS_DEFAULT_RX_HASH_LOG_TBL_SZ;
+ priv->params.num_tc = 1;
+ priv->params.default_vlan_prio = 0;
+
+ priv->params.rss_hash_xor = true;
+
+ /* TODO: add user ability to configure lro wqe size */
+ /* we disable lro by default, user can enable via ethtool */
+ priv->params.lro_en = false && !!MLX5_CAP_ETH(priv->mdev, lro_cap);
+ priv->params.lro_wqe_sz =
+ MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ;
+
+ priv->mdev = mdev;
+ priv->netdev = netdev;
+ priv->params.num_channels = num_comp_vectors;
+ priv->default_vlan_prio = priv->params.default_vlan_prio;
+
+ spin_lock_init(&priv->async_events_spinlock);
+ mutex_init(&priv->state_lock);
+
+ INIT_WORK(&priv->update_carrier_work, mlx5e_update_carrier_work);
+ INIT_WORK(&priv->set_rx_mode_work, mlx5e_set_rx_mode_work);
+ INIT_DELAYED_WORK(&priv->update_stats_work, mlx5e_update_stats_work);
+}
+
+static void mlx5e_set_netdev_dev_addr(struct net_device *netdev)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+
+ mlx5_query_nic_vport_mac_address(priv->mdev, netdev->dev_addr);
+ /* TODO: w4fw: set mac address in nic vport context */
+}
+
+static void mlx5e_build_netdev(struct net_device *netdev)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+ SET_NETDEV_DEV(netdev, &mdev->pdev->dev);
+
+ netdev->netdev_ops = &mlx5e_netdev_ops;
+ netdev->watchdog_timeo = 15 * HZ;
+
+#ifdef HAVE_ETHTOOL_OPS_EXT
+ SET_ETHTOOL_OPS(netdev, &mlx5e_ethtool_ops);
+ set_ethtool_ops_ext(netdev, &mlx5e_ethtool_ops_ext);
+#else
+ netdev->ethtool_ops = &mlx5e_ethtool_ops;
+#endif
+
+ netdev->vlan_features = NETIF_F_SG;
+ netdev->vlan_features |= NETIF_F_IP_CSUM;
+ netdev->vlan_features |= NETIF_F_IPV6_CSUM;
+ netdev->vlan_features |= NETIF_F_GRO;
+ netdev->vlan_features |= NETIF_F_TSO;
+ netdev->vlan_features |= NETIF_F_TSO6;
+ netdev->vlan_features |= NETIF_F_RXCSUM;
+#ifdef HAVE_NETIF_F_RXHASH
+ netdev->vlan_features |= NETIF_F_RXHASH;
+#endif
+
+ if (!!MLX5_CAP_ETH(mdev, lro_cap))
+ netdev->vlan_features |= NETIF_F_LRO;
+
+#ifdef HAVE_NETDEV_HW_FEATURES
+ netdev->hw_features = netdev->vlan_features;
+ netdev->hw_features |= NETIF_F_HW_VLAN_CTAG_RX;
+ netdev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER;
+
+ netdev->features = netdev->hw_features;
+#else /* HAVE_NETDEV_HW_FEATURES */
+ netdev->features = netdev->vlan_features;
+ netdev->features |= NETIF_F_HW_VLAN_CTAG_RX;
+ netdev->features |= NETIF_F_HW_VLAN_CTAG_FILTER;
+#ifdef HAVE_SET_NETDEV_HW_FEATURES
+ set_netdev_hw_features(netdev, netdev->features);
+#endif
+#endif /* HAVE_NETDEV_HW_FEATURES */
+
+ if (!priv->params.lro_en)
+ netdev->features &= ~NETIF_F_LRO;
+
+ netdev->features |= NETIF_F_HIGHDMA;
+
+#if (LINUX_VERSION_CODE >= KERNEL_VERSION(3,2,0))
+ netdev->priv_flags |= IFF_UNICAST_FLT;
+#endif
+
+#ifdef HAVE_NET_DEVICE_OPS_EXT
+ set_netdev_ops_ext(netdev, &mlx5_netdev_ops_ext);
+#endif
+
+ mlx5e_set_netdev_dev_addr(netdev);
+}
+
+static int mlx5e_create_mkey(struct mlx5e_priv *priv, u32 pdn,
+ struct mlx5_core_mr *mr)
+{
+ struct mlx5_core_dev *mdev = priv->mdev;
+ struct mlx5_create_mkey_mbox_in *in;
+ int err;
+
+ in = mlx5_vzalloc(sizeof(*in));
+ if (!in)
+ return -ENOMEM;
+
+ in->seg.flags = MLX5_PERM_LOCAL_WRITE |
+ MLX5_PERM_LOCAL_READ |
+ MLX5_ACCESS_MODE_PA;
+ in->seg.flags_pd = cpu_to_be32(pdn | MLX5_MKEY_LEN64);
+ in->seg.qpn_mkey7_0 = cpu_to_be32(0xffffff << 8);
+
+ err = mlx5_core_create_mkey(mdev, mr, in, sizeof(*in), NULL, NULL,
+ NULL);
+
+ kvfree(in);
+
+ return err;
+}
+
+static void *mlx5e_create_netdev(struct mlx5_core_dev *mdev)
+{
+ struct net_device *netdev;
+ struct mlx5e_priv *priv;
+ int ncv = mdev->priv.eq_table.num_comp_vectors;
+ int err;
+
+ if (mlx5e_check_required_hca_cap(mdev))
+ return NULL;
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ netdev = alloc_etherdev_mqs(sizeof(struct mlx5e_priv),
+ ncv * MLX5E_MAX_NUM_TC,
+ ncv);
+#else
+ netdev = alloc_etherdev_mq(sizeof(struct mlx5e_priv), ncv);
+#endif
+ if (!netdev) {
+ mlx5_core_err(mdev, "alloc_etherdev_mqs() failed\n");
+ return NULL;
+ }
+
+ mlx5e_build_netdev_priv(mdev, netdev, ncv);
+ mlx5e_build_netdev(netdev);
+
+ netif_carrier_off(netdev);
+
+ priv = netdev_priv(netdev);
+
+ err = mlx5_alloc_map_uar(mdev, &priv->cq_uar);
+ if (err) {
+ netdev_err(netdev, "%s: mlx5_alloc_map_uar failed, %d\n",
+ __func__, err);
+ goto err_free_netdev;
+ }
+
+ err = mlx5_core_alloc_pd(mdev, &priv->pdn);
+ if (err) {
+ netdev_err(netdev, "%s: mlx5_core_alloc_pd failed, %d\n",
+ __func__, err);
+ goto err_unmap_free_uar;
+ }
+
+ err = mlx5_alloc_transport_domain(mdev, &priv->tdn);
+ if (err) {
+ netdev_err(netdev, "%s: mlx5_alloc_transport_domain failed, %d\n",
+ __func__, err);
+ goto err_dealloc_pd;
+ }
+
+ err = mlx5e_create_mkey(priv, priv->pdn, &priv->mr);
+ if (err) {
+ netdev_err(netdev, "%s: mlx5e_create_mkey failed, %d\n",
+ __func__, err);
+ goto err_dealloc_transport_domain;
+ }
+
+ err = register_netdev(netdev);
+ if (err) {
+ netdev_err(netdev, "%s: register_netdev failed, %d\n",
+ __func__, err);
+ goto err_destroy_mkey;
+ }
+
+ mlx5e_enable_async_events(priv);
+
+ return priv;
+
+err_destroy_mkey:
+ mlx5_core_destroy_mkey(mdev, &priv->mr);
+
+err_dealloc_transport_domain:
+ mlx5_dealloc_transport_domain(mdev, priv->tdn);
+
+err_dealloc_pd:
+ mlx5_core_dealloc_pd(mdev, priv->pdn);
+
+err_unmap_free_uar:
+ mlx5_unmap_free_uar(mdev, &priv->cq_uar);
+
+err_free_netdev:
+ free_netdev(netdev);
+
+ return NULL;
+}
+
+static void mlx5e_destroy_netdev(struct mlx5_core_dev *mdev, void *vpriv)
+{
+ struct mlx5e_priv *priv = vpriv;
+ struct net_device *netdev = priv->netdev;
+
+ unregister_netdev(netdev);
+ mlx5_core_destroy_mkey(priv->mdev, &priv->mr);
+ mlx5_dealloc_transport_domain(priv->mdev, priv->tdn);
+ mlx5_core_dealloc_pd(priv->mdev, priv->pdn);
+ mlx5_unmap_free_uar(priv->mdev, &priv->cq_uar);
+ mlx5e_disable_async_events(priv);
+ flush_scheduled_work();
+ free_netdev(netdev);
+}
+
+static void *mlx5e_get_netdev(void *vpriv)
+{
+ struct mlx5e_priv *priv = vpriv;
+
+ return priv->netdev;
+}
+
+static struct mlx5_interface mlx5e_interface = {
+ .add = mlx5e_create_netdev,
+ .remove = mlx5e_destroy_netdev,
+ .event = mlx5e_async_event,
+ .protocol = MLX5_INTERFACE_PROTOCOL_ETH,
+ .get_dev = mlx5e_get_netdev,
+};
+
+void mlx5e_init(void)
+{
+ mlx5_register_interface(&mlx5e_interface);
+}
+
+void mlx5e_cleanup(void)
+{
+ mlx5_unregister_interface(&mlx5e_interface);
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/en_rx.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_rx.c
new file mode 100644
index 0000000..67081ac
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_rx.c
@@ -0,0 +1,310 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "en.h"
+
+static inline int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq,
+ struct mlx5e_rx_wqe *wqe, u16 ix)
+{
+ struct sk_buff *skb;
+ dma_addr_t dma_addr;
+
+ skb = netdev_alloc_skb(rq->netdev, rq->wqe_sz);
+ if (unlikely(!skb))
+ return -ENOMEM;
+
+ dma_addr = dma_map_single(rq->pdev,
+ /* hw start padding */
+ skb->data,
+ /* hw end padding */
+ rq->wqe_sz,
+ DMA_FROM_DEVICE);
+
+ if (unlikely(dma_mapping_error(rq->pdev, dma_addr)))
+ goto err_free_skb;
+
+ skb_reserve(skb, MLX5E_NET_IP_ALIGN);
+
+ *((dma_addr_t *)skb->cb) = dma_addr;
+ wqe->data.addr = cpu_to_be64(dma_addr + MLX5E_NET_IP_ALIGN);
+
+ rq->skb[ix] = skb;
+
+ return 0;
+
+err_free_skb:
+ dev_kfree_skb(skb);
+
+ return -ENOMEM;
+}
+
+bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq)
+{
+ struct mlx5_wq_ll *wq = &rq->wq;
+
+ if (unlikely(!test_bit(MLX5E_RQ_STATE_POST_WQES_ENABLE, &rq->state)))
+ return false;
+
+ while (!mlx5_wq_ll_is_full(wq)) {
+ struct mlx5e_rx_wqe *wqe = mlx5_wq_ll_get_wqe(wq, wq->head);
+
+ if (unlikely(mlx5e_alloc_rx_wqe(rq, wqe, wq->head)))
+ break;
+
+ mlx5_wq_ll_push(wq, be16_to_cpu(wqe->next.next_wqe_index));
+ }
+
+ /* ensure wqes are visible to device before updating doorbell record */
+ wmb();
+
+ mlx5_wq_ll_update_db_record(wq);
+
+ return !mlx5_wq_ll_is_full(wq);
+}
+
+static void mlx5e_lro_update_hdr(struct sk_buff *skb, struct mlx5_cqe64 *cqe)
+{
+ /* TODO: consider vlans, ip options, ... */
+ struct ethhdr *eth = (struct ethhdr *)(skb->data);
+ struct iphdr *ipv4 = (struct iphdr *)(skb->data + ETH_HLEN);
+ struct ipv6hdr *ipv6 = (struct ipv6hdr *)(skb->data + ETH_HLEN);
+ struct tcphdr *tcp;
+
+ u8 l4_hdr_type = get_cqe_l4_hdr_type(cqe);
+ int tcp_ack = ((CQE_L4_HDR_TYPE_TCP_ACK_NO_DATA == l4_hdr_type) ||
+ (CQE_L4_HDR_TYPE_TCP_ACK_AND_DATA == l4_hdr_type));
+
+ /* TODO: consider vlan */
+ u16 tot_len = be32_to_cpu(cqe->byte_cnt) - ETH_HLEN;
+
+ if (eth->h_proto == htons(ETH_P_IP)) {
+ tcp = (struct tcphdr *)(skb->data + ETH_HLEN +
+ sizeof(struct iphdr));
+ ipv6 = NULL;
+ } else {
+ tcp = (struct tcphdr *)(skb->data + ETH_HLEN +
+ sizeof(struct ipv6hdr));
+ ipv4 = NULL;
+ }
+
+ /* TODO: handle timestamp */
+
+ if (get_cqe_lro_tcppsh(cqe))
+ tcp->psh = 1;
+
+ if (tcp_ack) {
+ tcp->ack = 1;
+ tcp->ack_seq = cqe->lro_ack_seq_num;
+ tcp->window = cqe->lro_tcp_win;
+ }
+
+ if (ipv4) {
+ ipv4->ttl = cqe->lro_min_ttl;
+ ipv4->tot_len = cpu_to_be16(tot_len);
+ ipv4->check = 0;
+ ipv4->check = ip_fast_csum((unsigned char *)ipv4,
+ ipv4->ihl);
+ } else {
+ ipv6->hop_limit = cqe->lro_min_ttl;
+ ipv6->payload_len = cpu_to_be16(tot_len -
+ sizeof(struct ipv6hdr));
+ }
+ /* TODO: handle tcp checksum */
+}
+
+#ifdef HAVE_NETIF_F_RXHASH
+static inline void mlx5e_skb_set_hash(struct mlx5_cqe64 *cqe,
+ struct sk_buff *skb)
+{
+#ifdef HAVE_SKB_SET_HASH
+ u8 cht = cqe->rss_hash_type;
+ int ht = (cht & CQE_RSS_HTYPE_L4) ? PKT_HASH_TYPE_L4 :
+ (cht & CQE_RSS_HTYPE_IP) ? PKT_HASH_TYPE_L3 :
+ PKT_HASH_TYPE_NONE;
+ skb_set_hash(skb, be32_to_cpu(cqe->rss_hash_result), ht);
+#else
+ skb->rxhash = be32_to_cpu(cqe->rss_hash_result);
+#endif
+}
+#endif
+
+static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe,
+ struct mlx5e_rq *rq,
+ struct sk_buff *skb)
+{
+ struct net_device *netdev = rq->netdev;
+ u32 cqe_bcnt = be32_to_cpu(cqe->byte_cnt);
+ int lro_num_seg;
+
+ skb_put(skb, cqe_bcnt);
+
+ lro_num_seg = be32_to_cpu(cqe->srqn) >> 24;
+ if (lro_num_seg > 1) {
+ mlx5e_lro_update_hdr(skb, cqe);
+ skb_shinfo(skb)->gso_size = MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ;
+ rq->stats.lro_packets++;
+ rq->stats.lro_bytes += cqe_bcnt;
+ }
+
+ if (likely(netdev->features & NETIF_F_RXCSUM) &&
+ (cqe->hds_ip_ext & CQE_L2_OK) &&
+ (cqe->hds_ip_ext & CQE_L3_OK) &&
+ (cqe->hds_ip_ext & CQE_L4_OK)) {
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ } else {
+ skb->ip_summed = CHECKSUM_NONE;
+ rq->stats.csum_none++;
+ }
+
+ skb->protocol = eth_type_trans(skb, netdev);
+
+ skb_record_rx_queue(skb, rq->ix);
+
+#ifdef HAVE_NETIF_F_RXHASH
+ if (likely(netdev->features & NETIF_F_RXHASH))
+ mlx5e_skb_set_hash(cqe, skb);
+#endif
+
+ if (cqe_has_vlan(cqe))
+#if (LINUX_VERSION_CODE < KERNEL_VERSION(3,10,0))
+ __vlan_hwaccel_put_tag(skb, be16_to_cpu(cqe->vlan_info));
+#else
+ __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q),
+ be16_to_cpu(cqe->vlan_info));
+#endif
+}
+
+bool mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget)
+{
+ struct mlx5e_rq *rq = container_of(cq, struct mlx5e_rq, cq);
+ struct mlx5_cqe64 *cqe;
+#if defined HAVE_VLAN_GRO_RECEIVE || defined HAVE_VLAN_HWACCEL_RX
+ struct net_device *netdev = rq->netdev;
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct mlx5_cqe64 *prev_cqe;
+#endif
+ int i;
+
+ /* avoid accessing cq (dma coherent memory) if not needed */
+ if (!test_and_clear_bit(MLX5E_CQ_HAS_CQES, &cq->flags))
+ return false;
+
+ cqe = mlx5e_get_cqe(cq);
+
+ for (i = 0; i < budget; i++) {
+ struct mlx5e_rx_wqe *wqe;
+ struct sk_buff *skb;
+ __be16 wqe_counter_be;
+ u16 wqe_counter;
+
+ if (!cqe)
+ break;
+
+ mlx5_cqwq_pop(&cq->wq);
+ mlx5e_prefetch_cqe(cq);
+
+ wqe_counter_be = cqe->wqe_counter;
+ wqe_counter = be16_to_cpu(wqe_counter_be);
+ wqe = mlx5_wq_ll_get_wqe(&rq->wq, wqe_counter);
+ skb = rq->skb[wqe_counter];
+ prefetch(skb->data);
+ rq->skb[wqe_counter] = NULL;
+
+ dma_unmap_single(rq->pdev,
+ *((dma_addr_t *)skb->cb),
+ rq->wqe_sz,
+ DMA_FROM_DEVICE);
+
+ if (unlikely((cqe->op_own >> 4) != MLX5_CQE_RESP_SEND)) {
+ rq->stats.wqe_err++;
+ dev_kfree_skb(skb);
+ cqe = mlx5e_get_cqe(cq);
+ goto wq_ll_pop;
+ }
+
+ mlx5e_build_rx_skb(cqe, rq, skb);
+ rq->stats.packets++;
+
+#if defined HAVE_VLAN_GRO_RECEIVE || defined HAVE_VLAN_HWACCEL_RX
+ prev_cqe = cqe;
+#endif
+ cqe = mlx5e_get_cqe(cq);
+
+#ifdef HAVE_SK_BUFF_XMIT_MORE
+ if (cqe)
+ skb->xmit_more = 1;
+#endif
+
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+ if (rq->flags & MLX5E_RQ_FLAG_SWLRO)
+ lro_receive_skb(&rq->sw_lro.lro_mgr, skb, NULL);
+ else
+#endif
+
+#if defined HAVE_VLAN_GRO_RECEIVE || defined HAVE_VLAN_HWACCEL_RX
+ if (priv->vlan_grp && cqe_has_vlan(prev_cqe))
+#ifdef HAVE_VLAN_GRO_RECEIVE
+ vlan_gro_receive(cq->napi, priv->vlan_grp,
+ be16_to_cpu(prev_cqe->vlan_info),
+ skb);
+#else
+ vlan_hwaccel_rx(skb, priv->vlan_grp,
+ be16_to_cpu(prev_cqe->vlan_info));
+#endif
+ else
+#endif
+ napi_gro_receive(cq->napi, skb);
+
+wq_ll_pop:
+ mlx5_wq_ll_pop(&rq->wq, wqe_counter_be,
+ &wqe->next.next_wqe_index);
+ }
+
+ mlx5_cqwq_update_db_record(&cq->wq);
+
+ /* ensure cq space is freed before enabling more cqes */
+ wmb();
+
+ if (i == budget) {
+ set_bit(MLX5E_CQ_HAS_CQES, &cq->flags);
+ return true;
+ }
+#ifdef CONFIG_COMPAT_LRO_ENABLED_IPOIB
+ if (rq->flags & MLX5E_RQ_FLAG_SWLRO)
+ lro_flush_all(&rq->sw_lro.lro_mgr);
+#endif
+ return false;
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/en_tx.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_tx.c
new file mode 100644
index 0000000..7a7c3b8
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_tx.c
@@ -0,0 +1,392 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "en.h"
+
+#define MLX5E_SQ_NOPS_ROOM MLX5_SEND_WQE_MAX_WQEBBS
+#define MLX5E_SQ_STOP_ROOM (MLX5_SEND_WQE_MAX_WQEBBS +\
+ MLX5E_SQ_NOPS_ROOM)
+
+void mlx5e_send_nop(struct mlx5e_sq *sq, bool notify_hw)
+{
+ struct mlx5_wq_cyc *wq = &sq->wq;
+
+ u16 pi = sq->pc & wq->sz_m1;
+ struct mlx5e_tx_wqe *wqe = mlx5_wq_cyc_get_wqe(wq, pi);
+
+ struct mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl;
+
+ memset(cseg, 0, sizeof(*cseg));
+
+ cseg->opmod_idx_opcode = cpu_to_be32((sq->pc << 8) | MLX5_OPCODE_NOP);
+ cseg->qpn_ds = cpu_to_be32((sq->sqn << 8) | 0x01);
+
+ sq->skb[pi] = NULL;
+ sq->pc++;
+
+ if (notify_hw) {
+ cseg->fm_ce_se = MLX5_WQE_CTRL_CQ_UPDATE;
+ mlx5e_tx_notify_hw(sq, wqe, 0);
+ }
+}
+
+static void mlx5e_dma_pop_last_pushed(struct mlx5e_sq *sq, dma_addr_t *addr,
+ u32 *size)
+{
+ sq->dma_fifo_pc--;
+ *addr = sq->dma_fifo[sq->dma_fifo_pc & sq->dma_fifo_mask].addr;
+ *size = sq->dma_fifo[sq->dma_fifo_pc & sq->dma_fifo_mask].size;
+}
+
+static void mlx5e_dma_unmap_wqe_err(struct mlx5e_sq *sq, struct sk_buff *skb)
+{
+ dma_addr_t addr;
+ u32 size;
+ int i;
+
+ for (i = 0; i < MLX5E_TX_SKB_CB(skb)->num_dma; i++) {
+ mlx5e_dma_pop_last_pushed(sq, &addr, &size);
+ dma_unmap_single(sq->pdev, addr, size, DMA_TO_DEVICE);
+ }
+}
+
+static inline void mlx5e_dma_push(struct mlx5e_sq *sq, dma_addr_t addr,
+ u32 size)
+{
+ sq->dma_fifo[sq->dma_fifo_pc & sq->dma_fifo_mask].addr = addr;
+ sq->dma_fifo[sq->dma_fifo_pc & sq->dma_fifo_mask].size = size;
+ sq->dma_fifo_pc++;
+}
+
+static inline void mlx5e_dma_get(struct mlx5e_sq *sq, u32 i, dma_addr_t *addr,
+ u32 *size)
+{
+ *addr = sq->dma_fifo[i & sq->dma_fifo_mask].addr;
+ *size = sq->dma_fifo[i & sq->dma_fifo_mask].size;
+}
+
+#ifndef HAVE_SELECT_QUEUE_FALLBACK_T
+#define fallback(dev, skb) __netdev_pick_tx(dev, skb)
+#endif
+
+#if defined(NDO_SELECT_QUEUE_HAS_ACCEL_PRIV) || defined(HAVE_SELECT_QUEUE_FALLBACK_T)
+u16 mlx5e_select_queue(struct net_device *dev, struct sk_buff *skb,
+#ifdef HAVE_SELECT_QUEUE_FALLBACK_T
+ void *accel_priv, select_queue_fallback_t fallback)
+#else
+ void *accel_priv)
+#endif
+#else /* NDO_SELECT_QUEUE_HAS_ACCEL_PRIV || HAVE_SELECT_QUEUE_FALLBACK_T */
+u16 mlx5e_select_queue(struct net_device *dev, struct sk_buff *skb)
+#endif
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+ int channel_ix = fallback(dev, skb);
+#ifdef HAVE_NETDEV_GET_PRIO_TC_MAP
+ int up = skb_vlan_tag_present(skb) ?
+ skb->vlan_tci >> VLAN_PRIO_SHIFT :
+ priv->default_vlan_prio;
+ int tc = netdev_get_prio_tc_map(dev, up);
+#else
+ /* TODO: QoS and traffic class is not fully implemented */
+ int tc = 0;
+#endif
+ return priv->channel[channel_ix]->tc_to_txq_map[tc];
+}
+
+static inline u16 mlx5e_get_inline_hdr_size(struct mlx5e_sq *sq,
+ struct sk_buff *skb, bool bf)
+{
+ bool inline_wqe = bf && (skb_headlen(skb) <= sq->max_inline);
+
+ return inline_wqe ? skb_headlen(skb) : (ETH_HLEN + 2/*vlan tag*/);
+}
+
+static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_sq *sq, struct sk_buff *skb)
+{
+ struct mlx5_wq_cyc *wq = &sq->wq;
+
+ u16 pi = sq->pc & wq->sz_m1;
+ struct mlx5e_tx_wqe *wqe = mlx5_wq_cyc_get_wqe(wq, pi);
+
+ struct mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl;
+ struct mlx5_wqe_eth_seg *eseg = &wqe->eth;
+ struct mlx5_wqe_data_seg *dseg;
+
+ u8 opcode = MLX5_OPCODE_SEND;
+ dma_addr_t dma_addr = 0;
+ bool bf = false;
+ u16 headlen;
+ u16 ds_cnt = sizeof(*wqe) / MLX5_SEND_WQE_DS;
+ u16 ihs;
+ int i;
+
+ memset(wqe, 0, sizeof(*wqe));
+
+ if (likely(skb->ip_summed == CHECKSUM_PARTIAL))
+ eseg->cs_flags = MLX5_ETH_WQE_L3_CSUM | MLX5_ETH_WQE_L4_CSUM;
+ else
+ sq->stats.csum_offload_none++;
+
+ if (sq->cc != sq->prev_cc) {
+ sq->prev_cc = sq->cc;
+ sq->bf_budget = (sq->cc == sq->pc) ? MLX5E_SQ_BF_BUDGET : 0;
+ }
+
+ if (skb_is_gso(skb)) {
+ u32 payload_len;
+
+ eseg->mss = cpu_to_be16(skb_shinfo(skb)->gso_size);
+ opcode = MLX5_OPCODE_LSO;
+ ihs = skb_transport_offset(skb) + tcp_hdrlen(skb);
+ payload_len = skb->len - ihs;
+ MLX5E_TX_SKB_CB(skb)->num_bytes = skb->len +
+ (skb_shinfo(skb)->gso_segs - 1) * ihs;
+ sq->stats.tso_packets++;
+ sq->stats.tso_bytes += payload_len;
+ } else {
+ bf = sq->bf_budget &&
+#ifdef HAVE_SK_BUFF_XMIT_MORE
+ !skb->xmit_more &&
+#endif
+ !skb_shinfo(skb)->nr_frags;
+ ihs = mlx5e_get_inline_hdr_size(sq, skb, bf);
+ MLX5E_TX_SKB_CB(skb)->num_bytes = max_t(unsigned int, skb->len,
+ ETH_ZLEN);
+ }
+
+ skb_copy_from_linear_data(skb, eseg->inline_hdr_start, ihs);
+ skb_pull_inline(skb, ihs);
+
+ eseg->inline_hdr_sz = cpu_to_be16(ihs);
+
+ ds_cnt += DIV_ROUND_UP(ihs - sizeof(eseg->inline_hdr_start),
+ MLX5_SEND_WQE_DS);
+
+ dseg = (struct mlx5_wqe_data_seg *)cseg + ds_cnt;
+
+ MLX5E_TX_SKB_CB(skb)->num_dma = 0;
+
+ headlen = skb_headlen(skb);
+ if (headlen) {
+ dma_addr = dma_map_single(sq->pdev, skb->data, headlen,
+ DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(sq->pdev, dma_addr)))
+ goto dma_unmap_wqe_err;
+
+ dseg->addr = cpu_to_be64(dma_addr);
+ dseg->lkey = sq->mkey_be;
+ dseg->byte_count = cpu_to_be32(headlen);
+
+ mlx5e_dma_push(sq, dma_addr, headlen);
+ MLX5E_TX_SKB_CB(skb)->num_dma++;
+
+ dseg++;
+ }
+
+ for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+ struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[i];
+ int fsz = skb_frag_size(frag);
+
+ dma_addr = skb_frag_dma_map(sq->pdev, frag, 0, fsz,
+ DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(sq->pdev, dma_addr)))
+ goto dma_unmap_wqe_err;
+
+ dseg->addr = cpu_to_be64(dma_addr);
+ dseg->lkey = sq->mkey_be;
+ dseg->byte_count = cpu_to_be32(fsz);
+
+ mlx5e_dma_push(sq, dma_addr, fsz);
+ MLX5E_TX_SKB_CB(skb)->num_dma++;
+
+ dseg++;
+ }
+
+ ds_cnt += MLX5E_TX_SKB_CB(skb)->num_dma;
+
+ cseg->opmod_idx_opcode = cpu_to_be32((sq->pc << 8) | opcode);
+ cseg->qpn_ds = cpu_to_be32((sq->sqn << 8) | ds_cnt);
+
+ sq->skb[pi] = skb;
+
+ MLX5E_TX_SKB_CB(skb)->num_wqebbs = DIV_ROUND_UP(ds_cnt,
+ MLX5_SEND_WQEBB_NUM_DS);
+ sq->pc += MLX5E_TX_SKB_CB(skb)->num_wqebbs;
+
+ netdev_tx_sent_queue(sq->txq, MLX5E_TX_SKB_CB(skb)->num_bytes);
+
+ if (unlikely(!mlx5e_sq_has_room_for(sq, MLX5E_SQ_STOP_ROOM))) {
+ netif_tx_stop_queue(sq->txq);
+ sq->stats.stopped++;
+ }
+#ifdef HAVE_SK_BUFF_XMIT_MORE
+ if (!skb->xmit_more || netif_xmit_stopped(sq->txq))
+#endif
+ {
+ int bf_sz = 0;
+
+ if (bf && sq->uar_bf_map)
+ bf_sz = MLX5E_TX_SKB_CB(skb)->num_wqebbs << 3;
+
+ cseg->fm_ce_se = MLX5_WQE_CTRL_CQ_UPDATE;
+ mlx5e_tx_notify_hw(sq, wqe, bf_sz);
+ }
+
+ sq->bf_budget = bf ? sq->bf_budget - 1 : 0;
+
+ /* fill sq edge with nops to avoid wqe wrap around */
+ while ((sq->pc & wq->sz_m1) > sq->edge)
+ mlx5e_send_nop(sq, false);
+
+ sq->stats.packets++;
+ return NETDEV_TX_OK;
+
+dma_unmap_wqe_err:
+ sq->stats.dropped++;
+ mlx5e_dma_unmap_wqe_err(sq, skb);
+
+ dev_kfree_skb_any(skb);
+
+ return NETDEV_TX_OK;
+}
+
+netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+ struct mlx5e_sq *sq = priv->txq_to_sq_map[skb_get_queue_mapping(skb)];
+
+ return mlx5e_sq_xmit(sq, skb);
+}
+
+bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq)
+{
+ struct mlx5_cqe64 *cqe;
+ struct mlx5e_sq *sq;
+ u32 dma_fifo_cc;
+ u32 nbytes;
+ u16 npkts;
+ u16 sqcc;
+ int i;
+
+ /* avoid accessing cq (dma coherent memory) if not needed */
+ if (!test_and_clear_bit(MLX5E_CQ_HAS_CQES, &cq->flags))
+ return false;
+
+ sq = container_of(cq, struct mlx5e_sq, cq);
+
+ npkts = 0;
+ nbytes = 0;
+
+ /* sq->cc must be updated only after mlx5_cqwq_update_db_record(),
+ * otherwise a cq overrun may occur */
+ sqcc = sq->cc;
+
+ /* avoid dirtying sq cache line every cqe */
+ dma_fifo_cc = sq->dma_fifo_cc;
+
+ cqe = mlx5e_get_cqe(cq);
+
+ for (i = 0; i < MLX5E_TX_CQ_POLL_BUDGET; i++) {
+ u16 wqe_counter;
+ bool last_wqe;
+
+ if (!cqe)
+ break;
+
+ mlx5_cqwq_pop(&cq->wq);
+ mlx5e_prefetch_cqe(cq);
+
+ wqe_counter = be16_to_cpu(cqe->wqe_counter);
+
+ do {
+ struct sk_buff *skb;
+ u16 ci;
+ int j;
+
+ last_wqe = (sqcc == wqe_counter);
+
+ ci = sqcc & sq->wq.sz_m1;
+ skb = sq->skb[ci];
+
+ if (unlikely(!skb)) { /* nop */
+ sq->stats.nop++;
+ sqcc++;
+ continue;
+ }
+
+ for (j = 0; j < MLX5E_TX_SKB_CB(skb)->num_dma; j++) {
+ dma_addr_t addr;
+ u32 size;
+
+ mlx5e_dma_get(sq, dma_fifo_cc, &addr, &size);
+ dma_fifo_cc++;
+ dma_unmap_single(sq->pdev, addr, size,
+ DMA_TO_DEVICE);
+ }
+
+ npkts++;
+ nbytes += MLX5E_TX_SKB_CB(skb)->num_bytes;
+ sqcc += MLX5E_TX_SKB_CB(skb)->num_wqebbs;
+ dev_kfree_skb(skb);
+ } while (!last_wqe);
+
+ cqe = mlx5e_get_cqe(cq);
+ }
+
+ mlx5_cqwq_update_db_record(&cq->wq);
+
+ /* ensure cq space is freed before enabling more cqes */
+ wmb();
+
+ sq->dma_fifo_cc = dma_fifo_cc;
+ sq->cc = sqcc;
+
+ netdev_tx_completed_queue(sq->txq, npkts, nbytes);
+
+ if (netif_tx_queue_stopped(sq->txq) &&
+ mlx5e_sq_has_room_for(sq, MLX5E_SQ_STOP_ROOM) &&
+ likely(test_bit(MLX5E_SQ_STATE_WAKE_TXQ_ENABLE, &sq->state))) {
+ netif_tx_wake_queue(sq->txq);
+ sq->stats.wake++;
+ }
+ if (i == MLX5E_TX_CQ_POLL_BUDGET) {
+ set_bit(MLX5E_CQ_HAS_CQES, &cq->flags);
+ return true;
+ }
+
+ return false;
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/en_txrx.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_txrx.c
new file mode 100644
index 0000000..42d5317
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/en_txrx.c
@@ -0,0 +1,118 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "en.h"
+
+void mlx5e_prefetch_cqe(struct mlx5e_cq *cq)
+{
+ struct mlx5_cqwq *wq = &cq->wq;
+ u32 ci = mlx5_cqwq_get_ci(wq);
+ struct mlx5_cqe64 *cqe = mlx5_cqwq_get_wqe(wq, ci);
+
+ prefetch(cqe);
+}
+
+struct mlx5_cqe64 *mlx5e_get_cqe(struct mlx5e_cq *cq)
+{
+ struct mlx5_cqwq *wq = &cq->wq;
+ u32 ci = mlx5_cqwq_get_ci(wq);
+ struct mlx5_cqe64 *cqe = mlx5_cqwq_get_wqe(wq, ci);
+ int cqe_ownership_bit = cqe->op_own & MLX5_CQE_OWNER_MASK;
+ int sw_ownership_val = mlx5_cqwq_get_wrap_cnt(wq) & 1;
+
+ if (cqe_ownership_bit != sw_ownership_val)
+ return NULL;
+
+ /* ensure cqe content is read after cqe ownership bit */
+ rmb();
+
+ return cqe;
+}
+
+int mlx5e_napi_poll(struct napi_struct *napi, int budget)
+{
+ struct mlx5e_channel *c = container_of(napi, struct mlx5e_channel,
+ napi);
+ bool busy = false;
+ int i;
+
+ clear_bit(MLX5E_CHANNEL_NAPI_SCHED, &c->flags);
+
+ busy |= mlx5e_poll_rx_cq(&c->rq.cq, budget);
+
+ busy |= mlx5e_post_rx_wqes(&c->rq);
+
+ for (i = 0; i < c->num_tc; i++)
+ busy |= mlx5e_poll_tx_cq(&c->sq[i].cq);
+
+ if (busy)
+ return budget;
+
+ napi_complete(napi);
+
+ /* avoid losing completion event during/after polling cqs */
+ if (test_bit(MLX5E_CHANNEL_NAPI_SCHED, &c->flags)) {
+ napi_schedule(napi);
+ return 0;
+ }
+
+ for (i = 0; i < c->num_tc; i++)
+ mlx5e_cq_arm(&c->sq[i].cq);
+ mlx5e_cq_arm(&c->rq.cq);
+
+ return 0;
+}
+
+void mlx5e_completion_event(struct mlx5_core_cq *mcq)
+{
+ struct mlx5e_cq *cq = container_of(mcq, struct mlx5e_cq, mcq);
+
+ set_bit(MLX5E_CQ_HAS_CQES, &cq->flags);
+ set_bit(MLX5E_CHANNEL_NAPI_SCHED, &cq->channel->flags);
+ barrier();
+ napi_schedule(cq->napi);
+}
+
+void mlx5e_cq_error_event(struct mlx5_core_cq *mcq, enum mlx5_event event)
+{
+ struct mlx5e_cq *cq = container_of(mcq, struct mlx5e_cq, mcq);
+ struct mlx5e_channel *c = cq->channel;
+ struct mlx5e_priv *priv = c->priv;
+ struct net_device *netdev = priv->netdev;
+
+ netdev_err(netdev, "%s: cqn=0x%.6x event=0x%.2x\n",
+ __func__, mcq->cqn, event);
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/eq.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/eq.c
new file mode 100644
index 0000000..1788fa2
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/eq.c
@@ -0,0 +1,566 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+enum {
+ MLX5_EQE_SIZE = sizeof(struct mlx5_eqe),
+ MLX5_EQE_OWNER_INIT_VAL = 0x1,
+};
+
+enum {
+ MLX5_EQ_STATE_ARMED = 0x9,
+ MLX5_EQ_STATE_FIRED = 0xa,
+ MLX5_EQ_STATE_ALWAYS_ARMED = 0xb,
+};
+
+enum {
+ MLX5_NUM_SPARE_EQE = 0x80,
+ MLX5_NUM_ASYNC_EQE = 0x100,
+ MLX5_NUM_CMD_EQE = 32,
+};
+
+enum {
+ MLX5_EQ_DOORBEL_OFFSET = 0x40,
+};
+
+#define MLX5_ASYNC_EVENT_MASK ((1ull << MLX5_EVENT_TYPE_PATH_MIG) | \
+ (1ull << MLX5_EVENT_TYPE_COMM_EST) | \
+ (1ull << MLX5_EVENT_TYPE_SQ_DRAINED) | \
+ (1ull << MLX5_EVENT_TYPE_CQ_ERROR) | \
+ (1ull << MLX5_EVENT_TYPE_WQ_CATAS_ERROR) | \
+ (1ull << MLX5_EVENT_TYPE_PATH_MIG_FAILED) | \
+ (1ull << MLX5_EVENT_TYPE_WQ_INVAL_REQ_ERROR) | \
+ (1ull << MLX5_EVENT_TYPE_WQ_ACCESS_ERROR) | \
+ (1ull << MLX5_EVENT_TYPE_PORT_CHANGE) | \
+ (1ull << MLX5_EVENT_TYPE_SRQ_CATAS_ERROR) | \
+ (1ull << MLX5_EVENT_TYPE_SRQ_LAST_WQE) | \
+ (1ull << MLX5_EVENT_TYPE_SRQ_RQ_LIMIT))
+
+struct map_eq_in {
+ u64 mask;
+ u32 reserved;
+ u32 unmap_eqn;
+};
+
+struct cre_des_eq {
+ u8 reserved[15];
+ u8 eqn;
+};
+
+static int mlx5_cmd_destroy_eq(struct mlx5_core_dev *dev, u8 eqn)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_eq_in)];
+ u32 out[MLX5_ST_SZ_DW(destroy_eq_out)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(destroy_eq_in, in, opcode, MLX5_CMD_OP_DESTROY_EQ);
+ MLX5_SET(destroy_eq_in, in, eq_number, eqn);
+
+ memset(out, 0, sizeof(out));
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in),
+ out, sizeof(out));
+}
+
+static struct mlx5_eqe *get_eqe(struct mlx5_eq *eq, u32 entry)
+{
+ return mlx5_buf_offset(&eq->buf, entry * MLX5_EQE_SIZE);
+}
+
+static struct mlx5_eqe *next_eqe_sw(struct mlx5_eq *eq)
+{
+ struct mlx5_eqe *eqe = get_eqe(eq, eq->cons_index & (eq->nent - 1));
+
+ return ((eqe->owner & 1) ^ !!(eq->cons_index & eq->nent)) ? NULL : eqe;
+}
+
+static const char *eqe_type_str(u8 type)
+{
+ switch (type) {
+ case MLX5_EVENT_TYPE_COMP:
+ return "MLX5_EVENT_TYPE_COMP";
+ case MLX5_EVENT_TYPE_PATH_MIG:
+ return "MLX5_EVENT_TYPE_PATH_MIG";
+ case MLX5_EVENT_TYPE_COMM_EST:
+ return "MLX5_EVENT_TYPE_COMM_EST";
+ case MLX5_EVENT_TYPE_SQ_DRAINED:
+ return "MLX5_EVENT_TYPE_SQ_DRAINED";
+ case MLX5_EVENT_TYPE_SRQ_LAST_WQE:
+ return "MLX5_EVENT_TYPE_SRQ_LAST_WQE";
+ case MLX5_EVENT_TYPE_SRQ_RQ_LIMIT:
+ return "MLX5_EVENT_TYPE_SRQ_RQ_LIMIT";
+ case MLX5_EVENT_TYPE_CQ_ERROR:
+ return "MLX5_EVENT_TYPE_CQ_ERROR";
+ case MLX5_EVENT_TYPE_WQ_CATAS_ERROR:
+ return "MLX5_EVENT_TYPE_WQ_CATAS_ERROR";
+ case MLX5_EVENT_TYPE_PATH_MIG_FAILED:
+ return "MLX5_EVENT_TYPE_PATH_MIG_FAILED";
+ case MLX5_EVENT_TYPE_WQ_INVAL_REQ_ERROR:
+ return "MLX5_EVENT_TYPE_WQ_INVAL_REQ_ERROR";
+ case MLX5_EVENT_TYPE_WQ_ACCESS_ERROR:
+ return "MLX5_EVENT_TYPE_WQ_ACCESS_ERROR";
+ case MLX5_EVENT_TYPE_SRQ_CATAS_ERROR:
+ return "MLX5_EVENT_TYPE_SRQ_CATAS_ERROR";
+ case MLX5_EVENT_TYPE_INTERNAL_ERROR:
+ return "MLX5_EVENT_TYPE_INTERNAL_ERROR";
+ case MLX5_EVENT_TYPE_PORT_CHANGE:
+ return "MLX5_EVENT_TYPE_PORT_CHANGE";
+ case MLX5_EVENT_TYPE_GPIO_EVENT:
+ return "MLX5_EVENT_TYPE_GPIO_EVENT";
+ case MLX5_EVENT_TYPE_REMOTE_CONFIG:
+ return "MLX5_EVENT_TYPE_REMOTE_CONFIG";
+ case MLX5_EVENT_TYPE_DB_BF_CONGESTION:
+ return "MLX5_EVENT_TYPE_DB_BF_CONGESTION";
+ case MLX5_EVENT_TYPE_STALL_EVENT:
+ return "MLX5_EVENT_TYPE_STALL_EVENT";
+ case MLX5_EVENT_TYPE_CMD:
+ return "MLX5_EVENT_TYPE_CMD";
+ case MLX5_EVENT_TYPE_PAGE_REQUEST:
+ return "MLX5_EVENT_TYPE_PAGE_REQUEST";
+ case MLX5_EVENT_TYPE_PAGE_FAULT:
+ return "MLX5_EVENT_TYPE_PAGE_FAULT";
+ case MLX5_EVENT_TYPE_DCT_DRAINED:
+ return "MLX5_EVENT_TYPE_DCT_DRAINED";
+ case MLX5_EVENT_TYPE_DCT_KEY_VIOLATION:
+ return "MLX5_EVENT_TYPE_DCT_KEY_VIOLATION";
+ default:
+ return "Unrecognized event";
+ }
+}
+
+static enum mlx5_dev_event port_subtype_event(u8 subtype)
+{
+ switch (subtype) {
+ case MLX5_PORT_CHANGE_SUBTYPE_DOWN:
+ return MLX5_DEV_EVENT_PORT_DOWN;
+ case MLX5_PORT_CHANGE_SUBTYPE_ACTIVE:
+ return MLX5_DEV_EVENT_PORT_UP;
+ case MLX5_PORT_CHANGE_SUBTYPE_INITIALIZED:
+ return MLX5_DEV_EVENT_PORT_INITIALIZED;
+ case MLX5_PORT_CHANGE_SUBTYPE_LID:
+ return MLX5_DEV_EVENT_LID_CHANGE;
+ case MLX5_PORT_CHANGE_SUBTYPE_PKEY:
+ return MLX5_DEV_EVENT_PKEY_CHANGE;
+ case MLX5_PORT_CHANGE_SUBTYPE_GUID:
+ return MLX5_DEV_EVENT_GUID_CHANGE;
+ case MLX5_PORT_CHANGE_SUBTYPE_CLIENT_REREG:
+ return MLX5_DEV_EVENT_CLIENT_REREG;
+ }
+ return -1;
+}
+
+static void eq_update_ci(struct mlx5_eq *eq, int arm)
+{
+ __be32 __iomem *addr = eq->doorbell + (arm ? 0 : 2);
+ u32 val = (eq->cons_index & 0xffffff) | (eq->eqn << 24);
+ __raw_writel((__force u32) cpu_to_be32(val), addr);
+ /* We still want ordering, just not swabbing, so add a barrier */
+ mb();
+}
+
+static void dump_eqe(struct mlx5_core_dev *dev, void *eqe)
+{
+ __be32 *buf = eqe;
+ int i;
+
+ mlx5_core_warn(dev, "EQE contents: %08x %08x %08x %08x\n",
+ be32_to_cpu(buf[0]), be32_to_cpu(buf[1]),
+ be32_to_cpu(buf[2]), be32_to_cpu(buf[3]));
+ for (i = 4; i < 16; i += 4) {
+ mlx5_core_warn(dev, " %08x %08x %08x %08x\n",
+ be32_to_cpu(buf[i]), be32_to_cpu(buf[i + 1]),
+ be32_to_cpu(buf[i + 2]), be32_to_cpu(buf[i + 3]));
+ }
+}
+
+static int mlx5_eq_int(struct mlx5_core_dev *dev, struct mlx5_eq *eq)
+{
+ struct mlx5_eqe *eqe;
+ int eqes_found = 0;
+ int set_ci = 0;
+ int err;
+ u32 cqn;
+ u32 rsn;
+ u32 dctn;
+ u8 port;
+
+ while ((eqe = next_eqe_sw(eq))) {
+ /*
+ * Make sure we read EQ entry contents after we've
+ * checked the ownership bit.
+ */
+ rmb();
+
+ mlx5_core_dbg(eq->dev, "eqn %d, eqe type %s\n",
+ eq->eqn, eqe_type_str(eqe->type));
+ switch (eqe->type) {
+ case MLX5_EVENT_TYPE_COMP:
+ cqn = be32_to_cpu(eqe->data.comp.cqn) & 0xffffff;
+ mlx5_cq_completion(dev, cqn);
+ break;
+
+ case MLX5_EVENT_TYPE_DCT_DRAINED:
+ case MLX5_EVENT_TYPE_DCT_KEY_VIOLATION:
+ dctn = be32_to_cpu(eqe->data.dct.dctn) & 0xffffff;
+ err = mlx5_rsc_event(dev, dctn, eqe->type);
+ if (err) {
+ mlx5_core_warn(dev, "mlx5_rsc_event failed on eq 0x%x\n", eq->eqn);
+ dump_eqe(dev, eqe);
+ }
+ break;
+
+ case MLX5_EVENT_TYPE_PATH_MIG:
+ case MLX5_EVENT_TYPE_COMM_EST:
+ case MLX5_EVENT_TYPE_SQ_DRAINED:
+ case MLX5_EVENT_TYPE_SRQ_LAST_WQE:
+ case MLX5_EVENT_TYPE_WQ_CATAS_ERROR:
+ case MLX5_EVENT_TYPE_PATH_MIG_FAILED:
+ case MLX5_EVENT_TYPE_WQ_INVAL_REQ_ERROR:
+ case MLX5_EVENT_TYPE_WQ_ACCESS_ERROR:
+ rsn = be32_to_cpu(eqe->data.qp_srq.qp_srq_n) & 0xffffff;
+ mlx5_core_dbg(dev, "event %s(%d) arrived on resource 0x%x\n",
+ eqe_type_str(eqe->type), eqe->type, rsn);
+ mlx5_rsc_event(dev, rsn, eqe->type);
+ break;
+
+ case MLX5_EVENT_TYPE_SRQ_RQ_LIMIT:
+ case MLX5_EVENT_TYPE_SRQ_CATAS_ERROR:
+ rsn = be32_to_cpu(eqe->data.qp_srq.qp_srq_n) & 0xffffff;
+ mlx5_core_dbg(dev, "SRQ event %s(%d): srqn 0x%x\n",
+ eqe_type_str(eqe->type), eqe->type, rsn);
+ mlx5_srq_event(dev, rsn, eqe->type);
+ break;
+
+ case MLX5_EVENT_TYPE_CMD:
+ mlx5_cmd_comp_handler(dev, be32_to_cpu(eqe->data.cmd.vector));
+ break;
+
+ case MLX5_EVENT_TYPE_PORT_CHANGE:
+ port = (eqe->data.port.port >> 4) & 0xf;
+ switch (eqe->sub_type) {
+ case MLX5_PORT_CHANGE_SUBTYPE_DOWN:
+ case MLX5_PORT_CHANGE_SUBTYPE_ACTIVE:
+ case MLX5_PORT_CHANGE_SUBTYPE_LID:
+ case MLX5_PORT_CHANGE_SUBTYPE_PKEY:
+ case MLX5_PORT_CHANGE_SUBTYPE_GUID:
+ case MLX5_PORT_CHANGE_SUBTYPE_CLIENT_REREG:
+ case MLX5_PORT_CHANGE_SUBTYPE_INITIALIZED:
+ if (dev->event)
+ dev->event(dev, port_subtype_event(eqe->sub_type),
+ (unsigned long)port);
+ break;
+ default:
+ mlx5_core_warn(dev, "Port event with unrecognized subtype: port %d, sub_type %d\n",
+ port, eqe->sub_type);
+ }
+ break;
+ case MLX5_EVENT_TYPE_CQ_ERROR:
+ cqn = be32_to_cpu(eqe->data.cq_err.cqn) & 0xffffff;
+ mlx5_core_warn(dev, "CQ error on CQN 0x%x, syndrom 0x%x\n",
+ cqn, eqe->data.cq_err.syndrome);
+ mlx5_cq_event(dev, cqn, eqe->type);
+ break;
+
+ case MLX5_EVENT_TYPE_PAGE_REQUEST:
+ {
+ u16 func_id = be16_to_cpu(eqe->data.req_pages.func_id);
+ s32 npages = be32_to_cpu(eqe->data.req_pages.num_pages);
+
+ mlx5_core_dbg(dev, "page request for func 0x%x, npages %d\n",
+ func_id, npages);
+ mlx5_core_req_pages_handler(dev, func_id, npages);
+ }
+ break;
+
+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
+ case MLX5_EVENT_TYPE_PAGE_FAULT:
+ mlx5_eq_pagefault(dev, eqe);
+ break;
+#endif
+
+ default:
+ mlx5_core_warn(dev, "Unhandled event 0x%x on EQ 0x%x\n",
+ eqe->type, eq->eqn);
+ break;
+ }
+
+ ++eq->cons_index;
+ eqes_found = 1;
+ ++set_ci;
+
+ /* The HCA will think the queue has overflowed if we
+ * don't tell it we've been processing events. We
+ * create our EQs with MLX5_NUM_SPARE_EQE extra
+ * entries, so we must update our consumer index at
+ * least that often.
+ */
+ if (unlikely(set_ci >= MLX5_NUM_SPARE_EQE)) {
+ eq_update_ci(eq, 0);
+ set_ci = 0;
+ }
+ }
+
+ eq_update_ci(eq, 1);
+
+ return eqes_found;
+}
+
+static irqreturn_t mlx5_msix_handler(int irq, void *eq_ptr)
+{
+ struct mlx5_eq *eq = eq_ptr;
+ struct mlx5_core_dev *dev = eq->dev;
+
+ mlx5_eq_int(dev, eq);
+
+ /* MSI-X vectors always belong to us */
+ return IRQ_HANDLED;
+}
+
+static void init_eq_buf(struct mlx5_eq *eq)
+{
+ struct mlx5_eqe *eqe;
+ int i;
+
+ for (i = 0; i < eq->nent; i++) {
+ eqe = get_eqe(eq, i);
+ eqe->owner = MLX5_EQE_OWNER_INIT_VAL;
+ }
+}
+
+int mlx5_create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq, u8 vecidx,
+ int nent, u64 mask, const char *name, struct mlx5_uar *uar)
+{
+ struct mlx5_priv *priv = &dev->priv;
+ struct mlx5_create_eq_mbox_in *in;
+ struct mlx5_create_eq_mbox_out out;
+ int err;
+ int inlen;
+
+ eq->nent = roundup_pow_of_two(nent + MLX5_NUM_SPARE_EQE);
+ eq->cons_index = 0;
+ err = mlx5_buf_alloc(dev, eq->nent * MLX5_EQE_SIZE, 2 * PAGE_SIZE,
+ &eq->buf);
+ if (err)
+ return err;
+
+ init_eq_buf(eq);
+
+ inlen = sizeof(*in) + sizeof(in->pas[0]) * eq->buf.npages;
+ in = mlx5_vzalloc(inlen);
+ if (!in) {
+ err = -ENOMEM;
+ goto err_buf;
+ }
+ memset(&out, 0, sizeof(out));
+
+ mlx5_fill_page_array(&eq->buf, in->pas);
+
+ in->hdr.opcode = cpu_to_be16(MLX5_CMD_OP_CREATE_EQ);
+ in->ctx.log_sz_usr_page = cpu_to_be32(ilog2(eq->nent) << 24 | uar->index);
+ in->ctx.intr = vecidx;
+ in->ctx.log_page_size = eq->buf.page_shift - MLX5_ADAPTER_PAGE_SHIFT;
+ in->events_mask = cpu_to_be64(mask);
+
+ err = mlx5_cmd_exec(dev, in, inlen, &out, sizeof(out));
+ if (err)
+ goto err_in;
+
+ if (out.hdr.status) {
+ err = mlx5_cmd_status_to_err(&out.hdr);
+ goto err_in;
+ }
+
+ snprintf(priv->irq_info[vecidx].name, MLX5_MAX_IRQ_NAME, "%s@pci:%s",
+ name, pci_name(dev->pdev));
+ eq->eqn = out.eq_number;
+ eq->irqn = vecidx;
+ eq->dev = dev;
+ eq->doorbell = uar->map + MLX5_EQ_DOORBEL_OFFSET;
+ err = request_irq(priv->msix_arr[vecidx].vector, mlx5_msix_handler, 0,
+ priv->irq_info[vecidx].name, eq);
+ if (err)
+ goto err_eq;
+
+ err = mlx5_debug_eq_add(dev, eq);
+ if (err)
+ goto err_irq;
+
+ /* EQs are created in ARMED state
+ */
+ eq_update_ci(eq, 1);
+
+ kvfree(in);
+ return 0;
+
+err_irq:
+ free_irq(priv->msix_arr[vecidx].vector, eq);
+
+err_eq:
+ mlx5_cmd_destroy_eq(dev, eq->eqn);
+
+err_in:
+ kvfree(in);
+
+err_buf:
+ mlx5_buf_free(dev, &eq->buf);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_create_map_eq);
+
+int mlx5_destroy_unmap_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq)
+{
+ struct mlx5_priv *priv = &dev->priv;
+ int err;
+
+ mlx5_debug_eq_remove(dev, eq);
+ free_irq(priv->msix_arr[eq->irqn].vector, eq);
+ err = mlx5_cmd_destroy_eq(dev, eq->eqn);
+ if (err)
+ mlx5_core_warn(dev, "failed to destroy a previously created eq: eqn %d\n",
+ eq->eqn);
+ synchronize_irq(priv->msix_arr[eq->irqn].vector);
+ mlx5_buf_free(dev, &eq->buf);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_destroy_unmap_eq);
+
+int mlx5_eq_init(struct mlx5_core_dev *dev)
+{
+ int err;
+
+ spin_lock_init(&dev->priv.eq_table.lock);
+
+ err = mlx5_eq_debugfs_init(dev);
+
+ return err;
+}
+
+
+void mlx5_eq_cleanup(struct mlx5_core_dev *dev)
+{
+ mlx5_eq_debugfs_cleanup(dev);
+}
+
+int mlx5_start_eqs(struct mlx5_core_dev *dev)
+{
+ struct mlx5_eq_table *table = &dev->priv.eq_table;
+ u32 async_event_mask = MLX5_ASYNC_EVENT_MASK;
+ int err;
+
+ if (MLX5_CAP_GEN(dev, pg))
+ async_event_mask |= (1ull << MLX5_EVENT_TYPE_PAGE_FAULT);
+
+ err = mlx5_create_map_eq(dev, &table->cmd_eq, MLX5_EQ_VEC_CMD,
+ MLX5_NUM_CMD_EQE, 1ull << MLX5_EVENT_TYPE_CMD,
+ "mlx5_cmd_eq", &dev->priv.uuari.uars[0]);
+ if (err) {
+ mlx5_core_warn(dev, "failed to create cmd EQ %d\n", err);
+ return err;
+ }
+
+ mlx5_cmd_use_events(dev);
+
+ err = mlx5_create_map_eq(dev, &table->async_eq, MLX5_EQ_VEC_ASYNC,
+ MLX5_NUM_ASYNC_EQE,
+ dev->aysnc_events_mask | async_event_mask,
+ "mlx5_async_eq", &dev->priv.uuari.uars[0]);
+ if (err) {
+ mlx5_core_warn(dev, "failed to create async EQ %d\n", err);
+ goto err1;
+ }
+
+ err = mlx5_create_map_eq(dev, &table->pages_eq,
+ MLX5_EQ_VEC_PAGES,
+ /* TODO: sriov max_vf + */ 1,
+ 1 << MLX5_EVENT_TYPE_PAGE_REQUEST, "mlx5_pages_eq",
+ &dev->priv.uuari.uars[0]);
+ if (err) {
+ mlx5_core_warn(dev, "failed to create pages EQ %d\n", err);
+ goto err2;
+ }
+
+ return err;
+
+err2:
+ mlx5_destroy_unmap_eq(dev, &table->async_eq);
+
+err1:
+ mlx5_cmd_use_polling(dev);
+ mlx5_destroy_unmap_eq(dev, &table->cmd_eq);
+ return err;
+}
+
+int mlx5_stop_eqs(struct mlx5_core_dev *dev)
+{
+ struct mlx5_eq_table *table = &dev->priv.eq_table;
+ int err;
+
+ err = mlx5_destroy_unmap_eq(dev, &table->pages_eq);
+ if (err)
+ return err;
+
+ mlx5_destroy_unmap_eq(dev, &table->async_eq);
+ mlx5_cmd_use_polling(dev);
+
+ err = mlx5_destroy_unmap_eq(dev, &table->cmd_eq);
+ if (err)
+ mlx5_cmd_use_events(dev);
+
+ return err;
+}
+
+int mlx5_core_eq_query(struct mlx5_core_dev *dev, struct mlx5_eq *eq,
+ struct mlx5_query_eq_mbox_out *out, int outlen)
+{
+ struct mlx5_query_eq_mbox_in in;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(out, 0, outlen);
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_QUERY_EQ);
+ in.eqn = eq->eqn;
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), out, outlen);
+ if (err)
+ return err;
+
+ if (out->hdr.status)
+ err = mlx5_cmd_status_to_err(&out->hdr);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_eq_query);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/flow_table.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/flow_table.c
new file mode 100644
index 0000000..d7d8eab
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/flow_table.c
@@ -0,0 +1,422 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies, Ltd. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+struct mlx5_ftg {
+ struct mlx5_flow_table_group g;
+ u32 id;
+ u32 start_ix;
+};
+
+struct mlx5_flow_table {
+ struct mlx5_core_dev *dev;
+ u8 level;
+ u8 type;
+ u32 id;
+ struct mutex mutex; /* sync bitmap alloc */
+ u16 num_groups;
+ struct mlx5_ftg *group;
+ unsigned long *bitmap;
+ u32 size;
+};
+
+static int mlx5_set_flow_entry_cmd(struct mlx5_flow_table *ft, u32 group_ix,
+ u32 flow_index, void *flow_context)
+{
+ u32 out[MLX5_ST_SZ_DW(set_fte_out)];
+ u32 *in;
+ void *in_flow_context;
+ int fcdls =
+ MLX5_GET(flow_context, flow_context, destination_list_size) *
+ MLX5_ST_SZ_BYTES(dest_format_struct);
+ int inlen = MLX5_ST_SZ_BYTES(set_fte_in) + fcdls;
+ int err;
+
+ in = mlx5_vzalloc(inlen);
+ if (!in) {
+ mlx5_core_warn(ft->dev, "failed to allocate inbox\n");
+ return -ENOMEM;
+ }
+
+ MLX5_SET(set_fte_in, in, table_type, ft->type);
+ MLX5_SET(set_fte_in, in, table_id, ft->id);
+ MLX5_SET(set_fte_in, in, flow_index, flow_index);
+ MLX5_SET(set_fte_in, in, opcode, MLX5_CMD_OP_SET_FLOW_TABLE_ENTRY);
+
+ in_flow_context = MLX5_ADDR_OF(set_fte_in, in, flow_context);
+ memcpy(in_flow_context, flow_context,
+ MLX5_ST_SZ_BYTES(flow_context) + fcdls);
+
+ MLX5_SET(flow_context, in_flow_context, group_id, ft->group[group_ix].id);
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(ft->dev, in, inlen, out,
+ sizeof(out));
+ kvfree(in);
+
+ return err;
+}
+
+static void mlx5_del_flow_entry_cmd(struct mlx5_flow_table *ft, u32 flow_index)
+{
+ u32 in[MLX5_ST_SZ_DW(delete_fte_in)];
+ u32 out[MLX5_ST_SZ_DW(delete_fte_out)];
+
+ memset(in, 0, sizeof(in));
+ memset(out, 0, sizeof(out));
+
+#define MLX5_SET_DFTEI(p, x, v) MLX5_SET(delete_fte_in, p, x, v)
+ MLX5_SET_DFTEI(in, table_type, ft->type);
+ MLX5_SET_DFTEI(in, table_id, ft->id);
+ MLX5_SET_DFTEI(in, flow_index, flow_index);
+ MLX5_SET_DFTEI(in, opcode, MLX5_CMD_OP_DELETE_FLOW_TABLE_ENTRY);
+
+ mlx5_cmd_exec_check_status(ft->dev, in, sizeof(in), out, sizeof(out));
+}
+
+static void mlx5_destroy_flow_group_cmd(struct mlx5_flow_table *ft, int i)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_flow_group_in)];
+ u32 out[MLX5_ST_SZ_DW(destroy_flow_group_out)];
+
+ memset(in, 0, sizeof(in));
+ memset(out, 0, sizeof(out));
+
+#define MLX5_SET_DFGI(p, x, v) MLX5_SET(destroy_flow_group_in, p, x, v)
+ MLX5_SET_DFGI(in, table_type, ft->type);
+ MLX5_SET_DFGI(in, table_id, ft->id);
+ MLX5_SET_DFGI(in, opcode, MLX5_CMD_OP_DESTROY_FLOW_GROUP);
+ MLX5_SET_DFGI(in, group_id, ft->group[i].id);
+ mlx5_cmd_exec_check_status(ft->dev, in, sizeof(in), out, sizeof(out));
+}
+
+static int mlx5_create_flow_group_cmd(struct mlx5_flow_table *ft, int i)
+{
+ u32 out[MLX5_ST_SZ_DW(create_flow_group_out)];
+ u32 *in;
+ void *in_match_criteria;
+ int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
+ struct mlx5_flow_table_group *g = &ft->group[i].g;
+ u32 start_ix = ft->group[i].start_ix;
+ u32 end_ix = start_ix + (1 << g->log_sz) - 1;
+ int err;
+
+ in = mlx5_vzalloc(inlen);
+ if (!in) {
+ mlx5_core_warn(ft->dev, "failed to allocate inbox\n");
+ return -ENOMEM;
+ }
+ in_match_criteria = MLX5_ADDR_OF(create_flow_group_in, in,
+ match_criteria);
+
+ memset(out, 0, sizeof(out));
+
+#define MLX5_SET_CFGI(p, x, v) MLX5_SET(create_flow_group_in, p, x, v)
+ MLX5_SET_CFGI(in, table_type, ft->type);
+ MLX5_SET_CFGI(in, table_id, ft->id);
+ MLX5_SET_CFGI(in, opcode, MLX5_CMD_OP_CREATE_FLOW_GROUP);
+ MLX5_SET_CFGI(in, start_flow_index, start_ix);
+ MLX5_SET_CFGI(in, end_flow_index, end_ix);
+ MLX5_SET_CFGI(in, match_criteria_enable, g->match_criteria_enable);
+
+ memcpy(in_match_criteria, g->match_criteria,
+ MLX5_ST_SZ_BYTES(fte_match_param));
+
+ err = mlx5_cmd_exec_check_status(ft->dev, in, inlen, out,
+ sizeof(out));
+ if (!err)
+ ft->group[i].id = MLX5_GET(create_flow_group_out, out,
+ group_id);
+
+ kvfree(in);
+
+ return err;
+}
+
+static void mlx5_destroy_flow_table_groups(struct mlx5_flow_table *ft)
+{
+ int i;
+
+ for (i = 0; i < ft->num_groups; i++)
+ mlx5_destroy_flow_group_cmd(ft, i);
+}
+
+static int mlx5_create_flow_table_groups(struct mlx5_flow_table *ft)
+{
+ int err;
+ int i;
+
+ for (i = 0; i < ft->num_groups; i++) {
+ err = mlx5_create_flow_group_cmd(ft, i);
+ if (err)
+ goto err_destroy_flow_table_groups;
+ }
+
+ return 0;
+
+err_destroy_flow_table_groups:
+ for (i--; i >= 0; i--)
+ mlx5_destroy_flow_group_cmd(ft, i);
+
+ return err;
+}
+
+static int mlx5_create_flow_table_cmd(struct mlx5_flow_table *ft)
+{
+ u32 in[MLX5_ST_SZ_DW(create_flow_table_in)];
+ u32 out[MLX5_ST_SZ_DW(create_flow_table_out)];
+ int err;
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(create_flow_table_in, in, table_type, ft->type);
+ MLX5_SET(create_flow_table_in, in, level, ft->level);
+ MLX5_SET(create_flow_table_in, in, log_size, order_base_2(ft->size));
+
+ MLX5_SET(create_flow_table_in, in, opcode,
+ MLX5_CMD_OP_CREATE_FLOW_TABLE);
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(ft->dev, in, sizeof(in), out,
+ sizeof(out));
+ if (err)
+ return err;
+
+ ft->id = MLX5_GET(create_flow_table_out, out, table_id);
+
+ return 0;
+}
+
+static void mlx5_destroy_flow_table_cmd(struct mlx5_flow_table *ft)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_flow_table_in)];
+ u32 out[MLX5_ST_SZ_DW(destroy_flow_table_out)];
+
+ memset(in, 0, sizeof(in));
+ memset(out, 0, sizeof(out));
+
+#define MLX5_SET_DFTI(p, x, v) MLX5_SET(destroy_flow_table_in, p, x, v)
+ MLX5_SET_DFTI(in, table_type, ft->type);
+ MLX5_SET_DFTI(in, table_id, ft->id);
+ MLX5_SET_DFTI(in, opcode, MLX5_CMD_OP_DESTROY_FLOW_TABLE);
+
+ mlx5_cmd_exec_check_status(ft->dev, in, sizeof(in), out, sizeof(out));
+}
+
+static int mlx5_find_group(struct mlx5_flow_table *ft, u8 match_criteria_enable,
+ u32 *match_criteria, int *group_ix)
+{
+ void *mc_outer = MLX5_ADDR_OF(fte_match_param, match_criteria,
+ outer_headers);
+ void *mc_misc = MLX5_ADDR_OF(fte_match_param, match_criteria,
+ misc_parameters);
+ void *mc_inner = MLX5_ADDR_OF(fte_match_param, match_criteria,
+ inner_headers);
+ int mc_outer_sz = MLX5_ST_SZ_BYTES(fte_match_set_lyr_2_4);
+ int mc_misc_sz = MLX5_ST_SZ_BYTES(fte_match_set_misc);
+ int mc_inner_sz = MLX5_ST_SZ_BYTES(fte_match_set_lyr_2_4);
+ int i;
+
+ for (i = 0; i < ft->num_groups; i++) {
+ struct mlx5_flow_table_group *g = &ft->group[i].g;
+ void *gmc_outer = MLX5_ADDR_OF(fte_match_param,
+ g->match_criteria,
+ outer_headers);
+ void *gmc_misc = MLX5_ADDR_OF(fte_match_param,
+ g->match_criteria,
+ misc_parameters);
+ void *gmc_inner = MLX5_ADDR_OF(fte_match_param,
+ g->match_criteria,
+ inner_headers);
+
+ if (g->match_criteria_enable != match_criteria_enable)
+ continue;
+
+ if (match_criteria_enable & MLX5_MATCH_OUTER_HEADERS)
+ if (memcmp(mc_outer, gmc_outer, mc_outer_sz))
+ continue;
+
+ if (match_criteria_enable & MLX5_MATCH_MISC_PARAMETERS)
+ if (memcmp(mc_misc, gmc_misc, mc_misc_sz))
+ continue;
+
+ if (match_criteria_enable & MLX5_MATCH_INNER_HEADERS)
+ if (memcmp(mc_inner, gmc_inner, mc_inner_sz))
+ continue;
+
+ *group_ix = i;
+ return 0;
+ }
+
+ return -EINVAL;
+}
+
+static int alloc_flow_index(struct mlx5_flow_table *ft, int group_ix, u32 *ix)
+{
+ struct mlx5_ftg *g = &ft->group[group_ix];
+ int err = 0;
+
+ mutex_lock(&ft->mutex);
+
+ *ix = find_next_zero_bit(ft->bitmap, ft->size, g->start_ix);
+ if (*ix >= (g->start_ix + (1 << g->g.log_sz)))
+ err = -ENOSPC;
+ else
+ __set_bit(*ix, ft->bitmap);
+
+ mutex_unlock(&ft->mutex);
+
+ return err;
+}
+
+static void mlx5_free_flow_index(struct mlx5_flow_table *ft, u32 ix)
+{
+ __clear_bit(ix, ft->bitmap);
+}
+
+int mlx5_add_flow_table_entry(void *flow_table, u8 match_criteria_enable,
+ void *match_criteria, void *flow_context,
+ u32 *flow_index)
+{
+ struct mlx5_flow_table *ft = flow_table;
+ int group_ix;
+ int err;
+
+ err = mlx5_find_group(ft, match_criteria_enable, match_criteria,
+ &group_ix);
+ if (err) {
+ mlx5_core_warn(ft->dev, "mlx5_find_group failed\n");
+ return err;
+ }
+
+ err = alloc_flow_index(ft, group_ix, flow_index);
+ if (err) {
+ mlx5_core_warn(ft->dev, "alloc_flow_index failed\n");
+ return err;
+ }
+
+ return mlx5_set_flow_entry_cmd(ft, group_ix, *flow_index, flow_context);
+}
+EXPORT_SYMBOL(mlx5_add_flow_table_entry);
+
+void mlx5_del_flow_table_entry(void *flow_table, u32 flow_index)
+{
+ struct mlx5_flow_table *ft = flow_table;
+
+ mlx5_del_flow_entry_cmd(ft, flow_index);
+ mlx5_free_flow_index(ft, flow_index);
+}
+EXPORT_SYMBOL(mlx5_del_flow_table_entry);
+
+void *mlx5_create_flow_table(struct mlx5_core_dev *dev, u8 level, u8 table_type,
+ u16 num_groups,
+ struct mlx5_flow_table_group *group)
+{
+ struct mlx5_flow_table *ft;
+ u32 start_ix = 0;
+ u32 ft_size = 0;
+ void *gr;
+ void *bm;
+ int err;
+ int i;
+
+ for (i = 0; i < num_groups; i++)
+ ft_size += (1 << group[i].log_sz);
+
+ ft = kzalloc(sizeof(*ft), GFP_KERNEL);
+ gr = kcalloc(num_groups, sizeof(struct mlx5_ftg), GFP_KERNEL);
+ bm = kcalloc(BITS_TO_LONGS(ft_size), sizeof(uintptr_t), GFP_KERNEL);
+ if (!ft || !gr || !bm)
+ goto err_free_ft;
+
+ ft->group = gr;
+ ft->bitmap = bm;
+ ft->num_groups = num_groups;
+ ft->level = level;
+ ft->type = table_type;
+ ft->size = ft_size;
+ ft->dev = dev;
+ mutex_init(&ft->mutex);
+
+ for (i = 0; i < ft->num_groups; i++) {
+ memcpy(&ft->group[i].g, &group[i], sizeof(*group));
+ ft->group[i].start_ix = start_ix;
+ start_ix += 1 << group[i].log_sz;
+ }
+
+ err = mlx5_create_flow_table_cmd(ft);
+ if (err)
+ goto err_free_ft;
+
+ err = mlx5_create_flow_table_groups(ft);
+ if (err)
+ goto err_destroy_flow_table_cmd;
+
+ return ft;
+
+err_destroy_flow_table_cmd:
+ mlx5_destroy_flow_table_cmd(ft);
+
+err_free_ft:
+ mlx5_core_warn(dev, "failed to alloc flow table\n");
+ kfree(bm);
+ kfree(gr);
+ kfree(ft);
+
+ return NULL;
+}
+EXPORT_SYMBOL(mlx5_create_flow_table);
+
+void mlx5_destroy_flow_table(void *flow_table)
+{
+ struct mlx5_flow_table *ft = flow_table;
+
+ mlx5_destroy_flow_table_groups(ft);
+ mlx5_destroy_flow_table_cmd(ft);
+ kfree(ft->bitmap);
+ kfree(ft->group);
+ kfree(ft);
+}
+EXPORT_SYMBOL(mlx5_destroy_flow_table);
+
+u32 mlx5_get_flow_table_id(void *flow_table)
+{
+ struct mlx5_flow_table *ft = flow_table;
+
+ return ft->id;
+}
+EXPORT_SYMBOL(mlx5_get_flow_table_id);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/fw.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/fw.c
new file mode 100644
index 0000000..e54998f
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/fw.c
@@ -0,0 +1,199 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+int mlx5_cmd_query_adapter(struct mlx5_core_dev *dev, u32 *out, int outlen)
+{
+ u32 in[MLX5_ST_SZ_DW(query_adapter_in)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(query_adapter_in, in, opcode, MLX5_CMD_OP_QUERY_ADAPTER);
+
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, outlen);
+
+}
+
+int mlx5_query_board_id(struct mlx5_core_dev *dev)
+{
+ u32 *out;
+ int outlen = MLX5_ST_SZ_BYTES(query_adapter_out);
+ int err;
+
+ out = kzalloc(outlen, GFP_KERNEL);
+ if (!out)
+ return -ENOMEM;
+
+ err = mlx5_cmd_query_adapter(dev, out, outlen);
+ if (err)
+ goto out;
+
+ memcpy(dev->board_id,
+ MLX5_ADDR_OF(query_adapter_out, out,
+ query_adapter_struct.vsd_contd_psid),
+ MLX5_FLD_SZ_BYTES(query_adapter_out,
+ query_adapter_struct.vsd_contd_psid));
+
+out:
+ kfree(out);
+ return err;
+}
+
+int mlx5_core_query_vendor_id(struct mlx5_core_dev *mdev, u32 *vendor_id)
+{
+ u32 *out;
+ int outlen = MLX5_ST_SZ_BYTES(query_adapter_out);
+ int err;
+
+ out = kzalloc(outlen, GFP_KERNEL);
+ if (!out)
+ return -ENOMEM;
+
+ err = mlx5_cmd_query_adapter(mdev, out, outlen);
+ if (err)
+ goto out;
+
+ *vendor_id = MLX5_GET(query_adapter_out, out,
+ query_adapter_struct.ieee_vendor_id);
+out:
+ kfree(out);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx5_core_query_vendor_id);
+
+int mlx5_query_hca_caps(struct mlx5_core_dev *dev)
+{
+ int err;
+
+ err = mlx5_core_get_caps(dev, MLX5_CAP_GENERAL, HCA_CAP_OPMOD_GET_CUR);
+ if (err)
+ return err;
+
+ err = mlx5_core_get_caps(dev, MLX5_CAP_GENERAL, HCA_CAP_OPMOD_GET_MAX);
+ if (err)
+ return err;
+
+ if (MLX5_CAP_GEN(dev, eth_net_offloads)) {
+ err = mlx5_core_get_caps(dev, MLX5_CAP_ETHERNET_OFFLOADS,
+ HCA_CAP_OPMOD_GET_CUR);
+ if (err)
+ return err;
+ err = mlx5_core_get_caps(dev, MLX5_CAP_ETHERNET_OFFLOADS,
+ HCA_CAP_OPMOD_GET_MAX);
+ if (err)
+ return err;
+ }
+
+ if (MLX5_CAP_GEN(dev, pg)) {
+ err = mlx5_core_get_caps(dev, MLX5_CAP_ODP,
+ HCA_CAP_OPMOD_GET_CUR);
+ if (err)
+ return err;
+ err = mlx5_core_get_caps(dev, MLX5_CAP_ODP,
+ HCA_CAP_OPMOD_GET_MAX);
+ if (err)
+ return err;
+ }
+
+ if (MLX5_CAP_GEN(dev, atomic)) {
+ err = mlx5_core_get_caps(dev, MLX5_CAP_ATOMIC,
+ HCA_CAP_OPMOD_GET_CUR);
+ if (err)
+ return err;
+ err = mlx5_core_get_caps(dev, MLX5_CAP_ATOMIC,
+ HCA_CAP_OPMOD_GET_MAX);
+ if (err)
+ return err;
+ }
+
+ if (MLX5_CAP_GEN(dev, roce)) {
+ err = mlx5_core_get_caps(dev, MLX5_CAP_ROCE,
+ HCA_CAP_OPMOD_GET_CUR);
+ if (err)
+ return err;
+ err = mlx5_core_get_caps(dev, MLX5_CAP_ROCE,
+ HCA_CAP_OPMOD_GET_MAX);
+ if (err)
+ return err;
+ }
+
+ if (MLX5_CAP_GEN(dev, nic_flow_table)) {
+ err = mlx5_core_get_caps(dev, MLX5_CAP_FLOW_TABLE,
+ HCA_CAP_OPMOD_GET_CUR);
+ if (err)
+ return err;
+ err = mlx5_core_get_caps(dev, MLX5_CAP_FLOW_TABLE,
+ HCA_CAP_OPMOD_GET_MAX);
+ if (err)
+ return err;
+ }
+
+ err = mlx5_core_query_special_contexts(dev);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+int mlx5_cmd_init_hca(struct mlx5_core_dev *dev)
+{
+ u32 in[MLX5_ST_SZ_DW(init_hca_in)];
+ u32 out[MLX5_ST_SZ_DW(init_hca_out)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(init_hca_in, in, opcode, MLX5_CMD_OP_INIT_HCA);
+
+ memset(out, 0, sizeof(out));
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in),
+ out, sizeof(out));
+}
+
+int mlx5_cmd_teardown_hca(struct mlx5_core_dev *dev)
+{
+ u32 in[MLX5_ST_SZ_DW(teardown_hca_in)];
+ u32 out[MLX5_ST_SZ_DW(teardown_hca_out)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(teardown_hca_in, in, opcode, MLX5_CMD_OP_TEARDOWN_HCA);
+
+ memset(out, 0, sizeof(out));
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in),
+ out, sizeof(out));
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/health.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/health.c
new file mode 100644
index 0000000..a4d918e
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/health.c
@@ -0,0 +1,229 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+enum {
+ MLX5_HEALTH_POLL_INTERVAL = 2 * HZ,
+ MAX_MISSES = 3,
+};
+
+enum {
+ MLX5_HEALTH_SYNDR_FW_ERR = 0x1,
+ MLX5_HEALTH_SYNDR_IRISC_ERR = 0x7,
+ MLX5_HEALTH_SYNDR_HW_UNRECOVERABLE_ERR = 0x8,
+ MLX5_HEALTH_SYNDR_CRC_ERR = 0x9,
+ MLX5_HEALTH_SYNDR_FETCH_PCI_ERR = 0xa,
+ MLX5_HEALTH_SYNDR_HW_FTL_ERR = 0xb,
+ MLX5_HEALTH_SYNDR_ASYNC_EQ_OVERRUN_ERR = 0xc,
+ MLX5_HEALTH_SYNDR_EQ_ERR = 0xd,
+ MLX5_HEALTH_SYNDR_EQ_INV = 0xe,
+ MLX5_HEALTH_SYNDR_FFSER_ERR = 0xf,
+ MLX5_HEALTH_SYNDR_HIGH_TEMP = 0x10,
+};
+
+static DEFINE_SPINLOCK(health_lock);
+static LIST_HEAD(health_list);
+static struct work_struct health_work;
+
+void mlx5_enter_error_state(struct mlx5_core_dev *dev)
+{
+ unsigned long flags, vector = 0;
+
+ if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
+ return;
+
+ mlx5_core_err(dev, "start\n");
+ if (pci_channel_offline(dev->pdev))
+ dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
+
+ mlx5_core_event(dev, MLX5_DEV_EVENT_SYS_ERROR, 0);
+ spin_lock_irqsave(&dev->cmd.alloc_lock, flags);
+ vector = ~dev->cmd.bitmask & ~vector & ((1 << dev->cmd.log_sz) - 1);
+ spin_unlock_irqrestore(&dev->cmd.alloc_lock, flags);
+ mlx5_cmd_comp_handler(dev, vector);
+ mlx5_core_err(dev, "end\n");
+}
+
+static void health_care(struct work_struct *work)
+{
+ struct mlx5_core_health *health, *n;
+ struct mlx5_core_dev *dev;
+ struct mlx5_priv *priv;
+ LIST_HEAD(tlist);
+
+ spin_lock_irq(&health_lock);
+ list_splice_init(&health_list, &tlist);
+
+ spin_unlock_irq(&health_lock);
+
+ list_for_each_entry_safe(health, n, &tlist, list) {
+ priv = container_of(health, struct mlx5_priv, health);
+ dev = container_of(priv, struct mlx5_core_dev, priv);
+ dev_err(&dev->pdev->dev, "handling bad device here\n");
+ /* nothing yet */
+ spin_lock_irq(&health_lock);
+ list_del_init(&health->list);
+ spin_unlock_irq(&health_lock);
+ }
+}
+
+static const char *hsynd_str(u8 synd)
+{
+ switch (synd) {
+ case MLX5_HEALTH_SYNDR_FW_ERR:
+ return "firmware internal error";
+ case MLX5_HEALTH_SYNDR_IRISC_ERR:
+ return "irisc not responding";
+ case MLX5_HEALTH_SYNDR_HW_UNRECOVERABLE_ERR:
+ return "unrecoverable hardware error";
+ case MLX5_HEALTH_SYNDR_CRC_ERR:
+ return "firmware CRC error";
+ case MLX5_HEALTH_SYNDR_FETCH_PCI_ERR:
+ return "ICM fetch PCI error";
+ case MLX5_HEALTH_SYNDR_HW_FTL_ERR:
+ return "HW fatal error\n";
+ case MLX5_HEALTH_SYNDR_ASYNC_EQ_OVERRUN_ERR:
+ return "async EQ buffer overrun";
+ case MLX5_HEALTH_SYNDR_EQ_ERR:
+ return "EQ error";
+ case MLX5_HEALTH_SYNDR_EQ_INV:
+ return "Invalid EQ refrenced";
+ case MLX5_HEALTH_SYNDR_FFSER_ERR:
+ return "FFSER error";
+ case MLX5_HEALTH_SYNDR_HIGH_TEMP:
+ return "High temprature";
+ default:
+ return "unrecognized error";
+ }
+}
+
+static u16 read_be16(__be16 __iomem *p)
+{
+ return swab16(readl((__force u16 __iomem *) p));
+}
+
+static u32 read_be32(__be32 __iomem *p)
+{
+ return swab32(readl((__force u32 __iomem *) p));
+}
+
+static void print_health_info(struct mlx5_core_dev *dev)
+{
+ struct mlx5_core_health *health = &dev->priv.health;
+ struct health_buffer __iomem *h = health->health;
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(h->assert_var); i++)
+ dev_err(&dev->pdev->dev, "assert_var[%d] 0x%08x\n", i, read_be32(h->assert_var + i));
+
+ dev_err(&dev->pdev->dev, "assert_exit_ptr 0x%08x\n", read_be32(&h->assert_exit_ptr));
+ dev_err(&dev->pdev->dev, "assert_callra 0x%08x\n", read_be32(&h->assert_callra));
+ dev_err(&dev->pdev->dev, "fw_ver 0x%08x\n", read_be32(&h->fw_ver));
+ dev_err(&dev->pdev->dev, "hw_id 0x%08x\n", read_be32(&h->hw_id));
+ dev_err(&dev->pdev->dev, "irisc_index %d\n", readb(&h->irisc_index));
+ dev_err(&dev->pdev->dev, "synd 0x%x: %s\n", readb(&h->synd), hsynd_str(readb(&h->synd)));
+ dev_err(&dev->pdev->dev, "ext_synd 0x%04x\n", read_be16(&h->ext_synd));
+
+}
+
+static void poll_health(unsigned long data)
+{
+ struct mlx5_core_dev *dev = (struct mlx5_core_dev *)data;
+ struct mlx5_core_health *health = &dev->priv.health;
+ unsigned long next;
+ u32 count;
+
+ count = ioread32be(health->health_counter);
+ if (count == health->prev)
+ ++health->miss_counter;
+ else
+ health->miss_counter = 0;
+
+ health->prev = count;
+ if (health->miss_counter == MAX_MISSES) {
+ dev_err(&dev->pdev->dev, "device's health compromised\n");
+ print_health_info(dev);
+ spin_lock_irq(&health_lock);
+ list_add_tail(&health->list, &health_list);
+ spin_unlock_irq(&health_lock);
+
+ queue_work(mlx5_core_wq, &health_work);
+ } else {
+ get_random_bytes(&next, sizeof(next));
+ next %= HZ;
+ next += jiffies + MLX5_HEALTH_POLL_INTERVAL;
+ mod_timer(&health->timer, next);
+ }
+}
+
+void mlx5_start_health_poll(struct mlx5_core_dev *dev)
+{
+ struct mlx5_core_health *health = &dev->priv.health;
+
+ INIT_LIST_HEAD(&health->list);
+ init_timer(&health->timer);
+ health->health = &dev->iseg->health;
+ health->health_counter = &dev->iseg->health_counter;
+
+ health->timer.data = (unsigned long)dev;
+ health->timer.function = poll_health;
+ health->timer.expires = round_jiffies(jiffies + MLX5_HEALTH_POLL_INTERVAL);
+ add_timer(&health->timer);
+}
+
+void mlx5_stop_health_poll(struct mlx5_core_dev *dev)
+{
+ struct mlx5_core_health *health = &dev->priv.health;
+
+ del_timer_sync(&health->timer);
+
+ spin_lock_irq(&health_lock);
+ if (!list_empty(&health->list))
+ list_del_init(&health->list);
+ spin_unlock_irq(&health_lock);
+
+ flush_workqueue(mlx5_core_wq);
+}
+
+void mlx5_health_cleanup(void)
+{
+}
+
+void __init mlx5_health_init(void)
+{
+ INIT_WORK(&health_work, health_care);
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/mad.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/mad.c
new file mode 100644
index 0000000..e03d33a
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/mad.c
@@ -0,0 +1,78 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+int mlx5_core_mad_ifc(struct mlx5_core_dev *dev, void *inb, void *outb,
+ u16 opmod, u8 port)
+{
+ struct mlx5_mad_ifc_mbox_in *in = NULL;
+ struct mlx5_mad_ifc_mbox_out *out = NULL;
+ int err;
+
+ in = kzalloc(sizeof(*in), GFP_KERNEL);
+ if (!in)
+ return -ENOMEM;
+
+ out = kzalloc(sizeof(*out), GFP_KERNEL);
+ if (!out) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ in->hdr.opcode = cpu_to_be16(MLX5_CMD_OP_MAD_IFC);
+ in->hdr.opmod = cpu_to_be16(opmod);
+ in->port = port;
+
+ memcpy(in->data, inb, sizeof(in->data));
+
+ err = mlx5_cmd_exec(dev, in, sizeof(*in), out, sizeof(*out));
+ if (err)
+ goto out;
+
+ if (out->hdr.status) {
+ err = mlx5_cmd_status_to_err(&out->hdr);
+ goto out;
+ }
+
+ memcpy(outb, out->data, sizeof(out->data));
+
+out:
+ kfree(out);
+ kfree(in);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_mad_ifc);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/main.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/main.c
new file mode 100644
index 0000000..92fd490
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/main.c
@@ -0,0 +1,1583 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+MODULE_AUTHOR("Eli Cohen <eli@mellanox.com>");
+MODULE_DESCRIPTION("Mellanox Connect-IB, ConnectX-4 core driver");
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_VERSION(DRIVER_VERSION);
+
+int mlx5_core_debug_mask;
+module_param_named(debug_mask, mlx5_core_debug_mask, int, 0644);
+MODULE_PARM_DESC(debug_mask, "debug mask: 1 = dump cmd data, 2 = dump cmd exec time, 3 = both. Default=0");
+
+#define MLX5_DEFAULT_PROF 2
+static int prof_sel = MLX5_DEFAULT_PROF;
+module_param_named(prof_sel, prof_sel, int, 0444);
+MODULE_PARM_DESC(prof_sel, "profile selector. Valid range 0 - 2");
+
+struct workqueue_struct *mlx5_core_wq;
+static LIST_HEAD(intf_list);
+static LIST_HEAD(dev_list);
+static DEFINE_MUTEX(intf_mutex);
+
+struct mlx5_device_context {
+ struct list_head list;
+ struct mlx5_interface *intf;
+ void *context;
+};
+
+static struct mlx5_profile profile[] = {
+ [0] = {
+ .mask = 0,
+ },
+ [1] = {
+ .mask = MLX5_PROF_MASK_QP_SIZE |
+ MLX5_PROF_MASK_DCT,
+ .log_max_qp = 12,
+ .dct_enable = 1,
+ },
+ [2] = {
+ .mask = MLX5_PROF_MASK_QP_SIZE |
+ MLX5_PROF_MASK_MR_CACHE |
+ MLX5_PROF_MASK_DCT,
+ .log_max_qp = 18,
+ .dct_enable = 1,
+ .mr_cache[0] = {
+ .size = 500,
+ .limit = 250
+ },
+ .mr_cache[1] = {
+ .size = 500,
+ .limit = 250
+ },
+ .mr_cache[2] = {
+ .size = 500,
+ .limit = 250
+ },
+ .mr_cache[3] = {
+ .size = 500,
+ .limit = 250
+ },
+ .mr_cache[4] = {
+ .size = 500,
+ .limit = 250
+ },
+ .mr_cache[5] = {
+ .size = 500,
+ .limit = 250
+ },
+ .mr_cache[6] = {
+ .size = 500,
+ .limit = 250
+ },
+ .mr_cache[7] = {
+ .size = 500,
+ .limit = 250
+ },
+ .mr_cache[8] = {
+ .size = 500,
+ .limit = 250
+ },
+ .mr_cache[9] = {
+ .size = 500,
+ .limit = 250
+ },
+ .mr_cache[10] = {
+ .size = 500,
+ .limit = 250
+ },
+ .mr_cache[11] = {
+ .size = 500,
+ .limit = 250
+ },
+ .mr_cache[12] = {
+ .size = 64,
+ .limit = 32
+ },
+ .mr_cache[13] = {
+ .size = 32,
+ .limit = 16
+ },
+ .mr_cache[14] = {
+ .size = 16,
+ .limit = 8
+ },
+ },
+};
+
+#define FW_INIT_TIMEOUT_MILI 2000
+#define FW_INIT_WAIT_MS 2
+
+static int wait_fw_init(struct mlx5_core_dev *dev, u32 max_wait_mili)
+{
+ unsigned long end = jiffies + msecs_to_jiffies(max_wait_mili);
+ int err = 0;
+
+ while (fw_initializing(dev)) {
+ if (!mlx5_core_is_pf(dev))
+ return -EAGAIN;
+
+ if (time_after(jiffies, end)) {
+ err = -EBUSY;
+ break;
+ }
+ msleep(FW_INIT_WAIT_MS);
+ }
+
+ return err;
+}
+
+static int set_dma_caps(struct pci_dev *pdev)
+{
+ int err;
+
+ err = pci_set_dma_mask(pdev, DMA_BIT_MASK(64));
+ if (err) {
+ dev_warn(&pdev->dev, "Warning: couldn't set 64-bit PCI DMA mask\n");
+ err = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
+ if (err) {
+ dev_err(&pdev->dev, "Can't set PCI DMA mask, aborting\n");
+ return err;
+ }
+ }
+
+ err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+ if (err) {
+ dev_warn(&pdev->dev,
+ "Warning: couldn't set 64-bit consistent PCI DMA mask\n");
+ err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
+ if (err) {
+ dev_err(&pdev->dev,
+ "Can't set consistent PCI DMA mask, aborting\n");
+ return err;
+ }
+ }
+
+ dma_set_max_seg_size(&pdev->dev, 2u * 1024 * 1024 * 1024);
+ return err;
+}
+
+static int mlx5_pci_enable_device(struct mlx5_core_dev *dev)
+{
+ struct pci_dev *pdev = dev->pdev;
+ int err = 0;
+
+ mutex_lock(&dev->pci_status_mutex);
+ if (dev->pci_status == MLX5_PCI_STATUS_DISABLED) {
+ err = pci_enable_device(pdev);
+ if (!err)
+ dev->pci_status = MLX5_PCI_STATUS_ENABLED;
+ }
+ mutex_unlock(&dev->pci_status_mutex);
+
+ return err;
+}
+
+static void mlx5_pci_disable_device(struct mlx5_core_dev *dev)
+{
+ struct pci_dev *pdev = dev->pdev;
+
+ mutex_lock(&dev->pci_status_mutex);
+ if (dev->pci_status == MLX5_PCI_STATUS_ENABLED) {
+ pci_disable_device(pdev);
+ dev->pci_status = MLX5_PCI_STATUS_DISABLED;
+ }
+ mutex_unlock(&dev->pci_status_mutex);
+}
+
+static int request_bar(struct pci_dev *pdev)
+{
+ int err = 0;
+
+ if (!(pci_resource_flags(pdev, 0) & IORESOURCE_MEM)) {
+ dev_err(&pdev->dev, "Missing registers BAR, aborting\n");
+ return -ENODEV;
+ }
+
+ err = pci_request_regions(pdev, DRIVER_NAME);
+ if (err)
+ dev_err(&pdev->dev, "Couldn't get PCI resources, aborting\n");
+
+ return err;
+}
+
+static void release_bar(struct pci_dev *pdev)
+{
+ pci_release_regions(pdev);
+}
+
+enum {
+ PPC_MAX_VECTORS = 32,
+};
+
+static int mlx5_enable_msix(struct mlx5_core_dev *dev)
+{
+ struct mlx5_eq_table *table = &dev->priv.eq_table;
+ int num_eqs = 1 << MLX5_CAP_GEN(dev, log_max_eq);
+ struct mlx5_priv *priv = &dev->priv;
+ int nvec;
+#ifndef HAVE_PCI_ENABLE_MSIX_RANGE
+ int err;
+#endif
+ int i;
+
+ nvec = MLX5_CAP_GEN(dev, num_ports) * num_online_cpus() +
+ MLX5_EQ_VEC_COMP_BASE;
+ nvec = min_t(int, nvec, num_eqs);
+#ifdef CONFIG_PPC
+ nvec = min_t(int, nvec, PPC_MAX_VECTORS);
+#endif
+ if (nvec <= MLX5_EQ_VEC_COMP_BASE)
+ return -ENOMEM;
+
+ priv->msix_arr = kzalloc(nvec * sizeof(*priv->msix_arr), GFP_KERNEL);
+ priv->irq_info = kzalloc(nvec * sizeof(*priv->irq_info), GFP_KERNEL);
+ if (!priv->msix_arr || !priv->irq_info)
+ goto err_free_msix;
+
+ for (i = 0; i < nvec; i++)
+ priv->msix_arr[i].entry = i;
+
+#ifdef HAVE_PCI_ENABLE_MSIX_RANGE
+ nvec = pci_enable_msix_range(dev->pdev, priv->msix_arr,
+ MLX5_EQ_VEC_COMP_BASE + 1, nvec);
+ if (nvec < 0)
+ return nvec;
+
+ table->num_comp_vectors = nvec - MLX5_EQ_VEC_COMP_BASE;
+#else
+retry:
+ table->num_comp_vectors = nvec - MLX5_EQ_VEC_COMP_BASE;
+ err = pci_enable_msix(dev->pdev, priv->msix_arr, nvec);
+ if (err <= 0) {
+ return err;
+ } else if (err > 2) {
+ nvec = err;
+ goto retry;
+ }
+ mlx5_core_dbg(dev, "received %d MSI vectors out of %d requested\n", err, nvec);
+#endif
+
+ return 0;
+
+err_free_msix:
+ kfree(priv->irq_info);
+ kfree(priv->msix_arr);
+ return -ENOMEM;
+}
+
+static void mlx5_disable_msix(struct mlx5_core_dev *dev)
+{
+ struct mlx5_priv *priv = &dev->priv;
+
+ pci_disable_msix(dev->pdev);
+ kfree(priv->irq_info);
+ kfree(priv->msix_arr);
+}
+
+struct mlx5_reg_host_endianess {
+ u8 he;
+ u8 rsvd[15];
+};
+
+
+#define CAP_MASK(pos, size) ((u64)((1 << (size)) - 1) << (pos))
+
+enum {
+ MLX5_CAP_BITS_RW_MASK = CAP_MASK(MLX5_CAP_OFF_CMDIF_CSUM, 2) |
+ MLX5_DEV_CAP_FLAG_DCT,
+};
+
+static u16 to_fw_pkey_sz(u32 size)
+{
+ switch (size) {
+ case 128:
+ return 0;
+ case 256:
+ return 1;
+ case 512:
+ return 2;
+ case 1024:
+ return 3;
+ case 2048:
+ return 4;
+ case 4096:
+ return 5;
+ default:
+ pr_warn("invalid pkey table size %d\n", size);
+ return 0;
+ }
+}
+
+int mlx5_core_query_special_contexts(struct mlx5_core_dev *dev)
+{
+ u32 in[MLX5_ST_SZ_DW(query_special_contexts_in)];
+ u32 out[MLX5_ST_SZ_DW(query_special_contexts_out)];
+ int err;
+
+ memset(in, 0, sizeof(in));
+ memset(out, 0, sizeof(out));
+
+ MLX5_SET(query_special_contexts_in, in, opcode,
+ MLX5_CMD_OP_QUERY_SPECIAL_CONTEXTS);
+ err = mlx5_cmd_exec_check_status(dev, in, sizeof(in), out,
+ sizeof(out));
+ if (err)
+ return err;
+
+ dev->special_contexts.resd_lkey = MLX5_GET(query_special_contexts_out,
+ out, resd_lkey);
+
+ return err;
+}
+
+int mlx5_core_get_caps(struct mlx5_core_dev *dev, enum mlx5_cap_type cap_type,
+ enum mlx5_cap_mode cap_mode)
+{
+ u8 in[MLX5_ST_SZ_BYTES(query_hca_cap_in)];
+ int out_sz = MLX5_ST_SZ_BYTES(query_hca_cap_out);
+ void *out, *hca_caps;
+ u16 opmod = (cap_type << 1) | (cap_mode & 0x01);
+ int err;
+
+ memset(in, 0, sizeof(in));
+ out = kzalloc(out_sz, GFP_KERNEL);
+ if (!out)
+ return -ENOMEM;
+
+ MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
+ MLX5_SET(query_hca_cap_in, in, op_mod, opmod);
+ err = mlx5_cmd_exec(dev, in, sizeof(in), out, out_sz);
+ if (err)
+ goto query_ex;
+
+ err = mlx5_cmd_status_to_err_v2(out);
+ if (err) {
+ mlx5_core_warn(dev,
+ "QUERY_HCA_CAP : type(%x) opmode(%x) Failed(%d)\n",
+ cap_type, cap_mode, err);
+ goto query_ex;
+ }
+
+ hca_caps = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
+
+ switch (cap_mode) {
+ case HCA_CAP_OPMOD_GET_MAX:
+ memcpy(dev->hca_caps_max[cap_type], hca_caps,
+ MLX5_UN_SZ_BYTES(hca_cap_union));
+ break;
+ case HCA_CAP_OPMOD_GET_CUR:
+ memcpy(dev->hca_caps_cur[cap_type], hca_caps,
+ MLX5_UN_SZ_BYTES(hca_cap_union));
+ break;
+ default:
+ mlx5_core_warn(dev,
+ "Tried to query dev cap type(%x) with wrong opmode(%x)\n",
+ cap_type, cap_mode);
+ err = -EINVAL;
+ break;
+ }
+query_ex:
+ kfree(out);
+ return err;
+}
+
+static int set_caps(struct mlx5_core_dev *dev, void *in, int in_sz)
+{
+ u32 out[MLX5_ST_SZ_DW(set_hca_cap_out)];
+ int err;
+
+ memset(out, 0, sizeof(out));
+
+ MLX5_SET(set_hca_cap_in, in, opcode, MLX5_CMD_OP_SET_HCA_CAP);
+ err = mlx5_cmd_exec(dev, in, in_sz, out, sizeof(out));
+ if (err)
+ return err;
+
+ err = mlx5_cmd_status_to_err_v2(out);
+
+ return err;
+}
+
+static int handle_hca_cap(struct mlx5_core_dev *dev)
+{
+ void *set_ctx = NULL;
+ struct mlx5_profile *prof = dev->profile;
+ int err = -ENOMEM;
+ int set_sz = MLX5_ST_SZ_BYTES(set_hca_cap_in);
+ void *set_hca_cap;
+
+ set_ctx = kzalloc(set_sz, GFP_KERNEL);
+ if (!set_ctx)
+ goto query_ex;
+
+ err = mlx5_core_get_caps(dev, MLX5_CAP_GENERAL, HCA_CAP_OPMOD_GET_MAX);
+ if (err)
+ goto query_ex;
+
+ err = mlx5_core_get_caps(dev, MLX5_CAP_GENERAL, HCA_CAP_OPMOD_GET_CUR);
+ if (err)
+ goto query_ex;
+
+ set_hca_cap = MLX5_ADDR_OF(set_hca_cap_in, set_ctx,
+ capability);
+ memcpy(set_hca_cap, dev->hca_caps_cur[MLX5_CAP_GENERAL],
+ MLX5_ST_SZ_BYTES(cmd_hca_cap));
+
+ mlx5_core_dbg(dev, "Current Pkey table size %d Setting new size %d\n",
+ mlx5_to_sw_pkey_sz(MLX5_CAP_GEN(dev, pkey_table_size)),
+ 128);
+ /* we limit the size of the pkey table to 128 entries for now */
+ MLX5_SET(cmd_hca_cap, set_hca_cap, pkey_table_size,
+ to_fw_pkey_sz(128));
+
+ if (prof->mask & MLX5_PROF_MASK_QP_SIZE)
+ MLX5_SET(cmd_hca_cap, set_hca_cap, log_max_qp,
+ prof->log_max_qp);
+
+ /* disable cmdif checksum */
+ MLX5_SET(cmd_hca_cap, set_hca_cap, cmdif_checksum, 0);
+
+ MLX5_SET(cmd_hca_cap, set_hca_cap, log_uar_page_sz, PAGE_SHIFT - 12);
+
+ if (prof->mask & MLX5_PROF_MASK_DCT) {
+ if (prof->dct_enable) {
+ if (MLX5_CAP_GEN_MAX(dev, dct)) {
+ MLX5_SET(cmd_hca_cap, set_hca_cap, dct, 1);
+ dev->aysnc_events_mask |= (1ull << MLX5_EVENT_TYPE_DCT_DRAINED) |
+ (1ull << MLX5_EVENT_TYPE_DCT_KEY_VIOLATION);
+ }
+ } else {
+ MLX5_SET(cmd_hca_cap, set_hca_cap, dct, 0);
+ }
+ }
+
+ err = set_caps(dev, set_ctx, set_sz);
+
+query_ex:
+ kfree(set_ctx);
+ return err;
+}
+
+static int set_hca_ctrl(struct mlx5_core_dev *dev)
+{
+ struct mlx5_reg_host_endianess he_in;
+ struct mlx5_reg_host_endianess he_out;
+ int err;
+
+ if (!mlx5_core_is_pf(dev))
+ return 0;
+
+ memset(&he_in, 0, sizeof(he_in));
+ he_in.he = MLX5_SET_HOST_ENDIANNESS;
+ err = mlx5_core_access_reg(dev, &he_in, sizeof(he_in),
+ &he_out, sizeof(he_out),
+ MLX5_REG_HOST_ENDIANNESS, 0, 1);
+ return err;
+}
+
+int mlx5_core_enable_hca(struct mlx5_core_dev *dev, u16 func_id)
+{
+ u32 in[MLX5_ST_SZ_DW(enable_hca_in)];
+ u32 out[MLX5_ST_SZ_DW(enable_hca_out)];
+
+ memset(in, 0, sizeof(in));
+ MLX5_SET(enable_hca_in, in, opcode, MLX5_CMD_OP_ENABLE_HCA);
+ MLX5_SET(enable_hca_in, in, function_id, func_id);
+ memset(out, 0, sizeof(out));
+
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in),
+ out, sizeof(out));
+}
+
+int mlx5_core_disable_hca(struct mlx5_core_dev *dev, u16 func_id)
+{
+ u32 in[MLX5_ST_SZ_DW(disable_hca_in)];
+ u32 out[MLX5_ST_SZ_DW(disable_hca_out)];
+
+ memset(in, 0, sizeof(in));
+ MLX5_SET(disable_hca_in, in, opcode, MLX5_CMD_OP_DISABLE_HCA);
+ MLX5_SET(disable_hca_in, in, function_id, func_id);
+ memset(out, 0, sizeof(out));
+
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
+}
+
+static int mlx5_core_set_issi(struct mlx5_core_dev *dev)
+{
+ u32 query_in[MLX5_ST_SZ_DW(query_issi_in)];
+ u32 query_out[MLX5_ST_SZ_DW(query_issi_out)];
+ u32 set_in[MLX5_ST_SZ_DW(set_issi_in)];
+ u32 set_out[MLX5_ST_SZ_DW(set_issi_out)];
+ int err;
+
+ memset(query_in, 0, sizeof(query_in));
+ memset(query_out, 0, sizeof(query_out));
+
+ MLX5_SET(query_issi_in, query_in, opcode, MLX5_CMD_OP_QUERY_ISSI);
+
+ err = mlx5_cmd_exec_check_status(dev, query_in, sizeof(query_in),
+ query_out, sizeof(query_out));
+ if (err) {
+ dev->issi = 0;
+ return 0;
+ }
+
+ dev->supported_issi_mask = MLX5_GET(query_issi_out, query_out, supported_issi_dw0);
+
+ if (dev->supported_issi_mask & (1 << 1)) {
+ memset(set_in, 0, sizeof(set_in));
+ memset(set_out, 0, sizeof(set_out));
+
+ MLX5_SET(set_issi_in, set_in, opcode, MLX5_CMD_OP_SET_ISSI);
+ MLX5_SET(set_issi_in, set_in, current_issi, 1);
+
+ err = mlx5_cmd_exec_check_status(dev, set_in, sizeof(set_in),
+ set_out, sizeof(set_out));
+ if (err) {
+ mlx5_core_warn(dev, "failed to set ISSI=1\n");
+ return err;
+ }
+ dev->issi = 1;
+ return 0;
+ } else if ((dev->supported_issi_mask & (1 << 0)) ||
+ (!dev->supported_issi_mask)) {
+ dev->issi = 0;
+ return 0;
+ }
+
+ return -ENOTSUPP;
+}
+
+static void mlx5_irq_set_affinity_hint(struct mlx5_core_dev *mdev, int i)
+{
+ struct mlx5_priv *priv = &mdev->priv;
+ struct msix_entry *msix = priv->msix_arr;
+ int irq = msix[i + MLX5_EQ_VEC_COMP_BASE].vector;
+ int numa_node = priv->numa_node;
+
+ if (numa_node == -1)
+ numa_node = first_online_node;
+
+ if (!zalloc_cpumask_var(&priv->irq_info[i].mask, GFP_KERNEL)) {
+ mlx5_core_warn(mdev, "zalloc_cpumask_var failed");
+ return;
+ }
+
+ if (cpumask_set_cpu_local_first(i, numa_node, priv->irq_info[i].mask)) {
+ mlx5_core_warn(mdev, "cpumask_set_cpu_local_first failed");
+ goto err_clear_mask;
+ }
+
+ if (irq_set_affinity_hint(irq, priv->irq_info[i].mask)) {
+ mlx5_core_warn(mdev, "irq_set_affinity_hint failed,irq 0x%.4x",
+ irq);
+ goto err_clear_mask;
+ }
+
+ return;
+
+err_clear_mask:
+ free_cpumask_var(priv->irq_info[i].mask);
+#ifdef CONFIG_CPUMASK_OFFSTACK
+ priv->irq_info[i].mask = NULL;
+#endif
+}
+
+static void mlx5_irq_clear_affinity_hint(struct mlx5_core_dev *mdev, int i)
+{
+ struct mlx5_priv *priv = &mdev->priv;
+ struct msix_entry *msix = priv->msix_arr;
+ int irq = msix[i + MLX5_EQ_VEC_COMP_BASE].vector;
+#ifdef CONFIG_CPUMASK_OFFSTACK
+ cpumask_var_t mask = priv->irq_info[i].mask;
+
+ if (!priv->irq_info[i].mask)
+ return;
+#endif
+
+ irq_set_affinity_hint(irq, NULL);
+#ifdef CONFIG_CPUMASK_OFFSTACK
+ free_cpumask_var(mask);
+#else
+ free_cpumask_var(priv->irq_info[i].mask);
+#endif
+}
+
+static void mlx5_irq_set_affinity_hints(struct mlx5_core_dev *mdev)
+{
+ int i;
+
+ for (i = 0; i < mdev->priv.eq_table.num_comp_vectors; i++)
+ mlx5_irq_set_affinity_hint(mdev, i);
+}
+
+static void mlx5_irq_clear_affinity_hints(struct mlx5_core_dev *mdev)
+{
+ int i;
+
+ for (i = 0; i < mdev->priv.eq_table.num_comp_vectors; i++) {
+ mlx5_irq_clear_affinity_hint(mdev, i);
+ }
+}
+
+int mlx5_vector2eqn(struct mlx5_core_dev *dev, int vector, int *eqn, int *irqn)
+{
+ struct mlx5_eq_table *table = &dev->priv.eq_table;
+ struct mlx5_eq *eq, *n;
+ int err = -ENOENT;
+
+ spin_lock(&table->lock);
+ list_for_each_entry_safe(eq, n, &table->comp_eqs_list, list) {
+ if (eq->index == vector) {
+ *eqn = eq->eqn;
+ *irqn = eq->irqn;
+ err = 0;
+ break;
+ }
+ }
+ spin_unlock(&table->lock);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx5_vector2eqn);
+
+int mlx5_rename_eq(struct mlx5_core_dev *dev, int eq_ix, char *name)
+{
+ struct mlx5_priv *priv = &dev->priv;
+ struct mlx5_eq_table *table = &priv->eq_table;
+ struct mlx5_eq *eq;
+ int err = -ENOENT;
+
+ spin_lock(&table->lock);
+ list_for_each_entry(eq, &table->comp_eqs_list, list) {
+ if (eq->index == eq_ix) {
+ int irq_ix = eq_ix + MLX5_EQ_VEC_COMP_BASE;
+
+ snprintf(priv->irq_info[irq_ix].name,
+ MLX5_MAX_IRQ_NAME, "%s-%d",
+ name, eq_ix);
+
+ err = 0;
+ break;
+ }
+ }
+ spin_unlock(&table->lock);
+
+ return err;
+}
+
+static void free_comp_eqs(struct mlx5_core_dev *dev)
+{
+ struct mlx5_eq_table *table = &dev->priv.eq_table;
+ struct mlx5_eq *eq, *n;
+
+ spin_lock(&table->lock);
+ list_for_each_entry_safe(eq, n, &table->comp_eqs_list, list) {
+ list_del(&eq->list);
+ spin_unlock(&table->lock);
+ if (mlx5_destroy_unmap_eq(dev, eq))
+ mlx5_core_warn(dev, "failed to destroy EQ 0x%x\n",
+ eq->eqn);
+ kfree(eq);
+ spin_lock(&table->lock);
+ }
+ spin_unlock(&table->lock);
+}
+
+static int alloc_comp_eqs(struct mlx5_core_dev *dev)
+{
+ struct mlx5_eq_table *table = &dev->priv.eq_table;
+ char name[MLX5_MAX_IRQ_NAME];
+ struct mlx5_eq *eq;
+ int ncomp_vec;
+ int nent;
+ int err;
+ int i;
+
+ INIT_LIST_HEAD(&table->comp_eqs_list);
+ ncomp_vec = table->num_comp_vectors;
+ nent = MLX5_COMP_EQ_SIZE;
+ for (i = 0; i < ncomp_vec; i++) {
+ eq = kzalloc(sizeof(*eq), GFP_KERNEL);
+ if (!eq) {
+ err = -ENOMEM;
+ goto clean;
+ }
+
+ snprintf(name, MLX5_MAX_IRQ_NAME, "mlx5_comp%d", i);
+ err = mlx5_create_map_eq(dev, eq,
+ i + MLX5_EQ_VEC_COMP_BASE, nent, 0,
+ name, &dev->priv.uuari.uars[0]);
+ if (err) {
+ kfree(eq);
+ goto clean;
+ }
+ mlx5_core_dbg(dev, "allocated completion EQN %d\n", eq->eqn);
+ eq->index = i;
+ spin_lock(&table->lock);
+ list_add_tail(&eq->list, &table->comp_eqs_list);
+ spin_unlock(&table->lock);
+ }
+
+ return 0;
+
+clean:
+ free_comp_eqs(dev);
+ return err;
+}
+
+static void mlx5_add_device(struct mlx5_interface *intf, struct mlx5_priv *priv)
+{
+ struct mlx5_device_context *dev_ctx;
+ struct mlx5_core_dev *dev = container_of(priv, struct mlx5_core_dev, priv);
+
+ dev_ctx = kmalloc(sizeof(*dev_ctx), GFP_KERNEL);
+ if (!dev_ctx) {
+ pr_warn("mlx5_add_device: alloc context failed\n");
+ return;
+ }
+
+ dev_ctx->intf = intf;
+ dev_ctx->context = intf->add(dev);
+
+ if (dev_ctx->context) {
+ spin_lock_irq(&priv->ctx_lock);
+ list_add_tail(&dev_ctx->list, &priv->ctx_list);
+ spin_unlock_irq(&priv->ctx_lock);
+ } else {
+ kfree(dev_ctx);
+ }
+}
+
+static void mlx5_remove_device(struct mlx5_interface *intf, struct mlx5_priv *priv)
+{
+ struct mlx5_device_context *dev_ctx;
+ struct mlx5_core_dev *dev = container_of(priv, struct mlx5_core_dev, priv);
+
+ list_for_each_entry(dev_ctx, &priv->ctx_list, list)
+ if (dev_ctx->intf == intf) {
+ spin_lock_irq(&priv->ctx_lock);
+ list_del(&dev_ctx->list);
+ spin_unlock_irq(&priv->ctx_lock);
+
+ intf->remove(dev, dev_ctx->context);
+ kfree(dev_ctx);
+ return;
+ }
+}
+static int mlx5_register_device(struct mlx5_core_dev *dev)
+{
+ struct mlx5_priv *priv = &dev->priv;
+ struct mlx5_interface *intf;
+
+ mutex_lock(&intf_mutex);
+ list_add_tail(&priv->dev_list, &dev_list);
+ list_for_each_entry(intf, &intf_list, list)
+ mlx5_add_device(intf, priv);
+ mutex_unlock(&intf_mutex);
+
+ return 0;
+}
+static void mlx5_unregister_device(struct mlx5_core_dev *dev)
+{
+ struct mlx5_priv *priv = &dev->priv;
+ struct mlx5_interface *intf;
+
+ mutex_lock(&intf_mutex);
+ list_for_each_entry(intf, &intf_list, list)
+ mlx5_remove_device(intf, priv);
+ list_del(&priv->dev_list);
+ mutex_unlock(&intf_mutex);
+}
+
+int mlx5_register_interface(struct mlx5_interface *intf)
+{
+ struct mlx5_priv *priv;
+
+ if (!intf->add || !intf->remove)
+ return -EINVAL;
+
+ mutex_lock(&intf_mutex);
+ list_add_tail(&intf->list, &intf_list);
+ list_for_each_entry(priv, &dev_list, dev_list)
+ mlx5_add_device(intf, priv);
+ mutex_unlock(&intf_mutex);
+
+ return 0;
+}
+EXPORT_SYMBOL(mlx5_register_interface);
+
+void mlx5_unregister_interface(struct mlx5_interface *intf)
+{
+ struct mlx5_priv *priv;
+
+ mutex_lock(&intf_mutex);
+ list_for_each_entry(priv, &dev_list, dev_list)
+ mlx5_remove_device(intf, priv);
+ list_del(&intf->list);
+ mutex_unlock(&intf_mutex);
+}
+EXPORT_SYMBOL(mlx5_unregister_interface);
+
+void *mlx5_get_protocol_dev(struct mlx5_core_dev *mdev, int protocol)
+{
+ struct mlx5_priv *priv = &mdev->priv;
+ struct mlx5_device_context *dev_ctx;
+ unsigned long flags;
+ void *result = NULL;
+
+ spin_lock_irqsave(&priv->ctx_lock, flags);
+
+ list_for_each_entry(dev_ctx, &mdev->priv.ctx_list, list)
+ if ((dev_ctx->intf->protocol == protocol) &&
+ dev_ctx->intf->get_dev) {
+ result = dev_ctx->intf->get_dev(dev_ctx->context);
+ break;
+ }
+
+ spin_unlock_irqrestore(&priv->ctx_lock, flags);
+
+ return result;
+}
+EXPORT_SYMBOL(mlx5_get_protocol_dev);
+
+static int mlx5_pci_init(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
+{
+ struct pci_dev *pdev = dev->pdev;
+ int err = 0;
+
+ pci_set_drvdata(dev->pdev, dev);
+ strncpy(priv->name, dev_name(&pdev->dev), MLX5_MAX_NAME_LEN);
+ priv->name[MLX5_MAX_NAME_LEN - 1] = 0;
+
+ mutex_init(&priv->pgdir_mutex);
+ INIT_LIST_HEAD(&priv->pgdir_list);
+ spin_lock_init(&priv->mkey_lock);
+
+ mutex_init(&priv->alloc_mutex);
+
+ priv->numa_node = dev_to_node(&dev->pdev->dev);
+
+ priv->dbg_root = debugfs_create_dir(dev_name(&pdev->dev), mlx5_debugfs_root);
+ if (!priv->dbg_root)
+ return -ENOMEM;
+
+ err = mlx5_pci_enable_device(dev);
+ if (err) {
+ dev_err(&pdev->dev, "Cannot enable PCI device, aborting\n");
+ goto err_dbg;
+ }
+
+ err = request_bar(pdev);
+ if (err) {
+ dev_err(&pdev->dev, "error requesting BARs, aborting\n");
+ goto err_disable;
+ }
+
+ pci_set_master(pdev);
+
+ err = set_dma_caps(pdev);
+ if (err) {
+ dev_err(&pdev->dev, "Failed setting DMA capabilities mask, aborting\n");
+ goto err_clr_master;
+ }
+
+ dev->iseg_base = pci_resource_start(dev->pdev, 0);
+ dev->iseg = ioremap(dev->iseg_base, sizeof(*dev->iseg));
+ if (!dev->iseg) {
+ err = -ENOMEM;
+ dev_err(&pdev->dev, "Failed mapping initialization segment, aborting\n");
+ goto err_clr_master;
+ }
+
+ return 0;
+
+err_clr_master:
+ pci_clear_master(dev->pdev);
+ release_bar(dev->pdev);
+err_disable:
+ mlx5_pci_disable_device(dev);
+
+err_dbg:
+ debugfs_remove(priv->dbg_root);
+ return err;
+}
+
+static void mlx5_pci_close(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
+{
+ iounmap(dev->iseg);
+ pci_clear_master(dev->pdev);
+ release_bar(dev->pdev);
+ mlx5_pci_disable_device(dev);
+ debugfs_remove(priv->dbg_root);
+}
+
+/* TODO: Calling to io_mapping_create_wc spoils the IB user BF mapping as WC
+ * Fix this before enabling this function.
+static int map_bf_area(struct mlx5_core_dev *dev)
+{
+ resource_size_t bf_start = pci_resource_start(dev->pdev, 0);
+ resource_size_t bf_len = pci_resource_len(dev->pdev, 0);
+
+ dev->priv.bf_mapping = io_mapping_create_wc(bf_start, bf_len);
+
+ return dev->priv.bf_mapping ? 0 : -ENOMEM;
+}
+*/
+
+static void unmap_bf_area(struct mlx5_core_dev *dev)
+{
+ if (dev->priv.bf_mapping)
+ io_mapping_free(dev->priv.bf_mapping);
+}
+
+#define MLX5_IB_MOD "mlx5_ib"
+static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
+{
+ struct pci_dev *pdev = dev->pdev;
+ int err;
+
+ err = wait_fw_init(dev, FW_INIT_TIMEOUT_MILI);
+ if (err) {
+ if (err == -EAGAIN) {
+ priv->sriov.vf_partial_init = 1;
+ return 0;
+ }
+
+ dev_err(&dev->pdev->dev, "Firmware over %d MS in initializing state, aborting\n",
+ FW_INIT_TIMEOUT_MILI);
+ return err;
+ }
+
+ mutex_lock(&dev->intf_state_mutex);
+ if (dev->interface_state == MLX5_INTERFACE_STATE_UP) {
+ dev_warn(&dev->pdev->dev, "%s: interface is up, NOP\n",
+ __func__);
+ goto out;
+ }
+
+ dev_info(&pdev->dev, "firmware version: %d.%d.%d\n", fw_rev_maj(dev),
+ fw_rev_min(dev), fw_rev_sub(dev));
+
+ /* on load removing any previous indication of internal error, device is
+ * up
+ */
+ dev->state = MLX5_DEVICE_STATE_UP;
+
+ err = mlx5_cmd_init(dev);
+ if (err) {
+ dev_err(&pdev->dev, "Failed initializing command interface, aborting\n");
+ goto out_err;
+ }
+
+ mlx5_pagealloc_init(dev);
+
+ err = mlx5_core_enable_hca(dev, 0);
+ if (err) {
+ dev_err(&pdev->dev, "enable hca failed\n");
+ goto err_pagealloc_cleanup;
+ }
+
+ err = mlx5_core_set_issi(dev);
+ if (err) {
+ dev_err(&pdev->dev, "failed to set issi\n");
+ goto err_disable_hca;
+ }
+
+ err = mlx5_satisfy_startup_pages(dev, 1);
+ if (err) {
+ dev_err(&pdev->dev, "failed to allocate boot pages\n");
+ goto err_disable_hca;
+ }
+
+ err = mlx5_update_guids(dev);
+ if (err)
+ dev_err(&pdev->dev, "failed to update guids. continue with default...\n");
+
+ err = set_hca_ctrl(dev);
+ if (err) {
+ dev_err(&pdev->dev, "set_hca_ctrl failed\n");
+ goto reclaim_boot_pages;
+ }
+
+ err = handle_hca_cap(dev);
+ if (err) {
+ dev_err(&pdev->dev, "handle_hca_cap failed\n");
+ goto reclaim_boot_pages;
+ }
+
+ err = mlx5_satisfy_startup_pages(dev, 0);
+ if (err) {
+ dev_err(&pdev->dev, "failed to allocate init pages\n");
+ goto reclaim_boot_pages;
+ }
+
+ err = mlx5_pagealloc_start(dev);
+ if (err) {
+ dev_err(&pdev->dev, "mlx5_pagealloc_start failed\n");
+ goto reclaim_boot_pages;
+ }
+
+ err = mlx5_cmd_init_hca(dev);
+ if (err) {
+ dev_err(&pdev->dev, "init hca failed\n");
+ goto err_pagealloc_stop;
+ }
+
+ mlx5_start_health_poll(dev);
+
+ err = mlx5_query_hca_caps(dev);
+ if (err) {
+ dev_err(&pdev->dev, "query hca failed\n");
+ goto err_stop_poll;
+ }
+
+ err = mlx5_query_board_id(dev);
+ if (err) {
+ dev_err(&pdev->dev, "query board id failed\n");
+ goto err_stop_poll;
+ }
+
+ err = mlx5_enable_msix(dev);
+ if (err) {
+ dev_err(&pdev->dev, "enable msix failed\n");
+ goto err_stop_poll;
+ }
+
+ err = mlx5_eq_init(dev);
+ if (err) {
+ dev_err(&pdev->dev, "failed to initialize eq\n");
+ goto disable_msix;
+ }
+
+ err = mlx5_alloc_uuars(dev, &priv->uuari);
+ if (err) {
+ dev_err(&pdev->dev, "Failed allocating uar, aborting\n");
+ goto err_eq_cleanup;
+ }
+
+ err = mlx5_start_eqs(dev);
+ if (err) {
+ dev_err(&pdev->dev, "Failed to start pages and async EQs\n");
+ goto err_free_uar;
+ }
+
+ err = alloc_comp_eqs(dev);
+ if (err) {
+ dev_err(&pdev->dev, "Failed to alloc completion EQs\n");
+ goto err_stop_eqs;
+ }
+
+ /*
+ * if (map_bf_area(dev))
+ * dev_err(&pdev->dev, "Failed to map blue flame area\n");
+ * TODO: Open this mapping when map_bf_area is fixed
+ */
+
+ mlx5_irq_set_affinity_hints(dev);
+ MLX5_INIT_DOORBELL_LOCK(&priv->cq_uar_lock);
+
+ mlx5_init_cq_table(dev);
+ mlx5_init_qp_table(dev);
+ mlx5_init_srq_table(dev);
+ mlx5_init_mr_table(dev);
+ mlx5_init_dct_table(dev);
+ err = mlx5_sriov_init(dev);
+ if (err) {
+ dev_err(&pdev->dev, "sriov init failed %d\n", err);
+ goto err_reg_dev;
+ }
+
+ err = mlx5_register_device(dev);
+ if (err) {
+ dev_err(&pdev->dev, "mlx5_register_device failed %d\n", err);
+ goto err_sriov;
+ }
+
+ err = request_module_nowait(MLX5_IB_MOD);
+ if (err)
+ pr_info("failed request module on %s\n", MLX5_IB_MOD);
+
+ dev->interface_state = MLX5_INTERFACE_STATE_UP;
+out:
+ mutex_unlock(&dev->intf_state_mutex);
+
+
+ return 0;
+
+err_sriov:
+ if (mlx5_sriov_cleanup(dev))
+ dev_err(&dev->pdev->dev, "sriov cleanup failed\n");
+err_reg_dev:
+ mlx5_cleanup_dct_table(dev);
+ mlx5_cleanup_mr_table(dev);
+ mlx5_cleanup_srq_table(dev);
+ mlx5_cleanup_qp_table(dev);
+ mlx5_cleanup_cq_table(dev);
+ mlx5_irq_clear_affinity_hints(dev);
+ free_comp_eqs(dev);
+
+err_stop_eqs:
+ mlx5_stop_eqs(dev);
+
+err_free_uar:
+ mlx5_free_uuars(dev, &priv->uuari);
+
+err_eq_cleanup:
+ mlx5_eq_cleanup(dev);
+
+disable_msix:
+ mlx5_disable_msix(dev);
+
+err_stop_poll:
+ mlx5_stop_health_poll(dev);
+ if (mlx5_cmd_teardown_hca(dev)) {
+ dev_err(&dev->pdev->dev, "tear_down_hca failed, skip cleanup\n");
+ return err;
+ }
+
+err_pagealloc_stop:
+ mlx5_pagealloc_stop(dev);
+
+reclaim_boot_pages:
+ mlx5_reclaim_startup_pages(dev);
+
+err_disable_hca:
+ mlx5_core_disable_hca(dev, 0);
+
+err_pagealloc_cleanup:
+ mlx5_pagealloc_cleanup(dev);
+ mlx5_cmd_cleanup(dev);
+
+out_err:
+ dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
+ mutex_unlock(&dev->intf_state_mutex);
+
+ return err;
+}
+
+static int mlx5_unload_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
+{
+ int err = 0;
+
+ if (priv->sriov.vf_partial_init)
+ return 0;
+
+ err = mlx5_sriov_cleanup(dev);
+ if (err) {
+ dev_warn(&dev->pdev->dev, "%s: sriov cleanup failed - abort\n",
+ __func__);
+ return err;
+ }
+ mutex_lock(&dev->intf_state_mutex);
+ if (dev->interface_state == MLX5_INTERFACE_STATE_DOWN) {
+ dev_warn(&dev->pdev->dev, "%s: interface is down, NOP\n",
+ __func__);
+ goto out;
+ }
+
+ mlx5_unregister_device(dev);
+ mlx5_cleanup_dct_table(dev);
+ mlx5_cleanup_mr_table(dev);
+ mlx5_cleanup_srq_table(dev);
+ mlx5_cleanup_qp_table(dev);
+ mlx5_cleanup_cq_table(dev);
+ mlx5_irq_clear_affinity_hints(dev);
+ unmap_bf_area(dev);
+ free_comp_eqs(dev);
+ mlx5_stop_eqs(dev);
+ mlx5_free_uuars(dev, &priv->uuari);
+ mlx5_eq_cleanup(dev);
+ mlx5_disable_msix(dev);
+ mlx5_stop_health_poll(dev);
+ err = mlx5_cmd_teardown_hca(dev);
+ if (err) {
+ dev_err(&dev->pdev->dev, "tear_down_hca failed, skip cleanup\n");
+ goto out;
+ }
+ mlx5_pagealloc_stop(dev);
+ mlx5_reclaim_startup_pages(dev);
+ mlx5_core_disable_hca(dev, 0);
+ mlx5_pagealloc_cleanup(dev);
+ mlx5_cmd_cleanup(dev);
+
+out:
+ dev->interface_state = MLX5_INTERFACE_STATE_DOWN;
+ mutex_unlock(&dev->intf_state_mutex);
+ return err;
+}
+
+void mlx5_core_event(struct mlx5_core_dev *dev, enum mlx5_dev_event event,
+ unsigned long param)
+{
+ struct mlx5_priv *priv = &dev->priv;
+ struct mlx5_device_context *dev_ctx;
+ unsigned long flags;
+
+ spin_lock_irqsave(&priv->ctx_lock, flags);
+
+ list_for_each_entry(dev_ctx, &priv->ctx_list, list)
+ if (dev_ctx->intf->event)
+ dev_ctx->intf->event(dev, dev_ctx->context, event, param);
+
+ spin_unlock_irqrestore(&priv->ctx_lock, flags);
+}
+
+struct mlx5_core_event_handler {
+ void (*event)(struct mlx5_core_dev *dev,
+ enum mlx5_dev_event event,
+ void *data);
+};
+
+
+static int init_one(struct pci_dev *pdev,
+ const struct pci_device_id *id)
+{
+ struct mlx5_core_dev *dev;
+ struct mlx5_priv *priv;
+ int err;
+
+ dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+ if (!dev) {
+ dev_err(&pdev->dev, "kzalloc failed\n");
+ return -ENOMEM;
+ }
+ priv = &dev->priv;
+ priv->pci_dev_data = id->driver_data;
+
+ pci_set_drvdata(pdev, dev);
+
+ if (prof_sel < 0 || prof_sel >= ARRAY_SIZE(profile)) {
+ pr_warn("selected profile out of range, selecting default (%d)\n",
+ MLX5_DEFAULT_PROF);
+ prof_sel = MLX5_DEFAULT_PROF;
+ }
+ dev->profile = &profile[prof_sel];
+ dev->pdev = pdev;
+ dev->event = mlx5_core_event;
+
+ INIT_LIST_HEAD(&priv->ctx_list);
+ spin_lock_init(&priv->ctx_lock);
+ mutex_init(&dev->pci_status_mutex);
+ mutex_init(&dev->intf_state_mutex);
+ err = mlx5_pci_init(dev, priv);
+ if (err) {
+ dev_err(&pdev->dev, "mlx5_pci_init failed with error code %d\n", err);
+ goto clean_dev;
+ }
+
+ err = mlx5_load_one(dev, priv);
+ if (err) {
+ dev_err(&pdev->dev, "mlx5_load_one failed with error code %d\n", err);
+ goto close_pci;
+ }
+
+ pci_save_state(pdev);
+
+ return 0;
+
+close_pci:
+ mlx5_pci_close(dev, priv);
+clean_dev:
+ pci_set_drvdata(pdev, NULL);
+ kfree(dev);
+
+ return err;
+}
+
+static void remove_one(struct pci_dev *pdev)
+{
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+ struct mlx5_priv *priv = &dev->priv;
+
+ if (mlx5_unload_one(dev, priv)) {
+ dev_err(&dev->pdev->dev, "mlx5_unload_one failed\n");
+ return;
+ }
+ mlx5_pci_close(dev, priv);
+ pci_set_drvdata(pdev, NULL);
+ kfree(dev);
+}
+
+#ifdef CONFIG_PM
+static int suspend(struct device *device)
+{
+ struct pci_dev *pdev = to_pci_dev(device);
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+ struct mlx5_priv *priv = &dev->priv;
+ int err;
+
+ dev_info(&pdev->dev, "suspend was called\n");
+
+ err = mlx5_unload_one(dev, priv);
+ if (err) {
+ dev_err(&pdev->dev, "mlx5_unload_one failed with error code: %d\n", err);
+ return err;
+ }
+
+ err = pci_save_state(pdev);
+ if (err) {
+ dev_err(&pdev->dev, "pci_save_state failed with error code: %d\n", err);
+ return err;
+ }
+
+ err = pci_enable_wake(pdev, PCI_D3hot, 0);
+ if (err) {
+ dev_err(&pdev->dev, "pci_enable_wake failed with error code: %d\n", err);
+ return err;
+ }
+
+ mlx5_pci_disable_device(dev);
+ err = pci_set_power_state(pdev, PCI_D3hot);
+ if (err) {
+ dev_warn(&pdev->dev, "pci_set_power_state failed with error code: %d\n", err);
+ return err;
+ }
+
+ return 0;
+}
+
+static int resume(struct device *device)
+{
+ struct pci_dev *pdev = to_pci_dev(device);
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+ struct mlx5_priv *priv = &dev->priv;
+ int err;
+
+ dev_info(&pdev->dev, "resume was called\n");
+
+ err = pci_set_power_state(pdev, PCI_D0);
+ if (err) {
+ dev_warn(&pdev->dev, "pci_set_power_state failed with error code: %d\n", err);
+ return err;
+ }
+
+ pci_restore_state(pdev);
+ err = pci_save_state(pdev);
+ if (err) {
+ dev_err(&pdev->dev, "pci_save_state failed with error code: %d\n", err);
+ return err;
+ }
+ err = mlx5_pci_enable_device(dev);
+ if (err) {
+ dev_err(&pdev->dev, "mlx5_pci_enabel_device failed with error code: %d\n", err);
+ return err;
+ }
+ pci_set_master(pdev);
+
+ err = mlx5_load_one(dev, priv);
+ if (err) {
+ dev_err(&pdev->dev, "mlx5_load_one failed with error code: %d\n", err);
+ return err;
+ }
+
+ return 0;
+}
+
+static const struct dev_pm_ops mlnx_pm = {
+ .suspend = suspend,
+ .resume = resume,
+};
+#endif /* CONFIG_PM */
+
+static pci_ers_result_t mlx5_pci_err_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+ struct mlx5_priv *priv = &dev->priv;
+
+ dev_info(&pdev->dev, "%s was called\n", __func__);
+ mlx5_enter_error_state(dev);
+ mlx5_unload_one(dev, priv);
+ mlx5_pci_disable_device(dev);
+ return state == pci_channel_io_perm_failure ?
+ PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_NEED_RESET;
+}
+
+static pci_ers_result_t mlx5_pci_slot_reset(struct pci_dev *pdev)
+{
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+ int err = 0;
+
+ dev_info(&pdev->dev, "%s was called\n", __func__);
+
+ err = mlx5_pci_enable_device(dev);
+ if (err) {
+ dev_err(&pdev->dev, "%s: mlx5_pci_enable_device failed with error code: %d\n"
+ , __func__, err);
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+ pci_set_master(pdev);
+ pci_set_power_state(pdev, PCI_D0);
+ pci_restore_state(pdev);
+
+ return err ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
+}
+
+static void mlx5_pci_resume(struct pci_dev *pdev)
+{
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+ struct mlx5_priv *priv = &dev->priv;
+ int err;
+
+ dev_info(&pdev->dev, "%s was called\n", __func__);
+
+ pci_save_state(pdev);
+
+ err = mlx5_load_one(dev, priv);
+ if (err)
+ dev_err(&pdev->dev, "%s: mlx5_load_one failed with error code: %d\n"
+ , __func__, err);
+ else
+ dev_info(&pdev->dev, "%s: device recovered\n", __func__);
+}
+
+#ifdef CONFIG_COMPAT_IS_CONST_PCI_ERROR_HANDLERS
+static const struct pci_error_handlers mlx5_err_handler = {
+#else
+static struct pci_error_handlers mlx5_err_handler = {
+#endif
+ .error_detected = mlx5_pci_err_detected,
+ .slot_reset = mlx5_pci_slot_reset,
+ .resume = mlx5_pci_resume
+};
+
+static void shutdown(struct pci_dev *pdev)
+{
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+ struct mlx5_priv *priv = &dev->priv;
+
+ dev_info(&pdev->dev, "shutdown was called\n");
+ mlx5_unload_one(dev, priv);
+}
+
+static const struct pci_device_id mlx5_core_pci_table[] = {
+ { PCI_VDEVICE(MELLANOX, 0x1011) }, /* Connect-IB */
+ { PCI_VDEVICE(MELLANOX, 0x1012), MLX5_PCI_DEV_IS_VF}, /* Connect-IB VF */
+ { PCI_VDEVICE(MELLANOX, 0x1013) }, /* ConnectX-4 */
+ { PCI_VDEVICE(MELLANOX, 0x1014), MLX5_PCI_DEV_IS_VF}, /* ConnectX-4 VF */
+ { PCI_VDEVICE(MELLANOX, 0x1015) }, /* ConnectX-4LX */
+ { PCI_VDEVICE(MELLANOX, 0x1016), MLX5_PCI_DEV_IS_VF}, /* ConnectX-4LX VF */
+ { PCI_VDEVICE(MELLANOX, 0x1017) },
+ { PCI_VDEVICE(MELLANOX, 0x1018) },
+ { PCI_VDEVICE(MELLANOX, 0x1019) },
+ { PCI_VDEVICE(MELLANOX, 0x101a) },
+ { PCI_VDEVICE(MELLANOX, 0x101c) },
+ { PCI_VDEVICE(MELLANOX, 0x101b) },
+ { PCI_VDEVICE(MELLANOX, 0x101d) },
+ { PCI_VDEVICE(MELLANOX, 0x101e) },
+ { PCI_VDEVICE(MELLANOX, 0x101f) },
+ { PCI_VDEVICE(MELLANOX, 0x1020) },
+ { PCI_VDEVICE(MELLANOX, 0x1021) },
+ { PCI_VDEVICE(MELLANOX, 0x1022) },
+ { PCI_VDEVICE(MELLANOX, 0x1023) },
+ { PCI_VDEVICE(MELLANOX, 0x1024) },
+ { PCI_VDEVICE(MELLANOX, 0x1025) },
+ { PCI_VDEVICE(MELLANOX, 0x1026) },
+ { PCI_VDEVICE(MELLANOX, 0x1027) },
+ { PCI_VDEVICE(MELLANOX, 0x1028) },
+ { PCI_VDEVICE(MELLANOX, 0x1029) },
+ { PCI_VDEVICE(MELLANOX, 0x102a) },
+ { PCI_VDEVICE(MELLANOX, 0x102b) },
+ { PCI_VDEVICE(MELLANOX, 0x102c) },
+ { PCI_VDEVICE(MELLANOX, 0x102d) },
+ { PCI_VDEVICE(MELLANOX, 0x102e) },
+ { PCI_VDEVICE(MELLANOX, 0x102f) },
+ { PCI_VDEVICE(MELLANOX, 0x1030) },
+ { 0, }
+};
+
+MODULE_DEVICE_TABLE(pci, mlx5_core_pci_table);
+
+static struct pci_driver mlx5_core_driver = {
+ .name = DRIVER_NAME,
+ .id_table = mlx5_core_pci_table,
+#ifdef CONFIG_PM
+ .driver = {
+ .pm = &mlnx_pm,
+ },
+#endif /* CONFIG_PM */
+ .probe = init_one,
+ .remove = remove_one,
+ .shutdown = shutdown,
+ .err_handler = &mlx5_err_handler,
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,10,0)
+ .sriov_configure = mlx5_core_sriov_configure,
+#endif
+};
+
+static int __init init(void)
+{
+ int err;
+
+ mlx5_register_debugfs();
+ mlx5_core_wq = create_singlethread_workqueue("mlx5_core_wq");
+ if (!mlx5_core_wq) {
+ err = -ENOMEM;
+ goto err_debug;
+ }
+ mlx5_health_init();
+
+ err = pci_register_driver(&mlx5_core_driver);
+ if (err)
+ goto err_health;
+
+ mlx5e_init();
+
+ return 0;
+
+err_health:
+ mlx5_health_cleanup();
+ destroy_workqueue(mlx5_core_wq);
+err_debug:
+ mlx5_unregister_debugfs();
+ return err;
+}
+
+static void __exit cleanup(void)
+{
+ mlx5e_cleanup();
+ pci_unregister_driver(&mlx5_core_driver);
+ mlx5_health_cleanup();
+ destroy_workqueue(mlx5_core_wq);
+ mlx5_unregister_debugfs();
+}
+
+module_init(init);
+module_exit(cleanup);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/mcg.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/mcg.c
new file mode 100644
index 0000000..12cd498
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/mcg.c
@@ -0,0 +1,105 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+struct mlx5_attach_mcg_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 qpn;
+ __be32 rsvd;
+ u8 gid[16];
+};
+
+struct mlx5_attach_mcg_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvf[8];
+};
+
+struct mlx5_detach_mcg_mbox_in {
+ struct mlx5_inbox_hdr hdr;
+ __be32 qpn;
+ __be32 rsvd;
+ u8 gid[16];
+};
+
+struct mlx5_detach_mcg_mbox_out {
+ struct mlx5_outbox_hdr hdr;
+ u8 rsvf[8];
+};
+
+int mlx5_core_attach_mcg(struct mlx5_core_dev *dev, union ib_gid *mgid, u32 qpn)
+{
+ struct mlx5_attach_mcg_mbox_in in;
+ struct mlx5_attach_mcg_mbox_out out;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_ATTACH_TO_MCG);
+ memcpy(in.gid, mgid, sizeof(*mgid));
+ in.qpn = cpu_to_be32(qpn);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ err = mlx5_cmd_status_to_err(&out.hdr);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx5_core_attach_mcg);
+
+int mlx5_core_detach_mcg(struct mlx5_core_dev *dev, union ib_gid *mgid, u32 qpn)
+{
+ struct mlx5_detach_mcg_mbox_in in;
+ struct mlx5_detach_mcg_mbox_out out;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_DETTACH_FROM_MCG);
+ memcpy(in.gid, mgid, sizeof(*mgid));
+ in.qpn = cpu_to_be32(qpn);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ err = mlx5_cmd_status_to_err(&out.hdr);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx5_core_detach_mcg);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/mlx5_core.h b/drivers/net/mlnx_uio/mlnx/mlx5/core/mlx5_core.h
new file mode 100644
index 0000000..51adb04
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/mlx5_core.h
@@ -0,0 +1,105 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies, Ltd. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __MLX5_CORE_H__
+#define __MLX5_CORE_H__
+
+
+#define DRIVER_NAME "mlx5_core"
+#define DRIVER_VERSION "3.0-1.0.1"
+#define DRIVER_RELDATE "03 Mar 2015"
+
+extern int mlx5_core_debug_mask;
+
+#define mlx5_core_dbg(dev, format, ...) \
+ pr_debug("%s:%s:%d:(pid %d): " format, \
+ (dev)->priv.name, __func__, __LINE__, current->pid, \
+ ##__VA_ARGS__)
+
+#define mlx5_core_dbg_mask(dev, mask, format, ...) \
+do { \
+ if ((mask) & mlx5_core_debug_mask) \
+ mlx5_core_dbg(dev, format, ##__VA_ARGS__); \
+} while (0)
+
+#define mlx5_core_err(dev, format, ...) \
+ pr_err("%s:%s:%d:(pid %d): " format, \
+ (dev)->priv.name, __func__, __LINE__, current->pid, \
+ ##__VA_ARGS__)
+
+#define mlx5_core_warn(dev, format, ...) \
+ pr_warn("%s:%s:%d:(pid %d): " format, \
+ (dev)->priv.name, __func__, __LINE__, current->pid, \
+ ##__VA_ARGS__)
+
+enum {
+ MLX5_CMD_DATA, /* print command payload only */
+ MLX5_CMD_TIME, /* print command execution time */
+};
+
+static inline int mlx5_cmd_exec_check_status(struct mlx5_core_dev *dev, u32 *in,
+ int in_size, u32 *out,
+ int out_size)
+{
+ int err;
+
+ err = mlx5_cmd_exec(dev, in, in_size, out, out_size);
+ if (err)
+ return err;
+
+ return mlx5_cmd_status_to_err((struct mlx5_outbox_hdr *)out);
+}
+
+int mlx5_query_hca_caps(struct mlx5_core_dev *dev);
+int mlx5_query_board_id(struct mlx5_core_dev *dev);
+
+int mlx5_cmd_init_hca(struct mlx5_core_dev *dev);
+int mlx5_cmd_teardown_hca(struct mlx5_core_dev *dev);
+void mlx5_core_event(struct mlx5_core_dev *dev, enum mlx5_dev_event event,
+ unsigned long param);
+void mlx5_enter_error_state(struct mlx5_core_dev *dev);
+void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, unsigned long vector);
+int mlx5_rename_eq(struct mlx5_core_dev *dev, int eq_ix, char *name);
+int mlx5_core_sriov_configure(struct pci_dev *dev, int num_vfs);
+int mlx5_core_enable_hca(struct mlx5_core_dev *dev, u16 func_id);
+int mlx5_core_disable_hca(struct mlx5_core_dev *dev, u16 func_id);
+
+void mlx5e_init(void);
+void mlx5e_cleanup(void);
+
+#endif /* __MLX5_CORE_H__ */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/mr.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/mr.c
new file mode 100644
index 0000000..9590660
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/mr.c
@@ -0,0 +1,251 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+void mlx5_init_mr_table(struct mlx5_core_dev *dev)
+{
+ struct mlx5_mr_table *table = &dev->priv.mr_table;
+
+ memset(table, 0, sizeof(*table));
+ spin_lock_init(&table->lock);
+ INIT_RADIX_TREE(&table->tree, GFP_ATOMIC);
+}
+
+void mlx5_cleanup_mr_table(struct mlx5_core_dev *dev)
+{
+}
+
+int mlx5_core_create_mkey(struct mlx5_core_dev *dev, struct mlx5_core_mr *mr,
+ struct mlx5_create_mkey_mbox_in *in, int inlen,
+ mlx5_cmd_cbk_t callback, void *context,
+ struct mlx5_create_mkey_mbox_out *out)
+{
+ struct mlx5_mr_table *table = &dev->priv.mr_table;
+ struct mlx5_create_mkey_mbox_out lout;
+ unsigned long flags;
+ int err;
+ u8 key;
+
+ memset(&lout, 0, sizeof(lout));
+ spin_lock_irq(&dev->priv.mkey_lock);
+ key = dev->priv.mkey_key++;
+ spin_unlock_irq(&dev->priv.mkey_lock);
+ in->seg.qpn_mkey7_0 |= cpu_to_be32(key);
+ in->hdr.opcode = cpu_to_be16(MLX5_CMD_OP_CREATE_MKEY);
+ if (callback) {
+ err = mlx5_cmd_exec_cb(dev, in, inlen, out, sizeof(*out),
+ callback, context);
+ return err;
+ } else {
+ err = mlx5_cmd_exec(dev, in, inlen, &lout, sizeof(lout));
+ }
+
+ if (err) {
+ mlx5_core_dbg(dev, "cmd exec failed %d\n", err);
+ return err;
+ }
+
+ if (lout.hdr.status) {
+ mlx5_core_dbg(dev, "status %d\n", lout.hdr.status);
+ return mlx5_cmd_status_to_err(&lout.hdr);
+ }
+
+ mr->iova = be64_to_cpu(in->seg.start_addr);
+ mr->size = be64_to_cpu(in->seg.len);
+ mr->key = mlx5_idx_to_mkey(be32_to_cpu(lout.mkey) & 0xffffff) | key;
+ mr->pd = be32_to_cpu(in->seg.flags_pd) & 0xffffff;
+
+ mlx5_core_dbg(dev, "out 0x%x, key 0x%x, mkey 0x%x\n",
+ be32_to_cpu(lout.mkey), key, mr->key);
+
+ /* connect to MR tree */
+ spin_lock_irqsave(&table->lock, flags);
+ err = radix_tree_insert(&table->tree, mlx5_mkey_to_idx(mr->key), mr);
+ spin_unlock_irqrestore(&table->lock, flags);
+ if (err) {
+ mlx5_core_warn(dev, "failed radix tree insert of mr 0x%x, %d\n",
+ mr->key, err);
+ mlx5_core_destroy_mkey(dev, mr);
+ }
+
+ pr_debug("mkey 0x%x created\n", mr->key);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx5_core_create_mkey);
+
+int mlx5_core_destroy_mkey(struct mlx5_core_dev *dev, struct mlx5_core_mr *mr)
+{
+ struct mlx5_mr_table *table = &dev->priv.mr_table;
+ u32 in[MLX5_ST_SZ_DW(destroy_mkey_in)];
+ u32 out[MLX5_ST_SZ_DW(destroy_mkey_out)];
+ struct mlx5_core_mr *deleted_mr;
+ unsigned long flags;
+ int err;
+
+ memset(in, 0, sizeof(in));
+
+ spin_lock_irqsave(&table->lock, flags);
+ deleted_mr = radix_tree_delete(&table->tree, mlx5_mkey_to_idx(mr->key));
+ spin_unlock_irqrestore(&table->lock, flags);
+ if (!deleted_mr) {
+ mlx5_core_warn(dev, "could not find mkey 0x%x in radix tree\n", mr->key);
+ return -ENOENT;
+ }
+
+ MLX5_SET(destroy_mkey_in, in, opcode, MLX5_CMD_OP_DESTROY_MKEY);
+ MLX5_SET(destroy_mkey_in, in, mkey_index, mlx5_mkey_to_idx(mr->key));
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(dev, in, sizeof(in),
+ out, sizeof(out));
+ if (err)
+ return err;
+
+ pr_debug("mkey 0x%x is destroyed\n", mr->key);
+ return 0;
+}
+EXPORT_SYMBOL(mlx5_core_destroy_mkey);
+
+int mlx5_core_query_mkey(struct mlx5_core_dev *dev, struct mlx5_core_mr *mr,
+ struct mlx5_query_mkey_mbox_out *out, int outlen)
+{
+ struct mlx5_query_mkey_mbox_in in;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(out, 0, outlen);
+
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_QUERY_MKEY);
+ in.mkey = cpu_to_be32(mlx5_mkey_to_idx(mr->key));
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), out, outlen);
+ if (err)
+ return err;
+
+ if (out->hdr.status)
+ return mlx5_cmd_status_to_err(&out->hdr);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx5_core_query_mkey);
+
+int mlx5_core_dump_fill_mkey(struct mlx5_core_dev *dev, struct mlx5_core_mr *mr,
+ u32 *mkey)
+{
+ struct mlx5_query_special_ctxs_mbox_in in;
+ struct mlx5_query_special_ctxs_mbox_out out;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_QUERY_SPECIAL_CONTEXTS);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ return mlx5_cmd_status_to_err(&out.hdr);
+
+ *mkey = be32_to_cpu(out.dump_fill_mkey);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx5_core_dump_fill_mkey);
+
+int mlx5_core_create_psv(struct mlx5_core_dev *dev, u32 pdn,
+ int npsvs, u32 *sig_index)
+{
+ struct mlx5_allocate_psv_in in;
+ struct mlx5_allocate_psv_out out;
+ int i, err;
+
+ if (npsvs > MLX5_MAX_PSVS)
+ return -EINVAL;
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_CREATE_PSV);
+ in.npsv_pd = cpu_to_be32((npsvs << 28) | pdn);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err) {
+ mlx5_core_err(dev, "cmd exec failed %d\n", err);
+ return err;
+ }
+
+ if (out.hdr.status) {
+ mlx5_core_err(dev, "create_psv bad status %d\n",
+ out.hdr.status);
+ return mlx5_cmd_status_to_err(&out.hdr);
+ }
+
+ for (i = 0; i < npsvs; i++)
+ sig_index[i] = be32_to_cpu(out.psv_idx[i]) & 0xffffff;
+
+ return err;
+}
+EXPORT_SYMBOL(mlx5_core_create_psv);
+
+int mlx5_core_destroy_psv(struct mlx5_core_dev *dev, int psv_num)
+{
+ struct mlx5_destroy_psv_in in;
+ struct mlx5_destroy_psv_out out;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+
+ in.psv_number = cpu_to_be32(psv_num);
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_DESTROY_PSV);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err) {
+ mlx5_core_err(dev, "destroy_psv cmd exec failed %d\n", err);
+ goto out;
+ }
+
+ if (out.hdr.status) {
+ mlx5_core_err(dev, "destroy_psv bad status %d\n",
+ out.hdr.status);
+ err = mlx5_cmd_status_to_err(&out.hdr);
+ goto out;
+ }
+
+out:
+ return err;
+}
+EXPORT_SYMBOL(mlx5_core_destroy_psv);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/pagealloc.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/pagealloc.c
new file mode 100644
index 0000000..a0f07ce
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/pagealloc.c
@@ -0,0 +1,533 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+enum {
+ MLX5_PAGES_CANT_GIVE = 0,
+ MLX5_PAGES_GIVE = 1,
+ MLX5_PAGES_TAKE = 2
+};
+
+enum {
+ MLX5_BOOT_PAGES = 1,
+ MLX5_INIT_PAGES = 2,
+ MLX5_POST_INIT_PAGES = 3
+};
+
+struct mlx5_pages_req {
+ struct mlx5_core_dev *dev;
+ u16 func_id;
+ s32 npages;
+ struct work_struct work;
+};
+
+struct fw_page {
+ struct rb_node rb_node;
+ u64 addr;
+ struct page *page;
+ u16 func_id;
+ unsigned long bitmask;
+ struct list_head list;
+ unsigned free_count;
+};
+
+struct mlx5_manage_pages_inbox {
+ struct mlx5_inbox_hdr hdr;
+ __be16 rsvd;
+ __be16 func_id;
+ __be32 num_entries;
+ __be64 pas[0];
+};
+
+struct mlx5_manage_pages_outbox {
+ struct mlx5_outbox_hdr hdr;
+ __be32 num_entries;
+ u8 rsvd[4];
+ __be64 pas[0];
+};
+
+enum {
+ MAX_RECLAIM_TIME_MSECS = 5000,
+};
+
+enum {
+ MLX5_MAX_RECLAIM_TIME_MILI = 5000,
+ MLX5_NUM_4K_IN_PAGE = PAGE_SIZE / MLX5_ADAPTER_PAGE_SIZE,
+};
+
+static int insert_page(struct mlx5_core_dev *dev, u64 addr, struct page *page, u16 func_id)
+{
+ struct rb_root *root = &dev->priv.page_root;
+ struct rb_node **new = &root->rb_node;
+ struct rb_node *parent = NULL;
+ struct fw_page *nfp;
+ struct fw_page *tfp;
+ int i;
+
+ while (*new) {
+ parent = *new;
+ tfp = rb_entry(parent, struct fw_page, rb_node);
+ if (tfp->addr < addr)
+ new = &parent->rb_left;
+ else if (tfp->addr > addr)
+ new = &parent->rb_right;
+ else
+ return -EEXIST;
+ }
+
+ nfp = kzalloc(sizeof(*nfp), GFP_KERNEL);
+ if (!nfp)
+ return -ENOMEM;
+
+ nfp->addr = addr;
+ nfp->page = page;
+ nfp->func_id = func_id;
+ nfp->free_count = MLX5_NUM_4K_IN_PAGE;
+ for (i = 0; i < MLX5_NUM_4K_IN_PAGE; i++)
+ set_bit(i, &nfp->bitmask);
+
+ rb_link_node(&nfp->rb_node, parent, new);
+ rb_insert_color(&nfp->rb_node, root);
+ list_add(&nfp->list, &dev->priv.free_list);
+
+ return 0;
+}
+
+static struct fw_page *find_fw_page(struct mlx5_core_dev *dev, u64 addr)
+{
+ struct rb_root *root = &dev->priv.page_root;
+ struct rb_node *tmp = root->rb_node;
+ struct fw_page *result = NULL;
+ struct fw_page *tfp;
+
+ while (tmp) {
+ tfp = rb_entry(tmp, struct fw_page, rb_node);
+ if (tfp->addr < addr) {
+ tmp = tmp->rb_left;
+ } else if (tfp->addr > addr) {
+ tmp = tmp->rb_right;
+ } else {
+ result = tfp;
+ break;
+ }
+ }
+
+ return result;
+}
+
+static int mlx5_cmd_query_pages(struct mlx5_core_dev *dev, u16 *func_id,
+ s32 *npages, int boot)
+{
+ u32 in[MLX5_ST_SZ_DW(query_pages_in)];
+ u32 out[MLX5_ST_SZ_DW(query_pages_out)];
+ int err;
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(query_pages_in, in, opcode, MLX5_CMD_OP_QUERY_PAGES);
+ MLX5_SET(query_pages_in, in, op_mod,
+ boot ? MLX5_BOOT_PAGES : MLX5_INIT_PAGES);
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
+ if (err)
+ return err;
+
+ *npages = MLX5_GET(query_pages_out, out, num_pages);
+ *func_id = MLX5_GET(query_pages_out, out, function_id);
+
+ return 0;
+}
+
+static int alloc_4k(struct mlx5_core_dev *dev, u64 *addr)
+{
+ struct fw_page *fp;
+ unsigned n;
+
+ if (list_empty(&dev->priv.free_list))
+ return -ENOMEM;
+
+ fp = list_entry(dev->priv.free_list.next, struct fw_page, list);
+ n = find_first_bit(&fp->bitmask, 8 * sizeof(fp->bitmask));
+ if (n >= MLX5_NUM_4K_IN_PAGE) {
+ mlx5_core_warn(dev, "alloc 4k bug\n");
+ return -ENOENT;
+ }
+ clear_bit(n, &fp->bitmask);
+ fp->free_count--;
+ if (!fp->free_count)
+ list_del(&fp->list);
+
+ *addr = fp->addr + n * MLX5_ADAPTER_PAGE_SIZE;
+
+ return 0;
+}
+
+static void free_4k(struct mlx5_core_dev *dev, u64 addr)
+{
+ struct fw_page *fwp;
+ int n;
+
+ fwp = find_fw_page(dev, addr & PAGE_MASK);
+ if (!fwp) {
+ mlx5_core_warn(dev, "page not found\n");
+ return;
+ }
+
+ n = (addr & ~PAGE_MASK) >> MLX5_ADAPTER_PAGE_SHIFT;
+ fwp->free_count++;
+ set_bit(n, &fwp->bitmask);
+ if (fwp->free_count == MLX5_NUM_4K_IN_PAGE) {
+ rb_erase(&fwp->rb_node, &dev->priv.page_root);
+ if (fwp->free_count != 1)
+ list_del(&fwp->list);
+ dma_unmap_page(&dev->pdev->dev, addr & PAGE_MASK, PAGE_SIZE,
+ DMA_BIDIRECTIONAL);
+ __free_page(fwp->page);
+ kfree(fwp);
+ } else if (fwp->free_count == 1) {
+ list_add(&fwp->list, &dev->priv.free_list);
+ }
+}
+
+static int alloc_system_page(struct mlx5_core_dev *dev, u16 func_id)
+{
+ struct page *page;
+ u64 addr;
+ int err;
+
+ page = alloc_page(GFP_HIGHUSER);
+ if (!page) {
+ mlx5_core_warn(dev, "failed to allocate page\n");
+ return -ENOMEM;
+ }
+ addr = dma_map_page(&dev->pdev->dev, page, 0,
+ PAGE_SIZE, DMA_BIDIRECTIONAL);
+ if (dma_mapping_error(&dev->pdev->dev, addr)) {
+ mlx5_core_warn(dev, "failed dma mapping page\n");
+ err = -ENOMEM;
+ goto out_alloc;
+ }
+ err = insert_page(dev, addr, page, func_id);
+ if (err) {
+ mlx5_core_err(dev, "failed to track allocated page\n");
+ goto out_mapping;
+ }
+
+ return 0;
+
+out_mapping:
+ dma_unmap_page(&dev->pdev->dev, addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+
+out_alloc:
+ __free_page(page);
+
+ return err;
+}
+
+static void page_notify_fail(struct mlx5_core_dev *dev, u16 func_id)
+{
+ struct mlx5_manage_pages_inbox *in;
+ struct mlx5_manage_pages_outbox out;
+
+ in = kzalloc(sizeof(*in), GFP_KERNEL);
+ if (!in) {
+ mlx5_core_warn(dev, "allocation failed\n");
+ return;
+ }
+
+ memset(&out, 0, sizeof(out));
+ in->hdr.opcode = cpu_to_be16(MLX5_CMD_OP_MANAGE_PAGES);
+ in->hdr.opmod = cpu_to_be16(MLX5_PAGES_CANT_GIVE);
+ in->func_id = cpu_to_be16(func_id);
+ if (mlx5_cmd_exec(dev, in, sizeof(*in), &out, sizeof(out)))
+ mlx5_core_warn(dev, "page notify failed\n");
+ kfree(in);
+}
+
+static int give_pages(struct mlx5_core_dev *dev, u16 func_id, int npages,
+ int notify_fail)
+{
+ struct mlx5_manage_pages_inbox *in;
+ struct mlx5_manage_pages_outbox out;
+ int inlen;
+ u64 addr;
+ int err;
+ int i;
+
+ inlen = sizeof(*in) + npages * sizeof(in->pas[0]);
+ in = mlx5_vzalloc(inlen);
+ if (!in) {
+ if (notify_fail)
+ page_notify_fail(dev, func_id);
+ mlx5_core_warn(dev, "vzalloc failed %d\n", inlen);
+ return -ENOMEM;
+ }
+ memset(&out, 0, sizeof(out));
+
+ for (i = 0; i < npages; i++) {
+retry:
+ err = alloc_4k(dev, &addr);
+ if (err) {
+ if (err == -ENOMEM)
+ err = alloc_system_page(dev, func_id);
+ if (err)
+ goto out_alloc;
+
+ goto retry;
+ }
+ in->pas[i] = cpu_to_be64(addr);
+ }
+
+ in->hdr.opcode = cpu_to_be16(MLX5_CMD_OP_MANAGE_PAGES);
+ in->hdr.opmod = cpu_to_be16(MLX5_PAGES_GIVE);
+ in->func_id = cpu_to_be16(func_id);
+ in->num_entries = cpu_to_be32(npages);
+ err = mlx5_cmd_exec(dev, in, inlen, &out, sizeof(out));
+ if (err) {
+ mlx5_core_warn(dev, "func_id 0x%x, npages %d, err %d\n",
+ func_id, npages, err);
+ goto out_4k;
+ }
+ dev->priv.fw_pages += npages;
+
+ if (out.hdr.status) {
+ err = mlx5_cmd_status_to_err(&out.hdr);
+ if (err) {
+ mlx5_core_warn(dev, "func_id 0x%x, npages %d, status %d\n",
+ func_id, npages, out.hdr.status);
+ goto out_4k;
+ }
+ }
+
+ mlx5_core_dbg(dev, "err %d\n", err);
+
+ goto out_free;
+
+out_alloc:
+ if (notify_fail)
+ page_notify_fail(dev, func_id);
+
+out_4k:
+ for (i--; i >= 0; i--)
+ free_4k(dev, be64_to_cpu(in->pas[i]));
+out_free:
+ kvfree(in);
+ return err;
+}
+
+static int reclaim_pages(struct mlx5_core_dev *dev, u32 func_id, int npages,
+ int *nclaimed)
+{
+ struct mlx5_manage_pages_inbox in;
+ struct mlx5_manage_pages_outbox *out;
+ int num_claimed;
+ int outlen;
+ u64 addr;
+ int err;
+ int i;
+
+ if (nclaimed)
+ *nclaimed = 0;
+
+ memset(&in, 0, sizeof(in));
+ outlen = sizeof(*out) + npages * sizeof(out->pas[0]);
+ out = mlx5_vzalloc(outlen);
+ if (!out)
+ return -ENOMEM;
+
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_MANAGE_PAGES);
+ in.hdr.opmod = cpu_to_be16(MLX5_PAGES_TAKE);
+ in.func_id = cpu_to_be16(func_id);
+ in.num_entries = cpu_to_be32(npages);
+ mlx5_core_dbg(dev, "npages %d, outlen %d\n", npages, outlen);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), out, outlen);
+ if (err) {
+ mlx5_core_err(dev, "failed reclaiming pages\n");
+ goto out_free;
+ }
+
+ if (out->hdr.status) {
+ err = mlx5_cmd_status_to_err(&out->hdr);
+ goto out_free;
+ }
+
+ num_claimed = be32_to_cpu(out->num_entries);
+ if (nclaimed)
+ *nclaimed = num_claimed;
+
+ for (i = 0; i < num_claimed; i++) {
+ addr = be64_to_cpu(out->pas[i]);
+ free_4k(dev, addr);
+ }
+ dev->priv.fw_pages -= num_claimed;
+
+out_free:
+ kvfree(out);
+ return err;
+}
+
+static void pages_work_handler(struct work_struct *work)
+{
+ struct mlx5_pages_req *req = container_of(work, struct mlx5_pages_req, work);
+ struct mlx5_core_dev *dev = req->dev;
+ int err = 0;
+
+ if (req->npages < 0)
+ err = reclaim_pages(dev, req->func_id, -1 * req->npages, NULL);
+ else if (req->npages > 0)
+ err = give_pages(dev, req->func_id, req->npages, 1);
+
+ if (err)
+ mlx5_core_warn(dev, "%s fail %d\n",
+ req->npages < 0 ? "reclaim" : "give", err);
+
+ kfree(req);
+}
+
+void mlx5_core_req_pages_handler(struct mlx5_core_dev *dev, u16 func_id,
+ s32 npages)
+{
+ struct mlx5_pages_req *req;
+
+ req = kzalloc(sizeof(*req), GFP_ATOMIC);
+ if (!req) {
+ mlx5_core_warn(dev, "failed to allocate pages request\n");
+ return;
+ }
+
+ req->dev = dev;
+ req->func_id = func_id;
+ req->npages = npages;
+ INIT_WORK(&req->work, pages_work_handler);
+ queue_work(dev->priv.pg_wq, &req->work);
+}
+
+int mlx5_satisfy_startup_pages(struct mlx5_core_dev *dev, int boot)
+{
+ u16 uninitialized_var(func_id);
+ s32 uninitialized_var(npages);
+ int err;
+
+ err = mlx5_cmd_query_pages(dev, &func_id, &npages, boot);
+ if (err)
+ return err;
+
+ mlx5_core_dbg(dev, "requested %d %s pages for func_id 0x%x\n",
+ npages, boot ? "boot" : "init", func_id);
+
+ return give_pages(dev, func_id, npages, 0);
+}
+
+enum {
+ MLX5_BLKS_FOR_RECLAIM_PAGES = 12
+};
+
+static int optimal_reclaimed_pages(void)
+{
+ struct mlx5_cmd_prot_block *block;
+ struct mlx5_cmd_layout *lay;
+ int ret;
+
+ ret = (sizeof(lay->out) + MLX5_BLKS_FOR_RECLAIM_PAGES * sizeof(block->data) -
+ sizeof(struct mlx5_manage_pages_outbox)) /
+ FIELD_SIZEOF(struct mlx5_manage_pages_outbox, pas[0]);
+
+ return ret;
+}
+
+int mlx5_reclaim_startup_pages(struct mlx5_core_dev *dev)
+{
+ unsigned long end = jiffies + msecs_to_jiffies(MAX_RECLAIM_TIME_MSECS);
+ struct fw_page *fwp;
+ struct rb_node *p;
+ int nclaimed = 0;
+ int err = 0;
+
+ do {
+ p = rb_first(&dev->priv.page_root);
+ if (p) {
+ fwp = rb_entry(p, struct fw_page, rb_node);
+ if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) {
+ free_4k(dev, fwp->addr);
+ nclaimed = 1;
+ } else {
+ err = reclaim_pages(dev, fwp->func_id,
+ optimal_reclaimed_pages(),
+ &nclaimed);
+ }
+ if (err) {
+ mlx5_core_warn(dev, "failed reclaiming pages (%d)\n",
+ err);
+ return err;
+ }
+ if (nclaimed)
+ end = jiffies + msecs_to_jiffies(MAX_RECLAIM_TIME_MSECS);
+ }
+ if (time_after(jiffies, end)) {
+ mlx5_core_warn(dev, "FW did not return all pages. giving up...\n");
+ break;
+ }
+ } while (p);
+
+ return 0;
+}
+
+void mlx5_pagealloc_init(struct mlx5_core_dev *dev)
+{
+ dev->priv.page_root = RB_ROOT;
+ INIT_LIST_HEAD(&dev->priv.free_list);
+}
+
+void mlx5_pagealloc_cleanup(struct mlx5_core_dev *dev)
+{
+ /* nothing */
+}
+
+int mlx5_pagealloc_start(struct mlx5_core_dev *dev)
+{
+ dev->priv.pg_wq = create_singlethread_workqueue("mlx5_page_allocator");
+ if (!dev->priv.pg_wq)
+ return -ENOMEM;
+
+ return 0;
+}
+
+void mlx5_pagealloc_stop(struct mlx5_core_dev *dev)
+{
+ destroy_workqueue(dev->priv.pg_wq);
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/params.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/params.c
new file mode 100644
index 0000000..f253364
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/params.c
@@ -0,0 +1,198 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+static char *guids;
+module_param_named(guids, guids, charp, 0444);
+MODULE_PARM_DESC(node_guid, "guids configuration. This module parameter will be obsolete!");
+
+/* format: dddd:bb:vv.f-nn:nn:nn:nn:nn:nn:nn:nn:pp:pp:pp:pp:pp:pp:pp:pp:qq:qq:qq:qq:qq:qq:qq:qq,
+ *
+ * dddd:bb:vv.f are domain, bus, device, function for the device
+ * nn:nn:nn:nn:nn:nn:nn:nn is node guid to configure
+ * pp:pp:pp:pp:pp:pp:pp:pp is port 1 GUID
+ * qq:qq:qq:qq:qq:qq:qq:qq is port 2 GUID. this param is optional
+ *
+ * The comma indicates another record follows
+ */
+
+static u64 extract_guid(int *g)
+{
+ return ((u64)g[0] << 56) |
+ ((u64)g[1] << 48) |
+ ((u64)g[2] << 40) |
+ ((u64)g[3] << 32) |
+ ((u64)g[4] << 24) |
+ ((u64)g[5] << 16) |
+ ((u64)g[6] << 8) |
+ (u64)g[7];
+}
+
+static int is_valid_len(const char *p, int *nport)
+{
+ int tmp;
+ char *x;
+
+ x = strchr(p, ',');
+ if (x)
+ tmp = (int)(x - p);
+ else
+ tmp = strlen(p);
+
+ switch (tmp) {
+ case 47:
+ *nport = 1;
+ break;
+
+ case 71:
+ *nport = 2;
+ break;
+
+ default:
+ return 0;
+ }
+
+ return 1;
+}
+
+static int get_record(const char *p, u64 *node_guid, u64 *port1_guid,
+ u64 *port2_guid, int *nport)
+{
+ int tmp[8];
+ int err;
+ const char *guid_format = "%02x:%02x:%02x:%02x:%02x:%02x:%02x:%02x";
+ int np;
+
+ if (!is_valid_len(p, &np))
+ return -EINVAL;
+
+ err = sscanf(p, guid_format, tmp, tmp + 1, tmp + 2, tmp + 3, tmp + 4,
+ tmp + 5, tmp + 6, tmp + 7);
+ if (err != 8)
+ return -EINVAL;
+
+ *node_guid = extract_guid(tmp);
+ p += 23;
+ if (*p != ':')
+ return -EINVAL;
+ p++;
+
+ err = sscanf(p, guid_format, tmp, tmp + 1, tmp + 2, tmp + 3, tmp + 4,
+ tmp + 5, tmp + 6, tmp + 7);
+ if (err != 8)
+ return -EINVAL;
+ *port1_guid = extract_guid(tmp);
+ if (np != 2) {
+ *nport = np;
+ return 0;
+ }
+
+ p += 23;
+ if (*p != ':')
+ return -EINVAL;
+ p++;
+
+ err = sscanf(p, guid_format, tmp, tmp + 1, tmp + 2, tmp + 3, tmp + 4,
+ tmp + 5, tmp + 6, tmp + 7);
+ if (err != 8)
+ return -EINVAL;
+ *port2_guid = extract_guid(tmp);
+ *nport = np;
+
+ return 0;
+}
+
+int mlx5_update_guids(struct mlx5_core_dev *dev)
+{
+ struct pci_dev *pdev = dev->pdev;
+ const char *devp;
+ char *p = guids;
+ u64 port1_guid = 0;
+ u64 port2_guid = 0;
+ u64 node_guid;
+ int nport;
+ int dlen;
+ int err;
+ struct mlx5_hca_vport_context *req;
+
+ if (!p)
+ return 0;
+
+ devp = dev_name(&pdev->dev);
+ dlen = strlen(devp);
+ while (1) {
+ if (dlen >= strlen(p))
+ return -ENODEV;
+
+ if (!memcmp(devp, p, dlen)) {
+ p += dlen;
+ if (*p != '-')
+ return -EINVAL;
+ p++;
+ break;
+ }
+
+ p = strchr(p, ',');
+ if (!p)
+ return -ENODEV;
+ p++;
+ }
+
+ err = get_record(p, &node_guid, &port1_guid, &port2_guid, &nport);
+ if (err)
+ return err;
+
+ req = kzalloc(sizeof(*req), GFP_KERNEL);
+ if (!req)
+ return -ENOMEM;
+
+ req->node_guid = node_guid;
+ req->port_guid = port1_guid;
+ req->field_select = MLX5_HCA_VPORT_SEL_NODE_GUID | MLX5_HCA_VPORT_SEL_PORT_GUID;
+ err = mlx5_core_modify_hca_vport_context(dev, 0, 1, 0, req);
+ if (err)
+ goto out;
+
+ if (nport == 2) {
+ req->port_guid = port2_guid;
+ err = mlx5_core_modify_hca_vport_context(dev, 0, 2, 0, req);
+ }
+
+out:
+ kfree(req);
+
+ return err;
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/pd.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/pd.c
new file mode 100644
index 0000000..3938e9f
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/pd.c
@@ -0,0 +1,73 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+int mlx5_core_alloc_pd(struct mlx5_core_dev *dev, u32 *pdn)
+{
+ u32 in[MLX5_ST_SZ_DW(alloc_pd_in)];
+ u32 out[MLX5_ST_SZ_DW(alloc_pd_out)];
+ int err;
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(alloc_pd_in, in, opcode, MLX5_CMD_OP_ALLOC_PD);
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
+ if (err)
+ return err;
+
+ *pdn = MLX5_GET(alloc_pd_out, out, pd);
+ return 0;
+}
+EXPORT_SYMBOL(mlx5_core_alloc_pd);
+
+int mlx5_core_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn)
+{
+ u32 in[MLX5_ST_SZ_DW(dealloc_pd_in)];
+ u32 out[MLX5_ST_SZ_DW(dealloc_pd_out)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(dealloc_pd_in, in, opcode, MLX5_CMD_OP_DEALLOC_PD);
+ MLX5_SET(dealloc_pd_in, in, pd, pdn);
+
+ memset(out, 0, sizeof(out));
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in),
+ out, sizeof(out));
+}
+EXPORT_SYMBOL(mlx5_core_dealloc_pd);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/port.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/port.c
new file mode 100644
index 0000000..6f4cfcb
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/port.c
@@ -0,0 +1,869 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+#if (LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,39))
+static int mlx5_pci_num_vf(struct pci_dev *pdev)
+{
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+
+ if (!mlx5_core_is_pf(dev))
+ return 0;
+
+ return dev->priv.sriov.num_vfs;
+}
+#endif
+
+static int is_valid_vf(struct mlx5_core_dev *dev, int vf)
+{
+ struct pci_dev *pdev = dev->pdev;
+
+ if (vf == 1)
+ return 1;
+
+ if (mlx5_core_is_pf(dev))
+#if (LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,39))
+ return (vf <= mlx5_pci_num_vf(pdev)) && (vf >= 1);
+#else
+ return (vf <= pci_num_vf(pdev)) && (vf >= 1);
+#endif
+
+ return 0;
+}
+
+int mlx5_core_access_reg(struct mlx5_core_dev *dev, void *data_in,
+ int size_in, void *data_out, int size_out,
+ u16 reg_num, int arg, int write)
+{
+ struct mlx5_access_reg_mbox_in *in = NULL;
+ struct mlx5_access_reg_mbox_out *out = NULL;
+ int err = -ENOMEM;
+
+ in = mlx5_vzalloc(sizeof(*in) + size_in);
+ if (!in)
+ goto ex;
+
+ out = mlx5_vzalloc(sizeof(*out) + size_out);
+ if (!out)
+ goto ex;
+
+ memcpy(in->data, data_in, size_in);
+ in->hdr.opcode = cpu_to_be16(MLX5_CMD_OP_ACCESS_REG);
+ in->hdr.opmod = cpu_to_be16(!write);
+ in->arg = cpu_to_be32(arg);
+ in->register_id = cpu_to_be16(reg_num);
+ err = mlx5_cmd_exec(dev, in, sizeof(*in) + size_in, out,
+ sizeof(*out) + size_out);
+ if (err)
+ goto ex;
+
+ err = mlx5_cmd_status_to_err(&out->hdr);
+ if (!err)
+ memcpy(data_out, out->data, size_out);
+
+ex:
+ kvfree(out);
+ kvfree(in);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_access_reg);
+
+
+struct mlx5_reg_pcap {
+ u8 rsvd0;
+ u8 port_num;
+ u8 rsvd1[2];
+ __be32 caps_127_96;
+ __be32 caps_95_64;
+ __be32 caps_63_32;
+ __be32 caps_31_0;
+};
+
+static int set_port_caps_pf(struct mlx5_core_dev *dev, u8 port_num,
+ u32 set, u32 clear, u32 cur)
+{
+ struct mlx5_reg_pcap in;
+ struct mlx5_reg_pcap out;
+ u32 tmp;
+ int err;
+
+ tmp = (cur | set) & ~clear;
+
+ memset(&in, 0, sizeof(in));
+ in.caps_127_96 = cpu_to_be32(tmp);
+ in.port_num = port_num;
+
+ err = mlx5_core_access_reg(dev, &in, sizeof(in), &out,
+ sizeof(out), MLX5_REG_PCAP, 0, 1);
+
+ return err;
+}
+
+static int set_port_caps_vf(struct mlx5_core_dev *dev, u8 port_num,
+ u32 set, u32 clear)
+{
+ struct mlx5_hca_vport_context *req;
+ int err;
+
+ req = kzalloc(sizeof(*req), GFP_KERNEL);
+ if (!req)
+ return -ENOMEM;
+
+ req->cap_mask1 = set | ~clear;
+ req->cap_mask1_perm = set | clear;
+ err = mlx5_core_modify_hca_vport_context(dev, 0, port_num,
+ 0, req);
+
+ kfree(req);
+ return err;
+}
+
+int mlx5_set_port_caps(struct mlx5_core_dev *dev, u8 port_num,
+ u32 set, u32 clear, u32 cur)
+{
+ if (mlx5_core_is_pf(dev))
+ return set_port_caps_pf(dev, port_num, set, clear, cur);
+
+ return set_port_caps_vf(dev, port_num, set, clear);
+}
+EXPORT_SYMBOL_GPL(mlx5_set_port_caps);
+
+int mlx5_query_port_ptys(struct mlx5_core_dev *dev, u32 *ptys,
+ int ptys_size, int proto_mask)
+{
+ u32 in[MLX5_ST_SZ_DW(ptys_reg)];
+ int err;
+
+ memset(in, 0, sizeof(in));
+ MLX5_SET(ptys_reg, in, local_port, 1);
+ MLX5_SET(ptys_reg, in, proto_mask, proto_mask);
+
+ err = mlx5_core_access_reg(dev, in, sizeof(in), ptys,
+ ptys_size, MLX5_REG_PTYS, 0, 0);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_port_ptys);
+
+int mlx5_query_port_proto_cap(struct mlx5_core_dev *dev,
+ u32 *proto_cap, int proto_mask)
+{
+ u32 out[MLX5_ST_SZ_DW(ptys_reg)];
+ int err;
+
+ err = mlx5_query_port_ptys(dev, out, sizeof(out), proto_mask);
+ if (err)
+ return err;
+
+ if (proto_mask == MLX5_PTYS_EN)
+ *proto_cap = MLX5_GET(ptys_reg, out, eth_proto_capability);
+ else
+ *proto_cap = MLX5_GET(ptys_reg, out, ib_proto_capability);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_port_proto_cap);
+
+int mlx5_core_access_ptys(struct mlx5_core_dev *dev, struct mlx5_ptys_reg *ptys, int write)
+{
+ int sz = MLX5_ST_SZ_BYTES(ptys_reg);
+ void *out = NULL;
+ void *in = NULL;
+ int err;
+
+ in = kzalloc(sz, GFP_KERNEL);
+ out = kzalloc(sz, GFP_KERNEL);
+ if (!in || !out)
+ return -ENOMEM;
+
+ MLX5_SET(ptys_reg, in, local_port, ptys->local_port);
+ MLX5_SET(ptys_reg, in, proto_mask, ptys->proto_mask);
+ if (write) {
+ MLX5_SET(ptys_reg, in, eth_proto_capability, ptys->eth_proto_cap);
+ MLX5_SET(ptys_reg, in, ib_link_width_capability, ptys->ib_link_width_cap);
+ MLX5_SET(ptys_reg, in, ib_proto_capability, ptys->ib_proto_cap);
+ MLX5_SET(ptys_reg, in, eth_proto_admin, ptys->eth_proto_admin);
+ MLX5_SET(ptys_reg, in, ib_link_width_admin, ptys->ib_link_width_admin);
+ MLX5_SET(ptys_reg, in, ib_proto_admin, ptys->ib_proto_admin);
+ MLX5_SET(ptys_reg, in, eth_proto_oper, ptys->eth_proto_oper);
+ MLX5_SET(ptys_reg, in, ib_link_width_oper, ptys->ib_link_width_oper);
+ MLX5_SET(ptys_reg, in, ib_proto_oper, ptys->ib_proto_oper);
+ MLX5_SET(ptys_reg, in, eth_proto_lp_advertise, ptys->eth_proto_lp_advertise);
+ }
+
+ err = mlx5_core_access_reg(dev, in, sz, out, sz, MLX5_REG_PTYS, 0, !!write);
+ if (err)
+ goto out;
+
+ if (!write) {
+ ptys->local_port = MLX5_GET(ptys_reg, out, local_port);
+ ptys->proto_mask = MLX5_GET(ptys_reg, out, proto_mask);
+ ptys->eth_proto_cap = MLX5_GET(ptys_reg, out, eth_proto_capability);
+ ptys->ib_link_width_cap = MLX5_GET(ptys_reg, out, ib_link_width_capability);
+ ptys->ib_proto_cap = MLX5_GET(ptys_reg, out, ib_proto_capability);
+ ptys->eth_proto_admin = MLX5_GET(ptys_reg, out, eth_proto_admin);
+ ptys->ib_link_width_admin = MLX5_GET(ptys_reg, out, ib_link_width_admin);
+ ptys->ib_proto_admin = MLX5_GET(ptys_reg, out, ib_proto_admin);
+ ptys->eth_proto_oper = MLX5_GET(ptys_reg, out, eth_proto_oper);
+ ptys->ib_link_width_oper = MLX5_GET(ptys_reg, out, ib_link_width_oper);
+ ptys->ib_proto_oper = MLX5_GET(ptys_reg, out, ib_proto_oper);
+ ptys->eth_proto_lp_advertise = MLX5_GET(ptys_reg, out, eth_proto_lp_advertise);
+ }
+
+out:
+ kfree(in);
+ kfree(out);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_access_ptys);
+
+int mlx5_core_access_pvlc(struct mlx5_core_dev *dev, struct mlx5_pvlc_reg *pvlc, int write)
+{
+ int sz = MLX5_ST_SZ_BYTES(pvlc_reg);
+ u8 in[MLX5_ST_SZ_BYTES(pvlc_reg)];
+ u8 out[MLX5_ST_SZ_BYTES(pvlc_reg)];
+ int err;
+
+ memset(out, 0, sizeof(out));
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(pvlc_reg, in, local_port, pvlc->local_port);
+ if (write)
+ MLX5_SET(pvlc_reg, in, vl_admin, pvlc->vl_admin);
+
+ err = mlx5_core_access_reg(dev, in, sz, out, sz, MLX5_REG_PVLC, 0, !!write);
+ if (err)
+ return err;
+
+ if (!write) {
+ pvlc->local_port = MLX5_GET(pvlc_reg, out, local_port);
+ pvlc->vl_hw_cap = MLX5_GET(pvlc_reg, out, vl_hw_cap);
+ pvlc->vl_admin = MLX5_GET(pvlc_reg, out, vl_admin);
+ pvlc->vl_operational = MLX5_GET(pvlc_reg, out, vl_operational);
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_access_pvlc);
+
+static int mtu_to_ib_mtu(int mtu)
+{
+ switch (mtu) {
+ case 256: return 1;
+ case 512: return 2;
+ case 1024: return 3;
+ case 2048: return 4;
+ case 4096: return 5;
+ default:
+ pr_warn("invalid mtu\n");
+ return -1;
+ }
+}
+
+int mlx5_core_access_pmtu(struct mlx5_core_dev *dev, struct mlx5_pmtu_reg *pmtu, int write)
+{
+ int sz = MLX5_ST_SZ_BYTES(pmtu_reg);
+ void *out = NULL;
+ void *in = NULL;
+ int err;
+
+ in = kzalloc(sz, GFP_KERNEL);
+ out = kzalloc(sz, GFP_KERNEL);
+ if (!in || !out)
+ return -ENOMEM;
+
+ MLX5_SET(pmtu_reg, in, local_port, pmtu->local_port);
+ if (write)
+ MLX5_SET(pmtu_reg, in, admin_mtu, pmtu->admin_mtu);
+
+ err = mlx5_core_access_reg(dev, in, sz, out, sz, MLX5_REG_PMTU, 0, !!write);
+ if (err)
+ goto out;
+
+ if (!write) {
+ pmtu->local_port = MLX5_GET(pmtu_reg, out, local_port);
+ pmtu->max_mtu = mtu_to_ib_mtu(MLX5_GET(pmtu_reg, out, max_mtu));
+ pmtu->admin_mtu = mtu_to_ib_mtu(MLX5_GET(pmtu_reg, out, admin_mtu));
+ pmtu->oper_mtu = mtu_to_ib_mtu(MLX5_GET(pmtu_reg, out, oper_mtu));
+ }
+
+out:
+ kfree(in);
+ kfree(out);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_access_pmtu);
+
+int mlx5_query_port_proto_admin(struct mlx5_core_dev *dev,
+ u32 *proto_admin, int proto_mask)
+{
+ u32 out[MLX5_ST_SZ_DW(ptys_reg)];
+ int err;
+
+ err = mlx5_query_port_ptys(dev, out, sizeof(out), proto_mask);
+ if (err)
+ return err;
+
+ if (proto_mask == MLX5_PTYS_EN)
+ *proto_admin = MLX5_GET(ptys_reg, out, eth_proto_admin);
+ else
+ *proto_admin = MLX5_GET(ptys_reg, out, ib_proto_admin);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_port_proto_admin);
+
+int mlx5_set_port_proto(struct mlx5_core_dev *dev, u32 proto_admin,
+ int proto_mask)
+{
+ u32 in[MLX5_ST_SZ_DW(ptys_reg)];
+ u32 out[MLX5_ST_SZ_DW(ptys_reg)];
+ int err;
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(ptys_reg, in, local_port, 1);
+ MLX5_SET(ptys_reg, in, proto_mask, proto_mask);
+ if (proto_mask == MLX5_PTYS_EN)
+ MLX5_SET(ptys_reg, in, eth_proto_admin, proto_admin);
+ else
+ MLX5_SET(ptys_reg, in, ib_proto_admin, proto_admin);
+
+ err = mlx5_core_access_reg(dev, in, sizeof(in), out,
+ sizeof(out), MLX5_REG_PTYS, 0, 1);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_set_port_proto);
+
+int mlx5_set_port_status(struct mlx5_core_dev *dev,
+ enum mlx5_port_status status)
+{
+ u32 in[MLX5_ST_SZ_DW(paos_reg)];
+ u32 out[MLX5_ST_SZ_DW(paos_reg)];
+ int err;
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(paos_reg, in, admin_status, status);
+ MLX5_SET(paos_reg, in, ase, 1);
+
+ err = mlx5_core_access_reg(dev, in, sizeof(in), out,
+ sizeof(out), MLX5_REG_PAOS, 0, 1);
+ return err;
+}
+
+int mlx5_query_port_status(struct mlx5_core_dev *dev, u8 *status)
+{
+ u32 in[MLX5_ST_SZ_DW(paos_reg)];
+ u32 out[MLX5_ST_SZ_DW(paos_reg)];
+ int err;
+
+ memset(in, 0, sizeof(in));
+
+ err = mlx5_core_access_reg(dev, in, sizeof(in), out,
+ sizeof(out), MLX5_REG_PAOS, 0, 0);
+ if (err)
+ return err;
+
+ *status = MLX5_GET(paos_reg, out, oper_status);
+ return err;
+}
+
+static void mlx5_query_port_mtu(struct mlx5_core_dev *dev,
+ int *admin_mtu, int *max_mtu, int *oper_mtu)
+{
+ u32 in[MLX5_ST_SZ_DW(pmtu_reg)];
+ u32 out[MLX5_ST_SZ_DW(pmtu_reg)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(pmtu_reg, in, local_port, 1);
+
+ mlx5_core_access_reg(dev, in, sizeof(in), out,
+ sizeof(out), MLX5_REG_PMTU, 0, 0);
+
+ if (max_mtu)
+ *max_mtu = MLX5_GET(pmtu_reg, out, max_mtu);
+ if (oper_mtu)
+ *oper_mtu = MLX5_GET(pmtu_reg, out, oper_mtu);
+ if (admin_mtu)
+ *admin_mtu = MLX5_GET(pmtu_reg, out, admin_mtu);
+}
+
+int mlx5_set_port_mtu(struct mlx5_core_dev *dev, int mtu)
+{
+ u32 in[MLX5_ST_SZ_DW(pmtu_reg)];
+ u32 out[MLX5_ST_SZ_DW(pmtu_reg)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(pmtu_reg, in, admin_mtu, mtu);
+ MLX5_SET(pmtu_reg, in, local_port, 1);
+
+ return mlx5_core_access_reg(dev, in, sizeof(in), out,
+ sizeof(out), MLX5_REG_PMTU, 0, 1);
+}
+EXPORT_SYMBOL_GPL(mlx5_set_port_mtu);
+
+void mlx5_query_port_max_mtu(struct mlx5_core_dev *dev, int *max_mtu)
+{
+ mlx5_query_port_mtu(dev, NULL, max_mtu, NULL);
+}
+EXPORT_SYMBOL_GPL(mlx5_query_port_max_mtu);
+
+void mlx5_query_port_oper_mtu(struct mlx5_core_dev *dev, int *oper_mtu)
+{
+ mlx5_query_port_mtu(dev, NULL, NULL, oper_mtu);
+}
+EXPORT_SYMBOL_GPL(mlx5_query_port_oper_mtu);
+
+int mlx5_core_query_gids(struct mlx5_core_dev *dev, u8 other_vport,
+ u8 port_num, u16 vf_num, u16 gid_index,
+ union ib_gid *gid)
+{
+ int in_sz = MLX5_ST_SZ_BYTES(query_hca_vport_gid_in);
+ int out_sz = MLX5_ST_SZ_BYTES(query_hca_vport_gid_out);
+ int is_group_manager;
+ void *out = NULL;
+ void *in = NULL;
+ union ib_gid *tmp;
+ int tbsz;
+ int nout;
+ int err;
+
+ vf_num += 1;
+ if (!is_valid_vf(dev, vf_num)) {
+ mlx5_core_warn(dev, "invalid vf number %d", vf_num);
+ return -EINVAL;
+ }
+
+ is_group_manager = MLX5_CAP_GEN(dev, vport_group_manager);
+ tbsz = mlx5_get_gid_table_len(MLX5_CAP_GEN(dev, gid_table_size));
+ mlx5_core_dbg(dev, "vf_num %d, index %d, gid_table_size %d\n",
+ vf_num, gid_index, tbsz);
+
+ if (gid_index > tbsz && gid_index != 0xffff)
+ return -EINVAL;
+
+ if (gid_index == 0xffff)
+ nout = tbsz;
+ else
+ nout = 1;
+
+ out_sz += nout * sizeof(*gid);
+
+ in = kzalloc(in_sz, GFP_KERNEL);
+ out = kzalloc(out_sz, GFP_KERNEL);
+ if (!in || !out) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ MLX5_SET(query_hca_vport_gid_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_VPORT_GID);
+ if (other_vport) {
+ if (is_group_manager) {
+ MLX5_SET(query_hca_vport_gid_in, in, vport_number, vf_num);
+ MLX5_SET(query_hca_vport_gid_in, in, other_vport, 1);
+ } else {
+ err = -EPERM;
+ goto out;
+ }
+ }
+ MLX5_SET(query_hca_vport_gid_in, in, gid_index, gid_index);
+
+ if (MLX5_CAP_GEN(dev, num_ports) == 2)
+ MLX5_SET(query_hca_vport_gid_in, in, port_num, port_num);
+
+ err = mlx5_cmd_exec(dev, in, in_sz, out, out_sz);
+ if (err)
+ goto out;
+
+ err = mlx5_cmd_status_to_err_v2(out);
+ if (err)
+ goto out;
+
+ tmp = out + MLX5_ST_SZ_BYTES(query_hca_vport_gid_out);
+ gid->global.subnet_prefix = tmp->global.subnet_prefix;
+ gid->global.interface_id = tmp->global.interface_id;
+
+out:
+ kfree(in);
+ kfree(out);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_query_gids);
+
+int mlx5_core_query_pkeys(struct mlx5_core_dev *dev, u8 other_vport,
+ u8 port_num, u16 vf_num, u16 pkey_index,
+ u16 *pkey)
+{
+ int in_sz = MLX5_ST_SZ_BYTES(query_hca_vport_pkey_in);
+ int out_sz = MLX5_ST_SZ_BYTES(query_hca_vport_pkey_out);
+ int is_group_manager;
+ void *out = NULL;
+ void *in = NULL;
+ void *pkarr;
+ int nout;
+ int tbsz;
+ int err;
+ int i;
+
+ is_group_manager = MLX5_CAP_GEN(dev, vport_group_manager);
+ mlx5_core_dbg(dev, "vf_num %d\n", vf_num);
+
+ vf_num += 1;
+ if (!is_valid_vf(dev, vf_num)) {
+ mlx5_core_warn(dev, "invalid vf number %d", vf_num);
+ return -EINVAL;
+ }
+
+ tbsz = mlx5_to_sw_pkey_sz(MLX5_CAP_GEN(dev, pkey_table_size));
+ if (pkey_index > tbsz && pkey_index != 0xffff)
+ return -EINVAL;
+
+ if (pkey_index == 0xffff)
+ nout = tbsz;
+ else
+ nout = 1;
+
+ out_sz += nout * MLX5_ST_SZ_BYTES(pkey);
+
+ in = kzalloc(in_sz, GFP_KERNEL);
+ out = kzalloc(out_sz, GFP_KERNEL);
+ if (!in || !out) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ MLX5_SET(query_hca_vport_pkey_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_VPORT_PKEY);
+ if (other_vport) {
+ if (is_group_manager) {
+ MLX5_SET(query_hca_vport_pkey_in, in, vport_number, vf_num);
+ MLX5_SET(query_hca_vport_pkey_in, in, other_vport, 1);
+ } else {
+ err = -EPERM;
+ goto out;
+ }
+ }
+ MLX5_SET(query_hca_vport_pkey_in, in, pkey_index, pkey_index);
+
+ if (MLX5_CAP_GEN(dev, num_ports) == 2)
+ MLX5_SET(query_hca_vport_pkey_in, in, port_num, port_num);
+
+ err = mlx5_cmd_exec(dev, in, in_sz, out, out_sz);
+ if (err)
+ goto out;
+
+ err = mlx5_cmd_status_to_err_v2(out);
+ if (err)
+ goto out;
+
+ pkarr = out + MLX5_ST_SZ_BYTES(query_hca_vport_pkey_out);
+ for (i = 0; i < nout; i++, pkey++, pkarr += MLX5_ST_SZ_BYTES(pkey))
+ *pkey = MLX5_GET_PR(pkey, pkarr, pkey);
+
+out:
+ kfree(in);
+ kfree(out);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_query_pkeys);
+
+#define MLX5_ENABLE_HCA_MASK (MLX5_HCA_VPORT_SEL_NODE_GUID | MLX5_HCA_VPORT_SEL_PORT_GUID)
+int mlx5_core_check_enable_vf_hca(struct mlx5_core_dev *dev, u32 field_select, u8 vport_num)
+{
+ struct mlx5_core_sriov *sriov = &dev->priv.sriov;
+ struct mlx5_vf_context *ctx;
+ int err;
+
+ if (!mlx5_core_is_pf(dev))
+ return 0;
+
+ if (!sriov)
+ return -ENOMEM;
+
+ if (!sriov->vfs_ctx)
+ return -ENOMEM;
+
+ vport_num += 1;
+ if (!is_valid_vf(dev, vport_num)) {
+ mlx5_core_warn(dev, "invalid vf number %d", vport_num);
+ return -EINVAL;
+ }
+
+ ctx = &sriov->vfs_ctx[vport_num - 1];
+ ctx->state_mask |= (MLX5_ENABLE_HCA_MASK & field_select);
+ if (ctx->state_mask != MLX5_ENABLE_HCA_MASK || ctx->enabled)
+ return 0;
+
+ err = mlx5_core_enable_hca(dev, vport_num);
+ if (!err)
+ ctx->enabled = 1;
+
+ return err;
+}
+
+int mlx5_core_modify_hca_vport_context(struct mlx5_core_dev *dev,
+ u8 other_vport, u8 port_num,
+ u16 vf_num,
+ struct mlx5_hca_vport_context *req)
+{
+ int in_sz = MLX5_ST_SZ_BYTES(modify_hca_vport_context_in);
+ u8 out[MLX5_ST_SZ_BYTES(modify_hca_vport_context_out)];
+ int is_group_manager;
+ void *in;
+ int err;
+ void *ctx;
+
+ mlx5_core_dbg(dev, "vf_num %d\n", vf_num);
+ is_group_manager = MLX5_CAP_GEN(dev, vport_group_manager);
+ vf_num += 1;
+ if (!is_valid_vf(dev, vf_num)) {
+ mlx5_core_warn(dev, "invalid vf number %d", vf_num);
+ return -EINVAL;
+ }
+
+ in = kzalloc(in_sz, GFP_KERNEL);
+ if (!in)
+ return -ENOMEM;
+
+ memset(out, 0, sizeof(out));
+ MLX5_SET(modify_hca_vport_context_in, in, opcode, MLX5_CMD_OP_MODIFY_HCA_VPORT_CONTEXT);
+ if (other_vport) {
+ if (is_group_manager) {
+ MLX5_SET(modify_hca_vport_context_in, in, other_vport, 1);
+ MLX5_SET(modify_hca_vport_context_in, in, vport_number, vf_num);
+ } else {
+ err = -EPERM;
+ goto ex;
+ }
+ }
+
+ if (MLX5_CAP_GEN(dev, num_ports) == 2)
+ MLX5_SET(modify_hca_vport_context_in, in, port_num, port_num);
+
+ ctx = MLX5_ADDR_OF(modify_hca_vport_context_in, in, hca_vport_context);
+ MLX5_SET(hca_vport_context, ctx, field_select, req->field_select);
+ MLX5_SET(hca_vport_context, ctx, sm_virt_aware, req->sm_virt_aware);
+ MLX5_SET(hca_vport_context, ctx, has_smi, req->has_smi);
+ MLX5_SET(hca_vport_context, ctx, has_raw, req->has_raw);
+ MLX5_SET(hca_vport_context, ctx, vport_state_policy, req->policy);
+ MLX5_SET(hca_vport_context, ctx, port_physical_state, req->phys_state);
+ MLX5_SET(hca_vport_context, ctx, vport_state, req->vport_state);
+ MLX5_SET64(hca_vport_context, ctx, port_guid, req->port_guid);
+ MLX5_SET64(hca_vport_context, ctx, node_guid, req->node_guid);
+ MLX5_SET(hca_vport_context, ctx, cap_mask1, req->cap_mask1);
+ MLX5_SET(hca_vport_context, ctx, cap_mask1_field_select, req->cap_mask1_perm);
+ MLX5_SET(hca_vport_context, ctx, cap_mask2, req->cap_mask2);
+ MLX5_SET(hca_vport_context, ctx, cap_mask2_field_select, req->cap_mask2_perm);
+ MLX5_SET(hca_vport_context, ctx, lid, req->lid);
+ MLX5_SET(hca_vport_context, ctx, init_type_reply, req->init_type_reply);
+ MLX5_SET(hca_vport_context, ctx, lmc, req->lmc);
+ MLX5_SET(hca_vport_context, ctx, subnet_timeout, req->subnet_timeout);
+ MLX5_SET(hca_vport_context, ctx, sm_lid, req->sm_lid);
+ MLX5_SET(hca_vport_context, ctx, sm_sl, req->sm_sl);
+ MLX5_SET(hca_vport_context, ctx, qkey_violation_counter, req->qkey_violation_counter);
+ MLX5_SET(hca_vport_context, ctx, pkey_violation_counter, req->pkey_violation_counter);
+ err = mlx5_cmd_exec(dev, in, in_sz, out, sizeof(out));
+ if (err)
+ goto ex;
+
+ err = mlx5_cmd_status_to_err_v2(out);
+
+ex:
+ kfree(in);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_modify_hca_vport_context);
+
+int mlx5_core_query_hca_vport_context(struct mlx5_core_dev *dev,
+ u8 other_vport, u8 port_num,
+ u16 vf_num,
+ struct mlx5_hca_vport_context *rep)
+{
+ int out_sz = MLX5_ST_SZ_BYTES(query_hca_vport_context_out);
+ int in[MLX5_ST_SZ_BYTES(query_hca_vport_context_in)];
+ int is_group_manager;
+ void *out;
+ void *ctx;
+ int err;
+
+ mlx5_core_dbg(dev, "vf_num %d\n", vf_num);
+ is_group_manager = MLX5_CAP_GEN(dev, vport_group_manager);
+ vf_num += 1;
+ if (!is_valid_vf(dev, vf_num)) {
+ mlx5_core_warn(dev, "invalid vf number %d", vf_num);
+ return -EINVAL;
+ }
+
+ memset(in, 0, sizeof(in));
+ out = kzalloc(out_sz, GFP_KERNEL);
+ if (!out)
+ return -ENOMEM;
+
+ MLX5_SET(query_hca_vport_context_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_VPORT_CONTEXT);
+
+ if (other_vport) {
+ if (is_group_manager) {
+ MLX5_SET(query_hca_vport_context_in, in, other_vport, 1);
+ MLX5_SET(query_hca_vport_context_in, in, vport_number, vf_num);
+ } else {
+ err = -EPERM;
+ goto ex;
+ }
+ }
+
+ if (MLX5_CAP_GEN(dev, num_ports) == 2)
+ MLX5_SET(query_hca_vport_context_in, in, port_num, port_num);
+
+ err = mlx5_cmd_exec(dev, in, sizeof(in), out, out_sz);
+ if (err)
+ goto ex;
+ err = mlx5_cmd_status_to_err_v2(out);
+ if (err)
+ goto ex;
+
+ ctx = MLX5_ADDR_OF(query_hca_vport_context_out, out, hca_vport_context);
+ rep->field_select = MLX5_GET_PR(hca_vport_context, ctx, field_select);
+ rep->sm_virt_aware = MLX5_GET_PR(hca_vport_context, ctx, sm_virt_aware);
+ rep->has_smi = MLX5_GET_PR(hca_vport_context, ctx, has_smi);
+ rep->has_raw = MLX5_GET_PR(hca_vport_context, ctx, has_raw);
+ rep->policy = MLX5_GET_PR(hca_vport_context, ctx, vport_state_policy);
+ rep->phys_state = MLX5_GET_PR(hca_vport_context, ctx,
+ port_physical_state);
+ rep->vport_state = MLX5_GET_PR(hca_vport_context, ctx, vport_state);
+ rep->port_physical_state = MLX5_GET_PR(hca_vport_context, ctx,
+ port_physical_state);
+ rep->port_guid = MLX5_GET64_PR(hca_vport_context, ctx, port_guid);
+ rep->node_guid = MLX5_GET64_PR(hca_vport_context, ctx, node_guid);
+ rep->cap_mask1 = MLX5_GET_PR(hca_vport_context, ctx, cap_mask1);
+ rep->cap_mask1_perm = MLX5_GET_PR(hca_vport_context, ctx,
+ cap_mask1_field_select);
+ rep->cap_mask2 = MLX5_GET_PR(hca_vport_context, ctx, cap_mask2);
+ rep->cap_mask2_perm = MLX5_GET_PR(hca_vport_context, ctx,
+ cap_mask2_field_select);
+ rep->lid = MLX5_GET_PR(hca_vport_context, ctx, lid);
+ rep->init_type_reply = MLX5_GET_PR(hca_vport_context, ctx,
+ init_type_reply);
+ rep->lmc = MLX5_GET_PR(hca_vport_context, ctx, lmc);
+ rep->subnet_timeout = MLX5_GET_PR(hca_vport_context, ctx,
+ subnet_timeout);
+ rep->sm_lid = MLX5_GET_PR(hca_vport_context, ctx, sm_lid);
+ rep->sm_sl = MLX5_GET_PR(hca_vport_context, ctx, sm_sl);
+ rep->qkey_violation_counter = MLX5_GET_PR(hca_vport_context, ctx,
+ qkey_violation_counter);
+ rep->pkey_violation_counter = MLX5_GET_PR(hca_vport_context, ctx,
+ pkey_violation_counter);
+ rep->grh_required = MLX5_GET_PR(hca_vport_context, ctx, grh_required);
+ rep->sys_image_guid = MLX5_GET64_PR(hca_vport_context, ctx,
+ system_image_guid);
+
+ex:
+ kfree(out);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_query_hca_vport_context);
+
+int mlx5_core_query_vport_counter(struct mlx5_core_dev *dev, u8 other_vport,
+ u8 port_num, u16 vf_num,
+ struct mlx5_vport_counters *vc)
+{
+ int out_sz = MLX5_ST_SZ_BYTES(query_vport_counter_out);
+ int in_sz = MLX5_ST_SZ_BYTES(query_vport_counter_in);
+ int is_group_manager;
+ void *out;
+ void *in;
+ int err;
+
+ mlx5_core_dbg(dev, "vf_num %d\n", vf_num);
+ is_group_manager = MLX5_CAP_GEN(dev, vport_group_manager);
+ vf_num += 1;
+ if (!is_valid_vf(dev, vf_num)) {
+ mlx5_core_warn(dev, "invalid vf number %d", vf_num);
+ return -EINVAL;
+ }
+
+ in = kzalloc(in_sz, GFP_KERNEL);
+ out = kzalloc(out_sz, GFP_KERNEL);
+ if (!in || !out) {
+ err = -ENOMEM;
+ goto ex;
+ }
+
+ MLX5_SET(query_vport_counter_in, in, opcode, MLX5_CMD_OP_QUERY_VPORT_COUNTER);
+ if (other_vport) {
+ if (is_group_manager) {
+ MLX5_SET(query_vport_counter_in, in, other_vport, 1);
+ MLX5_SET(query_vport_counter_in, in, vport_number, vf_num);
+ } else {
+ err = -EPERM;
+ goto ex;
+ }
+ }
+ if (MLX5_CAP_GEN(dev, num_ports) == 2)
+ MLX5_SET(query_vport_counter_in, in, port_num, port_num);
+
+ err = mlx5_cmd_exec(dev, in, in_sz, out, out_sz);
+ if (err)
+ goto ex;
+ err = mlx5_cmd_status_to_err_v2(out);
+ if (err)
+ goto ex;
+
+ vc->received_errors.packets = MLX5_GET64_PR(query_vport_counter_out, out, received_errors.packets);
+ vc->received_errors.octets = MLX5_GET64_PR(query_vport_counter_out, out, received_errors.octets);
+ vc->transmit_errors.packets = MLX5_GET64_PR(query_vport_counter_out, out, transmit_errors.packets);
+ vc->transmit_errors.octets = MLX5_GET64_PR(query_vport_counter_out, out, transmit_errors.octets);
+ vc->received_ib_unicast.packets = MLX5_GET64_PR(query_vport_counter_out, out, received_ib_unicast.packets);
+ vc->received_ib_unicast.octets = MLX5_GET64_PR(query_vport_counter_out, out, received_ib_unicast.octets);
+ vc->transmitted_ib_unicast.packets = MLX5_GET64_PR(query_vport_counter_out, out, transmitted_ib_unicast.packets);
+ vc->transmitted_ib_unicast.octets = MLX5_GET64_PR(query_vport_counter_out, out, transmitted_ib_unicast.octets);
+ vc->received_ib_multicast.packets = MLX5_GET64_PR(query_vport_counter_out, out, received_ib_multicast.packets);
+ vc->received_ib_multicast.octets = MLX5_GET64_PR(query_vport_counter_out, out, received_ib_multicast.octets);
+ vc->transmitted_ib_multicast.packets = MLX5_GET64_PR(query_vport_counter_out, out, transmitted_ib_multicast.packets);
+ vc->transmitted_ib_multicast.octets = MLX5_GET64_PR(query_vport_counter_out, out, transmitted_ib_multicast.octets);
+ vc->received_eth_broadcast.packets = MLX5_GET64_PR(query_vport_counter_out, out, received_eth_broadcast.packets);
+ vc->received_eth_broadcast.octets = MLX5_GET64_PR(query_vport_counter_out, out, received_eth_broadcast.octets);
+ vc->transmitted_eth_broadcast.packets = MLX5_GET64_PR(query_vport_counter_out, out, transmitted_eth_broadcast.packets);
+ vc->transmitted_eth_broadcast.octets = MLX5_GET64_PR(query_vport_counter_out, out, transmitted_eth_broadcast.octets);
+
+ex:
+ kfree(in);
+ kfree(out);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_query_vport_counter);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/qp.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/qp.c
new file mode 100644
index 0000000..c9810c5
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/qp.c
@@ -0,0 +1,639 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+
+#include "mlx5_core.h"
+
+static struct mlx5_core_rsc_common *mlx5_get_rsc(struct mlx5_core_dev *dev,
+ u32 rsn)
+{
+ struct mlx5_qp_table *table = &dev->priv.qp_table;
+ struct mlx5_core_rsc_common *common;
+
+ spin_lock(&table->lock);
+
+ common = radix_tree_lookup(&table->tree, rsn);
+ if (common)
+ atomic_inc(&common->refcount);
+
+ spin_unlock(&table->lock);
+
+ if (!common) {
+ mlx5_core_warn(dev, "Async event for bogus resource 0x%x\n",
+ rsn);
+ return NULL;
+ }
+ return common;
+}
+
+void mlx5_core_put_rsc(struct mlx5_core_rsc_common *common)
+{
+ if (atomic_dec_and_test(&common->refcount))
+ complete(&common->free);
+}
+
+int mlx5_rsc_event(struct mlx5_core_dev *dev, u32 rsn, int event_type)
+{
+ struct mlx5_core_rsc_common *common = mlx5_get_rsc(dev, rsn);
+ struct mlx5_core_dct *dct;
+ struct mlx5_core_qp *qp;
+
+ if (!common)
+ return -1;
+
+ switch (common->res) {
+ case MLX5_RES_QP:
+ qp = (struct mlx5_core_qp *)common;
+ qp->event(qp, event_type);
+ break;
+
+ case MLX5_RES_DCT:
+ dct = (struct mlx5_core_dct *)common;
+ if (event_type == MLX5_EVENT_TYPE_DCT_DRAINED)
+ complete(&dct->drained);
+ else
+ dct->event(dct, event_type);
+ break;
+
+ default:
+ mlx5_core_warn(dev, "invalid resource type for 0x%x\n", rsn);
+ }
+
+ mlx5_core_put_rsc(common);
+ return 0;
+}
+
+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
+void mlx5_eq_pagefault(struct mlx5_core_dev *dev, struct mlx5_eqe *eqe)
+{
+ struct mlx5_eqe_page_fault *pf_eqe = &eqe->data.page_fault;
+ int qpn = be32_to_cpu(pf_eqe->flags_qpn) & MLX5_QPN_MASK;
+ struct mlx5_core_rsc_common *common = mlx5_get_rsc(dev, qpn);
+ struct mlx5_core_qp *qp =
+ container_of(common, struct mlx5_core_qp, common);
+ struct mlx5_pagefault pfault;
+
+ if (!qp) {
+ mlx5_core_warn(dev, "ODP event for non-existent QP %06x\n",
+ qpn);
+ return;
+ }
+
+ pfault.event_subtype = eqe->sub_type;
+ pfault.flags = (be32_to_cpu(pf_eqe->flags_qpn) >> MLX5_QPN_BITS) &
+ (MLX5_PFAULT_REQUESTOR | MLX5_PFAULT_WRITE | MLX5_PFAULT_RDMA);
+ pfault.bytes_committed = be32_to_cpu(
+ pf_eqe->bytes_committed);
+
+ mlx5_core_dbg(dev,
+ "PAGE_FAULT: subtype: 0x%02x, flags: 0x%02x,\n",
+ eqe->sub_type, pfault.flags);
+
+ switch (eqe->sub_type) {
+ case MLX5_PFAULT_SUBTYPE_RDMA:
+ /* RDMA based event */
+ pfault.rdma.r_key =
+ be32_to_cpu(pf_eqe->rdma.r_key);
+ pfault.rdma.packet_size =
+ be16_to_cpu(pf_eqe->rdma.packet_length);
+ pfault.rdma.rdma_op_len =
+ be32_to_cpu(pf_eqe->rdma.rdma_op_len);
+ pfault.rdma.rdma_va =
+ be64_to_cpu(pf_eqe->rdma.rdma_va);
+ mlx5_core_dbg(dev,
+ "PAGE_FAULT: qpn: 0x%06x, r_key: 0x%08x,\n",
+ qpn, pfault.rdma.r_key);
+ mlx5_core_dbg(dev,
+ "PAGE_FAULT: rdma_op_len: 0x%08x,\n",
+ pfault.rdma.rdma_op_len);
+ mlx5_core_dbg(dev,
+ "PAGE_FAULT: rdma_va: 0x%016llx,\n",
+ pfault.rdma.rdma_va);
+ mlx5_core_dbg(dev,
+ "PAGE_FAULT: bytes_committed: 0x%06x\n",
+ pfault.bytes_committed);
+ break;
+
+ case MLX5_PFAULT_SUBTYPE_WQE:
+ /* WQE based event */
+ pfault.wqe.wqe_index =
+ be16_to_cpu(pf_eqe->wqe.wqe_index);
+ pfault.wqe.packet_size =
+ be16_to_cpu(pf_eqe->wqe.packet_length);
+ mlx5_core_dbg(dev,
+ "PAGE_FAULT: qpn: 0x%06x, wqe_index: 0x%04x,\n",
+ qpn, pfault.wqe.wqe_index);
+ mlx5_core_dbg(dev,
+ "PAGE_FAULT: bytes_committed: 0x%06x\n",
+ pfault.bytes_committed);
+ break;
+
+ default:
+ mlx5_core_warn(dev,
+ "Unsupported page fault event sub-type: 0x%02hhx, QP %06x\n",
+ eqe->sub_type, qpn);
+ /* Unsupported page faults should still be resolved by the
+ * page fault handler
+ */
+ }
+
+ if (qp->pfault_handler) {
+ qp->pfault_handler(qp, &pfault);
+ } else {
+ mlx5_core_err(dev,
+ "ODP event for QP %08x, without a fault handler in QP\n",
+ qpn);
+ /* Page fault will remain unresolved. QP will hang until it is
+ * destroyed
+ */
+ }
+
+ mlx5_core_put_rsc(common);
+}
+#endif
+
+int mlx5_core_create_qp(struct mlx5_core_dev *dev,
+ struct mlx5_core_qp *qp,
+ struct mlx5_create_qp_mbox_in *in,
+ int inlen)
+{
+ struct mlx5_qp_table *table = &dev->priv.qp_table;
+ struct mlx5_create_qp_mbox_out out;
+ struct mlx5_destroy_qp_mbox_in din;
+ struct mlx5_destroy_qp_mbox_out dout;
+ int err;
+ void *qpc;
+
+ memset(&out, 0, sizeof(out));
+ in->hdr.opcode = cpu_to_be16(MLX5_CMD_OP_CREATE_QP);
+
+ if (dev->issi) {
+ qpc = MLX5_ADDR_OF(create_qp_in, in, qpc);
+ /* 0xffffff means we ask to work with cqe version 0 */
+ MLX5_SET(qpc, qpc, user_index, 0xffffff);
+ }
+
+ err = mlx5_cmd_exec(dev, in, inlen, &out, sizeof(out));
+ if (err) {
+ mlx5_core_warn(dev, "ret %d\n", err);
+ return err;
+ }
+
+ if (out.hdr.status) {
+ mlx5_core_warn(dev, "current num of QPs 0x%x\n",
+ atomic_read(&dev->num_qps));
+ return mlx5_cmd_status_to_err(&out.hdr);
+ }
+
+ qp->qpn = be32_to_cpu(out.qpn) & 0xffffff;
+ mlx5_core_dbg(dev, "qpn = 0x%x\n", qp->qpn);
+
+ qp->common.res = MLX5_RES_QP;
+ spin_lock_irq(&table->lock);
+ err = radix_tree_insert(&table->tree, qp->qpn, qp);
+ spin_unlock_irq(&table->lock);
+ if (err) {
+ mlx5_core_warn(dev, "err %d\n", err);
+ goto err_cmd;
+ }
+
+ err = mlx5_debug_qp_add(dev, qp);
+ if (err)
+ mlx5_core_dbg(dev, "failed adding QP 0x%x to debug file system\n",
+ qp->qpn);
+
+ qp->pid = current->pid;
+ atomic_set(&qp->common.refcount, 1);
+ atomic_inc(&dev->num_qps);
+ init_completion(&qp->common.free);
+
+ return 0;
+
+err_cmd:
+ memset(&din, 0, sizeof(din));
+ memset(&dout, 0, sizeof(dout));
+ din.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_DESTROY_QP);
+ din.qpn = cpu_to_be32(qp->qpn);
+ mlx5_cmd_exec(dev, &din, sizeof(din), &out, sizeof(dout));
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_create_qp);
+
+int mlx5_core_destroy_qp(struct mlx5_core_dev *dev,
+ struct mlx5_core_qp *qp)
+{
+ struct mlx5_destroy_qp_mbox_in in;
+ struct mlx5_destroy_qp_mbox_out out;
+ struct mlx5_qp_table *table = &dev->priv.qp_table;
+ unsigned long flags;
+ int err;
+
+ mlx5_debug_qp_remove(dev, qp);
+
+ spin_lock_irqsave(&table->lock, flags);
+ radix_tree_delete(&table->tree, qp->qpn);
+ spin_unlock_irqrestore(&table->lock, flags);
+
+ mlx5_core_put_rsc((struct mlx5_core_rsc_common *)qp);
+ wait_for_completion(&qp->common.free);
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_DESTROY_QP);
+ in.qpn = cpu_to_be32(qp->qpn);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ return mlx5_cmd_status_to_err(&out.hdr);
+
+ atomic_dec(&dev->num_qps);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_destroy_qp);
+
+int mlx5_core_qp_modify(struct mlx5_core_dev *dev, enum mlx5_qp_state cur_state,
+ enum mlx5_qp_state new_state,
+ struct mlx5_modify_qp_mbox_in *in, int sqd_event,
+ struct mlx5_core_qp *qp)
+{
+ static const u16 optab[MLX5_QP_NUM_STATE][MLX5_QP_NUM_STATE] = {
+ [MLX5_QP_STATE_RST] = {
+ [MLX5_QP_STATE_RST] = MLX5_CMD_OP_2RST_QP,
+ [MLX5_QP_STATE_ERR] = MLX5_CMD_OP_2ERR_QP,
+ [MLX5_QP_STATE_INIT] = MLX5_CMD_OP_RST2INIT_QP,
+ },
+ [MLX5_QP_STATE_INIT] = {
+ [MLX5_QP_STATE_RST] = MLX5_CMD_OP_2RST_QP,
+ [MLX5_QP_STATE_ERR] = MLX5_CMD_OP_2ERR_QP,
+ [MLX5_QP_STATE_INIT] = MLX5_CMD_OP_INIT2INIT_QP,
+ [MLX5_QP_STATE_RTR] = MLX5_CMD_OP_INIT2RTR_QP,
+ },
+ [MLX5_QP_STATE_RTR] = {
+ [MLX5_QP_STATE_RST] = MLX5_CMD_OP_2RST_QP,
+ [MLX5_QP_STATE_ERR] = MLX5_CMD_OP_2ERR_QP,
+ [MLX5_QP_STATE_RTS] = MLX5_CMD_OP_RTR2RTS_QP,
+ },
+ [MLX5_QP_STATE_RTS] = {
+ [MLX5_QP_STATE_RST] = MLX5_CMD_OP_2RST_QP,
+ [MLX5_QP_STATE_ERR] = MLX5_CMD_OP_2ERR_QP,
+ [MLX5_QP_STATE_RTS] = MLX5_CMD_OP_RTS2RTS_QP,
+ },
+ [MLX5_QP_STATE_SQD] = {
+ [MLX5_QP_STATE_RST] = MLX5_CMD_OP_2RST_QP,
+ [MLX5_QP_STATE_ERR] = MLX5_CMD_OP_2ERR_QP,
+ },
+ [MLX5_QP_STATE_SQER] = {
+ [MLX5_QP_STATE_RST] = MLX5_CMD_OP_2RST_QP,
+ [MLX5_QP_STATE_ERR] = MLX5_CMD_OP_2ERR_QP,
+ [MLX5_QP_STATE_RTS] = MLX5_CMD_OP_SQERR2RTS_QP,
+ },
+ [MLX5_QP_STATE_ERR] = {
+ [MLX5_QP_STATE_RST] = MLX5_CMD_OP_2RST_QP,
+ [MLX5_QP_STATE_ERR] = MLX5_CMD_OP_2ERR_QP,
+ }
+ };
+
+ struct mlx5_modify_qp_mbox_out out;
+ int err = 0;
+ u16 op;
+
+ if (cur_state >= MLX5_QP_NUM_STATE || new_state >= MLX5_QP_NUM_STATE ||
+ !optab[cur_state][new_state])
+ return -EINVAL;
+
+ memset(&out, 0, sizeof(out));
+ op = optab[cur_state][new_state];
+ in->hdr.opcode = cpu_to_be16(op);
+ in->qpn = cpu_to_be32(qp->qpn);
+ err = mlx5_cmd_exec(dev, in, sizeof(*in), &out, sizeof(out));
+ if (err)
+ return err;
+
+ return mlx5_cmd_status_to_err(&out.hdr);
+}
+EXPORT_SYMBOL_GPL(mlx5_core_qp_modify);
+
+void mlx5_init_qp_table(struct mlx5_core_dev *dev)
+{
+ struct mlx5_qp_table *table = &dev->priv.qp_table;
+
+ memset(table, 0, sizeof(*table));
+ spin_lock_init(&table->lock);
+ INIT_RADIX_TREE(&table->tree, GFP_ATOMIC);
+ mlx5_qp_debugfs_init(dev);
+}
+
+void mlx5_cleanup_qp_table(struct mlx5_core_dev *dev)
+{
+ mlx5_qp_debugfs_cleanup(dev);
+}
+
+void mlx5_init_dct_table(struct mlx5_core_dev *dev)
+{
+ mlx5_dct_debugfs_init(dev);
+}
+
+void mlx5_cleanup_dct_table(struct mlx5_core_dev *dev)
+{
+ mlx5_dct_debugfs_cleanup(dev);
+}
+
+int mlx5_core_qp_query(struct mlx5_core_dev *dev, struct mlx5_core_qp *qp,
+ struct mlx5_query_qp_mbox_out *out, int outlen)
+{
+ struct mlx5_query_qp_mbox_in in;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(out, 0, outlen);
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_QUERY_QP);
+ in.qpn = cpu_to_be32(qp->qpn);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), out, outlen);
+ if (err)
+ return err;
+
+ if (out->hdr.status)
+ return mlx5_cmd_status_to_err(&out->hdr);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_qp_query);
+
+int mlx5_core_xrcd_alloc(struct mlx5_core_dev *dev, u32 *xrcdn)
+{
+ struct mlx5_alloc_xrcd_mbox_in in;
+ struct mlx5_alloc_xrcd_mbox_out out;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_ALLOC_XRCD);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ err = mlx5_cmd_status_to_err(&out.hdr);
+ else
+ *xrcdn = be32_to_cpu(out.xrcdn) & 0xffffff;
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_xrcd_alloc);
+
+int mlx5_core_xrcd_dealloc(struct mlx5_core_dev *dev, u32 xrcdn)
+{
+ struct mlx5_dealloc_xrcd_mbox_in in;
+ struct mlx5_dealloc_xrcd_mbox_out out;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_DEALLOC_XRCD);
+ in.xrcdn = cpu_to_be32(xrcdn);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ err = mlx5_cmd_status_to_err(&out.hdr);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_xrcd_dealloc);
+
+int mlx5_core_create_dct(struct mlx5_core_dev *dev,
+ struct mlx5_core_dct *dct,
+ struct mlx5_create_dct_mbox_in *in)
+{
+ struct mlx5_qp_table *table = &dev->priv.qp_table;
+ struct mlx5_create_dct_mbox_out out;
+ struct mlx5_destroy_dct_mbox_in din;
+ struct mlx5_destroy_dct_mbox_out dout;
+ int err;
+ void *dctc;
+
+ init_completion(&dct->drained);
+ memset(&out, 0, sizeof(out));
+ in->hdr.opcode = cpu_to_be16(MLX5_CMD_OP_CREATE_DCT);
+
+ if (dev->issi) {
+ dctc = MLX5_ADDR_OF(create_dct_in, in, dct_context_entry);
+ /* 0xffffff means we ask to work with cqe version 0 */
+ MLX5_SET(dctc, dctc, user_index, 0xffffff);
+ }
+
+ err = mlx5_cmd_exec(dev, in, sizeof(*in), &out, sizeof(out));
+ if (err) {
+ mlx5_core_warn(dev, "create DCT failed, ret %d", err);
+ return err;
+ }
+
+ if (out.hdr.status)
+ return mlx5_cmd_status_to_err(&out.hdr);
+
+ dct->dctn = be32_to_cpu(out.dctn) & 0xffffff;
+
+ dct->common.res = MLX5_RES_DCT;
+ spin_lock_irq(&table->lock);
+ err = radix_tree_insert(&table->tree, dct->dctn, dct);
+ spin_unlock_irq(&table->lock);
+ if (err) {
+ mlx5_core_warn(dev, "err %d", err);
+ goto err_cmd;
+ }
+
+ err = mlx5_debug_dct_add(dev, dct);
+ if (err)
+ mlx5_core_dbg(dev, "failed adding DCT 0x%x to debug file system\n",
+ dct->dctn);
+
+ dct->pid = current->pid;
+ atomic_set(&dct->common.refcount, 1);
+ init_completion(&dct->common.free);
+
+ return 0;
+
+err_cmd:
+ memset(&din, 0, sizeof(din));
+ memset(&dout, 0, sizeof(dout));
+ din.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_DESTROY_DCT);
+ din.dctn = cpu_to_be32(dct->dctn);
+ mlx5_cmd_exec(dev, &din, sizeof(din), &out, sizeof(dout));
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_create_dct);
+
+static int mlx5_core_drain_dct(struct mlx5_core_dev *dev,
+ struct mlx5_core_dct *dct)
+{
+ struct mlx5_drain_dct_mbox_out out;
+ struct mlx5_drain_dct_mbox_in in;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_DRAIN_DCT);
+ in.dctn = cpu_to_be32(dct->dctn);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ return mlx5_cmd_status_to_err(&out.hdr);
+
+ return 0;
+}
+
+int mlx5_core_destroy_dct(struct mlx5_core_dev *dev,
+ struct mlx5_core_dct *dct)
+{
+ struct mlx5_qp_table *table = &dev->priv.qp_table;
+ struct mlx5_destroy_dct_mbox_out out;
+ struct mlx5_destroy_dct_mbox_in in;
+ unsigned long flags;
+ int err;
+
+ err = mlx5_core_drain_dct(dev, dct);
+ if (err) {
+ mlx5_core_warn(dev, "failed drain DCT 0x%x\n", dct->dctn);
+ return err;
+ }
+
+ wait_for_completion(&dct->drained);
+
+ mlx5_debug_dct_remove(dev, dct);
+
+ spin_lock_irqsave(&table->lock, flags);
+ if (radix_tree_delete(&table->tree, dct->dctn) != dct)
+ mlx5_core_warn(dev, "dct delete differs\n");
+ spin_unlock_irqrestore(&table->lock, flags);
+
+ if (atomic_dec_and_test(&dct->common.refcount))
+ complete(&dct->common.free);
+ wait_for_completion(&dct->common.free);
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_DESTROY_DCT);
+ in.dctn = cpu_to_be32(dct->dctn);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ return mlx5_cmd_status_to_err(&out.hdr);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_destroy_dct);
+
+int mlx5_core_dct_query(struct mlx5_core_dev *dev, struct mlx5_core_dct *dct,
+ struct mlx5_query_dct_mbox_out *out)
+{
+ struct mlx5_query_dct_mbox_in in;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(out, 0, sizeof(*out));
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_QUERY_DCT);
+ in.dctn = cpu_to_be32(dct->dctn);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), out, sizeof(*out));
+ if (err)
+ return err;
+
+ if (out->hdr.status)
+ return mlx5_cmd_status_to_err(&out->hdr);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_dct_query);
+
+int mlx5_core_arm_dct(struct mlx5_core_dev *dev, struct mlx5_core_dct *dct)
+{
+ struct mlx5_arm_dct_mbox_out out;
+ struct mlx5_arm_dct_mbox_in in;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_ARM_DCT_FOR_KEY_VIOLATION);
+ in.dctn = cpu_to_be32(dct->dctn);
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ return mlx5_cmd_status_to_err(&out.hdr);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_arm_dct);
+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
+int mlx5_core_page_fault_resume(struct mlx5_core_dev *dev, u32 qpn,
+ u8 flags, int error)
+{
+ struct mlx5_page_fault_resume_mbox_in in;
+ struct mlx5_page_fault_resume_mbox_out out;
+ int err;
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_PAGE_FAULT_RESUME);
+ in.hdr.opmod = 0;
+ flags &= (MLX5_PAGE_FAULT_RESUME_REQUESTOR |
+ MLX5_PAGE_FAULT_RESUME_WRITE |
+ MLX5_PAGE_FAULT_RESUME_RDMA);
+ flags |= (error ? MLX5_PAGE_FAULT_RESUME_ERROR : 0);
+ in.flags_qpn = cpu_to_be32((qpn & MLX5_QPN_MASK) |
+ (flags << MLX5_QPN_BITS));
+ err = mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out));
+ if (err)
+ return err;
+
+ if (out.hdr.status)
+ err = mlx5_cmd_status_to_err(&out.hdr);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_page_fault_resume);
+#endif
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/sriov.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/sriov.c
new file mode 100644
index 0000000..21589fc
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/sriov.c
@@ -0,0 +1,525 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2014, Mellanox Technologies inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+static void mlx5_destroy_vfs_sysfs(struct mlx5_core_dev *dev);
+static int mlx5_create_vfs_sysfs(struct mlx5_core_dev *dev, int num_vfs);
+
+#if (LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,39))
+static int mlx5_pci_num_vf(struct pci_dev *pdev)
+{
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+
+ if (!mlx5_core_is_pf(dev))
+ return 0;
+
+ return dev->priv.sriov.num_vfs;
+}
+#endif
+
+static void mlx5_core_destroy_vfs(struct pci_dev *pdev)
+{
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+ struct mlx5_core_sriov *sriov = &dev->priv.sriov;
+#if (LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,39))
+ int num_vfs = mlx5_pci_num_vf(pdev);
+#else
+ int num_vfs = pci_num_vf(pdev);
+#endif
+ int err;
+ int vf;
+
+ for (vf = 1; vf <= num_vfs; vf++) {
+ if (sriov->vfs_ctx[vf - 1].enabled) {
+ err = mlx5_core_disable_hca(dev, vf);
+ if (err)
+ mlx5_core_warn(dev, "disable_hca for vf %d failed: %d\n", vf, err);
+ }
+ }
+}
+
+static int mlx5_core_create_vfs(struct pci_dev *pdev, int num_vfs)
+{
+ int err;
+
+ err = pci_enable_sriov(pdev, num_vfs);
+ if (err)
+ dev_warn(&pdev->dev, "enable sriov failed %d\n", err);
+
+ return err;
+}
+
+static int mlx5_core_sriov_enable(struct pci_dev *pdev, int num_vfs)
+{
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+ struct mlx5_core_sriov *sriov = &dev->priv.sriov;
+#if (LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,39))
+ int cur_vfs = mlx5_pci_num_vf(pdev);
+#else
+ int cur_vfs = pci_num_vf(pdev);
+#endif
+ int err;
+
+ if (cur_vfs) {
+ if (cur_vfs != num_vfs)
+ mlx5_core_destroy_vfs(pdev);
+ else
+ goto out;
+ }
+ kfree(sriov->vfs_ctx);
+ sriov->vfs_ctx = kcalloc(num_vfs, sizeof(*sriov->vfs_ctx), GFP_ATOMIC);
+ if (!sriov->vfs_ctx)
+ return -ENOMEM;
+
+ err = mlx5_core_create_vfs(pdev, num_vfs);
+ if (err) {
+ kfree(sriov->vfs_ctx);
+ sriov->vfs_ctx = NULL;
+ return err;
+ }
+
+out:
+ return num_vfs;
+}
+
+static void mlx5_core_free_vfs(struct mlx5_core_dev *dev)
+{
+ struct mlx5_core_sriov *sriov;
+ int i;
+
+ if (!mlx5_core_is_pf(dev))
+ return;
+
+ sriov = &dev->priv.sriov;
+ for (i = 0; i < sriov->num_vfs; ++i)
+ if (sriov->vfs_ctx[i].enabled) {
+ mlx5_core_disable_hca(dev, i + 1);
+ sriov->vfs_ctx[i].enabled = 0;
+ }
+}
+
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,10,0)
+static int mlx5_pci_vfs_assigned(struct pci_dev *pdev)
+{
+ return 0;
+}
+#else
+static int mlx5_pci_vfs_assigned(struct pci_dev *pdev)
+{
+ return pci_vfs_assigned(pdev);
+}
+#endif
+
+int mlx5_core_sriov_configure(struct pci_dev *pdev, int num_vfs)
+{
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+ struct mlx5_core_sriov *sriov = &dev->priv.sriov;
+ int err;
+
+ return -ENOSYS;
+ if (!mlx5_core_is_pf(dev))
+ return -EPERM;
+
+ if (num_vfs < 0)
+ return -EINVAL;
+
+ if (mlx5_pci_vfs_assigned(pdev) && num_vfs != MLX5_SRIOV_UNLOAD_MAGIC) {
+ mlx5_core_warn(dev, "cannot change while VFs are assigned\n");
+ return -EPERM;
+ }
+
+ if (num_vfs == MLX5_SRIOV_UNLOAD_MAGIC)
+ num_vfs = 0;
+
+ mlx5_destroy_vfs_sysfs(dev);
+
+ if (num_vfs > 0) {
+ err = mlx5_core_sriov_enable(pdev, num_vfs);
+ if (err != num_vfs)
+ dev_warn(&pdev->dev, "mlx5_core_sriov_enable failed %d\n", err);
+ else
+ err = mlx5_create_vfs_sysfs(dev, num_vfs);
+
+ return err;
+ }
+
+ if (!num_vfs)
+ kfree(sriov->vfs_ctx);
+
+ pci_disable_sriov(pdev);
+
+ return 0;
+}
+
+struct guid_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct mlx5_sriov_vf *, struct guid_attribute *, char *buf);
+ ssize_t (*store)(struct mlx5_sriov_vf *, struct guid_attribute *,
+ const char *buf, size_t count);
+};
+
+static ssize_t guid_attr_show(struct kobject *kobj,
+ struct attribute *attr, char *buf)
+{
+ struct guid_attribute *ga =
+ container_of(attr, struct guid_attribute, attr);
+ struct mlx5_sriov_vf *g = container_of(kobj, struct mlx5_sriov_vf, kobj);
+
+ if (!ga->show)
+ return -EIO;
+
+ return ga->show(g, ga, buf);
+}
+
+static ssize_t guid_attr_store(struct kobject *kobj,
+ struct attribute *attr,
+ const char *buf, size_t size)
+{
+ struct guid_attribute *ga =
+ container_of(attr, struct guid_attribute, attr);
+ struct mlx5_sriov_vf *g = container_of(kobj, struct mlx5_sriov_vf, kobj);
+
+ if (!ga->store)
+ return -EIO;
+
+ return ga->store(g, ga, buf, size);
+}
+
+static ssize_t port_show(struct mlx5_sriov_vf *g, struct guid_attribute *oa,
+ char *buf)
+{
+ struct mlx5_core_dev *dev = g->dev;
+ union ib_gid gid;
+ int err;
+ u8 *p;
+
+ err = mlx5_core_query_gids(dev, 1, 1, g->vf, 0 , &gid);
+ if (err) {
+ mlx5_core_warn(dev, "failed to query gid at index 0 for vf %d\n", g->vf);
+ return err;
+ }
+
+ p = &gid.raw[8];
+ err = sprintf(buf, "%02x:%02x:%02x:%02x:%02x:%02x:%02x:%02x\n",
+ p[0], p[1], p[2], p[3], p[4], p[5], p[6], p[7]);
+ return err;
+}
+
+static ssize_t port_store(struct mlx5_sriov_vf *g, struct guid_attribute *oa,
+ const char *buf, size_t count)
+{
+ struct mlx5_core_dev *dev = g->dev;
+ struct mlx5_hca_vport_context *in;
+ u64 guid = 0;
+ int err;
+ int tmp[8];
+ int i;
+
+ err = sscanf(buf, "%02x:%02x:%02x:%02x:%02x:%02x:%02x:%02x\n",
+ &tmp[0], &tmp[1], &tmp[2], &tmp[3], &tmp[4], &tmp[5], &tmp[6], &tmp[7]);
+ if (err != 8)
+ return -EINVAL;
+
+ for (i = 0; i < 8; i++)
+ guid += ((u64)tmp[i] << ((7 - i) * 8));
+
+ in = kzalloc(sizeof(*in), GFP_KERNEL);
+ if (!in)
+ return -ENOMEM;
+
+ in->field_select = MLX5_HCA_VPORT_SEL_PORT_GUID;
+ in->port_guid = guid;
+ err = mlx5_core_modify_hca_vport_context(dev, 1, 1, g->vf, in);
+ if (err) {
+ kfree(in);
+ return err;
+ }
+
+ err = mlx5_core_check_enable_vf_hca(dev, in->field_select, g->vf);
+ if (err)
+ mlx5_core_dbg(dev, "failed to enable hca for VF %d\n", g->vf);
+
+ kfree(in);
+ return count;
+}
+
+static ssize_t node_show(struct mlx5_sriov_vf *g, struct guid_attribute *oa,
+ char *buf)
+{
+ struct mlx5_core_dev *dev = g->dev;
+ struct mlx5_hca_vport_context *rep;
+ __be64 guid;
+
+ int err;
+ u8 *p;
+
+ rep = kzalloc(sizeof(*rep), GFP_KERNEL);
+ if (!rep) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ err = mlx5_core_query_hca_vport_context(dev, 1, 1, g->vf, rep);
+ if (err) {
+ mlx5_core_warn(dev, "failed to query node guid for vf %d (%d)\n",
+ g->vf, err);
+ goto free;
+ }
+
+ guid = cpu_to_be64(rep->node_guid);
+ p = (u8 *)&guid;
+ err = sprintf(buf, "%02x:%02x:%02x:%02x:%02x:%02x:%02x:%02x\n",
+ p[0], p[1], p[2], p[3], p[4], p[5], p[6], p[7]);
+
+free:
+ kfree(rep);
+out:
+ return err;
+}
+
+static ssize_t node_store(struct mlx5_sriov_vf *g, struct guid_attribute *oa,
+ const char *buf, size_t count)
+{
+ struct mlx5_core_dev *dev = g->dev;
+ struct mlx5_hca_vport_context *in;
+ u64 guid = 0;
+ int err;
+ int tmp[8];
+ int i;
+
+ err = sscanf(buf, "%02x:%02x:%02x:%02x:%02x:%02x:%02x:%02x\n",
+ &tmp[0], &tmp[1], &tmp[2], &tmp[3], &tmp[4], &tmp[5], &tmp[6], &tmp[7]);
+ if (err != 8)
+ return -EINVAL;
+
+ for (i = 0; i < 8; i++)
+ guid += ((u64)tmp[i] << ((7 - i) * 8));
+
+ in = kzalloc(sizeof(*in), GFP_KERNEL);
+ if (!in)
+ return -ENOMEM;
+
+ in->field_select = MLX5_HCA_VPORT_SEL_NODE_GUID;
+ in->node_guid = guid;
+ err = mlx5_core_modify_hca_vport_context(dev, 1, 1, g->vf, in);
+ if (err) {
+ kfree(in);
+ return err;
+ }
+
+ err = mlx5_core_check_enable_vf_hca(dev, in->field_select, g->vf);
+ if (err)
+ mlx5_core_dbg(dev, "failed to enable hca for VF %d\n", g->vf);
+
+ kfree(in);
+ return count;
+}
+
+#if (LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,39))
+static struct sysfs_ops guid_sysfs_ops = {
+#else
+static const struct sysfs_ops guid_sysfs_ops = {
+#endif
+ .show = guid_attr_show,
+ .store = guid_attr_store,
+};
+
+#define GUID_ATTR(_name) struct guid_attribute guid_attr_##_name = \
+ __ATTR(_name, 0644, _name##_show, _name##_store)
+
+GUID_ATTR(node);
+GUID_ATTR(port);
+
+static struct attribute *guid_default_attrs[] = {
+ &guid_attr_node.attr,
+ &guid_attr_port.attr,
+ NULL
+};
+
+static struct kobj_type guid_type = {
+ .sysfs_ops = &guid_sysfs_ops,
+ .default_attrs = guid_default_attrs
+};
+
+static int mlx5_create_vfs_sysfs(struct mlx5_core_dev *dev, int num_vfs)
+{
+ struct mlx5_core_sriov *sriov = &dev->priv.sriov;
+ struct mlx5_sriov_vf *tmp;
+ int err;
+ int vf;
+
+ sriov->vfs = kcalloc(num_vfs, sizeof(*sriov->vfs), GFP_KERNEL);
+ if (!sriov->vfs)
+ return -ENOMEM;
+
+ for (vf = 0; vf < num_vfs; vf++) {
+ tmp = &sriov->vfs[vf];
+ tmp->dev = dev;
+ tmp->vf = vf;
+ err = kobject_init_and_add(&tmp->kobj, &guid_type, sriov->config,
+ "%d", vf);
+ if (err)
+ goto err_vf;
+
+ kobject_uevent(&tmp->kobj, KOBJ_ADD);
+ }
+ sriov->num_vfs = num_vfs;
+
+ return 0;
+
+err_vf:
+ for (; vf >= 1; vf--) {
+ tmp = &sriov->vfs[vf - 1];
+ kobject_put(&tmp->kobj);
+ }
+ kfree(sriov->vfs);
+ sriov->vfs = NULL;
+ return err;
+}
+
+static void mlx5_destroy_vfs_sysfs(struct mlx5_core_dev *dev)
+{
+ struct mlx5_core_sriov *sriov = &dev->priv.sriov;
+ struct mlx5_sriov_vf *tmp;
+ int vf;
+
+ mlx5_core_free_vfs(dev);
+ for (vf = 1; vf <= sriov->num_vfs; vf++) {
+ tmp = &sriov->vfs[vf - 1];
+ kobject_put(&tmp->kobj);
+ }
+ sriov->num_vfs = 0;
+ kfree(sriov->vfs);
+ sriov->vfs = NULL;
+}
+
+static ssize_t num_vf_store(struct device *device, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct pci_dev *pdev = container_of(device, struct pci_dev, dev);
+ int req_vfs;
+ int err;
+
+ if (kstrtoint(buf, 0, &req_vfs) || req_vfs < 0)
+ return -EINVAL;
+
+ err = mlx5_core_sriov_configure(pdev, req_vfs);
+ if (err)
+ return err;
+
+ return count;
+}
+
+static ssize_t num_vf_show(struct device *device, struct device_attribute *attr,
+ char *buf)
+{
+ struct pci_dev *pdev = container_of(device, struct pci_dev, dev);
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+ struct mlx5_core_sriov *sriov = &dev->priv.sriov;
+
+ return sprintf(buf, "%d\n", sriov->num_vfs);
+}
+
+static DEVICE_ATTR(mlx5_num_vfs, 0600, num_vf_show, num_vf_store);
+
+static struct device_attribute *mlx5_class_attributes[] = {
+ &dev_attr_mlx5_num_vfs,
+};
+
+static int mlx5_sriov_sysfs_init(struct mlx5_core_dev *dev)
+{
+ struct mlx5_core_sriov *sriov = &dev->priv.sriov;
+ struct device *device = &dev->pdev->dev;
+ int err;
+ int i;
+
+ sriov->config = kobject_create_and_add("sriov", &device->kobj);
+ if (!sriov->config)
+ return -ENOMEM;
+
+ for (i = 0; i < ARRAY_SIZE(mlx5_class_attributes); i++) {
+ err = device_create_file(device, mlx5_class_attributes[i]);
+ if (err)
+ goto err_attr;
+ }
+
+ return 0;
+
+err_attr:
+ kobject_put(sriov->config);
+ sriov->config = NULL;
+ return err;
+}
+
+static void mlx5_sriov_sysfs_cleanup(struct mlx5_core_dev *dev)
+{
+ struct mlx5_core_sriov *sriov = &dev->priv.sriov;
+ struct device *device = &dev->pdev->dev;
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(mlx5_class_attributes); i++)
+ device_remove_file(device, mlx5_class_attributes[i]);
+
+ kobject_put(sriov->config);
+ sriov->config = NULL;
+}
+
+int mlx5_sriov_init(struct mlx5_core_dev *dev)
+{
+ return 0;
+ if (!mlx5_core_is_pf(dev))
+ return 0;
+
+ return mlx5_sriov_sysfs_init(dev);
+}
+
+int mlx5_sriov_cleanup(struct mlx5_core_dev *dev)
+{
+ struct pci_dev *pdev = dev->pdev;
+ int err;
+
+ return 0;
+ if (!mlx5_core_is_pf(dev))
+ return 0;
+
+ err = mlx5_core_sriov_configure(pdev, MLX5_SRIOV_UNLOAD_MAGIC);
+ if (err)
+ return err;
+
+ mlx5_sriov_sysfs_cleanup(dev);
+ return 0;
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/srq.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/srq.c
new file mode 100644
index 0000000..2bce3bf
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/srq.c
@@ -0,0 +1,524 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+#include "transobj.h"
+
+void mlx5_srq_event(struct mlx5_core_dev *dev, u32 srqn, int event_type)
+{
+ struct mlx5_srq_table *table = &dev->priv.srq_table;
+ struct mlx5_core_srq *srq;
+
+ spin_lock(&table->lock);
+
+ srq = radix_tree_lookup(&table->tree, srqn);
+ if (srq)
+ atomic_inc(&srq->refcount);
+
+ spin_unlock(&table->lock);
+
+ if (!srq) {
+ mlx5_core_warn(dev, "Async event for bogus SRQ 0x%08x\n", srqn);
+ return;
+ }
+
+ srq->event(srq, event_type);
+
+ if (atomic_dec_and_test(&srq->refcount))
+ complete(&srq->free);
+}
+
+static int get_pas_size(void *srqc)
+{
+ u32 log_page_size = MLX5_GET(srqc, srqc, log_page_size) + 12;
+ u32 log_srq_size = MLX5_GET(srqc, srqc, log_srq_size);
+ u32 log_rq_stride = MLX5_GET(srqc, srqc, log_rq_stride);
+ u32 page_offset = MLX5_GET(srqc, srqc, page_offset);
+ u32 po_quanta = 1 << (log_page_size - 6);
+ u32 rq_sz = 1 << (log_srq_size + 4 + log_rq_stride);
+ u32 page_size = 1 << log_page_size;
+ u32 rq_sz_po = rq_sz + (page_offset * po_quanta);
+ u32 rq_num_pas = (rq_sz_po + page_size - 1) / page_size;
+
+ return rq_num_pas * sizeof(u64);
+}
+
+static void rmpc_srqc_reformat(void *srqc, void *rmpc, bool srqc_to_rmpc)
+{
+ void *wq = MLX5_ADDR_OF(rmpc, rmpc, wq);
+
+ if (srqc_to_rmpc) {
+ switch (MLX5_GET(srqc, srqc, state)) {
+ case MLX5_SRQC_STATE_GOOD:
+ MLX5_SET(rmpc, rmpc, state, MLX5_RMPC_STATE_RDY);
+ break;
+ case MLX5_SRQC_STATE_ERROR:
+ MLX5_SET(rmpc, rmpc, state, MLX5_RMPC_STATE_ERR);
+ break;
+ default:
+ pr_warn("%s: %d: Unknown srq state = 0x%x\n", __func__,
+ __LINE__, MLX5_GET(srqc, srqc, state));
+ MLX5_SET(rmpc, rmpc, state, MLX5_GET(srqc, srqc, state));
+ }
+
+ MLX5_SET(wq, wq, wq_signature, MLX5_GET(srqc, srqc, wq_signature));
+ MLX5_SET(wq, wq, log_wq_pg_sz, MLX5_GET(srqc, srqc, log_page_size));
+ MLX5_SET(wq, wq, log_wq_stride, MLX5_GET(srqc, srqc, log_rq_stride) + 4);
+ MLX5_SET(wq, wq, log_wq_sz, MLX5_GET(srqc, srqc, log_srq_size));
+ MLX5_SET(wq, wq, page_offset, MLX5_GET(srqc, srqc, page_offset));
+ MLX5_SET(wq, wq, lwm, MLX5_GET(srqc, srqc, lwm));
+ MLX5_SET(wq, wq, pd, MLX5_GET(srqc, srqc, pd));
+ MLX5_SET64(wq, wq, dbr_addr, MLX5_GET64(srqc, srqc, dbr_addr));
+ } else {
+ switch (MLX5_GET(rmpc, rmpc, state)) {
+ case MLX5_RMPC_STATE_RDY:
+ MLX5_SET(srqc, srqc, state, MLX5_SRQC_STATE_GOOD);
+ break;
+ case MLX5_RMPC_STATE_ERR:
+ MLX5_SET(srqc, srqc, state, MLX5_SRQC_STATE_ERROR);
+ break;
+ default:
+ pr_warn("%s: %d: Unknown rmp state = 0x%x\n",
+ __func__, __LINE__, MLX5_GET(rmpc, rmpc, state));
+ MLX5_SET(srqc, srqc, state, MLX5_GET(rmpc, rmpc, state));
+ }
+
+ MLX5_SET(srqc, srqc, wq_signature, MLX5_GET(wq, wq, wq_signature));
+ MLX5_SET(srqc, srqc, log_page_size, MLX5_GET(wq, wq, log_wq_pg_sz));
+ MLX5_SET(srqc, srqc, log_rq_stride, MLX5_GET(wq, wq, log_wq_stride) - 4);
+ MLX5_SET(srqc, srqc, log_srq_size, MLX5_GET(wq, wq, log_wq_sz));
+ MLX5_SET(srqc, srqc, page_offset, MLX5_GET(wq, wq, page_offset));
+ MLX5_SET(srqc, srqc, lwm, MLX5_GET(wq, wq, lwm));
+ MLX5_SET(srqc, srqc, pd, MLX5_GET(wq, wq, pd));
+ MLX5_SET64(srqc, srqc, dbr_addr, MLX5_GET64(wq, wq, dbr_addr));
+ }
+}
+
+struct mlx5_core_srq *mlx5_core_get_srq(struct mlx5_core_dev *dev, u32 srqn)
+{
+ struct mlx5_srq_table *table = &dev->priv.srq_table;
+ struct mlx5_core_srq *srq;
+
+ spin_lock(&table->lock);
+
+ srq = radix_tree_lookup(&table->tree, srqn);
+ if (srq)
+ atomic_inc(&srq->refcount);
+
+ spin_unlock(&table->lock);
+
+ return srq;
+}
+EXPORT_SYMBOL(mlx5_core_get_srq);
+
+static int create_srq_cmd(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+ struct mlx5_create_srq_mbox_in *in, int inlen)
+{
+ struct mlx5_create_srq_mbox_out out;
+ int err;
+
+ memset(&out, 0, sizeof(out));
+
+ in->hdr.opcode = cpu_to_be16(MLX5_CMD_OP_CREATE_SRQ);
+
+ err = mlx5_cmd_exec_check_status(dev, (u32 *)in, inlen, (u32 *)(&out), sizeof(out));
+
+ srq->srqn = be32_to_cpu(out.srqn) & 0xffffff;
+
+ return err;
+}
+
+static int destroy_srq_cmd(struct mlx5_core_dev *dev,
+ struct mlx5_core_srq *srq)
+{
+ struct mlx5_destroy_srq_mbox_in in;
+ struct mlx5_destroy_srq_mbox_out out;
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_DESTROY_SRQ);
+ in.srqn = cpu_to_be32(srq->srqn);
+
+ return mlx5_cmd_exec_check_status(dev, (u32 *)(&in), sizeof(in), (u32 *)(&out), sizeof(out));
+}
+
+static int arm_srq_cmd(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+ u16 lwm, int is_srq)
+{
+ struct mlx5_arm_srq_mbox_in in;
+ struct mlx5_arm_srq_mbox_out out;
+
+ memset(&in, 0, sizeof(in));
+ memset(&out, 0, sizeof(out));
+
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_ARM_RQ);
+ in.hdr.opmod = cpu_to_be16(!!is_srq);
+ in.srqn = cpu_to_be32(srq->srqn);
+ in.lwm = cpu_to_be16(lwm);
+
+ return mlx5_cmd_exec_check_status(dev, (u32 *)(&in), sizeof(in), (u32 *)(&out), sizeof(out));
+}
+
+static int query_srq_cmd(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+ struct mlx5_query_srq_mbox_out *out)
+{
+ struct mlx5_query_srq_mbox_in in;
+
+ memset(&in, 0, sizeof(in));
+
+ in.hdr.opcode = cpu_to_be16(MLX5_CMD_OP_QUERY_SRQ);
+ in.srqn = cpu_to_be32(srq->srqn);
+
+ return mlx5_cmd_exec_check_status(dev, (u32 *)(&in), sizeof(in), (u32 *)out, sizeof(*out));
+}
+
+static int create_xrc_srq_cmd(struct mlx5_core_dev *dev,
+ struct mlx5_core_srq *srq,
+ struct mlx5_create_srq_mbox_in *in,
+ int srq_inlen)
+{
+ u32 create_out[MLX5_ST_SZ_DW(create_xrc_srq_out)];
+ void *create_in;
+ void *srqc;
+ void *xrc_srqc;
+ void *pas;
+ int pas_size;
+ int inlen;
+ int err;
+
+ srqc = MLX5_ADDR_OF(create_srq_in, in, srq_context_entry);
+ pas_size = get_pas_size(srqc);
+ inlen = MLX5_ST_SZ_BYTES(create_xrc_srq_in) + pas_size;
+ create_in = mlx5_vzalloc(inlen);
+ if (!create_in)
+ return -ENOMEM;
+
+ xrc_srqc = MLX5_ADDR_OF(create_xrc_srq_in, create_in, xrc_srq_context_entry);
+ pas = MLX5_ADDR_OF(create_xrc_srq_in, create_in, pas);
+
+ memcpy(xrc_srqc, srqc, MLX5_ST_SZ_BYTES(srqc));
+ memcpy(pas, in->pas, pas_size);
+ /* 0xffffff means we ask to work with cqe version 0 */
+ MLX5_SET(srqc, xrc_srqc, user_index, 0xffffff);
+ MLX5_SET(create_xrc_srq_in, create_in, opcode, MLX5_CMD_OP_CREATE_XRC_SRQ);
+
+ memset(create_out, 0, sizeof(create_out));
+ err = mlx5_cmd_exec_check_status(dev, create_in, inlen, create_out, sizeof(create_out));
+ if (err)
+ goto out;
+
+ srq->srqn = MLX5_GET(create_xrc_srq_out, create_out, xrc_srqn);
+out:
+ kvfree(create_in);
+ return err;
+}
+
+static int destroy_xrc_srq_cmd(struct mlx5_core_dev *dev,
+ struct mlx5_core_srq *srq)
+{
+ u32 xrcsrq_in[MLX5_ST_SZ_DW(destroy_xrc_srq_in)];
+ u32 xrcsrq_out[MLX5_ST_SZ_DW(destroy_xrc_srq_out)];
+
+ memset(xrcsrq_in, 0, sizeof(xrcsrq_in));
+ memset(xrcsrq_out, 0, sizeof(xrcsrq_out));
+
+ MLX5_SET(destroy_xrc_srq_in, xrcsrq_in, opcode, MLX5_CMD_OP_DESTROY_XRC_SRQ);
+ MLX5_SET(destroy_xrc_srq_in, xrcsrq_in, xrc_srqn, srq->srqn);
+
+ return mlx5_cmd_exec_check_status(dev, xrcsrq_in, sizeof(xrcsrq_in),
+ xrcsrq_out, sizeof(xrcsrq_out));
+}
+
+static int arm_xrc_srq_cmd(struct mlx5_core_dev *dev,
+ struct mlx5_core_srq *srq, u16 lwm)
+{
+ u32 xrcsrq_in[MLX5_ST_SZ_DW(arm_xrc_srq_in)];
+ u32 xrcsrq_out[MLX5_ST_SZ_DW(arm_xrc_srq_out)];
+
+ memset(xrcsrq_in, 0, sizeof(xrcsrq_in));
+ memset(xrcsrq_out, 0, sizeof(xrcsrq_out));
+
+ MLX5_SET(arm_xrc_srq_in, xrcsrq_in, opcode, MLX5_CMD_OP_ARM_XRC_SRQ);
+ MLX5_SET(arm_xrc_srq_in, xrcsrq_in, op_mod, MLX5_ARM_XRC_SRQ_IN_OP_MOD_XRC_SRQ);
+ MLX5_SET(arm_xrc_srq_in, xrcsrq_in, xrc_srqn, srq->srqn);
+ MLX5_SET(arm_xrc_srq_in, xrcsrq_in, lwm, lwm);
+
+ return mlx5_cmd_exec_check_status(dev, xrcsrq_in, sizeof(xrcsrq_in),
+ xrcsrq_out, sizeof(xrcsrq_out));
+}
+
+static int query_xrc_srq_cmd(struct mlx5_core_dev *dev,
+ struct mlx5_core_srq *srq,
+ struct mlx5_query_srq_mbox_out *out)
+{
+ u32 xrcsrq_in[MLX5_ST_SZ_DW(query_xrc_srq_in)];
+ u32 *xrcsrq_out;
+ void *srqc;
+ void *xrc_srqc;
+ int err;
+
+ xrcsrq_out = mlx5_vzalloc(MLX5_ST_SZ_BYTES(query_xrc_srq_out));
+ if (!xrcsrq_out)
+ return -ENOMEM;
+ memset(xrcsrq_in, 0, sizeof(xrcsrq_in));
+
+ MLX5_SET(query_xrc_srq_in, xrcsrq_in, opcode, MLX5_CMD_OP_QUERY_XRC_SRQ);
+ MLX5_SET(query_xrc_srq_in, xrcsrq_in, xrc_srqn, srq->srqn);
+ err = mlx5_cmd_exec_check_status(dev, xrcsrq_in, sizeof(xrcsrq_in),
+ xrcsrq_out,
+ MLX5_ST_SZ_BYTES(query_xrc_srq_out));
+ if (err)
+ goto out;
+
+ xrc_srqc = MLX5_ADDR_OF(query_xrc_srq_out, xrcsrq_out,
+ xrc_srq_context_entry);
+ srqc = MLX5_ADDR_OF(query_srq_out, out, srq_context_entry);
+ memcpy(srqc, xrc_srqc, MLX5_ST_SZ_BYTES(srqc));
+
+out:
+ kvfree(xrcsrq_out);
+ return err;
+}
+
+static int create_rmp_cmd(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+ struct mlx5_create_srq_mbox_in *in, int srq_inlen)
+{
+ void *create_in;
+ void *rmpc;
+ void *srqc;
+ int pas_size;
+ int inlen;
+ int err;
+
+ srqc = MLX5_ADDR_OF(create_srq_in, in, srq_context_entry);
+ pas_size = get_pas_size(srqc);
+ inlen = MLX5_ST_SZ_BYTES(create_rmp_in) + pas_size;
+ create_in = mlx5_vzalloc(inlen);
+ if (!create_in)
+ return -ENOMEM;
+
+ rmpc = MLX5_ADDR_OF(create_rmp_in, create_in, ctx);
+
+ memcpy(MLX5_ADDR_OF(rmpc, rmpc, wq.pas), in->pas, pas_size);
+ rmpc_srqc_reformat(srqc, rmpc, true);
+
+ err = mlx5_core_create_rmp(dev, create_in, inlen, &srq->srqn);
+
+ kvfree(create_in);
+ return err;
+}
+
+static int destroy_rmp_cmd(struct mlx5_core_dev *dev,
+ struct mlx5_core_srq *srq)
+{
+ return mlx5_core_destroy_rmp(dev, srq->srqn);
+}
+
+static int arm_rmp_cmd(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq, u16 lwm)
+{
+ void *in;
+ void *rmpc;
+ void *wq;
+ void *bitmask;
+ int err;
+
+ in = mlx5_vzalloc(MLX5_ST_SZ_BYTES(modify_rmp_in));
+ if (!in)
+ return -ENOMEM;
+
+ rmpc = MLX5_ADDR_OF(modify_rmp_in, in, ctx);
+ bitmask = MLX5_ADDR_OF(modify_rmp_in, in, bitmask);
+ wq = MLX5_ADDR_OF(rmpc, rmpc, wq);
+
+ MLX5_SET(modify_rmp_in, in, rmp_state, MLX5_RMPC_STATE_RDY);
+ MLX5_SET(modify_rmp_in, in, rmpn, srq->srqn);
+ MLX5_SET(wq, wq, lwm, lwm);
+ MLX5_SET(rmp_bitmask, bitmask, lwm, 1);
+ MLX5_SET(rmpc, rmpc, state, MLX5_RMPC_STATE_RDY);
+
+ err = mlx5_core_modify_rmp(dev, in, MLX5_ST_SZ_BYTES(modify_rmp_in));
+
+ kvfree(in);
+ return err;
+}
+
+static int query_rmp_cmd(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+ struct mlx5_query_srq_mbox_out *out)
+{
+ u32 *rmp_out;
+ void *rmpc;
+ void *srqc;
+ int err;
+
+ rmp_out = mlx5_vzalloc(MLX5_ST_SZ_BYTES(query_rmp_out));
+ if (!rmp_out)
+ return -ENOMEM;
+
+ err = mlx5_core_query_rmp(dev, srq->srqn, rmp_out);
+ if (err)
+ goto out;
+
+ srqc = MLX5_ADDR_OF(query_srq_out, out, srq_context_entry);
+ rmpc = MLX5_ADDR_OF(query_rmp_out, rmp_out, rmp_context);
+ rmpc_srqc_reformat(srqc, rmpc, false);
+
+out:
+ kvfree(rmp_out);
+ return err;
+}
+
+static int create_srq_split(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+ struct mlx5_create_srq_mbox_in *in, int inlen,
+ int is_xrc)
+{
+ if (!dev->issi)
+ return create_srq_cmd(dev, srq, in, inlen);
+ else if (srq->common.res == MLX5_RES_XSRQ)
+ return create_xrc_srq_cmd(dev, srq, in, inlen);
+ else
+ return create_rmp_cmd(dev, srq, in, inlen);
+}
+
+static int destroy_srq_split(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq)
+{
+ if (!dev->issi)
+ return destroy_srq_cmd(dev, srq);
+ else if (srq->common.res == MLX5_RES_XSRQ)
+ return destroy_xrc_srq_cmd(dev, srq);
+ else
+ return destroy_rmp_cmd(dev, srq);
+}
+
+int mlx5_core_create_srq(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+ struct mlx5_create_srq_mbox_in *in, int inlen,
+ int is_xrc)
+{
+ int err;
+ struct mlx5_srq_table *table = &dev->priv.srq_table;
+
+ srq->common.res = is_xrc ? MLX5_RES_XSRQ : MLX5_RES_SRQ;
+
+ err = create_srq_split(dev, srq, in, inlen, is_xrc);
+ if (err)
+ return err;
+
+ atomic_set(&srq->refcount, 1);
+ init_completion(&srq->free);
+
+ spin_lock_irq(&table->lock);
+ err = radix_tree_insert(&table->tree, srq->srqn, srq);
+ spin_unlock_irq(&table->lock);
+ if (err) {
+ mlx5_core_warn(dev, "err %d, srqn 0x%x\n", err, srq->srqn);
+ goto err_destroy_srq_split;
+ }
+
+ return 0;
+
+err_destroy_srq_split:
+ destroy_srq_split(dev, srq);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx5_core_create_srq);
+
+int mlx5_core_destroy_srq(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq)
+{
+ struct mlx5_srq_table *table = &dev->priv.srq_table;
+ struct mlx5_core_srq *tmp;
+ int err;
+
+ spin_lock_irq(&table->lock);
+ tmp = radix_tree_delete(&table->tree, srq->srqn);
+ spin_unlock_irq(&table->lock);
+ if (!tmp) {
+ mlx5_core_warn(dev, "srq 0x%x not found in tree\n", srq->srqn);
+ return -EINVAL;
+ }
+ if (tmp != srq) {
+ mlx5_core_warn(dev, "corruption on srqn 0x%x\n", srq->srqn);
+ return -EINVAL;
+ }
+
+ err = destroy_srq_split(dev, srq);
+ if (err)
+ return err;
+
+ if (atomic_dec_and_test(&srq->refcount))
+ complete(&srq->free);
+ wait_for_completion(&srq->free);
+
+ return 0;
+}
+EXPORT_SYMBOL(mlx5_core_destroy_srq);
+
+int mlx5_core_query_srq(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+ struct mlx5_query_srq_mbox_out *out)
+{
+ if (!dev->issi)
+ return query_srq_cmd(dev, srq, out);
+ else if (srq->common.res == MLX5_RES_XSRQ)
+ return query_xrc_srq_cmd(dev, srq, out);
+ else
+ return query_rmp_cmd(dev, srq, out);
+}
+EXPORT_SYMBOL(mlx5_core_query_srq);
+
+int mlx5_core_arm_srq(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+ u16 lwm, int is_srq)
+{
+ if (!dev->issi)
+ return arm_srq_cmd(dev, srq, lwm, is_srq);
+ else if (srq->common.res == MLX5_RES_XSRQ)
+ return arm_xrc_srq_cmd(dev, srq, lwm);
+ else
+ return arm_rmp_cmd(dev, srq, lwm);
+}
+EXPORT_SYMBOL(mlx5_core_arm_srq);
+
+void mlx5_init_srq_table(struct mlx5_core_dev *dev)
+{
+ struct mlx5_srq_table *table = &dev->priv.srq_table;
+
+ memset(table, 0, sizeof(*table));
+ spin_lock_init(&table->lock);
+ INIT_RADIX_TREE(&table->tree, GFP_ATOMIC);
+}
+
+void mlx5_cleanup_srq_table(struct mlx5_core_dev *dev)
+{
+ /* nothing */
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.c
new file mode 100644
index 0000000..512889c
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.c
@@ -0,0 +1,361 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies, Ltd. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+
+#include "mlx5_core.h"
+#include "transobj.h"
+
+int mlx5_alloc_transport_domain(struct mlx5_core_dev *dev, u32 *tdn)
+{
+ u32 in[MLX5_ST_SZ_DW(alloc_transport_domain_in)];
+ u32 out[MLX5_ST_SZ_DW(alloc_transport_domain_out)];
+ int err;
+
+ memset(in, 0, sizeof(in));
+ memset(out, 0, sizeof(out));
+
+ MLX5_SET(alloc_transport_domain_in, in, opcode,
+ MLX5_CMD_OP_ALLOC_TRANSPORT_DOMAIN);
+
+ err = mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
+ if (!err)
+ *tdn = MLX5_GET(alloc_transport_domain_out, out,
+ transport_domain);
+
+ return err;
+}
+
+void mlx5_dealloc_transport_domain(struct mlx5_core_dev *dev, u32 tdn)
+{
+ u32 in[MLX5_ST_SZ_DW(dealloc_transport_domain_in)];
+ u32 out[MLX5_ST_SZ_DW(dealloc_transport_domain_out)];
+
+ memset(in, 0, sizeof(in));
+ memset(out, 0, sizeof(out));
+
+ MLX5_SET(dealloc_transport_domain_in, in, opcode,
+ MLX5_CMD_OP_DEALLOC_TRANSPORT_DOMAIN);
+ MLX5_SET(dealloc_transport_domain_in, in, transport_domain, tdn);
+
+ mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
+}
+
+int mlx5_core_create_rq(struct mlx5_core_dev *dev, u32 *in, int inlen, u32 *rqn)
+{
+ u32 out[MLX5_ST_SZ_DW(create_rq_out)];
+ int err;
+
+ MLX5_SET(create_rq_in, in, opcode, MLX5_CMD_OP_CREATE_RQ);
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(dev, in, inlen, out, sizeof(out));
+ if (!err)
+ *rqn = MLX5_GET(create_rq_out, out, rqn);
+
+ return err;
+}
+
+int mlx5_core_modify_rq(struct mlx5_core_dev *dev, u32 *in, int inlen)
+{
+ u32 out[MLX5_ST_SZ_DW(modify_rq_out)];
+
+ MLX5_SET(modify_rq_in, in, opcode, MLX5_CMD_OP_MODIFY_RQ);
+
+ memset(out, 0, sizeof(out));
+ return mlx5_cmd_exec_check_status(dev, in, inlen, out, sizeof(out));
+}
+
+void mlx5_core_destroy_rq(struct mlx5_core_dev *dev, u32 rqn)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_rq_in)];
+ u32 out[MLX5_ST_SZ_DW(destroy_rq_out)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(destroy_rq_in, in, opcode, MLX5_CMD_OP_DESTROY_RQ);
+ MLX5_SET(destroy_rq_in, in, rqn, rqn);
+
+ mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
+}
+
+int mlx5_core_create_sq(struct mlx5_core_dev *dev, u32 *in, int inlen, u32 *sqn)
+{
+ u32 out[MLX5_ST_SZ_DW(create_sq_out)];
+ int err;
+
+ MLX5_SET(create_sq_in, in, opcode, MLX5_CMD_OP_CREATE_SQ);
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(dev, in, inlen, out, sizeof(out));
+ if (!err)
+ *sqn = MLX5_GET(create_sq_out, out, sqn);
+
+ return err;
+}
+
+int mlx5_core_modify_sq(struct mlx5_core_dev *dev, u32 *in, int inlen)
+{
+ u32 out[MLX5_ST_SZ_DW(modify_sq_out)];
+
+ MLX5_SET(modify_sq_in, in, opcode, MLX5_CMD_OP_MODIFY_SQ);
+
+ memset(out, 0, sizeof(out));
+ return mlx5_cmd_exec_check_status(dev, in, inlen, out, sizeof(out));
+}
+
+void mlx5_core_destroy_sq(struct mlx5_core_dev *dev, u32 sqn)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_sq_in)];
+ u32 out[MLX5_ST_SZ_DW(destroy_sq_out)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(destroy_sq_in, in, opcode, MLX5_CMD_OP_DESTROY_SQ);
+ MLX5_SET(destroy_sq_in, in, sqn, sqn);
+
+ mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
+}
+
+int mlx5_core_create_tir(struct mlx5_core_dev *dev, u32 *in, int inlen,
+ u32 *tirn)
+{
+ u32 out[MLX5_ST_SZ_DW(create_tir_out)];
+ int err;
+
+ MLX5_SET(create_tir_in, in, opcode, MLX5_CMD_OP_CREATE_TIR);
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(dev, in, inlen, out, sizeof(out));
+ if (!err)
+ *tirn = MLX5_GET(create_tir_out, out, tirn);
+
+ return err;
+}
+
+void mlx5_core_destroy_tir(struct mlx5_core_dev *dev, u32 tirn)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_tir_out)];
+ u32 out[MLX5_ST_SZ_DW(destroy_tir_out)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(destroy_tir_in, in, opcode, MLX5_CMD_OP_DESTROY_TIR);
+ MLX5_SET(destroy_tir_in, in, tirn, tirn);
+
+ mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
+}
+
+int mlx5_core_create_tis(struct mlx5_core_dev *dev, u32 *in, int inlen,
+ u32 *tisn)
+{
+ u32 out[MLX5_ST_SZ_DW(create_tis_out)];
+ int err;
+
+ MLX5_SET(create_tis_in, in, opcode, MLX5_CMD_OP_CREATE_TIS);
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(dev, in, inlen, out, sizeof(out));
+ if (!err)
+ *tisn = MLX5_GET(create_tis_out, out, tisn);
+
+ return err;
+}
+
+void mlx5_core_destroy_tis(struct mlx5_core_dev *dev, u32 tisn)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_tis_out)];
+ u32 out[MLX5_ST_SZ_DW(destroy_tis_out)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(destroy_tis_in, in, opcode, MLX5_CMD_OP_DESTROY_TIS);
+ MLX5_SET(destroy_tis_in, in, tisn, tisn);
+
+ mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
+}
+
+int mlx5_core_create_rmp(struct mlx5_core_dev *dev, u32 *in, int inlen, u32 *rmpn)
+{
+ u32 out[MLX5_ST_SZ_DW(create_rmp_out)];
+ int err;
+
+ MLX5_SET(create_rmp_in, in, opcode, MLX5_CMD_OP_CREATE_RMP);
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(dev, in, inlen, out, sizeof(out));
+ if (!err)
+ *rmpn = MLX5_GET(create_rmp_out, out, rmpn);
+
+ return err;
+}
+
+int mlx5_core_modify_rmp(struct mlx5_core_dev *dev, u32 *in, int inlen)
+{
+ u32 out[MLX5_ST_SZ_DW(modify_rmp_out)];
+
+ MLX5_SET(modify_rmp_in, in, opcode, MLX5_CMD_OP_MODIFY_RMP);
+
+ memset(out, 0, sizeof(out));
+ return mlx5_cmd_exec_check_status(dev, in, inlen, out, sizeof(out));
+}
+
+int mlx5_core_destroy_rmp(struct mlx5_core_dev *dev, u32 rmpn)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_rmp_in)];
+ u32 out[MLX5_ST_SZ_DW(destroy_rmp_out)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(destroy_rmp_in, in, opcode, MLX5_CMD_OP_DESTROY_RMP);
+ MLX5_SET(destroy_rmp_in, in, rmpn, rmpn);
+
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
+}
+
+int mlx5_core_query_rmp(struct mlx5_core_dev *dev, u32 rmpn, u32 *out)
+{
+ u32 in[MLX5_ST_SZ_DW(query_rmp_in)];
+ int outlen = MLX5_ST_SZ_BYTES(query_rmp_out);
+
+ memset(in, 0, sizeof(in));
+ MLX5_SET(query_rmp_in, in, opcode, MLX5_CMD_OP_QUERY_RMP);
+ MLX5_SET(query_rmp_in, in, rmpn, rmpn);
+
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, outlen);
+}
+
+int mlx5_core_arm_rmp(struct mlx5_core_dev *dev, u32 rmpn, u16 lwm)
+{
+ void *in;
+ void *rmpc;
+ void *wq;
+ void *bitmask;
+ int err;
+
+ in = mlx5_vzalloc(MLX5_ST_SZ_BYTES(modify_rmp_in));
+ if (!in)
+ return -ENOMEM;
+
+ rmpc = MLX5_ADDR_OF(modify_rmp_in, in, ctx);
+ bitmask = MLX5_ADDR_OF(modify_rmp_in, in, bitmask);
+ wq = MLX5_ADDR_OF(rmpc, rmpc, wq);
+
+ MLX5_SET(modify_rmp_in, in, rmp_state, MLX5_RMPC_STATE_RDY);
+ MLX5_SET(modify_rmp_in, in, rmpn, rmpn);
+ MLX5_SET(wq, wq, lwm, lwm);
+ MLX5_SET(rmp_bitmask, bitmask, lwm, 1);
+ MLX5_SET(rmpc, rmpc, state, MLX5_RMPC_STATE_RDY);
+
+ err = mlx5_core_modify_rmp(dev, in, MLX5_ST_SZ_BYTES(modify_rmp_in));
+
+ kvfree(in);
+
+ return err;
+}
+
+int mlx5_core_create_xsrq(struct mlx5_core_dev *dev, u32 *in, int inlen, u32 *xsrqn)
+{
+ u32 out[MLX5_ST_SZ_DW(create_xrc_srq_out)];
+ int err;
+
+ MLX5_SET(create_xrc_srq_in, in, opcode, MLX5_CMD_OP_CREATE_XRC_SRQ);
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(dev, in, inlen, out, sizeof(out));
+ if (!err)
+ *xsrqn = MLX5_GET(create_xrc_srq_out, out, xrc_srqn);
+
+ return err;
+}
+
+int mlx5_core_destroy_xsrq(struct mlx5_core_dev *dev, u32 xsrqn)
+{
+ u32 in[MLX5_ST_SZ_DW(destroy_xrc_srq_in)];
+ u32 out[MLX5_ST_SZ_DW(destroy_xrc_srq_out)];
+
+ memset(in, 0, sizeof(in));
+ memset(out, 0, sizeof(out));
+
+ MLX5_SET(destroy_xrc_srq_in, in, opcode, MLX5_CMD_OP_DESTROY_XRC_SRQ);
+ MLX5_SET(destroy_xrc_srq_in, in, xrc_srqn, xsrqn);
+
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in), out,
+ sizeof(out));
+}
+
+int mlx5_core_query_xsrq(struct mlx5_core_dev *dev, u32 xsrqn, u32 *out)
+{
+ u32 in[MLX5_ST_SZ_DW(query_xrc_srq_in)];
+ void *srqc;
+ void *xrc_srqc;
+ int err;
+
+ memset(in, 0, sizeof(in));
+ MLX5_SET(query_xrc_srq_in, in, opcode, MLX5_CMD_OP_QUERY_XRC_SRQ);
+ MLX5_SET(query_xrc_srq_in, in, xrc_srqn, xsrqn);
+
+ err = mlx5_cmd_exec_check_status(dev, in, sizeof(in),
+ out,
+ MLX5_ST_SZ_BYTES(query_xrc_srq_out));
+ if (!err) {
+ xrc_srqc = MLX5_ADDR_OF(query_xrc_srq_out, out,
+ xrc_srq_context_entry);
+ srqc = MLX5_ADDR_OF(query_srq_out, out, srq_context_entry);
+ memcpy(srqc, xrc_srqc, MLX5_ST_SZ_BYTES(srqc));
+ }
+
+ return err;
+}
+
+int mlx5_core_arm_xsrq(struct mlx5_core_dev *dev, u32 xsrqn, u16 lwm)
+{
+ u32 in[MLX5_ST_SZ_DW(arm_xrc_srq_in)];
+ u32 out[MLX5_ST_SZ_DW(arm_xrc_srq_out)];
+
+ memset(in, 0, sizeof(in));
+ memset(out, 0, sizeof(out));
+
+ MLX5_SET(arm_xrc_srq_in, in, opcode, MLX5_CMD_OP_ARM_XRC_SRQ);
+ MLX5_SET(arm_xrc_srq_in, in, xrc_srqn, xsrqn);
+ MLX5_SET(arm_xrc_srq_in, in, lwm, lwm);
+ MLX5_SET(arm_xrc_srq_in, in, op_mod,
+ MLX5_ARM_XRC_SRQ_IN_OP_MOD_XRC_SRQ);
+
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in), out,
+ sizeof(out));
+
+}
+
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.h b/drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.h
new file mode 100644
index 0000000..8de82e5
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/transobj.h
@@ -0,0 +1,68 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies, Ltd. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __TRANSOBJ_H__
+#define __TRANSOBJ_H__
+
+int mlx5_alloc_transport_domain(struct mlx5_core_dev *dev, u32 *tdn);
+void mlx5_dealloc_transport_domain(struct mlx5_core_dev *dev, u32 tdn);
+int mlx5_core_create_rq(struct mlx5_core_dev *dev, u32 *in, int inlen,
+ u32 *rqn);
+int mlx5_core_modify_rq(struct mlx5_core_dev *dev, u32 *in, int inlen);
+void mlx5_core_destroy_rq(struct mlx5_core_dev *dev, u32 rqn);
+int mlx5_core_create_sq(struct mlx5_core_dev *dev, u32 *in, int inlen,
+ u32 *sqn);
+int mlx5_core_modify_sq(struct mlx5_core_dev *dev, u32 *in, int inlen);
+void mlx5_core_destroy_sq(struct mlx5_core_dev *dev, u32 sqn);
+int mlx5_core_create_tir(struct mlx5_core_dev *dev, u32 *in, int inlen,
+ u32 *tirn);
+void mlx5_core_destroy_tir(struct mlx5_core_dev *dev, u32 tirn);
+int mlx5_core_create_tis(struct mlx5_core_dev *dev, u32 *in, int inlen,
+ u32 *tisn);
+void mlx5_core_destroy_tis(struct mlx5_core_dev *dev, u32 tisn);
+int mlx5_core_create_rmp(struct mlx5_core_dev *dev, u32 *in, int inlen, u32 *rmpn);
+int mlx5_core_modify_rmp(struct mlx5_core_dev *dev, u32 *in, int inlen);
+int mlx5_core_destroy_rmp(struct mlx5_core_dev *dev, u32 rmpn);
+int mlx5_core_query_rmp(struct mlx5_core_dev *dev, u32 rmpn, u32 *out);
+int mlx5_core_arm_rmp(struct mlx5_core_dev *dev, u32 rmpn, u16 lwm);
+int mlx5_core_create_xsrq(struct mlx5_core_dev *dev, u32 *in, int inlen, u32 *rmpn);
+int mlx5_core_destroy_xsrq(struct mlx5_core_dev *dev, u32 rmpn);
+int mlx5_core_query_xsrq(struct mlx5_core_dev *dev, u32 rmpn, u32 *out);
+int mlx5_core_arm_xsrq(struct mlx5_core_dev *dev, u32 rmpn, u16 lwm);
+
+#endif /* __TRANSOBJ_H__ */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/uar.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/uar.c
new file mode 100644
index 0000000..b69831d
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/uar.c
@@ -0,0 +1,235 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "mlx5_core.h"
+
+enum {
+ NUM_DRIVER_UARS = 4,
+ NUM_LOW_LAT_UUARS = 4,
+};
+
+int mlx5_cmd_alloc_uar(struct mlx5_core_dev *dev, u32 *uarn)
+{
+ u32 in[MLX5_ST_SZ_DW(alloc_uar_in)];
+ u32 out[MLX5_ST_SZ_DW(alloc_uar_out)];
+ int err;
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(alloc_uar_in, in, opcode, MLX5_CMD_OP_ALLOC_UAR);
+
+ memset(out, 0, sizeof(out));
+ err = mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
+ if (err)
+ return err;
+
+ *uarn = MLX5_GET(alloc_uar_out, out, uar);
+
+ return 0;
+}
+EXPORT_SYMBOL(mlx5_cmd_alloc_uar);
+
+int mlx5_cmd_free_uar(struct mlx5_core_dev *dev, u32 uarn)
+{
+ u32 in[MLX5_ST_SZ_DW(dealloc_uar_in)];
+ u32 out[MLX5_ST_SZ_DW(dealloc_uar_out)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(dealloc_uar_in, in, opcode, MLX5_CMD_OP_DEALLOC_UAR);
+ MLX5_SET(dealloc_uar_in, in, uar, uarn);
+
+ memset(out, 0, sizeof(out));
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in),
+ out, sizeof(out));
+}
+EXPORT_SYMBOL(mlx5_cmd_free_uar);
+
+static int need_uuar_lock(int uuarn)
+{
+ int tot_uuars = NUM_DRIVER_UARS * MLX5_BF_REGS_PER_PAGE;
+
+ if (uuarn == 0 || tot_uuars - NUM_LOW_LAT_UUARS)
+ return 0;
+
+ return 1;
+}
+
+int mlx5_alloc_uuars(struct mlx5_core_dev *dev, struct mlx5_uuar_info *uuari)
+{
+ int tot_uuars = NUM_DRIVER_UARS * MLX5_BF_REGS_PER_PAGE;
+ struct mlx5_bf *bf;
+ phys_addr_t addr;
+ int err;
+ int i;
+
+ uuari->num_uars = NUM_DRIVER_UARS;
+ uuari->num_low_latency_uuars = NUM_LOW_LAT_UUARS;
+
+ mutex_init(&uuari->lock);
+ uuari->uars = kcalloc(uuari->num_uars, sizeof(*uuari->uars), GFP_KERNEL);
+ if (!uuari->uars)
+ return -ENOMEM;
+
+ uuari->bfs = kcalloc(tot_uuars, sizeof(*uuari->bfs), GFP_KERNEL);
+ if (!uuari->bfs) {
+ err = -ENOMEM;
+ goto out_uars;
+ }
+
+ uuari->bitmap = kcalloc(BITS_TO_LONGS(tot_uuars), sizeof(*uuari->bitmap),
+ GFP_KERNEL);
+ if (!uuari->bitmap) {
+ err = -ENOMEM;
+ goto out_bfs;
+ }
+
+ uuari->count = kcalloc(tot_uuars, sizeof(*uuari->count), GFP_KERNEL);
+ if (!uuari->count) {
+ err = -ENOMEM;
+ goto out_bitmap;
+ }
+
+ for (i = 0; i < uuari->num_uars; i++) {
+ err = mlx5_cmd_alloc_uar(dev, &uuari->uars[i].index);
+ if (err)
+ goto out_count;
+
+ addr = dev->iseg_base + ((phys_addr_t)(uuari->uars[i].index) << PAGE_SHIFT);
+ uuari->uars[i].map = ioremap(addr, PAGE_SIZE);
+ if (!uuari->uars[i].map) {
+ mlx5_cmd_free_uar(dev, uuari->uars[i].index);
+ err = -ENOMEM;
+ goto out_count;
+ }
+ mlx5_core_dbg(dev, "allocated uar index 0x%x, mmaped at %p\n",
+ uuari->uars[i].index, uuari->uars[i].map);
+ }
+
+ for (i = 0; i < tot_uuars; i++) {
+ bf = &uuari->bfs[i];
+
+ bf->buf_size = (1 << MLX5_CAP_GEN(dev, log_bf_reg_size)) / 2;
+ bf->uar = &uuari->uars[i / MLX5_BF_REGS_PER_PAGE];
+ bf->regreg = uuari->uars[i / MLX5_BF_REGS_PER_PAGE].map;
+ bf->reg = NULL; /* Add WC support */
+ bf->offset = (i % MLX5_BF_REGS_PER_PAGE) *
+ (1 << MLX5_CAP_GEN(dev, log_bf_reg_size)) +
+ MLX5_BF_OFFSET;
+ bf->need_lock = need_uuar_lock(i);
+ spin_lock_init(&bf->lock);
+ spin_lock_init(&bf->lock32);
+ bf->uuarn = i;
+ }
+
+ return 0;
+
+out_count:
+ for (i--; i >= 0; i--) {
+ iounmap(uuari->uars[i].map);
+ mlx5_cmd_free_uar(dev, uuari->uars[i].index);
+ }
+ kfree(uuari->count);
+
+out_bitmap:
+ kfree(uuari->bitmap);
+
+out_bfs:
+ kfree(uuari->bfs);
+
+out_uars:
+ kfree(uuari->uars);
+ return err;
+}
+
+int mlx5_free_uuars(struct mlx5_core_dev *dev, struct mlx5_uuar_info *uuari)
+{
+ int i = uuari->num_uars;
+
+ for (i--; i >= 0; i--) {
+ iounmap(uuari->uars[i].map);
+ mlx5_cmd_free_uar(dev, uuari->uars[i].index);
+ }
+
+ kfree(uuari->count);
+ kfree(uuari->bitmap);
+ kfree(uuari->bfs);
+ kfree(uuari->uars);
+
+ return 0;
+}
+
+int mlx5_alloc_map_uar(struct mlx5_core_dev *mdev, struct mlx5_uar *uar)
+{
+ phys_addr_t pfn;
+ phys_addr_t uar_bar_start;
+ int err;
+
+ err = mlx5_cmd_alloc_uar(mdev, &uar->index);
+ if (err) {
+ mlx5_core_warn(mdev, "mlx5_cmd_alloc_uar() failed, %d\n", err);
+ return err;
+ }
+
+ uar_bar_start = pci_resource_start(mdev->pdev, 0);
+ pfn = (uar_bar_start >> PAGE_SHIFT) + uar->index;
+ uar->map = ioremap(pfn << PAGE_SHIFT, PAGE_SIZE);
+ if (!uar->map) {
+ mlx5_core_warn(mdev, "ioremap() failed, %d\n", err);
+ err = -ENOMEM;
+ goto err_free_uar;
+ }
+
+ if (mdev->priv.bf_mapping)
+ uar->bf_map = io_mapping_map_wc(mdev->priv.bf_mapping,
+ uar->index << PAGE_SHIFT);
+
+ return 0;
+
+err_free_uar:
+ mlx5_cmd_free_uar(mdev, uar->index);
+
+ return err;
+}
+EXPORT_SYMBOL(mlx5_alloc_map_uar);
+
+void mlx5_unmap_free_uar(struct mlx5_core_dev *mdev, struct mlx5_uar *uar)
+{
+ io_mapping_unmap(uar->bf_map);
+ iounmap(uar->map);
+ mlx5_cmd_free_uar(mdev, uar->index);
+}
+EXPORT_SYMBOL(mlx5_unmap_free_uar);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/vport.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/vport.c
new file mode 100644
index 0000000..27737de
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/vport.c
@@ -0,0 +1,216 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies, Ltd. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "linux/mlx5/vport.h"
+#include "mlx5_core.h"
+
+u8 mlx5_query_vport_state(struct mlx5_core_dev *mdev, u8 opmod)
+{
+ u32 in[MLX5_ST_SZ_DW(query_vport_state_in)];
+ u32 out[MLX5_ST_SZ_DW(query_vport_state_out)];
+ int err;
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(query_vport_state_in, in, opcode,
+ MLX5_CMD_OP_QUERY_VPORT_STATE);
+ MLX5_SET(query_vport_state_in, in, op_mod, opmod);
+
+ err = mlx5_cmd_exec_check_status(mdev, in, sizeof(in), out,
+ sizeof(out));
+ if (err)
+ mlx5_core_warn(mdev, "MLX5_CMD_OP_QUERY_VPORT_STATE failed\n");
+
+ return MLX5_GET(query_vport_state_out, out, state);
+}
+EXPORT_SYMBOL_GPL(mlx5_query_vport_state);
+
+static int mlx5_query_nic_vport_context(struct mlx5_core_dev *mdev, u32 *out,
+ int outlen)
+{
+ u32 in[MLX5_ST_SZ_DW(query_nic_vport_context_in)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(query_nic_vport_context_in, in, opcode,
+ MLX5_CMD_OP_QUERY_NIC_VPORT_CONTEXT);
+
+ return mlx5_cmd_exec_check_status(mdev, in, sizeof(in), out, outlen);
+}
+
+int mlx5_query_nic_vport_mac_address(struct mlx5_core_dev *mdev, u8 *addr)
+{
+ u32 *out;
+ int outlen = MLX5_ST_SZ_BYTES(query_nic_vport_context_out);
+ u8 *out_addr;
+ int err;
+
+ out = kzalloc(outlen, GFP_KERNEL);
+ if (!out)
+ return -ENOMEM;
+
+ out_addr = MLX5_ADDR_OF(query_nic_vport_context_out, out,
+ nic_vport_context.permanent_address);
+
+ err = mlx5_query_nic_vport_context(mdev, out, outlen);
+ if (err)
+ goto out;
+
+ ether_addr_copy(addr, &out_addr[2]);
+
+out:
+ kfree(out);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_nic_vport_mac_address);
+
+int mlx5_query_nic_vport_system_image_guid(struct mlx5_core_dev *mdev,
+ u64 *system_image_guid)
+{
+ u32 *out;
+ int outlen = MLX5_ST_SZ_BYTES(query_nic_vport_context_out);
+ int err;
+
+ out = kzalloc(outlen, GFP_KERNEL);
+ if (!out)
+ return -ENOMEM;
+
+ err = mlx5_query_nic_vport_context(mdev, out, outlen);
+ if (err)
+ goto out;
+
+ *system_image_guid = MLX5_GET64(query_nic_vport_context_out, out,
+ nic_vport_context.system_image_guid);
+out:
+ kfree(out);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_nic_vport_system_image_guid);
+
+int mlx5_query_nic_vport_node_guid(struct mlx5_core_dev *mdev, u64 *node_guid)
+{
+ u32 *out;
+ int outlen = MLX5_ST_SZ_BYTES(query_nic_vport_context_out);
+ int err;
+
+ out = kzalloc(outlen, GFP_KERNEL);
+ if (!out)
+ return -ENOMEM;
+
+ err = mlx5_query_nic_vport_context(mdev, out, outlen);
+ if (err)
+ goto out;
+
+ *node_guid = MLX5_GET64(query_nic_vport_context_out, out,
+ nic_vport_context.node_guid);
+
+out:
+ kfree(out);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_nic_vport_node_guid);
+
+int mlx5_query_nic_vport_qkey_viol_cntr(struct mlx5_core_dev *mdev,
+ u16 *qkey_viol_cntr)
+{
+ u32 *out;
+ int outlen = MLX5_ST_SZ_BYTES(query_nic_vport_context_out);
+ int err;
+
+ out = kzalloc(outlen, GFP_KERNEL);
+ if (!out)
+ return -ENOMEM;
+
+ err = mlx5_query_nic_vport_context(mdev, out, outlen);
+ if (err)
+ goto out;
+
+ *qkey_viol_cntr = MLX5_GET(query_nic_vport_context_out, out,
+ nic_vport_context.qkey_violation_counter);
+
+out:
+ kfree(out);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_nic_vport_qkey_viol_cntr);
+
+static int mlx5_modify_nic_vport_context(struct mlx5_core_dev *mdev, void *in,
+ int inlen)
+{
+ u32 out[MLX5_ST_SZ_DW(modify_nic_vport_context_out)];
+
+ MLX5_SET(modify_nic_vport_context_in, in, opcode,
+ MLX5_CMD_OP_MODIFY_NIC_VPORT_CONTEXT);
+
+ memset(out, 0, sizeof(out));
+ return mlx5_cmd_exec_check_status(mdev, in, inlen, out, sizeof(out));
+}
+
+static int mlx5_nic_vport_enable_disable_roce(struct mlx5_core_dev *mdev,
+ int enable_disable)
+{
+ void *in;
+ int inlen = MLX5_ST_SZ_BYTES(modify_nic_vport_context_in);
+ int err;
+
+ in = mlx5_vzalloc(inlen);
+ if (!in) {
+ mlx5_core_warn(mdev, "failed to allocate inbox\n");
+ return -ENOMEM;
+ }
+
+ MLX5_SET(modify_nic_vport_context_in, in, field_select.roce_en, 1);
+ MLX5_SET(modify_nic_vport_context_in, in, nic_vport_context.roce_en,
+ enable_disable);
+
+ err = mlx5_modify_nic_vport_context(mdev, in, inlen);
+
+ kvfree(in);
+
+ return err;
+}
+
+int mlx5_nic_vport_enable_roce(struct mlx5_core_dev *mdev)
+{
+ return mlx5_nic_vport_enable_disable_roce(mdev, 1);
+}
+EXPORT_SYMBOL_GPL(mlx5_nic_vport_enable_roce);
+
+int mlx5_nic_vport_disable_roce(struct mlx5_core_dev *mdev)
+{
+ return mlx5_nic_vport_enable_disable_roce(mdev, 0);
+}
+EXPORT_SYMBOL_GPL(mlx5_nic_vport_disable_roce);
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/wq.c b/drivers/net/mlnx_uio/mlnx/mlx5/core/wq.c
new file mode 100644
index 0000000..08e947c
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/wq.c
@@ -0,0 +1,195 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies, Ltd. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "wq.h"
+#include "mlx5_core.h"
+
+u32 mlx5_wq_cyc_get_size(struct mlx5_wq_cyc *wq)
+{
+ return (u32)wq->sz_m1 + 1;
+}
+
+u32 mlx5_cqwq_get_size(struct mlx5_cqwq *wq)
+{
+ return wq->sz_m1 + 1;
+}
+
+u32 mlx5_wq_ll_get_size(struct mlx5_wq_ll *wq)
+{
+ return (u32)wq->sz_m1 + 1;
+}
+
+static u32 mlx5_wq_cyc_get_byte_size(struct mlx5_wq_cyc *wq)
+{
+ return mlx5_wq_cyc_get_size(wq) << wq->log_stride;
+}
+
+static u32 mlx5_cqwq_get_byte_size(struct mlx5_cqwq *wq)
+{
+ return mlx5_cqwq_get_size(wq) << wq->log_stride;
+}
+
+static u32 mlx5_wq_ll_get_byte_size(struct mlx5_wq_ll *wq)
+{
+ return mlx5_wq_ll_get_size(wq) << wq->log_stride;
+}
+
+int mlx5_wq_cyc_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
+ void *wqc, struct mlx5_wq_cyc *wq,
+ struct mlx5_wq_ctrl *wq_ctrl)
+{
+ int max_direct = param->linear ? INT_MAX : 0;
+ int err;
+
+ wq->log_stride = MLX5_GET(wq, wqc, log_wq_stride);
+ wq->sz_m1 = (1 << MLX5_GET(wq, wqc, log_wq_sz)) - 1;
+
+ err = mlx5_db_alloc_node(mdev, &wq_ctrl->db, param->db_numa_node);
+ if (err) {
+ mlx5_core_warn(mdev, "mlx5_db_alloc() failed, %d\n", err);
+ return err;
+ }
+
+ err = mlx5_buf_alloc_node(mdev, mlx5_wq_cyc_get_byte_size(wq),
+ max_direct, &wq_ctrl->buf,
+ param->buf_numa_node);
+ if (err) {
+ mlx5_core_warn(mdev, "mlx5_buf_alloc() failed, %d\n", err);
+ goto err_db_free;
+ }
+
+ wq->buf = wq_ctrl->buf.direct.buf;
+ wq->db = wq_ctrl->db.db;
+
+ wq_ctrl->mdev = mdev;
+
+ return 0;
+
+err_db_free:
+ mlx5_db_free(mdev, &wq_ctrl->db);
+
+ return err;
+}
+
+int mlx5_cqwq_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
+ void *cqc, struct mlx5_cqwq *wq,
+ struct mlx5_wq_ctrl *wq_ctrl)
+{
+ int max_direct = param->linear ? INT_MAX : 0;
+ int err;
+
+ wq->log_stride = 6 + MLX5_GET(cqc, cqc, cqe_sz);
+ wq->log_sz = MLX5_GET(cqc, cqc, log_cq_size);
+ wq->sz_m1 = (1 << wq->log_sz) - 1;
+
+ err = mlx5_db_alloc_node(mdev, &wq_ctrl->db, param->db_numa_node);
+ if (err) {
+ mlx5_core_warn(mdev, "mlx5_db_alloc() failed, %d\n", err);
+ return err;
+ }
+
+ err = mlx5_buf_alloc_node(mdev, mlx5_cqwq_get_byte_size(wq),
+ max_direct, &wq_ctrl->buf,
+ param->buf_numa_node);
+ if (err) {
+ mlx5_core_warn(mdev, "mlx5_buf_alloc() failed, %d\n", err);
+ goto err_db_free;
+ }
+
+ wq->buf = wq_ctrl->buf.direct.buf;
+ wq->db = wq_ctrl->db.db;
+
+ wq_ctrl->mdev = mdev;
+
+ return 0;
+
+err_db_free:
+ mlx5_db_free(mdev, &wq_ctrl->db);
+
+ return err;
+}
+
+int mlx5_wq_ll_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
+ void *wqc, struct mlx5_wq_ll *wq,
+ struct mlx5_wq_ctrl *wq_ctrl)
+{
+ struct mlx5_wqe_srq_next_seg *next_seg;
+ int max_direct = param->linear ? INT_MAX : 0;
+ int err;
+ int i;
+
+ wq->log_stride = MLX5_GET(wq, wqc, log_wq_stride);
+ wq->sz_m1 = (1 << MLX5_GET(wq, wqc, log_wq_sz)) - 1;
+
+ err = mlx5_db_alloc_node(mdev, &wq_ctrl->db, param->db_numa_node);
+ if (err) {
+ mlx5_core_warn(mdev, "mlx5_db_alloc() failed, %d\n", err);
+ return err;
+ }
+
+ err = mlx5_buf_alloc_node(mdev, mlx5_wq_ll_get_byte_size(wq),
+ max_direct, &wq_ctrl->buf,
+ param->buf_numa_node);
+ if (err) {
+ mlx5_core_warn(mdev, "mlx5_buf_alloc() failed, %d\n", err);
+ goto err_db_free;
+ }
+
+ wq->buf = wq_ctrl->buf.direct.buf;
+ wq->db = wq_ctrl->db.db;
+
+ for (i = 0; i < wq->sz_m1; i++) {
+ next_seg = mlx5_wq_ll_get_wqe(wq, i);
+ next_seg->next_wqe_index = cpu_to_be16(i + 1);
+ }
+ next_seg = mlx5_wq_ll_get_wqe(wq, i);
+ wq->tail_next = &next_seg->next_wqe_index;
+
+ wq_ctrl->mdev = mdev;
+
+ return 0;
+
+err_db_free:
+ mlx5_db_free(mdev, &wq_ctrl->db);
+
+ return err;
+}
+
+void mlx5_wq_destroy(struct mlx5_wq_ctrl *wq_ctrl)
+{
+ mlx5_buf_free(wq_ctrl->mdev, &wq_ctrl->buf);
+ mlx5_db_free(wq_ctrl->mdev, &wq_ctrl->db);
+}
diff --git a/drivers/net/mlnx_uio/mlnx/mlx5/core/wq.h b/drivers/net/mlnx_uio/mlnx/mlx5/core/wq.h
new file mode 100644
index 0000000..29172b0
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlnx/mlx5/core/wq.h
@@ -0,0 +1,177 @@
+#ifndef K_CONVERTED
+#define K_CONVERTED
+#endif
+#include "kmod.h"
+/*
+ * Copyright (c) 2013-2015, Mellanox Technologies, Ltd. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __MLX5_WQ_H__
+#define __MLX5_WQ_H__
+
+
+struct mlx5_wq_param {
+ int linear;
+ int buf_numa_node;
+ int db_numa_node;
+};
+
+struct mlx5_wq_ctrl {
+ struct mlx5_core_dev *mdev;
+ struct mlx5_buf buf;
+ struct mlx5_db db;
+};
+
+struct mlx5_wq_cyc {
+ void *buf;
+ __be32 *db;
+ u16 sz_m1;
+ u8 log_stride;
+};
+
+struct mlx5_cqwq {
+ void *buf;
+ __be32 *db;
+ u32 sz_m1;
+ u32 cc; /* consumer counter */
+ u8 log_sz;
+ u8 log_stride;
+};
+
+struct mlx5_wq_ll {
+ void *buf;
+ __be32 *db;
+ __be16 *tail_next;
+ u16 sz_m1;
+ u16 head;
+ u16 wqe_ctr;
+ u16 cur_sz;
+ u8 log_stride;
+};
+
+int mlx5_wq_cyc_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
+ void *wqc, struct mlx5_wq_cyc *wq,
+ struct mlx5_wq_ctrl *wq_ctrl);
+u32 mlx5_wq_cyc_get_size(struct mlx5_wq_cyc *wq);
+
+int mlx5_cqwq_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
+ void *cqc, struct mlx5_cqwq *wq,
+ struct mlx5_wq_ctrl *wq_ctrl);
+u32 mlx5_cqwq_get_size(struct mlx5_cqwq *wq);
+
+int mlx5_wq_ll_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
+ void *wqc, struct mlx5_wq_ll *wq,
+ struct mlx5_wq_ctrl *wq_ctrl);
+u32 mlx5_wq_ll_get_size(struct mlx5_wq_ll *wq);
+
+void mlx5_wq_destroy(struct mlx5_wq_ctrl *wq_ctrl);
+
+static inline u16 mlx5_wq_cyc_ctr2ix(struct mlx5_wq_cyc *wq, u16 ctr)
+{
+ return ctr & wq->sz_m1;
+}
+
+static inline void *mlx5_wq_cyc_get_wqe(struct mlx5_wq_cyc *wq, u16 ix)
+{
+ return wq->buf + (ix << wq->log_stride);
+}
+
+static inline int mlx5_wq_cyc_cc_bigger(u16 cc1, u16 cc2)
+{
+ int equal = (cc1 == cc2);
+ int smaller = 0x8000 & (cc1 - cc2);
+
+ return !equal && !smaller;
+}
+
+static inline u32 mlx5_cqwq_get_ci(struct mlx5_cqwq *wq)
+{
+ return wq->cc & wq->sz_m1;
+}
+
+static inline void *mlx5_cqwq_get_wqe(struct mlx5_cqwq *wq, u32 ix)
+{
+ return wq->buf + (ix << wq->log_stride);
+}
+
+static inline u32 mlx5_cqwq_get_wrap_cnt(struct mlx5_cqwq *wq)
+{
+ return wq->cc >> wq->log_sz;
+}
+
+static inline void mlx5_cqwq_pop(struct mlx5_cqwq *wq)
+{
+ wq->cc++;
+}
+
+static inline void mlx5_cqwq_update_db_record(struct mlx5_cqwq *wq)
+{
+ *wq->db = cpu_to_be32(wq->cc & 0xffffff);
+}
+
+static inline int mlx5_wq_ll_is_full(struct mlx5_wq_ll *wq)
+{
+ return wq->cur_sz == wq->sz_m1;
+}
+
+static inline int mlx5_wq_ll_is_empty(struct mlx5_wq_ll *wq)
+{
+ return !wq->cur_sz;
+}
+
+static inline void *mlx5_wq_ll_get_wqe(struct mlx5_wq_ll *wq, u16 ix)
+{
+ return wq->buf + (ix << wq->log_stride);
+}
+
+static inline void mlx5_wq_ll_push(struct mlx5_wq_ll *wq, u16 head_next)
+{
+ wq->head = head_next;
+ wq->wqe_ctr++;
+ wq->cur_sz++;
+}
+
+static inline void mlx5_wq_ll_pop(struct mlx5_wq_ll *wq, __be16 ix,
+ __be16 *next_tail_next)
+{
+ *wq->tail_next = ix;
+ wq->tail_next = next_tail_next;
+ wq->cur_sz--;
+}
+
+static inline void mlx5_wq_ll_update_db_record(struct mlx5_wq_ll *wq)
+{
+ *wq->db = cpu_to_be32(wq->wqe_ctr);
+}
+
+#endif /* __MLX5_WQ_H__ */
+
+#include "post_kmod.h"
diff --git a/drivers/net/mlnx_uio/mlx4_en_special.h b/drivers/net/mlnx_uio/mlx4_en_special.h
new file mode 100644
index 0000000..efb2c54
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlx4_en_special.h
@@ -0,0 +1,21 @@
+/*
+ * mlx4_special.h
+ *
+ * Created on: Nov 4, 2014
+ * Author: leeopop
+ */
+
+#ifndef MLX4_EN_SPECIAL_H_
+#define MLX4_EN_SPECIAL_H_
+
+struct rte_mbuf;
+typedef void (*mlx4_tx_completion_callback_t)(uint64_t timestamp, struct rte_mbuf* mbuf, void* arg);
+
+int mlx4_set_tx_timestamp(int port, int queue_id, int use);
+int mlx4_set_rx_timestamp(int port, int queue_id, int use);
+int mlx4_poll_tx_cq(int port, int txq);
+int mlx4_set_tx_completion_callback(int port, int queue_id, mlx4_tx_completion_callback_t callback, void* arg);
+uint64_t mlx4_read_dev_clock_hz(int port);
+uint64_t mlx4_read_dev_clock(int port);
+
+#endif /* MLX4_EN_SPECIAL_H_ */
diff --git a/drivers/net/mlnx_uio/mlx4_uio.c b/drivers/net/mlnx_uio/mlx4_uio.c
new file mode 100644
index 0000000..c8a31bf
--- /dev/null
+++ b/drivers/net/mlnx_uio/mlx4_uio.c
@@ -0,0 +1,1026 @@
+/*
+ * mlx4_uio.c
+ *
+ * Created on: Jun 30, 2015
+ * Author: leeopop
+ */
+
+#include "kmod.h"
+#include "mlx4_uio.h"
+#include "mlnx/mlx4/mlx4_en.h"
+#include "dcbnl.h"
+#include "mlx4/device.h"
+
+#include "mlx4_uio_helper.h"
+#include "log2.h"
+#include "mlx4_en_special.h"
+#include <rte_mbuf.h>
+
+#ifdef CONFIG_INFINIBAND_WQE_FORMAT
+ #define INIT_OWNER_BIT cpu_to_be32(1 << 30)
+#else
+ #define INIT_OWNER_BIT 0xffffffff
+#endif
+
+void mlx4_eth_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+ struct mlx4_en_priv* priv = rtedev_priv(dev);
+ int port = priv->port;
+ struct mlx4_en_dev* mdev = priv->mdev;
+ struct mlx4_en_port_profile* prof = priv->prof;
+
+ uint64_t rss_offloads = ETH_RSS_IPV4 | ETH_RSS_IPV6 | ETH_RSS_NONFRAG_IPV4_TCP | ETH_RSS_NONFRAG_IPV6_TCP;
+ if(mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_UDP_RSS)
+ rss_offloads |= (ETH_RSS_NONFRAG_IPV4_UDP | ETH_RSS_NONFRAG_IPV6_UDP);
+ if(mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_RSS_IP_FRAG)
+ rss_offloads |= (ETH_RSS_FRAG_IPV4 | ETH_RSS_FRAG_IPV6);
+
+
+ struct rte_eth_dev_info rte_info = {
+ .pci_dev = mdev->rte_pdev,
+ .driver_name = "mlx4_uio",
+ .if_index = 0,
+ .min_rx_bufsize = MLX4_EN_SMALL_PKT_SIZE,
+ .max_rx_pktlen = mdev->dev->caps.eth_mtu_cap[priv->port],
+ .max_rx_queues = MLX4_EN_MAX_RX_SIZE,
+ .max_tx_queues = MLX4_EN_MAX_TX_SIZE,
+ .max_mac_addrs = (1 << mdev->dev->caps.log_num_macs),
+ .max_hash_mac_addrs = MLX4_EN_MAC_HASH_SIZE,
+ .max_vfs = MLX4_MAX_NUM_VF_P_PORT,
+ .max_vmdq_pools = 0, //MLX4_MFUNC_EQ_NUM, //XXX
+ .rx_offload_capa = DEV_RX_OFFLOAD_IPV4_CKSUM | DEV_RX_OFFLOAD_UDP_CKSUM | DEV_RX_OFFLOAD_TCP_CKSUM,
+ .tx_offload_capa = DEV_TX_OFFLOAD_IPV4_CKSUM | DEV_TX_OFFLOAD_UDP_CKSUM | DEV_TX_OFFLOAD_TCP_CKSUM,
+ .reta_size = mdev->dev->caps.max_rss_tbl_sz,
+ .flow_type_rss_offloads = rss_offloads,
+ .default_rxconf = {.rx_drop_en = 1},
+ //.default_txconf = {0,},
+ .vmdq_pool_base = 0,
+ .vmdq_queue_base = 0,
+ .vmdq_queue_num = 0,
+ };
+
+ memcpy(dev_info, &rte_info, sizeof(struct rte_eth_dev_info));
+}
+
+static int mlx4_eth_dev_configure(struct rte_eth_dev *dev)
+{
+ if(rte_persistent_init() < 0)
+ return -1;
+ /*
+ * Initialize driver private data
+ */
+
+ struct mlx4_en_priv* priv = rtedev_priv(dev);
+ priv->counter_index = 0xff;
+ spin_lock_init(&priv->stats_lock);
+#ifdef HAVE_VXLAN_ENABLED
+ INIT_WORK(&priv->vxlan_add_task, mlx4_en_add_vxlan_offloads);
+ INIT_WORK(&priv->vxlan_del_task, mlx4_en_del_vxlan_offloads);
+#endif
+#ifdef CONFIG_RFS_ACCEL
+ INIT_LIST_HEAD(&priv->filters);
+ spin_lock_init(&priv->filters_lock);
+#endif
+
+ int port = priv->port;
+ struct mlx4_en_dev* mdev = priv->mdev;
+ struct mlx4_en_port_profile* prof = priv->prof;
+ priv->rte_dev = dev;
+ //priv->mdev = mdev;
+ //priv->prof = prof;
+ //priv->port = port;
+ priv->port_up = false;
+ priv->flags = prof->flags;
+ //priv->pflags = MLX4_EN_PRIV_FLAGS_BLUEFLAME;
+ priv->pflags = 0; //disable blueflame
+ priv->ctrl_flags = cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE |
+ MLX4_WQE_CTRL_SOLICITED);
+
+ priv->cqe_factor = (mdev->dev->caps.cqe_size == 64) ? 1 : 0;
+ priv->cqe_size = mdev->dev->caps.cqe_size;
+ priv->mac_index = -1;
+ priv->msg_enable = 0xFFFFFFFF;//MLX4_EN_MSG_LEVEL;
+
+ int i, err;
+ for (i = 0; i < MLX4_EN_MAC_HASH_SIZE; ++i)
+ INIT_HLIST_HEAD(&priv->mac_hash[i]);
+
+ /* Query for default mac and max mtu */
+ priv->max_mtu = mdev->dev->caps.eth_mtu_cap[priv->port];
+
+ if (mdev->dev->caps.rx_checksum_flags_port[priv->port] &
+ MLX4_RX_CSUM_MODE_VAL_NON_TCP_UDP)
+ priv->flags |= MLX4_EN_FLAG_RX_CSUM_NON_TCP_UDP;
+
+ priv->stride = prof->inline_scatter_thold >= MIN_INLINE_SCATTER ?
+ prof->inline_scatter_thold :
+ roundup_pow_of_two(sizeof(struct mlx4_en_rx_desc) +
+ DS_SIZE * MLX4_EN_MAX_RX_FRAGS);
+
+ priv->allocated = 0;
+
+ mdev->rte_pndev[port] = dev;
+ mdev->rte_upper[port] = NULL;
+
+ if(dev->data->dev_conf.rxmode.jumbo_frame)
+ dev->data->mtu = dev->data->dev_conf.rxmode.max_rx_pkt_len;
+ dev->data->mtu = RTE_MIN(priv->max_mtu, dev->data->mtu);
+ int eff_mtu = dev->data->mtu + ETH_HLEN + VLAN_HLEN;
+ priv->eff_mtu = eff_mtu;
+
+
+ return 0;
+}
+
+int mlx4_eth_rx_queue_setup(struct rte_eth_dev *dev,
+ uint16_t rx_queue_id,
+ uint16_t nb_rx_desc,
+ unsigned int socket_id,
+ const struct rte_eth_rxconf *rx_conf,
+ struct rte_mempool *mb_pool)
+{
+ struct mlx4_en_priv* priv = rtedev_priv(dev);
+ int port = priv->port;
+ struct mlx4_en_dev* mdev = priv->mdev;
+ struct mlx4_en_rx_ring* rxq = rte_zmalloc_socket("mlx4_rx_ring",
+ sizeof(struct mlx4_en_rx_ring), RTE_CACHE_LINE_SIZE, dev->pci_dev->numa_node);
+ rxq->mb_pool = mb_pool;
+ int ret = mlx4_en_create_cq(priv, &rxq->rx_cq, nb_rx_desc, rx_queue_id, RX, dev->pci_dev->numa_node);
+ if(ret < 0)
+ return ret;
+
+ {
+ int stride = priv->stride;
+ int size = nb_rx_desc;
+
+ rxq->prod = 0;
+ rxq->cons = 0;
+ rxq->size = size;
+ rxq->size_mask = size - 1;
+ rxq->stride = stride;
+ rxq->log_stride = ffs(rxq->stride) - 1;
+ rxq->buf_size = rxq->size * rxq->stride + TXBB_SIZE;
+
+ /* Allocate HW buffers on provided NUMA node */
+ //set_dev_node(&mdev->dev->persist->pdev->dev, node);
+ ret = mlx4_alloc_hwq_res(mdev->dev, &rxq->wqres,
+ rxq->buf_size, 2 * PAGE_SIZE);
+ //set_dev_node(&mdev->dev->persist->pdev->dev, mdev->dev->numa_node);
+ if (ret)
+ return ret;
+
+ ret = mlx4_en_map_buffer(&rxq->wqres.buf);
+ if (ret) {
+ en_err(priv, "Failed to map RX buffer\n");
+ return ret;
+ }
+ rxq->buf = rxq->wqres.buf.direct.buf;
+
+ rxq->enable_hwtstamp = 0;
+ }
+ {
+ int tmp;
+ size_t frag_size = rte_pktmbuf_data_room_size(mb_pool) - RTE_PKTMBUF_HEADROOM;
+ int eff_mtu = priv->eff_mtu;
+
+ int buf_size = 0;
+ int i = 0;
+
+ while (buf_size < eff_mtu) {
+ buf_size += frag_size;
+ i++;
+ }
+ assert(i<=MLX4_EN_MAX_RX_FRAGS);
+ rxq->num_frags = i;
+ rxq->frag_size = frag_size;
+
+ assert(rte_mempool_count(mb_pool) >= (rxq->num_frags * rxq->size));
+
+ tmp = nb_rx_desc * sizeof(struct rte_mbuf*) * i;
+ rxq->rx_info = rte_zmalloc_socket("mlx4_rx_info",
+ tmp, RTE_CACHE_LINE_SIZE, dev->pci_dev->numa_node);
+ if (!rxq->rx_info) {
+ ret = -ENOMEM;
+ return ret;
+ }
+
+ en_dbg(DRV, priv, "Allocated rx_info ring at addr:%p size:%d\n",
+ rxq->rx_info, tmp);
+
+ en_dbg(DRV, priv, "Rx buffer scatter-list (Q idx:%d effective-mtu:%d num_frags:%d):\n",
+ rx_queue_id, eff_mtu, i);
+ }
+ dev->data->rx_queues[rx_queue_id] = rxq;
+ return ret;
+}
+
+int mlx4_eth_tx_queue_setup(struct rte_eth_dev *dev,
+ uint16_t tx_queue_id,
+ uint16_t nb_tx_desc,
+ unsigned int socket_id,
+ const struct rte_eth_txconf *tx_conf)
+{
+ int ret;
+
+ struct mlx4_en_priv* priv = rtedev_priv(dev);
+ int port = priv->port;
+ struct mlx4_en_dev* mdev = priv->mdev;
+ struct mlx4_en_tx_ring* txq = rte_zmalloc_socket("mlx4_tx_ring",
+ sizeof(struct mlx4_en_tx_ring), RTE_CACHE_LINE_SIZE, dev->pci_dev->numa_node);
+ ret = mlx4_en_create_cq(priv, &txq->tx_cq, nb_tx_desc, tx_queue_id, TX, dev->pci_dev->numa_node);
+ if(ret < 0)
+ return ret;
+
+ {
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_en_tx_ring *ring = txq;
+ int tmp;
+ int size = nb_tx_desc;
+ int stride = TXBB_SIZE;
+ int queue_index = tx_queue_id;
+
+ ring->size = size;
+ ring->size_mask = size - 1;
+ ring->stride = stride;
+
+ tmp = size * sizeof(struct mlx4_en_tx_info);
+ ring->tx_info = rte_zmalloc_socket("mlx4_tx_info",
+ tmp, RTE_CACHE_LINE_SIZE, dev->pci_dev->numa_node);
+ if (!ring->tx_info) {
+ ret = -ENOMEM;
+ return ret;
+ }
+
+ en_dbg(DRV, priv, "Allocated tx_info ring at addr:%p size:%d\n",
+ ring->tx_info, tmp);
+
+ ring->bounce_buf = rte_zmalloc_socket("mlx4_tx_bounce",
+ MAX_DESC_SIZE, RTE_CACHE_LINE_SIZE, dev->pci_dev->numa_node);
+ if (!ring->bounce_buf) {
+ ret = -ENOMEM;
+ return ret;
+ }
+ ring->buf_size = ALIGN(size * ring->stride, MLX4_EN_PAGE_SIZE);
+
+ /* Allocate HW buffers on provided NUMA node */
+ //set_dev_node(&mdev->dev->persist->pdev->dev, node);
+ ret = mlx4_alloc_hwq_res(mdev->dev, &ring->wqres, ring->buf_size,
+ 2 * PAGE_SIZE);
+ //set_dev_node(&mdev->dev->persist->pdev->dev, mdev->dev->numa_node);
+ if (ret) {
+ en_err(priv, "Failed allocating hwq resources\n");
+ return ret;
+ }
+
+ ret = mlx4_en_map_buffer(&ring->wqres.buf);
+ if (ret) {
+ en_err(priv, "Failed to map TX buffer\n");
+ return ret;
+ }
+
+ ring->buf = ring->wqres.buf.direct.buf;
+
+ en_dbg(DRV, priv, "Allocated TX ring (addr:%p) - buf:%p size:%d buf_size:%d dma:%llx\n",
+ ring, ring->buf, ring->size, ring->buf_size,
+ (unsigned long long) ring->wqres.buf.direct.map);
+
+ ret = mlx4_qp_reserve_range(mdev->dev, 1, 1, &ring->qpn,
+ MLX4_RESERVE_ETH_BF_QP);
+ if (ret) {
+ en_err(priv, "failed reserving qp for TX ring\n");
+ return ret;
+ }
+
+ ret = mlx4_qp_alloc(mdev->dev, ring->qpn, &ring->qp, GFP_KERNEL);
+ if (ret) {
+ en_err(priv, "Failed allocating qp %d\n", ring->qpn);
+ return ret;
+ }
+ ring->qp.event = mlx4_en_sqp_event;
+
+ //ret = mlx4_bf_alloc(mdev->dev, &ring->bf, node);
+ //if (ret)
+ {
+ en_dbg(DRV, priv, "working without blueflame (%d)\n", ret);
+ ring->bf.uar = &mdev->priv_uar;
+ ring->bf.uar->map = mdev->uar_map;
+ ring->bf_enabled = false;
+ ring->bf_alloced = false;
+ priv->pflags &= ~MLX4_EN_PRIV_FLAGS_BLUEFLAME;
+ }
+ //else {
+ // ring->bf_alloced = true;
+ // ring->bf_enabled = !!(priv->pflags &
+ // MLX4_EN_PRIV_FLAGS_BLUEFLAME);
+ //}
+
+ //ring->hwtstamp_tx_type = priv->hwtstamp_config.tx_type;
+ ring->enable_hwtstamp = 0;
+ ring->queue_index = queue_index;
+
+ //if (queue_index < priv->num_tx_rings_p_up && cpu_online(queue_index))
+ // cpumask_set_cpu(queue_index, &ring->affinity_mask);
+ }
+ dev->data->tx_queues[tx_queue_id] = txq;
+ return ret;
+}
+
+uint16_t
+mlx4_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
+ uint16_t nb_pkts)
+{
+ struct mlx4_en_rx_ring* ring = rx_queue;
+ return mlx4_en_process_rx_cq(ring, rx_pkts, nb_pkts);
+}
+
+int mlx4_en_xmit(struct rte_mbuf *mbuf, struct mlx4_en_tx_ring *ring)
+{
+ struct rte_eth_dev* dev = ring->tx_cq.rte_dev;
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_tx_desc *tx_desc;
+ struct mlx4_wqe_data_seg *data;
+ struct mlx4_en_tx_info *tx_info;
+ int nr_txbb;
+ int desc_size;
+ int real_size;
+ u32 index;
+ __be32 op_own;
+ //u16 vlan_tag = 0;
+ int i_frag;
+ bool bounce = false;
+ bool stop_queue;
+ //u32 ring_cons;
+ __be32 owner_bit;
+
+ owner_bit = (ring->prod & ring->size) ?
+ cpu_to_be32(MLX4_EN_BIT_DESC_OWN) : 0;
+
+ /* fetch ring->cons far ahead before needing it to avoid stall */
+ //ring_cons = ACCESS_ONCE(ring->cons);
+
+ real_size = CTRL_SIZE + (mbuf->nb_segs) * DS_SIZE;
+
+ /* Align descriptor to TXBB size */
+ desc_size = ALIGN(real_size, TXBB_SIZE);
+ nr_txbb = desc_size / TXBB_SIZE;
+ if (unlikely(nr_txbb > MAX_DESC_TXBBS)) {
+ en_warn(priv, "Oversized header or SG list\n");
+ goto tx_drop;
+ }
+
+ //vlan_tag = mbuf->vlan_tci;
+
+
+
+ /* Packet is good - grab an index and transmit it */
+ index = ring->prod & ring->size_mask;
+
+ /* See if we have enough space for whole descriptor TXBB for setting
+ * SW ownership on next descriptor; if not, use a bounce buffer. */
+ if (likely(index + nr_txbb <= ring->size)) {
+ tx_desc = ring->buf + index * TXBB_SIZE;
+ } else {
+ tx_desc = (struct mlx4_en_tx_desc *) ring->bounce_buf;
+ bounce = true;
+ }
+
+ /* Save skb in tx_info ring */
+ tx_info = &ring->tx_info[index];
+ tx_info->mbuf = mbuf;
+ tx_info->nr_txbb = nr_txbb;
+
+ data = &tx_desc->data;
+
+
+ {
+ struct mlx4_wqe_data_seg * data_iter;
+ struct rte_mbuf* mbuf_iter;
+ dma_addr_t dma = 0;
+ u32 byte_count = 0;
+
+ /* Map fragments if any */
+ mbuf_iter = mbuf;
+ data_iter = data;
+ for(i_frag = 0; i_frag < mbuf->nb_segs; i_frag++)
+ {
+ byte_count = rte_pktmbuf_data_len(mbuf_iter);
+ dma = mbuf_iter->buf_physaddr + mbuf_iter->data_off;
+ data_iter->addr = cpu_to_be64(dma);
+ data_iter->lkey = ring->mr_key;
+ wmb();
+ data_iter->byte_count = SET_BYTE_COUNT(byte_count);
+ ++data_iter;
+ mbuf_iter = mbuf_iter->next;
+ }
+
+ /* tx completion can avoid cache line miss for common cases */
+ //tx_info->map0_dma = dma;
+ //tx_info->map0_byte_count = byte_count;
+ }
+
+
+ /* Prepare ctrl segement apart opcode+ownership, which depends on
+ * whether LSO is used */
+ tx_desc->ctrl.srcrb_flags = priv->ctrl_flags;
+ if (mbuf->ol_flags & (PKT_TX_L4_MASK | PKT_TX_IP_CKSUM)) {
+ tx_desc->ctrl.srcrb_flags |= cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM |
+ MLX4_WQE_CTRL_TCP_UDP_CSUM);
+ }
+
+ if (priv->flags & MLX4_EN_FLAG_ENABLE_HW_LOOPBACK) {
+ struct ethhdr *ethh;
+
+ /* Copy dst mac address to wqe. This allows loopback in eSwitch,
+ * so that VFs and PF can communicate with each other
+ */
+ ethh = rte_pktmbuf_mtod(mbuf, struct ethhdr *);
+ tx_desc->ctrl.srcrb_flags16[0] = cpu_to_be16(*((__be16 *)ethh->h_dest));
+ tx_desc->ctrl.imm = cpu_to_be32(*((__be32 *)(ethh->h_dest + 2)));
+ }
+
+ {
+ /* Normal (Non LSO) packet */
+ op_own = cpu_to_be32(MLX4_OPCODE_SEND);
+ }
+
+ op_own |= owner_bit;
+
+ ring->prod += nr_txbb;
+
+ /* If we used a bounce buffer then copy descriptor back into place */
+ if (unlikely(bounce))
+ tx_desc = mlx4_en_bounce_to_desc(priv, ring, index, desc_size);
+
+
+
+ real_size = (real_size / 16) & 0x3f;
+
+ {
+ tx_desc->ctrl.vlan_tag = 0;//cpu_to_be16(vlan_tag);
+ //tx_desc->ctrl.ins_vlan = MLX4_WQE_CTRL_INS_VLAN *
+// !!mbuf->vlan_tci;
+ tx_desc->ctrl.ins_vlan = 0;
+ tx_desc->ctrl.fence_size = real_size;
+
+ /* Ensure new descriptor hits memory
+ * before setting ownership of this descriptor to HW
+ */
+ wmb();
+ tx_desc->ctrl.owner_opcode = op_own;
+
+ }
+ return 1;
+
+tx_drop:
+ rte_pktmbuf_free(mbuf);
+ return 1;
+}
+
+
+
+uint16_t
+mlx4_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+ uint16_t nb_pkts)
+{
+ struct mlx4_en_tx_ring* ring = tx_queue;
+
+ uint16_t sent = 0;
+ mlx4_en_process_tx_cq(ring);
+ while(sent < nb_pkts)
+ {
+ if(mlx4_txq_is_full(ring))
+ break;
+ sent += mlx4_en_xmit(tx_pkts[sent], ring);
+ }
+ if(sent > 0)
+ {
+ wmb();
+ /* Since there is no iowrite*_native() that writes the
+ * value as is, without byteswapping - using the one
+ * the doesn't do byteswapping in the relevant arch
+ * endianness.
+ */
+ write32(
+ ring->doorbell_qpn,
+ ring->bf.uar->map + MLX4_SEND_DOORBELL);
+ }
+ return sent;
+}
+
+
+int mlx4_en_dev_link_update(struct rte_eth_dev *eth_dev, int wait_to_complete)
+{
+ struct mlx4_en_priv *en_priv = rtedev_priv(eth_dev);
+ int ret = mlx4_en_QUERY_PORT(en_priv->mdev, en_priv->port);
+ assert(ret == 0);
+
+ eth_dev->data->dev_link.link_status = en_priv->port_state.link_state;
+ eth_dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;
+ switch(en_priv->port_state.link_speed)
+ {
+ case SPEED_100:
+ eth_dev->data->dev_link.link_speed = ETH_LINK_SPEED_100;
+ break;
+ case SPEED_1000:
+ eth_dev->data->dev_link.link_speed = ETH_LINK_SPEED_1000;
+ break;
+ case SPEED_10000:
+ eth_dev->data->dev_link.link_speed = ETH_LINK_SPEED_10G;
+ break;
+ case SPEED_20000:
+ eth_dev->data->dev_link.link_speed = ETH_LINK_SPEED_20G;
+ break;
+ case SPEED_40000:
+ eth_dev->data->dev_link.link_speed = ETH_LINK_SPEED_40G;
+ break;
+ case SPEED_56000:
+ RTE_LOG(WARNING, EAL, "56G is not recognized in DPDK, assume as 40G");
+ eth_dev->data->dev_link.link_speed = ETH_LINK_SPEED_40G;
+ break;
+ case -1:
+ msleep(500);
+ break;
+ default:
+ RTE_LOG(ERR, EAL, "Unknown link type %d", en_priv->port_state.link_speed);
+ break;
+ }
+
+ return 0;
+}
+
+void mlx4_en_dev_promiscuous_enable(struct rte_eth_dev *eth_dev)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(eth_dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ {
+ int err = 0;
+
+ if (!(priv->flags & MLX4_EN_FLAG_PROMISC)) {
+ //if (netif_msg_rx_status(priv))
+ en_warn(priv, "Entering promiscuous mode\n");
+ priv->flags |= MLX4_EN_FLAG_PROMISC;
+
+ /* Enable promiscouos mode */
+ switch (mdev->dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_DEVICE_MANAGED:
+ err = mlx4_flow_steer_promisc_add(mdev->dev,
+ priv->port,
+ priv->base_qpn,
+ MLX4_FS_ALL_DEFAULT);
+ if (err)
+ en_err(priv, "Failed enabling promiscuous mode\n");
+ priv->flags |= MLX4_EN_FLAG_MC_PROMISC;
+ break;
+
+ case MLX4_STEERING_MODE_B0:
+ err = mlx4_unicast_promisc_add(mdev->dev,
+ priv->base_qpn,
+ priv->port);
+ if (err)
+ en_err(priv, "Failed enabling unicast promiscuous mode\n");
+
+ /* Add the default qp number as multicast
+ * promisc
+ */
+ if (!(priv->flags & MLX4_EN_FLAG_MC_PROMISC)) {
+ err = mlx4_multicast_promisc_add(mdev->dev,
+ priv->base_qpn,
+ priv->port);
+ if (err)
+ en_err(priv, "Failed enabling multicast promiscuous mode\n");
+ priv->flags |= MLX4_EN_FLAG_MC_PROMISC;
+ }
+ break;
+
+ case MLX4_STEERING_MODE_A0:
+ err = mlx4_SET_PORT_qpn_calc(mdev->dev,
+ priv->port,
+ priv->base_qpn,
+ 1);
+ if (err)
+ en_err(priv, "Failed enabling promiscuous mode\n");
+ break;
+ }
+
+ /* Disable port multicast filter (unconditionally) */
+ err = mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, 0,
+ 0, MLX4_MCAST_DISABLE);
+ if (err)
+ en_err(priv, "Failed disabling multicast filter\n");
+ }
+ }
+}
+void mlx4_en_dev_promiscuous_disable(struct rte_eth_dev *eth_dev)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(eth_dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ {
+ int err = 0;
+
+ //if (netif_msg_rx_status(priv))
+ en_warn(priv, "Leaving promiscuous mode\n");
+ priv->flags &= ~MLX4_EN_FLAG_PROMISC;
+
+ /* Disable promiscouos mode */
+ switch (mdev->dev->caps.steering_mode) {
+ case MLX4_STEERING_MODE_DEVICE_MANAGED:
+ err = mlx4_flow_steer_promisc_remove(mdev->dev,
+ priv->port,
+ MLX4_FS_ALL_DEFAULT);
+ if (err)
+ en_err(priv, "Failed disabling promiscuous mode\n");
+ priv->flags &= ~MLX4_EN_FLAG_MC_PROMISC;
+ break;
+
+ case MLX4_STEERING_MODE_B0:
+ err = mlx4_unicast_promisc_remove(mdev->dev,
+ priv->base_qpn,
+ priv->port);
+ if (err)
+ en_err(priv, "Failed disabling unicast promiscuous mode\n");
+ /* Disable Multicast promisc */
+ if (priv->flags & MLX4_EN_FLAG_MC_PROMISC) {
+ err = mlx4_multicast_promisc_remove(mdev->dev,
+ priv->base_qpn,
+ priv->port);
+ if (err)
+ en_err(priv, "Failed disabling multicast promiscuous mode\n");
+ priv->flags &= ~MLX4_EN_FLAG_MC_PROMISC;
+ }
+ break;
+
+ case MLX4_STEERING_MODE_A0:
+ err = mlx4_SET_PORT_qpn_calc(mdev->dev,
+ priv->port,
+ priv->base_qpn, 0);
+ if (err)
+ en_err(priv, "Failed disabling promiscuous mode\n");
+ break;
+ }
+ }
+}
+
+static unsigned long en_stats_adder(__be64 *start, __be64 *next, int num)
+{
+ __be64 *curr = start;
+ unsigned long ret = 0;
+ int i;
+ int offset = next - start;
+
+ for (i = 0; i < num; i++) {
+ ret += be64_to_cpu(*curr);
+ curr += offset;
+ }
+
+ return ret;
+}
+
+void mlx4_en_dev_stats_get(struct rte_eth_dev *eth_dev,
+ struct rte_eth_stats *igb_stats)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(eth_dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int port = priv->port;
+ int reset = priv->stat_reset;
+ {
+ priv->stat_reset = 0;
+ struct mlx4_en_vport_stats tmp_vport_stats;
+ struct mlx4_en_stat_out_mbox *mlx4_en_stats;
+ struct mlx4_en_stat_out_flow_control_mbox *flowstats;
+ struct mlx4_en_vport_stats *vport_stats = &priv->vport_stats;
+ struct mlx4_cmd_mailbox *mailbox;
+ u64 in_mod = reset << 8 | port;
+ int err;
+ int i, read_counters = 0;;
+
+ mailbox = mlx4_alloc_cmd_mailbox(mdev->dev);
+ if (IS_ERR(mailbox))
+ return;// PTR_ERR(mailbox);
+ err = mlx4_cmd_box(mdev->dev, 0, mailbox->dma, in_mod, 0,
+ MLX4_CMD_DUMP_ETH_STATS, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_NATIVE);
+ if (err)
+ goto out;
+
+ mlx4_en_stats = mailbox->buf;
+
+
+ /* net device stats */
+ igb_stats->ierrors = be64_to_cpu(mlx4_en_stats->PCS) +
+ be32_to_cpu(mlx4_en_stats->RJBBR) +
+ be32_to_cpu(mlx4_en_stats->RCRC) +
+ be32_to_cpu(mlx4_en_stats->RRUNT) +
+ be64_to_cpu(mlx4_en_stats->RInRangeLengthErr) +
+ be64_to_cpu(mlx4_en_stats->ROutRangeLengthErr) +
+ be32_to_cpu(mlx4_en_stats->RSHORT) +
+ en_stats_adder(&mlx4_en_stats->RGIANT_prio_0,
+ &mlx4_en_stats->RGIANT_prio_1,
+ NUM_PRIORITIES);
+ igb_stats->oerrors = en_stats_adder(&mlx4_en_stats->TGIANT_prio_0,
+ &mlx4_en_stats->TGIANT_prio_1,
+ NUM_PRIORITIES);
+ igb_stats->imcasts = en_stats_adder(&mlx4_en_stats->MCAST_prio_0,
+ &mlx4_en_stats->MCAST_prio_1,
+ NUM_PRIORITIES);
+ igb_stats->imissed = be32_to_cpu(mlx4_en_stats->RDROP);
+ igb_stats->ibadlen = be32_to_cpu(mlx4_en_stats->RdropLength);
+ igb_stats->ibadcrc = be32_to_cpu(mlx4_en_stats->RCRC);
+ igb_stats->oerrors += be32_to_cpu(mlx4_en_stats->TDROP);
+
+ /* RX stats */
+ igb_stats->ipackets = en_stats_adder(&mlx4_en_stats->RTOT_prio_0,
+ &mlx4_en_stats->RTOT_prio_1,
+ NUM_PRIORITIES);
+ igb_stats->ibytes = en_stats_adder(&mlx4_en_stats->ROCT_prio_0,
+ &mlx4_en_stats->ROCT_prio_1,
+ NUM_PRIORITIES);
+
+
+ /* Tx stats */
+ igb_stats->opackets = en_stats_adder(&mlx4_en_stats->TTOT_prio_0,
+ &mlx4_en_stats->TTOT_prio_1,
+ NUM_PRIORITIES);
+ igb_stats->obytes = en_stats_adder(&mlx4_en_stats->TOCT_prio_0,
+ &mlx4_en_stats->TOCT_prio_1,
+ NUM_PRIORITIES);
+
+
+ out:
+ mlx4_free_cmd_mailbox(mdev->dev, mailbox);
+ }
+}
+
+void mlx4_en_dev_stats_reset(struct rte_eth_dev *eth_dev)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(eth_dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int port = priv->port;
+ int reset = 1;
+ if(priv->stat_reset)
+ {
+ priv->stat_reset = 0;
+ struct mlx4_cmd_mailbox *mailbox;
+ u64 in_mod = reset << 8 | port;
+
+ mailbox = mlx4_alloc_cmd_mailbox(mdev->dev);
+ if (IS_ERR(mailbox))
+ return;// PTR_ERR(mailbox);
+ mlx4_cmd_box(mdev->dev, 0, mailbox->dma, in_mod, 0,
+ MLX4_CMD_DUMP_ETH_STATS, MLX4_CMD_TIME_CLASS_B,
+ MLX4_CMD_NATIVE);
+ mlx4_free_cmd_mailbox(mdev->dev, mailbox);
+ }
+ else
+ priv->stat_reset = 1;
+}
+
+int mlx4_en_dev_start(struct rte_eth_dev *dev)
+{
+ struct mlx4_en_priv *priv = rtedev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ struct mlx4_en_cq *cq;
+ struct mlx4_en_tx_ring *tx_ring;
+ struct mlx4_en_rx_ring *rx_ring;
+ int rx_index = 0;
+ int tx_index = 0;
+ int err = 0;
+ int i;
+ int j;
+ u8 mc_list[16] = {0};
+
+ if (priv->port_up) {
+ en_dbg(DRV, priv, "start port called while port already up\n");
+ return 0;
+ }
+
+ INIT_LIST_HEAD(&priv->mc_list);
+ INIT_LIST_HEAD(&priv->curr_list);
+ INIT_LIST_HEAD(&priv->ethtool_list);
+ memset(&priv->ethtool_rules[0], 0,
+ sizeof(struct ethtool_flow_id) * MAX_NUM_OF_FS_RULES);
+
+ /* Calculate Rx buf size */
+ //dev->data->mtu = min(dev->data->mtu, priv->max_mtu);
+ //mlx4_en_calc_rx_buf(dev);
+ //en_dbg(DRV, priv, "Rx buf size:%d\n", priv->rx_skb_size);
+
+ /* Configure rx cq's and rings */
+ err = mlx4_en_activate_rx_rings(priv);
+ if (err) {
+ en_err(priv, "Failed to activate RX rings\n");
+ return err;
+ }
+ for (i = 0; i < dev->data->nb_rx_queues; i++) {
+ rx_ring = dev->data->rx_queues[i];
+ cq = &rx_ring->rx_cq;
+
+ err = mlx4_en_activate_cq(priv, cq, i, rx_ring->enable_hwtstamp);
+ if (err) {
+ en_err(priv, "Failed activating Rx CQ\n");
+ //mlx4_en_free_affinity_hint(priv, i);
+ return err;
+ }
+
+ for (j = 0; j < cq->size; j++) {
+ struct mlx4_cqe *cqe = NULL;
+
+ cqe = mlx4_en_get_cqe(cq->buf, j, priv->cqe_size) +
+ priv->cqe_factor;
+ cqe->owner_sr_opcode = MLX4_CQE_OWNER_MASK;
+ }
+
+ err = mlx4_en_set_cq_moder(priv, cq);
+ if (err) {
+ en_err(priv, "Failed setting cq moderation parameters\n");
+ mlx4_en_deactivate_cq(priv, cq);
+ //mlx4_en_free_affinity_hint(priv, i);
+ return err;
+ }
+ //mlx4_en_arm_cq(priv, cq);
+ //priv->rx_ring[i]->cqn = cq->mcq.cqn;
+ ++rx_index;
+ }
+
+ /* Set qp number */
+ en_dbg(DRV, priv, "Getting qp number for port %d\n", priv->port);
+ err = mlx4_en_get_qp(priv);
+ if (err) {
+ en_err(priv, "Failed getting eth qp\n");
+ return err;
+ }
+ mdev->mac_removed[priv->port] = 0;
+
+ /* gets default allocated counter index from func cap */
+ /* or sink counter index if no resources */
+ priv->counter_index = mdev->dev->caps.def_counter_index[priv->port - 1];
+
+ en_dbg(DRV, priv, "%s: default counter index %d for port %d\n",
+ __func__, priv->counter_index, priv->port);
+
+ err = mlx4_en_config_rss_steer(dev);
+ if (err) {
+ en_err(priv, "Failed configuring rss steering\n");
+ return err;
+ }
+
+ err = mlx4_en_create_drop_qp(priv);
+ if (err)
+ return err;
+
+ /* Configure tx cq's and rings */
+ for (i = 0; i < dev->data->nb_tx_queues; i++) {
+ /* Configure cq */
+ tx_ring = dev->data->tx_queues[i];
+ cq = &tx_ring->tx_cq;
+ err = mlx4_en_activate_cq(priv, cq, i, tx_ring->enable_hwtstamp);
+ if (err) {
+ en_err(priv, "Failed allocating Tx CQ\n");
+ return err;
+ }
+ err = mlx4_en_set_cq_moder(priv, cq);
+ if (err) {
+ en_err(priv, "Failed setting cq moderation parameters\n");
+ mlx4_en_deactivate_cq(priv, cq);
+ return err;
+ }
+ en_dbg(DRV, priv, "Resetting index of collapsed CQ:%d to -1\n", i);
+ cq->buf->wqe_index = cpu_to_be16(0xffff);
+
+ /* Configure ring */
+ //tx_ring = priv->tx_ring[i];
+#ifdef HAVE_NEW_TX_RING_SCHEME
+ err = mlx4_en_activate_tx_ring(priv, tx_ring, cq->mcq.cqn,
+ i / priv->num_tx_rings_p_up);
+#else
+ err = mlx4_en_activate_tx_ring(priv, tx_ring, cq->mcq.cqn);
+#endif
+ if (err) {
+ en_err(priv, "Failed allocating Tx ring\n");
+ mlx4_en_deactivate_cq(priv, cq);
+ return err;
+ }
+ //tx_ring->tx_queue = netdev_get_tx_queue(dev, i);
+
+ /* Arm CQ for TX completions */
+ //mlx4_en_arm_cq(priv, cq);
+
+ /* Set initial ownership of all Tx TXBBs to SW (1) */
+ for (j = 0; j < tx_ring->buf_size; j += STAMP_STRIDE)
+ *((u32 *) (tx_ring->buf + j)) = INIT_OWNER_BIT;
+ ++tx_index;
+ }
+
+ /* Configure port */
+ err = mlx4_SET_PORT_general(mdev->dev, priv->port,
+ priv->eff_mtu + ETH_FCS_LEN,
+ priv->prof->tx_pause,
+ priv->prof->tx_ppp,
+ priv->prof->rx_pause,
+ priv->prof->rx_ppp);
+ if (err) {
+ en_err(priv, "Failed setting port general configurations for port %d, with error %d\n",
+ priv->port, err);
+ return err;
+ }
+ /* Set default qp number */
+ err = mlx4_SET_PORT_qpn_calc(mdev->dev, priv->port, priv->base_qpn, 0);
+ if (err) {
+ en_err(priv, "Failed setting default qp numbers\n");
+ return err;
+ }
+
+ if (mdev->dev->caps.tunnel_offload_mode == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN) {
+ err = mlx4_SET_PORT_VXLAN(mdev->dev, priv->port, VXLAN_STEER_BY_OUTER_MAC, 1);
+ if (err) {
+ en_err(priv, "Failed setting port L2 tunnel configuration, err %d\n",
+ err);
+ return err;
+ }
+ }
+
+ /* Init port */
+ en_dbg(HW, priv, "Initializing port\n");
+ err = mlx4_INIT_PORT(mdev->dev, priv->port);
+ if (err) {
+ en_err(priv, "Failed Initializing port\n");
+ return err;
+ }
+
+ /* Attach rx QP to bradcast address */
+ memset(&mc_list[10], 0xff, ETH_ALEN);
+ mc_list[5] = priv->port; /* needed for B0 steering support */
+ if (mlx4_multicast_attach(mdev->dev, &priv->rss_map.indir_qp, mc_list,
+ priv->port, 0, MLX4_PROT_ETH,
+ &priv->broadcast_id))
+ mlx4_warn(mdev, "Failed Attaching Broadcast\n");
+
+ /* Must redo promiscuous mode setup. */
+ priv->flags &= ~(MLX4_EN_FLAG_PROMISC | MLX4_EN_FLAG_MC_PROMISC);
+
+ /* Schedule multicast task to populate multicast list */
+ //queue_work(mdev->workqueue, &priv->rx_mode_task);
+
+#ifdef HAVE_VXLAN_DYNAMIC_PORT
+ if (priv->mdev->dev->caps.tunnel_offload_mode == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN)
+ vxlan_get_rx_port(dev);
+#endif
+
+ priv->port_up = true;
+ //netif_tx_start_all_queues(dev);
+ //netif_device_attach(dev);
+ assert(dev->data->nb_rx_queues == rx_index);
+ assert(dev->data->nb_tx_queues == tx_index);
+
+ return 0;
+}
+
+
+const struct eth_dev_ops mlx4_eth_dev_ops = {
+ .dev_configure = mlx4_eth_dev_configure,
+ .dev_infos_get = mlx4_eth_dev_infos_get,
+ .rx_queue_setup = mlx4_eth_rx_queue_setup,
+ .tx_queue_setup = mlx4_eth_tx_queue_setup,
+ .link_update = mlx4_en_dev_link_update,
+ .promiscuous_enable = mlx4_en_dev_promiscuous_enable,
+ .promiscuous_disable = mlx4_en_dev_promiscuous_disable,
+ .stats_get = mlx4_en_dev_stats_get,
+ .stats_reset = mlx4_en_dev_stats_reset,
+ .dev_start = mlx4_en_dev_start,
+};
+
+
+int mlx4_set_tx_timestamp(int port, int queue_id, int use)
+{
+ struct mlx4_en_tx_ring* ring = rte_eth_devices[port].data->tx_queues[queue_id];
+ ring->enable_hwtstamp = !!use;
+ return 0;
+}
+int mlx4_set_rx_timestamp(int port, int queue_id, int use)
+{
+ struct mlx4_en_rx_ring* ring = rte_eth_devices[port].data->rx_queues[queue_id];
+ ring->enable_hwtstamp = !!use;
+ return 0;
+}
+
+int mlx4_poll_tx_cq(int port, int txq)
+{
+ struct mlx4_en_tx_ring* ring = rte_eth_devices[port].data->tx_queues[txq];
+ return mlx4_en_process_tx_cq(ring);
+}
+int mlx4_set_tx_completion_callback(int port, int queue_id, mlx4_tx_completion_callback_t callback, void* arg)
+{
+ struct mlx4_en_tx_ring* ring = rte_eth_devices[port].data->tx_queues[queue_id];
+ ring->tx_tstamp_callback = callback;
+ ring->tx_tstamp_callback_arg = arg;
+ return 0;
+}
+uint64_t mlx4_read_dev_clock_hz(int port)
+{
+ struct mlx4_en_priv* priv = rte_eth_devices[port].data->dev_private;
+ return priv->mdev->dev->caps.hca_core_clock * 1000000UL; //MHz
+}
+uint64_t mlx4_read_dev_clock(int port)
+{
+ struct mlx4_en_priv* priv = rte_eth_devices[port].data->dev_private;
+ return mlx4_read_clock(priv->mdev->dev);
+}
diff --git a/drivers/net/mlnx_uio/prepare.py b/drivers/net/mlnx_uio/prepare.py
new file mode 100755
index 0000000..d6a5ae4
--- /dev/null
+++ b/drivers/net/mlnx_uio/prepare.py
@@ -0,0 +1,28 @@
+#!/usr/bin/env python3
+import re
+import os
+import os.path
+import sys
+
+source_ext = ['.c','.cc','.cpp','.cxx']
+header_ext = ['.h','.hpp','.hh','.hxx']
+
+root_dir = sys.argv[1]
+
+makefile_path = os.path.join(root_dir, "driver_sources.mk")
+
+target_source = []
+
+for curdir, subdirs, files in os.walk(os.path.join(root_dir, "mlnx")):
+ for f in files:
+ name,ext = os.path.splitext(f)
+ if not (ext in source_ext):
+ continue
+ target_source.append((curdir, f))
+
+with open (makefile_path, "w") as makefile:
+ for (d,source) in target_source:
+ relpath = os.path.relpath(os.path.join(d,source), root_dir)
+ makefile.write("\tSRCS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += " + relpath + "\n")
+
+
diff --git a/drivers/net/mlnx_uio/rte_pmd_mlnx_uio_version.map b/drivers/net/mlnx_uio/rte_pmd_mlnx_uio_version.map
new file mode 100644
index 0000000..ef35398
--- /dev/null
+++ b/drivers/net/mlnx_uio/rte_pmd_mlnx_uio_version.map
@@ -0,0 +1,4 @@
+DPDK_2.0 {
+
+ local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index ad6f633..e0f38d8 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -70,6 +70,9 @@ ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
_LDLIBS-$(CONFIG_RTE_LIBRTE_KNI) += -lrte_kni
_LDLIBS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += -lrte_ivshmem
endif
+ifeq ($(CONFIG_RTE_EAL_PERSISTENT_MEM),y)
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PERSISTENT) += -lrte_persistent
+endif
_LDLIBS-$(CONFIG_RTE_LIBRTE_PIPELINE) += -lrte_pipeline
_LDLIBS-$(CONFIG_RTE_LIBRTE_TABLE) += -lrte_table
@@ -102,6 +105,8 @@ endif
_LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -libverbs
+_LDLIBS-$(CONFIG_RTE_EAL_PERSISTENT_MEM) += -lnuma
+
_LDLIBS-y += --start-group
ifeq ($(CONFIG_RTE_BUILD_COMBINE_LIBS),n)
@@ -137,6 +142,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING) += -lrte_pmd_ring
_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += -lrte_pmd_pcap
_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += -lrte_pmd_af_packet
_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += -lrte_pmd_null
+_LDLIBS-$(CONFIG_RTE_LIBRTE_MLNX_UIO_PMD) += -lrte_pmd_mlnx_uio
endif # ! $(CONFIG_RTE_BUILD_SHARED_LIB)
--
2.1.4
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices
2015-07-06 13:28 [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices leeopop
2015-07-06 13:28 ` [dpdk-dev] [PATCH 1/2] eal/persistent: new library to hold memory region after program exit leeopop
2015-07-06 13:28 ` [dpdk-dev] [PATCH 2/2] mlnx_uio: new poll mode driver leeopop
@ 2015-07-06 14:17 ` Thomas Monjalon
2015-07-06 15:57 ` Keunhong Lee
2 siblings, 1 reply; 13+ messages in thread
From: Thomas Monjalon @ 2015-07-06 14:17 UTC (permalink / raw)
To: leeopop; +Cc: dev
2015-07-06 22:28, leeopop:
> This is a native UIO-based PMD for Mellanox ConnectX-3 devices.
> It uses a persistent memory library in order to provide a persistent
> scartch area for the mlx4 HCA driver.
What is the benefit of this UIO approach compared to the OFED based driver?
> We release the driver itself under BSD license, but to use it for
> commercial products, you may have to re-implement the separated GPL sources.
The GPL sources are not really separated.
The DPDK libraries must be BSD-licensed.
> The GPL affected source codes reside in the mlnx_uio/kernel directory.
It seems that a large part of the GPL driver was also copied in mlnx_uio/mlnx/.
Given that you are dropping a huge GPL codebase (whose you don't own the copyright)
in a BSD library, and that you didn't give your real name in the signed-off line,
it is NACK.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH 1/2] eal/persistent: new library to hold memory region after program exit
2015-07-06 13:28 ` [dpdk-dev] [PATCH 1/2] eal/persistent: new library to hold memory region after program exit leeopop
@ 2015-07-06 14:34 ` Avi Kivity
2015-07-06 14:41 ` Thomas Monjalon
2015-07-06 19:19 ` Stephen Hemminger
1 sibling, 1 reply; 13+ messages in thread
From: Avi Kivity @ 2015-07-06 14:34 UTC (permalink / raw)
To: leeopop, dev
On 07/06/2015 04:28 PM, leeopop wrote:
> Some NICs use host memory region as their scratch area.
> When DPDK user applications terminate, all the memory regions are lost,
> re-initialized (memzone), which causes HW faults.
> This libraray maintains shared memory regions that is persistent across
> multiple execution and termination of user level applications.
> It also manages physically contiguous memory regions.
>
> Signed-off-by: leeopop <dlrmsghd@gmail.com>
>
Does dpdk accept anonymous signoffs?
DCO usually requires a real name.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH 1/2] eal/persistent: new library to hold memory region after program exit
2015-07-06 14:34 ` Avi Kivity
@ 2015-07-06 14:41 ` Thomas Monjalon
0 siblings, 0 replies; 13+ messages in thread
From: Thomas Monjalon @ 2015-07-06 14:41 UTC (permalink / raw)
To: Avi Kivity; +Cc: dev
2015-07-06 17:34, Avi Kivity:
> On 07/06/2015 04:28 PM, leeopop wrote:
> > Some NICs use host memory region as their scratch area.
> > When DPDK user applications terminate, all the memory regions are lost,
> > re-initialized (memzone), which causes HW faults.
> > This libraray maintains shared memory regions that is persistent across
> > multiple execution and termination of user level applications.
> > It also manages physically contiguous memory regions.
> >
> > Signed-off-by: leeopop <dlrmsghd@gmail.com>
> >
>
> Does dpdk accept anonymous signoffs?
>
> DCO usually requires a real name.
Exact. We follow Linux guidelines for DCO:
http://dpdk.org/dev#send
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices
2015-07-06 14:17 ` [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices Thomas Monjalon
@ 2015-07-06 15:57 ` Keunhong Lee
2015-07-06 16:14 ` Thomas Monjalon
0 siblings, 1 reply; 13+ messages in thread
From: Keunhong Lee @ 2015-07-06 15:57 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: dev
Answer 1. UIO based driver is faster then ib based driver.
It can saturate 40G link with MTU sized packets using a single thread while
ib wrapper cannot.
Answer 2. Sorry, I missed that. I'll make a new patch email with my real
name.
Question 1. Is it OK if I separate GPL-based and BSD-based codes into
separated patches?
mlx4 kernel driver itself is dual licenses, so I think they are considered
as BSD
in my source code.
The only source code under GPL is bitmap, integer logarithm, and red-black
tree
contained in mlnx_uio/kernel directory.
Keunhong.
2015-07-06 23:17 GMT+09:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
> 2015-07-06 22:28, leeopop:
> > This is a native UIO-based PMD for Mellanox ConnectX-3 devices.
> > It uses a persistent memory library in order to provide a persistent
> > scartch area for the mlx4 HCA driver.
>
> What is the benefit of this UIO approach compared to the OFED based driver?
>
> > We release the driver itself under BSD license, but to use it for
> > commercial products, you may have to re-implement the separated GPL
> sources.
>
> The GPL sources are not really separated.
> The DPDK libraries must be BSD-licensed.
>
> > The GPL affected source codes reside in the mlnx_uio/kernel directory.
>
> It seems that a large part of the GPL driver was also copied in
> mlnx_uio/mlnx/.
>
> Given that you are dropping a huge GPL codebase (whose you don't own the
> copyright)
> in a BSD library, and that you didn't give your real name in the
> signed-off line,
> it is NACK.
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices
2015-07-06 15:57 ` Keunhong Lee
@ 2015-07-06 16:14 ` Thomas Monjalon
2015-07-06 17:55 ` Keunhong Lee
0 siblings, 1 reply; 13+ messages in thread
From: Thomas Monjalon @ 2015-07-06 16:14 UTC (permalink / raw)
To: Keunhong Lee; +Cc: dev
2015-07-07 00:57, Keunhong Lee:
> Answer 1. UIO based driver is faster then ib based driver.
> It can saturate 40G link with MTU sized packets using a single thread while
> ib wrapper cannot.
OK, interesting. Do you have numbers and details about your testbed/scenario?
> Answer 2. Sorry, I missed that. I'll make a new patch email with my real
> name.
>
> Question 1. Is it OK if I separate GPL-based and BSD-based codes into
> separated patches?
> mlx4 kernel driver itself is dual licenses, so I think they are considered
> as BSD in my source code.
> The only source code under GPL is bitmap, integer logarithm, and red-black
> tree contained in mlnx_uio/kernel directory.
These parts will be built in the user-space driver library, right?
It would change the license, which is not desirable.
Technically, your approach may be interesting.
But from a maintenance point of view, this huge codebase may be a nightmare.
> 2015-07-06 23:17 GMT+09:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
>
> > 2015-07-06 22:28, leeopop:
> > > This is a native UIO-based PMD for Mellanox ConnectX-3 devices.
> > > It uses a persistent memory library in order to provide a persistent
> > > scartch area for the mlx4 HCA driver.
> >
> > What is the benefit of this UIO approach compared to the OFED based driver?
> >
> > > We release the driver itself under BSD license, but to use it for
> > > commercial products, you may have to re-implement the separated GPL
> > > sources.
> >
> > The GPL sources are not really separated.
> > The DPDK libraries must be BSD-licensed.
> >
> > > The GPL affected source codes reside in the mlnx_uio/kernel directory.
> >
> > It seems that a large part of the GPL driver was also copied in
> > mlnx_uio/mlnx/.
> >
> > Given that you are dropping a huge GPL codebase (whose you don't own the
> > copyright) in a BSD library, and that you didn't give your real name in
> > the signed-off line, it is NACK.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices
2015-07-06 16:14 ` Thomas Monjalon
@ 2015-07-06 17:55 ` Keunhong Lee
2015-07-07 6:50 ` Olga Shern
0 siblings, 1 reply; 13+ messages in thread
From: Keunhong Lee @ 2015-07-06 17:55 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: dev
We found that optimizing fragmentation configuration of mlx4 driver
performs as fast as native PMD.
I think we have to re-consider using native driver rather than ib driver.
Keunhong.
2015-07-07 1:14 GMT+09:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
> 2015-07-07 00:57, Keunhong Lee:
> > Answer 1. UIO based driver is faster then ib based driver.
> > It can saturate 40G link with MTU sized packets using a single thread
> while
> > ib wrapper cannot.
>
> OK, interesting. Do you have numbers and details about your
> testbed/scenario?
>
> > Answer 2. Sorry, I missed that. I'll make a new patch email with my real
> > name.
> >
> > Question 1. Is it OK if I separate GPL-based and BSD-based codes into
> > separated patches?
> > mlx4 kernel driver itself is dual licenses, so I think they are
> considered
> > as BSD in my source code.
> > The only source code under GPL is bitmap, integer logarithm, and
> red-black
> > tree contained in mlnx_uio/kernel directory.
>
> These parts will be built in the user-space driver library, right?
> It would change the license, which is not desirable.
>
> Technically, your approach may be interesting.
> But from a maintenance point of view, this huge codebase may be a
> nightmare.
>
>
> > 2015-07-06 23:17 GMT+09:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
> >
> > > 2015-07-06 22:28, leeopop:
> > > > This is a native UIO-based PMD for Mellanox ConnectX-3 devices.
> > > > It uses a persistent memory library in order to provide a persistent
> > > > scartch area for the mlx4 HCA driver.
> > >
> > > What is the benefit of this UIO approach compared to the OFED based
> driver?
> > >
> > > > We release the driver itself under BSD license, but to use it for
> > > > commercial products, you may have to re-implement the separated GPL
> > > > sources.
> > >
> > > The GPL sources are not really separated.
> > > The DPDK libraries must be BSD-licensed.
> > >
> > > > The GPL affected source codes reside in the mlnx_uio/kernel
> directory.
> > >
> > > It seems that a large part of the GPL driver was also copied in
> > > mlnx_uio/mlnx/.
> > >
> > > Given that you are dropping a huge GPL codebase (whose you don't own
> the
> > > copyright) in a BSD library, and that you didn't give your real name in
> > > the signed-off line, it is NACK.
>
>
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH 1/2] eal/persistent: new library to hold memory region after program exit
2015-07-06 13:28 ` [dpdk-dev] [PATCH 1/2] eal/persistent: new library to hold memory region after program exit leeopop
2015-07-06 14:34 ` Avi Kivity
@ 2015-07-06 19:19 ` Stephen Hemminger
1 sibling, 0 replies; 13+ messages in thread
From: Stephen Hemminger @ 2015-07-06 19:19 UTC (permalink / raw)
To: leeopop; +Cc: dev
Am I right, this library is using Unix shared memory to get
persistent storage. That is going to be slower and more worrying,
the kernel makes no guarantees that virtual to physical mapping
will not change.
There is also an ABI issue in this patch. You are introducing a new
API here, so the symbol should go in a new 2.1 section
> + if((len / RTE_PGSIZE_4K) > 1)
> + {
> + shmget_flag |= SHM_HUGETLB;
> + }
Please follow kernel coding style. This is one of many examples that
needs to be fixed before accepting.
ERROR: "foo* bar" should be "foo *bar"
#166: FILE: lib/librte_eal/common/include/rte_pci.h:210:
+ void* priv; /**< Private data. */
ERROR: "foo* bar" should be "foo *bar"
#195: FILE: lib/librte_eal/common/include/rte_persistent_mem.h:20:
+extern void* persistent_allocated_memory[RTE_MAX_NUMA_NODES][RTE_EAL_PERSISTENT_MEM_COUNT];
ERROR: "foo* bar" should be "foo *bar"
#302: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:44:
+static void* reserve_shared_zone(int subindex, uint32_t len, int socket_id)
ERROR: do not use C99 // comments
#307: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:49:
+ int shmget_flag = IPC_CREAT | SHM_R | SHM_W | IPC_EXCL; // | SHM_LOCKED;
WARNING: Missing a blank line after declarations
#310: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:52:
+ int err;
+ if((len / RTE_PGSIZE_4K) > 1)
ERROR: that open brace { should be on the previous line
#310: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:52:
+ if((len / RTE_PGSIZE_4K) > 1)
+ {
ERROR: space required before the open parenthesis '('
#310: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:52:
+ if((len / RTE_PGSIZE_4K) > 1)
ERROR: "foo* bar" should be "foo *bar"
#316: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:58:
+ void* addr = 0;
WARNING: Missing a blank line after declarations
#318: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:60:
+ int clear = 1;
+ if(shmid < 0)
ERROR: that open brace { should be on the previous line
#318: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:60:
+ if(shmid < 0)
+ {
ERROR: space required before the open parenthesis '('
#318: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:60:
+ if(shmid < 0)
ERROR: do not use C99 // comments
#320: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:62:
+ //Reuse existing
ERROR: that open brace { should be on the previous line
#328: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:70:
+ if(socket_id != SOCKET_ID_ANY)
+ {
ERROR: space required before the open parenthesis '('
#328: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:70:
+ if(socket_id != SOCKET_ID_ANY)
ERROR: "foo * bar" should be "foo *bar"
#330: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:72:
+ struct bitmask * mask = numa_bitmask_alloc(RTE_MAX_NUMA_NODES);
WARNING: Missing a blank line after declarations
#331: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:73:
+ struct bitmask * mask = numa_bitmask_alloc(RTE_MAX_NUMA_NODES);
+ mask = numa_bitmask_clearall(mask);
ERROR: that open brace { should be on the previous line
#336: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:78:
+ if(ret < 0)
+ {
ERROR: space required before the open parenthesis '('
#336: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:78:
+ if(ret < 0)
ERROR: that open brace { should be on the previous line
#344: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:86:
+ if(clear)
+ {
ERROR: space required before the open parenthesis '('
#344: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:86:
+ if(clear)
WARNING: Missing a blank line after declarations
#350: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:92:
+ size_t size;
+ volatile uint8_t reader = 0; //this prevents from being optimized out
ERROR: do not use C99 // comments
#350: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:92:
+ volatile uint8_t reader = 0; //this prevents from being optimized out
WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt
#350: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:92:
+ volatile uint8_t reader = 0; //this prevents from being optimized out
ERROR: "(foo*)" should be "(foo *)"
#351: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:93:
+ volatile uint8_t* readp = (uint8_t*)addr;
ERROR: "foo* bar" should be "foo *bar"
#351: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:93:
+ volatile uint8_t* readp = (uint8_t*)addr;
WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt
#351: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:93:
+ volatile uint8_t* readp = (uint8_t*)addr;
WARNING: Missing a blank line after declarations
#352: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:94:
+ volatile uint8_t* readp = (uint8_t*)addr;
+ for(size = 0; size < len; size++)
ERROR: that open brace { should be on the previous line
#352: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:94:
+ for(size = 0; size < len; size++)
+ {
ERROR: space required before the open parenthesis '('
#352: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:94:
+ for(size = 0; size < len; size++)
ERROR: "foo* bar" should be "foo *bar"
#364: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:106:
+void* persistent_allocated_memory[RTE_MAX_NUMA_NODES][SHM_COUNT];
ERROR: do not initialise statics to 0 or NULL
#366: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:108:
+static int numa_count = 0;
ERROR: do not use C99 // comments
#375: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:117:
+ assert(SHM_SIZE == RTE_PGSIZE_2M); //XXX considering only 2MB pages.
WARNING: Missing a blank line after declarations
#377: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:119:
+ int num_numa = numa_num_configured_nodes();
+ if(num_numa == 0)
ERROR: space required before the open parenthesis '('
#377: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:119:
+ if(num_numa == 0)
WARNING: Missing a blank line after declarations
#382: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:124:
+ int k;
+ for(node = 0; node < RTE_MAX_NUMA_NODES; node++)
ERROR: space required before the open parenthesis '('
#382: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:124:
+ for(node = 0; node < RTE_MAX_NUMA_NODES; node++)
ERROR: spaces required around that '=' (ctx:VxV)
#383: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:125:
+ for(k=0; k<SHM_COUNT; k++)
^
ERROR: spaces required around that '<' (ctx:VxV)
#383: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:125:
+ for(k=0; k<SHM_COUNT; k++)
^
ERROR: space required before the open parenthesis '('
#383: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:125:
+ for(k=0; k<SHM_COUNT; k++)
ERROR: that open brace { should be on the previous line
#386: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:128:
+ for(node = 0; node < num_numa; node++)
+ {
ERROR: space required before the open parenthesis '('
#386: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:128:
+ for(node = 0; node < num_numa; node++)
WARNING: Missing a blank line after declarations
#389: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:131:
+ int cur_socket = num_numa > 1 ? node : SOCKET_ID_ANY;
+ for(k=0; k<SHM_COUNT/num_numa; k++)
ERROR: that open brace { should be on the previous line
#389: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:131:
+ for(k=0; k<SHM_COUNT/num_numa; k++)
+ {
ERROR: spaces required around that '=' (ctx:VxV)
#389: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:131:
+ for(k=0; k<SHM_COUNT/num_numa; k++)
^
ERROR: spaces required around that '<' (ctx:VxV)
#389: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:131:
+ for(k=0; k<SHM_COUNT/num_numa; k++)
^
ERROR: space required before the open parenthesis '('
#389: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:131:
+ for(k=0; k<SHM_COUNT/num_numa; k++)
WARNING: Missing a blank line after declarations
#392: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:134:
+ int zone_index = ((SHM_COUNT/num_numa)*node + k);
+ persistent_allocated_memory[node][k] = reserve_shared_zone(zone_index,
ERROR: that open brace { should be on the previous line
#394: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:136:
+ if(persistent_allocated_memory[node][k] == 0)
+ {
ERROR: space required before the open parenthesis '('
#394: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:136:
+ if(persistent_allocated_memory[node][k] == 0)
WARNING: quoted string split across lines
#397: FILE: lib/librte_eal/linuxapp/eal/eal_persistent_mem.c:139:
+ RTE_LOG(ERR, EAL, "Cannot allocate shared zone index %d."
+ "node: %d, local index: %d\n", zone_index, node, k);
ERROR: do not initialise statics to 0 or NULL
#533: FILE: lib/librte_persistent/rte_persistent.c:26:
+static struct rte_hash* allocated_segments = 0;
ERROR: "foo* bar" should be "foo *bar"
#533: FILE: lib/librte_persistent/rte_persistent.c:26:
+static struct rte_hash* allocated_segments = 0;
ERROR: open brace '{' following struct go on the same line
#536: FILE: lib/librte_persistent/rte_persistent.c:29:
+struct alloc_info
+{
ERROR: do not use C99 // comments
#537: FILE: lib/librte_persistent/rte_persistent.c:30:
+ void* addr; //0 if not allocated
ERROR: "foo* bar" should be "foo *bar"
#537: FILE: lib/librte_persistent/rte_persistent.c:30:
+ void* addr; //0 if not allocated
ERROR: do not initialise statics to 0 or NULL
#550: FILE: lib/librte_persistent/rte_persistent.c:43:
+static int __initialized = 0;
ERROR: that open brace { should be on the previous line
#554: FILE: lib/librte_persistent/rte_persistent.c:47:
+ if(!__initialized)
+ {
ERROR: space required before the open parenthesis '('
#554: FILE: lib/librte_persistent/rte_persistent.c:47:
+ if(!__initialized)
ERROR: that open brace { should be on the previous line
#557: FILE: lib/librte_persistent/rte_persistent.c:50:
+ struct rte_hash_parameters hash_param =
+ {
ERROR: "(foo*)" should be "(foo *)"
#561: FILE: lib/librte_persistent/rte_persistent.c:54:
+ .key_len = sizeof(void*),
ERROR: do not use C99 // comments
#562: FILE: lib/librte_persistent/rte_persistent.c:55:
+ .hash_func = 0, //DEFAULT_HASH_FUNC,
WARNING: Missing a blank line after declarations
#571: FILE: lib/librte_persistent/rte_persistent.c:64:
+ int k;
+ for(k=0; k<SEGMENT_COUNT; k++)
ERROR: spaces required around that '=' (ctx:VxV)
#571: FILE: lib/librte_persistent/rte_persistent.c:64:
+ for(k=0; k<SEGMENT_COUNT; k++)
^
ERROR: spaces required around that '<' (ctx:VxV)
#571: FILE: lib/librte_persistent/rte_persistent.c:64:
+ for(k=0; k<SEGMENT_COUNT; k++)
^
ERROR: space required before the open parenthesis '('
#571: FILE: lib/librte_persistent/rte_persistent.c:64:
+ for(k=0; k<SEGMENT_COUNT; k++)
ERROR: "foo* bar" should be "foo *bar"
#588: FILE: lib/librte_persistent/rte_persistent.c:81:
+void* rte_persistent_alloc(size_t size, int socket)
WARNING: Missing a blank line after declarations
#591: FILE: lib/librte_persistent/rte_persistent.c:84:
+ int num_numa = rte_persistent_memory_num_numa();
+ if(socket == SOCKET_ID_ANY)
ERROR: that open brace { should be on the previous line
#591: FILE: lib/librte_persistent/rte_persistent.c:84:
+ if(socket == SOCKET_ID_ANY)
+ {
ERROR: space required before the open parenthesis '('
#591: FILE: lib/librte_persistent/rte_persistent.c:84:
+ if(socket == SOCKET_ID_ANY)
WARNING: Missing a blank line after declarations
#600: FILE: lib/librte_persistent/rte_persistent.c:93:
+ int num_page = (size / ALLOC_UNIT);
+ if(size % ALLOC_UNIT)
ERROR: space required before the open parenthesis '('
#600: FILE: lib/librte_persistent/rte_persistent.c:93:
+ if(size % ALLOC_UNIT)
WARNING: Missing a blank line after declarations
#605: FILE: lib/librte_persistent/rte_persistent.c:98:
+ int k;
+ for(k=0; k<num_page; k++)
ERROR: that open brace { should be on the previous line
#605: FILE: lib/librte_persistent/rte_persistent.c:98:
+ for(k=0; k<num_page; k++)
+ {
ERROR: spaces required around that '=' (ctx:VxV)
#605: FILE: lib/librte_persistent/rte_persistent.c:98:
+ for(k=0; k<num_page; k++)
^
ERROR: spaces required around that '<' (ctx:VxV)
#605: FILE: lib/librte_persistent/rte_persistent.c:98:
+ for(k=0; k<num_page; k++)
^
ERROR: space required before the open parenthesis '('
#605: FILE: lib/librte_persistent/rte_persistent.c:98:
+ for(k=0; k<num_page; k++)
ERROR: "foo* bar" should be "foo *bar"
#611: FILE: lib/librte_persistent/rte_persistent.c:104:
+ void* found_buffer = 0;
WARNING: Missing a blank line after declarations
#612: FILE: lib/librte_persistent/rte_persistent.c:105:
+ void* found_buffer = 0;
+ for(k=l_start; k<(l_start + l_range); k++)
ERROR: that open brace { should be on the previous line
#612: FILE: lib/librte_persistent/rte_persistent.c:105:
+ for(k=l_start; k<(l_start + l_range); k++)
+ {
ERROR: spaces required around that '=' (ctx:VxV)
#612: FILE: lib/librte_persistent/rte_persistent.c:105:
+ for(k=l_start; k<(l_start + l_range); k++)
^
ERROR: spaces required around that '<' (ctx:VxV)
#612: FILE: lib/librte_persistent/rte_persistent.c:105:
+ for(k=l_start; k<(l_start + l_range); k++)
^
ERROR: space required before the open parenthesis '('
#612: FILE: lib/librte_persistent/rte_persistent.c:105:
+ for(k=l_start; k<(l_start + l_range); k++)
ERROR: "foo* bar" should be "foo *bar"
#614: FILE: lib/librte_persistent/rte_persistent.c:107:
+ char* start = alloc_array[k];
ERROR: "foo* bar" should be "foo *bar"
#615: FILE: lib/librte_persistent/rte_persistent.c:108:
+ char* found = strstr(start, find_str);
ERROR: that open brace { should be on the previous line
#617: FILE: lib/librte_persistent/rte_persistent.c:110:
+ if(found)
+ {
ERROR: space required before the open parenthesis '('
#617: FILE: lib/librte_persistent/rte_persistent.c:110:
+ if(found)
WARNING: Missing a blank line after declarations
#620: FILE: lib/librte_persistent/rte_persistent.c:113:
+ int offset = found - start;
+ found_buffer = persistent_allocated_memory[socket][k];
WARNING: Missing a blank line after declarations
#624: FILE: lib/librte_persistent/rte_persistent.c:117:
+ int j;
+ for(j=0; j<num_page; j++)
ERROR: that open brace { should be on the previous line
#624: FILE: lib/librte_persistent/rte_persistent.c:117:
+ for(j=0; j<num_page; j++)
+ {
ERROR: spaces required around that '=' (ctx:VxV)
#624: FILE: lib/librte_persistent/rte_persistent.c:117:
+ for(j=0; j<num_page; j++)
^
ERROR: spaces required around that '<' (ctx:VxV)
#624: FILE: lib/librte_persistent/rte_persistent.c:117:
+ for(j=0; j<num_page; j++)
^
ERROR: space required before the open parenthesis '('
#624: FILE: lib/librte_persistent/rte_persistent.c:117:
+ for(j=0; j<num_page; j++)
WARNING: Missing a blank line after declarations
#629: FILE: lib/librte_persistent/rte_persistent.c:122:
+ int index = rte_hash_add_key(allocated_segments, &found_buffer);
+ assert(index >= 0);
ERROR: "foo* bar" should be "foo *bar"
#639: FILE: lib/librte_persistent/rte_persistent.c:132:
+ void* user = found_buffer;
WARNING: Missing a blank line after declarations
#642: FILE: lib/librte_persistent/rte_persistent.c:135:
+ size_t diff = RTE_MAX((uint64_t)user, hw) - RTE_MIN((uint64_t)user, hw);
+ for(j = 0; j < num_page; j++)
ERROR: that open brace { should be on the previous line
#642: FILE: lib/librte_persistent/rte_persistent.c:135:
+ for(j = 0; j < num_page; j++)
+ {
ERROR: space required before the open parenthesis '('
#642: FILE: lib/librte_persistent/rte_persistent.c:135:
+ for(j = 0; j < num_page; j++)
ERROR: "(foo*)" should be "(foo *)"
#645: FILE: lib/librte_persistent/rte_persistent.c:138:
+ void* cur_user = ((char*)user + shift);
ERROR: "foo* bar" should be "foo *bar"
#645: FILE: lib/librte_persistent/rte_persistent.c:138:
+ void* cur_user = ((char*)user + shift);
WARNING: line over 120 characters
#647: FILE: lib/librte_persistent/rte_persistent.c:140:
+ size_t cur_diff = RTE_MAX((uint64_t)cur_user, cur_hw) - RTE_MIN((uint64_t)cur_user, cur_hw);
ERROR: that open brace { should be on the previous line
#649: FILE: lib/librte_persistent/rte_persistent.c:142:
+ if(cur_diff != diff)
+ {
ERROR: space required before the open parenthesis '('
#649: FILE: lib/librte_persistent/rte_persistent.c:142:
+ if(cur_diff != diff)
WARNING: line over 120 characters
#651: FILE: lib/librte_persistent/rte_persistent.c:144:
+ RTE_LOG(ERR, EAL, "Hugepage is not contiguous, curdiff: %lX, expected: %lX\n", cur_diff, diff);
ERROR: space required before the open parenthesis '('
#658: FILE: lib/librte_persistent/rte_persistent.c:151:
+ if(!found_buffer)
ERROR: "foo* bar" should be "foo *bar"
#663: FILE: lib/librte_persistent/rte_persistent.c:156:
+phys_addr_t rte_persistent_hw_addr(const void* addr)
ERROR: space required before the open parenthesis '('
#665: FILE: lib/librte_persistent/rte_persistent.c:158:
+ if(addr == 0)
ERROR: "(foo*)" should be "(foo *)"
#667: FILE: lib/librte_persistent/rte_persistent.c:160:
+ int index = rte_hash_lookup(allocated_segments, (const void*)&addr);
WARNING: Missing a blank line after declarations
#668: FILE: lib/librte_persistent/rte_persistent.c:161:
+ int index = rte_hash_lookup(allocated_segments, (const void*)&addr);
+ assert(index >= 0);
ERROR: "foo* bar" should be "foo *bar"
#674: FILE: lib/librte_persistent/rte_persistent.c:167:
+size_t rte_persistent_mem_length(const void* addr)
ERROR: "(foo*)" should be "(foo *)"
#676: FILE: lib/librte_persistent/rte_persistent.c:169:
+ int index = rte_hash_lookup(allocated_segments, (const void*)&addr);
WARNING: Missing a blank line after declarations
#677: FILE: lib/librte_persistent/rte_persistent.c:170:
+ int index = rte_hash_lookup(allocated_segments, (const void*)&addr);
+ assert(index >= 0);
ERROR: "foo* bar" should be "foo *bar"
#683: FILE: lib/librte_persistent/rte_persistent.c:176:
+void rte_persistent_free(void* addr)
ERROR: "(foo*)" should be "(foo *)"
#685: FILE: lib/librte_persistent/rte_persistent.c:178:
+ int index = rte_hash_lookup(allocated_segments, (const void*)&addr);
WARNING: Missing a blank line after declarations
#686: FILE: lib/librte_persistent/rte_persistent.c:179:
+ int index = rte_hash_lookup(allocated_segments, (const void*)&addr);
+ assert(index >= 0);
ERROR: "(foo*)" should be "(foo *)"
#700: FILE: lib/librte_persistent/rte_persistent.c:193:
+ rte_hash_del_key(allocated_segments, (const void*)&addr);
WARNING: Missing a blank line after declarations
#703: FILE: lib/librte_persistent/rte_persistent.c:196:
+ int k;
+ for(k=0; k<len; k++)
ERROR: spaces required around that '=' (ctx:VxV)
#703: FILE: lib/librte_persistent/rte_persistent.c:196:
+ for(k=0; k<len; k++)
^
ERROR: spaces required around that '<' (ctx:VxV)
#703: FILE: lib/librte_persistent/rte_persistent.c:196:
+ for(k=0; k<len; k++)
^
ERROR: space required before the open parenthesis '('
#703: FILE: lib/librte_persistent/rte_persistent.c:196:
+ for(k=0; k<len; k++)
ERROR: "foo* bar" should be "foo *bar"
#726: FILE: lib/librte_persistent/rte_persistent.h:15:
+void* rte_persistent_alloc(size_t size, int socket);
ERROR: "foo* bar" should be "foo *bar"
#727: FILE: lib/librte_persistent/rte_persistent.h:16:
+phys_addr_t rte_persistent_hw_addr(const void* addr);
ERROR: "foo* bar" should be "foo *bar"
#728: FILE: lib/librte_persistent/rte_persistent.h:17:
+void rte_persistent_free(void* addr);
ERROR: "foo* bar" should be "foo *bar"
#729: FILE: lib/librte_persistent/rte_persistent.h:18:
+size_t rte_persistent_mem_length(const void* addr);
total: 96 errors, 28 warnings, 573 lines checked
/tmp/persist.patch has style problems, please review.
NOTE: If any of the errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices
2015-07-06 17:55 ` Keunhong Lee
@ 2015-07-07 6:50 ` Olga Shern
2015-07-07 7:02 ` Pavel Odintsov
0 siblings, 1 reply; 13+ messages in thread
From: Olga Shern @ 2015-07-07 6:50 UTC (permalink / raw)
To: Keunhong Lee, Thomas Monjalon; +Cc: dev
Hi Keunhong,
I disagree with you regarding the performance of ConnectX-3 PMD driver based on verbs. We get line rate of 40G link with message size smaller than 512B.
In MLNX_OFED 3.0 we have presented a new verbs, called accelerated verbs, these verbs improves by more than 100% the performance of RAW QP for DPDK. The PMD changes based on this new infrastructure have been submitted to this list and are part of DPDK 2.1 release.
Our approach is to improve performance of userspace verbs then to maintain a huge amount of code that you have submitted.
Another advantage to use userspace verbs API is that you have so called bi-furcated driver by design. Kernel driver can work side by side with DPDK PMD , if needed, and there is no security issues with this model.
Best Regards,
Olga
________________________________________________________________
Olga Shern
Sr. Manager, Acceleration libraries team (Accelio, DPDK, VMA)
Mellanox Technologies, Raanana Israel
-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Keunhong Lee
Sent: Monday, July 06, 2015 8:56 PM
To: Thomas Monjalon
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices
We found that optimizing fragmentation configuration of mlx4 driver performs as fast as native PMD.
I think we have to re-consider using native driver rather than ib driver.
Keunhong.
2015-07-07 1:14 GMT+09:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
> 2015-07-07 00:57, Keunhong Lee:
> > Answer 1. UIO based driver is faster then ib based driver.
> > It can saturate 40G link with MTU sized packets using a single
> > thread
> while
> > ib wrapper cannot.
>
> OK, interesting. Do you have numbers and details about your
> testbed/scenario?
>
> > Answer 2. Sorry, I missed that. I'll make a new patch email with my
> > real name.
> >
> > Question 1. Is it OK if I separate GPL-based and BSD-based codes
> > into separated patches?
> > mlx4 kernel driver itself is dual licenses, so I think they are
> considered
> > as BSD in my source code.
> > The only source code under GPL is bitmap, integer logarithm, and
> red-black
> > tree contained in mlnx_uio/kernel directory.
>
> These parts will be built in the user-space driver library, right?
> It would change the license, which is not desirable.
>
> Technically, your approach may be interesting.
> But from a maintenance point of view, this huge codebase may be a
> nightmare.
>
>
> > 2015-07-06 23:17 GMT+09:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
> >
> > > 2015-07-06 22:28, leeopop:
> > > > This is a native UIO-based PMD for Mellanox ConnectX-3 devices.
> > > > It uses a persistent memory library in order to provide a
> > > > persistent scartch area for the mlx4 HCA driver.
> > >
> > > What is the benefit of this UIO approach compared to the OFED
> > > based
> driver?
> > >
> > > > We release the driver itself under BSD license, but to use it
> > > > for commercial products, you may have to re-implement the
> > > > separated GPL sources.
> > >
> > > The GPL sources are not really separated.
> > > The DPDK libraries must be BSD-licensed.
> > >
> > > > The GPL affected source codes reside in the mlnx_uio/kernel
> directory.
> > >
> > > It seems that a large part of the GPL driver was also copied in
> > > mlnx_uio/mlnx/.
> > >
> > > Given that you are dropping a huge GPL codebase (whose you don't
> > > own
> the
> > > copyright) in a BSD library, and that you didn't give your real
> > > name in the signed-off line, it is NACK.
>
>
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices
2015-07-07 6:50 ` Olga Shern
@ 2015-07-07 7:02 ` Pavel Odintsov
2015-07-07 9:18 ` Olga Shern
0 siblings, 1 reply; 13+ messages in thread
From: Pavel Odintsov @ 2015-07-07 7:02 UTC (permalink / raw)
To: Olga Shern; +Cc: dev
Hello!
Sorry for off topic. But this question is very important for me.
Olga, could I achieve line rate with Mellanox Cards with 64b packets for 40GE?
On Tue, Jul 7, 2015 at 9:50 AM, Olga Shern <olgas@mellanox.com> wrote:
> Hi Keunhong,
>
> I disagree with you regarding the performance of ConnectX-3 PMD driver based on verbs. We get line rate of 40G link with message size smaller than 512B.
> In MLNX_OFED 3.0 we have presented a new verbs, called accelerated verbs, these verbs improves by more than 100% the performance of RAW QP for DPDK. The PMD changes based on this new infrastructure have been submitted to this list and are part of DPDK 2.1 release.
> Our approach is to improve performance of userspace verbs then to maintain a huge amount of code that you have submitted.
>
> Another advantage to use userspace verbs API is that you have so called bi-furcated driver by design. Kernel driver can work side by side with DPDK PMD , if needed, and there is no security issues with this model.
>
> Best Regards,
> Olga
>
>
> ________________________________________________________________
> Olga Shern
> Sr. Manager, Acceleration libraries team (Accelio, DPDK, VMA)
> Mellanox Technologies, Raanana Israel
>
>
>
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Keunhong Lee
> Sent: Monday, July 06, 2015 8:56 PM
> To: Thomas Monjalon
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices
>
> We found that optimizing fragmentation configuration of mlx4 driver performs as fast as native PMD.
> I think we have to re-consider using native driver rather than ib driver.
>
> Keunhong.
>
> 2015-07-07 1:14 GMT+09:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
>
>> 2015-07-07 00:57, Keunhong Lee:
>> > Answer 1. UIO based driver is faster then ib based driver.
>> > It can saturate 40G link with MTU sized packets using a single
>> > thread
>> while
>> > ib wrapper cannot.
>>
>> OK, interesting. Do you have numbers and details about your
>> testbed/scenario?
>>
>> > Answer 2. Sorry, I missed that. I'll make a new patch email with my
>> > real name.
>> >
>> > Question 1. Is it OK if I separate GPL-based and BSD-based codes
>> > into separated patches?
>> > mlx4 kernel driver itself is dual licenses, so I think they are
>> considered
>> > as BSD in my source code.
>> > The only source code under GPL is bitmap, integer logarithm, and
>> red-black
>> > tree contained in mlnx_uio/kernel directory.
>>
>> These parts will be built in the user-space driver library, right?
>> It would change the license, which is not desirable.
>>
>> Technically, your approach may be interesting.
>> But from a maintenance point of view, this huge codebase may be a
>> nightmare.
>>
>>
>> > 2015-07-06 23:17 GMT+09:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
>> >
>> > > 2015-07-06 22:28, leeopop:
>> > > > This is a native UIO-based PMD for Mellanox ConnectX-3 devices.
>> > > > It uses a persistent memory library in order to provide a
>> > > > persistent scartch area for the mlx4 HCA driver.
>> > >
>> > > What is the benefit of this UIO approach compared to the OFED
>> > > based
>> driver?
>> > >
>> > > > We release the driver itself under BSD license, but to use it
>> > > > for commercial products, you may have to re-implement the
>> > > > separated GPL sources.
>> > >
>> > > The GPL sources are not really separated.
>> > > The DPDK libraries must be BSD-licensed.
>> > >
>> > > > The GPL affected source codes reside in the mlnx_uio/kernel
>> directory.
>> > >
>> > > It seems that a large part of the GPL driver was also copied in
>> > > mlnx_uio/mlnx/.
>> > >
>> > > Given that you are dropping a huge GPL codebase (whose you don't
>> > > own
>> the
>> > > copyright) in a BSD library, and that you didn't give your real
>> > > name in the signed-off line, it is NACK.
>>
>>
>>
>>
--
Sincerely yours, Pavel Odintsov
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices
2015-07-07 7:02 ` Pavel Odintsov
@ 2015-07-07 9:18 ` Olga Shern
0 siblings, 0 replies; 13+ messages in thread
From: Olga Shern @ 2015-07-07 9:18 UTC (permalink / raw)
To: Pavel Odintsov; +Cc: dev
Hi Pavel,
A new Mellanox NIC, ConnectX-4 can achieve line arte with 64b packet for 40G link
We will have PMD that supports this NIC in DPDK 2.2 .
By the way, this is also an advantage of using verbs API, with almost 0% effort we can support a new NIC
Best Regards,
Olga
-----Original Message-----
From: Pavel Odintsov [mailto:pavel.odintsov@gmail.com]
Sent: Tuesday, July 07, 2015 10:02 AM
To: Olga Shern
Cc: Keunhong Lee; Thomas Monjalon; dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices
Hello!
Sorry for off topic. But this question is very important for me.
Olga, could I achieve line rate with Mellanox Cards with 64b packets for 40GE?
On Tue, Jul 7, 2015 at 9:50 AM, Olga Shern <olgas@mellanox.com> wrote:
> Hi Keunhong,
>
> I disagree with you regarding the performance of ConnectX-3 PMD driver based on verbs. We get line rate of 40G link with message size smaller than 512B.
> In MLNX_OFED 3.0 we have presented a new verbs, called accelerated verbs, these verbs improves by more than 100% the performance of RAW QP for DPDK. The PMD changes based on this new infrastructure have been submitted to this list and are part of DPDK 2.1 release.
> Our approach is to improve performance of userspace verbs then to maintain a huge amount of code that you have submitted.
>
> Another advantage to use userspace verbs API is that you have so called bi-furcated driver by design. Kernel driver can work side by side with DPDK PMD , if needed, and there is no security issues with this model.
>
> Best Regards,
> Olga
>
>
> ________________________________________________________________
> Olga Shern
> Sr. Manager, Acceleration libraries team (Accelio, DPDK, VMA) Mellanox
> Technologies, Raanana Israel
>
>
>
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Keunhong Lee
> Sent: Monday, July 06, 2015 8:56 PM
> To: Thomas Monjalon
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox
> ConnectX-3 devices
>
> We found that optimizing fragmentation configuration of mlx4 driver performs as fast as native PMD.
> I think we have to re-consider using native driver rather than ib driver.
>
> Keunhong.
>
> 2015-07-07 1:14 GMT+09:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
>
>> 2015-07-07 00:57, Keunhong Lee:
>> > Answer 1. UIO based driver is faster then ib based driver.
>> > It can saturate 40G link with MTU sized packets using a single
>> > thread
>> while
>> > ib wrapper cannot.
>>
>> OK, interesting. Do you have numbers and details about your
>> testbed/scenario?
>>
>> > Answer 2. Sorry, I missed that. I'll make a new patch email with my
>> > real name.
>> >
>> > Question 1. Is it OK if I separate GPL-based and BSD-based codes
>> > into separated patches?
>> > mlx4 kernel driver itself is dual licenses, so I think they are
>> considered
>> > as BSD in my source code.
>> > The only source code under GPL is bitmap, integer logarithm, and
>> red-black
>> > tree contained in mlnx_uio/kernel directory.
>>
>> These parts will be built in the user-space driver library, right?
>> It would change the license, which is not desirable.
>>
>> Technically, your approach may be interesting.
>> But from a maintenance point of view, this huge codebase may be a
>> nightmare.
>>
>>
>> > 2015-07-06 23:17 GMT+09:00 Thomas Monjalon <thomas.monjalon@6wind.com>:
>> >
>> > > 2015-07-06 22:28, leeopop:
>> > > > This is a native UIO-based PMD for Mellanox ConnectX-3 devices.
>> > > > It uses a persistent memory library in order to provide a
>> > > > persistent scartch area for the mlx4 HCA driver.
>> > >
>> > > What is the benefit of this UIO approach compared to the OFED
>> > > based
>> driver?
>> > >
>> > > > We release the driver itself under BSD license, but to use it
>> > > > for commercial products, you may have to re-implement the
>> > > > separated GPL sources.
>> > >
>> > > The GPL sources are not really separated.
>> > > The DPDK libraries must be BSD-licensed.
>> > >
>> > > > The GPL affected source codes reside in the mlnx_uio/kernel
>> directory.
>> > >
>> > > It seems that a large part of the GPL driver was also copied in
>> > > mlnx_uio/mlnx/.
>> > >
>> > > Given that you are dropping a huge GPL codebase (whose you don't
>> > > own
>> the
>> > > copyright) in a BSD library, and that you didn't give your real
>> > > name in the signed-off line, it is NACK.
>>
>>
>>
>>
--
Sincerely yours, Pavel Odintsov
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2015-07-07 9:18 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-06 13:28 [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices leeopop
2015-07-06 13:28 ` [dpdk-dev] [PATCH 1/2] eal/persistent: new library to hold memory region after program exit leeopop
2015-07-06 14:34 ` Avi Kivity
2015-07-06 14:41 ` Thomas Monjalon
2015-07-06 19:19 ` Stephen Hemminger
2015-07-06 13:28 ` [dpdk-dev] [PATCH 2/2] mlnx_uio: new poll mode driver leeopop
2015-07-06 14:17 ` [dpdk-dev] [PATCH 0/2] Native uio-based PMD for Mellanox ConnectX-3 devices Thomas Monjalon
2015-07-06 15:57 ` Keunhong Lee
2015-07-06 16:14 ` Thomas Monjalon
2015-07-06 17:55 ` Keunhong Lee
2015-07-07 6:50 ` Olga Shern
2015-07-07 7:02 ` Pavel Odintsov
2015-07-07 9:18 ` Olga Shern
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).