Network Test Access Point (TAP) is the network monitoring service commonly adotpted in SDN-based network infrastructures. When VMs are inter-connected over virtual switches, TAP requires vSwitch to mirror out network traffics from specific workload VM ports to the TAP device/VM ports. Classical mirroring impmentations in vSwitch make an extra copy of the source packets, which results in significant degradation in the throughput levels vSwitch could normally archieve. Therefore, we propose a new set of APIs to support high-throughput packet mirroring through hardware offloading. The proposal is consisted of three major parts: - Mirror registration APIs - Mirror offload/customization callbacks - Shared mirror data path In this proposal, mirroring happens between a pair of ethdev ports (one for the source port and the other for the mirror port), which is configurable on a per-port per-direction basis. i.e. applications invoke the mirroring API to register source ports and traffic directions (tx or rx). The registration API will then attach the mirror data path to the source port as a standard ethdev tx or rx callback. If any custom mirror offload functions are specified by applications, the offload function will be executed within the mirror data path. The mirror data path intercepts the packets flowing over the registered source ports and, rather than doing extra packets copy operations, simply transmits packets to the destination (mirror) port with an incremented mbuf reference count. In this way, an identical copy of the packet data is transmitted to both the mirror port and the original traffic destination. In addition, with the proposed APIs we can implement even more complicated mirrorings scenarios. Two examples include flow based mirroring and MAC address matching, both of which have common usage within the industry. Mirror registration API proto-types are defined as follows: int rte_mirror_register(uint16_t src_port, struct rte_mirror_param *param); int rte_mirror_unregister(uint16_t src_port); struct rte_mirror_param contains all of the parameters the mirror data path will use, e.g. the mirror port number, the source traffic direction to be mirrored, the custom mirroring offload function pointer, the application context, etc. Notablely, applications should passdown a spinlock through this structure, which is used to synchronize the packet data access between application and the mirror data path. Our prior studies demonstrate that this methedology is capble of doubling the mirroring performance as compared to the default OVS port mirroring performance (refer to the paper in IEEE xplore for further details: https://ieeexplore.ieee.org/document/9110293) An OVS implementation was also suggested to the OVS community for review and comments (refer to the following OVS RFC patch: https://patchwork.ozlabs.org/project/openvswitch/patch/ 1595596858-78846-2-git-send-email-emma.finn@intel.com/) We are considering implementing the mirroring APIs as a standalone library in DPDK, but it's also reasonble to place it inside ethdev layer or within the vhost-pmd considering the potential usage scenarios. Signed-off-by: Liang-min Wang <liang-min.wang@intel.com> Signed-off-by: Patrick Fu <patrick.fu@intel.com> Signed-off-by: Timothy Miskell <timothy.miskell@intel.com>
30/07/2020 05:23, Patrick Fu: > Network Test Access Point (TAP) is the network monitoring service > commonly adotpted in SDN-based network infrastructures. When VMs are > inter-connected over virtual switches, TAP requires vSwitch to mirror > out network traffics from specific workload VM ports to the TAP > device/VM ports. Classical mirroring impmentations in vSwitch make an > extra copy of the source packets, which results in significant degradation > in the throughput levels vSwitch could normally archieve. Therefore, we > propose a new set of APIs to support high-throughput packet mirroring > through hardware offloading. > > The proposal is consisted of three major parts: > - Mirror registration APIs > - Mirror offload/customization callbacks > - Shared mirror data path > > In this proposal, mirroring happens between a pair of ethdev ports (one for > the source port and the other for the mirror port), which is configurable > on a per-port per-direction basis. i.e. applications invoke the mirroring > API to register source ports and traffic directions (tx or rx). The > registration API will then attach the mirror data path to the source port > as a standard ethdev tx or rx callback. If any custom mirror offload > functions are specified by applications, the offload function will be > executed within the mirror data path. > > The mirror data path intercepts the packets flowing over the registered > source ports and, rather than doing extra packets copy operations, simply > transmits packets to the destination (mirror) port with an incremented > mbuf reference count. In this way, an identical copy of the packet data is > transmitted to both the mirror port and the original traffic destination. > > In addition, with the proposed APIs we can implement even more complicated > mirrorings scenarios. Two examples include flow based mirroring and MAC > address matching, both of which have common usage within the industry. > > Mirror registration API proto-types are defined as follows: > int rte_mirror_register(uint16_t src_port, > struct rte_mirror_param *param); > int rte_mirror_unregister(uint16_t src_port); > > struct rte_mirror_param contains all of the parameters the mirror data > path will use, e.g. the mirror port number, the source traffic direction > to be mirrored, the custom mirroring offload function pointer, the > application context, etc. Notablely, applications should passdown a > spinlock through this structure, which is used to synchronize the packet > data access between application and the mirror data path. > > Our prior studies demonstrate that this methedology is capble of doubling > the mirroring performance as compared to the default OVS port mirroring > performance (refer to the paper in IEEE xplore for further details: > https://ieeexplore.ieee.org/document/9110293) > An OVS implementation was also suggested to the OVS community for review > and comments (refer to the following OVS RFC patch: > https://patchwork.ozlabs.org/project/openvswitch/patch/ > 1595596858-78846-2-git-send-email-emma.finn@intel.com/) > > We are considering implementing the mirroring APIs as a standalone library > in DPDK, but it's also reasonble to place it inside ethdev layer or within > the vhost-pmd considering the potential usage scenarios. > > Signed-off-by: Liang-min Wang <liang-min.wang@intel.com> > Signed-off-by: Patrick Fu <patrick.fu@intel.com> > Signed-off-by: Timothy Miskell <timothy.miskell@intel.com> I assume you consider deprecating rte_eth_mirror_rule_set() http://doc.dpdk.org/api/rte__ethdev_8h.html#a1c88c5e86f0358981443600f05069091 Please consider reviewing this implementation in rte_flow: https://patches.dpdk.org/patch/73279/
Hi Thomas, > -----Original Message----- > From: Thomas Monjalon <thomas@monjalon.net> > Sent: Thursday, July 30, 2020 2:33 PM > To: Fu, Patrick <patrick.fu@intel.com> > Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>; > maxime.coquelin@redhat.com; Richardson, Bruce > <bruce.richardson@intel.com>; Wang, Zhihong <zhihong.wang@intel.com>; > Wang, Liang-min <liang-min.wang@intel.com>; Ananyev, Konstantin > <konstantin.ananyev@intel.com>; Miskell, Timothy > <timothy.miskell@intel.com>; Liang, Cunming <cunming.liang@intel.com>; > arybchenko@solarflare.com; Jiawei Wang <jiaweiw@mellanox.com>; > orika@mellanox.com > Subject: Re: [dpdk-dev] [RFC] lib: introduce traffic mirroring API > > > I assume you consider deprecating rte_eth_mirror_rule_set() > http://doc.dpdk.org/api/rte__ethdev_8h.html#a1c88c5e86f0358981443600f > 05069091 > Not exactly. The rte_eth_mirror_rule_set() is vendor-dependent API which allows admin to configure two components (traffic source and traffic destination) of the same NIC so packets can be copied from traffic source to traffic destination through hardware. The API allows vendor to implement this function via hardware-dependent offloading capability. In contrast, this RFC is proposing two high-level APIs (vendor independent) to allow admin configuring mirror traffic from device A to device B where device A and B may come from different vendors. In particular, our initial target is on software virtual devices such as virtio/vhost where there is no mirror hw support. > Please consider reviewing this implementation in rte_flow: > https://patches.dpdk.org/patch/73279/ > For the same reason explained, this patch is also targeting at different use cases with our RFC. Thanks, Patrick
31/07/2020 04:34, Fu, Patrick:
> Hi Thomas,
>
> From: Thomas Monjalon <thomas@monjalon.net>
> >
> > I assume you consider deprecating rte_eth_mirror_rule_set()
> > http://doc.dpdk.org/api/rte__ethdev_8h.html#a1c88c5e86f0358981443600f
> > 05069091
> >
> Not exactly.
> The rte_eth_mirror_rule_set() is vendor-dependent API which allows admin to configure two components (traffic source and traffic destination) of the same NIC so packets can be copied from traffic source to traffic destination through hardware. The API allows vendor to implement this function via hardware-dependent offloading capability. In contrast, this RFC is proposing two high-level APIs (vendor independent) to allow admin configuring mirror traffic from device A to device B where device A and B may come from different vendors. In particular, our initial target is on software virtual devices such as virtio/vhost where there is no mirror hw support.
>
> > Please consider reviewing this implementation in rte_flow:
> > https://patches.dpdk.org/patch/73279/
> >
> For the same reason explained, this patch is also targeting at different use cases with our RFC.
We should not have different API depending on the device.
Please look how to unify in a single API.
Hi Thomas,
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, July 31, 2020 5:32 PM
> To: Fu, Patrick <patrick.fu@intel.com>
> Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>;
> maxime.coquelin@redhat.com; Richardson, Bruce
> <bruce.richardson@intel.com>; Wang, Zhihong <zhihong.wang@intel.com>;
> Wang, Liang-min <liang-min.wang@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Miskell, Timothy
> <timothy.miskell@intel.com>; Liang, Cunming <cunming.liang@intel.com>;
> arybchenko@solarflare.com; Jiawei Wang <jiaweiw@mellanox.com>;
> orika@mellanox.com
> Subject: Re: [dpdk-dev] [RFC] lib: introduce traffic mirroring API
>
> 31/07/2020 04:34, Fu, Patrick:
> > Hi Thomas,
> >
> > From: Thomas Monjalon <thomas@monjalon.net>
> > >
> > > I assume you consider deprecating rte_eth_mirror_rule_set()
> > >
> http://doc.dpdk.org/api/rte__ethdev_8h.html#a1c88c5e86f0358981443600
> > > f
> > > 05069091
> > >
> > Not exactly.
> > The rte_eth_mirror_rule_set() is vendor-dependent API which allows
> admin to configure two components (traffic source and traffic destination) of
> the same NIC so packets can be copied from traffic source to traffic
> destination through hardware. The API allows vendor to implement this
> function via hardware-dependent offloading capability. In contrast, this RFC
> is proposing two high-level APIs (vendor independent) to allow admin
> configuring mirror traffic from device A to device B where device A and B may
> come from different vendors. In particular, our initial target is on software
> virtual devices such as virtio/vhost where there is no mirror hw support.
> >
> > > Please consider reviewing this implementation in rte_flow:
> > > https://patches.dpdk.org/patch/73279/
> > >
> > For the same reason explained, this patch is also targeting at different use
> cases with our RFC.
>
> We should not have different API depending on the device.
> Please look how to unify in a single API.
>
I believe the proposed APIs work on a different abstraction level than existing APIs.
But we can look into the possibility if they could be unified.
So in general, do you think it's a right direction that we add common framework
in DPDK to support cross devices traffic and vdev devices traffic mirroring?
Thanks,
Patrick