DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] 2.3 Roadmap
@ 2015-11-30 20:50 O'Driscoll, Tim
  2015-11-30 21:50 ` Thomas Monjalon
                   ` (4 more replies)
  0 siblings, 5 replies; 33+ messages in thread
From: O'Driscoll, Tim @ 2015-11-30 20:50 UTC (permalink / raw)
  To: dev

As we're nearing the completion of the 2.2 release, I'd like to start a discussion on plans for 2.3. To kick this off, below are the features that we're hoping to submit for this release.

If others are prepared to contribute their plans, then we could build a complete view of the release which Thomas can maintain on the dpdk.org roadmap page, and make sure we're not duplicating work.


IPsec Sample Application: A sample application will be created which will show how DPDK and the new cryptodev API can be used to implement IPsec. Use of the cryptodev API will allow either hardware or software encryption to be used. IKE will not be implemented so the SA/SP DBs will be statically configured.

Cryptodev Support for SNOW 3G: The cryptodev API, and the hardware and software crypto PMDs that it supports, will be enhanced to support the SNOW 3G cipher.

External Mempool Manager: SoCs and some software applications that use DPDK have their own memory allocation capabilities. This feature will allow DPDK to work with an external mempool manager.

Packet Framework (Edge Router Use Case):
- Further performance tuning for the vPE use case.
- Support for load balancing within a pipeline.
- Support for CPU utilization measurements within a pipeline.
- Improvements for the functional pipelines, tables and ports.

Ethdev Enhancements: Merge parts of the Packet Framework ports library into ethdev so they can be used without the Packet Framework. The initial focus is to add support for buffered TX to ethdev.

Live Migration: The main infrastructure to support live migration of VMs was implemented over the last few DPDK releases via the Link Bonding and PCI Hot Plug features. This feature will involve further investigation, prototyping and enhancements to improve live migration support in DPDK.

Tcpdump Support: Support for tcpdump will be added to DPDK. This will improve usability and debugging of DPDK applications.

Increase Next Hops for LPM (IPv4): The number of next hops for IPv4 LPM is currently limited to 256. This will be extended to allow a greater number of next hops.

Fm10k Enhancements: FTAG based forwarding, and performance tuning

Support Intel Resource Director Technology: A library will be added to DPDK to support the following Intel CPU technologies:
- CAT - Cache Allocation Technology (LLC aka L3)
- CDP - Code Data Prioritization (extension of CAT)
- CMT - Cache Monitoring Technology (LLC)
- MBM - Memory Bandwidth Monitoring, to local and remote RAM
These technologies are currently available via cgroups and perf, but this feature will provide closer integration with DPDK and a sample application showing how they can be used.

I40e Enhancements:
- Flow Director input set Alignment
- Ethertype configuration for QinQ support
- Flow Director Support for Tunnels (QinQ, GRE/NVGRE, VXLAN)
- Flow Director Support for IP Proto and IP TOS
- VEB switching
- Floating VEB
- IPGRE Support
- Set VF MAC address
- Rework PCIe extended tag enabling by using DPDK interfaces

Virtio/Vhost Enhancements:
- Virtio 1.0 support
- Vhost software TSO
- Vhost/virtio performance tuning

Container Enhancements:
- Virtio for containers
- Hugetlbfs mount point size
- Cgroup resource awareness 
- Enable short-lived DPDK applications

Generic Tunneling API:
- Implement virtual flow device framework
- Implement generic virtual device management APIs, including the following callback functions:
  - flow_ethdev_start/stop/configure/close/info_get
  - ethdev_rx/tx_queue_setup/release
  - flow_ethdev_tunnel_configure/setup/destroy
  - flow_ethdev_tunnel_pkt_decap/encap
- Implement flow device PMD drive APIs
  - rte_eth_flow_dev_create/remove/ others
- Integrate VXLAN protocol (including VXLAN decap/encap optimization) into this framework only on i40e.


Tim

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-11-30 20:50 [dpdk-dev] 2.3 Roadmap O'Driscoll, Tim
@ 2015-11-30 21:50 ` Thomas Monjalon
  2015-11-30 22:19 ` Dave Neary
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: Thomas Monjalon @ 2015-11-30 21:50 UTC (permalink / raw)
  To: O'Driscoll, Tim; +Cc: dev

It looks very ambitious :)
Thank you Intel for pushing forward!

2015-11-30 20:50, O'Driscoll, Tim:
> As we're nearing the completion of the 2.2 release, I'd like to start a discussion on plans for 2.3. To kick this off, below are the features that we're hoping to submit for this release.
> 
> If others are prepared to contribute their plans, then we could build a complete view of the release which Thomas can maintain on the dpdk.org roadmap page, and make sure we're not duplicating work.
> 
> 
> IPsec Sample Application: A sample application will be created which will show how DPDK and the new cryptodev API can be used to implement IPsec. Use of the cryptodev API will allow either hardware or software encryption to be used. IKE will not be implemented so the SA/SP DBs will be statically configured.
> 
> Cryptodev Support for SNOW 3G: The cryptodev API, and the hardware and software crypto PMDs that it supports, will be enhanced to support the SNOW 3G cipher.
> 
> External Mempool Manager: SoCs and some software applications that use DPDK have their own memory allocation capabilities. This feature will allow DPDK to work with an external mempool manager.
> 
> Packet Framework (Edge Router Use Case):
> - Further performance tuning for the vPE use case.
> - Support for load balancing within a pipeline.
> - Support for CPU utilization measurements within a pipeline.
> - Improvements for the functional pipelines, tables and ports.
> 
> Ethdev Enhancements: Merge parts of the Packet Framework ports library into ethdev so they can be used without the Packet Framework. The initial focus is to add support for buffered TX to ethdev.
> 
> Live Migration: The main infrastructure to support live migration of VMs was implemented over the last few DPDK releases via the Link Bonding and PCI Hot Plug features. This feature will involve further investigation, prototyping and enhancements to improve live migration support in DPDK.
> 
> Tcpdump Support: Support for tcpdump will be added to DPDK. This will improve usability and debugging of DPDK applications.
> 
> Increase Next Hops for LPM (IPv4): The number of next hops for IPv4 LPM is currently limited to 256. This will be extended to allow a greater number of next hops.
> 
> Fm10k Enhancements: FTAG based forwarding, and performance tuning
> 
> Support Intel Resource Director Technology: A library will be added to DPDK to support the following Intel CPU technologies:
> - CAT - Cache Allocation Technology (LLC aka L3)
> - CDP - Code Data Prioritization (extension of CAT)
> - CMT - Cache Monitoring Technology (LLC)
> - MBM - Memory Bandwidth Monitoring, to local and remote RAM
> These technologies are currently available via cgroups and perf, but this feature will provide closer integration with DPDK and a sample application showing how they can be used.
> 
> I40e Enhancements:
> - Flow Director input set Alignment
> - Ethertype configuration for QinQ support
> - Flow Director Support for Tunnels (QinQ, GRE/NVGRE, VXLAN)
> - Flow Director Support for IP Proto and IP TOS
> - VEB switching
> - Floating VEB
> - IPGRE Support
> - Set VF MAC address
> - Rework PCIe extended tag enabling by using DPDK interfaces
> 
> Virtio/Vhost Enhancements:
> - Virtio 1.0 support
> - Vhost software TSO
> - Vhost/virtio performance tuning
> 
> Container Enhancements:
> - Virtio for containers
> - Hugetlbfs mount point size
> - Cgroup resource awareness 
> - Enable short-lived DPDK applications
> 
> Generic Tunneling API:
> - Implement virtual flow device framework
> - Implement generic virtual device management APIs, including the following callback functions:
>   - flow_ethdev_start/stop/configure/close/info_get
>   - ethdev_rx/tx_queue_setup/release
>   - flow_ethdev_tunnel_configure/setup/destroy
>   - flow_ethdev_tunnel_pkt_decap/encap
> - Implement flow device PMD drive APIs
>   - rte_eth_flow_dev_create/remove/ others
> - Integrate VXLAN protocol (including VXLAN decap/encap optimization) into this framework only on i40e.
> 
> 
> Tim

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-11-30 20:50 [dpdk-dev] 2.3 Roadmap O'Driscoll, Tim
  2015-11-30 21:50 ` Thomas Monjalon
@ 2015-11-30 22:19 ` Dave Neary
  2015-12-01 11:57   ` O'Driscoll, Tim
  2015-11-30 22:30 ` Hobywan Kenoby
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 33+ messages in thread
From: Dave Neary @ 2015-11-30 22:19 UTC (permalink / raw)
  To: O'Driscoll, Tim, dev

Hi Tim,

Just curious about one item on the list:

On 11/30/2015 03:50 PM, O'Driscoll, Tim wrote:
> IPsec Sample Application: A sample application will be created which will show how DPDK and the new cryptodev API can be used to implement IPsec. Use of the cryptodev API will allow either hardware or software encryption to be used. IKE will not be implemented so the SA/SP DBs will be statically configured.

Do you anticipate this application living in the dpdk repo, or in a
separate tree?

Thanks,
Dave.

-- 
Dave Neary - NFV/SDN Community Strategy
Open Source and Standards, Red Hat - http://community.redhat.com
Ph: +1-978-399-2182 / Cell: +1-978-799-3338

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-11-30 20:50 [dpdk-dev] 2.3 Roadmap O'Driscoll, Tim
  2015-11-30 21:50 ` Thomas Monjalon
  2015-11-30 22:19 ` Dave Neary
@ 2015-11-30 22:30 ` Hobywan Kenoby
  2015-12-01 11:52   ` O'Driscoll, Tim
  2015-11-30 22:53 ` Kyle Larose
  2015-12-01 12:59 ` Matthew Hall
  4 siblings, 1 reply; 33+ messages in thread
From: Hobywan Kenoby @ 2015-11-30 22:30 UTC (permalink / raw)
  To: O'Driscoll, Tim, dev


Hi,

CAT And CDP technologies look very intriguing.... Could you elaborate a little on those?

-HK
________________________________________
From: dev <dev-bounces@dpdk.org> on behalf of O'Driscoll, Tim <tim.odriscoll@intel.com>
Sent: Monday, November 30, 2015 9:50:58 PM
To: dev@dpdk.org
Subject: [dpdk-dev] 2.3 Roadmap

As we're nearing the completion of the 2.2 release, I'd like to start a discussion on plans for 2.3. To kick this off, below are the features that we're hoping to submit for this release.

If others are prepared to contribute their plans, then we could build a complete view of the release which Thomas can maintain on the dpdk.org roadmap page, and make sure we're not duplicating work.


IPsec Sample Application: A sample application will be created which will show how DPDK and the new cryptodev API can be used to implement IPsec. Use of the cryptodev API will allow either hardware or software encryption to be used. IKE will not be implemented so the SA/SP DBs will be statically configured.

Cryptodev Support for SNOW 3G: The cryptodev API, and the hardware and software crypto PMDs that it supports, will be enhanced to support the SNOW 3G cipher.

External Mempool Manager: SoCs and some software applications that use DPDK have their own memory allocation capabilities. This feature will allow DPDK to work with an external mempool manager.

Packet Framework (Edge Router Use Case):
- Further performance tuning for the vPE use case.
- Support for load balancing within a pipeline.
- Support for CPU utilization measurements within a pipeline.
- Improvements for the functional pipelines, tables and ports.

Ethdev Enhancements: Merge parts of the Packet Framework ports library into ethdev so they can be used without the Packet Framework. The initial focus is to add support for buffered TX to ethdev.

Live Migration: The main infrastructure to support live migration of VMs was implemented over the last few DPDK releases via the Link Bonding and PCI Hot Plug features. This feature will involve further investigation, prototyping and enhancements to improve live migration support in DPDK.

Tcpdump Support: Support for tcpdump will be added to DPDK. This will improve usability and debugging of DPDK applications.

Increase Next Hops for LPM (IPv4): The number of next hops for IPv4 LPM is currently limited to 256. This will be extended to allow a greater number of next hops.

Fm10k Enhancements: FTAG based forwarding, and performance tuning

Support Intel Resource Director Technology: A library will be added to DPDK to support the following Intel CPU technologies:
- CAT - Cache Allocation Technology (LLC aka L3)
- CDP - Code Data Prioritization (extension of CAT)
- CMT - Cache Monitoring Technology (LLC)
- MBM - Memory Bandwidth Monitoring, to local and remote RAM
These technologies are currently available via cgroups and perf, but this feature will provide closer integration with DPDK and a sample application showing how they can be used.

I40e Enhancements:
- Flow Director input set Alignment
- Ethertype configuration for QinQ support
- Flow Director Support for Tunnels (QinQ, GRE/NVGRE, VXLAN)
- Flow Director Support for IP Proto and IP TOS
- VEB switching
- Floating VEB
- IPGRE Support
- Set VF MAC address
- Rework PCIe extended tag enabling by using DPDK interfaces

Virtio/Vhost Enhancements:
- Virtio 1.0 support
- Vhost software TSO
- Vhost/virtio performance tuning

Container Enhancements:
- Virtio for containers
- Hugetlbfs mount point size
- Cgroup resource awareness
- Enable short-lived DPDK applications

Generic Tunneling API:
- Implement virtual flow device framework
- Implement generic virtual device management APIs, including the following callback functions:
  - flow_ethdev_start/stop/configure/close/info_get
  - ethdev_rx/tx_queue_setup/release
  - flow_ethdev_tunnel_configure/setup/destroy
  - flow_ethdev_tunnel_pkt_decap/encap
- Implement flow device PMD drive APIs
  - rte_eth_flow_dev_create/remove/ others
- Integrate VXLAN protocol (including VXLAN decap/encap optimization) into this framework only on i40e.


Tim

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-11-30 20:50 [dpdk-dev] 2.3 Roadmap O'Driscoll, Tim
                   ` (2 preceding siblings ...)
  2015-11-30 22:30 ` Hobywan Kenoby
@ 2015-11-30 22:53 ` Kyle Larose
  2015-12-01  1:16   ` Stephen Hemminger
  2015-12-01 12:59 ` Matthew Hall
  4 siblings, 1 reply; 33+ messages in thread
From: Kyle Larose @ 2015-11-30 22:53 UTC (permalink / raw)
  To: O'Driscoll, Tim, dev

Hi Tim,

On Mon, Nov 30, 2015 at 3:50 PM, O'Driscoll, Tim <tim.odriscoll@intel.com> wrote:

> Tcpdump Support: Support for tcpdump will be added to DPDK. This will improve usability and debugging of DPDK applications.

I'm curious about the proposed tcpdump support. Is there a concrete plan for this, or is that still being looked into? Sandvine is interested in contributing to this effort. Anything we can do to help?

Thanks,

Kyle 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-11-30 22:53 ` Kyle Larose
@ 2015-12-01  1:16   ` Stephen Hemminger
  2015-12-01 10:03     ` Bruce Richardson
  0 siblings, 1 reply; 33+ messages in thread
From: Stephen Hemminger @ 2015-12-01  1:16 UTC (permalink / raw)
  To: Kyle Larose; +Cc: dev

On Mon, 30 Nov 2015 22:53:50 +0000
Kyle Larose <klarose@sandvine.com> wrote:

> Hi Tim,
> 
> On Mon, Nov 30, 2015 at 3:50 PM, O'Driscoll, Tim <tim.odriscoll@intel.com> wrote:
> 
> > Tcpdump Support: Support for tcpdump will be added to DPDK. This will improve usability and debugging of DPDK applications.
> 
> I'm curious about the proposed tcpdump support. Is there a concrete plan for this, or is that still being looked into? Sandvine is interested in contributing to this effort. Anything we can do to help?
> 
> Thanks,
> 
> Kyle 

We discussed an Ovscon doing a simple example of how to have a thread use named pipe
support (already in tcpdump and wireshark). More complex solutions require changes to
libpcap and application interaction.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01  1:16   ` Stephen Hemminger
@ 2015-12-01 10:03     ` Bruce Richardson
  2015-12-01 11:26       ` Yoshinobu Inoue
  2015-12-01 14:27       ` Panu Matilainen
  0 siblings, 2 replies; 33+ messages in thread
From: Bruce Richardson @ 2015-12-01 10:03 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

On Mon, Nov 30, 2015 at 05:16:55PM -0800, Stephen Hemminger wrote:
> On Mon, 30 Nov 2015 22:53:50 +0000
> Kyle Larose <klarose@sandvine.com> wrote:
> 
> > Hi Tim,
> > 
> > On Mon, Nov 30, 2015 at 3:50 PM, O'Driscoll, Tim <tim.odriscoll@intel.com> wrote:
> > 
> > > Tcpdump Support: Support for tcpdump will be added to DPDK. This will improve usability and debugging of DPDK applications.
> > 
> > I'm curious about the proposed tcpdump support. Is there a concrete plan for this, or is that still being looked into? Sandvine is interested in contributing to this effort. Anything we can do to help?
> > 
> > Thanks,
> > 
> > Kyle 
> 
> We discussed an Ovscon doing a simple example of how to have a thread use named pipe
> support (already in tcpdump and wireshark). More complex solutions require changes to
> libpcap and application interaction.

Our current thinking is to use kni to mirror packets into the kernel itself,
so that all standard linux capture tools can then be used.

/Bruce

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 10:03     ` Bruce Richardson
@ 2015-12-01 11:26       ` Yoshinobu Inoue
  2015-12-01 11:58         ` Bruce Richardson
  2015-12-01 14:27       ` Panu Matilainen
  1 sibling, 1 reply; 33+ messages in thread
From: Yoshinobu Inoue @ 2015-12-01 11:26 UTC (permalink / raw)
  To: bruce.richardson; +Cc: dev

Hello DPDK list,

I've been so far just roughly reading messages, as I've been working on my
company's product based on DPDK1.6 for some time and haven't yet much
catched up to newer releases,
but as I implemented packet capture for my product just in a way as commented
here, (using KNI),

> Our current thinking is to use kni to mirror packets into the kernel itself,
> so that all standard linux capture tools can then be used.
> 
> /Bruce

I felt like giving commets from my humble experiences,,,

 - In our case, KNI is always enabeld for each DPDK port,
   as it seemed handy that packet rx/tx stat and up/down status can be checked
   by ifconfig as well.
   Also, iface MIB become available as well via /sys/devices/pci.../net.

   As far as we checked it, it seemed that just creating KNI itself didn't much
   affect Dplane performance.
   (Just when we update statistics, there is a bit of overhead.)

 - I inserted rte_kni_tx_burst() call to
   packet RX path (after rte_eth_rx_burst) and TX path (after rte_eth_tx_burst,
   based on an assumption that just sent out pkt is not yet freed, it will be
   freed when the tx descriptor is overwritten by some time later tx packet.).

   The call to rte_kni_tx_burst() is enabled/disabled by some external capture
   enable/disable command.

   I copy the packet beforehand and pass the copied one to rte_kni_tx_burst().
   In TX path, we might not need to copy if we just increment the packet refcnt
   by 1, but haven't yet tried such hack much.

   The packets sent to rte_kni_tx_burst() can then be captured by normal libpcap
   tools like tcpdump on the correspondent KNI.

   The performance loss when the capture was enabled was roughly 20-30%.
   (Perhaps it might change based on many factors.)

   By the way, in this way, we can enable tx-only or rx-only capture.


 - Some considerations,

   -  Someone might not like capture on/off check everytime on normal fast path
      tx/rx route.
      I too, so created fastpath send routine and slowpath send routine,
      and switched the function pointer when the capture is enabled/disabled.
      But not sure if it was worth for the effort.

   - In this approach, everyone needs to create their own capture enable/disable
     command in their implementation, and it could be a bit bothering.

     I myself am not sure if it's possible, but if as in normal tcpdump,
     an invocation of tcpdump to a KNI interfce could be somehow notified to
     the correspondent DPDK port user application, and then the call to
     rte_kni_tx_burst() could be automatically enabled/disabled, that's cool.

   
Thanks,
Yoshinobu Inoue


From: Bruce Richardson <bruce.richardson@intel.com>
Subject: Re: [dpdk-dev] 2.3 Roadmap
Date: Tue, 1 Dec 2015 10:03:33 +0000

> On Mon, Nov 30, 2015 at 05:16:55PM -0800, Stephen Hemminger wrote:
>> On Mon, 30 Nov 2015 22:53:50 +0000
>> Kyle Larose <klarose@sandvine.com> wrote:
>> 
>> > Hi Tim,
>> > 
>> > On Mon, Nov 30, 2015 at 3:50 PM, O'Driscoll, Tim <tim.odriscoll@intel.com> wrote:
>> > 
>> > > Tcpdump Support: Support for tcpdump will be added to DPDK. This will improve usability and debugging of DPDK applications.
>> > 
>> > I'm curious about the proposed tcpdump support. Is there a concrete plan for this, or is that still being looked into? Sandvine is interested in contributing to this effort. Anything we can do to help?
>> > 
>> > Thanks,
>> > 
>> > Kyle 
>> 
>> We discussed an Ovscon doing a simple example of how to have a thread use named pipe
>> support (already in tcpdump and wireshark). More complex solutions require changes to
>> libpcap and application interaction.
> 
> Our current thinking is to use kni to mirror packets into the kernel itself,
> so that all standard linux capture tools can then be used.
> 
> /Bruce
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-11-30 22:30 ` Hobywan Kenoby
@ 2015-12-01 11:52   ` O'Driscoll, Tim
  0 siblings, 0 replies; 33+ messages in thread
From: O'Driscoll, Tim @ 2015-12-01 11:52 UTC (permalink / raw)
  To: Hobywan Kenoby, dev


> -----Original Message-----
> From: Hobywan Kenoby [mailto:hobywank@hotmail.com]
> Sent: Monday, November 30, 2015 10:30 PM
> To: O'Driscoll, Tim; dev@dpdk.org
> Subject: Re: 2.3 Roadmap
> 
> 
> Hi,
> 
> CAT And CDP technologies look very intriguing.... Could you elaborate a
> little on those?

We're working on a white paper which should be available soon. In the meantime, there's more information on these technologies at:
https://www-ssl.intel.com/content/www/us/en/communications/cache-monitoring-cache-allocation-technologies.html
https://01.org/packet-processing/cache-monitoring-technology-memory-bandwidth-monitoring-cache-allocation-technology-code-and-data


Tim

> 
> -HK
> ________________________________________
> From: dev <dev-bounces@dpdk.org> on behalf of O'Driscoll, Tim
> <tim.odriscoll@intel.com>
> Sent: Monday, November 30, 2015 9:50:58 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] 2.3 Roadmap
> 
> As we're nearing the completion of the 2.2 release, I'd like to start a
> discussion on plans for 2.3. To kick this off, below are the features
> that we're hoping to submit for this release.
> 
> If others are prepared to contribute their plans, then we could build a
> complete view of the release which Thomas can maintain on the dpdk.org
> roadmap page, and make sure we're not duplicating work.
> 
> 
> IPsec Sample Application: A sample application will be created which
> will show how DPDK and the new cryptodev API can be used to implement
> IPsec. Use of the cryptodev API will allow either hardware or software
> encryption to be used. IKE will not be implemented so the SA/SP DBs will
> be statically configured.
> 
> Cryptodev Support for SNOW 3G: The cryptodev API, and the hardware and
> software crypto PMDs that it supports, will be enhanced to support the
> SNOW 3G cipher.
> 
> External Mempool Manager: SoCs and some software applications that use
> DPDK have their own memory allocation capabilities. This feature will
> allow DPDK to work with an external mempool manager.
> 
> Packet Framework (Edge Router Use Case):
> - Further performance tuning for the vPE use case.
> - Support for load balancing within a pipeline.
> - Support for CPU utilization measurements within a pipeline.
> - Improvements for the functional pipelines, tables and ports.
> 
> Ethdev Enhancements: Merge parts of the Packet Framework ports library
> into ethdev so they can be used without the Packet Framework. The
> initial focus is to add support for buffered TX to ethdev.
> 
> Live Migration: The main infrastructure to support live migration of VMs
> was implemented over the last few DPDK releases via the Link Bonding and
> PCI Hot Plug features. This feature will involve further investigation,
> prototyping and enhancements to improve live migration support in DPDK.
> 
> Tcpdump Support: Support for tcpdump will be added to DPDK. This will
> improve usability and debugging of DPDK applications.
> 
> Increase Next Hops for LPM (IPv4): The number of next hops for IPv4 LPM
> is currently limited to 256. This will be extended to allow a greater
> number of next hops.
> 
> Fm10k Enhancements: FTAG based forwarding, and performance tuning
> 
> Support Intel Resource Director Technology: A library will be added to
> DPDK to support the following Intel CPU technologies:
> - CAT - Cache Allocation Technology (LLC aka L3)
> - CDP - Code Data Prioritization (extension of CAT)
> - CMT - Cache Monitoring Technology (LLC)
> - MBM - Memory Bandwidth Monitoring, to local and remote RAM
> These technologies are currently available via cgroups and perf, but
> this feature will provide closer integration with DPDK and a sample
> application showing how they can be used.
> 
> I40e Enhancements:
> - Flow Director input set Alignment
> - Ethertype configuration for QinQ support
> - Flow Director Support for Tunnels (QinQ, GRE/NVGRE, VXLAN)
> - Flow Director Support for IP Proto and IP TOS
> - VEB switching
> - Floating VEB
> - IPGRE Support
> - Set VF MAC address
> - Rework PCIe extended tag enabling by using DPDK interfaces
> 
> Virtio/Vhost Enhancements:
> - Virtio 1.0 support
> - Vhost software TSO
> - Vhost/virtio performance tuning
> 
> Container Enhancements:
> - Virtio for containers
> - Hugetlbfs mount point size
> - Cgroup resource awareness
> - Enable short-lived DPDK applications
> 
> Generic Tunneling API:
> - Implement virtual flow device framework
> - Implement generic virtual device management APIs, including the
> following callback functions:
>   - flow_ethdev_start/stop/configure/close/info_get
>   - ethdev_rx/tx_queue_setup/release
>   - flow_ethdev_tunnel_configure/setup/destroy
>   - flow_ethdev_tunnel_pkt_decap/encap
> - Implement flow device PMD drive APIs
>   - rte_eth_flow_dev_create/remove/ others
> - Integrate VXLAN protocol (including VXLAN decap/encap optimization)
> into this framework only on i40e.
> 
> 
> Tim

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-11-30 22:19 ` Dave Neary
@ 2015-12-01 11:57   ` O'Driscoll, Tim
  0 siblings, 0 replies; 33+ messages in thread
From: O'Driscoll, Tim @ 2015-12-01 11:57 UTC (permalink / raw)
  To: Dave Neary, dev


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Dave Neary
> Sent: Monday, November 30, 2015 10:19 PM
> To: O'Driscoll, Tim; dev@dpdk.org
> Subject: Re: [dpdk-dev] 2.3 Roadmap
> 
> Hi Tim,
> 
> Just curious about one item on the list:
> 
> On 11/30/2015 03:50 PM, O'Driscoll, Tim wrote:
> > IPsec Sample Application: A sample application will be created which
> will show how DPDK and the new cryptodev API can be used to implement
> IPsec. Use of the cryptodev API will allow either hardware or software
> encryption to be used. IKE will not be implemented so the SA/SP DBs will
> be statically configured.
> 
> Do you anticipate this application living in the dpdk repo, or in a
> separate tree?

Good question. As a sample application showing how DPDK can be used to implement IPsec, I believe it belongs within the DPDK repo.

When we have an Architecture Board in place, I think one of the first tasks for the board should be to clarify the scope of DPDK. Venky agreed at the Userspace event to draft an initial statement on this. If the outcome of that is that the IPsec sample app doesn't belong within DPDK, then we can put it into a separate tree.


Tim
> 
> Thanks,
> Dave.
> 
> --
> Dave Neary - NFV/SDN Community Strategy
> Open Source and Standards, Red Hat - http://community.redhat.com
> Ph: +1-978-399-2182 / Cell: +1-978-799-3338

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 11:26       ` Yoshinobu Inoue
@ 2015-12-01 11:58         ` Bruce Richardson
  2015-12-01 13:42           ` Matthew Hall
  2015-12-02  0:53           ` Yoshinobu Inoue
  0 siblings, 2 replies; 33+ messages in thread
From: Bruce Richardson @ 2015-12-01 11:58 UTC (permalink / raw)
  To: Yoshinobu Inoue; +Cc: dev

On Tue, Dec 01, 2015 at 08:26:39PM +0900, Yoshinobu Inoue wrote:
> Hello DPDK list,
> 
> I've been so far just roughly reading messages, as I've been working on my
> company's product based on DPDK1.6 for some time and haven't yet much
> catched up to newer releases,
> but as I implemented packet capture for my product just in a way as commented
> here, (using KNI),
> 
> > Our current thinking is to use kni to mirror packets into the kernel itself,
> > so that all standard linux capture tools can then be used.
> > 
> > /Bruce
> 
> I felt like giving commets from my humble experiences,,,
> 
>  - In our case, KNI is always enabeld for each DPDK port,
>    as it seemed handy that packet rx/tx stat and up/down status can be checked
>    by ifconfig as well.
>    Also, iface MIB become available as well via /sys/devices/pci.../net.
> 
>    As far as we checked it, it seemed that just creating KNI itself didn't much
>    affect Dplane performance.
>    (Just when we update statistics, there is a bit of overhead.)
> 
>  - I inserted rte_kni_tx_burst() call to
>    packet RX path (after rte_eth_rx_burst) and TX path (after rte_eth_tx_burst,
>    based on an assumption that just sent out pkt is not yet freed, it will be
>    freed when the tx descriptor is overwritten by some time later tx packet.).
> 
>    The call to rte_kni_tx_burst() is enabled/disabled by some external capture
>    enable/disable command.
> 
>    I copy the packet beforehand and pass the copied one to rte_kni_tx_burst().
>    In TX path, we might not need to copy if we just increment the packet refcnt
>    by 1, but haven't yet tried such hack much.
> 
>    The packets sent to rte_kni_tx_burst() can then be captured by normal libpcap
>    tools like tcpdump on the correspondent KNI.
> 
>    The performance loss when the capture was enabled was roughly 20-30%.
>    (Perhaps it might change based on many factors.)
> 
>    By the way, in this way, we can enable tx-only or rx-only capture.
> 
> 
>  - Some considerations,
> 
>    -  Someone might not like capture on/off check everytime on normal fast path
>       tx/rx route.
>       I too, so created fastpath send routine and slowpath send routine,
>       and switched the function pointer when the capture is enabled/disabled.
>       But not sure if it was worth for the effort.
> 
>    - In this approach, everyone needs to create their own capture enable/disable
>      command in their implementation, and it could be a bit bothering.
> 
>      I myself am not sure if it's possible, but if as in normal tcpdump,
>      an invocation of tcpdump to a KNI interfce could be somehow notified to
>      the correspondent DPDK port user application, and then the call to
>      rte_kni_tx_burst() could be automatically enabled/disabled, that's cool.
> 
>    
> Thanks,
> Yoshinobu Inoue
> 
Hi,

that is indeed very similar to what we are thinking ourselves. Is there any of
what you have already done that you could contribute publically to save us
duplicating some of your effort? [The one big difference, is that we are not
thinking of enabling kni permanently for each port, as the ethtool support is
only present for a couple of NIC types, and solving that is a separate issue.:-)]

/Bruce

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-11-30 20:50 [dpdk-dev] 2.3 Roadmap O'Driscoll, Tim
                   ` (3 preceding siblings ...)
  2015-11-30 22:53 ` Kyle Larose
@ 2015-12-01 12:59 ` Matthew Hall
  2015-12-01 13:16   ` O'Driscoll, Tim
  4 siblings, 1 reply; 33+ messages in thread
From: Matthew Hall @ 2015-12-01 12:59 UTC (permalink / raw)
  To: O'Driscoll, Tim; +Cc: dev

On Mon, Nov 30, 2015 at 08:50:58PM +0000, O'Driscoll, Tim wrote:
> Increase Next Hops for LPM (IPv4): The number of next hops for IPv4 LPM is 
> currently limited to 256. This will be extended to allow a greater number of 
> next hops.

In other threads, we previously proposed doing increased LPM4 *and* LPM6.

Having incompatible support between both is a huge headache for me.

And I already contributed patches to fix the issue in both versions.

Thanks,
Matthew

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 12:59 ` Matthew Hall
@ 2015-12-01 13:16   ` O'Driscoll, Tim
  2015-12-01 13:44     ` Matthew Hall
  0 siblings, 1 reply; 33+ messages in thread
From: O'Driscoll, Tim @ 2015-12-01 13:16 UTC (permalink / raw)
  To: Matthew Hall; +Cc: dev


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matthew Hall
> Sent: Tuesday, December 1, 2015 1:00 PM
> To: O'Driscoll, Tim
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] 2.3 Roadmap
> 
> On Mon, Nov 30, 2015 at 08:50:58PM +0000, O'Driscoll, Tim wrote:
> > Increase Next Hops for LPM (IPv4): The number of next hops for IPv4
> LPM is
> > currently limited to 256. This will be extended to allow a greater
> number of
> > next hops.
> 
> In other threads, we previously proposed doing increased LPM4 *and*
> LPM6.
> 
> Having incompatible support between both is a huge headache for me.
> 
> And I already contributed patches to fix the issue in both versions.

True. The goal is to merge the best of the various patches that were submitted on this. This could involve changes to IPv6 as well as IPv4.


Tim

> 
> Thanks,
> Matthew

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 11:58         ` Bruce Richardson
@ 2015-12-01 13:42           ` Matthew Hall
  2015-12-01 14:45             ` Kyle Larose
  2015-12-02  0:53           ` Yoshinobu Inoue
  1 sibling, 1 reply; 33+ messages in thread
From: Matthew Hall @ 2015-12-01 13:42 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Tue, Dec 01, 2015 at 11:58:16AM +0000, Bruce Richardson wrote:
> Hi,
> 
> that is indeed very similar to what we are thinking ourselves. Is there any of
> what you have already done that you could contribute publically to save us
> duplicating some of your effort? [The one big difference, is that we are not
> thinking of enabling kni permanently for each port, as the ethtool support is
> only present for a couple of NIC types, and solving that is a separate issue.:-)]
> 
> /Bruce

Personally I was looking at something a bit different because I wanted an 
ability to support lightning fast BPF expressions for security purposes, not 
just debugging captures.

I got hold of a copy of the bpfjit implementation, with some tweaks to support 
compiling on Linux and BSD in userspace mode, from Alexander Nasonov who made 
it for the BSD kernel, as a result of participating here.

I am planning to use this to do the captures so you don't incur the headache 
or performance issues with rte_kni.

I am curious how I might be able to link it up w/ the standard libpcap based 
tools to get an end-to-end solution with minimal loss.

Matthew.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 13:16   ` O'Driscoll, Tim
@ 2015-12-01 13:44     ` Matthew Hall
  2015-12-01 13:57       ` Bruce Richardson
  0 siblings, 1 reply; 33+ messages in thread
From: Matthew Hall @ 2015-12-01 13:44 UTC (permalink / raw)
  To: O'Driscoll, Tim; +Cc: dev

On Tue, Dec 01, 2015 at 01:16:47PM +0000, O'Driscoll, Tim wrote:
> True. The goal is to merge the best of the various patches that were 
> submitted on this. This could involve changes to IPv6 as well as IPv4.
> 
> 
> Tim

If it's possible to fix IPv6 as well this would be good for me. Offering a 
large nexthop space on all protocols is very good for BGP / core routing and 
security inspection applications. Using this feature, I will be able to detect 
interactions with bad subnets and IPs at tremendous speed compared to 
competing solutions.

Matthew.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 13:44     ` Matthew Hall
@ 2015-12-01 13:57       ` Bruce Richardson
  2015-12-01 19:49         ` Matthew Hall
  0 siblings, 1 reply; 33+ messages in thread
From: Bruce Richardson @ 2015-12-01 13:57 UTC (permalink / raw)
  To: Matthew Hall; +Cc: dev

On Tue, Dec 01, 2015 at 08:44:57AM -0500, Matthew Hall wrote:
> On Tue, Dec 01, 2015 at 01:16:47PM +0000, O'Driscoll, Tim wrote:
> > True. The goal is to merge the best of the various patches that were 
> > submitted on this. This could involve changes to IPv6 as well as IPv4.
> > 
> > 
> > Tim
> 
> If it's possible to fix IPv6 as well this would be good for me. Offering a 
> large nexthop space on all protocols is very good for BGP / core routing and 
> security inspection applications. Using this feature, I will be able to detect 
> interactions with bad subnets and IPs at tremendous speed compared to 
> competing solutions.
> 
> Matthew.

Hi Matthew,

Couple of follow-up questions on this:
* do you need the exact same number of bits in both implementations? If we support
21 bits of data in IPv6 and 24 in IPv4 is that an issue compared to supporting
21 bits just in both for compatibility.
* related to this - how much data are you looking to store in the tables?


Thanks,
/Bruce

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 10:03     ` Bruce Richardson
  2015-12-01 11:26       ` Yoshinobu Inoue
@ 2015-12-01 14:27       ` Panu Matilainen
  2015-12-01 14:48         ` Vincent JARDIN
  1 sibling, 1 reply; 33+ messages in thread
From: Panu Matilainen @ 2015-12-01 14:27 UTC (permalink / raw)
  To: Bruce Richardson, Stephen Hemminger; +Cc: dev

On 12/01/2015 12:03 PM, Bruce Richardson wrote:
> On Mon, Nov 30, 2015 at 05:16:55PM -0800, Stephen Hemminger wrote:
>> On Mon, 30 Nov 2015 22:53:50 +0000
>> Kyle Larose <klarose@sandvine.com> wrote:
>>
>>> Hi Tim,
>>>
>>> On Mon, Nov 30, 2015 at 3:50 PM, O'Driscoll, Tim <tim.odriscoll@intel.com> wrote:
>>>
>>>> Tcpdump Support: Support for tcpdump will be added to DPDK. This will improve usability and debugging of DPDK applications.
>>>
>>> I'm curious about the proposed tcpdump support. Is there a concrete plan for this, or is that still being looked into? Sandvine is interested in contributing to this effort. Anything we can do to help?
>>>
>>> Thanks,
>>>
>>> Kyle
>>
>> We discussed an Ovscon doing a simple example of how to have a thread use named pipe
>> support (already in tcpdump and wireshark). More complex solutions require changes to
>> libpcap and application interaction.
>
> Our current thinking is to use kni to mirror packets into the kernel itself,
> so that all standard linux capture tools can then be used.

The problem with that (unless I'm missing something here) is that KNI 
requires using out-of-tree kernel modules which makes it pretty much a 
non-option for distros.

	- Panu -

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 13:42           ` Matthew Hall
@ 2015-12-01 14:45             ` Kyle Larose
  2015-12-01 19:28               ` Matthew Hall
  0 siblings, 1 reply; 33+ messages in thread
From: Kyle Larose @ 2015-12-01 14:45 UTC (permalink / raw)
  To: Matthew Hall; +Cc: dev

On Tue, Dec 1, 2015 at 8:42 AM, Matthew Hall <mhall@mhcomputing.net> wrote:


> I am planning to use this to do the captures so you don't incur the headache
> or performance issues with rte_kni.
>
> I am curious how I might be able to link it up w/ the standard libpcap based
> tools to get an end-to-end solution with minimal loss

Earlier Stephen mentioned using the named pipe behaviour of tcpdump.
Is there an opportunity to take what you have mentioned here and marry
it to the named pipe output to get the perf you need?

Personally I have written a tool which sends packets out of a ring pmd
in my main application. The associated rings are polled by another
application which dumps to the packets to a pcap file. This has fairly
good performance, though my implementation is quite crude. My point
here is that sending to a named pipe may not be a bad idea. I don't
know how the perf would compare, but it can't be that bad, can it? :P
Things might get tricky thought if we need multiple queues to handle
the rate of traffic being captured. Not sure how tcpdump would like
that.

I've also considered using cuse to expose a cdev that tcpdump could
read from, but I'm not sure that tcpdump has that ability yet.


>
> Matthew.

Thanks,

Kyle

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 14:27       ` Panu Matilainen
@ 2015-12-01 14:48         ` Vincent JARDIN
  2015-12-01 14:58           ` Panu Matilainen
  2015-12-02 11:24           ` Neil Horman
  0 siblings, 2 replies; 33+ messages in thread
From: Vincent JARDIN @ 2015-12-01 14:48 UTC (permalink / raw)
  To: Panu Matilainen, Bruce Richardson, Stephen Hemminger; +Cc: dev

On 01/12/2015 15:27, Panu Matilainen wrote:
> The problem with that (unless I'm missing something here) is that KNI
> requires using out-of-tree kernel modules which makes it pretty much a
> non-option for distros.

It works fine with some distros. I do not think it should be an argument.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 14:48         ` Vincent JARDIN
@ 2015-12-01 14:58           ` Panu Matilainen
  2015-12-01 15:16             ` Christian Ehrhardt
  2015-12-01 15:19             ` Bruce Richardson
  2015-12-02 11:24           ` Neil Horman
  1 sibling, 2 replies; 33+ messages in thread
From: Panu Matilainen @ 2015-12-01 14:58 UTC (permalink / raw)
  To: Vincent JARDIN, Bruce Richardson, Stephen Hemminger; +Cc: dev

On 12/01/2015 04:48 PM, Vincent JARDIN wrote:
> On 01/12/2015 15:27, Panu Matilainen wrote:
>> The problem with that (unless I'm missing something here) is that KNI
>> requires using out-of-tree kernel modules which makes it pretty much a
>> non-option for distros.
>
> It works fine with some distros. I do not think it should be an argument.

Its not a question of *working*, its that out-of-tree kernel modules are 
considered unsupportable by the kernel people. So relying on KNI would 
make the otherwise important and desireable tcpdump feature non-existent 
on at least Fedora and RHEL where such modules are practically outright 
banned by distro policies.

	- Panu -

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 14:58           ` Panu Matilainen
@ 2015-12-01 15:16             ` Christian Ehrhardt
  2015-12-01 15:19             ` Bruce Richardson
  1 sibling, 0 replies; 33+ messages in thread
From: Christian Ehrhardt @ 2015-12-01 15:16 UTC (permalink / raw)
  To: Panu Matilainen; +Cc: dev

On Tue, Dec 1, 2015 at 3:58 PM, Panu Matilainen <pmatilai@redhat.com> wrote:
> On 12/01/2015 04:48 PM, Vincent JARDIN wrote:
>>
>> On 01/12/2015 15:27, Panu Matilainen wrote:
>>>
>>> The problem with that (unless I'm missing something here) is that KNI
>>> requires using out-of-tree kernel modules which makes it pretty much a
>>> non-option for distros.
>>
>>
>> It works fine with some distros. I do not think it should be an argument.
>
>
> Its not a question of *working*, its that out-of-tree kernel modules are
> considered unsupportable by the kernel people. So relying on KNI would make
> the otherwise important and desirable tcpdump feature non-existent on at
> least Fedora and RHEL where such modules are practically outright banned by
> distro policies.
>
>         - Panu -

+1 to that argument from an Ubuntu Point-of-View

Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 14:58           ` Panu Matilainen
  2015-12-01 15:16             ` Christian Ehrhardt
@ 2015-12-01 15:19             ` Bruce Richardson
  2015-12-01 15:31               ` Aaron Conole
  1 sibling, 1 reply; 33+ messages in thread
From: Bruce Richardson @ 2015-12-01 15:19 UTC (permalink / raw)
  To: Panu Matilainen; +Cc: dev

On Tue, Dec 01, 2015 at 04:58:08PM +0200, Panu Matilainen wrote:
> On 12/01/2015 04:48 PM, Vincent JARDIN wrote:
> >On 01/12/2015 15:27, Panu Matilainen wrote:
> >>The problem with that (unless I'm missing something here) is that KNI
> >>requires using out-of-tree kernel modules which makes it pretty much a
> >>non-option for distros.
> >
> >It works fine with some distros. I do not think it should be an argument.
> 
> Its not a question of *working*, its that out-of-tree kernel modules are
> considered unsupportable by the kernel people. So relying on KNI would make
> the otherwise important and desireable tcpdump feature non-existent on at
> least Fedora and RHEL where such modules are practically outright banned by
> distro policies.
> 
> 	- Panu -

Yes, KNI is a bit of a problem right now in that way.

How about a solution which is just based around the idea of setting up a generic
port mirroring callback? Hopefully in the future we can get KNI exposed as a PMD,
and we already have a ring PMD, and could possibly do a generic file/fifo PMD.
Between the 3, we could then have multiple options for intercepting traffic
going in/out of an app. The callback would just have to copy the traffic to the
selected interface before returning it to the app as normal?

/Bruce

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 15:19             ` Bruce Richardson
@ 2015-12-01 15:31               ` Aaron Conole
  2015-12-01 15:54                 ` Richardson, Bruce
  2015-12-01 19:32                 ` Matthew Hall
  0 siblings, 2 replies; 33+ messages in thread
From: Aaron Conole @ 2015-12-01 15:31 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

Bruce Richardson <bruce.richardson@intel.com> writes:
> On Tue, Dec 01, 2015 at 04:58:08PM +0200, Panu Matilainen wrote:
>> On 12/01/2015 04:48 PM, Vincent JARDIN wrote:
>> >On 01/12/2015 15:27, Panu Matilainen wrote:
>> >>The problem with that (unless I'm missing something here) is that KNI
>> >>requires using out-of-tree kernel modules which makes it pretty much a
>> >>non-option for distros.
>> >
>> >It works fine with some distros. I do not think it should be an argument.
>> 
>> Its not a question of *working*, its that out-of-tree kernel modules are
>> considered unsupportable by the kernel people. So relying on KNI would make
>> the otherwise important and desireable tcpdump feature non-existent on at
>> least Fedora and RHEL where such modules are practically outright banned by
>> distro policies.
>> 
>> 	- Panu -
>
> Yes, KNI is a bit of a problem right now in that way.
>
> How about a solution which is just based around the idea of setting up a generic
> port mirroring callback? Hopefully in the future we can get KNI
> exposed as a PMD,
> and we already have a ring PMD, and could possibly do a generic file/fifo PMD.
> Between the 3, we could then have multiple options for intercepting traffic
> going in/out of an app. The callback would just have to copy the traffic to the
> selected interface before returning it to the app as normal?
>
> /Bruce

I'm actually working on a patch series that uses a TAP device (it's currently
been only minorly tested) called back from the port input. The benefit
is no dependancy on kernel modules (just TUN/TAP support). I don't have
a way of signaling sampling, so right now, it's just drinking from the
firehose. Nothing I'm ready to put out publicly (because it's ugly -
just a PoC), but it allows a few things:

1) on demand on/off using standard linux tools (ifconfig/ip to set tap
   device up/down)
2) Can work with any tool which reads off of standard linux interfaces
   (tcpdump/wireshark work out of the box, but you could plug in any
   pcap or non-pcap tool)
3) Doesn't require changes to the application (no command line switches
   during startup, etc.)

As I said, I'm not ready to put it out there publicly, because I haven't
had a chance to check the performance, and it's definitely not following
any kind of DPDK-like coding style. Just wanted to throw this out as
food for thought - if you think this approach is worthwhile I can try to
prioritize it, at least to get an RFC series out.

-Aaron

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 15:31               ` Aaron Conole
@ 2015-12-01 15:54                 ` Richardson, Bruce
  2015-12-02  1:38                   ` Wiles, Keith
  2015-12-01 19:32                 ` Matthew Hall
  1 sibling, 1 reply; 33+ messages in thread
From: Richardson, Bruce @ 2015-12-01 15:54 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev



> -----Original Message-----
> From: Aaron Conole [mailto:aconole@redhat.com]
> Sent: Tuesday, December 1, 2015 3:31 PM
> To: Richardson, Bruce <bruce.richardson@intel.com>
> Cc: Panu Matilainen <pmatilai@redhat.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] 2.3 Roadmap
> 
> Bruce Richardson <bruce.richardson@intel.com> writes:
> > On Tue, Dec 01, 2015 at 04:58:08PM +0200, Panu Matilainen wrote:
> >> On 12/01/2015 04:48 PM, Vincent JARDIN wrote:
> >> >On 01/12/2015 15:27, Panu Matilainen wrote:
> >> >>The problem with that (unless I'm missing something here) is that
> >> >>KNI requires using out-of-tree kernel modules which makes it pretty
> >> >>much a non-option for distros.
> >> >
> >> >It works fine with some distros. I do not think it should be an
> argument.
> >>
> >> Its not a question of *working*, its that out-of-tree kernel modules
> >> are considered unsupportable by the kernel people. So relying on KNI
> >> would make the otherwise important and desireable tcpdump feature
> >> non-existent on at least Fedora and RHEL where such modules are
> >> practically outright banned by distro policies.
> >>
> >> 	- Panu -
> >
> > Yes, KNI is a bit of a problem right now in that way.
> >
> > How about a solution which is just based around the idea of setting up
> > a generic port mirroring callback? Hopefully in the future we can get
> > KNI exposed as a PMD, and we already have a ring PMD, and could
> > possibly do a generic file/fifo PMD.
> > Between the 3, we could then have multiple options for intercepting
> > traffic going in/out of an app. The callback would just have to copy
> > the traffic to the selected interface before returning it to the app as
> normal?
> >
> > /Bruce
> 
> I'm actually working on a patch series that uses a TAP device (it's
> currently been only minorly tested) called back from the port input. The
> benefit is no dependancy on kernel modules (just TUN/TAP support). I don't
> have a way of signaling sampling, so right now, it's just drinking from
> the firehose. Nothing I'm ready to put out publicly (because it's ugly -
> just a PoC), but it allows a few things:
> 
> 1) on demand on/off using standard linux tools (ifconfig/ip to set tap
>    device up/down)
> 2) Can work with any tool which reads off of standard linux interfaces
>    (tcpdump/wireshark work out of the box, but you could plug in any
>    pcap or non-pcap tool)
> 3) Doesn't require changes to the application (no command line switches
>    during startup, etc.)
> 
> As I said, I'm not ready to put it out there publicly, because I haven't
> had a chance to check the performance, and it's definitely not following
> any kind of DPDK-like coding style. Just wanted to throw this out as food
> for thought - if you think this approach is worthwhile I can try to
> prioritize it, at least to get an RFC series out.
> 
> -Aaron

Once I had a generic file-handling PMD written, I was then considering extending
it to work with TUN/TAP too. :-)
I think a TAP PMD would be useful for the downstream distros who can't package
KNI as it is right now.

/Bruce

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 14:45             ` Kyle Larose
@ 2015-12-01 19:28               ` Matthew Hall
  0 siblings, 0 replies; 33+ messages in thread
From: Matthew Hall @ 2015-12-01 19:28 UTC (permalink / raw)
  To: Kyle Larose; +Cc: dev

On Tue, Dec 01, 2015 at 09:45:56AM -0500, Kyle Larose wrote:
> Earlier Stephen mentioned using the named pipe behaviour of tcpdump.
> Is there an opportunity to take what you have mentioned here and marry
> it to the named pipe output to get the perf you need?

I am wondering about the same thing. But I didn't want to limit the scope of 
solutions too much so I didn't specifically enumerate this possibility.

Matthew.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 15:31               ` Aaron Conole
  2015-12-01 15:54                 ` Richardson, Bruce
@ 2015-12-01 19:32                 ` Matthew Hall
  1 sibling, 0 replies; 33+ messages in thread
From: Matthew Hall @ 2015-12-01 19:32 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev

On Tue, Dec 01, 2015 at 10:31:02AM -0500, Aaron Conole wrote:
> The benefit is no dependancy on kernel modules (just TUN/TAP support). I 
> don't have a way of signaling sampling, so right now, it's just drinking 
> from the firehose.

This is actually quite a good idea. Many years ago I coded up a simple 
connector between DPDK and TAP devices for use with some legacy applications 
that did not support DPDK.

I could definitely connect the output of user-space bpfjit to a TAP device 
quite easily.

I am somewhat less clear on how to connect tcpdump or other standard libpcap 
based entities up, so that one could change the capture filters or other 
settings from outside the DPDK application. I am hoping some of the network 
API experts can comment on this since I'm just a security specialist.

How are you letting people configure the capture filter in this scenario?

Matthew.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 13:57       ` Bruce Richardson
@ 2015-12-01 19:49         ` Matthew Hall
  2015-12-02 12:35           ` Bruce Richardson
  0 siblings, 1 reply; 33+ messages in thread
From: Matthew Hall @ 2015-12-01 19:49 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Tue, Dec 01, 2015 at 01:57:39PM +0000, Bruce Richardson wrote:
> Hi Matthew,
> 
> Couple of follow-up questions on this:
> * do you need the exact same number of bits in both implementations? If we support
> 21 bits of data in IPv6 and 24 in IPv4 is that an issue compared to supporting
> 21 bits just in both for compatibility.
> * related to this - how much data are you looking to store in the tables?
> 
> Thanks,
> /Bruce

Let me provide some more detailed high level examples of some security use 
cases so we could consider what makes sense.

1) Spamhaus provides a list of approximately 800 CIDR blocks which are so 
bad that they recommend null-routing them as widely as possible:

https://www.spamhaus.org/drop/
https://www.spamhaus.org/drop/drop.txt
https://www.spamhaus.org/drop/edrop.txt

In the old implementation I couldn't even fit all of those, and doing 
something like this seems to be a must-have feature for security.

2) Team Cymru provides lists of Bogons for IPv4 and IPv6. In IPv4, there are 
3600 bogon CIDR blocks because many things are in-use. But the IPv6 table has 
65000 CIDR blocks, because it is larger, newer, and more sparse.

http://www.team-cymru.org/Services/Bogons/fullbogons-ipv4.txt
http://www.team-cymru.org/Services/Bogons/fullbogons-ipv6.txt

Being able to monitor these would be another must-have for security and is 
quite popular for core routing from what I have heard.

3) At any given time, through various methods, I am aware of around 350,000 to 
2.5 million recent bad IP addresses. Technically single entries could be 
matched using rte_hash. But it is quite common in the security world, to look 
at the number of bad IPs in a class C, and then flag the entire subnet as 
suspect if more than a few bad IPs are present there.

Some support for some level of this is a must-have for security and firewall 
use cases.

4) Of course, it goes without saying that fitting the contents of the entire 
Internet BGP prefix list for IPv4 and IPv6 is a must-have for core routing 
although less needed for security. I am not an expert in this. Some very basic 
statistics I located with a quick search suggest one needs about 600,000 
prefixes (presumably for IPv4). It would help if some router experts could 
clarify it and help me know what the story is for IPv6.

http://www.cidr-report.org/as2.0/#General_Status

5) Considering all of the above, it seems like 22 or 23 unsigned lookup bits 
are required (4194304 or 8388608 entries) if you want comprehensive bad IP 
detection. And probably 21 unsigned bits for basic security support. But that 
would not necessarily leave a whole lot of headroom depending on the details.

Matthew.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 11:58         ` Bruce Richardson
  2015-12-01 13:42           ` Matthew Hall
@ 2015-12-02  0:53           ` Yoshinobu Inoue
  1 sibling, 0 replies; 33+ messages in thread
From: Yoshinobu Inoue @ 2015-12-02  0:53 UTC (permalink / raw)
  To: bruce.richardson; +Cc: dev

Hello Bruce,

> Hi,
> 
> that is indeed very similar to what we are thinking ourselves. Is there any of
> what you have already done that you could contribute publically to save us
> duplicating some of your effort? [The one big difference, is that we are not
> thinking of enabling kni permanently for each port, as the ethtool support is
> only present for a couple of NIC types, and solving that is a separate issue.:-)]
> 
> /Bruce

It seems there has been some progress in the thread, but if there is still
anything worth while doing, yes I think I can.
My DPDK 1.6 work has finished and my boss suggested me trying to spend more time
on contibuting something from now on.
I'm trying to get used to DPDK 2.1 (skipping 1.7, 1.8) and will then try 2.2 and
current tree in the near future.


By roughly checking the later progress in the thread, if we can capture packet
flow anywhere in the code path, and can filter it using BPF, it seems quite nice
and desirable.

Just I have another comment from relatively more operational point of view,
once I tried to implement a packet capture API which copies packets to a
shared memory FIFO queue, and outer modified libpcap library read out the
packets from those shared memory FIFO queue, thus enabling anywhere capturing,
but it ended up that I didn't use the functionality very much, as it requires
hard coding of inserting packet capture API each time the capture is required,
and that was somewhat bothering.
There was another simple packet trace mechanism which does tracing on several
pre-defined points, and it was OK for many cases.

I think when introducing this kind of anywhere capturing, it will be quite easy
to use for normal user if each capture point is pre-defined, such as each
packet processing module input and output point, each internal FIFO queue input
and output point.
Then, some another outer tool which displays internal functional topology is
also desirable,
(just ascii art can be OK),
such as showing each capture point by ID number and showing also if capturing of
each of then are enabled or not, and let us enable/disable each capture point
easily by specifying the ID number.

If there could be modified libpcap tools for this mechanism, that would be
further helpful, such as they can specify the capture ID number and dump
the packet from that point, also, can specify and update BPF filter for each
capture point ID number.


Regards,
Yoshinobu Inoue


From: Bruce Richardson <bruce.richardson@intel.com>
Subject: Re: [dpdk-dev] 2.3 Roadmap
Date: Tue, 1 Dec 2015 11:58:16 +0000

> On Tue, Dec 01, 2015 at 08:26:39PM +0900, Yoshinobu Inoue wrote:
>> Hello DPDK list,
>> 
>> I've been so far just roughly reading messages, as I've been working on my
>> company's product based on DPDK1.6 for some time and haven't yet much
>> catched up to newer releases,
>> but as I implemented packet capture for my product just in a way as commented
>> here, (using KNI),
>> 
>> > Our current thinking is to use kni to mirror packets into the kernel itself,
>> > so that all standard linux capture tools can then be used.
>> > 
>> > /Bruce
>> 
>> I felt like giving commets from my humble experiences,,,
>> 
>>  - In our case, KNI is always enabeld for each DPDK port,
>>    as it seemed handy that packet rx/tx stat and up/down status can be checked
>>    by ifconfig as well.
>>    Also, iface MIB become available as well via /sys/devices/pci.../net.
>> 
>>    As far as we checked it, it seemed that just creating KNI itself didn't much
>>    affect Dplane performance.
>>    (Just when we update statistics, there is a bit of overhead.)
>> 
>>  - I inserted rte_kni_tx_burst() call to
>>    packet RX path (after rte_eth_rx_burst) and TX path (after rte_eth_tx_burst,
>>    based on an assumption that just sent out pkt is not yet freed, it will be
>>    freed when the tx descriptor is overwritten by some time later tx packet.).
>> 
>>    The call to rte_kni_tx_burst() is enabled/disabled by some external capture
>>    enable/disable command.
>> 
>>    I copy the packet beforehand and pass the copied one to rte_kni_tx_burst().
>>    In TX path, we might not need to copy if we just increment the packet refcnt
>>    by 1, but haven't yet tried such hack much.
>> 
>>    The packets sent to rte_kni_tx_burst() can then be captured by normal libpcap
>>    tools like tcpdump on the correspondent KNI.
>> 
>>    The performance loss when the capture was enabled was roughly 20-30%.
>>    (Perhaps it might change based on many factors.)
>> 
>>    By the way, in this way, we can enable tx-only or rx-only capture.
>> 
>> 
>>  - Some considerations,
>> 
>>    -  Someone might not like capture on/off check everytime on normal fast path
>>       tx/rx route.
>>       I too, so created fastpath send routine and slowpath send routine,
>>       and switched the function pointer when the capture is enabled/disabled.
>>       But not sure if it was worth for the effort.
>> 
>>    - In this approach, everyone needs to create their own capture enable/disable
>>      command in their implementation, and it could be a bit bothering.
>> 
>>      I myself am not sure if it's possible, but if as in normal tcpdump,
>>      an invocation of tcpdump to a KNI interfce could be somehow notified to
>>      the correspondent DPDK port user application, and then the call to
>>      rte_kni_tx_burst() could be automatically enabled/disabled, that's cool.
>> 
>>    
>> Thanks,
>> Yoshinobu Inoue
>> 
> Hi,
> 
> that is indeed very similar to what we are thinking ourselves. Is there any of
> what you have already done that you could contribute publically to save us
> duplicating some of your effort? [The one big difference, is that we are not
> thinking of enabling kni permanently for each port, as the ethtool support is
> only present for a couple of NIC types, and solving that is a separate issue.:-)]
> 
> /Bruce
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 15:54                 ` Richardson, Bruce
@ 2015-12-02  1:38                   ` Wiles, Keith
  2015-12-02  2:42                     ` Matthew Hall
  0 siblings, 1 reply; 33+ messages in thread
From: Wiles, Keith @ 2015-12-02  1:38 UTC (permalink / raw)
  To: Richardson, Bruce, Aaron Conole; +Cc: dev

On 12/1/15, 10:54 AM, "dev on behalf of Richardson, Bruce" <dev-bounces@dpdk.org on behalf of bruce.richardson@intel.com> wrote:

>
>
>> -----Original Message-----
>> From: Aaron Conole [mailto:aconole@redhat.com]
>> Sent: Tuesday, December 1, 2015 3:31 PM
>> To: Richardson, Bruce <bruce.richardson@intel.com>
>> Cc: Panu Matilainen <pmatilai@redhat.com>; dev@dpdk.org
>> Subject: Re: [dpdk-dev] 2.3 Roadmap
>> 
>> Bruce Richardson <bruce.richardson@intel.com> writes:
>> > On Tue, Dec 01, 2015 at 04:58:08PM +0200, Panu Matilainen wrote:
>> >> On 12/01/2015 04:48 PM, Vincent JARDIN wrote:
>> >> >On 01/12/2015 15:27, Panu Matilainen wrote:
>> >> >>The problem with that (unless I'm missing something here) is that
>> >> >>KNI requires using out-of-tree kernel modules which makes it pretty
>> >> >>much a non-option for distros.
>> >> >
>> >> >It works fine with some distros. I do not think it should be an
>> argument.
>> >>
>> >> Its not a question of *working*, its that out-of-tree kernel modules
>> >> are considered unsupportable by the kernel people. So relying on KNI
>> >> would make the otherwise important and desireable tcpdump feature
>> >> non-existent on at least Fedora and RHEL where such modules are
>> >> practically outright banned by distro policies.
>> >>
>> >> 	- Panu -
>> >
>> > Yes, KNI is a bit of a problem right now in that way.
>> >
>> > How about a solution which is just based around the idea of setting up
>> > a generic port mirroring callback? Hopefully in the future we can get
>> > KNI exposed as a PMD, and we already have a ring PMD, and could
>> > possibly do a generic file/fifo PMD.
>> > Between the 3, we could then have multiple options for intercepting
>> > traffic going in/out of an app. The callback would just have to copy
>> > the traffic to the selected interface before returning it to the app as
>> normal?
>> >
>> > /Bruce
>> 
>> I'm actually working on a patch series that uses a TAP device (it's
>> currently been only minorly tested) called back from the port input. The
>> benefit is no dependancy on kernel modules (just TUN/TAP support). I don't
>> have a way of signaling sampling, so right now, it's just drinking from
>> the firehose. Nothing I'm ready to put out publicly (because it's ugly -
>> just a PoC), but it allows a few things:
>> 
>> 1) on demand on/off using standard linux tools (ifconfig/ip to set tap
>>    device up/down)
>> 2) Can work with any tool which reads off of standard linux interfaces
>>    (tcpdump/wireshark work out of the box, but you could plug in any
>>    pcap or non-pcap tool)
>> 3) Doesn't require changes to the application (no command line switches
>>    during startup, etc.)
>> 
>> As I said, I'm not ready to put it out there publicly, because I haven't
>> had a chance to check the performance, and it's definitely not following
>> any kind of DPDK-like coding style. Just wanted to throw this out as food
>> for thought - if you think this approach is worthwhile I can try to
>> prioritize it, at least to get an RFC series out.
>> 
>> -Aaron
>
>Once I had a generic file-handling PMD written, I was then considering extending
>it to work with TUN/TAP too. :-)
>I think a TAP PMD would be useful for the downstream distros who can't package
>KNI as it is right now.

In Pktgen I used tap interface to wireshark and that worked very nicely the only problem is it was slow :-(
Having a tap PMD would be nice to be able to remove that code from Pktgen.
>
>/Bruce
>


Regards,
Keith





^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-02  1:38                   ` Wiles, Keith
@ 2015-12-02  2:42                     ` Matthew Hall
  0 siblings, 0 replies; 33+ messages in thread
From: Matthew Hall @ 2015-12-02  2:42 UTC (permalink / raw)
  To: Wiles, Keith; +Cc: dev

On Wed, Dec 02, 2015 at 01:38:07AM +0000, Wiles, Keith wrote:
> In Pktgen I used tap interface to wireshark and that worked very nicely the 
> only problem is it was slow :-(
> 
> Having a tap PMD would be nice to be able to remove that code from Pktgen.

All these approaches we discussed so far have a serious architectural issue. 
The entire point of BPF / libpcap was to prevent crossing unnecessary system 
boundaries with an arbitrarily, unsustainably large volume of unfiltered 
packets which will be tossed out anyways thereafter as they are irrelevant to 
a particular debugging objective.

In the past it was the kernel -> user boundary.

In our case it is the Data Plane -> Management Plane boundary.

If we don't use something similar to libpcap offline mode (which I am using 
presently in my code) or preferably the user-space bpfjit (which I am working 
up to using eventually), it's going to be more or less impossible for this to 
work properly and not blow stuff up with anything close to wirespeed traffic 
going through the links being monitored. Especially with 10, 40, 100, ad 
nauseam, gigabit links.

With the classic BPF / libpcap, it's absolutely possible to get it to work, 
without causing a big performance problem, or causing a huge packet dump 
meltdown, or any other issues in the process. We need to find a way to achieve 
the same objective in our new environment as well.

One possible option, if some kernel guys could assist with figuring out the 
trick, would be if we could capture the BPF ioctl / syscall / whatever it 
secretly does on the TAP or KNI interface, when it passes the capture filter 
to the kernel, and steal the filter for use in pcap offline or userspace 
bpfjit inside of DPDK. Then supply only the packets meeting the filter back 
onto said interface.

Matthew.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 14:48         ` Vincent JARDIN
  2015-12-01 14:58           ` Panu Matilainen
@ 2015-12-02 11:24           ` Neil Horman
  1 sibling, 0 replies; 33+ messages in thread
From: Neil Horman @ 2015-12-02 11:24 UTC (permalink / raw)
  To: Vincent JARDIN; +Cc: dev

On Tue, Dec 01, 2015 at 03:48:54PM +0100, Vincent JARDIN wrote:
> On 01/12/2015 15:27, Panu Matilainen wrote:
> >The problem with that (unless I'm missing something here) is that KNI
> >requires using out-of-tree kernel modules which makes it pretty much a
> >non-option for distros.
> 
> It works fine with some distros. I do not think it should be an argument.
> 
Strictly speaking it _works_ fine with all distros.  The problem is that some
choose not to use out of tree modules, and that will limit dpdk's reach. which
is not what you want.

Neil

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-01 19:49         ` Matthew Hall
@ 2015-12-02 12:35           ` Bruce Richardson
  2015-12-02 15:47             ` Matthew Hall
  0 siblings, 1 reply; 33+ messages in thread
From: Bruce Richardson @ 2015-12-02 12:35 UTC (permalink / raw)
  To: Matthew Hall; +Cc: dev

On Tue, Dec 01, 2015 at 02:49:46PM -0500, Matthew Hall wrote:
> On Tue, Dec 01, 2015 at 01:57:39PM +0000, Bruce Richardson wrote:
> > Hi Matthew,
> > 
> > Couple of follow-up questions on this:
> > * do you need the exact same number of bits in both implementations? If we support
> > 21 bits of data in IPv6 and 24 in IPv4 is that an issue compared to supporting
> > 21 bits just in both for compatibility.
> > * related to this - how much data are you looking to store in the tables?
> > 
> > Thanks,
> > /Bruce
> 
> Let me provide some more detailed high level examples of some security use 
> cases so we could consider what makes sense.
> 
> 1) Spamhaus provides a list of approximately 800 CIDR blocks which are so 
> bad that they recommend null-routing them as widely as possible:
> 
> https://www.spamhaus.org/drop/
> https://www.spamhaus.org/drop/drop.txt
> https://www.spamhaus.org/drop/edrop.txt
> 
> In the old implementation I couldn't even fit all of those, and doing 
> something like this seems to be a must-have feature for security.
> 
> 2) Team Cymru provides lists of Bogons for IPv4 and IPv6. In IPv4, there are 
> 3600 bogon CIDR blocks because many things are in-use. But the IPv6 table has 
> 65000 CIDR blocks, because it is larger, newer, and more sparse.
> 
> http://www.team-cymru.org/Services/Bogons/fullbogons-ipv4.txt
> http://www.team-cymru.org/Services/Bogons/fullbogons-ipv6.txt
> 
> Being able to monitor these would be another must-have for security and is 
> quite popular for core routing from what I have heard.
> 
> 3) At any given time, through various methods, I am aware of around 350,000 to 
> 2.5 million recent bad IP addresses. Technically single entries could be 
> matched using rte_hash. But it is quite common in the security world, to look 
> at the number of bad IPs in a class C, and then flag the entire subnet as 
> suspect if more than a few bad IPs are present there.
> 
> Some support for some level of this is a must-have for security and firewall 
> use cases.
> 
> 4) Of course, it goes without saying that fitting the contents of the entire 
> Internet BGP prefix list for IPv4 and IPv6 is a must-have for core routing 
> although less needed for security. I am not an expert in this. Some very basic 
> statistics I located with a quick search suggest one needs about 600,000 
> prefixes (presumably for IPv4). It would help if some router experts could 
> clarify it and help me know what the story is for IPv6.
> 
> http://www.cidr-report.org/as2.0/#General_Status
> 
> 5) Considering all of the above, it seems like 22 or 23 unsigned lookup bits 
> are required (4194304 or 8388608 entries) if you want comprehensive bad IP 
> detection. And probably 21 unsigned bits for basic security support. But that 
> would not necessarily leave a whole lot of headroom depending on the details.
> 
> Matthew.

Hi Matthew,

thanks for the info, but I'm not sure I understand it correctly. It seems to
me that you are mostly referring to the depths/sizes of the tables being used,
rather than to the "data-size" being stored in each entry, which was actually
what I was asking about. Is that correct? If so, it seems that - looking initially
at IPv4 LPM only - you are more looking for an increase in the number of tbl8's
for lookup, rather than necessarily an increase the 8-bit user data being stored
with each entry. [And assuming similar interest for v6] Am I right in 
thinking this?

Thanks,
/Bruce

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [dpdk-dev] 2.3 Roadmap
  2015-12-02 12:35           ` Bruce Richardson
@ 2015-12-02 15:47             ` Matthew Hall
  0 siblings, 0 replies; 33+ messages in thread
From: Matthew Hall @ 2015-12-02 15:47 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Wed, Dec 02, 2015 at 12:35:16PM +0000, Bruce Richardson wrote:
> Hi Matthew,
> 
> thanks for the info, but I'm not sure I understand it correctly. It seems to
> me that you are mostly referring to the depths/sizes of the tables being used,
> rather than to the "data-size" being stored in each entry, which was actually
> what I was asking about. Is that correct? If so, it seems that - looking initially
> at IPv4 LPM only - you are more looking for an increase in the number of tbl8's
> for lookup, rather than necessarily an increase the 8-bit user data being stored
> with each entry. [And assuming similar interest for v6] Am I right in 
> thinking this?
> 
> Thanks,
> /Bruce

This question is a result of a different way of looking at things between 
routing / networking and security. I actually need to increase the size of 
user data as I did in my patches.

1. There is an assumption, when LPM is used for routing, that many millions of 
inputs might map to a smaller number of outputs.

2. This assumption is not true in the security ecosystem. If I have several 
million CIDR blocks and bad IPs, I need a separate user data value output for 
each value input.

This is because, every time I have a bad IP, CIDR, Domain, URL, or Email, I 
create a security indicator tracking struct for each one of these. In the IP 
and CIDR case I find the struct using rte_hash (possibly for single IPs) and 
rte_lpm.

For Domain, URL, and Email, rte_hash cannot be used, because it mis-assumes 
all inputs are equal-length. So I had to use a different hash table.

4. The struct contains things such as a unique 64-bit unsigned integer for 
each separate IP or CIDR triggered, to allow looking up contextual data about 
the threat it represents. These IDs are defined by upstream threat databases, 
so I can't crunch them down to fit inside rte_lpm. They also include stats 
regarding how many times an indicator is seen, what kind of security threat it 
represents, etc. Without which you can't do any valuable security enrichment 
needed to respond to any events generated.

5. This means, if I want to support X million security indicators, regardless 
if they are IP, CIDR, Domain, URL, or Email, then I need X million distinct 
user data values to look up all the context that goes with them.

Matthew.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-12-02 15:47 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-30 20:50 [dpdk-dev] 2.3 Roadmap O'Driscoll, Tim
2015-11-30 21:50 ` Thomas Monjalon
2015-11-30 22:19 ` Dave Neary
2015-12-01 11:57   ` O'Driscoll, Tim
2015-11-30 22:30 ` Hobywan Kenoby
2015-12-01 11:52   ` O'Driscoll, Tim
2015-11-30 22:53 ` Kyle Larose
2015-12-01  1:16   ` Stephen Hemminger
2015-12-01 10:03     ` Bruce Richardson
2015-12-01 11:26       ` Yoshinobu Inoue
2015-12-01 11:58         ` Bruce Richardson
2015-12-01 13:42           ` Matthew Hall
2015-12-01 14:45             ` Kyle Larose
2015-12-01 19:28               ` Matthew Hall
2015-12-02  0:53           ` Yoshinobu Inoue
2015-12-01 14:27       ` Panu Matilainen
2015-12-01 14:48         ` Vincent JARDIN
2015-12-01 14:58           ` Panu Matilainen
2015-12-01 15:16             ` Christian Ehrhardt
2015-12-01 15:19             ` Bruce Richardson
2015-12-01 15:31               ` Aaron Conole
2015-12-01 15:54                 ` Richardson, Bruce
2015-12-02  1:38                   ` Wiles, Keith
2015-12-02  2:42                     ` Matthew Hall
2015-12-01 19:32                 ` Matthew Hall
2015-12-02 11:24           ` Neil Horman
2015-12-01 12:59 ` Matthew Hall
2015-12-01 13:16   ` O'Driscoll, Tim
2015-12-01 13:44     ` Matthew Hall
2015-12-01 13:57       ` Bruce Richardson
2015-12-01 19:49         ` Matthew Hall
2015-12-02 12:35           ` Bruce Richardson
2015-12-02 15:47             ` Matthew Hall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).