DPDK patches and discussions
 help / color / mirror / Atom feed
From: Andrzej Ostruszka <aostruszka@marvell.com>
To: <dev@dpdk.org>
Subject: [dpdk-dev] [PATCH v2 0/4] Introduce IF proxy library
Date: Tue, 10 Mar 2020 12:10:33 +0100	[thread overview]
Message-ID: <20200310111037.31451-1-aostruszka@marvell.com> (raw)
In-Reply-To: <20200306164104.15528-1-aostruszka@marvell.com>

What is this useful for
=======================

Usually, when an ethernet port is assigned to DPDK it vanishes from the
system and user looses ability to control it via normal configuration
utilities (e.g. those from iproute2 package).  Moreover by default DPDK
application is not aware of the network configuration of the system.

To address both of these issues application needs to:
- add some command line interface (or other mechanism) allowing for
  control of the port and its configuration
- query the status of network configuration and monitor its changes

The purpose of this library is to help with both of these tasks (as long
as they remain in domain of configuration available to the system).  In
other words, if DPDK application has some special needs, that cannot be
addressed by the normal system configuration utilities, then they need
to be solved by the application itself.

The connection between DPDK and system is based on the existence of
ports that are visible to both DPDK and system (like Tap, KNI and
possibly some other drivers).  These ports serve as an interface
proxies.

Let's visualize the action of the library by the following example:

              Linux             |            DPDK
==============================================================
                                |
                                |   +-------+       +-------+
                                |   | Port1 |       | Port2 |
"ip link set dev tap1 mtu 1600" |   +-------+       +-------+
                          |     |       ^              ^ ^
                          |  +------+   | mtu_change   | |
                          `->| Tap1 |---' callback     | |
                             +------+                  | |
"ip addr add 198.51.100.14 \    |                      | |
                  dev tap2"     |                      | |
                          |  +------+                  | |
                          +->| Tap2 |------------------' |
                          |  +------+  addr_add callback |
"ip route add 198.0.2.0/24 \    |  |                     |
                  dev tap2"     |  | route_add callback  |
                                |  `---------------------'

So we have two ports Port1 and Port2 that are not visible to the system.
We create two proxy interfaces (here based on Tap driver) and bind the
ports to their proxies.  When user issues a command changing MTU for
Tap1 interface the library notes this and calls "mtu_change" callback
for the Port1.  Similarly when user adds an IPv4 address to the Tap2
interface "addr_add" callback is called for the Port2 and the same
happens for configuration of routing rule pointing to Tap2.  Apart from
callbacks this library can notify about changes via adding events to
notification queues.  See below for more inforamtion about that and
a complete list of available callbacks.

Please note that nothing has been mentioned about forwarding of the
packets between system and DPDK.  Since the proxies are normal DPDK
ports you can receive/send to them via usual RX/TX burst API.  However
since the library is not aware of the structure of packet processing
used by the application it cannot automatically forward the packets - it
is responsibility of the application to include proxy ports into its
packet processing engine.

As mentioned above the intention of the library is to:
- provide information about network configuration that would allow
  application to decide what to do with the packets received on DPDK
  ports,
- allow for control of the ports via standard configuration utilities

Although the library only helps you to identify proxy for given port
(and vice versa) and calls appropriate callbacks it does open some
interesting possibilities.  For example you can use the proxy ports to
forward packets for protocols that you do not wish to handle in DPDK
application to the system protocol stack and just listen to the
configuration changes - so that way you can "offload" handling of those
protocols to the system.

How to use it
=============

Usage of this library is rather simple.  You have to:
1. Create proxy (if you don't have port suitable for being proxy or you
  have one but do not wish to use it as a proxy).
2. Bind port to proxy.
3. Register callbacks and/or event queues.
4. Start listening to the network configuration.

The only mandatory requirement for DPDK port to be able to act as
a proxy is that it is visible in the system - this is checked during
port to proxy binding by calling rte_eth_dev_info_get() on proxy port
and inspecting 'if_index' field (it has to be non-zero).
One can create such port in the application by calling:

  proxy_id = rte_ifpx_create(RTE_IFPX_DEFAULT);

Upon success this returns id of DPDK proxy port created
(RTE_MAX_ETHPORTS on failure).  The argument selects type of proxy port
to create (currently Tap/KNI only).  This function actually is just
a wrapper around:

  uint16_t rte_ifpx_create_by_devarg(const char *devarg);

creating valid 'devarg' string for the chosen type of proxy.  If you have
other driver capable of acting as a proxy you can call
rte_ifpx_create_by_devarg() directly passing appropriate argument.

Once you have id of both port and proxy you can bind the two via:

  rte_ifpx_port_bind(port_id, proxy_id);

This creates logical binding - as mentioned above there is no automatic
packet forwarding.  With this binding whenever user changes the state of
proxy interface in the system (link up/down, change mac/mtu, add/remove
IPv4/IPv6) you get appropriate notification for the bound port.

So far we've mentioned several times that the library calls callbacks.
They are grouped in 'struct rte_ifpx_callbacks' and user provides them
to the library via:

  rte_ifpx_callbacks_register(&cbs);

It is worth mentioning that the context (lcore/thread) in which these
callbacks are called is implementation defined.  It might differ between
different platforms, so the application needs to assume that some kind
of inter lcore/thread synchronization/communication is required.

Apart from notification via callbacks this library also supports
notifying about the changes via adding events to the configured
notification queues.  The queues are registered via:

  int rte_ifpx_queue_add(struct rte_ring *r);

and the actual logic used is: if there is callback registered then it is
called, if it returns non-zero then event is considered completed,
otherwise event is added to each configured notification queue.
That way application can update data structures that are safe to be
modified by single writer from within callback or do the common
preprocessing steps (if any needed) in callback and data that is
replicated can be updated during handling of queued events.

Once we have bindings in place and notification configured, the only
essential part that remains is to get the current network configuration
and start listening to its changes.  This is accomplished via a call to:

  rte_ifpx_listen();

And basically this is all one needs to understand how to use this
library.  Other less essential parts include:
- ability to query what events are available for given platform
- getting mapping between proxy and port
- unbinding the ports from proxy
- destroying proxy port
- closing the listening service
- getting basic information about proxy


Currently available features and implementation
===============================================

The library's API is system independent but it obviously needs some
system dependent parts.  We provide exemplary Linux implementation (based
on netlink sockets).  Very similar implementation is possible for
FreeBSD (with the usage of PF_ROUTE sockets).  Windows implementation
would need to differ much (probably IP Helper library would be of some help).

Here is the list of currently implemented callbacks:

struct rte_ifpx_callbacks {
  int (*mac_change)(const struct rte_ifpx_mac_change *event);
  int (*mtu_change)(const struct rte_ifpx_mtu_change *event);
  int (*link_change)(const struct rte_ifpx_link_change *event);
  int (*addr_add)(const struct rte_ifpx_addr_change *event);
  int (*addr_del)(const struct rte_ifpx_addr_change *event);
  int (*addr6_add)(const struct rte_ifpx_addr6_change *event);
  int (*addr6_del)(const struct rte_ifpx_addr6_change *event);
  int (*route_add)(const struct rte_ifpx_route_change *event);
  int (*route_del)(const struct rte_ifpx_route_change *event);
  int (*route6_add)(const struct rte_ifpx_route6_change *event);
  int (*route6_del)(const struct rte_ifpx_route6_change *event);
  int (*neigh_add)(const struct rte_ifpx_neigh_change *event);
  int (*neigh_del)(const struct rte_ifpx_neigh_change *event);
  int (*neigh6_add)(const struct rte_ifpx_neigh6_change *event);
  int (*neigh6_del)(const struct rte_ifpx_neigh6_change *event);
  int (*cfg_done)(void);
};

They are all rather self-descriptive with the exception of the last one.
When the user calls rte_ifpx_listen() the library first queries the
system for its current configuration.  That might require several
request/reply exchanges between DPDK and system and once it is finished
this callback is called to let application know that all info has been
gathered.

It is worth to mention also that while typical case would be a 1-to-1
mapping between port and proxy, the 1-to-many mapping is also supported.
In that case related callbacks will be called for each port bound to
given proxy interface - it is application responsibility to define
semantic of such mapping (e.g. all changes apply to all ports, or link
changes apply to all but other are accepted in "round robin" fashion, or
some other logic).

As mentioned above Linux implementation is based on netlink socket.
This socket is registered as file descriptor in EAL interrupts
(similarly to how EAL alarms are implemented).

What has changed since the RFC
==============================

- Platform dependent parts has been separated into a ifpx_platform
  structure with callbacks for initialization, getting information about
  the interface, listening to the changes and closing of the library.
  That should allow easier reimplementation.

- Notification scheme has been changed - instead of having just
  callbacks now event queueing is also available (or a mix of those
  two).

- Filtering of events only related to the proxy ports - previously all
  network configuration changes were reported.  But DPDK application
  doesn't need to know whole configuration - only just portion related
  to the proxy ports.  If a packet comes that does not match rules then
  it can be forwarded via proxy to the system to decide what to do with
  it.  If that is not desired and such packets should be dropped then
  null port can be created with proxy and e.g. default route installed
  on it.

- Removed previous example which was just printing notification.
  Instead added a simplified (stripped vectorization and other
  performance improvements) version of l3fwd that should serve as an
  example of using this library in real applications.

Changes in V2
=============
- Cleaned up checkpatch warnings
- Removed dead/unused code and added gateway clearing in l3fwd-ifpx


With regards
Andrzej Ostruszka

Note: Patch 4 in this series has a dependency on:
  https://patchwork.dpdk.org/patch/66492/
so I add here this newly proposed tag here:
Depends-on: series-8862

Andrzej Ostruszka (4):
  lib: introduce IF Proxy library
  if_proxy: add library documentation
  if_proxy: add simple functionality test
  if_proxy: add example application

 MAINTAINERS                                   |    6 +
 app/test/Makefile                             |    5 +
 app/test/meson.build                          |    4 +
 app/test/test_if_proxy.c                      |  707 +++++++++++
 config/common_base                            |    5 +
 config/common_linux                           |    1 +
 doc/guides/prog_guide/if_proxy_lib.rst        |  142 +++
 doc/guides/prog_guide/index.rst               |    1 +
 examples/Makefile                             |    1 +
 examples/l3fwd-ifpx/Makefile                  |   60 +
 examples/l3fwd-ifpx/l3fwd.c                   | 1131 +++++++++++++++++
 examples/l3fwd-ifpx/l3fwd.h                   |   98 ++
 examples/l3fwd-ifpx/main.c                    |  740 +++++++++++
 examples/l3fwd-ifpx/meson.build               |   11 +
 examples/meson.build                          |    2 +-
 lib/Makefile                                  |    2 +
 .../common/include/rte_eal_interrupts.h       |    2 +
 lib/librte_eal/linux/eal/eal_interrupts.c     |   14 +-
 lib/librte_if_proxy/Makefile                  |   29 +
 lib/librte_if_proxy/if_proxy_common.c         |  494 +++++++
 lib/librte_if_proxy/if_proxy_priv.h           |   97 ++
 lib/librte_if_proxy/linux/Makefile            |    4 +
 lib/librte_if_proxy/linux/if_proxy.c          |  550 ++++++++
 lib/librte_if_proxy/meson.build               |   19 +
 lib/librte_if_proxy/rte_if_proxy.h            |  561 ++++++++
 lib/librte_if_proxy/rte_if_proxy_version.map  |   19 +
 lib/meson.build                               |    2 +-
 27 files changed, 4701 insertions(+), 6 deletions(-)
 create mode 100644 app/test/test_if_proxy.c
 create mode 100644 doc/guides/prog_guide/if_proxy_lib.rst
 create mode 100644 examples/l3fwd-ifpx/Makefile
 create mode 100644 examples/l3fwd-ifpx/l3fwd.c
 create mode 100644 examples/l3fwd-ifpx/l3fwd.h
 create mode 100644 examples/l3fwd-ifpx/main.c
 create mode 100644 examples/l3fwd-ifpx/meson.build
 create mode 100644 lib/librte_if_proxy/Makefile
 create mode 100644 lib/librte_if_proxy/if_proxy_common.c
 create mode 100644 lib/librte_if_proxy/if_proxy_priv.h
 create mode 100644 lib/librte_if_proxy/linux/Makefile
 create mode 100644 lib/librte_if_proxy/linux/if_proxy.c
 create mode 100644 lib/librte_if_proxy/meson.build
 create mode 100644 lib/librte_if_proxy/rte_if_proxy.h
 create mode 100644 lib/librte_if_proxy/rte_if_proxy_version.map

-- 
2.17.1


  parent reply	other threads:[~2020-03-10 11:10 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-06 16:41 [dpdk-dev] [PATCH " Andrzej Ostruszka
2020-03-06 16:41 ` [dpdk-dev] [PATCH 1/4] lib: introduce IF Proxy library Andrzej Ostruszka
2020-03-31 12:36   ` Harman Kalra
2020-03-31 15:37     ` Andrzej Ostruszka [C]
2020-04-01  5:29   ` Varghese, Vipin
2020-04-01 20:08     ` Andrzej Ostruszka [C]
2020-04-08  3:04       ` Varghese, Vipin
2020-04-08 18:13         ` Andrzej Ostruszka [C]
2020-03-06 16:41 ` [dpdk-dev] [PATCH 2/4] if_proxy: add library documentation Andrzej Ostruszka
2020-03-06 16:41 ` [dpdk-dev] [PATCH 3/4] if_proxy: add simple functionality test Andrzej Ostruszka
2020-03-06 16:41 ` [dpdk-dev] [PATCH 4/4] if_proxy: add example application Andrzej Ostruszka
2020-03-06 17:17 ` [dpdk-dev] [PATCH 0/4] Introduce IF proxy library Andrzej Ostruszka
2020-03-10 11:10 ` Andrzej Ostruszka [this message]
2020-03-10 11:10   ` [dpdk-dev] [PATCH v2 1/4] lib: introduce IF Proxy library Andrzej Ostruszka
2020-07-02  0:34     ` Stephen Hemminger
2020-07-07 20:13       ` Andrzej Ostruszka [C]
2020-07-08 16:07         ` Morten Brørup
2020-07-09  8:43           ` Andrzej Ostruszka [C]
2020-07-22  0:40             ` Thomas Monjalon
2020-07-22  8:45               ` Jerin Jacob
2020-07-22  8:56                 ` Thomas Monjalon
2020-07-22  9:09                   ` Jerin Jacob
2020-07-22  9:27                     ` Thomas Monjalon
2020-07-22  9:54                       ` Jerin Jacob
2020-07-23 14:09                         ` [dpdk-dev] [EXT] " Andrzej Ostruszka [C]
2020-03-10 11:10   ` [dpdk-dev] [PATCH v2 2/4] if_proxy: add library documentation Andrzej Ostruszka
2020-03-10 11:10   ` [dpdk-dev] [PATCH v2 3/4] if_proxy: add simple functionality test Andrzej Ostruszka
2020-03-10 11:10   ` [dpdk-dev] [PATCH v2 4/4] if_proxy: add example application Andrzej Ostruszka
2020-03-25  8:08   ` [dpdk-dev] [PATCH v2 0/4] Introduce IF proxy library David Marchand
2020-03-25 11:11     ` Morten Brørup
2020-03-26 17:42       ` Andrzej Ostruszka
2020-04-02 13:48         ` Andrzej Ostruszka [C]
2020-04-03 17:19           ` Thomas Monjalon
2020-04-03 19:09             ` Jerin Jacob
2020-04-03 21:18               ` Morten Brørup
2020-04-03 21:57                 ` Thomas Monjalon
2020-04-04 10:18                   ` Jerin Jacob
2020-04-10 10:41                     ` Morten Brørup
2020-04-04 18:30             ` Andrzej Ostruszka [C]
2020-04-04 19:58               ` Thomas Monjalon
2020-04-10 10:03                 ` Morten Brørup
2020-04-10 12:28                   ` Jerin Jacob
2020-03-26 12:41     ` Andrzej Ostruszka
2020-03-30 19:23       ` Andrzej Ostruszka
2020-04-03 21:42   ` Thomas Monjalon
2020-04-04 18:07     ` Andrzej Ostruszka [C]
2020-04-04 19:51       ` Thomas Monjalon
2020-04-16 16:11 ` [dpdk-dev] [PATCH " Stephen Hemminger
2020-04-16 16:49   ` Jerin Jacob
2020-04-16 17:04     ` Stephen Hemminger
2020-04-16 17:26       ` Andrzej Ostruszka [C]
2020-04-16 17:27       ` Jerin Jacob
2020-04-16 17:12     ` Andrzej Ostruszka [C]
2020-04-16 17:19       ` Stephen Hemminger
2020-05-04  8:53 ` [dpdk-dev] [PATCH v3 " Andrzej Ostruszka
2020-05-04  8:53   ` [dpdk-dev] [PATCH v3 1/4] lib: introduce IF Proxy library Andrzej Ostruszka
2020-05-04  8:53   ` [dpdk-dev] [PATCH v3 2/4] if_proxy: add library documentation Andrzej Ostruszka
2020-05-04  8:53   ` [dpdk-dev] [PATCH v3 3/4] if_proxy: add simple functionality test Andrzej Ostruszka
2020-05-04  8:53   ` [dpdk-dev] [PATCH v3 4/4] if_proxy: add example application Andrzej Ostruszka
2020-06-22  9:21 ` [dpdk-dev] [PATCH v4 0/4] Introduce IF proxy library Andrzej Ostruszka
2020-06-22  9:21   ` [dpdk-dev] [PATCH v4 1/4] lib: introduce IF Proxy library Andrzej Ostruszka
2020-06-22  9:21   ` [dpdk-dev] [PATCH v4 2/4] if_proxy: add library documentation Andrzej Ostruszka
2020-06-22  9:21   ` [dpdk-dev] [PATCH v4 3/4] if_proxy: add simple functionality test Andrzej Ostruszka
2020-06-22  9:21   ` [dpdk-dev] [PATCH v4 4/4] if_proxy: add example application Andrzej Ostruszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200310111037.31451-1-aostruszka@marvell.com \
    --to=aostruszka@marvell.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).