DPDK patches and discussions
 help / color / Atom feed
From: Andrzej Ostruszka <amo@semihalf.com>
To: Morten Brørup <mb@smartsharesystems.com>,
	Andrzej Ostruszka <aostruszka@marvell.com>,
	David Marchand <david.marchand@redhat.com>
Cc: dev <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v2 0/4] Introduce IF proxy library
Date: Thu, 26 Mar 2020 18:42:18 +0100
Message-ID: <7886ce41-84eb-c7c0-fa4d-cd3a44825630@semihalf.com> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35C60EFD@smartserver.smartshare.dk>

On 3/25/20 12:11 PM, Morten Brørup wrote:
[...]
>>> - Notification scheme has been changed - instead of having just
>>>   callbacks now event queueing is also available (or a mix of those
>>>   two).
> 
> Thank you for adding event queueing!

That was actually a good input from you - thank you.

> David mentions ABI forward compatibility below.
> Consider using a dynamically sized generic TLV (type, length, value)
> message format instead of a big union structure for the events. This
> would make it easier to extend the list of event types without breaking
> the ABI.

My understanding is that David was talking about registering of
callbacks and you want to extend this to event definition.

So let's focus on one example:
...
	RTE_IFPX_NEIGH_ADD,
	RTE_IFPX_NEIGH_DEL,
...
struct rte_ifpx_neigh_change {
	uint16_t port_id;
	struct rte_ether_addr mac;
	uint32_t ip;
};

Right now the event is defined as:

struct rte_ifpx_event {
	enum rte_ifpx_event_type type;
	union {
	...
		struct rte_ifpx_neigh_change neigh_change;
	...
	};
};

So what the user does is a switch on event->type:

	switch (ev->type) {
		case RTE_IFPX_NEIGH_ADD:
			handle_neigh_add(lconf, &ev->neigh_change);
			break;
		case RTE_IFPX_NEIGH_DEL:
			handle_neigh_del(lconf, &ev->neigh_change);
			break;

How does adding more event types to this union would break ABI?  User
gets event from the queue (allocated by the lib) checks the type and
casts the pointer past the 'type' to proper event definition.  And when
done with the event simply free()s it (BTW right now it is malloc() not
rte_malloc() - should I change that?).  If app links against newer
version of lib then it might get type which it does not
understand/handle so it should skip (possibly with a warning).  I'm not
sure how changing rte_ifpx_event to:

struct rte_ifpx_event {
	enut rte_ifpx_event_type type;
	int length;
	uint8_t data[];
};

would help here.  The user would need to cast data based on event type
whereas now it takes address of a proper union member - and the union is
there only to avoid casting.  In both cases what is important is that
RTE_IFPX_NEIGH_ADD/DEL and "struct rte_ifpx_neigh_change" don't change
between versions (new values can be added - or new versions of the
previously existing events when trying to make a change).

And for the callbacks it is more or less the same - library will prepare
data and call callback with a pointer to this data.  Handling of new
event types should be automatic when I implement what David wanted -
simply lib callback for the new event will be NULL nothing will be
called and application will work without problems.

> And I am still strongly opposed to the callback method:

Noted - however for now I would like to keep them.  I don't have much
experience with this library so if they prove to be inadequate then we
will remove them.  Right now they seem to add some flexibility that I like:
- if something should be changed globally and once (and it is safe to do
  so!) then it can be done from the callback
- if something can be prepared once and consumed later by lcores then it
  can be done in callback and the callback returns 0 so that event is
  still queued and lcores (under assumption that queues are per lcore)
  pick up what has been prepared.

> The callbacks are handled as DPDK interrupts, which are running in a non-DPDK
> thread, i.e. a running callback may be preempted by some other Linux process.
> This makes it difficult to implement callbacks correctly.
> The risk of someone calling a non-thread safe function from a callback is high,
> e.g. DPDK hash table manipulation (except lookup) is not thread safe.
> 
> Your documentation is far too vague about this:
> Please note however that the context in which these callbacks are 
> called is most probably different from the one in which packets are 
> handled and it is application writer responsibility to use proper 
> synchronization mechanisms - if they are needed.
> 
> You need a big fat WARNING about how difficult the DPDK interrupt thread is to
> work with. As I described above, it is not "most probably" it is "certainly" a
> very different kind of context.

OK.  Will update in next version.

> Did you check that the functions you use in your example callbacks are all
> thread safe and non-blocking, so they can safely be called from a non-DPDK thread
> that may be preempted by a another Linux process?

I believe so.  However there is a big question whether my assumption
about LPM is correct.  I've looked at the code and it looks like it so
but I'm not in power to authoritatively declare it.  So again, to me LPM
looks like safe to be changed by a single writer while being used by
multiple readers (with an obvious transient period when rule is being
expanded and some IPs might go with an old and some with a new destination).

With regards
Andrzej Ostruszka

  reply index

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-06 16:41 [dpdk-dev] [PATCH " Andrzej Ostruszka
2020-03-06 16:41 ` [dpdk-dev] [PATCH 1/4] lib: introduce IF Proxy library Andrzej Ostruszka
2020-03-31 12:36   ` Harman Kalra
2020-03-31 15:37     ` Andrzej Ostruszka [C]
2020-04-01  5:29   ` Varghese, Vipin
2020-04-01 20:08     ` Andrzej Ostruszka [C]
2020-03-06 16:41 ` [dpdk-dev] [PATCH 2/4] if_proxy: add library documentation Andrzej Ostruszka
2020-03-06 16:41 ` [dpdk-dev] [PATCH 3/4] if_proxy: add simple functionality test Andrzej Ostruszka
2020-03-06 16:41 ` [dpdk-dev] [PATCH 4/4] if_proxy: add example application Andrzej Ostruszka
2020-03-06 17:17 ` [dpdk-dev] [PATCH 0/4] Introduce IF proxy library Andrzej Ostruszka
2020-03-10 11:10 ` [dpdk-dev] [PATCH v2 " Andrzej Ostruszka
2020-03-10 11:10   ` [dpdk-dev] [PATCH v2 1/4] lib: introduce IF Proxy library Andrzej Ostruszka
2020-03-10 11:10   ` [dpdk-dev] [PATCH v2 2/4] if_proxy: add library documentation Andrzej Ostruszka
2020-03-10 11:10   ` [dpdk-dev] [PATCH v2 3/4] if_proxy: add simple functionality test Andrzej Ostruszka
2020-03-10 11:10   ` [dpdk-dev] [PATCH v2 4/4] if_proxy: add example application Andrzej Ostruszka
2020-03-25  8:08   ` [dpdk-dev] [PATCH v2 0/4] Introduce IF proxy library David Marchand
2020-03-25 11:11     ` Morten Brørup
2020-03-26 17:42       ` Andrzej Ostruszka [this message]
2020-03-26 12:41     ` Andrzej Ostruszka
2020-03-30 19:23       ` Andrzej Ostruszka

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7886ce41-84eb-c7c0-fa4d-cd3a44825630@semihalf.com \
    --to=amo@semihalf.com \
    --cc=aostruszka@marvell.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=mb@smartsharesystems.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox