DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: dev@dpdk.org
Subject: [dpdk-dev] [PATCH v2 10/13] mbuf: split mbuf across two cache lines.
Date: Thu, 11 Sep 2014 14:15:44 +0100	[thread overview]
Message-ID: <1410441347-22840-11-git-send-email-bruce.richardson@intel.com> (raw)
In-Reply-To: <1410441347-22840-1-git-send-email-bruce.richardson@intel.com>

This change splits the mbuf in two to move the pool and next pointers to
the second cache line. This frees up 16 bytes in first cache line.

The reason for this change is that we believe that there is no possible
way that we can ever fit all the fields we need to fit into a 64-byte
mbuf, and so we need to start looking at a 128-byte mbuf instead. Examples
of new fields that need to fit in, include -
* 32-bits more for filter information for support for the new filters in
  the i40e driver (and possibly other future drivers)
* an additional 2-4 bytes for storing info on a second vlan tag to allow
  drivers to support double Vlan/QinQ
* 4-bytes for storing a sequence number to enable out of order packet
  processing and subsequent packet reordering
as well as potentially a number of other fields or splitting out fields
that are superimposed over each other right now, e.g. for the qos scheduler.
We also want to allow space for use by other non-Intel NIC drivers that may
be open-sourced to dpdk.org in the future too, where they support fields
and offloads that currently supported hardware doesn't.

If we accept the fact of a 2-cache-line mbuf, then the issue becomes
how to rework things so that we spread our fields over the two
cache lines while causing the lowest slow-down possible. The general
approach that we are looking to take is to focus the first cache
line on fields that are updated on RX , so that receive only deals
with one cache line. The second cache line can be used for application
data and information that will only be used on the TX leg. This would
allow us to work on the first cache line in RX as now, and have the
second cache line being prefetched in the background so that it is
available when necessary. Hardware prefetches should help us out
here. We also may move rarely used, or slow-path RX fields e.g. such
as those for chained mbufs with jumbo frames, to the second
cache line, depending upon the performance impact and bytes savings
achieved.

Updated in V2:
* Expanded commit description to include contents of a previous mail
  describing some of the logic behind expanding mbuf to two cache lines
* Update kni mbuf structure

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 app/test/test_mbuf.c                                          | 2 +-
 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h | 6 +++---
 lib/librte_mbuf/rte_mbuf.h                                    | 3 ++-
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 1b25481..66bcbc5 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -782,7 +782,7 @@ test_failing_mbuf_sanity_check(void)
 static int
 test_mbuf(void)
 {
-	RTE_BUILD_BUG_ON(sizeof(struct rte_mbuf) != 64);
+	RTE_BUILD_BUG_ON(sizeof(struct rte_mbuf) != CACHE_LINE_SIZE * 2);
 
 	/* create pktmbuf pool if it does not exist */
 	if (pktmbuf_pool == NULL) {
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index ab022bd..25ed672 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -108,7 +108,7 @@ struct rte_kni_fifo {
  * Padding is necessary to assure the offsets of these fields
  */
 struct rte_kni_mbuf {
-	void *buf_addr;
+	void *buf_addr __attribute__((__aligned__(64)));
 	char pad0[10];
 	uint16_t data_off;      /**< Start address of data in segment buffer. */
 	char pad1[4];
@@ -117,9 +117,9 @@ struct rte_kni_mbuf {
 	uint16_t data_len;      /**< Amount of data in segment buffer. */
 	uint32_t pkt_len;       /**< Total pkt len: sum of all segment data_len. */
 	char pad3[8];
-	void *pool;
+	void *pool __attribute__((__aligned__(64)));
 	void *next;
-} __attribute__((__aligned__(64)));
+};
 
 /*
  * Struct used to create a KNI device. Passed to the kernel in IOCTL call
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 34900d4..508021b 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -176,7 +176,8 @@ struct rte_mbuf {
 		uint32_t sched;   /**< Hierarchical scheduler */
 	} hash;                   /**< hash information */
 
-	/* fields only used in slow path or on TX */
+	/* second cache line - fields only used in slow path or on TX */
+	MARKER cacheline1 __rte_cache_aligned;
 	struct rte_mempool *pool; /**< Pool from which mbuf was allocated. */
 	struct rte_mbuf *next;    /**< Next segment of scattered packet. */
 
-- 
1.9.3

  parent reply	other threads:[~2014-09-11 13:11 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-03 15:49 [dpdk-dev] [PATCH 00/13] Mbuf Structure Rework, part 2 Bruce Richardson
2014-09-03 15:49 ` [dpdk-dev] [PATCH 01/13] mbuf: replace data pointer by an offset Bruce Richardson
2014-09-08  9:52   ` Olivier MATZ
2014-09-08  9:55     ` Olivier MATZ
2014-09-03 15:49 ` [dpdk-dev] [PATCH 02/13] mbuf: reorder fields by time of use Bruce Richardson
2014-09-08 10:17   ` Olivier MATZ
2014-09-03 15:49 ` [dpdk-dev] [PATCH 03/13] mbuf: add packet_type field Bruce Richardson
2014-09-08 10:17   ` Olivier MATZ
2014-09-08 10:33     ` Yerden Zhumabekov
2014-09-08 11:17       ` Olivier MATZ
2014-09-09  3:59         ` Zhang, Helin
     [not found]           ` <540EB428.9060706@6wind.com>
2014-09-09  8:45             ` Zhang, Helin
2014-09-09  9:47             ` Richardson, Bruce
2014-09-09 15:05         ` Jim Thompson
2014-09-03 15:49 ` [dpdk-dev] [PATCH 04/13] mbuf: expand ol_flags field to 64-bits Bruce Richardson
2014-09-08 10:25   ` Olivier MATZ
2014-09-09  9:00     ` Richardson, Bruce
2014-09-03 15:49 ` [dpdk-dev] [PATCH 05/13] mbuf: introduce a flag to indicate a control mbuf Bruce Richardson
2014-09-08 11:53   ` Olivier MATZ
2014-09-03 15:49 ` [dpdk-dev] [PATCH 06/13] mbuf: minor changes for readability Bruce Richardson
2014-09-08 12:03   ` Olivier MATZ
2014-09-03 15:49 ` [dpdk-dev] [PATCH 07/13] mbuf: use macros only to access the mbuf metadata Bruce Richardson
2014-09-08 12:05   ` Olivier MATZ
2014-09-09  9:01     ` Richardson, Bruce
2014-09-12 16:56       ` Dumitrescu, Cristian
2014-09-12 21:02         ` Olivier MATZ
2014-09-16 20:07           ` Dumitrescu, Cristian
2014-09-16 22:06             ` Ramia, Kannan Babu
2014-09-17 10:31               ` Richardson, Bruce
2014-09-17 14:01                 ` Thomas Monjalon
2014-09-10 15:09     ` Bruce Richardson
2014-09-10 15:31       ` Olivier MATZ
2014-09-03 15:49 ` [dpdk-dev] [PATCH 08/13] mbuf: add named points inside the mbuf structure Bruce Richardson
2014-09-08 12:08   ` Olivier MATZ
2014-09-03 15:49 ` [dpdk-dev] [PATCH 09/13] ixgbe: rework vector pmd following mbuf changes Bruce Richardson
2014-09-03 15:49 ` [dpdk-dev] [PATCH 10/13] mbuf: split mbuf across two cache lines Bruce Richardson
2014-09-08 12:10   ` Olivier MATZ
2014-09-03 15:49 ` [dpdk-dev] [PATCH 11/13] mbuf: move l2_len and l3_len to second cache line Bruce Richardson
2014-09-04  5:08   ` Yerden Zhumabekov
2014-09-04 10:27     ` Bruce Richardson
2014-09-04 11:00       ` Yerden Zhumabekov
2014-09-04 11:55         ` Bruce Richardson
2014-09-03 15:49 ` [dpdk-dev] [PATCH 12/13] ixgbe: Fix perf regression due to moved pool ptr Bruce Richardson
2014-09-03 15:49 ` [dpdk-dev] [PATCH 13/13] ixgbe: Improve slow-path perf: vector scattered RX Bruce Richardson
2014-09-11 13:15 ` [dpdk-dev] [PATCH v2 00/13] Mbuf Structure Rework, part 2 Bruce Richardson
2014-09-11 13:15   ` [dpdk-dev] [PATCH v2 01/13] mbuf: replace data pointer by an offset Bruce Richardson
2014-09-11 13:15   ` [dpdk-dev] [PATCH v2 02/13] mbuf: reorder fields by time of use Bruce Richardson
2014-09-15  7:11     ` Liu, Jijiang
2014-09-15  8:19       ` Richardson, Bruce
2014-09-11 13:15   ` [dpdk-dev] [PATCH v2 03/13] mbuf: expand ol_flags field to 64-bits Bruce Richardson
2014-09-11 13:15   ` [dpdk-dev] [PATCH v2 04/13] mbuf: introduce a flag to indicate a control mbuf Bruce Richardson
2014-09-11 13:15   ` [dpdk-dev] [PATCH v2 05/13] mbuf: minor changes for readability Bruce Richardson
2014-09-11 13:15   ` [dpdk-dev] [PATCH v2 06/13] mbuf: use macros only to access the mbuf metadata Bruce Richardson
2014-09-11 13:15   ` [dpdk-dev] [PATCH v2 07/13] mbuf: move metadata macros to rte_port library Bruce Richardson
2014-09-11 13:15   ` [dpdk-dev] [PATCH v2 08/13] mbuf: add named points inside the mbuf structure Bruce Richardson
2014-09-11 13:15   ` [dpdk-dev] [PATCH v2 09/13] ixgbe: rework vector pmd following mbuf changes Bruce Richardson
2014-09-11 13:15   ` Bruce Richardson [this message]
2014-09-11 13:15   ` [dpdk-dev] [PATCH v2 11/13] mbuf: move l2_len and l3_len to second cache line Bruce Richardson
2014-09-11 13:15   ` [dpdk-dev] [PATCH v2 12/13] ixgbe: Fix perf regression due to moved pool ptr Bruce Richardson
2014-09-15 16:20     ` [dpdk-dev] [PATCH v3 " Bruce Richardson
2014-09-11 13:15   ` [dpdk-dev] [PATCH v2 13/13] ixgbe: Improve slow-path perf: vector scattered RX Bruce Richardson
2014-09-17 22:35   ` [dpdk-dev] [PATCH v2 00/13] Mbuf Structure Rework, part 2 Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1410441347-22840-11-git-send-email-bruce.richardson@intel.com \
    --to=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).