DPDK patches and discussions
 help / color / mirror / Atom feed
From: Jerin Jacob <jerinjacobk@gmail.com>
To: Thomas Monjalon <thomas@monjalon.net>
Cc: dpdk-dev <dev@dpdk.org>,
	David Marchand <david.marchand@redhat.com>,
	 Ferruh Yigit <ferruh.yigit@intel.com>,
	Olivier Matz <olivier.matz@6wind.com>,
	 Morten Brørup <mb@smartsharesystems.com>,
	 "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	 Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
	 Viacheslav Ovsiienko <viacheslavo@nvidia.com>,
	Ajit Khaparde <ajit.khaparde@broadcom.com>,
	 Jerin Jacob <jerinj@marvell.com>,
	Hemant Agrawal <hemant.agrawal@nxp.com>,
	 Ray Kinsella <mdr@ashroe.eu>,
	Neil Horman <nhorman@tuxdriver.com>,
	 Nithin Dabilpuram <ndabilpuram@marvell.com>,
	Kiran Kumar K <kirankumark@marvell.com>
Subject: Re: [dpdk-dev] [PATCH 1/1] mbuf: move pool pointer in first half
Date: Sat, 7 Nov 2020 22:42:46 +0530
Message-ID: <CALBAE1N3XMH5_+F6TjhcdhnKZiB0kPobon1ajUoPSsi-XQUfoA@mail.gmail.com> (raw)
In-Reply-To: <20201107155306.463148-1-thomas@monjalon.net>

On Sat, Nov 7, 2020 at 10:04 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> The mempool pointer in the mbuf struct is moved
> from the second to the first half.
> It should increase performance on most systems having 64-byte cache line,

> i.e. mbuf is split in two cache lines.

But In any event, Tx needs to touch the pool to freeing back to the
pool upon  Tx completion. Right?
Not able to understand the motivation for moving it to the first 64B cache line?
The gain varies from driver to driver. For example, a Typical
ARM-based NPU does not need to
touch the pool in Rx and its been filled by HW. Whereas it needs to
touch in Tx if the reference count is implemented.


>
> Due to this change, tx_offload is moved, so some vector data paths
> may need to be adjusted. Note: OCTEON TX2 check is removed temporarily!

It will be breaking the Tx path, Please just don't remove the static
assert without adjusting the code.

>
> Moving this field gives more space to dynfield1.
>
> This is how the mbuf layout looks like (pahole-style):
>
> word  type                              name                byte  size
>  0    void *                            buf_addr;         /*   0 +  8 */
>  1    rte_iova_t                        buf_iova          /*   8 +  8 */
>       /* --- RTE_MARKER64               rearm_data;                   */
>  2    uint16_t                          data_off;         /*  16 +  2 */
>       uint16_t                          refcnt;           /*  18 +  2 */
>       uint16_t                          nb_segs;          /*  20 +  2 */
>       uint16_t                          port;             /*  22 +  2 */
>  3    uint64_t                          ol_flags;         /*  24 +  8 */
>       /* --- RTE_MARKER                 rx_descriptor_fields1;        */
>  4    uint32_t             union        packet_type;      /*  32 +  4 */
>       uint32_t                          pkt_len;          /*  36 +  4 */
>  5    uint16_t                          data_len;         /*  40 +  2 */
>       uint16_t                          vlan_tci;         /*  42 +  2 */
>  5.5  uint64_t             union        hash;             /*  44 +  8 */
>  6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
>       uint16_t                          buf_len;          /*  54 +  2 */
>  7    struct rte_mempool *              pool;             /*  56 +  8 */
>       /* --- RTE_MARKER                 cacheline1;                   */
>  8    struct rte_mbuf *                 next;             /*  64 +  8 */
>  9    uint64_t             union        tx_offload;       /*  72 +  8 */
> 10    struct rte_mbuf_ext_shared_info * shinfo;           /*  80 +  8 */
> 11    uint16_t                          priv_size;        /*  88 +  2 */
>       uint16_t                          timesync;         /*  90 +  2 */
> 11.5  uint32_t                          dynfield1[9];     /*  92 + 36 */
> 16    /* --- END                                             128      */
>
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  doc/guides/rel_notes/deprecation.rst | 7 -------
>  drivers/net/octeontx2/otx2_ethdev.c  | 2 --
>  lib/librte_kni/rte_kni_common.h      | 3 ++-
>  lib/librte_mbuf/rte_mbuf.h           | 1 -
>  lib/librte_mbuf/rte_mbuf_core.h      | 5 ++---
>  lib/librte_mbuf/rte_mbuf_dyn.c       | 1 -
>  6 files changed, 4 insertions(+), 15 deletions(-)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index f3258eb3f7..efb09f0c5e 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -81,13 +81,6 @@ Deprecation Notices
>    us extending existing enum/define.
>    One solution can be using a fixed size array instead of ``.*MAX.*`` value.
>
> -* mbuf: Some fields will be converted to dynamic API in DPDK 20.11
> -  in order to reserve more space for the dynamic fields, as explained in
> -  `this presentation <https://www.youtube.com/watch?v=Ttl6MlhmzWY>`_.
> -  As a consequence, the layout of the ``struct rte_mbuf`` will be re-arranged,
> -  avoiding impact on vectorized implementation of the driver datapaths,
> -  while evaluating performance gains of a better use of the first cache line.
> -
>  * ethdev: The flow director API, including ``rte_eth_conf.fdir_conf`` field,
>    and the related structures (``rte_fdir_*`` and ``rte_eth_fdir_*``),
>    will be removed in DPDK 20.11.
> diff --git a/drivers/net/octeontx2/otx2_ethdev.c b/drivers/net/octeontx2/otx2_ethdev.c
> index 6cebbe677d..d6e0f1dd03 100644
> --- a/drivers/net/octeontx2/otx2_ethdev.c
> +++ b/drivers/net/octeontx2/otx2_ethdev.c
> @@ -748,8 +748,6 @@ nix_tx_offload_flags(struct rte_eth_dev *eth_dev)
>                          offsetof(struct rte_mbuf, buf_iova) + 16);
>         RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pkt_len) !=
>                          offsetof(struct rte_mbuf, ol_flags) + 12);
> -       RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, tx_offload) !=
> -                        offsetof(struct rte_mbuf, pool) + 2 * sizeof(void *));
>
>         if (conf & DEV_TX_OFFLOAD_VLAN_INSERT ||
>             conf & DEV_TX_OFFLOAD_QINQ_INSERT)
> diff --git a/lib/librte_kni/rte_kni_common.h b/lib/librte_kni/rte_kni_common.h
> index 36d66e2ffa..ffb3182731 100644
> --- a/lib/librte_kni/rte_kni_common.h
> +++ b/lib/librte_kni/rte_kni_common.h
> @@ -84,10 +84,11 @@ struct rte_kni_mbuf {
>         char pad2[4];
>         uint32_t pkt_len;       /**< Total pkt len: sum of all segment data_len. */
>         uint16_t data_len;      /**< Amount of data in segment buffer. */
> +       char pad3[14];
> +       void *pool;
>
>         /* fields on second cache line */
>         __attribute__((__aligned__(RTE_CACHE_LINE_MIN_SIZE)))
> -       void *pool;
>         void *next;             /**< Physical address of next mbuf in kernel. */
>  };
>
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 3190a29cce..c4c9ebfaa0 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -1107,7 +1107,6 @@ rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr,
>  static inline void
>  rte_mbuf_dynfield_copy(struct rte_mbuf *mdst, const struct rte_mbuf *msrc)
>  {
> -       memcpy(&mdst->dynfield0, msrc->dynfield0, sizeof(mdst->dynfield0));
>         memcpy(&mdst->dynfield1, msrc->dynfield1, sizeof(mdst->dynfield1));
>  }
>
> diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> index debaace95a..567551deab 100644
> --- a/lib/librte_mbuf/rte_mbuf_core.h
> +++ b/lib/librte_mbuf/rte_mbuf_core.h
> @@ -586,12 +586,11 @@ struct rte_mbuf {
>
>         uint16_t buf_len;         /**< Length of segment buffer. */
>
> -       uint64_t dynfield0[1]; /**< Reserved for dynamic fields. */
> +       struct rte_mempool *pool; /**< Pool from which mbuf was allocated. */
>
>         /* second cache line - fields only used in slow path or on TX */
>         RTE_MARKER cacheline1 __rte_cache_min_aligned;
>
> -       struct rte_mempool *pool; /**< Pool from which mbuf was allocated. */
>         struct rte_mbuf *next;    /**< Next segment of scattered packet. */
>
>         /* fields to support TX offloads */
> @@ -645,7 +644,7 @@ struct rte_mbuf {
>         /** Timesync flags for use with IEEE1588. */
>         uint16_t timesync;
>
> -       uint32_t dynfield1[7]; /**< Reserved for dynamic fields. */
> +       uint32_t dynfield1[9]; /**< Reserved for dynamic fields. */
>  } __rte_cache_aligned;
>
>  /**
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
> index fd3e019a22..7d5e942bf0 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.c
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.c
> @@ -125,7 +125,6 @@ init_shared_mem(void)
>                  * rte_mbuf_dynfield_copy().
>                  */
>                 memset(shm, 0, sizeof(*shm));
> -               mark_free(dynfield0);
>                 mark_free(dynfield1);
>
>                 /* init free_flags */
> --
> 2.28.0
>

  reply	other threads:[~2020-11-07 17:13 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-07 15:53 Thomas Monjalon
2020-11-07 17:12 ` Jerin Jacob [this message]
2020-11-07 18:39   ` Thomas Monjalon
2020-11-07 19:05     ` Jerin Jacob
2020-11-07 20:33       ` Thomas Monjalon
2020-11-09  5:18         ` Jerin Jacob
2020-11-09  8:04           ` Thomas Monjalon
2020-11-09  8:27             ` Jerin Jacob
2020-11-09  9:47               ` Bruce Richardson
2020-11-09 12:01                 ` Jerin Jacob
2020-11-09 12:59                   ` Thomas Monjalon
2020-11-09 13:35                     ` Jerin Jacob
2020-11-09 14:02                       ` Thomas Monjalon
2020-11-09 14:08                         ` Jerin Jacob
2020-11-09 14:42                           ` Thomas Monjalon
2020-11-09 14:53                             ` Jerin Jacob
2020-11-09  8:16           ` Morten Brørup
2020-11-09 10:06             ` [dpdk-dev] [dpdk-techboard] " Bruce Richardson
2020-11-09 10:21               ` Morten Brørup
2020-11-09 18:04                 ` Stephen Hemminger
2020-11-10  7:15                   ` Morten Brørup
2020-11-07 18:57 ` [dpdk-dev] " Morten Brørup
2020-11-09 10:08   ` Bruce Richardson
2020-11-09 10:30     ` Morten Brørup
2020-11-09 10:33     ` Ananyev, Konstantin
2020-11-09 10:36       ` Bruce Richardson
2020-11-09 11:24       ` Ananyev, Konstantin
2020-11-09 21:29 ` [dpdk-dev] [PATCH v2 0/2] move mbuf pool pointer Thomas Monjalon
2020-11-09 21:29   ` [dpdk-dev] [PATCH v2 1/2] drivers: disable OCTEON TX2 in 32-bit build Thomas Monjalon
2020-11-10 18:05     ` Jerin Jacob
2020-11-09 21:29   ` [dpdk-dev] [PATCH v2 2/2] mbuf: move pool pointer in first half Thomas Monjalon
2020-11-10 10:05     ` Morten Brørup
2020-11-10 10:44       ` Thomas Monjalon
2020-11-10 16:25     ` Olivier Matz
2020-11-10 18:06       ` Jerin Jacob
2020-11-12 14:39         ` Thomas Monjalon
2020-11-10 18:08       ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALBAE1N3XMH5_+F6TjhcdhnKZiB0kPobon1ajUoPSsi-XQUfoA@mail.gmail.com \
    --to=jerinjacobk@gmail.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=jerinj@marvell.com \
    --cc=kirankumark@marvell.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=mb@smartsharesystems.com \
    --cc=mdr@ashroe.eu \
    --cc=ndabilpuram@marvell.com \
    --cc=nhorman@tuxdriver.com \
    --cc=olivier.matz@6wind.com \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git