From: "Dumitrescu, Cristian" <cristian.dumitrescu@intel.com>
To: Stephen Hemminger <stephen@networkplumber.org>,
"dev@dpdk.org" <dev@dpdk.org>
Cc: Stephen Hemminger <shemming@brocade.com>
Subject: Re: [dpdk-dev] [PATCH v2 2/7] rte_sched: expand scheduler hierarchy for more VLAN's
Date: Fri, 20 Feb 2015 18:18:27 +0000 [thread overview]
Message-ID: <3EB4FA525960D640B5BDFFD6A3D891263231B256@IRSMSX108.ger.corp.intel.com> (raw)
In-Reply-To: <1423116294-17080-2-git-send-email-stephen@networkplumber.org>
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen
> Hemminger
> Sent: Thursday, February 5, 2015 6:05 AM
> To: dev@dpdk.org
> Cc: Stephen Hemminger
> Subject: [dpdk-dev] [PATCH v2 2/7] rte_sched: expand scheduler hierarchy
> for more VLAN's
>
> From: Stephen Hemminger <shemming@brocade.com>
>
> The QoS subport is limited to 8 bits in original code.
> But customers demanded ability to support full number of VLAN's (4096)
> therefore use the full part of the tag field of mbuf.
>
> Resize the pipe as well to allow for more pipes in future and
> avoid expensive bitfield access.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> v2 use tag area rather than claiming reserved bit which isn't documented
>
> lib/librte_mbuf/rte_mbuf.h | 5 ++++-
> lib/librte_sched/rte_sched.h | 38 ++++++++++++++++++++++++-------------
> -
> 2 files changed, 28 insertions(+), 15 deletions(-)
>
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 16059c6..8f0c3a4 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -258,7 +258,10 @@ struct rte_mbuf {
> /**< First 4 flexible bytes or FD ID, dependent on
> PKT_RX_FDIR_* flag in ol_flags. */
> } fdir; /**< Filter identifier if FDIR enabled */
> - uint32_t sched; /**< Hierarchical scheduler */
> + struct {
> + uint32_t lo;
> + uint32_t hi;
> + } sched; /**< Hierarchical scheduler */
> uint32_t usr; /**< User defined tags. See
> @rte_distributor_process */
> } hash; /**< hash information */
>
> diff --git a/lib/librte_sched/rte_sched.h b/lib/librte_sched/rte_sched.h
> index e6bba22..dda287f 100644
> --- a/lib/librte_sched/rte_sched.h
> +++ b/lib/librte_sched/rte_sched.h
> @@ -195,16 +195,20 @@ struct rte_sched_port_params {
> #endif
> };
>
> -/** Path through the scheduler hierarchy used by the scheduler enqueue
> operation to
> -identify the destination queue for the current packet. Stored in the field
> hash.sched
> -of struct rte_mbuf of each packet, typically written by the classification
> stage and read by
> -scheduler enqueue.*/
> +/*
> + * Path through the scheduler hierarchy used by the scheduler enqueue
> + * operation to identify the destination queue for the current
> + * packet. Stored in the field pkt.hash.sched of struct rte_mbuf of
> + * each packet, typically written by the classification stage and read
> + * by scheduler enqueue.
> + */
> struct rte_sched_port_hierarchy {
> - uint32_t queue:2; /**< Queue ID (0 .. 3) */
> - uint32_t traffic_class:2; /**< Traffic class ID (0 .. 3)*/
> - uint32_t pipe:20; /**< Pipe ID */
> - uint32_t subport:6; /**< Subport ID */
> - uint32_t color:2; /**< Color */
> + uint16_t queue:2; /**< Queue ID (0 .. 3) */
> + uint16_t traffic_class:2; /**< Traffic class ID (0 .. 3)*/
> + uint16_t color:2; /**< Color */
> + uint16_t unused:10;
> + uint16_t subport; /**< Subport ID */
> + uint32_t pipe; /**< Pipe ID */
> };
Extending the number of bits allocated for mbuf->sched makes sense to me. I agree with this partitioning.
>
> /*
> @@ -350,12 +354,15 @@ rte_sched_queue_read_stats(struct
> rte_sched_port *port,
> */
> static inline void
> rte_sched_port_pkt_write(struct rte_mbuf *pkt,
> - uint32_t subport, uint32_t pipe, uint32_t traffic_class, uint32_t
> queue, enum rte_meter_color color)
> + uint32_t subport, uint32_t pipe,
> + uint32_t traffic_class,
> + uint32_t queue, enum rte_meter_color color)
> {
> - struct rte_sched_port_hierarchy *sched = (struct
> rte_sched_port_hierarchy *) &pkt->hash.sched;
> + struct rte_sched_port_hierarchy *sched
> + = (struct rte_sched_port_hierarchy *) &pkt->hash.sched;
>
> - sched->color = (uint32_t) color;
> sched->subport = subport;
> + sched->color = (uint32_t) color;
> sched->pipe = pipe;
> sched->traffic_class = traffic_class;
> sched->queue = queue;
> @@ -379,9 +386,12 @@ rte_sched_port_pkt_write(struct rte_mbuf *pkt,
> *
> */
> static inline void
> -rte_sched_port_pkt_read_tree_path(struct rte_mbuf *pkt, uint32_t
> *subport, uint32_t *pipe, uint32_t *traffic_class, uint32_t *queue)
> +rte_sched_port_pkt_read_tree_path(struct rte_mbuf *pkt, uint32_t
> *subport,
> + uint32_t *pipe, uint32_t *traffic_class,
> + uint32_t *queue)
> {
> - struct rte_sched_port_hierarchy *sched = (struct
> rte_sched_port_hierarchy *) &pkt->hash.sched;
> + struct rte_sched_port_hierarchy *sched
> + = (struct rte_sched_port_hierarchy *) &pkt->hash.sched;
>
> *subport = sched->subport;
> *pipe = sched->pipe;
> --
> 2.1.4
The functions used to access the mbuf->sched field are very slow.
Functions rte_sched_port_pkt_write(), rte_sched_port_pkt_read_tree_path(), rte_sched_port_pkt_read_color() are accessing the bitfields directly, and gcc seems to do a particularly bad job at optimizing the code that makes use of bitfields. Although less readable, a more performant alternative is to implement these functions with bit shifting, masking and or-ing operations as opposed to accessing the bit fields, as it seems to save dozens of cycles per packet.
Stephen, based on a previous conversation we had a while ago, would you be OK to do this now and resubmit this patch?
--------------------------------------------------------------
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
next prev parent reply other threads:[~2015-02-20 18:19 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-05 6:04 [dpdk-dev] [PATCH v2 1/7] rte_sched: make RED optional at runtime Stephen Hemminger
2015-02-05 6:04 ` [dpdk-dev] [PATCH v2 2/7] rte_sched: expand scheduler hierarchy for more VLAN's Stephen Hemminger
2015-02-05 9:57 ` Ananyev, Konstantin
2015-02-20 18:18 ` Dumitrescu, Cristian [this message]
2015-02-05 6:04 ` [dpdk-dev] [PATCH v2 3/7] rte_sched: keep track of RED drops Stephen Hemminger
2015-02-20 18:22 ` Dumitrescu, Cristian
2015-02-20 17:54 ` [dpdk-dev] [PATCH v2 1/7] rte_sched: make RED optional at runtime Dumitrescu, Cristian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3EB4FA525960D640B5BDFFD6A3D891263231B256@IRSMSX108.ger.corp.intel.com \
--to=cristian.dumitrescu@intel.com \
--cc=dev@dpdk.org \
--cc=shemming@brocade.com \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).