* [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods
@ 2016-05-31 3:06 Jianbo Liu
2016-05-31 19:28 ` Olivier MATZ
2016-05-31 20:00 ` Stephen Hemminger
0 siblings, 2 replies; 10+ messages in thread
From: Jianbo Liu @ 2016-05-31 3:06 UTC (permalink / raw)
To: olivier.matz, jerin.jacob, dev; +Cc: Jianbo Liu
Change the inline function to macro with parameters
Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
---
drivers/net/fm10k/fm10k_rxtx_vec.c | 8 ++++----
drivers/net/i40e/i40e_rxtx_vec.c | 8 ++++----
drivers/net/ixgbe/ixgbe_rxtx_vec.c | 8 ++++----
drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 12 ++++++++----
drivers/net/mlx4/mlx4.c | 4 ++--
drivers/net/mlx5/mlx5_rxtx.c | 4 ++--
examples/ipsec-secgw/ipsec-secgw.c | 2 +-
lib/librte_mbuf/rte_mbuf.h | 25 +++++++++++++------------
8 files changed, 38 insertions(+), 33 deletions(-)
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c b/drivers/net/fm10k/fm10k_rxtx_vec.c
index ef256a5..0e4c91c 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -487,10 +487,10 @@ fm10k_recv_raw_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
rte_compiler_barrier();
if (split_packet) {
- rte_mbuf_prefetch_part2(rx_pkts[pos]);
- rte_mbuf_prefetch_part2(rx_pkts[pos + 1]);
- rte_mbuf_prefetch_part2(rx_pkts[pos + 2]);
- rte_mbuf_prefetch_part2(rx_pkts[pos + 3]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, rx_pkts[pos]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, rx_pkts[pos + 1]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, rx_pkts[pos + 2]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, rx_pkts[pos + 3]);
}
/* D.1 pkt 3,4 convert format from desc to pktmbuf */
diff --git a/drivers/net/i40e/i40e_rxtx_vec.c b/drivers/net/i40e/i40e_rxtx_vec.c
index eef80d9..a5c4847 100644
--- a/drivers/net/i40e/i40e_rxtx_vec.c
+++ b/drivers/net/i40e/i40e_rxtx_vec.c
@@ -297,10 +297,10 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
_mm_storeu_si128((__m128i *)&rx_pkts[pos+2], mbp2);
if (split_packet) {
- rte_mbuf_prefetch_part2(rx_pkts[pos]);
- rte_mbuf_prefetch_part2(rx_pkts[pos + 1]);
- rte_mbuf_prefetch_part2(rx_pkts[pos + 2]);
- rte_mbuf_prefetch_part2(rx_pkts[pos + 3]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, rx_pkts[pos]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, rx_pkts[pos + 1]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, rx_pkts[pos + 2]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, rx_pkts[pos + 3]);
}
/* avoid compiler reorder optimization */
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
index 09f4892..55adb56 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
@@ -308,10 +308,10 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
_mm_storeu_si128((__m128i *)&rx_pkts[pos+2], mbp2);
if (split_packet) {
- rte_mbuf_prefetch_part2(rx_pkts[pos]);
- rte_mbuf_prefetch_part2(rx_pkts[pos + 1]);
- rte_mbuf_prefetch_part2(rx_pkts[pos + 2]);
- rte_mbuf_prefetch_part2(rx_pkts[pos + 3]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, rx_pkts[pos]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, rx_pkts[pos + 1]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, rx_pkts[pos + 2]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, rx_pkts[pos + 3]);
}
/* avoid compiler reorder optimization */
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
index 9c1d124..941b2d5 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
@@ -280,10 +280,14 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_pkts,
vst1q_u64((uint64_t *)&rx_pkts[pos + 2], mbp2);
if (split_packet) {
- rte_prefetch_non_temporal(&rx_pkts[pos]->cacheline1);
- rte_prefetch_non_temporal(&rx_pkts[pos + 1]->cacheline1);
- rte_prefetch_non_temporal(&rx_pkts[pos + 2]->cacheline1);
- rte_prefetch_non_temporal(&rx_pkts[pos + 3]->cacheline1);
+ RTE_MBUF_PREFETCH_PART2(prefetch_non_temporal,
+ rx_pkts[pos]);
+ RTE_MBUF_PREFETCH_PART2(prefetch_non_temporal,
+ rx_pkts[pos + 1]);
+ RTE_MBUF_PREFETCH_PART2(prefetch_non_temporal,
+ rx_pkts[pos + 2]);
+ RTE_MBUF_PREFETCH_PART2(prefetch_non_temporal,
+ rx_pkts[pos + 3]);
}
/* D.1 pkt 3,4 convert format from desc to pktmbuf */
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 9ed1491..677ca02 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -3283,8 +3283,8 @@ mlx4_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
* Fetch initial bytes of packet descriptor into a
* cacheline while allocating rep.
*/
- rte_mbuf_prefetch_part1(seg);
- rte_mbuf_prefetch_part2(seg);
+ RTE_MBUF_PREFETCH_PART1(prefetch0, seg);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, seg);
ret = rxq->if_cq->poll_length_flags(rxq->cq, NULL, NULL,
&flags);
if (unlikely(ret < 0)) {
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 29bfcec..3d853c5 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1134,8 +1134,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
* Fetch initial bytes of packet descriptor into a
* cacheline while allocating rep.
*/
- rte_mbuf_prefetch_part1(seg);
- rte_mbuf_prefetch_part2(seg);
+ RTE_MBUF_PREFETCH_PART1(prefetch0, seg);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, seg);
ret = rxq->poll(rxq->cq, NULL, NULL, &flags, &vlan_tci);
if (unlikely(ret < 0)) {
struct ibv_wc wc;
diff --git a/examples/ipsec-secgw/ipsec-secgw.c b/examples/ipsec-secgw/ipsec-secgw.c
index ebd7c23..2da94b3 100644
--- a/examples/ipsec-secgw/ipsec-secgw.c
+++ b/examples/ipsec-secgw/ipsec-secgw.c
@@ -298,7 +298,7 @@ prepare_tx_burst(struct rte_mbuf *pkts[], uint16_t nb_pkts, uint8_t port)
const int32_t prefetch_offset = 2;
for (i = 0; i < (nb_pkts - prefetch_offset); i++) {
- rte_mbuf_prefetch_part2(pkts[i + prefetch_offset]);
+ RTE_MBUF_PREFETCH_PART2(prefetch0, pkts[i + prefetch_offset]);
prepare_tx_pkt(pkts[i], port);
}
/* Process left packets */
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 11fa06d..f01754c 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -849,14 +849,15 @@ struct rte_mbuf {
* in the receive path. If the cache line of the architecture is higher than
* 64B, the second part will also be prefetched.
*
+ * @param method
+ * The prefetch method: prefetch0, prefetch1, prefetch2 or
+ * prefetch_non_temporal.
+ *
* @param m
* The pointer to the mbuf.
*/
-static inline void
-rte_mbuf_prefetch_part1(struct rte_mbuf *m)
-{
- rte_prefetch0(&m->cacheline0);
-}
+#define RTE_MBUF_PREFETCH_PART1(method, m) \
+ rte_##method(&(m)->cacheline0)
/**
* Prefetch the second part of the mbuf
@@ -866,19 +867,19 @@ rte_mbuf_prefetch_part1(struct rte_mbuf *m)
* this function does nothing as it is expected that the full mbuf is
* already in cache.
*
+ * @param method
+ * The prefetch method: prefetch0, prefetch1, prefetch2 or
+ * prefetch_non_temporal.
+ *
* @param m
* The pointer to the mbuf.
*/
-static inline void
-rte_mbuf_prefetch_part2(struct rte_mbuf *m)
-{
#if RTE_CACHE_LINE_SIZE == 64
- rte_prefetch0(&m->cacheline1);
+#define RTE_MBUF_PREFETCH_PART2(method, m) \
+ rte_##method(&(m)->cacheline1)
#else
- RTE_SET_USED(m);
+#define RTE_MBUF_PREFETCH_PART2(method, m)
#endif
-}
-
static inline uint16_t rte_pktmbuf_priv_size(struct rte_mempool *mp);
--
2.4.11
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods
2016-05-31 3:06 [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods Jianbo Liu
@ 2016-05-31 19:28 ` Olivier MATZ
2016-06-01 3:29 ` Jianbo Liu
2016-05-31 20:00 ` Stephen Hemminger
1 sibling, 1 reply; 10+ messages in thread
From: Olivier MATZ @ 2016-05-31 19:28 UTC (permalink / raw)
To: Jianbo Liu, jerin.jacob, dev
Hi Jianbo,
On 05/31/2016 05:06 AM, Jianbo Liu wrote:
> Change the inline function to macro with parameters
>
> Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
>
> [...]
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -849,14 +849,15 @@ struct rte_mbuf {
> * in the receive path. If the cache line of the architecture is higher than
> * 64B, the second part will also be prefetched.
> *
> + * @param method
> + * The prefetch method: prefetch0, prefetch1, prefetch2 or
> + * prefetch_non_temporal.
> + *
> * @param m
> * The pointer to the mbuf.
> */
> -static inline void
> -rte_mbuf_prefetch_part1(struct rte_mbuf *m)
> -{
> - rte_prefetch0(&m->cacheline0);
> -}
> +#define RTE_MBUF_PREFETCH_PART1(method, m) \
> + rte_##method(&(m)->cacheline0)
I'm not very fan of this macro, because it allows to
really do everything):
RTE_MBUF_PREFETCH_PART1(pktmbuf_free, m)
would expand as:
rte_pktmbuf_free(m)
I'd prefer to have a switch case like this, almost similar
to what Keith proposed in the initial discussion for my
patch:
enum rte_mbuf_prefetch_type {
PREFETCH0,
PREFETCH1,
...
};
static inline void
rte_mbuf_prefetch_part1(enum rte_mbuf_prefetch_type type,
struct rte_mbuf *m)
{
switch (type) {
case PREFETCH0:
rte_prefetch0(&m->cacheline0);
break;
case PREFETCH1:
rte_prefetch1(&m->cacheline0);
break;
...
}
Some questions: could you give some details about the use
of non-temporal prefetch in ixgbe_vec_neon? What are the
pros and cons, and would it be useful in other drivers?
Currently all drivers are doing prefetch0 when they prefetch
the mbuf structure. Some drivers use prefetch1 for data.
By the way, I did not try to apply the patch, but it looks
it's on top of dpdk-next-net/rel_16_07, right?
Thanks,
Olivier
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods
2016-05-31 19:28 ` Olivier MATZ
@ 2016-06-01 3:29 ` Jianbo Liu
2016-06-01 6:00 ` Jerin Jacob
2016-06-02 7:10 ` Olivier MATZ
0 siblings, 2 replies; 10+ messages in thread
From: Jianbo Liu @ 2016-06-01 3:29 UTC (permalink / raw)
To: Olivier MATZ; +Cc: Jerin Jacob, dev
On 1 June 2016 at 03:28, Olivier MATZ <olivier.matz@6wind.com> wrote:
> Hi Jianbo,
>
> On 05/31/2016 05:06 AM, Jianbo Liu wrote:
>> Change the inline function to macro with parameters
>>
>> Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
>>
>> [...]
>> --- a/lib/librte_mbuf/rte_mbuf.h
>> +++ b/lib/librte_mbuf/rte_mbuf.h
>> @@ -849,14 +849,15 @@ struct rte_mbuf {
>> * in the receive path. If the cache line of the architecture is higher than
>> * 64B, the second part will also be prefetched.
>> *
>> + * @param method
>> + * The prefetch method: prefetch0, prefetch1, prefetch2 or
>> + * prefetch_non_temporal.
>> + *
>> * @param m
>> * The pointer to the mbuf.
>> */
>> -static inline void
>> -rte_mbuf_prefetch_part1(struct rte_mbuf *m)
>> -{
>> - rte_prefetch0(&m->cacheline0);
>> -}
>> +#define RTE_MBUF_PREFETCH_PART1(method, m) \
>> + rte_##method(&(m)->cacheline0)
>
> I'm not very fan of this macro, because it allows to
> really do everything):
>
> RTE_MBUF_PREFETCH_PART1(pktmbuf_free, m)
>
> would expand as:
>
> rte_pktmbuf_free(m)
>
>
> I'd prefer to have a switch case like this, almost similar
> to what Keith proposed in the initial discussion for my
> patch:
>
> enum rte_mbuf_prefetch_type {
> PREFETCH0,
> PREFETCH1,
> ...
> };
>
> static inline void
> rte_mbuf_prefetch_part1(enum rte_mbuf_prefetch_type type,
> struct rte_mbuf *m)
> {
> switch (type) {
> case PREFETCH0:
> rte_prefetch0(&m->cacheline0);
> break;
> case PREFETCH1:
> rte_prefetch1(&m->cacheline0);
> break;
> ...
> }
>
How about adding these to forbid the illegal use of this macro?
enum rte_mbuf_prefetch_type {
ENUM_prefetch0,
ENUM_prefetch1,
...
};
#define RTE_MBUF_PREFETCH_PART1(type, m) \
if (ENUM_##type == ENUM_prefretch0) \
rte_prefetch0(&(m)->cacheline0); \
else if (ENUM_##type == ENUM_prefetch1) \
rte_prefetch1(&(m)->cacheline0); \
....
>
> Some questions: could you give some details about the use
> of non-temporal prefetch in ixgbe_vec_neon? What are the
> pros and cons, and would it be useful in other drivers?
> Currently all drivers are doing prefetch0 when they prefetch
> the mbuf structure. Some drivers use prefetch1 for data.
>
It's for performance consideration, and only on armv8a platform.
>
> By the way, I did not try to apply the patch, but it looks
> it's on top of dpdk-next-net/rel_16_07, right?
>
Yes
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods
2016-06-01 3:29 ` Jianbo Liu
@ 2016-06-01 6:00 ` Jerin Jacob
2016-06-02 9:04 ` Jianbo Liu
2016-06-02 7:10 ` Olivier MATZ
1 sibling, 1 reply; 10+ messages in thread
From: Jerin Jacob @ 2016-06-01 6:00 UTC (permalink / raw)
To: Jianbo Liu; +Cc: Olivier MATZ, dev
On Wed, Jun 01, 2016 at 11:29:47AM +0800, Jianbo Liu wrote:
> On 1 June 2016 at 03:28, Olivier MATZ <olivier.matz@6wind.com> wrote:
> > Hi Jianbo,
> >
> > On 05/31/2016 05:06 AM, Jianbo Liu wrote:
> >> Change the inline function to macro with parameters
> >>
> >> Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
> >>
> >> [...]
> >> --- a/lib/librte_mbuf/rte_mbuf.h
> >> +++ b/lib/librte_mbuf/rte_mbuf.h
> >> @@ -849,14 +849,15 @@ struct rte_mbuf {
> >> * in the receive path. If the cache line of the architecture is higher than
> >> * 64B, the second part will also be prefetched.
> >> *
> >> + * @param method
> >> + * The prefetch method: prefetch0, prefetch1, prefetch2 or
> >> + * prefetch_non_temporal.
> >> + *
> >> * @param m
> >> * The pointer to the mbuf.
> >> */
> >> -static inline void
> >> -rte_mbuf_prefetch_part1(struct rte_mbuf *m)
> >> -{
> >> - rte_prefetch0(&m->cacheline0);
> >> -}
> >> +#define RTE_MBUF_PREFETCH_PART1(method, m) \
> >> + rte_##method(&(m)->cacheline0)
> >
> > I'm not very fan of this macro, because it allows to
> > really do everything):
> >
> > RTE_MBUF_PREFETCH_PART1(pktmbuf_free, m)
> >
> > would expand as:
> >
> > rte_pktmbuf_free(m)
> >
> >
> > I'd prefer to have a switch case like this, almost similar
> > to what Keith proposed in the initial discussion for my
> > patch:
> >
> > enum rte_mbuf_prefetch_type {
> > PREFETCH0,
> > PREFETCH1,
> > ...
> > };
> >
> > static inline void
> > rte_mbuf_prefetch_part1(enum rte_mbuf_prefetch_type type,
> > struct rte_mbuf *m)
> > {
> > switch (type) {
> > case PREFETCH0:
> > rte_prefetch0(&m->cacheline0);
> > break;
> > case PREFETCH1:
> > rte_prefetch1(&m->cacheline0);
> > break;
> > ...
> > }
> >
> How about adding these to forbid the illegal use of this macro?
> enum rte_mbuf_prefetch_type {
> ENUM_prefetch0,
> ENUM_prefetch1,
> ...
> };
>
> #define RTE_MBUF_PREFETCH_PART1(type, m) \
> if (ENUM_##type == ENUM_prefretch0) \
> rte_prefetch0(&(m)->cacheline0); \
> else if (ENUM_##type == ENUM_prefetch1) \
> rte_prefetch1(&(m)->cacheline0); \
> ....
>
> >
> > Some questions: could you give some details about the use
> > of non-temporal prefetch in ixgbe_vec_neon? What are the
> > pros and cons, and would it be useful in other drivers?
> > Currently all drivers are doing prefetch0 when they prefetch
> > the mbuf structure. Some drivers use prefetch1 for data.
> >
> It's for performance consideration, and only on armv8a platform.
Strictly it is not armv8 specific, IA also implemented this API with
_MM_HINT_NTA hint.
Do we really need non-temporal/transient version of prefetch for ixgbe?
If so, for x86 also it makes sense to keep it? Right?
The primary use case for transient version would be use with pipe line
line mode where the same cpu wont consume the packet.
/**
* Prefetch a cache line into all cache levels (non-temporal/transient
* version)
*
* The non-temporal prefetch is intended as a prefetch hint that
* processor will
* use the prefetched data only once or short period, unlike the
* rte_prefetch0() function which imply that prefetched data to use
* repeatedly.
*
* @param p
* Address to prefetch
*/
static inline void rte_prefetch_non_temporal(const volatile void *p);
>
> >
> > By the way, I did not try to apply the patch, but it looks
> > it's on top of dpdk-next-net/rel_16_07, right?
> >
> Yes
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods
2016-06-01 6:00 ` Jerin Jacob
@ 2016-06-02 9:04 ` Jianbo Liu
2016-06-02 9:30 ` Jerin Jacob
0 siblings, 1 reply; 10+ messages in thread
From: Jianbo Liu @ 2016-06-02 9:04 UTC (permalink / raw)
To: Jerin Jacob; +Cc: Olivier MATZ, dev
On 1 June 2016 at 14:00, Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
> On Wed, Jun 01, 2016 at 11:29:47AM +0800, Jianbo Liu wrote:
>> On 1 June 2016 at 03:28, Olivier MATZ <olivier.matz@6wind.com> wrote:
>> > Hi Jianbo,
>> >
>> > On 05/31/2016 05:06 AM, Jianbo Liu wrote:
>> >> Change the inline function to macro with parameters
>> >>
>> >> Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
>> >>
>> >> [...]
[...]
>> It's for performance consideration, and only on armv8a platform.
>
> Strictly it is not armv8 specific, IA also implemented this API with
> _MM_HINT_NTA hint.
I mean this patch is only for ixgbe vector PMD on armv8 platform.
>
> Do we really need non-temporal/transient version of prefetch for ixgbe?
Strictly speaking, we don't have to since we don't know how APPs use
the mbuf header.
But, is it high possibility that the second part is used only once or
short period because prefetching is done only when split_packet is not
NULL?
> If so, for x86 also it makes sense to keep it? Right?
>
> The primary use case for transient version would be use with pipe line
> line mode where the same cpu wont consume the packet.
>
> /**
> * Prefetch a cache line into all cache levels (non-temporal/transient
> * version)
> *
> * The non-temporal prefetch is intended as a prefetch hint that
> * processor will
> * use the prefetched data only once or short period, unlike the
> * rte_prefetch0() function which imply that prefetched data to use
> * repeatedly.
> *
> * @param p
> * Address to prefetch
> */
> static inline void rte_prefetch_non_temporal(const volatile void *p);
>
>>
>> >
>> > By the way, I did not try to apply the patch, but it looks
>> > it's on top of dpdk-next-net/rel_16_07, right?
>> >
>> Yes
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods
2016-06-02 9:04 ` Jianbo Liu
@ 2016-06-02 9:30 ` Jerin Jacob
2016-06-21 14:56 ` Olivier Matz
0 siblings, 1 reply; 10+ messages in thread
From: Jerin Jacob @ 2016-06-02 9:30 UTC (permalink / raw)
To: Jianbo Liu; +Cc: Olivier MATZ, dev
On Thu, Jun 02, 2016 at 05:04:13PM +0800, Jianbo Liu wrote:
> On 1 June 2016 at 14:00, Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
> > On Wed, Jun 01, 2016 at 11:29:47AM +0800, Jianbo Liu wrote:
> >> On 1 June 2016 at 03:28, Olivier MATZ <olivier.matz@6wind.com> wrote:
> >> > Hi Jianbo,
> >> >
> >> > On 05/31/2016 05:06 AM, Jianbo Liu wrote:
> >> >> Change the inline function to macro with parameters
> >> >>
> >> >> Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
> >> >>
> >> >> [...]
> [...]
> >> It's for performance consideration, and only on armv8a platform.
> >
> > Strictly it is not armv8 specific, IA also implemented this API with
> > _MM_HINT_NTA hint.
>
> I mean this patch is only for ixgbe vector PMD on armv8 platform.
>
> >
> > Do we really need non-temporal/transient version of prefetch for ixgbe?
>
> Strictly speaking, we don't have to since we don't know how APPs use
> the mbuf header.
Then IMO it makes sense to keep the same behavior as x86 ixgbe driver.
Then on the upside, We may not need the new macros for part prefetching
Jerin
> But, is it high possibility that the second part is used only once or
> short period because prefetching is done only when split_packet is not
> NULL?
>
> > If so, for x86 also it makes sense to keep it? Right?
> >
> > The primary use case for transient version would be use with pipe line
> > line mode where the same cpu wont consume the packet.
> >
> > /**
> > * Prefetch a cache line into all cache levels (non-temporal/transient
> > * version)
> > *
> > * The non-temporal prefetch is intended as a prefetch hint that
> > * processor will
> > * use the prefetched data only once or short period, unlike the
> > * rte_prefetch0() function which imply that prefetched data to use
> > * repeatedly.
> > *
> > * @param p
> > * Address to prefetch
> > */
> > static inline void rte_prefetch_non_temporal(const volatile void *p);
> >
> >>
> >> >
> >> > By the way, I did not try to apply the patch, but it looks
> >> > it's on top of dpdk-next-net/rel_16_07, right?
> >> >
> >> Yes
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods
2016-06-02 9:30 ` Jerin Jacob
@ 2016-06-21 14:56 ` Olivier Matz
0 siblings, 0 replies; 10+ messages in thread
From: Olivier Matz @ 2016-06-21 14:56 UTC (permalink / raw)
To: Jerin Jacob, Jianbo Liu; +Cc: dev
Hi,
On 06/02/2016 11:30 AM, Jerin Jacob wrote:
> On Thu, Jun 02, 2016 at 05:04:13PM +0800, Jianbo Liu wrote:
>> On 1 June 2016 at 14:00, Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
>>> On Wed, Jun 01, 2016 at 11:29:47AM +0800, Jianbo Liu wrote:
>>>> On 1 June 2016 at 03:28, Olivier MATZ <olivier.matz@6wind.com> wrote:
>>>>> Hi Jianbo,
>>>>>
>>>>> On 05/31/2016 05:06 AM, Jianbo Liu wrote:
>>>>>> Change the inline function to macro with parameters
>>>>>>
>>>>>> Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
>>>>>>
>>>>>> [...]
>> [...]
>>>> It's for performance consideration, and only on armv8a platform.
>>>
>>> Strictly it is not armv8 specific, IA also implemented this API with
>>> _MM_HINT_NTA hint.
>>
>> I mean this patch is only for ixgbe vector PMD on armv8 platform.
>>
>>>
>>> Do we really need non-temporal/transient version of prefetch for ixgbe?
>>
>> Strictly speaking, we don't have to since we don't know how APPs use
>> the mbuf header.
>
> Then IMO it makes sense to keep the same behavior as x86 ixgbe driver.
> Then on the upside, We may not need the new macros for part prefetching
>
> Jerin
Knowing that http://www.dpdk.org/dev/patchwork/patch/13992/ has been
submitted, I think this patch can be marked as closed in patchwork.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods
2016-06-01 3:29 ` Jianbo Liu
2016-06-01 6:00 ` Jerin Jacob
@ 2016-06-02 7:10 ` Olivier MATZ
2016-06-02 9:12 ` Jianbo Liu
1 sibling, 1 reply; 10+ messages in thread
From: Olivier MATZ @ 2016-06-02 7:10 UTC (permalink / raw)
To: Jianbo Liu; +Cc: Jerin Jacob, dev
Hi Jianbo,
On 06/01/2016 05:29 AM, Jianbo Liu wrote:
>> enum rte_mbuf_prefetch_type {
>> > PREFETCH0,
>> > PREFETCH1,
>> > ...
>> > };
>> >
>> > static inline void
>> > rte_mbuf_prefetch_part1(enum rte_mbuf_prefetch_type type,
>> > struct rte_mbuf *m)
>> > {
>> > switch (type) {
>> > case PREFETCH0:
>> > rte_prefetch0(&m->cacheline0);
>> > break;
>> > case PREFETCH1:
>> > rte_prefetch1(&m->cacheline0);
>> > break;
>> > ...
>> > }
>> >
> How about adding these to forbid the illegal use of this macro?
> enum rte_mbuf_prefetch_type {
> ENUM_prefetch0,
> ENUM_prefetch1,
> ...
> };
>
> #define RTE_MBUF_PREFETCH_PART1(type, m) \
> if (ENUM_##type == ENUM_prefretch0) \
> rte_prefetch0(&(m)->cacheline0); \
> else if (ENUM_##type == ENUM_prefetch1) \
> rte_prefetch1(&(m)->cacheline0); \
> ....
>
As Stephen stated, a static inline is better than a macro, mainly
because it is understood by the compiler instead of beeing a dumb
code replacement.
Any reason why you would prefer a macro in that case?
Regards
Olivier
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods
2016-06-02 7:10 ` Olivier MATZ
@ 2016-06-02 9:12 ` Jianbo Liu
0 siblings, 0 replies; 10+ messages in thread
From: Jianbo Liu @ 2016-06-02 9:12 UTC (permalink / raw)
To: Olivier MATZ; +Cc: Jerin Jacob, dev
On 2 June 2016 at 15:10, Olivier MATZ <olivier.matz@6wind.com> wrote:
> Hi Jianbo,
>
> On 06/01/2016 05:29 AM, Jianbo Liu wrote:
>>> enum rte_mbuf_prefetch_type {
>>> > PREFETCH0,
>>> > PREFETCH1,
>>> > ...
>>> > };
>>> >
>>> > static inline void
>>> > rte_mbuf_prefetch_part1(enum rte_mbuf_prefetch_type type,
>>> > struct rte_mbuf *m)
>>> > {
>>> > switch (type) {
>>> > case PREFETCH0:
>>> > rte_prefetch0(&m->cacheline0);
>>> > break;
>>> > case PREFETCH1:
>>> > rte_prefetch1(&m->cacheline0);
>>> > break;
>>> > ...
>>> > }
>>> >
>> How about adding these to forbid the illegal use of this macro?
>> enum rte_mbuf_prefetch_type {
>> ENUM_prefetch0,
>> ENUM_prefetch1,
>> ...
>> };
>>
>> #define RTE_MBUF_PREFETCH_PART1(type, m) \
>> if (ENUM_##type == ENUM_prefretch0) \
>> rte_prefetch0(&(m)->cacheline0); \
>> else if (ENUM_##type == ENUM_prefetch1) \
>> rte_prefetch1(&(m)->cacheline0); \
>> ....
>>
>
> As Stephen stated, a static inline is better than a macro, mainly
> because it is understood by the compiler instead of beeing a dumb
> code replacement.
>
> Any reason why you would prefer a macro in that case?
>
For the simplicity reason. If not, we may have to write several
similar functions for different prefetchings.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods
2016-05-31 3:06 [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods Jianbo Liu
2016-05-31 19:28 ` Olivier MATZ
@ 2016-05-31 20:00 ` Stephen Hemminger
1 sibling, 0 replies; 10+ messages in thread
From: Stephen Hemminger @ 2016-05-31 20:00 UTC (permalink / raw)
To: Jianbo Liu; +Cc: olivier.matz, jerin.jacob, dev
On Tue, 31 May 2016 08:36:06 +0530
Jianbo Liu <jianbo.liu@linaro.org> wrote:
> Change the inline function to macro with parameters
>
> Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
Going from typed (inline) to untyped (macro) is a step backwards
in code safety.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-06-21 14:56 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-31 3:06 [dpdk-dev] [PATCH] mbuf: extend rte_mbuf_prefetch_part* to support more prefetching methods Jianbo Liu
2016-05-31 19:28 ` Olivier MATZ
2016-06-01 3:29 ` Jianbo Liu
2016-06-01 6:00 ` Jerin Jacob
2016-06-02 9:04 ` Jianbo Liu
2016-06-02 9:30 ` Jerin Jacob
2016-06-21 14:56 ` Olivier Matz
2016-06-02 7:10 ` Olivier MATZ
2016-06-02 9:12 ` Jianbo Liu
2016-05-31 20:00 ` Stephen Hemminger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).