DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
@ 2015-07-29 15:07 Martin Weiser
  2015-07-29 18:12 ` Ananyev, Konstantin
  0 siblings, 1 reply; 19+ messages in thread
From: Martin Weiser @ 2015-07-29 15:07 UTC (permalink / raw)
  To: helin.zhang, olivier.matz; +Cc: dev

Hi Helin, Hi Olivier,

we are seeing an issue with the ixgbe and i40e drivers which we could
track down to our setting of the private area size of the mbufs.
The issue can be easily reproduced with the l2fwd example application
when a small modification is done: just set the priv_size parameter in
the call to the rte_pktmbuf_pool_create function to an odd number like
1. In our tests this causes every call to rte_eth_rx_burst to return 32
(which is the setting of nb_pkts) nonsense mbufs although no packets are
received on the interface and the hardware counters do not report any
received packets.
Interestingly this does not happen if we force the scattered rx path.

I assume the drivers have some expectations regarding the alignment of
the buf_addr in the mbuf and setting an odd private are size breaks this
alignment in the rte_pktmbuf_init function. If this is the case then one
possible fix might be to enforce an alignment on the private area size.

Best regards,
Martin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
  2015-07-29 15:07 [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd Martin Weiser
@ 2015-07-29 18:12 ` Ananyev, Konstantin
  2015-07-29 20:24   ` Zhang, Helin
  2015-08-02 22:32   ` [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd Thomas Monjalon
  0 siblings, 2 replies; 19+ messages in thread
From: Ananyev, Konstantin @ 2015-07-29 18:12 UTC (permalink / raw)
  To: Martin Weiser, Zhang, Helin, olivier.matz; +Cc: dev

Hi Martin,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Martin Weiser
> Sent: Wednesday, July 29, 2015 4:07 PM
> To: Zhang, Helin; olivier.matz@6wind.com
> Cc: dev@dpdk.org
> Subject: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
> 
> Hi Helin, Hi Olivier,
> 
> we are seeing an issue with the ixgbe and i40e drivers which we could
> track down to our setting of the private area size of the mbufs.
> The issue can be easily reproduced with the l2fwd example application
> when a small modification is done: just set the priv_size parameter in
> the call to the rte_pktmbuf_pool_create function to an odd number like
> 1. In our tests this causes every call to rte_eth_rx_burst to return 32
> (which is the setting of nb_pkts) nonsense mbufs although no packets are
> received on the interface and the hardware counters do not report any
> received packets.

From Niantic datasheet:

"7.1.6.1 Advanced Receive Descriptors — Read Format
Table 7-15 lists the advanced receive descriptor programming by the software. The
...
Packet Buffer Address (64)
This is the physical address of the packet buffer. The lowest bit is A0 (LSB of the
address).
Header Buffer Address (64)
The physical address of the header buffer with the lowest bit being Descriptor Done (DD).
When a packet spans in multiple descriptors, the header buffer address is used only on
the first descriptor. During the programming phase, software must set the DD bit to zero
(see the description of the DD bit in this section). This means that header buffer
addresses are always word aligned."

Right now, in ixgbe PMD we always setup  Packet Buffer Address (PBA)and Header Buffer Address (HBA) to the same value:
buf_physaddr + RTE_PKTMBUF_HEADROOM
So when pirv_size==1, DD bit in RXD is always set to one by SW itself, and then SW considers that HW already done with it.
In other words, right now for ixgbe you can't use RX buffer that is not aligned on word boundary.

So the advice would be, right now - don't set priv_size to the odd value.
As we don't support split header feature anyway, I think we can fix it just by  always setting HBA in the RXD to zero.
Could you try the fix for ixgbe below?

Same story with FVL, I believe.
Konstantin


> Interestingly this does not happen if we force the scattered rx path.
> 
> I assume the drivers have some expectations regarding the alignment of
> the buf_addr in the mbuf and setting an odd private are size breaks this
> alignment in the rte_pktmbuf_init function. If this is the case then one
> possible fix might be to enforce an alignment on the private area size.
> 
> Best regards,
> Martin

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index a0c8847..94967c5 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool reset_mbuf)

                /* populate the descriptors */
                dma_addr = rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
-               rxdp[i].read.hdr_addr = dma_addr;
+               rxdp[i].read.hdr_addr = 0;
                rxdp[i].read.pkt_addr = dma_addr;
        }

@@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
                rxe->mbuf = nmb;
                dma_addr =
                        rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
-               rxdp->read.hdr_addr = dma_addr;
+               rxdp->read.hdr_addr = 0;
                rxdp->read.pkt_addr = dma_addr;

                /*
@@ -1741,7 +1741,7 @@ next_desc:
                        rxe->mbuf = nmb;

                        rxm->data_off = RTE_PKTMBUF_HEADROOM;
-                       rxdp->read.hdr_addr = dma;
+                       rxdp->read.hdr_addr = 0;
                        rxdp->read.pkt_addr = dma;
                } else
                        rxe->mbuf = NULL;
@@ -3633,7 +3633,7 @@ ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
                dma_addr =
                        rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
                rxd = &rxq->rx_ring[i];
-               rxd->read.hdr_addr = dma_addr;
+               rxd->read.hdr_addr = 0;
                rxd->read.pkt_addr = dma_addr;
                rxe[i].mbuf = mbuf;
        }
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
index 6c1647e..16a9c64 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
@@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
                        RTE_PKTMBUF_HEADROOM);
        __m128i dma_addr0, dma_addr1;

+       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
+
        rxdp = rxq->rx_ring + rxq->rxrearm_start;

        /* Pull 'n' more MBUFs into the software ring */
@@ -108,6 +110,9 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
                dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
                dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);

+               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
+               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
+
                /* flush desc with pa dma_addr */
                _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
                _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
bash-4.2$ cat patch1
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index a0c8847..94967c5 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool reset_mbuf)

                /* populate the descriptors */
                dma_addr = rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
-               rxdp[i].read.hdr_addr = dma_addr;
+               rxdp[i].read.hdr_addr = 0;
                rxdp[i].read.pkt_addr = dma_addr;
        }

@@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
                rxe->mbuf = nmb;
                dma_addr =
                        rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
-               rxdp->read.hdr_addr = dma_addr;
+               rxdp->read.hdr_addr = 0;
                rxdp->read.pkt_addr = dma_addr;

                /*
@@ -1741,7 +1741,7 @@ next_desc:
                        rxe->mbuf = nmb;

                        rxm->data_off = RTE_PKTMBUF_HEADROOM;
-                       rxdp->read.hdr_addr = dma;
+                       rxdp->read.hdr_addr = 0;
                        rxdp->read.pkt_addr = dma;
                } else
                        rxe->mbuf = NULL;
@@ -3633,7 +3633,7 @@ ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
                dma_addr =
                        rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
                rxd = &rxq->rx_ring[i];
-               rxd->read.hdr_addr = dma_addr;
+               rxd->read.hdr_addr = 0;
                rxd->read.pkt_addr = dma_addr;
                rxe[i].mbuf = mbuf;
        }
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
index 6c1647e..16a9c64 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
@@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
                        RTE_PKTMBUF_HEADROOM);
        __m128i dma_addr0, dma_addr1;

+       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
+
        rxdp = rxq->rx_ring + rxq->rxrearm_start;

        /* Pull 'n' more MBUFs into the software ring */
@@ -108,6 +110,9 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
                dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
                dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);

+               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
+               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
+
                /* flush desc with pa dma_addr */
                _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
                _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
  2015-07-29 18:12 ` Ananyev, Konstantin
@ 2015-07-29 20:24   ` Zhang, Helin
  2015-07-30  8:12     ` Olivier MATZ
  2015-08-02 22:32   ` [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd Thomas Monjalon
  1 sibling, 1 reply; 19+ messages in thread
From: Zhang, Helin @ 2015-07-29 20:24 UTC (permalink / raw)
  To: Ananyev, Konstantin, Martin Weiser, olivier.matz; +Cc: dev

Hi Martin

Thank you very much for the good catch!

The similar situation in i40e, as explained by Konstantin.
As header split hasn't been supported by DPDK till now. It would be better to put the header address in RX descriptor to 0.
But in the future, during header split enabling. We may need to pay extra attention to that. As at least x710 datasheet said specifically as below.
"The header address should be set by the software to an even number (word aligned address)". We may need to find a way to ensure that during mempool/mbuf allocation.

Regards,
Helin

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Wednesday, July 29, 2015 11:12 AM
> To: Martin Weiser; Zhang, Helin; olivier.matz@6wind.com
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf
> private area size is odd
> 
> Hi Martin,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Martin Weiser
> > Sent: Wednesday, July 29, 2015 4:07 PM
> > To: Zhang, Helin; olivier.matz@6wind.com
> > Cc: dev@dpdk.org
> > Subject: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when
> > mbuf private area size is odd
> >
> > Hi Helin, Hi Olivier,
> >
> > we are seeing an issue with the ixgbe and i40e drivers which we could
> > track down to our setting of the private area size of the mbufs.
> > The issue can be easily reproduced with the l2fwd example application
> > when a small modification is done: just set the priv_size parameter in
> > the call to the rte_pktmbuf_pool_create function to an odd number like
> > 1. In our tests this causes every call to rte_eth_rx_burst to return
> > 32 (which is the setting of nb_pkts) nonsense mbufs although no
> > packets are received on the interface and the hardware counters do not
> > report any received packets.
> 
> From Niantic datasheet:
> 
> "7.1.6.1 Advanced Receive Descriptors — Read Format Table 7-15 lists the
> advanced receive descriptor programming by the software. The ...
> Packet Buffer Address (64)
> This is the physical address of the packet buffer. The lowest bit is A0 (LSB of the
> address).
> Header Buffer Address (64)
> The physical address of the header buffer with the lowest bit being Descriptor
> Done (DD).
> When a packet spans in multiple descriptors, the header buffer address is used
> only on the first descriptor. During the programming phase, software must set
> the DD bit to zero (see the description of the DD bit in this section). This means
> that header buffer addresses are always word aligned."
> 
> Right now, in ixgbe PMD we always setup  Packet Buffer Address (PBA)and
> Header Buffer Address (HBA) to the same value:
> buf_physaddr + RTE_PKTMBUF_HEADROOM
> So when pirv_size==1, DD bit in RXD is always set to one by SW itself, and then
> SW considers that HW already done with it.
> In other words, right now for ixgbe you can't use RX buffer that is not aligned on
> word boundary.
> 
> So the advice would be, right now - don't set priv_size to the odd value.
> As we don't support split header feature anyway, I think we can fix it just by
> always setting HBA in the RXD to zero.
> Could you try the fix for ixgbe below?
> 
> Same story with FVL, I believe.
> Konstantin
> 
> 
> > Interestingly this does not happen if we force the scattered rx path.
> >
> > I assume the drivers have some expectations regarding the alignment of
> > the buf_addr in the mbuf and setting an odd private are size breaks
> > this alignment in the rte_pktmbuf_init function. If this is the case
> > then one possible fix might be to enforce an alignment on the private area size.
> >
> > Best regards,
> > Martin
> 
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
> index a0c8847..94967c5 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
> reset_mbuf)
> 
>                 /* populate the descriptors */
>                 dma_addr =
> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
> -               rxdp[i].read.hdr_addr = dma_addr;
> +               rxdp[i].read.hdr_addr = 0;
>                 rxdp[i].read.pkt_addr = dma_addr;
>         }
> 
> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
> **rx_pkts,
>                 rxe->mbuf = nmb;
>                 dma_addr =
> 
> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
> -               rxdp->read.hdr_addr = dma_addr;
> +               rxdp->read.hdr_addr = 0;
>                 rxdp->read.pkt_addr = dma_addr;
> 
>                 /*
> @@ -1741,7 +1741,7 @@ next_desc:
>                         rxe->mbuf = nmb;
> 
>                         rxm->data_off = RTE_PKTMBUF_HEADROOM;
> -                       rxdp->read.hdr_addr = dma;
> +                       rxdp->read.hdr_addr = 0;
>                         rxdp->read.pkt_addr = dma;
>                 } else
>                         rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
>                 dma_addr =
> 
> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
>                 rxd = &rxq->rx_ring[i];
> -               rxd->read.hdr_addr = dma_addr;
> +               rxd->read.hdr_addr = 0;
>                 rxd->read.pkt_addr = dma_addr;
>                 rxe[i].mbuf = mbuf;
>         }
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> index 6c1647e..16a9c64 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>                         RTE_PKTMBUF_HEADROOM);
>         __m128i dma_addr0, dma_addr1;
> 
> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
> +
>         rxdp = rxq->rx_ring + rxq->rxrearm_start;
> 
>         /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>                 dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
>                 dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
> 
> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
> +
>                 /* flush desc with pa dma_addr */
>                 _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
>                 _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
> bash-4.2$ cat patch1 diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> b/drivers/net/ixgbe/ixgbe_rxtx.c index a0c8847..94967c5 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
> reset_mbuf)
> 
>                 /* populate the descriptors */
>                 dma_addr =
> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
> -               rxdp[i].read.hdr_addr = dma_addr;
> +               rxdp[i].read.hdr_addr = 0;
>                 rxdp[i].read.pkt_addr = dma_addr;
>         }
> 
> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
> **rx_pkts,
>                 rxe->mbuf = nmb;
>                 dma_addr =
> 
> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
> -               rxdp->read.hdr_addr = dma_addr;
> +               rxdp->read.hdr_addr = 0;
>                 rxdp->read.pkt_addr = dma_addr;
> 
>                 /*
> @@ -1741,7 +1741,7 @@ next_desc:
>                         rxe->mbuf = nmb;
> 
>                         rxm->data_off = RTE_PKTMBUF_HEADROOM;
> -                       rxdp->read.hdr_addr = dma;
> +                       rxdp->read.hdr_addr = 0;
>                         rxdp->read.pkt_addr = dma;
>                 } else
>                         rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
>                 dma_addr =
> 
> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
>                 rxd = &rxq->rx_ring[i];
> -               rxd->read.hdr_addr = dma_addr;
> +               rxd->read.hdr_addr = 0;
>                 rxd->read.pkt_addr = dma_addr;
>                 rxe[i].mbuf = mbuf;
>         }
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> index 6c1647e..16a9c64 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>                         RTE_PKTMBUF_HEADROOM);
>         __m128i dma_addr0, dma_addr1;
> 
> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
> +
>         rxdp = rxq->rx_ring + rxq->rxrearm_start;
> 
>         /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>                 dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
>                 dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
> 
> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
> +
>                 /* flush desc with pa dma_addr */
>                 _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
>                 _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
  2015-07-29 20:24   ` Zhang, Helin
@ 2015-07-30  8:12     ` Olivier MATZ
  2015-07-30  9:00       ` Ananyev, Konstantin
  0 siblings, 1 reply; 19+ messages in thread
From: Olivier MATZ @ 2015-07-30  8:12 UTC (permalink / raw)
  To: Zhang, Helin, Ananyev, Konstantin, Martin Weiser; +Cc: dev

Hi,

On 07/29/2015 10:24 PM, Zhang, Helin wrote:
> Hi Martin
>
> Thank you very much for the good catch!
>
> The similar situation in i40e, as explained by Konstantin.
> As header split hasn't been supported by DPDK till now. It would be better to put the header address in RX descriptor to 0.
> But in the future, during header split enabling. We may need to pay extra attention to that. As at least x710 datasheet said specifically as below.
> "The header address should be set by the software to an even number (word aligned address)". We may need to find a way to ensure that during mempool/mbuf allocation.

Indeed it would be good to force the priv_size to be aligned.

The priv_size could be aligned automatically in
rte_pktmbuf_pool_create(). The only possible problem I could see
is that it would break applications that access to the data buffer
by doing (sizeof(mbuf) + sizeof(priv)), which is probably not the
best thing to do (I didn't find any applications like this in dpdk).

For applications that directly use rte_mempool_create() instead of
rte_pktmbuf_pool_create(), we could add a check using an assert in 
rte_pktmbuf_init() and some words in the documentation.

The question is: what should be the proper alignment? I would say
at least 8 bytes, but maybe cache_aligned is an option too.

Regards,
Olivier


>
> Regards,
> Helin
>
>> -----Original Message-----
>> From: Ananyev, Konstantin
>> Sent: Wednesday, July 29, 2015 11:12 AM
>> To: Martin Weiser; Zhang, Helin; olivier.matz@6wind.com
>> Cc: dev@dpdk.org
>> Subject: RE: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf
>> private area size is odd
>>
>> Hi Martin,
>>
>>> -----Original Message-----
>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Martin Weiser
>>> Sent: Wednesday, July 29, 2015 4:07 PM
>>> To: Zhang, Helin; olivier.matz@6wind.com
>>> Cc: dev@dpdk.org
>>> Subject: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when
>>> mbuf private area size is odd
>>>
>>> Hi Helin, Hi Olivier,
>>>
>>> we are seeing an issue with the ixgbe and i40e drivers which we could
>>> track down to our setting of the private area size of the mbufs.
>>> The issue can be easily reproduced with the l2fwd example application
>>> when a small modification is done: just set the priv_size parameter in
>>> the call to the rte_pktmbuf_pool_create function to an odd number like
>>> 1. In our tests this causes every call to rte_eth_rx_burst to return
>>> 32 (which is the setting of nb_pkts) nonsense mbufs although no
>>> packets are received on the interface and the hardware counters do not
>>> report any received packets.
>>
>>  From Niantic datasheet:
>>
>> "7.1.6.1 Advanced Receive Descriptors — Read Format Table 7-15 lists the
>> advanced receive descriptor programming by the software. The ...
>> Packet Buffer Address (64)
>> This is the physical address of the packet buffer. The lowest bit is A0 (LSB of the
>> address).
>> Header Buffer Address (64)
>> The physical address of the header buffer with the lowest bit being Descriptor
>> Done (DD).
>> When a packet spans in multiple descriptors, the header buffer address is used
>> only on the first descriptor. During the programming phase, software must set
>> the DD bit to zero (see the description of the DD bit in this section). This means
>> that header buffer addresses are always word aligned."
>>
>> Right now, in ixgbe PMD we always setup  Packet Buffer Address (PBA)and
>> Header Buffer Address (HBA) to the same value:
>> buf_physaddr + RTE_PKTMBUF_HEADROOM
>> So when pirv_size==1, DD bit in RXD is always set to one by SW itself, and then
>> SW considers that HW already done with it.
>> In other words, right now for ixgbe you can't use RX buffer that is not aligned on
>> word boundary.
>>
>> So the advice would be, right now - don't set priv_size to the odd value.
>> As we don't support split header feature anyway, I think we can fix it just by
>> always setting HBA in the RXD to zero.
>> Could you try the fix for ixgbe below?
>>
>> Same story with FVL, I believe.
>> Konstantin
>>
>>
>>> Interestingly this does not happen if we force the scattered rx path.
>>>
>>> I assume the drivers have some expectations regarding the alignment of
>>> the buf_addr in the mbuf and setting an odd private are size breaks
>>> this alignment in the rte_pktmbuf_init function. If this is the case
>>> then one possible fix might be to enforce an alignment on the private area size.
>>>
>>> Best regards,
>>> Martin
>>
>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
>> index a0c8847..94967c5 100644
>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
>> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
>> reset_mbuf)
>>
>>                  /* populate the descriptors */
>>                  dma_addr =
>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
>> -               rxdp[i].read.hdr_addr = dma_addr;
>> +               rxdp[i].read.hdr_addr = 0;
>>                  rxdp[i].read.pkt_addr = dma_addr;
>>          }
>>
>> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
>> **rx_pkts,
>>                  rxe->mbuf = nmb;
>>                  dma_addr =
>>
>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
>> -               rxdp->read.hdr_addr = dma_addr;
>> +               rxdp->read.hdr_addr = 0;
>>                  rxdp->read.pkt_addr = dma_addr;
>>
>>                  /*
>> @@ -1741,7 +1741,7 @@ next_desc:
>>                          rxe->mbuf = nmb;
>>
>>                          rxm->data_off = RTE_PKTMBUF_HEADROOM;
>> -                       rxdp->read.hdr_addr = dma;
>> +                       rxdp->read.hdr_addr = 0;
>>                          rxdp->read.pkt_addr = dma;
>>                  } else
>>                          rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
>> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
>>                  dma_addr =
>>
>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
>>                  rxd = &rxq->rx_ring[i];
>> -               rxd->read.hdr_addr = dma_addr;
>> +               rxd->read.hdr_addr = 0;
>>                  rxd->read.pkt_addr = dma_addr;
>>                  rxe[i].mbuf = mbuf;
>>          }
>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>> index 6c1647e..16a9c64 100644
>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>                          RTE_PKTMBUF_HEADROOM);
>>          __m128i dma_addr0, dma_addr1;
>>
>> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
>> +
>>          rxdp = rxq->rx_ring + rxq->rxrearm_start;
>>
>>          /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
>> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>                  dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
>>                  dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
>>
>> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
>> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
>> +
>>                  /* flush desc with pa dma_addr */
>>                  _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
>>                  _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
>> bash-4.2$ cat patch1 diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
>> b/drivers/net/ixgbe/ixgbe_rxtx.c index a0c8847..94967c5 100644
>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
>> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
>> reset_mbuf)
>>
>>                  /* populate the descriptors */
>>                  dma_addr =
>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
>> -               rxdp[i].read.hdr_addr = dma_addr;
>> +               rxdp[i].read.hdr_addr = 0;
>>                  rxdp[i].read.pkt_addr = dma_addr;
>>          }
>>
>> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
>> **rx_pkts,
>>                  rxe->mbuf = nmb;
>>                  dma_addr =
>>
>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
>> -               rxdp->read.hdr_addr = dma_addr;
>> +               rxdp->read.hdr_addr = 0;
>>                  rxdp->read.pkt_addr = dma_addr;
>>
>>                  /*
>> @@ -1741,7 +1741,7 @@ next_desc:
>>                          rxe->mbuf = nmb;
>>
>>                          rxm->data_off = RTE_PKTMBUF_HEADROOM;
>> -                       rxdp->read.hdr_addr = dma;
>> +                       rxdp->read.hdr_addr = 0;
>>                          rxdp->read.pkt_addr = dma;
>>                  } else
>>                          rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
>> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
>>                  dma_addr =
>>
>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
>>                  rxd = &rxq->rx_ring[i];
>> -               rxd->read.hdr_addr = dma_addr;
>> +               rxd->read.hdr_addr = 0;
>>                  rxd->read.pkt_addr = dma_addr;
>>                  rxe[i].mbuf = mbuf;
>>          }
>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>> index 6c1647e..16a9c64 100644
>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>                          RTE_PKTMBUF_HEADROOM);
>>          __m128i dma_addr0, dma_addr1;
>>
>> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
>> +
>>          rxdp = rxq->rx_ring + rxq->rxrearm_start;
>>
>>          /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
>> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>                  dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
>>                  dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
>>
>> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
>> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
>> +
>>                  /* flush desc with pa dma_addr */
>>                  _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
>>                  _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
  2015-07-30  8:12     ` Olivier MATZ
@ 2015-07-30  9:00       ` Ananyev, Konstantin
  2015-07-30  9:10         ` Olivier MATZ
  0 siblings, 1 reply; 19+ messages in thread
From: Ananyev, Konstantin @ 2015-07-30  9:00 UTC (permalink / raw)
  To: Olivier MATZ, Zhang, Helin, Martin Weiser; +Cc: dev

Hi Olivier,

> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Thursday, July 30, 2015 9:12 AM
> To: Zhang, Helin; Ananyev, Konstantin; Martin Weiser
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
> 
> Hi,
> 
> On 07/29/2015 10:24 PM, Zhang, Helin wrote:
> > Hi Martin
> >
> > Thank you very much for the good catch!
> >
> > The similar situation in i40e, as explained by Konstantin.
> > As header split hasn't been supported by DPDK till now. It would be better to put the header address in RX descriptor to 0.
> > But in the future, during header split enabling. We may need to pay extra attention to that. As at least x710 datasheet said
> specifically as below.
> > "The header address should be set by the software to an even number (word aligned address)". We may need to find a way to
> ensure that during mempool/mbuf allocation.
> 
> Indeed it would be good to force the priv_size to be aligned.
> 
> The priv_size could be aligned automatically in
> rte_pktmbuf_pool_create(). The only possible problem I could see
> is that it would break applications that access to the data buffer
> by doing (sizeof(mbuf) + sizeof(priv)), which is probably not the
> best thing to do (I didn't find any applications like this in dpdk).


Might be just make rte_pktmbuf_pool_create() fail if input priv_size % MIN_ALIGN != 0?

> 
> For applications that directly use rte_mempool_create() instead of
> rte_pktmbuf_pool_create(), we could add a check using an assert in
> rte_pktmbuf_init() and some words in the documentation.
> 
> The question is: what should be the proper alignment? I would say
> at least 8 bytes, but maybe cache_aligned is an option too.

8 bytes seems enough to me.

Konstantin 

> 
> Regards,
> Olivier
> 
> 
> >
> > Regards,
> > Helin
> >
> >> -----Original Message-----
> >> From: Ananyev, Konstantin
> >> Sent: Wednesday, July 29, 2015 11:12 AM
> >> To: Martin Weiser; Zhang, Helin; olivier.matz@6wind.com
> >> Cc: dev@dpdk.org
> >> Subject: RE: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf
> >> private area size is odd
> >>
> >> Hi Martin,
> >>
> >>> -----Original Message-----
> >>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Martin Weiser
> >>> Sent: Wednesday, July 29, 2015 4:07 PM
> >>> To: Zhang, Helin; olivier.matz@6wind.com
> >>> Cc: dev@dpdk.org
> >>> Subject: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when
> >>> mbuf private area size is odd
> >>>
> >>> Hi Helin, Hi Olivier,
> >>>
> >>> we are seeing an issue with the ixgbe and i40e drivers which we could
> >>> track down to our setting of the private area size of the mbufs.
> >>> The issue can be easily reproduced with the l2fwd example application
> >>> when a small modification is done: just set the priv_size parameter in
> >>> the call to the rte_pktmbuf_pool_create function to an odd number like
> >>> 1. In our tests this causes every call to rte_eth_rx_burst to return
> >>> 32 (which is the setting of nb_pkts) nonsense mbufs although no
> >>> packets are received on the interface and the hardware counters do not
> >>> report any received packets.
> >>
> >>  From Niantic datasheet:
> >>
> >> "7.1.6.1 Advanced Receive Descriptors — Read Format Table 7-15 lists the
> >> advanced receive descriptor programming by the software. The ...
> >> Packet Buffer Address (64)
> >> This is the physical address of the packet buffer. The lowest bit is A0 (LSB of the
> >> address).
> >> Header Buffer Address (64)
> >> The physical address of the header buffer with the lowest bit being Descriptor
> >> Done (DD).
> >> When a packet spans in multiple descriptors, the header buffer address is used
> >> only on the first descriptor. During the programming phase, software must set
> >> the DD bit to zero (see the description of the DD bit in this section). This means
> >> that header buffer addresses are always word aligned."
> >>
> >> Right now, in ixgbe PMD we always setup  Packet Buffer Address (PBA)and
> >> Header Buffer Address (HBA) to the same value:
> >> buf_physaddr + RTE_PKTMBUF_HEADROOM
> >> So when pirv_size==1, DD bit in RXD is always set to one by SW itself, and then
> >> SW considers that HW already done with it.
> >> In other words, right now for ixgbe you can't use RX buffer that is not aligned on
> >> word boundary.
> >>
> >> So the advice would be, right now - don't set priv_size to the odd value.
> >> As we don't support split header feature anyway, I think we can fix it just by
> >> always setting HBA in the RXD to zero.
> >> Could you try the fix for ixgbe below?
> >>
> >> Same story with FVL, I believe.
> >> Konstantin
> >>
> >>
> >>> Interestingly this does not happen if we force the scattered rx path.
> >>>
> >>> I assume the drivers have some expectations regarding the alignment of
> >>> the buf_addr in the mbuf and setting an odd private are size breaks
> >>> this alignment in the rte_pktmbuf_init function. If this is the case
> >>> then one possible fix might be to enforce an alignment on the private area size.
> >>>
> >>> Best regards,
> >>> Martin
> >>
> >> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
> >> index a0c8847..94967c5 100644
> >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> >> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
> >> reset_mbuf)
> >>
> >>                  /* populate the descriptors */
> >>                  dma_addr =
> >> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
> >> -               rxdp[i].read.hdr_addr = dma_addr;
> >> +               rxdp[i].read.hdr_addr = 0;
> >>                  rxdp[i].read.pkt_addr = dma_addr;
> >>          }
> >>
> >> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
> >> **rx_pkts,
> >>                  rxe->mbuf = nmb;
> >>                  dma_addr =
> >>
> >> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
> >> -               rxdp->read.hdr_addr = dma_addr;
> >> +               rxdp->read.hdr_addr = 0;
> >>                  rxdp->read.pkt_addr = dma_addr;
> >>
> >>                  /*
> >> @@ -1741,7 +1741,7 @@ next_desc:
> >>                          rxe->mbuf = nmb;
> >>
> >>                          rxm->data_off = RTE_PKTMBUF_HEADROOM;
> >> -                       rxdp->read.hdr_addr = dma;
> >> +                       rxdp->read.hdr_addr = 0;
> >>                          rxdp->read.pkt_addr = dma;
> >>                  } else
> >>                          rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
> >> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
> >>                  dma_addr =
> >>
> >> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
> >>                  rxd = &rxq->rx_ring[i];
> >> -               rxd->read.hdr_addr = dma_addr;
> >> +               rxd->read.hdr_addr = 0;
> >>                  rxd->read.pkt_addr = dma_addr;
> >>                  rxe[i].mbuf = mbuf;
> >>          }
> >> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >> index 6c1647e..16a9c64 100644
> >> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
> >>                          RTE_PKTMBUF_HEADROOM);
> >>          __m128i dma_addr0, dma_addr1;
> >>
> >> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
> >> +
> >>          rxdp = rxq->rx_ring + rxq->rxrearm_start;
> >>
> >>          /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
> >> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
> >>                  dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
> >>                  dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
> >>
> >> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
> >> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
> >> +
> >>                  /* flush desc with pa dma_addr */
> >>                  _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
> >>                  _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
> >> bash-4.2$ cat patch1 diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> >> b/drivers/net/ixgbe/ixgbe_rxtx.c index a0c8847..94967c5 100644
> >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> >> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
> >> reset_mbuf)
> >>
> >>                  /* populate the descriptors */
> >>                  dma_addr =
> >> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
> >> -               rxdp[i].read.hdr_addr = dma_addr;
> >> +               rxdp[i].read.hdr_addr = 0;
> >>                  rxdp[i].read.pkt_addr = dma_addr;
> >>          }
> >>
> >> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
> >> **rx_pkts,
> >>                  rxe->mbuf = nmb;
> >>                  dma_addr =
> >>
> >> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
> >> -               rxdp->read.hdr_addr = dma_addr;
> >> +               rxdp->read.hdr_addr = 0;
> >>                  rxdp->read.pkt_addr = dma_addr;
> >>
> >>                  /*
> >> @@ -1741,7 +1741,7 @@ next_desc:
> >>                          rxe->mbuf = nmb;
> >>
> >>                          rxm->data_off = RTE_PKTMBUF_HEADROOM;
> >> -                       rxdp->read.hdr_addr = dma;
> >> +                       rxdp->read.hdr_addr = 0;
> >>                          rxdp->read.pkt_addr = dma;
> >>                  } else
> >>                          rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
> >> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
> >>                  dma_addr =
> >>
> >> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
> >>                  rxd = &rxq->rx_ring[i];
> >> -               rxd->read.hdr_addr = dma_addr;
> >> +               rxd->read.hdr_addr = 0;
> >>                  rxd->read.pkt_addr = dma_addr;
> >>                  rxe[i].mbuf = mbuf;
> >>          }
> >> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >> index 6c1647e..16a9c64 100644
> >> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
> >>                          RTE_PKTMBUF_HEADROOM);
> >>          __m128i dma_addr0, dma_addr1;
> >>
> >> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
> >> +
> >>          rxdp = rxq->rx_ring + rxq->rxrearm_start;
> >>
> >>          /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
> >> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
> >>                  dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
> >>                  dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
> >>
> >> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
> >> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
> >> +
> >>                  /* flush desc with pa dma_addr */
> >>                  _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
> >>                  _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
  2015-07-30  9:00       ` Ananyev, Konstantin
@ 2015-07-30  9:10         ` Olivier MATZ
  2015-07-30  9:43           ` Ananyev, Konstantin
  0 siblings, 1 reply; 19+ messages in thread
From: Olivier MATZ @ 2015-07-30  9:10 UTC (permalink / raw)
  To: Ananyev, Konstantin, Zhang, Helin, Martin Weiser; +Cc: dev

Hi Konstantin,

On 07/30/2015 11:00 AM, Ananyev, Konstantin wrote:
> Hi Olivier,
>
>> -----Original Message-----
>> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
>> Sent: Thursday, July 30, 2015 9:12 AM
>> To: Zhang, Helin; Ananyev, Konstantin; Martin Weiser
>> Cc: dev@dpdk.org
>> Subject: Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
>>
>> Hi,
>>
>> On 07/29/2015 10:24 PM, Zhang, Helin wrote:
>>> Hi Martin
>>>
>>> Thank you very much for the good catch!
>>>
>>> The similar situation in i40e, as explained by Konstantin.
>>> As header split hasn't been supported by DPDK till now. It would be better to put the header address in RX descriptor to 0.
>>> But in the future, during header split enabling. We may need to pay extra attention to that. As at least x710 datasheet said
>> specifically as below.
>>> "The header address should be set by the software to an even number (word aligned address)". We may need to find a way to
>> ensure that during mempool/mbuf allocation.
>>
>> Indeed it would be good to force the priv_size to be aligned.
>>
>> The priv_size could be aligned automatically in
>> rte_pktmbuf_pool_create(). The only possible problem I could see
>> is that it would break applications that access to the data buffer
>> by doing (sizeof(mbuf) + sizeof(priv)), which is probably not the
>> best thing to do (I didn't find any applications like this in dpdk).
>
>
> Might be just make rte_pktmbuf_pool_create() fail if input priv_size % MIN_ALIGN != 0?

Hmm maybe it would break more applications: an odd priv_size is
probably rare, but a priv_size that is not aligned to 8 bytes is
maybe more common.
It's maybe safer to align the size transparently?


Regards,
Olivier



>
>>
>> For applications that directly use rte_mempool_create() instead of
>> rte_pktmbuf_pool_create(), we could add a check using an assert in
>> rte_pktmbuf_init() and some words in the documentation.
>>
>> The question is: what should be the proper alignment? I would say
>> at least 8 bytes, but maybe cache_aligned is an option too.
>
> 8 bytes seems enough to me.
>
> Konstantin
>
>>
>> Regards,
>> Olivier
>>
>>
>>>
>>> Regards,
>>> Helin
>>>
>>>> -----Original Message-----
>>>> From: Ananyev, Konstantin
>>>> Sent: Wednesday, July 29, 2015 11:12 AM
>>>> To: Martin Weiser; Zhang, Helin; olivier.matz@6wind.com
>>>> Cc: dev@dpdk.org
>>>> Subject: RE: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf
>>>> private area size is odd
>>>>
>>>> Hi Martin,
>>>>
>>>>> -----Original Message-----
>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Martin Weiser
>>>>> Sent: Wednesday, July 29, 2015 4:07 PM
>>>>> To: Zhang, Helin; olivier.matz@6wind.com
>>>>> Cc: dev@dpdk.org
>>>>> Subject: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when
>>>>> mbuf private area size is odd
>>>>>
>>>>> Hi Helin, Hi Olivier,
>>>>>
>>>>> we are seeing an issue with the ixgbe and i40e drivers which we could
>>>>> track down to our setting of the private area size of the mbufs.
>>>>> The issue can be easily reproduced with the l2fwd example application
>>>>> when a small modification is done: just set the priv_size parameter in
>>>>> the call to the rte_pktmbuf_pool_create function to an odd number like
>>>>> 1. In our tests this causes every call to rte_eth_rx_burst to return
>>>>> 32 (which is the setting of nb_pkts) nonsense mbufs although no
>>>>> packets are received on the interface and the hardware counters do not
>>>>> report any received packets.
>>>>
>>>>   From Niantic datasheet:
>>>>
>>>> "7.1.6.1 Advanced Receive Descriptors — Read Format Table 7-15 lists the
>>>> advanced receive descriptor programming by the software. The ...
>>>> Packet Buffer Address (64)
>>>> This is the physical address of the packet buffer. The lowest bit is A0 (LSB of the
>>>> address).
>>>> Header Buffer Address (64)
>>>> The physical address of the header buffer with the lowest bit being Descriptor
>>>> Done (DD).
>>>> When a packet spans in multiple descriptors, the header buffer address is used
>>>> only on the first descriptor. During the programming phase, software must set
>>>> the DD bit to zero (see the description of the DD bit in this section). This means
>>>> that header buffer addresses are always word aligned."
>>>>
>>>> Right now, in ixgbe PMD we always setup  Packet Buffer Address (PBA)and
>>>> Header Buffer Address (HBA) to the same value:
>>>> buf_physaddr + RTE_PKTMBUF_HEADROOM
>>>> So when pirv_size==1, DD bit in RXD is always set to one by SW itself, and then
>>>> SW considers that HW already done with it.
>>>> In other words, right now for ixgbe you can't use RX buffer that is not aligned on
>>>> word boundary.
>>>>
>>>> So the advice would be, right now - don't set priv_size to the odd value.
>>>> As we don't support split header feature anyway, I think we can fix it just by
>>>> always setting HBA in the RXD to zero.
>>>> Could you try the fix for ixgbe below?
>>>>
>>>> Same story with FVL, I believe.
>>>> Konstantin
>>>>
>>>>
>>>>> Interestingly this does not happen if we force the scattered rx path.
>>>>>
>>>>> I assume the drivers have some expectations regarding the alignment of
>>>>> the buf_addr in the mbuf and setting an odd private are size breaks
>>>>> this alignment in the rte_pktmbuf_init function. If this is the case
>>>>> then one possible fix might be to enforce an alignment on the private area size.
>>>>>
>>>>> Best regards,
>>>>> Martin
>>>>
>>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
>>>> index a0c8847..94967c5 100644
>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
>>>> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
>>>> reset_mbuf)
>>>>
>>>>                   /* populate the descriptors */
>>>>                   dma_addr =
>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
>>>> -               rxdp[i].read.hdr_addr = dma_addr;
>>>> +               rxdp[i].read.hdr_addr = 0;
>>>>                   rxdp[i].read.pkt_addr = dma_addr;
>>>>           }
>>>>
>>>> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
>>>> **rx_pkts,
>>>>                   rxe->mbuf = nmb;
>>>>                   dma_addr =
>>>>
>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
>>>> -               rxdp->read.hdr_addr = dma_addr;
>>>> +               rxdp->read.hdr_addr = 0;
>>>>                   rxdp->read.pkt_addr = dma_addr;
>>>>
>>>>                   /*
>>>> @@ -1741,7 +1741,7 @@ next_desc:
>>>>                           rxe->mbuf = nmb;
>>>>
>>>>                           rxm->data_off = RTE_PKTMBUF_HEADROOM;
>>>> -                       rxdp->read.hdr_addr = dma;
>>>> +                       rxdp->read.hdr_addr = 0;
>>>>                           rxdp->read.pkt_addr = dma;
>>>>                   } else
>>>>                           rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
>>>> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
>>>>                   dma_addr =
>>>>
>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
>>>>                   rxd = &rxq->rx_ring[i];
>>>> -               rxd->read.hdr_addr = dma_addr;
>>>> +               rxd->read.hdr_addr = 0;
>>>>                   rxd->read.pkt_addr = dma_addr;
>>>>                   rxe[i].mbuf = mbuf;
>>>>           }
>>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>> index 6c1647e..16a9c64 100644
>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>>>                           RTE_PKTMBUF_HEADROOM);
>>>>           __m128i dma_addr0, dma_addr1;
>>>>
>>>> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
>>>> +
>>>>           rxdp = rxq->rx_ring + rxq->rxrearm_start;
>>>>
>>>>           /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
>>>> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>>>                   dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
>>>>                   dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
>>>>
>>>> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
>>>> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
>>>> +
>>>>                   /* flush desc with pa dma_addr */
>>>>                   _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
>>>>                   _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
>>>> bash-4.2$ cat patch1 diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
>>>> b/drivers/net/ixgbe/ixgbe_rxtx.c index a0c8847..94967c5 100644
>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
>>>> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
>>>> reset_mbuf)
>>>>
>>>>                   /* populate the descriptors */
>>>>                   dma_addr =
>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
>>>> -               rxdp[i].read.hdr_addr = dma_addr;
>>>> +               rxdp[i].read.hdr_addr = 0;
>>>>                   rxdp[i].read.pkt_addr = dma_addr;
>>>>           }
>>>>
>>>> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
>>>> **rx_pkts,
>>>>                   rxe->mbuf = nmb;
>>>>                   dma_addr =
>>>>
>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
>>>> -               rxdp->read.hdr_addr = dma_addr;
>>>> +               rxdp->read.hdr_addr = 0;
>>>>                   rxdp->read.pkt_addr = dma_addr;
>>>>
>>>>                   /*
>>>> @@ -1741,7 +1741,7 @@ next_desc:
>>>>                           rxe->mbuf = nmb;
>>>>
>>>>                           rxm->data_off = RTE_PKTMBUF_HEADROOM;
>>>> -                       rxdp->read.hdr_addr = dma;
>>>> +                       rxdp->read.hdr_addr = 0;
>>>>                           rxdp->read.pkt_addr = dma;
>>>>                   } else
>>>>                           rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
>>>> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
>>>>                   dma_addr =
>>>>
>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
>>>>                   rxd = &rxq->rx_ring[i];
>>>> -               rxd->read.hdr_addr = dma_addr;
>>>> +               rxd->read.hdr_addr = 0;
>>>>                   rxd->read.pkt_addr = dma_addr;
>>>>                   rxe[i].mbuf = mbuf;
>>>>           }
>>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>> index 6c1647e..16a9c64 100644
>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>>>                           RTE_PKTMBUF_HEADROOM);
>>>>           __m128i dma_addr0, dma_addr1;
>>>>
>>>> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
>>>> +
>>>>           rxdp = rxq->rx_ring + rxq->rxrearm_start;
>>>>
>>>>           /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
>>>> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>>>                   dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
>>>>                   dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
>>>>
>>>> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
>>>> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
>>>> +
>>>>                   /* flush desc with pa dma_addr */
>>>>                   _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
>>>>                   _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
  2015-07-30  9:10         ` Olivier MATZ
@ 2015-07-30  9:43           ` Ananyev, Konstantin
  2015-07-30 11:22             ` Olivier MATZ
  0 siblings, 1 reply; 19+ messages in thread
From: Ananyev, Konstantin @ 2015-07-30  9:43 UTC (permalink / raw)
  To: Olivier MATZ, Zhang, Helin, Martin Weiser; +Cc: dev



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> Sent: Thursday, July 30, 2015 10:10 AM
> To: Ananyev, Konstantin; Zhang, Helin; Martin Weiser
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
> 
> Hi Konstantin,
> 
> On 07/30/2015 11:00 AM, Ananyev, Konstantin wrote:
> > Hi Olivier,
> >
> >> -----Original Message-----
> >> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> >> Sent: Thursday, July 30, 2015 9:12 AM
> >> To: Zhang, Helin; Ananyev, Konstantin; Martin Weiser
> >> Cc: dev@dpdk.org
> >> Subject: Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
> >>
> >> Hi,
> >>
> >> On 07/29/2015 10:24 PM, Zhang, Helin wrote:
> >>> Hi Martin
> >>>
> >>> Thank you very much for the good catch!
> >>>
> >>> The similar situation in i40e, as explained by Konstantin.
> >>> As header split hasn't been supported by DPDK till now. It would be better to put the header address in RX descriptor to 0.
> >>> But in the future, during header split enabling. We may need to pay extra attention to that. As at least x710 datasheet said
> >> specifically as below.
> >>> "The header address should be set by the software to an even number (word aligned address)". We may need to find a way to
> >> ensure that during mempool/mbuf allocation.
> >>
> >> Indeed it would be good to force the priv_size to be aligned.
> >>
> >> The priv_size could be aligned automatically in
> >> rte_pktmbuf_pool_create(). The only possible problem I could see
> >> is that it would break applications that access to the data buffer
> >> by doing (sizeof(mbuf) + sizeof(priv)), which is probably not the
> >> best thing to do (I didn't find any applications like this in dpdk).
> >
> >
> > Might be just make rte_pktmbuf_pool_create() fail if input priv_size % MIN_ALIGN != 0?
> 
> Hmm maybe it would break more applications: an odd priv_size is
> probably rare, but a priv_size that is not aligned to 8 bytes is
> maybe more common.

My thought was that rte_mempool_create() was just introduced in 2.1,
so if we add extra requirement for the input parameter now -
there would be no ABI breakage, and not many people started to use it already. 
For me just seems a bit easier and more straightforward then silent alignment -
user would not have wrong assumptions here.
Though if you think that a silent alignment would be more convenient
for most users - I wouldn't insist.
Konstantin

> It's maybe safer to align the size transparently?
> 
> 
> Regards,
> Olivier
> 
> 
> 
> >
> >>
> >> For applications that directly use rte_mempool_create() instead of
> >> rte_pktmbuf_pool_create(), we could add a check using an assert in
> >> rte_pktmbuf_init() and some words in the documentation.
> >>
> >> The question is: what should be the proper alignment? I would say
> >> at least 8 bytes, but maybe cache_aligned is an option too.
> >
> > 8 bytes seems enough to me.
> >
> > Konstantin
> >
> >>
> >> Regards,
> >> Olivier
> >>
> >>
> >>>
> >>> Regards,
> >>> Helin
> >>>
> >>>> -----Original Message-----
> >>>> From: Ananyev, Konstantin
> >>>> Sent: Wednesday, July 29, 2015 11:12 AM
> >>>> To: Martin Weiser; Zhang, Helin; olivier.matz@6wind.com
> >>>> Cc: dev@dpdk.org
> >>>> Subject: RE: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf
> >>>> private area size is odd
> >>>>
> >>>> Hi Martin,
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Martin Weiser
> >>>>> Sent: Wednesday, July 29, 2015 4:07 PM
> >>>>> To: Zhang, Helin; olivier.matz@6wind.com
> >>>>> Cc: dev@dpdk.org
> >>>>> Subject: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when
> >>>>> mbuf private area size is odd
> >>>>>
> >>>>> Hi Helin, Hi Olivier,
> >>>>>
> >>>>> we are seeing an issue with the ixgbe and i40e drivers which we could
> >>>>> track down to our setting of the private area size of the mbufs.
> >>>>> The issue can be easily reproduced with the l2fwd example application
> >>>>> when a small modification is done: just set the priv_size parameter in
> >>>>> the call to the rte_pktmbuf_pool_create function to an odd number like
> >>>>> 1. In our tests this causes every call to rte_eth_rx_burst to return
> >>>>> 32 (which is the setting of nb_pkts) nonsense mbufs although no
> >>>>> packets are received on the interface and the hardware counters do not
> >>>>> report any received packets.
> >>>>
> >>>>   From Niantic datasheet:
> >>>>
> >>>> "7.1.6.1 Advanced Receive Descriptors — Read Format Table 7-15 lists the
> >>>> advanced receive descriptor programming by the software. The ...
> >>>> Packet Buffer Address (64)
> >>>> This is the physical address of the packet buffer. The lowest bit is A0 (LSB of the
> >>>> address).
> >>>> Header Buffer Address (64)
> >>>> The physical address of the header buffer with the lowest bit being Descriptor
> >>>> Done (DD).
> >>>> When a packet spans in multiple descriptors, the header buffer address is used
> >>>> only on the first descriptor. During the programming phase, software must set
> >>>> the DD bit to zero (see the description of the DD bit in this section). This means
> >>>> that header buffer addresses are always word aligned."
> >>>>
> >>>> Right now, in ixgbe PMD we always setup  Packet Buffer Address (PBA)and
> >>>> Header Buffer Address (HBA) to the same value:
> >>>> buf_physaddr + RTE_PKTMBUF_HEADROOM
> >>>> So when pirv_size==1, DD bit in RXD is always set to one by SW itself, and then
> >>>> SW considers that HW already done with it.
> >>>> In other words, right now for ixgbe you can't use RX buffer that is not aligned on
> >>>> word boundary.
> >>>>
> >>>> So the advice would be, right now - don't set priv_size to the odd value.
> >>>> As we don't support split header feature anyway, I think we can fix it just by
> >>>> always setting HBA in the RXD to zero.
> >>>> Could you try the fix for ixgbe below?
> >>>>
> >>>> Same story with FVL, I believe.
> >>>> Konstantin
> >>>>
> >>>>
> >>>>> Interestingly this does not happen if we force the scattered rx path.
> >>>>>
> >>>>> I assume the drivers have some expectations regarding the alignment of
> >>>>> the buf_addr in the mbuf and setting an odd private are size breaks
> >>>>> this alignment in the rte_pktmbuf_init function. If this is the case
> >>>>> then one possible fix might be to enforce an alignment on the private area size.
> >>>>>
> >>>>> Best regards,
> >>>>> Martin
> >>>>
> >>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
> >>>> index a0c8847..94967c5 100644
> >>>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> >>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> >>>> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
> >>>> reset_mbuf)
> >>>>
> >>>>                   /* populate the descriptors */
> >>>>                   dma_addr =
> >>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
> >>>> -               rxdp[i].read.hdr_addr = dma_addr;
> >>>> +               rxdp[i].read.hdr_addr = 0;
> >>>>                   rxdp[i].read.pkt_addr = dma_addr;
> >>>>           }
> >>>>
> >>>> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
> >>>> **rx_pkts,
> >>>>                   rxe->mbuf = nmb;
> >>>>                   dma_addr =
> >>>>
> >>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
> >>>> -               rxdp->read.hdr_addr = dma_addr;
> >>>> +               rxdp->read.hdr_addr = 0;
> >>>>                   rxdp->read.pkt_addr = dma_addr;
> >>>>
> >>>>                   /*
> >>>> @@ -1741,7 +1741,7 @@ next_desc:
> >>>>                           rxe->mbuf = nmb;
> >>>>
> >>>>                           rxm->data_off = RTE_PKTMBUF_HEADROOM;
> >>>> -                       rxdp->read.hdr_addr = dma;
> >>>> +                       rxdp->read.hdr_addr = 0;
> >>>>                           rxdp->read.pkt_addr = dma;
> >>>>                   } else
> >>>>                           rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
> >>>> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
> >>>>                   dma_addr =
> >>>>
> >>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
> >>>>                   rxd = &rxq->rx_ring[i];
> >>>> -               rxd->read.hdr_addr = dma_addr;
> >>>> +               rxd->read.hdr_addr = 0;
> >>>>                   rxd->read.pkt_addr = dma_addr;
> >>>>                   rxe[i].mbuf = mbuf;
> >>>>           }
> >>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> index 6c1647e..16a9c64 100644
> >>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
> >>>>                           RTE_PKTMBUF_HEADROOM);
> >>>>           __m128i dma_addr0, dma_addr1;
> >>>>
> >>>> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
> >>>> +
> >>>>           rxdp = rxq->rx_ring + rxq->rxrearm_start;
> >>>>
> >>>>           /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
> >>>> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
> >>>>                   dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
> >>>>                   dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
> >>>>
> >>>> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
> >>>> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
> >>>> +
> >>>>                   /* flush desc with pa dma_addr */
> >>>>                   _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
> >>>>                   _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
> >>>> bash-4.2$ cat patch1 diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> >>>> b/drivers/net/ixgbe/ixgbe_rxtx.c index a0c8847..94967c5 100644
> >>>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> >>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> >>>> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
> >>>> reset_mbuf)
> >>>>
> >>>>                   /* populate the descriptors */
> >>>>                   dma_addr =
> >>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
> >>>> -               rxdp[i].read.hdr_addr = dma_addr;
> >>>> +               rxdp[i].read.hdr_addr = 0;
> >>>>                   rxdp[i].read.pkt_addr = dma_addr;
> >>>>           }
> >>>>
> >>>> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
> >>>> **rx_pkts,
> >>>>                   rxe->mbuf = nmb;
> >>>>                   dma_addr =
> >>>>
> >>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
> >>>> -               rxdp->read.hdr_addr = dma_addr;
> >>>> +               rxdp->read.hdr_addr = 0;
> >>>>                   rxdp->read.pkt_addr = dma_addr;
> >>>>
> >>>>                   /*
> >>>> @@ -1741,7 +1741,7 @@ next_desc:
> >>>>                           rxe->mbuf = nmb;
> >>>>
> >>>>                           rxm->data_off = RTE_PKTMBUF_HEADROOM;
> >>>> -                       rxdp->read.hdr_addr = dma;
> >>>> +                       rxdp->read.hdr_addr = 0;
> >>>>                           rxdp->read.pkt_addr = dma;
> >>>>                   } else
> >>>>                           rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
> >>>> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
> >>>>                   dma_addr =
> >>>>
> >>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
> >>>>                   rxd = &rxq->rx_ring[i];
> >>>> -               rxd->read.hdr_addr = dma_addr;
> >>>> +               rxd->read.hdr_addr = 0;
> >>>>                   rxd->read.pkt_addr = dma_addr;
> >>>>                   rxe[i].mbuf = mbuf;
> >>>>           }
> >>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> index 6c1647e..16a9c64 100644
> >>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>>> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
> >>>>                           RTE_PKTMBUF_HEADROOM);
> >>>>           __m128i dma_addr0, dma_addr1;
> >>>>
> >>>> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
> >>>> +
> >>>>           rxdp = rxq->rx_ring + rxq->rxrearm_start;
> >>>>
> >>>>           /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
> >>>> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
> >>>>                   dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
> >>>>                   dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
> >>>>
> >>>> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
> >>>> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
> >>>> +
> >>>>                   /* flush desc with pa dma_addr */
> >>>>                   _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
> >>>>                   _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
> >


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
  2015-07-30  9:43           ` Ananyev, Konstantin
@ 2015-07-30 11:22             ` Olivier MATZ
  2015-07-30 13:47               ` Thomas Monjalon
  0 siblings, 1 reply; 19+ messages in thread
From: Olivier MATZ @ 2015-07-30 11:22 UTC (permalink / raw)
  To: Ananyev, Konstantin, Zhang, Helin, Martin Weiser; +Cc: dev

Hi,

On 07/30/2015 11:43 AM, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
>> Sent: Thursday, July 30, 2015 10:10 AM
>> To: Ananyev, Konstantin; Zhang, Helin; Martin Weiser
>> Cc: dev@dpdk.org
>> Subject: Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
>>
>> Hi Konstantin,
>>
>> On 07/30/2015 11:00 AM, Ananyev, Konstantin wrote:
>>> Hi Olivier,
>>>
>>>> -----Original Message-----
>>>> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
>>>> Sent: Thursday, July 30, 2015 9:12 AM
>>>> To: Zhang, Helin; Ananyev, Konstantin; Martin Weiser
>>>> Cc: dev@dpdk.org
>>>> Subject: Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
>>>>
>>>> Hi,
>>>>
>>>> On 07/29/2015 10:24 PM, Zhang, Helin wrote:
>>>>> Hi Martin
>>>>>
>>>>> Thank you very much for the good catch!
>>>>>
>>>>> The similar situation in i40e, as explained by Konstantin.
>>>>> As header split hasn't been supported by DPDK till now. It would be better to put the header address in RX descriptor to 0.
>>>>> But in the future, during header split enabling. We may need to pay extra attention to that. As at least x710 datasheet said
>>>> specifically as below.
>>>>> "The header address should be set by the software to an even number (word aligned address)". We may need to find a way to
>>>> ensure that during mempool/mbuf allocation.
>>>>
>>>> Indeed it would be good to force the priv_size to be aligned.
>>>>
>>>> The priv_size could be aligned automatically in
>>>> rte_pktmbuf_pool_create(). The only possible problem I could see
>>>> is that it would break applications that access to the data buffer
>>>> by doing (sizeof(mbuf) + sizeof(priv)), which is probably not the
>>>> best thing to do (I didn't find any applications like this in dpdk).
>>>
>>>
>>> Might be just make rte_pktmbuf_pool_create() fail if input priv_size % MIN_ALIGN != 0?
>>
>> Hmm maybe it would break more applications: an odd priv_size is
>> probably rare, but a priv_size that is not aligned to 8 bytes is
>> maybe more common.
>
> My thought was that rte_mempool_create() was just introduced in 2.1,
> so if we add extra requirement for the input parameter now -
> there would be no ABI breakage, and not many people started to use it already.
> For me just seems a bit easier and more straightforward then silent alignment -
> user would not have wrong assumptions here.
> Though if you think that a silent alignment would be more convenient
> for most users - I wouldn't insist.


Yes, I agree on the principle, but it depends whether this fix
is integrated for 2.1 or not.
I think it may already be a bit late for that, especially as it
is not a very critical bug.

Thomas, what do you think?


Olivier




> Konstantin
>
>> It's maybe safer to align the size transparently?
>>
>>
>> Regards,
>> Olivier
>>
>>
>>
>>>
>>>>
>>>> For applications that directly use rte_mempool_create() instead of
>>>> rte_pktmbuf_pool_create(), we could add a check using an assert in
>>>> rte_pktmbuf_init() and some words in the documentation.
>>>>
>>>> The question is: what should be the proper alignment? I would say
>>>> at least 8 bytes, but maybe cache_aligned is an option too.
>>>
>>> 8 bytes seems enough to me.
>>>
>>> Konstantin
>>>
>>>>
>>>> Regards,
>>>> Olivier
>>>>
>>>>
>>>>>
>>>>> Regards,
>>>>> Helin
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ananyev, Konstantin
>>>>>> Sent: Wednesday, July 29, 2015 11:12 AM
>>>>>> To: Martin Weiser; Zhang, Helin; olivier.matz@6wind.com
>>>>>> Cc: dev@dpdk.org
>>>>>> Subject: RE: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf
>>>>>> private area size is odd
>>>>>>
>>>>>> Hi Martin,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Martin Weiser
>>>>>>> Sent: Wednesday, July 29, 2015 4:07 PM
>>>>>>> To: Zhang, Helin; olivier.matz@6wind.com
>>>>>>> Cc: dev@dpdk.org
>>>>>>> Subject: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when
>>>>>>> mbuf private area size is odd
>>>>>>>
>>>>>>> Hi Helin, Hi Olivier,
>>>>>>>
>>>>>>> we are seeing an issue with the ixgbe and i40e drivers which we could
>>>>>>> track down to our setting of the private area size of the mbufs.
>>>>>>> The issue can be easily reproduced with the l2fwd example application
>>>>>>> when a small modification is done: just set the priv_size parameter in
>>>>>>> the call to the rte_pktmbuf_pool_create function to an odd number like
>>>>>>> 1. In our tests this causes every call to rte_eth_rx_burst to return
>>>>>>> 32 (which is the setting of nb_pkts) nonsense mbufs although no
>>>>>>> packets are received on the interface and the hardware counters do not
>>>>>>> report any received packets.
>>>>>>
>>>>>>    From Niantic datasheet:
>>>>>>
>>>>>> "7.1.6.1 Advanced Receive Descriptors — Read Format Table 7-15 lists the
>>>>>> advanced receive descriptor programming by the software. The ...
>>>>>> Packet Buffer Address (64)
>>>>>> This is the physical address of the packet buffer. The lowest bit is A0 (LSB of the
>>>>>> address).
>>>>>> Header Buffer Address (64)
>>>>>> The physical address of the header buffer with the lowest bit being Descriptor
>>>>>> Done (DD).
>>>>>> When a packet spans in multiple descriptors, the header buffer address is used
>>>>>> only on the first descriptor. During the programming phase, software must set
>>>>>> the DD bit to zero (see the description of the DD bit in this section). This means
>>>>>> that header buffer addresses are always word aligned."
>>>>>>
>>>>>> Right now, in ixgbe PMD we always setup  Packet Buffer Address (PBA)and
>>>>>> Header Buffer Address (HBA) to the same value:
>>>>>> buf_physaddr + RTE_PKTMBUF_HEADROOM
>>>>>> So when pirv_size==1, DD bit in RXD is always set to one by SW itself, and then
>>>>>> SW considers that HW already done with it.
>>>>>> In other words, right now for ixgbe you can't use RX buffer that is not aligned on
>>>>>> word boundary.
>>>>>>
>>>>>> So the advice would be, right now - don't set priv_size to the odd value.
>>>>>> As we don't support split header feature anyway, I think we can fix it just by
>>>>>> always setting HBA in the RXD to zero.
>>>>>> Could you try the fix for ixgbe below?
>>>>>>
>>>>>> Same story with FVL, I believe.
>>>>>> Konstantin
>>>>>>
>>>>>>
>>>>>>> Interestingly this does not happen if we force the scattered rx path.
>>>>>>>
>>>>>>> I assume the drivers have some expectations regarding the alignment of
>>>>>>> the buf_addr in the mbuf and setting an odd private are size breaks
>>>>>>> this alignment in the rte_pktmbuf_init function. If this is the case
>>>>>>> then one possible fix might be to enforce an alignment on the private area size.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Martin
>>>>>>
>>>>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>>> index a0c8847..94967c5 100644
>>>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>>> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
>>>>>> reset_mbuf)
>>>>>>
>>>>>>                    /* populate the descriptors */
>>>>>>                    dma_addr =
>>>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
>>>>>> -               rxdp[i].read.hdr_addr = dma_addr;
>>>>>> +               rxdp[i].read.hdr_addr = 0;
>>>>>>                    rxdp[i].read.pkt_addr = dma_addr;
>>>>>>            }
>>>>>>
>>>>>> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
>>>>>> **rx_pkts,
>>>>>>                    rxe->mbuf = nmb;
>>>>>>                    dma_addr =
>>>>>>
>>>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
>>>>>> -               rxdp->read.hdr_addr = dma_addr;
>>>>>> +               rxdp->read.hdr_addr = 0;
>>>>>>                    rxdp->read.pkt_addr = dma_addr;
>>>>>>
>>>>>>                    /*
>>>>>> @@ -1741,7 +1741,7 @@ next_desc:
>>>>>>                            rxe->mbuf = nmb;
>>>>>>
>>>>>>                            rxm->data_off = RTE_PKTMBUF_HEADROOM;
>>>>>> -                       rxdp->read.hdr_addr = dma;
>>>>>> +                       rxdp->read.hdr_addr = 0;
>>>>>>                            rxdp->read.pkt_addr = dma;
>>>>>>                    } else
>>>>>>                            rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
>>>>>> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
>>>>>>                    dma_addr =
>>>>>>
>>>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
>>>>>>                    rxd = &rxq->rx_ring[i];
>>>>>> -               rxd->read.hdr_addr = dma_addr;
>>>>>> +               rxd->read.hdr_addr = 0;
>>>>>>                    rxd->read.pkt_addr = dma_addr;
>>>>>>                    rxe[i].mbuf = mbuf;
>>>>>>            }
>>>>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> index 6c1647e..16a9c64 100644
>>>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>>>>>                            RTE_PKTMBUF_HEADROOM);
>>>>>>            __m128i dma_addr0, dma_addr1;
>>>>>>
>>>>>> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
>>>>>> +
>>>>>>            rxdp = rxq->rx_ring + rxq->rxrearm_start;
>>>>>>
>>>>>>            /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
>>>>>> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>>>>>                    dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
>>>>>>                    dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
>>>>>>
>>>>>> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
>>>>>> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
>>>>>> +
>>>>>>                    /* flush desc with pa dma_addr */
>>>>>>                    _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
>>>>>>                    _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
>>>>>> bash-4.2$ cat patch1 diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>>> b/drivers/net/ixgbe/ixgbe_rxtx.c index a0c8847..94967c5 100644
>>>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
>>>>>> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool
>>>>>> reset_mbuf)
>>>>>>
>>>>>>                    /* populate the descriptors */
>>>>>>                    dma_addr =
>>>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
>>>>>> -               rxdp[i].read.hdr_addr = dma_addr;
>>>>>> +               rxdp[i].read.hdr_addr = 0;
>>>>>>                    rxdp[i].read.pkt_addr = dma_addr;
>>>>>>            }
>>>>>>
>>>>>> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
>>>>>> **rx_pkts,
>>>>>>                    rxe->mbuf = nmb;
>>>>>>                    dma_addr =
>>>>>>
>>>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
>>>>>> -               rxdp->read.hdr_addr = dma_addr;
>>>>>> +               rxdp->read.hdr_addr = 0;
>>>>>>                    rxdp->read.pkt_addr = dma_addr;
>>>>>>
>>>>>>                    /*
>>>>>> @@ -1741,7 +1741,7 @@ next_desc:
>>>>>>                            rxe->mbuf = nmb;
>>>>>>
>>>>>>                            rxm->data_off = RTE_PKTMBUF_HEADROOM;
>>>>>> -                       rxdp->read.hdr_addr = dma;
>>>>>> +                       rxdp->read.hdr_addr = 0;
>>>>>>                            rxdp->read.pkt_addr = dma;
>>>>>>                    } else
>>>>>>                            rxe->mbuf = NULL; @@ -3633,7 +3633,7 @@
>>>>>> ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
>>>>>>                    dma_addr =
>>>>>>
>>>>>> rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
>>>>>>                    rxd = &rxq->rx_ring[i];
>>>>>> -               rxd->read.hdr_addr = dma_addr;
>>>>>> +               rxd->read.hdr_addr = 0;
>>>>>>                    rxd->read.pkt_addr = dma_addr;
>>>>>>                    rxe[i].mbuf = mbuf;
>>>>>>            }
>>>>>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> index 6c1647e..16a9c64 100644
>>>>>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
>>>>>> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>>>>>                            RTE_PKTMBUF_HEADROOM);
>>>>>>            __m128i dma_addr0, dma_addr1;
>>>>>>
>>>>>> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
>>>>>> +
>>>>>>            rxdp = rxq->rx_ring + rxq->rxrearm_start;
>>>>>>
>>>>>>            /* Pull 'n' more MBUFs into the software ring */ @@ -108,6 +110,9 @@
>>>>>> ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>>>>>>                    dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
>>>>>>                    dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
>>>>>>
>>>>>> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
>>>>>> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
>>>>>> +
>>>>>>                    /* flush desc with pa dma_addr */
>>>>>>                    _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
>>>>>>                    _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
>>>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
  2015-07-30 11:22             ` Olivier MATZ
@ 2015-07-30 13:47               ` Thomas Monjalon
  2015-07-30 13:56                 ` [dpdk-dev] [PATCH] mbuf: enforce alignment of mbuf private area Olivier Matz
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2015-07-30 13:47 UTC (permalink / raw)
  To: Olivier MATZ; +Cc: dev

2015-07-30 13:22, Olivier MATZ:
> On 07/30/2015 11:43 AM, Ananyev, Konstantin wrote:
> > From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> >> On 07/30/2015 11:00 AM, Ananyev, Konstantin wrote:
> >>> From: Olivier MATZ [mailto:olivier.matz@6wind.com]
> >>>> On 07/29/2015 10:24 PM, Zhang, Helin wrote:
> >>>>> The similar situation in i40e, as explained by Konstantin.
> >>>>> As header split hasn't been supported by DPDK till now. It would be better to put the header address in RX descriptor to 0.
> >>>>> But in the future, during header split enabling. We may need to pay extra attention to that. As at least x710 datasheet said
> >>>> specifically as below.
> >>>>> "The header address should be set by the software to an even number (word aligned address)". We may need to find a way to
> >>>> ensure that during mempool/mbuf allocation.
> >>>>
> >>>> Indeed it would be good to force the priv_size to be aligned.
> >>>>
> >>>> The priv_size could be aligned automatically in
> >>>> rte_pktmbuf_pool_create(). The only possible problem I could see
> >>>> is that it would break applications that access to the data buffer
> >>>> by doing (sizeof(mbuf) + sizeof(priv)), which is probably not the
> >>>> best thing to do (I didn't find any applications like this in dpdk).
> >>>
> >>>
> >>> Might be just make rte_pktmbuf_pool_create() fail if input priv_size % MIN_ALIGN != 0?
> >>
> >> Hmm maybe it would break more applications: an odd priv_size is
> >> probably rare, but a priv_size that is not aligned to 8 bytes is
> >> maybe more common.
> >
> > My thought was that rte_mempool_create() was just introduced in 2.1,
> > so if we add extra requirement for the input parameter now -
> > there would be no ABI breakage, and not many people started to use it already.
> > For me just seems a bit easier and more straightforward then silent alignment -
> > user would not have wrong assumptions here.
> > Though if you think that a silent alignment would be more convenient
> > for most users - I wouldn't insist.
> 
> 
> Yes, I agree on the principle, but it depends whether this fix
> is integrated for 2.1 or not.
> I think it may already be a bit late for that, especially as it
> is not a very critical bug.
> 
> Thomas, what do you think?

It is a fix.
Adding a doc comment, an assert and an alignment constraint or a new automatic
alignment in the not yet released function shouldn't hurt.
A patch would be welcome for 2.1. Thanks

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH] mbuf: enforce alignment of mbuf private area
  2015-07-30 13:47               ` Thomas Monjalon
@ 2015-07-30 13:56                 ` Olivier Matz
  2015-07-30 14:13                   ` Ananyev, Konstantin
                                     ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Olivier Matz @ 2015-07-30 13:56 UTC (permalink / raw)
  To: dev

It looks better to have a data buffer address that is aligned to
8 bytes. This is the case when there is no mbuf private area, but
if there is one, the alignment depends on the size of this area
that is located between the mbuf structure and the data buffer.

Indeed, some drivers expects to have the buffer address aligned
to an even address, and moreover an unaligned buffer may impact
the performance when accessing to network headers.

Add a check in rte_pktmbuf_pool_create() to verify the alignment
constraint before creating the mempool. For applications that use
the alternative way (direct call to rte_mempool_create), also
add an assertion in rte_pktmbuf_init().

By the way, also add the MBUF log type.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_eal/common/include/rte_log.h | 1 +
 lib/librte_mbuf/rte_mbuf.c              | 8 +++++++-
 lib/librte_mbuf/rte_mbuf.h              | 7 +++++--
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index 24a55cc..ede0dca 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -77,6 +77,7 @@ extern struct rte_logs rte_logs;
 #define RTE_LOGTYPE_PORT    0x00002000 /**< Log related to port. */
 #define RTE_LOGTYPE_TABLE   0x00004000 /**< Log related to table. */
 #define RTE_LOGTYPE_PIPELINE 0x00008000 /**< Log related to pipeline. */
+#define RTE_LOGTYPE_MBUF    0x00010000 /**< Log related to mbuf. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1   0x01000000 /**< User-defined log type 1. */
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 4320dd4..a1ddbb3 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -125,6 +125,7 @@ rte_pktmbuf_init(struct rte_mempool *mp,
 	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
 	buf_len = rte_pktmbuf_data_room_size(mp);
 
+	RTE_MBUF_ASSERT((priv_size & (RTE_MBUF_PRIV_ALIGN - 1)) == 0);
 	RTE_MBUF_ASSERT(mp->elt_size >= mbuf_size);
 	RTE_MBUF_ASSERT(buf_len <= UINT16_MAX);
 
@@ -154,7 +155,12 @@ rte_pktmbuf_pool_create(const char *name, unsigned n,
 	struct rte_pktmbuf_pool_private mbp_priv;
 	unsigned elt_size;
 
-
+	if ((priv_size & (RTE_MBUF_PRIV_ALIGN - 1)) != 0) {
+		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
+			priv_size);
+		rte_errno = EINVAL;
+		return NULL;
+	}
 	elt_size = sizeof(struct rte_mbuf) + (unsigned)priv_size +
 		(unsigned)data_room_size;
 	mbp_priv.mbuf_data_room_size = data_room_size;
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 010b32d..c3b8c98 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -698,6 +698,9 @@ extern "C" {
                                                  RTE_PTYPE_INNER_L4_MASK))
 #endif /* RTE_NEXT_ABI */
 
+/** Alignment constraint of mbuf private area. */
+#define RTE_MBUF_PRIV_ALIGN 8
+
 /**
  * Get the name of a RX offload flag
  *
@@ -1238,7 +1241,7 @@ void rte_pktmbuf_pool_init(struct rte_mempool *mp, void *opaque_arg);
  *   details.
  * @param priv_size
  *   Size of application private are between the rte_mbuf structure
- *   and the data buffer.
+ *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
  * @param data_room_size
  *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
  * @param socket_id
@@ -1250,7 +1253,7 @@ void rte_pktmbuf_pool_init(struct rte_mempool *mp, void *opaque_arg);
  *   with rte_errno set appropriately. Possible rte_errno values include:
  *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
  *    - E_RTE_SECONDARY - function was called from a secondary process instance
- *    - EINVAL - cache size provided is too large
+ *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
  *    - ENOSPC - the maximum number of memzones has already been allocated
  *    - EEXIST - a memzone with the same name already exists
  *    - ENOMEM - no appropriate memory area found in which to create memzone
-- 
2.1.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: enforce alignment of mbuf private area
  2015-07-30 13:56                 ` [dpdk-dev] [PATCH] mbuf: enforce alignment of mbuf private area Olivier Matz
@ 2015-07-30 14:13                   ` Ananyev, Konstantin
  2015-07-30 16:06                     ` Olivier MATZ
  2015-07-30 15:33                   ` Zhang, Helin
  2015-07-30 16:22                   ` [dpdk-dev] [PATCH v2] " Olivier Matz
  2 siblings, 1 reply; 19+ messages in thread
From: Ananyev, Konstantin @ 2015-07-30 14:13 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev


Hi Olivier,

If fails to compile for me:

/local/kananye1/dpdk.org-mbprv1/lib/librte_mbuf/rte_mbuf.c: In function ârte_pktmbuf_pool_createâ:
/local/kananye1/dpdk.org-mbprv1/lib/librte_mbuf/rte_mbuf.c:161:3: error: ârte_errnoâ undeclared (first use in this function)
   rte_errno = EINVAL;
   ^
/local/kananye1/dpdk.org-mbprv1/lib/librte_mbuf/rte_mbuf.c:161:3: note: each undeclared identifier is reported only once for each function it appears in

I had to add:

diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index a1ddbb3..04344c0 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -58,6 +58,7 @@
 #include <rte_mbuf.h>
 #include <rte_string_fns.h>
 #include <rte_hexdump.h>
+#include <rte_errno.h>

 /*
  * ctrlmbuf constructor, given as a callback function to

Apart from that - looks good to me.
Konstantin
 
> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Thursday, July 30, 2015 2:56 PM
> To: dev@dpdk.org
> Cc: Ananyev, Konstantin; olivier.matz@6wind.com; Zhang, Helin; martin.weiser@allegro-packets.com; thomas.monjalon@6wind.com
> Subject: [PATCH] mbuf: enforce alignment of mbuf private area
> 
> It looks better to have a data buffer address that is aligned to
> 8 bytes. This is the case when there is no mbuf private area, but
> if there is one, the alignment depends on the size of this area
> that is located between the mbuf structure and the data buffer.
> 
> Indeed, some drivers expects to have the buffer address aligned
> to an even address, and moreover an unaligned buffer may impact
> the performance when accessing to network headers.
> 
> Add a check in rte_pktmbuf_pool_create() to verify the alignment
> constraint before creating the mempool. For applications that use
> the alternative way (direct call to rte_mempool_create), also
> add an assertion in rte_pktmbuf_init().
> 
> By the way, also add the MBUF log type.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>  lib/librte_eal/common/include/rte_log.h | 1 +
>  lib/librte_mbuf/rte_mbuf.c              | 8 +++++++-
>  lib/librte_mbuf/rte_mbuf.h              | 7 +++++--
>  3 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
> index 24a55cc..ede0dca 100644
> --- a/lib/librte_eal/common/include/rte_log.h
> +++ b/lib/librte_eal/common/include/rte_log.h
> @@ -77,6 +77,7 @@ extern struct rte_logs rte_logs;
>  #define RTE_LOGTYPE_PORT    0x00002000 /**< Log related to port. */
>  #define RTE_LOGTYPE_TABLE   0x00004000 /**< Log related to table. */
>  #define RTE_LOGTYPE_PIPELINE 0x00008000 /**< Log related to pipeline. */
> +#define RTE_LOGTYPE_MBUF    0x00010000 /**< Log related to mbuf. */
> 
>  /* these log types can be used in an application */
>  #define RTE_LOGTYPE_USER1   0x01000000 /**< User-defined log type 1. */
> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> index 4320dd4..a1ddbb3 100644
> --- a/lib/librte_mbuf/rte_mbuf.c
> +++ b/lib/librte_mbuf/rte_mbuf.c
> @@ -125,6 +125,7 @@ rte_pktmbuf_init(struct rte_mempool *mp,
>  	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
>  	buf_len = rte_pktmbuf_data_room_size(mp);
> 
> +	RTE_MBUF_ASSERT((priv_size & (RTE_MBUF_PRIV_ALIGN - 1)) == 0);
>  	RTE_MBUF_ASSERT(mp->elt_size >= mbuf_size);
>  	RTE_MBUF_ASSERT(buf_len <= UINT16_MAX);
> 
> @@ -154,7 +155,12 @@ rte_pktmbuf_pool_create(const char *name, unsigned n,
>  	struct rte_pktmbuf_pool_private mbp_priv;
>  	unsigned elt_size;
> 
> -
> +	if ((priv_size & (RTE_MBUF_PRIV_ALIGN - 1)) != 0) {
> +		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
> +			priv_size);
> +		rte_errno = EINVAL;
> +		return NULL;
> +	}
>  	elt_size = sizeof(struct rte_mbuf) + (unsigned)priv_size +
>  		(unsigned)data_room_size;
>  	mbp_priv.mbuf_data_room_size = data_room_size;
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 010b32d..c3b8c98 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -698,6 +698,9 @@ extern "C" {
>                                                   RTE_PTYPE_INNER_L4_MASK))
>  #endif /* RTE_NEXT_ABI */
> 
> +/** Alignment constraint of mbuf private area. */
> +#define RTE_MBUF_PRIV_ALIGN 8
> +
>  /**
>   * Get the name of a RX offload flag
>   *
> @@ -1238,7 +1241,7 @@ void rte_pktmbuf_pool_init(struct rte_mempool *mp, void *opaque_arg);
>   *   details.
>   * @param priv_size
>   *   Size of application private are between the rte_mbuf structure
> - *   and the data buffer.
> + *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
>   * @param data_room_size
>   *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
>   * @param socket_id
> @@ -1250,7 +1253,7 @@ void rte_pktmbuf_pool_init(struct rte_mempool *mp, void *opaque_arg);
>   *   with rte_errno set appropriately. Possible rte_errno values include:
>   *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
>   *    - E_RTE_SECONDARY - function was called from a secondary process instance
> - *    - EINVAL - cache size provided is too large
> + *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
>   *    - ENOSPC - the maximum number of memzones has already been allocated
>   *    - EEXIST - a memzone with the same name already exists
>   *    - ENOMEM - no appropriate memory area found in which to create memzone
> --
> 2.1.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: enforce alignment of mbuf private area
  2015-07-30 13:56                 ` [dpdk-dev] [PATCH] mbuf: enforce alignment of mbuf private area Olivier Matz
  2015-07-30 14:13                   ` Ananyev, Konstantin
@ 2015-07-30 15:33                   ` Zhang, Helin
  2015-07-30 16:07                     ` Olivier MATZ
  2015-07-30 16:22                   ` [dpdk-dev] [PATCH v2] " Olivier Matz
  2 siblings, 1 reply; 19+ messages in thread
From: Zhang, Helin @ 2015-07-30 15:33 UTC (permalink / raw)
  To: Olivier Matz, dev



> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Thursday, July 30, 2015 6:56 AM
> To: dev@dpdk.org
> Cc: Ananyev, Konstantin; olivier.matz@6wind.com; Zhang, Helin;
> martin.weiser@allegro-packets.com; thomas.monjalon@6wind.com
> Subject: [PATCH] mbuf: enforce alignment of mbuf private area
> 
> It looks better to have a data buffer address that is aligned to
> 8 bytes. This is the case when there is no mbuf private area, but if there is one,
> the alignment depends on the size of this area that is located between the mbuf
> structure and the data buffer.
> 
> Indeed, some drivers expects to have the buffer address aligned to an even
> address, and moreover an unaligned buffer may impact the performance when
> accessing to network headers.
> 
> Add a check in rte_pktmbuf_pool_create() to verify the alignment constraint
> before creating the mempool. For applications that use the alternative way
> (direct call to rte_mempool_create), also add an assertion in rte_pktmbuf_init().
> 
> By the way, also add the MBUF log type.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>  lib/librte_eal/common/include/rte_log.h | 1 +
>  lib/librte_mbuf/rte_mbuf.c              | 8 +++++++-
>  lib/librte_mbuf/rte_mbuf.h              | 7 +++++--
>  3 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_eal/common/include/rte_log.h
> b/lib/librte_eal/common/include/rte_log.h
> index 24a55cc..ede0dca 100644
> --- a/lib/librte_eal/common/include/rte_log.h
> +++ b/lib/librte_eal/common/include/rte_log.h
> @@ -77,6 +77,7 @@ extern struct rte_logs rte_logs;
>  #define RTE_LOGTYPE_PORT    0x00002000 /**< Log related to port. */
>  #define RTE_LOGTYPE_TABLE   0x00004000 /**< Log related to table. */
>  #define RTE_LOGTYPE_PIPELINE 0x00008000 /**< Log related to pipeline. */
> +#define RTE_LOGTYPE_MBUF    0x00010000 /**< Log related to mbuf. */
> 
>  /* these log types can be used in an application */
>  #define RTE_LOGTYPE_USER1   0x01000000 /**< User-defined log type 1. */
> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c index
> 4320dd4..a1ddbb3 100644
> --- a/lib/librte_mbuf/rte_mbuf.c
> +++ b/lib/librte_mbuf/rte_mbuf.c
> @@ -125,6 +125,7 @@ rte_pktmbuf_init(struct rte_mempool *mp,
>  	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
>  	buf_len = rte_pktmbuf_data_room_size(mp);
> 
> +	RTE_MBUF_ASSERT((priv_size & (RTE_MBUF_PRIV_ALIGN - 1)) == 0);
Using RTE_ALIGN() could be more readable?

>  	RTE_MBUF_ASSERT(mp->elt_size >= mbuf_size);
>  	RTE_MBUF_ASSERT(buf_len <= UINT16_MAX);
> 
> @@ -154,7 +155,12 @@ rte_pktmbuf_pool_create(const char *name, unsigned
> n,
>  	struct rte_pktmbuf_pool_private mbp_priv;
>  	unsigned elt_size;
> 
> -
> +	if ((priv_size & (RTE_MBUF_PRIV_ALIGN - 1)) != 0) {
Using RTE_ALIGN() could be more readable?

> +		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
> +			priv_size);
> +		rte_errno = EINVAL;
> +		return NULL;
> +	}
>  	elt_size = sizeof(struct rte_mbuf) + (unsigned)priv_size +
>  		(unsigned)data_room_size;
>  	mbp_priv.mbuf_data_room_size = data_room_size; diff --git
> a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index
> 010b32d..c3b8c98 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -698,6 +698,9 @@ extern "C" {
> 
> RTE_PTYPE_INNER_L4_MASK))  #endif /* RTE_NEXT_ABI */
> 
> +/** Alignment constraint of mbuf private area. */ #define
> +RTE_MBUF_PRIV_ALIGN 8
> +
>  /**
>   * Get the name of a RX offload flag
>   *
> @@ -1238,7 +1241,7 @@ void rte_pktmbuf_pool_init(struct rte_mempool *mp,
> void *opaque_arg);
>   *   details.
>   * @param priv_size
>   *   Size of application private are between the rte_mbuf structure
> - *   and the data buffer.
> + *   and the data buffer. This value must be aligned to
> RTE_MBUF_PRIV_ALIGN.
>   * @param data_room_size
>   *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
>   * @param socket_id
> @@ -1250,7 +1253,7 @@ void rte_pktmbuf_pool_init(struct rte_mempool *mp,
> void *opaque_arg);
>   *   with rte_errno set appropriately. Possible rte_errno values include:
>   *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config
> structure
>   *    - E_RTE_SECONDARY - function was called from a secondary process
> instance
> - *    - EINVAL - cache size provided is too large
> + *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
>   *    - ENOSPC - the maximum number of memzones has already been
> allocated
>   *    - EEXIST - a memzone with the same name already exists
>   *    - ENOMEM - no appropriate memory area found in which to create
> memzone
> --
> 2.1.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: enforce alignment of mbuf private area
  2015-07-30 14:13                   ` Ananyev, Konstantin
@ 2015-07-30 16:06                     ` Olivier MATZ
  0 siblings, 0 replies; 19+ messages in thread
From: Olivier MATZ @ 2015-07-30 16:06 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev

On 07/30/2015 04:13 PM, Ananyev, Konstantin wrote:
>
> Hi Olivier,
>
> If fails to compile for me:
>
> /local/kananye1/dpdk.org-mbprv1/lib/librte_mbuf/rte_mbuf.c: In function ârte_pktmbuf_pool_createâ:
> /local/kananye1/dpdk.org-mbprv1/lib/librte_mbuf/rte_mbuf.c:161:3: error: ârte_errnoâ undeclared (first use in this function)
>     rte_errno = EINVAL;
>     ^
> /local/kananye1/dpdk.org-mbprv1/lib/librte_mbuf/rte_mbuf.c:161:3: note: each undeclared identifier is reported only once for each function it appears in
>
> I had to add:

Sorry I had the same error but I forgot to squash the fix... :/

I'm sending a v2


>
> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> index a1ddbb3..04344c0 100644
> --- a/lib/librte_mbuf/rte_mbuf.c
> +++ b/lib/librte_mbuf/rte_mbuf.c
> @@ -58,6 +58,7 @@
>   #include <rte_mbuf.h>
>   #include <rte_string_fns.h>
>   #include <rte_hexdump.h>
> +#include <rte_errno.h>
>
>   /*
>    * ctrlmbuf constructor, given as a callback function to
>
> Apart from that - looks good to me.
> Konstantin
>
>> -----Original Message-----
>> From: Olivier Matz [mailto:olivier.matz@6wind.com]
>> Sent: Thursday, July 30, 2015 2:56 PM
>> To: dev@dpdk.org
>> Cc: Ananyev, Konstantin; olivier.matz@6wind.com; Zhang, Helin; martin.weiser@allegro-packets.com; thomas.monjalon@6wind.com
>> Subject: [PATCH] mbuf: enforce alignment of mbuf private area
>>
>> It looks better to have a data buffer address that is aligned to
>> 8 bytes. This is the case when there is no mbuf private area, but
>> if there is one, the alignment depends on the size of this area
>> that is located between the mbuf structure and the data buffer.
>>
>> Indeed, some drivers expects to have the buffer address aligned
>> to an even address, and moreover an unaligned buffer may impact
>> the performance when accessing to network headers.
>>
>> Add a check in rte_pktmbuf_pool_create() to verify the alignment
>> constraint before creating the mempool. For applications that use
>> the alternative way (direct call to rte_mempool_create), also
>> add an assertion in rte_pktmbuf_init().
>>
>> By the way, also add the MBUF log type.
>>
>> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>> ---
>>   lib/librte_eal/common/include/rte_log.h | 1 +
>>   lib/librte_mbuf/rte_mbuf.c              | 8 +++++++-
>>   lib/librte_mbuf/rte_mbuf.h              | 7 +++++--
>>   3 files changed, 13 insertions(+), 3 deletions(-)
>>
>> diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
>> index 24a55cc..ede0dca 100644
>> --- a/lib/librte_eal/common/include/rte_log.h
>> +++ b/lib/librte_eal/common/include/rte_log.h
>> @@ -77,6 +77,7 @@ extern struct rte_logs rte_logs;
>>   #define RTE_LOGTYPE_PORT    0x00002000 /**< Log related to port. */
>>   #define RTE_LOGTYPE_TABLE   0x00004000 /**< Log related to table. */
>>   #define RTE_LOGTYPE_PIPELINE 0x00008000 /**< Log related to pipeline. */
>> +#define RTE_LOGTYPE_MBUF    0x00010000 /**< Log related to mbuf. */
>>
>>   /* these log types can be used in an application */
>>   #define RTE_LOGTYPE_USER1   0x01000000 /**< User-defined log type 1. */
>> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
>> index 4320dd4..a1ddbb3 100644
>> --- a/lib/librte_mbuf/rte_mbuf.c
>> +++ b/lib/librte_mbuf/rte_mbuf.c
>> @@ -125,6 +125,7 @@ rte_pktmbuf_init(struct rte_mempool *mp,
>>   	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
>>   	buf_len = rte_pktmbuf_data_room_size(mp);
>>
>> +	RTE_MBUF_ASSERT((priv_size & (RTE_MBUF_PRIV_ALIGN - 1)) == 0);
>>   	RTE_MBUF_ASSERT(mp->elt_size >= mbuf_size);
>>   	RTE_MBUF_ASSERT(buf_len <= UINT16_MAX);
>>
>> @@ -154,7 +155,12 @@ rte_pktmbuf_pool_create(const char *name, unsigned n,
>>   	struct rte_pktmbuf_pool_private mbp_priv;
>>   	unsigned elt_size;
>>
>> -
>> +	if ((priv_size & (RTE_MBUF_PRIV_ALIGN - 1)) != 0) {
>> +		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
>> +			priv_size);
>> +		rte_errno = EINVAL;
>> +		return NULL;
>> +	}
>>   	elt_size = sizeof(struct rte_mbuf) + (unsigned)priv_size +
>>   		(unsigned)data_room_size;
>>   	mbp_priv.mbuf_data_room_size = data_room_size;
>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>> index 010b32d..c3b8c98 100644
>> --- a/lib/librte_mbuf/rte_mbuf.h
>> +++ b/lib/librte_mbuf/rte_mbuf.h
>> @@ -698,6 +698,9 @@ extern "C" {
>>                                                    RTE_PTYPE_INNER_L4_MASK))
>>   #endif /* RTE_NEXT_ABI */
>>
>> +/** Alignment constraint of mbuf private area. */
>> +#define RTE_MBUF_PRIV_ALIGN 8
>> +
>>   /**
>>    * Get the name of a RX offload flag
>>    *
>> @@ -1238,7 +1241,7 @@ void rte_pktmbuf_pool_init(struct rte_mempool *mp, void *opaque_arg);
>>    *   details.
>>    * @param priv_size
>>    *   Size of application private are between the rte_mbuf structure
>> - *   and the data buffer.
>> + *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
>>    * @param data_room_size
>>    *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
>>    * @param socket_id
>> @@ -1250,7 +1253,7 @@ void rte_pktmbuf_pool_init(struct rte_mempool *mp, void *opaque_arg);
>>    *   with rte_errno set appropriately. Possible rte_errno values include:
>>    *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
>>    *    - E_RTE_SECONDARY - function was called from a secondary process instance
>> - *    - EINVAL - cache size provided is too large
>> + *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
>>    *    - ENOSPC - the maximum number of memzones has already been allocated
>>    *    - EEXIST - a memzone with the same name already exists
>>    *    - ENOMEM - no appropriate memory area found in which to create memzone
>> --
>> 2.1.4
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH] mbuf: enforce alignment of mbuf private area
  2015-07-30 15:33                   ` Zhang, Helin
@ 2015-07-30 16:07                     ` Olivier MATZ
  0 siblings, 0 replies; 19+ messages in thread
From: Olivier MATZ @ 2015-07-30 16:07 UTC (permalink / raw)
  To: Zhang, Helin, dev

On 07/30/2015 05:33 PM, Zhang, Helin wrote:
>
>
>> -----Original Message-----
>> From: Olivier Matz [mailto:olivier.matz@6wind.com]
>> Sent: Thursday, July 30, 2015 6:56 AM
>> To: dev@dpdk.org
>> Cc: Ananyev, Konstantin; olivier.matz@6wind.com; Zhang, Helin;
>> martin.weiser@allegro-packets.com; thomas.monjalon@6wind.com
>> Subject: [PATCH] mbuf: enforce alignment of mbuf private area
>>
>> It looks better to have a data buffer address that is aligned to
>> 8 bytes. This is the case when there is no mbuf private area, but if there is one,
>> the alignment depends on the size of this area that is located between the mbuf
>> structure and the data buffer.
>>
>> Indeed, some drivers expects to have the buffer address aligned to an even
>> address, and moreover an unaligned buffer may impact the performance when
>> accessing to network headers.
>>
>> Add a check in rte_pktmbuf_pool_create() to verify the alignment constraint
>> before creating the mempool. For applications that use the alternative way
>> (direct call to rte_mempool_create), also add an assertion in rte_pktmbuf_init().
>>
>> By the way, also add the MBUF log type.
>>
>> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>> ---
>>   lib/librte_eal/common/include/rte_log.h | 1 +
>>   lib/librte_mbuf/rte_mbuf.c              | 8 +++++++-
>>   lib/librte_mbuf/rte_mbuf.h              | 7 +++++--
>>   3 files changed, 13 insertions(+), 3 deletions(-)
>>
>> diff --git a/lib/librte_eal/common/include/rte_log.h
>> b/lib/librte_eal/common/include/rte_log.h
>> index 24a55cc..ede0dca 100644
>> --- a/lib/librte_eal/common/include/rte_log.h
>> +++ b/lib/librte_eal/common/include/rte_log.h
>> @@ -77,6 +77,7 @@ extern struct rte_logs rte_logs;
>>   #define RTE_LOGTYPE_PORT    0x00002000 /**< Log related to port. */
>>   #define RTE_LOGTYPE_TABLE   0x00004000 /**< Log related to table. */
>>   #define RTE_LOGTYPE_PIPELINE 0x00008000 /**< Log related to pipeline. */
>> +#define RTE_LOGTYPE_MBUF    0x00010000 /**< Log related to mbuf. */
>>
>>   /* these log types can be used in an application */
>>   #define RTE_LOGTYPE_USER1   0x01000000 /**< User-defined log type 1. */
>> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c index
>> 4320dd4..a1ddbb3 100644
>> --- a/lib/librte_mbuf/rte_mbuf.c
>> +++ b/lib/librte_mbuf/rte_mbuf.c
>> @@ -125,6 +125,7 @@ rte_pktmbuf_init(struct rte_mempool *mp,
>>   	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
>>   	buf_len = rte_pktmbuf_data_room_size(mp);
>>
>> +	RTE_MBUF_ASSERT((priv_size & (RTE_MBUF_PRIV_ALIGN - 1)) == 0);
> Using RTE_ALIGN() could be more readable?
>
>>   	RTE_MBUF_ASSERT(mp->elt_size >= mbuf_size);
>>   	RTE_MBUF_ASSERT(buf_len <= UINT16_MAX);
>>
>> @@ -154,7 +155,12 @@ rte_pktmbuf_pool_create(const char *name, unsigned
>> n,
>>   	struct rte_pktmbuf_pool_private mbp_priv;
>>   	unsigned elt_size;
>>
>> -
>> +	if ((priv_size & (RTE_MBUF_PRIV_ALIGN - 1)) != 0) {
> Using RTE_ALIGN() could be more readable?

Will do, thanks for commenting.



>
>> +		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
>> +			priv_size);
>> +		rte_errno = EINVAL;
>> +		return NULL;
>> +	}
>>   	elt_size = sizeof(struct rte_mbuf) + (unsigned)priv_size +
>>   		(unsigned)data_room_size;
>>   	mbp_priv.mbuf_data_room_size = data_room_size; diff --git
>> a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index
>> 010b32d..c3b8c98 100644
>> --- a/lib/librte_mbuf/rte_mbuf.h
>> +++ b/lib/librte_mbuf/rte_mbuf.h
>> @@ -698,6 +698,9 @@ extern "C" {
>>
>> RTE_PTYPE_INNER_L4_MASK))  #endif /* RTE_NEXT_ABI */
>>
>> +/** Alignment constraint of mbuf private area. */ #define
>> +RTE_MBUF_PRIV_ALIGN 8
>> +
>>   /**
>>    * Get the name of a RX offload flag
>>    *
>> @@ -1238,7 +1241,7 @@ void rte_pktmbuf_pool_init(struct rte_mempool *mp,
>> void *opaque_arg);
>>    *   details.
>>    * @param priv_size
>>    *   Size of application private are between the rte_mbuf structure
>> - *   and the data buffer.
>> + *   and the data buffer. This value must be aligned to
>> RTE_MBUF_PRIV_ALIGN.
>>    * @param data_room_size
>>    *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
>>    * @param socket_id
>> @@ -1250,7 +1253,7 @@ void rte_pktmbuf_pool_init(struct rte_mempool *mp,
>> void *opaque_arg);
>>    *   with rte_errno set appropriately. Possible rte_errno values include:
>>    *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config
>> structure
>>    *    - E_RTE_SECONDARY - function was called from a secondary process
>> instance
>> - *    - EINVAL - cache size provided is too large
>> + *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
>>    *    - ENOSPC - the maximum number of memzones has already been
>> allocated
>>    *    - EEXIST - a memzone with the same name already exists
>>    *    - ENOMEM - no appropriate memory area found in which to create
>> memzone
>> --
>> 2.1.4
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH v2] mbuf: enforce alignment of mbuf private area
  2015-07-30 13:56                 ` [dpdk-dev] [PATCH] mbuf: enforce alignment of mbuf private area Olivier Matz
  2015-07-30 14:13                   ` Ananyev, Konstantin
  2015-07-30 15:33                   ` Zhang, Helin
@ 2015-07-30 16:22                   ` Olivier Matz
  2015-07-30 16:25                     ` Zhang, Helin
  2015-07-30 21:28                     ` Ananyev, Konstantin
  2 siblings, 2 replies; 19+ messages in thread
From: Olivier Matz @ 2015-07-30 16:22 UTC (permalink / raw)
  To: dev

It looks better to have a data buffer address that is aligned to
8 bytes. This is the case when there is no mbuf private area, but
if there is one, the alignment depends on the size of this area
that is located between the mbuf structure and the data buffer.

Indeed, some drivers expects to have the buffer address aligned
to an even address, and moreover an unaligned buffer may impact
the performance when accessing to network headers.

Add a check in rte_pktmbuf_pool_create() to verify the alignment
constraint before creating the mempool. For applications that use
the alternative way (direct call to rte_mempool_create), also
add an assertion in rte_pktmbuf_init().

By the way, also add the MBUF log type.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_eal/common/include/rte_log.h | 1 +
 lib/librte_mbuf/rte_mbuf.c              | 9 ++++++++-
 lib/librte_mbuf/rte_mbuf.h              | 7 +++++--
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h
index 24a55cc..ede0dca 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -77,6 +77,7 @@ extern struct rte_logs rte_logs;
 #define RTE_LOGTYPE_PORT    0x00002000 /**< Log related to port. */
 #define RTE_LOGTYPE_TABLE   0x00004000 /**< Log related to table. */
 #define RTE_LOGTYPE_PIPELINE 0x00008000 /**< Log related to pipeline. */
+#define RTE_LOGTYPE_MBUF    0x00010000 /**< Log related to mbuf. */
 
 /* these log types can be used in an application */
 #define RTE_LOGTYPE_USER1   0x01000000 /**< User-defined log type 1. */
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 4320dd4..e416312 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -58,6 +58,7 @@
 #include <rte_mbuf.h>
 #include <rte_string_fns.h>
 #include <rte_hexdump.h>
+#include <rte_errno.h>
 
 /*
  * ctrlmbuf constructor, given as a callback function to
@@ -125,6 +126,7 @@ rte_pktmbuf_init(struct rte_mempool *mp,
 	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
 	buf_len = rte_pktmbuf_data_room_size(mp);
 
+	RTE_MBUF_ASSERT(RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) == priv_size);
 	RTE_MBUF_ASSERT(mp->elt_size >= mbuf_size);
 	RTE_MBUF_ASSERT(buf_len <= UINT16_MAX);
 
@@ -154,7 +156,12 @@ rte_pktmbuf_pool_create(const char *name, unsigned n,
 	struct rte_pktmbuf_pool_private mbp_priv;
 	unsigned elt_size;
 
-
+	if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
+		RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
+			priv_size);
+		rte_errno = EINVAL;
+		return NULL;
+	}
 	elt_size = sizeof(struct rte_mbuf) + (unsigned)priv_size +
 		(unsigned)data_room_size;
 	mbp_priv.mbuf_data_room_size = data_room_size;
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 010b32d..c3b8c98 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -698,6 +698,9 @@ extern "C" {
                                                  RTE_PTYPE_INNER_L4_MASK))
 #endif /* RTE_NEXT_ABI */
 
+/** Alignment constraint of mbuf private area. */
+#define RTE_MBUF_PRIV_ALIGN 8
+
 /**
  * Get the name of a RX offload flag
  *
@@ -1238,7 +1241,7 @@ void rte_pktmbuf_pool_init(struct rte_mempool *mp, void *opaque_arg);
  *   details.
  * @param priv_size
  *   Size of application private are between the rte_mbuf structure
- *   and the data buffer.
+ *   and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIGN.
  * @param data_room_size
  *   Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM.
  * @param socket_id
@@ -1250,7 +1253,7 @@ void rte_pktmbuf_pool_init(struct rte_mempool *mp, void *opaque_arg);
  *   with rte_errno set appropriately. Possible rte_errno values include:
  *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
  *    - E_RTE_SECONDARY - function was called from a secondary process instance
- *    - EINVAL - cache size provided is too large
+ *    - EINVAL - cache size provided is too large, or priv_size is not aligned.
  *    - ENOSPC - the maximum number of memzones has already been allocated
  *    - EEXIST - a memzone with the same name already exists
  *    - ENOMEM - no appropriate memory area found in which to create memzone
-- 
2.1.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: enforce alignment of mbuf private area
  2015-07-30 16:22                   ` [dpdk-dev] [PATCH v2] " Olivier Matz
@ 2015-07-30 16:25                     ` Zhang, Helin
  2015-07-30 21:28                     ` Ananyev, Konstantin
  1 sibling, 0 replies; 19+ messages in thread
From: Zhang, Helin @ 2015-07-30 16:25 UTC (permalink / raw)
  To: Olivier Matz, dev



> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Thursday, July 30, 2015 9:22 AM
> To: dev@dpdk.org
> Cc: Ananyev, Konstantin; olivier.matz@6wind.com; Zhang, Helin;
> martin.weiser@allegro-packets.com; thomas.monjalon@6wind.com
> Subject: [PATCH v2] mbuf: enforce alignment of mbuf private area
> 
> It looks better to have a data buffer address that is aligned to
> 8 bytes. This is the case when there is no mbuf private area, but if there is one,
> the alignment depends on the size of this area that is located between the mbuf
> structure and the data buffer.
> 
> Indeed, some drivers expects to have the buffer address aligned to an even
> address, and moreover an unaligned buffer may impact the performance when
> accessing to network headers.
> 
> Add a check in rte_pktmbuf_pool_create() to verify the alignment constraint
> before creating the mempool. For applications that use the alternative way
> (direct call to rte_mempool_create), also add an assertion in rte_pktmbuf_init().
> 
> By the way, also add the MBUF log type.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: enforce alignment of mbuf private area
  2015-07-30 16:22                   ` [dpdk-dev] [PATCH v2] " Olivier Matz
  2015-07-30 16:25                     ` Zhang, Helin
@ 2015-07-30 21:28                     ` Ananyev, Konstantin
  2015-08-02 22:35                       ` Thomas Monjalon
  1 sibling, 1 reply; 19+ messages in thread
From: Ananyev, Konstantin @ 2015-07-30 21:28 UTC (permalink / raw)
  To: Olivier Matz, dev



> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Thursday, July 30, 2015 5:22 PM
> To: dev@dpdk.org
> Cc: Ananyev, Konstantin; olivier.matz@6wind.com; Zhang, Helin; martin.weiser@allegro-packets.com; thomas.monjalon@6wind.com
> Subject: [PATCH v2] mbuf: enforce alignment of mbuf private area
> 
> It looks better to have a data buffer address that is aligned to
> 8 bytes. This is the case when there is no mbuf private area, but
> if there is one, the alignment depends on the size of this area
> that is located between the mbuf structure and the data buffer.
> 
> Indeed, some drivers expects to have the buffer address aligned
> to an even address, and moreover an unaligned buffer may impact
> the performance when accessing to network headers.
> 
> Add a check in rte_pktmbuf_pool_create() to verify the alignment
> constraint before creating the mempool. For applications that use
> the alternative way (direct call to rte_mempool_create), also
> add an assertion in rte_pktmbuf_init().
> 
> By the way, also add the MBUF log type.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd
  2015-07-29 18:12 ` Ananyev, Konstantin
  2015-07-29 20:24   ` Zhang, Helin
@ 2015-08-02 22:32   ` Thomas Monjalon
  1 sibling, 0 replies; 19+ messages in thread
From: Thomas Monjalon @ 2015-08-02 22:32 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev

2015-07-29 18:12, Ananyev, Konstantin:
> As we don't support split header feature anyway, I think we can fix it just by  always setting HBA in the RXD to zero.
> Could you try the fix for ixgbe below?

Please Konstantin, could you send a proper patch for this fix?
Thanks

> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool reset_mbuf)
> 
>                 /* populate the descriptors */
>                 dma_addr = rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
> -               rxdp[i].read.hdr_addr = dma_addr;
> +               rxdp[i].read.hdr_addr = 0;
>                 rxdp[i].read.pkt_addr = dma_addr;
>         }
> 
> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
>                 rxe->mbuf = nmb;
>                 dma_addr =
>                         rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
> -               rxdp->read.hdr_addr = dma_addr;
> +               rxdp->read.hdr_addr = 0;
>                 rxdp->read.pkt_addr = dma_addr;
> 
>                 /*
> @@ -1741,7 +1741,7 @@ next_desc:
>                         rxe->mbuf = nmb;
> 
>                         rxm->data_off = RTE_PKTMBUF_HEADROOM;
> -                       rxdp->read.hdr_addr = dma;
> +                       rxdp->read.hdr_addr = 0;
>                         rxdp->read.pkt_addr = dma;
>                 } else
>                         rxe->mbuf = NULL;
> @@ -3633,7 +3633,7 @@ ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
>                 dma_addr =
>                         rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
>                 rxd = &rxq->rx_ring[i];
> -               rxd->read.hdr_addr = dma_addr;
> +               rxd->read.hdr_addr = 0;
>                 rxd->read.pkt_addr = dma_addr;
>                 rxe[i].mbuf = mbuf;
>         }
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> index 6c1647e..16a9c64 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>                         RTE_PKTMBUF_HEADROOM);
>         __m128i dma_addr0, dma_addr1;
> 
> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
> +
>         rxdp = rxq->rx_ring + rxq->rxrearm_start;
> 
>         /* Pull 'n' more MBUFs into the software ring */
> @@ -108,6 +110,9 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>                 dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
>                 dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
> 
> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
> +
>                 /* flush desc with pa dma_addr */
>                 _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
>                 _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);
> bash-4.2$ cat patch1
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
> index a0c8847..94967c5 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> @@ -1183,7 +1183,7 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool reset_mbuf)
> 
>                 /* populate the descriptors */
>                 dma_addr = rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
> -               rxdp[i].read.hdr_addr = dma_addr;
> +               rxdp[i].read.hdr_addr = 0;
>                 rxdp[i].read.pkt_addr = dma_addr;
>         }
> 
> @@ -1414,7 +1414,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
>                 rxe->mbuf = nmb;
>                 dma_addr =
>                         rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(nmb));
> -               rxdp->read.hdr_addr = dma_addr;
> +               rxdp->read.hdr_addr = 0;
>                 rxdp->read.pkt_addr = dma_addr;
> 
>                 /*
> @@ -1741,7 +1741,7 @@ next_desc:
>                         rxe->mbuf = nmb;
> 
>                         rxm->data_off = RTE_PKTMBUF_HEADROOM;
> -                       rxdp->read.hdr_addr = dma;
> +                       rxdp->read.hdr_addr = 0;
>                         rxdp->read.pkt_addr = dma;
>                 } else
>                         rxe->mbuf = NULL;
> @@ -3633,7 +3633,7 @@ ixgbe_alloc_rx_queue_mbufs(struct ixgbe_rx_queue *rxq)
>                 dma_addr =
>                         rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mbuf));
>                 rxd = &rxq->rx_ring[i];
> -               rxd->read.hdr_addr = dma_addr;
> +               rxd->read.hdr_addr = 0;
>                 rxd->read.pkt_addr = dma_addr;
>                 rxe[i].mbuf = mbuf;
>         }
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> index 6c1647e..16a9c64 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> @@ -56,6 +56,8 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>                         RTE_PKTMBUF_HEADROOM);
>         __m128i dma_addr0, dma_addr1;
> 
> +       const __m128i hba_msk = _mm_set_epi64x(0, UINT64_MAX);
> +
>         rxdp = rxq->rx_ring + rxq->rxrearm_start;
> 
>         /* Pull 'n' more MBUFs into the software ring */
> @@ -108,6 +110,9 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
>                 dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room);
>                 dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room);
> 
> +               dma_addr0 =  _mm_and_si128(dma_addr0, hba_msk);
> +               dma_addr1 =  _mm_and_si128(dma_addr1, hba_msk);
> +
>                 /* flush desc with pa dma_addr */
>                 _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0);
>                 _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1);

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v2] mbuf: enforce alignment of mbuf private area
  2015-07-30 21:28                     ` Ananyev, Konstantin
@ 2015-08-02 22:35                       ` Thomas Monjalon
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Monjalon @ 2015-08-02 22:35 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev

> > It looks better to have a data buffer address that is aligned to
> > 8 bytes. This is the case when there is no mbuf private area, but
> > if there is one, the alignment depends on the size of this area
> > that is located between the mbuf structure and the data buffer.
> > 
> > Indeed, some drivers expects to have the buffer address aligned
> > to an even address, and moreover an unaligned buffer may impact
> > the performance when accessing to network headers.
> > 
> > Add a check in rte_pktmbuf_pool_create() to verify the alignment
> > constraint before creating the mempool. For applications that use
> > the alternative way (direct call to rte_mempool_create), also
> > add an assertion in rte_pktmbuf_init().
> > 
> > By the way, also add the MBUF log type.
> > 
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> 
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

Applied, thanks

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-08-02 22:36 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-29 15:07 [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd Martin Weiser
2015-07-29 18:12 ` Ananyev, Konstantin
2015-07-29 20:24   ` Zhang, Helin
2015-07-30  8:12     ` Olivier MATZ
2015-07-30  9:00       ` Ananyev, Konstantin
2015-07-30  9:10         ` Olivier MATZ
2015-07-30  9:43           ` Ananyev, Konstantin
2015-07-30 11:22             ` Olivier MATZ
2015-07-30 13:47               ` Thomas Monjalon
2015-07-30 13:56                 ` [dpdk-dev] [PATCH] mbuf: enforce alignment of mbuf private area Olivier Matz
2015-07-30 14:13                   ` Ananyev, Konstantin
2015-07-30 16:06                     ` Olivier MATZ
2015-07-30 15:33                   ` Zhang, Helin
2015-07-30 16:07                     ` Olivier MATZ
2015-07-30 16:22                   ` [dpdk-dev] [PATCH v2] " Olivier Matz
2015-07-30 16:25                     ` Zhang, Helin
2015-07-30 21:28                     ` Ananyev, Konstantin
2015-08-02 22:35                       ` Thomas Monjalon
2015-08-02 22:32   ` [dpdk-dev] Issue with non-scattered rx in ixgbe and i40e when mbuf private area size is odd Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).