* [PATCH v1 1/6] net/intel: update E830 Tx Time Queue Context Structure
2025-06-06 21:19 [PATCH v1 0/6] Add TxPP Support for E830 Soumyadeep Hore
@ 2025-06-06 21:19 ` Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 2/6] net/intel: add read clock feature in ICE Soumyadeep Hore
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Soumyadeep Hore @ 2025-06-06 21:19 UTC (permalink / raw)
To: dev, bruce.richardson
Cc: aman.deep.singh, manoj.kumar.subbarao, Paul Greenwalt
From: Paul Greenwalt <paul.greenwalt@intel.com>
Updated the Tx Time Queue Context Structure to align with HAS.
Signed-off-by: Soumyadeep Hore <soumyadeep.hore@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
---
drivers/net/intel/ice/base/ice_common.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/net/intel/ice/base/ice_common.c b/drivers/net/intel/ice/base/ice_common.c
index fce9b070cf..d6be991fe3 100644
--- a/drivers/net/intel/ice/base/ice_common.c
+++ b/drivers/net/intel/ice/base/ice_common.c
@@ -1671,17 +1671,17 @@ const struct ice_ctx_ele ice_txtime_ctx_info[] = {
ICE_CTX_STORE(ice_txtime_ctx, cpuid, 8, 82),
ICE_CTX_STORE(ice_txtime_ctx, tphrd_desc, 1, 90),
ICE_CTX_STORE(ice_txtime_ctx, qlen, 13, 91),
- ICE_CTX_STORE(ice_txtime_ctx, timer_num, 3, 104),
- ICE_CTX_STORE(ice_txtime_ctx, txtime_ena_q, 1, 107),
- ICE_CTX_STORE(ice_txtime_ctx, drbell_mode_32, 1, 108),
- ICE_CTX_STORE(ice_txtime_ctx, ts_res, 4, 109),
- ICE_CTX_STORE(ice_txtime_ctx, ts_round_type, 2, 113),
- ICE_CTX_STORE(ice_txtime_ctx, ts_pacing_slot, 3, 115),
- ICE_CTX_STORE(ice_txtime_ctx, merging_ena, 1, 118),
- ICE_CTX_STORE(ice_txtime_ctx, ts_fetch_prof_id, 4, 119),
- ICE_CTX_STORE(ice_txtime_ctx, ts_fetch_cache_line_aln_thld, 4, 123),
- ICE_CTX_STORE(ice_txtime_ctx, tx_pipe_delay_mode, 1, 127),
- ICE_CTX_STORE(ice_txtime_ctx, int_q_state, 70, 128),
+ ICE_CTX_STORE(ice_txtime_ctx, timer_num, 1, 104),
+ ICE_CTX_STORE(ice_txtime_ctx, txtime_ena_q, 1, 105),
+ ICE_CTX_STORE(ice_txtime_ctx, drbell_mode_32, 1, 106),
+ ICE_CTX_STORE(ice_txtime_ctx, ts_res, 4, 107),
+ ICE_CTX_STORE(ice_txtime_ctx, ts_round_type, 2, 111),
+ ICE_CTX_STORE(ice_txtime_ctx, ts_pacing_slot, 3, 113),
+ ICE_CTX_STORE(ice_txtime_ctx, merging_ena, 1, 116),
+ ICE_CTX_STORE(ice_txtime_ctx, ts_fetch_prof_id, 4, 117),
+ ICE_CTX_STORE(ice_txtime_ctx, ts_fetch_cache_line_aln_thld, 4, 121),
+ ICE_CTX_STORE(ice_txtime_ctx, tx_pipe_delay_mode, 1, 125),
+ ICE_CTX_STORE(ice_txtime_ctx, int_q_state, 70, 126),
{ 0 }
};
--
2.43.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v1 2/6] net/intel: add read clock feature in ICE
2025-06-06 21:19 [PATCH v1 0/6] Add TxPP Support for E830 Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 1/6] net/intel: update E830 Tx Time Queue Context Structure Soumyadeep Hore
@ 2025-06-06 21:19 ` Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 3/6] net/intel: add TxPP Support for E830 Soumyadeep Hore
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Soumyadeep Hore @ 2025-06-06 21:19 UTC (permalink / raw)
To: dev, bruce.richardson; +Cc: aman.deep.singh, manoj.kumar.subbarao
Adding eth_ice_read_clock() feature to get current time
for scheduling Packets based on Tx time.
Signed-off-by: Soumyadeep Hore <soumyadeep.hore@intel.com>
---
drivers/net/intel/ice/ice_ethdev.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/drivers/net/intel/ice/ice_ethdev.c b/drivers/net/intel/ice/ice_ethdev.c
index 7cc083ca32..9478ba92df 100644
--- a/drivers/net/intel/ice/ice_ethdev.c
+++ b/drivers/net/intel/ice/ice_ethdev.c
@@ -187,6 +187,7 @@ static int ice_timesync_read_time(struct rte_eth_dev *dev,
static int ice_timesync_write_time(struct rte_eth_dev *dev,
const struct timespec *timestamp);
static int ice_timesync_disable(struct rte_eth_dev *dev);
+static int eth_ice_read_clock(struct rte_eth_dev *dev, uint64_t *clock);
static int ice_fec_get_capability(struct rte_eth_dev *dev, struct rte_eth_fec_capa *speed_fec_capa,
unsigned int num);
static int ice_fec_get(struct rte_eth_dev *dev, uint32_t *fec_capa);
@@ -317,6 +318,7 @@ static const struct eth_dev_ops ice_eth_dev_ops = {
.timesync_read_time = ice_timesync_read_time,
.timesync_write_time = ice_timesync_write_time,
.timesync_disable = ice_timesync_disable,
+ .read_clock = eth_ice_read_clock,
.tm_ops_get = ice_tm_ops_get,
.fec_get_capability = ice_fec_get_capability,
.fec_get = ice_fec_get,
@@ -6935,6 +6937,17 @@ ice_timesync_disable(struct rte_eth_dev *dev)
return 0;
}
+static int
+eth_ice_read_clock(__rte_unused struct rte_eth_dev *dev, uint64_t *clock)
+{
+ struct timespec system_time;
+
+ clock_gettime(CLOCK_REALTIME, &system_time);
+ *clock = system_time.tv_sec * NSEC_PER_SEC + system_time.tv_nsec;
+
+ return 0;
+}
+
static const uint32_t *
ice_buffer_split_supported_hdr_ptypes_get(struct rte_eth_dev *dev __rte_unused,
size_t *no_of_elements)
--
2.43.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v1 3/6] net/intel: add TxPP Support for E830
2025-06-06 21:19 [PATCH v1 0/6] Add TxPP Support for E830 Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 1/6] net/intel: update E830 Tx Time Queue Context Structure Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 2/6] net/intel: add read clock feature in ICE Soumyadeep Hore
@ 2025-06-06 21:19 ` Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 4/6] net/intel: add AVX2 Support for TxPP Soumyadeep Hore
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Soumyadeep Hore @ 2025-06-06 21:19 UTC (permalink / raw)
To: dev, bruce.richardson; +Cc: aman.deep.singh, manoj.kumar.subbarao
Add support for Tx Time based queues. This is used to schedule
packets based on Tx timestamp.
Signed-off-by: Soumyadeep Hore <soumyadeep.hore@intel.com>
---
drivers/net/intel/common/tx.h | 14 ++
drivers/net/intel/ice/base/ice_lan_tx_rx.h | 4 +
drivers/net/intel/ice/ice_ethdev.c | 3 +-
drivers/net/intel/ice/ice_ethdev.h | 12 ++
drivers/net/intel/ice/ice_rxtx.c | 233 ++++++++++++++++++++-
drivers/net/intel/ice/ice_rxtx.h | 9 +
6 files changed, 265 insertions(+), 10 deletions(-)
diff --git a/drivers/net/intel/common/tx.h b/drivers/net/intel/common/tx.h
index b0a68bae44..8b958bf8e5 100644
--- a/drivers/net/intel/common/tx.h
+++ b/drivers/net/intel/common/tx.h
@@ -30,6 +30,19 @@ struct ci_tx_entry_vec {
typedef void (*ice_tx_release_mbufs_t)(struct ci_tx_queue *txq);
+/**
+ * Structure associated with Tx Time based queue
+ */
+struct ice_txtime {
+ volatile struct ice_ts_desc *ice_ts_ring; /* Tx time ring virtual address */
+ uint16_t nb_ts_desc; /* number of Tx Time descriptors */
+ uint16_t ts_tail; /* current value of tail register */
+ rte_iova_t ts_ring_dma; /* TX time ring DMA address */
+ const struct rte_memzone *ts_mz;
+ int ts_offset; /* dynamic mbuf Tx timestamp field offset */
+ uint64_t ts_flag; /* dynamic mbuf Tx timestamp flag */
+};
+
struct ci_tx_queue {
union { /* TX ring virtual address */
volatile struct i40e_tx_desc *i40e_tx_ring;
@@ -77,6 +90,7 @@ struct ci_tx_queue {
union {
struct { /* ICE driver specific values */
uint32_t q_teid; /* TX schedule node id. */
+ struct ice_txtime tsq; /* Tx Time based queue */
};
struct { /* I40E driver specific values */
uint8_t dcb_tc;
diff --git a/drivers/net/intel/ice/base/ice_lan_tx_rx.h b/drivers/net/intel/ice/base/ice_lan_tx_rx.h
index f92382346f..8b6c1a07a3 100644
--- a/drivers/net/intel/ice/base/ice_lan_tx_rx.h
+++ b/drivers/net/intel/ice/base/ice_lan_tx_rx.h
@@ -1278,6 +1278,8 @@ struct ice_ts_desc {
#define ICE_TXTIME_MAX_QUEUE 2047
#define ICE_SET_TXTIME_MAX_Q_AMOUNT 127
#define ICE_OP_TXTIME_MAX_Q_AMOUNT 2047
+#define ICE_TXTIME_FETCH_TS_DESC_DFLT 8
+#define ICE_TXTIME_FETCH_PROFILE_CNT 16
/* Tx Time queue context data
*
* The sizes of the variables may be larger than needed due to crossing byte
@@ -1303,8 +1305,10 @@ struct ice_txtime_ctx {
u8 drbell_mode_32;
#define ICE_TXTIME_CTX_DRBELL_MODE_32 1
u8 ts_res;
+#define ICE_TXTIME_CTX_RESOLUTION_128NS 7
u8 ts_round_type;
u8 ts_pacing_slot;
+#define ICE_TXTIME_CTX_FETCH_PROF_ID_0 0
u8 merging_ena;
u8 ts_fetch_prof_id;
u8 ts_fetch_cache_line_aln_thld;
diff --git a/drivers/net/intel/ice/ice_ethdev.c b/drivers/net/intel/ice/ice_ethdev.c
index 9478ba92df..3af9f6ba38 100644
--- a/drivers/net/intel/ice/ice_ethdev.c
+++ b/drivers/net/intel/ice/ice_ethdev.c
@@ -4139,7 +4139,8 @@ ice_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO |
RTE_ETH_TX_OFFLOAD_GRE_TNL_TSO |
RTE_ETH_TX_OFFLOAD_IPIP_TNL_TSO |
- RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO;
+ RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO |
+ RTE_ETH_TX_OFFLOAD_SEND_ON_TIMESTAMP;
dev_info->flow_type_rss_offloads |= ICE_RSS_OFFLOAD_ALL;
}
diff --git a/drivers/net/intel/ice/ice_ethdev.h b/drivers/net/intel/ice/ice_ethdev.h
index bfe093afca..dd86bd030c 100644
--- a/drivers/net/intel/ice/ice_ethdev.h
+++ b/drivers/net/intel/ice/ice_ethdev.h
@@ -17,6 +17,18 @@
#include "base/ice_flow.h"
#include "base/ice_sched.h"
+#define __bf_shf(x) rte_bsf32(x)
+#define FIELD_GET(_mask, _reg) \
+ (__extension__ ({ \
+ typeof(_mask) _x = (_mask); \
+ (typeof(_x))(((_reg) & (_x)) >> __bf_shf(_x)); \
+ }))
+#define FIELD_PREP(_mask, _val) \
+ (__extension__ ({ \
+ typeof(_mask) _x = (_mask); \
+ ((typeof(_x))(_val) << __bf_shf(_x)) & (_x); \
+ }))
+
#define ICE_ADMINQ_LEN 32
#define ICE_SBIOQ_LEN 32
#define ICE_MAILBOXQ_LEN 32
diff --git a/drivers/net/intel/ice/ice_rxtx.c b/drivers/net/intel/ice/ice_rxtx.c
index ba1435b9de..b256a1b5b8 100644
--- a/drivers/net/intel/ice/ice_rxtx.c
+++ b/drivers/net/intel/ice/ice_rxtx.c
@@ -740,6 +740,53 @@ ice_rx_queue_stop(struct rte_eth_dev *dev, uint16_t rx_queue_id)
return 0;
}
+/**
+ * ice_setup_txtime_ctx - setup a struct ice_txtime_ctx instance
+ * @txq: The queue on which tstamp ring to configure
+ * @txtime_ctx: Pointer to the Tx time queue context structure to be initialized
+ * @txtime_ena: Tx time enable flag, set to true if Tx time should be enabled
+ */
+static int
+ice_setup_txtime_ctx(struct ci_tx_queue *txq,
+ struct ice_txtime_ctx *txtime_ctx, bool txtime_ena)
+{
+ struct ice_vsi *vsi = txq->ice_vsi;
+ struct ice_hw *hw = ICE_VSI_TO_HW(vsi);
+
+ txtime_ctx->base = txq->tsq.ts_ring_dma >> ICE_TX_CMPLTNQ_CTX_BASE_S;
+
+ /* Tx time Queue Length */
+ txtime_ctx->qlen = txq->tsq.nb_ts_desc;
+
+ if (txtime_ena)
+ txtime_ctx->txtime_ena_q = 1;
+
+ /* PF number */
+ txtime_ctx->pf_num = hw->pf_id;
+
+ switch (vsi->type) {
+ case ICE_VSI_LB:
+ case ICE_VSI_CTRL:
+ case ICE_VSI_ADI:
+ case ICE_VSI_PF:
+ txtime_ctx->vmvf_type = ICE_TLAN_CTX_VMVF_TYPE_PF;
+ break;
+ default:
+ PMD_DRV_LOG(ERR, "Unable to set VMVF type for VSI type %d",
+ vsi->type);
+ return -EINVAL;
+ }
+
+ /* make sure the context is associated with the right VSI */
+ txtime_ctx->src_vsi = vsi->vsi_id;
+
+ txtime_ctx->ts_res = ICE_TXTIME_CTX_RESOLUTION_128NS;
+ txtime_ctx->drbell_mode_32 = ICE_TXTIME_CTX_DRBELL_MODE_32;
+ txtime_ctx->ts_fetch_prof_id = ICE_TXTIME_CTX_FETCH_PROF_ID_0;
+
+ return 0;
+}
+
int
ice_tx_queue_start(struct rte_eth_dev *dev, uint16_t tx_queue_id)
{
@@ -799,11 +846,6 @@ ice_tx_queue_start(struct rte_eth_dev *dev, uint16_t tx_queue_id)
ice_set_ctx(hw, (uint8_t *)&tx_ctx, txq_elem->txqs[0].txq_ctx,
ice_tlan_ctx_info);
- txq->qtx_tail = hw->hw_addr + QTX_COMM_DBELL(txq->reg_idx);
-
- /* Init the Tx tail register*/
- ICE_PCI_REG_WRITE(txq->qtx_tail, 0);
-
/* Fix me, we assume TC always 0 here */
err = ice_ena_vsi_txq(hw->port_info, vsi->idx, 0, tx_queue_id, 1,
txq_elem, buf_len, NULL);
@@ -826,6 +868,40 @@ ice_tx_queue_start(struct rte_eth_dev *dev, uint16_t tx_queue_id)
/* record what kind of descriptor cleanup we need on teardown */
txq->vector_tx = ad->tx_vec_allowed;
+ if (txq->tsq.ts_flag > 0) {
+ struct ice_aqc_set_txtime_qgrp *ts_elem;
+ u8 ts_buf_len = ice_struct_size(ts_elem, txtimeqs, 1);
+ struct ice_txtime_ctx txtime_ctx = { 0 };
+
+ ts_elem = ice_malloc(hw, ts_buf_len);
+ ice_setup_txtime_ctx(txq, &txtime_ctx,
+ true);
+ ice_set_ctx(hw, (u8 *)&txtime_ctx,
+ ts_elem->txtimeqs[0].txtime_ctx,
+ ice_txtime_ctx_info);
+
+ txq->qtx_tail = hw->hw_addr +
+ E830_GLQTX_TXTIME_DBELL_LSB(txq->reg_idx);
+
+ /* Init the Tx time tail register*/
+ ICE_PCI_REG_WRITE(txq->qtx_tail, 0);
+
+ err = ice_aq_set_txtimeq(hw, txq->reg_idx, 1, ts_elem,
+ ts_buf_len, NULL);
+ if (err) {
+ PMD_DRV_LOG(ERR, "Failed to set Tx Time queue context, error: %d", err);
+ rte_free(txq_elem);
+ rte_free(ts_elem);
+ return err;
+ }
+ rte_free(ts_elem);
+ } else {
+ txq->qtx_tail = hw->hw_addr + QTX_COMM_DBELL(txq->reg_idx);
+
+ /* Init the Tx tail register*/
+ ICE_PCI_REG_WRITE(txq->qtx_tail, 0);
+ }
+
dev->data->tx_queue_state[tx_queue_id] = RTE_ETH_QUEUE_STATE_STARTED;
rte_free(txq_elem);
@@ -1046,6 +1122,20 @@ ice_reset_tx_queue(struct ci_tx_queue *txq)
txq->last_desc_cleaned = (uint16_t)(txq->nb_tx_desc - 1);
txq->nb_tx_free = (uint16_t)(txq->nb_tx_desc - 1);
+
+ if (txq->tsq.ts_flag > 0) {
+ size = sizeof(struct ice_ts_desc) * txq->tsq.nb_ts_desc;
+ for (i = 0; i < size; i++)
+ ((volatile char *)txq->tsq.ice_ts_ring)[i] = 0;
+
+ for (i = 0; i < txq->tsq.nb_ts_desc; i++) {
+ volatile struct ice_ts_desc *tsd =
+ &txq->tsq.ice_ts_ring[i];
+ tsd->tx_desc_idx_tstamp = 0;
+ }
+
+ txq->tsq.ts_tail = 0;
+ }
}
int
@@ -1080,6 +1170,19 @@ ice_tx_queue_stop(struct rte_eth_dev *dev, uint16_t tx_queue_id)
q_ids[0] = txq->reg_idx;
q_teids[0] = txq->q_teid;
+ if (txq->tsq.ts_flag > 0) {
+ struct ice_aqc_ena_dis_txtime_qgrp txtime_pg;
+ status = ice_aq_ena_dis_txtimeq(hw, q_ids[0], 1, 0,
+ &txtime_pg, NULL);
+ if (status != ICE_SUCCESS) {
+ PMD_DRV_LOG(DEBUG, "Failed to disable Tx time queue");
+ return -EINVAL;
+ }
+ txq->tsq.ts_flag = 0;
+ txq->tsq.ts_offset = -1;
+ dev->dev_ops->timesync_disable(dev);
+ }
+
/* Fix me, we assume TC always 0 here */
status = ice_dis_vsi_txq(hw->port_info, vsi->idx, 0, 1, &q_handle,
q_ids, q_teids, ICE_NO_RESET, 0, NULL);
@@ -1166,6 +1269,7 @@ ice_rx_queue_setup(struct rte_eth_dev *dev,
struct rte_mempool *mp)
{
struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+ struct ice_hw *hw = ICE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
struct ice_adapter *ad =
ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
struct ice_vsi *vsi = pf->main_vsi;
@@ -1249,7 +1353,7 @@ ice_rx_queue_setup(struct rte_eth_dev *dev,
rxq->xtr_field_offs = ad->devargs.xtr_field_offs;
/* Allocate the maximum number of RX ring hardware descriptor. */
- len = ICE_MAX_RING_DESC;
+ len = ICE_MAX_NUM_DESC_BY_MAC(hw);
/**
* Allocating a little more memory because vectorized/bulk_alloc Rx
@@ -1337,6 +1441,36 @@ ice_rx_queue_release(void *rxq)
rte_free(q);
}
+/**
+ * ice_calc_ts_ring_count - Calculate the number of timestamp descriptors
+ * @hw: pointer to the hardware structure
+ * @tx_desc_count: number of Tx descriptors in the ring
+ *
+ * Return: the number of timestamp descriptors
+ */
+static uint16_t ice_calc_ts_ring_count(struct ice_hw *hw, u16 tx_desc_count)
+{
+ u16 prof = ICE_TXTIME_CTX_FETCH_PROF_ID_0;
+ u16 max_fetch_desc = 0;
+ u16 fetch;
+ u32 reg;
+ u16 i;
+
+ for (i = 0; i < ICE_TXTIME_FETCH_PROFILE_CNT; i++) {
+ reg = rd32(hw, E830_GLTXTIME_FETCH_PROFILE(prof, 0));
+ fetch = FIELD_GET(E830_GLTXTIME_FETCH_PROFILE_FETCH_TS_DESC_M,
+ reg);
+ max_fetch_desc = max(fetch, max_fetch_desc);
+ }
+
+ if (!max_fetch_desc)
+ max_fetch_desc = ICE_TXTIME_FETCH_TS_DESC_DFLT;
+
+ max_fetch_desc = RTE_ALIGN(max_fetch_desc, ICE_REQ_DESC_MULTIPLE);
+
+ return tx_desc_count + max_fetch_desc;
+}
+
int
ice_tx_queue_setup(struct rte_eth_dev *dev,
uint16_t queue_idx,
@@ -1345,6 +1479,7 @@ ice_tx_queue_setup(struct rte_eth_dev *dev,
const struct rte_eth_txconf *tx_conf)
{
struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+ struct ice_hw *hw = ICE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
struct ice_vsi *vsi = pf->main_vsi;
struct ci_tx_queue *txq;
const struct rte_memzone *tz;
@@ -1469,7 +1604,8 @@ ice_tx_queue_setup(struct rte_eth_dev *dev,
}
/* Allocate TX hardware ring descriptors. */
- ring_size = sizeof(struct ice_tx_desc) * ICE_MAX_RING_DESC;
+ ring_size = sizeof(struct ice_tx_desc) *
+ ICE_MAX_NUM_DESC_BY_MAC(hw);
ring_size = RTE_ALIGN(ring_size, ICE_DMA_MEM_ALIGN);
tz = rte_eth_dma_zone_reserve(dev, "ice_tx_ring", queue_idx,
ring_size, ICE_RING_BASE_ALIGN,
@@ -1507,6 +1643,42 @@ ice_tx_queue_setup(struct rte_eth_dev *dev,
return -ENOMEM;
}
+ if (vsi->type == ICE_VSI_PF &&
+ (offloads & RTE_ETH_TX_OFFLOAD_SEND_ON_TIMESTAMP) &&
+ txq->tsq.ts_offset == 0 && hw->phy_model == ICE_PHY_E830) {
+ int ret =
+ rte_mbuf_dyn_tx_timestamp_register(&txq->tsq.ts_offset,
+ &txq->tsq.ts_flag);
+ if (ret) {
+ PMD_INIT_LOG(ERR, "Cannot register Tx mbuf field/flag "
+ "for timestamp");
+ return -EINVAL;
+ }
+ dev->dev_ops->timesync_enable(dev);
+
+ ring_size = sizeof(struct ice_ts_desc) *
+ ICE_MAX_NUM_DESC_BY_MAC(hw);
+ ring_size = RTE_ALIGN(ring_size, ICE_DMA_MEM_ALIGN);
+ const struct rte_memzone *ts_z =
+ rte_eth_dma_zone_reserve(dev, "ice_tstamp_ring",
+ queue_idx, ring_size, ICE_RING_BASE_ALIGN,
+ socket_id);
+ if (!ts_z) {
+ ice_tx_queue_release(txq);
+ PMD_INIT_LOG(ERR, "Failed to reserve DMA memory "
+ "for TX timestamp");
+ return -ENOMEM;
+ }
+ txq->tsq.ts_mz = ts_z;
+ txq->tsq.ice_ts_ring = ts_z->addr;
+ txq->tsq.ts_ring_dma = ts_z->iova;
+ txq->tsq.nb_ts_desc =
+ ice_calc_ts_ring_count(ICE_VSI_TO_HW(vsi),
+ txq->nb_tx_desc);
+ } else {
+ txq->tsq.ice_ts_ring = NULL;
+ }
+
ice_reset_tx_queue(txq);
txq->q_set = true;
dev->data->tx_queues[queue_idx] = txq;
@@ -1539,6 +1711,8 @@ ice_tx_queue_release(void *txq)
ci_txq_release_all_mbufs(q, false);
rte_free(q->sw_ring);
+ if (q->tsq.ts_mz)
+ rte_memzone_free(q->tsq.ts_mz);
rte_memzone_free(q->mz);
rte_free(q);
}
@@ -2960,7 +3134,7 @@ ice_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
struct rte_mbuf *tx_pkt;
struct rte_mbuf *m_seg;
uint32_t cd_tunneling_params;
- uint16_t tx_id;
+ uint16_t tx_id, ts_id;
uint16_t nb_tx;
uint16_t nb_used;
uint16_t nb_ctx;
@@ -2979,6 +3153,9 @@ ice_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
tx_id = txq->tx_tail;
txe = &sw_ring[tx_id];
+ if (txq->tsq.ts_flag > 0)
+ ts_id = txq->tsq.ts_tail;
+
/* Check if the descriptor ring needs to be cleaned. */
if (txq->nb_tx_free < txq->tx_free_thresh)
(void)ice_xmit_cleanup(txq);
@@ -3166,10 +3343,48 @@ ice_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
txd->cmd_type_offset_bsz |=
rte_cpu_to_le_64(((uint64_t)td_cmd) <<
ICE_TXD_QW1_CMD_S);
+
+ if (txq->tsq.ts_flag > 0) {
+ uint64_t txtime = *RTE_MBUF_DYNFIELD(tx_pkt,
+ txq->tsq.ts_offset, uint64_t *);
+ uint32_t tstamp = (uint32_t)(txtime % NS_PER_S) >>
+ ICE_TXTIME_CTX_RESOLUTION_128NS;
+ if (tx_id == 0)
+ txq->tsq.ice_ts_ring[ts_id].tx_desc_idx_tstamp =
+ rte_cpu_to_le_32(FIELD_PREP(ICE_TXTIME_TX_DESC_IDX_M,
+ txq->nb_tx_desc) | FIELD_PREP(ICE_TXTIME_STAMP_M,
+ tstamp));
+ else
+ txq->tsq.ice_ts_ring[ts_id].tx_desc_idx_tstamp =
+ rte_cpu_to_le_32(FIELD_PREP(ICE_TXTIME_TX_DESC_IDX_M,
+ tx_id) | FIELD_PREP(ICE_TXTIME_STAMP_M, tstamp));
+ ts_id++;
+ /* Handling MDD issue causing Tx Hang */
+ if (ts_id == txq->tsq.nb_ts_desc) {
+ uint16_t fetch = txq->tsq.nb_ts_desc - txq->nb_tx_desc;
+ ts_id = 0;
+ for (; ts_id < fetch; ts_id++) {
+ if (tx_id == 0)
+ txq->tsq.ice_ts_ring[ts_id].tx_desc_idx_tstamp =
+ rte_cpu_to_le_32(FIELD_PREP(ICE_TXTIME_TX_DESC_IDX_M,
+ txq->nb_tx_desc) | FIELD_PREP(ICE_TXTIME_STAMP_M,
+ tstamp));
+ else
+ txq->tsq.ice_ts_ring[ts_id].tx_desc_idx_tstamp =
+ rte_cpu_to_le_32(FIELD_PREP(ICE_TXTIME_TX_DESC_IDX_M,
+ tx_id) | FIELD_PREP(ICE_TXTIME_STAMP_M, tstamp));
+ }
+ }
+ }
}
end_of_tx:
/* update Tail register */
- ICE_PCI_REG_WRITE(txq->qtx_tail, tx_id);
+ if (txq->tsq.ts_flag > 0) {
+ ICE_PCI_REG_WRITE(txq->qtx_tail, ts_id);
+ txq->tsq.ts_tail = ts_id;
+ } else {
+ ICE_PCI_REG_WRITE(txq->qtx_tail, tx_id);
+ }
txq->tx_tail = tx_id;
return nb_tx;
diff --git a/drivers/net/intel/ice/ice_rxtx.h b/drivers/net/intel/ice/ice_rxtx.h
index 500d630679..a9e8b5c5e9 100644
--- a/drivers/net/intel/ice/ice_rxtx.h
+++ b/drivers/net/intel/ice/ice_rxtx.h
@@ -11,9 +11,18 @@
#define ICE_ALIGN_RING_DESC 32
#define ICE_MIN_RING_DESC 64
#define ICE_MAX_RING_DESC (8192 - 32)
+#define ICE_MAX_RING_DESC_E830 8096
+#define ICE_MAX_NUM_DESC_BY_MAC(hw) ((hw)->phy_model == \
+ ICE_PHY_E830 ? \
+ ICE_MAX_RING_DESC_E830 : \
+ ICE_MAX_RING_DESC)
#define ICE_DMA_MEM_ALIGN 4096
#define ICE_RING_BASE_ALIGN 128
+#define ICE_TXTIME_TX_DESC_IDX_M RTE_GENMASK32(12, 0)
+#define ICE_TXTIME_STAMP_M RTE_GENMASK32(31, 13)
+#define ICE_REQ_DESC_MULTIPLE 32
+
#define ICE_RX_MAX_BURST 32
#define ICE_TX_MAX_BURST 32
--
2.43.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v1 4/6] net/intel: add AVX2 Support for TxPP
2025-06-06 21:19 [PATCH v1 0/6] Add TxPP Support for E830 Soumyadeep Hore
` (2 preceding siblings ...)
2025-06-06 21:19 ` [PATCH v1 3/6] net/intel: add TxPP Support for E830 Soumyadeep Hore
@ 2025-06-06 21:19 ` Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 5/6] net/intel: add AVX512 " Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 6/6] doc: announce TxPP support for E830 adapters Soumyadeep Hore
5 siblings, 0 replies; 7+ messages in thread
From: Soumyadeep Hore @ 2025-06-06 21:19 UTC (permalink / raw)
To: dev, bruce.richardson; +Cc: aman.deep.singh, manoj.kumar.subbarao
Tx Time based queues are supported using AVX2 vector.
Signed-off-by: Soumyadeep Hore <soumyadeep.hore@intel.com>
---
drivers/net/intel/ice/ice_rxtx_vec_avx2.c | 134 +++++++++++++++++++-
drivers/net/intel/ice/ice_rxtx_vec_common.h | 17 +++
2 files changed, 149 insertions(+), 2 deletions(-)
diff --git a/drivers/net/intel/ice/ice_rxtx_vec_avx2.c b/drivers/net/intel/ice/ice_rxtx_vec_avx2.c
index 0c54b325c6..56274b9135 100644
--- a/drivers/net/intel/ice/ice_rxtx_vec_avx2.c
+++ b/drivers/net/intel/ice/ice_rxtx_vec_avx2.c
@@ -848,6 +848,127 @@ ice_vtx(volatile struct ice_tx_desc *txdp,
}
}
+static __rte_always_inline void
+ice_vts1(volatile struct ice_ts_desc *ts, struct rte_mbuf *pkt,
+ uint16_t tx_tail, uint16_t nb_tx_desc, int ts_offset)
+{
+ ts->tx_desc_idx_tstamp = ice_get_ts_queue_desc(pkt,
+ tx_tail, nb_tx_desc, ts_offset);
+}
+
+static __rte_always_inline void
+ice_vts4(volatile struct ice_ts_desc *ts, struct rte_mbuf **pkt,
+ uint16_t nb_pkts, uint16_t tx_tail, uint16_t nb_tx_desc,
+ int ts_offset)
+{
+ uint16_t tx_id;
+
+ for (; nb_pkts > 3; ts += 4, pkt += 4, nb_pkts -= 4,
+ tx_tail += 4) {
+ tx_id = tx_tail + 4;
+ uint32_t ts_dsc3 = ice_get_ts_queue_desc(pkt[3],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 3;
+ uint32_t ts_dsc2 = ice_get_ts_queue_desc(pkt[2],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 2;
+ uint32_t ts_dsc1 = ice_get_ts_queue_desc(pkt[1],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 1;
+ uint32_t ts_dsc0 = ice_get_ts_queue_desc(pkt[0],
+ tx_id, nb_tx_desc, ts_offset);
+ __m128i desc0_3 = _mm_set_epi32(ts_dsc3, ts_dsc2,
+ ts_dsc1, ts_dsc0);
+ _mm_store_si128(RTE_CAST_PTR(void *, ts), desc0_3);
+ }
+
+ /* do any last ones */
+ while (nb_pkts) {
+ tx_tail++;
+ ice_vts1(ts, *pkt, tx_tail, nb_tx_desc, ts_offset);
+ ts++, pkt++, nb_pkts--;
+ }
+}
+
+static __rte_always_inline void
+ice_vts(volatile struct ice_ts_desc *ts, struct rte_mbuf **pkt,
+ uint16_t nb_pkts, uint16_t tx_tail, uint16_t nb_tx_desc,
+ int ts_offset)
+{
+ uint16_t tx_id;
+
+ for (; nb_pkts > 7; ts += 8, pkt += 8, nb_pkts -= 8,
+ tx_tail += 8) {
+ tx_id = tx_tail + 8;
+ uint32_t ts_dsc7 = ice_get_ts_queue_desc(pkt[7],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 7;
+ uint32_t ts_dsc6 = ice_get_ts_queue_desc(pkt[6],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 6;
+ uint32_t ts_dsc5 = ice_get_ts_queue_desc(pkt[5],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 5;
+ uint32_t ts_dsc4 = ice_get_ts_queue_desc(pkt[4],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 4;
+ uint32_t ts_dsc3 = ice_get_ts_queue_desc(pkt[3],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 3;
+ uint32_t ts_dsc2 = ice_get_ts_queue_desc(pkt[2],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 2;
+ uint32_t ts_dsc1 = ice_get_ts_queue_desc(pkt[1],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 1;
+ uint32_t ts_dsc0 = ice_get_ts_queue_desc(pkt[0],
+ tx_id, nb_tx_desc, ts_offset);
+ __m256i desc0_7 = _mm256_set_epi32(ts_dsc7, ts_dsc6,
+ ts_dsc5, ts_dsc4, ts_dsc3, ts_dsc2,
+ ts_dsc1, ts_dsc0);
+ _mm256_storeu_si256(RTE_CAST_PTR(void *, ts), desc0_7);
+ }
+
+ /* do any last ones */
+ if (nb_pkts)
+ ice_vts4(ts, pkt, nb_pkts, tx_tail, nb_tx_desc,
+ ts_offset);
+}
+
+static __rte_always_inline uint16_t
+ice_xmit_fixed_ts_burst_vec_avx512(struct ci_tx_queue *txq,
+ struct rte_mbuf **tx_pkts, uint16_t nb_pkts,
+ uint16_t tx_tail)
+{
+ volatile struct ice_ts_desc *ts;
+ uint16_t n, ts_id, fetch;
+
+ ts_id = txq->tsq.ts_tail;
+ ts = &txq->tsq.ice_ts_ring[ts_id];
+
+ n = (uint16_t)(txq->tsq.nb_ts_desc - ts_id);
+ if (nb_pkts >= n) {
+ ice_vts(ts, tx_pkts, n, txq->tx_tail, txq->nb_tx_desc,
+ txq->tsq.ts_offset);
+ tx_pkts += n;
+ ts += n;
+ tx_tail += n;
+ nb_pkts = (uint16_t)(nb_pkts - n);
+ ts_id = 0;
+ ts = &txq->tsq.ice_ts_ring[ts_id];
+ fetch = txq->tsq.nb_ts_desc - txq->nb_tx_desc;
+ for (; ts_id < fetch; ts_id++, ts++)
+ ice_vts1(ts, *tx_pkts, tx_tail + 1,
+ txq->nb_tx_desc, txq->tsq.ts_offset);
+ }
+
+ ice_vts(ts, tx_pkts, nb_pkts, tx_tail, txq->nb_tx_desc,
+ txq->tsq.ts_offset);
+ ts_id = (uint16_t)(ts_id + nb_pkts);
+
+ return ts_id;
+}
+
static __rte_always_inline uint16_t
ice_xmit_fixed_burst_vec_avx2(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts, bool offload)
@@ -855,7 +976,7 @@ ice_xmit_fixed_burst_vec_avx2(void *tx_queue, struct rte_mbuf **tx_pkts,
struct ci_tx_queue *txq = (struct ci_tx_queue *)tx_queue;
volatile struct ice_tx_desc *txdp;
struct ci_tx_entry_vec *txep;
- uint16_t n, nb_commit, tx_id;
+ uint16_t n, nb_commit, tx_id, ts_id;
uint64_t flags = ICE_TD_CMD;
uint64_t rs = ICE_TX_DESC_CMD_RS | ICE_TD_CMD;
@@ -875,6 +996,10 @@ ice_xmit_fixed_burst_vec_avx2(void *tx_queue, struct rte_mbuf **tx_pkts,
txq->nb_tx_free = (uint16_t)(txq->nb_tx_free - nb_pkts);
+ if (txq->tsq.ts_flag > 0)
+ ts_id = ice_xmit_fixed_ts_burst_vec_avx512(txq,
+ tx_pkts, nb_commit, tx_id);
+
n = (uint16_t)(txq->nb_tx_desc - tx_id);
if (nb_commit >= n) {
ci_tx_backlog_entry_vec(txep, tx_pkts, n);
@@ -910,7 +1035,12 @@ ice_xmit_fixed_burst_vec_avx2(void *tx_queue, struct rte_mbuf **tx_pkts,
txq->tx_tail = tx_id;
- ICE_PCI_REG_WC_WRITE(txq->qtx_tail, txq->tx_tail);
+ if (txq->tsq.ts_flag > 0) {
+ ICE_PCI_REG_WC_WRITE(txq->qtx_tail, ts_id);
+ txq->tsq.ts_tail = ts_id;
+ } else {
+ ICE_PCI_REG_WC_WRITE(txq->qtx_tail, txq->tx_tail);
+ }
return nb_pkts;
}
diff --git a/drivers/net/intel/ice/ice_rxtx_vec_common.h b/drivers/net/intel/ice/ice_rxtx_vec_common.h
index 7933c26366..9166a0408a 100644
--- a/drivers/net/intel/ice/ice_rxtx_vec_common.h
+++ b/drivers/net/intel/ice/ice_rxtx_vec_common.h
@@ -215,4 +215,21 @@ ice_txd_enable_offload(struct rte_mbuf *tx_pkt,
*txd_hi |= ((uint64_t)td_cmd) << ICE_TXD_QW1_CMD_S;
}
+
+static inline uint32_t
+ice_get_ts_queue_desc(struct rte_mbuf *pkt, uint16_t tx_tail,
+ uint16_t nb_tx_desc, int ts_offset)
+{
+ uint64_t txtime;
+ uint32_t tstamp, ts_desc;
+
+ tx_tail = (tx_tail > nb_tx_desc) ? (tx_tail - nb_tx_desc) :
+ tx_tail;
+ txtime = *RTE_MBUF_DYNFIELD(pkt, ts_offset, uint64_t *);
+ tstamp = (uint32_t)(txtime % NS_PER_S) >>
+ ICE_TXTIME_CTX_RESOLUTION_128NS;
+ ts_desc = rte_cpu_to_le_32(FIELD_PREP(ICE_TXTIME_TX_DESC_IDX_M,
+ (tx_tail)) | FIELD_PREP(ICE_TXTIME_STAMP_M, tstamp));
+ return ts_desc;
+}
#endif
--
2.43.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v1 5/6] net/intel: add AVX512 Support for TxPP
2025-06-06 21:19 [PATCH v1 0/6] Add TxPP Support for E830 Soumyadeep Hore
` (3 preceding siblings ...)
2025-06-06 21:19 ` [PATCH v1 4/6] net/intel: add AVX2 Support for TxPP Soumyadeep Hore
@ 2025-06-06 21:19 ` Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 6/6] doc: announce TxPP support for E830 adapters Soumyadeep Hore
5 siblings, 0 replies; 7+ messages in thread
From: Soumyadeep Hore @ 2025-06-06 21:19 UTC (permalink / raw)
To: dev, bruce.richardson; +Cc: aman.deep.singh, manoj.kumar.subbarao
Tx Time based queues are supported using AVX512 vector.
Signed-off-by: Soumyadeep Hore <soumyadeep.hore@intel.com>
---
drivers/net/intel/ice/ice_rxtx_vec_avx512.c | 205 +++++++++++++++++++-
1 file changed, 203 insertions(+), 2 deletions(-)
diff --git a/drivers/net/intel/ice/ice_rxtx_vec_avx512.c b/drivers/net/intel/ice/ice_rxtx_vec_avx512.c
index bd49be07c9..6cdd368c38 100644
--- a/drivers/net/intel/ice/ice_rxtx_vec_avx512.c
+++ b/drivers/net/intel/ice/ice_rxtx_vec_avx512.c
@@ -912,6 +912,198 @@ ice_vtx(volatile struct ice_tx_desc *txdp, struct rte_mbuf **pkt,
}
}
+static __rte_always_inline void
+ice_vts1(volatile struct ice_ts_desc *ts, struct rte_mbuf *pkt,
+ uint16_t tx_tail, uint16_t nb_tx_desc, int ts_offset)
+{
+ ts->tx_desc_idx_tstamp = ice_get_ts_queue_desc(pkt,
+ tx_tail, nb_tx_desc, ts_offset);
+}
+
+static __rte_always_inline void
+ice_vts4(volatile struct ice_ts_desc *ts, struct rte_mbuf **pkt,
+ uint16_t nb_pkts, uint16_t tx_tail, uint16_t nb_tx_desc,
+ int ts_offset)
+{
+ uint16_t tx_id;
+
+ for (; nb_pkts > 3; ts += 4, pkt += 4, nb_pkts -= 4,
+ tx_tail += 4) {
+ tx_id = tx_tail + 4;
+ uint32_t ts_dsc3 = ice_get_ts_queue_desc(pkt[3],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 3;
+ uint32_t ts_dsc2 = ice_get_ts_queue_desc(pkt[2],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 2;
+ uint32_t ts_dsc1 = ice_get_ts_queue_desc(pkt[1],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 1;
+ uint32_t ts_dsc0 = ice_get_ts_queue_desc(pkt[0],
+ tx_id, nb_tx_desc, ts_offset);
+ __m128i desc0_3 = _mm_set_epi32(ts_dsc3, ts_dsc2,
+ ts_dsc1, ts_dsc0);
+ _mm_store_si128(RTE_CAST_PTR(void *, ts), desc0_3);
+ }
+
+ /* do any last ones */
+ while (nb_pkts) {
+ tx_tail++;
+ ice_vts1(ts, *pkt, tx_tail, nb_tx_desc, ts_offset);
+ ts++, pkt++, nb_pkts--;
+ }
+}
+
+static __rte_always_inline void
+ice_vts8(volatile struct ice_ts_desc *ts, struct rte_mbuf **pkt,
+ uint16_t nb_pkts, uint16_t tx_tail, uint16_t nb_tx_desc,
+ int ts_offset)
+{
+ uint16_t tx_id;
+
+ for (; nb_pkts > 7; ts += 8, pkt += 8, nb_pkts -= 8,
+ tx_tail += 8) {
+ tx_id = tx_tail + 8;
+ uint32_t ts_dsc7 = ice_get_ts_queue_desc(pkt[7],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 7;
+ uint32_t ts_dsc6 = ice_get_ts_queue_desc(pkt[6],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 6;
+ uint32_t ts_dsc5 = ice_get_ts_queue_desc(pkt[5],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 5;
+ uint32_t ts_dsc4 = ice_get_ts_queue_desc(pkt[4],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 4;
+ uint32_t ts_dsc3 = ice_get_ts_queue_desc(pkt[3],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 3;
+ uint32_t ts_dsc2 = ice_get_ts_queue_desc(pkt[2],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 2;
+ uint32_t ts_dsc1 = ice_get_ts_queue_desc(pkt[1],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 1;
+ uint32_t ts_dsc0 = ice_get_ts_queue_desc(pkt[0],
+ tx_id, nb_tx_desc, ts_offset);
+ __m256i desc0_7 = _mm256_set_epi32(ts_dsc7, ts_dsc6,
+ ts_dsc5, ts_dsc4, ts_dsc3, ts_dsc2,
+ ts_dsc1, ts_dsc0);
+ _mm256_storeu_si256(RTE_CAST_PTR(void *, ts), desc0_7);
+ }
+
+ /* do any last ones */
+ if (nb_pkts)
+ ice_vts4(ts, pkt, nb_pkts, tx_tail, nb_tx_desc,
+ ts_offset);
+}
+
+static __rte_always_inline void
+ice_vts(volatile struct ice_ts_desc *ts, struct rte_mbuf **pkt,
+ uint16_t nb_pkts, uint16_t tx_tail, uint16_t nb_tx_desc,
+ int ts_offset)
+{
+ uint16_t tx_id;
+
+ for (; nb_pkts > 15; ts += 16, pkt += 16, nb_pkts -= 16,
+ tx_tail += 16) {
+ tx_id = tx_tail + 16;
+ uint32_t ts_dsc15 = ice_get_ts_queue_desc(pkt[15],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 15;
+ uint32_t ts_dsc14 = ice_get_ts_queue_desc(pkt[14],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 14;
+ uint32_t ts_dsc13 = ice_get_ts_queue_desc(pkt[13],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 13;
+ uint32_t ts_dsc12 = ice_get_ts_queue_desc(pkt[12],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 12;
+ uint32_t ts_dsc11 = ice_get_ts_queue_desc(pkt[11],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 11;
+ uint32_t ts_dsc10 = ice_get_ts_queue_desc(pkt[10],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 10;
+ uint32_t ts_dsc9 = ice_get_ts_queue_desc(pkt[9],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 9;
+ uint32_t ts_dsc8 = ice_get_ts_queue_desc(pkt[8],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 8;
+ uint32_t ts_dsc7 = ice_get_ts_queue_desc(pkt[7],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 7;
+ uint32_t ts_dsc6 = ice_get_ts_queue_desc(pkt[6],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 6;
+ uint32_t ts_dsc5 = ice_get_ts_queue_desc(pkt[5],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 5;
+ uint32_t ts_dsc4 = ice_get_ts_queue_desc(pkt[4],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 4;
+ uint32_t ts_dsc3 = ice_get_ts_queue_desc(pkt[3],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 3;
+ uint32_t ts_dsc2 = ice_get_ts_queue_desc(pkt[2],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 2;
+ uint32_t ts_dsc1 = ice_get_ts_queue_desc(pkt[1],
+ tx_id, nb_tx_desc, ts_offset);
+ tx_id = tx_tail + 1;
+ uint32_t ts_dsc0 = ice_get_ts_queue_desc(pkt[0],
+ tx_id, nb_tx_desc, ts_offset);
+ __m512i desc0_15 = _mm512_set_epi32(ts_dsc15, ts_dsc14,
+ ts_dsc13, ts_dsc12, ts_dsc11, ts_dsc10,
+ ts_dsc9, ts_dsc8, ts_dsc7, ts_dsc6,
+ ts_dsc5, ts_dsc4, ts_dsc3, ts_dsc2,
+ ts_dsc1, ts_dsc0);
+ _mm512_storeu_si512(RTE_CAST_PTR(void *, ts), desc0_15);
+ }
+
+ /* do any last ones */
+ if (nb_pkts)
+ ice_vts8(ts, pkt, nb_pkts, tx_tail, nb_tx_desc,
+ ts_offset);
+}
+
+static __rte_always_inline uint16_t
+ice_xmit_fixed_ts_burst_vec_avx512(struct ci_tx_queue *txq,
+ struct rte_mbuf **tx_pkts, uint16_t nb_pkts,
+ uint16_t tx_tail)
+{
+ volatile struct ice_ts_desc *ts;
+ uint16_t n, ts_id, fetch;
+
+ ts_id = txq->tsq.ts_tail;
+ ts = &txq->tsq.ice_ts_ring[ts_id];
+
+ n = (uint16_t)(txq->tsq.nb_ts_desc - ts_id);
+ if (nb_pkts >= n) {
+ ice_vts(ts, tx_pkts, n, txq->tx_tail, txq->nb_tx_desc,
+ txq->tsq.ts_offset);
+ tx_pkts += n;
+ ts += n;
+ tx_tail += n;
+ nb_pkts = (uint16_t)(nb_pkts - n);
+ ts_id = 0;
+ ts = &txq->tsq.ice_ts_ring[ts_id];
+ fetch = txq->tsq.nb_ts_desc - txq->nb_tx_desc;
+ for (; ts_id < fetch; ts_id++, ts++)
+ ice_vts1(ts, *tx_pkts, tx_tail + 1,
+ txq->nb_tx_desc, txq->tsq.ts_offset);
+ }
+
+ ice_vts(ts, tx_pkts, nb_pkts, tx_tail, txq->nb_tx_desc,
+ txq->tsq.ts_offset);
+ ts_id = (uint16_t)(ts_id + nb_pkts);
+
+ return ts_id;
+}
+
static __rte_always_inline uint16_t
ice_xmit_fixed_burst_vec_avx512(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts, bool do_offload)
@@ -919,7 +1111,7 @@ ice_xmit_fixed_burst_vec_avx512(void *tx_queue, struct rte_mbuf **tx_pkts,
struct ci_tx_queue *txq = (struct ci_tx_queue *)tx_queue;
volatile struct ice_tx_desc *txdp;
struct ci_tx_entry_vec *txep;
- uint16_t n, nb_commit, tx_id;
+ uint16_t n, nb_commit, tx_id, ts_id;
uint64_t flags = ICE_TD_CMD;
uint64_t rs = ICE_TX_DESC_CMD_RS | ICE_TD_CMD;
@@ -940,6 +1132,10 @@ ice_xmit_fixed_burst_vec_avx512(void *tx_queue, struct rte_mbuf **tx_pkts,
txq->nb_tx_free = (uint16_t)(txq->nb_tx_free - nb_pkts);
+ if (txq->tsq.ts_flag > 0)
+ ts_id = ice_xmit_fixed_ts_burst_vec_avx512(txq,
+ tx_pkts, nb_commit, tx_id);
+
n = (uint16_t)(txq->nb_tx_desc - tx_id);
if (nb_commit >= n) {
ci_tx_backlog_entry_vec(txep, tx_pkts, n);
@@ -975,7 +1171,12 @@ ice_xmit_fixed_burst_vec_avx512(void *tx_queue, struct rte_mbuf **tx_pkts,
txq->tx_tail = tx_id;
- ICE_PCI_REG_WC_WRITE(txq->qtx_tail, txq->tx_tail);
+ if (txq->tsq.ts_flag > 0) {
+ ICE_PCI_REG_WC_WRITE(txq->qtx_tail, ts_id);
+ txq->tsq.ts_tail = ts_id;
+ } else {
+ ICE_PCI_REG_WC_WRITE(txq->qtx_tail, txq->tx_tail);
+ }
return nb_pkts;
}
--
2.43.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v1 6/6] doc: announce TxPP support for E830 adapters
2025-06-06 21:19 [PATCH v1 0/6] Add TxPP Support for E830 Soumyadeep Hore
` (4 preceding siblings ...)
2025-06-06 21:19 ` [PATCH v1 5/6] net/intel: add AVX512 " Soumyadeep Hore
@ 2025-06-06 21:19 ` Soumyadeep Hore
5 siblings, 0 replies; 7+ messages in thread
From: Soumyadeep Hore @ 2025-06-06 21:19 UTC (permalink / raw)
To: dev, bruce.richardson; +Cc: aman.deep.singh, manoj.kumar.subbarao
E830 adapters currently support Tx Time based queues.
Signed-off-by: Soumyadeep Hore <soumyadeep.hore@intel.com>
---
doc/guides/nics/ice.rst | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst
index 77985ae5a2..73c5477946 100644
--- a/doc/guides/nics/ice.rst
+++ b/doc/guides/nics/ice.rst
@@ -415,6 +415,22 @@ and add the ``--force-max-simd-bitwidth=64`` startup parameter to disable vector
examples/dpdk-ptpclient -c f -n 3 -a 0000:ec:00.1 --force-max-simd-bitwidth=64 -- -T 1 -p 0x1 -c 1
+Tx Packet Pacing
+~~~~~~~~~~~~~~~~
+
+In order to deliver the timestamp with every packet, a special type of Tx Host Queue is
+used, the TS Queue. This feature is currently supported only in E830 adapters.
+
+The tx_offload ``RTE_ETH_TX_OFFLOAD_SEND_ON_TIMESTAMP`` is used to enable the feature.
+For example:
+
+.. code-block:: console
+
+ dpdk-testpmd -a 0000:31:00.0 -c f -n 4 -- -i --tx-offloads=0x200000
+ set fwd txonly
+ set txtimes 30000000,1000000
+ start
+
Generic Flow Support
~~~~~~~~~~~~~~~~~~~~
--
2.43.0
^ permalink raw reply [flat|nested] 7+ messages in thread